Why patching especially matters in a post Meltdown and Spectre world.
For anyone in the cybersecurity industry, 2018 began on January 3rd — the day a trio of CPU bugs was announced. What trio? You probably recall Meltdown and Spectre, but from our perspective, the latter bug is really two for the price of one. While Meltdown and Spectre both got plenty of coverage in media outlets and security blogs around the globe (yes, that includes us, too), there’s an important distinction to make and more to say on this matter.
First, to be clear, and as a reminder: Meltdown is a bug that affected mostly Intel CPUs (with few exceptions). Spectre, meanwhile, is a design flaw that is present in almost all CPUs out there. Unfortunately, within the last few weeks, it has become clear that more fallout is yet to come.
Here’s what we know: Eight new Spectre-like bugs have been announced, all called Spectre-NG (next generation). Four of these bugs are considered “critical.” The group of researchers that reported these bugs both to Intel and the German magazine Heist was not known at the time of announcement.
While waiting for Spectre-NG disclosure, it seems that the gears of the Spectre machine aren’t slowing. The first two of this set of eight were disclosed on May 21, dubbed as variants 3a and 4, with numbers CVE-2018-3640 and CVE-2018-3639. These variants have been found by Microsoft and Project Zero Google group.
So what is the fundamental problem with the design of modern CPUs? Here is an explanation.
Over time, we have created CPUs that are essentially oracles — CPUs that can predict the future … or at least predict what’s going to happen next. How so? First, it’s important to understand how a CPU communicates with memory. Today’s CPUs are so fast that asking memory for data slows them down. So, in order for a CPU to be faster, it requires something called a cache — a small, fast, and expensive piece of memory that keeps frequently-used data readily accessible.
Any time a CPU accesses data in its main memory, it first consults the cache. If the data is there, great; but if not, a copy of the needed data gets moved from the main memory into the cache, and then further to the CPU. The next time the CPU wants that same data, it will already be in the cache.
As modern CPUs are able to process instructions in parallel, they tend to save time. Imagine that a CPU reaches some point in a program when it must decide which of two ways to go based on the content (value) of some place (address) in memory. That value is not in the cache, so it needs to be fetched from the memory.
In the meantime, the CPU becomes impatient and starts to speculate, “What if the value I’m waiting for makes me go this way? Because this way is usually taken ... I’ve gone through it several times before.”
So it speculatively executes a branch of code — part of this code loads data from another place of memory (let’s call it a secret place) where the secret value is stored. Meanwhile, the real value finally arrives from the main memory. Now let’s say the CPU recognizes that it must go the other way; it now discards all the results it had from the speculative execution and continues on the correct path.
Now imagine that the important value is a security check that tells the CPU not to touch the secret place. But, it already did. Does it matter? Is that a big deal? After all, it was only done speculatively and all the results have been thrown away, right? Not so true. Remember the cache? The secret value is still there. By using a special technique (which I won’t dive into), the value, in fact, can be easily recovered.
So this is the bare bones of the issue. As we raced for faster CPUs, we traded security for speed and speculation. Unfortunately, this cannot be easily fixed. At least not without sacrificing some performance. And why aren’t old CPUs affected? The answer is simple: they don’t speculate.
While these vulnerabilities are proven and serious (a ‘proof of concept’ code exists for all of them), there is no proof that any malware has successfully exploited them. To successfully exploit them in a real-world case, the attacker has to know the “anatomy” of the target process and system. Therefore, it is very unlikely for these vulnerabilities to be exploited on a mass scale.
You may be asking: Should I patch my system, or not worry about it? Will the fixes slow down my computer?
The answer is yes, some mitigations might reduce performance by 30%. And yes, you should definitely patch. However, be aware that to patch properly, you sometimes need to deploy both OS and CPU patches. Typically, the OS patch is automatic, and as such, straightforward. When prompted to patch, say YES. But because these issues are hardware and architectural issues on the CPU itself, you must also patch your CPU. You usually need to update your BIOS which then delivers the so-called “CPU microcode” patch into your CPU every time your computer starts up. This is needed for security reasons because any “patch” to a CPU is not permanent and therefore must be delivered every time the CPU starts.
Speaking of degradation in performance, it highly varies. Expect no more than a 30% performance hit, and in most cases, it will be lower. This performance hit comes from the fact that to “mitigate” the Meltdown issue and to protect OS-privileged memory (kernel) from leaking its content into unprivileged mode, special measures had been taken by OS vendors to further isolate these two memory spaces. As a result, every switch between privileged(kernel) / unprivileged(user) mode results in a performance penalty. But this depends on how often this happens, which then depends on the application type you are running. If you have a modern CPU (Skylake, Kabylake or newer), the performance overhead is in single digit numbers. On older CPUs the hit may be more noticable.
So what you should do?