Bygone, forgotten, over? One year after Meltdown of processor security

May 29 2019

by Werner Haas

Do you recall the year change 2017/18? Of course, I am not referring to the New Year’s resolutions usually getting out of sight after a couple of weeks. Back then, I (together with a small team of other security researchers) was waiting for Intel to disclose security vulnerabilities we had discovered in its microprocessor hardware. We expected a fair bit of excitement because the industry had been scrambling to get mitigations in place. However I was thoroughly gobsmacked by the kind of delayed fireworks unfolding in the media. More than a year has elapsed since then so it is only fair to ask what is left beyond the sound and smoke - and why it was not the beginning of the end of the familiar IT universe, as predicted by a couple of pessimists.

We were invited by AVANTEC to provide answers as we had the pleasure to explain the original Meltdown/Spectre vulnerabilities at their IT-Security INSIDE event in 2018. This blog closely follows the corresponding article written in German.

What has actually happened?

Pre-Meltdown/-Spectre, the dominant causes of IT security issues could be classified into two major categories: protection mechanisms were bypassed either via software bugs, or by abusing legitimate user credentials. The revolutionary characteristics of the new exploits was that supposedly secret data was accessible despite fully validated software running on bug-free computer hardware without any breach of access credentials.

We covered the technical background of different hardware-level vulnerabilities in other blog posts already ( Meltdown, SpectreV4, LazyFP, L1TF, ZombieLoad ) so I will provide a brief summary, only. Meltdown and Spectre represent two fundamentally different mechanisms. Both have in common that they rely on the basic operating principles of microprocessors. Progress in semiconductor technology has been mind-boggling but the enormous gain in compute performance is more a result of innovations in computer architecture than direct consequence of improved physical properties. It is quite common to compare instruction processing with assembly line work. As usual, however, such analogies have their limitations. Let’s take the production of simple bolts and nuts as example. No production manager would think about making the moulding of blanks dependent on the colour of the packaging.

Unfortunately this is a consequence of the programming model of the first electronic general purpose computer ENIAC. John von Neumann became so involved in the project that the underlying architecture is often associated with his name. Even today, roughly 75 years later, the vast majority of computer system adheres to its principles. Underneath the surface of the programming model level, however, things have changed considerably. To pick up the production plant analogy again, the quality control at the end ensures that all products leaving the factory meet the specification.

Meltdown-like attacks evade safeguards during the production of results, such as access restrictions for user-level code. The Spectre class, on the other hand, leverages the fact that the processor has to guess in case of possible branches because the corresponding information i.e., branch conditions and/or the next instruction’s address, needs some time to be calculated. Although the quality control still does its job and throws away all faulty manufacture, a vigilant observer could still draw conclusions about the nonconforming products by watching the goods received by the factory. This kind of information gathering is called side channel because the regular outgoing goods i.e., the effects from executing instructions, match exactly the programmer’s expectations.

Was the excitement exaggerated?

Truth to be told, I am not aware of a single case where Meltdown and/or Spectre was used for an actual cyber attack. This supports the critic’s claim of media reaction gone overboard. However they overlook the fact that insiders were aware of the issues well in advance of the public disclosure and worked frantically on mitigations. Google’s Project Zero had informed Intel months ahead of us and the main reason for the long quiet period was work on appropriate operating system patches.

To this end, developers had to change the memory management, a core functionality one does not toy with light-heartedly. Of course, this also had performance implications. But fortunately, the technique known as Kernel Page Table Isolation (KPTI) in the Linux world is highly effective against Meltdown. And we observed first-hand how quickly computer science students were able to leak data, given just the basic concept. Spectre requires more sophistication but in its simplest form it would have been possible to reach sensitive user data via malicious banner ads.

In summary I believe it was first and foremost thanks to the adequate response of all stakeholders that the damage was kept under control. And by referring to meltdown in physics, nuclear bombs also carry a significantly higher weight in the news compared to plain old firearms despite the fact that the latter caused many more deaths.

This comparison may be blown out of proportion but it leads me to another aspect. It is sad that computer crime has evolved to following standard business rules. While Spectre-class attacks can be very powerful, they also require significant effort to mount. As long as primitive methods offer sufficient chances of success it is plain and simple not worth the trouble applying more complex techniques. A hand gun is more than sufficient for street robbery; the situation is completely different at national level. Hence we always recommend a rational risk analysis in order to assess the real threat via hardware-level attacks and to deploy adequate countermeasures.

How has the situation evolved?

Our prophecy from January 2018 of Meltdown/Spectre being just the tip of the iceberg turned out to be true. Many more vulnerabilities in processor hardware have been discovered since then. Security researchers looked more systematically across different products. Additionally, they fed additional processor submodules with carefully crafted, yet unusual execution patterns in order to trigger data leaks.

As suspected, no processor manufacturer was able to claim complete immunity although Intel stood in the spotlight most of the time. That comes of course from being the market leader but questionable design decisions that led to Meltdown invited researchers to dig for more. It has become a tradition of sorts to have Intel issuing new security advisories every couple of months ( May’18, June’18, August’18, April’19, May’19 ) - with no end in sight. On the other hand, Intel also applied some fixes in Silicon so the performance-draining operating system patches for Meltdown are no longer necessary on processors from the Cascade Lake and Whiskey Lake families. Unfortunately, buyers should better read the fine print, though. For example, only the most recent generation of Coffee Lake CPUs features the required microarchitectural changes.

Against Spectre-class attacks there is no hardware-only solution in sight (and from my point of view also not very likely). The mitigations in software, however, improved (= reduced the associated overhead) significantly. The corresponding keyword is “Retpoline”, an algorithm conceived by Google engineers. While the reputation got tainted by the most recent updates to Windows 10, the performance loss most likely stems from other changes. Retpoline is interesting for another aspect, too: as usual, technology by itself is neither good nor bad. Originally, the algorithm was conceived as protection mechanism for computer programs. In the mean time, however, we realized that the basic idea can also be used to make our new attacks on different processor components more efficient. So the cat-and-mouse game continues…

Coming back to the headline, bygone is certainly the media hype with new vulnerabilities barely getting any attention. But it is safe to say that Meltdown/Spectre will not be forgotten in the relevant circles any time soon. And the problems are far from over, as evidenced by the disclosure of the data sampling attacks on May 14th.

Share this article: