Ask HN: Why CPU manufacturers did not foresee Meltdown?

pixl97 · on Jan 9, 2018

The only thing harder than trying to figure out how to execute a side channel attack is attempting to make hardware immune to side channel attacks.

And this is a good question, the Department of Defense and other intelligence agencies have researched attacks like this for decades. Many mainframe/time sharing systems have had much deeper levels of isolation built in because of security requirements of government. It is my personal opinion that Intel especially overstepped their knowledge domain when moving from systems that ran mostly single user 'trusted' code to systems that were multi user and running untrusted code. Processors stopped getting significantly faster and started getting wider (cores). This began the push to commodity virtualization. One processor could run multiple operating systems at once. Again, much like the push disconnected computers with no security to everything being always connected, people wanted to use what they already invested in. If Intel told you that you needed a $10,000 processor and to recompile all your code (hmm, they actually tried that, it sunk like the titanic), you would tell them no. AMD did a much better job at context isolation, but they have always been behind intel in performance, possibly because of this exact issue. With customers not really caring about security and the only attacks were theoretical, things played along just as we see.

So why did this happen?

* Legacy never goes away.

* Performance is king.

* People will go with 'wrong' but cheaper if the problem occurs sometime in the future.

* Marketing departments like graphs with bigger numbers.

* Real isolation is expensive.

* Some people have to be set on fire before they accept they are not fireproof.

dm319 · on Jan 9, 2018

> AMD did a much better job at context isolation, but they have always been behind intel in performance, possibly because of this exact issue.

Do we need to re-evaluate historical processor metrics? I wonder how Ryzen compares now to equivalent Intel processors. I'm aware that both are vulnerable to Spectre, but it doesn't look like a fix for that is coming any time soon if I have understood things correctly, and it's the lesser of the two evils.

maltalex · on Jan 9, 2018

It's a good question and I bet there's going to be a lot of speculation and pointing to documents and people that predicted these issues years ago. But it's important to remember that hindsight is 20/20.

All of us probably have, on some level, risky behaviors we participate it. Maybe you're reusing a password between a few of sites, maybe you're driving a little bit too fast sometimes, or maybe you're sitting at your computer a bit more than you should. We take these risks daily, both at home and at work. Risk is a fact of life and of business.

Most of the risks we take turn out OK for us. But some don't. The problem is we tend to ask "why didn't we foresee it" only when something goes wrong, regardless of whether that risk was worth taking at the time. So the question is incredibly biased. Risks don't exist in a vacuum.

Anyway, I'm not an expert on CPUs or the CPU industry, but for what it's worth I'm guessing that they knew that there was a theoretical way to exploit this architecture. Maybe they knew how bad it could be, maybe they didn't. But market forces today are a lot stronger than some theoretical problem that may or may not happen in the future, especially if that theoretical problem would also hurt your competitors. I bet that fixing the problem is expensive in either money or performance. If that's the case, then fixing it when the risk is theoretical could be a bad business decision and might give your competitors an advantage (cheaper or more performant hardware).

I think that a much more interesting question to as is how do we incentivize this industry to avoid these types of problems in the future

Top19 · on Jan 9, 2018

You make a good point.

Sometimes trying to prepare for a crisis that might never come is the worst strategy.

Supposedly, this is what makes democracies stronger than authoritarian governments. Democracies argue endlessly, never taking action until a crisis, and then pulling off the impossible.

Authoritarian governments will try some crazy engineering solution for a problem they really don’t have (like irrigating Kazakhstan), and then 15 years later 90% of the Aral Sea is gone.

In other cases, preparing for Y2K, although it didn’t come, was actually extremely useful in responding to the attacks of 9/11. A lot of protocol that was used that day was originally going to be deployed 18 months earlier during New Years, December 31st, 1999.

techjuice · on Jan 11, 2018

Main reason is cost to secure vs profits to be made. We technically could have more secure chips, but that would delay the regular release cycle of processors that businesses, investors and consumers have come accustomed too.

Now if things were secure by default with no option to disable it we would be in a much better place right now security wise (not sure about how easy the use-ability would be though at first) as engineers would have to adapt to programming more securely on the hardware and software level. Though I did find it strange that when you read the Intel documentation everything is not accurate and there is a large amount of illegal opcode [0].

There is a chance that these vulnerabilities were reported internally and externally, never made it past a manager or never reported to the vendor, seen as not exploitable or theoretical by management or higher level engineer until someone else found it and figured out how to exploit it with time.

This is the same case with encryption algorithms that have been implemented on chip, in theory they are hard to brute force, but with time normally a method is created that can reduce this time by using artifacts found in the algorithm or certain symptoms that occur during the process of attempted encryption/decryption that could be a game changer that was not thought of or documented.

[0] https://github.com/xoreaxeaxeax/sandsifter

PhilWright · on Jan 9, 2018

I think it would be a mistake to assume that Meltdown/Spectre are somehow obvious and should have been spotted early in the design process. Remember that these vulnerabilities have been preset for 10 years and only now have people found them. If the issues had been spotted within the first year of CPU release you could argue that it should have been more obvious to the designers as well.

pixl97 · on Jan 10, 2018

Eh, a lot of it has to do with how we use computers have changed.

The x86 world went from either single user systems (Windows 95) or multi user trusted code systems (Business systems that run their own code) to multi user untrusted code systems (VMs that execute javascript from the internet). Many mainframe systems have had much better code/cache separation because they were specifically designed not to leak information between multiple users. These systems were also far more expensive than even the most expensive x86 systems.

Spooky23 · on Jan 11, 2018

That is true, but all IBM systems, including the mainframe, as400 and AIX boxes are vulnerable to spectre.