Intel's "cripple AMD" function

drewcrawford · on Jan 3, 2010

I was an intern for AMD a few years ago (these are my views and not AMD's). I was pretty skeptical about AMD's antitrust claims against Intel until I went to work there. I'm as free market as they day is long, but there's a whole untold story of the evil things that go on in the back meeting rooms, even outside of sales, where most of the public lawsuit claims are/were.

The thing to remember is that AMD is a small fraction of the size of Intel, and they have to cover the same market segments. If they try to specialize (say, servers, or notebooks), Intel will just sell that segment at a loss. AMD has to cover everything with only a fraction of the people to stay competitive, and it's really hard.

Even while I was there, we had what I suspect (but have no proof) were incidents of people leaking product plans, roadmaps, etc. (but no IP) to Intel. It's sad, really.

jimbokun · on Jan 3, 2010

"Even while I was there, we had what I suspect (but have no proof) were incidents of people leaking product plans, roadmaps, etc. (but no IP) to Intel."

I can't imagine Steve Jobs allowing this to happen at Apple. They have definitely caught people leaking things, and the consequences were swift and unpleasant for the leaker. Why can't AMD catch these people? Is there something preventing them from implementing the same kinds of measures to catch leakers as Apple?

(Using Apple just as an example, of course. I'm sure there are other companies who find leakers and make an example of them through the legal system.)

drewcrawford · on Jan 4, 2010

There was one guy who was strung up as a leaker ten years ago. Don't remember his name, but it was a big deal.

AMD's culture is just different than Apple's. For one, there are no "secret teams" like iPhone, iTablet, etc. (well, at least none that I knew about). For another, developers have real autonomy to make business decisions, something that would never happen at Apple. For instance, I, a lowly intern, redeployed software to the production line during an emergency. If something went wrong, chips would actually stop rolling out of the factory. I would imagine normal (non-senior) Apple engineers don't have that kind of autonomy.

The other major difference, which perhaps you caught above, is that AMD actually manufactures their own stuff. So, not only are there US engineers, but engineers overseas in the plants that AMD owns, engineers in Dresden, etc. Not to say that foreign engineers are somehow bad, but it is a lot harder to control leaks when you have engineers working literally 24/7 all around the world. And at the scale that you're making your own stuff, there are just more people than there are at Apple, and things are way harder to control. It's like herding cats.

djcapelis · on Jan 4, 2010

> For instance, I, a lowly intern, redeployed software to the

> production line during an emergency. If something went

> wrong, chips would actually stop rolling out of the factory.

While I'm all for letting engineers react to things progressively, that you were in position where a screw up could have shut down a fab as an intern is nothing short of terrifying to me.

drewcrawford · on Jan 5, 2010

> That you were in position where a screw up could have shut down a fab as an intern is nothing short of terrifying to me.

To me, the converse is a lot scarier: what if I, feeling no personal responsibility for yield, wasn't there after-hours looking for bugs in the first place? Or what if I did find the critical bug, but had to wait weeks for forms before it was pushed through? Or was blocked by office politics?

There's nothing more disheartening as having a fix for something serious that you can't push through. I've worked at companies like that: taking away the power to break something means taking away the power to make it better.

Not to say that I somehow dislike code reviews or generally fly by the seat of my pants: the situation really was a real emergency. I can't talk specifics, but the bug had already cost more than the damage it would do to if I broke something.

regularfry · on Jan 5, 2010

Toyota have a very similar rule that they actively promote: every worker has a button with which they can shut the line down if they see something wrong.

It seems to work for them...

Nelson69 · on Jan 4, 2010

Culture is part of it. How many people there had worked for Intel in the past and how many leave AMD to go to Intel in the future? I'd expect that's part of it too.

rbanffy · on Jan 4, 2010

The problem here is that you can easily detect leaks to the press. Tainted information gets published and you know who leaked by how the leak deviates from reality.

It's less so with leaks between companies. A company that deals this way will never allow the information it gathers to show up like that. Even analysts who will examine the leaked materials will have little (not so little in the case of Intel - they have like two competitors) information about the origins of what they are looking at.

praptak · on Jan 4, 2010

"I can't imagine Steve Jobs allowing this to happen at Apple. They have definitely caught people leaking things, and the consequences were swift and unpleasant for the leaker."

Indeed, you don't wanna mess with Apple: http://news.cnet.com/8301-13579_3-10291701-37.html

joe_the_user · on Jan 4, 2010

Well AMD is a small fraction of the size of Apple as well. Even more, Apple's business isn't oriented towards cutting into the market share of a large competitor. And there have been significant leaks at Apple.

brk · on Jan 4, 2010

The thing to remember is that AMD is a small fraction of the size of Intel, and they have to cover the same market segments.

Not sure what you mean by this statement. Should AMD get some sort of special consideration because they are smaller than Intel?

netcan · on Jan 4, 2010

The point does not stand alone, you are right. drewcrawford said "If they try to specialize (say, servers, or notebooks), Intel will just sell that segment at a loss"

That's the point & that is arguably anticompetitive. IE, they are small and they have to spread wide to avoid being open to predatory pricing.

brk · on Jan 4, 2010

Yes, but they voluntarily entered in to their business.

What is AMD's competitive advantage? The best I've ever seen is that they are cheaper than Intel for a given processor class/speed. However that doesn't seem to have gotten them much traction. Competing on price alone rarely makes for a successful company.

krakensden · on Jan 4, 2010

Back when the opteron and amd64 instruction sets came out, they were pretty convincingly beating intel in performance, and for quite a long time, and after that they made a big advertising push about their processing power/watt.

netcan · on Jan 4, 2010

" Competing on price alone rarely makes for a successful company"

You sure?

brk · on Jan 4, 2010

Yes. It's fairly established that chasing down the lowest-price path usually doesn't make for a very stable business model in the manufacturing businesses.

This is even more significant when the competitive options are significantly limited (eg: processors). In larger markets (eg: automobiles) there is more room for low-cost competitors to make some money, but they are not often powerhouses of the industry.

If you have data to the contrary please post (and by data I mean more than just 1-off examples).

netcan · on Jan 4, 2010

Well, first you said companies not just manufacturing companies so I would have pointed to Wal Mart. But since you have limited the range it's more difficult because manufacturing companies are not as high profile as retailers. I will point to China as a whole as a contemporary example. Chinese companies have consistently beaten the previous generation's manufacturers by achieving lower costs.

Anyway, since you say "it is fairly established" the burden should be on you to tell me where & by who. What I know is established is that internet marketing gurus speaking to small & micro businesses recommend finding non price differentiators. This arguably makes sense for small businesses where market size is not an issue. Most large companies however, need to go after large markets & that means lower prices.

My old marketing textbooks say that there are two broad positioning strategies & corresponding pricing strategies niche(differentiated) & penetration(low cost). The larger share of the pie usually belongs to the latter with high margins often going to the former.

I would argue that low cost strategies are probably more "stable" since they do not rely on innovation & other constant miracles. Even Apple may flop two or three major products in a row & die again.

rdl · on Jan 4, 2010

Being a low cost producer is a great survival strategy, even for a new entrant. Walmart, Amazon, and PC vs. Minicomputer are good examples. You can either use your cost advantage to undercut prices profitably, or use your large margins to outspend on R&D, marketing, etc.

Selling at low prices given the same cost structure is not a good survival strategy. "We lose money on every unit but we'll make it up on volume!"

sandGorgon · on Jan 4, 2010

what about Honda vs.... I dont know.. Detroit ?

That is a good example of manufacturing companies winning on cost.

brk · on Jan 4, 2010

Honda, Toyota, et al got a leg up on the US Automakers by delivering a higher quality and more efficient product. They also managed to do it cheaper, but their competitive advantage was NOT primarily price.

Car companies that tried to deliver a low-cost product without the quality behind it (eg Yugo) ultimately failed.

Retric · on Jan 4, 2010

Honda / Toyota advantage was cost. They produced a better product for slightly less money because they had lower production costs. If you look at early Toyota cars they where all low end and cheap because that's all they could make. Just look a the size of the Accord when it was introduced vs the Civic vs the Fit. It's all about starting at the low end of the market and working your way up to the less price focused customer.

PS: Don't forget the market decides price, you can only really control cost if you want to be more than a nitch player.

gaius · on Jan 4, 2010

Yes but the Japanese strategy was not "sell at the lowest possible price". It was "we have good control over our costs, let's pick a price and see what we can deliver at that price". At that point in time, US automakers didn't even know what making a car cost them, they were so focussed on revenue they lost sight of the bottom line.

rbanffy · on Jan 4, 2010

AMD voluntarily entered the desktop processor business, IIRC, at a time the US government required a second source before they could buy x86 chips. How does that make anti-competitive behaviour from a company many times their size nice?

anc2020 · on Jan 4, 2010

It is anti-competitive to sell anything at a loss.

herf · on Jan 3, 2010

When using IPP, I had to rewrite the CPU detector, even for new Intel chips as they came out. This code should be better...really it should just benchmark all the options and catch processor exceptions to pick a supported path.

Instead, the idea is to do a static dispatch for 'known' chips, which is really bad. When the Core2Duo came out, the version of IPP we used reverted to basic MMX code instead of SSE2, about 2.5x slower. This is just bad code, and it's bad on Intel chips, not just AMD.

Also there is the "optimized for benchmarking" piece. It's not always good to use all your cores for one job, for instance, but a lot of these libraries make the assumption that your CPU has nothing else to do.

rbanffy · on Jan 3, 2010

Isn't this the textbook reason for using - and contributing to - open-source compilers and libraries?

liuliu · on Jan 3, 2010

and gcc is not that bad after all. I uses openmp to parallel my program on core i7 860 cpu which should support 8-threads. But using icc as compiler, it will only utilize 7 cores, and it does affect performance (about 10% slower (wall time) than gcc which uses 8 cores). I suspect that it has something to do with the dynamic linked openmp library for icc.

jey · on Jan 3, 2010

Really? I find Intel's compiler to outperform GCC on pretty much all of the numerical work I do. I build with "-O3 -xHost" and make use of OpenMP.

Dynamic linking of the OpenMP library is almost certainly not the cause of the slowness you're observing. If you really want to force the Intel OpenMP runtime to use all 8 cores:

  export OMP_DYNAMIC=false
  export OMP_NUM_THREADS=8
  export KMP_LIBRARY=throughput

  # "KMP_BLOCKTIME" is how long an idle worker thread
  # should enter a blocking wait for more work before
  # sleeping, in milliseconds. default value is 200ms
  export KMP_BLOCKTIME=1000

  # following are needed if you use Intel MKL
  export MKL_DYNAMIC=false
  export MKL_NUM_THREADS=8

For more info: http://software.intel.com/sites/products/documentation/hpc/c...

rbanffy · on Jan 4, 2010

"I find Intel's compiler to outperform GCC on pretty much all of the numerical work I do."

It's only a question of time. I suspect that if more companies decided to pool resources around GCC (or any other free C compiler, like pcc or clang), they will pretty much bury Intel.

Intel is a chip company. The only conceivable reason for them to want to maintain a C compiler is to make a C compiler that's better than the competition on Intel processors and that sucks as much as possible on competing architectures.

Icc is not a compiler. It's a sales tool.

jey · on Jan 4, 2010

I fully agree. I'm looking forward to LLVM becoming fully mature; it's a great platform already, and just needs to be fleshed out with some more optimizations/analyses/etc. And with the clang front-end, we can get rid of the unmaintainable pile of crap that is GCC.

pvandehaar · on Jan 5, 2010

A good compiler without bias would help the whole market, but not enough to be worth the expense.

chancho · on Jan 4, 2010

What are some kinds (examples?) of code that you find ICC to compile better than GCC? Like, what's a typical loop that ICC can vectorize but GCC can't? I always have the damnedest time pinpointing when and where these optimizations fire and I've pretty much given up on the compiler when it comes to them. Rather I just develop code as normal and then when it's done find the top 3 or 4 functions in gprof (or Shark or whatever) and vectorize those by hand. Either that or try every compiler you have available and pick the one that yields the best time, but in my experience it's not always Intel.

DarkShikari · on Jan 4, 2010

ICC's vectorization is nearly useless. I've run it on thousands of lines of basic DSP code and gotten almost nothing--at best a single bad autovectorization.

The reasons ICC are better are many but unrelated to vectorization: one optimization I noticed is that it will compile a set of code that depends heavily on aliasing concerns twice and branch to which code path depending on whether the relevant pointers alias or not. This branch is usually predictable, since the pointers in reality will probably never alias, but it has to abide by the C spec.

There's probably a few dozen more things like this that add up to make it a few percent better than GCC. Though GCC is so buggy and many of its heuristics (especially inlining and storing array/struct elements in registers) so utterly hackneyed that beating it is not extraordinarily difficult..

jrockway · on Jan 3, 2010

Sounds like AMD should just start setting the vendor string to "GenuineIntel", then. (This is something like the "like Mozilla" in every user agent string. If dumb software is going to do dumb tests, and you need to fool the dumb test to get your interoperability.)

rbanffy · on Jan 3, 2010

Better yet: make it writable.

That way the OS could change it per process/thread/context and the code would be happy.

wmf · on Jan 3, 2010

VIA has a writable vendor string which they have used to reveal this type of shenanigans in the past.

rbanffy · on Jan 4, 2010

I always liked those guys. The über-486 Centaur built was briiliant design and out-of-the-box thinking from top to bottom.

Weren't the VIAs able to trounce Xeons in some crypto stuff?

wmf · on Jan 4, 2010

Weren't the VIAs able to trounce Xeons in some crypto stuff?

Yes, since they had crypto instructions and nobody else did. Now I would expect Westmere to be faster.

sophiebits · on Jan 4, 2010

From the end of the article:

It is possible to change the CPUID of AMD processors by using the AMD virtualization instructions. I hope that somebody will volunteer to make a program for this purpose. This will make it easy for anybody to check if their benchmark is fair and to improve the performance of software compiled with the Intel compiler on AMD processors.

dbz · on Jan 3, 2010

That idea is flawed because putting that string will make it optimize for an Intel chip, which will be worse than optimizing for an AMD chip because even though the compiler chooses the worst "optimized" option, it is still optimizing FOR that chip. -> The AMD optimization is optimized -But the Intel optimization is extra optimized (but not optimized for AM- only for itself).

djcapelis · on Jan 4, 2010

Yes and no. I'm fairly certain that often the optimizations will be better than the fallback codepath in practice. This is especially true if you note that this is defeating the check for the vendor id, not defeating a lot of the other checks the dispatcher does.

loudtiger · on Jan 3, 2010

sounds like the Pre and iTunes. haha.

pbhjpbhj · on Jan 3, 2010

That would be passing off / trademark infringement FWIW.

pwmanagerdied · on Jan 3, 2010

No it wouldn't. All else aside, if this were the case every browser for the past decade would have been sued for including "Mozillla" and/or "MSIE" in their user-agent strings.

pbhjpbhj · on Jan 4, 2010

The important point is whether the mark is being abused as a false indication of the origin of the goods. In the case of UA strings this is plainly not the case (now). Vendor IDs are still considered to be indicative of the vendor - such an objection could be avoided by a disclaimer but this may yet not be enough if vendor-id were the common method to determine origin.

Of course their are rules against using as a TM something that is not indicative of origin (generic names of products &c.) and so if a name were needed for interoperability purposes then it could well be argued for free use.

I perhaps trolled slightly. Mea culpa.

As for Chrome browser identifying as Safari, I think not, what does Help > About say.

djcapelis · on Jan 4, 2010

Well, it's somewhat unclear since "GenuineIntel" and "Intel" are not quite the same. I'm betting a good lawyer could make the case that the first explicitly seeks to confuse the user and thus dilutes the trademark. It would be interesting to see this play out in court though. I wonder if the judge could actually tell them that by using their trademark in a technical sense like this, perhaps that would cause dilution itself.

As always, with the law it's a bit more complex than it might seem.

Also note that most browser UAs explicitly states "like X."

dangrossman · on Jan 4, 2010

Chrome identifies itself as Safari, and we know how protective Apple is of its trademarks.

wmf · on Jan 3, 2010

Has anyone tried the Sun Studio compilers? They're free and supposed to be as good as Intel, but I've seen virtually no discussion of them.

daeken · on Jan 3, 2010

For x86, they fall behind a bit. For x64, they're faster than ICC in general. Definitely worth a look if you don't mind going to OSol.

jedbrown · on Jan 4, 2010

My experience on x64 has been that Sun is usually competitive with GCC/ICC, but not clearly better. Sun's C99 compiler does really atrocious things with SSE intrinsics, strange since their C++ compiler handles intrinsics almost as well as GCC/ICC. Note that they also work fine on Linux.

rythie · on Jan 4, 2010

Not only should they fix it, they should open source the code, so AMD can contribute.

Intel often makes noises about open source, so they should put their money where there mouth is.

notauser · on Jan 4, 2010

Compiler discussion to one side for a minute...

Intel does more than just make noises about open source. Their wifi and graphics chip set support has been excellent over the years. Prior to the recent changes at ATI they were pretty much the only company doing that.

rythie · on Jan 4, 2010

Agreed and it's good that they do that since I use their graphics and Wifi drivers on my Ubuntu Laptop.

My point was if they are truly committed to open source they would do this too, they are after all a hardware company and should compete by making the best hardware.

sfg · on Jan 4, 2010

I do not know much about processor benchmarking, but is it not a little weird that the bench markers use software that is not independent of the hardware they are testing? It seems like they are asking to be manipulated: why do they do this?

wmf · on Jan 4, 2010

They think the software is processor-independent but it's really Intel-biased; that's the problem.

Andys · on Jan 3, 2010

Sensational headline: this article is only about the Intel C Compiler, which as far as I can see, is only used for benchmarketing and research purposes.

scott_s · on Jan 3, 2010

As a systems researcher, I have often used the Intel C Compiler when I wanted to make the most fair comparison possible because it is generally accepted to produce the best code of x86 processors. With that background, I correctly guessed from the headline what the content of the article was.

praptak · on Jan 3, 2010

"Only" benchmarketing? If any published benchmarks are affected by this misfeature, it's pitchforks and torches for Intel.

Andys · on Jan 4, 2010

Does it really come as news to anyone that if Intel wants to show their CPU in best light, they'll use their own compiler?

Caveat emptor.

The bulk of the x86 world uses Microsoft C++ or GCC - end of story.

zeugma · on Jan 4, 2010

I disagree. ICC is often use to optimize heavy computation code. As it have a reputation of producing faster code.

kelnos · on Jan 4, 2010

I think the parent was talking more about third parties doing benchmarks. If someone (not Intel) uses benchmarking software compiled with ICC, it might report erroneously bad results on an AMD system.

praptak · on Jan 4, 2010

Yup, that was my point. Also, 'caveat emptor' has limits - Intel should at least state that their compiler produces suboptimal code for their competitors' CPUs.

adame944 · on Jan 3, 2010

Bottom line: it's a business decision. Code generated by the Intel compiler "works" on AMD chips, although it may not be optimal. For Intel to support the optimal codepaths on AMD chips would require a substantial amount of research. I don't think they're intentionally crippling AMD chips; just declining to invest the effort to support them optimally.

wtallis · on Jan 3, 2010

Nope. Checking the vendor string to determine capabilities when the CPUID instruction already has flags for different capabilities is unjustifiable. When a CPU claims SSE2 support, the compiler should enable SSE2, regardless of the vendor string. If AMD's implementation of SSE2 is buggy, that's their problem, and Intel should have no trouble making it into a PR win.

This is really no better than printer manufacturers putting chips into their cartridges so that they can use the DMCA to prevent third parties from refilling or making compatible cartridges.

DarkShikari · on Jan 3, 2010

That isn't exactly how it works.

The proper way to do it:

    if( CPUIDbits & SSE1_CAPABLE ) {enable SSE1}
    if( CPUIDbits & SSE2_CAPABLE ) {enable SSE2}
    [etc]

The even better way to do it:

    if( CPUIDbits & SSE1_CAPABLE ) {enable SSE1}
    if( CPUIDbits & SSE2_CAPABLE ) {enable SSE2}
    if( CPU is Athlon 64 ) {disable some SSE2 functions}
    if( CPU is Pentium-M ) {disable all SSE2 functions}
    [etc]

Intel's way of doing it:

    if( CPU is Pentium 3 ) {enable SSE1}
    if( CPU is Pentium 4 ) {enable SSE1/SSE2}
    if( CPU is Core 2 ) {enable SSE1/SSE2/SSE3/SSSE3}
    [etc]

Practically all sane applications do things the first way; a couple do things the second way. Anyone doing things the third way is just asking for trouble both in terms of future compatibility and resilience to unexpected situations. For example, some VMs disable certain instruction sets, which would result in SIGILLs when using the last method.

kelnos · on Jan 4, 2010

If you read the article, it looks like Intel's CPU-type dispatcher actually does it the second way (sorta; it appears to only check Intel CPU family IDs), but at the bottom of that list there's a big "if(CPU string is not "GenuineIntel") { disable everything and use crappy fallback code path }".

nvoorhies · on Jan 3, 2010

In addition to just the question of what path is optimal, they'd have to keep track of all the bugs in AMD's, Cyrix's, Transmeta's, and other implementations which aren't the same as the bugs on Intel x86 chips. Falling back to a subset of the architecture that is more likely to produce the right behavior is the sane thing to do.

e.g. http://www.amd.com/us-en/assets/content_type/white_papers_an... v. http://download.intel.com/design/processor/specupdt/320836.p... - these aren't just a couple gotchas you can put on the back of an index card

wtallis · on Jan 3, 2010

Have you actually looked at the details of the AMD errata? I scanned through quite a few, and none of them looked at all relevant to whether a compiler should enable certain SIMD extensions. Also, for the bugs that do need workarounds, the fix is typically in the BIOS or kernel, not the compiler.

Regardless of whatever bugs may or may not exist in AMD's chips, it is anti-competitive for Intel's compiler to refuse to enable SSEx extensions on AMD processors that claim compatibility, when the user has requested SSEx instructions to be used, unless Intel has specific knowledge that AMD has never produced a sufficiently bug-free implementation of SSEx.

drewcrawford · on Jan 3, 2010

I was an AMD intern responsible for writing the software that tests for those bugs (these are my views and not those of AMD).

You're entirely on the mark--the bugs are generally not SSE-related, and are typically worked around at the BIOS level. In addition, these bugs aren't "bugs" in the software sense, but more in the engineering sense; running some specific set of 10 million instructions on 3% of the chips while holding at exactly 28 deg C flips a byte in the L2 cache or something. Not something that's easily reproducible.

OP is also right insofar as there would probably be a few gotchas with weird software that would make some engineer somewhere scratch his head because his for loop returns too quickly every third Monday. That doesn't mean that it's fair for Intel to disable SSEx, but it is probably an accurate observation--AMD and Intel have different bugs.

nvoorhies · on Jan 3, 2010

Clearly the cost of checking that AMD's errata is accurate is greater than the cost of having icc produce suboptimal code for AMD processors.

The results of incorrect code generation on any given platform would more likely be people ceasing to use Intel's compiler, and that's not something they want to deal with, I'm sure. As to it being anti-competitive, there's nothing stopping AMD from making their own compiler that produces more optimal code and trusts everyone's processors to work as advertised. If it produced better results, people would likely use it.

scott_s · on Jan 3, 2010

As to it being anti-competitive, there's nothing stopping AMD from making their own compiler that produces more optimal code and trusts everyone's processors to work as advertised.

Sure there is: time and money. Huge companies that are near-monopolies have enormous reservoirs of both in comparison to the competition.

What Intel is accused of doing is considered anti-competitive because they are not actually improving their product. Instead, they are using their stronger market position to degrade the value of another company's product. This harms the overall market, and particularly the consumers. Hence, it's illegal.

bensummers · on Jan 3, 2010

The objection in the article is not what the compiler does, but how it's advertised. If it was claimed to be an optimizing compiler "for Intel CPUs only", then there would be no problem.

(Of course, if they did that, would people use it?)

NathanKP · on Jan 3, 2010

This doesn't really make any sense. All you would need to do is compile the code on an Intel machine to get fast speed and then you can run it on an AMD machine. It shouldn't really cause any problems as long as developers build on genuine Intel machines. Of course that it irritating, but it shouldn't cause any slowdown on other machines.

ShabbyDoo · on Jan 3, 2010

I think the compiler generates code which checks processor type at runtime, not compile time. If the compiled code is running on an AMD processor, the "safe" version of the compiled code is chosen automagically.

NathanKP · on Jan 3, 2010

Wouldn't that make the code twice as large?

barrkel · on Jan 3, 2010

It's pretty common for runtime libraries to optimize low-level routines like memcpy, math, etc. with multiple different paths chosen on the basis of CPU capability bits. It's not the whole code that's twice the size; it's small functions which are implemented 2 or 3 or 5 times depending on what features are available.

ShabbyDoo · on Jan 3, 2010

Perhaps, but size doesn't really affect runtime performance that much, especially if most codepaths are never execute -- no processor cache churn because the unused paths are never executed.

I don't really know anything about this compiler, so I'm certainly speculating. My assumption is that one writes some function foo() and the compiler prepends a dispatcher in front which forks (code paths, not processes) to one of N optimized but functionally equivalent codepaths based on the actual processor upon which the code runs.

ars · on Jan 3, 2010

Size does affect performance because of the cache.

If the forks are inline, and the cache works in blocks, then you are wasting cache space for code that never runs.

But considering it's intel I'm sure they thought of that.

pmjordan · on Jan 3, 2010

I suspect it patches a jumptable at initialisation time based on CPU type, and all the code used by one type of CPU is bunched close together. The unused code probably isn't even paged into physical RAM.

DougBTX · on Jan 3, 2010

Certainly. These are performance optimisations, they must measure results.