Measuring the Haskell Gap [pdf]

breckinloggins · on July 26, 2013

This is a great paper; it is very readable and motivated and I learned quite a bit. I'm also now looking forward to perusing the 2012 Ninja C paper.

One small change I would make to the preprint would be to better normalize the graph symbols. In particular, failing to read the legends of each graph carefully might cause readers to misattribute results in subsequent graphs (for example, my mind wanted to associate Intel's HRC with "the lighter gray ones with the boxes", which is not a stable representation across all graphs).

yalue · on July 26, 2013

I would add to this comment that I would appreciate labels for the y-axes of the graphs.

jfarmer · on July 26, 2013

The y-axis is unit-less. They normalized the run times by the "Normal C" run time, so that, e.g., 2.0 means "took twice as long as Normal C."

arocks · on July 26, 2013

It is interesting to read how Haskell optimized the algorithm based on the instrinsic properties of the data structures. In contrast, C compilers leveraged on the knowledge of the underlying machine. It is amazing how far Haskell compilers have come.

berkut · on July 26, 2013

It's a good conclusion I feel: this is always the issue with language benchmarks - who wrote the code, and how good were they with each of the languages.

Similarly as the article points out, the compiler matters a lot: ICC can in certain cases be more than 200% faster than GCC with similar flags, and is generally 15-20% faster anyway, mainly due to more intelligent inlining and much faster (and more accurate with fpmath=fast) math libs.

copx · on July 26, 2013

..only on Intel chips. It deliberately generates code that runs slow on non-Intel CPUs:

http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler#Criticis...

As an AMD user I really hope most programmers know this by now. If you make a build for the general public as opposed to only targeting Intel machines, please don't use ICC.

berkut · on July 26, 2013

Not since 2010 it doesn't:

http://www.hardware.fr/articles/847-1/impact-compilateurs-ar...

It can generate code which when run on AMD is faster than GCC and MSVC.

copx · on July 26, 2013

I don't read French but Intel's current official compiler documentation..

http://software.intel.com/sites/products/documentation/docli...

..suggests nothing has changed. Search for "non-Intel".

Maybe ICC generates code which beats GCC even when run on an AMD chip in one particular benchmark but that doesn't mean it generates better code in general.

Personally I will never trust the Intel compiler, because it's part of their business strategy to generate bad code for AMD processors.

Even if the claim in the original post about being "generally 15-20% faster" were true for Intel chips, it wouldn't be 15-20% faster on AMD or Intel's documentation - which clearly states the compiler generates inferior code for non-Intel chips - is wrong.

berkut · on July 26, 2013

You could look at the graphs - it's pretty obvious ICC wins in almost all benchmarks on all processors, not just one benchmark.

Intel state they do different optimisations for different chips, and I'd guess it does it based on how many load/store ports there are as this seriously affects the fp throughput.

These change per-chip - i.e. the core i7 sandy bridge doubled the number of front end float load ports from Nahalem, so more OOO execution can be done, and thus the compiler can generate code differently to take this into account.

You can't expect Intel to optimise very thoroughly their compiler for all the processor models of their competitors.

copx · on July 26, 2013

>Intel state they do different optimisations for different chips,

The documentation says more highly optimized for Intel® microprocessors than for non-Intel microprocessors. again and again. Not just different, inferior.

Also the optimization notice mentioned in the Wikipedia article I linked, the one Intel was mandated to add by the courts is still there. Maybe I should quote the current version in full:

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

>You can't expect Intel to optimise very thoroughly their compiler for all the processor models of their competitors.

As I said I don't, I expect it to generate bad code for the products of the competition. And that's what it did and does. AMD dragged Intel to court over this and won. That's why the compiler documentation is now full of these "non-Intel" disclaimers.

ICC exists to sell Intel processors, one should always remember that. Intel isn't trying to make money selling compilers..

berkut · on July 26, 2013

> Not just different, inferior.

Because they're not going to spend time working out by trial and error (it's possible based on timing and evaluating code) how many float ops / cycle each AMD chip can do. For their own chips they know the numbers themselves.

So they make an assumption for non Intel chips. Maybe they assume 2 when some AMD chips can do 4.

> ICC exists to sell Intel processors

And strangely enough, if you use ICC you'll generally be getting better code out the other end regardless of what chip you run it on compared to the other two major compilers.

aidenn0 · on July 26, 2013

I don't know the current state of ICC, but previously it would ignore the instruction set the CPU claimed to support and not use a large fraction of SSE instructions on non Intel CPUs that supported it.

Centaur did a study where they changed their CPU ID to pretend claim to be an Intel part and they got a significant performance boost when running code compiled with ICC.

marshray · on July 27, 2013

> The documentation says more highly optimized for Intel® microprocessors than for non-Intel microprocessors. again and again. Not just different, inferior.

No, it's "less superior".

I don't see anything wrong with this, unless somehow Intel is preventing AMD from writing their own compiler and investing more into code generation for AMD chips than Intel's.

> ICC exists to sell Intel processors, one should always remember that.

I think it's probably more reasonable to look at it as ICC exists to ensure that Intel can ship new features, optimizations, and instructions in their chips and not be totally dependent on 3rd parties to make them available to C/C++ programmers.

copx · on July 26, 2013

I would like to add that I am not actually universally opposed to using ICC for general builds but one should carefully benchmark the resulting binary in those cases.. on a non-Intel machine.

As I pointed out it is reasonable to assume that the code will run slower on non-Intel chips. Basically I just replied to the OP because I was worried about people blindly doing release builds with ICC assuming it means "higher performance for free". That maybe the case in some situations but given the background of the compiler one should always make sure.

igouy · on July 26, 2013

>> who wrote the code, and how good were they with each of the languages <<

I'm repeatedly surprised when programmers don't emphasize how much comparisons can be about good programs and not so good programs, and instead drift into somewhat tribal language comparisons.

I kind-of think we know better but that knowledge just doesn't serve language advocacy, so it's not mentioned.

As we often we get hired by programming language, language advocacy has obvious importance to us ;-)

joelthelion · on July 26, 2013

This is a nicely done benchmark, and an impressive demonstration of HRC. Thanks for posting!

mhaymo · on July 26, 2013

I'm surprised by how dramatic the difference is between the speed of C and Haskell. One of my professors (at The University of Glasgow, so appropriately a Haskell fan) once claimed that it had "c-like performance".

I suppose that's the point of this paper though, that "c-like performance" is a terribly vague term, meaningless without knowledge of the specific comparisons being made.

octo_t · on July 26, 2013

For lots of algorithms, fairly naive Haskell can get very close in performance (within 10%) of pretty decent C or C++.

For example[1] shows that for very advanced algorithms (such as BLAS), Haskell can be very performant - with the optimisation being reusable and transparent to the programmer.

[1] - http://research.microsoft.com/en-us/um/people/simonpj/papers...

strmpnk · on July 26, 2013

That's kind of an odd comparison, using unfused C code compared to fused Haskell code. Their point seems to focus on stream fusion's advantages and possibly that optimizing C takes more effort.

To quote the paper: "Clearly “properly”-written C++ can outperform Haskell. The challenge is in ﬁguring out what “proper” means."

igouy · on July 26, 2013

>> a terribly vague term, meaningless without knowledge of the specific comparisons being made <<

otoh http://benchmarksgame.alioth.debian.org/u32/program.php?test...

:-)

Paradigma11 · on July 26, 2013

Well, but my Haskell code does not have to beat the best possible c code but only my best c code ;)

beefman · on July 26, 2013

Summary: Intel's HRC (Haskell Research Compiler) is an optimizing compiler for GHC's (Glasgow Haskell Compiler) "Core" intermediate language.* On six common benchmarks, it improves the performance of Haskell dramatically. But Haskell is still 4 times slower than the best C implementations of these benchmarks, on average.

* Core is just desugared Haskell and should not be confused with GHC's other intermediate languages, STG and C--. And there is no relation to Intel's "Core" microarchitecture.

6ren · on July 26, 2013

Scribd got better! It's comparable, perhaps better, than google's viewer: https://docs.google.com/viewer?url=http%3A%2F%2Fwww.leafpete...

jkldotio · on July 26, 2013

Isn't Google's viewer really the default PDF viewer in Chrome? The one which doesn't add a toolbar or any tooltips that are popping up despite my mouse not hovering over them.

Scribd is for people who want to share a PDF but don't know how to do it in any other way. It's pretty much never been welcome on HN because that's the only problem it solves and everything else it does is inferior to just having a native PDF you can view with no problems and save with no problems. We are not the target market so I never understood why it was pushed on HN at all.

soganess · on July 26, 2013

Chrome use an unbranded version of foxit[1][2].

[1]http://googlesystem.blogspot.com/2010/08/google-chromes-pdf-... [2]independently verified to me by a foxit employee.

omaranto · on July 26, 2013

> Isn't Google's viewer really the default PDF viewer in Chrome? The one which doesn't add a toolbar or any tooltips that are popping up despite my mouse not hovering over them.

No, it's not, unless, Google figured out a way to install their PDF viewer plugin in my Firefox without me noticing. ;) The link 6ren posted is to Google Docs's file viewing component, now sold separately: https://docs.google.com/viewer

I'm not completely sure, but I think that for PDFs what the Google Docs viewer does is (1) generate a PNG image of each page, (2) extract the text and put in on the page as invisible HTML, positioned carefully to correspond to the PNG. The point of (2) is that it makes text selectable.

j_s · on July 26, 2013

  > why it was pushed on HN at all

I agree scribd is terrible/useless (personal opinion) and does catch a lot of flack here, but it's YC S06.

https://news.ycombinator.com/item?id=1326047

sambeau · on July 26, 2013

Did Scribd get 'better' in an empirical performance comparison with google's viewer?

For me to take such a comparison seriously, you need to present evidence to us transparently, with the understanding that every such comparison inevitably relies on making choices and is hence only meaningful insofar as those choices can be seen and understood by the reader

DanWaterworth · on July 26, 2013

I'm not sure why you were down voted. I thought what you said was very witty, but I suppose it only works if you read the paper.

venomsnake · on July 26, 2013

Thanks. PDF is PITA to read on anything but paper anyway, but every little bit helps I suppose.

seanmcdirmid · on July 26, 2013

PDFs are fairly easy to read on computers, especially high res iPads.

crncosta · on July 26, 2013

Does any one know if the authors are sharing the benchmark's source code? tks.

yogsototh · on July 26, 2013

I would be interested to see how jhc [1] compare to ghc. I am not sure Repa could be easily compiled with jhc thought.

[1]: http://repetae.net/computer/jhc/

dumael · on July 26, 2013

http://mirror.seize.it/report.html

That report is quite old though and just compares compilers with the nofib benchmark.

JHC can't compile repa as repa requires multi-parameter type classes which JHC doesn't support unfortunately.

sirspazzolot · on July 26, 2013

"DRAFT - Not for redistribution" haha

Very interested in seeing where Haskell is headed in the future. Major props to Intel for the disclaimer that this isn't a definitive study.