Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Really? I find Intel's compiler to outperform GCC on pretty much all of the numerical work I do. I build with "-O3 -xHost" and make use of OpenMP.

Dynamic linking of the OpenMP library is almost certainly not the cause of the slowness you're observing. If you really want to force the Intel OpenMP runtime to use all 8 cores:

  export OMP_DYNAMIC=false
  export OMP_NUM_THREADS=8
  export KMP_LIBRARY=throughput

  # "KMP_BLOCKTIME" is how long an idle worker thread
  # should enter a blocking wait for more work before
  # sleeping, in milliseconds. default value is 200ms
  export KMP_BLOCKTIME=1000

  # following are needed if you use Intel MKL
  export MKL_DYNAMIC=false
  export MKL_NUM_THREADS=8
For more info: http://software.intel.com/sites/products/documentation/hpc/c...



"I find Intel's compiler to outperform GCC on pretty much all of the numerical work I do."

It's only a question of time. I suspect that if more companies decided to pool resources around GCC (or any other free C compiler, like pcc or clang), they will pretty much bury Intel.

Intel is a chip company. The only conceivable reason for them to want to maintain a C compiler is to make a C compiler that's better than the competition on Intel processors and that sucks as much as possible on competing architectures.

Icc is not a compiler. It's a sales tool.


I fully agree. I'm looking forward to LLVM becoming fully mature; it's a great platform already, and just needs to be fleshed out with some more optimizations/analyses/etc. And with the clang front-end, we can get rid of the unmaintainable pile of crap that is GCC.


A good compiler without bias would help the whole market, but not enough to be worth the expense.


What are some kinds (examples?) of code that you find ICC to compile better than GCC? Like, what's a typical loop that ICC can vectorize but GCC can't? I always have the damnedest time pinpointing when and where these optimizations fire and I've pretty much given up on the compiler when it comes to them. Rather I just develop code as normal and then when it's done find the top 3 or 4 functions in gprof (or Shark or whatever) and vectorize those by hand. Either that or try every compiler you have available and pick the one that yields the best time, but in my experience it's not always Intel.


ICC's vectorization is nearly useless. I've run it on thousands of lines of basic DSP code and gotten almost nothing--at best a single bad autovectorization.

The reasons ICC are better are many but unrelated to vectorization: one optimization I noticed is that it will compile a set of code that depends heavily on aliasing concerns twice and branch to which code path depending on whether the relevant pointers alias or not. This branch is usually predictable, since the pointers in reality will probably never alias, but it has to abide by the C spec.

There's probably a few dozen more things like this that add up to make it a few percent better than GCC. Though GCC is so buggy and many of its heuristics (especially inlining and storing array/struct elements in registers) so utterly hackneyed that beating it is not extraordinarily difficult..




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: