If, like me, you spend most of your time in high-level, garbage collected "scrip...

edmccard · on March 26, 2017

>It is astonishing how fast a computer is without the overhead most modern languages bring in.

No kidding! I once implemented Knuth's Dancing Links/Algorithm X[1] in Python, as part of a Sudoku solver. At one step in the algorithm you need to choose from a group of "things to do next"; the C version I was using as a reference just chose "the next item" in the group (it was stored as a linked list). Changing it to the more efficient method of scanning all the items and choosing the one which would more fully constrain the search space made no obvious difference in the time it took to solve a 9x9 Sudoku -- practically instant.

In the Python version, the efficient strategy also solved 9x9's practically instantly; the simpler strategy ended up taking several minutes! I forget exactly how many unnecessary steps the simpler version ending up taking -- hundreds of thousands, maybe -- but the C code just got out of the way and let the processor crunch through them too quickly to care about.

[1] https://www.ocf.berkeley.edu/~jchu/publicportal/sudoku/00110...

TeMPOraL · on March 26, 2017

Worth checking out Common Lisp. It's as high-level language as you can get, and yet good compilers (like SBCL, or like commercial ones from Franz and LispWorks) can compile it to tight assembly with performance very close to that of C++ (you need to disable some runtime checks for that though, but you can do that on a per-function level, so it's much less of a problem than one thinks).

runeks · on March 27, 2017

Haskell is also really fast. On most Stack Overflow questions asking "why is this Haskell program slow", there's often a detailed answer with an implemention that's about as fast as -- or even faster than -- C:

http://stackoverflow.com/questions/42771348/why-is-haskell-s...

http://stackoverflow.com/questions/6964392/speed-comparison-...

http://stackoverflow.com/questions/29875886/haskell-why-does...

Johnny_Brahms · on March 27, 2017

None of those are very good examples. They spend a lot of time optimizing already dead-slow algorithms, and they end up with an optimized dead-slow haskell version that is faster than the non-optimized dead-slow C version. Not very impressive.

I thought I would implement some naive versions:

For he second link, I just whipped up this naive in chez scheme, and it runs in 0.5s, and that should really be the baseline: https://pastebin.com/JXGLA4TR Incidentally this seems to be on par with the fastest haskell version posted that does not use precomputed primes for factorisation. It could be made a lot faster by using only primes, but I couldn't be bothered.

And I doubt you will find a haskell version that gets the longest collatz-sequence that is faster than this: https://pastebin.com/FAdiHA3X (0.01s using gcc -O2). Yet again, a simple algorithm, but this time with bolted-on memoization.

Edit: The first link contains the most naive and slow fibonacci function. It might be mathematically elegant, but it is dirt slow.

    (define (fib n)
      (let loop ([n n] [a 0] [b 1])
        (if (zero? n)
            b
            (loop (- n 1) b (+ a b)))))

That one takes 0.1s calculating the 100.000th fibonacci number, and there are even faster ways.

kgabis · on March 27, 2017

But you still don't get any control over memory layout of your data and how allocations are performed and this could easily cost you an order of magnitude in performance.

TeMPOraL · on March 27, 2017

True, at least without using any vendor-specific extensions (that may or may not help here; I haven't investigated). My point was that in CL, you get close-to-C++ performance levels by default, you can get even closer by optimizing specific areas in your code, and all of that while still writing in a very high-level language. Of course you won't beat memory-optimized C++ code in performance, but you're already close enough that you rarely need to.

megous · on March 26, 2017

Yep, I've recently written a RFC2822 (and friends) message parsing library in C that can parse 4GB of email messages per second (incl. full bodies decoding and normalizing everything to utf-8) on my old Ivy Bridge i5. Part of the trick is never tying number of malloc() calls to the number of messages/size of input, not using crappy OOP like interfaces, and avoiding string comparisons, which is all possible. My library has no malloc(), no strcmp() and no OOP. :)

It's about 4 times faster than gmime (also written in C - but kind of unfair comparison, because gmime is not doing the body decoding and conversion to UTF-8 by default) and 20 times faster than Python (in a single thread). Now if only I could have a SSD that could read my mail archive that fast. :)

srean · on March 30, 2017

I would love to read through some of that. Would that be possible ?

blacksmythe · on March 26, 2017

  >> It is astonishing how fast a computer is without the overhead most modern languages bring in.

The other interesting thing with high level languages (using Python as an example) is that a small number of features that are not often used keep the language from running significantly faster (see PyPy for example).

Seems like there should be room for a language with slightly reduced set of Python features (Py--) which runs much faster.

Johnny_Brahms · on March 27, 2017

I worked through project euler using both C and chez scheme. Once you spend a lot of time optimizing an algorithm in chez, you can sometimes squeeze out a 5x speed increase just by porting it to C.

And chez is _fast_.