GHC has great optimizations within the scope of modules (and seems to balance for the right level of polymorphism/modularity/linking), and can inline some things across module boundaries. But at the level of entire programs more advanced optimizations are possible; I gather there are tradeoffs too.
http://mlton.org/
https://www.reddit.com/r/haskell/comments/2tpmbo/what_on_ear...
http://stackoverflow.com/questions/4720499/possible-optimiza...