The speed gained depends a lot on the structure of the code benchmarked. Natively written Go code has more computation happening in local loops without many function calls, the optimization brings less effect.
An interpreter often calls a function for every single directive executed. This means, you have a lot of function calls inside loops, sometimes for every single operation executed. This of course profits massively from this optimization.