Thanks for the link. It seems the implementations are already improving. From the posted benchmarks it looks like it got twice as fast, which is still slow for some use cases (like ephemeral use for forward secrecy) and slower devices. The best solution will be hardware support.