Thanks for the explanation! I have only hardware with no FPU (teensy 3.2) or single precision FPU(teensy 3.5)... I suppose in that case it would be better to write a fixed point implementation?
Fixed point often works better, as it has better resolution because it doesn't need the exponent. The flip side is that the range of a fixed point is very limited (because it doesn't have the exponent). Practically, that means you need to do some analysis to make sure don't overflow or saturate, and use the bulk of your fixed point range to get the resolution benefits. If you do that, fixed point is great.
If you don't have time or knowledge to do that, floating point works better because you can be more ham-fisted with your scaling and still not overflow or saturate.
I have the choice between using a slower processor (96 MHz) without an FPU or a faster one(120MHz) with a 32 bit FPU. I might use the slower one but with a fixed-point implementation.
Thanks for your tips!