Comparing Floating-Point Numbers Is Tricky

jepler · on March 30, 2017

The operator overlooks my pet peeve when it comes to comparing floating-point numbers: in C++ the "standard associative containers" (std::set and std::map) are based on ordering using an less-than relationship which must be a "strict weak ordering". Many times, the methods suggested for comparing floating point types do not satisfy the requirements of "strict weak ordering", and in this case the C++ standard says you've entered the realm of undefined behavior. In the code at $DAY_JOB, "undefined behavior" turned out in this case to include such pleasant side-effects as double frees(!).

Specifically: when you have a less-than relationship "<", then !(a<b) && !(b<a) implies that a and b are equal (a==b). And if a==b and b==c then it must be the case that a==c, or the requirements of the ordering predicate are not met. Unfortunately, under most of these FP comparison schemes, for numbers a and b that are "close but not too close", it's the case that a<b, but for x=(a+b)/2, a==x and x==b!

Negitivefrags · on March 30, 2017

This is actually caused by the use of the x87 80 bit floating point registers. (Infamous GCC bug #323)

When the float is first inserted into the set/map it has 80 bit precision, but that gets truncated to float or double precision during the store. This breaks the ordering as you are saying, but it's not an inherent flaw with floats as such.

The problem goes away if you compile with -mfpmath=sse because then the math will be performed in the same precision as the storage format.

Bug #323 is responsible for a huge amount of mistrust of floats that they don't deserve. Other compilers don't have this problem because they truncate the floats before any comparison.

jepler · on March 30, 2017

Yes, though I'm specifically talking about when you decide to define a 'bool less_than(double, double)' that uses some kind of fuzzy comparison approach internally. This can affect any platform, not just one with the "bug #323" behavior in it.

Negitivefrags · on March 31, 2017

You don't need (and shouldn't do) any kind of fuzzy comparison for less than for floats.

jepler · on March 31, 2017

You sure might imagine you need to. For instance, suppose you want to average pieces of data that are timestamped "almost the same", but could arrive to be processed with varying delays, including out of order arrival (so you can't just say "is this datum at 'about the same time as' the datum received just prior"). There are better approaches, but the one I inherited involved using a std::map which used a fuzzy less than as the ordering predicate; and my main task was to diagnose why, once in a blue moon, a segmentation fault could occur when doing some operation on the map (insertion, I think).

enqk · on March 31, 2017

It seems more logical to pre-quantise the timestamps before insertion. (Actual solution I've seen practiced)

tgjsrkghruksd · on March 31, 2017

As bug #323 points out, truncating floats is an incomplete solution and brings its own problems. The GNU people were not simply being lazy or ignorant, their approach was valid.

The "bug" is x87 design, period. And x87 is the past. It's old, and bad. SSE and IEEE 754 is the present.

mmarx · on March 31, 2017

> Specifically: when you have a less-than relationship "<", then !(a<b) && !(b<a) implies that a and b are equal (a==b). And if a==b and b==c then it must be the case that a==c, or the requirements of the ordering predicate are not met.

¬(a<b) ∧ ¬(b<a) → a=b is not, in fact, a requirement on <. Rather, the point is that behind the scenes, any two elements satisfying ¬(a<b) ∧ ¬(b<a) are treated as equivalent by these containers.

To see the difference, consider the (somewhat counter-intuitive) behavior of NaNs: for any two NaNs m, n, we have ¬(m<n) and ¬(n<m), yet also m≠n. If that implication were an actual requirement, then the usual ordering < on floats would not be a suitable ordering predicate. What happens, though, is that the containers will implicitly treat NaNs as equivalent, i.e., the notion of equivalence the container uses for the elements depends only on < and might not coincide with the usual ==.

srean · on March 31, 2017

Ah! fond memories, related one of the most vexing and entertaining bug that I discovered in my code.

I was using C++'s std::sort and soon enough things would go wrong. It would crash at unpredictable times, I suspected I was corrupting memory somewhere. Parts of my data structures would get overwritten by parts from some other data structure. I checked and checked and checked my code, nothing seemed wrong.

Its only after opening the covers, when I started peering into the sort that I realized my mistake. I was passing the comparison operator that was a "less than equal" relation.

noobermin · on March 30, 2017

>When comparing to some known value—especially zero or values near it—use a fixed ϵ that makes sense for your calculations.

If you're ever doing mathematical calculations of any sort, it is a good practice to have a handle on the scale your numbers will lie within. If not just to be a better professional, it helps you choose an ϵ that matches.

fnj · on March 30, 2017

Comparing floating point numbers for equality us tricky. In fact it is a classic fool's errand. Comparing for lesser or greater is not tricky.

deathanatos · on March 31, 2017

I have seen bugs that, essentially, were caused by a floating point less-than comparison. You can still get bitten if you're not careful. (A calculation was being passed to acos(x), which is undefined for x > 1. In our case, mathematically, the inputs evaluated to exactly 1, but in the land of floating points, it was slightly off.)

(The above bug is a restatement of bubblethink's sibling comment; it's an inequality where the two values are extremely (exactly!) close.)

The Go bug on the lack of a round function had several broken implementations, none of which had a direct equality[1].

[1]: https://github.com/golang/go/issues/4594

pasta · on March 31, 2017

Comparing for equality is not tricky because it works:

3.1415927410125732421875 equals 3.1415927410125732421875

The problem is that programmers expect the float to be a decimal (smaller than the float) or integer.

For example if you want to calculate that two vectors are parallel then comparing for lesser or greater won't give you the expected result if you need them to be zero in difference. You might think that the vectors (0, 10) and (10, 10) are parallel but using !(x > 0 || x < 0) won't help you. You still end up using a threshold of the smallest float, something like (abs(x - 0) < 0.0000000000000000000001)

bubblethink · on March 31, 2017

Why is that so ? If you can do a<b and a>b reliably, you can get equality as !(a<b || b<a). Inequality has the same issues for very close values.

wumpus · on March 31, 2017

Try putting NaN into your equation: not equal to itself.

keithnz · on March 31, 2017

There was another good article posted on HN on floating points - http://lemire.me/blog/2017/02/28/how-many-floating-point-num...

where the key takeaway is that between -1 -> 1 is where the %50 of floats live.

hprotagonist · on March 30, 2017

comparing floats is arguably nondeterministic.

https://randomascii.wordpress.com/2013/07/16/floating-point-...

kwhitefoot · on March 31, 2017

Comparing numbers is easy. The operators are there in the manual.

The hard part is understanding when it is appropriate to compare floating point numbers and how to produce them.

I regularly use floating point numbers as keys in dictionaries and of course all the code quality tools whine about comparisons being inexact. But in my case there is no fuzziness because the keys are all produced by the same method and hence do not suffer from any different rounding errors.

Pretty much every new member of the team sees floats being compared and has heart attack yet the code in question is in the oldest and most reliable component of the whole million line program.

Just don't expect two numbers produced by different expressions that would be mathematically equivalent but have operators in a different order to produce identical results.

Safety1stClyde · on March 31, 2017

I got confused reading this, because first of all it has a picture of 64 bit floating point, then it starts comparing float to int32_t. Obviously 64 bit floating point is "double" not "float", but then it has a picture of 64 bit and program code using int32_t.

What I would say is that when you're considering comparison of floating point numbers, it's important to understand what the operation means in terms of the data which you're representing using the floating point numbers, in other words what does it mean in terms of the data for two values to be equal or not equal. Usually there is a precision inherent in the data itself which will guide you to how to formulate equality, if necessary.

exDM69 · on March 31, 2017

Here's my less-than-scientific floating point near-equality test I use.

    bool zero(float x) ( return x*x < FLT_EPSILON; }
    bool equal_float(float a, float b) {
        return (zero(a) && zero(b)) ||   // both are zero
            zero((a-b)*(a-b) / (a*a + b*b)); // or relative error squared is zero
    }

This checks equality to about four decimal digits for 32 bit single precision and seven digits for 64 bit floats. Inf/NaN special values are not considered.

Critique and comments welcome.

janco · on March 31, 2017

`FLT_EPSILON` represent the minimum difference between two adjacent floats around 1.0; it should be scaled according to the input argument. E.g. your `equal_float` returns `true` for 2e-6 and 4e-6 which are clearly not the same number.

A better comparison would check for zero somehow like this:

    bool zero(float x) { return std::abs(x) <= std::abs(x)*FLT_EPSILON;  }

exDM69 · on March 31, 2017

Yes, this is by design and it works as intended.

I typically use this with doubles and DBL_EPSILON, which is much much smaller than FLT_EPSILON.

With FLT_EPSILON this roughly equals to "zero" being "less than 0.001". If the zero check is omitted, there's going to be a division by near-zero which will make the results nonsense (and you have to draw the line somewhere). With DBL_EPSILON "zero is less than 0.000000001".

If this is too loose, then `zero(x) = abs(x) < FLT_EPSILON` makes it much stricter (about 1e-7).

This is good enough for my purposes, I don't deal with very small numbers in float and doubles give more than enough precision.

NOTE: I usually use this kind of comparison in testing by comparing known "gold" figures against the results of the code being tested. I don't test accuracy, I test for "in the ballpark" because the stuff I deal with has built-in inaccuracy in the algorithm and numerics.

The version you posted will always return false if I read it correctly.

hcrisp · on March 30, 2017

There was a very good article explaining this using MATLAB, but I can't find it right now. This one is pretty close and explains the concepts of overflow, underflow, etc. The diagrams about "eps" are pretty good, even if your language of choice is Python, C/C++, etc.

http://blogs.mathworks.com/cleve/2014/07/07/floating-point-n...

ubernostrum · on March 31, 2017

Python has math.isclose() in the standard library, with configurable tolerances:

https://docs.python.org/3/library/math.html#math.isclose

mikeash · on March 30, 2017

Formatting note for the author: on Safari Mac, something is causing ff and fl ligatures to be applied even to the monospaced code, which makes it look kind of weird.

mrkline · on March 30, 2017

Is it better now?

mikeash · on March 30, 2017

I still see it in a couple of places where code-formatted text is inline with regular text, for example "relative_difference." The code blocks themselves look good.

mrkline · on March 31, 2017

Hopefully that fixes it. Sorry I suck at CSS.

mikeash · on March 31, 2017

Looks good now!

mjevans · on March 31, 2017

I try to always remember:

Floats are great for quick, mostly correct, math.

Think REALLY carefully about any process that then 'compares' the result. Usually you're "doing it wrong" if that's the case.

I think I might even find it useful if compilers could be instructed to warn whenever comparison operators were used on a float type value.

munro · on March 31, 2017

Hm, this is interesting--in over a decade the only time I can think I've ever needed to compare floats are 1) deduping duplicate data, naive comparison is what I want 2) disambiguating messy user data, which I would take floats over strings any day

JoelJacobson · on March 30, 2017

Can someone explain why float is "better" than numeric in this example?

pg1:joel=#* select 1/7::numeric7; ?column? ------------------------ 0.99999999999999999998 (1 row)

pg1:joel=# select 1/7::float*7; ?column? ---------- 1 (1 row)

mikeash · on March 30, 2017

It looks like numeric is a decimal type. 1/7 can't be represented exactly in either binary or decimal, so it's going to come down to rounding. It just so happens that 1/7 in binary, rounded to float precision, then multiplied by 7, is equal to 1. Do the same in decimal with whatever precision numeric gives you, and the result is not 1. I don't think there's any deep reason for it, it's just how it happens to work out. You can probably find a value where the opposite is true.

luhn · on March 30, 2017

Floating point math is a bit fuzzy, so many languages will round to the nearest integer if the float is within a certain margin.

Numeric types are meant to be exact, so they will be represented as an exact value.

tmyklebu · on March 30, 2017

Floating-point math is carefully-defined. There are multiple independently-developed but interoperable implementations and an IEEE standard that talks in detail about how floating-point math is supposed to work. It's not "a bit fuzzy."

ghettoimp · on March 31, 2017

It most certainly is a bit fuzzy. For a fun critique of the newer IEEE standard by a real FP expert, see:

http://www.russinoff.com/papers/ieee.pdf

For something more concrete, consider Section 8, "Variations Allowed by the IEEE Floating-Point Standard", of the TestFloat tool for testing floating point implementations for IEEE compliance:

http://www.jhauser.us/arithmetic/TestFloat-3c/doc/TestFloat-...

And of course, many arithmetic operations (e.g., trig functions) aren't even covered by the standard, which occasionally provokes consternation like this...

https://forums.theregister.co.uk/forum/1/2014/10/10/intel_un...

simonbyrne · on March 31, 2017

The problem with the "floating point is fuzzy" comment is that people start treating it as some sort of black box, or as if the results are random somehow. Sure there are a few weird things with floating point status flags, but mostly "fuzziness" is perfectly understandable when you grasp what it is doing (including the usual 0.1 + 0.2 != 0.3).

Also the standard does specify trig functions (§9.2 Recommended correctly rounded functions), but that's one of the optional parts, and as far as I know no one has actually implemented them fully (CRlibm came close, but I don't think their pow function has been fully proven to be correctly rounded, and in any case it isn't widely used).

This is actually a big problem with most standards: a lot of them contain finicky details about which only a very small subset of people care. As far as I know, there still isn't a C compiler that implements all the PRAGMAs specified in the C-1999/C-2011 specs.

tmyklebu · on March 31, 2017

I wasn't aware of the ambiguity surrounding the underflow flag noted in your second link. However, the other complaints I'm reading from your links (things not specified by the standard may vary in behaviour; the standard is written in English and could have been written in different or better English) do not impact the semantics of floating-point arithmetic in a material way.

jnordwick · on March 31, 2017

> so many languages will round to the nearest integer if the float is within a certain margin

What language does this? It is more likely that it is printing out a truncated form but the number still contains that small difference, such as it will print "1.1" even though 1.1 isn't exactly representable in binary. This comes from the often used Grisu set of algorithms for printing which (in Grisu3) prints the smallest string that can represent the floating point bit pattern.

luhn · on March 31, 2017

That behavior was what I was referring to. I only have limited understanding on the topic; it seems my explanation was not quite right. Thanks for correcting me.

rusk · on March 31, 2017

My basic rule of thumb is to only use floats at the presentation layer, for storage, or for measurements. If you're doing any serious calculations you need to normalise to a more reliable format first.