Improbable fun with IEEE Floating Point: agh

agh

Improbable fun with IEEE Floating Point

Sep 14, 2007 08:27

Had I played the lottery last Tuesday, I might have won. Something seemingly insignificant but perhaps even more improbable happened. I’ll have to give some background first before the punchline.

So, you want to compare floating point values. Plan A is to use ==. Unfortunately, you aren’t normally supposed to compare exactly, as there can be rounding errors and such, since things like associativity don’t hold.

Plan B is seeing whether they are within a fixed ε of each other. Strike two. The decimal precision of a float scales inversely with how near the values are to zero. A float consists of a sign bit, an exponent, and a mantissa. In other words, think scientific notation, .000005322 and 5,322,000 take the same number of digits: 5.322 * 10x in each case. Thus, rounding errors vary in the amount of imprecision based on how large the number is.

We need a plan C. Luckily, the IEEE standard cleverly builds floating point values so that they can be compared. 32-bit floats have 1 sign bit (s), 8 exponent bits (x), and 23 mantissa bits (m). Let M = 224, and E = 150 A float’s value is calculated as

(-1)s * 2(e-E) * (M + m)

The M factor is totally crucial, because it forces each real value into a particular float representation. Without that M, you could inversely vary the exponent and the mantissa and have many ways to represent the same value (e.g., 3*22 and 6*21). In particular, this would mean that there would be no plan C for comparing floats.

But because of this setup, we can basically just convert the bits to integers and compare them. This technique will scale nicely to the size of the floats, and was proposed by a coworker for some test code:

bool AreEqual(float a, float b)

{

// Reinterpret as integers

int aAsInt = *(int *)&a;

int bAsInt = *(int *)&b;

// Makes sure 0.f and -0.f are equal.

if (aAsInt < 0)

aAsInt = 0x80000000 - aAsInt;

if (bAsInt < 0)

bAsInt = 0x80000000 - bAsInt;

// Allow, say, four units of precision difference or less.

if (abs(aAsInt - bAsInt) <= 4)

return true;

else

return false;

}

Other than the trickery with the subtraction from 0x80000000, it’s pretty straightforward. However, that part bothered me, because I quickly looked at it and thought that a float and its negative would compare to be the same. I plugged in 2 and -2, and lo and behold, I was right! Without thinking further, I shot off a mail to my coworker showing the bug. He wrote some garbage back about “interesting case” and “minimum negative integer” and crap. I replied, saying “I think we’re talking about two different things. I’m saying the function is broken for any a and -a!”

I had just sent the mail when my hackles rose, and I decied to start plugging in other numbers. Sure enough, 2 and -2 were the only pair of numbers that failed. The reason was completely unrelated to my concern: the argument to the absolute value function was MININT, and since there’s one more negative integer than positive, abs cannot return a positive integer when given MININT, so it returns MININT. Of course, MININT is less than 4…

How could this happen? There are (232)2 pairs of floats, and at most 232 pairs that would subtract out of this formula to be MININT. The probability that I’d pick such a pair at random would be ½32. Now, my pick wasn’t exactly random, nor is the floating point standard, but still…

I wish I’d found those odds in a powerball ticket. I also missed the opportunity to pretend I had found the failure case through careful deduction and calculation.

code