Hacker News

An aside, but I recently learned -- if one is willing to use a very modest amount of memory -- summing floating-point numbers with no loss of precision is effectively a solved problem with the XSUM algorithm.

https://glizen.com/radfordneal/ftp/xsum.pdf

tialaramex 5 hours ago [ - ]

That paper explains some useful optimisation details, but obviously since the floats are all (either infinity or) some multiple of a known tiny fraction (their smallest non-zero number), we can definitely sum them accurately.

mjcohen 3 hours ago [ - ]

Not if the ratio between the largest and smallest floats is very large (2^(2^n)) where n is the number of bits in the exponent.

tialaramex 2 hours ago [ - ]

I think you either haven't thought about this or you did your math wrong.

You need (2^e)+m+1 bits. That is more bits than would fit in the cheap machine integer type you just have lying around, but it's not that many in real terms.

Let's do a tiny one to see though first, the "half-precision" or f16 type, 5 bits of exponent, 10 bits of fraction, 1 sign bit. We need 43 bits. This will actually fit in the 64-bit signed integer type on a modern CPU.

Now lets try f64, the big daddy, 11 exponent, 52 fraction, 1 sign bit so total 2048 + 52 + 1 = 2101 bits. As I said it doesn't fit in our machine integer types but it's much smaller than a kilobyte of RAM.

Edited: I can't count, though it doesn't make a huge difference.