Hacker News

> For small n we can directly implement the definition. For large n, the direct approach would be slow and would accumulate floating point error.

is there a reason the direct definition would be slow, if we cache the prior harmonic number to calculate the next?

gjm11 6 hours ago [ - ]

I think it's fair to say that summing the series directly would be slow, even if it's not slow when you already happen to have summed the previous n-1 terms.

Not least because for modestly-sized target sums the number of terms you need to sum is more than is actually feasible. For instance, if you're interested in approximating a sum of 100 then you need something on the order of exp(100) or about 10^43 terms. You can't just say "well, it's not slow to add up 10^43 numbers, because it's quick if you've already done the first 10^43-1 of them".

coherentpony 7 hours ago [ - ]

It’s a natural observation, but it doesn’t address the floating point problem. I think the author should have said “fast or would accumulate floating point error” instead of “fast and would accumulate floating point error”.

You could compute in the reverse direction, starting from 1/n instead of starting from 1, this would produce a stable floating point sum but this method is slow.

Edit: Of course, for very large n, 1/n becomes unrepresentable in floating point.

cj10driver 4 hours ago [ - ]

Three techniques I’ve used to handle floating point imprecision/error:

1. Use storage that handles the level of scale and precision you need.

2. Use long/integer (if it fits). This is how some systems store money, e.g. as micros, but even though it’s sensical, there is still a limit and a wild swing of inflation may lead you to migrate to different units, then another wild swing of deflation may have you up-in-arms with data loss. Also it sounds great but could be a pita for storing arbitrary scale and precision.

3. Use ranges when doing comparison to attempt to handle floating point error by fuzzy matching numbers. This isn’t applicable for everything, but I’ve used this before; it worked fine and was much faster than BigDecimal, which mattered at the time. Long integers are really the best for this sort of thing, though; they’re much faster to work with.

4. BigDecimal. The problem with this is memory and speed. Also, as far as we know yet, you couldn’t store pi fully in a BigDecimal, and doing calculations with pi as a BigDecimal would be so slow things would come to a halt; it’s probably the way aliens do encryption.