Hacker News

Floating point defines n/0 the same as math. It's infinity as long as n isn't zero.

In almost all forms of math, the value n/0 is undefined. It's definitely not infinity, for two reasons - depending on the value of n, it can be negative; and neither info nor -inf are numbers, so they can't be the result of an equation (unless you look at transfinite equations).

What you can do in math is talk about the limit of a series of fractions as the denominator approaches 0, and that's where you get some relation to infinity or -infinity. But the limit can also be any other number, if the numerator also gets closer to 0; or it can not exist, if the function oscillates.

StilesCrisis 7 hours ago [ - ]

I explicitly didn't say "infinity or negative infinity" because I didn't think that level of pedantry would be needed here on HN. I guess I was wrong.

jdiff 6 hours ago [ - ]

It's not positive or negative infinity. It is simply undefined. Math has many conventions, and you can define your own convention that it does equal some flavor of infinity, but that is only a convention, and not a universal one.

throw-the-towel 5 hours ago [ - ]

All discussions of mathematics assume maximal possible pedantry.

simiones 4 hours ago [ - ]

That's not the problem, and this is not just pedantry. It's just not correct to say that n/0 = inf, nor even to say that positive_n / 0 = inf, in any normal math context.

For example, if you accepted that n/0 = inf just like n/1 = n, then you'd conclude that n/0 + 3 = inf + 3 = inf, so n/0 + 3 = n/0, so 3 = 0. Or you'd want to do weird things like asking what is sin(inf).

freehorse 7 hours ago [ - ]

> as long as n isn't zero

Which is the case with softmax function, as for T=0 you end up with a fraction that either becomes 0/0 or inf/inf [0]. So you do need branching as floating point arithmetic is not gonna get you there.

[0] except for weights that are exactly 0

edit: thinking more about it, one could always express the softmax formula in ways that this could work with floating point arithmetic but it would be very inefficient and sort of pointless