From the article:

"As temperature approaches zero from the negative side, the model output will again be deterministic — but this time, the least likely tokens will be output."

I understand this as, a negative number far from zero is also quite random (just with a distribution that will produce unlikely tokens).

Yep! Very large negative temperatures and very large positive temperatures have essentially the same distribution. This is clearer if you consider thermodynamic beta, where T = ±∞ corresponds to β = 0.