Hacker News

The transformation function in jpeg (DCT) is generally well defined math. While lossy, most of the information is reprocudable.

An LLM is layers and layers of non-linear transformations. It's hard to say exactly how information is accumulated. You can inspect activations from tokens but it's really not clear how to define what the function is exactly doing. Therefore error is poorly understood.

ashf023 6 hours ago [ - ]

JPEG is similar actually. The DCT is invertible, but the result of the DCT is quantized, which is where some of the compression happens (DCT -> quantization -> IDCT), so the end to end process is not truly invertible. Maybe an analogy to the non-linearities in between the linear steps in deep learning