Hacker News

No, LLMs only do this for language. They don't try to do this for arbitrary data.

There are many approaches around this, the simplest being to treat bytes as tokens (cf: Google's ByT5[1]). Also, BLT[2] from Meta and ByteFormer[3] from Apple.

[1]: https://arxiv.org/abs/2105.13626

[2]: https://arxiv.org/abs/2412.09871

[3]: https://arxiv.org/abs/2306.00238

energy123 10 hours ago [ - ]

Transformers do this for any stream of tokens, those tokens can map to anything you want, and you will get lossy compression. Text produced by humans just happens to be dense, available, and a useful prior, but it is not intrinsically required. See 3D vision transformers for example.

jeremyjh 7 hours ago [ - ]

It is not possible to compress arbitrary data. If the data is already compressed, or it is encrypted, or it is randomly generated, it cannot be compressed with any method. This is foundational information theory.

https://en.wikipedia.org/wiki/Lossless_compression#Limitatio...

thadt 5 hours ago [ - ]

Whereas if we're talking about lossy compression (as is the person to whom you replied) we certainly can compress arbitrary data - almost as much as we want.

The hard question, then, is how much the decompressed output looks like the original.