Decompression is equivalent to executing code for a specialized virtual machine. It should be possible to automate this process of finding "small" programs that generate "large" outputs. Could even be an interesting AI benchmark.
Decompression is equivalent to executing code for a specialized virtual machine. It should be possible to automate this process of finding "small" programs that generate "large" outputs. Could even be an interesting AI benchmark.
Many of them already do this. [0]
It is a much easier problem to solve than you would expect. No need to drag in a data centre when heuristics can get you close enough.
[0] https://sources.debian.org/patches/unzip/6.0-29/23-cve-2019-...
I meant it should be possible to take a specialized virtual machine that is equivalent to decompressing some compressed bitstream & figure out how to write programs for it that are small but generate large outputs, not that it should be possible to do static analysis & figure out whether the given small program will generate a large output although that is also an interesting problem to solve & would also be an interesting AI benchmark.
My guess is this is a subset of the halting problem (does this program accept data with non-halting decompression), and is therefore beautifully undecidable. You are free to leave zip/tgz/whatever fork bombs as little mines for live-off-the-land advanced persistent threats in your filesystems.
it's not. decompression always ends since it progresses through the stream always moving forward. but it might take a while