https://www.ioccc.org/2024/stedolan/index.html
Reading last years entries, this image decompression oneliner outputs its own logo when passed the hash of its source code?! Pretty neat.
https://www.ioccc.org/2024/stedolan/index.html
Reading last years entries, this image decompression oneliner outputs its own logo when passed the hash of its source code?! Pretty neat.
I was wondering how the hash thing worked. Got my answer from the notes, which are otherwise hilarious.
"While terseness was preferred over obscurity, this program hopefully still lives down to IOCCC’s usual standards of clarity."
Also:
“Several variants of this program were considered. Several.”
(They must have tried billions of variants to find the match on md5sum. Luckily, you don’t have to compile or run the code to find that match)
Since 2009, it has not been that hard to create a collision (where you control both inputs and only care that they end up with the same hash as each other). https://archive.org/details/pocorgtfo14/page/n45/mode/1up After reading this article, scroll up to the top and see that the PDF has its own MD5 hash on the cover.
Or this gif that animates its own MD5sum: https://shells.aachen.ccc.de/~spq/md5.gif
I was mostly interested if the solution did that or calculated its own hash on the fly, for example by reading its own source. Thinking about it and considering how short the source is it must have been the former all along, but that was my motivation looking into the notes. I didn't regret it.
This isn’t (fully) a situation where you control both inputs. The MD5 hash code has to be a ‘program’ that produces a nice icon when processed by the program whose MD5 hash code has that value.
Indeed!
> By an astonishing coincidence, the number of bits in the input format is approximately the log2 of the number of MD5 evaluations that a five-year-old GPU can do in an hour.
I still don't understand how it works, the source code doesn't have any obvious sections to insert data for brute-forcing. I'm as confused as the judges:
> There are no magic numbers in the program, and bits of the input map to pixels of the output in a regular way, yet it outputs a nice icon for itself, if given the MD5 hash of its own source. How?
I'm assuming the fact that MD5 is completely broken plays into it somehow...
Thinking about it more, I guess you can just change the single-letter variable names and positioning to quite easily get up to a 40 bit brute-force.
> By an astonishing coincidence, the number of bits in the input format is approximately the log2 of the number of MD5 evaluations that a five-year-old GPU can do in an hour.
I read this to mean he calculated a matching hash on his GPU, in order to tune the magic constants of his app to produce the desired output.
> There are no magic numbers in the program
There are several:
Most of those numbers aren't possible to change, 80x80 is the output size, %10 is the input size.
Changing any of the other numbers would surely impact the function of the program, I don't think you could get to a 40 bit brute-force there.
I think the brute-force was done by making non-semantic changes like variable names and changing the order of the declarations.
> I think the brute-force was done by making non-semantic changes like variable names and changing the order of the declarations.
Yep. You have 17 bits of entropy just in the names of three single-letter variables.
Same guy wrote jq
In 135 bytes !
This is right there in the mad scientist territory.
Incredible, pure bananas.
Yeah I just read that one, insane!
The guy obviously enjoys obscure piping commands, the other one being jq