It's interesting to me that most of Bellard's work is basically turning specs into C.
His most important projects are ffmpeg (codec specs), qEmu (ISA specs), QuickJS (the EcmaScript spec), tinyC (the C spec), and his telecom company (LTE specs). I guess the pi calculations and neural network stuff are exceptions.
Just to be clear, this doesn't make his work any less impressive. Highly performant codec and emulator implementations are no easy feat; it's just interesting that most of this work falls into that relatively narrow area.
It's worth noting that most communications specifications that involve an encoder/decoder pair communicating over a channel only specify the encoder. Standards purposely leave the decoder open to allow systems to progress as technology develops and to allow competition between implementations. This also makes a standard simpler, as a decoder is usually more complex than an encoder since it has to deal with noise and other effects introduced by the channel. Consequently, implementing a competitive standards compliant decoder involves R&D and is not a case of following a predefined path.
I've always seen Bellard as an engineer who programs rather than a pure programmer.
It is exactly the opposite for MPEG, which only specifies the decoder (i.e. how frames should be decoded).
Maybe they meant encoding, the file format.
But that only specifies the decoder.
The format for all modern video codecs is not the kind of format where any specific piece of uncompressed input should always be encoded the same way, but more like a very restricted programming language that gives the encoder a lot of tools to compress the video, and which tools they use and how they use them are up to them.
> ffmpeg (codec specs)
if your mental model is that somebody writes codec specs and then fabrice bellard comes in and turns the specs into C, you are dead wrong. first of all, codecs are usually reverse-engineered, there is no spec. second of all, even when a well specified document describes the codec, that spec does not describe how to efficiently encode or decode with that codec. people like fabrice bellard develop the algorithms that do that.
Vocabulary please. A "codec" is software that CODes and DECodes multimedia content, while specs describe an encoded file or stream format (occasionally involving network protocols and other concerns).
In a normal standard development process experimental codecs come first, then those that have proved to work well, including having good enough performance, are described in the spec; after standardization there's very little room to "develop the algorithms" because nonconformant implementations would be useless.
Reverse engineering is limited to the abnormal case of having access to some codec but not to the standard that describes it.
> after standardization there's very little room to "develop the algorithms" because nonconformant implementations would be useless.
there is A LOT OF ROOM to develop the algorithms. it seems that you are confused about what an algorithm is, since you seemingly think that there can be only 1 algorithm that can decode a given media file.
There is a lot of room to do exactly the same thing more efficiently, which doesn't count as different algorithms.
The way to criticize that comment is to point out that all the major and most important codecs that are most commonly used with ffmpeg, do not come from the ffmpeg project. H.264, H.265, libmp3lame, speex, libfdkaac, etc. all come from other projects. What ffmpeg does is provide libraries for transforming decoded data between formats and calling to and from encoders and decoders and multiplexers and bitstream formats.
It may also be worth pointing out, in terms of apportioning credit fairly, that ffmpeg has not been Bellard's project since 2004. The thing we see today is no more his project than GCC or Emacs are Stallman's projects.
There was a time when we would spend an enormous amount of time defining a spec, so that we can farm out the code. Now, we farm out the spec so that we can spend an enormous amount of time with the code.
That's actually how I was trained. The spec and the implementation (and the testing) were separate areas; sometimes, done by different people.
These days, I tend to mix them all together, and I think I get good results.
I strongly suspect that a lot of folks, these days, only do the middle one.
> I strongly suspect that a lot of folks, these days, only do the middle one.
Ain't no one willing to pay for all of that. The clear separation is something you only see remaining in academia and industries where code quality issues have legal consequences (i.e. aerospace, marine, automotive and medical), and even there, pressure is high to relax rules viewed as "arcane".
Writing good specifications, documentations, implementation code and tests each is an art form in itself
If you actually work with ffmpeg, it's rather quite impressive how pluggable the architecture is. The codecs have huge amount of quirks and disagreements about basics (what is a "frame" in audio, subtitle, and video worlds?) and even their environment (passing frames around software and hardware coders is way different).
That fact that you can (almost) freely mix and match processing between such different worlds is quite an achievement and libav (IMO) is decently well designed to allow that.
Interesting observation, similar manner of work as Linus Torvalds. These guys implement existing ideas well, consistent and open, but are not inventors.
Maybe pi is a spec. Just not written by man.
But I was told “spec implementators” were prime for LLM replacement
I don’t think the distinction is actually that interesting as you could call any piece of software a spec