A workable C compiler is a ~10-50KLOC program, and a fairly simple one at that (batch, with no concurrency or interaction). That Anthropic's swarm of agents wrote 100KLOC before failing is a symptom of the problem. It's certainly possible that many programs are in the sub 5KLOC range, but it's definitely not "most software". Plus, almost no software has this level of detailed spec, ready-made tests, and a selection of existing implementations of the same spec.
My first thought when reading Anthropic's description of the experiment was that it is unrealistically easy. It's hard to come up with realistic jobs in the 10-50KLOC range that would be this easy for an LLM. That it failed only shows how much further we still have to go.
A bit off topic, but see how Anthropic publicity stunts went from "Claude C Compiler" with 100K LOC to the recent Bun Rust rewrite with 1M LOC (10x!) in just 3 months.
I get that it's "novel" creation vs porting, but given that they reported that the C compiler cost them $20k in API costs, the Bun rewrite must be at least $200k, maybe even closer to a million. Pure madness.
Asking an LLM tp change programming language of an implementation is completely different from asking it to code from spec. It's orders of magnitude simpler in practice. I converted some 60kloc of Java to C++ and it works. There were some issues where the Java implementation used runtime reflection because that needs creative workarounds and not all of the C++ translations worked on the first try. And that was my first serious attempt at a task with an LLM. I could likely do better now. An important task simplification here is that a well designed codebase can be converted in small pieces and then joined back together. So the total amount of code converted becomes an irrelevant metric.
I don't know how it could fail - Bun loses popularity among devs? Is it an objective metric? From what I understand, Node.js remains dominant across the industry as a whole, with Deno and Bun mostly used by startups.
Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success, and there would be plenty of AI-sphere startups already drinking the kool-aid that would consider the whole vibe-coding thing to Bun's benefit.
> Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success,
Can they, though? They tried and failed to do it in their C compiler experiment. The experimenter wrote: "I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality."
Do Firefox not have tests? Then how was there over 200 CVEs found?
Are we going to be comfortable running a piece of software that has 1M lines, and who knows how many zero-days will be in it.
Yes, sure they are going to use LLM to find the CVE's, and so will the hackers. You need a day or two to fix the security issue, a hacker just need to put it in use.
People who independently tried to use it reported that it is very much not workable:
- "CCC compiled every single C source file in the Linux 6.9 kernel without a single compiler error (0 errors, 96 warnings). This is genuinely impressive for a compiler built entirely by an AI. However, the build failed at the linker stage with ~40,784 undefined reference errors."(https://github.com/harshavmb/compare-claude-compiler)
- Overall it’s an interesting experiment, and shows the current bleeding edge of Claude’s Opus 4.6 model. However the resulting product is also a clear example of the throwaway nature of projects generated almost entirely by AI code agents with little human oversight. The prototype is really impressive, but there is no real path forward for it to be further developed. It can build the Linux kernel [for RISC-V], which is impressive. It can also build other things… if you are lucky, but you really cannot rely on it to work. (https://voxelmanip.se/2026/02/06/trying-out-claudes-c-compil...)
Anthropic themselves said that the codebase was effectively bricked and that their agents could not salvage it.
Well then as you say a 10-50KLOC C compiler is workable. Could you show me the C compiler that does manage to compile a modern Linux kernel that is of that size?
TCC did several years ago. It could boot Linux from source in under 10 seconds. It's wasn't that big of a C compiler. It's in the 50,000 lines of code range.
This was 20 years ago from what I can find. Beside that Linux now is a vastly different codebase than it was 20 years ago. That effort also did not compile Linux unmodified, it required several changes: https://bellard.org/tcc/tccboot_readme.html.
I can make a c compiler in a couple weeks just by looking up open source libraries and copying them.
I can't make any software that people will pay me money to use without taking months/years of development, research, expiramentation and iteration.
Just because the original people who invented compilers had to be genius, doesn't mean anyone has to spend much time or thought in copying that work now.
I built a compiler for a simpler language as part of my compilers course in a CS degree. It was a non-trivial exercise well beyond the majority of software applications. What open source libraries did you have in mind and what are you copying?
If you can truly write a C compiler in weeks then kudos to you. How many compilers have you written so far for how many languages?
I work for big tech and I would say a large % of developers are incapable of producing a working C compiler on any reasonable time scale, certainly not weeks, even with looking at open source. I'm sure they can download one and run it. Most developers today don't even know C or assembler. They don't know how to approach the C language spec. The top 5-10% of developers/engineers can do it but even for them it's non-trivial.
> It was a non-trivial exercise well beyond the majority of software applications
Maybe if you include every application ever written, including every variation of "hello world", but if you are claiming that most serious production quality software could be written by a CS student who is simultaneously working on other classes, I'm gonna have to disagree with you.
I'd copy and paste from all the thousands of open source ones, what do you mean?
There are plenty of open source compilers that I can copy and paste whatever I need to. I don't get why you think this would have any level of difficulty?
Of course I couldn't make a brand new compiler that was better than what's out there...
Just like a game engine, I could clone one of the thousands of engines out there pretty easily - making something better or novel would be difficult. Just making a bare bones clone of what already exists by referencing documentation and pre-existing code is relatively easy now.
Yeah, when I made a mediocre 3d game engine 20 years ago, it was brain breaking difficult work. I can make one infinitely better in a micro fraction of the time now because most of the hard stuff is done and can just be looked up now.
If you copy and paste an entire compiler you didn't make anything. If you copy pieces from different compilers they won't work together. So I'm not sure how you "make" a compiler with copying and pasting from open source compiler. Are you saying you'll take one file from clang, one from gcc, another another from another compiler?
Sure. You can clone gcc and build it. You can close a game engine and use it.
> It was a non-trivial exercise well beyond the majority of software applications
That depends on how you count. By number of programs that may well be right, but that's not what matters in terms of impact on the industry, as software value roughly corresponds to the number of people working on a particular piece of software (or lines of code, if you wish). By number of people/LOC most software is not in the "simpler than a C compiler" category.
I do think being able to write a compiler is a milestone indicator of your computer science knowledge. Most developers probably don't understand pointers either, because "most developers" are people who did a React bootcamp.
A workable C compiler is a ~10-50KLOC program, and a fairly simple one at that (batch, with no concurrency or interaction). That Anthropic's swarm of agents wrote 100KLOC before failing is a symptom of the problem. It's certainly possible that many programs are in the sub 5KLOC range, but it's definitely not "most software". Plus, almost no software has this level of detailed spec, ready-made tests, and a selection of existing implementations of the same spec.
My first thought when reading Anthropic's description of the experiment was that it is unrealistically easy. It's hard to come up with realistic jobs in the 10-50KLOC range that would be this easy for an LLM. That it failed only shows how much further we still have to go.
A bit off topic, but see how Anthropic publicity stunts went from "Claude C Compiler" with 100K LOC to the recent Bun Rust rewrite with 1M LOC (10x!) in just 3 months.
I get that it's "novel" creation vs porting, but given that they reported that the C compiler cost them $20k in API costs, the Bun rewrite must be at least $200k, maybe even closer to a million. Pure madness.
Asking an LLM tp change programming language of an implementation is completely different from asking it to code from spec. It's orders of magnitude simpler in practice. I converted some 60kloc of Java to C++ and it works. There were some issues where the Java implementation used runtime reflection because that needs creative workarounds and not all of the C++ translations worked on the first try. And that was my first serious attempt at a task with an LLM. I could likely do better now. An important task simplification here is that a well designed codebase can be converted in small pieces and then joined back together. So the total amount of code converted becomes an irrelevant metric.
Yes, the task is very different, but also it will be months to a year until we know the results of the bun experiment.
I don't know how it could fail - Bun loses popularity among devs? Is it an objective metric? From what I understand, Node.js remains dominant across the industry as a whole, with Deno and Bun mostly used by startups.
Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success, and there would be plenty of AI-sphere startups already drinking the kool-aid that would consider the whole vibe-coding thing to Bun's benefit.
> Anthropic can always fire the Opus/Mythos token machine gun on any problem (bugs, features, security) to ensure PR success,
Can they, though? They tried and failed to do it in their C compiler experiment. The experimenter wrote: "I tried (hard!) to fix several of the above limitations but wasn’t fully successful. New features and bugfixes frequently broke existing functionality."
It could fail due to maintenance burden. There is a lot of code now that no one wrote.
Are we assuming, all tests pass == software done?
Do Firefox not have tests? Then how was there over 200 CVEs found?
Are we going to be comfortable running a piece of software that has 1M lines, and who knows how many zero-days will be in it.
Yes, sure they are going to use LLM to find the CVE's, and so will the hackers. You need a day or two to fix the security issue, a hacker just need to put it in use.
And good luck debugging a million line code base.
1M LOC == already failed.
The compiler that claude made went way beyond workable. It could compile the full linux kernel afaik. That is much further even beyond standard C.
People who independently tried to use it reported that it is very much not workable:
- "CCC compiled every single C source file in the Linux 6.9 kernel without a single compiler error (0 errors, 96 warnings). This is genuinely impressive for a compiler built entirely by an AI. However, the build failed at the linker stage with ~40,784 undefined reference errors."(https://github.com/harshavmb/compare-claude-compiler)
- Overall it’s an interesting experiment, and shows the current bleeding edge of Claude’s Opus 4.6 model. However the resulting product is also a clear example of the throwaway nature of projects generated almost entirely by AI code agents with little human oversight. The prototype is really impressive, but there is no real path forward for it to be further developed. It can build the Linux kernel [for RISC-V], which is impressive. It can also build other things… if you are lucky, but you really cannot rely on it to work. (https://voxelmanip.se/2026/02/06/trying-out-claudes-c-compil...)
Anthropic themselves said that the codebase was effectively bricked and that their agents could not salvage it.
Well then as you say a 10-50KLOC C compiler is workable. Could you show me the C compiler that does manage to compile a modern Linux kernel that is of that size?
TCC did several years ago. It could boot Linux from source in under 10 seconds. It's wasn't that big of a C compiler. It's in the 50,000 lines of code range.
This was 20 years ago from what I can find. Beside that Linux now is a vastly different codebase than it was 20 years ago. That effort also did not compile Linux unmodified, it required several changes: https://bellard.org/tcc/tccboot_readme.html.
Not really.
I can make a c compiler in a couple weeks just by looking up open source libraries and copying them.
I can't make any software that people will pay me money to use without taking months/years of development, research, expiramentation and iteration.
Just because the original people who invented compilers had to be genius, doesn't mean anyone has to spend much time or thought in copying that work now.
I built a compiler for a simpler language as part of my compilers course in a CS degree. It was a non-trivial exercise well beyond the majority of software applications. What open source libraries did you have in mind and what are you copying?
If you can truly write a C compiler in weeks then kudos to you. How many compilers have you written so far for how many languages?
I work for big tech and I would say a large % of developers are incapable of producing a working C compiler on any reasonable time scale, certainly not weeks, even with looking at open source. I'm sure they can download one and run it. Most developers today don't even know C or assembler. They don't know how to approach the C language spec. The top 5-10% of developers/engineers can do it but even for them it's non-trivial.
> It was a non-trivial exercise well beyond the majority of software applications
Maybe if you include every application ever written, including every variation of "hello world", but if you are claiming that most serious production quality software could be written by a CS student who is simultaneously working on other classes, I'm gonna have to disagree with you.
I'd copy and paste from all the thousands of open source ones, what do you mean?
There are plenty of open source compilers that I can copy and paste whatever I need to. I don't get why you think this would have any level of difficulty?
Of course I couldn't make a brand new compiler that was better than what's out there...
Just like a game engine, I could clone one of the thousands of engines out there pretty easily - making something better or novel would be difficult. Just making a bare bones clone of what already exists by referencing documentation and pre-existing code is relatively easy now.
Yeah, when I made a mediocre 3d game engine 20 years ago, it was brain breaking difficult work. I can make one infinitely better in a micro fraction of the time now because most of the hard stuff is done and can just be looked up now.
Do you not agree?
If you copy and paste an entire compiler you didn't make anything. If you copy pieces from different compilers they won't work together. So I'm not sure how you "make" a compiler with copying and pasting from open source compiler. Are you saying you'll take one file from clang, one from gcc, another another from another compiler?
Sure. You can clone gcc and build it. You can close a game engine and use it.
> It was a non-trivial exercise well beyond the majority of software applications
That depends on how you count. By number of programs that may well be right, but that's not what matters in terms of impact on the industry, as software value roughly corresponds to the number of people working on a particular piece of software (or lines of code, if you wish). By number of people/LOC most software is not in the "simpler than a C compiler" category.
I do think being able to write a compiler is a milestone indicator of your computer science knowledge. Most developers probably don't understand pointers either, because "most developers" are people who did a React bootcamp.