I bet the blog post will make no mention of pressure from anthropic to do this and instead will celebrate the fact that “it passes all tests”, of course omitting how many tests were modified to forcibly pass

Do you have any proof Anthropic pushed for this? Because the author has been clear this was an experiment they wanted to test out on their own, only when it seemed to be in a working state did they consider, okay maybe this might work for us.

Does it take a phd in psychoanalysis to not see that the company that has been marketing the fuck out of lame publicity stunts, to not take advantage of another publicity stunt? Good lord, no wonder the public hates tech workers.

I refuse to blindly hate something because someone tells me to with no evidence, if you want to hate me for that, so be it, that sounds like a personal problem.

Show me the incentive and I'll show you the outcome.

Anthropic is unable to contribute to Zig due to new AI policy, Bun has to maintain a fork of Zig, the lead developer decided "what if I try Rust? can the model do this for me in a meaningful way?" is that so hard to believe? I've done it with Claude before this story was blown out of proportion. It's basically one of the strengths of language models. If you frequent any reverse engineering communities, a lot of breakthroughs are coming from people having Claude disassemble things and translate it to either specs of raw source files in a new language, to the point that it compiles.

So from the context of someone who has never done this with Claude, or GPT, or any other model, I guess I could see how this would smell like a marketing stunt, but Anthropic already has marketing videos for this sort of thing on their YouTube as of last year. They have a video of Claude going through legacy COBOL code and modernizing it. Whereas all of you guys are giving me "trust me bro" as your only evidence.

I don’t have proof, but I can offer you that dexter suspicious meme instead

Was there pressure to do this, or freedom to do this? If I had an unlimited token budget I'd probably try all sorts of crazy things. Also you (one) can read the tests and see that they weren't modified to forcibly pass.