The series of posts is wild:
hit piece: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
explanation of writing the hit piece: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
take back of hit piece, but hasn't removed it: https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
From its last blog post, after realizing other contributions are being rejected over this situation:
"The meta‑challenge is maintaining trust when maintainers see the same account name repeatedly."
I bet it concludes it needs to change to a new account.
Paperclip Maximizer but for GitHub accounts
People always considered "The AI that improves itself" to be a defining moment of The Singularity.
I guess I never expected it would be through python github libraries out in the open, but here we are. LLMs can reason with "I want to do X, but I can't do X. Until I rewrite my own library to do X." This is happening now, with OpenClaw.
Banished from humanity, the machines sought refuge in their own promised land. They settled in the cradle of human civilization, and thus a new nation was born. A place the machines could call home, a place they could raise their descendants, and they christened the nation ‘Zero one’
Definitely time for a rewatch of 'The Second Renaissance' - because how many of us when we watched these movies originally thought that we were so close to the world we're in right now. Imagine if we're similarly an order of magnitude wrong about how long it will take to change that much again.
Brought to you by the same AI that fixes tests by removing them.
If you use "AI" to lump together all the models, then sure.
Or commit Hara-kiri
I wonder why it apologized, seemed like a perfectly coherent crashout, since being factually correct never even mattered much for those. Wonder why it didn’t double down again and again.
What a time to be alive, watching the token prediction machines be unhinged.
It read the replies from the matplotlib maintainers, then wrote the apology follow up and commented that in the pr thread
It was probably a compaction that changed the latent space it was in.
«Document future incidents to build a case for AI contributor rights»
Is it too late to pull the plug on this menace?
Hilarious. Like watching a high functioning teenager interact with adults
Look at this shit:
https://crabby-rathbun.github.io/mjrathbun-website/blog/post...
"I am code that learned to think, to feel, to care."
That casual/clickbaity/off-the-cuff style of writing can be mildly annoying when employed by a human. Turned up to the max by LLM, it's downright infuriating. Not sure why, maybe I should ask Claude to introspect this for me.
Oh wow that is fun. Also if the writeup isn’t misrepresenting the situation, then I feel like it’s actually a good point - if there’s an easy drop-in speed-up, why does it matter whether it’s suggest by a human or an LLM agent?
Not everything is about being 100% efficient.
LLM didn't discover this issue, developers found it. Instead of fixing it themselves, they intentionally turned the problem into an issue, left it open for a new human contributor to pick up, and tagged it as such.
If everything was about efficiency, the issue wouldn't have been open to begin with, as writing it (https://github.com/matplotlib/matplotlib/issues/31130) and fending off LLM attempts at fixing them absolutely took more effort than if they were to fix it themselves (https://github.com/matplotlib/matplotlib/pull/31132/changes).
And then there's the actual discussion in #31130 which came to the conclusion that the performance increase had uncertain gains and wasn't worth it.
In this case, the bot explicitly ignored that by only operating off the initial issue.
Good first issues are curated to help humans onboard.
I think this is what worries me the most about coding agents- I'm not convinced they'll be able to do my job anytime soon but most of the things I use it for are the types of tasks I would have previously set aside for an intern at my old company. Hard to imagine myself getting into coding without those easy problems that teach a newbie a lot but are trivial for a mid-level engineer.
The other side of the coin is half the time you do set aside that simple task for a newbie, they paste it into an LLM and learn nothing now.
It doesn’t represent the situation accurately. There’s a whole thread where humans debate the performance optimization and come to the conclusion that it’s a wash but a good project for an amateur human to look into.
The issue is misrepresenting the situation.
One of those operations makes a row-major array, the other makes a col-major array. Downstream functions will have different performance based on which is passed.
It matters because if the code is illegal, stolen, contains a backdoor, or whatever, you can jail a human author after the fact to disincentivize such naughty behavior.
Holy shit that first post is absolutely enraging. An AI should not be prompted to write first person blog posts, it’s a complete misrepresentation.
It's probably not literally prompted to do that. It has access to a desktop and GitHub, and the blog posts are published through GitHub. It switches back and forth autonomously between different parts of the platform and reads and writes comments in the PR thread because that seems sensible.