Hacker News

I strongly agree with this take — and that’s partly why the article posted here leaves me scratching my head. PRs are already the gate, right? I don’t care what an agent does or doesn’t do within the confines of its workspace assuming their contributions are gated via a git repository and they don’t require exotic access to a production environment to do their development.

I’m also with you on the junior / mid-level engineer framing (a “brilliant” junior engineer perhaps, one who graduated from at the top of their class from the best CS program in the country) with a big caveat: AI is like a junior engineer who doesn’t know how to learn.

It’s like you’re working with the guy from Memento. Every day your LLM reports to work and they’ve learned nothing from your work so far. Every day is the first day!

Now like the Memento guy you can help them to scatter their workspace with sticky notes and reminders everywhere. With some effort you can start to approximate that thing called “learning” which is LITERALLY the most important trait of every single software developer on a team.

But I confess it’s a struggle for me and the available tooling isn’t there yet. The best I’ve done looks closer to the “second brain” people use tools like Obsidian for. Sadly I don’t think a second brain is a substitute for a first brain. And to be 100% honest any engineer who exhibited the same inability to learn and grow as an AI agent would be sacked after their first month on the job at any company I’ve ever worked at.

I’m actually reasonably optimistic that either the main AI providers or someone else will improve on this in the coming years. It certainly feels like a decent memory paired with a well architected thinking system that’s better at contextually injecting memories (I find LLMs today don’t know what they don’t know unless you force them to put metaphorical sticky notes all over the place) as well as capturing real learnings without supervision shouldn’t be an impossible task requiring novel technical structures.

Anyhow I’d love to be wrong about some of the above and I’m always reading articles like this one hoping that someone has solved these problems already and that I’m just slow on the uptake. But as of today, I’m only modestly better at architecting such agents than I was when I started.