> most "AI Agents" that make it to production aren't actually that agentic. The best ones are mostly just well-engineered software with LLMs sprinkled in at key points

I've been saying that forever, and I think that anyone who actually implements AI in an enterprise context has come to the same conclusion. Using the Anthropic vernacular, AI "workflows" are the solution 90% of the time and AI "agents" maybe 10%. But everyone wants the shiny new object on their CV and the LLM vendors want to bias the market in that direction because running LLMs in a loop drives token consumption through the roof.

Everyone wants to go the agent route until the agent messes up once after working 99 times in a row. "Why did it make a silly mistake?" We don't know. "Well, let's put a few more guard rails around it." Sounds good... back to "workflows."

"But what about having another agent that quality controls your first agent?"

You should watch the CDO-squared scene from the Big Short again.

THIS so much. People are like "why human supervision when we can have agent supervsion" and always respond

> look if you don't trust the LLM to make the thing right in the first place, how are you gonna PROBABLY THE SAME LLM to fix it?

yes I know multiple passes improves performance, but it doesn't guarantee anything. for a lot of tool you might wanna call, 90% or even 99% accuracy isn't enough

Yup

I think it got started as AI tools for things like cancer detection based purely on deep learning started to outperform tools where humans guide the models what to look for. The expectation became that eventually this will happen for LLM agents too if only we can add more horsepower. But it seems like we've hit a bit of a ceiling there. The latest releases from OpenAI and Meta were largely duds despite their size, still very far from anything you'd trust for anything important, and there's nothing left to add to their training corpus that isn't already there.

Of course a new breakthrough could happen any day and get through that ceiling. Or "common sense" may be something that's out of reach for a machine without life experience. Until that shakes out, I'd be reluctant to make any big bets on any AI-for-everything solutions.

> Or "common sense" may be something that's out of reach for a machine without life experience

Maybe Doug Lenat's idea of a common sense knowledge base wasn't such a bad one.

I keep trying to tell my PM this

[deleted]

I screenshot that comment to send to my PM.