I use agentic tools daily and SOTA models have certainly improved a lot in the last year. But still in a linear, "they don't light my repo on fire as often when they get a confusing compiler error" kind of way, not a "I would now trust Opus 4.6 to respond to every work email and hands-off manage my banking and investment portfolio" kind of way.
They're still afflicted by the same fundamental problems that hold LLMs back from being a truly autonomous "drop-in human replacement" that would enable an entire new world of use cases.
And finally live up to the hype/dreams many of us couldn't help but feeling was right around in the corner circa 2022/3 when things really started taking off.