So in a nutshell Claude becoming smarter means that logic that once resided in the agent is being moved to the model?
If that's the case, it's important to asses wether it'll be consistent when operating on a higher level, less dependent on the software layer that governs the agent. Otherwise it'll risk Claude also becoming more erratic.
I'm going to be paranoid and guess they're trying to segment the users into those that'll notice they're dumbing down the system via caches, limited model via quantized downgrade and those that expect the fully available tools.
Thariq (who's on the Claude Code team) swears up and down that they do not do this.
Honestly, man, this is just weird new tech. We're asking a probabilistic model to generate English and JSON and Bash at the same time in an inherently mutable environment and then Anthropic also release one or two updates most workdays that contain tweaks to the system prompt and new feature flags that are being flipped every which way. I don't think you have to believe in a conspiracy theory to understand why it's a little wobbly sometimes.