> when I step outside my area of deep knowledge I can no longer call BS on the agents
It's still funny that 4 years into this mania the models can hallucinate basic ground truths, humans are increasingly not reviewing the output, and misusing LLMs where simple automation would suffice.
My wife does project management and works with a lot of tech leads. They came to her with a project plan deck, and she started questioning some weird dates.
The LLM was able to pull artifacts out of their issuer tracker, but it just.. hallucinated some of the dates in the process of creating a project plan deck out of the underlying data. These guys didn't care to review and notice, and who knows what else it hallucinated content wise. They were happy to send this project plan multiple levels up the food chain with hallucinated unreviewed dates.
5 years ago they would have just written a script and had none of this mess.
That’s why I use AI more like: Write a tool for me that does this.
Instead of directly: do this.
Preferably I would interweave code and AI queries where some function waits on prompt result too I think?? To avoid too big context hallucinations
I mean that would work for my use cases.
At least what I learned is that the less AI itself does in the context is the better so to say as critical LLM mistakes are approaching 100% of probability over time.
The crazy thing is how many people who can write code (with or without uAI) are in fact using the LLMs in the latter "go do this" mode.
There are a lot of non-tech people using these products in this manner.
Along these lines my friend is CTO at a non-tech firm and theres vibe coding happening in one department on a project that is going to churn $1M of tokens. Head of that department told him it's OK because instead of paying a SWE annual salary, they'll just pay $1M of tokens once forever.
People don't know what they don't know about software, SDLC, support, maintenance, etc. If code was something you write once and never think about again, most tech orgs could be 75% smaller.