> Anyone overseeing work from multiple people has to?
That's not a black box though. Someone is still reading the code.
> At some point you have to let go and trust people‘s judgement
Where's the people in this case?
> People who do that (I‘m not one of them btw) must rely on higher level reports.
Does such a thing exist here? Just "done".
> Someone is still reading the code.
But you are not. That’s the point?
> Where's the people in this case?
Juniors build worse code than codex. Their superiors also can‘t check everything they do. They need to have some level of trust for doing dumb shit, or they can’t hire juniors.
> Does such a thing exist here? Just "done".
Not sure what you mean. You can definitely ask the agent what it built, why it built it, and what could be improved. You will get only part of the info vs when you read the output, but it won’t be zero info.
You: "Why did you build this?"
LLM: "Because the embeddings in your prompt are close to some embeddings in my training data. Here's some seemingly explanatory text with that is just similar embeddings to other 'why?' questions."
You: "What could be improved?"
LLM: "Here's some different stuff based on other training data with embeddings close to the original embeddings, but different.
---
It's near zero useful information. Example imformation might be "it builds" (baseline necessity, so useless info), "it passes some tests" (fairly baseline, more useful, but actually useless if you don't know what the tests are doing), or "it's different" (duh).
If I asked you, about a piece of code that you didn’t build, „What would you improve?“, how would that be fundamentally different?
> You can definitely ask the agent what it built, why it built it, and what could be improved.
If that was true we’d have what they call AGI.
So no, it doesn’t actually give you those since it can’t reason and logic in such a way.
What does that have to do with AGI?
What you asked for is AGI. How else does it think, reason and logic to answer your "why"?
It doesn't do that currently even if you think it does.