Using models to go from spec to program is one use case, but it’s not the whole story. I’m not hand-writing specs; I use LLMs to iteratively develop the spec, the validation harness, and then the implementation. I’m hands-on with the agents, and hands-off with our workflow style we call Attractor

In practice, we try to close the loop with agents: plan -> generate -> run tests/validators -> fix -> repeat. What I mainly contribute is taste and deciding what to do next: what to build, what "done" means, and how to decompose the work so models can execute. With a strong definition of done and a good harness, the system can often converge with minimal human input. For debugging, we also have a system that ingests app logs plus agent traces (via CXDB).

The more reps you get, the better your intuition for where models work and where you need tighter specs. You also have to keep updating your priors with each new model release or harness change.

This might not have been a clear answer, but I am happy to keep clarifying as needed!

But what is the result of your work? What do you commit to the repo? What do you show to new folks when they join your team?