Sharing my current MO:

I start with a high level design md doc which an AI helps write. Then I ask another AI - whether the same model without the context, or another model - to critique it and spot bugs, gaps and omissions. It always finds obvious in hindsight stuff. So I ask it to summarize its findings and I paste that into the first AI and ask its opinions. We form an agreed change and make it and carry on this adversarial round robin until no model can suggest anything that seems weighty.

I then ask the AI to make a plan. And I round robin that through a bunch of AIs adversarially as well. In the end, the plan looks solid.

Then the end to end test cases plan and so on.

By the end of the first day or week or month - depending on the scale of the system - we are ready to code.

And as code gets made I paste that into other AIs with the spec and plan and ask them to spot bugs, omissions and gaps too and so on. Continually using other AI to check on the main one implementing.

And of course you have to go read the code because I have found it that AI misses polishes.

The discourse around AI is that we’ve unlocked a whole new unsupervised paradigm of development; but you’re basically describing how Google has built code for a decade, just with humans of different levels of trust instead of AI.

And I’m not saying that to poke fun at you (my workflow is essentially identical to yours), or at Google, but rather to say that there’s nothing new :)

AI is a fantastic accelerator of effective and ineffective workflows alike. It’s showing us which are effective and ineffective on way shorter timescales / in realtime!

That is actually reassuring. I used to try to work this way with people but the culture where I work didn’t align and I found it easier to work this way alone by trying to put myself into critique mode and so on. Now much better to get AIs to do it. And I find the more I polish the plan the less expensive the AI needed to implement too.

This sort of "spec-driven development" was the USP behind AWS Kiro: https://kiro.dev/docs/specs/

> And of course you have to go read the code because I have found it that AI misses polishes

Since you mentioned using other agents, do you get mileage out of code reviews with another agent polishing the unpolished bits? My colleagues swear by it, though I personally remain skeptical about its value without a human reviewer.

> Then I ask another AI

May be synthesis-antithesis-thesis works better in applied computer science... https://en.wikipedia.org/wiki/Dialectic#Criticisms

How much faster/slower are you with that process compared to writing code yourself?

Developer of 20+ years here, can't give you an accurate multiplier but I am faster.

Because spotting holes in specs has never been one of my strengths. And working without technical colleagues much of the time, it's a boon to be able to "rubber-duck" my ideas with something that is at least more intelligent than plastic.

Grabbing multipliers from thin air, the coding bit may only be 2x faster with a poorer-quality outcome, but working out what's needed is a good 5x faster.

And yes, I'm using the same adversarial AI MO as @wood_spirit, combined with Matt Pocock's excellent /grill-me and /grill-with-docs skills [1] and Plannotator [2] to review the plans.

1. https://github.com/mattpocock/skills

2. https://github.com/backnotprop/plannotator

I actually use LLMs a lot to rubber duck my problems and help develop plans. Then I manually code, to ensure my skills don't deteriorate. I feel like I'm a lot faster, with few of the downsides. Do you have any thoughts on this process?

If you can type code fast and accurately, it sounds a great process to use. You're using LLMs for the bit where they bring great value, and yourself as a higher quality coding agent :)

Have you considered incorporating formal modelling?

Like:

[0] https://csci1710.github.io/2026/ and https://forge-fm.github.io/book/2026/

[1] https://elliotswart.github.io/pragmaticformalmodeling/

[2] https://quint.sh/

Only at the "hmm that seems an interesting idea" level.

Thanks for the links, going to have a read and see if I can apply any to my work.

Thanks for sharing those. They look interesting.

Can't speak for GP or OP, but I see about 10x the output and 2-4x the value of what I would be able to get by hand. Within the gap between 2-4x and the 10x is really a lot of design documents, user/dev documentation and testing that I might not have rolled to nearly the extent that I do/get when using AI.

I haven't been using multiple AIs adversarially as OP, but might consider giving it a try with Codex and Opus. That said, my AI workflow has been pretty similar... lots of iterations on just design, then iterations on documentation, testing, etc... then iterations on implementation, testing, validation and human review in the mix.

My analogy is that it's really close to working with a foreign dev team, but your turnaround is in minutes instead of days, where it's much more interactive.

I'm seeing the same, for gains being largely from documentation.

I feel strong making "dev" documentation though, since it seems a bit redundant/superfluous. I fully suspect nobody is going to read it at this point.

Fair... but the AI will/may as you use agents for dealing with issues/bugs, etc.

For me, sometimes faster/sometimes slower, but there are a lot of other benefits besides speed:

* I can work in code I'm not familiar with much easier

* LLMs often identify confusion or uncertainty upfront, so I can address it earlier.

* I'm much less mentally taxed so I can go for longer at my top end.

* Meetings, disruptions, end of day is WAY less critical since I can lean on the LLM to get back into things.

* I can do something else productive while the LLM is running. Bug fixes, documentation, PR reviews, etc.

Having tried something similar, the perceived speedup does not, in the steady state, last.

To get a quality, lasting, result you're ultimately having to carefully study everything otherwise you end up quickly accumulating cognitive debt and the speedup soon shrinks as you're constantly having to revisit the initial approaches.

[deleted]