My process: start ideating and get the AI to poke holes in your reasoning, your vision, scalability, etc. do this for a few days while taking breaks. This is all contained in one Md file with mermaid diagrams and sections.

Then use ideation to architect, dive into details and tell the AI exactly what your choices are, how certain methods should be called, how logging and observability should be setup, what language to use, type checking, coding style (configure ruthless linting and formatting before you write a single line of code), what testing methodology, framework, unit, integration, e2e. Database, changes you will handle migrations, as much as possible so the AI is as confined as possible to how you would do it.

Then, create a plan file, have it manage it like a task list, and implement in parts, before starting it needs to present you a plan, in it you will notice it will make mistakes, misunderstand some things that you may me didn’t clarify before, or it will just forget. You add to AGENTS.md or whatever, make changes to the ai’s plan, tell it to update the plan.md and when satisfied, proceed.

After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff.

The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.

If you do this and iterate you will gradually end up with a solid harness and you will need to review less.

Then port it to other projects.

> After done, review the code. You will notice there is always something to fix. Hardcoded variables, a sql migration with seed data that should actually not be a migration, just generally crazy stuff. > > The worst is that the AI is always very loose on requirements. You will notice all its fields are nullable, records have little to no validation, you report an error when testing and it tried to solve it with an brittle async solution, like LISTEN/NOTIFY or a callback instead of doing the architecturally correct solution. Things that at scale are hell to debug, especially if you did not write the code.

For that I usually get it reviewed by LLMs first, before reviewing it myself.

Same model, but clean session, different models from different providers. And multiple (at least 2) automated rounds of review -> triage by the implementing session -> addressing + reasons for deferring / ignoring deferred / ignored feedbacks -> review -> triage by the implementing session -> …

Works wonders.

Committing the initial spec / plan also helps the reviewers compare the actual implementation to what was planned. Didn’t expect it, but it’s worked nicely.

LISTEN/NOTIFY is not brittle, we use it for millions of events per day.

I agree! It should be very stable, IMO. If not, then please send a bug report and we'll look into it. Also, now it scales well with the number of listening connections (given clients listen on unique channel names): https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit...

The LISTEN/NOTIFY feature really just doesn’t get enough PR. It is perfectly suitable for production workloads yet people still want to reach for more complicated solutions they don’t need.

It's not the feature itself, it's how/what the llm tries to use it for. It uses it to cross any and all architectural boundaries.

[dead]

I find it very interesting that you assume this method would branch out to other projects. I find it even more interesting that you assume all software codebases use a database, give a damn about async anything, and that these ideas percolate out to general software engineering.

Sounds like a solid way to make crud web apps though.

GP is clearly providing examples of categories of tasks. Sure, not all languages do “async fn foo()”, but almost all problem domains involve some sort of making sure the right things happen at the right times, which is in a similar ballpark.

Holier than thou “yeah well I work on stuff that doesn’t use databases, checkmate!” doesn’t really land - data still gets moved around somehow, and often over a network!

Not trying to "land" anything.