> what works best with AI-coding: a strong and thorough idea of what you want, broken up into hundreds of smaller problems, with specific architectural steers on the really critical pieces

This has worked extremely well for me.

I have been working on an end-to-end modeling solution for my day job and I'm doing it entirely w/Claude.

I am on full-rework iteration three, learning as I go on what works best, and this is definitely the way. I'm going to be making a presentation to my team about how to use AI to accelerate and extend their day-to-day for things like this and here's my general outline:

1. Tell the LLM your overall goal and have it craft a thoughtful product plan from start to finish.

2. Take that plan and tell it to break each of the parts into many different parts that are well-planned and thoroughly documented, and then tell it to give you a plan on how to best execute it with LLMs.

3. Then go piece by piece, refining as you go.

The tool sets up an environment, gets the data from the warehouse, models it, and visualizes it in great detail. It took me about 22 hours of total time and roughly 2 hours of active time.

It's beautiful, fast, and fully featured. I am honestly BLOWN AWAY by what it did and I can't wait to see what others on my team do w/this. We could have all done the setup, data ingestion, and modeling, no question; the visualization platform it built for me we absolutely could NOT have done w/the expertise we have on staff--but the time it took? The first three pieces probably were a few days of time, but the last part, I have no idea. Weeks? Months?

Amazing.

I wrote a whole PRD for this very simple idea, but still the bug persisted, even though I started from scratch four times. Granted, some had different bugs.

I guess sometimes I have to do some minor debugging myself. But I really haven't encountered what you're experiencing.

Early on, I realized that you have to start a new "chat" after so many messages or the LLM will become incoherent. I've found that gpt-4.1 has a much lower threshold for this than o3. Maybe that's affecting your workflow and you're not realizing it?

No, that's why I started again, because it's a fairly simple problem and I was worried that the context would get saturated. A sibling commenter said that browser rendering bugs on mobile are just too hard, which seems to be the case here.

Have you tried with both Claude opus 4 and Gemini 2.5 pro?

Opus 4, Sonnet 4, o3, o4-mini-high.