I wrote a whole PRD for this very simple idea, but still the bug persisted, even though I started from scratch four times. Granted, some had different bugs.

I guess sometimes I have to do some minor debugging myself. But I really haven't encountered what you're experiencing.

Early on, I realized that you have to start a new "chat" after so many messages or the LLM will become incoherent. I've found that gpt-4.1 has a much lower threshold for this than o3. Maybe that's affecting your workflow and you're not realizing it?

No, that's why I started again, because it's a fairly simple problem and I was worried that the context would get saturated. A sibling commenter said that browser rendering bugs on mobile are just too hard, which seems to be the case here.

Have you tried with both Claude opus 4 and Gemini 2.5 pro?

Opus 4, Sonnet 4, o3, o4-mini-high.