Yet it still codes like a junior developer that memorized all of stack overflow.

Even if the code was like that (it isn't), the power of the current crop of models to analyze data for patterns and build context out of code is leaps and bounds what it was even a year ago. And any developer will tell you that the hardest part of fixing a bug is knowing where the bug is in the first place. Once you know where it is, fixing it is usually trivial.

There is serious magic happening in the construction of model context.

PhDs code like that too. Especially if they're statisticians :)

Personally I don't find this to be true anymore! It's not always great and does still will often tend towards unneeded complexity (especially if not pushed a bit), but I often find GPT 5.5 writing code I would have written myself. This was very much not true with earlier models (who make something that worked, but I'd always have to rewrite to make it "good code").

Personally I found 5.5 a massive step back from 5.4. Both of them still use way too many fallbacks and unnecessary checks, especially if you're having it output php. It's fine if you're just one person and checking everything and able to catch and correct. But it's really bad when you have a team all using it, not checking the output and trusting it's output leading to spaghetti code. Technically works, but very messy and will no doubt lead to buggy code.

It still writes like a junior dev, in that despite AI being able to get a picture of an entire repo, it's changes are typically confined to the task it's working on and will opt to duplicate logic to keep changes contained. Again, technically works, not ideal.

Clearly you've never supervised junior developers.

That's literally my job...

Or PhDs