They do improve, but the general creativity and sparkle we see with increasing scale comes mostly from scaling up pretraining/parameter-size, so it's quite slow and expensive compared to the speed (and decreasing cost) people have come to take for granted in math/coding in small cheap models. Hence the reaction to GPT-4.5: exactly as much better taste and discernment as it should have had based on scaling laws, yet regarded almost universally as a colossal failure. It was as unpopular as the original GPT-3 was when the paper was released, because people look at the log-esque gains from scaling up 10x or 100x and are disappointed. "Is that all?! What has the Bitter Lesson or scaling done for me lately?"
So, you can expect coding skills to continue to outpace the native LLM taste.
I think we're basically agreeing here. Your point (if I'm reading it right) is that taste and discernment do scale, but the gains come through pretraining/parameter scaling, which is slow and expensive compared to the fast, cheap wins in math/coding from smaller models. So taste is more of a lagging indicator of scale. it improves, but it's the last thing people notice because the benchmarkable stuff races ahead. Which also means taste isn't really a moat, just late to get commoditized.
My point is more that since you can expect taste's commoditization to lag behind for deep fundamental reasons, then taste does serve as a moat. Just perhaps a weaker one than one would naively expect, and where you will have to frantically keep investing in it to stay ahead of the LLMs slowly catching up, as opposed to a permanent lock-in you can lazily monopolistically coast on indefinitely. (I'm reminded of Neal Stephenson's La Brea tarpit analogy for open source vs proprietary software in _In The Beginning was the Commandline_.)
Fair enough. I really like the tarpit analogy, wasn't familiar with it. You can keep pulling your feet out faster than the tar rises, as long as you're willing to keep spending the energy, possibly with diminishing returns over time.