Sharing my 2 cents.

In the past 2 months I've been using all the SOTA models to help me design a new DSL for narrative scripting (such as game story telling) and a c# runtime implementation o the script player engine.

The language spec and design is about 95% authored by me up to this point; I have the LLMs work on the 2nd layer: the implementation specs/guidelines and the 3rd layer: concrete c# implementation.

Since it's a new language, I consider it's somewhat new/novel tasks for LLMs (at least, not like boilerplate stuff like HTTP API or CRUD service). I'd say, these LLMs have been very helpful - you can tell they sometimes get confused and have trouble to comply to the foreign language spec and design - but they are mostly smart enough to carry out the objectives, and they get better and better after the project got on track and has plenty of files/resources to read and reference.

And I'd also say "prompt better" is a important factor, just much more nuanced/complicated. I started with 0 experience with LLM agents and have learned a lot about how to tame them, and developed a protocol to collaborate with agents, these all comes from countless trial and errors, but in the end get boiled down to "prompt better".

I wonder if my intuition here is correct; I would posit that “PL implementation” is a far more popular and well-explored field than it seems. How many toy/small/labor-of-love langs make it to Show HN? How many more simply don’t?

I’ve never personally caught the language implementation bug. I appreciate your perspective here.

I totally agree, and I was fully aware of how common people make language for fun when I replied.

But I feel like the rationale would still stands: Considering LLMs' natures, common boilerplate tasks are easy because they can kind of just "decompress" from training data. But for a new language design, unless the language is almost identical to some other captured by the model, "decompression" would just fail.