After decades of writing software, I feel like I have a pretty good sense for "this can't possibly be idiomatic" in a new language. If I sniff something is off, I start Googling for reference code, large projects in that language, etc.

You can also just ask the LLM: are you sure this is idiomatic?

Of course it may lie to you...

> You can also just ask the LLM: are you sure this is idiomatic?

I found the reverse flow to be better. Never argue. Start asking questions first. "What is the idiomatic way of doing x in y?" or "Describe idiomatic y when working on x" or similar.

Then gather some stuff out of the "pedantic" generations and add to your constraints, model.md, task.md or whatever your stuff uses.

You can also use this for a feedback loop. "Here's a task and some code, here are some idiomatic concepts in y, please provide feedback on adherence to these standards".

> If I sniff something is off, I start Googling for reference code, large projects in that language, etc.

This works so long as you know how to ask the question. But it's been my experience that an LLM directed on a task will do something, and I don't even know how to frame its behavior in language in a way that would make sense to search for.

(My experience here is with frontend in particular: I'm not much of a JS/TS/HTML/CSS person, and LLMs produce outputs that look really good to me. But I don't know how to even begin to verify that they are in fact good or idiomatic, since there's more often than not multiple layers of intermediating abstractions that I'm not already familiar with.)

I'm not much of a JS/TS/HTML/CSS person either. But if I think something looks off and it's something I care about, then I'll lose a day boning up on that thing.

To your point that you're not sure what to search for, I do the same thing I always do: I start searching for reference documentation, reading it, and augmenting that with whatever prominent code bases/projects I can find.

This motivates the question: if you're doing all this work to verify the LLM, is the LLM really saving you anytime?

After just a few weeks in this brave new world my answer is: it depends, and I'm not really sure.

I think over time as both the LLMs get better and I get better at working with them, I'll start trusting them more.

One thing that would help with that would be for them to become a lot less random and less sensitive to their prompts.

> and I don't even know how to frame its behavior in language in a way that would make sense to search for.

Have you tried recursion? Something like: "Using idiomatic terminology from the foo language ecosystem, explain what function x is doing."

If all goes well it will hand you the correct terminology to frame your earlier question. Then you can do what the adjacent comment describes and ask it what the idiomatic way of doing p in q is.

I think you’re missing the point. The point is that I’m not qualified to evaluate the LLM’s output in this context. Having it self-report doesn’t change that fact, it’s just playing hide the pickle by moving the evaluation around.

Not at all - my point was that it can effectively tutor you sufficiently for you to figure out if the code it wrote earlier was passable or not. These things are unbelievably good at knowledge retrieval and synthesis. Gemini makes lots of boneheaded mistakes when it comes to the finer points of C++ but it has an uncanny ability to produce documentation and snippets in the immediate vicinity of what I'm after.

Sure, that approach could fail in the face of it having solidly internalized an absolutely backwards conception of an entire area. But that seems exceedingly unlikely to me.

It will also be incredibly time consuming if you're starting from zero on the topic in question. But then if you're trying to write related code you were already committed to that uphill battle, right?