hmm, yeah given LLM's ability to churn out lots of code quickly and be overly verbose in that code that is a potential downside. That it could in a quick one time edit create so much intellectual overhead that Python might be the wrong language to understand what is going on.
What language do you feel is easier to reason about in the large?
Haskell would be my vote, and Rust too, actually, both because of their very strong type systems. The type system lets you very quickly figure out what something is before you figure out what something does, and it turns out that separating those two concerns as hard as those two languages do often results in doing the whole one-two punch faster.
Haskell does not qualify for a large training set, though. (Nor for readability in my opinion)
I think I have never seen haskell software made wih LLM's but well, aside from university, I have not seen Haskell code at all. (Also Haskell purists I would associate with people who avoid LLM's)
I would rather go with Rust given these choices.
But I have good results with typescript (or javascript for simpler things). Really large set of examples. Tools optimized for it, agents debugging in the browser works allmost out of the box. And well, a elaborate typesystem.
I used Claude to generate Haskell and it works really well. Claude struggles sometimes with respecting abstraction boundaries, but Haskell enforces parts of those boundaries in its type system better than a lot of other languages (if a module can’t do IO, for example).
Works well, in my experience. Sometimes the agent does weird stuff that you have to rewrite, but I get the sense that this happens in any language.
Maybe Haskell’s training set is not large enough, but it seems to work despite the smaller training set.
[dead]
Bit of a nit, it isn't the strong typing that makes Rust great for LLMs, it's the very strict compiler.
Plenty of languages have strong (enough) typing but their compilers happily let you or the LLM footgun yourself.
I'd say Java, because it has a massive footprint amenable for training, and a strong type system (does not have sum types though and those are trendy).
You'd have to steer the LLM to use the style you want, and not massively overarchitect things though, but that's going to be an issue nonetheless.
Java has sum types - they are fairly recent, called sealed records, and can be exhaustively pattern matched on.
(I do agree however, Java is a great target for LLMs)
C# is as close to an ideal language as you can get for most things IMO. I find AI does a great job with it.
I do agree. C# is an hidden gem for IA. There are not that much different ways to get somewhere so the model have probably been trained on the framework and libraries everybody uses (the Microsoft ones).
Compared to most languages, including Java, C# will have a hard time letting you compile incoherent code.
You barely need any dependencies other than aspnetcore and efcore for most applications and your AI knows them well.
It’s easy to do TDD with it so it’s easy to keep your IA from hallucinating.
I definitely agree with the sentiment. However this part.
> There are not that much different ways to get somewhere
This is far from true. C# is a language where you can operate on the raw pointers through unsafe keyword. On the other end of the spectrum, you can have duck-typing in dynamic blocks.
For operating on collections you can use old style loops, or chain of lambdas or sql like syntax.
I have been coding in C# old school way for most of my life at this point, and I feel like I'm in a foreign land reading code from some other C# projects.