This claim needs to be backed up by evals. I could just as well argue the opposite, that LLMs are best at coding Python because there are two orders of magnitude more Python in their training sets than C++ or Rust.
In any case, you can easily get most of the benefits of typed languages by adding a rule that requires the LLM to always output Python code with type annotations and validate its output by running ruff and ty.
> In any case, you can easily get most of the benefits of typed languages by adding a rule that requires the LLM to always output Python code with type annotations and validate its output by running ruff and ty.
My personal experience is that by doing exactly that, the productivity, code readability, and correctness goes through the roof, at a slight increase in cost due to having to iterate more.
And since that is an actual language-independent comparison, it leads me to believe that yes, static typing does in fact help substantially, and that the current differences between vibe coding languages are, just like you say, due to the relative quantity of training data.
I agree that the training sets for LLMs have much more training data for Python than for Rust. But C++ has existed before Python I believe. So I doubt there is 2 orders of magnitude of Python code more than C++.
You miss how many fewer programmers were there in the early years, how much of that code was ever public, and even if it was, how useful it was, as C++ has changed drastically since, say, what we used to write in 2001.
It's not just a question of whether there is more actual code in a given language, but how much is available in the public and private training data.
I've done work on reviewing and fine-tuning training data with a couple of providers, and the amount of Python code I got to see at least out-distanced C++ code by far more than 2 orders of magnitude. It could be a heavily biased sample, but I have no problems believing it also could be representative.
Python is pretty old, so I had a quick look.
https://en.wikipedia.org/wiki/C%2B%2B#History
In 1985, the first edition of The C++ Programming Language was released, which became the definitive reference for the language, as there was not yet an official standard.[31] The first commercial implementation of C++ was released in October of the same year.[28]
In 1998, C++98 was released, standardizing the language, and a minor update (C++03) was released in 2003.
https://en.wikipedia.org/wiki/History_of_Python
The programming language Python was conceived in the late 1980s,[1] and its implementation was started in December 1989[2] by Guido van Rossum at CWI in the Netherlands as a successor to ABC capable of exception handling and interfacing with the Amoeba operating system.[3]
Python reached version 1.0 in January 1994.
Of course it's hard to say how much that is reflected in code available and is any of the old code still valid input for modern use. It does broadly look like c++ is older, in general.
> But C++ has existed before Python I believe.
Sure, C++ is 42 years old, Python is “only” 34. Both are older than the online code hosts (or even the web itself) from which the code for AI training data is sourced, so age probably isn't a key factor in how much code of each is there, popularity with projects hosted in accessible public code repos is more relevant.
My experience with Github Copilot and Python has been that it _does_ generate better code completions for Python. It's sometimes shockingly good at predicting what you want to do in the next 30-50 lines of code based on a few well named variables. But that shockingly good code is also filled with hallucinated classes, methods, parameter ordering, etc. which completely negate its usefulness.
ty still misses things caught by mypy. It also doesn't have the same level of support for Pydantic yet. I use it (because it's so damn fast), but along with mypy, not a replacement yet.
Yes, mypy is slow, but who cares if it's the agent waiting on it to complete.
Yep, I prefer pyright though, but ty is too early to be relied on (though I love love ruff so I'm sure they will get there.(
I think you vastly overestimate the capacity of Python typing.