The logic above can support exactly the opposite conclusion: LLM can do dynamic typed language better since it does not need to solve type errors and save several context tokens.
Practically, it was reported that LLM-backed coding agents just worked around type errors by using `any` in a gradually typed language like TypeScript. I also personally observed such usage multiple times.
I also tried using LLM agents with stronger languages like Rust. When complex type errors occured, the agents struggled to fix them and eventually just used `todo!()`
The experience above can be caused by insufficient training data. But it illustrates the importance of eval instead of ideological speculation.
In my experience you can get around it by having a linter rule disallowing it and using a local claude file instructing it to fix the linting issues every time it does something.
You can equally get around a significant portion of the purported issues with dynamically typed languages by having Claude run tests, and try to run the actual code.
I have no problem believing they will handle some languages better than others, but I don't think we'll know whether typing makes a significant difference vs. other factors without actual tests.
I always instructions to have the LLM run `task build` before claiming a task is done.
Build runs linters and tests and actually builds the project, kinda-sorta confirming that nothing major broke.
it does not always work in my experience due to complex type definitions. Also extra tool calls and time are needed to fix linting.
Or just bad training data. I've seen "any" casually used everywhere.
>The logic above can support exactly the opposite conclusion: LLM can do dynamic typed language better since it does not need to solve type errors and save several context tokens.
If the goal is just to output code that does not show any linter errors, then yes, choose a dynamically typed language.
But for code that works at runtime? Types are a huge helper for humans and LLMs alike.