It’s interesting because my career went from doing higher level language (Python) to lower language (C++ and C). Opus and the like is amazing at Python, honestly sometimes better than me but it does do some really stupid architectural decisions occasionally. But when it comes to embedded stuff, it’s still like a junior engineer. Unsure if that will ever change but I wonder if it’s just the quality and availability of training data. This is why I find it hard to believe LLMs will replace hardware engineers anytime soon (I was a MechE for a decade).
How was the career transition from MechE? Looking to do the same myself.
As someone who did Python professionally from a software engineering perspective, I've actually found Python to be pretty crappy really: unaware of _good_ idioms living outside tutorials and likely 90% of Python code out there that was simply hacked together quickly.
I have not tested, but I would expect more niche ecosystems like Rust or Haskell or Erlang to have better overall training set (developer who care about good engineering focus on them), and potentially produce the best output.
For C and C++, I'd expect similar situation with Python: while not as approachable, it is also being pushed on beginning software engineers, and the training data would naturally have plenty of bad code.
I think its pretty good at Elixir, so that tracks.
Came here to say this.
Can you recommend some books that teach these idioms? I know not everything is in books but I suspect a bit of it is
I've found it's ok at Rust. I think a lot of existing Rust code is high quality and also the stricter Rust compiler enforces that the output of the LLM is somewhat reasonable.
Yes, it's nice to have a strict compiler, so the agent has to keep fixing its bugs until it actually compiles. Rust and TypeScript are great for this.
A big downside with rust is the compile times. Being in a tight AI loop just wasn't part of the design of any existing programming languages.
As languages designed for (and probably written by) AI come out over the next decade, it will be really interesting to see what dragon tradeoffs they make.
"cargo check" is fast and it's enough for the AI to know the code is correct.
I would argue that because Rust is so strict having the agent compile and run tests on every iterations is actually less needed then in other languages.
I program mostly in python but I keep my projects strictly typed with basedpyright and it greatly reduced the amount of errors the agent makes because it can get immediate feedback it has done something stupid.
Of course you still need to review the code because it doesn't solve logic bugs.
cargo check is faster; it's not fast
>Being in a tight AI loop just wasn't part of the design of any existing programming languages.
I would dare to say that any Lisp (Common Lisp, Clojure, Racket, whatever) is perfect for a tight AI loop thanks to REPL-driven development. It's an interesting space to explore and I know that the Clojure community at least are trying to figure out something there.
Quite sure it's not about the language but the domain.
Agreed. When I've written very low level code where there are "odd" constraints ("this function must never take a lock, no system calls can be made" etc) the LLM would accidentally violate them. It seems sort of obvious why - the vast majority of code it is trained on does not have those constraints.
It is really good at writing C++ for Arduino, can one-shot most programs.
I'd say the chance of me one shotting C++ is veeeery low. Same for bash scripts etc. This is where the LLM really shines for me.
I've had a similar experience as a graphics programmer that works in C++ every day
Writing quick python scripts works a lot better than niche domain specific code
Unfortunately, I’ve found it’s really good at Wayland and OpenGL. It even knows how to use Clutter and Meta frameworks from the Gnome Mutter stack. Makes me wonder why I learned this all in the first place.
To being able to determine it's really good.
LLMsdo great with Rust though
I think the combinatorial space is just too much. When I did web dev it was mostly transforming HTML/JSON from well-defined type A to well-defined type B. Everything is in text. There's nothing to reason about besides what is in the prompt itself. But constructing and maintaining a mental model of a chip and all of its instructions and all of the empirical data from profiling is just too much for SOTA to handle reliably.
nor web engineers (backend) that are not doing standard crud work.
I have seen these shine on frontend work
[dead]