I had good fun transliterating it to Rust as a learning experience (https://github.com/stochastical/microgpt-rs). The trickiest part was working out how to represent the autograd graph data structure with Rust types. I'm finalising some small tweaks to make it run in the browser via WebAssmebly and then compile it up for my blog :) Andrej's code is really quite poetic, I love how much it packs into such a concise program

Storing the partial derivatives into the weights structure is quite the hack, to be honest. But everybody seems to do it like that.

Great work! Might do it too in some other language...

I got a convertion to Java. It worked (at least I think...) in the first try.

Then I want to convert this to my own programming language (which traspiles to C). I like those tiny projects very much!

Zig, here.

Anything but Python

At least python can do this exercise without pulling 3rd party dependencies :)

What's missing from Zig and its std lib for this?

Zig version [0] doesn't need any external dependencies.

0. https://tangled.org/m17e.co/microgpt