It’s just linear algebra. Work your way from feed forward to CNN to RNN to LSTM to attention then maybe a small inference engine. Kaparthy’s llama2.c is only ~300 lines on the latter and it pragma simds so you don’t need fancy GPUs
It’s just linear algebra. Work your way from feed forward to CNN to RNN to LSTM to attention then maybe a small inference engine. Kaparthy’s llama2.c is only ~300 lines on the latter and it pragma simds so you don’t need fancy GPUs