> * Way way way more code in the training set.
Why not convert the training code to AST?
You could, but it is extremely expensive to train an LLM that is competitive on coding evals. So, I was assuming use of a model someone else trained.
Also, if it is only trained on code, it's likely to miss out on all the world knowledge that comes from the rest of the data.
fine tune instead of training from scratch might help.
You could, but it is extremely expensive to train an LLM that is competitive on coding evals. So, I was assuming use of a model someone else trained.
Also, if it is only trained on code, it's likely to miss out on all the world knowledge that comes from the rest of the data.
fine tune instead of training from scratch might help.