The point of the article is boring but training LLMs on documents from a particular time period is actually pretty interesting.
The point of the article is boring but training LLMs on documents from a particular time period is actually pretty interesting.
Assembling 6GB of training data is actually rather impressive, given the constraints.