It mostly doesn't, at 9M it has very limited capacity. The whole idea of this project is to demonstrate how Language Models work.
It mostly doesn't, at 9M it has very limited capacity. The whole idea of this project is to demonstrate how Language Models work.