"everything else is just efficiency" is a nice line but the efficiency is the hard part. the core of a search engine is also trivial, rank documents by relevance. google's moat was making it work at scale. same applies here.
"everything else is just efficiency" is a nice line but the efficiency is the hard part. the core of a search engine is also trivial, rank documents by relevance. google's moat was making it work at scale. same applies here.
Sure, but understanding the core concepts are essential to make things efficient and as far as I understand, this has mainly educational purposes ( it does not even run on a GPU).
yep, agreed. wasn’t knocking the project at all, it’s great for exactly that purpose
I think the hard part is improving on the basic concept.
The current top of the line models are extremely overfitted and produce so much nonsense they are useless for anything but the most simple tasks.
This architecture was an interesting experiment, but is not the future.