Chat GPT-4 has alegedly 1.8 trillion parameters.
Imagine having a bunch of 2D matrices with a combined 1.8 trillion total numbers, from which you pick out a blocks of numbers in a loop and finally merge them and combine them to form a token.
Good luck figuring out what number represents what.
Wouldn't that mean it's totally impractical for day-to-day usage, but a researcher or team of researchers could solve this?
Anthropic has a tool that lets them do this but apparently doing it for even one prompt can take an entire day of work.
That’s so much faster than I expected