I’m curious (as someone who knows nothing about this stuff!)—the context window is basically a record of the conversation so far and other info that isn’t part of the model, right?

I’m a bit surprised that 8GB is useful as a context window if that is the case—it just seems like you could fit a ton of research papers, emails, and textbooks in 2GB, for example.

But, I’m commenting from a place of ignorance and curiosity. Do models blow up the info in the context window, maybe do some processing to pre-digest it?

Yes, every token is expanded into a vector that can be many thousand of dimensions. The vectors are stored for every token and every layer.

You absolutely can not fill even a single research paper in 2 GB much less an entire book.