Hacker News

SwellJoe 7 hours ago [ - ]

The context window is not on your system. It's on the server with the model. There may be some local prompt caching, of some sort, but you're not locally hosting the context unless you're also locally hosting the model.

bluegatty 3 hours ago [ - ]

Chat history is kept locally, generally you have to send the 'whole history' to the model 'each turn'.

SwellJoe 23 minutes ago [ - ]

That's just the plain text (or whatever files), that's not the context the model is directly working with on the server, which is tokenized, embedded, vectorized and has attention run against those vectors. The local history is generally quite small, the context generally quite a bit larger. A text conversation of a few hundred kilobytes in plain text will be gigabytes in context.

rixed 2 hours ago [ - ]

Only "generally"? I'm curious what API has moved away from this protocol that seems mode adapted to conversaions with humans than agentic loops.

bluegatty 2 hours ago [ - ]

So the standard API you pass it all along but I think there are some odd open ai apis that are different.