I am curious if it reuses the LLM across all tabs, hard to imagine most machines can boot up 1-2 of any 4gb model unless its a more powerful system.
I am curious if it reuses the LLM across all tabs, hard to imagine most machines can boot up 1-2 of any 4gb model unless its a more powerful system.
I think it obviously will, what would be the benefit to spinning up more than one copy?
It should only need to load one copy of the weights, but each tab/site will need a separate context and KV cache.