Hey, contributor here. That's right, GoModel embeds requests and does vector similarity lookup before proxying. Regarding the cache invalidation, there is no "purging" involved – the model is part of the namespace (params_hash includes the LLM model, path, guardrails hash, etc). TTL takes care of the cleanup later.