I really like the simplicity of this! What's retrieval performance and speed like?

Minimalism is my design philosophy :-)

Good question. Since it is just an LLM reading files, it depends entirely on how fast it can call tools, so it depends on the token/s of the model.

Haven't done a formal benchmark, but from the vibes, it feels like a few seconds for GPT-5.4-high per query.

There is an implicit "caching" mechanism, so the more you use it, the smoother it will feel.