Everybody builds one. And, then they usually figure out that making the model fill its context with a bunch of memories hurts performance more often than it helps.

That's why I always ask: got benchmarks?

Yes — cargo run -p mnemo-bench. Ships with 12 benchmarks. Full retrieval pipeline is ~4ms on debug build. Numbers are in the README performance table.

I don't care if it's fast, if it makes the model dumber by cluttering up context.

[flagged]

[dead]

[dead]