Building HEBBS — a memory engine for AI agents, written in Rust.

The problem: every agent framework bolts together a vector DB for recall, a KV store for state, maybe a graph DB for relationships, and then hopes the duct tape holds. You get one retrieval path (similarity search), no decay, no consolidation, and the agent forgets everything the moment context gets trimmed.

HEBBS replaces that stack with a single embedded binary (RocksDB underneath, ONNX for local embeddings). Nine operations in three groups: write (remember, revise, forget), read (recall, prime, subscribe), and consolidate (reflect, insights, policy). The interesting part is four recall strategies — similarity, temporal, causal, and analogical — instead of just "nearest vector."

Some technical decisions I'm happy with:

- No network calls on the hot path. Embeddings run locally via ONNX; LLM calls only happen in the background reflect pipeline.

- recall at 2ms p50 / 8ms p99 at 10M memories on a 2 vCPU instance.

- Append-only event model for memories — sync is conflict-free, and forget is itself a logged event (useful for GDPR).

- Lineage tracking: insights link back to source memories, revisions track predecessors.

SDKs for Python, TypeScript, and Rust. CLI with a REPL. gRPC + REST.

There's a reference demo — an AI sales agent that uses HEBBS for multi-session memory, objection handling recall, and background consolidation of conversation patterns.

Still early. The part I'm wrestling with now is tuning the reflect pipeline — figuring out when and how aggressively to consolidate episodic memories into semantic insights without losing useful detail. Curious if anyone working on agent memory has opinions on that tradeoff, or if you've found other approaches that work.

https://github.com/hebbs-ai/hebbs