Hacker News

I am currently working on deep context query which uses dynamically generated regex to pull only the relevant context blocks. By using lightweight RegEx pattern matching to detect semantic intent and filter structured context sections accordingly, you avoid the attention degradation that comes from stuffing semantically redundant information into the window

https://jdsemrau.substack.com/p/tokenmaxxing-and-optimizing-...

structuredPizza 16 hours ago [ - ]

The more real world use cases we see, the more we see the use of a well thought out regex as a bridge from probabilistic to deterministic.

ogogmad 6 hours ago [ - ]

This is one of the most interesting comments I've read on this website.

pbronez 7 hours ago [ - ]

Interesting approach.

> Prioritize recall over precision.

Have you tried stemming your regex? That would help you catch messages where a different form of your word appeared. For example instead of “story” you look for “stor” which catches “stories” as well.

Then you might think, could we do an even better job by figuring out the general semantic intent of the query and history? Let’s project them into a semantic vector space! That’s an embedding.

Then you want to query that, which means you need a vector database. So now we can take the query, embed it, query the vector DB with that embedding and retrieve the N closest history documents. You can use that to augment the generation of the response to your prompt.

This is RAG.

Anyway, interesting to see different degrees of sophistication here. Certainly a handful of naive regex are very snappy.

There’s probably a hybrid approach where you use sophisticated NLP and embedding techniques to robustly define topics, then train a regex to approximate that well.

jsemrau 7 hours ago [ - ]

That assumes one layer of memory. In my experience you need to have at least 4 layers of memory to work well. All of them have different requirements for retrieval. Everything that is in short-term memory (state of the app, current conversation, current workspace artefact) requires fast latency and precision. For example if you want to edit a segment in a financial analysis, a blog post, or a program you only want to edit this segment. RAG on a VectorDB is overkill in my opinion.