an effective "vectorless RAG" is to have an LLM write search queries against the documents. e.g. if you store your documents in postgres, allow the LLM to construct a regex string that will find relevant matches. If you were searching for “Martin Luther King Jr.”, it might write something like:
SELECT id, body
FROM docs
WHERE body ~* E'(?x) -- x = allow whitespace/comments
(?:\\m(?:dr|rev(?:erend)?)\\.?\\M[\\s.]+)? -- optional title: Dr., Rev., Reverend
( -- name forms
(?:\\mmartin\\M[\\s.]+(?:\\mluther\\M[\\s.]+)?\\mking\\M) -- "Martin (Luther)? King"
| (?:\\mm\\.?\\M[\\s.]+(?:\\ml\\.?\\M[\\s.]+)?\\mking\\M) -- "M. (L.)? King" / "M L King"
| (?:\\mmlk\\M) -- "MLK"
)
(?:[\\s.,-]*\\m(?:jr|junior)\\M\\.?)* -- optional suffix(es): Jr, Jr., Junior
';
Won't that be slower than vector DB's by an order of magnitude or more?
I guess the major foucs in certain uses cases is not speed but accuracy and retrieval quality.
Faster is not always better. In certain situations, we may choose to sacrifice speed for increased accuracy.