Sure, but vector search is the dominant form of RAG, the rest are niche. Saying "RAG doesn’t have to use vectors" is like saying "LLMs don't have to use transformers". Technically true, but irrelevant when 99% of what's in use today does.

How are they niche? The default mode of search for most dedicated RAG apps nowadays is hybrid search that blends classical BM-25 search with some HNSW embedding search. That's already breaking the definition.

A search is a search. The architecture doesn't care if it's doing an vector search or a text search or a keyword search or a regex search, it's all the same. Deploying a RAG app means trying different search methods, or using multiple methods simultaneously or sequentially, to get the best performance for your corpus and use case.

Most hybrid stacks (BM25 + dense via HNSW/IVF) still rely on embeddings as a first class signal. So in practice the vector side carries recall on paraphrase/synonymy/OOO vocab, while BM25 stabilizes precision on exact term and short doc cases. So my point still stands.

> The architecture doesn't care

The architecture does care because latency, recall shape, and failure modes differ.

I don't know of any serious RAG deployments that don't use vectors. I'm referring to large scale systems, not hobby projects or small sites.