How are they niche? The default mode of search for most dedicated RAG apps nowadays is hybrid search that blends classical BM-25 search with some HNSW embedding search. That's already breaking the definition.
A search is a search. The architecture doesn't care if it's doing an vector search or a text search or a keyword search or a regex search, it's all the same. Deploying a RAG app means trying different search methods, or using multiple methods simultaneously or sequentially, to get the best performance for your corpus and use case.
Most hybrid stacks (BM25 + dense via HNSW/IVF) still rely on embeddings as a first class signal. So in practice the vector side carries recall on paraphrase/synonymy/OOO vocab, while BM25 stabilizes precision on exact term and short doc cases. So my point still stands.
> The architecture doesn't care
The architecture does care because latency, recall shape, and failure modes differ.
I don't know of any serious RAG deployments that don't use vectors. I'm referring to large scale systems, not hobby projects or small sites.