This is a crucial clarification about the architecture.
The MCA algorithm does not perform the majority of document retrieval; it is merely a mechanism that complements FAISS and BM25 for greater coverage. My system uses a truly hybrid retriever:
FAISS (with BGE-large-en-v1.5, 1024D) handles the primary retrieval load, pulling 60%+ documents.
MCA acts as a specialized gate. The logic is: if two retrievers miss the ground truth (GT), the third one catches it. They complement each other—what FAISS misses, MCA finds.
Pipeline Magic: Despite this aggressive union coverage (which often exceeds 85% documents), the reranker plays an equally critical role. The ground truth (GT) doesn't always reach the top 15, and the final LLM doesn't always grasp the context even when it's present. All the magic is in the deterministic pipeline orchestration.
LLM Agnosticism: The LLM (gpt-4o-mini) is only involved in the final answer generation, which makes my system highly robust and LLM-agnostic. You can switch to a weaker or stronger generative model; the overall accuracy will not change by more than ±10.