In this approach, the documents need to be pre-processed once to generate a tree structure, which is slower than the current vector-based method. However, during retrieval, this approach only requires conditioning on the context for the LLM and does not require an embedding model to convert the query into vectors. As a result, it can be efficient when the tree is small. When the tree is large, however, this approach may be slower than the vector-based method since it prioritizes accuracy. If you prioritize speed over accuracy, then I guess you should use Vector DB.