In this approach, the documents need to be pre-processed once to generate a tree structure, which is slower than the current vector-based method. However, during retrieval, this approach only requires conditioning on the context for the LLM and does not require an embedding model to convert the query into vectors. As a result, it can be efficient when the tree is small. When the tree is large, however, this approach may be slower than the vector-based method since it prioritizes accuracy. If you prioritize speed over accuracy, then I guess you should use Vector DB.
The approach used here for breaking down large documents into summarized chunks that can more easily be reasoned about is how a lot of AI systems deal with large documents that surpass effective context limits in-general, but in my experience this approach will only work up to a certain point and then the summaries will start to hide enough detail that you do need semantic search or another RAG approach like GraphRAG. I think the efficacy of this approach will really fall apart after a certain number of documents.
Would've loved to seen the author run experiments about how they compare to other RAG approaches or what the limitations are to this one.
Thanks, that’s a great point! That’s why we use the tree structure, which can search layer by layer without putting the whole tree into the context (to compromise the summary quality). We’ll update with more examples and experiments on this. Thanks for the suggestion!
In this approach, the documents need to be pre-processed once to generate a tree structure, which is slower than the current vector-based method. However, during retrieval, this approach only requires conditioning on the context for the LLM and does not require an embedding model to convert the query into vectors. As a result, it can be efficient when the tree is small. When the tree is large, however, this approach may be slower than the vector-based method since it prioritizes accuracy. If you prioritize speed over accuracy, then I guess you should use Vector DB.
The approach used here for breaking down large documents into summarized chunks that can more easily be reasoned about is how a lot of AI systems deal with large documents that surpass effective context limits in-general, but in my experience this approach will only work up to a certain point and then the summaries will start to hide enough detail that you do need semantic search or another RAG approach like GraphRAG. I think the efficacy of this approach will really fall apart after a certain number of documents.
Would've loved to seen the author run experiments about how they compare to other RAG approaches or what the limitations are to this one.
Thanks, that’s a great point! That’s why we use the tree structure, which can search layer by layer without putting the whole tree into the context (to compromise the summary quality). We’ll update with more examples and experiments on this. Thanks for the suggestion!
Can you eloborate on this please?
To put it in terms of data structures, a vector DB is more like a Map, this is more like a Tree
For the C++ programmers among us I think that means it's more like `unordered_map` than `map`
Lol you mean vector db is more like hash_map. map is more tree based