Hacker News

GraphRAG does full entity extraction across the entire data set, then looks at every relation between those entities in the documents, then looks at every “community” of relations and generates narratives/descriptions for everything at all of those levels. That is… not linear scaling in relation to your data to say the least — and because questions will be answered on the basis of this preprocessing you don’t want to just use the stupidest/cheapest LLM available. It adds up pretty quickly — and most of the preprocessing turns out to be useless for questions you’ll ask. The OP’s approach is more expensive per query, but you’re more likely to get good results for that particular question.