Hacker News

It doesn't look like it's just root nodes from the structure, it appears to be the entire structure including a summary and excluding the text content itself:

    {json.dumps(tree_without_text, indent=2)}

The end result is that a pre-summarized digest is input in each prompt, the LLM selects whatever it decides on.

The pageIndex value add here is ostensibly the creation of that summary structure, but this too is done with LLM assistance. I've been through the code now, and what I see is essentially JSON creation and parsing during the index process that has LLM prompts as the creation engine for all of that as well.

Yes, it is technically vectorless-RAG, but it gets there completely and totally with iterative and recursive calls to an LLM on all sides.

Looking through the rest of their code & API, the API exists to do these things:

    1: Create your ToC using unsupervised[1]  LLM calls.
    2: Serve your ToC to an LLM when searching or querying your doc base
    3: Be your document store to return hits from #2

[1] Unsupervised in the ML sense, not as a value/quality judgement.