That directory is huge already! I guess the index.md helps the agent find what it needs, but even the markdown file is very long - this would consume a ton of tokens.

Also I wonder who/what decides what papers go in there.

In the blog post, the agent is allowed to do its own search.

Check out the Researcher and Process Leads skill in ctoth/research-papers-plugin. I have basically completely automated the literature review.

Having a "indexed global data collection" of the markdown would be a kumbaya moment for AI. There's so much data out there but finite disk space. Maybe torrents or IPFS could work for this?

I'm actually sort of working on this! https://github.com/ctoth/propstore -- it's like Cyc, but there is no one answer. Plus knowledge bases are literally git repos that you can fork/merge. Research-papers-plugin is the frontend, we extract the knowledge, then we need somewhere to put it :)