The relevance here is pretty weak.

https://sturdystatistics.com/deepdive?fast=0&q=reinforcement...

I think only 1/10 of the articles is really on topic.

I see that the model has not yet finished training: I think you are referring to the "Raw Search Results Section".

Our tool works a little different than LLM style tools. We are doing a bulk search — for academic search, ~1000 papers — and then training a hierarchical Bayesian model to organize the results. Once the model trains, it provides a visual representation of the high level themes that you can then use to explore the results.

The trade off is we are willing to lower the relevance filter to enable a broad set of exploration.