Hacker News

What are peoples' experiences with using LLMs to mine information from scientific papers?

My own experience: I first attempted to extract the anti-drug antibody (ADA) rate from each of 3730 clinical-trial papers, all indexed in PubMed. I started from PDFs. Claude Opus 4.7 analyzed each PDF using a written rules doc that we had formulated. Running all the papers took about a week because I kept hitting session limits; the total cost was ~$25 (USD). We got actual rates from 909 papers. The rest were mostly cases where the rate was not present or did not meet our criteria, including administering only one drug at a time.

I read thirty of the papers and re-read those where I got a different answer from Claude, concluding that it had erred one time and I had erred three times.

So this works, but is not totally convenient: session limits mean that I can't start it up and walk away. Or I don't know how to engineer this capability. In addition I was curious how local models would perform.

To that end I tried llama 3.3 70B on my Mac M5 Max (128 GB mem). I used Ollama, Q4_K_M, 128 k context, ~80 k input tokens after pdftotext -layout.

One paper took 18 minutes; the model was unable to determine the ADA rate, whereas it is clearly in the paper. One paper is not a proper benchmark but it's too slow to do a proper test. Clearly part of the speed issue here is that Claude has access to a server farm, whereas I'm running on just one Mac. This is part of the practical problem that someone would face with local computation.

What is the state of the art on this type of problem, for answering questions one paper at a time or using many papers at once? I'd love to hear success stories!

smartypant 9 hours ago [ - ]

Why don't you use deepseek V4 or kimi k2.5 or 2.6? They are really good models and you will not hit these token limits.

davidbjaffe 4 hours ago [ - ]

Wow! I had not tried deepseek. I just had claude teach me how to use it, and then tested it on the same example, which it got right, and for which it charged next to nothing. This is wonderful and completely hilarious -- that claude would happily give away business. Thank you!! P.S. Now it occurs to me that maybe I'm not comfortable with deepseek having access to directories on my computer.