Interesting that they manually transcribed the data to Excel. It would also be interesting to know how they mapped from the excel files to the final dataset. I wonder if LLMs could do the switch from scans to structured data more efficiently, and how much of a hit to accuracy would be involved.