We've been trying to automate since the beginning. A lot of it is automated but it's mostly the easier and less damaged parts of the scrolls. Scanning takes a few days for the biggest scrolls but the amount of human refinement is still a multi month process.
Random shower thought: I wonder if it would be better in the long term to stop digging out archeological findings. The more we excavate, the more damage we do for future archaeologists who will have the superpower of reading these texts without even needing to dig the scrolls free and open them.
Archaeologists think about this a lot. Many digs leave portions intact specifically so that future scientists, with access to techniques and technologies beyond what's available now, can research them.
There is an active debate on exactly this topic when it comes to whether or not to excavate the tomb of Qin Shi Huang.
https://en.wikipedia.org/wiki/Mausoleum_of_Qin_Shi_Huang
Modern archaeologists are painfully aware that theirs is a destructive science, and do their best to mitigate that. The most extreme example is probably the tomb of the First Emperor, Qin Shi Huang, where official policy on excavation can be boiled down to "not yet".
We stand on the shoulders of those that came before us. People have been trying to unroll and read the scrolls for 250 some odd years now. Had they not laid the groundwork for all that time we wouldn't be making the progress we are now.
How many scrolls are intact (worldwide, rather than just France) that might still be recoverable?
IIRC 99% of all of the existing scrolls are still in Italy's possession. I think the breakdown is something like ~350 are mostly in tact, another ~1000 are damaged but still "scroll like", and the remaining hundreds are shattered fragments.
did anything progress on trying to dig more out of the ground? i know that there was thinking that a lot of scrolls might still be down there
Not yet, as far as I am aware. Digging progress is decided by the Italian government at multiple levels and would be a many year long thing. We have our hands full for the forseeable future with the 30 or so scrolls we've already scanned. We're getting more and more efficient on the scanning and automation fronts, though, and are hoping that we can get our hands on the other 300 or so intact scrolls, but that in and of itself is a multi year long project that will require more money and time. As I've mentioned in a different comment, scanning is _not cheap_ and we pay for it ourselves from our own funding and donations in order to release the data for free with permissive licensing. We hope that we can improve our processes to be able to work with cheaper, lower resolution CT methods, but right now we are focused on extracting as much as possible from the best scan source in the world. Productization of cheaper scanning methods is a secondary to tertiary priority at the moment.
Funding wise I asked a year or so back about crowd funding, but hear nothing back. My means are limited but I'm sure there are a lot of people like me who could band together. The project seems content on the big doners right now?
My god, but that sounds wonderful.
...plus the ones that have not been dug out yet... the site is still partially buried
may you please tell us how much effort goes into each type of task in those months?
where else do you think these techniques be applied?
We are a core team of about 10 researchers and developers working full time on work that applies to all of the scrolls. We also ahve 4 full time annotators that tend to work on one scroll at a time. The amount of time spent on any given scroll varies with how difficult and large it is.
There is an extremely large overlap between a lot of the work we do with medical imaging, CT scanning, XRay technology, and such. A lot of the ML models and frameworks we have used and adapted for our purposes originated in the medical field for things like cancer detection or segmenting different body parts.