Well, since we're all talking about sourcing training material to "benchmaxx" for social proof, and not litigating the whole "AI bubble" debate, just the entire cottage industry of data curation firms:

https://scale.com/data-engine

https://www.appen.com/llm-training-data

https://www.cogitotech.com/generative-ai/

https://www.telusdigital.com/solutions/data-for-ai-training/...

https://www.nexdata.ai/industries/generative-ai

---

P.S. Google Comms would have been consulted re putting a pelican in the I/O keynote :-)

https://x.com/simonw/status/1924909405906338033

Cool. At least they are working across the board and benchmaxing random things like the theory of mind.