Not sure where you are based, but if you were in the EU and had no commercial intentions, you might want to consider adding the crawls from OpenWebSearch.eu, an EU-funded research project to provide an open crawl of a substantial part of the Web (they also collaborate with Common Crawl), its plain text and an index:
https://openwebsearch.eu/
It would be fantastic if someone could provide a not-for-profit decent quality Web search engine.
Why the hell do all those ”renowned institutions” need this (admittedly brilliant) guy to turn their crawl into a usable search engine? What’s wrong with this continent…?