Here's the latest (although, it looks truncated at those having > 1M pages),
https://commoncrawl.github.io/cc-crawl-statistics/plots/tld/...
Here's the latest (although, it looks truncated at those having > 1M pages),
https://commoncrawl.github.io/cc-crawl-statistics/plots/tld/...
The complete list hides in the web graph:
https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main...
and the specific file that's every host we've seen in the latest 3 crawls is:
https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main...