Number of pages, distribution of top-level domains, crawl overlaps, etc. - basic metrics about Common Crawl Monthly Crawl Archives Latest crawl: CC-MAIN-2024-33 View the Project on GitHub Distribution of Languages The language of a document is identified by Compact Language Detector 2 (CLD2). It is able to identify 160 different languages and up to 3 languages per document. The table lists the per