<span class="vcard">/u/stefan59867958</span>
/u/stefan59867958

Common Crawl’s Impact on Generative AI

Common Crawl is a massive archive of web crawl data created by a small nonprofit that has become a central building block for generative AI (or more specifically LLMs) due to its size and free availability. Yet so far, its role and influence on generat…