I don't know how people can use the data. There's so much of it! I don't see any harddrives that are 80TB. It seems like people would need some kind of raid setup that can handle 200+TB of uncompressed data
You don't need to download the whole thing. You can parse the WARC files from S3 to only extract the information you want (like pages with content). It's a lot smaller when you only keep the links and text.
A search index is often made of smaller independent pieces often called segments. So you can download & process progressively the data locally and upload it to an object storage. And run queries on it. That's what we did here for this project: https://quickwit.io/blog/commoncrawl