The data stops before it gets to useful sizes (10e7 and up). How are people implementing sorting algorithms not routinely working at 10e9-10e12 where the workload is actually a bottleneck?
I run ml algorithms like boosted trees (i.e xgboost) on data sets with 30k-1m rows and 200-2k columns. Sorting is the bottleneck, it's what the algorithm does. I doubt I'm special, and I'm sure these size are common