An understated benefit of merge sort is that all of its access patterns have excellent locality. Regardless of big-O, this is a huge advantage because it reduces cache thrashing.
Like it probably doesn't matter if you're sorting 10k entries, but when you're sorting several gigabytes it really does.
Locality is good during merge but there’s no parallelism - there’s a data dependency between the previous and next iteration because we don’t know which pointer we’re supposed to advance. I think it’s still unclear whether this necessarily is compensated for by the cache locality
Quite the opposite. Merge sort is much more susceptible to cpu cache misses. You have to go out of your way to optimize mere sort to avoid cache issues. But the devil is in the details and it really depends on the arch, the dataset you're sorting (numbers vs strings makes a big difference), cores and cache levels available, memory available.
In practice, to avoid worse case quadratic time, you have to have a rather involved pivot selection in Quicksort. Eg nine random or evenly distributed elements which you put in three groups of three and take the median of the median.
Collecting those nine elements looks terrible for locality.
Funnelsort is a cache oblivious variant of merge sort which is provably cache optimal in some sense. I haven't tried implementing it.
To handle quadratic edge cases you would switch to heapsort when the recursion depth goes above the threshold. I.e. introsort; no need for convoluted pivot selection methods.
That's usually called Introsort, but I think you'll find that most of these still consider at least the middle element for a pivot, so they still have a more complex cache behaviour than the naïve linear sweep from both ends of a simple Quicksort.
At one point I used to work at a place that had a production process that used several hundred ibm 3480(if I remember correctly) tape cartridges. I would do a LSB radix sort afterwards to get them back in order.
I think back on it with a certain amount of nostalgia now, but it was a stupid way to do operations, even at the time, They had invested big in computers in the 70's and were still maintaining the same tech stack in the early 00's when I worked there.
Merging stacks of papers by hand isn't the same as exchange-based computer mergesort: if the cost of inspecting and comparing items is negligible relative to actually moving them the basic, unit cost operation is splitting a sequence into three possibly empty sequences and splicing the middle piece into an arbitrary position in another sequence. In some cases reversing the spliced sequence is free ("pancake" sorting).
This cost model is an important reason to merge bottom up, from sorted subsequences, without arbitrary splitting that can add work but not reduce it.
I like to think so but I'm not sure that's true. At least I don't think I've ever seen anyone divide a stack in half, then divide the left one in half, then divide the left one of that in half, etc. as if they're doing recursive mergesort. I feel like most people start from the first card and build upward, either with insertion sort or with something like iterative mergesort.
What I’ve done is: Grading data structures exams (appropriately enough), I’d grab a handful of them, and grade them, and then sort the little stack. Then I’d put it aside to deal with later.
Eventually I realized I was running out of desk space, so I started merging little stacks of around equal size.
So the base case wasn’t 1 element, but that’s conventional for actual implementations. And the splitting wasn’t quite recursive I guess. But partial credit at least.
Only for reasonably small stacks. Insertions at random indices won't stay O(1) when you're dealing with 100k pages and more – i.e., when you can't pick them up in one movement.
I think most people would shortcut the "recursive task-spawning" part by just taking all the pages and spreading them out individually side-by-side on a work surface.
That's why I said it's like the iterative version of mergesort, which goes bottom-up. People don't divide things in half like in traditional recursive mergesort, is what I was saying.
When I taught, I would sort graded papers/tests using quick sort. Put all the a–l in on pile and m–z in the other, then do each pile into halves and manually sort the quarters. And since I knew what the sorted set looked like, finding pívot points was trivial.
Like it probably doesn't matter if you're sorting 10k entries, but when you're sorting several gigabytes it really does.