"The limit for sorting speed on modern systems is not the comparison computations, it's the data movement; so the old established sorting algorithms are no longer the best. Sorting is fastest by getting the data to the parallel cores faster. mergesort is a good parallel sort with a min-heap data structure. You size the heaps to fit into the CPU cache."