Patience is a Virtue: Revisiting Merge and Sort on Modern Processors
The vast quantities of log-based data appearing in data centers has generated an interest in sorting almost-sorted datasets. We revisit the problem of sorting and merging data in main memory, and show that a long-forgotten technique called Patience Sort can, with some key modifications, be made competitive with today’s best comparison-based sorting techniques for both random and almost sorted data. Patience sort consists of two phases: the creation of sorted runs, and the merging of these runs. Through a combination of algorithmic and architectural innovations, we dramatically improve Patience sort for both random and almost-ordered data. Of particular interest is a new technique called ping-pong merge for merging sorted runs in main memory. Together, these innovations produce an extremely fast sorting technique that we call P3 Sort (for Ping-Pong Patience+ Sort), which is competitive with or better than the popular implementations of the fastest comparison-based sort techniques of today. For example, our implementation of P3 sort is around 20% faster than GNU Quicksort on random data, and 20% to 4x faster than Timsort for almost sorted data. Finally, we investigate replacement selection sort in the context of single-pass sorting of logs with bounded disorder, and leverage P3 sort to improve replacement selection. Experiments show that our proposal, P3 replacement selection, significantly improves performance, with speedups of 3x to 20x over classical replacement selection.