Skip to content

Counting Sorts

Kevin Lin edited this page Oct 17, 2025 · 2 revisions

Note

Objectives

  1. Explain the worst-case lower bound for comparison sorting.
  2. Describe counting sort, MSD radix sort, and LSD radix sort.
  3. Explain how the subsort is used in MSD and LSD radix sort.

Code: MSD.java, LSD.java

In practice, Java's system sorts like Timsort and dual-pivot quicksort have a linearithmic order of growth. Is it possible to sort in faster than worst case $\Theta(N \log N)$ time? In the best case, we know that sorting algorithms like insertion sort can run in linear time if we happen to have an array that is already-sorted. But can we design a sorting algorithm that runs in linear time in the worst case?

Sorting decision tree

How many comparisons do we need to make in order to know how to sort an array?

In merge sort, we relied on knowing that a single element subarray is already sorted with respect to itself:

  • Sorting an array with 1 element requires 0 comparisons.
  • Sorting an array with 2 elements requires 1 comparison.
  • Sorting an array with 3 elements requires either 2 or 3 comparisons depending on the questions.

We can draw a sorting decision tree to see exactly what questions we need to ask to determine the sorted order for 3 elements. Each leaf in the decision tree represents a possible sorted order for the elements, and each branch represents a choice of answering either "Yes" or "No" to each comparison question.

Sorting decision tree for 3 elements

This decision tree is not only a useful concept, but it can also be implemented as a program consisting of a hierarchy of nested if conditional statements. This decision tree represents the sorting network for 3 elements: a comparison sort that requires the absolute least number of comparisons to sort a given input array.

if (a < b)
    if      (b < c) return {a, b, c};
    else if (a < c) return {a, c, b};
    else            return {c, a, b};
else
    if      (a < c) return {b, a, c};
    else if (b < c) return {b, c, a};
    else            return {c, b, a};
If 3 elements have 6 permutations, how many permutations are there for 4 elements?

For each of the 6 permutations in a, b, c, we can insert the element d before, in-between, or after each element. For example, if we consider the permutation {a, b, c}, we can insert d in 4 different places: {d, a, b, c}, {a, d, b, c}, {a, b, d, c}, and {a, b, c, d}. Ultimately, we take the 6 permutations we had for 3 elements and multiply by 4 to get 24 total permutations for 4 elements. More generally, the number of permutations can be described using the factorial function.

If $N$ elements have $N!$ (factorial) potential permutations, and each potential permutation is a leaf in a balanced sorting decision tree, then the sorting network in the worst case needs about $\log_2 N!$ comparisons to determine the correct sorting of the elements. Stirling's approximation can be used to show that $\log_2 N! \in \Theta(N \log N)$.

Tip

Since the optimal sorting network for a problem with $N$ elements requires $\Theta(N \log N)$ comparisons in the worst case, it's not possible to design a comparison sorting algorithm that takes linear time in the worst case.

Definitions

The worst case lower bound on comparison sorting algorithms assumed we used comparison operators like <, >, or compareTo to determine the relative order of pairs of elements. What if we changed the rules of the 'game' of sorting to no longer rely on comparison operators? How would that even work?

A counting sort sorts an array of elements by relying on enumeration instead of comparison. Elements are considered comparable if they can be compared with one another. Elements are considered enumerable if they can be listed-out in a sequence from first to last.

  1. Create a count array that will be used to store the number of times each element occurs in the input array.
  2. Iterate over the input array, updating the count array to reflect the occurrence of each element.
  3. Iterate over the count array, unraveling the number of times each element occurs back into the input array.

The idea of counting sort is much like the way that people sort a deck of cards by their numeric value. Instead of comparing pairs of cards against each other, one approach is to take the deck of cards and, for each card, put it in a pile with other cards of the same value. Then, at the end, collect all the cards from the smallest value to the largest value.

Important

Open the VisuAlgo module on sorting algorithms. Press Esc to exit the e-Lecture Mode, and choose COU from the top navigation bar to switch to counting sort. Run the sorting algorithm using Sort from the bottom left menu.

Then, review the Key-Indexed Counting Demo.

Radix sorts

Counting sorts work great with small integer values when we know the range between the smallest integer and the largest integer. But many other data types, like strings, are more difficult to enumerate because we can always make a more complicated string. For example, suppose we have a string "a" and we decide to put it at index 0 in the count array. Where would the string "b" belong in the count array? We know "b" comes after "a", but how far after "a"? We might run into strings like "aa", "aaa", "aaaa", etc. and not know how many spaces to reserve for these elements.

To address this issue, we can take inspiration from tries. Just as tries divided a string into its constituent letters and processed each letter individually, we can do the same and apply counting sort on each letter. Radix sorts represent a category of counting sorts that divide strings (or string-like objects) into individual subunits that can be separately counting-sorted. We'll focus on two types of radix sorts:

  • Most-Significant Digit (MSD) radix sort starts from the leftmost character and proceeds to the right, recursively counting sort each next-character separately.
  • Least-Significant Digit (LSD) radix sort starts from the rightmost character and proceeds to the left, iteratively counting sort all elements each time.

Important

Open the VisuAlgo module on sorting algorithms. Press Esc to exit the e-Lecture Mode, and choose RAD from the top navigation bar to switch to an LSD radix sort. Run the sorting algorithm using Sort from the bottom left menu.

If a trie is like MSD radix sort, what sort would a ternary search tree be like?

A ternary search tree combines elements of tries with binary search trees. We learned that quicksort is like a binary search tree. Combining these two ideas gets us 3-way radix quicksort, which is a radix sort that works like a ternary search tree.

  1. Select a pivot element for the current index into the strings.
  2. Partition the array into elements less than, equal to, and greater than the pivot element.
  3. Recursively sort each of the less than, equal to, and greater than subarrays.

Clone this wiki locally