44D. Analysis of Heap Sort

Let's look at how long it takes to run Heap Sort on an array of size n, in the worst case.


Overcounting the reheapDownMax calls

The loop performs reheapDownMax a total of n−1 times. Although the heap keeps getting smaller and smaller, imagine that its size is always n; that will only cause us to overcount the total cost.

It takes time O(log(n)) to perform a reheapDownMax call on a heap with n things in it. Since there are fewer than n calls to reheapDownMax, the total time for the loop is O(nlog(n)).


Undercounting the reheapDownMax calls

It is easy to see that we have not overcounted by much. Half of the reheapDownMax calls are done on heaps of size at least n/2. Their cost is about log(n/2) = log(n) − 1. So the first n/2 iterations of the loop take time about (n/2)(log(n) − 1), which is Θ(nlog(n)). So the total worst-case time for the loop is Θ(nlog(n)).


BuildHeap

That only leaves the buildHeap step. For simplicity, look at a heap where every level is full, and count the number of swaps that are done by buildHeap. About half of the nodes are in the bottom level, and they cost nothing. About a 1/4 of the nodes are in the next level up from the bottom, and they only cost 1 because there is only one level below them, and they can't move below the bottom level. About 1/8 of the nodes are at the next level up, and they cost no more than 2, since that is as far as they can move down, and so on. Adding everything up, the total cost is:

 O(1(n/4) + 2(n/8) + 3(n/16) + …

   = n(1/4 + 2/8 + 3/16 + 4/32 + ...).

The sum (1/4 + 2/8 + 3/16 + 4/32 + ...) can be shown to be less than 1. (If you carry it out to infinity, it is exactly 1.) So the time required to do the initial buildHeap call is O(n).