41A. Sorting


Insertion into a sorted list

We have seen how to insert a value into a sorted linked list in such a way that the list remains sorted. Here is the code that we derived.

  // insert(x,L) inserts x into list L, changing L.
  //
  // L must be in ascending order when insert is called,
  // and x is inserted in the correct place so that
  // L is sorted after the insertion.

  void insert(const int x, List& L)
  {
    if(L == NULL || x <= L->head)
    {
      L = cons(x, L);
    }
    else
    {
      insert(x, L->tail);
    }
  }

That builds a new cell for the inserted item (as you can see from the call to cons). But for our purposes here, we want a function that is given a cell containing x and just inserts that cell into L, so that insert does not create any new cells. Here is the modified function.

  // insertCell(C,L) inserts cell into list L, changing L.
  //
  // L must be in ascending order when insert is called.
  // Cell C is inserted, preserving its head value,
  // in the correct place so that L is sorted after
  // the insertion.

  void insertCell(ListCell* C, List& L)
  {
    if(L == NULL || C->head <= L->head)
    {
      C->tail = L;
      L = C;
    }
    else
    {
      insertCell(C, L->tail);
    }
  }

Sorting a linked list

Let's use insertCell to write a function that sorts a linked list without presuming anything about its order at the start. Here are thoughts about that.

  1. An empty list is in ascending order in a trivial way. Do nothing to it.

  2. To sort a nonempty list L, sort the tail of L. Then insert the head of L into the sorted tail.

That leads to the following sorting function, called Insertion Sort.

  // InsertionSort(L) sorts list L into ascending order,
  // changing list L.

  void InsertionSort(List& L)
  {
    if(L != NULL)
    {
      InsertionSort(L->tail);
      insertCell(L, L->tail);
    }
  }

They don't come much simpler than that.

Notice that pointer L can be viewed as a pointer to a linked list or, as in the call to insertCell, as a pointer to the first cell in the linked list. Both viewpoints are valid.


Analysis of Insertion Sort

It is easy to see that, if list L has n members, then insertCell(C, L) takes time Θ(n) in the worst case. It takes a constant amount of time to look at each cell, and the worst case occurs when the new cell needs to be inserted at the end of L.

What about InsertionSort? If list L has n members, then InsertionSort will do n calls to insertCell, one for each member of L. (Each cell clearly needs to be inserted.)

The last cell in L will be inserted into an empty list. Let's charge 1 unit for that, counting all of the cells in L plus cell C. The next-to-last cell in L will be inserted into a list of length 1, at a cost of 2 units. A little thought shows that the insertCell costs, from the back of L to front of L, are: 1, 2, 3, 4, … n.

All of those insertCell calls must be done. Adding up their costs gives 1 + 2 + 3 + … n = n(n+1)/2 ≈ n2/2. That is, the cost of InsertionSort(L) is Θ(n2).

That is quite slow. When n = 10,000, n2 is 100,000,000, or a hundred million. That seems like a high cost.