27. Equations and Algorithms on Lists


Defining functions by equations

Let's think about writing a definition of function length(L) that returns the length of list L. For example, length([2, 4, 6]) = 3 and length([ ]) = 0. It suffices to say how to compute the length of an empty list and how to compute the length of a nonempty list. But instead of thinking about an algorithm right now, let's focus on facts.

(length.1) length([]) = 0
(length.2) length(L)  = 1 + length(tail(L))  (when L is not empty)
Here are two important observations.

So we have two facts (equations) about the length function. But what we really want is an algorithm to compute the length of a list. Is it reasonable to say that those two equations define an algorithm? Let's try to compute length([2, 4, 6]) using them. The only thing we do is replace expressions with equal expressions, using facts (length.1) and (length.2) and a little arithmetic.

  length([2,4,6])
    = 1 + length([4,6])           by (length.2), since tail([2,4,6]) = [4,6]

    = 1 + (1 + length([6]))       by (length.2), since tail([4,6]) = [6]

    = 1 + (1 + (1 + length([])))  by (length.2), since tail([6]) = []

    = 1 + (1 + (1 + 0))           by (length.1)

    = 3                           by arithmetic


Converting to C++

Equations (length.1) and (length.2) are easy to convert into C++. Since there are two equations, there are two cases.

  int length(ConstList L)
  {
    if(isEmpty(L)) 
    {
       return 0;                   // by (length.1)
    }
    else 
    {
       return 1 + length(tail(L))  // by (length.2)
    }
  }
It is also fine to use C++ notation directly.
  int length(ConstList L)
  {
    if(L == NULL) 
    {
       return 0;
    }
    else 
    {
       return 1 + length(L->tail);
    }
  }
Use whichever form you prefer.


Example: Membership test

Let's use equations to define function member(x, L), which is true if x belongs to list L. For example, member(3, [2, 6, 3, 5]) = true and member(4, [2, 6, 3, 5]) = false.

Here are some equations. Operator 'or' is the logical or operator, the same as | | in C++.

(member.1) member(x, []) = false
(member.2) member(x, L)  = head(L) == x  or  member(x, tail(L))
The first equation should be clear. The empty list does not have any members. Let's try some examples of the second equation, (member.2). According to that equation,

member(6, 2:[4,6,8]) 
  = 6 == 2  or  member(6, [4,6,8])
  = false  or  true
  = true
(Remember the rule for hand-simulating recursive algorithms. Recursive calls are assumed to work correctly. You can do the same for checking equations.)


member(2, 2:[4,6,8]) 
  = 2 == 2  or  member(2, [4,6,8])
  = true    or  member(2, [4,6,8])
  = true
Notice that we do not need to evaluate member(2, [4, 6, 8]) because (true or x) is true regardless of what x is. This is a search problem, and the algorithm does not need to look at the entire list when the it finds what it is looking for.


member(5, 2:[4,6,8]) 
  = 5 == 2  or  member(5, [4,6,8])
  = false   or  false
  = false

Let's convert Equations (member.1) and (member.2) into a definition of member in C++.

  bool member(const int x, ConstList L)
  {
    if(L == NULL)
    {
      return false;
    }
    else
    {
      return x == L->head || member(x, L->tail);
    }
  }

Notice that, for both length and member, we wrote one equation for an empty list and one for a nonempty list. That is typical. In some cases, you need (or prefer) more equations, as the next example illustrates.


Example: Prefix test

Let's write equations that define function isPrefix(A, B), which is true if list A is a prefix of list B. For example, isPrefix([1,3], [1,3,5]) is true. Every list is a prefix of itself, so isPrefix([2, 4, 6], [2, 4, 6]) is true, and the empty list is a prefix of every list. Here are equations for isPrefix.

(isPrefix.1) isPrefix([], B) = true
(isPrefix.2) isPrefix(A, []) = false
                                (when A is not [])
(isPrefix.3) isPrefix(A, B)  = head(A) == head(B) and 
                               isPrefix(tail(A), tail(B))
                                (when A ≠ [] and B ≠ [])
The first two equations should be evident. Equation (isPrefix.1) says that the empty list is a prefix of every list. We assume that the equations are tried in order, so (isPrefix.2) is only relevant when A is not []. Equation (isPrefix.2) says that a nonempty list is not a prefix of an empty list. The third equation should be more obvious through an example.
  isPrefix([3,5], [3,5,7,9])
    = 3 == 3  and  isPrefix([5], [5,7,9])   by (isPrefix.3) 
    = true  and  isPrefix([5], [5,7,9])
    = isPrefix([5], [5,7,9])
    = 5 == 5  and  isPrefix([], [7,9])      by (isPrefix.3)
    = true  and  isPrefix([], [7,9])
    = isPrefix([], [7,9])
    = true                                  by (isPrefix.1)

Here is a C++ definition of isPrefix.

  bool isPrefix(ConstList A, ConstList B)
  {
    if(A == NULL)
    {
      return true;
    }
    else if(B == NULL)
    {
      return false;
    }
    else
    {
      return A->head == B->head && isPrefix(A->tail, B->tail)
    }
  }

Example: concatenation

The concatenation function cat(A, B) glues two lists A and B together into a single list. For example,

(cat-example.1) cat([2,5,7], [3,6]) = [2,5,7,3,6].
Two equations should be obvious.
(cat.1) cat([], B) = B
(cat.2) cat(A, []) = A
All that is left is the case where A and B are both nonempty. Let's think about that case, and concentrate on how to find the head and the tail of the answer.

  1. The head of cat(A, B) is the same as the head of A. That should be evident from (cat-example.1), where head(A = 2 and head(cat(A, B)) = 2.

  2. Look again at (cat-example-1). The tail of result [2, 5, 7, 3, 6] is [5, 7, 3, 6], which is equal to cat([5, 7], [3, 6]). That is, the tail of cat(A, B) is cat(tail(A), B).

  3. Remember that h:t is the list whose head is h and whose tail is t. If we know what h and t are, we can build list h:t. Putting the above observations to work,

    (cat.3) cat(A, B) = head(A):cat(tail(A), B)
    
    when A is not empty.

Notice that Equation (cat.3) does not require list B to be nonempty. For example, cat([2, 3, 4], [ ]) = 2:cat([3, 4], [ ]). Equation (cat.1) tells how to compute cat(A, B) when A is empty and equation (cat.3) tells how to compute cat(A, B) when A is not empty. That covers all possibilities, so there is no need for equation (cat.2). That leads to the following equations for cat.

(cat.1) cat([],  B) = B
(cat.3) cat(A, B) = head(A):cat(tail(A), B)
Let's do a hand simulation of cat([2, 4], [6, 8]).
  cat([2,4], [6,8]) 
    = 2:cat([4], [6,8])         by (cat.3)
    = 2:(4:cat([], [6,8]))      by (cat.3)
    = 2:(4:[6,8])               by (cat.1)
    = 2:[4,6,8]                 since 4:[6,8] = [4,6,8]
    = [2,4,6,8]                 since 2:[4,6,8] = [2,4,6,8]

Converting (cat.1) and (cat.3) to C++ is straightforward, except for one catch, remarked on just after the definition.

  List cat(ConstList A, List B)
  {
    if(A == NULL)
    {
      return B;
    }
    else
    {
      return cons(A->head, cat(A->tail, B));
    }
  }

Notice that parameter B has type List, not ConstList. If the return type is to be List, then that is necessary. If B has type ConstList, then line

  return B;
requires conversion of B from ConstList to List, which is not allowed.


Memory sharing

Notice that cat(NULL, B) returns B. But really, a List is a pointer to a ListCell. That means that pointer B ends up in two different lists: B and the result of cat(A, B). The following illustrates.

As long as you don't change lists, memory sharing does not cause problems, and it can greatly reduce both time and memory utilization. But we will shortly look at functions that make changes to lists. That does not get along well with memory sharing. In the diagram above, if you change 3 to 5 in list B then list C is changed as well, and that can cause confusion in a program.


Example: Select some of the members

Let's write equations that define function evens(L), which yields a list of all members of list L that are even, preserving their order. For example, evens([2, 5, 6, 7, 9, 10]) = [2, 6, 10]. First, an obvious equation.

(evens.1) evens([]) = []
Now suppose that L starts with an even number. Then that even number will be the first value in the result list, and it will be followed by all even numbers in tail(L).
(evens.2) evens(L) = head(L):evens(tail(L))  (when head(L) is even)
Finally, if L does not begin with an even number, just ignore that number.
(evens.3) evens(L) = evens(tail(L))  (when head(L) is odd)
Putting those three equations together defines an algorithm for evens(L).

  List evens(ConstList L)
  {
    if(L == NULL)
    {
      return NULL;
    }
    else {
      int       h = L->head;
      ConstList t = L->tail;
      if(h % 2 == 0)
      {
        return cons(h, evens(t));
      }
      else
      {
        return evens(t);
      }
    }
  }

Exercises

  1. The following equation about cat is false. Give a counterexample that shows it is wrong. Evaluate the two sides for your counterexample and show that they are not equal.

      cat(h:t, u:v) = h:(u:(cat(t, v)))
    
    Answer

  2. The following equation about isPrefix is false. Give a counterexample that shows it is wrong. Evaluate the two sides for your counterexample and show that they are not equal.

      isPrefix(h:t, L) = isPrefix(t, L)
    
    Answer

  3. Using equations (length.1) and (length.2'), show an evaluation of length([6,5,4,3]) by only replacing expressions by equal expressions. Answer

  4. Using equations (cat.1) and (cat.3), show an evaluation of cat([1,2,3], [4,5,6]) by only replacing expressions by equal expressions. Answer

  5. Using equations (isPrefix.1–isPrefix.3), show an evaluation of isPrefix([2,3,4], [2,4,3]) by only replacing expressions by equal expressions. Answer

  6. Using equations (isPrefix.1–isPrefix.3), show an evaluation of isPrefix([2,3], [2]) by only replacing expressions by equal expressions. Answer

  7. Write equations for function sum(L), which yields the sum of all members of list L. For example, sum([2,3,4]) = 2 + 3 + 4 = 9. The sum of an empty list is 0. Make sure that your equations contain enough information to determine the value of sum(L) for any list of integers L. Answer

  8. Using your equations for sum(L) from the preceding question, do an evaluation of sum([3, 5, 7]) by only replacing expressions by equal expressions. Answer

  9. Following your equations, write a C++ definition of sum(L). Answer

  10. Write equations for function smallest(L), which yields the smallest member of nonempty list L. For example, smallest([2, 3, 4]) = 2 and smallest([6, 4, 8, 7]) = 4. Your equations should not try to define smallest([ ]), since the list is required to be nonempty. You can use function min(x, y) to compute the smaller of two integers x and y. Answer

  11. Write a definition of smallest based on your equations. Answer

  12. Write equations for function prefix(L, n), which yields the length n prefix of list L. For example, prefix([2, 4, 6, 8, 10], 3) = [2, 4, 6]. If n is larger than the length of L then prefix(L, n) should return L. For example, prefix([2, 4, 6, 8, 10], 50) = [2, 4, 6, 8, 10]. Answer

  13. Convert your equations for prefix to a C++ definition of prefix(L, n). Answer