15. Recursion


Recursion

Recall that, whenever a function is called, it gets a new frame in the run-time stack. So each call has its own variables.

A consequence of that is that a function can call itself. Effectively, it actually calls a copy of itself, with a new frame. A function that calls itself is said to be recursive, or to use recursion.


Recursion is a substitute for a loop

Do not try to use both recursion and a loop in the same function until you are an expert. Use one or the other. Every job that can be done with a loop can also be done with recursion and every job that can be done with recursion can be done with a loop. But one is often much easier than the other.

You will find that loops work well for some problems, but they are somewhat inflexible, and require you to do complicated workarounds for some problems. Recursion is more flexible and often gives you short and easy solutions to problems that are difficult to solve with a loop.


Example

Let's write a definition of function factorial(n) that returns n! = 1·2·…·n. Notice the following two facts.

0! = 1  (by definition)
n = n·(n−1)!  (if n > 0)

That suggests the following definition of factorial.

  int factorial(const int n)
  {
    if(n == 0)
    {
      return 1;
    }
    else
    {
      return n * factorial(n-1);
    }
  }

Hand simulation of recursion 1

Let's use frames to simulate computation of factorial(3). To make the simulation easier to show, we break the statement in the else-part into statements. Here is the modified definition of factorial.

  int factorial(const int n)
  {
    if(n == 0)
    {
      return 1;
    }
    else
    {
      const int r = factorial(n-1);
      return n * r;
    }
  }
Initially, a frame for factorial(3) is created.

factorial
n = 3
at: if(n == 0)

Since n is not 0, factorial moves to the else part and calls factorial(2).

factorial
n = 3
at: int r = factorial(n-1);
factorial
n = 2
at: if(n == 0)

The call to factorial(2) sees that n is 2, not 0, so it goes to the else part and calls factorial(1).

factorial
n = 3
at: int r = factorial(n-1);
factorial
n = 2
at: int r = factorial(n-1);
factorial
n = 1
at: if(n == 0)

Since n is not 0, the call to factorial(1) calls factorial(0).

factorial
n = 3
at: int r = factorial(n-1);
factorial
n = 2
at: int r = factorial(n-1);
factorial
n = 1
at: int r = factorial(n-1);
factorial
n = 0
at: if(n == 0)

Since n is 0, the call to factorial(0) returns 1.

factorial
n = 3
at: int r = factorial(n-1);
factorial
n = 2
at: int r = factorial(n-1);
factorial
n = 1
r = 1
at: return n * r;

Factorial(1) returns 1*1 = 1.

factorial
n = 3
at: int r = factorial(n-1);
factorial
n = 2
r = 1
at: return n * r;

Factorial(2) returns 2*1 = 2.

factorial
n = 3
r = 2
at: return n * r;

Finally, factorial(3) returns 3*2 = 6.


Variables and recursion

Factorial has a const parameter, n. The original definition of factorial does not have any other variables.

Most recursive function definitions only use constants. They do not change the value of any variable, once the variable has a value.

That can come as a surprise to someone who has become accustomed to using loops. How can a repetition be accomplished if no variables ever change?

The answer to that question is simple. Because each call to a function creates a new frame for that function, recursion can create a lot of different variables. The variation needed for repetition is not accomplished by changing the value of one variable, but rather by creating many variables with different values.

That probably sounds extravagent. And, in some cases, it is. But computer resources are relatively cheap, while human resources are much more expensive. What people have found is that, for a lot of problems, recursion takes less development time than looping, and it leads to function definitions that work the first time, with no debugging, more often than you think you have a right to. So recursion can reduce utilization of the resource that really matters: your time.


Another example

Let's look at the problem of computing xk (x to the k-th power) where x is a real number and k is a positive integer. Breaking this down into cases yields a simple case and a more complicated case.

  1. If k = 1 then xk = x.

  2. If k > 1 then it is clear that xk = x*xk−1. We can use our function to compute xk−1.

That leads to the following definition of power(x, k).

  // power(x,k) returns x to the k-th power.
  //
  // Requirement: k > 0.

  double power(const double x, const int k)
  {
    if(k == 1)
    {
      return x;
    }
    else {
      return x * power(x, k-1);
    }
  }
That works, but we can do better. Imagine computing x8. Our algorithm does it as follows.
  x8 = x * x7
     = x * x * x6
     = x * x * x * x5
     = x * x * x * x * x4
     = x * x * x * x * x * x3
     = x * x * x * x * x * x * x2
     = x * x * x * x * x * x * x * x1
     = x * x * x * x * x * x * x * x
It ends up doing 7 multiplications. But we can compute x8 more efficiently as follows.
  x8 = (x4)*(x4)
  x4 = (x2)*(x2)
  x2 = (x1)*(x1)
The advantage is that each right-hand side consists of a value multiplied by itself. We only need to compute that value once, and we end up computing x8 using only 3 multiplications. The same idea allows us to compute x1024 using only 10 multiplications.

So we use the rule:

  1. If k is even then xk = (xk/2)2

(We need k to be even to ensure that k/2 is an integer.) Here is the improved function definition.

  // power(x,k) returns x to the k-th power.
  //
  // Requirement: k > 0.

  double power(const double x, const int k)
  {
    if(k == 1)
    {
      return x;
    }
    else if(k % 2 == 0)
    {
      const double p = power(x, k/2);
      return p*p;
    }
    else 
    {
      return x * power(x, k-1);
    }
  }
That is a much more efficient algorithm than the previous one. For example, to compute x9, it uses the rule
  x9 = x * x8
then it uses the efficient algorithm for x8 shown above. To compute x18, it uses
  x18 = x9 * x9
and we have seen that x9 is computed efficiently. In fact, the new function computes xk using no more that 2log2(k) multiplications.

The definition of 'power' creates a constant, p. Recursive definitions occasionally create new constants, but rarely new variables. If you find yourself changing the value of a variable in a recursive definition at this point in your studies, you are probably doing something wrong.


Summary

Recursion is the ability of a function to call itself. It actually calls a copy of itself, with different variables.

Recursion has some advantages over loops. Recursive definitions of many functions take less time to develop than definitions based on loops.


Exercises

  1. What is the value of g(4), given the definition of g below? (Hint. Work out g(1), then g(2), then g(3), then g(4), in that order. Keep your work organized.)

      int g(const int n)
      {
        if(n == 1) 
        {
          return 2;
        }
        else 
        {
          return g(n-1) + 3;
        }
      }
    
    Answer

  2. Write a C++ definition of function sum(a, b) that returns a + (a+1) + ... + b. For example, sum(2,5) = 2 + 3 + 4 + 5 = 14. More precisely, sum(a, b) returns the sum of all integers that are greater than or equal to a and less than or equal to b. For this question do not use any kind of loop. Use recursion instead. Answer

  3. Suppose the power function is written as follows. Notice that it uses power to do the squaring in the second case. Does it work? (Hint. Do a hand simulation of power(3.0, 2).)

      // power(x,k) returns x to the k-th power.
      //
      // Requirement: k > 0.
    
      double power(const double x, const int k)
      {
        if(k == 1)
        {
          return x;
        }
        else if(k % 2 == 0)
        {
          return power(power(x, k/2), 2);
        }
        else 
        {
          return x * power(x, k-1);
        }
      }
    
    Answer

  4. Suppose the power function is written as follows. Does it work?

      // power(x,k) returns x to the k-th power.
      //
      // Requirement: k > 0.
    
      double power(const double x, const int k)
      {
        if(k == 1)
        {
          return x;
        }
        else if(k % 2 == 0)
        {
          return power(x*x, k/2);
        }
        else 
        {
          return x * power(x, k-1);
        }
      }
    
    Answer