Computer Science 2530
Spring 2020
Programming Assignment 4

Assigned: Thursday, February 27
Due: Friday, March 6, 11:59pm
Points: 350

Table of Contents

  1. Purpose of this assignment
  2. What you will submit
  3. Background
  4. The assignment
  5. An algorithm for implementing an equivalence manager
  6. Getting started
  7. A refinement plan
  8. Compiling and testing your program on xlogin
  9. Issues to be aware of
  10. Submitting your work
  11. Late submissions
  12. Asking questions
  13. Extra credit: improving the equivalence manager.

Purpose of This Assignment

This assignment exercises your knowledge of arrays and functions. It also gives you experience implementing a module that is not an application but is intended to be part of a larger application.

Initially, this will look difficult. In reality, it is very short and simple. You need to write six short functions, plus documentation. Strive to keep it simple and follow the instructions.

Read the entire assignment except for the extra credit part before you start working on it. Be sure to follow the instructions.

Implement exactly the functions that are indicated. Keep the parameter order as shown here. If you change the parameter order, your module will not compile correctly with my tester. Do not add extra responsibilities to functions.


What You Will Submit

You will need to submit two files, equiv.cpp and equiv.h.


Background

An equivalence relation ≡ is a relation with the following properties for all x, y and z.

  1. ≡ is reflexive: xx.

  2. ≡ is symmetric: If xy then yx.

  3. ≡ is transitive: If xy and yz then xz.

An equivalence relation on a set always partitions the set into equivalence classes. Each value belongs to exactly one equivalence class. Two values x and y are equivalent if and only if they belong to the same equivalence class.

For example, suppose that ≡ is a relation on set {1, 2, 3, 4, 5, 6} with equivalence classes {1, 3, 4}, {2, 6}, {5}. Then 1 ≡ 4 and 2 ≡ 6 but 2 is not equivalent to 3.


The Assignment

Your goal for this assignment is to create a module that manages an equivalence relation that can be changed in a specific way by the program while the program runs. The assignment will involve creating two files, equiv.h and equiv.cpp.

For this assignment, the equivalence relation is always over a set of integers {1, 2, 3, …, n} for some positive integer n. The equivalence classes are sets of integers from 1 to n.

This module is not a complete program. It is intended to be part of a larger program. File equiv.cpp must not contain a main function.

Equiv.h will contain function prototypes, but it must not contain any full function definitions. (There must be no function bodies in equiv.h.) Equiv.cpp must contain line

  #include "equiv.h"
before any function definitions.

Interface

The interface tells exactly what this module provides for other modules to use. Other modules must not use anything that is not described here. Briefly, the interface includes a type, ER, which is the type of an equivalence relation, and the following functions.

None of these functions reads or writes anything.

ER newER(int n)

Statement

  ER r = newER(n);
returns a new equivalence relation over {1, …, n}. Initally, all of the equivalence classes are singleton. That is, the equivalence classes are {1}, {2}, ..., {n}.

void destroyER(ER r)

DestroyER(r) recycles the memory occupied by r.

bool equivalent(ER r, int x, int y)

Equivalent(r, x, y) returns true if x and y are currently in the same equivalence class in equivalence relation r (and returns false otherwise).

void combine(ER r, int x, int y)

Combine(r, x, y) merges the equivalence classes of x and y in equivalence relation r.

void showER(const ER r, int n)

ShowER(r, n) shows the boss array r of equivalence relation r, where r is over {1, …, n}. ShowER is only for debugging.

There is one more function that is part of the implementation but not part of the interface. You can use it for debugging, though.

int leader(ER r, int x)

Leader is described below.

Example

For example, suppose that n = 7. The following shows a sequence of operations on a particular equivalence relation and shows the equivalence classes after each combine operation.

Step Result
ER R = newER(7) R = {1} {2} {3} {4} {5} {6} {7}
combine(R, 1, 5) R = {1, 5} {2} {3} {4} {6} {7}
combine(R, 2, 7) R = {1, 5} {2, 7} {3} {4} {6}
equivalent(R, 1, 5) yields true
equivalent(R, 1, 7) yields false
combine(R, 5, 7) R = {1, 2, 5, 7} {3} {4} {6}
equivalent(R, 2, 5) yields true
equivalent(R, 2, 3) yields false yields false
combine(R, 2, 3) R = {1, 2, 3, 5, 7} {4} {6}
equivalent(R, 3, 7) yields true
equivalent(R, 4, 7) yields false
combine(R, 4, 6) R = {1, 2, 3, 5, 7} {4, 6}
combine(R, 2, 3) R = {1, 2, 3, 5, 7} {4, 6}

As you can see from the last step, it is allowed to combine two values that are already equivalent. That should cause no change.


An Algorithm for Managing an Equivalence Relation

You will not store the equivalence classes directly. Instead, you will store them implicitly, using the following ideas. You are required to implement an equivalence relation manager in this way. You will receive no credit for a module that does not follow this algorithm.

  1. Each equivalence class has a leader, which is one of the members of that equivalence class. You will create a function leader(r, x) that returns the current leader of the equivalence class that contains x in equivalence relation r.

    Two values are equivalent if they have the same leader.

  2. There is another idea that is similar to a leader, but not exactly the same. Each value has a boss, which is a value in its equivalence class. For the purposes of describing the idea, let's write boss[x] for x's boss.

    1. If x is the leader of its equivalence class then boss[x] = x. A leader is its own boss.

    2. If x is not the leader of its equivalence class then boss[x] ≠ x and boss[x] is closer to the leader, in the following sense. If you look at the values x, boss[x], boss[boss[x]], boss[boss[boss[x]]], … then you will eventually encounter x's leader.

Details on the algorithm

Use an array to store the bosses. Declaration

  typedef int* ER;
defines type ER to be the same as int*. Put that line in equiv.h and not in equiv.cpp.

NewER(n) should allocate an array r of size n+1 so that it has indices 1, …, n. It must initialize the array so r[i] = i for i = 1, …, n.


Getting Started

Create a directory for assignment 4 and download

into that directory.

Create file equiv.cpp. Copy and paste the module-template into it. Add your name, the assignment number and the file name (equiv.cpp). Add line

#include "equiv.h"
to equiv.cpp.


A Refinement Plan

  1. Write a program comment telling what this module will provide when it is finished. It provides an implementation of combine/equivalent equivalence relations. Don't describe what happens when you run this, since it is not an application.

  2. Write a contract, then an implementation, of newER. In logical terms (which the contract is concerned with), newER returns an equivalence relation. In physical terms, which the implementation is concerned with, newER returns an array, which we call the boss array.

  3. Write a contract, then an implementation, of showER Do not try to be too fancy. You want to see what the boss array looks like for debugging. Be sure that showER(r, n) shows both k and k's boss r[k], for each k from 1 to n.

  4. Create a file test.cpp. It should contain a main function that creates an equivalence relation and shows it. Be sure that test.cpp contains

    #include "equiv.h"
    

    before any function definitions. Compile test.cpp and equiv.cpp together as follows.

      g++ -Wall -o test test.cpp equiv.cpp
    
    Then run test by
      ./test
    

  5. Add a contract and definition for leader. To compute x's leader, follow the bosses up to the leader.

  6. Add a contract and definition for combine. If x and y are leaders then combine just needs to make r[x] = y. But be careful. Combine must never change the boss of a nonleader. Start by getting leaders.

  7. Modify test.cpp so that it performs a few combine steps and shows the boss array after each step. Do the results look sensible?

  8. Add a contract and definition for equivalent. Two values are equivalent if they have the same leader.

    You now have enough to run the automated tester. Run it and look at the results. If your implementation is correct, you will see only the results of a few calls to showER. Incorrect results are clearly flagged.

  9. Add a contract and definition for destroyER.

  10. When you are satisified that your program works and is well written, submit your work.


Compiling and Testing Your Program on Xlogin

Use the following commands to compile and run your program.

make equiv.o
This compiles equiv.cpp if it needs compiling.

make testequiv.o
This compile testequiv.cpp if it needs compiling.

make testequiv
This compiles equiv.cpp (if necessary) and testequiv.cpp (if necessary) and links them, yielding executable file testequiv.

make test
This does make testequiv and then runs an automated tester.

Note. To write the output from the tester into file testout.txt, use command

  make test > testout.txt
Then you can look at the test results using a text editor.

make debug
Do make testequiv then run the tester via the gdb debugger.

make clean
This remove machine-generated files.

Issues to Be Aware of

As always, your module must follow the coding standards. Pay attention to the following issues.

  1. An equivalence relation has type ER. Any time you need to write the type of an equivalence relation, write ER, not int*.

  2. Be cautious not to use a nonexistent index of an array. If array A has size n, then A[n] does not exist. Pay attention to this. In the past, it was a very common source of serious mistakes.

  3. Be cautious not to change the boss of a nonleader. This has also been a source of mistakes in prior terms.

  4. Every function is required to have a clear, concise and precise contract. Pay attention to this. In the past, students have lost a lot of points because they did not put enough care into writing clear, understandable contracts.

  5. Put a blank line before and after each contract. Use correct spelling and punctuation. Do not omit the subject of a sentence. That can't be difficult. If it is, enroll in a basic English class.

  6. Indent well. Avoid long lines. Limit lines to about 80 characters.

  7. A function body must not change the value of a call-by-value parameter. Pay attention to this. In the past, many students have lost points for violating this requirement.

  8. Avoid code duplication.

  9. Do not use redundant tests in if-statements.


Submitting Your Work

You must submit your program using the following method. Email submissions will not be accepted. An excuse that you do not know how to use Linux will not be accepted.

To turn in your work, log into xlogin, change your directory for the one for assignment 4, and use the following command.

  ~abrahamsonk/2530/bin/submit 4 equiv.cpp equiv.h
After submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.

Command

  ~abrahamsonk/2530/bin/submit 4
will show you what you have submitted for assignment 4.

You can do repeated submissions. New submissions will replace old ones.


Late Submissions

Late submissions will be accepted for 24 hours after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.


Asking Questions

To ask a question about your program, first submit it, but use assignment name q4. For example, use command

  ~abrahamsonk/2530/bin/submit q4 equiv.cpp equiv.h
Then send me an email with your question. Do not expect me to read your mind. Tell me what your questions are. I will look at the files that you have submitted as q4. If you have another question later, resubmit your new file as assignment q4.


Extra Credit: Improving the Equivalence Relation Manager.

For extra credit (up to 45 points), implement two improvements on the equivalence manager algorithm.

But be careful. Make sure that they work and that they do not ruin the equivalence relation manager. You will get no extra credit for an improvement that does not work. If your "improvements" ruin the equivalence relation manager, you will received less credit than you would for a working equivalence relation manager without the improvements.

You can only receive extra credit if your program passes the tests that I run on it.

The First Improvement: Improving the Leader Function (15 pts)

The first improvement involves a change to the leader function.

Suppose array R contains the following.

  R[1] = 2
  R[2] = 4
  R[3] = 3
  R[4] = 6
  R[5] = 3
  R[6] = 6
  R[7] = 4
  R[8] = 5
  R[9] = 1
Then computing the leader of 1 requires chaining through 1, 2, 4 and 6, stopping at 6. If the program computes the leader of 1 several times, it must redo that scan each time.

After the leader function scans through a chain of boss links to find the leader of a value, it should go back through the chain and put the current leader in the boss array for every number that was looked at in the chain. That way, subsequent leader computations will reach the leader much more quickly. The improvement changes the contents of the array to the following, by installing the correct leader (6) of each of the numbers that was looked at.
  R[1] = 6
  R[2] = 6
  R[3] = 3
  R[4] = 6
  R[5] = 3
  R[6] = 6
  R[7] = 4
  R[8] = 5
  R[9] = 1
You can do this with another loop that rescans through the chain, the same way the chain was scanned the first time, but now putting the leader into the array as you go. Alternatively, make the leader function be recursive, and just change the boss after each recursive call.

Notice that we have not scanned the entire array from beginning to end! R[8] is still 5, even though the leader of 8 is 3. R[9] is still 1, since 9 was not encounted in the scan from 1 to 6. Only the numbers that were looked at in the original scan have their boss values changed. If you try to change everything in the array, you make the implementation slower, not faster.

Also notice that it was not just the boss of 1 that was changed. All of the numbers that were examined in the chain have their bosses set to their leaders. For example R[2] was changed too.

Be sure to test your improved leader function. It is easy to write it incorrectly. Try it by hand to see whether it seems to do the right thing.

(In the past, most students who did this improvement using a loop got it wrong, so be careful. Most students who did it with recursion got it right.)

The Second Improvement: Improving Combine (30 pts)

To do this improvement you will need to use structures and arrays of structures which we will cover a little later.

Each number that is a leader has a collection of constituents, namely the members of its equivalence class. For example, in the above array, number 3 is a leader, and the constituents of 3 are 3, 5 and 8, so 3 has three constituents, counting itself.

When doing a combine operation, you find two values s and t that are leaders. You can then either change the boss of s to t (so s is no longer a leader) or change the boss of t to s (so t is no longer a leader). Either one will accomplish the goal of combining the two sets, but the choice of which to do influences the efficiency of the implementation. The best choice is to change the boss of the value that has the fewest constituents. That tends to keep the lengths of the boss chains up to the leader short.

Modify your ER type so that it is an array of structures. Each structure holds a boss and a constituent count. So, if R has type ER then R[k].boss is the boss of k in equivalence relation R. If k is a leader then R[k].numConstituents is the size of the equivalence class that k leads. (If k is not a leader then R[k].numConstituents should be 0, since a nonleader has no constituents.)

A picture of the initial array, before any combines have been done, might look like this. Notice that each number has one constituent, itself.

  index    boss  numConstituents
    1        1           1
    2        2           1
    3        3           1
    4        4           1
    5        5           1
    6        6           1
    7        7           1

If you do combine(R, 3, 5) with the above array, you might arbitrarily decide to make the boss of 3 be 5, since they have the same constituent count. Then the array looks like this.

  index   boss  numConstituents
    1        1           1
    2        2           1
    3        5           0
    4        4           1
    5        5           2
    6        6           1
    7        7           1
If you now do combine(R, 5, 1), you must change the boss of 1, since it has fewer constituents than 5. The array ends up looking like this.
  index   boss  numConstituents
    1        5           0
    2        2           1
    3        5           0
    4        4           1
    5        5           3
    6        6           1
    7        7           1
As before, only change the boss of a number that is currently a leader. If you now do combine(R, 2, 1), you must realize that you are really being asked to combine 2 and 5, since 5 is the leader of 1. Since 5 has more constituents, you change 2's boss, yielding
  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           4
    6        6           1
    7        7           1
As you can see, this improvement tends to lead to shorter chains of bosses before the leader is found.

Suppose you continue by combining 6 and 7. You might get the following.

  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           4
    6        7           0
    7        7           2
Now combine 1 and 6. Their leaders are 5 and 7. Since 5 has more constituents, change the boss of 7 to be 5. The new information is as follows.
  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           6
    6        7           0
    7        5           0
Now 5 has six constituents. Although 6 is one of 5's constituents, its boss is still 7. The boss chains are shorter, but you still need to do the leader calculation using a loop (or recursion). A value's boss is still not necessarily its leader.

Important note. This improvement requires you to change the definition of type ER. The types of functions newER, equivalent and combine must not change because of this improvement. For example, the type of equivalent should be

  bool equivalent(ER R, const int x, const int y);
Do not change that to
  bool equivalent(ER* R, const int x, const int y);

Efficiency Before and After Improvements

If you do neither improvement, then the equivalence relation manager can take time proportional to n2 to process n combine and equivalent requests. With both improvements, the time is no worse than proportional to n α(n) where α(n), the inverse of Ackerman's function, is a very slowly growing function of n, so slow that, for all remotely practical values of n, α(n) ≤ 6.