Assigned: | Thursday, February 27 |
Due: | Friday, March 6, 11:59pm |
Points: | 350 |
This assignment exercises your knowledge of arrays and functions. It also gives you experience implementing a module that is not an application but is intended to be part of a larger application.
Initially, this will look difficult. In reality, it is very short and simple. You need to write six short functions, plus documentation. Strive to keep it simple and follow the instructions.
Read the entire assignment except for the extra credit part before you start working on it. Be sure to follow the instructions.
Implement exactly the functions that are indicated. Keep the parameter order as shown here. If you change the parameter order, your module will not compile correctly with my tester. Do not add extra responsibilities to functions.
You will need to submit two files, equiv.cpp and equiv.h.
An equivalence relation ≡ is a relation with the following properties for all x, y and z.
≡ is reflexive: x ≡ x.
≡ is symmetric: If x ≡ y then y ≡ x.
≡ is transitive: If x ≡ y and y ≡ z then x ≡ z.
An equivalence relation on a set always partitions the set into equivalence classes. Each value belongs to exactly one equivalence class. Two values x and y are equivalent if and only if they belong to the same equivalence class.
For example, suppose that ≡ is a relation on set {1, 2, 3, 4, 5, 6} with equivalence classes {1, 3, 4}, {2, 6}, {5}. Then 1 ≡ 4 and 2 ≡ 6 but 2 is not equivalent to 3.
Your goal for this assignment is to create a module that manages an equivalence relation that can be changed in a specific way by the program while the program runs. The assignment will involve creating two files, equiv.h and equiv.cpp.
For this assignment, the equivalence relation is always over a set of integers {1, 2, 3, …, n} for some positive integer n. The equivalence classes are sets of integers from 1 to n.
This module is not a complete program. It is intended to be part of a larger program. File equiv.cpp must not contain a main function.
Equiv.h will contain function prototypes, but it must not contain any full function definitions. (There must be no function bodies in equiv.h.) Equiv.cpp must contain line
#include "equiv.h"before any function definitions.
The interface tells exactly what this module provides for other modules to use. Other modules must not use anything that is not described here. Briefly, the interface includes a type, ER, which is the type of an equivalence relation, and the following functions.
None of these functions reads or writes anything.
ER newER(int n)
Statement
ER r = newER(n);returns a new equivalence relation over {1, …, n}. Initally, all of the equivalence classes are singleton. That is, the equivalence classes are {1}, {2}, ..., {n}.
void destroyER(ER r)
bool equivalent(ER r, int x, int y)
void combine(ER r, int x, int y)
void showER(const ER r, int n)
There is one more function that is part of the implementation but not part of the interface. You can use it for debugging, though.
int leader(ER r, int x)
Leader is described below.
For example, suppose that n = 7. The following shows a sequence of operations on a particular equivalence relation and shows the equivalence classes after each combine operation.
Step | Result |
---|---|
ER R = newER(7) | R = {1} {2} {3} {4} {5} {6} {7} |
combine(R, 1, 5) | R = {1, 5} {2} {3} {4} {6} {7} |
combine(R, 2, 7) | R = {1, 5} {2, 7} {3} {4} {6} |
equivalent(R, 1, 5) | yields true |
equivalent(R, 1, 7) | yields false |
combine(R, 5, 7) | R = {1, 2, 5, 7} {3} {4} {6} |
equivalent(R, 2, 5) | yields true |
equivalent(R, 2, 3) yields false | yields false |
combine(R, 2, 3) | R = {1, 2, 3, 5, 7} {4} {6} |
equivalent(R, 3, 7) | yields true |
equivalent(R, 4, 7) | yields false |
combine(R, 4, 6) | R = {1, 2, 3, 5, 7} {4, 6} |
combine(R, 2, 3) | R = {1, 2, 3, 5, 7} {4, 6} |
As you can see from the last step, it is allowed to combine two values that are already equivalent. That should cause no change.
You will not store the equivalence classes directly. Instead, you will store them implicitly, using the following ideas. You are required to implement an equivalence relation manager in this way. You will receive no credit for a module that does not follow this algorithm.
Each equivalence class has a leader, which is one of the members of that equivalence class. You will create a function leader(r, x) that returns the current leader of the equivalence class that contains x in equivalence relation r.
Two values are equivalent if they have the same leader.
There is another idea that is similar to a leader, but not exactly the same. Each value has a boss, which is a value in its equivalence class. For the purposes of describing the idea, let's write boss[x] for x's boss.
If x is the leader of its equivalence class then boss[x] = x. A leader is its own boss.
If x is not the leader of its equivalence class then boss[x] ≠ x and boss[x] is closer to the leader, in the following sense. If you look at the values x, boss[x], boss[boss[x]], boss[boss[boss[x]]], … then you will eventually encounter x's leader.
Use an array to store the bosses. Declaration
typedef int* ER;defines type ER to be the same as int*. Put that line in equiv.h and not in equiv.cpp.
NewER(n) should allocate an array r of size n+1 so that it has indices 1, …, n. It must initialize the array so r[i] = i for i = 1, …, n.
Create a directory for assignment 4 and download
into that directory.
Create file equiv.cpp. Copy and paste the module-template into it. Add your name, the assignment number and the file name (equiv.cpp). Add line
#include "equiv.h"to equiv.cpp.
Write a program comment telling what this module will provide when it is finished. It provides an implementation of combine/equivalent equivalence relations. Don't describe what happens when you run this, since it is not an application.
Write a contract, then an implementation, of newER. In logical terms (which the contract is concerned with), newER returns an equivalence relation. In physical terms, which the implementation is concerned with, newER returns an array, which we call the boss array.
Write a contract, then an implementation, of showER Do not try to be too fancy. You want to see what the boss array looks like for debugging. Be sure that showER(r, n) shows both k and k's boss r[k], for each k from 1 to n.
Create a file test.cpp. It should contain a main function that creates an equivalence relation and shows it. Be sure that test.cpp contains
#include "equiv.h"
before any function definitions. Compile test.cpp and equiv.cpp together as follows.
g++ -Wall -o test test.cpp equiv.cppThen run test by
./test
Add a contract and definition for leader. To compute x's leader, follow the bosses up to the leader.
Add a contract and definition for combine. If x and y are leaders then combine just needs to make r[x] = y. But be careful. Combine must never change the boss of a nonleader. Start by getting leaders.
Modify test.cpp so that it performs a few combine steps and shows the boss array after each step. Do the results look sensible?
Add a contract and definition for equivalent. Two values are equivalent if they have the same leader.
You now have enough to run the automated tester. Run it and look at the results. If your implementation is correct, you will see only the results of a few calls to showER. Incorrect results are clearly flagged.
Add a contract and definition for destroyER.
When you are satisified that your program works and is well written, submit your work.
Use the following commands to compile and run your program.
Note. To write the output from the tester into file testout.txt, use command
make test > testout.txtThen you can look at the test results using a text editor.
As always, your module must follow the coding standards. Pay attention to the following issues.
An equivalence relation has type ER. Any time you need to write the type of an equivalence relation, write ER, not int*.
Be cautious not to use a nonexistent index of an array. If array A has size n, then A[n] does not exist. Pay attention to this. In the past, it was a very common source of serious mistakes.
Be cautious not to change the boss of a nonleader. This has also been a source of mistakes in prior terms.
Every function is required to have a clear, concise and precise contract. Pay attention to this. In the past, students have lost a lot of points because they did not put enough care into writing clear, understandable contracts.
Put a blank line before and after each contract. Use correct spelling and punctuation. Do not omit the subject of a sentence. That can't be difficult. If it is, enroll in a basic English class.
Indent well. Avoid long lines. Limit lines to about 80 characters.
A function body must not change the value of a call-by-value parameter. Pay attention to this. In the past, many students have lost points for violating this requirement.
Avoid code duplication.
Do not use redundant tests in if-statements.
You must submit your program using the following method. Email submissions will not be accepted. An excuse that you do not know how to use Linux will not be accepted.
To turn in your work, log into xlogin, change your directory for the one for assignment 4, and use the following command.
~abrahamsonk/2530/bin/submit 4 equiv.cpp equiv.hAfter submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.
Command
~abrahamsonk/2530/bin/submit 4will show you what you have submitted for assignment 4.
You can do repeated submissions. New submissions will replace old ones.
Late submissions will be accepted for 24 hours after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.
To ask a question about your program, first submit it, but use assignment name q4. For example, use command
~abrahamsonk/2530/bin/submit q4 equiv.cpp equiv.hThen send me an email with your question. Do not expect me to read your mind. Tell me what your questions are. I will look at the files that you have submitted as q4. If you have another question later, resubmit your new file as assignment q4.
For extra credit (up to 45 points), implement two improvements on the equivalence manager algorithm.
But be careful. Make sure that they work and that they do not ruin the equivalence relation manager. You will get no extra credit for an improvement that does not work. If your "improvements" ruin the equivalence relation manager, you will received less credit than you would for a working equivalence relation manager without the improvements.
You can only receive extra credit if your program passes the tests that I run on it.
The first improvement involves a change to the leader function.
Suppose array R contains the following.
R[1] = 2 R[2] = 4 R[3] = 3 R[4] = 6 R[5] = 3 R[6] = 6 R[7] = 4 R[8] = 5 R[9] = 1Then computing the leader of 1 requires chaining through 1, 2, 4 and 6, stopping at 6. If the program computes the leader of 1 several times, it must redo that scan each time. After the leader function scans through a chain of boss links to find the leader of a value, it should go back through the chain and put the current leader in the boss array for every number that was looked at in the chain. That way, subsequent leader computations will reach the leader much more quickly. The improvement changes the contents of the array to the following, by installing the correct leader (6) of each of the numbers that was looked at.
R[1] = 6 R[2] = 6 R[3] = 3 R[4] = 6 R[5] = 3 R[6] = 6 R[7] = 4 R[8] = 5 R[9] = 1You can do this with another loop that rescans through the chain, the same way the chain was scanned the first time, but now putting the leader into the array as you go. Alternatively, make the leader function be recursive, and just change the boss after each recursive call.
Notice that we have not scanned the entire array from beginning to end! R[8] is still 5, even though the leader of 8 is 3. R[9] is still 1, since 9 was not encounted in the scan from 1 to 6. Only the numbers that were looked at in the original scan have their boss values changed. If you try to change everything in the array, you make the implementation slower, not faster.
Also notice that it was not just the boss of 1 that was changed. All of the numbers that were examined in the chain have their bosses set to their leaders. For example R[2] was changed too.
Be sure to test your improved leader function. It is easy to write it incorrectly. Try it by hand to see whether it seems to do the right thing.
(In the past, most students who did this improvement using a loop got it wrong, so be careful. Most students who did it with recursion got it right.)
To do this improvement you will need to use structures and arrays of structures which we will cover a little later.
Each number that is a leader has a collection of constituents, namely the members of its equivalence class. For example, in the above array, number 3 is a leader, and the constituents of 3 are 3, 5 and 8, so 3 has three constituents, counting itself.
When doing a combine operation, you find two values s and t that are leaders. You can then either change the boss of s to t (so s is no longer a leader) or change the boss of t to s (so t is no longer a leader). Either one will accomplish the goal of combining the two sets, but the choice of which to do influences the efficiency of the implementation. The best choice is to change the boss of the value that has the fewest constituents. That tends to keep the lengths of the boss chains up to the leader short.
Modify your ER type so that it is an array of structures. Each structure holds a boss and a constituent count. So, if R has type ER then R[k].boss is the boss of k in equivalence relation R. If k is a leader then R[k].numConstituents is the size of the equivalence class that k leads. (If k is not a leader then R[k].numConstituents should be 0, since a nonleader has no constituents.)
A picture of the initial array, before any combines have been done, might look like this. Notice that each number has one constituent, itself.
index boss numConstituents 1 1 1 2 2 1 3 3 1 4 4 1 5 5 1 6 6 1 7 7 1
If you do combine(R, 3, 5) with the above array, you might arbitrarily decide to make the boss of 3 be 5, since they have the same constituent count. Then the array looks like this.
index boss numConstituents 1 1 1 2 2 1 3 5 0 4 4 1 5 5 2 6 6 1 7 7 1If you now do combine(R, 5, 1), you must change the boss of 1, since it has fewer constituents than 5. The array ends up looking like this.
index boss numConstituents 1 5 0 2 2 1 3 5 0 4 4 1 5 5 3 6 6 1 7 7 1As before, only change the boss of a number that is currently a leader. If you now do combine(R, 2, 1), you must realize that you are really being asked to combine 2 and 5, since 5 is the leader of 1. Since 5 has more constituents, you change 2's boss, yielding
index boss numConstituents 1 5 0 2 5 0 3 5 0 4 4 1 5 5 4 6 6 1 7 7 1As you can see, this improvement tends to lead to shorter chains of bosses before the leader is found.
Suppose you continue by combining 6 and 7. You might get the following.
index boss numConstituents 1 5 0 2 5 0 3 5 0 4 4 1 5 5 4 6 7 0 7 7 2Now combine 1 and 6. Their leaders are 5 and 7. Since 5 has more constituents, change the boss of 7 to be 5. The new information is as follows.
index boss numConstituents 1 5 0 2 5 0 3 5 0 4 4 1 5 5 6 6 7 0 7 5 0Now 5 has six constituents. Although 6 is one of 5's constituents, its boss is still 7. The boss chains are shorter, but you still need to do the leader calculation using a loop (or recursion). A value's boss is still not necessarily its leader.
Important note. This improvement requires you to change the definition of type ER. The types of functions newER, equivalent and combine must not change because of this improvement. For example, the type of equivalent should be
bool equivalent(ER R, const int x, const int y);Do not change that to
bool equivalent(ER* R, const int x, const int y);
If you do neither improvement, then the equivalence relation manager can take time proportional to n2 to process n combine and equivalent requests. With both improvements, the time is no worse than proportional to n α(n) where α(n), the inverse of Ackerman's function, is a very slowly growing function of n, so slow that, for all remotely practical values of n, α(n) ≤ 6.