Assigned:  Friday, March 2 
Due:  Saturday, March 17, 11:59pm 
This assignment gives you experience using structures and arrays and writing a program with two modules.
It is important to follow the algorithm and design described here. Do not make up a different algorithm or design.
An undirected graph (or simply a graph) is a collection of numbered vertices connected by a collection of edges. An edge connects exactly two different vertices to one another. You can draw a picture of a graph by showing the vertices as circles and the edges as lines connecting them. Here is an example.
I will use variables such as v_{1} and v_{2} to stand for vertex numbers. Each variable can refer to any vertex. For example, variable v_{1} might refer to vertex 4.
A path from vertex v_{1} to vertex v_{k} is a sequence of vertices (v_{1}, v_{2}, …, v_{k}) such that there is an edge between v_{i} and v_{i+1}, for i = 1, …, k − 1.
We will only be concerned with connected graphs. A graph is connected if there is a path from every vertex to every other vertex. For example, in the above graph, there is a path from 3 to 4 that goes (3, 1, 2, 4). There are other paths from 3 to 4 as well. For any two vertices u and v, there is a path from u to v in the example graph.
A weighted graph is a graph in which each edge has a number attached to it, called the weight of the edge. Here is a picture of a weighted graph.
A spanning tree of a connected graph is obtained by deleting as many edges as possible without making the graph so that it is no longer connected. That is, we still need to have a path from each vertex to each other vertex. For example, the following is a spanning tree of the above weighted graph.
If graph G has n vertices then every spanning tree of G has exactly n−1 edges.
The weight of a spanning tree is the sum of the weights of its edges. For example, the weight of the above spanning tree is 55. Here is another spanning tree for the same graph. Its weight is 50.
Obviously, some spanning trees have smaller weight than others. A minimal spanning tree is a spanning tree with the smallest possible weight.
There is a wellknown algorithm, called Kruskal's algorithm, for computing a minimal spanning tree of a weighted graph G. It goes as follows.
Sort the edges of G from small weight to larger weight. Call the sorted list of edges (e_{0}, …, e_{m−1}). So, for example, e_{0} is the edge with the smallest weight. (Technically, e_{0} is an edge with the smallest weight, since two or more edges can have the same weight.)
Start with an empty graph K. It has all of the vertices of G, but no edges yet. Edges will be added to K until, at the end, it is the minimal spanning tree.
For i = 0, …, m−1 do
Look at edge e_{i}. Suppose that it connects vertices u and v.
If there is not already a path in K between vertices u and v, then add edge e_{i} to K. Otherwise, do not add it to K.
When the algorithm is done, K is a minimal spanning tree of your graph.
Kruskal's algorithm is an example of a greedy algorithm, which is an algorithm that tries to optimize on a large scale by optimizing on a small scale. When Kruskal's algorithm chooses an edge to add to K, it chooses the edge with the smallest weight that does the job (because it looks at the edges in increasing order of weight). That is doing the cheapest thing to do for this step. There is no thought about whether that decision will be a good one in the long run.
It turns out that Kruskal's algorithm works, and always finds a minimal weight spanning tree. Watch out, though. Many greedy algorithms for other problems at first appear to be reasonable, but fail to produce a correct result on some inputs.
See below for an argument that Kruskal's algorithm works.
Write a program that reads, from the standard input, a description of a weighted graph with integer weights, in the input format shown below. Then the program should write, on the standard output:
the original graph G;
the edges that are part of a minimal spanning tree K of G;
the total weight of K.
For each graph, show the number of vertices and the number of edges. Make it clear just what each part of the output is. Don't make a reader guess what he or she is looking at. The output of your program might look like this.
Input graph: There are 5 vertices and 6 edges vertices weight 1 2 9 1 3 12 2 4 18 2 3 6 2 5 20 3 5 15 Minimal spanning tree: There are 5 vertices and 4 edges vertices weight 2 3 6 1 2 9 3 5 15 2 4 18 The total weight of the spanning tree is 48.
Start from the standard template. Call your program mst.cpp.
Development plan  

1. Create a directory to hold assignment 5.
Copy equiv.h and equiv.cpp from Assignment 4 into that directory. Put all of the files for this assignment in that directory.  
2. Create a file called mst.cpp
Copy and paste the template into it. Edit the file. Add your name and the assignment number. If you will use tabs, say how far apart the tab stops are.  
3. Document the program
Write a comment in mst.cpp telling what this program will do when it is finished. Include the input format (below), an example input and an example output. when it is finished.  
4. Vertex numbers and edge numbers
Do not confuse edges with vertices. Keep track of the number of vertices in a graph separately from the number of edges. Remember that the first number in the input is the number of vertices, not the number of edges. The vertices are numbered 1, …, n. You will need to have an array of edges. It is a good idea to start numbering the edges from 0, not from 1. So put the first edge that you read at index 0 in the array of edges. Any bugs that result from ignoring this advise are on you. To help to avoid confusing vertex numbers with edge numbers, create types vertexNum and edgeNum as follows. typedef int vertexNum; typedef int edgeNum;Then use type vertexNum for the type of every integer that is a vertex number, and use type edgeNum for the type of every integer that is an edge number. Seeing the type will help you to know which kind of integer you are talking about. Do not confuse a vertex number with a count of the number of vertices in a graph. If you talk about vertex number 1, then you use type vertexNumber. If a variable represents an edge number, it should have type edgeNum. Edge weights are not vertex numbers or edge numbers. Their type should be int.  
5. Representing a Weighted Graph
Start by defining two structure types, Edge and Graph, in mst.cpp. Be sure to document each.
All structure types must be documented by saying (1) what an object of this structure type represents and (2) what property each field represents.  
6. Adding an Edge
In mst.cpp, define a function insertEdge(u, v, w, g) that adds an edge between u and v of weight w to graph g. If the edge array is full, insertEdge should refuse to add the edge. Here is a suitable contract for insertEdge. Note its similarity to the preceding paragraph. // insertEdge(u, v, w, g) inserts an edge of weight w between vertices // u and v into graph g. // // If there is not enough room in g to add the edge, then // insertEdge does nothing. It is critical that g be passed by reference to insertEdge. Otherwise, insertEdge will modify a copy of your graph, which is not what you want. Here is a suitable heading for insertEdge. void insertEdge(vertexNum u, vertexNum v, int w, Graph &g) Make sure that insertEdge does the whole job of inserting an edge. Don't force its caller to do part of the job.  
7. Input and Input Format
In mst.cpp, define a function readGraph(G) that takes a graph G and reads a description of a graph from the standard input, storing it into G. First, write a contract. Then write the function definition. The input starts with a line that tells how many vertices the graph has. If there are five vertices, then those vertices have numbers 1, 2, 3, 4 and 5. In general, if there are n vertices, then they are numbered 1, …, n. Following the first line are the edges, one per line. Each edge line has three integers on it. Line 2 4 50indicates that there is an edge of weight 50 between vertices 2 and 4. The weights are integers. The end of the input is signaled by a line that contains just a 0. An input describing graph looks like this. 5 1 2 9 1 3 12 2 4 18 2 3 8 2 5 20 3 5 15 0 Function readGraph must use insertEdge to add each edge.  
8. Output and Output Format
In mst.cpp, define and document a function writeGraph(G) that writes a description of graph G on the standard output. See the sample output above for a reasonable output format. WriteGraph should not say that the graph is "the input graph". It should just show the graph. Naming the graph is the responsibility of whoever calls writeGraph. WriteGraph will need a variable that holds an index in the array of edges. That variable is an edge number. So it should have type edgeNum.  
9. Main
In mst.cpp, define a main function that just reads a graph and writes it. Test it. Do not move on until this works. Do not skip testing this!  
10. Sorting an Array of Edges
In mst.cpp, define and document a function sortEdges(G) that sorts the array of edges in graph G. Use library function qsort to do this. (Should the contract say that sortEdges uses qsort?) Function qsort is part of the C standard library. It is a generalpurpose sorting function that sorts an array using a variant of the Quicksort algorithm. You should include header file <cstdlib> to use qsort. Qsort needs information about the array and how to sort it. It also needs for you to perform some conversions that, in general, would be questionable, but that work correctly here. Qsort takes four parameters. Function call qsort(base, numElements, elementSize, compare);sorts array base, where:
To run qsort, define a new type, QSORT_COMPARE_TYPE, as follows. It is the type of the compare parameter that qsort expects to receive. typedef int (*QSORT_COMPARE_TYPE)(const void*, const void*);Now statement qsort((void*) Arr, n, sizeof(Edge), (QSORT_COMPARE_TYPE) compareEdges);will sort array of n edges Arr according to weights, with smaller weights toward the beginning of the array.
 
11. Building a Minimal Spanning Tree
In mst.cpp, define and document a function minimalSpanningTree(G) that will take a Graph G as a parameter and return a minimal spanning tree of G (also of type Graph). Document minimalSpanningTree as it will be when finished. It will return a minimal spanning tree of G. For now, only make it sort the edges of G and return G. You will write more later. Modify main so that it computes the minimal spanning tree (by calling minimalSpanningTree) and prints that tree. Test this, keeping in mind that minimalSpanningTree currently just sorts the edges. Are the edges sorted correctly? Do not skip testing this!  
12. Performing Kruskal's Algorithm
Now modify your minimalSpanningTree function definition so that it uses Kruskal's algorithm to compute a minimal spanning tree of G. Create a new graph K without any edges. Look at each edge in G and decide whether to add it to K. If so, then add it to K. You already have a function that adds an edge to a graph. Use it. MinimalSpanningTree will need a variable that holds an index in the edge array. What should its type be? Think about it. See the next step, "determining connections," to see how to decide whether to add an edge to the minimal spanning tree. The only modification of G that minimalSpanningTree(G) is allowed to do is to sort the array of edges. You are not allowed to destroy the array of edges in G in order to build the array of edges of the minimal spanning tree. Graph G must be intact after running minimalSpanningTree(G). Don't be naive about copies of graphs. You do not need to copy G. Read about shallow copying.  
13. Determining Connections
For two vertices u and v, say that u ~ v just when there is a path between u and v. Then ~ is an equivalence relation. (~ is reflexive: the path from u to u has length 0 and is just (u). It should be easy to see that ~ is also symmetric and transitive.) Use your equivalence relation manager from the previous assignment to keep track of the equivalence classes of ~. Initially, each vertex is in an equivalence class by itself. When Kruskal's algorithm calls for adding an edge between u and v, tell the equivalence relation manager to merge u and v. To ask if there is a path between u and v, just ask if u and v are in the same equivalence class. Do not copy your equivalence manager into your file that implements Kruskal's algorithm. Keep it in separate files equiv.cpp and equiv.h. Just include "equiv.h" in mst.cpp. Do not include equiv.cpp in mst.cpp! Now test this. Is the minimal spanning tree correct? Try more than one graph! Many students have submitted programs that they thought were correct but that failed some tests.  
14. Getting the Total Weight
In mst.cpp, define and document a function to compute the total weight of a graph by adding up the weights of the edges. It should return the weight, and must not write anything. Add a call to this function in main. Test it.  
15. Submit your work.

A Makefile is provided for you. If you put it in the directory that you have created for assignment 5, then you can use the following commands.
You can put test inputs into files, and I strongly recommend that you do that. Just redirect the standard input to a file. Command
./mst <graph1.txtruns executable file mst with its standard input coming from file graph1.txt.
Your program must follow the outline in the section "A refinement plan and additional requirements." It can have additional functions, but it must have at least the functions indicated in the design.
All functions must have a clear and correct contract, including any extra functions that you write.
As always, your program must follow the coding standards.
Check the following.
Be sure that your program reads from the standard input and writes to the standard output.
Do not ignore compiler warnings. If you do not understand what a warning means, ask about it.
It is crucial that equiv.cpp works correctly. If you still need to fix it, then fix it.
But do not modify equiv.h or equiv.cpp unless you are only fixing errors. No minimalspanningtree code should find its way into equiv.h or equiv.cpp.
Pay attention to contracts. In the past, many students have become lax about contracts in this assignment, and it has cost them a lot of points.
Proofread contracts. Do they explain how every parameter affects what the function accomplishes? Do they refer to parameter by name? Do they explain what the returned value means, if there is a returned value?
Be sure to document your structure type definitions. Say what an object of the structure type represents and what information each field holds.
Ensure that you are implementing Kruskal's algorithm correctly. Test your program on more than one graph. Test your program on a graph where all of the edges are used in the minimal spanning tree. Check the output. Is it right? Don't just assume that the output is correct.
Make sure that arrays are large enough. If array A has size n then A[n] is out of bounds.
Your mst.cpp file must not make use of the fact that type ER is a pointer or an array. Just use the functions defined in equiv.cpp on ER objects. You can expect to lose a lot of points if you ignore this.
Each function can have at most one loop in its body. Do not use one loop to simulate two nested loops by manipulating the loop parameters.
Each function body must have no more than 16 noncomment lines.
A function body must not change the value of a callbyvalue parameter. It can change a callbyreference parameter.
Make sure that each function does its whole job., not just part of it job. Make sure that a function does not have undocumented requirements.
Avoid code duplication and duplicate calls.
To turn in your work, log into xlogin, change your directory for the one for assignment 5, and use the following command.
~abrahamsonk/2530/bin/submit 5 mst.cpp equiv.cpp equiv.hAfter submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.
Be sure to submit all three files.
Command
~abrahamsonk/2530/bin/submit 5will show you what you have submitted for assignment 5.
You can do repeated submissions. New submissions will replace old ones.
Late submissions will be accepted for 24 hours after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.
To ask a question about your program, first submit it, but use assignment name q5. For example, use command
~abrahamsonk/2530/bin/submit q5 mst.cpp equiv.cpp equiv.hInclude a data file if appropriate. Don't expect me to create your data file.
Then send me an email with your question. Do not expect me to read your mind. Tell me what your questions are. I will look at the files that you have submitted as q5. If you have another question later, resubmit your new file as assignment q5.