Computer Science 2530
Spring 2020
Programming Assignment 5

Assigned: Wednesday, March 18
Due: Wednesday, April 1, 11:59pm
Points: 600

Table of Contents

  1. Purpose of this assignment
  2. What you will submit
  3. Background: graphs and spanning trees
  4. An algorithm to find a minimal spanning tree
  5. Determining connections
  6. The assignment
  7. Getting started
  8. Functions
  9. Refinement barriers
  10. A refinement plan
  11. Compiling and running your program on xlogin
  12. Issues to be aware of
  13. Submitting your work
  14. Late submissions
  15. Asking questions


Purpose of This Assignment

This assignment gives you experience using structures and arrays and writing a program with two modules.

It is important to follow the algorithm and design described here. Do not make up a different algorithm or design. You will need to define two structure types and eight functions, including main.

Read the entire assignment before you start working on it. Be sure to follow the instructions.


What You Will Submit

You will need to submit three files, mst.cpp, equiv.cpp and equiv.h. Submit all three files, even if I already have equiv.cpp and equiv.h from Assignment 4.


Background: Graphs and Spanning Trees

Graphs and Paths

An undirected graph (or simply a graph) is a collection of numbered vertices connected by a collection of edges. An edge connects exactly two different vertices to one another. You can draw a diagram of a graph by showing the vertices as circles and the edges as lines connecting the vertices. Here is an example.

I will use variables such as u, v, v1, v2 to stand for vertex numbers. Each variable can refer to any vertex. For example, variable v1 might refer to vertex 4.

A path from vertex v1 to vertex vk is a sequence of vertices (v1, v2, …, vk) such that there is an edge between vi and vi+1, for i = 1, …, k − 1.

We will only be concerned with connected graphs. A graph is connected if there is a path from every vertex to every other vertex. For example, in the above graph, there is a path from 3 to 4 that goes (3, 1, 2, 4). There are other paths from 3 to 4 as well. For any two vertices u and v, there is a path from u to v in the example graph, so it is connected.

Weighted Graphs

A weighted graph is a graph in which each edge has a number attached to it, called the weight of the edge. Here is a diagram of a weighted graph.

If e is an edge, we write W(e) to indicate the weight of e.

Spanning Trees

A spanning tree of a connected graph is obtained by deleting as many edges as possible without making the graph so that it is no longer connected. That is, we still need to have a path from each vertex to each other vertex. For example, the following is a spanning tree of the above weighted graph.

If graph G has n vertices then every spanning tree of G has n − 1 edges.

The weight of a spanning tree is the sum of the weights of its edges. For example, the weight of the above spanning tree is 55. Here is another spanning tree for the same graph. Its weight is 50.

Obviously, some spanning trees have smaller total weight than others. A minimal spanning tree is a spanning tree with the smallest possible weight.


An Algorithm to Find a Minimal Spanning Tree

There is a well-known algorithm, called Kruskal's algorithm, for computing a minimal spanning tree of a weighted graph G. It goes as follows, expressed in conceptual, not physical, terms.

  1. Make a list (or array) of the edges in G. Suppose there are m edges. Sort the edges from small weight to larger weight, giving list (e0, e1, …, em − 1) of edges. Because they have been sorted by weight, we know that (W(e0) ≤ W(e1) ≤ … ≤ W(em − 1)

  2. Start with an empty graph T. It has all of the vertices of G but no edges yet. Edges will be added to T until, at the end, it is the minimal spanning tree.

  3. For i = 0, …, m − 1 do

    1. Look at edge ei. Suppose that it connects vertices u and v.

    2. If there is not already a path in T between vertices u and v, then add edge ei to T. Otherwise, do not add it to T.

When the algorithm is done, T is a minimal spanning tree of G.

Kruskal's algorithm is an example of a greedy algorithm, which is an algorithm that tries to optimize on a large scale by optimizing on a small scale. When Kruskal's algorithm chooses an edge to add to T, it chooses the edge with the smallest weight that does not cause a cycle (because Kruskal's algorithm looks at the edges in increasing order of weight). That is the cheapest edge to add for this step. There is no thought about whether that decision will be a good one in the long run.

It turns out that Kruskal's algorithm works, and always finds a minimal weight spanning tree. Watch out, though. Many greedy algorithms for other problems at first appear to be reasonable, but fail to produce a correct result on some inputs.

Determining Connections

Kruskal's algorithm needs to be able to determine whether two vertices are connected by the edges that have already been added to the tree.

For two vertices u and v, say that u ~ v just when there is a path between u and v. Then ~ is an equivalence relation. (~ is reflexive: the path from u to u has length 0 and is just (u). It should be easy to see that ~ is also symmetric and transitive.)

Use your equivalence relation manager from the previous assignment to keep track of the equivalence classes of ~. Initially, there are no edges, so each vertex is in an equivalence class by itself. When Kruskal's algorithm considers adding an edge between vertices u and v, it consults the equivalence relation manager. If u ~ v, then the edge between u and v is omitted.

If u and v are not equivalent, then the algorithm adds this edge to the tree and tells the equivalence relation manager to combine the equivalence classes of u and v.


The Assignment

Write a program that reads, from the standard input, a description of a weighted graph with integer weights, in the input format shown below. Then the program should write, on the standard output:

  1. the original graph G;

  2. the computed minimal spanning tree T of G;

  3. the total weight of T.

To show a graph (or tree), show the number of vertices, the number of edges and a list of all of the edges, with their weights.

Output format

Make it clear just what each part of the output is. Don't make a reader guess what he or she is looking at. The output of your program might look like this.


Input graph:

There are 5 vertices and 6 edges

  vertices    weight
  1   2            9
  1   3           12
  2   4           18
  2   3            6
  2   5           20
  3   5           15
  
Minimal spanning tree:

There are 5 vertices and 4 edges

  vertices    weight
  2   3            6
  1   2            9
  3   5           15
  2   4           18

The total weight of the spanning tree is 48.

Input format

The input starts with a line that tells how many vertices the graph has. If there are five vertices, then those vertices have numbers 1, 2, 3, 4 and 5. In general, if there are n vertices, then they are numbered 1, …, n.

Following the first line are the edges, one per line. Each edge line has three integers on it. Line

  2 4 50
indicates that there is an edge of weight 50 between vertices 2 and 4. The weights are integers. The end of the input is signaled by a line that contains just a 0. Input
5
1 2  9
1 3 12
2 4 18
2 3  8
2 5 20
3 5 15
0
describes graph

Note. The input is not required to be broken into lines as above. For example, the numbers might all be on one line, as in

5 1 2 9 1 3 12 2 4 18 2 3 8 2 5 20 3 5 15 0
Do not try to force the input to move to the next line. Just read one integer after another. If that involves moving to a new line, that will be done automatically by scanf.


Getting Started

Start by creating a directory to hold assignment 5. Copy equiv.h and equiv.cpp from Assignment 4 into that directory. Download files

into that directory.

Create a file called mst.cpp. Copy and paste the template into it. Edit mst.cpp. Add your name and the assignment number. If you will use tabs, say how far apart the tab stops are. Add line

  #include "equiv.h"
to mst.cpp.


Functions

Define the following functions in mst.cpp. Use exactly the headings shown.

void insertEdge(int u, int v, int w, Graph* g)

InsertEdge inserts an edge between vertices u and v, of weight w, into graph g.

Graph* readGraph(int e)

ReadGraph allocates a graph in the heap with enough room for e edges. It reads a graph from the standard input in the input format and returns a pointer to that graph.

void writeGraph(const Graph* g)

WriteGraph writes graph g in the output format. WriteGraph shows the number of vertices and the number of edges, but otherwise does not write a heading. It is the caller's responsibility to label the graph.

WriteGraph(g) must not change any part of the representation of g.

void sortEdges(Graph* g)

SortEdges sorts the edges of g into nondecreasing order by weight.

Graph* minimalSpanningTree(Graph* g)

MinimalSpanningTree returns a minimal spanning tree of g. It must not change g.

int totalWeight(const Graph* g)

TotalWeight returns the sum of the weights of the edges in g.

Refinement Barriers

You are required to develop this program by successive refinement, as outlined in the next section. The following refinement barriers will be enforced.

  1. If readGraph is so bad that it cannot possibly provide a graph to be processed by the remaining functions, then you will only receive credit readGraph, writeGraph and their helpers, regardless of how much beyond that you have written.

  2. If sortEdges does not to its job at all, you will receive no credit for minimalSpanningTree or totalWeight.


A Refinement Plan

  1. Write a comment in mst.cpp telling what this program will do when it is finished. Include information to help a reader understand what this program does and how to use it, including:

    1. the fact that this program reads a weighted graph g from the standard input and writes a minimal spanning tree of g to the standard output;

    2. the input format, with an example input;

  2. Define two structure types, Edge and Graph, in mst.cpp. Be sure to document each.

    1. An object of type Edge represents one edge in a graph. It has three fields: two vertex numbers and a weight. Include a parameterless constructor that sets all three fields to 0.

    2. An object of type Graph represents a weighted graph. It has four fields:

      • the number of vertices in the graph,
      • the number of edges in the graph,
      • a pointer to an array of Edge structures (allocated in the heap) that holds the edges,
      • the physical size of the array of edges.

      Choose clear and sensible names for the fields.

      Include a constructor Graph(n, e) that yields a graph with n vertices and no edges, but with an array of edges whose physical size is e.

      Do not confuse the physical size of the array with the current number of edges. The physical size is the maximum number of edges allowed. The current number of edges might be less than that.

    Both Edge and Graph must be documented by saying

    1. what an object of this structure type represents and
    2. what property of the object each field represents.

    Don't make the reader guess.

  3. Add a contract and definition of insertEdge(u, v, w, g).

  4. Add a contract and definition of readGraph(e).

    readGraph must use insertEdge to insert an edge.

    It is tempting to read three numbers per line, no matter what. On the last line, only one number will be read successfully. But programs tend to be modified, and, wherever it is not too much work, it is a good idea to be prepared for modifications. What if the program is changed so that there is more input after the graph? You don't want readGraph to read into the what comes after the graph.

    Write readGraph so that it does not depend on the line containing just 0 to be the last thing in the input. That is easy to do. Read the first number and check it before you read the next two.

  5. Add a contract and definition of writeGraph(g).

  6. Write a definition of main. Make it just read a graph and write the graph. Test your program using the automated tester. Does it correctly read and write each graph? Do not go on until readGraph and writeGraph work correctly.

  7. Add a contract and definition of sortEdges.

    Use library function qsort to do this. (Should the contract for sortEdges say that sortEdges uses qsort?) See using qsort.

  8. Add a contract and heading for minimalSpanningTree.

    Initially, make minimalSpanningTree(g) just call sortEdges(g). Modify main so that it calls minimimalSpanningTree and shows the "tree" that minimimalSpanningTree returns. Test your program. You should see that the array of edges has been sorted.

  9. Now make minimimalSpanningTree(g) compute a minimal spanning tree of g. Start by creating a new graph T (in the heap) with the same number of vertices as g but without any edges. How large does the array of edges in T need to be? Do not make the array be too small or too large.

    Look at each edge in g and decide whether to add it to T. You already have a function that adds an edge to a graph. Use it.

    See the determining connections for how to decide whether to add an edge to the minimal spanning tree.

    The only modification of g that minimalSpanningTree(g) is allowed to do is to sort the array of edges. You are not allowed to destroy the array of edges in g in order to build the array of edges of the minimal spanning tree! Graph g must be intact after running minimalSpanningTree(g). Making a shallow copy of g won't help. Do not copy g.

    Test your program. Does it work?

  10. To ensure that minimimalSpanningTree(g) does not destroy g, modify main so that it writes the original graph after writing the minimal spanning tree. Test your program. Does it work? If so, remove the added call to writeGraph.

  11. Add a contract and definition of totalWeight.

    Modify main so that it shows the total weight of the minimal spanning tree.

    Test your program.

  12. Proofread your contracts. Make sure that there are no spelling or grammatical errors. Be sure that each contract explains how each parameter affects the function, and refers to each parameter by its name.

  13. When you are satisified that your program works and is well written, submit your work.


Compiling and Running Your Program on Xlogin

You can use the following commands to compile, run or test your program.

make mst
Just compile mst.cpp and equiv.cpp, if necessary, and link them to create executable file mst.

make run
Do make mst then run the program.

make test
Do make mst then run the program on some predefined graphs.

make debug
Do make mst then run the program via the gdb debugger.

make clean
Remove all machine-generated files.

Consider adding your own tests. You can put test inputs into files, and I strongly recommend that you do that. Just redirect the standard input to a file. Command

  ./mst <graph1.txt
runs executable file mst with its standard input coming from file graph1.txt.


Issues to Be Aware of

As always, your program must follow the coding standards. You are required to follow the instructions. A program that ignores the instructions will receive a score of 0.

Check the following.

  1. Do not confuse edges with vertices. Keep track of the number of vertices in a graph separately from the number of edges. Remember that the first number in the input is the number of vertices, not the number of edges.

    The vertices are numbered 1, …, n, where n is the first number in the input. You will need to have an array of edges. There are usually (but not always) more edges than vertices. It is a good idea to start numbering the edges from 0, not from 1. So put the first edge that you read at index 0 in the array of edges.

  2. Do not ignore compiler warnings. If you do not understand what a warning means, ask about it.

  3. It is crucial that equiv.cpp works correctly. If you still need to fix it, then fix it. You cannot get mst.cpp to work with a faulty equivalence relation manager.

    But do not modify equiv.h or equiv.cpp unless you are only fixing errors. No minimal-spanning-tree code should find its way into equiv.h or equiv.cpp.

  4. Pay attention to contracts. In the past, many students have become lax about contracts in this assignment, and it has cost them a lot of points.

    Be sure to document your structure type definitions. Say what an object of the structure type represents and what information each field holds.

  5. Make sure that arrays are large enough. If array A has size n then A[n] is out of bounds.

  6. Your mst.cpp file must not make use of the fact that type ER is a pointer or an array. It should restrict itself to the interface of the equivalence relation manager module. You can expect to lose a lot of points if you ignore this.

  7. Each function can have at most one loop in its body. Do not use one loop to simulate two nested loops by manipulating the loop parameters.

  8. Each function body must have no more than 16 noncomment lines.

  9. A function body must not change the value of a call-by-value parameter. It can change a call-by-reference parameter or the value pointed to by a call-by-pointer parameter.

  10. Make sure that each function does its whole job., not just part of it job. Make sure that a function does not have undocumented requirements.

  11. Avoid code duplication and duplicate calls.


Submitting Your Work

To turn in your work, log into xlogin, change your directory for the one for assignment 5, and use the following command.

  ~abrahamsonk/2530/bin/submit 5 mst.cpp equiv.cpp equiv.h
After submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.

Be sure to submit all three files.

Command

  ~abrahamsonk/2530/bin/submit 5
will show you what you have submitted for assignment 5.

You can do repeated submissions. New submissions will replace old ones.

Late Submissions

Late submissions will be accepted for 24 hours after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.


Asking Questions

To ask a question about your program, first submit it, but use assignment name q5. For example, use command

  ~abrahamsonk/2530/bin/submit q5 mst.cpp equiv.cpp equiv.h
Include a data file if appropriate. Don't expect me to create your data file.

Then send me an email with your question. Do not expect me to read your mind. Tell me what your questions are. I will look at the files that you have submitted as q5. If you have another question later, resubmit your new file as assignment q5.