23A. Modules


Modules

Only small computer programs are written as a single file. Larger programs are normally broken into separate modules, each module in one file. When you add more code to software, instead of making the files larger, you add more files. As a rule of thumb, you limit modules to about 1000 lines, with much smaller modules common.

Modules are critical for working in teams. Imagine several programmers working on the same file all at once!

Of course, to make an executable file, you need to link all of the modules together. You normally compile each module separately, using the -c option of g++ to create an object file for each. For example, command

  g++ -c A.cpp
compiles file A.cpp and produces object file A.o. We will see below how object files are combined to create an executable file.

But first, there is another issue: header files.


Header files

When compiling one file, the compiler needs to know about the resources that other files will provide, once the entire executable file is produced.

For example, suppose that A.cpp defines function jump and B.cpp wants to use jump. The compiler needs to know that jump will be available after the object files are combined. It also needs to know information about jump such as how many arguments it takes, what the argument types are and what type of value jump returns.

To provide the required information, each file that defines (or exports) functions for other modules to use has a header file describing those functions. The header file for A.cpp is normally called A.h.

Each module that wants to use the resources provided by A.cpp includes A.h, using line

 #include "A.h"

Note the quotes in the #include line. When compiling B.cpp, type compiler looks for "A.h" in the same directory, or folder, as B.cpp. Writing <A.h> tells the compiler to look in the directory that holds the header files for the C++ library.

A.cpp also always includes A.h. That is important for two reasons. First, A.h often contains definitions of types that are not repeated in A.cpp, Second, you always want to ensure that you have not made a mistake. By including A.h in A.cpp you allow the compiler to check that the prototypes in A.h are consistent with the definitions in A.cpp.


Prototypes

File A.h typically contains prototypes for the functions that A.cpp exports. A prototype is a function heading followed by a semicolon. For example, suppose A.cpp provides function parent defined as follows.

  int parent(const int n)
  {
    return (n+1)/2;
  }
Then, to export parent, A.h contains prototype
  int parent(const int n);
Really, a prototype is just a promise that a function will be defined somewhere in the collection of modules that will be linked together. Normally, though, a prototype in A.h is assumed to indicate that the corresponding function is defined in A.cpp.

If A.cpp defines functions that are only for its own use, then those functions are not mentioned in file A.h.

See sample header file gcd.h and file gcd.cpp.

Prototypes are not limited to header files. You can put them in .cpp files too. By doing that, you allow yourself to use a function before its definition (but after its prototype).

Watch out: Don't forget the semicolon

If you omit the semicolon at the end of a prototype, the compiler will think that everything after that point is the body of a function. You will probably get pages of errors from that one missing semicolon.


Linkage

Header files only provide information that the compiler needs. In order to create an executable file from object files, you need to perform a step called linkage to put the object files together and write the executable file. The g++ compiler knows how to do linkage. Just list the object files to link. For example, command

  g++ -o run main.o tools.o grape.o
creates executable file run by linking together object files main.o, tools.o and grape.o.


Compiling several files at once

The g++ compiler will compile and link several files at once if you prefer. For example, command

  g++ -Wall -o run main.cpp tools.cpp grape.cpp
compiles main.cpp, tools.cpp and grape.cpp (each separately) and then links them together and writes executable file run. That is not the preferred way of doing it (see make), but it works in a pinch.


Do not include .cpp files

It is tempting to link files by including all of the .cpp files in one module, as in the following.

  #include "A.cpp"
  #include "B.cpp"
  ...
If you do that, any change to any module requires recompiling them all. That is not a problem for small programs, but it is big problem for large pieces of software. Also, you still need header files to, for example, allow functions in A.cpp to use functions in B.cpp.

When I test your submissions, I will link them in the correct way, as shown above. If you include .cpp files, there will be duplicate definitions. That forces me to change your program in order to run it. It is not my job to fix mistakes in your software, and I will penalize you for forcing me to do that.


Summary

A C++ module usually has two parts: a header file (extension .h) and an implementation file (extension .cpp). A .cpp file always includes its own header file. You compile each module separately then link them together to make an executable program.

Include header files, not .cpp files. Do not make a header file include a .cpp file.


Exercises

  1. What is the purpose of a header file? Answer

  2. How does a C++ program include header file tools.h that is in the current directory? Answer

  3. How can you link together object files tools.o and main.o, producing an executable file called go? Answer

  4. What is the difference between

      #include "tools.h"
    
    and
      #include <tools.h>
    
    Answer