CS 122 W'15: Valgrind and Make

Goals

In this lab, you will learn about two very useful tools: valgrind and make.

To get started, please run git pull upstream master to pick up the lab4 directory.

Valgrind

The purpose of this part of the lab is to give you experience with using valgrind, a tool for debugging and profiling Linux (or linux-like) executables. As you will see, valgrind is useful for finding memory leaks and bad memory accesses.

This part of the lab is based on a lab from Colorado State, which in turn is based on a lab from The Hebrew University.

We will work through a series of very simple programs, each of which illustrates a different kind of bug.

Valgrind basics

Take a look at the code in the file ex1.c and then compile and run it. You'll see that the code runs to completion without any problems. Now run it with valgrind:

valgrind ./a.out

You'll see a lot of output, including the message "definitely lost: 100 bytes in 1 blocks" and the instruction "Rerun with --leak-check=full to see details of leaked memory." Follow this instruction and run it again using the suggested flag:

valgrind --leak-check=full ./a.out

The output still contains a lot of information that you probably don't need. You can suppress this information using the -q flag:

valgrind -q --leak-check=full ./a.out

If you look at the output of the last command, you might notice that valgrind is trying to tell you where in the code the leaked memory was allocated. (As an aside, memory is said to have leaked if it was allocated, but never deallocated.) Unfortunately, when you compile the code without the -g flag, there is very little information about the source code in the resulting executable. As a result, valgrind cannot tie the leaked memory to a specific line in the code. Recompile ex1.c with the -g flag and then run valgrind again:

gcc -Wall -g ex1.c
valgrind -q --leak-check=full ./a.out

Now when you look at valgrind's output, you will see that it tells you exactly where the leaked space was allocated.

More examples

The files ex2.c through ex5.c each contain a different type of bug. Use valgrind to identify these bugs.

Compile and run ex6.c without valgrind. You'll notice that the C runtime system generates a cryptic message. Now run it again with valgrind and you'll see that valgrind's message is much more informative.

Valgrind does not identify all types of array out of bounds errors. Compile ex7.c and run it with valgrind. Notice that valgrind does not generate an error, even though there is an out of bounds array reference. What is the difference between the error in ex7.c and the very similar error in ex4.c?

While it is very likely that you could have identified the bugs in these examples merely by inspecting the code, the exercises are useful because they will allow you to see the type of output valgrind generates for a number of different problems.

Valgrind is open source software and is available for different variants of Linux and OS X.

Make

Now we will move on to make, a tool that is useful for building programs.

This section of the lab was adapted from an old CS154 lab that was designed by Lars Bergstrom.

Compiling

Compilation of a C program takes place in two phases: first the C source files are each compiled separately, then they are all linked together. Both steps are handled by the gcc program. Separating compilation from linking is especially important for large projects in which a lot of time can be saved by only re-compiling the C source files that change, not all the files.

Make

Make is a tool for building programs. It reads a Makefile, expands the variables in it, and executes the rule requested when run from the command-line. If no rule is provided, make runs the first rule it finds in the Makefile.

The Makefile has two parts. Variables are of the form:

OBJS = foo.o bar.o baz.o blork.o

By convention, variables are in all capital letters. Rules are of the form:

main: foo.o
    gcc foo.o -o main

clean:
    rm -rf *.o

A rule is either a file name or a phony target (as in the rule for cleaning up your directory). In either case, make looks on disk to see if there is a file with the name of the rule. If it does not exist or if a file with the rule name exists but is older than at least one file in the list of files to the right of the colon, then make runs the command on the second line.

IMPORTANT! the second line starts with a tab. If there is anything other than a tab at the start of that line, make will not run the commands.

Make has a comprehensive manual.

Walk-through

Execute an ls in your lab4 directory and look at the files. There's a Makefile, two .h files, and collection of .c files. To build sources into an executable, just run make:

$ make

You will see output from gcc compiling two files and then linking them into a single binary. Now do another ls. You will see two .o files and an executable binary named main. Edit the file named foo.c - just use your favorite editor to add a blank line someplace safe, save, and exit. Now re-run make as above.

You should see just the file foo.c rebuilt into foo.o, and then the binary named main rebuilt. bar.o should not have been rebuilt. Open up the Makefile to see why! When you ran make, it loaded the Makefile and built the first target, main. Main depends on foo.o and bar.o. bar.o's files have not changed, so it did not need to be updated. However, foo.o's last modified date was older than foo.c's, so it was rebuilt. The important thing to take away from this discussion is that the Makefile is what specified the dependency, not the compiler or source! If you don't keep the dependencies up to date, make will not know what to rebuild when a file is edited.

Makefiles often have a phony target named clean that is used to remove unneeded files, such as .o files. Running the command:

$ make clean

With the current Makefile, this will remove the .o files and prepare for another build.

You can revert the change with the git checkout command:

$ git checkout foo.c

For the next exercise, you will alter a Makefile.

Exercise

Part 1: If you look at the rules, they have a lot of copied code! Create variables CC for the compiler, CFLAGS for the compiler flags, etc. and move all constant statements to the top of the file as variables. As an example, if a rule looked like the following:

clean:
    rm -rf a.o b.o c.o

You would turn it into:

RM = rm
RMFLAGS = -rf
OBJS = a.o b.o c.o

clean:
    $(RM) $(RMFLAGS) $(OBJS)

FYI, the -c flag tells gcc to generate an object file (.o) rather than an executable. Object files can then be combined to create an executable.

Part 2: Listing all of those files both in the prerequisite line and on the command-line seems like a lot of redundant redundancy! Use automatic variables to simplify them. As an example, if a rule looked like the following:

frob: thing1.c thing2.c thing3.c
    tweak thing1.c thing2.c thing3.c -o frob

You would turn it into:

frob: thing1.c thing2.c thing3.c
    tweak $^ -o $@

@ and ^ are automatic variables. The former (@) refers to the name of the rule (that is, the string to the left of the colon) and the latter (^) refers to the files listed as dependencies.

Part 3: The command for each object file is very similar to each other object file. This can be simplified using pattern rules. Pattern rules contain a % character, which will match any non-empty string. An example would look like:

%.o: %.c

Indicating that any .o file depends only on the corresponding .c file.

Using pattern rules, combine the two rules that compile foo and bar into one rule that compiles either of them.

Be sure to run make clean followed by make periodically to check that you didn't break anything! There are multiple correct final solutions to this exercise.