Rules With Multiple Outputs in GNU Make

[article]
Mr. Make
Summary:
One problem that Makefile writers sometimes have is the need to write a single rule that produces multiple output files in order to accommodate tools that don't fit the standard one-command-one-output model generally assumed by Make. Eric Melski takes a look at a few alternatives, including the one and only way to truly capture the relationship in GNU Make syntax.

One problem that Makefile writers sometimes have is the need to write a single rule that produces multiple output files in order to accommodate tools that don't fit the standard one-command-one-output model generally assumed by Make. The classic example is bison, a parser generator used in crafting compilers and interpreters. Bison takes an input file like parser.i and generates both parser.c and parser.h. The Makefile hacker is left with a dilemna: how do you express this relationship in GNU Make syntax? In this article we'll look at the obvious answer and why it is wrong. Then we'll look at a few alternatives, including the one and only way to truly capture the relationship in GNU Make syntax.

The obvious but wrong solution
Faced with this problem, many Makefile hackers will write something like this:

all: parser.h parser.c

parser.h parser.c: parser.i
@echo Generating parser.h and parser.c from parser.i
@sleep 2
@touch parser.c parser.h

Unfortunately this Makefile does not describe a single rule with two outputs, but rather two distinct rules that each have a single output, and that happen to use the same series of commands.  In a serial build this distinction is often irrelevant and sometimes even undetectable:  although GNU Make will schedule both rules to run, the second rule will never do any work because its output file will already have been updated (by the first rule).  But try running this build in parallel with gmake -j 2:

Generating parser.h and parser.c from parser.i
Generating parser.h and parser.c from parser.i

Because there are two distinct rules which each update both output files, the files are actually updated twice.  In the best case, this just results in a little wasted work.  In the worst case, the rules both try to update the output files at the same time, resulting in corrupted output.  So, this approach will work if you only ever run serial builds, but nobody does that these days. So what's a Makefile hacker to do?

A Crude Fix

Our first attempt at fixing the problem is to add a dependency between the two output files.  This is a bit crude since there is no actual dependency between the files, but at least it will ensure that the rules run one at a time:

all: parser.h parser.c

parser.h parser.c: parser.i
@echo Generating parser.h and parser.c from parser.i
@sleep 2
@touch parser.c parser.h

parser.h: parser.c

This modification has made the makefile parallel-safe, but it has introduced a surprising side-effect: Even in a serialized build, the files are now generated twice.  Go ahead and try it yourself.  This is a result of the particular dependency graph algorithms that GNU Make employs. But here's what will really bake your noodle: f you reverse the order in which the output files are listed as prerequisites of all, suddenly the files are generated only once, because those algorithms are very sensitive to the order in which dependencies are declared.  So this approach seems to work, but it's too brittle to be considered seriously.

Another attempt
It seems that we might be able to fix some of the problems with our first crude attempt by rewriting the makefile so that there are no commands for one of the two targets:

all: parser.h parser.c

parser.c: parser.i
@echo Generating parser.h and parser.c from parser.i
@sleep 2
@touch parser.c parser.h

parser.h: parser.c

At first this seems pretty good. Now there is only one rule that can update the files, so there's no risk of duplicating work or corrupting outputs.  In a from-scratch build, this construct will work fine. But it has some trouble in incremental builds.  What happens if you delete just parser.h, but not parser.c?  Since there are no commands specified for generating parser.h, make will not know how to produce it, and since parser.c is already up-to-date, it will not run the commands that would generate the files.  Even if you explicitly ask make to build parser.h, by running "gmake parser.h", you're stuck:

gmake: Nothing to be done for `parser.h'.

If you only do from-scratch full builds, this solution may work for you, but if you do incrementals as well, use caution.

Dummy targets

Another approach is to use a dummy target to do the work, and have the actual output files depend on the dummy: 

all: parser.h parser.c

parser.h parser.c: generate_parser

generate_parser: parser.i
@echo Generating parser.h and parser.c from parser.i
@sleep 2
@touch parser.c parser.h

This looks promising: We have a single rule that generates both files, so it will work correctly in a parallel build.  But this approach has one significant drawback:  it breaks the relationship between the input file and the output files derived from it.  We no longer have a depedency that says, "if parser.i is newer than parser.c or parser.h, rebuild those files."  Instead, we have a dependency that says, "if parser.i is newer than 'generate_parser' (whatever that is), rebuild it". This makefile will rebuild parser.c and parser.h every time it is run, because make is comparing the times on parser.c and parser.h with generate_parser.  Since generate_parser doesn't exist, make will run that rule.  It doesn't matter if parser.i is older than parser.c and parser.i, because there is no direct relationship between those files in this makefile.

We can work around this by changing the generate_parser rule so that it also creates a file on disk named "generate_parser"; then on an incremental build, make will see that the file "generate_parser" is newer than parser.i and will not rebuild.  But this is messy:  we'll have an extra file hanging around that serves no purpose other than to work around a deficiency in the build tool, and we need to remember to manage that file along with the other outputs of the build.  It should be deleted by "make clean", for example.  And if somebody does something like "touch generate_output" in between builds, that make may not be able to correctly detect that parser.c and parser.h must be rebuilt.  As with the previous solution, if you only do from-scratch full builds, this solution will work fine, but with incrementals you need to be careful.

The GNU Make "right" way
In GNU Make syntax there is really only one correct way to get a single rule with multiple outputs, and that is to use a pattern rule:

all: parser.h parser.c

%.h %.c: %.i
@echo Generating $*.h and $*.c from $*.i
@sleep 2
@touch $*.c $*.h

In direct contrast to the first example, this actually will create a single rule that creates two outputs. If you run with gmake -j 2, you'll see the files are updated only once:

Generating parser.h and parser.c from parser.i

This is the only construct that produces the correct dependency graph and behavior.  Unfortunately, it has one significant shortcoming: It requires that the input and all the outputs share a common stem, such as parser in our simple examples, so it's not as flexible as we'd like it to be.  Still, if your files do fit this restriction, then this is the best solution for you.

Conclusion
Now you know some of the ways to create a makefile that generates multiple outputs from a single command.  If possible, you should use a pattern rule with multiple outputs.  If that doesn't work for you, hopefully one of the alternatives I've shown you will.

About the Author
Eric Melski was part of the team that founded Electric Cloud and is now Architect for ElectricAccelerator. Before Electric Cloud, he was a Software Engineer at Scriptics, Inc. and Interwoven. He holds a BS in Computer Science from the University of Wisconsin.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.