Descrambling Parallel Build Logs

Summary:

One of GNU make's many features allows you to shorten build times by running more than one command at a time. If your dependencies are all correct, or nearly correct, this can give you a significant improvement, and since it's built into the tool you get it "for free." But GNU make's parallel build feature—often called "dash j mode," after the command-line option that is used to enable it—is not without drawbacks. The worst of these is that GNU make parallel builds can produce incorrect results if the build dependencies are not sufficiently correct. Fixing that is way outside the scope of this column. But we can do something about the second biggest problem: the scrambled build log. It's not a perfect solution, as you'll see, but it addresses a large portion of the problem, and it's simple to implement.

A typical scrambled build log
If you haven't used GNU make parallel builds, you may not be familiar with the problem we're trying to solve here. The issue is that with multiple commands running simultaneously, all writing output and error text to the same location, the build log becomes a jumbled mess. The output from each command gets interleaved with the output from other commands. Worse, the error messages get mixed up too, so it becomes difficult or impossible to tell which commands are producing the errors. Consider this trivial makefile:

all: a b c
a b c: ; @for n in 1 2 3 4 ; do echo $@-$$n && sleep 1 ; done
Run serially, the output looks like this:

a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4

But run in parallel, the output is all mixed up:

a-1
b-1
c-1
b-2
a-2
c-2
a-3
b-3
c-3
b-4
a-4
c-4

If you run it in parallel again, the output is all mixed up again, but in a different way:

a-1
c-1
b-1
a-2
c-2
b-2
a-3
c-3
b-3
a-4
c-4
b-4

In fact, you might get a different permutation of the output every time you run the build. That's no problem if you never have any errors in your build, but for those of us without that kind of luck, this situation is a complete disaster. Unfortunately, you can't completely solve this problem without help from GNU make itself, but we can make things a little better by doing a bit of programming, and leveraging another, often overlooked, feature of GNU make.

Locker
The first piece of our solution is a simple utility that uses file locking to ensure only one command can write to the build log at a time. We'll look at a Linux version of the utility here to get a better idea of how it works.

// locker
//
//      A command invocation wrapper that captures stdout/stderr from the
//      command, then grabs an exclusive lock on a specified file and copies
//      the command stdout/stderr to the real stdout. Intended for use with
//      GNU make parallel builds to keep stderr/stdout from each command
//      distinct.
//
//      Usage: locker

#include
#include // For flock().
#include // For malloc().
#include // for tmpfile().
#include // For fork(), execvp(), dup2().
#include // For waitpid().

int main(int argc, char *argv[])
{
pid_t child;
int outfd = -1;
int status;
int lockfd = -1;
char buffer[64 * 1024];
ssize_t bytesRead;
FILE *outfile = tmpfile();

outfd = fileno(outfile);

child = fork();
if (child == (pid_t) 0) {
// In the child. Run the real shell.

char * const shArgv[4] = {
"/bin/sh",
"-c",
argv[3],
0
};
close(1);
close(2);
dup2(outfd, 1);
dup2(outfd, 2);
execvp("/bin/sh", shArgv);

// The following should never execute.

return 17;
} else if (child == (pid_t) -1) {
// Error in fork; this should never happen.

return 23;
}

// In the parent. Wait for the child to exit, then relay the output to
// the real stdout.

waitpid(child, &status, 0);
lockfd = open(argv[1], O_WRONLY|O_CREAT, 0600);
flock(lockfd, LOCK_EX);
fseek(outfile, 0, SEEK_SET);

while ((bytesRead = fread(buffer, 1, 64 * 1024, outfile)) > 0) {
write(1, buffer, bytesRead);
}
fclose(outfile);
flock(lockfd, LOCK_UN);
close(lockfd);

// Relay the exit code from the child.

return status;
}
There's not much to this utility. It just runs the specified command as a child process, redirecting the command's output and errors to a temporary file. When the command finishes, the wrapper acquires an exclusive lock on a specified lock file. This is the crux of the wrapper, as it ensures that the output from each command is written atomically, without interference from other commands that might be running at the same time. After copying the redirected output to the real standard output stream, the lock is released and the wrapper exits, propagating the exit status from the command. At this point we have a way to ensure that the output from each command is handled atomically. We just need to make sure each command is invoked using our wrapper. The obvious solution is to modify the makefile to prefix each command with locker lockfile -c, but on any non-trivial build that would be a nightmare. Fortunately there is a better way.

Overriding SHELL for fun and profit
Normally GNU make only uses the shell to invoke commands when the command-line requires the shell for correct operation, such as commands that use process pipelines, or that invoke shell built-in commands. Simple commands are invoked directly by GNU make, to avoid the overhead of creating the extraneous shell process. However, a little known feature of GNU make is that it will use the shell for every command invocation if you have explicitly set the SHELL variable in your makefile. We can exploit this fact to automatically prefix every command in the build with our wrapper, without requiring any other changes to the makefile. If SHELL is set, GNU make creates command-lines by appending -c commandline to the value of SHELL. That is, if the original command-line is gcc -o foo foo.c, then GNU make will actually invoke the command $(SHELL) -c "gcc -o foo foo.c". Therefore, we should set SHELL to simply locker lockfile. For example, we could modify our trivial test makefile like so:

SHELL:=$(shell pwd)/locker lockfile
all: a b c
a b c: ; @for n in 1 2 3 4 ; do echo $@-$$n && sleep 1 ; done
With this modification, the parallel build output might look like this:
b-1
b-2
b-3
b-4
a-1
a-2
a-3
a-4
c-1
c-2
c-3
c-4
Success! The output from individual commands is no longer interleaved with output from other commands. Not a bad return for the effort invested.

Shortcomings
Of course this solution is not perfect. As I said earlier, to really solve this problem correctly you need help from the make tool. The principal drawbacks of this approach are:

There is a small amount of additional overhead introduced by the addition of the wrapper to the command.

Output is handled on a per-command basis, not per-target. That means that if you have a target with a multi-line rule body, it is possible that the output of the different commands could get separated.

The order of output is still non-deterministic. That means that if you run the same build in parallel again, the build log might be in a different order. For example:

a-1
a-2
a-3
a-4
b-1
b-2
b-3
b-4
c-1
c-2
c-3
c-4

So it's not perfect, but I think it is demonstrably better than the previous sorry state of affairs. Here's hoping you find it useful. If you have other ideas for dealing with this problem, please share them in the comments!

Tags:

configuration management

About the author

Eric Melski

Eric is Chief Architect for ElectricAccelerator, a high-performance implementation of make from Electric Cloud, Inc. He obtained a BS in Computer Science from the University of Wisconsin in Madison in 1999. In 2002 Eric co-founded Electric Cloud, where he has spent more than a decade developing distributed, parallel systems designed to accelerate build processes. He is named on seven patents related to his work on build acceleration at Electric Cloud.

Apr 28	STAREAST Software Testing Conference in Orlando & Online
Jun 02	AI Con USA Bridging Minds and Machines
Sep 22	STARWEST Software Testing Conference in Anaheim & Online
Oct 13	Agile + DevOps USA The Conference for Agile and DevOps Professionals

May 09	Beyond Compliance: Ensuring Accessibility with Innovative Testing Strategies
May 16	Test Ahead of the Curve: Insights for Developing a Superior Test Coverage
On Demand	Building Confidence in Your Automation
On Demand	Leveraging Open Source Tools for DevSecOps
On Demand	Five Reasons Why Agile Isn't Working

Welcome

Descrambling Parallel Build Logs

About the author

Eric Melski

Connect with Me

Descrambling Parallel Build Logs

About the author

Eric Melski

Connect with Me

More Like This