Convention over Configuration: Replace Scripting with New Build Names

[article]
Summary:
Bernie Zelitch writes that his company’s build system scales well because early on, they scrutinized their build naming convention, saw its implications to the build ecosystem, and made radical changes. Their new naming convention takes some getting used to, but once it was fully adopted, it improved economy, flexibility, and functionality.

In two years, a single build engineer has grown our centralized build dashboard from two projects to two hundred fifty as we acquired companies and added new products.

Our company’s build system scales well because early on, we scrutinized the build naming convention. We saw its implications to the build ecosystem and made radical changes.

To understand our story, let's compare the original build names with their current equivalents.

Original name

Current example display name

myABILITY trunk continuous deployment

dpl_mya0-4.6.0_150707-140922-a67v75e_Debug_dev201_DB

myABILITY trunk qa manual deployment

dpl_mya0-4.6.0_150707-140922-a67v75e_Release_qa170_DB

The original convention is well-suited for human understanding. Descriptive names are simple, so they are common in software departments.

But two years ago, some of us saw trouble. A change to configuration took a three-page recipe. Whole departments stressed out when a team moved its underlying source control branch or asked for a new project. Many meetings, timing negotiations, and emails led to mistakes. People committed to the wrong branch, or the build project failed as it pointed to a new branch.

Our first bid for relief was to make the build name more semantically rich. Could its name instruct automation scripts how to get its source control, how to build, and where to place build artifacts?

The answer was yes and no. As brilliant as the names looked on a whiteboard, they were just an academic exercise. We could not condense a three-page recipe into a thirty-five-character build string name—not without redesigning our code branching and other release patterns.

We made radical adjustments in branching strategy, artifact paths and names, and even build machine names. We aligned schedule and naming practices throughout the entire release cycle. Branches and build projects started and ended with the release cycle, with well-defined product string identifiers and major-minor-patch versioning strings.

So in our above example, we shortened the myABILITY product with the identifier “mya0”. The resulting branch versioning name with its major-minor-patch digits, “mya0-4.6.0”, was enforced throughout that release cycle, from the source code branch to build name to build artifacts. Deployment targets became known as “qa231” or “dev201”. This signified their “QA” or “development” purpose, followed by the last octet of their IP address. Finally, we constructed our artifact repository to be a hierarchy of product identifier, branch, and source control identifier.

All these ideas arose from how a build naming convention could inform every place it touched, upstream and downstream. We modeled the build naming convention after an old friend: a shell script with positional parameters. This supported our goals of economy, flexibility, and functionality.

To see the analogy, imagine a deployment script, “dpl”, that accepts five positional parameters. Substitute space for underscore in the names above and you can see the idea:

Position

1

2

3

4

5

Description

product identifier and version

source control commit time and revision

deploy target

type

reinstall database?

Format

productIdentifer-major.minor.patch

yymmdd-hhmmss-sourceControlRevision

targetIdentifier

Debug|

Release

DB (yes)|

db (no)

Example arguments

mya0-4.6.0

150707-140922-a67v75e

qa170

Debug

DB

Now, let us clarify our first table, where we called our example names display names. We also could call them instantiated names, where the possible values are already filled in. Using such filled-in names as build names would be useless because the resulting build would always produce the same outcome. So the actual build names use the algebraic X where selection is to be offered. In our example, look at the bold timestamp and commit ID:

dpl_mya0-4.6.0_150707-140922-a67v75e_Debug_dev201_DB

The actual build name would replace this with X. We would configure the build project to offer dropdown choices, including default to the latest commit, for this value:

dpl_mya0-4.6.0_X_Debug_dev201_DB

You can start to see the power of this build project name. Simply copying this example build project and changing the branch name—say, from mya0-4.6.0 to mya0-4.7.0—is pretty much all it takes to set up a new working build:

dpl_mya0-4.7.0_X_Debug_dev201_DB

Now consider the art of naming. We use the right mix of X’s and hard-coded parameters to tell the build’s story. Look at our continuous deployment example above, dpl_mya0-4.6.0_X_Debug_dev201_DB. Only the source control information changes. We set it up to run with every commit and pick the head commit for its lone X. The product, branch, destination (dev201), and database operation are fixed. In contrast, we give the QA team many X choices for their manually triggered deployment (dpl_mya0-4.6.0_X_X_X_X). To make it easier for them, we set up the project to default to their usual preferences, including the head of the source control branch.

Of course, the build process needs to understand the arguments so it can do the right work. As a first step, a script reads environmental variables and uses them to construct new ones. It writes them to a standard Bash or Java properties file available in the build workspace. A starting point is parsing the build name ($JOB_NAME in Jenkins) into its arguments, replacing the X’s with the user-selected values. We use a Bash script to create language-portable array emulation, as shown in this fragment of a properties file. After splitting the $JOB_NAME by “_”, it further splits by “-“ and “.” two more dimensions:

# Emulated arrays properties table

# variable

 

value

__1

=

mya0-4.6.0

__1_0

=

mya0

__1_2

=

4.6.0

__1_2_0

=

4

__1_2_1

=

6

__1_2_2

=

0

__2

=

150707-140922-a67v75e

# Derived variables

product

=

$__1_0

artifact_path

=

//repo/$product/${__1}_$__2/ 

   

Our consistent naming pattern welcomes other efficiencies. In our original design, both builds compiled code and then deployed the artifacts. Now, we split off the compile step into an upstream packaging build, pkg_mya0-4.6.0_X. This specializes in monitoring for code changes, compiling and publishing the resulting package. Through a Jenkins dashboard setting, it triggers the continuous deployment that knows where to get the package. The package is available and offered as a dropdown choice when QA is ready to operate the more controlled build. This is an example of “Build once, deploy many times.”

Creating new build projects from existing branches is so easy that we’ve made it self-serve. A Jenkins project allows a team lead to instantly create the new projects associated with a new branch. The underlying scripts are only fifteen hundred lines, which further simplifies creation of new projects.

The new naming convention has had an unintended side effect. What originally looked as cryptic as a doctor’s prescription order has entered office lingo. We might hear somebody ask, “When is mya0-4.6.0 releasing to production?” We’ve come full circle. Names that were intended to be short and clear for computers now also work for people.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.