|
What to Do With Those Pesky Tags |
|
Tags, also known as labels, are a common piece of metadata
in Version Control systems. They allow a symbolic linking of many files/revisions
together using a human understandable name. They come in many types and flavors
and their use is often hotly contested, some times all the way to the political
level. First, a little summary of what tags are and what they are often used
for.
Fixed Tags:
Fixed tags are tags that are applied once and then "never"
moved. By "never," I mean by anyone other than a CM person. They are used to
identify a fixed set of file/revisions and generally correspond to a milestone
point in time (also called a version). Even a Configuration Manager should
refrain from playing with fixed tags as it could easily invalidate an auditable
milestone. All changes to fixed tags should either be logged and tracked by the
Version Control (VC) tool itself, or by the CM person who made the changes in
an official CM Record.
Branch Tags:
Branch tags are tags that identify a specific branch.
Depending on the VC tool in use, how these tags are defined and used will vary,
but in general referencing a branch tag will yield the latest revision along
that branch. Some tools allow branch tags to be used in conjunction with a timestamp
to select file/revision sets along the branch that are earlier than, later
than, earlier than or equal to, or later than or equal to a timestamp.
Floating Tags:
Floating
tags are tags that move from revision to revision somewhat like branch tags, but
they cannot be used to specify a branch. They are generally used to select
file/revisions that will be part of
some future milestone. In general, they float to the latest revision on the
branch they are created on unless some tool-specific restriction gets invoked.
Some of the more common restrictions are:
- A number in the revision numbering sequence is
manually forced, thus causing gap in the numbering sequence
- A specific version is promoted to a different
state (e.g., Development to Test)
- The tag is frozen
Frozen Tags:
Frozen tags are tags that are blocked from being changed.
The most common use is to change a floating tag into a fixed tag. The second
most common use is to lock a tag from being modified by anyone. Of course, CM
Administrators can use special tools or modes to allow them to thaw frozen tags
and in some cases manipulate frozen tags directly, but good practices and all
regulatory agencies require the CM person who made the changes to note the fact
of modification in an official CM Record.
Uses for Non-Branch Tags:
The most
common uses for non-branch tags include:
- Identifying files/revisions used in a build
- Identifying the promotion state of a build
- Identifying the components of a release
candidate
- Identifying the source and destination revisions
involved in a merge
- Identifying the base revisions involved in a
branch creation
In our environment I may have up to 8 branches in use at
any one time. Some of those branches have hundreds of builds tagged. Each
branch will have at least one merge source tag. The trunk will have one branch
base tag and one merge destination tag per branch plus trunk build tags
(currently around 1100 and counting). As you can see, that's a lot
of tags, and we only tag successful builds! Having all of these build tags allows
me to determine exactly what changed between builds - down to the character
level if necessary. It lets me trend the number of elements added, modified or
removed between builds and, in conjunction with a Defect Tracking system lets
me assess the overall quality of the changes. This can in turn be used by
Management to objectively support
release/delay decisions
A particular build can have additional tags applied to
indicate its promotion state (such as Functional Test, Systems Test, Acceptance
Test or Release, though the actual tag more often resembles FuncTest_1_3). Many
of today's VC tools provide this information using metadata other than tags,
but it is always possible to use the tags as a form of lowest common
denominator.
Using tags to identify a release candidate is accomplished
by applying a second tag on top of a build tag. This double-tagging allows for
easier filtering, especially if one does a lot of builds. It is also not
uncommon for this type of tag to be the only one used for promotion tagging.
Merging is always a problem and keeping track of which
revisions were merged together is always a challenge. Some tools such as
ClearCase and CM+ make this much easier and provide mechanism other than tags
to do it with. Once again, however, tags form the lowest common denominator. If
only a diff-2 style merge is used, then the source of a merge is tagged using a
tag such as MRGSRC_ACS_20061010 and the resulting revision is tagged using a
tag such as MRGTGT_ACS_20061010. If a diff-3 style merge is performed, then the
common ancestor revision is also tagged using a tag such as
MRGBASE_ACS_20061010, though it is more common for the common ancestor to be
the base of a branch (see below). In practice, I also use four tags: Common
Ancestor, Merge Source, Pre-merge Target and Post-merge Target. This allows me
to also determine what changes were made during the conflict resolution phase
of the merge. Once again, the primary motivators to using these tags are
traceability, auditability and metrics.
The final type of tag listed above is applied to the
branch point when a branch is created. It is generally easy to compute the base
point of a branch for any individual element, but doing so for the entire
branched code base is not a pretty site. Using the base tag always allows for
determining what has changed during the life of the child branch as well as
what has changed on the parent branch during the same time. Comparing the two
yields at best a list of potential
conflicts that may need manual intervention and at worst a ballpark
approximation of the difficulty in merging the child branch back into the
parent.
Project Management periodically asks if I can prune the
tags so there is less "useless" information being presented to developers. Now
comes the fun part. I live and breath CM. I know
that I will need to reproduce a build from 6 months or a year ago. I know I will need to do data mining for
future QA analysis. I know that
getting rid of metadata such as tags makes traceability and auditablity much more
difficult (and using lists generated from the tag metadata before it is removed
gives me even more files I need to version!). I know that I would have to document each tag removed in a CM Record,
along with the file/revisions that were affected if I am to do things "The CM
Way." Explaining this to Management (or even developers) is like explaining why
a pack rat hides things or why test-first is such an important aspect of the
Agile development model.
So I do what I can. I set up the GUIs and IDEs to filter
out most of the tags. I generate reports that list the information they need
without extraneous data that tends to confuse the audience. Some of the VC
tools even make this simple; others seem to go out of their way to make sure we
always see all of the metadata whether we want to or not. I do whatever I can
to keep my tags. And I keep the metadata duplicated in a non-tool specific
metadata repository just in case.
I would love to hear the experiences of others in how they
have used tags, or could have used tags to make things easier. I would also
love to hear the occasional disaster story. All is grist for the mill...
Ben Weatherall is currently based in Fort Worth,
Texas where he practices Practical CM on a daily basis using a
combination of CVS and custom tools to support a modified Agile-SCRUM
development methodology. He is a member of IEEE, ASEE (Association of
Software Engineering Excellence – The SEI’s Dallas based SPIN Affiliate), NTLUG (North Texas Linux Users Group), and PLUG (Phoenix Linux Users Group).
Trackback(0)
|