Derived Object As Stereotypical Configuration Item
Traditional SCM, as well as CM, is plagued by a
revision blindness:
that
configuration items would necessarily have to be
source elements,
explicitly added into the versioned database.
Under this fallacy, SCM is often confused with RCS, or even understood as
source code management.
The example of
ClearCase over the last 15 years or so, shows however, to whom cares to look at it, that such need not be the case.
Of course, the behaviour demonstrated by
ClearCase is crude, and restricted in arbitrary (
hysterical) ways.
For instance, it is limited to the realm of
build management, and furthermore, under the control of the
clearmake tool, thus unfortunately bound to the use of
makefiles.
A
derived object produced (or modified) under
auditing will automatically be attributed a database id.
Furthermore, if the same or a different user attempts to reproduce it (under the same conditions), the build will be
avoided and an existing suitable object, if found,
promoted to
shared status (the first time), and
winked in.
The net result (beyond some possible time saving) is to
share the derived objects, thus reducing the noise related to duplication.
This is by the way the essential basic mechanism characteristic of
configuration items!
Two remarkable aspects must be stressed:
- Source files are an extremely small portion of the items composing a software configuration. In addition, they are typically of prior interest only to their authors: all other users would rather ignore them, and will be reminded or their existence only while investigating a breakdown. Derived Objects on the contrary cover virtually anything (source elements being a degenerated case).
- The space of sources is flat (from an SCM, generic, perspective). Any structure between source files can only be specific to their type and format, and subject to interpretation or intends. With dependencies, the domain of derived objects is on the contrary structured in a generic way, suitable for tool support, hence a basis for objective semantics.
A
postmodern SCM could thus be built around this design. To sketch its implementation, think of it as a front-end to some kind of sophisticated
strace utility, linked to a database. The challenge is to interpret some system calls as transaction boundaries, and to record, and process, the file handling so bracketed. Threads are tricky to support, given the various posible usage patterns (but one may already meet similar difficulties e.g. with co-processes...). An other non-trivial aspect (beyond the scope of
clearmake) could be this of managing local copies of remote resources.
A
virtual file system providing with referential transparency (paths stable across
version selection) is a key to reusing existing objects.
There are also direct implications related to the
distribution.
In summary:
- One can conceive an SCM system not based on version control or revisions.
- An SCM implicitly managing arbitrary file objects would be far more general than the scope of CM ever. It could benefit non developers end users, thus have a mass market.
- This is obviously not the same thing as flatly staging every file, which has been done many times (from VAX VMS to some subversion front end, via the Elephant File System —which never forgets).
--
MarcGirod - 22 May 2007