|
![]() This month, I will explore the various situations wherein a repository is modified, starting with the simplest case of a single developer making a change to a single file. Editing a Single File Consider the simple situation where a developer needs to make a change to one source file. This case is obviously rather simple:
Step 1: Checkout Checking out a file has two basic affects:
File checkouts are a way of communicating your intentions to others. When you have a file checked out, other users can be aware and avoid making changes to that file until you are done with it. The checkout status of a file is usually displayed somewhere in the user interface of the SCM client application. For example, in the following screendump from Vault, users can see that I have checked out libsgdcore.cpp:
Best Practice: Use checkouts and locks carefully Sometimes the SCM tool will allow multiple people to checkout a file at the same time. SourceSafe and Vault both offer this capability as an option. When this "multiple checkouts" feature is used, things can get a bit more complicated. I'll talk more about this later. If the SCM tool prevents anyone else from checking out a file which I have checked out, then my checkout is "exclusive" and may be described as a "lock." In the screendump above, the user interface is indicating that I have an exclusive lock on libsgdcore.cpp. Vault will allow no one else to checkout this file. The client side of checkout On the client side, the effect of a checkout is quite simple: If necessary, the latest version of the file is retrieved from the server. The working file is then made writable, if it was not in that state already. All of the files in a working folder are made read-only when the SCM tool retrieves them from the repository. A file is not made writable until it is checked out. This prevents the developer from accidentally editing a file. Undoing a checkout Normally, a checkout ends when a checkin happens. However, sometimes we checkout a file and subsequently decide that we did not need to do so. When this happens, we "undo the checkout." Most SCM tools have a command which offers this functionality. On the server side, the command will remove the checkout and release any exclusive lock that was being held. On the client side, Vault offers the user three choices for how the working file should be treated:
Step 3: Checkin Best Practice: Explain your Checkins Completely Every SCM tool provides a way to associate a comment when checking changes into the repository. This comment is important. If we consistently use good checkin comments, our repository's history contains not only every change we have ever made, but it also contains an explanation of why those changes happened. These kinds of records can be invaluable later as we forget things. I believe developers should be encouraged to enter checkin comments which are as long as necessary to explain what is going on. Don't just type "minor change." Tell us what the minor change was. Don't just tell us "fixed bug 1234." Tell us what bug 1234 is and tell us a little bit about the changes that were necessary to fix it. One issue does deserve special mention. Most SCM tools ask the user to enter a comment when making a checkin. This comment will be stored in the repository forever along with the changes being submitted. The comment provides a place for the developer to explain what was changed and why the change was made. After the file is checked out, the developer proceeds to make her changes. She edits the file and verifies that her change is correct. Having completed all this, she is ready to submit her changes to the repository. Doing so will make her change permanent and official. Submitting her changes to the repository is the operation we call "checkin." The process of a checkin isn't terribly complicated:
![]() Checkins are Additive It is reassuring to remember one fundamental axiom of source control: Nothing is ever destroyed. Let us suppose that we are editing a file which is currently at version 4. When we checkin our changes, our new version of the file becomes version 5. Clients will be notified that the latest version is now 5. Clients that are still holding version 4 in their working folder will be warned that the file is now "Old." But version 4 is still there. If we ask the server for the latest version, we will get 5. But if we specifically ask for version 4, and for any previous version, we can still get it. Each checkin adds to the history of our repository. We never subtract anything from that history. Other Kinds of Checkins We will informally use the word "checkin" to refer to any change which is made to the repository. It is common for a developer to say, "I made some checkins this afternoon to fix that bug," using the word "checkin" to include any of the following types of changes to the repository:
I will take this opportunity to say a few things about how these operations behave. If we conceptually think of a folder as a list of files and subfolders, each of these operations is actually a modification of a folder. When we create a folder inside folder A, then we are modifying folder A to include a new subfolder in its list. When we rename a file or folder, the parent folder is being modified. Just as the version number of a file is incremented when we modify it, these folder-level changes cause the version number of a folder to be incremented. If we ask for the previous version of a folder, we can still retrieve it just the way it was before. The renamed file will be back to the old name. The deleted file will reappear exactly where it was before. It may bother you to realize that the "delete" command in your SCM tool doesn't actually delete anything. However, you'll get used to it. Atomic Transactions I've been talking mostly about the simple case of making a change to a single source code file. However, most programming tasks require us to make multiple repository changes. Perhaps we need to edit more than one file to accomplish our task. Perhaps our task requires more than just file modifications, but also folder-level changes like the addition of new files or the renaming of a file. When faced with a complex task that requires several different operations, we would like to be able to submit all the related changes together in a single checkin operation. Although tools like SourceSafe and CVS do not offer this capability, some source control systems (like Vault and Subversion) do include support for "atomic transactions." Best Practice: Group your Checkins Logically I recommend that each transaction you check into the repository should correspond to one task. A "task" might be a bug fix or a feature. Include all of the repository changes which were necessary to complete that task, and nothing else. Avoid fixing multiple bugs in a single checkin transaction.The concept is similar to the behavior of atomic transactions in a SQL database. The Vault server guarantees that all operations within a transaction will stay together. Either they will all succeed, or they will all fail. It is impossible for the repository to end up in a state with only half of the operations done. The integrity of the repository is assured. To ensure that a transaction can contain all kinds of operations, Vault supports the notion of a pending change set. Essentially, the Vault client keeps a running list of changes you have made which are waiting to be sent to the server. When you invoke the Delete command, not only will it not actually delete anything, but it doesn't even send the command to the server. It merely adds the Delete operation to the pending change set, so that it can be sent later as part of a group. In the following screen dump, my pending change set contains three operations. I have modified libsgdcore.cpp. I have renamed libsgdcore.h to headerfile.h. And I have deleted libsgdcore_diff_file.c. ![]() Note that these operations have not actually happened yet. They won't happen unless I submit them to the server, at which time they will take place as a single atomic transaction. Vault persists the pending change set between sessions. If I shutdown my Vault client and turn off my computer, next time I launch the Vault client the pending change set will contain the same items it does now. The Church of "Edit-Merge-Commit" Up until now, I have explained everything about checkouts and checkins in a very "matter of fact" fashion. I have claimed that working files are always read-only until they are checked out, and I have claimed that files are always checked out before they are checked in. I have made broad generalizations and I have explained things in terms that sound very absolute. I lied. In reality, there are two very distinct doctrines for how this basic interaction with an SCM tool can work. I have been describing the doctrine I call "checkout-edit-checkin." Reviewing the simple case when a developer needs to modify a single file, the practice of this faith involves the following steps::
Followers of the "checkout-edit-checkin" doctrine are effectively submitting to live according to the following rules:
This approach is the default behavior for SourceSafe and for Vault. However, CVS doesn't work this way at all. CVS uses the doctrine I call "edit-merge-commit." Practicers of this religion will perform the following steps to modify a single file:
The edit-merge-commit doctrine is a liberal denomination which preaches a message of freedom from structure. Its followers live by these rules:
Each of these approaches corresponds to a different style of managing concurrent development on a team. People tend to have very strong feelings about which style they prefer. The religious flame war between these two churches can get very intense. Holy Wars The "checkout-edit-checkin" doctrine is obviously more traditional and conservative. When applied strictly, it is impossible for two people to modify a given file at the same time, thus avoiding the necessity of merging two versions of a file into one. The "edit-merge-commit" teaches a lifestyle which is riskier. The risk is that the merge step may be tedious or cause problems. However, the acceptance of this risk rewards us with a concurrent development style which causes developers to trip over each other a lot less often. Still, these risks are real, and we will not flippantly disregard them. A detailed discussion of file merging appears in the next chapter. For now I will simply mention that most SCM tools include features that can safely do a three-way merge automatically. Not all developers are willing to trust this feature, but many do. So, when using the "edit-merge-commit" approach, the merge must happen, and we are left with two choices:
Eric Sink is a software developer at SourceGear who make source control (aka "version control," "SCM") tools for Windows developers. He founded the AbiWord project and was responsible for much of the original design and implementation. Prior to SourceGear, he was the Project Lead for the browser team at Spyglass (now OpenTV) who built the original versions of the browser you now know as "Internet Explorer." Eric received his B.S. in Computer Science from the University of Illinois at Urbana-Champaign. The title on Eric's business card says "Software Craftsman." You can Eric at eric@sourcegear.com This series of articles from Eric Sink are part of his online book called Source Control HOWTO, a best practices guide on source control, version control, and configuration management. You can find it online at http://software.ericsink.com/scm/source_control.html
Set as favorite
Bookmark
Email this
Hits: 6835 Trackback(0)Comments (0)
|
| Last Updated on Sunday, 03 August 2008 18:12 |






