|
In this month's article I will be drawing on a recent personal experience. You have been warned.A while ago I was having an increasingly difficult time breathing and I was able to do less and less. I went to the doctor several times and followed up with phone calls as, while some symptoms got better, overall I was doing worse. It got to the point where I was unable to drive to work without pulling off once or twice to take "power naps" of around 10 minutes each. Without this, the world would turn grey, I would drift across lanes and eventually black out. Fortunately, I was always able to pull off, but not even the fear was able to snap me out of it. As some of you may know, I drive a little yellow car, along with a group of other like-minded men, in parades throughout the year. During practice, the drills last from 8 to 10 minutes, but after 6 minutes I had to pull out or pass out. Finally, I was referred to a specialist, though I had to wait for a week to see him. Within a few minutes of his walking into the room, he asked if I had any family plans over the upcoming holidays. I told him no and he stuck his head out the door and told one of his assistants to "get me a room tonight!" Thus began 5 days of having my lungs back-flushed, antibiotics, steroids, respiratory therapy, diet manipulation, etc. The good news is that I am better now. It will still take a while to build myself back to where I need to be, maybe up to another year. The bad news is that the downward spiral had been going on for over 15 months without anyone noticing it. The primary reason for this was multiple causes masking each other. As the doctors and I addressed any one problem, the others kept the downward spiral going. This is a classic definition of a mess - multiple problems that fully or partially mask each other so that addressing one of them only changes the symptoms detected instead of satisfactorily addressing the issue. You might wonder what this has to do with SCM. When I was spiraling down, we did not have in place the controls [1] necessary to detect messes [2] and kept addressing single problems in a reactive fashion. As SCM'ers, we are often guilty of doing the same thing. We set the basic tools in place and either work within some other group's workflow or control the process ourselves. Then we sit back and tend to our daily grind. Like developers, we keep our heads down trying to keep things working. As a group, we are typically understaffed, so we tend to take on multiple roles such as:
This leaves little time to determine which controls need to be put in place, much less do the actual work of creating and automating them. We depend on our memories and experience to "feel" the trends, or at least problems in the making. So being sick for the first 12 months was "felt" to be just a run of bad luck, not a set of problems that were not being addressed. Most of the frameworks [3] that address Software Configuration Management take this type of monitoring (metrics) and controls (metrics and trend analysis generated alarms) as a best practice. Going back to the early days of the CMM, Software Configuration Management evolved along with the rest of the process system. What was necessary for achieving Level 2 was much less that what was needed at higher Levels. In a reactive SCM environment, we have to perform heroic measures, often of a draconian nature, in order to correct a mess. And after we succeed (failure is not an option - right?) we still do not have any controls in place to keep if from happening again. We may not even know how to detect it in the future if some, but not all, of the component problems reoccur. We need to become more proactive at defining controls and getting them in place - even if we have to do it gradually. Remember, a control is not generally based on a single set of data. Collecting information on Cholesterol levels is only one of many indicators of potential heart problems. Trending Cholesterol levels is more informative, but still cannot be taken in isolation. Other measures, such as blood pressure, resting pulse rate, weight, height, BMI, ECG and age at the time of data collection all go together to constitute the control. After you determine the controls you want, determine what data is necessary and how the data needs to be analyzed to put them in place. It will often be true that the raw data is already being captured in one or more of the existing CM tools, but in a form that is not useful for controls. Similarly, the analysis that needs to be performed is most often simple in nature, but requires data from multiple sources. Note that I did not say that we should necessarily automate data analysis at the same time. Collect enough data for whatever analysis you plan to do to be statistically meaningful while using something as simple as a spreadsheet to go beyond the earlier prototyping. The idea is to do the analysis refinement and automation while a reasonable database is being generated. Of course, if you are starting with mined historical data, then the two can be developed in parallel. Some of the typical areas where problems occur are: · Process Process failures occur when people decide to bypass the process or do not know/understand the process. The second can be corrected with training, but only if it is detected before too much damage is done. The first indicates a personnel problem, a management problem or a process problem. When only a relatively small number of personnel decide to bypass a process, they need to be identified and management made aware of them. SCM is all about consistency and reproducibility. We can deal with Agile Development, we can even deal with Agile SCM, but if established processes are being bypassed, then so are the checks and balances necessary to ensure the integrity of the content in our repositories and the quality of the associated products. If there is a management problem, it typically manifests itself in one of two ways: o Personnel are directed to violate process o Offending personnel do not have their behavior addressed If there is a process problem, it is necessary to address it as quickly as possible. It may lead to personnel bypassing it, or it may be affecting productivity, quality or reliability. · Procedures Procedural failures mean steps are skipped, performed in the wrong order or executed improperly. This could be as simple as not following a commenting convention when checking code into a Version Control system or, if SCM is integrated into a workflow or ALM [4] system, incorrect paths are taken by someone assigned to a specific role. Any time steps are skipped or taken out of order, problems can occur that may never be detected under normal conditions. To detect problems of this nature it may become necessary to have automatic logging occur (if the steps are performed through tools) or to have some monitoring software detect changes to the environment and log them as well as any additional information that can be determined. It is not uncommon that any one set of captured data be insufficient to detect procedural failures; however the use of multiple sources may. · Naming conventions/standards There are, or should be, naming conventions for not only files/modules/classes, but also for such things as supporting documents, directories, branches/streams, builds and release identifiers. The purpose behind naming conventions is three fold: o Make it simpler to find what you are looking for o Make it less error prone when you think you know what something is o Make it more amenable for automation Detecting things that violate naming conventions/standards allows early corrective actions. Stopping these violations is an important, though often ignored, task. · Evil Siblings Evil Siblings, often called Evil Twins, is not just an artifact of Version Control. They can exist in DIET[5] systems as well. In Version Control systems, an Evil sibling occurs when there These are instances where the same item exists on multiple branches/streams, but do not share a common history. It is relatively easy to monitor for elements being created in one branch/stream whose name has already been used in the same directory in another stream. It is a little more difficult to detect that an element of that name exists in another branch/stream, but has been renamed. It is even more difficult to detect when an element in another branch/stream has been moved to another location (and perhaps renamed as well). In a DIET system, Evil Siblings occur when you have duplicate issues that are not identified as such. It is very difficult to automatically detect that a set of records are related after they have been created. There are some methods that can be applied heuristically to identify potential duplicates, but they are not common and are rarely built into the product. The only way I have found automate this myself was to search for records that had similar keywords and either referenced the same elements (or elements that were cloned from each other), or which were marked as "Could not duplicate" in a subsequent release. This was somewhat helpful, but I feel that it caught only "low-hanging fruit." · Conflicting defects Unlike Evil Sibling defects, these defects are in direct opposition to each other. If one says that under a specific condition a flag should be set to TRUE, then the other says it should be set to FALSE. Every time one of the defects is resolved, the other reappears. This typically reflects a problem at the Requirements level, but not always. Again, detecting this condition automatically is difficult; however any such detections should cause an immediate notification to occur Generic controls are those that could detect any or all of these classes of problems. Some controls, in addition to any mentioned above, that might be put in place are:
Should I worry about tool specifics? The data you collect should be maintained outside of all the existing tools if at all possible. This prevents the loss of the data and the supporting analysis and notification infrastructure as the tools evolve or are replaced. Another thing about this analogy; even though we identified what we believe to be all five of the root causes, I have only resolved four of them. I have an internal resistance to changing a process I am familiar and comfortable with, even though I know I need to. This is not uncommon within the SCM arena either. There are some things that are just "too difficult" or take "too much time we don't have" to fix. I don't know if it is this "failure to resolve" that is causing it to take so much longer to "heal" than I thought it would, or if there is still a sixth root cause still hiding. [1] Unlike metrics which simply measure predefined conditions, controls provide thresholds or limits where specific actions are triggered. A simple example of a metric is the number of critical severity defects remaining in a build. The corresponding control might be that if a release is initiated and the number of critical severity defects remaining is non-zero, then the release attempt is automatically rejected and suitable notifications are sent. [2] More than one co-masking problems [3] Frameworks, to use Bob Aiello's explanation, are formalized encapsulations of cumulative wisdom that go before review bodies. They exist somewhere between guidelines and standards. Under this definition, CMM and CMMI are two examples of frameworks. [4] ALM - Application Lifecycle Management [5] DIET - Defect, Issue and Enhancement Tracking Ben Weatherall is currently based in Fort Worth, Texas where he practices Practical CM on a daily basis supporting a modified Agile-SCRUM development methodology. He uses a combination of AccuRev, CVS, Bugzilla and AnthillPro (as well as custom tools). He is a member of IEEE, ASEE (Association of Software Engineering Excellence – The SEI’s Dallas based SPIN Affiliate), FWLUG (Fort Worth Linux Users Group), NTLUG (North Texas Linux Users Group) and PLUG (Phoenix Linux Users Group).
Set as favorite
Bookmark
Email this
Hits: 4904 Trackback(0)Comments (0)
|
| Last Updated on Tuesday, 18 August 2009 16:45 |


In this month's article I will be drawing on a recent personal experience. You have been warned.
