|
| Deployment management, although not strictly part of build management, is still a very important topic, as it can render very strong build management practices almost useless. Recently I've ran into several development teams that have what I would call "terrible" deployment practices. The cardinal sin perpetrated by all these teams comes down to modifying deployed artifacts directly. The problem with that is that the act of modifying deployed artifacts breaks the traceability of those artifacts back to the source. All of the sudden, you find yourself in a situation where you do not know what is actually deployed. And that is not a very good situation to be in. :)
This is an issue that hit particularly hard at web applications. The Example Perhaps the problem, and how easy it is to get caught up in it, is best illustrated by an example. In this hypothetical example, lets say that we have a web application. The application can be written in pretty much any language, but lets say that it's written in Java. As a typical web application, it is made up of some html files, image files, JSP files, some jar files and some configuration files. In the Java world, our application can be deployed either as a single binary file called a war (web application archive) or as a directory tree containing all the files making up the application. It turns out that a war file is pretty much just a zip file containing the directory tree holding all the application files. Let say that we have a pretty good build process in place on our project. Each build produces our build artifacts, namely our war file, with a unique version number. Our build process also ensures that we maintain traceability between the build artifacts and the sources used to produce them. This is accomplished by applying a label to our source code repository. At some point during the development of the project, we decide to do a build that will go into testing. The artifacts produced by this build get handed off to the QA team (to be deployed in one of the test environments), or to the customer so that the customer's team can do testing. During the testing several "small" problems emerge. And as the timeline is getting uncomfortably tight, the development team is told to fix the problems right there on the test server. At this point, I think I know what you're thinking -- that such things never happen. I have to admit that I am having some difficulty suspending my disbelief as well, and I'm the author. But let me assure you that these types of situations happen quite often and perhaps the most surprising thing is that I've seem this happen even when there is great talent and experience on the project team; it happens in places that you would never expect. The Problems Caused by Poor Deployment PracticesMany seemingly unexplained problems that may arise out of the above example situation. The root of all this evil is that the link between the build artifacts, and thus the sources used to produce them, and the deployment environment is broken. This often manages to manifest itself with a situation where many of the problems that were supposedly fixed in the previous build suddenly resurface in subsequent builds. That happens because if you make fixes in the deployment environment, there is a high likelihood such fixes never actually make it to the source control system. And as I'm fond of saying, "if it's not in source control, then it doesn't exist." I don't think I need to go into much more detail on this problem. Another often seen repercussion of such deployment practices is the inability to reproduce problems seen in one environment on another environment. A particularly heart braking example of this is the inability reproduce in a staging environment a problem that suddenly appears in the production environment. Luckily, most teams have very strict access controls on the production environment so that very few people are actually able to touch it. But even leaving production environments out of it, a great deal of hours can be wasted trying to track down the reason why two seemingly identical test environments display different behavior/problems. Of course the cause of this is pretty obvious -- the two environments are not identical at all. Even assuming that the same version of the build artifacts was deployed into each environment, changes permitted to be made directly to the environments after deployment ensured that the two environments differ in ways nobody may know. Potential SolutionsI don't think that there is one easy solution to the deployment problem. That's because the mechanics of deployment are different for almost every application. After all, deploying a web application on Microsoft IIS is different than deploying a web application to a J2EE server, which is different than deploying a Linux RPM or a Windows MSI. Rather than trying to present a single solution, we're going to look at a few guiding principles. The hope is that following said principles will lead to practices that prevent the hijacked deployment problem that we've looked at above. When looking for ways to deal with the hijacked deployment problem, there are two general areas we can look at: prevention, as well as discovery and cure. Lets take a look at prevention first, and next month we'll go into discovery and cure. PreventionThe Lockdown The easy thing to do is lock down access to all deployment environments so that the development staff does not have enough access to modify (and therefore hijack) the deployment. This route is often taken in production systems but it is seldom followed in test environments. I'm not entirely sure why that is; perhaps it is due to the human psyche finding the test environment not as important as the production environment. But even if your test environment is locked down, often times that is not in itself enough to guarantee perfect deployments. Personally, I do not see any reason why test environments should be treated differently than production environments. If the developers need to tweak the application (not the configuration, which we'll get to next month) then that's a sign that something is wrong with the application. And if something is wrong with the application, it should be fixed in the development environment and then redeployed to the test environment. Locking down the deployment environment may still lead to imperfect deployments if the deployment mechanism is very manual and thus error prone. What I'm talking about is the great tendency in web applications to want to deploy only the files that have been changed since the last deployment. This desire for optimization often can lead to problems, especially when performed in the most common fashion -- manually. I don't think that I need to go into any more detail explaining why deploying files one by one even to a locked down environment provides no traceability and is a recipe for a failed deployment. Coarse Grained Unexploded Archive FilesThe obvious answer to this and perhaps the next principle is to make deployments as coarse grained as possible. Ideally, deployment would be as easy as moving a single archive file to the target environment and perhaps providing some configuration settings. Past of this ideal deployment is that the archive file would remain intact even while the application runs. So unlike an RPM that gets exploded into potentially thousands of files placed throughout the file system, or a zip file that gets exploded into a directory tree, the archive file I'm talking about would never get exploded. This is only an ideal since it would require server applications that are able to mount the archive file and treat it as a filesystem. I'm talking about something like an apache module that can load a website from a tar.gz or a zip file rather than from a directory tree. The Java 2 Enterprise Edition platform does come close to this ideal as it supports deployment of web applications via a single WAR (web application archive) files. But even there, the J2EE spec does not go far enough as some J2EE servers explode the was file onto the filesystem before using it. Also, the spec allows developers to deploy applications via either war files or exploded war files, which are directory trees whose structure mirrors the internal structure of war files. So why is deploying applications via coarse-grained unexploded archive files ideal? The answer has more to do with human psychology than with technology. I'm a firm believer that the procedures that are successful are the ones that are most natural and consequently easiest and simplest to follow. People have a natural tendency to do the easiest thing that will get the job done. And since we want developers to tweak the application in the development environment rather than the test or production environments, we have to design a system that is going to reinforce that behavior by making it the easiest and simplest, and thus the most natural. And if deployments were to performed with the aid of coarse grained unexploded archive files, it would be a lot easier for developers to tweak the application in the development environment, rebuild and redeploy, rather than explode the archive, make the tweak, and then rebuild the archive all in the deployment environment. In this way, we can structure a system that lends itself to good practices because it makes them easy while at the same time discouraging bad practices by making them hard. The other reason why unexploded coarse-grained deployments are ideal is that it is a lot easier to verify one file than it is to verify a hundred or thousand files. For this reason traceability is a lot easier to accomplish if we are tracing only a single file, or a small number of files. ConclusionThese two principles: locking down all deployment environments except the development environment, and using coarse grained unexploded archive files during deployments complete the prevention side of things. Granted that not everything that we've talked about thus far is completely doable right not, but I think that we've laid down some principles that can help steer us in the right direction and set some goals to work towards. Next month we'll pick up this thread with the discovery and cure section as well as a discussion about patch builds and deployments. As always, I invite your feedback. Please feel free to share with me your thoughts and/or personal experiences on this topic. Maciej Zawadzki is the President of Urbancode, makers of the enterprise level build management server Anthill Pro. In addition to being an active developer, Maciej is a recognized conference speaker and co-author of the book Professional Struts Applications. His background includes over a decade of experience in the software industry, with the last six years focused on the development of enterprise level server side Java applications. You can reach Maciej by email at mbz@urbancode.com
Set as favorite
Bookmark
Email this
Hits: 4927 Trackback(0)Comments (0)
|
| Last Updated on Monday, 26 June 2006 05:06 |



