Continuous Delivery versus Frequent Releases

comparing-applesContinuous delivery and more traditional versioned releases are close, but a little different. In this post I dig into some of the practical details and issues that different between the approaches. There is no rocket science in this post – just summarizing knowledge gained from practical experience.

Continuous Delivery

In continuous delivery, the idea is you continuously compile up the current version of the source code and if its good, you can roll it to site. Purists probably say “if it passes the test, then always roll it to site”. Pragmatists would probably say “a human would decide when to roll it to site” (continuous delivery here is more we are able to roll the code all the time – but if we chose to).

If the code is not being rolled fairly frequently (at worst say once a month) then I don’t think you can call it continuous delivery. Not rolling periodically can increase risk as the chance of issues when you do finally do a roll are higher – jumping from build 300 to build 400 may never have been tested before (only 300 to 301, 301 to 302, and so on).

Because the code is continually checked out, compiled, and tested, build numbers are normally a part of the release number assigned to artifacts. Multiple versions of a “release” (such as “1.0.0″) may be made – it’s the build number that uniquely identifies a build. It may be argued the major and minor release numbers are almost pointless in this model.

To get this to work with Maven, the normal approach is use a parameter in the pom files for holding the build number or “-SNAPSHOT” for developer desktop builds. All the code is recompiled per release, so the release number of all artifacts needed to build a release are known (they all share the same build number).

Traditional Releases

In maven, the more traditional release process is to assign version numbers to artifacts (typically jar files representing one component of the system). A very common scheme is to have major and minor numbers, followed by a patch number (for example 3.4.1). Each release typically increments the minor number, with the major number being incremented to highlight some significant change (where a human defines what “significant” means). Patches are minor fixes to a release (typically where the API is guaranteed not to have changed). Patches are normally made in a branch based on the code in that branch.

In this model, the build number is not used as a part of the release number of an artifact. You may include the build number within the release artifacts, but it does not form a part of the jar file name. The reason for this is all cross jar file dependencies are defined in terms of release numbers. Each jar file has its own history where an official release is made (with a release number such as 1.0.0), then when changes are to be made to the jar file the version number is changed to 1.0.0-SNAPSHOT. The “-SNAPSHOT” suffix is special to Maven – it tells Maven the release is not final to be shipped externally – it is a current work in progress. Developers know snapshot versions are just to play with. When the code is ready to roll (e.g. to site), a release is made by committing the pom file with the “-SNAPSHOT” prefix removed, that code is then compiled and “installed” in the release maven repo. No other change should then be made to that release of the code. If a change is required, a new version is created by updating the pom file to increment the release number and add the “-SNAPSHOT” prefix. (A common mistake is to release new versions of previously released jar files. This inevitably leads to problems due to the caching nature of Maven.)

Further, maven has the concept of dependencies. Artifact A may depend on artifact B. Artifact B may depend on artifact C. These dependencies are defined based on version numbers. So A might be at version 4.1.2, B might be at version 1.0.2, and C might be at version 2.1.0. Before A can be released, B and C must be released if not done already (the code must be compiled up without the “-SNAPSHOT” suffix). If B currently depends on a snapshot version of C, then C must be officially release (without the –SNAPSHOT suffix) with all future changes going into the next release number for C. This may require a new version of B to be released as well, and finally a new version of A. The important thing is for all jars in the dependency graph, for A to be released everything A depends on (directly and indirectly) must have been released as well (no –SNAPSHOT suffix). Once A is released, that version never changes.

In this model, releases are made manually as it involves incrementing the release number. Humans decide when the version number is ready for change. The point in the source code control system history where the release was made is frequently tagged for easy reference. The release version must undergo significant testing, but once done there is no need to test it again. It will never change. The continuous integration framework does continue to compile up the most recent version in the source code control system, but more as an early warning system that the code has been broken (it won’t be built into a releasable version until a human decides that it is an appropriate time to do so).

It is worth pausing here to look at GIT versus Subversion support for branching. In GIT branches are labels assigned to a particular version in the version history of the repository. This causes problems in a repository where multiple jars are released with different version numbers. In Subversion for example, a branch is in fact simply a tree copy within the repository. There is a convention of a repository having “/branches” at the top level directory. It is however completely acceptable to have the top level directory be a module name, under which the “branches” directory is place (e.g. /widgets/branches, /gizmos/branches, …), under which the source code goes. The benefit of this approach is different modules can be versioned and branched separately within the same Subversion repository. GIT does not support this approach. It would instead require creating multiple repositories so they can be versioned separately. (If I am wrong here please let me know!)

For GIT, a common solution to keep the number of repositories down (hopefully to just one, but maybe to a small number of them) is to make everything in the repository be versioned together. For jars A, B, and C, if any one needs a new version number then they all get the same new version number. Different artifacts are not given different version numbers if they reside the same GIT repository. To get independent version numbers they must be moved out, or everything in the local repository should be synced up to use the new version number. This is normally achieved by having sub-modules of a single parent module, where the parent module pom file holds the version number shared by all sub-modules (and the artifacts they build). This means for a single jar file there may be multiple releases of identical code with the different version numbers. This is not always ideal, but sometimes it is fine. It is a matter of working out when different jars can be upgraded in sync and when they must not.

Multiple GIT repositories can be painful to manage, but also has advantages. It is not required for developers to check out code from all the GIT repositories. They can use Maven to download released artifacts they depend on. Developers only need to check out in their local environment code they need to change and compile.

(There is another way to have a single GIT repository with different version numbers per artifact, but it requires discipline to get right. It involves when a branch is created that only one module (which builds one artifact) is ever changed in that branch. This needs a bit of care and thought to get right, otherwise merge conflict hell can result.)

Summary

In a nutshell, the more traditional Maven release approach tends to work well when developers work in individual areas of code. But it is not really a continuous release approach. Releases are made when a human makes the decision to create a release. The number gives the release a number. The continuous integration server that continually compiles and tests the code works on the head of the repository, which typically will be a snapshot version (not an official release). It is giving help to decide if a release can be made.

It is also not clear to me how to have one GIT repository to host all the code where there are different versioned artifacts within the repository. A solution may be to force everything in the repository to have the same version number, but this is not always a good approach.

Personally I am more used to the frequent release model rather than truly continuous deployment. Humans decide when something is to be released. You can still get many of the advantages of continuous delivery as long as you keep the release overheads to a minimum (allowing you to manually release frequently). I have not seen as good tooling around for true continuous delivery, so if you do go the full continuous delivery route, expect to have to invest more to get it up and going properly.

2 comments

  1. Thanks for such a nice explanation, basicaly i agree with your idea of frequent release in place of continuous deployment.

  2. Committing and building code often doesn’t mean that code should be released often. However, it does shorten the feedback loop and increase the likelihood that bugs will be found sooner.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: