subbu.org

Design for Upgrade

without comments

Software upgrades are not fun. When you upgrade software, usually things go wrong, and you spend several hours to fix it. It seems that most users are not really surprised at this, and are prepared to accept it.

On the other side of the fence, software developers usually treat upgrades as a necessary annoyance to deal with, just like writing tests or writing design docs. Software vendors promise upgrades, but always come up with some legalese in license agreements that in effect mean “if you are upgrading to this latest and greatest version, we thank you, but if things go wrong, go blame yourself”. “Oh! Really!” and not “Of course”, would be a likely expression when a software development team gets to know that one of their users was able to upgrade successfully. Such is the state of software upgrades.

A few weeks ago, one of the developers I know spent several hours figuring some JNDI/JDBC issues that she encountered when she moved their web apps from a previous vesion to a recent version of Apache Tomcat. After some searching around newsgroups, she found that others had similar issues, and someone posted a solution for that. In an ideal world, such things should not happen. I don’t mean to single out Tomcat here. I bet horror stories about upgrades with various software products will take up volumes to fill.

It is not just the users that pay for upgrade mishaps. Software development teams and vendors face the consequence of not engineering their software for seemless upgrades. When a new version of the software is offered, users don’t necessarily jump up to upgrade to the latest version because of the cost of upgrades. In the extreme case, users don’t want anything in the new versions other than bug fixes that are important to them. No new features, and no fixes not reported by them! As a software developer, I can imagine the frustration to hear such demands from users. But users are completely justified in making such demands. There is also the added cost of maintaining multiple active versions of the software - older versions with bug fixes, and new versions with features. If you make upgrades so easy and painless that users barely notice upgrades, may be they will be more inclined to switch to newer versions!

Does it have to be like this? Is upgrade such a tough, expensive, or intractable problem that very few can design software ground up to offer seemless upgrades? Of course note. Consider this, rather fancy, scenario that upgrades your favorite app server.

  • You go to a server in the staging/test area, open a web page, and click on “search for updates” - just like way most M$ Windoz users do to get their latest fixes (along with latest security holes - pun intended - I don’t use Windows).
  • You get a list of updates, pick up the ones that are important to you, and select “update”.
  • The server will then check for compatibility, any configuration or data that needs to be migrated, download the updates, let you preview those changes, and then do the actual upgrade.
  • Once this is done, you select an option that reads “migrate these changes to the production server”. Once selected, the staging server will talk to the production server, and schedule an upgrade.
  • At the scheduled time, the staging server will coordinate an upgrade of the production environment.
  • The production server is also upgrade smart, and upgrades each server node without incuring any downtime.
  • If unexpected things happen, the upgrade process will revert all the changes, and collect and diagnostic information for future analysis. (This reminds me of a very large software company that will stay unnamed here that is so infamous for its installers that if the installation goes wrong in the middle, you can’t repeat the installation without cleaning up Windows registry manually.)

You can also apply the same scenario to applications built on top of application servers.

This scneaio may be too fancy, but there is software today that can do some of the steps. My favorite example is yum on Linux. I have used yum to upgrade my desktop from Fedora Core 2 to Core 3, and then from Core 3 to Core 4 without losing any data or settings, just with one reboot each time. I have also upgraded KDE across several versions using yum. Of course, yum has its kinks mostly due to broken repositories. On a few upgrades, I had to rebuild my VPN client, and recofigure XFree86. I’m not advocating that we all use yum to upgrade, but there are ideas and some precedent to follow to design and build software for upgrade.

I don’t think software development teams ignore upgrades completely. The problem is that most software, at its core, is not architected and designed to facilitate easy and guranteed upgrades. To most software, upgrade is an after-thought. It is something that is slapped on to the software sometime before delivering it to the users. In fact, how many software development teams continuously test for upgrades during the development cycle?

When you consider “automatic and guaranteed upgrade” as one of the primary use cases during the initial development phases, you may end up designing your software differently. Consider the number of problems that I assumed solved in the above scenario.

  • New releases are structured such that users can choose the features/fixes they want to install and not install features they don’t want. This problem itself can influence the way source code is structured, built, tested, and versioned.
  • Library changes can be applied dynamically. In the case of server side Java, this means that the core layer of your application server should be able to manage classloaders such that it can unload old classes and load new classes in a different class loader, and still provide continuity of operations. For those interested, the latest release of WebLogic Server 9.0 lets you upgrade applications without having to take applications off-line. This is a move in the right direction.
  • Data can be upgraded automatically. I also assumed that the software is able to upgrade data from older schema to new schema (XML, SQL, or other file formats), and perform the updates dynamically without requiring offline file/data upgrade process. That means, each change must be both backwards and forwards-compatible.
  • Old and new versions can co-exist at runtime. When software is running on clustered servers, this is an important problem. In a cluster, nodes can talk to each other, or use shared data. Since it is quite unreasonable to expect that all nodes are upgraded precisely at the same moment, each node must be able to co-exist with nodes running older or newer versions of the software. For a more real-world discussion on this problem, see the interview with Phil Smoot of MSN-Hotmail in the December 2005 issue of ACM Queue. In this interview, Phil talks about multiple versions of the software running at the same time. Considering the number of servers hosted for email services like Hotmal, each change to a data format or a communication protocol can make or break the system.

These problems are complex, and solutions influence the architecture of the software greatly. In many cases, mending existing software to address these problems can take significant motivation and effort. That’s why it is crucial to consider these problems early the design phase, and validate them continually during the development phase. Design for upgrade!

  • Digg
  • del.icio.us
  • Google

March 11th, 2006 at 9:07 pm

RSS feed

Comments »

No comments yet.

Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.

Trackback responses to this post