
When possible, I report the problem. How long does it take to fix the problem? Sometimes, it takes those vendors days to fix their site. I expect only ten to fifteen minutes because I expect them to use rollback, not just fast deployment.
Rollback is not new—it is a tried and true idea from at least the 1980s. As an industry, we know that new releases can break what used to work. That's okay. We can roll back to a previously known good version. Will it have the new features? No. But it won't have the problems—the product or the website still works. That's why rollbacks work.
My History with Rollbacks
When I first started to program in the 1970s, we did not have version control. Instead, we noted who had changed this file with an “MRU” (Most Recently Updated). That allowed us to either ask ourselves what the heck we had done or ask someone else who changed the code.
That is possibly version identification. It is not version control.
I first used version control in the early 1980s: first, SCCS, then RCS. They were clunky tools, but allowed us to mark known good versions. That was key for being able to roll back when we introduced problems anywhere: in the product or with deployment.
I moved into project management in the mid-1980s. That's when I realized this important idea:
Installation is the first thing the customer sees. That's the first customer interaction with the product. Let's make it a good interaction.
If installation is not easy, the customer is already predisposed to hate the product. Back in the 80s, my product managers focused on the features the product was supposed to have. That was fine, except the product managers did not consider installation. They assumed the developers would “take care of” installation themselves.
Developers are human, too. They did not want to focus on installation—that was not a “difficult” problem to solve. Yet, too often, I worked on products that had terrible installation procedures and documentation.
That's why I focused some of my project management work on making sure the product was easy to install. I wanted the customers to have a good interaction from the start.
But installation and adding more features is just one part of the equation. The second part is to make sure you do not leave the customer in a bad state. That's the point of a rollback.
Rollbacks Can Prevent or Manage Bad State Problems
What happens when installation breaks in some way? Or fails partway through? Or, as I have experienced too often in the past few weeks, where the site is in a corrupted or compromised state and the user cannot do what she wants?
That's a terrible user experience. I am much less interested in continuing to use the site if they can't keep the site up for the normal activities.
Enter rollback.
When a product or a site has a rollback procedure, the vendor should be able to restore from that known good place in a short period of time. How short? Back in the 80s, I wanted it to be 10 minutes, max. Why would it be different now?
I can hear all of you developers talking about database upgrades, only one-way deployments, etc. I understand the need for moving forward. And, I challenge you: test the hell out of the installation and the product before you make a one-way upgrade. Otherwise, make it easy to roll back.
Because if you don't, and I have to restore my data from the way your product destroyed it? I will find a replacement for your product.
And if you don't, and I “only” use your website? I will find a way to complain and/or find a replacement for your product.
State problems are not the user's fault. They are the vendor's or producer's fault. Worse, most of these problems are the result of insufficient testing.
We Need More Testing, Not Less
I hope that you want your product or website to continue working, even in the face of strange states. Remember, lines of code is not an asset. Even the number of tests is not an asset. But tests that stress the installation of your product or updates to your site? Those are an asset, just as much of an asset as running, tested features. (Because the tests are how you get to running, tested features.)
I want modern products and websites. But I only want them when they work. That's why all producers need to consider how they will roll back to a previously known good state.
Don't send me an email to go to your site—and then not deliver on the promise of your features. That's an installation defect. Those installation defects break trust with your users—both for you as a producer and for the problems you claim to solve for your users.
Fix that with more testing and add a rollback procedure. Rollback will give you breathing room to fix the problems without your users yelling at you.
Because yes, when you break the promise of the problems you solve, your users will search for an alternative. Remember, managers do not want customer churn. (See How to Link the Team Measures to What Managers Want.) Managers want ever-increasing revenue and customer retention.
Rollbacks are a tried-and-true way to make your product or site work. Learn to use rollbacks and more testing to make your users and managers happy. Remember, that's the first thing your customers see.