Wednesday, April 1, 2020

A sorry tale of software quality: what went wrong and some lessons

I’m going to tell you a story about a software quality drive that went horribly wrong and what we can learn from it. To protect the guilty, I’ve disguised some details, but the main points are all true.

(The Antares launch failure. Image credit: Nasa. Source: Wikimedia Commons. Public domain image.)

I worked for an organization that got the quality bug. They decided that the cause of the software failures the company had experienced was poor quality processes, so in response, we were going to become a world leader in software quality. To do this, we (the development team) were all going to be certified to software quality standard ABC (not the standard’s name). The project was going to be completed in a year and we were going to see astounding benefits.

The company hired a number of highly paid contractors to advise us, most of whom were well-qualified. Unfortunately, many of the contractors showed two major personality traits: arrogance and condescension. Instead of creating a cooperative approach, they went for a command and control style that was disastrous. I heard there were multiple complaints about how they treated my colleagues.

Not all the contractors were well qualified. In one case, the person hired to create and manage software processes had never worked in software before and had never written any software. As you might expect, they ended up proposing weird metrics to measure quality, for example, counting the number of version control updates as a metric of quality (presumably, more updates meaning better software? It was never defined).

Everyone was trained multiple times, no expense was spared on training, and the sky was the limit for spending time on processes. We went to very, very long training sessions about once a month. These often included long descriptions of the benefits of the process and the fantastic results we might see at the end of it. In one notable case, the presenter (an external contractor) was caught out showing 'expected' results as real results - but our management team was true believers so the presenter got away with it.

As we started to put together processes, we were encouraged to meet once a week to discuss quality, which everyone did. We faithfully started to implement the processes proposed by the consultants and as customized by our teams. Most of these processes were a bit silly, didn’t really add any quality, and took time, but we implemented them anyway because that's what the leadership team wanted. The weekly meetings started to get longer and ended up taking a whole morning.

A few brave people suggested that we should have metrics that measured project deliverables, deadlines, and outages, but the ABC quality standard people wanted us all to focus on measuring the process, not the outcome, so that’s what we did.

Despite all these efforts, the failures kept on coming. The one thing that did change noticeably was that deliverables were now taking twice as long. Executive leadership was unhappy.

As the quality project started to lose support and fail, the leadership tried to change their approach from stick to carrot. One of the things they did was hold a contest for people to suggest 'fun' alternative words for the project acronym, if the project was ABC, an alternative meaning might be 'A Big Cheese'. Unfortunately, a lot of the entries were obscene or negative about the project.

Eventually, the inevitable happened. Executive leadership saw the project was going nowhere, they did a management reshuffle and effectively killed the quality drive. The contractors were shown the door. Everything went back to normal, except the promoters of ABC were no longer in senior positions. The standard ABC became a dirty word and people stopped talking about quality improvements; sadly, the whole ABC debacle meant that discussions of more sensible process improvements were taboo.

The core problem was, the process became about getting certified in ABC and not about reducing defects. We had the wrong goal.

I’ve had some time to think about how this all played out and what I would have done to make it a success.

  • Start small-scale and learn. I would have started with one team and measured the results very carefully. People talk to each other, so success with one team would have been communicated virally. To oversimplify: If it can’t work with one team, it can’t work with many.
  • Scale-up slowly.
  • Bring in contractors, but on a one-off basis and clearly as advisors. Any sign of arrogance and they would be removed. Contractors should be available on a regular basis to ensure change is permanent.
  • Focus on end results metrics, not processes. If the goal was to reduce defects, then that’s the metric that should have been measured and made public. 
  • Flexibility to change processes in response to increased knowledge.
  • No certification until the target is achieved - if certification is an option, then inevitably certification becomes the goal. Certification is a reasonable goal, but only once the targets have been reached.

I’m not against quality standards, I think they have an important role. But I do think they need to be implemented with great care and a focus on people. If you don’t do it right, you’ll get what you asked for, but not what you need.

No comments:

Post a Comment