Wednesday, September 15, 2010

The pitfalls of software metrics

In this post I'm going to talk about software metrics in general and specifically about the pitfalls involved with using them. What people are trying to do when using software metrics, is to capture certain aspects of a software system in an, at the first glance, easy to understand number. This is why metrics are fascinating, especially for project leads and managers. Software is kind of an "invisible product" for people that are not able to read source code. So all means to shed some light on what this expensive bunch of developers is actually doing all day are very welcome.

If you're an architect or technical lead in your project, it is part of your job to improve transparency for the management. People that give the money certainly should know what progress you are making and if the quality the produced code is OK. Metrics can help you in satisfying the demand for information and yes, control. But in my experience metrics can cause more harm to a project than they might actually help, if they are applied careless.

To prevent this, we at first need some basic understanding what software metrics actually are. Typically software metrics are divided up in three categories:

  • Product metrics: Metrics of this kind are used to measure properties of the actual software system. Samples are system size, complexity, response time, error rates, etc.
  • Process metrics: Process metrics tell you something about the quality of your development process. For example, you could measure how long it takes in average to fix a bug.
  • Project metrics: Are used to describe the project that builds the system. Typical examples are the budget and the team size.
All metrics can be further classified based on the measurement process. Some metrics can only be measured manually (like team size) others are calculated automatically from the project artifacts. The most popular example for the latter type of metrics is the LOC (lines-of-code) metric.

The LOC metric is a good candidate to explain why careless usage of software metrics can be harmful. Lets assume the management request some form of weekly or monthly reporting about the progress of the running projects. At first glance, an easy way to fulfil this request is to run some tool on the source code, calculate the LOC (and, possibly, some other metrics the tool happens to support), put everything in an excel, generate a chart and put this into the project status report. I am pretty sure people will like it. First, charts have been very welcome in nearly all status meetings I've ever been in. And secondly, the LOC metric is very easy to understand. More lines of code compared to the last meeting sounds good to anyone.

However, things will get complicated as soon as the LOC number remains constant or, even worse, shrinks from meeting to meeting. Now, don't blame the people attending the meeting for getting nervous. They asked for a progress indicator and you gave them LOC. The good thing about the LOC metric is that you can calculate it easily. The bad thing is, it does not measure the system size. It measures the number of source code lines (whatever that is, but we'll ignore this detail for now). The count of source code lines is only an indicator of how "big" a software system is. For example, a skillful developer will reuse code as often as possible. This involves reworking existing code in a way that new code can be implemented in an efficient way. At the end of the day, maybe just a few, none or even LESSER lines of code have been "produced". Nonetheless, the system size has increased. Unfortunately, only the coolest of all project leads and managers in a status meeting will believe this ...

To avoid the described scenario, we must select software metrics carefully. The selection criteria must not be that a metric is supported by a certain tool out of the box. Instead, the goals of the measurement process are relevant. In the example, the overall goal was to provide a report on the project progress. Now, we formulate questions we have to answer in order to reach this goal. Since we're talking about a software project here, a good question would be "How many use cases have already been implemented? ". This still is rather generic. So we get into more detail:

  • Is the source code checked into the repository?
  • Do module tests exist and are they executing without errors?
  • Have the testers confirmed that the implementation is feature-complete according to the use case description?
In a third step, we derive metrics that are able to answer the questions. The metric for the above example would be called Completed Used Cases. If the project uses a task management system like Jira, Redmine, Trac or whatever, the metric can even be calculated automatically using the reporting functions.

Because of the three steps Goal, Question and Metric this approach is also known as GQM approach. In my experience it works OK. One of its main advantages is that the used metrics are well-documented - just take a look at the questions. Additionally, the GQM approach is easy to apply in practice, everybody understands it and it puts the tools at the end of the process and not at the beginning.

1 comment:

  1. Thanks for sharing information man, great stuff. I have also blogged my experience as How HashMap works in Java . Please let me know how do you find it.

    How Synchronization works in Java


Note: Only a member of this blog may post a comment.