Software Metrics: Some Background
It happened when I participated for the first time on a SEI/CMM process improvement initiative at my employer. That’s when I realized the power of software metrics. Instead of handing us a ready made process to follow, each of the different product organizations were tasked with the objective of developing their own processes with the goal of achieving Level 2 certification. Each group formed teams around the key process areas (KPA) to develop the practices to be followed. It was without a doubt the best process improvement experience of my career changing the way I manage and view software management permanently.
When the process areas were complete, it was then my responsibility as a Technical Project Leader to use the processes developed on the implementation of my next assignment. While there was skepticism with the initiative, the organization I worked in decided that they were going to support it and give it a good college try, and we did. Some organizations simply wrote the procedures and loosely followed their processes, which is more often the case.
That’s when the epiphany happened. One of the key principles for estimating is to first size the project and permute the size into a duration using historical data or industry standard data when no historical data exists. Since we never recorded this type of data before, we used industry standard data, or what we believed to be. Once the estimates were developed and work began, the power of that approach revealed itself when I began to track the project. The tracking KPA required that all estimating assumptions be tracked, both time and size.
Benefits
This had many important benefits, but the most important benefit was that I could be wrong in my estimating assumptions and I could detect the error early enough when it was still possible to do something to correct it. If I had only tracked one of the two dimensions, time or size, the fidelity is not there to draw meaningful inferences about how well the project is tracking to the schedule until it is too late. If we only track time, we really do not know whether the work is on track until we get close to completing the assignment - typically too late to correct things. If we only track size, well we only know that widgets are being produced or no widgets are being produced, but we can’t draw good inferences about whether it’s on track to completing on time. If I had to pick one or the other to track, I’d pick size because time only tells you that time has been consumed, you have no clue as to whether time was consumed towards reaching your delivery goals. For example, completing 50% of the time budgeted does not tell you that 50% of the work is completed. Projects deliver completed work, not completed time.
Better Communication
Interesting things began to happen when I drew inferences from the data. I was able to see more clearly which deliverables were progressing well and which were not, and most importantly I was able to quantify it. As a result, the conversations with the developers become more interesting and beneficial for both parties. The questions were better, and consequently the answers were better. Some of the conversations we could have were as follows:
- It looks like this assignment is bigger than we estimated it to be, and here is why I’m drawing that conclusion. What do you think?
- Now that you’ve been working on the assignment and your knowledge is much better, how much bigger do you think it’s going to be?
- It looks like progress has really slowed this week, and here’s why I’m drawing that conclusion. Is the delivery smaller than we thought it would be? If no, is there something slowing you up? You might find that when you pulled the developer to do some other necessary task, it had a bigger impact than you thought it would, but in the past you wouldn’t ask the question because you couldn’t see the change.
This also permitted some better discussions with management and the product manager. I could now go to management with hard data and request an additional resource. For example, I could say to management: we underestimated feature A by eight weeks, and we had Joe assigned to deliver feature B once A was complete. All the other team members are fully assigned for the duration of the project. If I can get one developer for twelve weeks, I can keep with the schedule. On the other hand, you can have a similar discussion with the Product Manager as follows: if we deliver feature B and don’t get another developer on the project, we’re going to need to slip the schedule. We scheduled feature B latest because it was a low priority feature. We would need to slip the schedule by six weeks to deliver it, or since it was the lowest priority feature, we can remove it from the release and deliver it in the next release. What do you prefer?
Interesting Results
It’s not that these conversations couldn’t or didn’t happen before. The difference is that they happen earlier. Typically you wouldn’t have this conversation until after you slipped the completion of feature A by a number of weeks. Now you are having the conversation before it was even scheduled to be delivered. Also, the trust is greater because you have data backing up your argument. Presenting that something was wrongly scheduled because it is quantifiably bigger than estimated is more trustworthy than the frequent argument that the assignment just took longer than we thought. Bigger implies that it would take anyone doing it longer than planned, but longer than thought may simply mean it was mismanaged, or the person assigned to it was not very productive. It’s a subtle nuance, but one that makes all the difference.
Other interesting things happened when I began managing this way. Developers were more productive. They were working significantly less overtime and delivering more, which I will discuss more deeply later in this series. On one project where I assumed the management after the team had just completed a death march delivery, one of the experienced team members commented, “it doesn’t feel like we’re working very hard on this project, yet we are on schedule, quality is good, and we producing more.”
What’s Next
In this installment my goal was to give you a little history with my experience in metrics and how I came to appreciate the benefits of using software metrics to manage projects. In the next few installments, I plan to give some further support for using metrics with some obvious examples in other industries to illuminate the benefits of metrics without complicating it with how we might use metrics in software projects.


October 4th, 2007 at2:49 pm
The quantifying the size always worked great for new Developments, but where it gets dicey is in updates to existing source code. How do you measure a developer rewriting 1,000 lines of code. You start with a thousand and end with a thousand. It becomes even more complex when the developer estimates he/she will be modifying 1,000 lines of a 10,000 line code base. As a manager how do you montior that the developer has already altered 500 lines within the first week of a 10 week estimate? The developer is in more code than we thought but we can’t measure that.
This I believe is the main problem with LOC. And the best LOC counting programs I’ve seen all have problems with modifying code. They just don’t count it right.
October 4th, 2007 at11:01 pm
Your getting into an area that I plan to cover in the series, so I don’t want to say too much on it. We’ve worked on the same product together for almost 7 years. During all of the time we worked on it, it was a maintenance project. I can tell you that there wasn’t one release of that product where I didn’t add way more code than I changed, and I believe that to be true for the rest of the developers on the project as well, so I think we’re obscurring the technique with something that doesn’t happen very often in practice: where you do a release and the code base doesn’t grow. I’ve been practicing this technique now for over 8 years all on maintenance releases delivering in toto over 1 million LOC on existing code base. In every release the code base grew significantly. I think this is a red herring that servse to undermine an otherwise excellent practice. That’s all I would like to say about it for now.