This is the third post in a series on “The Seven Deadly Sins of Agile Measurement.” Get an introduction and read about Sin #1, then learn about Sins #2 and #3. Find Sin #7 here.

### Sin #4 - Too-costly metrics

Sometimes it’s too expensive or burdensome to get the exact metric you want. Even when the actual cost--in time it takes for your developers to record extra information that helps calculate a certain metric--is low, the perceived burden can be much greater, especially in an Agile environment. Developers often are your most valuable resources and you do not want to burden them unnecessarily. You should only ask your developers to input manual data if they believe it leads to a measurement whose value exceeds the perceived burden (see Sin #1).

What this means is that in many cases, qualitative insight is your best alternative, but that's OK, maybe even desirable (see Sin #3). However, you can turn a qualitative perception into quantitative insight via a survey. That’s why two of the SDPI dimensions are customer satisfaction and employee engagement. We recommend that you use a very lightweight metric (maybe one question like net promoter score) for minimal burden while capturing a lot of value.

When assessing the value of a particular metric, make sure you include the “cost” of the perceived burden on your developers to record that metric. Qualitative insights can become valuable quantitative data.

### Sin #5 - Using a lazy metric

Better measurement leads to better insights, which in turn lead to better decisions and eventually better outcomes. With this measurement chain of effect in mind, many people start by choosing measurements that are easy to acquire. However, measuring what’s easy often can drive the wrong behavior. Let’s use a sports analogy to illustrate this point.

In 2010, Monta Ellis of the Golden State Warriors was the ninth-highest scorer in the NBA. Carmelo Anthony of the Denver Nuggets was the eighth-highest scorer. Measuring individual scoring totals is easy, and you might assume that because these players were prolific scorers, their teams were winning games.

However, it turns out that the more they played, the less their teams won. Scoring is itself a function of two measures: the number of shots taken, and the percentage of those shots that go in the basket. These two “stars” have high measures for the former but low measures for the latter: they are high scorers because they take more shots, but their shooting percentages are quite low. Since they’re not making more opportunities for shots--their teams take almost exactly the same number of shots per game no matter how much they play--they are figuratively stealing shots from their teammates who might have a better chance of scoring.

So while the flow of learning goes from measures to outcomes, the best way to make measurement decisions is to start by identifying the desired outcomes. That’s why we call this ODIM:

#### better OUTCOMES ← better DECISIONS ← better INSIGHTS ← better MEASURES

Applying this to our sports analogy, the NBA stars should focus on the outcome of winning more games rather than being high scorers. If they used insights gained from measurements, such as the overall likelihood of the team scoring under various conditions, it would help them make better game-time decisions and achieve the right outcome.

Even the best strategy can fail in the execution, one small decision at a time. Another way to think about measurement in this context is that it provides the feedback to help your people know which decisions best align with your strategy.

### Sin #6 - Bad analysis

Imagine setting a service level agreement, thinking that you would miss it in just 0.2% of your cases, but in reality you miss it in 1.0% of your cases. If you had allocated $200,000 as a contingency, you’d actually need$1,000,000. Yikes! Though the percentages seem small, you can see that their impact in dollars is not. This kind of bad analysis is actually a fairly easy mistake to make.

Here’s the same chart we discussed in Sin #1--using measurement as a lever to drive someone else’s behavior--where we looked at how coloring the dots red is "lever" rather than "feedback". Each dot represents a user story: the higher the dot, the longer it took to be completed. The upper control limit is supposed to be a 2 standard deviation, or 97.7% threshold, but the data are not normally distributed so it's really only about a 92% threshold.

Many phenomena in the knowledge-work domain exhibit fat-tail behavior, and the tail is frequently what interests us. Careless statistical analysis will hugely underestimate this.

This is just one example of the way in which bad analysis can drive misinformed decisions. The heavenly virtue to go with this sin is to invest in the expertise that enables you to do correct analysis.