Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The harder truth is that metrics only appear to work. As soon as you use metrics to judge performance, people will start gaming them. They’ll do whatever it takes to get a good score, regardless of whether it’s good for the company or its customers.

The metrics “work,” in that they’ll go up, but the things you don’t measure will get worse, often (eventually) catastrophically.

In the case of velocity, you’ll get people taking shortcuts and sacrificing quality, both internal and external, so they can juice their numbers. The outcome is technical debt and arguments about what it means for something to be done, resulting in slower progress overall.

Source: I’ve been consulting in this space for a few decades now, and have seen the consequences of using velocity as a performance measure time and time again.

(Other metrics are just as bad. See Robert Austin, Measuring and Managing Performance in Organizations, for an explanation of why knowledge work like software development is incompletely measurable and thus subject to measurement dysfunction.)



The truth is somewhere in between.

Metrics are great to diagnose the overall process and get a sense of what's going on that can be superior to our qualitative feels about it.

And metrics can also spot an occasional outlier or performance problem. Used sparingly, this does not encourage juicing the numbers.


Yes, that is true. But metrics cannot pinpoint the cause of the problem, which is where the engineering approach fails. You cannot accurately measure the stress levels of individual developers like you can with force plates. You cannot accurately measure developers taking shortcuts in their code like you can measure gear slippage. Similarly, you cannot accurately predict the critical load of a particular configuration of developers like you do with beams in bridge construction, nor can you accurately measure the weight of a feature request like you can weigh a vehicle. You cannot measure the "velocity" of two developer teams working on a different codebase and then assume that the metrics are comparable just because you're measuring the same quantity.

The only way software development metrics are useful is to get an indication of the performance over time of the same team working on the same codebase. That should give some indication of the overall trends, but when the numbers start going down, how will you accurately determine the cause of that? Will you treat the developer team as a black box and insert more probes, or will you talk to them and rely on their qualitative assessments after all?


Using data and metrics for analysis and self-reflection is great, when used thoughtfully. The problems arise when they’re used to judge performance—or even perceived to be used to judge performance. That’s why they’re so tricky to use well. You have to set up a situation where it’s systematically impossible to abuse metrics, typically by putting the data and analysis/judgement at the same level, and only reporting aggregated/anonymized results and qualitative conclusions rather than the raw data.

Some people don’t know how to manage without measurements, punishments, and rewards. It’s a correctable flaw. Measurement-based management is called “theory X” management, but knowledge work needs “theory Y” management. There’s a lot of material out there on how to do it, including a section on it in my book.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: