Practical Agility LLC

View Original

Can We Measure Developer Productivity and Is That Even the Right Question to Ask? - Part 2

In Part 1 of this blog article, we summarized the main points of the McKinsey article entitled “Yes, You Can Measure Software Developer Productivity” and shared our concerns about some of the proposed metrics and whether this is even the best question for us to ask if we want to move the needle on delivering more customer value.

Now we turn our attention to notable responses to the original article that were penned by Kent Beck and Gergely Orosz.

For those that don’t know, Kent Beck is one of the preeminent figures in modern software development and creator of the Extreme Programming practices. When it comes to software development topics, I’m always interested in his unique perspective. Gergely Orosz is best known as the creator of “The Pragmatic Engineer”, the #1 technology newsletter on Substack.

In their two part response (part 1, part 2), Beck and Orosz introduce a framework to discuss productivity, measurements, and impact that I really like. They focus on four different components of any work process - Effort, Output, Outcomes, and Impact. Impact and Outcomes are the items that are most valuable to measure and optimize, but Effort and Output are the easiest items to measure. One salient criticism they level toward the McKinsey suggested measures (excluding the DORA and SPACE components) are that almost all of them focus on Effort or Output.

Beck and Orosz also discuss how CEOs/CFOs wonder why if we can measure sales and recruiting performance individually, why can’t we do so for software developers? They raise some good points about how sales and recruiting are often individual-focused activities whereas software development is a more collaborative endeavor. This makes it easier to measure Outcomes and Impact of the individuals in the sales and recruiting process, while harder to attribute the Outcomes and Impact individually in the software development process. One additional point I would make on the Sales and Recruiting front is the underlying assumption that making a sale or hiring a person is the Outcome we should primarily be measuring is not quite correct. If we extend this out to Impact, we would need to consider whether the sale made is “good revenue” or “bad revenue” (that is, are there externalities tied to the sale like future delivery commitments or false promises made that will erode the long-term value of the sale.). Likewise, in the recruiting scenario, wouldn’t it be better to look at the actual contributions of the hire and how long they remain at the company? These aren’t quite as easy to measure, but they absolutely are more useful than just the total sales or hires made metrics.

Another major point that Beck and Orosz make is that measuring activity instead of productivity is problematic. When we measure Effort and Output and especially when it is tied to performance management (reviews, compensation, promotions), people will absolutely game the system to give you exactly what you are measuring. As Goodhart’s Law states, “When a measure becomes a target, it ceases to be a good measure.” This gamification, they note, can take a collaborative environment and turn it into a cutthroat environment. Measuring individual developer productivity has a significant cost in terms of culture and collaboration while the benefits of this measurement is minimal in relation to the costs. I discussed similar concerns in several of my previous blog posts Leaders Have the Power to Align Rewards, Metrics, and Culture to Desired Ways of Working and Be Careful What You Wish For - You Might Just Get It!.

Another focus of their responses is taking a look at why is the individual developer productivity question being asked by CEOs and CFOs? In most of the cases they mentioned, looking at individual developer productivity isn’t the best way to address the true underlying question prompting the request to look at developer productivity. Instead, they suggest reframing the question about individual productivity into the true underlying question that needs to be answered and then determining the best way to address that question. Often times, it comes down to CEOs and CFOs wanting to make sure you, as an engineering leader, are demonstrating strong stewardship of your engineering teams. I can confirm this from my many years of experience as a development leader. Developers are one of the most expensive positions on a per capita basis that an organization is spending money on, so there is a considerable and understandable scrutiny on ensuring that expenditure drives positive business impacts for the organization.

Beck and Orosz offer up two metrics they believe make good sense to use in addition to the DORA and SPACE-inspired metrics:

  1. Please the customer at least once per team per week, focus on delivering impact

  2. Business impact goals that a team can commit to

Finally, they offer two uses for Effort and Output metrics. These metrics can be used to understand the underlying reasons why you are not achieving your desired Outcomes and Impact as they are initial and early inputs that eventually drive Outcomes and Impact. They also suggested that these metrics can be used by individuals for self-improvement purposes. When the Effort and Output metrics are used in these manners, they do have some usefulness.

In conclusion, I share Beck and Orosz’s concerns with measuring Effort and Output, the dangers of metric gamification, and the erosion of the collaboration needed to successfully produce continuous value in our software. Likewise, I agree that it is important for engineering leaders to develop better skill at understanding what CEOs and CFOs are really asking for when they ask for individual developer productivity metrics and formulating meaningful responses that truly are indicative of the value (Outcomes and Impacts) produced through the product development process. Getting better at speaking to these concerns means we spend less time looking at individual developer productivity metrics that really aren’t that relevant and have significant downsides.

The cover image for this blog is from Simone Secci @simonesecci.