< Articles


Reflecting on our Defects First Experiment

Throughout 2020, I led the frontend rewrite for our online estimating product and its companion administrative application. Two greenfield products, combined with a supportive product team, created the perfect environment for a radical engineering experiment: defects before features. In this article I'll outline the context for the experiment, the results we saw, and the special nuances other teams should be aware of before employing a similar process.

Context

At first glance, this paradigm might not seem that revolutionary. Shouldn't every product strive for quality? Why would a software product allow defects to persist? However, the reality of modern software engineering makes this a challenging goal to attain because the vast majority of technology companies are understaffed relative to their product expectations.

Software engineers are incredibly expensive and so product groups will often create a surplus of feature in an effort to ensure that every engineering hour is spent on actual engineering.

While this makes sense from a productivity standpoint, the effect on technical leaership groups is a perpetual feeling of being behind. Knowing full well that we can't accomplish everything that's asked of us, we begin prioritize every task based on what our product group values, paying no mind to the task type. And what does every product group value?

Features. Features. Features.

Features are the presence of positive customer value whereas defects are the presence of negative customer value. And so a defect free codebase just means there's an absense of negative customer value. When was the last time you valued the absence of a negative thing in your life? It's human nature to always value the positive and rarely give any thought to an absence of the negative.

With this context in mind, it's easy to see how defects can build up in our applications and why it's so hard to create a sustainable codebase.

Defects First

With the chance to start from scratch, I wanted to test a hypothesis: a defects first policy would create a defect-free codebase while also increasing feature velocity.

After a year plus of religiously following this paradigm as a team, I can confidently say it's a success, though I've had to change my expectations slightly.

As of today, we have a single digit defect count for our portion of the two applications we manage. Additionally, we have no known defects in our internal component library (~50 UI components). This is huge as our library is now being consumed by two internal teams for their applications.

It's also been interesting to see how this paradigm shift has changed our code review process. While our code reviews were good before, they're now incredibly thorough. It's not uncommon to see 30+ comments on even the best pull requests.

I've come to realize that fixing defects first is a keystone habit, something Charles Duhigg details in his book The Power of Habit.

Small changes or habits that people introduce into their routines that unintentionally carry over into other aspects of their lives.

Our defects first approach has caused a massive shift in our code review process.

When defective code enters a codebase managed by a team that doesn't practice defects first, the typical workflow is as follows:

    1

    The defect is identified.

    2

    The defect is prioritized according to existing defects and features.

    3

    Weeks or months later, the ticket is picked up by a member of the team.

Something to note here is that it's not uncommon for a different developer to fix the defect than the one who created the defect. Additionally, there's all kinds of velocity lost when a feature is no longer fresh in ones mind. Compare that now to a similar workflow implementing our defect first process:

    1

    The defect is identified.

    2

    The defect jumps above existing features.

    3

    The defect is quickly addressed.

In this workflow, the defect is often picked up the same day it's found because it leapfrogs all the other features and there's little to no defects in front of it. The speed of addressing the defect ensures that more often than not, the developer who introduced the bug is also the one fixing the bug. Additionally, the context is fresh in the developer's mind.

I believe that the process bleeds over into our code reviews because the consequences for bad code (or poor code reviews) is more immediate. Nobody wants to work on (or review) the same thing twice. Developers pay more attention to the details before opening a pull request and reviewers make sure everything works before approving.

This is definitely something we'll continue to do moving into 2021.

First Nuance: This Process Takes Time

While it ended up being a success, this process wasn't without its challenges. For one, it took a good amount of diligence before we started seeing increases in velocity and quality.

With greenfield projects, the quality is going to be consistent from day 1 because you'll always have a small defect count. However, it comes at the cost of velocity because in addition to addressing every defect, you're still working on infrastructure (build systems, environments, integration with other products). While the quality of the product was always high, we didn't start seeing a noticeable velocity improvement until we were 10 months in.

I imagine the timeline would be similar for a legacy application because there's often a ton of defects. For a project of that nature, the strategy will hinge on the existing defect count as well as the development team's relationship with the product team.

If product is okay with features not being developed for an extended period of time, go ahead and clear out the defects. Often times though, that's not possible so you may need to modify the strategy and address defects first on all new development while finding some sort of balance between old defect fixes and new feature development.

The defect count should be going down month after month. If it's not, you need to be more aggressive in addressing the defects and scale back the number of new features you're working on.

Second Nuance: Product Needs to Believe It

During this last year there were a handful of uncomfortable moments where a pending deadline for a given feature pushed us to the brink of abandoning the process. Lucky for us, our product leadership group saw the merit in our process and supported us in addressing defects before we addressed features.

Sometimes this meant we missed our intermediary deadlines. And when I say intermediary here, I'm referring to the fact that some deadlines are fixed (i.e. we have to have this by the end of the year) while other deadlines are more to ensure we're on track for the fixed deadlines (i.e. we need this by the end of the year so getting it in by September will probably put us on pace for other deadlines). We did miss some of those intermediary deadlines but at least from a frontend perspective, we've hit all of our fixed deadlines. And as an aside, we've actually completed 3 large features that we didn't plan on doing in 2020 and we're working pushing for a fourth before year's end.

But I digress. The point is you need a strong enough relationship with your product team that it's okay to miss an intermediary deadline, or have nothing major to show for an upcoming demo, in order to do this process long enough to see results. While it hasn't always been comfortable in those moments, our product team has supported us to stick with it throughout the year and we're now enjoying a high quality product and a strong feature development velocity.