Survivor Bias for Software Teams

I have been thinking about how drastically my understanding, opinions, and even convictions change over the course of a project. It’s only natural and hopefully a good thing considering it is the product of experience. But if we don’t take time to think critically about those changes, are they always good?

Survivor bias is an interesting phenomena. Basically, if you always look at success, you may be missing the important facts. A Smart Bear has a great story to illustrate the problem (and how it relates to business and startups):

During World War II the English sent daily bombing raids into Germany. Many planes never returned; those that did were often riddled with bullet holes from anti-air machine guns and German fighters.

Wanting to improve the odds of getting a crew home alive, English engineers studied the locations of the bullet holes. Where the planes were hit most, they reasoned, is where they should attach heavy armor plating. Sure enough, a pattern emerged: Bullets clustered on the wings, tail, and rear gunner’s station. Few bullets were found in the main cockpit or fuel tanks.

The logical conclusion is that they should add armor plating to the spots that get hit most often by bullets. But that’s wrong.

Planes with bullets in the cockpit or fuel tanks didn’t make it home; the bullet holes in returning planes were “found” in places that were by definition relatively benign. The real data is in the planes that were shot down, not the ones that survived.

Below, I am going to outline a few lessons I’ve learned through failure. Hopefully, you can avoid some of the mistakes I’ve made in spite of the apparent success of the methods.

Survivor Bias for Process

We build processes to facilitate development in software, and to help us avoid common pitfalls. One very common practice is continuous integration with tools such as Jenkins. When deadlines come due and there is a big push to the end, it can be tempting to declare marshal law and accept that master will fail in order to deliver the code.

I have had to accept this on many occasions. Trust me, I don’t take it lightly, and generally the team doesn’t either. It often takes weeks or months afterward to get everything to pass and we’ve usually introduced a number of new bugs in the process. Especially since our team generally only tests important features and functions, making our test failures significant.

After successful deliveries, in spite of master failing, it seems that the practice of hand testing everything and bug fixing only critical issues is acceptable. In this case, that is the process that weathered the storm and allowed us to succeed. A fool would look to a number of these and say that successful teams operate this way.

In fact, the times that we were most successful were the times that master passed for long periods of time. The reason was that we were able to focus more on development and integration than testing and hot-fixing. These are the times that we want to emulate, not the ones when we were able to come through and no one thought we could. The logical conclusion is actually not only that we want master to always pass – it’s that we want to build schedules and make promises which allow us to keep a constant velocity without panic, and have a stable build throughout the process.

Where were the cockpit and fuel tanks in this scenario? Scheduling.

Survivor Bias for Technical Debt

Our convictions are most tested when the going gets tough. Oftentimes teams need to accept technical debt to achieve their schedules and customer needs. This is a normal part of software. A poor evaluation of the situation might lead some to think that this practice, as a whole, is what allows a team to be successful.

My experience has been that both successful and unsuccessful teams share this practice. The irony is the worst teams actually did it more. In fact, the most serious threat of demise I’ve seen in teams is when the team decided to “budge” on too many issues and wound up incurring a debt load that was unsustainable.

So the survivor bias says that it is a good practice to sometimes accept technical debt. The hidden truth is that accepting technical debt can be crippling. Instead, the real lesson needs to be that successful teams consider the debt carefully.

Survivor Bias for Ideas

We’ve all heard it. “This is how we’ve always done it.” or “Nothing else works because…”

It can be frustrating to argue this point since things are already working. It may even be tempting to simply fix the issues with the current system, rather than consider replacing it.

This sort of thinking is a direct analogue to the WWII story. Just because something has survived, does not make it the paramount solution. Here, as a lead, it is important to consider the costs and benefits of potential new ideas. It may seem easier to keep patching the current system and keeping it limping along, but will it wind up being more work than it’s worth just to replace it with a better system?

Survivor Bias for People

To me, this seems like one of the most interesting – and hardest – things to get right. In my current position, I’ve seen untold amounts of turnover, even at the top of the organization – much more than I did when I worked retail. When it’s time to staff up or replace someone, it is all to easy to fall prey to the survivor bias. It’s too easy to look for people like the team you already have, after all, they are the ones who can survive the environment, right?

Whats wrong with this plan? The problem is that you’re not taking into account “all of the bombers that didn’t make it home.” The real questions at this time need to be, “what is my team missing?” and “what drove the people who had these traits away?.”

Moral for leads

It’s sometimes easy to feel like we’re invincible, especially when we’ve been successful through hard times or difficult situations. The trick is, that as a software lead, it’s important to reflect on what actually makes you successful and how to maintain it. The decisions that made you successful might not always be obvious, in fact, may be obscure, but they are the real nuggets of experience we need to focus on.

As the principles of the agile manifesto says, “At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.” This is also true for individuals, especially leads.