Embracing Failure: A Key to Success in the Modern Workplace

This is an image of a coffee spill on a tiled floor. The coffee has spread out, creating a dark pool in the center, and there is a trail leading from it to a small puddle at the bottom right corner of the frame. There's also a single coffee cup lid upside-down near the spill.

This post is part of an ongoing series on the books that I have read as part of my continual professional development (CPD). All of my CPD posts are available at the following link: Continual Professional Development

Amy C. Edmondson’s “Right Kind of Wrong: The Science of Failing Well” is one of the books I find myself recommending most often when I work with engineering leaders who are stuck on incident response. The book’s argument is unfussy: most organisations get failure wrong, the wrongness is patterned and predictable, and the fix is largely a matter of deciding to behave differently when something goes wrong. This post is my CPD review, focused on the parts that have changed how I think about working with teams in the aftermath of mistakes.

A note before I begin: this is a book about embracing failure, but the principle of Extreme Ownership still applies. I covered that in Leading with Ownership. Embracing failure is not the same thing as abdicating responsibility for it.

The Balanced Approach to Expecting Outcomes

Edmondson opens by separating two related ideas that most organisations conflate: the expectation that you will perform well, and the expectation that you will perform perfectly. The first is reasonable. The second is corrosive. A team that expects perfection treats every imperfection as failure, which means most of the team’s days are recorded as failures. That accounting will eventually shape the team.

If you expect to do everything perfectly to win every contest, you will be disappointed or even distressed when it doesn’t happen. In contrast, if you expect to try your best, accepting that you might not achieve everything that you want, you’re likely to have a more balanced and healthy relationship with failure.

The reframe is small but the effect is substantial. A team that expects setbacks treats them as data when they arrive. A team that does not treats them as evidence of inadequacy. The first response leads to debrief; the second leads to defence. Only one of those leads to the next improvement.

The Fallibility of Humans and the Importance of Effective Failure

Edmondson is careful to say that failing well is not natural. Humans have an emotional aversion to failure, and even when we have overcome the aversion intellectually, our reflexive responses still follow the old pattern. The skill of failing effectively (which is what this whole book is about teaching) is therefore something that has to be built deliberately and maintained, like any other professional habit.

Each of us is a fallible human being, living and working with other fallible human beings. Even if we work to overcome our emotional aversion to failure, failing effectively isn’t automatic.

The connection to psychological safety is direct. If a team has not built the conditions in which mistakes can be admitted, every failure produces two costs: the cost of the failure itself, and the cost of the cover-up. The cover-up cost almost always exceeds the original cost. I have written about this dynamic at greater length in your team’s silence is telling you something.

The Pitfalls of Punishing Errors and the Fundamental Attribution Error

The most common organisational response to failure is to find the responsible person and apply consequences. The intent is usually deterrence. The actual effect is to teach the team that the safest move when something goes wrong is to hide it. The next failure is just as likely to occur, but now nobody will tell you about it until the impact is large enough that hiding it is no longer an option.

The important thing to remember about errors is that they are unintended—and punishing them as a strategy for preventing failure will backfire. It encourages people not to admit errors, which ironically increases the likelihood of preventable basic failure.

Edmondson also names the cognitive bias that makes the punitive response feel natural: the fundamental attribution error. When we look at our own failures, we explain them by reference to the circumstances. When we look at other people’s failures, we explain them by reference to their character or competence. Almost every incident review I have sat in on contains some form of this bias unless someone in the room is actively defending against it.

💡 Remember

The fundamental attribution error is the opposite of Extreme Ownership, which is where we take complete responsibility for outcomes, foster humility and learn from mistakes, and build a collaborative team environment.

"Poka-yoke" and the Toyota Production System in Agile Development

Edmondson devotes a chapter to error-proofing, by which she means designing systems so that the easiest path is the correct one. The Japanese word for this is poka-yoke, originating in the Toyota Production System and propagating from there into Lean manufacturing and (with some intermediate steps) Agile software development.

Poka-yoke, which means "error-proofing" in Japanese, a term that originated with the Toyota Production System (TPS), is a valued practice in modern manufacturing. That so many of the objects we use benefit from poka-yoke is evidence of basic failure’s ubiquity. We all experience instances of inattention. We can all hold faulty assumptions and be overconfident. The goal is to take measures to reduce the number of basic failures these tendencies cause.

In software the principle shows up in places we no longer recognise it: type systems that refuse to compile when an obvious mistake has been made, linters that catch issues before code review, deployment pipelines that fail fast when an assumption has been violated, unit tests that run automatically on commit. None of these eliminate failure. They make particular kinds of failure cheap to discover and trivial to fix, before they become incidents.

Complex Failures and the Importance of Systemic Change

Edmondson splits failure into three categories: basic, complex, and intelligent. Basic failures are caused by inattention or lack of skill. Intelligent failures are the expected outcome of well-designed experiments. Complex failures, which she spends most of the book on, are the dangerous ones: failures that emerge from the interaction of multiple factors within a system, often with warning signs that nobody connected at the time.

Complex failures happen in familiar settings, which is what distinguishes them from intelligent failures. Despite being familiar, these settings present a degree of complexity where multiple factors can interact in unexpected ways. Usually, complex failures are preceded by subtle warning signs. Finally, they often include at least one external, seemingly uncontrollable, factor.

The actionable point is that complex failures do not have a single root cause, and pretending they do produces the wrong response. The team meeting that ends with “so let’s all be more careful” has misdiagnosed a complex failure as a basic one. The fix has to operate on the system that produced the conditions, which usually means changing a process or a design rather than changing a person.

The Danger of Overconfidence and the Power of Responsibility

Overconfidence is the bridge between basic and complex failures. A practitioner who could once-upon-a-time produce a particular kind of work without error develops the habit of confidence, then stops checking the assumptions that made the habit safe. The conditions shift; the confidence does not.

More generally, when you find yourself thinking, "I can do this in my sleep," watch out! Overconfidence is a precursor to complex failure, just as it is to basic failure.

The corrective is unfashionable. Stay sceptical of your own work, especially the parts you have become good at. Welcome the colleague who tells you they don’t understand what you did and why, because their confusion is more useful to you than your team’s deference. Take responsibility for your contribution to the failure without taking on responsibility for the entire system that produced it; the second is self-flagellation in a costume.

Learning from Failures: A Path to Wisdom

What Edmondson is asking for, finally, is a shift in what the team treats as the source of useful information. Most organisations treat near-miss reports, anomalies, and bug reports as administrative overhead to be processed. The teams that fail well treat them as the most informationally rich signal the team produces, because they tell you where the system is fraying before the system snaps.

Understanding how systems produce failures—and especially which kinds of systems are especially failure prone—helps take blame out of the equation. It also helps us to focus on reducing failure by changing the system rather than by changing or replacing an individual who works in a faulty system.

This is unglamorous work. Nobody writes case studies about the company that quietly stopped repeating the same incident every six months because they changed the way they ran retrospectives. The companies that do this work just become, year on year, slightly harder to surprise. From the outside that looks like luck. From the inside it is the compounding effect of a team that learned to fail well.

If you’d like to build a psychologically safe culture where failure is treated as a learning opportunity rather than a liability, let’s talk.