Seeing that my natural vacation sanctuary, where I normally go to break from life’s stresses and enjoy time with my family is about to be permanently destroyed, I decided to break with the Idea Management and Lean tone of this blog in order to reflect a little on quality management. For many years, I helped Mercedes-Benz suppliers improve their quality through lean tools, but also through the use of statistics. Even though I was never formally trained as a Six-Sigma ‘grasshopper’ and much less a Sensei, I did use many of the statistical tools found in the Six Sigma toolbox.

The FMEA has always been a key tool in the auto industry to identify areas of product quality risks and thus planning how to mitigate them. Those components that can play a role in a potential catastrophic failure, loss of life or loss of property, get treated with extra care. To generalize, these components are ‘serialized, and data is recorded along the entire manufacturing chain. Every critical process is monitored, and equipment is designed to “inspect” its own quality, and in the case of critical characteristics, it’s designed to check the quality of preceding processes. Redundancy is so built that if one inspection process fails, the next one will catch the defect. These redundant checks are designed in layers and their ultimate goal is to ensure no bad parts exit the manufacturing process.

With that said, to put six sigma quality in perspective, aircraft are a good example of redundant systems at work. Critical systems in aircraft are designed with multiple backup systems. (Keeping math simple, and not using real life numbers) If a hydraulic system has a natural tendency to fail once in 100,000 uses, applying a backup system ensures that in a worst case scenario, a simultaneous failure will only occur once in 100 billion uses.

In general there are two major reasons for quality failures: The first is the failure to identify a potential failure mode and thus not guarding against it and is normally due to lack of historical reference or a lesson learned. The second is by far the worst, and it’s the failure of people to follow established procedures. This is critical because it is not a reflection of the actual workers, but rather a reflection on management.

Bunji Tozawa said “Blame the process and not the person”. What he eludes to is that management is responsible for the processes and thus a failure is essentially their fault.

Like the auto industry, big oil relies on suppliers, and it’s extremely critical to ensure these suppliers manage and maintain their internal procedures. In the auto industry we don’t only measure and rate suppliers by their ability to supply good parts, but we also audit their adherence to their quality systems and have different means of flagging potential problems before they occur. The proactive approach is taken to ensure that human lives are not lost driving cars.

Having a deep understanding of quality systems, redundancy, and personal ties inside the oil industry, putting my head around the sinking of the BP platform and BPs overall safety and environmental record (Pipeline 2006, Refinery explosion 2005) is almost impossible to think that it all happened because of ‘bad luck’ or failed equipment! This wasn’t a failure as in case one: Not identifying a potential failure mode, but instead a failure to follow procedures and adhering to best practices.

A reason why in general the global oil safety record is good is because of strict processes and procedures (I also wrote about managing safety and how Schlumberger uses idea management systems to manage identified safety risks.) Keep in mind that there are more oil wells in operation than there are aircraft in the air on any given day, and the number of catastrophic incidents pale in comparison to the aeronautical industry. (Here’s an old CNN article showing the worse accidents through 2001)

The bottom line, when the dust clears, a thorough investigation will likely yield a lack of self auditing and supplier auditing practices inside BP, and management’s inability or unwillingness to ensure that the entire corporate culture is driven by adherence to established procedures. A good indicator here will be BP’s response. As they start to blame Transocean (the operator of the rig), “a faulty blow-out” control system, a missed maintenance step, or operator error, what they really will be saying is that management has been incompetent and unable to drive a corporate culture that adheres to strict safety and environmentally relevant procedures.