There are many tools and methods that people refer to as Root Cause Analysis. Unfortunately, most of them are ineffective and conventional wisdom contiunes to propagate the ignorance that abounds. Conventional wisdom in most subjects is nearly always wrong, but when coupled with the intellectual laziness and resistance to change inherent in the human condition, conventional wisdom leads to stasis and a dedication to ignorance. Such is the case with human problem-solving.
Let's take a look at the various methods people have created to help them solve event-type problems and compare these methods to the RealityCharting™ process. Because there is no subject or discipline dedicated to effective problem-solving in the education system, businesses have taken it upon themselves to create their own problem-solving processes. These different methods are generally referred to as "Root Cause Analysis or RCA," and there are many sources available today that discuss these conventional tools and processes. To save you the pain of having to sort out the effective from the ineffective, we have done the work for you.
Many purveyors of root cause analysis believe the process is so complicated that you should use several methods for each problem or select one based on which type of problem you are experiencing. We've found the reason some people think this way, is that they don't understand the cause-and-effect principles. To quote Alert Einstein: "If you can't say it simply, you probably don't understand it."
In order to compare the many problem solving tools and methods, we created six criteria that define what we believe are the essential qualities required to ensure we find effective solutions. One point is scored for each criteria that is met and Limited is scored as 0.5 points. Details of each tool or method are provided in the links on the left side of this page.
To properly evaluate the many so-called root cause analysis methods and tools, we need a standard to which they can be compared.
It is generally agreed that the purpose of root cause analysis is to find effective solutions that prevent recurrence of the defined problem. Accordingly, an effective root cause analysis process should provide a clear understanding of exactly how proposed solutions meet their goal.
We believe that an effective problem-solving process should meet the following six criteria.
This is a simple causal process whereby one asks why of a predefined problem, answers with at least two causes in the form of an action and condition, then asks why of each answer and continues asking why of each stated cause until there are no more answers. At that time, a search for the unknown is launched and the process is repeated several times until a complete cause-and-effect chart, called a Realitychart, is created, showing all the known causes and their interrelationships.
Every cause on the chart has evidence to support its existence or a "?" is used to reflect an unknown and thus a risk. All causes are then examined to find a way to change them with a solution that is within your control, prevents recurrence, meets your goals and objectives, and does not cause other problems.
The result is clear causal connections between your solutions and the defined problem. Because all stakeholders can insert their causal relationships into the Realitychart, buy-in of the solutions is readily attained.
This is a complicated process that first identifies a sequence of events and aligns the events with the conditions that caused them.
These events and respective condition are aligned along a time line. Events and conditions that have evidence are shown in a solid line but evidence is not listed; all other observations are shown in dashed lines. After this representation of the problem is complete, an assessment is made by "walking" the chart and asking if the problem would be different if the events or conditions were changed. This leads to identifying causal factors such as training not adequate, management less than adequate, or barrier failed, which are identified by evaluating a tree diagram.
Events and Causal Factor Charting can provide the time line to help discover the action causes, and is generally inefficient and ineffective because it mixes storytelling with conditional causes, thus it produces complicated relationships that are not necessarily causal and this only serves to add confusion rather than clarity. Instead of identifying the many causal relationships of a given event, events and causal factor charting resorts to categorizing the important causes as causal factors, which are then evaluated as solution candidates using the same method as the categorization schemes.
Events and Causal Factor Charting does not follow the principles of cause and effect.
This is a six-step process that describes the event or problem, then describes the same situation without the problem, compares the two situations, documents all the differences, analyzes the differences, and identifies the consequences of the differences.
The results of the change analysis identifies the cause of the change and will frequently be tied to the passage of time and, therefore, easily fits into an events and causal factors chart, showing when and what existed before, during, and after the change.
Change analysis is nearly always used in conjunction with another RCA method to provide a specific cause, not necessarily a root cause.
Change Analysis is a very good tool to help determine specific causes or causal elements, but it does not provide a clear understanding of the causal relationships of a given event. Unfortunately, many people who use this method simply ask why the change occurred and fail to complete a comprehensive analysis.
This incident analysis identifies barriers used to protect a target from harm and analyzes the event to see if the barriers held, failed, or were compromised in some way by tracing the path to the threat from the harmful action to the target.
A simple example is a knife in a sheath. The knife is the threat, the sheath is the barrier, and the target is a human. If the sheath somehow fails and a human is injured, the barrier analysis would seek to find out why the barrier failed. The cause of this failure is then identified as the root cause.
Barrier analysis can provide an excellent tool for determining where to start your root cause analysis, but it is not a method for finding effective solutions because it does not identify why a barrier failed or was missing. This is beyond the scope of the barrier analysis. To determine root causes, the findings of the barrier analysis must be fed into a principle based method to discover why the barrier failed.
This type of root cause analysis is very common and goes by many names such as Ishikawa Fishbone Diagram, Management Oversight and Risk Tree (MORT) Analysis, Human Performance Evaluation Systems (HPES), and many other commercial brands. These methods use a predefined list of causal factors arranged like a fault tree.
Ishikawa uses manpower, methods, machinery and environment as the top-level categories. Each of these categories has sub-categories. For example, within the category of manpower, we may find management systems; within management systems we may find training; and within training we may find training less than adequate; and so on.
All categorization methods use the same basic logic with the premise that every problem has causes that lie within a predefined set of categories. These methods ask you to focus on one of the categories such as people and, in reviewing what you know of your event, to choose some causal factors from the list provided.
These Tree Diagrams also known as Categorization Schemes, are steadily being replaced with RealityCharting® but continue to retain a few followers because they appeal to our sense of order and “push-button” type thinking. There are at least seven major weaknesses in the tree diagram model.
Weakness 1. A tree diagram is clearly not a cause-and-effect chart, as the proponents of these methods would have us believe. It simply does not show all the causal relationships between the primary effect and the root causes.
Weakness 2. No two categorization schemes are the same, nor can they be, because we each have a different way of perceiving the world. When asked to categorize a given set of causes it is very difficult to find a consensus in any group. For example, what category does “Pushed Button” fall into? Some will see this as hardware; some will see it as people; and some will see it as procedure. If you have ever used any of these categorization methods to find a root cause, I know you have incurred many a wasted hour debating which is the correct category.
Weakness 3. The notion that anyone can create a list of causal factors that includes all the possible causes or causal factors of every human event should insult our intelligence. Ask yourself if your behavior can be categorized in a simple list and then ask if it is identical to every other human on the planet. The very fact that a method uses the term “causal factor” should be a heads-up that it does not provide a specific actionable cause but rather a broader categorical term representing many possible specific causes. At best, it acts as a checklist of possible causes for a given effect, but it does not provide any causal relationships. Since this error in logic is very contentious with those who use these methods, it begs the question why do these methods seem to work for them. What I have discovered, after talking with many people who claim success in using these methods, is that it works in spite of itself by providing some structure for the experienced investigator whose mind provides the actual causal relationships. It is not the methodology that works, but the experience of the investigator who is actually thinking causally. And while these methods seem to work for the experienced investigator, they are still incapable of communicating the reality of causal relationships to others. This inability to effectively communicate prevents the synergy among stakeholders necessary to fully understand the causes of the event, which is required to get buy-in for the solutions.
Weakness 4. These processes do not provide a means of showing how we know that a cause exists. There is no evidence provided to support the “causal factors” in the list, so it is not uncommon for causal factors to be included that are politically inspired with no basis in fact. With these methods, the best storytellers or the boss often get what they want, and the problem repeats. This may help explain why many managers and self-proclaimed leaders like this method.
Weakness 5. Categorization schemes restrict thinking by causing the investigator to stop at the categorical cause. Some methods reinforce this fallacy by providing a “root cause dictionary,” implying that it is a well-defined and recognized cause.
Weakness 6. Categorization methods perpetuate the root cause myth based on the belief it is a root cause we seek and solutions are secondary. Because these methods do not identify complete causal relationships, it is not obvious which causes can be controlled to prevent recurrence; therefore, you are asked to guess and vote on which causal factors are the root causes. Only after root causes are chosen are you asked to identify solutions and without a clear understanding of all known causal relationships between the solution and the primary effect, this method works by chance not by design.
Weakness 7. Some of these categorical methods provide what is called an “expert system” and includes solutions for a given root cause. Expert systems can be quite useful for a very specific system such as a car or production line where most of the causal relationships are well known and have a long history of repeatability. To presume that one could provide an expert system applicable to all event-based problems seems to me to be incredibly arrogant. How could anyone presume to know the causal relationships for all systems, how they interrelate, and what constitutes the best solution for every organization or individual? Beware the salesperson.
As you can see from all these weaknesses Tree Diagrams are people centric and do not follow the principles of cause and effect.
One of the many brainstorming methods also known as "the Five Whys method" is the most simplistic root cause analysis process and involves repeatedly asking why at least five times or until you can no longer answer the question. Five is an arbitrary figure.
The theory is that after asking why five times you will probably arrive at the root cause. That is, the root cause has been identified when asking "why" doesn't provide any more useful information.
This method produces a linear set of causal relationships and uses the experience of the problem owner to determine the root cause and corresponding solutions.
The Five Whys method is inappropriate for any complicated event, but it is actually quite useful when used on minor problems that require nothing more than some basic discussion of the event. Unlike most of the other methods, it identifies causal relationships, but still subscribes to the root cause myth of first finding the root cause and then assigning solutions. It should never be used for formal incident investigations, but is perfectly acceptable for informal discussions of cause. A better approach to simple problems is RealityCharting SimplifiedTM, an easy to use software application that follows the Five Whys philosophy, but includes principle-based causal logic.
This is a statistical approach to problem solving that uses a database of problems to identify the number of predefined causal factors that have occurred in your business or system. It is based on the Pareto principle, also known as the 80-20 rule, which presumes that 80% of your problems are caused by 20% of the causes.
It is intended to direct resources toward the most common causes. Often misused as an RCA method, Pareto analysis is best used as a tool for determining where you should start your analysis.
Pareto Analysis uses a failure database to trend the frequency of categorical failures. This process is fraught with many landmines, a few of which are discussed below.
1. The accuracy of a Pareto chart is limited by the accuracy of the data used to create it. If you use a failed approach like tree diagrams to determine the causal factors, the Pareto chart will only reflect causal factors from the predefined list provided.
2. The cause-and-effect principle dictates that all causes and effects are part of the same continuum. It many cases, certain causes will be closely linked (i.e., close to each other). For example, the cause “procedures not followed” could be caused by “procedures not accurate.” In the Pareto analysis, this causal connection is lost. Instead, we see both “procedures not followed” and “procedures not accurate” in those top cause categories, so we end up working on solving both problems when in reality we may only need to solve the “procedures not accurate” problem. In this example, the incomplete view of reality provided by a Pareto analysis may have caused you to expend more resources than necessary.
3. Pareto analysis can mask larger, more systemic issues. For example, if quality management has transitioned into a state of dysfunction, this can cause symptoms in many different areas, such as poor procedures, inadequate resources, outdated methods, high failure rates, low morale, etc. Pareto analysis has you capturing all these symptoms of a larger problem as causes, and wasting time solving the symptoms rather than the problem.
This is not really a root cause analysis method but is often passed off as one, so it is included for completeness. It is the single most common incident investigation method and is used by nearly every business and government entity. It typically uses predefined forms that include problem definition, a description of the event, who made a mistake, and what is going to be done to prevent recurrence. There is often a short list of root causes to choose from so a Pareto chart can be created to show where most problems originate.
Also known as the fill-out-a-form method, storytelling should never be used to find effective solutions. The primary difficulty with this approach is that you are relying completely on the experience and judgment of the report authors in assuring that the recommended solutions connect to the causes of the problems. Because they do not know, let alone follow, the principles of causation, the authors often fail to find effective solutions.
The primary purpose of this method is to document the investigation findings and corrective actions. These forms usually do a good job of capturing the what, when, and where of the event, but little or no analysis occurs. Consequently, the corrective actions fail to prevent recurrence most of the time.
With such poor results, you might be wondering why organizations continue to use this method. The answer is two fold. First, most organizations do not measure the effectiveness of their corrective actions, so they don’t know they are ineffective. Second, there is a false belief that everyone is a good problem solver, and all they need to do is document it on a form. For those organizations that recognize they are having repeat events, a more detailed form is often created that forces the users to follow a specified line of questions with the belief that an effective solution will emerge.
This is a false promise because the human thinking process cannot be reduced to a form. In our attempt to standardize the thinking process, we restrict our thinking to a predefined set of causes and solutions. The form tacitly signals the user to turn off their mind, fill in the blanks, and check the boxes. Because effective problem solving has been short circuited, the reports are incomplete and the problems keep occurring.
Fault Tree Analysis (FTA) is a quantitative causal diagram used to identify possible failures in a system. It is a common engineering tool used in the design stages of a project and works well to identify possible causal relationships.
It requires the use of specific data regarding known failure rates of components. Causal relationships can be identified with "and" and "or" relationships or various combinations thereof.
It is not normally used as a root cause analysis method, primarily because it does not work well when human actions are inserted as a cause. This is because the wide variance of possible human failure rates prevents accurate results. But it works extremely well at defining engineered systems and can be used to supplement an RCA in the following ways:
1. Finding causes by reviewing the assumptions and design decisions made during the system’s original design.
2. Determining if certain causal scenarios are probable, and
3. Selecting the appropriate solution(s) based on probability.
Failure modes and effects analysis (FMEA) is similar to fault tree analysis in that it is primarily used in the design of engineered systems rather than root cause analysis.
Like the name implies, it identifies a component, subjectively lists all the possible failures (modes) that could happen, and then makes an assessment of the consequences (effect) of each failure.
Sometimes a relative score is given to how critical the failure mode is to the operability of the system or component.
FMEA is sometimes used to find the cause of a component failure. Like many of the other tools discussed herein, it can be used to help you find a causal element within a Realitychart. However, it does not work well on systems or complex problems because it cannot show evidence-based causal relationships beyond the specific failure mode being analyzed.