It’s not. This guidance is just one big “calm down people” letter from the agency. They publish these sorts of guidance every now and then because we as an industry can sometimes learn the wrong lessons.
So read the guidance, but don’t panic. You are either following it already or you just need to spend some time getting better at risk assessments and creating some matrix approaches.
The Preliminary Hazard Analysis (PHA) is a risk tool that is used during initial design and development, thus the name “preliminary”, to identify systematic hazards that affect the intended function of the design to provide an opportunity to modify requirements that will help avoid issues in the design.
Like a fair amount of tools used in risk, the PHA was created by the US Army. ANSI/ASSP Z.590.3 “Prevention through Design, Guidelines for Addressing Occupational Hazards and Risks in Design and Redesign Processes” makes this one of the eight risk assessment tools everyone should know.
Taking the time to perform a PHA early on in the design will speed up the design process and avoid costly mistakes. Any identified hazards that cannot be avoided or eliminated are then controlled so that the risk is reduced to an acceptable level.
PHAs can also be used to examine existing systems, prioritize risk levels and select those systems requiring further study. The use of a single PHA may also be appropriate for simple, less compelx systems.
Like a Structured What-If, the Preliminary Hazard Analysis benefits from an established list of general categories:
by the source of risk: raw materials, environmental, equipment, usability and human factors, safety hazards, etc.
by consequence, aspects or dimensions of objectives or performance
Based on the established list, a preliminary hazard list is identified which lists the potential, significant hazards associated with a design. The purpose of the preliminary hazard list is to initially identify the most evident or worst-credible hazards that could occur in the system being designed. Such hazards may be inherent to the design or created by the interaction with other systems/environment/etc.
A team should be involved in collecting and reviewing.
B. Sequence of Events
Once the hazards are identified, the sequence of events that leads from each hazard to various hazardous situations is identified.
C. Hazardous Situation
For each sequence of events, we identify one or more hazardous situations.
D. Impact
For each hazardous situation, we identify one or more outcomes (or harms).
E. Severity and occurrence of the impact
Based on the identified outcomes/harms the severity is determined. An occurrence or probability is determined for each sequence of events that leads from the hazard to the hazardous situation to the outcome.
Based on severity and likelihood of occurrence a risk level is determined.
From hazard to a variety of harms
I tend to favor a 5×5 matrix for a PHA, though some use 3×3, and I’ve even seen 4×5.
Intended
outcomes
Likelihood of
Occurrence
Severity
Rating
Impact to failure
scale
1
Very unlikely
2
Likely
3
Possible
4
Likely
5
Very Likely
5
Complete
failure
5
10
15
20
25
4
Maximum tolerable
failure
4
8
12
16
20
3
Maximum
anticipated failure
3
6
9
12
15
2
Minimum
anticipated failure
2
4
6
8
10
1
Negligible
1
2
3
4
5
Very high risk: 15 or greater, High risk 9-14, Medium
risk 5-8, Low risk 1-4
F. Risk Control Measures
Based on the risk level risk controls and developed and applied. These risk controls will help the design team create new requirements that will drive the design.
On-going risks should be evaluated for the risk register.
The pictorial tree is a favorite for representing branching paths. Some common ones include Fault Tree Analysis (FTA); Cause Trees to analyze used retrospectively to analyze events that have already occurred; Question Trees to aid in problem-solving; and, even Success Trees to figure out why something went right.
Apple Tree illustration with long branching roots
Inductive or Deductive
Inductive Reasoning: Induction is reasoning from individual cases to a general conclusion. We start from a particular initiating condition and attempt to ascertain the effect of that fault or condition on a system. Deductive Reasoning: Deduction is reasoning from the general to the specific. We start with the way the system has failed and we attempt to find out what modes of system behavior contribute to this failure.
The beauty of a pictorial representation is that depending on which way you go on the tree pictorially represents the form of reasoning that is used.
Inductive reasoning is the branches, and tools like a Cause Tree, are used to determine what system states (usually failed states) are possible. The inductive techniques provide answers to the generic question, “What happens if–?” The process consists of assuming a particular state of existence of a component or components and analyzing to determine the effect of that condition on the system.
Deductive reasoning is the roots, and tools like Fault Tree Analysis, take some specific system state, which is generally a failure state, and chains of more basic faults contributing to this undesired events are built up in a systematic way to determine how a given failure can occur.
Success/Failure Space
We operate in a success/failure space. We are constantly identifying ways a thing can fail or the various ways of success.
Success/Failure Space
These are really just two sides of the coin in many ways, with identifiable points in success space coinciding with analogous points in failure space. “Maximum anticipated success” in success space coincides with “minimum anticipated failure” in failure space.
Like everything, how we frame the question helps us find answers. Certain questions require us to think in terms of failure space, others in success. There are advantages in both, but in risk management, the failure space is incredibly valuable.
Fault Tree Analysis
Fault Tree Analysis (FTA) is a tool for identifying and analyzing factors that contribute to an undesired event (called the “top event”). The top event is analyzed by first identifying its immediate and necessary causes. The logical relationship between these causes is represented by several gates such as AND and OR gates. Each cause is then analyzed step-wise in the same way until further analysis becomes unproductive. The result is a graphical representation of a Boolean equation in a tree diagram.
The Undesired Event
Fault tree analysis is a deductive failure analysis that focuses on one particular undesired event to determine the causes of this event. The undesired event constitutes the top event in a fault tree diagram and generally consists of a complete failure.
If the top event is too general, the analysis becomes unmanageable; if it is too specific, the analysis does not provide a sufficiently broad view.
Top events are usually a failure of a critical requirement or process step.
A fault tree is not a model of all possible failures or all possible causes for failure. A fault tree is tailored to its top event which corresponds to some particular failure mode, and the fault tree includes only those faults that contribute to this top event. These faults are not exhaustive – they cover only the most credible faults as assessed by the risk team.
The Symbology of a Fault Tree
Events
Base Event
The circle describes a basic initiating fault event that requires no further development. The circle signifies that the appropriate limit of resolution has been reached.
Undeveloped Event
The diamond describes a specific fault event that is not further developed, either because the event is of insufficient consequence or because information relevant to the event is unavailable.
Conditioning Event
The ellipse is used to record any conditions or restrictions that apply to any logic gate. It is used primarily with the INHIBIT and PRIORITY AND-gates.
External Event
The house is used to signify an event that is normally expected to occur: e.g., a phase change. The house symbol displays events that are not, of themselves, faults.
External does not mean external to the organization.
Intermediate Event
An intermediate event is a fault event that occurs because of one or more antecedent causes acting through logic gates. All intermediate events are symbolized by rectangles.
Event Symbols used in a Fault Tree Analysis
Gates
There are two basic types of fault tree gates: the OR-gate and the AND-gate. All other gates are really special cases of these two basic types.
OR-gate
The OR-gate is used to show that the output event occurs only if one or more of the input events occur. There may be any number of input events to an OR-gate.
AND-gate
The AND-gate is used to show that the output fault occurs only if all the input faults occur. There may be any number of input faults to an AND-gate.
INHIBIT-gate
The INHIBIT-gate, represented by the hexagon, is a special case of the AND-gate. The output is caused by a single input, but some qualifying condition must be satisfied before the input can produce the output. The condition that must exist is the conditional input. A description of this conditional input is spelled out within an ellipse drawn to the right of the gate.
EXCLUSIVE OR-gate
The EXCLUSIVE OR-gate is a special case of the OR-gate in which the output event occurs only if exactly one of the input events occur
PRIORITY AND-gate
The PRIORITY AND-gate is a special case of the AND-gate in which the output event occurs only if all input events occur in a specified ordered sequence. The sequence is usually shown inside an ellipse drawn to the right of the gate. In practice, the necessity of having a specific sequence is not usually encountered.
Gate Symbols used in a Fault Tree Analysis
Procedure
Identify the system or process that will be examined, including boundaries that will limit the analysis. FTA often stems from a previous risk assessment, such as a FMEA or Structured What-If; or, it comes from a root cause analysis.
Identify the Top Event, the type of failure that will be analyzed as narrowly and specifically as possible.
Identify the events that may be immediate cause sof the top event. Write these events at the level below te event they cause.
For each event ask “Is this a basic failure? Or can it be analyzed for its immediate causes?”
If the event is a basic failure, draw a circle around it.
If it can be analyzed for its own causes draw a rectangle around it (NOTE: if appropriate, other event types are possible)
Ask “How are these events related to the one they cause?” Use the gate symbols to show the relationships. The lower-level events are the input events. They one they cause, above the gate is the output event.
For each event that is not basic, repeat steps 4 and 5. Continue until all branches of the tree end in a basic or undeveloped event.
To determine the mathematical probability of failure, assign probabilities to each of the basic events. Use Boolean algebra to calculate the probability of each high-level event and the top event. Discussions of the math is a very different post.
Analyze the tree to understand the relations between the causes and to find ways to prevent failures. Use the gate relationships to find the most efficient ways to reduce risk. Focus attention on the causes most likely to happen.
FTA example using lack of team (basic)
The Question Tree
A critical task in problem-solving is determining what kinds of analysis and corresponding data would best solve the problem. Rather than a shortage of techniques, there are too many to choose from, we can often reflexively use the same few, basic tools out of familiarity and habit. This can mislead when the situation is complex, non-routine, and/or unfamiliar.
This is where a Question Tree comes in handy to determine what analyses and data are suited for a particular problem-solving situation. This tool is also known as a logic tree or a decision tree. Question Trees are structures for seeing the elements of a problem clearly, and keeping track of different levels of the problem, which we can liken to trunks, branches, twigs, and leaves. You can arrange them from left to right, right to left, or top to bottom— whatever makes the elements easier for you to visualize. Think of a Question Tree as a mental model of your problem. Better trees have a clearer and more complete logic of relationships linking the parts to each other, are more comprehensive and have no overlap.
The Question Tree is very powerful when working through broad and complex problems that no single analysis or framework can solve. By developing a set of questions that are connected to one another in the form of a tree we can determine what data analysis is needed, which can help us break out of the habit of using the same analysis tool even when it is a bad one for the job.
The core question is the starting point. It is made easier to solve by decomposing it into a few, more specific sub-questions. The logic of decomposition is such that the answers to these sub-questions should together fully answer the question they emerge from. The first level of sub-questions may still be too broad to solve with specific analyses and data, so each is decomposed further.
The process of decomposition continues until a sub-question is reached that can be answered using a particular technique or framework, and the data needed is specific enough to be identified. A Question Tree is thus constructed, and the final set of questions indicates the analyses and data needed. As much as possible, the questions on the tree are also framed such that they have “yes” or “no” as potential answers.
They, too, are hypotheses to be settled with data, analysis, and evidence. They can also be used to test assumptions and beliefs, evaluate expectations, explore puzzles and oddities, and generate solution options.
In decomposing a question, ask whether the sub-questions are mutually exclusive and collectively exhaustive. This can help generate the sub-question not asked, and thus, reduce errors of omission. Building a Question Tree is an iterative and nonlinear process. If later information so dictates, previously done work on the tree should be adjusted.
Judgment plays a role in building a Question Tree, so it is unlikely that two people working independently on a complex starting question will create identical trees, but these are bound to overlap. Expertise matters and the strength of teams should be leveraged.
Success and Cause Trees
These are variations of the fault tree analysis: a Success Tree where the top event is desired and a Cause Tree used to investigate a past event as part of root cause analysis.