Managing Events Systematically

Being good at problem-solving is critical to success in an organization. I’ve written quite a bit on problem-solving, but here I want to tackle the amount of effort we should apply.

Not all problems should be treated the same. There are also levels of problems. And these two aspects can contribute to some poor problem-solving practices.

It helps to look at problems systematically across our organization. The iceberg analogy is a pretty popular way to break this done focusing on Events, Patterns, Underlying Structure, and Mental Model.

Iceberg analogy

Events

Events start with the observation or discovery of a situation that is different in some way. What is being observed is a symptom and we want to quickly identify the problem and then determine the effort needed to address it.

This is where Art Smalley’s Four Types of Problems comes in handy to help us take a risk-based approach to determining our level of effort.

Type 1 problems, Troubleshooting, allows us to set problems with a clear understanding of the issue and a clear pathway. Have a flat tire? Fix it. Have a document error, fix it using good documentation practices.

It is valuable to work the way through common troubleshooting and ensure the appropriate linkages between the different processes, to ensure a system-wide approach to problem solving.

Corrective maintenance is a great example of troubleshooting as it involved restoring the original state of an asset. It includes documentation, a return to service and analysis of data. From that analysis of data problems are identified which require going deeper into problem-solving. It should have appropriate tie-ins to evaluate when the impact of an asset breaking leads to other problems (for example, impact to product) which can also require additional problem-solving.

It can be helpful for the organization to build decision trees that can help folks decide if a given problem stays as troubleshooting or if it it also requires going to type 2, “gap from standard.”

Type 2 problems, gap from standard, means that the actual result does not meet the expected and there is a potential of not meeting the core requirements (objectives) of the process, product, or service. This is the place we start deeper problem-solving, including root cause analysis.

Please note that often troubleshooting is done in a type 2 problem. We often call that a correction. If the bioreactor cannot maintain temperature during a run, that is a type 2 problem but I am certainly going to immediately apply troubleshooting as well. This is called a correction.

Take documentation errors. There is a practice in place, part of good documentation practices, for addressing troubleshooting around documents (how to correct, how to record a comment, etc). By working through the various ways documentation can go wrong, applying which ones are solved through troubleshooting and don’t involve type 2 problems, we can create a lot of noise in our system.

Core to the quality system is trending, looking for possible signals that require additional effort. Trending can help determine where problems lay and can also drive up the level of effort necessary.

Underlying Structure

Root Cause Analysis is about finding the underlying structure of the problem that defines the work applied to a type 2 problem.

Not all problems require the same amount of effort, and type 2 problems really have a scale based on consequences, that can help drive the level of effort. This should be based on the impact to the organization’s ability to meet the quality objectives, the requirements behind the product or service.

For example, in the pharma world there are three major criteria:

  •  safety, rights, or well-being of patients (including subjects and participants human and non-human)
  • data integrity (includes confidence in the results, outcome, or decision dependent on the data)
  • ability to meet regulatory requirements (which stem from but can be a lot broader than the first two)

These three criteria can be sliced and diced a lot of ways, but serve our example well.

To these three criteria we add a scale of possible harm to derive our criticality, an example can look like this:

ClassificationDescription
CriticalThe event has resulted in, or is clearly likely to result in, any one of the following outcomes:   significant harm to the safety, rights, or well-being of subjects or participants (human or non-human), or patients; compromised data integrity to the extent that confidence in the results, outcome, or decision dependent on the data is significantly impacted; or regulatory action against the company.
MajorThe event(s), were they to persist over time or become more serious, could potentially, though not imminently, result in any one of the following outcomes:  
harm to the safety, rights, or well-being of subjects or participants (human or non-human), or patients; compromised data integrity to the extent that confidence in the results, outcome, or decision dependent on the data is significantly impacted.
MinorAn isolated or recurring triggering event that does not otherwise meet the definitions of Critical or Major quality impacts.
Example of Classification of Events in a Pharmaceutical Quality System

This level of classification will drive the level of effort on the investigation, as well as drive if the CAPA addresses underlying structures alone or drives to addressing the mental models and thus driving culture change.

Mental Model

Here is where we address building a quality culture. In CAPA lingo this is usually more a preventive action than a corrective action. In the simplest of terms, corrective actions is address the underlying structures of the problem in the process/asset where the event happened. Preventive actions deal with underlying structures in other (usually related) process/assets or get to the Mindsets that allowed the underlying structures to exist in the first place.

Solving Problems Systematically

By applying this system perspective to our problem solving, by realizing that not everything needs a complete rebuild of the foundation, by looking holistically across our systems, we can ensure that we are driving a level of effort to truly build the house of quality.

Treating All Investigations the Same

Stephanie Gaulding, a colleague in the ASQ, recently wrote an excellent post for Redica on “How to Avoid Three Common Deviation Investigation Pitfalls“, a subject near and dear to my heart.

The three pitfalls Stephanie gives are:

  1. Not getting to root case
  2. Inadequate scoping
  3. Treating investigations the same

All three are right on the nose, and I’ve posted a bunch on the topics. Definitely go and read the post.

What I want to delve deeper into is Stephanie’s point that “Deviation systems should also be built to triage events into risk-based categories with sufficient time allocated to each category to drive risk-based investigations and focus the most time and effort on the highest risk and most complex events.”

That is an accurate breakdown, and exactly what regulators are asking for. However, I think the implementation of risk-based categories can sometimes lead to confusion, and we can spend some time unpacking the concept.

Risk is the possible effect of uncertainty. Risk is often described in terms of risk sources, potential events, their consequences, and their likelihoods (where we get likelihoodXseverity from).

But there are a lot of types of uncertainty, IEC31010 “Risk management – risk management techniques” lists the following examples:

  • uncertainty as to the truth of assumptions, including presumptions about how people or systems might behave
  • variability in the parameters on which a decision is to be based
  • uncertainty in the validity or accuracy of models which have been established to make predictions about the future
  • events (including changes in circumstances or conditions) whose occurrence, character or consequences are uncertain
  • uncertainty associated with disruptive events
  • the uncertain outcomes of systemic issues, such as shortages of competent staff, that can have wide ranging impacts which cannot be clearly defined lack of knowledge which arises when uncertainty is recognized but not fully understood
  • unpredictability
  • uncertainty arising from the limitations of the human mind, for example in understanding complex data, predicting situations with long-term consequences or making bias-free judgments.

Most of these are only, at best, obliquely relevant to risk categorizing deviations.

So it is important to first build the risk categories on consequences. At the end of the day these are the consequence that matter in the pharmaceutical/medical device world:

  • harm to the safety, rights, or well-being of patients, subjects or participants (human or non-human)
  • compromised data integrity so that confidence in the results, outcome, or decision dependent on the data is impacted

These are some pretty hefty areas and really hard for the average user to get their minds around. This is why building good requirements, and understanding how systems work is so critical. Building breadcrumbs in our procedures to let folks know what deviations are in what category is a good best practice.

There is nothing wrong with recognizing that different areas have different decision trees. Harm to safety in GMP can mean different things than safety in a GLP study.

The second place I’ve seen this go wrong has to do with likelihood, and folks getting symptom confused with problem confused with cause.

bridge with a gap

All deviations are with a situation that is different in some way from expected results. Deviations start with the symptom, and through analysis end up with a root cause. So when building your decision-tree, ensure it looks at symptoms and how the symptom is observed. That is surprisingly hard to do, which is why a lot of deviation criticality scales tend to focus only on severity.

4 major types of symptoms

Problem Statement Framing

A well-framed problem statement opens possibilities, while a bad problem statement closes down alternatives and quickly sends you down dead ends of facile thinking.

Consider a few typical problem statements you might hear during a management review:

  1. We have too many deviations
  2. We do not have enough people to process the deviations we get
  3. 45% of deviations are recurring

You hear this sort of framing regularly. Notice that only the third is a problem, the other two are solutions. And in the case of the first statement it can leave to some negative results. The second just has you throw more resources at the problem, which may or may not be a good thing. In both cases we are biasing the problem-solving process just as we begin.

The third problem statement pushes us to think. A measurable fact raises other questions that will help us develop better solutions: why are out deviations recurring? Why are we not solving issues when they first occur? What processes/areas are they recurring in? Are we putting the right amount of effort on important deviations? How can we eliminate these deviations?

If a problem statement has only one solution, reframe it to avoid jumping to conclusions.

By focusing on a problem statement with objective facts (45% of deviations are recurring) we can ask deeper, thoughtful questions which will lead to wisdom, and to better solutions.

To build a good problem statement:

  1. Begin with observable facts, not opinions, judgments, or interpretations.
  2. Describe what is happening by answering questions like “How much/How many/How long/How often.” This creates room for exploration and discovery.
  3. Iterate on the problem statement. As you think more deeply on the situation modify your first version. This is a sign that you understand more about the situation. This is the kind of data that will join with the facts you discover to lead towards sound decisions.

The 5W2H tool is always a good place to start.

5W2HTypical questionsContains
Who?Who are the people directly concerned with the problem? Who does this? Who should be involved but wasn’t? Was someone involved who shouldn’t be?Roles and Departments
What?What happened?Action, steps, description
When?When did the problem occur?Times, dates, place In process
Where?Where did the problem occur?Location
Why is it important?Why did we do this? What are the requirements? What is the expected condition?Justification, reason
How?How did we discover. Where in the process was it?Method, process, procedure
How Many? How Much?How many things are involved? How often did the situation happen? How much did it impact?Number, frequency

Remember this can be iterative as you discover more information and the problem statement at the end might not necessarily be the problem statement at the beginning.

ElementsProblem Statement
Is used to…Understand and target a problem.
Provide a scope.
Evaluate any risks.
Make objective decisions
Answers the following… (5W2H)What? (problem that occurred)
When? (timing of what occurred)
Where? (location of what occurred)
Who? (persons involved/observers)
Why? (why it matters, not why it occurred)
How Much/Many? (volume or count)
How Often? (First/only occurrence or multiple)
Contains…Object (What was affected?) Defect (What went wrong?)
Provides direction for…Escalation(s)  Investigation

Seven elements of good problem-solving

Logic

Perhaps more than anything else, we want our people to be able to think and then act rationally in decision making and problem-solving. The basic structure and technique embodied in problem solving is a combination of discipline when executing PDCA mixed with a heavy dose of the scientific method of investigation.

Logical thinking is tremendously powerful because it creates consistent, socially constructed approaches to problems, so that members within the organization spend less time spinning their wheels or trying to figure out how another person is approaching a given situation. This is an important dynamic necessary for quality culture.

The right processes and tools reinforce this as the underlying thinking pattern, helping to promote and reinforce logical thought processes that are thorough and address all important details, consider numerous potential avenues, take into account the effects of implementation, anticipate possible stumbling blocks, and incorporate contingencies. The processes apply to issues of goal setting, policymaking, and daily decision making just as much as they do to problem-solving.

Objectivity

Because human observation is inherently subjective, every person sees the world a little bit differently. The mental representations of the reality people experience can be quite different, and each tends to believe their representation is the “right” one. Individuals within an organization usually have enough common understanding that they can communicate and work together to get things done. But quite often, when they get into the details of the situation, the common understanding starts to break down, and the differences in how we see reality become apparent.

Problem-solving involves reconciling those multiple viewpoints – a view of the situation that includes multiple perspectives tends to be more objective than any single viewpoint. We start with one picture of the situation and make it explicit so that we can better share it with others and test it. Collecting quantitative (that is, objective) facts and discussing this picture with others is a key way in verifying that the picture is accurate. If it is not, appropriate adjustments are made until it is an accurate representation of a co-constructed reality. In other words, it is a co-constructed representation of a co-constructed reality.

Objectivity is a central component to the problem solving mindset. Effective problem-solvers continually test their understanding of a situation for assumptions, biases, and misconceptions. The process begins by framing the problem with relevant facts and details, as objectively as possible. Furthermore, suggested remedies or recommended courses of action should promote the organizational good, not (even if subconsciously) personal agendas.

Results and Process

Results are not favored over the process used to achieve them, nor is process elevated above results. Both are necessary and critical to an effective organization.

Synthesis, Distillation and and Visualization

We want to drive synthesis of the learning acquired in the course of understanding a problem or opportunity and discussing it with others. Through this multiple pieces of information from different sources are integrated into a coherent picture of the situation and recommended future action.

Visual thinking plays a vital role in conveying information and the act of creating the visualization aids the synthesis and distillation process.

Alignment

Effective implementation of a change often hinges on obtaining prior consensus among the parties involved. With consensus, everyone pulls together to overcome obstacles and make the change happen. Problem-solving teams communicates horizontally with other groups in the organization possibly affected by the proposed change and incorporates their concerns into the solution. The team also communicates vertically with individuals who are on the front lines to see how they may be affected, and with managers up the hierarchy to determine whether any broader issues have not been addressed. Finally, it is important that the history of the situation be taken into account, including past remedies, and that recommendations for action consider possible exigencies that may occur in the future. Taking all these into consideration will result in mutually agreeable, innovative solutions.

Coherency and Consistency

Problem-solving efforts are sometimes ineffective simply because the problem-solvers do not maintain coherency. They tackle problems that are not important to the organization’s goals, propose solutions that do not address the root causes, or even outline implementation plans that leave out key pieces of the proposed solution. So coherency within the problem-solving approach is paramount to effective problem resolution.

Consistent approaches to problem-solving speed up communication and aid in establishing shared understanding. Organizational members understand the implicit logic of the approach, so they can anticipate and offer information that will be helpful to the problem-solvers as they move through the process.

Systems Thinking

Good system thinking means good problem-solving.

Practice Paying Attention for Good Problem Solving

Situational awareness is built on perception. Problem-solving requires it. Perception is a building block of agile-thinking and pretty much everything else we need to do to succeed in today’s idea-based businesses.

As individuals we should be striving to develop perception, and as organizations we need to be developing training and practices to reinforce. There are few aspects we need to build.

Look inward to analyze previous mistakes

How often have you or some expert said “No one could have predicted that” or “It wasn’t my job to see the warning signs.” Rarely do you hear them acknowledge their own responsibility with comments such as “I didn’t think about how that change could affect our organization” or “I didn’t ask for more information.”

When a problem arises consider the decisions we’ve made and the role you and your team played. Did you miss warning signs? Is there an incentive to overlook what was going on? What are your weak spots and how can you fix them to prevent future problems?

Take an outsider’s view:

If you’ have ever encountered the “things aren’t done that way” response to new solutions, push harder. There is usually no logical reason why a change can’t be made, and there is a bad habit that needs to be broken.

Look for signs, symptoms and syndromes

  1. Signs – something is not right or expected
  2. Symptoms – some signs are symptoms, but usually signs point to symptoms, an underlying problem or set of problems
  3. Syndrome – false beliefs that can generate symptoms, usually part of a wider set of causes

Avoid Willful Blindness