When Water Systems Fail: Unpacking the LeMaitre Vascular Warning Letter

The FDA’s August 11, 2025 warning letter to LeMaitre Vascular reads like a masterclass in how fundamental water system deficiencies can cascade into comprehensive quality system failures. This warning letter offers lessons about the interconnected nature of pharmaceutical water systems and the regulatory expectations that surround them.

The Foundation Cracks

What makes this warning letter particularly instructive is how it demonstrates that water systems aren’t just utilities—they’re critical manufacturing infrastructure whose failures ripple through every aspect of product quality. LeMaitre’s North Brunswick facility, which manufactures Artegraft Collagen Vascular Grafts, found itself facing six major violations, with water system inadequacies serving as the primary catalyst.

The Artegraft device itself—a bovine carotid artery graft processed through enzymatic digestion and preserved in USP purified water and ethyl alcohol—places unique demands on water system reliability. When that foundation fails, everything built upon it becomes suspect.

Water Sampling: The Devil in the Details

The first violation strikes at something discussed extensively in previous posts: representative sampling. LeMaitre’s USP water sampling procedures contained what the FDA termed “inconsistent and conflicting requirements” that fundamentally compromised the representativeness of their sampling.

Consider the regulatory expectation here. As outlined in ISPE guideline, “sampling a POU must include any pathway that the water travels to reach the process”. Yet LeMaitre was taking samples through methods that included purging, flushing, and disinfection steps that bore no resemblance to actual production use. This isn’t just a procedural misstep—it’s a fundamental misunderstanding of what water sampling is meant to accomplish.

The FDA’s criticism centers on three critical sampling failures:

  • Sampling Location Discrepancies: Taking samples through different pathways than production water actually follows. This violates the basic principle that quality control sampling should “mimic the way the water is used for manufacturing”.
  • Pre-Sampling Conditioning: The procedures required extensive purging and cleaning before sampling—activities that would never occur during normal production use. This creates “aspirational data”—results that reflect what we wish our system looked like rather than how it actually performs.
  • Inconsistent Documentation: Failure to document required replacement activities during sampling, creating gaps in the very records meant to demonstrate control.

The Sterilant Switcheroo

Perhaps more concerning was LeMaitre’s unauthorized change of sterilant solutions for their USP water system sanitization. The company switched sterilants sometime in 2024 without documenting the change control, assessing biocompatibility impacts, or evaluating potential contaminant differences.

This represents a fundamental failure in change control—one of the most basic requirements in pharmaceutical manufacturing. Every change to a validated system requires formal assessment, particularly when that change could affect product safety. The fact that LeMaitre couldn’t provide documentation allowing for this change during inspection suggests a broader systemic issue with their change control processes.

Environmental Monitoring: Missing the Forest for the Trees

The second major violation addressed LeMaitre’s environmental monitoring program—specifically, their practice of cleaning surfaces before sampling. This mirrors issues we see repeatedly in pharmaceutical manufacturing, where the desire for “good” data overrides the need for representative data.

Environmental monitoring serves a specific purpose: to detect contamination that could reasonably be expected to occur during normal operations. When you clean surfaces before sampling, you’re essentially asking, “How clean can we make things when we try really hard?” rather than “How clean are things under normal operating conditions?”

The regulatory expectation is clear: environmental monitoring should reflect actual production conditions, including normal personnel traffic and operational activities. LeMaitre’s procedures required cleaning surfaces and minimizing personnel traffic around air samplers—creating an artificial environment that bore little resemblance to actual production conditions.

Sterilization Validation: Building on Shaky Ground

The third violation highlighted inadequate sterilization process validation for the Artegraft products. LeMaitre failed to consider bioburden of raw materials, their storage conditions, and environmental controls during manufacturing—all fundamental requirements for sterilization validation.

This connects directly back to the water system failures. When your water system monitoring doesn’t provide representative data, and your environmental monitoring doesn’t reflect actual conditions, how can you adequately assess the bioburden challenges your sterilization process must overcome?

The FDA noted that LeMaitre had six out-of-specification bioburden results between September 2024 and March 2025, yet took no action to evaluate whether testing frequency should be increased. This represents a fundamental misunderstanding of how bioburden data should inform sterilization validation and ongoing process control.

CAPA: When Process Discipline Breaks Down

The final violations addressed LeMaitre’s Corrective and Preventive Action (CAPA) system, where multiple CAPAs exceeded their own established timeframes by significant margins. A high-risk CAPA took 81 days instead of the required timeframe, while medium and low-risk CAPAs exceeded deadlines by 120-216 days.

This isn’t just about missing deadlines—it’s about the erosion of process discipline. When CAPA systems lose their urgency and rigor, it signals a broader cultural issue where quality requirements become suggestions rather than requirements.

The Recall That Wasn’t

Perhaps most concerning was LeMaitre’s failure to report a device recall to the FDA. The company distributed grafts manufactured using raw material from a non-approved supplier, with one graft implanted in a patient before the recall was initiated. This constituted a reportable removal under 21 CFR Part 806, yet LeMaitre failed to notify the FDA as required.

This represents the ultimate failure: when quality system breakdowns reach patients. The cascade from water system failures to inadequate environmental monitoring to poor change control ultimately resulted in a product safety issue that required patient intervention.

Gap Assessment Questions

For organizations conducting their own gap assessments based on this warning letter, consider these critical questions:

Water System Controls

  • Are your water sampling procedures representative of actual production use conditions?
  • Do you have documented change control for any modifications to water system sterilants or sanitization procedures?
  • Are all water system sampling activities properly documented, including any maintenance or replacement activities?
  • Have you assessed the impact of any sterilant changes on product biocompatibility?

Environmental Monitoring

  • Do your environmental monitoring procedures reflect normal production conditions?
  • Are surfaces cleaned before environmental sampling, and if so, is this representative of normal operations?
  • Does your environmental monitoring capture the impact of actual personnel traffic and operational activities?
  • Are your sampling frequencies and locations justified by risk assessment?

Sterilization and Bioburden Control

  • Does your sterilization validation consider bioburden from all raw materials and components?
  • Have you established appropriate bioburden testing frequencies based on historical data and risk assessment?
  • Do you have procedures for evaluating when bioburden testing frequency should be increased based on out-of-specification results?
  • Are bioburden results from raw materials and packaging components included in your sterilization validation?

CAPA System Integrity

  • Are CAPA timelines consistently met according to your established procedures?
  • Do you have documented rationales for any CAPA deadline extensions?
  • Is CAPA effectiveness verification consistently performed and documented?
  • Are supplier corrective actions properly tracked and their effectiveness verified?

Change Control and Documentation

  • Are all changes to validated systems properly documented and assessed?
  • Do you have procedures for notifying relevant departments when suppliers change materials or processes?
  • Are the impacts of changes on product quality and safety systematically evaluated?
  • Is there a formal process for assessing when changes require revalidation?

Regulatory Compliance

  • Are all required reports (corrections, removals, MDRs) submitted within regulatory timeframes?
  • Do you have systems in place to identify when product removals constitute reportable events?
  • Are all regulatory communications properly documented and tracked?

Learning from LeMaitre’s Missteps

This warning letter serves as a reminder that pharmaceutical manufacturing is a system of interconnected controls, where failures in fundamental areas like water systems can cascade through every aspect of operations. The path from water sampling deficiencies to patient safety issues is shorter than many organizations realize.

The most sobering aspect of this warning letter is how preventable these violations were. Representative sampling, proper change control, and timely CAPA completion aren’t cutting-edge regulatory science—they’re fundamental GMP requirements that have been established for decades.

For quality professionals, this warning letter reinforces the importance of treating utility systems with the same rigor we apply to manufacturing processes. Water isn’t just a raw material—it’s a critical quality attribute that deserves the same level of control, monitoring, and validation as any other aspect of your manufacturing process.

The question isn’t whether your water system works when everything goes perfectly. The question is whether your monitoring and control systems will detect problems before they become patient safety issues. Based on LeMaitre’s experience, that’s a question worth asking—and answering—before the FDA does it for you.

Causal Reasoning: A Transformative Approach to Root Cause Analysis

Energy Safety Canada recently published a white paper on causal reasoning that offers valuable insights for quality professionals across industries. As someone who has spent decades examining how we investigate deviations and perform root cause analysis, I found their framework refreshing and remarkably aligned with the challenges we face in pharmaceutical quality. The paper proposes a fundamental shift in how we approach investigations, moving from what they call “negative reasoning” to “causal reasoning” that could significantly improve our ability to prevent recurring issues and drive meaningful improvement.

The Problem with Traditional Root Cause Analysis

Many of us in quality have experienced the frustration of seeing the same types of deviations recur despite thorough investigations and seemingly robust CAPAs. The Energy Safety Canada white paper offers a compelling explanation for this phenomenon: our investigations often focus on what did not happen rather than what actually occurred.

This approach, which the authors term “negative reasoning,” leads investigators to identify counterfactuals-things that did not occur, such as “operators not following procedures” or “personnel not stopping work when they should have”. The problem is fundamental: what was not happening cannot create the outcomes we experienced. As the authors aptly state, these counterfactuals “exist only in retrospection and never actually influenced events,” yet they dominate many of our investigation conclusions.

This insight resonates strongly with what I’ve observed in pharmaceutical quality. Six years ago the MHRA’s 2019 citation of 210 companies for inadequate root cause analysis and CAPA development – including 6 critical findings – takes on renewed significance in light of Sanofi’s 2025 FDA warning letter. While most cited organizations likely believed their investigation processes were robust (as Sanofi presumably did before their contamination failures surfaced), these parallel cases across regulatory bodies and years expose a persistent industry-wide disconnect between perceived and actual investigation effectiveness. These continued failures exemplify how superficial root cause analysis creates dangerous illusions of control – precisely the systemic flaw the MHRA data highlighted six years prior.

Negative Reasoning vs. Causal Reasoning: A Critical Distinction

The white paper makes a distinction that I find particularly valuable: negative reasoning seeks to explain outcomes based on what was missing from the system, while causal reasoning looks for what was actually present or what happened. This difference may seem subtle, but it fundamentally changes the nature and outcomes of our investigations.

When we use negative reasoning, we create what the white paper calls “an illusion of cause without being causal”. We identify things like “failure to follow procedures” or “inadequate risk assessment,” which may feel satisfying but don’t explain why those conditions existed in the first place. These conclusions often lead to generic corrective actions that fail to address underlying issues.

In contrast, causal reasoning requires statements that have time, place, and magnitude. It focuses on what was necessary and sufficient to create the effect, building a logically tight cause-and-effect diagram. This approach helps reveal how work is actually done rather than comparing reality to an imagined ideal.

This distinction parallels the gap between “work-as-imagined” (the black line) and “work-as-done” (the blue line). Too often, our investigations focus only on deviations from work-as-imagined without trying to understand why work-as-done developed differently.

A Tale of Two Analyses: The Power of Causal Reasoning

The white paper presents a compelling case study involving a propane release and operator injury that illustrates the difference between these two approaches. When initially analyzed through negative reasoning, investigators concluded the operator:

  • Used an improper tool
  • Deviated from good practice
  • Failed to recognize hazards
  • Failed to learn from past experiences

These conclusions placed blame squarely on the individual and led leadership to consider terminating the operator.

However, when the same incident was examined through causal reasoning, a different picture emerged:

  • The operator used the pipe wrench because it was available at the pump specifically for this purpose
  • The pipe wrench had been deliberately left at that location because operators knew the valve was hard to close
  • The operator acted quickly because he perceived a risk to the plant and colleagues
  • Leadership had actually endorsed this workaround four years earlier during a turnaround

This causally reasoned analysis revealed that what appeared to be an individual failure was actually a system-level issue that had been normalized over time. Rather than punishing the operator, leadership recognized their own role in creating the conditions for the incident and implemented systemic improvements.

This example reminded me of our discussions on barrier analysis, where we examine barriers that failed, weren’t used, or didn’t exist. But causal reasoning takes this further by exploring why those conditions existed in the first place, creating a much richer understanding of how work actually happens.

First 24 Hours: Where Causal Reasoning Meets The Golden Day

In my recent post on “The Golden Start to a Deviation Investigation,” I emphasized how critical the first 24 hours are after discovering a deviation. This initial window represents our best opportunity to capture accurate information and set the stage for a successful investigation. The Energy Safety Canada white paper complements this concept perfectly by providing guidance on how to use those critical hours effectively.

When we apply causal reasoning during these early stages, we focus on collecting specific, factual information about what actually occurred rather than immediately jumping to what should have happened. This means documenting events with specificity (time, place, magnitude) and avoiding premature judgments about deviations from procedures or expectations.

As I’ve previously noted, clear and precise problem definition forms the foundation of any effective investigation. Causal reasoning enhances this process by ensuring we document using specific, factual language that describes what occurred rather than what didn’t happen. This creates a much stronger foundation for the entire investigation.

Beyond Human Error: System Thinking and Leadership’s Role

One of the most persistent challenges in our field is the tendency to attribute events to “human error.” As I’ve discussed before, when human error is suspected or identified as the cause, this should be justified only after ensuring that process, procedural, or system-based errors have not been overlooked. The white paper reinforces this point, noting that human actions and decisions are influenced by the system in which people work.

In fact, the paper presents a hierarchy of causes that resonates strongly with systems thinking principles I’ve advocated for previously. Outcomes arise from physical mechanisms influenced by human actions and decisions, which are in turn governed by systemic factors. If we only address physical mechanisms or human behaviors without changing the system, performance will eventually migrate back to where it has always been.

This connects directly to what I’ve written about quality culture being fundamental to providing quality. The white paper emphasizes that leadership involvement is directly correlated with performance improvement. When leaders engage to set conditions and provide resources, they create an environment where investigations can reveal systemic issues rather than just identify procedural deviations or human errors.

Implementing Causal Reasoning in Pharmaceutical Quality

For pharmaceutical quality professionals looking to implement causal reasoning in their investigation processes, I recommend starting with these practical steps:

1. Develop Investigator Competencies

As I’ve discussed in my analysis of Sanofi’s FDA warning letter, having competent investigators is crucial. Organizations should:

  • Define required competencies for investigators
  • Provide comprehensive training on causal reasoning techniques
  • Implement mentoring programs for new investigators
  • Regularly assess and refresh investigator skills

2. Shift from Counterfactuals to Causal Statements

Review your recent investigations and look for counterfactual statements like “operators did not follow the procedure.” Replace these with causal statements that describe what actually happened and why it made sense to the people involved at the time.

3. Implement a Sponsor-Driven Approach

The white paper emphasizes the importance of investigation sponsors (otherwise known as Area Managers) who set clear conditions and expectations. This aligns perfectly with my belief that quality culture requires alignment between top management behavior and quality system philosophy. Sponsors should:

  • Clearly define the purpose and intent of investigations
  • Specify that a causal reasoning orientation should be used
  • Provide resources and access needed to find data and translate it into causes
  • Remain engaged throughout the investigation process
Infographic capturing the 4 things a sponsor should do above

4. Use Structured Causal Analysis Tools

While the M-based frameworks I’ve discussed previously (4M, 5M, 6M) remain valuable for organizing contributing factors, they should be complemented with tools that support causal reasoning. The Cause-Consequence Analysis (CCA) I described in a recent post offers one such approach, combining elements of fault tree analysis and event tree analysis to provide a holistic view of risk scenarios.

From Understanding to Improvement

The Energy Safety Canada white paper’s emphasis on causal reasoning represents a valuable contribution to how we think about investigations across industries. For pharmaceutical quality professionals, this approach offers a way to move beyond compliance-focused investigations to truly understand how our systems operate and how to improve them.

As the authors note, “The capacity for an investigation to improve performance is dependent on the type of reasoning used by investigators”. By adopting causal reasoning, we can build investigations that reveal how work actually happens rather than simply identifying deviations from how we imagine it should happen.

This approach aligns perfectly with my long-standing belief that without a strong quality culture, people will not be ready to commit and involve themselves fully in building and supporting a robust quality management system. Causal reasoning creates the transparency and learning that form the foundation of such a culture.

I encourage quality professionals to download and read the full white paper, reflect on their current investigation practices, and consider how causal reasoning might enhance their approach to understanding and preventing deviations. The most important questions to consider are:

  1. Do your investigation conclusions focus on what didn’t happen rather than what did?
  2. How often do you identify “human error” without exploring the system conditions that made that error likely?
  3. Are your leaders engaged as sponsors who set conditions for successful investigations?
  4. What barriers exist in your organization that prevent learning from events?

As we continue to evolve our understanding of quality and safety, approaches like causal reasoning offer valuable tools for creating the transparency needed to navigate complexity and drive meaningful improvement.

Understanding the Distinction Between Impact and Risk

Two concepts—impact and risk — are often discussed but sometimes conflated within quality systems. While related, these concepts serve distinct purposes and drive different decisions throughout the quality system. Let’s explore.

The Fundamental Difference: Impact vs. Risk

The difference between impact and risk is fundamental to effective quality management. The difference between impact and risk is critical. Impact is best thought of as ‘What do I need to do to make the change.’ Risk is ‘What could go wrong in making this change?'”

Impact assessment focuses on evaluating the effects of a proposed change on various elements such as documentation, equipment, processes, and training. It helps identify the scope and reach of a change. Risk assessment, by contrast, looks ahead to identify potential failures that might occur due to the change – it’s preventive and focused on possible consequences.

This distinction isn’t merely academic – it directly affects how we approach actions and decisions in our quality systems, impacting core functions of CAPA, Change Control and Management Review.

AspectImpactRisk
DefinitionThe effect or influence a change, event, or deviation has on product quality, process, or systemThe probability and severity of harm or failure occurring as a result of a change, event, or deviation
FocusWhat is affected and to what extent (scope and magnitude of consequences)What could go wrong, how likely it is to happen, and how severe the outcome could be
Assessment TypeEvaluates the direct consequences of an action or eventEvaluates the likelihood and severity of potential adverse outcomes
Typical UseUsed in change control to determine which documents, systems, or processes are impactedUsed to prioritize actions, allocate resources, and implement controls to minimize negative outcomes
MeasurementUsually described qualitatively (e.g., minor, moderate, major, critical)Often quantified by combining probability and impact scores to assign a risk level (e.g., low, medium, high)
ExampleA change in raw material supplier impacts the manufacturing process and documentation.The risk is that the new supplier’s material could fail to meet quality standards, leading to product defects.

Change Control: Different Questions, Different Purposes

Within change management, the PIC/S Recommendation PI 054-1 notes that “In some cases, especially for simple and minor/low risk changes, an impact assessment is sufficient to document the risk-based rationale for a change without the use of more formal risk assessment tools or approaches.”

Impact Assessment in Change Control

  • Determines what documentation requires updating
  • Identifies affected systems, equipment, and processes
  • Establishes validation requirements
  • Determines training needs

Risk Assessment in Change Control

  • Identifies potential failures that could result from the change
  • Evaluates possible consequences to product quality and patient safety
  • Determines likelihood of those consequences occurring
  • Guides preventive measures

A common mistake is conflating these concepts or shortcutting one assessment. For example, companies often rush to designate changes as “like-for-like” without supporting data, effectively bypassing proper risk assessment. This highlights why maintaining the distinction is crucial.

Validation: Complementary Approaches

In validation, the impact-risk distinction shapes our entire approach.

Impact in validation relates to identifying what aspects of product quality could be affected by a system or process. For example, when qualifying manufacturing equipment, we determine which critical quality attributes (CQAs) might be influenced by the equipment’s performance.

Risk assessment in validation explores what could go wrong with the equipment or process that might lead to quality failures. Risk management plays a pivotal role in validation by enabling a risk-based approach to defining validation strategies, ensuring regulatory compliance, mitigating product quality and safety risks, facilitating continuous improvement, and promoting cross-functional collaboration.

In Design Qualification, we verify that the critical aspects (CAs) and critical design elements (CDEs) necessary to control risks identified during the quality risk assessment (QRA) are present in the design. This illustrates how impact assessment (identifying critical aspects) works together with risk assessment (identifying what could go wrong).

When we perform Design Review and Design Qualification, we focus on Critical Aspects: Prioritize design elements that directly impact product quality and patient safety. Here, impact assessment identifies critical aspects, while risk assessment helps prioritize based on potential consequences.

Following Design Qualification, Verification activities such as Installation Qualification (IQ), Operational Qualification (OQ), and Performance Qualification (PQ) serve to confirm that the system or equipment performs as intended under actual operating conditions. Here, impact assessment identifies the specific parameters and functions that must be verified to ensure no critical quality attributes are compromised. Simultaneously, risk assessment guides the selection and extent of tests by focusing on areas with the highest potential for failure or deviation. This dual approach ensures that verification not only confirms the intended impact of the design but also proactively mitigates risks before routine use.

Validation does not end with initial qualification. Continuous Validation involves ongoing monitoring and trending of process performance and product quality to confirm that the validated state is maintained over time. Impact assessment plays a role in identifying which parameters and quality attributes require ongoing scrutiny, while risk assessment helps prioritize monitoring efforts based on the likelihood and severity of potential deviations. This continuous cycle allows quality systems to detect emerging risks early and implement corrective actions promptly, reinforcing a proactive, risk-based culture that safeguards product quality throughout the product lifecycle.

Data Integrity: A Clear Example

Data integrity offers perhaps the clearest illustration of the impact-risk distinction.

As I’ve previously noted, Data quality is not a risk. It is a causal factor in the failure or severity. Poor data quality isn’t itself a risk; rather, it’s a factor that can influence the severity or likelihood of risks.

When assessing data integrity issues:

  • Impact assessment identifies what data is affected and which processes rely on that data
  • Risk assessment evaluates potential consequences of data integrity lapses

In my risk-based data integrity assessment methodology, I use a risk rating system that considers both impact and risk factors:

Risk RatingActionMitigation
>25High Risk-Potential Impact to Patient Safety or Product QualityMandatory
12-25Moderate Risk-No Impact to Patient Safety or Product Quality but Potential Regulatory RiskRecommended
<12Negligible DI RiskNot Required

This system integrates both impact (on patient safety or product quality) and risk (likelihood and detectability of issues) to guide mitigation decisions.

The Golden Day: Impact and Risk in Deviation Management

The Golden Day concept for deviation management provides an excellent practical example. Within the first 24 hours of discovering a deviation, we conduct:

  1. An impact assessment to determine:
    • Which products, materials, or batches are affected
    • Potential effects on critical quality attributes
    • Possible regulatory implications
  2. A risk assessment to evaluate:
    • Patient safety implications
    • Product quality impact
    • Compliance with registered specifications
    • Level of investigation required

This impact assessment is also the initial risk assessment, which will help guide the level of effort put into the deviation. This statement shows how the two concepts, while distinct, work together to inform quality decisions.

Quality Escalation: When Impact Triggers a Response

In quality escalation, we often use specific criteria based on both impact and risk:

Escalation CriteriaExamples of Quality Events for Escalation
Potential to adversely affect quality, safety, efficacy, performance or compliance of product– Contamination – Product defect/deviation from process parameters or specification – Significant GMP deviations
Product counterfeiting, tampering, theft– Product counterfeiting, tampering, theft reportable to Health Authority – Lost/stolen IMP
Product shortage likely to disrupt patient care– Disruption of product supply due to product quality events
Potential to cause patient harm associated with a product quality event– Urgent Safety Measure, Serious Breach, Significant Product Complaint

These criteria demonstrate how we use both impact (what’s affected) and risk (potential consequences) to determine when issues require escalation.

Both Are Essential

Understanding the difference between impact and risk fundamentally changes how we approach quality management. Impact assessment without risk assessment may identify what’s affected but fails to prevent potential issues. Risk assessment without impact assessment might focus on theoretical problems without understanding the actual scope.

The pharmaceutical quality system requires both perspectives:

  1. Impact tells us the scope – what’s affected
  2. Risk tells us the consequences – what could go wrong

By maintaining this distinction and applying both concepts appropriately across change control, validation, and data integrity management, we build more robust quality systems that not only comply with regulations but actually protect product quality and patient safety.

The Golden Start to a Deviation Investigation

How you respond in the first 24 hours after discovering a deviation can make the difference between a minor quality issue and a major compliance problem. This critical window-what I call “The Golden Day”-represents your best opportunity to capture accurate information, contain potential risks, and set the stage for a successful investigation. When managed effectively, this initial day creates the foundation for identifying true root causes and implementing effective corrective actions that protect product quality and patient safety.

Why the First 24 Hours Matter: The Evidence

The initial response to a deviation is crucial for both regulatory compliance and effective problem-solving. Industry practice and regulatory expectations align on the importance of quick, systematic responses to deviations.

  • Regulatory expectations explicitly state that deviation investigation and root cause determination should be completed in a timely manner, and industry expectations usually align on deviations being completed within 30 days of discovery.
  • In the landmark U.S. v. Barr Laboratories case, “the Court declared that all failure investigations must be performed promptly, within thirty business days of the problem’s occurrence”
  • Best practices recommend assembling a cross-functional team immediately after deviation discovery and conduct initial risk assessment within 24 hours”
  • Initial actions taken in the first day directly impact the quality and effectiveness of the entire investigation process

When you capitalize on this golden window, you’re working with fresh memories, intact evidence, and the highest chance of observing actual conditions that contributed to the deviation.

Identifying the Problem: Clarity from the Start

Clear, precise problem definition forms the foundation of any effective investigation. Vague or incomplete problem statements lead to misdirected investigations and ultimately, inadequate corrective actions.

  • Document using specific, factual language that describes what occurred versus what was expected
  • Include all relevant details such as procedure and equipment numbers, product names and lot numbers
  • Apply the 5W2H method (What, When, Where, Who, Why if known, How much is involved, and How it was discovered)
  • Avoid speculation about causes in the initial description
  • Remember that the description should incorporate relevant records and photographs of discovered defects.
5W2HTypical questionsContains
Who?Who are the people directly concerned with the problem? Who does this? Who should be involved but wasn’t? Was someone involved who shouldn’t be?User IDs, Roles and Departments
What?What happened?Action, steps, description
When?When did the problem occur?Times, dates, place In process
Where?Where did the problem occur?Location
Why is it important?Why did we do this? What are the requirements? What is the expected condition?Justification, reason
How?How did we discover. Where in the process was it?Method, process, procedure
How Many? How Much?How many things are involved? How often did the situation happen? How much did it impact?Number, frequency

The quality of your deviation documentation begins with this initial identification. As I’ve emphasized in previous posts, the investigation/deviation report should tell a story that can be easily understood by all parties well after the event and the investigation. This narrative begins with clear identification on day one.

ElementsProblem Statement
Is used to…Understand and target a problem. Providing a scope. Evaluate any risks. Make objective decisions
Answers the following… (5W2H)What? (problem that occurred);When? (timing of what occurred); Where? (location of what occurred); Who? (persons involved/observers); Why? (why it matters, not why it occurred); How Much/Many? (volume or count); How Often? (First/only occurrence or multiple)
Contains…Object (What was affected?); Defect (What went wrong?)
Provides direction for…Escalation(s); Investigation

Going to the GEMBA: Being Where the Action Is

GEMBA-the actual place where work happens-is a cornerstone concept in quality management. When a deviation occurs, there is no substitute for being physically present at the location.

  • Observe the actual conditions and environment firsthand
  • Notice details that might not be captured in written reports
  • Understand the workflow and context surrounding the deviation
  • Gather physical evidence before it’s lost or conditions change
  • Create the opportunity for meaningful conversations with operators

Human error occurs because we are human beings. The extent of our knowledge, training, and skill has little to do with the mistakes we make. We tire, our minds wander and lose concentration, and we must navigate complex processes while satisfying competing goals and priorities – compliance, schedule adherence, efficiency, etc.

Foremost to understanding human performance is knowing that people do what makes sense to them given the available cues, tools, and focus of their attention at the time. Simply put, people come to work to do a good job – if it made sense for them to do what they did, it will make sense to others given similar conditions. The following factors significantly shape human performance and should be the focus of any human error investigation:

Physical Environment
Environment, tools, procedures, process design
Organizational Culture
Just- or blame-culture, attitude towards error
Management and Supervision
Management of personnel, training, procedures
Stress Factors
Personal, circumstantial, organizational

We do not want to see or experience human error – but when we do, it’s imperative to view it as a valuable opportunity to improve the system or process. This mindset is the heart of effective human error prevention.

Conducting an Effective GEMBA Walk for Deviations

When conducting your GEMBA walk specifically for deviation investigation:

  • Arrive with a clear purpose and structured approach
  • Observe before asking questions
  • Document observations with photos when appropriate
  • Look for environmental factors that might not appear in reports
  • Pay attention to equipment configuration and conditions
  • Note how operators interact with the process or equipment

A deviation gemba is a cross-functional team meeting that is assembled where a potential deviation event occurred. Going to the gemba and “freezing the scene” as close as possible to the time the event occurred will yield valuable clues about the environment that existed at the time – and fresher memories will provide higher quality interviews. This gemba has specific objectives:

  • Obtain a common understanding of the event: what happened, when and where it happened, who observed it, who was involved – all the facts surrounding the event. Is it a deviation?
  • Clearly describe actions taken, or that need to be taken, to contain impact from the event: product quarantine, physical or mechanical interventions, management or regulatory notifications, etc.
  • Interview involved operators: ask open-ended questions, like how the event unfolded or was discovered, from their perspective, or how the event could have been prevented, in their opinion – insights from personnel experienced with the process can prove invaluable during an investigation.

Deviation GEMBA Tips

Typically there is time between when notification of a deviation gemba goes out and when the team is scheduled to assemble. It is important to come prepared to help facilitate an efficient gemba:

  • Assemble procedures and other relevant documents and records. This will make references easier during the gemba.
  • Keep your team on-track – the gemba should end with the team having a common understanding of the event, actions taken to contain impact, and the agreed-upon next steps of the investigation.

You will gain plenty of investigational leads from your observations and interviews at the gemba – which documents to review, which personnel to interview, which equipment history to inspect, and more. The gemba is such an invaluable experience that, for many minor events, root cause and CAPA can be determined fairly easily from information gathered solely at the gemba.

Informal Rubric for Conducting a Good Deviation GEMBA

  • Describe the timeliness of the team gathering at the gemba.
  • Were all required roles and experts present?
  • Was someone leading or facilitating the gemba?
  • Describe any interviews the team performed during the gemba.
  • Did the team get sidetracked or off-topic during the gemba
  • Was the team prepared with relevant documentation or information?
  • Did the team determine batch impact and any reportability requirements?
  • Did the team satisfy the objectives of the gemba?
  • What did the team do well?
  • What could the team improve upon?

Speaking with Operators: The Power of Cognitive Interviewing

Interviewing personnel who were present when the deviation occurred requires special techniques to elicit accurate, complete information. Traditional questioning often fails to capture critical details.

Cognitive interviewing, as I outlined in my previous post on “Interviewing,” was originally created for law enforcement and later adopted during accident investigations by the National Transportation Safety Board (NTSB). This approach is based on two key principles:

  • Witnesses need time and encouragement to recall information
  • Retrieval cues enhance memory recall

How to Apply Cognitive Interviewing in Deviation Investigations

  • Mental Reinstatement: Encourage the interviewee to mentally recreate the environment and people involved
  • In-Depth Reporting: Encourage the reporting of all the details, even if it is minor or not directly related
  • Multiple Perspectives: Ask the interviewee to recall the event from others’ points of view
  • Several Orders: Ask the interviewee to recount the timeline in different ways. Beginning to end, end to beginning

Most importantly, conduct these interviews at the actual location where the deviation occurred. A key part of this is that retrieval cues access memory. This is why doing the interview on the scene (or Gemba) is so effective.

ComponentWhat It Consists of
Mental ReinstatementEncourage the interviewee to mentally recreate the environment and people involved.
In-Depth ReportingEncourage the reporting of all the details.
Multiple PerspectivesAsk the interviewee to recall the event from others’ points of view.
Several OrdersAsk the interviewee to recount the timeline in different ways.
  • Approach the Interviewee Positively:
    • Ask for the interview.
    • State the purpose of the interview.
    • Tell interviewee why he/she was selected.
    • Avoid statements that imply blame.
    • Focus on the need to capture knowledge
    • Answer questions about the interview.
    • Acknowledge and respond to concerns.
    • Manage negative emotions.
  • Apply these Four Components:
    • Use mental reinstatement.
    • Report everything.
    • Change the perspective.
    • Change the order.
  • Apply these Two Principles:
    • Witnesses need time and encouragement to recall information.
    • Retrieval cues enhance memory recall.
  • Demonstrate these Skills:
    • Recreate the original context and had them walk you through process.
    • Tell the witness to actively generate information.
    • Adopt the witness’s perspective.
    • Listen actively, do not interrupt, and pause before asking follow-up questions.
    • Ask open-ended questions.
    • Encourage the witness to use imagery.
    • Perform interview at the Gemba.
    • Follow sequence of the four major components.
    • Bring support materials.
    • Establish a connection with the witness.
    • Do Not tell them how they made the mistake.

Initial Impact Assessment: Understanding the Scope

Within the first 24 hours, a preliminary impact assessment is essential for determining the scope of the deviation and the appropriate response.

  • Apply a risk-based approach to categorize the deviation as critical, major, or minor
  • Evaluate all potentially affected products, materials, or batches
  • Consider potential effects on critical quality attributes
  • Assess possible regulatory implications
  • Determine if released products may be affected

This impact assessment is also the initial risk assessment, which will help guide the level of effort put into the deviation.

Factors to Consider in Initial Risk Assessment

  • Patient safety implications
  • Product quality impact
  • Compliance with registered specifications
  • Potential for impact on other batches or products
  • Regulatory reporting requirements
  • Level of investigation required

This initial assessment will guide subsequent decisions about quarantine, notification requirements, and the depth of investigation needed. Remember, this is a preliminary assessment that will be refined as the investigation progresses.

Immediate Actions: Containing the Issue

Once you’ve identified the deviation and assessed its potential impact, immediate actions must be taken to contain the issue and prevent further risk.

  • Quarantine potentially affected products or materials to prevent their release or further use
  • Notify key stakeholders, including quality assurance, production supervision, and relevant department heads
  • Implement temporary corrective or containment measures
  • Document the deviation in your quality management system
  • Secure relevant evidence and documentation
  • Consider whether to stop related processes

Industry best practices emphasize that you should Report the deviation in real-time. Notify QA within 24 hours and hold the GEMBA. Remember that “if you don’t document it, it didn’t happen” – thorough documentation of both the deviation and your immediate response is essential.

Affected vs Related Batches

Not every Impact is the same, so it can be helpful to have two concepts: Affected and Related.

  • Affected Batch:  Product directly impacted by the event at the time of discovery, for instance, the batch being manufactured or tested when the deviation occurred.
  • Related Batch:  Product manufactured or tested under the same conditions or parameters using the process in which the deviation occurred and determined as part of the deviation investigation process to have no impact on product quality.

Setting Up for a Successful Full Investigation

The final step in the golden day is establishing the foundation for the comprehensive investigation that will follow.

  • Assemble a cross-functional investigation team with relevant expertise
  • Define clear roles and responsibilities for team members
  • Establish a timeline for the investigation (remembering the 30-day guideline)
  • Identify additional data or evidence that needs to be collected
  • Plan for any necessary testing or analysis
  • Schedule follow-up interviews or observations

In my post on handling deviations, I emphasized that you must perform a time-sensitive and thorough investigation within 30 days. The groundwork laid during the golden day will make this timeline achievable while maintaining investigation quality.

Planning for Root Cause Analysis

During this setup phase, you should also begin planning which root cause analysis tools might be most appropriate for your investigation. Select tools based on the event complexity and the number of potential root causes and when “human error” appears to be involved, prepare to dig deeper as this is rarely the true root cause

Identifying Phase of your Investigation

IfThen you are at
The problem is not understood. Boundaries have not been set. There could be more than one problemProblem Understanding
Data needs to be collected. There are questions about frequency or occurrence. You have not had interviewsData Collection
Data has been collected but not analyszedData Analysis
The root cause needs to be determined from the analyzed dataIdentify Root Cause
Root Cause Analysis Tools Chart body { font-family: Arial, sans-serif; line-height: 1.6; margin: 20px; } table { border-collapse: collapse; width: 100%; margin-bottom: 20px; } th, td { border: 1px solid ; padding: 8px 12px; vertical-align: top; } th { background-color: ; font-weight: bold; text-align: left; } tr:nth-child(even) { background-color: ; } .purpose-cell { font-weight: bold; } h1 { text-align: center; color: ; } ul { margin: 0; padding-left: 20px; }

Root Cause Analysis Tools Chart

Purpose Tool Description
Problem Understanding Process Map A picture of the separate steps of a process in sequential order, including:
  • materials or services entering or leaving the process (inputs and outputs)
  • decisions that must be made
  • people who become involved
  • time involved at each step, and/or
  • process measurements.
Critical Incident Technique (CIT) A process used for collecting direct observations of human behavior that
  • have critical significance, and
  • meet methodically defined criteria.
Comparative Analysis A technique that focuses a problem-solving team on a problem. It compares one or more elements of a problem or process to evaluate elements that are similar or different (e.g. comparing a standard process to a failing process).
Performance Matrix A tool that describes the participation by various roles in completing tasks or deliverables for a project or business process.
Note: It is especially useful in clarifying roles and responsibilities in cross-functional/departmental positions.
5W2H Analysis An approach that defines a problem and its underlying contributing factors by systematically asking questions related to who, what, when, where, why, how, and how much/often.
Data Collection Surveys A technique for gathering data from a targeted audience based on a standard set of criteria.
Check Sheets A technique to compile data or observations to detect and show trends/patterns.
Cognitive Interview An interview technique used by investigators to help the interviewee recall specific memories from a specific event.
KNOT Chart A data collection and classification tool to organize data based on what is
  • Known
  • Need to know
  • Opinion, and
  • Think we know.
Data Analysis Pareto Chart A technique that focuses efforts on problems offering the greatest potential for improvement.
Histogram A tool that
  • summarizes data collected over a period of time, and
  • graphically presents frequency distribution.
Scatter Chart A tool to study possible relationships between changes in two different sets of variables.
Run Chart A tool that captures study data for trends/patterns over time.
Affinity Diagram A technique for brainstorming and summarizing ideas into natural groupings to understand a problem.
Root Cause Analysis Interrelationship Digraphs A tool to identify, analyze, and classify cause and effect relationships among issues so that drivers become part of an effective solution.
Why-Why A technique that allows one to explore the cause-and-effect relationships of a particular problem by asking why; drilling down through the underlying contributing causes to identify root cause.
Is/Is Not A technique that guides the search for causes of a problem by isolating the who, what, when, where, and how of an event. It narrows the investigation to factors that have an impact and eliminates factors that do not have an impact. By comparing what the problem is with what the problem is not, we can see what is distinctive about a problem which leads to possible causes.
Structured Brainstorming A technique to identify, explore, and display the
  • factors within each root cause category that may be affecting the problem/issue, and/or
  • effect being studied through this structured idea-generating tool.
Cause and Effect Diagram (Ishikawa/Fishbone) A tool to display potential causes of an event based on root cause categories defined by structured brainstorming using this tool as a visual aid.
Causal Factor Charting A tool to
  • analyze human factors and behaviors that contribute to errors, and
  • identify behavior-influencing factors and gaps.
Other Tools Prioritization Matrix A tool to systematically compare choices through applying and weighting criteria.
Control Chart A tool to monitor process performance over time by studying its variation and source.
Process Capability A tool to determine whether a process is capable of meeting requirements or specifications.

Making the Most of Your Golden Day

The first 24 hours after discovering a deviation represent a unique opportunity that should not be wasted. By following the structured approach outlined in this post-identifying the problem clearly, going to the GEMBA, interviewing operators using cognitive techniques, conducting an initial impact assessment, taking immediate containment actions, and setting up for the full investigation-you maximize the value of this golden day.

Remember that excellent deviation management is directly linked to product quality, patient safety, and regulatory compliance. Each well-managed deviation is an opportunity to strengthen your quality system.

I encourage you to assess your current approach to the first 24 hours of deviation management. Are you capturing the full value of this golden day, or are you letting critical information slip away? Implement these strategies, train your team on proper deviation triage, and transform your deviation response from reactive to proactive.

Your deviation management effectiveness doesn’t begin when the investigation report is initiated-it begins the moment a deviation is discovered. Make that golden day count.

When Your Deviation/CAPA Program Runs Smoothly Expect a Period of Increased Deviations

One reason to invest in the CAPA program is that you will see fewer deviations over time as you fix issues. That is true, but it takes time. Yes, you’ve dealt with your backlog, improved your investigations, integrated risk management, built problem-solving into your processes, and are truly driving preventative actions. And yet your deviations remain high. What is going on?

It’s because you are getting good at things and working your way through the bolus of problems. Here’s what is going on:

  1. Improved Detection and Reporting: As a CAPA program matures, it enhances an organization’s ability to detect and report deviations. Employees become more adept at identifying and documenting deviations due to better training and awareness, leading to a temporary increase in reported deviations.
  2. Thorough Root Cause Analysis: A well-functioning CAPA program emphasizes thorough root cause analysis. This process often uncovers previously unnoticed issues and identifies additional deviations that need to be addressed.
  3. Increased Scrutiny and Compliance: As the CAPA program gains momentum, management usually scrutinizes it more, which can lead to the discovery of more deviations. Organizations become more vigilant in maintaining compliance, resulting in more deviations being reported and documented.
  4. Systematic Process Improvements: The CAPA process often leads to systemic improvements in processes and procedures. As these improvements are implemented, any deviations from the new standards are more likely to be identified and recorded, contributing to an initial rise in deviation reports.
  5. Cultural Shift Towards Quality: A successful CAPA program fosters a culture of quality and continuous improvement. Employees may feel more empowered and responsible for reporting deviations, increasing the number of deviations captured.

Expect these changes and build your metric program around them. Avoid introducing a metric like a reduction in deviations in the first year, as such a metric will drive bad behavior. Instead, focus on metrics that demonstrate the success of the changes and, over time, introduce metrics to see the overall benefits.