Sidney Dekker: The Safety Scientist Who Influences How I Think About Quality

Over the past decades, as I’ve grown and now led quality organizations in biotechnology, I’ve encountered many thinkers who’ve shaped my approach to investigation and risk management. But few have fundamentally altered my perspective like Sidney Dekker. His work didn’t just add to my toolkit—it forced me to question some of my most basic assumptions about human error, system failure, and what it means to create genuinely effective quality systems.

Dekker’s challenge to move beyond “safety theater” toward authentic learning resonates deeply with my own frustrations about quality systems that look impressive on paper but fail when tested by real-world complexity.

Why Dekker Matters for Quality Leaders

Professor Sidney Dekker brings a unique combination of academic rigor and operational experience to safety science. As both a commercial airline pilot and the Director of the Safety Science Innovation Lab at Griffith University, he understands the gap between how work is supposed to happen and how it actually gets done. This dual perspective—practitioner and scholar—gives his critiques of traditional safety approaches unusual credibility.

But what initially drew me to Dekker’s work wasn’t his credentials. It was his ability to articulate something I’d been experiencing but couldn’t quite name: the growing disconnect between our increasingly sophisticated compliance systems and our actual ability to prevent quality problems. His concept of “drift into failure” provided a framework for understanding why organizations with excellent procedures and well-trained personnel still experience systemic breakdowns.

The “New View” Revolution

Dekker’s most fundamental contribution is what he calls the “new view” of human error—a complete reframing of how we understand system failures. Having spent years investigating deviations and CAPAs, I can attest to how transformative this shift in perspective can be.

The Traditional Approach I Used to Take:

Human error causes problems
People are unreliable; systems need protection from human variability
Solutions focus on better training, clearer procedures, more controls

Dekker’s New View That Changed My Practice:

Human error is a symptom of deeper systemic issues
People are the primary source of system reliability, not the threat to it
Variability and adaptation are what make complex systems work

This isn’t just academic theory—it has practical implications for every investigation I lead. When I encounter “operator error” in a deviation investigation, Dekker’s framework pushes me to ask different questions: What made this action reasonable to the operator at the time? What system conditions shaped their decision-making? How did our procedures and training actually perform under real-world conditions?

This shift aligns perfectly with the causal reasoning approaches I’ve been developing on this blog. Instead of stopping at “failure to follow procedure,” we dig into the specific mechanisms that drove the event—exactly what Dekker’s view demands.

Drift Into Failure: Why Good Organizations Go Bad

Perhaps Dekker’s most powerful concept for quality leaders is “drift into failure”—the idea that organizations gradually migrate toward disaster through seemingly rational local decisions. This isn’t sudden catastrophic failure; it’s incremental erosion of safety margins through competitive pressure, resource constraints, and normalized deviance.

I’ve seen this pattern repeatedly. For example, a cleaning validation program starts with robust protocols, but over time, small shortcuts accumulate: sampling points that are “difficult to access” get moved, hold times get shortened when production pressure increases, acceptance criteria get “clarified” in ways that gradually expand limits.

Each individual decision seems reasonable in isolation. But collectively, they represent drift—a gradual migration away from the original safety margins toward conditions that enable failure. The contamination events and data integrity issues that plague our industry often represent the endpoint of these drift processes, not sudden breakdowns in otherwise reliable systems.

Beyond Root Cause: Understanding Contributing Conditions

Traditional root cause analysis seeks the single factor that “caused” an event, but complex system failures emerge from multiple interacting conditions. The take-the-best heuristic I’ve been exploring on this blog—focusing on the most causally powerful factor—builds directly on Dekker’s insight that we need to understand mechanisms, not hunt for someone to blame.

When I investigate a failure now, I’m not looking for THE root cause. I’m trying to understand how various factors combined to create conditions for failure. What pressures were operators experiencing? How did procedures perform under actual conditions? What information was available to decision-makers? What made their actions reasonable given their understanding of the situation?

This approach generates investigations that actually help prevent recurrence rather than just satisfying regulatory expectations for “complete” investigations.

Just Culture: Moving Beyond Blame

Dekker’s evolution of just culture thinking has been particularly influential in my leadership approach. His latest work moves beyond simple “blame-free” environments toward restorative justice principles—asking not “who broke the rule” but “who was hurt and how can we address underlying needs.”

This shift has practical implications for how I handle deviations and quality events. Instead of focusing on disciplinary action, I’m asking: What systemic conditions contributed to this outcome? What support do people need to succeed? How can we address the underlying vulnerabilities this event revealed?

This doesn’t mean eliminating accountability—it means creating accountability systems that actually improve performance rather than just satisfying our need to assign blame.

Safety Theater: The Problem with Compliance Performance

Dekker’s most recent work on “safety theater” hits particularly close to home in our regulated environment. He defines safety theater as the performance of compliance when under surveillance that retreats to actual work practices when supervision disappears.

I’ve watched organizations prepare for inspections by creating impressive documentation packages that bear little resemblance to how work actually gets done. Procedures get rewritten to sound more rigorous, training records get updated, and everyone rehearses the “right” answers for auditors. But once the inspection ends, work reverts to the adaptive practices that actually make operations function.

This theater emerges from our desire for perfect, controllable systems, but it paradoxically undermines genuine safety by creating inauthenticity. People learn to perform compliance rather than create genuine safety and quality outcomes.

The falsifiable quality systems I’ve been advocating on this blog represent one response to this problem—creating systems that can be tested and potentially proven wrong rather than just demonstrated as compliant.

Six Practical Takeaways for Quality Leaders

After years of applying Dekker’s insights in biotechnology manufacturing, here are the six most practical lessons for quality professionals:

1. Treat “Human Error” as the Beginning of Investigation, Not the End

When investigations conclude with “human error,” they’ve barely started. This should prompt deeper questions: Why did this action make sense? What system conditions shaped this decision? What can we learn about how our procedures and training actually perform under pressure?

2. Understand Work-as-Done, Not Just Work-as-Imagined

There’s always a gap between procedures (work-as-imagined) and actual practice (work-as-done). Understanding this gap and why it exists is more valuable than trying to force compliance with unrealistic procedures. Some of the most important quality improvements I’ve implemented came from understanding how operators actually solve problems under real conditions.

3. Measure Positive Capacities, Not Just Negative Events

Traditional quality metrics focus on what didn’t happen—no deviations, no complaints, no failures. I’ve started developing metrics around investigation quality, learning effectiveness, and adaptive capacity rather than just counting problems. How quickly do we identify and respond to emerging issues? How effectively do we share learning across sites? How well do our people handle unexpected situations?

4. Create Psychological Safety for Learning

Fear and punishment shut down the flow of safety-critical information. Organizations that want to learn from failures must create conditions where people can report problems, admit mistakes, and share concerns without fear of retribution. This is particularly challenging in our regulated environment, but it’s essential for moving beyond compliance theater toward genuine learning.

5. Focus on Contributing Conditions, Not Root Causes

Complex failures emerge from multiple interacting factors, not single root causes. The take-the-best approach I’ve been developing helps identify the most causally powerful factor while avoiding the trap of seeking THE cause. Understanding mechanisms is more valuable than finding someone to blame.

6. Embrace Adaptive Capacity Instead of Fighting Variability

People’s ability to adapt and respond to unexpected conditions is what makes complex systems work, not a threat to be controlled. Rather than trying to eliminate human variability through ever-more-prescriptive procedures, we should understand how that variability creates resilience and design systems that support rather than constrain adaptive problem-solving.

Connection to Investigation Excellence

Dekker’s work provides the theoretical foundation for many approaches I’ve been exploring on this blog. His emphasis on testable hypotheses rather than compliance theater directly supports falsifiable quality systems. His new view framework underlies the causal reasoning methods I’ve been developing. His focus on understanding normal work, not just failures, informs my approach to risk management.

Most importantly, his insistence on moving beyond negative reasoning (“what didn’t happen”) to positive causal statements (“what actually happened and why”) has transformed how I approach investigations. Instead of documenting failures to follow procedures, we’re understanding the specific mechanisms that drove events—and that makes all the difference in preventing recurrence.

Essential Reading for Quality Leaders

If you’re leading quality organizations in today’s complex regulatory environment, these Dekker works are essential:

Start Here:

The Field Guide to Understanding Human Error (3rd Edition) – Foundation for the new view approach
Just Culture: Restorative Justice and Accountability (4th Edition) – Practical approach to accountability without blame

For Investigation Excellence:

Behind Human Error (with Woods, Cook, et al.) – Comprehensive framework for moving beyond blame
Drift into Failure – Understanding how good organizations gradually deteriorate

For Current Challenges:

Safety Theater – Critique of compliance-focused approaches particularly relevant to regulated industries
Patient Safety: A Human Factors Approach – Healthcare applications that translate well to pharmaceutical quality

The Leadership Challenge

Dekker’s work challenges us as quality leaders to move beyond the comfortable certainty of compliance-focused approaches toward the more demanding work of creating genuine learning systems. This requires admitting that our procedures and training might not work as intended. It means supporting people when they make mistakes rather than just punishing them. It demands that we measure our success by how well we learn and adapt, not just how well we document compliance.

This isn’t easy work. It requires the kind of organizational humility that Amy Edmondson and other leadership researchers emphasize—the willingness to be proven wrong in service of getting better. But in my experience, organizations that embrace this challenge develop more robust quality systems and, ultimately, better outcomes for patients.

The question isn’t whether Sidney Dekker is right about everything—it’s whether we’re willing to test his ideas and learn from the results. That’s exactly the kind of falsifiable approach that both his work and effective quality systems demand.

Take-the-Best Heuristic for Causal Investigation

The integration of Gigerenzer’s take-the-best heuristic with a causal reasoning framework creates a powerful approach to root cause analysis that addresses one of the most persistent problems in quality investigations: the tendency to generate exhaustive lists of contributing factors without identifying the causal mechanisms that actually drove the event.

Traditional root cause analysis often suffers from what we might call “factor proliferation”—the systematic identification of every possible contributing element without distinguishing between those that were causally necessary for the outcome and those that merely provide context. This comprehensive approach feels thorough but often obscures the most important causal relationships by giving equal weight to diagnostic and non-diagnostic factors.

The take-the-best heuristic offers an elegant solution by focusing investigative effort on identifying the single most causally powerful factor—the factor that, if changed, would have been most likely to prevent the event from occurring. This approach aligns perfectly with causal reasoning’s emphasis on identifying what was actually present and necessary for the outcome, rather than cataloging everything that might have been relevant.

From Counterfactuals to Causal Mechanisms

The most significant advantage of applying take-the-best to causal investigation is its natural resistance to the negative reasoning trap that dominates traditional root cause analysis. When investigators ask “What single factor was most causally responsible for this outcome?” they’re forced to identify positive causal mechanisms rather than falling back on counterfactuals like “failure to follow procedure” or “inadequate training.”

Consider a typical pharmaceutical deviation where a batch fails specification due to contamination. Traditional analysis might identify multiple contributing factors: inadequate cleaning validation, operator error, environmental monitoring gaps, supplier material variability, and equipment maintenance issues. Each factor receives roughly equal attention in the investigation report, leading to broad but shallow corrective actions.

A take-the-best causal approach would ask: “Which single factor, if it had been different, would most likely have prevented this contamination?” The investigation might reveal that the cleaning validation was adequate under normal conditions, but a specific equipment configuration created dead zones that weren’t addressed in the original validation. This equipment configuration becomes the take-the-best factor because changing it would have directly prevented the contamination, regardless of other contributing elements.

This focus on the most causally powerful factor doesn’t ignore other contributing elements—it prioritizes them based on their causal necessity rather than their mere presence during the event.

The Diagnostic Power of Singular Focus

One of Gigerenzer’s key insights about take-the-best is that focusing on the single most diagnostic factor can actually improve decision accuracy compared to complex multivariate approaches. In causal investigation, this translates to identifying the factor that had the greatest causal influence on the outcome—the factor that represents the strongest link in the causal chain.

This approach forces investigators to move beyond correlation and association toward genuine causal understanding. Instead of asking “What factors were present during this event?” the investigation asks “What factor was most necessary and sufficient for this specific outcome to occur?” This question naturally leads to the kind of specific, testable causal statements.

For example, rather than concluding that “multiple factors contributed to the deviation including inadequate procedures, training gaps, and environmental conditions,” a take-the-best causal analysis might conclude that “the deviation occurred because the procedure specified a 30-minute hold time that was insufficient for complete mixing under the actual environmental conditions present during manufacturing, leading to stratification that caused the observed variability.” This statement identifies the specific causal mechanism (insufficient hold time leading to incomplete mixing) while providing the time, place, and magnitude specificity that causal reasoning demands.

Preventing the Generic CAPA Trap

The take-the-best approach to causal investigation naturally prevents one of the most common failures in pharmaceutical quality: the generation of generic, unfocused corrective actions that address symptoms rather than causes. When investigators identify multiple contributing factors without clear causal prioritization, the resulting CAPAs often become diffuse efforts to “improve” everything without addressing the specific mechanisms that drove the event.

By focusing on the single most causally powerful factor, take-the-best investigations generate targeted corrective actions that address the specific mechanism identified as most necessary for the outcome. This creates more effective prevention strategies while avoiding the resource dilution that often accompanies broad-based improvement efforts.

The causal reasoning framework enhances this focus by requiring that the identified factor be described in terms of what actually happened rather than what failed to happen. Instead of “failure to follow cleaning procedures,” the investigation might identify “use of abbreviated cleaning cycle during shift change because operators prioritized production schedule over cleaning thoroughness.” This causal statement directly leads to specific corrective actions: modify shift change procedures, clarify prioritization guidance, or redesign cleaning cycles to be robust against time pressure.

Systematic Application

Implementing take-the-best causal investigation in pharmaceutical quality requires systematic attention to identifying and testing causal hypotheses rather than simply cataloging potential contributing factors. This process follows a structured approach:

Step 1: Event Reconstruction with Causal Focus – Document what actually happened during the event, emphasizing the sequence of causal mechanisms rather than deviations from expected procedure. Focus on understanding why actions made sense to the people involved at the time they occurred.

Step 2: Causal Hypothesis Generation – Develop specific hypotheses about which single factor was most necessary and sufficient for the observed outcome. These hypotheses should make testable predictions about system behavior under different conditions.

Step 3: Diagnostic Testing – Systematically test each causal hypothesis to determine which factor had the greatest influence on the outcome. This might involve data analysis, controlled experiments, or systematic comparison with similar events.

Step 4: Take-the-Best Selection – Identify the single factor that testing reveals to be most causally powerful—the factor that, if changed, would be most likely to prevent recurrence of the specific event.

Step 5: Mechanistic CAPA Development – Design corrective actions that specifically address the identified causal mechanism rather than implementing broad-based improvements across all potential contributing factors.

Integration with Falsifiable Quality Systems

The take-the-best approach to causal investigation creates naturally falsifiable hypotheses that can be tested and validated over time. When an investigation concludes that a specific factor was most causally responsible for an event, this conclusion makes testable predictions about system behavior that can be validated through subsequent experience.

For example, if a contamination investigation identifies equipment configuration as the take-the-best causal factor, this conclusion predicts that similar contamination events will be prevented by addressing equipment configuration issues, regardless of training improvements or procedural changes. This prediction can be tested systematically as the organization gains experience with similar situations.

This integration with falsifiable quality systems creates a learning loop where investigation conclusions are continuously refined based on their predictive accuracy. Investigations that correctly identify the most causally powerful factors will generate effective prevention strategies, while investigations that miss the key causal mechanisms will be revealed through continued problems despite implemented corrective actions.

The Leadership and Cultural Implications

Implementing take-the-best causal investigation requires leadership commitment to genuine learning rather than blame assignment. This approach often reveals system-level factors that leadership helped create or maintain, requiring the kind of organizational humility that the Energy Safety Canada framework emphasizes.

The cultural shift from comprehensive factor identification to focused causal analysis can be challenging for organizations accustomed to demonstrating thoroughness through exhaustive documentation. Leaders must support investigators in making causal judgments and prioritizing factors based on their diagnostic power rather than their visibility or political sensitivity.

This cultural change aligns with the broader shift toward scientific quality management that both the adaptive toolbox and falsifiable quality frameworks require. Organizations must develop comfort with making specific causal claims that can be tested and potentially proven wrong, rather than maintaining the false safety of comprehensive but non-specific factor lists.

The take-the-best approach to causal investigation represents a practical synthesis of rigorous scientific thinking and adaptive decision-making. By focusing on the single most causally powerful factor while maintaining the specific, testable language that causal reasoning demands, this approach generates investigations that are both scientifically valid and operationally useful—exactly what pharmaceutical quality management needs to move beyond the recurring problems that plague traditional root cause analysis.

When Investigation Excellence Meets Contamination Reality: Lessons from the Rechon Life Science Warning Letter

The FDA’s April 30, 2025 warning letter to Rechon Life Science AB serves as a great learning opportunity about the importance robust investigation systems to contamination control to drive meaningful improvements. This Swedish contract manufacturer’s experience offers profound lessons for quality professionals navigating the intersection of EU Annex 1‘s contamination control strategy requirements and increasingly regulatory expectations. It is a mistake to think that just because the FDA doesn’t embrace the prescriptive nature of Annex 1 the agency is not fully aligned with the intent.

This Warning Letter resonates with similar systemic failures at companies like LeMaitre Vascular, Sanofi and others. The Rechon warning letter demonstrates a troubling but instructive pattern: organizations that fail to conduct meaningful contamination investigations inevitably find themselves facing regulatory action that could have been prevented through better investigation practices and systematic contamination control approaches.

The Cascade of Investigation Failures: Rechon’s Contamination Control Breakdown

Aseptic Process Failures and the Investigation Gap

Rechon’s primary violation centered on a fundamental breakdown in aseptic processing—operators were routinely touching critical product contact surfaces with gloved hands, a practice that was not only observed but explicitly permitted in their standard operating procedures. This represents more than poor technique; it reveals an organization that had normalized contamination risks through inadequate investigation and assessment processes.

The FDA’s citation noted that Rechon failed to provide environmental monitoring trend data for surface swab samples, representing exactly the kind of “aspirational data” problem. When investigation systems don’t capture representative information about actual manufacturing conditions, organizations operate in a state of regulatory blindness, making decisions based on incomplete or misleading data.

This pattern reflects a broader failure in contamination investigation methodology: environmental monitoring excursions require systematic evaluation that includes all environmental data (i.e. viable and non-viable tests) and must include areas that are physically adjacent or where related activities are performed. Rechon’s investigation gaps suggest they lacked these fundamental systematic approaches.

Environmental Monitoring Investigations: When Trend Analysis Fails

Perhaps more concerning was Rechon’s approach to persistent contamination with objectionable microorganisms—gram-negative organisms and spore formers—in ISO 5 and 7 areas since 2022. Their investigation into eight occurrences of gram-negative organisms concluded that the root cause was “operators talking in ISO 7 areas and an increase of staff illness,” a conclusion that demonstrates fundamental misunderstanding of contamination investigation principles.

As an aside, ISO7/Grade C is not normally an area we see face masks.

Effective investigations must provide comprehensive evaluation including:

Background and chronology of events with detailed timeline analysis
Investigation and data gathering activities including interviews and training record reviews
SME assessments from qualified microbiology and manufacturing science experts
Historical data review and trend analysis encompassing the full investigation zone
Manufacturing process assessment to determine potential contributing factors
Environmental conditions evaluation including HVAC, maintenance, and cleaning activities

Rechon’s investigation lacked virtually all of these elements, focusing instead on convenient behavioral explanations that avoided addressing systematic contamination sources. The persistence of gram-negative organisms and spore formers over a three-year period represented a clear adverse trend requiring a comprehensive investigation approach.

The Annex 1 Contamination Control Strategy Imperative: Beyond Compliance to Integration

The Paradigm Shift in Contamination Control

The revised EU Annex 1, effective since August 2023 demonstrates the current status of regulatory expectations around contamination control, moving from isolated compliance activities toward integrated risk management systems. The mandatory Contamination Control Strategy (CCS) requires manufacturers to develop comprehensive, living documents that integrate all aspects of contamination risk identification, mitigation, and monitoring.

Industry implementation experience since 2023 has revealed that many organizations are faiing to make meaningful connections between existing quality systems and the Annex 1 CCS requirements. Organizations struggle with the time and resource requirements needed to map existing contamination controls into coherent strategies, which often leads to discovering significant gaps in their understanding of their own processes.

Representative Environmental Monitoring Under Annex 1

The updated guidelines place emphasis on continuous monitoring and representative sampling that reflects actual production conditions rather than idealized scenarios. Rechon’s failure to provide comprehensive trend data demonstrates exactly the kind of gap that Annex 1 was designed to address.

Environmental monitoring must function as part of an integrated knowledge system that combines explicit knowledge (documented monitoring data, facility design specifications, cleaning validation reports) with tacit knowledge about facility-specific contamination risks and operational nuances. This integration demands investigation systems capable of revealing actual contamination patterns rather than providing comfortable explanations for uncomfortable realities.

The Design-First Philosophy

One of Annex 1’s most significant philosophical shifts is the emphasis on design-based contamination control rather than monitoring-based approaches. As we see from Warning Letters, and other regulatory intelligence, design gaps are frequently being cited as primary compliance failures, reinforcing the principle that organizations cannot monitor or control their way out of poor design.

This design-first philosophy fundamentally changes how contamination investigations must be conducted. Instead of simply investigating excursions after they occur, robust investigation systems must evaluate whether facility and process designs create inherent contamination risks that make excursions inevitable. Rechon’s persistent contamination issues suggest their investigation systems never addressed these fundamental design questions.

Best Practice 1: Implement Comprehensive Microbial Assessment Frameworks

Structured Organism Characterization

Effective contamination investigations begin with proper microbial assessments that characterize organisms based on actual risk profiles rather than convenient categorizations.

Complete microorganism documentation encompassing organism type, Gram stain characteristics, potential sources, spore-forming capability, and objectionable organism status. The structured approach outlined in formal assessment templates ensures consistent evaluation across different sample types (in-process, environmental monitoring, water and critical utilities).
Quantitative occurrence assessment using standardized vulnerability scoring systems that combine occurrence levels (Low, Medium, High) with nature and history evaluations. This matrix approach prevents investigators from minimizing serious contamination events through subjective assessments.
Severity evaluation based on actual manufacturing impact rather than theoretical scenarios. For environmental monitoring excursions, severity assessments must consider whether microorganisms were detected in controlled environments during actual production activities, the potential for product contamination, and the effectiveness of downstream processing steps.
Risk determination through systematic integration of vulnerability scores and severity ratings, providing objective classification of contamination risks that drives appropriate corrective action responses.

Rechon’s superficial investigation approach suggests they lacked these systematic assessment frameworks, focusing instead on behavioral explanations that avoided comprehensive organism characterization and risk assessment.

Best Practice 2: Establish Cross-Functional Investigation Teams with Defined Competencies

Investigation Team Composition and Qualifications

Major contamination investigations require dedicated cross-functional teams with clearly defined responsibilities and demonstrated competencies. The investigation lead must possess not only appropriate training and experience but also technical knowledge of the process and cGMP/quality system requirements, and ability to apply problem-solving tools.

Minimum team composition requirements for major investigations must include:

Impacted Department representatives (Manufacturing, Facilities) with direct operational knowledge
Subject Matter Experts (Manufacturing Sciences and Technology, QC Microbiology) with specialized technical expertise
Contamination Control specialists serving as Quality Assurance approvers with regulatory and risk assessment expertise

Investigation scope requirements must encompass systematic evaluation including background/chronology documentation, comprehensive data gathering activities (interviews, training record reviews), SME assessments, impact statement development, historical data review and trend analysis, and laboratory investigation summaries.

Training and Competency Management

Investigation team effectiveness depends on systematic competency development and maintenance. Teams must demonstrate proficiency in:

Root cause analysis methodologies including fishbone analysis, why-why questioning, fault tree analysis, and failure mode and effects analysis approaches suited to contamination investigation contexts.
Contamination microbiology principles including organism identification, source determination, growth condition assessment, and disinfectant efficacy evaluation specific to pharmaceutical manufacturing environments.
Risk assessment and impact evaluation capabilities that can translate investigation findings into meaningful product, process, and equipment risk determinations.
Regulatory requirement understanding encompassing both domestic and international contamination control expectations, investigation documentation standards, and CAPA development requirements.

The superficial nature of Rechon’s gram-negative organism investigation suggests their teams lacked these fundamental competencies, resulting in conclusions that satisfied neither regulatory expectations nor contamination control best practices.

Best Practice 3: Conduct Meaningful Historical Data Review and Comprehensive Trend Analysis

Investigation Zone Definition and Data Integration

Effective contamination investigations require comprehensive trend analysis that extends beyond simple excursion counting to encompass systematic pattern identification across related operational areas. As established in detailed investigation procedures, historical data review must include:

Physically adjacent areas and related activities recognition that contamination events rarely occur in isolation. Processing activities spanning multiple rooms, secondary gowning areas leading to processing zones, material transfer airlocks, and all critical utility distribution points must be included in investigation zones.
Comprehensive environmental data analysis encompassing all environmental data (i.e. viable and non-viable tests) to identify potential correlations between different contamination indicators that might not be apparent when examining single test types in isolation.
Extended historical review capabilities for situations where limited or no routine monitoring was performed during the questioned time frame, requiring investigation teams to expand their analytical scope to capture relevant contamination patterns.
Microorganism identification pattern assessment to determine shifts in routine microflora or atypical or objectionable organisms, enabling detection of contamination source changes that might indicate facility or process deterioration.

Temporal Correlation Analysis

Sophisticated trend analysis must correlate contamination events with operational activities, environmental conditions, and facility modifications that might contribute to adverse trends:

Manufacturing activity correlation examining whether contamination patterns correlate with specific production campaigns, personnel schedules, cleaning activities, or maintenance operations that might introduce contamination sources.
Environmental condition assessment including HVAC system performance, pressure differential maintenance, temperature and humidity control, and compressed air quality that could influence contamination recovery patterns.
Facility modification impact evaluation determining whether physical environment changes, equipment installations, utility upgrades, or process modifications correlate with contamination trend emergence or intensification.

Rechon’s three-year history of gram-negative and spore-former recovery represented exactly the kind of adverse trend requiring this comprehensive analytical approach. Their failure to conduct meaningful trend analysis prevented identification of systematic contamination sources that behavioral explanations could never address.

Best Practice 4: Integrate Investigation Findings with Dynamic Contamination Control Strategy

Knowledge Management and CCS Integration

Under Annex 1 requirements, investigation findings must feed directly into the overall Contamination Control Strategy, creating continuous improvement cycles that enhance contamination risk understanding and control effectiveness. This integration requires sophisticated knowledge management systems that capture both explicit investigation data and tacit operational insights.

Explicit knowledge integration encompasses formal investigation reports, corrective action documentation, trending analysis results, and regulatory correspondence that must be systematically incorporated into CCS risk assessments and control measure evaluations.
Tacit knowledge capture including personnel experiences with contamination events, operational observations about facility or process vulnerabilities, and institutional understanding about contamination source patterns that may not be fully documented but represent critical CCS inputs.

Risk Assessment Dynamic Updates

CCS implementation demands that investigation findings trigger systematic risk assessment updates that reflect enhanced understanding of contamination vulnerabilities:

Contamination source identification updates based on investigation findings that reveal previously unrecognized or underestimated contamination pathways requiring additional control measures or monitoring enhancements.
Control measure effectiveness verification through post-investigation monitoring that demonstrates whether implemented corrective actions actually reduce contamination risks or require further enhancement.
Monitoring program optimization based on investigation insights about contamination patterns that may indicate needs for additional sampling locations, modified sampling frequencies, or enhanced analytical methods.

Continuous Improvement Integration

The CCS must function as a living document that evolves based on investigation findings rather than remaining static until the next formal review cycle:

Investigation-driven CCS updates that incorporate new contamination risk understanding into facility design assessments, process control evaluations, and personnel training requirements.
Performance metrics integration that tracks investigation quality indicators alongside traditional contamination control metrics to ensure investigation systems themselves contribute to contamination risk reduction.
Cross-site knowledge sharing mechanisms that enable investigation insights from one facility to enhance contamination control strategies at related manufacturing sites.

Best Practice 5: Establish Investigation Quality Metrics and Systematic Oversight

Investigation Completeness and Quality Assessment

Organizations must implement systematic approaches to ensure investigation quality and prevent the superficial analysis demonstrated by Rechon. This requires comprehensive quality metrics that evaluate both investigation process compliance and outcome effectiveness:

Investigation completeness verification using a rubric or other standardized checklists that ensure all required investigation elements have been addressed before investigation closure. These must verify background documentation adequacy, data gathering comprehensiveness, SME assessment completion, impact evaluation thoroughness, and corrective action appropriateness.
Root cause determination quality assessment evaluating whether investigation conclusions demonstrate scientific rigor and logical connection between identified causes and observed contamination events. This includes verification that root cause analysis employed appropriate methodologies and that conclusions can withstand independent technical review.
Corrective action effectiveness verification through systematic post-implementation monitoring that demonstrates whether corrective actions achieved their intended contamination risk reduction objectives.

Management Review and Challenge Processes

Effective investigation oversight requires management systems that actively challenge investigation conclusions and ensure scientific rationale supports all determinations:

Technical review panels comprising independent SMEs who evaluate investigation methodology, data interpretation, and conclusion validity before investigation closure approval for major and critical deviations. I strongly recommend this as part of qualification and re-qualification activities.
Regulatory perspective integration ensuring investigation approaches and conclusions align with current regulatory expectations and enforcement trends rather than relying on outdated compliance interpretations.
Cross-functional impact assessment verifying that investigation findings and corrective actions consider all affected operational areas and don’t create unintended contamination risks in other facility areas.

CAPA System Integration and Effectiveness Tracking

Investigation findings must integrate with robust CAPA systems that ensure systematic improvements rather than isolated fixes:

Systematic improvement identification that links investigation findings to broader facility or process enhancement opportunities rather than limiting corrective actions to immediate excursion sources.
CAPA implementation quality management including resource allocation verification, timeline adherence monitoring, and effectiveness verification protocols that ensure corrective actions achieve intended risk reduction.
Knowledge management integration that captures investigation insights for application to similar contamination risks across the organization and incorporates lessons learned into training programs and preventive maintenance activities.

Rechon’s continued contamination issues despite previous investigations suggest their CAPA processes lacked this systematic improvement approach, treating each contamination event as isolated rather than symptoms of broader contamination control weaknesses.

A visual diagram presents a "Living Contamination Control Strategy" progressing toward a "Holistic Approach" through a winding path marked by five key best practices. Each best practice is highlighted in a circular node along the colored pathway.

Best Practice 01: Comprehensive microbial assessment frameworks through structured organism characterization.

Best Practice 02: Cross functional teams with the right competencies.

Best Practice 03: Meaningful historic data through investigation zones and temporal correlation.

Best Practice 04: Investigations integrated with Contamination Control Strategy.

Best Practice 05: Systematic oversight through metrics and challenge process.

The diagram represents a continuous improvement journey from foundational practices focused on organism assessment and team competency to integrating data, investigations, and oversight, culminating in a holistic contamination control strategy.

The Investigation-Annex 1 Integration Challenge: Building Investigation Resilience

Holistic Contamination Risk Assessment

Contamination control requires investigation systems that function as integral components of comprehensive strategies rather than reactive compliance activities.

Design-Investigation Integration demands that investigation findings inform facility design assessments and process modification evaluations. When investigations reveal design-related contamination sources, CCS updates must address whether facility modifications or process changes can eliminate contamination risks at their source rather than relying on monitoring and control measures.

Process Knowledge Enhancement through investigation activities that systematically build organizational understanding of contamination vulnerabilities, control measure effectiveness, and operational factors that influence contamination risk profiles.

Personnel Competency Development that leverages investigation findings to identify training needs, competency gaps, and behavioral factors that contribute to contamination risks requiring systematic rather than individual corrective approaches.

Technology Integration and Future Investigation Capabilities

Advanced Monitoring and Investigation Support Systems

The increasing sophistication of regulatory expectations necessitates corresponding advances in investigation support technologies that enable more comprehensive and efficient contamination risk assessment:

Real-time monitoring integration that provides investigation teams with comprehensive environmental data streams enabling correlation analysis between contamination events and operational variables that might not be captured through traditional discrete sampling approaches.

Automated trend analysis capabilities that identify contamination patterns and correlations across multiple data sources, facility areas, and time periods that might not be apparent through manual analysis methods.

Integrated knowledge management platforms that capture investigation insights, corrective action outcomes, and operational observations in formats that enable systematic application to future contamination risk assessments and control strategy optimization.

Investigation Standardization and Quality Enhancement

Technology solutions must also address investigation process standardization and quality improvement:

Investigation workflow management systems that ensure consistent application of investigation methodologies, prevent shortcuts that compromise investigation quality, and provide audit trails demonstrating compliance with regulatory expectations.

Cross-site investigation coordination capabilities that enable investigation insights from one facility to inform contamination risk assessments and investigation approaches at related manufacturing sites.

Building Organizational Investigation Excellence

Cultural Transformation Requirements

The evolution from compliance-focused contamination investigations toward risk-based contamination control strategies requires fundamental cultural changes that extend beyond procedural updates:

Leadership commitment demonstration through resource allocation for investigation system enhancement, personnel competency development, and technology infrastructure investment that enables comprehensive contamination risk assessment rather than minimal compliance achievement.

Cross-functional collaboration enhancement that breaks down organizational silos preventing comprehensive investigation approaches and ensures investigation teams have access to all relevant operational expertise and information sources.

Continuous improvement mindset development that views contamination investigations as opportunities for systematic facility and process enhancement rather than unfortunate compliance burdens to be minimized.

Investigation as Strategic Asset

Organizations that excel in contamination investigation develop capabilities that provide competitive advantages beyond regulatory compliance:

Process optimization opportunities identification through investigation activities that reveal operational inefficiencies, equipment performance issues, and facility design limitations that, when addressed, improve both contamination control and operational effectiveness.

Risk management capability enhancement that enables proactive identification and mitigation of contamination risks before they result in regulatory scrutiny or product quality issues requiring costly remediation.

Regulatory relationship management through demonstration of investigation competence and commitment to continuous improvement that can influence regulatory inspection frequency and focus areas.

The Cost of Investigation Mediocrity: Lessons from Enforcement

Regulatory Consequences and Business Impact

Rechon’s experience demonstrates the ultimate cost of inadequate contamination investigations: comprehensive regulatory action that threatens market access and operational continuity. The FDA’s requirements for extensive remediation—including independent assessment of investigation systems, comprehensive personnel and environmental monitoring program reviews, and retrospective out-of-specification result analysis—represent exactly the kind of work that should be conducted proactively rather than reactively.

Resource Allocation and Opportunity Cost

The remediation requirements imposed on companies receiving warning letters far exceed the resource investment required for proactive investigation system development:

Independent consultant engagement costs for comprehensive facility and system assessment that could be avoided through internal investigation capability development and systematic contamination control strategy implementation.
Production disruption resulting from regulatory holds, additional sampling requirements, and corrective action implementation that interrupts normal manufacturing operations and delays product release.
Market access limitations including potential product recalls, import restrictions, and regulatory approval delays that affect revenue streams and competitive positioning.

Reputation and Trust Impact

Beyond immediate regulatory and financial consequences, investigation failures create lasting reputation damage that affects customer relationships, regulatory standing, and business development opportunities:

Customer confidence erosion when investigation failures become public through warning letters, regulatory databases, and industry communications that affect long-term business relationships.
Regulatory relationship deterioration that can influence future inspection focus areas, approval timelines, and enforcement approaches that extend far beyond the original contamination control issues.
Industry standing impact that affects ability to attract quality personnel, develop partnerships, and maintain competitive positioning in increasingly regulated markets.

Gap Assessment Framework: Organizational Investigation Readiness

Investigation System Evaluation Criteria

Organizations should systematically assess their investigation capabilities against current regulatory expectations and best practice standards. This assessment encompasses multiple evaluation dimensions:

Technical Competency Assessment
- Do investigation teams possess demonstrated expertise in contamination microbiology, facility design, process engineering, and regulatory requirements?
- Are investigation methodologies standardized, documented, and consistently applied across different contamination scenarios?
- Does investigation scope routinely include comprehensive trend analysis, adjacent area assessment, and environmental correlation analysis?
- Are investigation conclusions supported by scientific rationale and independent technical review?
Resource Adequacy Evaluation
- Are sufficient personnel resources allocated to enable comprehensive investigation completion within reasonable timeframes?
- Do investigation teams have access to necessary analytical capabilities, reference materials, and technical support resources?
- Are investigation budgets adequate to support comprehensive data gathering, expert consultation, and corrective action implementation?
- Does management demonstrate commitment through resource allocation and investigation priority establishment?
Integration and Effectiveness Assessment
- Are investigation findings systematically integrated into contamination control strategy updates and facility risk assessments?
- Do CAPA systems ensure investigation insights drive systematic improvements rather than isolated fixes?
- Are investigation outcomes tracked and verified to confirm contamination risk reduction achievement?
- Do knowledge management systems capture and apply investigation insights across the organization?

From Investigation Adequacy to Investigation Excellence

Rechon Life Science’s experience serves as a cautionary tale about the consequences of investigation mediocrity, but it also illustrates the transformation potential inherent in comprehensive contamination control strategy implementation. When organizations invest in systematic investigation capabilities—encompassing proper team composition, comprehensive analytical approaches, effective knowledge management, and continuous improvement integration—they build competitive advantages that extend far beyond regulatory compliance.

The key insight emerging from regulatory enforcement patterns is that contamination control has evolved from a specialized technical discipline into a comprehensive business capability that affects every aspect of pharmaceutical manufacturing. The quality of an organization’s contamination investigations often determines whether contamination events become learning opportunities that strengthen operations or regulatory nightmares that threaten business continuity.

For quality professionals responsible for contamination control, the message is unambiguous: investigation excellence is not an optional enhancement to existing compliance programs—it’s a fundamental requirement for sustainable pharmaceutical manufacturing in the modern regulatory environment. The organizations that recognize this reality and invest accordingly will find themselves well-positioned not only for regulatory success but for operational excellence that drives competitive advantage in increasingly complex global markets.

The regulatory landscape has fundamentally changed, and traditional approaches to contamination investigation are no longer sufficient. Organizations must decide whether to embrace the investigation excellence imperative or face the consequences of continuing with approaches that regulatory agencies have clearly indicated are inadequate. The choice is clear, but the window for proactive transformation is narrowing as regulatory expectations continue to evolve and enforcement intensifies.

The question facing every pharmaceutical manufacturer is not whether contamination control investigations will face increased scrutiny—it’s whether their investigation systems will demonstrate the excellence necessary to transform regulatory challenges into competitive advantages. Those that choose investigation excellence will thrive; those that don’t will join Rechon Life Science and others in explaining their investigation failures to regulatory agencies rather than celebrating their contamination control successes in the marketplace.

Barriers and Root Cause Analysis: A Comprehensive Framework

Barriers, or controls, are one of the fundamental elements of root cause analysis. By understanding barriers—including their types and functions—we can understand both why a problem happened and how it can be prevented in the future. An evaluation of current process controls as part of root cause analysis can help determine whether all the current barriers pertaining to the problem you are investigating were present and effective.

Understanding Barrier Analysis

At its simplest, barrier analysis is a three-part brainstorm that examines the status and effectiveness of safety measures:

Barrier Analysis
Barriers that failed
Barriers that were not used
Barriers that did not exist

The key to this brainstorming session is to try to find all of the failed, unused, or nonexistent barriers. Do not be concerned if you are not certain which category they belong in initially.

Types of Barriers: Technical, Human, and Organizational

Most forms of barrier analysis examine two primary types: technical and administrative. Administrative barriers can be further broken down into “human” and “organizational” categories.

Choose	Technical	Human	Organizational
If	A technical or engineering control exists	The control relies on a human reviewer or operator	The control involves a transfer of responsibility. For example, a document reviewed by both manufacturing and quality.
Examples	Separation among manufacturing or packaging lines Emergency power supply Dedicated equipment Barcoding Keypad controlled doors Separated storage for components Software that prevents a workflow from going further if a field is not completed Redundant designs	Training and certifications Use of checklist Verification of critical task by a second person	Clear procedures and policies Adequate supervision Adequate load of work Periodic process audits

Preventive vs. Mitigative Barriers: A Critical Distinction

A fundamental aspect of barrier analysis involves understanding the difference between preventive and mitigative barriers. This distinction is crucial for comprehensive risk management and aligns with widely used frameworks such as bow-tie analysis.

Preventive Barriers

Preventive barriers are measures designed to prevent the top event from occurring. These barriers:

Focus on stopping incidents before they happen
Act as the first line of defense against threats
Aim to reduce the likelihood that a risk will materialize
Are proactive in nature, addressing potential causes before they can lead to unwanted events

Examples of preventive barriers include:

Regular equipment maintenance programs
Training and certification programs
Access controls and authentication systems
Equipment qualification protocols (IQ/OQ/PQ) validating proper installation and operation

Mitigative Barriers

Mitigative barriers are designed to reduce the impact and severity of consequences after the top event has occurred. These barriers:

Focus on damage control rather than prevention
Act to minimize harm when preventive measures have failed
Reduce the severity or substantially decrease the likelihood of consequences occurring
Are reactive in nature, coming into play after a risk has materialized

Examples of mitigative barriers include:

Alarm systems and response procedures
Containment measures for hazards
Emergency response teams and protocols
Backup power systems for critical operations

Timeline and Implementation Differences

The timing of barrier implementation and failure differs significantly between preventive and mitigative barriers:

Preventive barriers often fail over days, weeks, or years before the top event occurs, providing more opportunities for identification and intervention
Mitigative barriers often fail over minutes or hours after the top event occurs, requiring higher reliability and immediate effectiveness
This timing difference leads to higher reliance on mitigative barriers working correctly the first time

Enhanced Barrier Analysis Framework

Building on the traditional three-part analysis, organizations should incorporate the preventive vs. mitigative distinction into their barrier evaluation:

Enhanced Barrier Analysis
Preventive barriers that failed
Preventive barriers that were not used
Preventive barriers that did not exist
Mitigative barriers that failed
Mitigative barriers that were not used
Mitigative barriers that did not exist

Integration with Risk Assessment

These barriers are the same as current controls in risk assessment, which is key in a wide variety of risk assessment tools. The optimal approach involves balancing both preventive and mitigative barriers without placing reliance on just one type. Some companies may favor prevention by placing high confidence in their systems and practices, while others may emphasize mitigation through reactive policies, but neither approach alone is advisable as they each result in over-reliance on one type of barrier.

Practical Application

When conducting barrier analysis as part of root cause investigation:

Identify all relevant barriers that were supposed to protect against the incident
Classify each barrier as preventive or mitigative based on its intended function
Determine the barrier type: technical, human, or organizational
Assess barrier status: failed, not used, or did not exist
Evaluate the balance between preventive and mitigative measures
Develop corrective actions that address gaps in both preventive and mitigative barriers

This comprehensive approach to barrier analysis provides a more nuanced understanding of how incidents occur and how they can be prevented or their consequences minimized in the future. By understanding both the preventive and mitigative functions of barriers, organizations can develop more robust risk management strategies that address threats at multiple points in the incident timeline.

Causal Reasoning: A Transformative Approach to Root Cause Analysis

Energy Safety Canada recently published a white paper on causal reasoning that offers valuable insights for quality professionals across industries. As someone who has spent decades examining how we investigate deviations and perform root cause analysis, I found their framework refreshing and remarkably aligned with the challenges we face in pharmaceutical quality. The paper proposes a fundamental shift in how we approach investigations, moving from what they call “negative reasoning” to “causal reasoning” that could significantly improve our ability to prevent recurring issues and drive meaningful improvement.

The Problem with Traditional Root Cause Analysis

Many of us in quality have experienced the frustration of seeing the same types of deviations recur despite thorough investigations and seemingly robust CAPAs. The Energy Safety Canada white paper offers a compelling explanation for this phenomenon: our investigations often focus on what did not happen rather than what actually occurred.

This approach, which the authors term “negative reasoning,” leads investigators to identify counterfactuals-things that did not occur, such as “operators not following procedures” or “personnel not stopping work when they should have”. The problem is fundamental: what was not happening cannot create the outcomes we experienced. As the authors aptly state, these counterfactuals “exist only in retrospection and never actually influenced events,” yet they dominate many of our investigation conclusions.

This insight resonates strongly with what I’ve observed in pharmaceutical quality. Six years ago the MHRA’s 2019 citation of 210 companies for inadequate root cause analysis and CAPA development – including 6 critical findings – takes on renewed significance in light of Sanofi’s 2025 FDA warning letter. While most cited organizations likely believed their investigation processes were robust (as Sanofi presumably did before their contamination failures surfaced), these parallel cases across regulatory bodies and years expose a persistent industry-wide disconnect between perceived and actual investigation effectiveness. These continued failures exemplify how superficial root cause analysis creates dangerous illusions of control – precisely the systemic flaw the MHRA data highlighted six years prior.

Negative Reasoning vs. Causal Reasoning: A Critical Distinction

The white paper makes a distinction that I find particularly valuable: negative reasoning seeks to explain outcomes based on what was missing from the system, while causal reasoning looks for what was actually present or what happened. This difference may seem subtle, but it fundamentally changes the nature and outcomes of our investigations.

When we use negative reasoning, we create what the white paper calls “an illusion of cause without being causal”. We identify things like “failure to follow procedures” or “inadequate risk assessment,” which may feel satisfying but don’t explain why those conditions existed in the first place. These conclusions often lead to generic corrective actions that fail to address underlying issues.

In contrast, causal reasoning requires statements that have time, place, and magnitude. It focuses on what was necessary and sufficient to create the effect, building a logically tight cause-and-effect diagram. This approach helps reveal how work is actually done rather than comparing reality to an imagined ideal.

This distinction parallels the gap between “work-as-imagined” (the black line) and “work-as-done” (the blue line). Too often, our investigations focus only on deviations from work-as-imagined without trying to understand why work-as-done developed differently.

A Tale of Two Analyses: The Power of Causal Reasoning

The white paper presents a compelling case study involving a propane release and operator injury that illustrates the difference between these two approaches. When initially analyzed through negative reasoning, investigators concluded the operator:

Used an improper tool
Deviated from good practice
Failed to recognize hazards
Failed to learn from past experiences

These conclusions placed blame squarely on the individual and led leadership to consider terminating the operator.

However, when the same incident was examined through causal reasoning, a different picture emerged:

The operator used the pipe wrench because it was available at the pump specifically for this purpose
The pipe wrench had been deliberately left at that location because operators knew the valve was hard to close
The operator acted quickly because he perceived a risk to the plant and colleagues
Leadership had actually endorsed this workaround four years earlier during a turnaround

This causally reasoned analysis revealed that what appeared to be an individual failure was actually a system-level issue that had been normalized over time. Rather than punishing the operator, leadership recognized their own role in creating the conditions for the incident and implemented systemic improvements.

This example reminded me of our discussions on barrier analysis, where we examine barriers that failed, weren’t used, or didn’t exist. But causal reasoning takes this further by exploring why those conditions existed in the first place, creating a much richer understanding of how work actually happens.

First 24 Hours: Where Causal Reasoning Meets The Golden Day

In my recent post on “The Golden Start to a Deviation Investigation,” I emphasized how critical the first 24 hours are after discovering a deviation. This initial window represents our best opportunity to capture accurate information and set the stage for a successful investigation. The Energy Safety Canada white paper complements this concept perfectly by providing guidance on how to use those critical hours effectively.

When we apply causal reasoning during these early stages, we focus on collecting specific, factual information about what actually occurred rather than immediately jumping to what should have happened. This means documenting events with specificity (time, place, magnitude) and avoiding premature judgments about deviations from procedures or expectations.

As I’ve previously noted, clear and precise problem definition forms the foundation of any effective investigation. Causal reasoning enhances this process by ensuring we document using specific, factual language that describes what occurred rather than what didn’t happen. This creates a much stronger foundation for the entire investigation.

Beyond Human Error: System Thinking and Leadership’s Role

One of the most persistent challenges in our field is the tendency to attribute events to “human error.” As I’ve discussed before, when human error is suspected or identified as the cause, this should be justified only after ensuring that process, procedural, or system-based errors have not been overlooked. The white paper reinforces this point, noting that human actions and decisions are influenced by the system in which people work.

In fact, the paper presents a hierarchy of causes that resonates strongly with systems thinking principles I’ve advocated for previously. Outcomes arise from physical mechanisms influenced by human actions and decisions, which are in turn governed by systemic factors. If we only address physical mechanisms or human behaviors without changing the system, performance will eventually migrate back to where it has always been.

This connects directly to what I’ve written about quality culture being fundamental to providing quality. The white paper emphasizes that leadership involvement is directly correlated with performance improvement. When leaders engage to set conditions and provide resources, they create an environment where investigations can reveal systemic issues rather than just identify procedural deviations or human errors.

Implementing Causal Reasoning in Pharmaceutical Quality

For pharmaceutical quality professionals looking to implement causal reasoning in their investigation processes, I recommend starting with these practical steps:

1. Develop Investigator Competencies

As I’ve discussed in my analysis of Sanofi’s FDA warning letter, having competent investigators is crucial. Organizations should:

Define required competencies for investigators
Provide comprehensive training on causal reasoning techniques
Implement mentoring programs for new investigators
Regularly assess and refresh investigator skills

2. Shift from Counterfactuals to Causal Statements

Review your recent investigations and look for counterfactual statements like “operators did not follow the procedure.” Replace these with causal statements that describe what actually happened and why it made sense to the people involved at the time.

3. Implement a Sponsor-Driven Approach

The white paper emphasizes the importance of investigation sponsors (otherwise known as Area Managers) who set clear conditions and expectations. This aligns perfectly with my belief that quality culture requires alignment between top management behavior and quality system philosophy. Sponsors should:

Clearly define the purpose and intent of investigations
Specify that a causal reasoning orientation should be used
Provide resources and access needed to find data and translate it into causes
Remain engaged throughout the investigation process

Infographic capturing the 4 things a sponsor should do above

4. Use Structured Causal Analysis Tools

While the M-based frameworks I’ve discussed previously (4M, 5M, 6M) remain valuable for organizing contributing factors, they should be complemented with tools that support causal reasoning. The Cause-Consequence Analysis (CCA) I described in a recent post offers one such approach, combining elements of fault tree analysis and event tree analysis to provide a holistic view of risk scenarios.

From Understanding to Improvement

The Energy Safety Canada white paper’s emphasis on causal reasoning represents a valuable contribution to how we think about investigations across industries. For pharmaceutical quality professionals, this approach offers a way to move beyond compliance-focused investigations to truly understand how our systems operate and how to improve them.

As the authors note, “The capacity for an investigation to improve performance is dependent on the type of reasoning used by investigators”. By adopting causal reasoning, we can build investigations that reveal how work actually happens rather than simply identifying deviations from how we imagine it should happen.

This approach aligns perfectly with my long-standing belief that without a strong quality culture, people will not be ready to commit and involve themselves fully in building and supporting a robust quality management system. Causal reasoning creates the transparency and learning that form the foundation of such a culture.

I encourage quality professionals to download and read the full white paper, reflect on their current investigation practices, and consider how causal reasoning might enhance their approach to understanding and preventing deviations. The most important questions to consider are:

Do your investigation conclusions focus on what didn’t happen rather than what did?
How often do you identify “human error” without exploring the system conditions that made that error likely?
Are your leaders engaged as sponsors who set conditions for successful investigations?
What barriers exist in your organization that prevent learning from events?

As we continue to evolve our understanding of quality and safety, approaches like causal reasoning offer valuable tools for creating the transparency needed to navigate complexity and drive meaningful improvement.