The Hidden Contamination Hazards: What the Catalent Warning Letter Reveals About Systemic Aseptic Processing Failures

The November 2025 FDA Warning Letter to Catalent Indiana, LLC reads like an autopsy report—a detailed dissection of how contamination hazards aren’t discovered but rather engineered into aseptic operations through a constellation of decisions that individually appear defensible yet collectively create what I’ve previously termed the “zemblanity field” in pharmaceutical quality. Section 2, addressing failures under 21 CFR 211.113(b), exposes contamination hazards that didn’t emerge from random misfortune but from deliberate choices about decontamination strategies, sampling methodologies, intervention protocols, and investigation rigor.​

What makes this warning letter particularly instructive isn’t the presence of contamination events—every aseptic facility battles microbial ingress—but rather the systematic architectural failures that allowed contamination hazards to persist unrecognized, uninvestigated, and unmitigated despite multiple warning signals spanning more than 20 deviations and customer complaints. The FDA’s critique centers on three interconnected contamination hazard categories: VHP decontamination failures involving occluded surfaces, inadequate environmental monitoring methods that substituted convenience for detection capability, and intervention risk assessments that ignored documented contamination routes.

For those of us responsible for contamination control in aseptic manufacturing, this warning letter demands we ask uncomfortable questions: How many of our VHP cycles are validated against surfaces that remain functionally occluded? How often have we chosen contact plates over swabs because they’re faster, not because they’re more effective? When was the last time we terminated a media fill and treated it with the investigative rigor of a batch contamination event?

The Occluded Surface Problem: When Decontamination Becomes Theatre

The FDA’s identification of occluded surfaces as contamination sources during VHP decontamination represents a failure mode I’ve observed with troubling frequency across aseptic facilities. The fundamental physics are unambiguous: vaporized hydrogen peroxide achieves sporicidal efficacy through direct surface contact at validated concentration-time profiles. Any surface the vapor doesn’t contact—or contacts at insufficient concentration—remains a potential contamination reservoir regardless of cycle completion indicators showing “successful” decontamination.​

The Catalent situation involved two distinct occluded surface scenarios, each revealing different architectural failures in contamination hazard assessment. First, equipment surfaces occluded during VHP decontamination that subsequently became contamination sources during atypical interventions involving equipment changes. The FDA noted that “the most probable root cause” of an environmental monitoring failure was equipment surfaces occluded during VHP decontamination, with contamination occurring during execution of an atypical intervention involving changes to components integral to stopper seating.​

This finding exposes a conceptual error I frequently encounter: treating VHP decontamination as a universal solution that overcomes design deficiencies rather than as a validated process with specific performance boundaries. The Catalent facility’s own risk assessments advised against interventions that could disturb potentially occluded surfaces, yet these interventions continued—creating the precise contamination pathway their risk assessments identified as unacceptable.​

The second occluded surface scenario involved wrapped components within the filling line where insufficient VHP exposure allowed potential contamination. The FDA cited “occluded surfaces on wrapped [components] within the [equipment] as the potential cause of contamination”. This represents a validation failure: if wrapping materials prevent adequate VHP penetration, either the wrapping must be eliminated, the decontamination method must change, or these surfaces must be treated through alternative validated processes.​

The literature on VHP decontamination is explicit about occluded surface risks. As Sandle notes, surfaces must be “designed and installed so that operations, maintenance, and repairs can be performed outside the cleanroom” and where unavoidable, “all surfaces needing decontaminated” must be explicitly identified. The PIC/S guidance is similarly unambiguous: “Continuously occluded surfaces do not qualify for such trials as they cannot be exposed to the process and should have been eliminated”. Yet facilities continue to validate VHP cycles that demonstrate biological indicator kill on readily accessible flat coupons while ignoring the complex geometries, wrapped items, and recessed surfaces actually present in their filling environments.

What does a robust approach to occluded surface assessment look like? Based on the regulatory expectations and technical literature, facilities should:

Conduct comprehensive occluded surface mapping during design qualification. Every component introduced into VHP-decontaminated spaces must undergo geometric analysis to identify surfaces that may not receive adequate vapor exposure. This includes crevices, threaded connections, wrapped items, hollow spaces, and any surface shadowed by another object. The mapping should document not just that surfaces exist but their accessibility to vapor flow based on the specific VHP distribution characteristics of the equipment.​

Validate VHP distribution using chemical and biological indicators placed on identified occluded surfaces. Flat coupon placement on readily accessible horizontal surfaces tells you nothing about vapor penetration into wrapped components or recessed geometries. Biological indicators should be positioned specifically where vapor exposure is questionable—inside wrapped items, within threaded connections, under equipment flanges, in dead-legs of transfer lines. If biological indicators in these locations don’t achieve the validated log reduction, the surfaces are occluded and require design modification or alternative decontamination methods.​

Establish clear intervention protocols that distinguish between “sterile-to-sterile” and “potentially contaminated” surface contact. The Catalent finding reveals that atypical interventions involving equipment changes exposed the Grade A environment to surfaces not reliably exposed to VHP. Intervention risk assessments must explicitly categorize whether the intervention involves only VHP-validated surfaces or introduces components from potentially occluded areas. The latter category demands heightened controls: localized Grade A air protection, pre-intervention surface swabbing and disinfection, real-time environmental monitoring during the intervention, and post-intervention investigation if environmental monitoring shows any deviation.​

Implement post-decontamination surface monitoring that targets historically occluded locations. If your facility has identified occluded surfaces that cannot be designed out, these become critical sampling locations for post-VHP environmental monitoring. Trending of these specific locations provides early detection of decontamination effectiveness degradation before contamination reaches product-contact surfaces.

The FDA’s remediation demand is appropriately comprehensive: “a review of VHP exposure to decontamination methods as well as permitted interventions, including a retrospective historical review of routine interventions and atypical interventions to determine their risks, a comprehensive identification of locations that are not reliably exposed to VHP decontamination (i.e., occluded surfaces), your plan to reduce occluded surfaces where feasible, review of currently permitted interventions and elimination of high-risk interventions entailing equipment manipulations during production campaigns that expose the ISO 5 environment to surfaces not exposed to a validated decontamination process, and redesign of any intervention that poses an unacceptable contamination risk”.​

This remediation framework represents best practice for any aseptic facility using VHP decontamination. The occluded surface problem isn’t limited to Catalent—it’s an industry-wide vulnerability wherever VHP validation focuses on demonstrating sporicidal activity under ideal conditions rather than confirming adequate vapor contact across all surfaces within the validated space.

Contact Plates Versus Swabs: The Detection Capability Trade-Off

The FDA’s critique of Catalent’s environmental monitoring methodology exposes a decision I’ve challenged repeatedly throughout my career: the use of contact plates for sampling irregular, product-contact surfaces in Grade A environments. The technical limitations are well-established, yet contact plates persist because they’re faster and operationally simpler—prioritizing workflow convenience over contamination detection capability.

The specific Catalent deficiency involved sampling filling line components using “contact plate, sampling [surfaces] with one sweeping sampling motion.” The FDA identified two fundamental inadequacies: “With this method, you are unable to attribute contamination events to specific [locations]” and “your firm’s use of contact plates is not as effective as using swab methods”. These limitations aren’t novel discoveries—they’re inherent to contact plate methodology and have been documented in the microbiological literature for decades.​

Contact plates—rigid agar surfaces pressed against the area to be sampled—were designed for flat, smooth surfaces where complete agar-to-surface contact can be achieved with uniform pressure. They perform adequately on stainless steel benchtops, isolator walls, and other horizontal surfaces. But filling line components—particularly those identified in the warning letter—present complex geometries: curved surfaces, corners, recesses, and irregular topographies where rigid agar cannot conform to achieve complete surface contact.

The microbial recovery implications are significant. When a contact plate fails to achieve complete surface contact, microorganisms in uncontacted areas remain unsampled. The result is a false-negative environmental monitoring reading that suggests contamination control while actual contamination persists undetected. Worse, the “sweeping sampling motion” described in the warning letter—moving a single contact plate across multiple locations—creates the additional problem the FDA identified: inability to attribute any recovered contamination to a specific surface. Was the contamination on the first component contacted? The third? Somewhere in between? This sampling approach provides data too imprecise for meaningful contamination source investigation.

The alternative—swab sampling—addresses both deficiencies. Swabs conform to irregular surfaces, accessing corners, recesses, and curved topographies that contact plates cannot reach. Swabs can be applied to specific, discrete locations, enabling precise attribution of any contamination recovered to a particular surface. The trade-off is operational: swab sampling requires more time, involves additional manipulative steps within Grade A environments, and demands different operator technique validation.​

Yet the Catalent warning letter makes clear that this operational inconvenience doesn’t justify compromised detection capability for critical product-contact surfaces. The FDA’s expectation—acknowledged in Catalent’s response—is swab sampling “to replace use of contact plates to sample irregular surfaces”. This represents a fundamental shift from convenience-optimized to detection-optimized environmental monitoring.​

What should a risk-based surface sampling strategy look like? The differentiation should be based on surface geometry and criticality:

Contact plates remain appropriate for flat, smooth, readily accessible surfaces where complete agar contact can be verified and where contamination risk is lower (Grade B floors, isolator walls, equipment external surfaces). The speed and simplicity advantages of contact plates justify their continued use in these applications.

Swab sampling should be mandatory for product-contact surfaces, irregular geometries, recessed areas, and any location where contact plate conformity is questionable. This includes filling needles, stopper bowls, vial transport mechanisms, crimping heads, and the specific equipment components cited in the Catalent letter. The additional time required for swab sampling is trivial compared to the contamination risk from inadequate monitoring.

Surface sampling protocols must specify the exact location sampled, not general equipment categories. Rather than “sample stopper bowl,” protocols should identify “internal rim of stopper bowl,” “external base of stopper bowl,” “stopper agitation mechanism interior surfaces.” This specificity enables contamination source attribution during investigations and ensures sampling actually reaches the highest-risk surfaces.

Swab technique must be validated to ensure consistent recovery from target surfaces. Simply switching from contact plates to swabs doesn’t guarantee improved detection unless swab technique—pressure applied, surface area contacted, swab saturation, transfer to growth media—is standardized and demonstrated to achieve adequate microbial recovery from the specific materials and geometries being sampled.​

The EU GMP Annex 1 and FDA guidance documents emphasize detection capability over convenience in environmental monitoring. The expectation isn’t perfect contamination prevention—that’s impossible in aseptic processing—but rather monitoring systems sensitive enough to detect contamination events when they occur, enabling investigation and corrective action before product impact. Contact plates on irregular surfaces fail this standard by design, not because of operator error or inadequate validation but because the fundamental methodology cannot access the surfaces requiring monitoring.​

The Intervention Paradox: When Risk Assessments Identify Hazards But Operations Ignore Them

Perhaps the most troubling element of the Catalent contamination hazards section isn’t the presence of occluded surfaces or inadequate sampling methods but rather the intervention management failure that reveals a disconnect between risk assessment and operational decision-making. Catalent’s risk assessments explicitly “advised against interventions that can disturb potentially occluded surfaces,” yet these high-risk interventions continued during production campaigns.​

This represents what I’ve termed “investigation theatre” in previous posts—creating the superficial appearance of risk-based decision-making while actual operations proceed according to production convenience rather than contamination risk mitigation. The risk assessment identified the hazard. The environmental monitoring data confirmed the hazard when contamination occurred during the intervention. Yet the intervention continued as an accepted operational practice.​

The specific intervention involved equipment changes to components “integral to stopper seating in the [filling line]”. These components operate at the critical interface between the sterile stopper and the vial—precisely the location where any contamination poses direct product impact risk. The intervention occurred during production campaigns rather than between campaigns when comprehensive decontamination and validation could occur. The intervention involved surfaces potentially occluded during VHP decontamination, meaning their microbiological state was unknown when introduced into the Grade A filling environment.​

Every element of this scenario screams “unacceptable contamination risk,” yet it persisted as accepted practice until FDA inspection. How does this happen? Based on my experience across multiple aseptic facilities, the failure mode follows a predictable pattern:

Production scheduling drives intervention timing rather than contamination risk assessment. Stopping a campaign for equipment maintenance creates schedule disruption, yield loss, and capacity constraints. The pressure to maintain campaign continuity overwhelms contamination risk considerations that appear theoretical compared to the immediate, quantifiable production impact.

Risk assessments become compliance artifacts disconnected from operational decision-making. The quality unit conducts a risk assessment, documents that certain interventions pose unacceptable contamination risk, and files the assessment. But when production encounters the situation requiring that intervention, the actual decision-making process references production need, equipment availability, and batch schedules—not the risk assessment that identified the intervention as high-risk.

Interventions become “normalized deviance”—accepted operational practices despite documented risks. After performing a high-risk intervention successfully (meaning without detected contamination) multiple times, it transitions from “high-risk intervention requiring exceptional controls” to “routine intervention” in operational thinking. The fact that adequate controls prevented contamination detection gets inverted into evidence that the intervention isn’t actually high-risk.

Environmental monitoring provides false assurance when contamination goes undetected. If a high-risk intervention occurs and subsequent environmental monitoring shows no contamination, operations interprets this as validation that the intervention is acceptable. But as discussed in the contact plate section, inadequate sampling methodology may fail to detect contamination that actually occurred. The absence of detected contamination becomes “proof” that contamination didn’t occur, reinforcing the normalization of high-risk interventions.

The EU GMP Annex 1 requirements for intervention management represent regulatory recognition of these failure modes. Annex 1 Section 8.16 requires “the list of interventions evaluated via risk analysis” and Section 9.36 requires that aseptic process simulations include “interventions and associated risks”. The framework is explicit: identify interventions, assess their contamination risk, validate that operators can perform them aseptically through media fills, and eliminate interventions that cannot be performed without unacceptable contamination risk.​

What does robust intervention risk management look like in practice?

Categorize interventions by contamination risk based on specific, documented criteria. The categorization should consider: surfaces contacted (sterile-to-sterile vs. potentially contaminated), duration of exposure, proximity to open product, operator actions required, first air protection feasibility, and frequency. This creates a risk hierarchy that enables differentiated control strategies rather than treating all interventions equivalently.​

Establish clear decision authorities for different intervention risk levels. Routine interventions (low contamination risk, validated through media fills, performed regularly) can proceed under operator judgment following standard procedures. High-risk interventions (those involving occluded surfaces, extended exposure, or proximity to open product) should require quality unit pre-approval including documented risk assessment and enhanced controls specification. Interventions identified as posing unacceptable risk should be prohibited until equipment redesign or process modification eliminates the contamination hazard.​

Validate intervention execution through media fills that specifically simulate the intervention’s contamination challenges. Generic media fills demonstrating overall aseptic processing capability don’t validate specific high-risk interventions. If your risk assessment identifies a particular intervention as posing contamination risk, your media fill program must include that intervention, performed by the operators who will execute it, under the conditions (campaign timing, equipment state, environmental conditions) where it will actually occur.​

Implement intervention-specific environmental monitoring that targets the contamination pathways identified in risk assessments. If the risk assessment identifies that an intervention may expose product to surfaces not reliably decontaminated, environmental monitoring immediately following that intervention should specifically sample those surfaces and adjacent areas. Trending this intervention-specific monitoring data separately from routine environmental monitoring enables detection of intervention-associated contamination patterns.​

Conduct post-intervention investigations when environmental monitoring shows any deviation. The Catalent warning letter describes an environmental monitoring failure whose “most probable root cause” was an atypical intervention involving equipment changes. This temporal association between intervention and contamination should trigger automatic investigation even if environmental monitoring results remain within action levels. The investigation should assess whether intervention protocols require modification or whether the intervention should be eliminated.​

The FDA’s remediation demand addresses this gap directly: “review of currently permitted interventions and elimination of high-risk interventions entailing equipment manipulations during production campaigns that expose the ISO 5 environment to surfaces not exposed to a validated decontamination process”. This requirement forces facilities to confront the intervention paradox: if your risk assessment identifies an intervention as high-risk, you cannot simultaneously permit it as routine operational practice. Either modify the intervention to reduce risk, validate enhanced controls that mitigate the risk, or eliminate the intervention entirely.​

Media Fill Terminations: When Failures Become Invisible

The Catalent warning letter’s discussion of media fill terminations exposes an investigation failure mode that reveals deeper quality system inadequacies. Since November 2023, Catalent terminated more than five media fill batches representing the filling line. Following two terminations for stoppering issues and extrinsic particle contamination, the facility “failed to open a deviation or an investigation at the time of each failure, as required by your SOPs”.​

Read that again. Media fills—the fundamental aseptic processing validation tool, the simulation specifically designed to challenge contamination control—were terminated due to failures, and no deviation was opened, no investigation initiated. The failures simply disappeared from the quality system, becoming invisible until FDA inspection revealed their existence.

The rationalization is predictable: “there was no impact to the SISPQ (Safety, Identity, Strength, Purity, Quality) of the terminated media batches or to any customer batches” because “these media fills were re-executed successfully with passing results”. This reasoning exposes a fundamental misunderstanding of media fill purpose that I’ve encountered with troubling frequency across the industry.​

A media fill is not a “test” that you pass or fail with product consequences. It is a simulation—a deliberate challenge to your aseptic processing capability using growth medium instead of product specifically to identify contamination risks without product impact. When a media fill is terminated due to a processing failure, that termination is itself the critical finding. The termination reveals that your process is vulnerable to exactly the failure mode that caused termination: stoppering problems that could occur during commercial filling, extrinsic particles that could contaminate product.

The FDA’s response is appropriately uncompromising: “You do not provide the investigations with a root cause that justifies aborting and re-executing the media fills, nor do you provide the corrective actions taken for each terminated media fill to ensure effective CAPAs were promptly initiated”. The regulatory expectation is clear: media fill terminations require investigation identical in rigor to commercial batch failures. Why did the stoppering issue occur? What equipment, material, or operator factors contributed? How do we prevent recurrence? What commercial batches may have experienced similar failures that went undetected?​

The re-execution logic is particularly insidious. By immediately re-running the media fill and achieving passing results, Catalent created the appearance of successful validation while ignoring the process vulnerability revealed by the termination. The successful re-execution proved only that under ideal conditions—now with heightened operator awareness following the initial failure—the process could be executed successfully. It provided no assurance that commercial operations, without that heightened awareness and under the same conditions that caused the initial termination, wouldn’t experience identical failures.

What should media fill termination management look like?

Treat every media fill termination as a critical deviation requiring immediate investigation initiation. The investigation should identify the root cause of the termination, assess whether the failure mode could occur during commercial manufacturing, evaluate whether previous commercial batches may have experienced similar failures, and establish corrective actions that prevent recurrence. This investigation must occur before re-execution, not instead of investigation.​

Require quality unit approval before media fill re-execution. The approval should be based on documented investigation findings demonstrating that the termination cause is understood, corrective actions are implemented, and re-execution will validate process capability under conditions that include the corrective actions. Re-execution without investigation approval perpetuates the “keep running until we get a pass” mentality that defeats media fill purpose.​

Implement media fill termination trending as a critical quality indicator. A facility terminating “more than five media fill batches” in a period should recognize this as a signal of fundamental process capability problems, not as a series of unrelated events requiring re-execution. Trending should identify common factors: specific operators, equipment states, intervention types, campaign timing.​

Ensure deviation tracking systems cannot exclude media fill terminations. The Catalent situation arose partly because “you failed to initiate a deviation record to capture the lack of an investigation for each of the terminated media fills, resulting in an undercounting of the deviations”. Quality metrics that exclude media fill terminations from deviation totals create perverse incentives to avoid formal deviation documentation, rendering media fill findings invisible to quality system oversight.​

The broader issue extends beyond media fill terminations to how aseptic processing validation integrates with quality systems. Media fills should function as early warning indicators—detecting aseptic processing vulnerabilities before product impact occurs. But this detection value requires that findings from media fills drive investigations, corrective actions, and process improvements with the same rigor as commercial batch deviations. When media fill failures can be erased through re-execution without investigation, the entire validation framework becomes performative rather than protective.

The Stopper Supplier Qualification Failure: Accepting Contamination at the Source

The stopper contamination issues discussed throughout the warning letter—mammalian hair found in or around stopper regions of vials from nearly 20 batches across multiple products—reveal a supplier qualification and incoming inspection failure that compounds the contamination hazards already discussed. The FDA’s critique focuses on Catalent’s “inappropriate reliance on pre-shipment samples (tailgate samples)” and failure to implement “enhanced or comparative sampling of stoppers from your other suppliers”.​

The pre-shipment or “tailgate” sample approach represents a fundamental violation of GMP sampling principles. Under this approach, the stopper supplier—not Catalent—collected samples from lots prior to shipment and sent these samples directly to Catalent for quality testing. Catalent then made accept/reject decisions for incoming stopper lots based on testing of supplier-selected samples that never passed through Catalent’s receiving or storage processes.​

Why does this matter? Because representative sampling requires that samples be selected from the material population actually received by the facility, stored under facility conditions, and handled through facility processes. Supplier-selected pre-shipment samples bypass every opportunity to detect contamination introduced during shipping, storage transitions, or handling. They enable a supplier to selectively sample from cleaner portions of production lots while shipping potentially contaminated material in the same lot to the customer.

The FDA guidance on this issue is explicit and has been for decades: samples for quality attribute testing “are to be taken at your facility from containers after receipt to ensure they are representative of the components in question”. This isn’t a new expectation emerging from enhanced regulatory scrutiny—it’s a baseline GMP requirement that Catalent systematically violated through reliance on tailgate samples.​

But the tailgate sample issue represents only one element of broader supplier qualification failures. The warning letter notes that “while stoppers from [one supplier] were the primary source of extrinsic particles, they were not the only source of foreign matter.” Yet Catalent implemented “limited, enhanced sampling strategy for one of your suppliers” while failing to “increase sampling oversight” for other suppliers. This selective enhancement—focusing remediation only on the most problematic supplier while ignoring systemic contamination risks across the stopper supply base—predictably failed to resolve ongoing contamination issues.​

What should stopper supplier qualification and incoming inspection look like for aseptic filling operations?

Eliminate pre-shipment or tailgate sampling entirely. All quality testing must be conducted on samples taken from received lots, stored in facility conditions, and selected using documented random sampling procedures. If suppliers require pre-shipment testing for their internal quality release, that’s their process requirement—it doesn’t substitute for the purchaser’s independent incoming inspection using facility-sampled material.​

Implement risk-based incoming inspection that intensifies sampling when contamination history indicates elevated risk. The warning letter notes that Catalent recognized stoppers as “a possible contributing factor for contamination with mammalian hairs” in July 2024 but didn’t implement enhanced sampling until May 2025—a ten-month delay. The inspection enhancement should be automatic and immediate when contamination events implicate incoming materials. The sampling intensity should remain elevated until trending data demonstrates sustained contamination reduction across multiple lots.​

Apply visual inspection with reject criteria specific to the defect types that create product contamination risk. Generic visual inspection looking for general “defects” fails to detect the specific contamination types—embedded hair, extrinsic particles, material fragments—that create sterile product risks. Inspection protocols must specify mammalian hair, fiber contamination, and particulate matter as reject criteria with sensitivity adequate to detect single-particle contamination in sampled stoppers.​

Require supplier process changes—not just enhanced sampling—when contamination trends indicate process capability problems. The warning letter acknowledges Catalent “worked with your suppliers to reduce the likelihood of mammalian hair contamination events” but notes that despite these efforts, “you continued to receive complaints from customers who observed mammalian hair contamination in drug products they received from you”. Enhanced sampling detects contamination; it doesn’t prevent it. Suppliers demonstrating persistent contamination require process audits, environmental control improvements, and validated contamination reduction demonstrated through process capability studies—not just promises to improve quality.​

Implement finished product visual inspection with heightened sensitivity for products using stoppers from suppliers with contamination history. The FDA notes that Catalent indicated “future batches found during visual inspection of finished drug products would undergo a re-inspection followed by tightened acceptable quality limit to ensure defective units would be removed” but didn’t provide the re-inspection procedure. This two-stage inspection approach—initial inspection followed by re-inspection with enhanced criteria for lots from high-risk suppliers—provides additional contamination detection but must be validated to demonstrate adequate defect removal.​

The broader lesson extends beyond stoppers to supplier qualification for any component used in sterile manufacturing. Components introduce contamination risks—microbial bioburden, particulate matter, chemical residues—that cannot be fully mitigated through end-product testing. Supplier qualification must function as a contamination prevention tool, ensuring that materials entering aseptic operations meet microbiological and particulate quality standards appropriate for their role in maintaining sterility. Reliance on tailgate samples, delayed sampling enhancement, and acceptance of persistent supplier contamination all represent failures to recognize suppliers as critical contamination control points requiring rigorous qualification and oversight.

The Systemic Pattern: From Contamination Hazards to Quality System Architecture

Stepping back from individual contamination hazards—occluded surfaces, inadequate sampling, high-risk interventions, media fill terminations, supplier qualification failures—a systemic pattern emerges that connects this warning letter to the broader zemblanity framework I’ve explored in previous posts. These aren’t independent, unrelated deficiencies that coincidentally occurred at the same facility. They represent interconnected architectural failures in how the quality system approaches contamination control.​

The pattern reveals itself through three consistent characteristics:

Detection systems optimized for convenience rather than capability. Contact plates instead of swabs for irregular surfaces. Pre-shipment samples instead of facility-based incoming inspection. Generic visual inspection instead of defect-specific contamination screening. Each choice prioritizes operational ease and workflow efficiency over contamination detection sensitivity. The result is a quality system that generates reassuring data—passing environmental monitoring, acceptable incoming inspection results, successful visual inspection—while actual contamination persists undetected.

Risk assessments that identify hazards without preventing their occurrence. Catalent’s risk assessments advised against interventions disturbing potentially occluded surfaces, yet these interventions continued. The facility recognized stoppers as contamination sources in July 2024 but delayed enhanced sampling until May 2025. Media fill terminations revealed aseptic processing vulnerabilities but triggered re-execution rather than investigation. Risk identification became separated from risk mitigation—the assessment process functioned as compliance theatre rather than decision-making input.​

Investigation systems that erase failures rather than learn from them. Media fill terminations occurred without deviation initiation. Mammalian hair contamination events were investigated individually without recognizing the trend across 20+ deviations. Root cause investigations concluded “no product impact” based on passing sterility tests rather than addressing the contamination source enabling future events. The investigation framework optimized for batch release justification rather than contamination prevention.​

These patterns don’t emerge from incompetent quality professionals or inadequate resource allocation. They emerge from quality system design choices that prioritize production efficiency, workflow continuity, and batch release over contamination detection, investigation rigor, and source elimination. The system delivers what it was designed to deliver: maximum throughput with minimum disruption. It fails to deliver what patients require: contamination control capable of detecting and eliminating sterility risks before product impact.

Recommendations: Building Contamination Hazard Detection Into System Architecture

What does effective contamination hazard management look like at the quality system architecture level? Based on the Catalent failures and broader industry patterns, several principles should guide aseptic operations:

Design decontamination validation around worst-case geometries, not ideal conditions. VHP validation using flat coupons on horizontal surfaces tells you nothing about vapor penetration into the complex geometries, wrapped components, and recessed surfaces actually present in your filling line. Biological indicator placement should target occluded surfaces specifically—if you can’t achieve validated kill on these locations, they’re contamination hazards requiring design modification or alternative decontamination methods.

Select environmental monitoring methods based on detection capability for the surfaces and conditions actually requiring monitoring. Contact plates are adequate for flat, smooth surfaces. They’re inadequate for irregular product-contact surfaces, recessed areas, and complex geometries. Swab sampling takes more time but provides contamination detection capability that contact plates cannot match. The operational convenience sacrifice is trivial compared to the contamination risk from monitoring methods incapable of detecting contamination when it occurs.​

Establish intervention risk classification with decision authorities proportional to contamination risk. Routine low-risk interventions validated through media fills can proceed under operator judgment. High-risk interventions—those involving occluded surfaces, extended exposure, or proximity to open product—require quality unit pre-approval with documented enhanced controls. Interventions identified as posing unacceptable risk should be prohibited pending equipment redesign.​

Treat media fill terminations as critical deviations requiring investigation before re-execution. The termination reveals process vulnerability—the investigation must identify root cause, assess commercial batch risk, and establish corrective actions before validation continues. Re-execution without investigation perpetuates the failures that caused termination.​

Implement supplier qualification with facility-based sampling, contamination-specific inspection criteria, and automatic sampling enhancement when contamination trends emerge. Tailgate samples cannot provide representative material assessment. Visual inspection must target the specific contamination types—mammalian hair, particulate matter, material fragments—that create product risks. Enhanced sampling should be automatic and sustained when contamination history indicates elevated risk.​

Build investigation systems that learn from contamination events rather than erasing them through re-execution or “no product impact” conclusions. Contamination events represent failures in contamination control regardless of whether subsequent testing shows product remains within specification. The investigation purpose is preventing recurrence, not justifying release.​

The FDA’s comprehensive remediation demands represent what quality system architecture should look like: independent assessment of investigation capability, CAPA effectiveness evaluation, contamination hazard risk assessment covering material flows and equipment placement, detailed remediation with specific improvements, and ongoing management oversight throughout the manufacturing lifecycle.​

The Contamination Control Strategy as Living System

The Catalent warning letter’s contamination hazards section serves as a case study in how quality systems can simultaneously maintain surface-level compliance while allowing fundamental contamination control failures to persist. The facility conducted VHP decontamination cycles, performed environmental monitoring, executed media fills, and inspected incoming materials—checking every compliance box. Yet contamination hazards proliferated because these activities optimized for operational convenience and batch release justification rather than contamination detection and source elimination.

The EU GMP Annex 1 Contamination Control Strategy requirement represents regulatory recognition that contamination control cannot be achieved through isolated compliance activities. It requires integrated systems where facility design, decontamination processes, environmental monitoring, intervention protocols, material qualification, and investigation practices function cohesively to detect, investigate, and eliminate contamination sources. The Catalent failures reveal what happens when these elements remain disconnected: decontamination cycles that don’t reach occluded surfaces, monitoring that can’t detect contamination on irregular geometries, interventions that proceed despite identified risks, investigations that erase failures through re-execution​

For those of us responsible for contamination control in aseptic manufacturing, the question isn’t whether our facilities face similar vulnerabilities—they do. The question is whether our quality systems are architected to detect these vulnerabilities before regulators discover them. Are your VHP validations addressing actual occluded surfaces or ideal flat coupons? Are you using contact plates because they detect contamination effectively or because they’re operationally convenient? Do your intervention protocols prevent the high-risk activities your risk assessments identify? When media fills terminate, do investigations occur before re-execution?

The Catalent warning letter provides a diagnostic framework for assessing contamination hazard management. Use it. Map your own decontamination validation against the occluded surface criteria. Evaluate your environmental monitoring method selection against detection capability requirements. Review intervention protocols for alignment with risk assessments. Examine media fill termination handling for investigation rigor. Assess supplier qualification for facility-based sampling and contamination-specific inspection.

The contamination hazards are already present in your aseptic operations. The question is whether your quality system architecture can detect them.

When Water Systems Fail: Unpacking the LeMaitre Vascular Warning Letter

The FDA’s August 11, 2025 warning letter to LeMaitre Vascular reads like a masterclass in how fundamental water system deficiencies can cascade into comprehensive quality system failures. This warning letter offers lessons about the interconnected nature of pharmaceutical water systems and the regulatory expectations that surround them.

The Foundation Cracks

What makes this warning letter particularly instructive is how it demonstrates that water systems aren’t just utilities—they’re critical manufacturing infrastructure whose failures ripple through every aspect of product quality. LeMaitre’s North Brunswick facility, which manufactures Artegraft Collagen Vascular Grafts, found itself facing six major violations, with water system inadequacies serving as the primary catalyst.

The Artegraft device itself—a bovine carotid artery graft processed through enzymatic digestion and preserved in USP purified water and ethyl alcohol—places unique demands on water system reliability. When that foundation fails, everything built upon it becomes suspect.

Water Sampling: The Devil in the Details

The first violation strikes at something discussed extensively in previous posts: representative sampling. LeMaitre’s USP water sampling procedures contained what the FDA termed “inconsistent and conflicting requirements” that fundamentally compromised the representativeness of their sampling.

Consider the regulatory expectation here. As outlined in ISPE guideline, “sampling a POU must include any pathway that the water travels to reach the process”. Yet LeMaitre was taking samples through methods that included purging, flushing, and disinfection steps that bore no resemblance to actual production use. This isn’t just a procedural misstep—it’s a fundamental misunderstanding of what water sampling is meant to accomplish.

The FDA’s criticism centers on three critical sampling failures:

  • Sampling Location Discrepancies: Taking samples through different pathways than production water actually follows. This violates the basic principle that quality control sampling should “mimic the way the water is used for manufacturing”.
  • Pre-Sampling Conditioning: The procedures required extensive purging and cleaning before sampling—activities that would never occur during normal production use. This creates “aspirational data”—results that reflect what we wish our system looked like rather than how it actually performs.
  • Inconsistent Documentation: Failure to document required replacement activities during sampling, creating gaps in the very records meant to demonstrate control.

The Sterilant Switcheroo

Perhaps more concerning was LeMaitre’s unauthorized change of sterilant solutions for their USP water system sanitization. The company switched sterilants sometime in 2024 without documenting the change control, assessing biocompatibility impacts, or evaluating potential contaminant differences.

This represents a fundamental failure in change control—one of the most basic requirements in pharmaceutical manufacturing. Every change to a validated system requires formal assessment, particularly when that change could affect product safety. The fact that LeMaitre couldn’t provide documentation allowing for this change during inspection suggests a broader systemic issue with their change control processes.

Environmental Monitoring: Missing the Forest for the Trees

The second major violation addressed LeMaitre’s environmental monitoring program—specifically, their practice of cleaning surfaces before sampling. This mirrors issues we see repeatedly in pharmaceutical manufacturing, where the desire for “good” data overrides the need for representative data.

Environmental monitoring serves a specific purpose: to detect contamination that could reasonably be expected to occur during normal operations. When you clean surfaces before sampling, you’re essentially asking, “How clean can we make things when we try really hard?” rather than “How clean are things under normal operating conditions?”

The regulatory expectation is clear: environmental monitoring should reflect actual production conditions, including normal personnel traffic and operational activities. LeMaitre’s procedures required cleaning surfaces and minimizing personnel traffic around air samplers—creating an artificial environment that bore little resemblance to actual production conditions.

Sterilization Validation: Building on Shaky Ground

The third violation highlighted inadequate sterilization process validation for the Artegraft products. LeMaitre failed to consider bioburden of raw materials, their storage conditions, and environmental controls during manufacturing—all fundamental requirements for sterilization validation.

This connects directly back to the water system failures. When your water system monitoring doesn’t provide representative data, and your environmental monitoring doesn’t reflect actual conditions, how can you adequately assess the bioburden challenges your sterilization process must overcome?

The FDA noted that LeMaitre had six out-of-specification bioburden results between September 2024 and March 2025, yet took no action to evaluate whether testing frequency should be increased. This represents a fundamental misunderstanding of how bioburden data should inform sterilization validation and ongoing process control.

CAPA: When Process Discipline Breaks Down

The final violations addressed LeMaitre’s Corrective and Preventive Action (CAPA) system, where multiple CAPAs exceeded their own established timeframes by significant margins. A high-risk CAPA took 81 days instead of the required timeframe, while medium and low-risk CAPAs exceeded deadlines by 120-216 days.

This isn’t just about missing deadlines—it’s about the erosion of process discipline. When CAPA systems lose their urgency and rigor, it signals a broader cultural issue where quality requirements become suggestions rather than requirements.

The Recall That Wasn’t

Perhaps most concerning was LeMaitre’s failure to report a device recall to the FDA. The company distributed grafts manufactured using raw material from a non-approved supplier, with one graft implanted in a patient before the recall was initiated. This constituted a reportable removal under 21 CFR Part 806, yet LeMaitre failed to notify the FDA as required.

This represents the ultimate failure: when quality system breakdowns reach patients. The cascade from water system failures to inadequate environmental monitoring to poor change control ultimately resulted in a product safety issue that required patient intervention.

Gap Assessment Questions

For organizations conducting their own gap assessments based on this warning letter, consider these critical questions:

Water System Controls

  • Are your water sampling procedures representative of actual production use conditions?
  • Do you have documented change control for any modifications to water system sterilants or sanitization procedures?
  • Are all water system sampling activities properly documented, including any maintenance or replacement activities?
  • Have you assessed the impact of any sterilant changes on product biocompatibility?

Environmental Monitoring

  • Do your environmental monitoring procedures reflect normal production conditions?
  • Are surfaces cleaned before environmental sampling, and if so, is this representative of normal operations?
  • Does your environmental monitoring capture the impact of actual personnel traffic and operational activities?
  • Are your sampling frequencies and locations justified by risk assessment?

Sterilization and Bioburden Control

  • Does your sterilization validation consider bioburden from all raw materials and components?
  • Have you established appropriate bioburden testing frequencies based on historical data and risk assessment?
  • Do you have procedures for evaluating when bioburden testing frequency should be increased based on out-of-specification results?
  • Are bioburden results from raw materials and packaging components included in your sterilization validation?

CAPA System Integrity

  • Are CAPA timelines consistently met according to your established procedures?
  • Do you have documented rationales for any CAPA deadline extensions?
  • Is CAPA effectiveness verification consistently performed and documented?
  • Are supplier corrective actions properly tracked and their effectiveness verified?

Change Control and Documentation

  • Are all changes to validated systems properly documented and assessed?
  • Do you have procedures for notifying relevant departments when suppliers change materials or processes?
  • Are the impacts of changes on product quality and safety systematically evaluated?
  • Is there a formal process for assessing when changes require revalidation?

Regulatory Compliance

  • Are all required reports (corrections, removals, MDRs) submitted within regulatory timeframes?
  • Do you have systems in place to identify when product removals constitute reportable events?
  • Are all regulatory communications properly documented and tracked?

Learning from LeMaitre’s Missteps

This warning letter serves as a reminder that pharmaceutical manufacturing is a system of interconnected controls, where failures in fundamental areas like water systems can cascade through every aspect of operations. The path from water sampling deficiencies to patient safety issues is shorter than many organizations realize.

The most sobering aspect of this warning letter is how preventable these violations were. Representative sampling, proper change control, and timely CAPA completion aren’t cutting-edge regulatory science—they’re fundamental GMP requirements that have been established for decades.

For quality professionals, this warning letter reinforces the importance of treating utility systems with the same rigor we apply to manufacturing processes. Water isn’t just a raw material—it’s a critical quality attribute that deserves the same level of control, monitoring, and validation as any other aspect of your manufacturing process.

The question isn’t whether your water system works when everything goes perfectly. The question is whether your monitoring and control systems will detect problems before they become patient safety issues. Based on LeMaitre’s experience, that’s a question worth asking—and answering—before the FDA does it for you.

Causal Reasoning: A Transformative Approach to Root Cause Analysis

Energy Safety Canada recently published a white paper on causal reasoning that offers valuable insights for quality professionals across industries. As someone who has spent decades examining how we investigate deviations and perform root cause analysis, I found their framework refreshing and remarkably aligned with the challenges we face in pharmaceutical quality. The paper proposes a fundamental shift in how we approach investigations, moving from what they call “negative reasoning” to “causal reasoning” that could significantly improve our ability to prevent recurring issues and drive meaningful improvement.

The Problem with Traditional Root Cause Analysis

Many of us in quality have experienced the frustration of seeing the same types of deviations recur despite thorough investigations and seemingly robust CAPAs. The Energy Safety Canada white paper offers a compelling explanation for this phenomenon: our investigations often focus on what did not happen rather than what actually occurred.

This approach, which the authors term “negative reasoning,” leads investigators to identify counterfactuals-things that did not occur, such as “operators not following procedures” or “personnel not stopping work when they should have”. The problem is fundamental: what was not happening cannot create the outcomes we experienced. As the authors aptly state, these counterfactuals “exist only in retrospection and never actually influenced events,” yet they dominate many of our investigation conclusions.

This insight resonates strongly with what I’ve observed in pharmaceutical quality. Six years ago the MHRA’s 2019 citation of 210 companies for inadequate root cause analysis and CAPA development – including 6 critical findings – takes on renewed significance in light of Sanofi’s 2025 FDA warning letter. While most cited organizations likely believed their investigation processes were robust (as Sanofi presumably did before their contamination failures surfaced), these parallel cases across regulatory bodies and years expose a persistent industry-wide disconnect between perceived and actual investigation effectiveness. These continued failures exemplify how superficial root cause analysis creates dangerous illusions of control – precisely the systemic flaw the MHRA data highlighted six years prior.

Negative Reasoning vs. Causal Reasoning: A Critical Distinction

The white paper makes a distinction that I find particularly valuable: negative reasoning seeks to explain outcomes based on what was missing from the system, while causal reasoning looks for what was actually present or what happened. This difference may seem subtle, but it fundamentally changes the nature and outcomes of our investigations.

When we use negative reasoning, we create what the white paper calls “an illusion of cause without being causal”. We identify things like “failure to follow procedures” or “inadequate risk assessment,” which may feel satisfying but don’t explain why those conditions existed in the first place. These conclusions often lead to generic corrective actions that fail to address underlying issues.

In contrast, causal reasoning requires statements that have time, place, and magnitude. It focuses on what was necessary and sufficient to create the effect, building a logically tight cause-and-effect diagram. This approach helps reveal how work is actually done rather than comparing reality to an imagined ideal.

This distinction parallels the gap between “work-as-imagined” (the black line) and “work-as-done” (the blue line). Too often, our investigations focus only on deviations from work-as-imagined without trying to understand why work-as-done developed differently.

A Tale of Two Analyses: The Power of Causal Reasoning

The white paper presents a compelling case study involving a propane release and operator injury that illustrates the difference between these two approaches. When initially analyzed through negative reasoning, investigators concluded the operator:

  • Used an improper tool
  • Deviated from good practice
  • Failed to recognize hazards
  • Failed to learn from past experiences

These conclusions placed blame squarely on the individual and led leadership to consider terminating the operator.

However, when the same incident was examined through causal reasoning, a different picture emerged:

  • The operator used the pipe wrench because it was available at the pump specifically for this purpose
  • The pipe wrench had been deliberately left at that location because operators knew the valve was hard to close
  • The operator acted quickly because he perceived a risk to the plant and colleagues
  • Leadership had actually endorsed this workaround four years earlier during a turnaround

This causally reasoned analysis revealed that what appeared to be an individual failure was actually a system-level issue that had been normalized over time. Rather than punishing the operator, leadership recognized their own role in creating the conditions for the incident and implemented systemic improvements.

This example reminded me of our discussions on barrier analysis, where we examine barriers that failed, weren’t used, or didn’t exist. But causal reasoning takes this further by exploring why those conditions existed in the first place, creating a much richer understanding of how work actually happens.

First 24 Hours: Where Causal Reasoning Meets The Golden Day

In my recent post on “The Golden Start to a Deviation Investigation,” I emphasized how critical the first 24 hours are after discovering a deviation. This initial window represents our best opportunity to capture accurate information and set the stage for a successful investigation. The Energy Safety Canada white paper complements this concept perfectly by providing guidance on how to use those critical hours effectively.

When we apply causal reasoning during these early stages, we focus on collecting specific, factual information about what actually occurred rather than immediately jumping to what should have happened. This means documenting events with specificity (time, place, magnitude) and avoiding premature judgments about deviations from procedures or expectations.

As I’ve previously noted, clear and precise problem definition forms the foundation of any effective investigation. Causal reasoning enhances this process by ensuring we document using specific, factual language that describes what occurred rather than what didn’t happen. This creates a much stronger foundation for the entire investigation.

Beyond Human Error: System Thinking and Leadership’s Role

One of the most persistent challenges in our field is the tendency to attribute events to “human error.” As I’ve discussed before, when human error is suspected or identified as the cause, this should be justified only after ensuring that process, procedural, or system-based errors have not been overlooked. The white paper reinforces this point, noting that human actions and decisions are influenced by the system in which people work.

In fact, the paper presents a hierarchy of causes that resonates strongly with systems thinking principles I’ve advocated for previously. Outcomes arise from physical mechanisms influenced by human actions and decisions, which are in turn governed by systemic factors. If we only address physical mechanisms or human behaviors without changing the system, performance will eventually migrate back to where it has always been.

This connects directly to what I’ve written about quality culture being fundamental to providing quality. The white paper emphasizes that leadership involvement is directly correlated with performance improvement. When leaders engage to set conditions and provide resources, they create an environment where investigations can reveal systemic issues rather than just identify procedural deviations or human errors.

Implementing Causal Reasoning in Pharmaceutical Quality

For pharmaceutical quality professionals looking to implement causal reasoning in their investigation processes, I recommend starting with these practical steps:

1. Develop Investigator Competencies

As I’ve discussed in my analysis of Sanofi’s FDA warning letter, having competent investigators is crucial. Organizations should:

  • Define required competencies for investigators
  • Provide comprehensive training on causal reasoning techniques
  • Implement mentoring programs for new investigators
  • Regularly assess and refresh investigator skills

2. Shift from Counterfactuals to Causal Statements

Review your recent investigations and look for counterfactual statements like “operators did not follow the procedure.” Replace these with causal statements that describe what actually happened and why it made sense to the people involved at the time.

3. Implement a Sponsor-Driven Approach

The white paper emphasizes the importance of investigation sponsors (otherwise known as Area Managers) who set clear conditions and expectations. This aligns perfectly with my belief that quality culture requires alignment between top management behavior and quality system philosophy. Sponsors should:

  • Clearly define the purpose and intent of investigations
  • Specify that a causal reasoning orientation should be used
  • Provide resources and access needed to find data and translate it into causes
  • Remain engaged throughout the investigation process
Infographic capturing the 4 things a sponsor should do above

4. Use Structured Causal Analysis Tools

While the M-based frameworks I’ve discussed previously (4M, 5M, 6M) remain valuable for organizing contributing factors, they should be complemented with tools that support causal reasoning. The Cause-Consequence Analysis (CCA) I described in a recent post offers one such approach, combining elements of fault tree analysis and event tree analysis to provide a holistic view of risk scenarios.

From Understanding to Improvement

The Energy Safety Canada white paper’s emphasis on causal reasoning represents a valuable contribution to how we think about investigations across industries. For pharmaceutical quality professionals, this approach offers a way to move beyond compliance-focused investigations to truly understand how our systems operate and how to improve them.

As the authors note, “The capacity for an investigation to improve performance is dependent on the type of reasoning used by investigators”. By adopting causal reasoning, we can build investigations that reveal how work actually happens rather than simply identifying deviations from how we imagine it should happen.

This approach aligns perfectly with my long-standing belief that without a strong quality culture, people will not be ready to commit and involve themselves fully in building and supporting a robust quality management system. Causal reasoning creates the transparency and learning that form the foundation of such a culture.

I encourage quality professionals to download and read the full white paper, reflect on their current investigation practices, and consider how causal reasoning might enhance their approach to understanding and preventing deviations. The most important questions to consider are:

  1. Do your investigation conclusions focus on what didn’t happen rather than what did?
  2. How often do you identify “human error” without exploring the system conditions that made that error likely?
  3. Are your leaders engaged as sponsors who set conditions for successful investigations?
  4. What barriers exist in your organization that prevent learning from events?

As we continue to evolve our understanding of quality and safety, approaches like causal reasoning offer valuable tools for creating the transparency needed to navigate complexity and drive meaningful improvement.

Understanding the Distinction Between Impact and Risk

Two concepts—impact and risk — are often discussed but sometimes conflated within quality systems. While related, these concepts serve distinct purposes and drive different decisions throughout the quality system. Let’s explore.

The Fundamental Difference: Impact vs. Risk

The difference between impact and risk is fundamental to effective quality management. The difference between impact and risk is critical. Impact is best thought of as ‘What do I need to do to make the change.’ Risk is ‘What could go wrong in making this change?'”

Impact assessment focuses on evaluating the effects of a proposed change on various elements such as documentation, equipment, processes, and training. It helps identify the scope and reach of a change. Risk assessment, by contrast, looks ahead to identify potential failures that might occur due to the change – it’s preventive and focused on possible consequences.

This distinction isn’t merely academic – it directly affects how we approach actions and decisions in our quality systems, impacting core functions of CAPA, Change Control and Management Review.

AspectImpactRisk
DefinitionThe effect or influence a change, event, or deviation has on product quality, process, or systemThe probability and severity of harm or failure occurring as a result of a change, event, or deviation
FocusWhat is affected and to what extent (scope and magnitude of consequences)What could go wrong, how likely it is to happen, and how severe the outcome could be
Assessment TypeEvaluates the direct consequences of an action or eventEvaluates the likelihood and severity of potential adverse outcomes
Typical UseUsed in change control to determine which documents, systems, or processes are impactedUsed to prioritize actions, allocate resources, and implement controls to minimize negative outcomes
MeasurementUsually described qualitatively (e.g., minor, moderate, major, critical)Often quantified by combining probability and impact scores to assign a risk level (e.g., low, medium, high)
ExampleA change in raw material supplier impacts the manufacturing process and documentation.The risk is that the new supplier’s material could fail to meet quality standards, leading to product defects.

Change Control: Different Questions, Different Purposes

Within change management, the PIC/S Recommendation PI 054-1 notes that “In some cases, especially for simple and minor/low risk changes, an impact assessment is sufficient to document the risk-based rationale for a change without the use of more formal risk assessment tools or approaches.”

Impact Assessment in Change Control

  • Determines what documentation requires updating
  • Identifies affected systems, equipment, and processes
  • Establishes validation requirements
  • Determines training needs

Risk Assessment in Change Control

  • Identifies potential failures that could result from the change
  • Evaluates possible consequences to product quality and patient safety
  • Determines likelihood of those consequences occurring
  • Guides preventive measures

A common mistake is conflating these concepts or shortcutting one assessment. For example, companies often rush to designate changes as “like-for-like” without supporting data, effectively bypassing proper risk assessment. This highlights why maintaining the distinction is crucial.

Validation: Complementary Approaches

In validation, the impact-risk distinction shapes our entire approach.

Impact in validation relates to identifying what aspects of product quality could be affected by a system or process. For example, when qualifying manufacturing equipment, we determine which critical quality attributes (CQAs) might be influenced by the equipment’s performance.

Risk assessment in validation explores what could go wrong with the equipment or process that might lead to quality failures. Risk management plays a pivotal role in validation by enabling a risk-based approach to defining validation strategies, ensuring regulatory compliance, mitigating product quality and safety risks, facilitating continuous improvement, and promoting cross-functional collaboration.

In Design Qualification, we verify that the critical aspects (CAs) and critical design elements (CDEs) necessary to control risks identified during the quality risk assessment (QRA) are present in the design. This illustrates how impact assessment (identifying critical aspects) works together with risk assessment (identifying what could go wrong).

When we perform Design Review and Design Qualification, we focus on Critical Aspects: Prioritize design elements that directly impact product quality and patient safety. Here, impact assessment identifies critical aspects, while risk assessment helps prioritize based on potential consequences.

Following Design Qualification, Verification activities such as Installation Qualification (IQ), Operational Qualification (OQ), and Performance Qualification (PQ) serve to confirm that the system or equipment performs as intended under actual operating conditions. Here, impact assessment identifies the specific parameters and functions that must be verified to ensure no critical quality attributes are compromised. Simultaneously, risk assessment guides the selection and extent of tests by focusing on areas with the highest potential for failure or deviation. This dual approach ensures that verification not only confirms the intended impact of the design but also proactively mitigates risks before routine use.

Validation does not end with initial qualification. Continuous Validation involves ongoing monitoring and trending of process performance and product quality to confirm that the validated state is maintained over time. Impact assessment plays a role in identifying which parameters and quality attributes require ongoing scrutiny, while risk assessment helps prioritize monitoring efforts based on the likelihood and severity of potential deviations. This continuous cycle allows quality systems to detect emerging risks early and implement corrective actions promptly, reinforcing a proactive, risk-based culture that safeguards product quality throughout the product lifecycle.

Data Integrity: A Clear Example

Data integrity offers perhaps the clearest illustration of the impact-risk distinction.

As I’ve previously noted, Data quality is not a risk. It is a causal factor in the failure or severity. Poor data quality isn’t itself a risk; rather, it’s a factor that can influence the severity or likelihood of risks.

When assessing data integrity issues:

  • Impact assessment identifies what data is affected and which processes rely on that data
  • Risk assessment evaluates potential consequences of data integrity lapses

In my risk-based data integrity assessment methodology, I use a risk rating system that considers both impact and risk factors:

Risk RatingActionMitigation
>25High Risk-Potential Impact to Patient Safety or Product QualityMandatory
12-25Moderate Risk-No Impact to Patient Safety or Product Quality but Potential Regulatory RiskRecommended
<12Negligible DI RiskNot Required

This system integrates both impact (on patient safety or product quality) and risk (likelihood and detectability of issues) to guide mitigation decisions.

The Golden Day: Impact and Risk in Deviation Management

The Golden Day concept for deviation management provides an excellent practical example. Within the first 24 hours of discovering a deviation, we conduct:

  1. An impact assessment to determine:
    • Which products, materials, or batches are affected
    • Potential effects on critical quality attributes
    • Possible regulatory implications
  2. A risk assessment to evaluate:
    • Patient safety implications
    • Product quality impact
    • Compliance with registered specifications
    • Level of investigation required

This impact assessment is also the initial risk assessment, which will help guide the level of effort put into the deviation. This statement shows how the two concepts, while distinct, work together to inform quality decisions.

Quality Escalation: When Impact Triggers a Response

In quality escalation, we often use specific criteria based on both impact and risk:

Escalation CriteriaExamples of Quality Events for Escalation
Potential to adversely affect quality, safety, efficacy, performance or compliance of product– Contamination – Product defect/deviation from process parameters or specification – Significant GMP deviations
Product counterfeiting, tampering, theft– Product counterfeiting, tampering, theft reportable to Health Authority – Lost/stolen IMP
Product shortage likely to disrupt patient care– Disruption of product supply due to product quality events
Potential to cause patient harm associated with a product quality event– Urgent Safety Measure, Serious Breach, Significant Product Complaint

These criteria demonstrate how we use both impact (what’s affected) and risk (potential consequences) to determine when issues require escalation.

Both Are Essential

Understanding the difference between impact and risk fundamentally changes how we approach quality management. Impact assessment without risk assessment may identify what’s affected but fails to prevent potential issues. Risk assessment without impact assessment might focus on theoretical problems without understanding the actual scope.

The pharmaceutical quality system requires both perspectives:

  1. Impact tells us the scope – what’s affected
  2. Risk tells us the consequences – what could go wrong

By maintaining this distinction and applying both concepts appropriately across change control, validation, and data integrity management, we build more robust quality systems that not only comply with regulations but actually protect product quality and patient safety.

The Golden Start to a Deviation Investigation

How you respond in the first 24 hours after discovering a deviation can make the difference between a minor quality issue and a major compliance problem. This critical window-what I call “The Golden Day”-represents your best opportunity to capture accurate information, contain potential risks, and set the stage for a successful investigation. When managed effectively, this initial day creates the foundation for identifying true root causes and implementing effective corrective actions that protect product quality and patient safety.

Why the First 24 Hours Matter: The Evidence

The initial response to a deviation is crucial for both regulatory compliance and effective problem-solving. Industry practice and regulatory expectations align on the importance of quick, systematic responses to deviations.

  • Regulatory expectations explicitly state that deviation investigation and root cause determination should be completed in a timely manner, and industry expectations usually align on deviations being completed within 30 days of discovery.
  • In the landmark U.S. v. Barr Laboratories case, “the Court declared that all failure investigations must be performed promptly, within thirty business days of the problem’s occurrence”
  • Best practices recommend assembling a cross-functional team immediately after deviation discovery and conduct initial risk assessment within 24 hours”
  • Initial actions taken in the first day directly impact the quality and effectiveness of the entire investigation process

When you capitalize on this golden window, you’re working with fresh memories, intact evidence, and the highest chance of observing actual conditions that contributed to the deviation.

Identifying the Problem: Clarity from the Start

Clear, precise problem definition forms the foundation of any effective investigation. Vague or incomplete problem statements lead to misdirected investigations and ultimately, inadequate corrective actions.

  • Document using specific, factual language that describes what occurred versus what was expected
  • Include all relevant details such as procedure and equipment numbers, product names and lot numbers
  • Apply the 5W2H method (What, When, Where, Who, Why if known, How much is involved, and How it was discovered)
  • Avoid speculation about causes in the initial description
  • Remember that the description should incorporate relevant records and photographs of discovered defects.
5W2HTypical questionsContains
Who?Who are the people directly concerned with the problem? Who does this? Who should be involved but wasn’t? Was someone involved who shouldn’t be?User IDs, Roles and Departments
What?What happened?Action, steps, description
When?When did the problem occur?Times, dates, place In process
Where?Where did the problem occur?Location
Why is it important?Why did we do this? What are the requirements? What is the expected condition?Justification, reason
How?How did we discover. Where in the process was it?Method, process, procedure
How Many? How Much?How many things are involved? How often did the situation happen? How much did it impact?Number, frequency

The quality of your deviation documentation begins with this initial identification. As I’ve emphasized in previous posts, the investigation/deviation report should tell a story that can be easily understood by all parties well after the event and the investigation. This narrative begins with clear identification on day one.

ElementsProblem Statement
Is used to…Understand and target a problem. Providing a scope. Evaluate any risks. Make objective decisions
Answers the following… (5W2H)What? (problem that occurred);When? (timing of what occurred); Where? (location of what occurred); Who? (persons involved/observers); Why? (why it matters, not why it occurred); How Much/Many? (volume or count); How Often? (First/only occurrence or multiple)
Contains…Object (What was affected?); Defect (What went wrong?)
Provides direction for…Escalation(s); Investigation

Going to the GEMBA: Being Where the Action Is

GEMBA-the actual place where work happens-is a cornerstone concept in quality management. When a deviation occurs, there is no substitute for being physically present at the location.

  • Observe the actual conditions and environment firsthand
  • Notice details that might not be captured in written reports
  • Understand the workflow and context surrounding the deviation
  • Gather physical evidence before it’s lost or conditions change
  • Create the opportunity for meaningful conversations with operators

Human error occurs because we are human beings. The extent of our knowledge, training, and skill has little to do with the mistakes we make. We tire, our minds wander and lose concentration, and we must navigate complex processes while satisfying competing goals and priorities – compliance, schedule adherence, efficiency, etc.

Foremost to understanding human performance is knowing that people do what makes sense to them given the available cues, tools, and focus of their attention at the time. Simply put, people come to work to do a good job – if it made sense for them to do what they did, it will make sense to others given similar conditions. The following factors significantly shape human performance and should be the focus of any human error investigation:

Physical Environment
Environment, tools, procedures, process design
Organizational Culture
Just- or blame-culture, attitude towards error
Management and Supervision
Management of personnel, training, procedures
Stress Factors
Personal, circumstantial, organizational

We do not want to see or experience human error – but when we do, it’s imperative to view it as a valuable opportunity to improve the system or process. This mindset is the heart of effective human error prevention.

Conducting an Effective GEMBA Walk for Deviations

When conducting your GEMBA walk specifically for deviation investigation:

  • Arrive with a clear purpose and structured approach
  • Observe before asking questions
  • Document observations with photos when appropriate
  • Look for environmental factors that might not appear in reports
  • Pay attention to equipment configuration and conditions
  • Note how operators interact with the process or equipment

A deviation gemba is a cross-functional team meeting that is assembled where a potential deviation event occurred. Going to the gemba and “freezing the scene” as close as possible to the time the event occurred will yield valuable clues about the environment that existed at the time – and fresher memories will provide higher quality interviews. This gemba has specific objectives:

  • Obtain a common understanding of the event: what happened, when and where it happened, who observed it, who was involved – all the facts surrounding the event. Is it a deviation?
  • Clearly describe actions taken, or that need to be taken, to contain impact from the event: product quarantine, physical or mechanical interventions, management or regulatory notifications, etc.
  • Interview involved operators: ask open-ended questions, like how the event unfolded or was discovered, from their perspective, or how the event could have been prevented, in their opinion – insights from personnel experienced with the process can prove invaluable during an investigation.

Deviation GEMBA Tips

Typically there is time between when notification of a deviation gemba goes out and when the team is scheduled to assemble. It is important to come prepared to help facilitate an efficient gemba:

  • Assemble procedures and other relevant documents and records. This will make references easier during the gemba.
  • Keep your team on-track – the gemba should end with the team having a common understanding of the event, actions taken to contain impact, and the agreed-upon next steps of the investigation.

You will gain plenty of investigational leads from your observations and interviews at the gemba – which documents to review, which personnel to interview, which equipment history to inspect, and more. The gemba is such an invaluable experience that, for many minor events, root cause and CAPA can be determined fairly easily from information gathered solely at the gemba.

Informal Rubric for Conducting a Good Deviation GEMBA

  • Describe the timeliness of the team gathering at the gemba.
  • Were all required roles and experts present?
  • Was someone leading or facilitating the gemba?
  • Describe any interviews the team performed during the gemba.
  • Did the team get sidetracked or off-topic during the gemba
  • Was the team prepared with relevant documentation or information?
  • Did the team determine batch impact and any reportability requirements?
  • Did the team satisfy the objectives of the gemba?
  • What did the team do well?
  • What could the team improve upon?

Speaking with Operators: The Power of Cognitive Interviewing

Interviewing personnel who were present when the deviation occurred requires special techniques to elicit accurate, complete information. Traditional questioning often fails to capture critical details.

Cognitive interviewing, as I outlined in my previous post on “Interviewing,” was originally created for law enforcement and later adopted during accident investigations by the National Transportation Safety Board (NTSB). This approach is based on two key principles:

  • Witnesses need time and encouragement to recall information
  • Retrieval cues enhance memory recall

How to Apply Cognitive Interviewing in Deviation Investigations

  • Mental Reinstatement: Encourage the interviewee to mentally recreate the environment and people involved
  • In-Depth Reporting: Encourage the reporting of all the details, even if it is minor or not directly related
  • Multiple Perspectives: Ask the interviewee to recall the event from others’ points of view
  • Several Orders: Ask the interviewee to recount the timeline in different ways. Beginning to end, end to beginning

Most importantly, conduct these interviews at the actual location where the deviation occurred. A key part of this is that retrieval cues access memory. This is why doing the interview on the scene (or Gemba) is so effective.

ComponentWhat It Consists of
Mental ReinstatementEncourage the interviewee to mentally recreate the environment and people involved.
In-Depth ReportingEncourage the reporting of all the details.
Multiple PerspectivesAsk the interviewee to recall the event from others’ points of view.
Several OrdersAsk the interviewee to recount the timeline in different ways.
  • Approach the Interviewee Positively:
    • Ask for the interview.
    • State the purpose of the interview.
    • Tell interviewee why he/she was selected.
    • Avoid statements that imply blame.
    • Focus on the need to capture knowledge
    • Answer questions about the interview.
    • Acknowledge and respond to concerns.
    • Manage negative emotions.
  • Apply these Four Components:
    • Use mental reinstatement.
    • Report everything.
    • Change the perspective.
    • Change the order.
  • Apply these Two Principles:
    • Witnesses need time and encouragement to recall information.
    • Retrieval cues enhance memory recall.
  • Demonstrate these Skills:
    • Recreate the original context and had them walk you through process.
    • Tell the witness to actively generate information.
    • Adopt the witness’s perspective.
    • Listen actively, do not interrupt, and pause before asking follow-up questions.
    • Ask open-ended questions.
    • Encourage the witness to use imagery.
    • Perform interview at the Gemba.
    • Follow sequence of the four major components.
    • Bring support materials.
    • Establish a connection with the witness.
    • Do Not tell them how they made the mistake.

Initial Impact Assessment: Understanding the Scope

Within the first 24 hours, a preliminary impact assessment is essential for determining the scope of the deviation and the appropriate response.

  • Apply a risk-based approach to categorize the deviation as critical, major, or minor
  • Evaluate all potentially affected products, materials, or batches
  • Consider potential effects on critical quality attributes
  • Assess possible regulatory implications
  • Determine if released products may be affected

This impact assessment is also the initial risk assessment, which will help guide the level of effort put into the deviation.

Factors to Consider in Initial Risk Assessment

  • Patient safety implications
  • Product quality impact
  • Compliance with registered specifications
  • Potential for impact on other batches or products
  • Regulatory reporting requirements
  • Level of investigation required

This initial assessment will guide subsequent decisions about quarantine, notification requirements, and the depth of investigation needed. Remember, this is a preliminary assessment that will be refined as the investigation progresses.

Immediate Actions: Containing the Issue

Once you’ve identified the deviation and assessed its potential impact, immediate actions must be taken to contain the issue and prevent further risk.

  • Quarantine potentially affected products or materials to prevent their release or further use
  • Notify key stakeholders, including quality assurance, production supervision, and relevant department heads
  • Implement temporary corrective or containment measures
  • Document the deviation in your quality management system
  • Secure relevant evidence and documentation
  • Consider whether to stop related processes

Industry best practices emphasize that you should Report the deviation in real-time. Notify QA within 24 hours and hold the GEMBA. Remember that “if you don’t document it, it didn’t happen” – thorough documentation of both the deviation and your immediate response is essential.

Affected vs Related Batches

Not every Impact is the same, so it can be helpful to have two concepts: Affected and Related.

  • Affected Batch:  Product directly impacted by the event at the time of discovery, for instance, the batch being manufactured or tested when the deviation occurred.
  • Related Batch:  Product manufactured or tested under the same conditions or parameters using the process in which the deviation occurred and determined as part of the deviation investigation process to have no impact on product quality.

Setting Up for a Successful Full Investigation

The final step in the golden day is establishing the foundation for the comprehensive investigation that will follow.

  • Assemble a cross-functional investigation team with relevant expertise
  • Define clear roles and responsibilities for team members
  • Establish a timeline for the investigation (remembering the 30-day guideline)
  • Identify additional data or evidence that needs to be collected
  • Plan for any necessary testing or analysis
  • Schedule follow-up interviews or observations

In my post on handling deviations, I emphasized that you must perform a time-sensitive and thorough investigation within 30 days. The groundwork laid during the golden day will make this timeline achievable while maintaining investigation quality.

Planning for Root Cause Analysis

During this setup phase, you should also begin planning which root cause analysis tools might be most appropriate for your investigation. Select tools based on the event complexity and the number of potential root causes and when “human error” appears to be involved, prepare to dig deeper as this is rarely the true root cause

Identifying Phase of your Investigation

IfThen you are at
The problem is not understood. Boundaries have not been set. There could be more than one problemProblem Understanding
Data needs to be collected. There are questions about frequency or occurrence. You have not had interviewsData Collection
Data has been collected but not analyszedData Analysis
The root cause needs to be determined from the analyzed dataIdentify Root Cause
Root Cause Analysis Tools Chart body { font-family: Arial, sans-serif; line-height: 1.6; margin: 20px; } table { border-collapse: collapse; width: 100%; margin-bottom: 20px; } th, td { border: 1px solid ; padding: 8px 12px; vertical-align: top; } th { background-color: ; font-weight: bold; text-align: left; } tr:nth-child(even) { background-color: ; } .purpose-cell { font-weight: bold; } h1 { text-align: center; color: ; } ul { margin: 0; padding-left: 20px; }

Root Cause Analysis Tools Chart

Purpose Tool Description
Problem Understanding Process Map A picture of the separate steps of a process in sequential order, including:
  • materials or services entering or leaving the process (inputs and outputs)
  • decisions that must be made
  • people who become involved
  • time involved at each step, and/or
  • process measurements.
Critical Incident Technique (CIT) A process used for collecting direct observations of human behavior that
  • have critical significance, and
  • meet methodically defined criteria.
Comparative Analysis A technique that focuses a problem-solving team on a problem. It compares one or more elements of a problem or process to evaluate elements that are similar or different (e.g. comparing a standard process to a failing process).
Performance Matrix A tool that describes the participation by various roles in completing tasks or deliverables for a project or business process.
Note: It is especially useful in clarifying roles and responsibilities in cross-functional/departmental positions.
5W2H Analysis An approach that defines a problem and its underlying contributing factors by systematically asking questions related to who, what, when, where, why, how, and how much/often.
Data Collection Surveys A technique for gathering data from a targeted audience based on a standard set of criteria.
Check Sheets A technique to compile data or observations to detect and show trends/patterns.
Cognitive Interview An interview technique used by investigators to help the interviewee recall specific memories from a specific event.
KNOT Chart A data collection and classification tool to organize data based on what is
  • Known
  • Need to know
  • Opinion, and
  • Think we know.
Data Analysis Pareto Chart A technique that focuses efforts on problems offering the greatest potential for improvement.
Histogram A tool that
  • summarizes data collected over a period of time, and
  • graphically presents frequency distribution.
Scatter Chart A tool to study possible relationships between changes in two different sets of variables.
Run Chart A tool that captures study data for trends/patterns over time.
Affinity Diagram A technique for brainstorming and summarizing ideas into natural groupings to understand a problem.
Root Cause Analysis Interrelationship Digraphs A tool to identify, analyze, and classify cause and effect relationships among issues so that drivers become part of an effective solution.
Why-Why A technique that allows one to explore the cause-and-effect relationships of a particular problem by asking why; drilling down through the underlying contributing causes to identify root cause.
Is/Is Not A technique that guides the search for causes of a problem by isolating the who, what, when, where, and how of an event. It narrows the investigation to factors that have an impact and eliminates factors that do not have an impact. By comparing what the problem is with what the problem is not, we can see what is distinctive about a problem which leads to possible causes.
Structured Brainstorming A technique to identify, explore, and display the
  • factors within each root cause category that may be affecting the problem/issue, and/or
  • effect being studied through this structured idea-generating tool.
Cause and Effect Diagram (Ishikawa/Fishbone) A tool to display potential causes of an event based on root cause categories defined by structured brainstorming using this tool as a visual aid.
Causal Factor Charting A tool to
  • analyze human factors and behaviors that contribute to errors, and
  • identify behavior-influencing factors and gaps.
Other Tools Prioritization Matrix A tool to systematically compare choices through applying and weighting criteria.
Control Chart A tool to monitor process performance over time by studying its variation and source.
Process Capability A tool to determine whether a process is capable of meeting requirements or specifications.

Making the Most of Your Golden Day

The first 24 hours after discovering a deviation represent a unique opportunity that should not be wasted. By following the structured approach outlined in this post-identifying the problem clearly, going to the GEMBA, interviewing operators using cognitive techniques, conducting an initial impact assessment, taking immediate containment actions, and setting up for the full investigation-you maximize the value of this golden day.

Remember that excellent deviation management is directly linked to product quality, patient safety, and regulatory compliance. Each well-managed deviation is an opportunity to strengthen your quality system.

I encourage you to assess your current approach to the first 24 hours of deviation management. Are you capturing the full value of this golden day, or are you letting critical information slip away? Implement these strategies, train your team on proper deviation triage, and transform your deviation response from reactive to proactive.

Your deviation management effectiveness doesn’t begin when the investigation report is initiated-it begins the moment a deviation is discovered. Make that golden day count.