A 2025 Retrospective for Investigations of a Dog

If the history of pharmaceutical quality management were written as a geological timeline, 2025 would hopefully mark the end of the Holocene of Compliance—a long, stable epoch where “following the procedure” was sufficient to ensure survival—and the beginning of the Anthropocene of Complexity.

For decades, our industry has operated under a tacit social contract. We agreed to pretend that “compliance” was synonymous with “quality.” We agreed to pretend that a validated method would work forever because we proved it worked once in a controlled protocol three years ago. We agreed to pretend that “zero deviations” meant “perfect performance,” rather than “blind surveillance.” We agreed to pretend that if we wrote enough documents, reality would conform to them.

If I had my wish 2025 would be the year that contract finally dissolved.

Throughout the year—across dozens of posts, technical analyses, and industry critiques on this blog—I have tried to dismantle the comfortable illusions of “Compliance Theater” and show how this theater collides violently with the unforgiving reality of complex systems.

The connecting thread running through every one of these developments is the concept I have returned to obsessively this year: Falsifiable Quality.

This Year in Review is not merely a summary of blog posts. It is an attempt to synthesize the fragmented lessons of 2025 into a coherent argument. The argument is this: A quality system that cannot be proven wrong is a quality system that cannot be trusted.

If our systems—our validation protocols, our risk assessments, our environmental monitoring programs—are designed only to confirm what we hope is true (the “Happy Path”), they are not quality systems at all. They are comfort blankets. And 2025 was the year we finally started pulling the blanket off.

The Philosophy of Doubt

(Reflecting on: The Effectiveness Paradox, Sidney Dekker, and Gerd Gigerenzer)

Before we dissect the technical failures of 2025, let me first establish the philosophical framework that defined this year’s analysis.

In August, I published The Effectiveness Paradox: Why ‘Nothing Bad Happened’ Doesn’t Prove Your Quality System Works.” It became one of the most discussed posts of the year because it attacked the most sacred metric in our industry: the trend line that stays flat.

We are conditioned to view stability as success. If Environmental Monitoring (EM) data shows zero excursions for six months, we throw a pizza party. If a method validation passes all acceptance criteria on the first try, we commend the development team. If a year goes by with no Critical deviations, we pay out bonuses.

But through the lens of Falsifiable Quality—a concept heavily influenced by the philosophy of Karl Popper, the challenging insights of Deming, and the safety science of Sidney Dekker, whom we discussed in November—these “successes” look suspiciously like failures of inquiry.

The Problem with Unfalsifiable Systems

Karl Popper famously argued that a scientific theory is only valid if it makes predictions that can be tested and proven false. “All swans are white” is a scientific statement because finding one black swan falsifies it. “God is love” is not, because no empirical observation can disprove it.

In 2025, I argued that most Pharmaceutical Quality Systems (PQS) are designed to be unfalsifiable.

  • The Unfalsifiable Alert Limit: We set alert limits based on historical averages + 3 standard deviations. This ensures that we only react to statistical outliers, effectively blinding us to gradual drift or systemic degradation that remains “within the noise.”
  • The Unfalsifiable Robustness Study: We design validation protocols that test parameters we already know are safe (e.g., pH +/- 0.1), avoiding the “cliff edges” where the method actually fails. We prove the method works where it works, rather than finding where it breaks.
  • The Unfalsifiable Risk Assessment: We write FMEAs where the conclusion (“The risk is acceptable”) is decided in advance, and the RPN scores are reverse-engineered to justify it.

This is “Safety Theater,” a term Dekker uses to describe the rituals organizations perform to look safe rather than be safe.

Safety-I vs. Safety-II

In November’s post Sidney Dekker: The Safety Scientist Who Influences How I Think About Quality, I explored Dekker’s distinction between Safety-I (minimizing things that go wrong) and Safety-II (understanding how things usually go right).

Traditional Quality Assurance is obsessed with Safety-I. We count deviations. We count OOS results. We count complaints. When those counts are low, we assume the system is healthy.
But as the LeMaitre Vascular warning letter showed us this year (discussed in Part III), a system can have “zero deviations” simply because it has stopped looking for them. LeMaitre had excellent water data—because they were cleaning the valves before they sampled them. They were measuring their ritual, not their water.

Falsifiable Quality is the bridge to Safety-II. It demands that we treat every batch record not as a compliance artifact, but as a hypothesis test.

  • Hypothesis: “The contamination control strategy is effective.”
  • Test: Aggressive monitoring in worst-case locations, not just the “representative” center of the room.
  • Result: If we find nothing, the hypothesis survives another day. If we find something, we have successfully falsified the hypothesis—which is a good thing because it reveals reality.

The shift from “fearing the deviation” to “seeking the falsification” is a cultural pivot point of 2025.

The Epistemological Crisis in the Lab (Method Validation)

(Reflecting on: USP <1225>, Method Qualification vs. Validation, and Lifecycle Management)

Nowhere was the battle for Falsifiable Quality fought more fiercely in 2025 than in the analytical laboratory.

The proposed revision to USP <1225> Validation of Compendial Procedures (published in Pharmacopeial Forum 51(6)) arrived late in the year, but it serves as the perfect capstone to the arguments I’ve been making since January.

For forty years, analytical validation has been the ultimate exercise in “Validation as an Event.” You develop a method. You write a protocol. You execute the protocol over three days with your best analyst and fresh reagents. You print the report. You bind it. You never look at it again.

This model is unfalsifiable. It assumes that because the method worked in the “Work-as-Imagined” conditions of the validation study, it will work in the “Work-as-Done” reality of routine QC for the next decade.

The Reportable Result: Validating Decisions, Not Signals

The revised USP <1225>—aligned with ICH Q14(Analytical Procedure Development) and USP <1220> (The Lifecycle Approach)—destroys this assumption. It introduces concepts that force falsifiability into the lab.

The most critical of these is the Reportable Result.

Historically, we validated “the instrument” or “the measurement.” We proved that the HPLC could inject the same sample ten times with < 1.0% RSD.

But the Reportable Result is the final value used for decision-making—the value that appears on the Certificate of Analysis. It is the product of a complex chain: Sampling -> Transport -> Storage -> Preparation -> Dilution -> Injection -> Integration -> Calculation -> Averaging.

Validating the injection precision (the end of the chain) tells us nothing about the sampling variability (the beginning of the chain).

By shifting focus to the Reportable Result, USP <1225> forces us to ask: “Does this method generate decisions we can trust?”

The Replication Strategy: Validating “Work-as-Done”

The new guidance insists that validation must mimic the replication strategy of routine testing.
If your SOP says “We report the average of 3 independent preparations,” then your validation must evaluate the precision and accuracy of that average, not of the individual preparations.

This seems subtle, but it is revolutionary. It prevents the common trick of “averaging away” variability during validation to pass the criteria, only to face OOS results in routine production because the routine procedure doesn’t use the same averaging scheme.

It forces the validation study to mirror the messy reality of the “Work-as-Done,” making the validation data a falsifiable predictor of routine performance, rather than a theoretical maximum capability.

Method Qualification vs. Validation: The June Distinction

I wrote Method Qualification and Validation,” clarifying a distinction that often confuses the industry.

  • Qualification is the “discovery phase” where we explore the method’s limits. It is inherently falsifiable—we want to find where the method breaks.
  • Validation has traditionally been the “confirmation phase” where we prove it works.

The danger, as I noted in that post, is when we skip the falsifiable Qualification step and go straight to Validation. We write the protocol based on hope, not data.

USP <1225> essentially argues that Validation must retain the falsifiable spirit of Qualification. It is not a coronation; it is a stress test.

The Death of “Method Transfer” as We Know It

In a Falsifiable Quality system, a method is never “done.” The Analytical Target Profile (ATP)—a concept from ICH Q14 that permeates the new thinking—is a standing hypothesis: “This method measures Potency within +/- 2%.”

Every time we run a system suitability check, every time we run a control standard, we are testing that hypothesis.

If the method starts drifting—even if it still passes broad system suitability limits—a falsifiable system flags the drift. An unfalsifiable system waits for the OOS.

The draft revision of USP <1225> is a call to arms. It asks us to stop treating validation as a “ticket to ride”—a one-time toll we pay to enter GMP compliance—and start treating it as a “ticket to doubt.” Validation gives us permission to use the method, but only as long as the data continues to support the hypothesis of fitness.

The Reality Check (The “Unholy Trinity” of Warning Letters)

Philosophy and guidelines are fine, but in 2025, reality kicked in the door. The regulatory year was defined by three critical warning letters—SanofiLeMaitre, and Rechon—that collectively dismantled the industry’s illusions of control.

It began, as these things often do, with a ghost from the past.

Sanofi Framingham: The Pendulum Swings Back

(Reflecting on: Failure to Investigate Critical Deviations and The Sanofi Warning Letter)

The year opened with a shock. On January 15, 2025, the FDA issued a warning letter to Sanofi’s Framingham facility—the sister site to the legacy Genzyme Allston landing, whose consent decree defined an entire generation of biotech compliance and of my career.

In my January analysis (Failure to Investigate Critical Deviations: A Cautionary Tale), I noted that the FDA’s primary citation was a failure to “thoroughly investigate any unexplained discrepancy.”

This is the cardinal sin of Falsifiable Quality.

An “unexplained discrepancy” is a signal from reality. It is the system telling you, “Your hypothesis about this process is wrong.”

  • The Falsifiable Response: You dive into the discrepancy. You assume your control strategy missed something. You use Causal Reasoning (the topic of my May post) to find the mechanism of failure.
  • The Sanofi Response: As the warning letter detailed, they frequently attributed failures to “isolated incidents” or superficial causes without genuine evidence.

This is the “Refusal to Falsify.” By failing to investigate thoroughly, the firm protects the comfortable status quo. They choose to believe the “Happy Path” (the process is robust) over the evidence (the discrepancy).

The Pendulum of Compliance

In my companion post (Sanofi Warning Letter”), I discussed the “pendulum of compliance.” The Framingham site was supposed to be the fortress of quality, built on the lessons of the Genzyme crisis.

The failure at Sanofi wasn’t a lack of SOPs; it was a lack of curiosity.

The investigators likely had checklists, templates, and timelines (Compliance Theater), but they lacked the mandate—or perhaps the Expertise —to actually solve the problem.

This set the thematic stage for the rest of 2025. Sanofi showed us that “closing the deviation” is not the same as fixing the problem. This insight led directly into my August argument in The Effectiveness Paradox: You can close 100% of your deviations on time and still have a manufacturing process that is spinning out of control.

If Sanofi was the failure of investigation (looking back), Rechon and LeMaitre were failures of surveillance (looking forward). Together, they form a complete picture of why unfalsifiable systems fail.

Reflecting on: Rechon Life Science and LeMaitre Vascular

Philosophy and guidelines are fine, but in September, reality kicked in the door.

Two warning letters in 2025—Rechon Life Science (September) and LeMaitre Vascular (August)—provided brutal case studies in what happens when “representative sampling” is treated as a buzzword rather than a statistical requirement.

Rechon Life Science: The Map vs. The Territory

The Rechon Life Science warning letter was a significant regulatory signal of 2025 regarding sterile manufacturing. It wasn’t just a list of observations; it was an indictment of unfalsifiable Contamination Control Strategies (CCS).

We spent 2023 and 2024 writing massive CCS documents to satisfy Annex 1. Hundreds of pages detailing airflows, gowning procedures, and material flows. We felt good about them. We felt “compliant.”

Then the FDA walked into Rechon and essentially asked: “If your CCS is so good, why does your smoke study show turbulence over the open vials?”

The warning letter highlighted a disconnect I’ve called “The Map vs. The Territory.”

  • The Map: The CCS document says the airflow is unidirectional and protects the product.
  • The Territory: The smoke study video shows air eddying backward from the operator to the sterile core.

In an unfalsifiable system, we ignore the smoke study (or film it from a flattering angle) because it contradicts the CCS. We prioritize the documentation (the claim) over the observation (the evidence).

In a falsifiable system, the smoke study is the test. If the smoke shows turbulence, the CCS is falsified. We don’t defend the CCS; we rewrite it. We redesign the line.

The FDA’s critique of Rechon’s “dynamic airflow visualization” was devastating because it showed that Rechon was using the smoke study as a marketing video, not a diagnostic tool. They filmed “representative” operations that were carefully choreographed to look clean, rather than the messy reality of interventions.

LeMaitre Vascular: The Sin of “Aspirational Data”

If Rechon was about air, LeMaitre Vascular (analyzed in my August post When Water Systems Fail) was about water. And it contained an even more egregious sin against falsifiability.

The FDA observed that LeMaitre’s water sampling procedures required cleaning and purging the sample valves before taking the sample.

Let’s pause and consider the epistemology of this.

  • The Goal: To measure the quality of the water used in manufacturing.
  • The Reality: Manufacturing operators do not purge and sanitize the valve for 10 minutes before filling the tank. They open the valve and use the water.
  • The Sample: By sanitizing the valve before sampling, LeMaitre was measuring the quality of the sampling process, not the quality of the water system.

I call this “Aspirational Data.” It is data that reflects the system as we wish it existed, not as it actually exists. It is the ultimate unfalsifiable metric. You can never find biofilm in a valve if you scrub the valve with alcohol before you open it.

The FDA’s warning letter was clear: “Sampling… must include any pathway that the water travels to reach the process.”

LeMaitre also performed an unauthorized “Sterilant Switcheroo,” changing their sanitization agent without change control or biocompatibility assessment. This is the hallmark of an unfalsifiable culture: making changes based on convenience, assuming they are safe, and never designing the study to check if that assumption is wrong.

The “Representative” Trap

Both warning letters pivot on the misuse of the word “representative.”

Firms love to claim their EM sampling locations are “representative.” But representative of what? Usually, they are representative of the average condition of the room—the clean, empty spaces where nothing happens.

But contamination is not an “average” event. It is a specific, localized failure. A falsifiable EM program places probes in the “worst-case” locations—near the door, near the operator’s hands, near the crimping station. It tries to find contamination. It tries to falsify the claim that the zone is sterile, asceptic or bioburden reducing.

When Rechon and LeMaitre failed to justify their sampling locations, they were guilty of designing an unfalsifiable experiment. They placed the “microscope” where they knew they wouldn’t find germs.

2025 taught us that regulators are no longer impressed by the thickness of the CCS binder. They are looking for the logic of control. They are testing your hypothesis. And if you haven’t tested it yourself, you will fail.

The Investigation as Evidence

(Reflecting on: The Golden Start to a Deviation InvestigationCausal ReasoningTake-the-Best Heuristics, and The Catalent Case)

If Rechon, LeMaitre, and Sanofi teach us anything, it is that the quality system’s ability to discover failure is more important than its ability to prevent failure.

A perfect manufacturing process that no one is looking at is indistinguishable from a collapsing process disguised by poor surveillance. But a mediocre process that is rigorously investigated, understood, and continuously improved is a path toward genuine control.

The investigation itself—how we respond to a deviation, how we reason about causation, how we design corrective actions—is where falsifiable quality either succeeds or fails.

The Golden Day: When Theory Meets Work-as-Done

In April, I published “The Golden Start to a Deviation Investigation,” which made a deceptively simple argument: The first 24 hours after a deviation is discovered are where your quality system either commits to discovering truth or retreats into theater.

This argument sits at the heart of falsifiable quality.

When a deviation occurs, you have a narrow window—what I call the “Golden Day”—where evidence is fresh, memories are intact, and the actual conditions that produced the failure still exist. If you waste this window with vague problem statements and abstract discussions, you permanently lose the ability to test causal hypotheses later.

The post outlined a structured protocol:

First, crystallize the problem. Not “potency was low”—but “Lot X234, potency measured at 87% on January 15th at 14:32, three hours after completion of blending in Vessel C-2.” Precision matters because only specific, bounded statements can be falsified. A vague problem statement can always be “explained away.”

Second, go to the Gemba. This is the antidote to “work-as-imagined” investigation. The SOP says the temperature controller should maintain 37°C +/- 2°C. But the Gemba walk reveals that the probe is positioned six inches from the heating element, the data logger is in a recessed pocket where humidity accumulates, and the operator checks it every four hours despite a requirement to check hourly. These are the facts that predict whether the deviation will recur.

Third, interview with cognitive discipline. Most investigations fail not because investigators lack information, but because they extract information poorly. Cognitive interviewing—developed by the FBI and the National Transportation Safety Board—uses mental reinstatement, multiple perspectives, and sequential reordering to access accurate recall rather than confabulated narrative. The investigator asks the operator to walk through the event in different orders, from different viewpoints, each time triggering different memory pathways. This is not “soft” technique; it is a mechanism for generating falsifiable evidence.

The Golden Day post makes it clear: You do not investigate deviations to document compliance. You investigate deviations to gather evidence about whether your understanding of the process is correct.

Causal Reasoning: Moving Beyond “What Was Missing”

Most investigation tools fail not because they are flawed, but because they are applied with the wrong mindset. In my May post “Causal Reasoning: A Transformative Approach to Root Cause Analysis,” I argued that pharmaceutical investigations are often trapped in “negative reasoning.”

Negative reasoning asks: “What barrier was missing? What should have been done but wasn’t?” This mindset leads to unfalsifiable conclusions like “Procedure not followed” or “Training was inadequate.” These are dead ends because they describe the absence of an ideal, not the presence of a cause.

Causal reasoning flips the script. It asks: “What was present in the system that made the observed outcome inevitable?”

Instead of settling for “human error,” causal reasoning demands we ask: What environmental cues made the action sensible to the operator at that moment? Were the instructions ambiguous? Did competing priorities make compliance impossible? Was the process design fragile?

This shift transforms the investigation from a compliance exercise into a scientific inquiry.

Consider the LeMaitre example:

  • Negative Reasoning: “Why didn’t they sample the true condition?” Answer: “Because they didn’t follow the intent of the sampling plan.”
  • Causal Reasoning: “What made the pre-cleaning practice sensible to them?” Answer: “They believed it ensured sample validity by removing valve residue.”

By understanding the why, we identify a knowledge gap that can be tested and corrected, rather than a negligence gap that can only be punished.

In September, “Take-the-Best Heuristic for Causal Investigation” provided a practical framework for this. Instead of listing every conceivable cause—a process that often leads to paralysis—the “Take-the-Best” heuristic directs investigators to focus on the most information-rich discriminators. These are the factors that, if different, would have prevented the deviation. This approach focuses resources where they matter most, turning the investigation into a targeted search for truth.

CAPA: Predictions, Not Promises

The Sanofi warning letter—analyzed in January—showed the destination of unfalsifiable investigation: CAPAs that exist mainly as paperwork.

Sanofi had investigation reports. They had “corrective actions.” But the FDA noted that deviations recurred in similar patterns, suggesting that the investigation had identified symptoms, not mechanisms, and that the “corrective” action had not actually addressed causation.

This is the sin of treating CAPA as a promise rather than a hypothesis.

A falsifiable CAPA is structured as an explicit prediction“If we implement X change, then Y undesirable outcome will not recur under conditions Z.”

This can be tested. If it fails the test, the CAPA itself becomes evidence—not of failure, but of incomplete causal understanding. Which is valuable.

In the Rechon analysis, this showed up concretely: The FDA’s real criticism was not just that contamination was found; it was that Rechon’s Contamination Control Strategy had no mechanism to falsify itself. If the CCS said “unidirectional airflow protects the product,” and smoke studies showed bidirectional eddies, the CCS had been falsified. But Rechon treated the falsification as an anomaly to be explained away, rather than evidence that the CCS hypothesis was wrong.

A falsifiable organization would say: “Our CCS predicted that Grade A in an isolator with this airflow pattern would remain sterile. The smoke study proves that prediction wrong. Therefore, the CCS is false. We redesign.”

Instead, they filmed from a different angle and said the aerodynamics were “acceptable.”

Knowledge Integration: When Deviations Become the Curriculum

The final piece of falsifiable investigation is what I call “knowledge integration.” A single deviation is a data point. But across the organization, deviations should form a curriculum about how systems actually fail.

Sanofi’s failure was not that they investigated each deviation badly (though they did). It was that they investigated them in isolation. Each deviation closed on its own. Each CAPA addressed its own batch. There was no organizational learning—no mechanism for a pattern of similar deviations to trigger a hypothesis that the control strategy itself was fundamentally flawed.

This is where the Catalent case study, analyzed in September’s “When 483s Reveal Zemblanity,” becomes instructive. Zemblanity is the opposite of serendipity: the seemingly random recurrence of the same failure through different paths. Catalent’s 483 observations were not isolated mistakes; they formed a pattern that revealed a systemic assumption (about equipment capability, about environmental control, about material consistency) that was false across multiple products and locations.

A falsifiable quality system catches zemblanity early by:

  1. Treating each deviation as a test of organizational hypotheses, not as an isolated incident.
  2. Trending deviation patterns to detect when the same causal mechanism is producing failures across different products, equipment, or operators.
  3. Revising control strategies when patterns falsify the original assumptions, rather than tightening parameters at the margins.

The Digital Hallucination (CSA, AI, and the Expertise Crisis)

(Reflecting on: CSA: The Emperor’s New Clothes, Annex 11, and The Expertise Crisis)

While we battled microbes in the cleanroom, a different battle was raging in the server room. 2025 was the year the industry tried to “modernize” validation through Computer Software Assurance (CSA) and AI, and in many ways, it was the year we tried to automate our way out of thinking.

CSA: The Emperor’s New Validation Clothes

In September, I published Computer System Assurance: The Emperor’s New Validation Clothes,” a critique of the the contortions being made around the FDA’s guidance. The narrative sold by consultants for years was that traditional Computer System Validation (CSV) was “broken”—too much documentation, too much testing—and that CSA was a revolutionary new paradigm of “critical thinking.”

My analysis showed that this narrative is historically illiterate.

The principles of CSA—risk-based testing, leveraging vendor audits, focusing on intended use—are not new. They are the core principles of GAMP5 and have been applied for decades now.

The industry didn’t need a new guidance to tell us to use critical thinking; we had simply chosen not to use the critical thinking tools we already had. We had chosen to apply “one-size-fits-all” templates because they were safe (unfalsifiable).

The CSA guidance is effectively the FDA saying: “Please read the GAMP5 guide you claimed to be following for the last 15 years.”

The danger of the “CSA Revolution” narrative is that it encourages a swing to the opposite extreme: “Unscripted Testing” that becomes “No Testing.”

In a falsifiable system, “unscripted testing” is highly rigorous—it is an expert trying to break the software (“Ad Hoc testing”). But in an unfalsifiable system, “unscripted testing” becomes “I clicked around for 10 minutes and it looked fine.”

The Expertise Crisis: AI and the Death of the Apprentice

This leads directly to the Expertise Crisis. In September, I wrote The Expertise Crisis: Why AI’s War on Entry-Level Jobs Threatens Quality’s Future.” This was perhaps the most personal topic I covered this year, because it touches on the very survival of our profession.

We are rushing to integrate Artificial Intelligence (AI) into quality systems. We have AI writing deviations, AI drafting SOPs, AI summarizing regulatory changes. The efficiency gains are undeniable. But the cost is hidden, and it is epistemological.

Falsifiability requires expertise.
To falsify a claim—to look at a draft investigation report and say, “No, that conclusion doesn’t follow from the data”—you need deep, intuitive knowledge of the process. You need to know what a “normal” pH curve looks like so you can spot the “abnormal” one that the AI smoothed over.

Where does that intuition come from? It comes from the “grunt work.” It comes from years of reviewing batch records, years of interviewing operators, years of struggling to write a root cause analysis statement.

The Expertise Crisis is this: If we give all the entry-level work to AI, where will the next generation of Quality Leaders come from?

  • The Junior Associate doesn’t review the raw data; the AI summarizes it.
  • The Junior Associate doesn’t write the deviation; the AI generates the text.
  • Therefore, the Junior Associate never builds the mental models necessary to critique the AI.

The Loop of Unfalsifiable Hallucination

We are creating a closed loop of unfalsifiability.

  1. The AI generates a plausible-sounding investigation report.
  2. The human reviewer (who has been “de-skilled” by years of AI reliance) lacks the deep expertise to spot the subtle logical flaw or the missing data point.
  3. The report is approved.
  4. The “hallucination” becomes the official record.

In a falsifiable quality system, the human must remain the adversary of the algorithm. The human’s job is to try to break the AI’s logic, to check the citations, to verify the raw data.
But in 2025, we saw the beginnings of a “Compliance Autopilot”—a desire to let the machine handle the “boring stuff.”

My warning in September remains urgent: Efficiency without expertise is just accelerated incompetence. If we lose the ability to falsify our own tools, we are no longer quality professionals; we are just passengers in a car driven by a statistical model that doesn’t know what “truth” is.

My post “The Missing Middle in GMP Decision Making: How Annex 22 Redefines Human-Machine Collaboration in Pharmaceutical Quality Assurance” goes a lot deeper here.

Annex 11 and Data Governance

In August, I analyzed the draft Annex 11 (Computerised Systems) in the post Data Governance Systems: A Fundamental Shift.”

The Europeans are ahead of the FDA here. While the FDA talks about “Assurance” (testing less), the EU is talking about “Governance” (controlling more). The new Annex 11 makes it clear: You cannot validate a system if you do not control the data lifecycle. Validation is not a test script; it is a state of control.

This aligns perfectly with USP <1225> and <1220>. Whether it’s a chromatograph or an ERP system, the requirement is the same: Prove that the data is trustworthy, not just that the software is installed.

The Process as a Hypothesis (CPV & Cleaning)

(Reflecting on: Continuous Process Verification and Hypothesis Formation)

The final frontier of validation we explored in 2025 was the manufacturing process itself.

CPV: Continuous Falsification

In March, I published Continuous Process Verification (CPV) Methodology and Tool Selection.”
CPV is the ultimate expression of Falsifiable Quality in manufacturing.

  • Traditional Validation (3 Batches): “We made 3 good batches, therefore the process is perfect forever.” (Unfalsifiable extrapolation).
  • CPV: “We made 3 good batches, so we have a license to manufacture, but we will statistically monitor every subsequent batch to detect drift.” (Continuous hypothesis testing).

The challenge with CPV, as discussed in the post, is that it requires statistical literacy. You cannot implement CPV if your quality unit doesn’t understand the difference between Cpk and Ppk, or between control limits and specification limits.

This circles back to the Expertise Crisis. We are implementing complex statistical tools (CPV software) at the exact moment we are de-skilling the workforce. We risk creating a “CPV Dashboard” that turns red, but no one knows why or what to do about it.

Cleaning Validation: The Science of Residue

In August, I tried to apply falsifiability to one of the most stubborn areas of dogma: Cleaning Validation.

In Building Decision-Making with Structured Hypothesis Formation, I argued that cleaning validation should not be about “proving it’s clean.” It should be about “understanding why it gets dirty.”

  • Traditional Approach: Swab 10 spots. If they pass, we are good.
  • Hypothesis Approach: “We hypothesize that the gasket on the bottom valve is the hardest to clean. We predict that if we reduce rinse time by 1 minute, that gasket will fail.”

By testing the boundaries—by trying to make the cleaning fail—we understand the Design Space of the cleaning process.

We discussed the “Visual Inspection” paradox in cleaning: If you can see the residue, it failed. But if you can’t see it, does it pass?

Only if you have scientifically determined the Visible Residue Limit (VRL). Using “visually clean” without a validated VRL is—you guessed it—unfalsifiable.

To: Jeremiah Genest
From: Perplexity Research
Subject: Draft Content – Single-Use Systems & E&L Section

Here is a section on Single-Use Systems (SUS) and Extractables & Leachables (E&L).

I have positioned this piece to bridge the gap between “Part III: The Reality Check” (Contamination/Water) and “Part V: The Process as a Hypothesis” (Cleaning Validation).

The argument here is that by switching from Stainless Steel to Single-Use, we traded a visible risk (cleaning residue) for an invisible one (chemical migration), and that our current approach to E&L is often just “Paper Safety”—relying on vendor data that doesn’t reflect the “Work-as-Done” reality of our specific process conditions.

The Plastic Paradox (Single-Use Systems and the E&L Mirage)

If the Rechon and LeMaitre warning letters were about the failure to control biological contaminants we can find, the industry’s struggle with Single-Use Systems (SUS) in 2025 was about the chemical contaminants we choose not to find.

We have spent the last decade aggressively swapping stainless steel for plastic. The value proposition was irresistible: Eliminate cleaning validation, eliminate cross-contamination, increase flexibility. We traded the “devil we know” (cleaning residue) for the “devil we don’t” (Extractables and Leachables).

But in 2025, with the enforcement reality of USP <665> (Plastic Components and Systems) settling in, we had to confront the uncomfortable truth: Most E&L risk assessments are unfalsifiable.

The Vendor Data Trap

The standard industry approach to E&L is the ultimate form of “Compliance Theater.”

  1. We buy a single-use bag.
  2. We request the vendor’s regulatory support package (the “Map”).
  3. We see that the vendor extracted the film with aggressive solvents (ethanol, hexane) for 7 days.
  4. We conclude: “Our process uses water for 24 hours; therefore, we are safe.”

This logic is epistemologically bankrupt. It assumes that the Vendor’s Model (aggressive solvents/short time) maps perfectly to the User’s Reality (complex buffers/long duration/specific surfactants).

It ignores the fact that plastics are dynamic systems. Polymers age. Gamma irradiation initiates free radical cascades that evolve over months. A bag manufactured in January might have a different leachable profile than a bag manufactured in June, especially if the resin supplier made a “minor” change that didn’t trigger a notification.

By relying solely on the vendor’s static validation package, we are choosing not to falsify our safety hypothesis. We are effectively saying, “If the vendor says it’s clean, we will not look for dirt.”

USP <665>: A Baseline, Not a Ceiling

The full adoption of USP <665> was supposed to bring standardization. And it has—it provides a standard set of extraction conditions. But standards can become ceilings.

In 2025, I observed a troubling trend of “Compliance by Citation.” Firms are citing USP <665> compliance as proof of absence of risk, stopping the inquiry there.

A Falsifiable E&L Strategy goes further. It asks:

  • “What if the vendor data is irrelevant to my specific surfactant?”
  • “What if the gamma irradiation dose varied?”
  • “What if the interaction between the tubing and the connector creates a new species?”

The Invisible Process Aid

We must stop viewing Single-Use Systems as inert piping. They are active process components. They are chemically reactive vessels that participate in our reaction kinetics.

When we treat them as inert, we are engaging in the same “Aspirational Thinking” that LeMaitre used on their water valves. We are modeling the system we want (pure, inert plastic), not the system we have (a complex soup of antioxidants, slip agents, and degradants).

The lesson of 2025 is that Material Qualification cannot be a paper exercise. If you haven’t done targeted simulation studies that mimic your actual “Work-as-Done” conditions, you haven’t validated the system. You’ve just filed the receipt.

The Mandate for 2026

As we look toward 2026, the path is clear. We cannot go back to the comfortable fiction of the pre-2025 era.

The regulatory environment (Annex 1, ICH Q14, USP <1225>, Annex 11) is explicitly demanding evidence of control, not just evidence of compliance. The technological environment (AI) is demanding that we sharpen our human expertise to avoid becoming obsolete. The physical environment (contamination, supply chain complexity) is demanding systems that are robust, not just rigid.

The mandate for the coming year is to build Falsifiable Quality Systems.

What does that look like practically?

  1. In the Lab: Implement USP <1225> logic now. Don’t wait for the official date. Validate your reportable results. Add “challenge tests” to your routine monitoring.
  2. In the Plant: Redesign your Environmental Monitoring to hunt for contamination, not to avoid it. If you have a “perfect” record in a Grade C area, move the plates until you find the dirt.
  3. In the Office: Treat every investigation as a chance to falsify the control strategy. If a deviation occurs that the control strategy said was impossible, update the control strategy.
  4. In the Culture: Reward the messenger. The person who finds the crack in the system is not a troublemaker; they are the most valuable asset you have. They just falsified a false sense of security.
  5. In Design: Embrace the Elegant Quality System (discussed in May). Complexity is the enemy of falsifiability. Complex systems hide failures; simple, elegant systems reveal them.

2025 was the year we stopped pretending. 2026 must be the year we start building. We must build systems that are honest enough to fail, so that we can build processes that are robust enough to endure.

Thank you for reading, challenging, and thinking with me this year. The investigation continues.

Meeting Worst-Case Testing Requirements Through Hypothesis-Driven Validation

The integration of hypothesis-driven validation with traditional worst-case testing requirements represents a fundamental evolution in how we approach pharmaceutical process validation. Rather than replacing worst-case concepts, the hypothesis-driven approach provides scientific rigor and enhanced understanding while fully satisfying regulatory expectations for challenging process conditions under extreme scenarios.

The Evolution of Worst-Case Concepts in Modern Validation

The concept of “worst-case” testing has undergone significant refinement since the original 1987 FDA guidance, which defined worst-case as “a set of conditions encompassing upper and lower limits and circumstances, including those within standard operating procedures, which pose the greatest chance of process or product failure when compared to ideal conditions”. The FDA’s 2011 Process Validation guidance shifted emphasis from conducting validation runs under worst-case conditions to incorporating worst-case considerations throughout the process design and qualification phases.

This evolution aligns perfectly with hypothesis-driven validation principles. Rather than conducting three validation batches under artificially extreme conditions that may not represent actual manufacturing scenarios, the modern lifecycle approach integrates worst-case testing throughout process development, qualification, and continued verification stages. Hypothesis-driven validation enhances this approach by making the scientific rationale for worst-case selection explicit and testable.

Guidance/RegulationAgencyYear PublishedPageRequirement
EU Annex 15 Qualification and ValidationEMA20155PPQ should include tests under normal operating conditions with worst case batch sizes
EU Annex 15 Qualification and ValidationEMA201516Definition: Worst Case – A condition or set of conditions encompassing upper and lower processing limits and circumstances, within standard operating procedures, which pose the greatest chance of product or process failure
EMA Process Validation for Biotechnology-Derived Active SubstancesEMA20165Evaluation of selected step(s) operating in worst case and/or non-standard conditions (e.g. impurity spiking challenge) can be performed to support process robustness
EMA Process Validation for Biotechnology-Derived Active SubstancesEMA201610Evaluation of purification steps operating in worst case and/or non-standard conditions (e.g. process hold times, spiking challenge) to document process robustness
EMA Process Validation for Biotechnology-Derived Active SubstancesEMA201611Studies conducted under worst case conditions and/or non-standard conditions (e.g. higher temperature, longer time) to support suitability of claimed conditions
WHO GMP Validation Guidelines (Annex 3)WHO2015125Where necessary, worst-case situations or specific challenge tests should be considered for inclusion in the qualification and validation
PIC/S Validation Master Plan Guide (PI 006-3)PIC/S200713Challenge element to determine robustness of the process, generally referred to as a “worst case” exercise using starting materials on the extremes of specification
FDA Process Validation General Principles and PracticesFDA2011Not specifiedWhile not explicitly requiring worst case testing for PPQ, emphasizes understanding and controlling variability and process robustness

Scientific Framework for Worst-Case Integration

Hypothesis-Based Worst-Case Definition

Traditional worst-case selection often relies on subjective expert judgment or generic industry practices. The hypothesis-driven approach transforms this into a scientifically rigorous process by developing specific, testable hypotheses about which conditions truly represent the most challenging scenarios for process performance.

For the mAb cell culture example, instead of generically testing “upper and lower limits” of all parameters, we develop specific hypotheses about worst-case interactions:

Hypothesis-Based Worst-Case Selection: The combination of minimum pH (6.95), maximum temperature (37.5°C), and minimum dissolved oxygen (35%) during high cell density phase (days 8-12) represents the worst-case scenario for maintaining both titer and product quality, as this combination will result in >25% reduction in viable cell density and >15% increase in acidic charge variants compared to center-point conditions.

This hypothesis is falsifiable and provides clear scientific justification for why these specific conditions constitute “worst-case” rather than other possible extreme combinations.

Process Design Stage Integration

ICH Q7 and modern validation approaches emphasize that worst-case considerations should be integrated during process design rather than only during validation execution. The hypothesis-driven approach strengthens this integration by ensuring worst-case scenarios are based on mechanistic understanding rather than arbitrary parameter combinations.

Design Space Boundary Testing

During process development, systematic testing of design space boundaries provides scientific evidence for worst-case identification. For example, if our hypothesis predicts that pH-temperature interactions are critical, we systematically test these boundaries to identify the specific combinations that represent genuine worst-case conditions rather than simply testing all possible parameter extremes.

Regulatory Compliance Through Enhanced Scientific Rigor

EMA Biotechnology Guidance Alignment

The EMA guidance on biotechnology-derived active substances specifically requires that “Studies conducted under worst case conditions should be performed to document the robustness of the process”. The hypothesis-driven approach exceeds these requirements by:

  1. Scientific Justification: Providing mechanistic understanding of why specific conditions represent worst-case scenarios
  2. Predictive Capability: Enabling prediction of process behavior under conditions not directly tested
  3. Risk-Based Assessment: Linking worst-case selection to patient safety through quality attribute impact assessment

ICH Q7 Process Validation Requirements

ICH Q7 requires that process validation demonstrate “that the process operates within established parameters and yields product meeting its predetermined specifications and quality characteristics”. The hypothesis-driven approach satisfies these requirements while providing additional value

Traditional ICH Q7 Compliance:

  • Demonstrates process operates within established parameters
  • Shows consistent product quality
  • Provides documented evidence

Enhanced Hypothesis-Driven Compliance:

  • Demonstrates process operates within established parameters
  • Shows consistent product quality
  • Provides documented evidence
  • Explains why parameters are set at specific levels
  • Predicts process behavior under untested conditions
  • Provides scientific basis for parameter range justification

Practical Implementation of Worst-Case Hypothesis Testing

Cell Culture Bioreactor Example

For a CHO cell culture process, worst-case testing integration follows this structured approach:

Phase 1: Worst-Case Hypothesis Development

Instead of testing arbitrary parameter combinations, develop specific hypotheses about failure mechanisms:

Metabolic Stress Hypothesis: The worst-case metabolic stress condition occurs when glucose depletion coincides with high lactate accumulation (>4 g/L) and elevated CO₂ (>10%) simultaneously, leading to >50% reduction in specific productivity within 24 hours.

Product Quality Degradation Hypothesis: The worst-case condition for charge variant formation is the combination of extended culture duration (>14 days) with pH drift above 7.2 for >12 hours, resulting in >10% increase in acidic variants.

Phase 2: Systematic Worst-Case Testing Design

Rather than three worst-case validation batches, integrate systematic testing throughout process qualification:

Study PhaseTraditional ApproachHypothesis-Driven Integration
Process DevelopmentLimited worst-case explorationSystematic boundary testing to validate worst-case hypotheses
Process Qualification3 batches under arbitrary worst-caseMultiple studies testing specific worst-case mechanisms
Commercial MonitoringReactive deviation investigationProactive monitoring for predicted worst-case indicators

Phase 3: Worst-Case Challenge Studies

Design specific studies to test worst-case hypotheses under controlled conditions:

Controlled pH Deviation Study:

  • Deliberately induce pH drift to 7.3 for 18 hours during production phase
  • Testable Prediction: Acidic variants will increase by 8-12%
  • Falsification Criteria: If variant increase is <5% or >15%, hypothesis requires revision
  • Regulatory Value: Demonstrates process robustness under worst-case pH conditions

Metabolic Stress Challenge:

  • Create controlled glucose limitation combined with high CO₂ environment
  • Testable Prediction: Cell viability will drop to <80% within 36 hours
  • Falsification Criteria: If viability remains >90%, worst-case assumptions are incorrect
  • Regulatory Value: Provides quantitative data on process failure mechanisms

Meeting Matrix and Bracketing Requirements

Traditional validation often uses matrix and bracketing approaches to reduce validation burden while ensuring worst-case coverage. The hypothesis-driven approach enhances these strategies by providing scientific justification for grouping and worst-case selection decisions.

Enhanced Matrix Approach

Instead of grouping based on similar equipment size or configuration, group based on mechanistic similarity as defined by validated hypotheses:

Traditional Matrix Grouping: All 1000L bioreactors with similar impeller configuration are grouped together.

Hypothesis-Driven Matrix Grouping: All bioreactors where oxygen mass transfer coefficient (kLa) falls within 15% and mixing time is <30 seconds are grouped together, as validated hypotheses demonstrate these parameters control product quality variability.

Scientific Bracketing Strategy

The hypothesis-driven approach transforms bracketing from arbitrary extreme testing to mechanistically justified boundary evaluation:

Bracketing Hypothesis: If the process performs adequately under maximum metabolic demand conditions (highest cell density with minimum nutrient feeding rate) and minimum metabolic demand conditions (lowest cell density with maximum feeding rate), then all intermediate conditions will perform within acceptable ranges because metabolic stress is the primary driver of process failure.

This hypothesis can be tested and potentially falsified, providing genuine scientific basis for bracketing strategies rather than regulatory convenience.

Enhanced Validation Reports

Hypothesis-driven validation reports provide regulators with significantly more insight than traditional approaches:

Traditional Worst-Case Documentation: Three validation batches were executed under worst-case conditions (maximum and minimum parameter ranges). All batches met specifications, demonstrating process robustness.

Hypothesis-Driven Documentation: Process robustness was demonstrated through systematic testing of six specific hypotheses about failure mechanisms. Worst-case conditions were scientifically selected based on mechanistic understanding of metabolic stress, pH sensitivity, and product degradation pathways. Results confirm process operates reliably even under conditions that challenge the primary failure mechanisms.

Regulatory Submission Enhancement

The hypothesis-driven approach strengthens regulatory submissions by providing:

  1. Scientific Rationale: Clear explanation of worst-case selection criteria
  2. Predictive Capability: Evidence that process behavior can be predicted under untested conditions
  3. Risk Assessment: Quantitative understanding of failure probability under different scenarios
  4. Continuous Improvement: Framework for ongoing process optimization based on mechanistic understanding

Integration with Quality by Design (QbD) Principles

The hypothesis-driven approach to worst-case testing aligns perfectly with ICH Q8-Q11 Quality by Design principles while satisfying traditional validation requirements:

Design Space Verification

Instead of arbitrary worst-case testing, systematically verify design space boundaries through hypothesis testing:

Design Space Hypothesis: Operation anywhere within the defined design space (pH 6.95-7.10, Temperature 36-37°C, DO 35-50%) will result in product meeting CQA specifications with >95% confidence.

Worst-Case Verification: Test this hypothesis by deliberately operating at design space boundaries and measuring CQA response, providing scientific evidence for design space validity rather than compliance demonstration.

Control Strategy Justification

Hypothesis-driven worst-case testing provides scientific justification for control strategy elements:

Traditional Control Strategy: pH must be controlled between 6.95-7.10 based on validation data.

Enhanced Control Strategy: pH must be controlled between 6.95-7.10 because validated hypotheses demonstrate that pH excursions above 7.15 for >8 hours increase acidic variants beyond specification limits, while pH below 6.90 reduces cell viability by >20% within 12 hours.

Scientific Rigor Enhances Regulatory Compliance

The hypothesis-driven approach to validation doesn’t circumvent worst-case testing requirements—it elevates them from compliance exercises to genuine scientific inquiry. By developing specific, testable hypotheses about what constitutes worst-case conditions and why, we satisfy regulatory expectations while building genuine process understanding that supports continuous improvement and regulatory flexibility.

This approach provides regulators with the scientific evidence they need to have confidence in process robustness while giving manufacturers the process understanding necessary for lifecycle management, change control, and optimization. The result is validation that serves both compliance and business objectives through enhanced scientific rigor rather than additional bureaucracy.

The integration of worst-case testing with hypothesis-driven validation represents the evolution of pharmaceutical process validation from documentation exercises toward genuine scientific methodology. An evolution that strengthens rather than weakens regulatory compliance while providing the process understanding necessary for 21st-century pharmaceutical manufacturing.

Statistical Process Control (SPC): Methodology, Tools, and Strategic Application

Statistical Process Control (SPC) is both a standalone methodology and a critical component of broader quality management systems. Rooted in statistical principles, SPC enables organizations to monitor, control, and improve processes by distinguishing between inherent (common-cause) and assignable (special-cause) variation. This blog post explores SPC’s role in modern quality strategies, control charts as its primary tools, and practical steps for implementation, while emphasizing its integration into holistic frameworks like Six Sigma and Quality by Design (QbD).

SPC as a Methodology and Its Strategic Integration

SPC serves as a core methodology for achieving process stability through statistical tools, but its true value emerges when embedded within larger quality systems. For instance:

  • Quality by Design (QbD): In pharmaceutical manufacturing, SPC aligns with QbD’s proactive approach, where critical process parameters (CPPs) and material attributes are predefined using risk assessment. Control charts monitor these parameters to ensure they remain within Normal Operating Ranges (NORs) and Proven Acceptable Ranges (PARs), safeguarding product quality.
  • Six Sigma: SPC tools like control charts are integral to the “Measure” and “Control” phases of the DMAIC (Define-Measure-Analyze-Improve-Control) framework. By reducing variability, SPC helps achieve Six Sigma’s goal of near-perfect processes.
  • Regulatory Compliance: In regulated industries, SPC supports Ongoing Process Verification (OPV) and lifecycle management. For example, the FDA’s Process Validation Guidance emphasizes SPC for maintaining validated states, requiring trend analysis of quality metrics like deviations and out-of-specification (OOS) results.

This integration ensures SPC is not just a technical tool but a strategic asset for continuous improvement and compliance.

When to Use Statistical Process Control

SPC is most effective in environments where process stability and variability reduction are critical. Below are key scenarios for its application:

High-Volume Manufacturing

In industries like automotive or electronics, where thousands of units are produced daily, SPC identifies shifts in process mean or variability early. For example, control charts for variables data (e.g., X-bar/R charts) monitor dimensions of machined parts, ensuring consistency across high-volume production runs. The ASTM E2587 standard highlights that SPC is particularly valuable when subgroup data (e.g., 20–25 subgroups) are available to establish reliable control limits.

Batch Processes with Critical Quality Attributes

In pharmaceuticals or food production, batch processes require strict adherence to specifications. Attribute control charts (e.g., p-charts for defect rates) track deviations or OOS results, while individual/moving range (I-MR) charts monitor parameters.

Regulatory and Compliance Requirements

Regulated industries (e.g., pharmaceutical, medical devices, aerospace) use SPC to meet standards like ISO 9001 or ICH Q10. For instance, SPC’s role in Continious Process Verification (CPV) ensures processes remain in a state of control post-validation. The FDA’s emphasis on data-driven decision-making aligns with SPC’s ability to provide evidence of process capability and stability.

Continuous Improvement Initiatives

SPC is indispensable in projects aimed at reducing waste and variation. By identifying special causes (e.g., equipment malfunctions, raw material inconsistencies), teams can implement corrective actions. Western Electric Rules applied to control charts detect subtle shifts, enabling root-cause analysis and preventive measures.

Early-Stage Process Development

During process design, SPC helps characterize variability and set realistic tolerances. Exponentially Weighted Moving Average (EWMA) charts detect small shifts in pilot-scale batches, informing scale-up decisions. ASTM E2587 notes that SPC is equally applicable to both early-stage development and mature processes, provided rational subgrouping is used.

Supply Chain and Supplier Quality

SPC extends beyond internal processes to supplier quality management. c-charts or u-charts monitor defect rates from suppliers, ensuring incoming materials meet specifications.

In all cases, SPC requires sufficient data (typically ≥20 subgroups) and a commitment to data-driven culture. It is less effective in one-off production or where measurement systems lack precision.

Control Charts: The Engine of SPC

Control charts are graphical tools that plot process data over time against statistically derived control limits. They serve two purposes:

  1. Monitor Stability: Detect shifts or trends indicating special causes.
  2. Drive Improvement: Provide data for root-cause analysis and corrective actions.

Types of Control Charts

Control charts are categorized by data type:

Data TypeChart TypeUse Case
Variables (Continuous)X-bar & RMonitor process mean and variability (subgroups of 2–10).
X-bar & SSimilar to X-bar & R but uses standard deviation.
Individual & Moving Range (I-MR)For single measurements (e.g., batch processes).
Attributes (Discrete)p-chartProportion of defective units (variable subgroup size).
np-chartNumber of defective units (fixed subgroup size).
c-chartCount of defects per unit (fixed inspection interval).
u-chartDefects per unit (variable inspection interval).

Decision Rules: Western Electric and Nelson Rules

Control charts become actionable when paired with decision rules to identify non-random variation:

Western Electric Rules

A process is out of control if:

  1. 1 point exceeds 3σ limits.
  2. 2/3 consecutive points exceed 2σ on the same side.
  3. 4/5 consecutive points exceed 1σ on the same side.
  4. 8 consecutive points trend upward/downward.

Nelson Rules

Expands detection to include:

  1. 6+ consecutive points trending.
  2. 14+ alternating points (up/down).
  3. 15 points within 1σ of the mean.

Note: Overusing rules increases false alarms; apply judiciously.


SPC in Control Strategies and Trending

SPC is vital for maintaining validated states and continuous improvement:

  1. Control Strategy Integration:
  • Define Normal Operating Ranges (NORs) and Proven Acceptable Ranges (PARs) for CPPs.
  • Set alert limits (e.g., 2σ) and action limits (3σ) for KPIs like deviations or OOS results.
  1. Trending Practices:
  • Quarterly Reviews: Assess control charts for special causes.
  • Annual NOR Reviews: Re-evaluate limits after process changes.
  • CAPA Integration: Investigate trends and implement corrective actions.

Conclusion

SPC is a powerhouse methodology that thrives when embedded within broader quality systems. By aligning SPC with control strategies—through NORs, PARs, and structured trending—organizations achieve not just compliance, but excellence. Whether in pharmaceuticals, manufacturing, or beyond, SPC remains a timeless tool for mastering variability.

The Pre-Mortem

A pre-mortem is a proactive risk management exercise that enables pharmaceutical teams to anticipate and mitigate failures before they occur. This tool can transform compliance from a reactive checklist into a strategic asset for safeguarding product quality.


Pre-Mortems in Pharmaceutical Quality Systems

In GMP environments, where deviations in drug substance purity or drug product stability can cascade into global recalls, pre-mortems provide a structured framework to challenge assumptions. For example, a team developing a monoclonal antibody might hypothesize that aggregation occurred during drug substance purification due to inadequate temperature control in bioreactors. By contrast, a tablet manufacturing team might explore why dissolution specifications failed because of inconsistent API particle size distribution. These exercises align with ICH Q9’s requirement for systematic hazard analysis and ICH Q10’s emphasis on knowledge management, forcing teams to document tacit insights about process boundaries and failure modes.

Pre-mortems excel at identifying “unknown unknowns” through creative thinking. Their value lies in uncovering risks traditional assessments miss. As a tool it can usually be strongly leveraged to identify areas for focus that may need a deeper tool, such as an FMEA. In practice, pre-mortems and FMEA are synergistic through a layered approach which satisfies ICH Q9’s requirement for both creative hazard identification and structured risk evaluation, turning hypothetical failures into validated control strategies.

By combining pre-mortems’ exploratory power with FMEA’s rigor, teams can address both systemic and technical risks, ensuring compliance while advancing operational resilience.


Implementing Pre-Mortems

1. Scenario Definition and Stakeholder Engagement

Begin by framing the hypothetical failure, the risk question. For drug substances, this might involve declaring, “The API batch was rejected due to genotoxic impurity levels exceeding ICH M7 limits.” For drug products, consider, “Lyophilized vials failed sterility testing due to vial closure integrity breaches.” Assemble a team spanning technical operations, quality control, and regulatory affairs to ensure diverse viewpoints.

2. Failure Mode Elicitation

To overcome groupthink biases in traditional brainstorming, teams should begin with brainwriting—a silent, written idea-generation technique. The prompt is a request to list reasons behind the risk question, such as “List reasons why the API batch failed impurity specifications”. Participants anonymously write risks on structured templates for 10–15 minutes, ensuring all experts contribute equally.

The collected ideas are then synthesized into a fishbone (Ishikawa) diagram, categorizing causes relevant branches, using a 6 M technique.

This method ensures comprehensive risk identification while maintaining traceability for regulatory audits.

3. Risk Prioritization and Control Strategy Development

Risks identified during the pre-mortem are evaluated using a severity-probability-detectability matrix, structured similarly to Failure Mode and Effects Analysis (FMEA).

4. Integration into Pharmaceutical Quality Systems

Mitigation plans are formalized in in control strategies and other mechanisms.


Case Study: Preventing Drug Substance Oxidation in a Small Molecule API

A company developing an oxidation-prone API conducted a pre-mortem anticipating discoloration and potency loss. The exercise revealed:

  • Drug substance risk: Inadequate nitrogen sparging during final isolation led to residual oxygen in crystallization vessels.
  • Drug product risk: Blister packaging with insufficient moisture barrier exacerbated degradation.

Mitigations included installing dissolved oxygen probes in purification tanks and switching to aluminum-foil blisters with desiccants. Process validation batches showed a 90% reduction in oxidation byproducts, avoiding a potential FDA Postmarketing Commitment

Continuous Process Verification (CPV) Methodology and Tool Selection: A Framework Guided by FDA Process Validation

Continuous Process Verification (CPV) represents the final and most dynamic stage of the FDA’s process validation lifecycle, designed to ensure manufacturing processes remain validated during routine production. The methodology for CPV and the selection of appropriate tools are deeply rooted in the FDA’s 2011 guidance, Process Validation: General Principles and Practices, which emphasizes a science- and risk-based approach to quality assurance. This blog post examines how CPV methodologies align with regulatory frameworks and how tools are selected to meet compliance and operational objectives.

3 stages of process validation, with CPV in green as the 3rd stage

CPV Methodology: Anchored in the FDA’s Lifecycle Approach

The FDA’s process validation framework divides activities into three stages: Process Design (Stage 1), Process Qualification (Stage 2), and Continued Process Verification (Stage 3). CPV, as Stage 3, is not an isolated activity but a continuation of the knowledge gained in earlier stages. This lifecycle approach is our framework.

Stage 1: Process Design

During Stage 1, manufacturers define Critical Quality Attributes (CQAs) and Critical Process Parameters (CPPs) through risk assessments and experimental design. This phase establishes the scientific basis for monitoring and control strategies. For example, if a parameter’s variability is inherently low (e.g., clustering near the Limit of Quantification, or LOQ), this knowledge informs later decisions about CPV tools.

Stage 2: Process Qualification

Stage 2 confirms that the process, when operated within established parameters, consistently produces quality products. Data from this stage—such as process capability indices (Cpk/Ppk)—provide baseline metrics for CPV. For instance, a high Cpk (>2) for a parameter near LOQ signals that traditional control charts may be inappropriate due to limited variability.

Stage 3: Continued Process Verification

CPV methodology is defined by two pillars:

  1. Ongoing Monitoring: Continuous collection and analysis of CPP/CQA data.
  2. Adaptive Control: Adjustments to maintain process control, informed by statistical and risk-based insights.

Regulatory agencies require that CPV methodologies must be tailored to the process’s unique characteristics. For example, a parameter with data clustered near LOQ (as in the case study) demands a different approach than one with normal variability.

Selecting CPV Tools: Aligning with Data and Risk

The framework emphasizes that CPV tools must be scientifically justified, with selection criteria based on data suitability, risk criticality, and regulatory alignment.

Data Suitability Assessments

Data suitability assessments form the bedrock of effective Continuous Process Verification (CPV) programs, ensuring that monitoring tools align with the statistical and analytical realities of the process. These assessments are not merely technical exercises but strategic activities rooted in regulatory expectations, scientific rigor, and risk management. Below, we explore the three pillars of data suitability—distribution analysis, process capability evaluation, and analytical performance considerations—and their implications for CPV tool selection.

The foundation of any statistical monitoring system lies in understanding the distribution of the data being analyzed. Many traditional tools, such as control charts, assume that data follows a normal (Gaussian) distribution. This assumption underpins the calculation of control limits (e.g., ±3σ) and the interpretation of rule violations. To validate this assumption, manufacturers employ tests such as the Shapiro-Wilk test or Anderson-Darling test, which quantitatively assess normality. Visual tools like Q-Q plots or histograms complement these tests by providing intuitive insights into data skewness, kurtosis, or clustering.

When data deviates significantly from normality—common in parameters with values clustered near detection or quantification limits (e.g., LOQ)—the use of parametric tools like control charts becomes problematic. For instance, a parameter with 95% of its data below the LOQ may exhibit a left-skewed distribution, where the calculated mean and standard deviation are distorted by the analytical method’s noise rather than reflecting true process behavior. In such cases, traditional control charts generate misleading signals, such as Rule 1 violations (±3σ), which flag analytical variability rather than process shifts.

To address non-normal data, manufacturers must transition to non-parametric methods that do not rely on distributional assumptions. Tolerance intervals, which define ranges covering a specified proportion of the population with a given confidence level, are particularly useful for skewed datasets. For example, a 95/99 tolerance interval (95% of data within 99% confidence) can replace ±3σ limits for non-normal data, reducing false positives. Bootstrapping—a resampling technique—offers another alternative, enabling robust estimation of control limits without assuming normality.

Process Capability: Aligning Tools with Inherent Variability

Process capability indices, such as Cp and Cpk, quantify a parameter’s ability to meet specifications relative to its natural variability. A high Cp (>2) indicates that the process variability is small compared to the specification range, often resulting from tight manufacturing controls or robust product designs. While high capability is desirable for quality, it complicates CPV tool selection. For example, a parameter with a Cp of 3 and data clustered near the LOQ will exhibit minimal variability, rendering control charts ineffective. The narrow spread of data means that control limits shrink, increasing the likelihood of false alarms from minor analytical noise.

In such scenarios, traditional SPC tools like control charts lose their utility. Instead, manufacturers should adopt attribute-based monitoring or batch-wise trending. Attribute-based approaches classify results as pass/fail against predefined thresholds (e.g., LOQ breaches), simplifying signal interpretation. Batch-wise trending aggregates data across production lots, identifying shifts over time without overreacting to individual outliers. For instance, a manufacturer with a high-capability dissolution parameter might track the percentage of batches meeting dissolution criteria monthly, rather than plotting individual tablet results.

The FDA’s emphasis on risk-based monitoring further supports this shift. ICH Q9 guidelines encourage manufacturers to prioritize resources for high-risk parameters, allowing low-risk, high-capability parameters to be monitored with simpler tools. This approach reduces administrative burden while maintaining compliance.

Analytical Performance: Decoupling Noise from Process Signals

Parameters operating near analytical limits of detection (LOD) or quantification (LOQ) present unique challenges. At these extremes, measurement systems contribute significant variability, often overshadowing true process signals. For example, a purity assay with an LOQ of 0.1% may report values as “<0.1%” for 98% of batches, creating a dataset dominated by the analytical method’s imprecision. In such cases, failing to decouple analytical variability from process performance leads to misguided investigations and wasted resources.

To address this, manufacturers must isolate analytical variability through dedicated method monitoring programs. This involves:

  1. Analytical Method Validation: Rigorous characterization of precision, accuracy, and detection capabilities (e.g., determining the Practical Quantitation Limit, or PQL, which reflects real-world method performance).
  2. Separate Trending: Implementing control charts or capability analyses for the analytical method itself (e.g., monitoring LOQ stability across batches).
  3. Threshold-Based Alerts: Replacing statistical rules with binary triggers (e.g., investigating only results above LOQ).

For example, a manufacturer analyzing residual solvents near the LOQ might use detection capability indices to set action limits. If the analytical method’s variability (e.g., ±0.02% at LOQ) exceeds the process variability, threshold alerts focused on detecting values above 0.1% + 3σ_analytical would provide more meaningful signals than traditional control charts.

Integration with Regulatory Expectations

Regulatory agencies, including the FDA and EMA, mandate that CPV methodologies be “scientifically sound” and “statistically valid” (FDA 2011 Guidance). This requires documented justification for tool selection, including:

  • Normality Testing: Evidence that data distribution aligns with tool assumptions (e.g., Shapiro-Wilk test results).
  • Capability Analysis: Cp/Cpk values demonstrating the rationale for simplified monitoring.
  • Analytical Validation Data: Method performance metrics justifying decoupling strategies.

A 2024 FDA warning letter highlighted the consequences of neglecting these steps. A firm using control charts for non-normal dissolution data received a 483 observation for lacking statistical rationale, underscoring the need for rigor in data suitability assessments.

Case Study Application:
A manufacturer monitoring a CQA with 98% of data below LOQ initially used control charts, triggering frequent Rule 1 violations (±3σ). These violations reflected analytical noise, not process shifts. Transitioning to threshold-based alerts (investigating only LOQ breaches) reduced false positives by 72% while maintaining compliance.

Risk-Based Tool Selection

The ICH Q9 Quality Risk Management (QRM) framework provides a structured methodology for identifying, assessing, and controlling risks to pharmaceutical product quality, with a strong emphasis on aligning tool selection with the parameter’s impact on patient safety and product efficacy. Central to this approach is the principle that the rigor of risk management activities—including the selection of tools—should be proportionate to the criticality of the parameter under evaluation. This ensures resources are allocated efficiently, focusing on high-impact risks while avoiding overburdening low-risk areas.

Prioritizing Tools Through the Lens of Risk Impact

The ICH Q9 framework categorizes risks based on their potential to compromise product quality, guided by factors such as severity, detectability, and probability. Parameters with a direct impact on critical quality attributes (CQAs)—such as potency, purity, or sterility—are classified as high-risk and demand robust analytical tools. Conversely, parameters with minimal impact may require simpler methods. For example:

  • High-Impact Parameters: Use Failure Mode and Effects Analysis (FMEA) or Fault Tree Analysis (FTA) to dissect failure modes, root causes, and mitigation strategies.
  • Medium-Impact Parameters: Apply a tool such as a PHA.
  • Low-Impact Parameters: Utilize checklists or flowcharts for basic risk identification.

This tiered approach ensures that the complexity of the tool matches the parameter’s risk profile.

  1. Importance: The parameter’s criticality to patient safety or product efficacy.
  2. Complexity: The interdependencies of the system or process being assessed.
  3. Uncertainty: Gaps in knowledge about the parameter’s behavior or controls.

For instance, a high-purity active pharmaceutical ingredient (API) with narrow specification limits (high importance) and variable raw material inputs (high complexity) would necessitate FMEA to map failure modes across the supply chain. In contrast, a non-critical excipient with stable sourcing (low uncertainty) might only require a simplified risk ranking matrix.

Implementing a Risk-Based Approach

1. Assess Parameter Criticality

Begin by categorizing parameters based on their impact on CQAs, as defined during Stage 1 (Process Design) of the FDA’s validation lifecycle. Parameters are classified as:

  • Critical: Directly affecting safety/efficacy
  • Key: Influencing quality but not directly linked to safety
  • Non-Critical: No measurable impact on quality

This classification informs the depth of risk assessment and tool selection.

2. Select Tools Using the ICU Framework
  • Importance-Driven Tools: High-importance parameters warrant tools that quantify risk severity and detectability. FMEA is ideal for linking failure modes to patient harm, while Statistical Process Control (SPC) charts monitor real-time variability.
  • Complexity-Driven Tools: For multi-step processes (e.g., bioreactor operations), HACCP identifies critical control points, while Ishikawa diagrams map cause-effect relationships.
  • Uncertainty-Driven Tools: Parameters with limited historical data (e.g., novel drug formulations) benefit from Bayesian statistical models or Monte Carlo simulations to address knowledge gaps.
3. Document and Justify Tool Selection

Regulatory agencies require documented rationale for tool choices. For example, a firm using FMEA for a high-risk sterilization process must reference its ability to evaluate worst-case scenarios and prioritize mitigations. This documentation is typically embedded in Quality Risk Management (QRM) Plans or validation protocols.

Integration with Living Risk Assessments

Living risk assessments are dynamic, evolving documents that reflect real-time process knowledge and data. Unlike static, ad-hoc assessments, they are continually updated through:

1. Ongoing Data Integration

Data from Continual Process Verification (CPV)—such as trend analyses of CPPs/CQAs—feeds directly into living risk assessments. For example, shifts in fermentation yield detected via SPC charts trigger updates to bioreactor risk profiles, prompting tool adjustments (e.g., upgrading from checklists to FMEA).

2. Periodic Review Cycles

Living assessments undergo scheduled reviews (e.g., biannually) and event-driven updates (e.g., post-deviation). A QRM Master Plan, as outlined in ICH Q9(R1), orchestrates these reviews by mapping assessment frequencies to parameter criticality. High-impact parameters may be reviewed quarterly, while low-impact ones are assessed annually.

3. Cross-Functional Collaboration

Quality, manufacturing, and regulatory teams collaborate to interpret CPV data and update risk controls. For instance, a rise in particulate matter in vials (detected via CPV) prompts a joint review of filling line risk assessments, potentially revising tooling from HACCP to FMEA to address newly identified failure modes.

Regulatory Expectations and Compliance

Regulatory agencies requires documented justification for CPV tool selection, emphasizing:

  • Protocol Preapproval: CPV plans must be submitted during Stage 2, detailing tool selection criteria.
  • Change Control: Transitions between tools (e.g., SPC → thresholds) require risk assessments and documentation.
  • Training: Staff must be proficient in both traditional (e.g., Shewhart charts) and modern tools (e.g., AI).

A 2024 FDA warning letter cited a firm for using control charts on non-normal data without validation, underscoring the consequences of poor tool alignment.

A Framework for Adaptive Excellence

The FDA’s CPV framework is not prescriptive but principles-based, allowing flexibility in methodology and tool selection. Successful implementation hinges on:

  1. Science-Driven Decisions: Align tools with data characteristics and process capability.
  2. Risk-Based Prioritization: Focus resources on high-impact parameters.
  3. Regulatory Agility: Justify tool choices through documented risk assessments and lifecycle data.

CPV is a living system that must evolve alongside processes, leveraging tools that balance compliance with operational pragmatism. By anchoring decisions in the FDA’s lifecycle approach, manufacturers can transform CPV from a regulatory obligation into a strategic asset for quality excellence.