A 2025 Retrospective for Investigations of a Dog

If the history of pharmaceutical quality management were written as a geological timeline, 2025 would hopefully mark the end of the Holocene of Compliance—a long, stable epoch where “following the procedure” was sufficient to ensure survival—and the beginning of the Anthropocene of Complexity.

For decades, our industry has operated under a tacit social contract. We agreed to pretend that “compliance” was synonymous with “quality.” We agreed to pretend that a validated method would work forever because we proved it worked once in a controlled protocol three years ago. We agreed to pretend that “zero deviations” meant “perfect performance,” rather than “blind surveillance.” We agreed to pretend that if we wrote enough documents, reality would conform to them.

If I had my wish 2025 would be the year that contract finally dissolved.

Throughout the year—across dozens of posts, technical analyses, and industry critiques on this blog—I have tried to dismantle the comfortable illusions of “Compliance Theater” and show how this theater collides violently with the unforgiving reality of complex systems.

The connecting thread running through every one of these developments is the concept I have returned to obsessively this year: Falsifiable Quality.

This Year in Review is not merely a summary of blog posts. It is an attempt to synthesize the fragmented lessons of 2025 into a coherent argument. The argument is this: A quality system that cannot be proven wrong is a quality system that cannot be trusted.

If our systems—our validation protocols, our risk assessments, our environmental monitoring programs—are designed only to confirm what we hope is true (the “Happy Path”), they are not quality systems at all. They are comfort blankets. And 2025 was the year we finally started pulling the blanket off.

The Philosophy of Doubt

(Reflecting on: The Effectiveness Paradox, Sidney Dekker, and Gerd Gigerenzer)

Before we dissect the technical failures of 2025, let me first establish the philosophical framework that defined this year’s analysis.

In August, I published The Effectiveness Paradox: Why ‘Nothing Bad Happened’ Doesn’t Prove Your Quality System Works.” It became one of the most discussed posts of the year because it attacked the most sacred metric in our industry: the trend line that stays flat.

We are conditioned to view stability as success. If Environmental Monitoring (EM) data shows zero excursions for six months, we throw a pizza party. If a method validation passes all acceptance criteria on the first try, we commend the development team. If a year goes by with no Critical deviations, we pay out bonuses.

But through the lens of Falsifiable Quality—a concept heavily influenced by the philosophy of Karl Popper, the challenging insights of Deming, and the safety science of Sidney Dekker, whom we discussed in November—these “successes” look suspiciously like failures of inquiry.

The Problem with Unfalsifiable Systems

Karl Popper famously argued that a scientific theory is only valid if it makes predictions that can be tested and proven false. “All swans are white” is a scientific statement because finding one black swan falsifies it. “God is love” is not, because no empirical observation can disprove it.

In 2025, I argued that most Pharmaceutical Quality Systems (PQS) are designed to be unfalsifiable.

  • The Unfalsifiable Alert Limit: We set alert limits based on historical averages + 3 standard deviations. This ensures that we only react to statistical outliers, effectively blinding us to gradual drift or systemic degradation that remains “within the noise.”
  • The Unfalsifiable Robustness Study: We design validation protocols that test parameters we already know are safe (e.g., pH +/- 0.1), avoiding the “cliff edges” where the method actually fails. We prove the method works where it works, rather than finding where it breaks.
  • The Unfalsifiable Risk Assessment: We write FMEAs where the conclusion (“The risk is acceptable”) is decided in advance, and the RPN scores are reverse-engineered to justify it.

This is “Safety Theater,” a term Dekker uses to describe the rituals organizations perform to look safe rather than be safe.

Safety-I vs. Safety-II

In November’s post Sidney Dekker: The Safety Scientist Who Influences How I Think About Quality, I explored Dekker’s distinction between Safety-I (minimizing things that go wrong) and Safety-II (understanding how things usually go right).

Traditional Quality Assurance is obsessed with Safety-I. We count deviations. We count OOS results. We count complaints. When those counts are low, we assume the system is healthy.
But as the LeMaitre Vascular warning letter showed us this year (discussed in Part III), a system can have “zero deviations” simply because it has stopped looking for them. LeMaitre had excellent water data—because they were cleaning the valves before they sampled them. They were measuring their ritual, not their water.

Falsifiable Quality is the bridge to Safety-II. It demands that we treat every batch record not as a compliance artifact, but as a hypothesis test.

  • Hypothesis: “The contamination control strategy is effective.”
  • Test: Aggressive monitoring in worst-case locations, not just the “representative” center of the room.
  • Result: If we find nothing, the hypothesis survives another day. If we find something, we have successfully falsified the hypothesis—which is a good thing because it reveals reality.

The shift from “fearing the deviation” to “seeking the falsification” is a cultural pivot point of 2025.

The Epistemological Crisis in the Lab (Method Validation)

(Reflecting on: USP <1225>, Method Qualification vs. Validation, and Lifecycle Management)

Nowhere was the battle for Falsifiable Quality fought more fiercely in 2025 than in the analytical laboratory.

The proposed revision to USP <1225> Validation of Compendial Procedures (published in Pharmacopeial Forum 51(6)) arrived late in the year, but it serves as the perfect capstone to the arguments I’ve been making since January.

For forty years, analytical validation has been the ultimate exercise in “Validation as an Event.” You develop a method. You write a protocol. You execute the protocol over three days with your best analyst and fresh reagents. You print the report. You bind it. You never look at it again.

This model is unfalsifiable. It assumes that because the method worked in the “Work-as-Imagined” conditions of the validation study, it will work in the “Work-as-Done” reality of routine QC for the next decade.

The Reportable Result: Validating Decisions, Not Signals

The revised USP <1225>—aligned with ICH Q14(Analytical Procedure Development) and USP <1220> (The Lifecycle Approach)—destroys this assumption. It introduces concepts that force falsifiability into the lab.

The most critical of these is the Reportable Result.

Historically, we validated “the instrument” or “the measurement.” We proved that the HPLC could inject the same sample ten times with < 1.0% RSD.

But the Reportable Result is the final value used for decision-making—the value that appears on the Certificate of Analysis. It is the product of a complex chain: Sampling -> Transport -> Storage -> Preparation -> Dilution -> Injection -> Integration -> Calculation -> Averaging.

Validating the injection precision (the end of the chain) tells us nothing about the sampling variability (the beginning of the chain).

By shifting focus to the Reportable Result, USP <1225> forces us to ask: “Does this method generate decisions we can trust?”

The Replication Strategy: Validating “Work-as-Done”

The new guidance insists that validation must mimic the replication strategy of routine testing.
If your SOP says “We report the average of 3 independent preparations,” then your validation must evaluate the precision and accuracy of that average, not of the individual preparations.

This seems subtle, but it is revolutionary. It prevents the common trick of “averaging away” variability during validation to pass the criteria, only to face OOS results in routine production because the routine procedure doesn’t use the same averaging scheme.

It forces the validation study to mirror the messy reality of the “Work-as-Done,” making the validation data a falsifiable predictor of routine performance, rather than a theoretical maximum capability.

Method Qualification vs. Validation: The June Distinction

I wrote Method Qualification and Validation,” clarifying a distinction that often confuses the industry.

  • Qualification is the “discovery phase” where we explore the method’s limits. It is inherently falsifiable—we want to find where the method breaks.
  • Validation has traditionally been the “confirmation phase” where we prove it works.

The danger, as I noted in that post, is when we skip the falsifiable Qualification step and go straight to Validation. We write the protocol based on hope, not data.

USP <1225> essentially argues that Validation must retain the falsifiable spirit of Qualification. It is not a coronation; it is a stress test.

The Death of “Method Transfer” as We Know It

In a Falsifiable Quality system, a method is never “done.” The Analytical Target Profile (ATP)—a concept from ICH Q14 that permeates the new thinking—is a standing hypothesis: “This method measures Potency within +/- 2%.”

Every time we run a system suitability check, every time we run a control standard, we are testing that hypothesis.

If the method starts drifting—even if it still passes broad system suitability limits—a falsifiable system flags the drift. An unfalsifiable system waits for the OOS.

The draft revision of USP <1225> is a call to arms. It asks us to stop treating validation as a “ticket to ride”—a one-time toll we pay to enter GMP compliance—and start treating it as a “ticket to doubt.” Validation gives us permission to use the method, but only as long as the data continues to support the hypothesis of fitness.

The Reality Check (The “Unholy Trinity” of Warning Letters)

Philosophy and guidelines are fine, but in 2025, reality kicked in the door. The regulatory year was defined by three critical warning letters—SanofiLeMaitre, and Rechon—that collectively dismantled the industry’s illusions of control.

It began, as these things often do, with a ghost from the past.

Sanofi Framingham: The Pendulum Swings Back

(Reflecting on: Failure to Investigate Critical Deviations and The Sanofi Warning Letter)

The year opened with a shock. On January 15, 2025, the FDA issued a warning letter to Sanofi’s Framingham facility—the sister site to the legacy Genzyme Allston landing, whose consent decree defined an entire generation of biotech compliance and of my career.

In my January analysis (Failure to Investigate Critical Deviations: A Cautionary Tale), I noted that the FDA’s primary citation was a failure to “thoroughly investigate any unexplained discrepancy.”

This is the cardinal sin of Falsifiable Quality.

An “unexplained discrepancy” is a signal from reality. It is the system telling you, “Your hypothesis about this process is wrong.”

  • The Falsifiable Response: You dive into the discrepancy. You assume your control strategy missed something. You use Causal Reasoning (the topic of my May post) to find the mechanism of failure.
  • The Sanofi Response: As the warning letter detailed, they frequently attributed failures to “isolated incidents” or superficial causes without genuine evidence.

This is the “Refusal to Falsify.” By failing to investigate thoroughly, the firm protects the comfortable status quo. They choose to believe the “Happy Path” (the process is robust) over the evidence (the discrepancy).

The Pendulum of Compliance

In my companion post (Sanofi Warning Letter”), I discussed the “pendulum of compliance.” The Framingham site was supposed to be the fortress of quality, built on the lessons of the Genzyme crisis.

The failure at Sanofi wasn’t a lack of SOPs; it was a lack of curiosity.

The investigators likely had checklists, templates, and timelines (Compliance Theater), but they lacked the mandate—or perhaps the Expertise —to actually solve the problem.

This set the thematic stage for the rest of 2025. Sanofi showed us that “closing the deviation” is not the same as fixing the problem. This insight led directly into my August argument in The Effectiveness Paradox: You can close 100% of your deviations on time and still have a manufacturing process that is spinning out of control.

If Sanofi was the failure of investigation (looking back), Rechon and LeMaitre were failures of surveillance (looking forward). Together, they form a complete picture of why unfalsifiable systems fail.

Reflecting on: Rechon Life Science and LeMaitre Vascular

Philosophy and guidelines are fine, but in September, reality kicked in the door.

Two warning letters in 2025—Rechon Life Science (September) and LeMaitre Vascular (August)—provided brutal case studies in what happens when “representative sampling” is treated as a buzzword rather than a statistical requirement.

Rechon Life Science: The Map vs. The Territory

The Rechon Life Science warning letter was a significant regulatory signal of 2025 regarding sterile manufacturing. It wasn’t just a list of observations; it was an indictment of unfalsifiable Contamination Control Strategies (CCS).

We spent 2023 and 2024 writing massive CCS documents to satisfy Annex 1. Hundreds of pages detailing airflows, gowning procedures, and material flows. We felt good about them. We felt “compliant.”

Then the FDA walked into Rechon and essentially asked: “If your CCS is so good, why does your smoke study show turbulence over the open vials?”

The warning letter highlighted a disconnect I’ve called “The Map vs. The Territory.”

  • The Map: The CCS document says the airflow is unidirectional and protects the product.
  • The Territory: The smoke study video shows air eddying backward from the operator to the sterile core.

In an unfalsifiable system, we ignore the smoke study (or film it from a flattering angle) because it contradicts the CCS. We prioritize the documentation (the claim) over the observation (the evidence).

In a falsifiable system, the smoke study is the test. If the smoke shows turbulence, the CCS is falsified. We don’t defend the CCS; we rewrite it. We redesign the line.

The FDA’s critique of Rechon’s “dynamic airflow visualization” was devastating because it showed that Rechon was using the smoke study as a marketing video, not a diagnostic tool. They filmed “representative” operations that were carefully choreographed to look clean, rather than the messy reality of interventions.

LeMaitre Vascular: The Sin of “Aspirational Data”

If Rechon was about air, LeMaitre Vascular (analyzed in my August post When Water Systems Fail) was about water. And it contained an even more egregious sin against falsifiability.

The FDA observed that LeMaitre’s water sampling procedures required cleaning and purging the sample valves before taking the sample.

Let’s pause and consider the epistemology of this.

  • The Goal: To measure the quality of the water used in manufacturing.
  • The Reality: Manufacturing operators do not purge and sanitize the valve for 10 minutes before filling the tank. They open the valve and use the water.
  • The Sample: By sanitizing the valve before sampling, LeMaitre was measuring the quality of the sampling process, not the quality of the water system.

I call this “Aspirational Data.” It is data that reflects the system as we wish it existed, not as it actually exists. It is the ultimate unfalsifiable metric. You can never find biofilm in a valve if you scrub the valve with alcohol before you open it.

The FDA’s warning letter was clear: “Sampling… must include any pathway that the water travels to reach the process.”

LeMaitre also performed an unauthorized “Sterilant Switcheroo,” changing their sanitization agent without change control or biocompatibility assessment. This is the hallmark of an unfalsifiable culture: making changes based on convenience, assuming they are safe, and never designing the study to check if that assumption is wrong.

The “Representative” Trap

Both warning letters pivot on the misuse of the word “representative.”

Firms love to claim their EM sampling locations are “representative.” But representative of what? Usually, they are representative of the average condition of the room—the clean, empty spaces where nothing happens.

But contamination is not an “average” event. It is a specific, localized failure. A falsifiable EM program places probes in the “worst-case” locations—near the door, near the operator’s hands, near the crimping station. It tries to find contamination. It tries to falsify the claim that the zone is sterile, asceptic or bioburden reducing.

When Rechon and LeMaitre failed to justify their sampling locations, they were guilty of designing an unfalsifiable experiment. They placed the “microscope” where they knew they wouldn’t find germs.

2025 taught us that regulators are no longer impressed by the thickness of the CCS binder. They are looking for the logic of control. They are testing your hypothesis. And if you haven’t tested it yourself, you will fail.

The Investigation as Evidence

(Reflecting on: The Golden Start to a Deviation InvestigationCausal ReasoningTake-the-Best Heuristics, and The Catalent Case)

If Rechon, LeMaitre, and Sanofi teach us anything, it is that the quality system’s ability to discover failure is more important than its ability to prevent failure.

A perfect manufacturing process that no one is looking at is indistinguishable from a collapsing process disguised by poor surveillance. But a mediocre process that is rigorously investigated, understood, and continuously improved is a path toward genuine control.

The investigation itself—how we respond to a deviation, how we reason about causation, how we design corrective actions—is where falsifiable quality either succeeds or fails.

The Golden Day: When Theory Meets Work-as-Done

In April, I published “The Golden Start to a Deviation Investigation,” which made a deceptively simple argument: The first 24 hours after a deviation is discovered are where your quality system either commits to discovering truth or retreats into theater.

This argument sits at the heart of falsifiable quality.

When a deviation occurs, you have a narrow window—what I call the “Golden Day”—where evidence is fresh, memories are intact, and the actual conditions that produced the failure still exist. If you waste this window with vague problem statements and abstract discussions, you permanently lose the ability to test causal hypotheses later.

The post outlined a structured protocol:

First, crystallize the problem. Not “potency was low”—but “Lot X234, potency measured at 87% on January 15th at 14:32, three hours after completion of blending in Vessel C-2.” Precision matters because only specific, bounded statements can be falsified. A vague problem statement can always be “explained away.”

Second, go to the Gemba. This is the antidote to “work-as-imagined” investigation. The SOP says the temperature controller should maintain 37°C +/- 2°C. But the Gemba walk reveals that the probe is positioned six inches from the heating element, the data logger is in a recessed pocket where humidity accumulates, and the operator checks it every four hours despite a requirement to check hourly. These are the facts that predict whether the deviation will recur.

Third, interview with cognitive discipline. Most investigations fail not because investigators lack information, but because they extract information poorly. Cognitive interviewing—developed by the FBI and the National Transportation Safety Board—uses mental reinstatement, multiple perspectives, and sequential reordering to access accurate recall rather than confabulated narrative. The investigator asks the operator to walk through the event in different orders, from different viewpoints, each time triggering different memory pathways. This is not “soft” technique; it is a mechanism for generating falsifiable evidence.

The Golden Day post makes it clear: You do not investigate deviations to document compliance. You investigate deviations to gather evidence about whether your understanding of the process is correct.

Causal Reasoning: Moving Beyond “What Was Missing”

Most investigation tools fail not because they are flawed, but because they are applied with the wrong mindset. In my May post “Causal Reasoning: A Transformative Approach to Root Cause Analysis,” I argued that pharmaceutical investigations are often trapped in “negative reasoning.”

Negative reasoning asks: “What barrier was missing? What should have been done but wasn’t?” This mindset leads to unfalsifiable conclusions like “Procedure not followed” or “Training was inadequate.” These are dead ends because they describe the absence of an ideal, not the presence of a cause.

Causal reasoning flips the script. It asks: “What was present in the system that made the observed outcome inevitable?”

Instead of settling for “human error,” causal reasoning demands we ask: What environmental cues made the action sensible to the operator at that moment? Were the instructions ambiguous? Did competing priorities make compliance impossible? Was the process design fragile?

This shift transforms the investigation from a compliance exercise into a scientific inquiry.

Consider the LeMaitre example:

  • Negative Reasoning: “Why didn’t they sample the true condition?” Answer: “Because they didn’t follow the intent of the sampling plan.”
  • Causal Reasoning: “What made the pre-cleaning practice sensible to them?” Answer: “They believed it ensured sample validity by removing valve residue.”

By understanding the why, we identify a knowledge gap that can be tested and corrected, rather than a negligence gap that can only be punished.

In September, “Take-the-Best Heuristic for Causal Investigation” provided a practical framework for this. Instead of listing every conceivable cause—a process that often leads to paralysis—the “Take-the-Best” heuristic directs investigators to focus on the most information-rich discriminators. These are the factors that, if different, would have prevented the deviation. This approach focuses resources where they matter most, turning the investigation into a targeted search for truth.

CAPA: Predictions, Not Promises

The Sanofi warning letter—analyzed in January—showed the destination of unfalsifiable investigation: CAPAs that exist mainly as paperwork.

Sanofi had investigation reports. They had “corrective actions.” But the FDA noted that deviations recurred in similar patterns, suggesting that the investigation had identified symptoms, not mechanisms, and that the “corrective” action had not actually addressed causation.

This is the sin of treating CAPA as a promise rather than a hypothesis.

A falsifiable CAPA is structured as an explicit prediction“If we implement X change, then Y undesirable outcome will not recur under conditions Z.”

This can be tested. If it fails the test, the CAPA itself becomes evidence—not of failure, but of incomplete causal understanding. Which is valuable.

In the Rechon analysis, this showed up concretely: The FDA’s real criticism was not just that contamination was found; it was that Rechon’s Contamination Control Strategy had no mechanism to falsify itself. If the CCS said “unidirectional airflow protects the product,” and smoke studies showed bidirectional eddies, the CCS had been falsified. But Rechon treated the falsification as an anomaly to be explained away, rather than evidence that the CCS hypothesis was wrong.

A falsifiable organization would say: “Our CCS predicted that Grade A in an isolator with this airflow pattern would remain sterile. The smoke study proves that prediction wrong. Therefore, the CCS is false. We redesign.”

Instead, they filmed from a different angle and said the aerodynamics were “acceptable.”

Knowledge Integration: When Deviations Become the Curriculum

The final piece of falsifiable investigation is what I call “knowledge integration.” A single deviation is a data point. But across the organization, deviations should form a curriculum about how systems actually fail.

Sanofi’s failure was not that they investigated each deviation badly (though they did). It was that they investigated them in isolation. Each deviation closed on its own. Each CAPA addressed its own batch. There was no organizational learning—no mechanism for a pattern of similar deviations to trigger a hypothesis that the control strategy itself was fundamentally flawed.

This is where the Catalent case study, analyzed in September’s “When 483s Reveal Zemblanity,” becomes instructive. Zemblanity is the opposite of serendipity: the seemingly random recurrence of the same failure through different paths. Catalent’s 483 observations were not isolated mistakes; they formed a pattern that revealed a systemic assumption (about equipment capability, about environmental control, about material consistency) that was false across multiple products and locations.

A falsifiable quality system catches zemblanity early by:

  1. Treating each deviation as a test of organizational hypotheses, not as an isolated incident.
  2. Trending deviation patterns to detect when the same causal mechanism is producing failures across different products, equipment, or operators.
  3. Revising control strategies when patterns falsify the original assumptions, rather than tightening parameters at the margins.

The Digital Hallucination (CSA, AI, and the Expertise Crisis)

(Reflecting on: CSA: The Emperor’s New Clothes, Annex 11, and The Expertise Crisis)

While we battled microbes in the cleanroom, a different battle was raging in the server room. 2025 was the year the industry tried to “modernize” validation through Computer Software Assurance (CSA) and AI, and in many ways, it was the year we tried to automate our way out of thinking.

CSA: The Emperor’s New Validation Clothes

In September, I published Computer System Assurance: The Emperor’s New Validation Clothes,” a critique of the the contortions being made around the FDA’s guidance. The narrative sold by consultants for years was that traditional Computer System Validation (CSV) was “broken”—too much documentation, too much testing—and that CSA was a revolutionary new paradigm of “critical thinking.”

My analysis showed that this narrative is historically illiterate.

The principles of CSA—risk-based testing, leveraging vendor audits, focusing on intended use—are not new. They are the core principles of GAMP5 and have been applied for decades now.

The industry didn’t need a new guidance to tell us to use critical thinking; we had simply chosen not to use the critical thinking tools we already had. We had chosen to apply “one-size-fits-all” templates because they were safe (unfalsifiable).

The CSA guidance is effectively the FDA saying: “Please read the GAMP5 guide you claimed to be following for the last 15 years.”

The danger of the “CSA Revolution” narrative is that it encourages a swing to the opposite extreme: “Unscripted Testing” that becomes “No Testing.”

In a falsifiable system, “unscripted testing” is highly rigorous—it is an expert trying to break the software (“Ad Hoc testing”). But in an unfalsifiable system, “unscripted testing” becomes “I clicked around for 10 minutes and it looked fine.”

The Expertise Crisis: AI and the Death of the Apprentice

This leads directly to the Expertise Crisis. In September, I wrote The Expertise Crisis: Why AI’s War on Entry-Level Jobs Threatens Quality’s Future.” This was perhaps the most personal topic I covered this year, because it touches on the very survival of our profession.

We are rushing to integrate Artificial Intelligence (AI) into quality systems. We have AI writing deviations, AI drafting SOPs, AI summarizing regulatory changes. The efficiency gains are undeniable. But the cost is hidden, and it is epistemological.

Falsifiability requires expertise.
To falsify a claim—to look at a draft investigation report and say, “No, that conclusion doesn’t follow from the data”—you need deep, intuitive knowledge of the process. You need to know what a “normal” pH curve looks like so you can spot the “abnormal” one that the AI smoothed over.

Where does that intuition come from? It comes from the “grunt work.” It comes from years of reviewing batch records, years of interviewing operators, years of struggling to write a root cause analysis statement.

The Expertise Crisis is this: If we give all the entry-level work to AI, where will the next generation of Quality Leaders come from?

  • The Junior Associate doesn’t review the raw data; the AI summarizes it.
  • The Junior Associate doesn’t write the deviation; the AI generates the text.
  • Therefore, the Junior Associate never builds the mental models necessary to critique the AI.

The Loop of Unfalsifiable Hallucination

We are creating a closed loop of unfalsifiability.

  1. The AI generates a plausible-sounding investigation report.
  2. The human reviewer (who has been “de-skilled” by years of AI reliance) lacks the deep expertise to spot the subtle logical flaw or the missing data point.
  3. The report is approved.
  4. The “hallucination” becomes the official record.

In a falsifiable quality system, the human must remain the adversary of the algorithm. The human’s job is to try to break the AI’s logic, to check the citations, to verify the raw data.
But in 2025, we saw the beginnings of a “Compliance Autopilot”—a desire to let the machine handle the “boring stuff.”

My warning in September remains urgent: Efficiency without expertise is just accelerated incompetence. If we lose the ability to falsify our own tools, we are no longer quality professionals; we are just passengers in a car driven by a statistical model that doesn’t know what “truth” is.

My post “The Missing Middle in GMP Decision Making: How Annex 22 Redefines Human-Machine Collaboration in Pharmaceutical Quality Assurance” goes a lot deeper here.

Annex 11 and Data Governance

In August, I analyzed the draft Annex 11 (Computerised Systems) in the post Data Governance Systems: A Fundamental Shift.”

The Europeans are ahead of the FDA here. While the FDA talks about “Assurance” (testing less), the EU is talking about “Governance” (controlling more). The new Annex 11 makes it clear: You cannot validate a system if you do not control the data lifecycle. Validation is not a test script; it is a state of control.

This aligns perfectly with USP <1225> and <1220>. Whether it’s a chromatograph or an ERP system, the requirement is the same: Prove that the data is trustworthy, not just that the software is installed.

The Process as a Hypothesis (CPV & Cleaning)

(Reflecting on: Continuous Process Verification and Hypothesis Formation)

The final frontier of validation we explored in 2025 was the manufacturing process itself.

CPV: Continuous Falsification

In March, I published Continuous Process Verification (CPV) Methodology and Tool Selection.”
CPV is the ultimate expression of Falsifiable Quality in manufacturing.

  • Traditional Validation (3 Batches): “We made 3 good batches, therefore the process is perfect forever.” (Unfalsifiable extrapolation).
  • CPV: “We made 3 good batches, so we have a license to manufacture, but we will statistically monitor every subsequent batch to detect drift.” (Continuous hypothesis testing).

The challenge with CPV, as discussed in the post, is that it requires statistical literacy. You cannot implement CPV if your quality unit doesn’t understand the difference between Cpk and Ppk, or between control limits and specification limits.

This circles back to the Expertise Crisis. We are implementing complex statistical tools (CPV software) at the exact moment we are de-skilling the workforce. We risk creating a “CPV Dashboard” that turns red, but no one knows why or what to do about it.

Cleaning Validation: The Science of Residue

In August, I tried to apply falsifiability to one of the most stubborn areas of dogma: Cleaning Validation.

In Building Decision-Making with Structured Hypothesis Formation, I argued that cleaning validation should not be about “proving it’s clean.” It should be about “understanding why it gets dirty.”

  • Traditional Approach: Swab 10 spots. If they pass, we are good.
  • Hypothesis Approach: “We hypothesize that the gasket on the bottom valve is the hardest to clean. We predict that if we reduce rinse time by 1 minute, that gasket will fail.”

By testing the boundaries—by trying to make the cleaning fail—we understand the Design Space of the cleaning process.

We discussed the “Visual Inspection” paradox in cleaning: If you can see the residue, it failed. But if you can’t see it, does it pass?

Only if you have scientifically determined the Visible Residue Limit (VRL). Using “visually clean” without a validated VRL is—you guessed it—unfalsifiable.

To: Jeremiah Genest
From: Perplexity Research
Subject: Draft Content – Single-Use Systems & E&L Section

Here is a section on Single-Use Systems (SUS) and Extractables & Leachables (E&L).

I have positioned this piece to bridge the gap between “Part III: The Reality Check” (Contamination/Water) and “Part V: The Process as a Hypothesis” (Cleaning Validation).

The argument here is that by switching from Stainless Steel to Single-Use, we traded a visible risk (cleaning residue) for an invisible one (chemical migration), and that our current approach to E&L is often just “Paper Safety”—relying on vendor data that doesn’t reflect the “Work-as-Done” reality of our specific process conditions.

The Plastic Paradox (Single-Use Systems and the E&L Mirage)

If the Rechon and LeMaitre warning letters were about the failure to control biological contaminants we can find, the industry’s struggle with Single-Use Systems (SUS) in 2025 was about the chemical contaminants we choose not to find.

We have spent the last decade aggressively swapping stainless steel for plastic. The value proposition was irresistible: Eliminate cleaning validation, eliminate cross-contamination, increase flexibility. We traded the “devil we know” (cleaning residue) for the “devil we don’t” (Extractables and Leachables).

But in 2025, with the enforcement reality of USP <665> (Plastic Components and Systems) settling in, we had to confront the uncomfortable truth: Most E&L risk assessments are unfalsifiable.

The Vendor Data Trap

The standard industry approach to E&L is the ultimate form of “Compliance Theater.”

  1. We buy a single-use bag.
  2. We request the vendor’s regulatory support package (the “Map”).
  3. We see that the vendor extracted the film with aggressive solvents (ethanol, hexane) for 7 days.
  4. We conclude: “Our process uses water for 24 hours; therefore, we are safe.”

This logic is epistemologically bankrupt. It assumes that the Vendor’s Model (aggressive solvents/short time) maps perfectly to the User’s Reality (complex buffers/long duration/specific surfactants).

It ignores the fact that plastics are dynamic systems. Polymers age. Gamma irradiation initiates free radical cascades that evolve over months. A bag manufactured in January might have a different leachable profile than a bag manufactured in June, especially if the resin supplier made a “minor” change that didn’t trigger a notification.

By relying solely on the vendor’s static validation package, we are choosing not to falsify our safety hypothesis. We are effectively saying, “If the vendor says it’s clean, we will not look for dirt.”

USP <665>: A Baseline, Not a Ceiling

The full adoption of USP <665> was supposed to bring standardization. And it has—it provides a standard set of extraction conditions. But standards can become ceilings.

In 2025, I observed a troubling trend of “Compliance by Citation.” Firms are citing USP <665> compliance as proof of absence of risk, stopping the inquiry there.

A Falsifiable E&L Strategy goes further. It asks:

  • “What if the vendor data is irrelevant to my specific surfactant?”
  • “What if the gamma irradiation dose varied?”
  • “What if the interaction between the tubing and the connector creates a new species?”

The Invisible Process Aid

We must stop viewing Single-Use Systems as inert piping. They are active process components. They are chemically reactive vessels that participate in our reaction kinetics.

When we treat them as inert, we are engaging in the same “Aspirational Thinking” that LeMaitre used on their water valves. We are modeling the system we want (pure, inert plastic), not the system we have (a complex soup of antioxidants, slip agents, and degradants).

The lesson of 2025 is that Material Qualification cannot be a paper exercise. If you haven’t done targeted simulation studies that mimic your actual “Work-as-Done” conditions, you haven’t validated the system. You’ve just filed the receipt.

The Mandate for 2026

As we look toward 2026, the path is clear. We cannot go back to the comfortable fiction of the pre-2025 era.

The regulatory environment (Annex 1, ICH Q14, USP <1225>, Annex 11) is explicitly demanding evidence of control, not just evidence of compliance. The technological environment (AI) is demanding that we sharpen our human expertise to avoid becoming obsolete. The physical environment (contamination, supply chain complexity) is demanding systems that are robust, not just rigid.

The mandate for the coming year is to build Falsifiable Quality Systems.

What does that look like practically?

  1. In the Lab: Implement USP <1225> logic now. Don’t wait for the official date. Validate your reportable results. Add “challenge tests” to your routine monitoring.
  2. In the Plant: Redesign your Environmental Monitoring to hunt for contamination, not to avoid it. If you have a “perfect” record in a Grade C area, move the plates until you find the dirt.
  3. In the Office: Treat every investigation as a chance to falsify the control strategy. If a deviation occurs that the control strategy said was impossible, update the control strategy.
  4. In the Culture: Reward the messenger. The person who finds the crack in the system is not a troublemaker; they are the most valuable asset you have. They just falsified a false sense of security.
  5. In Design: Embrace the Elegant Quality System (discussed in May). Complexity is the enemy of falsifiability. Complex systems hide failures; simple, elegant systems reveal them.

2025 was the year we stopped pretending. 2026 must be the year we start building. We must build systems that are honest enough to fail, so that we can build processes that are robust enough to endure.

Thank you for reading, challenging, and thinking with me this year. The investigation continues.

USP <1225> Revised: Aligning Compendial Validation with ICH Q2(R2) and Q14’s Lifecycle Vision

The United States Pharmacopeia’s proposed revision of General Chapter <1225> Validation of Compendial Procedures, published in Pharmacopeial Forum 51(6), represents the continuation of a fundamental shift in how we conceptualize analytical method validation—moving from static demonstration of compliance toward dynamic lifecycle management of analytical capability.

This gets to the heart of a challenge us to think differently about what validation actually means. The revised chapter introduces concepts like reportable result, fitness for purpose, replication strategy, and combined evaluation of accuracy and precision that force us to confront uncomfortable questions: What are we actually validating? For what purpose? Under what conditions? And most critically—how do we know our analytical procedures remain fit for purpose once validation is “complete”?

The timing of this revision is deliberate. USP is working to align <1225> more closely with ICH Q2(R2) Validation of Analytical Procedures and ICH Q14 Analytical Procedure Development, both finalized in 2023. Together with the already-official USP <1220> Analytical Procedure Life Cycle (May 2022), these documents form an interconnected framework that demands we abandon the comfortable fiction that validation is a discrete event rather than an ongoing commitment to analytical quality.

Traditional validation approaches cn create the illusion of control without delivering genuine analytical reliability. Methods that “passed validation” fail when confronted with real-world variability. System suitability tests that looked rigorous on paper prove inadequate for detecting performance drift. Acceptance criteria established during development turn out to be disconnected from what actually matters for product quality decisions.

The revised USP <1225> offers conceptual tools to address these failures—if we’re willing to use them honestly rather than simply retrofitting compliance theater onto existing practices. This post explores what the revision actually says, how it relates to ICH Q2(R2) and Q14, and what it demands from quality leaders who want to build genuinely robust analytical systems rather than just impressive validation packages.

The Validation Paradigm Shift: From Compliance Theater to Lifecycle Management

Traditional analytical method validation follows a familiar script. We conduct studies demonstrating acceptable performance for specificity, accuracy, precision, linearity, range, and (depending on the method category) detection and quantitation limits. We generate validation reports showing data meets predetermined acceptance criteria. We file these reports in regulatory submission dossiers or archive them for inspection readiness. Then we largely forget about them until transfer, revalidation, or regulatory scrutiny forces us to revisit the method’s performance characteristics.

This approach treats validation as what Sidney Dekker would call “safety theater”—a performance of rigor that may or may not reflect the method’s actual capability to generate reliable results under routine conditions. The validation study represents work-as-imagined: controlled experiments conducted by experienced analysts using freshly prepared standards and reagents, with carefully managed environmental conditions and full attention to procedural details. What happens during routine testing—work-as-done—often looks quite different.

The lifecycle perspective championed by ICH Q14 and USP <1220> fundamentally challenges this validation-as-event paradigm. From a lifecycle view, validation becomes just one stage in a continuous process of ensuring analytical fitness for purpose. Method development (Stage 1 in USP <1220>) generates understanding of how method parameters affect performance. Validation (Stage 2) confirms the method performs as intended under specified conditions. But the critical innovation is Stage 3—ongoing performance verification that treats method capability as dynamic rather than static.

The revised USP <1225> attempts to bridge these worldviews. It maintains the structure of traditional validation studies while introducing concepts that only make sense within a lifecycle framework. Reportable result—the actual output of the analytical procedure that will be used for quality decisions—forces us to think beyond individual measurements to what we’re actually trying to accomplish. Fitness for purpose demands we articulate specific performance requirements linked to how results will be used, not just demonstrate acceptable performance against generic criteria. Replication strategy acknowledges that the variability observed during validation must reflect the variability expected during routine use.

These aren’t just semantic changes. They represent a shift from asking “does this method meet validation acceptance criteria?” to “will this method reliably generate results adequate for their intended purpose under actual operating conditions?” That second question is vastly more difficult to answer honestly, which is why many organizations will be tempted to treat the new concepts as compliance checkboxes rather than genuine analytical challenges.

I’ve advocated on this blog for falsifiable quality systems—systems that make testable predictions that could be proven wrong through empirical observation. The lifecycle validation paradigm, properly implemented, is inherently more falsifiable than traditional validation. Instead of a one-time demonstration that a method “works,” lifecycle validation makes an ongoing claim: “This method will continue to generate results of acceptable quality when operated within specified conditions.” That claim can be tested—and potentially falsified—every time the method is used. The question is whether we’ll design our Stage 3 performance verification systems to actually test that claim or simply monitor for obviously catastrophic failures.

Core Concepts in the Revised USP <1225>

The revised chapter introduces several concepts that deserve careful examination because they change not just what we do but how we think about analytical validation.

Reportable Result: The Target That Matters

Reportable result may be the most consequential new concept in the revision. It’s defined as the final analytical result that will be reported and used for quality decisions—not individual sample preparations, not replicate injections, but the actual value that appears on a Certificate of Analysis or stability report.

This distinction matters enormously because validation historically focused on demonstrating acceptable performance of individual measurements without always considering how those measurements would be combined to generate reportable values. A method might show excellent repeatability for individual injections while exhibiting problematic variability when the full analytical procedure—including sample preparation, multiple preparations, and averaging—is executed under intermediate precision conditions.

The reportable result concept forces us to validate what we actually use. If our SOP specifies reporting the mean of duplicate sample preparations, each prepared in duplicate and injected in triplicate, then validation should evaluate the precision and accuracy of that mean value, not just the repeatability of individual injections. This seems obvious when stated explicitly, but review your validation protocols and ask honestly: are you validating the reportable result or just demonstrating that the instrument performs acceptably?

This concept aligns perfectly with the Analytical Target Profile (ATP) from ICH Q14, which specifies required performance characteristics for the reportable result. Together, these frameworks push us toward outcome-focused validation rather than activity-focused validation. The question isn’t “did we complete all the required validation experiments?” but “have we demonstrated that the reportable results this method generates will be adequate for their intended use?”

Fitness for Purpose: Beyond Checkbox Validation

Fitness for purpose appears throughout the revised chapter as an organizing principle for validation strategy. But what does it actually mean beyond regulatory rhetoric?

In the falsifiable quality systems framework I’ve been developing, fitness for purpose requires explicit articulation of how analytical results will be used and what performance characteristics are necessary to support those decisions. An assay method used for batch release needs different performance characteristics than the same method used for stability trending. A method measuring a critical quality attribute directly linked to safety or efficacy requires more stringent validation than a method monitoring a process parameter with wide acceptance ranges.

The revised USP <1225> pushes toward risk-based validation strategies that match validation effort to analytical criticality and complexity. This represents a significant shift from the traditional category-based approach (Categories I-IV) that prescribed specific validation parameters based on method type rather than method purpose.

However, fitness for purpose creates interpretive challenges that could easily devolve into justification for reduced rigor. Organizations might claim methods are “fit for purpose” with minimal validation because “we’ve been using this method for years without problems.” This reasoning commits what I call the effectiveness fallacy—assuming that absence of detected failures proves adequate performance. In reality, inadequate analytical methods often fail silently, generating subtly inaccurate results that don’t trigger obvious red flags but gradually degrade our understanding of product quality.

True fitness for purpose requires explicit, testable claims about method performance: “This method will detect impurity X at levels down to 0.05% with 95% confidence” or “This assay will measure potency within ±5% of true value under normal operating conditions.” These are falsifiable statements that ongoing performance verification can test. Vague assertions that methods are “adequate” or “appropriate” are not.

Replication Strategy: Understanding Real Variability

The replication strategy concept addresses a fundamental disconnect in traditional validation: the mismatch between how we conduct validation experiments and how we’ll actually use the method. Validation studies often use simplified replication schemes optimized for experimental efficiency rather than reflecting the full procedural reality of routine testing.

The revised chapter emphasizes that validation should employ the same replication strategy that will be used for routine sample analysis to generate reportable results. If your SOP calls for analyzing samples in duplicate on separate days, validation should incorporate that time-based variability. If sample preparation involves multiple extraction steps that might be performed by different analysts, intermediate precision studies should capture that source of variation.

This requirement aligns validation more closely with work-as-done rather than work-as-imagined. But it also makes validation more complex and time-consuming. Organizations accustomed to streamlined validation protocols will face pressure to either expand their validation studies or simplify their routine testing procedures to match validation replication strategies.

From a quality systems perspective, this tension reveals important questions: Have we designed our analytical procedures to be unnecessarily complex? Are we requiring replication beyond what’s needed for adequate measurement uncertainty? Or conversely, are our validation replication schemes unrealistically simplified compared to the variability we’ll encounter during routine use?

The replication strategy concept forces these questions into the open rather than allowing validation and routine operation to exist in separate conceptual spaces.

Statistical Intervals: Combined Accuracy and Precision

Perhaps the most technically sophisticated addition in the revised chapter is guidance on combined evaluation of accuracy and precision using statistical intervals. Traditional validation treats these as separate performance characteristics evaluated through different experiments. But in reality, what matters for reportable results is the total error combining both bias (accuracy) and variability (precision).

The chapter describes approaches for computing statistical intervals that account for both accuracy and precision simultaneously. These intervals can then be compared against acceptance criteria to determine if the method is validated. If the computed interval falls completely within acceptable limits, the method demonstrates adequate performance for both characteristics together.

This approach is more scientifically rigorous than separate accuracy and precision evaluations because it recognizes that these characteristics interact. A highly precise method with moderate bias might generate reportable results within acceptable ranges, while a method with excellent accuracy but poor precision might not. Traditional validation approaches that evaluate these characteristics separately can miss such interactions.

However, combined evaluation requires more sophisticated statistical expertise than many analytical laboratories possess. The chapter provides references to USP <1210> Statistical Tools for Procedure Validation, which describes appropriate methodologies, but implementation will challenge organizations lacking strong statistical support for their analytical functions.

This creates risk of what I’ve called procedural simulation—going through the motions of applying advanced statistical methods without genuine understanding of what they reveal about method performance. Quality leaders need to ensure that if their teams adopt combined accuracy-precision evaluation approaches, they actually understand the results rather than just feeding data into software and accepting whatever output emerges.

Knowledge Management: Building on What We Know

The revised chapter emphasizes knowledge management more explicitly than previous versions, acknowledging that validation doesn’t happen in isolation from development activities and prior experience. Data generated during method development, platform knowledge from similar methods, and experience with related products all constitute legitimate inputs to validation strategy.

This aligns with ICH Q14’s enhanced approach and ICH Q2(R2)’s acknowledgment that development data can support validation. But it also creates interpretive challenges around what constitutes adequate prior knowledge and how to appropriately leverage it.

In my experience leading quality organizations, knowledge management is where good intentions often fail in practice. Organizations claim to be “leveraging prior knowledge” while actually just cutting corners on validation studies. Platform approaches that worked for previous products get applied indiscriminately to new products with different critical quality attributes. Development data generated under different conditions gets repurposed for validation without rigorous evaluation of its applicability.

Effective knowledge management requires disciplined documentation of what we actually know (with supporting evidence), explicit identification of knowledge gaps, and honest assessment of when prior experience is genuinely applicable versus superficially similar. The revised USP <1225> provides the conceptual framework for this discipline but can’t force organizations to apply it honestly.

Comparing the Frameworks: USP <1225>, ICH Q2(R2), and ICH Q14

Understanding how these three documents relate—and where they diverge—is essential for quality professionals trying to build coherent analytical validation programs.

Analytical Target Profile: Q14’s North Star

ICH Q14 introduced the Analytical Target Profile (ATP) as a prospective description of performance characteristics needed for an analytical procedure to be fit for its intended purpose. The ATP specifies what needs to be measured (the quality attribute), required performance criteria (accuracy, precision, specificity, etc.), and the anticipated performance based on product knowledge and regulatory requirements.

The ATP concept doesn’t explicitly appear in revised USP <1225>, though the chapter’s emphasis on fitness for purpose and reportable result requirements creates conceptual space for ATP-like thinking. This represents a subtle tension between the documents. ICH Q14 treats the ATP as foundational for both enhanced and minimal approaches to method development, while USP <1225> maintains its traditional structure without explicitly requiring ATP documentation.

In practice, this means organizations can potentially comply with revised USP <1225> without fully embracing the ATP concept. They can validate methods against acceptance criteria without articulating why those particular criteria are necessary for the reportable result’s intended use. This risks perpetuating validation-as-compliance-exercise rather than forcing honest engagement with whether methods are actually adequate.

Quality leaders serious about lifecycle validation should treat the ATP as essential even when working with USP <1225>, using it to bridge method development, validation, and ongoing performance verification. The ATP makes explicit what traditional validation often leaves implicit—the link between analytical performance and product quality requirements.

Performance Characteristics: Evolution from Q2(R1) to Q2(R2)

ICH Q2(R2) substantially revises the performance characteristics framework from the 1996 Q2(R1) guideline. Key changes include:

Specificity/Selectivity are now explicitly addressed together rather than treated as equivalent. The revision acknowledges these terms have been used inconsistently across regions and provides unified definitions. Specificity refers to the ability to assess the analyte unequivocally in the presence of expected components, while selectivity relates to the ability to measure the analyte in a complex mixture. In practice, most analytical methods need to demonstrate both, and the revised guidance provides clearer expectations for this demonstration.

Range now explicitly encompasses non-linear calibration models, acknowledging that not all analytical relationships follow simple linear functions. The guidance describes how to demonstrate that methods perform adequately across the reportable range even when the underlying calibration relationship is non-linear. This is particularly relevant for biological assays and certain spectroscopic techniques where non-linearity is inherent to the measurement principle.

Accuracy and Precision can be evaluated separately or through combined approaches, as discussed earlier. This flexibility accommodates both traditional methodology and more sophisticated statistical approaches while maintaining the fundamental requirement that both characteristics be adequate for intended use.

Revised USP <1225> incorporates these changes while maintaining its compendial focus. The chapter continues to reference validation categories (I-IV) as a familiar framework while noting that risk-based approaches considering the method’s intended use should guide validation strategy. This creates some conceptual tension—the categories imply that method type determines validation requirements, while fitness-for-purpose thinking suggests that method purpose should drive validation design.

Organizations need to navigate this tension thoughtfully. The categories provide useful starting points for validation planning, but they shouldn’t become straitjackets preventing appropriate customization based on specific analytical needs and risks.

The Enhanced Approach: When and Why

ICH Q14 distinguishes between minimal and enhanced approaches to analytical procedure development. The minimal approach uses traditional univariate optimization and risk assessment based on prior knowledge and analyst experience. The enhanced approach employs systematic risk assessment, design of experiments, establishment of parameter ranges (PARs or MODRs), and potentially multivariate analysis.

The enhanced approach offers clear advantages: deeper understanding of method performance, identification of critical parameters and their acceptable ranges, and potentially more robust control strategies that can accommodate changes without requiring full revalidation. But it also demands substantially more development effort, statistical expertise, and time.

Neither ICH Q2(R2) nor revised USP <1225> mandates the enhanced approach, though both acknowledge it as a valid strategy. This leaves organizations facing difficult decisions about when enhanced development is worth the investment. In my experience, several factors should drive this decision:

  • Product criticality and lifecycle stage: Biologics products with complex quality profiles and long commercial lifecycles benefit substantially from enhanced analytical development because the upfront investment pays dividends in robust control strategies and simplified change management.
  • Analytical complexity: Multivariate spectroscopic methods (NIR, Raman, mass spectrometry) are natural candidates for enhanced approaches because their complexity demands systematic exploration of parameter spaces that univariate approaches can’t adequately address.
  • Platform potential: When developing methods that might be applied across multiple products, enhanced approaches can generate knowledge that benefits the entire platform, amortizing development costs across the portfolio.
  • Regulatory landscape: Biosimilar programs and products in competitive generic spaces may benefit from enhanced approaches that strengthen regulatory submissions and simplify lifecycle management in response to originator changes.

However, enhanced approaches can also become expensive validation theater if organizations go through the motions of design of experiments and parameter range studies without genuine commitment to using the resulting knowledge for method control and change management. I’ve seen impressive MODRs filed in regulatory submissions that are then completely ignored during commercial manufacturing because operational teams weren’t involved in development and don’t understand or trust the parameter ranges.

The decision between minimal and enhanced approaches should be driven by honest assessment of whether the additional knowledge generated will actually improve method performance and lifecycle management, not by belief that “enhanced” is inherently better or that regulators will be impressed by sophisticated development.

Validation Categories vs Risk-Based Approaches

USP <1225> has traditionally organized validation requirements using four method categories:

  • Category I: Methods for quantitation of major components (assay methods)
  • Category II: Methods for quantitation of impurities and degradation products
  • Category III: Methods for determination of performance characteristics (dissolution, drug release)
  • Category IV: Identification tests

Each category specifies which performance characteristics require evaluation. This framework provides clarity and consistency, making it easy to design validation protocols for common method types.

However, the category-based approach can create perverse incentives. Organizations might design methods to fit into categories with less demanding validation requirements rather than choosing the most appropriate analytical approach for their specific needs. A method capable of quantitating impurities might be deliberately operated only as a limit test (Category II modified) to avoid full quantitation validation requirements.

The revised chapter maintains the categories while increasingly emphasizing that fitness for purpose should guide validation strategy. This creates interpretive flexibility that can be used constructively or abused. Quality leaders need to ensure their teams use the categories as starting points for validation design, not as rigid constraints or opportunities for gaming the system.

Risk-based validation asks different questions than category-based approaches: What decisions will be made using this analytical data? What happens if results are inaccurate or imprecise beyond acceptable limits? How critical is this measurement to product quality and patient safety? These questions should inform validation design regardless of which traditional category the method falls into.

Specificity/Selectivity: Terminology That Matters

The evolution of specificity/selectivity terminology across these documents deserves attention because terminology shapes how we think about analytical challenges. ICH Q2(R1) treated the terms as equivalent, leading to regional confusion as different pharmacopeias and regulatory authorities developed different preferences.

ICH Q2(R2) addresses this by defining both terms clearly and acknowledging they address related but distinct aspects of method performance. Specificity is the ability to assess the analyte unequivocally—can we be certain our measurement reflects only the intended analyte and not interference from other components? Selectivity is the ability to measure the analyte in the presence of other components—can we accurately quantitate our analyte even in a complex matrix?

For monoclonal antibody product characterization, for instance, a method might be specific for the antibody molecule versus other proteins but show poor selectivity among different glycoforms or charge variants. Distinguishing these concepts helps us design studies that actually demonstrate what we need to know rather than generically “proving the method is specific.”

Revised USP <1225> adopts the ICH Q2(R2) terminology while acknowledging that compendial procedures typically focus on specificity because they’re designed for relatively simple matrices (standards and reference materials). The chapter notes that when compendial procedures are applied to complex samples like drug products, selectivity may need additional evaluation during method verification or extension.

This distinction has practical implications for how we think about method transfer and method suitability. A method validated for drug substance might require additional selectivity evaluation when applied to drug product, even though the fundamental specificity has been established. Recognizing this prevents the false assumption that validation automatically confers suitability for all potential applications.

The Three-Stage Lifecycle: Where USP <1220>, <1225>, and ICH Guidelines Converge

The analytical procedure lifecycle framework provides the conceptual backbone for understanding how these various guidance documents fit together. USP <1220> explicitly describes three stages:

Stage 1: Procedure Design and Development

This stage encompasses everything from initial selection of analytical technique through systematic development and optimization to establishment of an analytical control strategy. ICH Q14 provides detailed guidance for this stage, describing both minimal and enhanced approaches.

Key activities include:

  • Knowledge gathering: Understanding the analyte, sample matrix, and measurement requirements based on the ATP or intended use
  • Risk assessment: Identifying analytical procedure parameters that might impact performance, using tools from ICH Q9
  • Method optimization: Systematically exploring parameter spaces through univariate or multivariate experiments
  • Robustness evaluation: Understanding how method performance responds to deliberate variations in parameters
  • Analytical control strategy: Establishing set points, acceptable ranges (PARs/MODRs), and system suitability criteria

Stage 1 generates the knowledge that makes Stage 2 validation more efficient and Stage 3 performance verification more meaningful. Organizations that short-cut development—rushing to validation with poorly understood methods—pay for those shortcuts through validation failures, unexplained variability during routine use, and inability to respond effectively to performance issues.

The causal reasoning approach I’ve advocated for investigations applies equally to method development. When development experiments produce unexpected results, the instinct is often to explain them away or adjust conditions to achieve desired outcomes. But unexpected results during development are opportunities to understand causal mechanisms governing method performance. Methods developed with genuine understanding of these mechanisms prove more robust than methods optimized through trial and error.

Stage 2: Procedure Performance Qualification (Validation)

This is where revised USP <1225> and ICH Q2(R2) provide detailed guidance. Stage 2 confirms that the method performs as intended under specified conditions, generating reportable results of adequate quality for their intended use.

The knowledge generated in Stage 1 directly informs Stage 2 protocol design. Risk assessment identifies which performance characteristics need most rigorous evaluation. Robustness studies reveal which parameters need tight control versus which have wide acceptable ranges. The analytical control strategy defines system suitability criteria and measurement conditions.

However, validation historically has been treated as disconnected from development, with validation protocols designed primarily to satisfy regulatory expectations rather than genuinely confirm method fitness. The revised documents push toward more integrated thinking—validation should test the specific knowledge claims generated during development.

From a falsifiable systems perspective, validation makes explicit predictions about method performance: “When operated within these conditions, this method will generate results meeting these performance criteria.” Stage 3 exists to continuously test whether those predictions hold under routine operating conditions.

Organizations that treat validation as a compliance hurdle rather than a genuine test of method fitness often discover that methods “pass validation” but perform poorly in routine use. The validation succeeded at demonstrating compliance but failed to establish that the method would actually work under real operating conditions with normal analyst variability, standard material lot changes, and equipment variations.

Stage 3: Continued Procedure Performance Verification

Stage 3 is where lifecycle validation thinking diverges most dramatically from traditional approaches. Once a method is validated and in routine use, traditional practice involved occasional revalidation driven by changes or regulatory requirements, but no systematic ongoing verification of performance.

USP <1220> describes Stage 3 as continuous performance verification through routine monitoring of performance-related data. This might include:

  • System suitability trending: Not just pass/fail determination but statistical trending to detect performance drift
  • Control charting: Monitoring QC samples, reference standards, or replicate analyses to track method stability
  • Comparative testing: Periodic evaluation against orthogonal methods or reference laboratories
  • Investigation of anomalous results: Treating unexplained variability or atypical results as potential signals of method performance issues

Stage 3 represents the “work-as-done” reality of analytical methods—how they actually perform under routine conditions with real samples, typical analysts, normal equipment status, and unavoidable operational variability. Methods that looked excellent during validation (work-as-imagined) sometimes reveal limitations during Stage 3 that weren’t apparent in controlled validation studies.

Neither ICH Q2(R2) nor revised USP <1225> provides detailed Stage 3 guidance. This represents what I consider the most significant gap in the current guidance landscape. We’ve achieved reasonable consensus around development (ICH Q14) and validation (ICH Q2(R2), USP <1225>), but Stage 3—arguably the longest and most important phase of the analytical lifecycle—remains underdeveloped from a regulatory guidance perspective.

Organizations serious about lifecycle validation need to develop robust Stage 3 programs even without detailed regulatory guidance. This means defining what ongoing verification looks like for different method types and criticality levels, establishing monitoring systems that generate meaningful performance data, and creating processes that actually respond to performance trending before methods drift into inadequate performance.

Practical Implications for Quality Professionals

Understanding what these documents say matters less than knowing how to apply their principles to build better analytical quality systems. Several practical implications deserve attention.

Moving Beyond Category I-IV Thinking

The validation categories provided useful structure when analytical methods were less diverse and quality systems were primarily compliance-focused. But modern pharmaceutical development, particularly for biologics, involves analytical challenges that don’t fit neatly into traditional categories.

An LC-MS method for characterizing post-translational modifications might measure major species (Category I), minor variants (Category II), and contribute to product identification (Category IV) simultaneously. Multivariate spectroscopic methods like NIR or Raman might predict multiple attributes across ranges spanning both major and minor components.

Rather than contorting methods to fit categories or conducting redundant validation studies to satisfy multiple category requirements, risk-based thinking asks: What do we need this method to do? What performance is necessary for those purposes? What validation evidence would demonstrate adequate performance?

This requires more analytical thinking than category-based validation, which is why many organizations resist it. Following category-based templates is easier than designing fit-for-purpose validation strategies. But template-based validation often generates massive data packages that don’t actually demonstrate whether methods will perform adequately under routine conditions.

Quality leaders should push their teams to articulate validation strategies in terms of fitness for purpose first, then verify that category-based requirements are addressed, rather than simply executing category-based templates without thinking about what they’re actually demonstrating.

Robustness: From Development to Control Strategy

Traditional validation often treated robustness as an afterthought—a set of small deliberate variations tested at the end of validation to identify factors that might influence performance. ICH Q2(R1) explicitly stated that robustness evaluation should be considered during development, not validation.

ICH Q2(R2) and Q14 formalize this by moving robustness firmly into Stage 1 development. The purpose shifts from demonstrating that small variations don’t affect performance to understanding how method parameters influence performance and establishing appropriate control strategies.

This changes what robustness studies look like. Instead of testing whether pH ±0.2 units or temperature ±2°C affect performance, enhanced approaches use design of experiments to systematically map performance across parameter ranges, identifying critical parameters that need tight control versus robust parameters that can vary within wide ranges.

The analytical control strategy emerging from this work defines what needs to be controlled, how tightly, and how that control will be verified through system suitability. Parameters proven robust across wide ranges don’t need tight control or continuous monitoring. Parameters identified as critical get appropriate control measures and verification.

Revised USP <1225> acknowledges this evolution while maintaining compatibility with traditional robustness testing for organizations using minimal development approaches. The practical implication is that organizations need to decide whether their robustness studies are compliance exercises demonstrating nothing really matters, or genuine explorations of parameter effects informing control strategies.

In my experience, most robustness studies fall into the former category—demonstrating that the developer knew enough about the method to avoid obviously critical parameters when designing the robustness protocol. Studies that actually reveal important parameter sensitivities are rare because developers already controlled those parameters tightly during development.

Platform Methods and Prior Knowledge

Biotechnology companies developing multiple monoclonal antibodies or other platform products can achieve substantial efficiency through platform analytical methods—methods developed once with appropriate robustness and then applied across products with minimal product-specific validation.

ICH Q2(R2) and revised USP <1225> both acknowledge that prior knowledge and platform experience constitute legitimate validation input. A platform charge variant method that has been thoroughly validated for multiple products can be applied to new products with reduced validation, focusing on product-specific aspects like impurity specificity and acceptance criteria rather than repeating full performance characterization.

However, organizations often claim platform status for methods that aren’t genuinely robust across the platform scope. A method that worked well for three high-expressing stable molecules might fail for a molecule with unusual post-translational modifications or stability challenges. Declaring something a “platform method” doesn’t automatically make it appropriate for all platform products.

Effective platform approaches require disciplined knowledge management documenting what’s actually known about method performance across product diversity, explicit identification of product attributes that might challenge method suitability, and honest assessment of when product-specific factors require more extensive validation.

The work-as-done reality is that platform methods often perform differently across products but these differences go unrecognized because validation strategies assume platform applicability rather than testing it. Quality leaders should ensure that platform method programs include ongoing monitoring of performance across products, not just initial validation studies.

What This Means for Investigations

The connection between analytical method validation and quality investigations is profound but often overlooked. When products fail specification, stability trends show concerning patterns, or process monitoring reveals unexpected variability, investigations invariably rely on analytical data. The quality of those investigations depends entirely on whether the analytical methods actually perform as assumed.

I’ve advocated for causal reasoning in investigations—focusing on what actually happened and why rather than cataloging everything that didn’t happen. This approach demands confidence in analytical results. If we can’t trust that our analytical methods are accurately measuring what we think they’re measuring, causal reasoning becomes impossible. We can’t identify causal mechanisms when we can’t reliably observe the phenomena we’re investigating.

The lifecycle validation paradigm, properly implemented, strengthens investigation capability by ensuring analytical methods remain fit for purpose throughout their use. Stage 3 performance verification should detect analytical performance drift before it creates false signals that trigger fruitless investigations or masks genuine quality issues that should be investigated.

However, this requires that investigation teams understand analytical method limitations and consider measurement uncertainty when evaluating results. An assay result of 98% when specification is 95-105% doesn’t necessarily represent genuine process variation if the method’s measurement uncertainty spans several percentage points. Understanding what analytical variation is normal versus unusual requires engagement with the analytical validation and ongoing verification data—engagement that happens far too rarely in practice.

Quality organizations should build explicit links between their analytical lifecycle management programs and investigation processes. Investigation templates should prompt consideration of measurement uncertainty. Trending programs should monitor analytical variation separately from product variation. Investigation training should include analytical performance concepts so investigators understand what questions to ask when analytical results seem anomalous.

The Work-as-Done Reality of Method Validation

Perhaps the most important practical implication involves honest reckoning with how validation actually happens versus how guidance documents describe it. Validation protocols present idealized experimental sequences with carefully controlled conditions and expert execution. The work-as-imagined of validation assumes adequate resources, appropriate timeline, skilled analysts, stable equipment, and consistent materials.

Work-as-done validation often involves constrained timelines driving corner-cutting, resource limitations forcing compromise, analyst skill gaps requiring extensive supervision, equipment variability creating unexplained results, and material availability forcing substitutions. These conditions shape validation study quality in ways that rarely appear in validation reports.

Organizations under regulatory pressure to validate quickly might conduct studies before development is genuinely complete, generating data that meets protocol acceptance criteria without establishing genuine confidence in method fitness. Analytical labs struggling with staffing shortages might rely on junior analysts for validation studies that require expert judgment. Equipment with marginal suitability might be used because better alternatives aren’t available within timeline constraints.

These realities don’t disappear because we adopt lifecycle validation frameworks or implement ATP concepts. Quality leaders must create organizational conditions where work-as-done validation can reasonably approximate work-as-imagined validation. This means adequate resources, appropriate timelines that don’t force rushing, investment in analyst training and equipment capability, and willingness to acknowledge when validation studies reveal genuine limitations requiring method redevelopment.

The alternative is validation theater—impressive documentation packages describing validation studies that didn’t actually happen as reported or didn’t genuinely demonstrate what they claim to demonstrate. Such theater satisfies regulatory inspections while creating quality systems built on foundations of misrepresentation—exactly the kind of organizational inauthenticity that Sidney Dekker’s work warns against.

Critical Analysis: What USP <1225> Gets Right (and Where Questions Remain)

The revised USP <1225> deserves credit for several important advances while also raising questions about implementation and potential for misuse.

Strengths of the Revision

Lifecycle integration: By explicitly connecting to USP <1220> and acknowledging ICH Q14 and Q2(R2), the chapter positions compendial validation within the broader analytical lifecycle framework. This represents significant conceptual progress from treating validation as an isolated event.

Reportable result focus: Emphasizing that validation should address the actual output used for quality decisions rather than intermediate measurements aligns validation with its genuine purpose—ensuring reliable decision-making data.

Combined accuracy-precision evaluation: Providing guidance on total error approaches acknowledges the statistical reality that these characteristics interact and should be evaluated together when appropriate.

Knowledge management: Explicit acknowledgment that development data, prior knowledge, and platform experience constitute legitimate validation inputs encourages more efficient validation strategies and better integration across analytical lifecycle stages.

Flexibility for risk-based approaches: While maintaining traditional validation categories, the revision provides conceptual space for fitness-for-purpose thinking and risk-based validation strategies.

Potential Implementation Challenges

Statistical sophistication requirements: Combined accuracy-precision evaluation and other advanced approaches require statistical expertise many analytical laboratories lack. Without adequate support, organizations might misapply statistical methods or avoid them entirely, losing the benefits the revision offers.

Interpretive ambiguity: Concepts like fitness for purpose and appropriate use of prior knowledge create interpretive flexibility that can be used constructively or abused. Without clear examples and expectations, organizations might claim compliance while failing to genuinely implement lifecycle thinking.

Resource implications: Validating with replication strategies matching routine use, conducting robust Stage 3 verification, and maintaining appropriate knowledge management all require resources beyond traditional validation. Organizations already stretched thin might struggle to implement these practices meaningfully.

Integration with existing systems: Companies with established validation programs built around traditional category-based approaches face significant effort to transition toward lifecycle validation thinking, particularly for legacy methods already in use.

Regulatory expectations uncertainty: Until regulatory agencies provide clear inspection and review expectations around the revised chapter’s concepts, organizations face uncertainty about what will be considered adequate implementation versus what might trigger deficiency citations.

The Risk of New Compliance Theater

My deepest concern about the revision is that organizations might treat new concepts as additional compliance checkboxes rather than genuine analytical challenges. Instead of honestly grappling with whether methods are fit for purpose, they might add “fitness for purpose justification” sections to validation reports that provide ritualistic explanations without meaningful analysis.

Reportable result definitions could become templates copied across validation protocols without consideration of what’s actually being reported. Replication strategies might nominally match routine use while validation continues to be conducted under unrealistically controlled conditions. Combined accuracy-precision evaluations might be performed because the guidance mentions them without understanding what the statistical intervals reveal about method performance.

This theater would be particularly insidious because it would satisfy document review while completely missing the point. Organizations could claim to be implementing lifecycle validation principles while actually maintaining traditional validation-as-event practices with updated terminology.

Preventing this outcome requires quality leaders who understand the conceptual foundations of lifecycle validation and insist on genuine implementation rather than cosmetic compliance. It requires analytical organizations willing to acknowledge when they don’t understand new concepts and seek appropriate expertise. It requires resource commitment to do lifecycle validation properly rather than trying to achieve it within existing resource constraints.

Questions for the Pharmaceutical Community

Several questions deserve broader community discussion as organizations implement the revised chapter:

How will regulatory agencies evaluate fitness-for-purpose justifications? What level of rigor is expected? How will reviewers distinguish between thoughtful risk-based strategies and efforts to minimize validation requirements?

What constitutes adequate Stage 3 verification for different method types and criticality levels? Without detailed guidance, organizations must develop their own programs. Will regulatory consensus emerge around what adequate verification looks like?

How should platform methods be validated and verified? What documentation demonstrates platform applicability? How much product-specific validation is expected?

What happens to legacy methods validated under traditional approaches? Is retrospective alignment with lifecycle concepts expected? How should organizations prioritize analytical lifecycle improvement efforts?

How will contract laboratories implement lifecycle validation? Many analytical testing organizations operate under fee-for-service models that don’t easily accommodate ongoing Stage 3 verification. How will sponsor oversight adapt?

These questions don’t have obvious answers, which means early implementers will shape emerging practices through their choices. Quality leaders should engage actively with peers, standards bodies, and regulatory agencies to help develop community understanding of reasonable implementation approaches.

Building Falsifiable Analytical Systems

Throughout this blog, I’ve advocated for falsifiable quality systems—systems designed to make testable predictions that could be proven wrong through empirical observation. The lifecycle validation paradigm, properly implemented, enables genuinely falsifiable analytical systems.

Traditional validation generates unfalsifiable claims: “This method was validated according to ICH Q2 requirements” or “Validation demonstrated acceptable performance for all required characteristics.” These statements can’t be proven false because they describe historical activities rather than making predictions about ongoing performance.

Lifecycle validation creates falsifiable claims: “This method will generate reportable results meeting the Analytical Target Profile requirements when operated within the defined analytical control strategy.” This prediction can be tested—and potentially falsified—through Stage 3 performance verification.

Every batch tested, every stability sample analyzed, every investigation that relies on analytical results provides opportunity to test whether the method continues performing as validation claimed it would. System suitability results, QC sample trending, interlaboratory comparisons, and investigation findings all generate evidence that either supports or contradicts the fundamental claim that the method remains fit for purpose.

Building falsifiable analytical systems requires:

  • Explicit performance predictions: The ATP or fitness-for-purpose justification must articulate specific, measurable performance criteria that can be objectively verified, not vague assertions of adequacy.
  • Ongoing performance monitoring: Stage 3 verification must actually measure the performance characteristics claimed during validation and detect degradation before methods drift into inadequate performance.
  • Investigation of anomalies: Unexpected results, system suitability failures, or performance trending outside normal ranges should trigger investigation of whether the method continues to perform as validated, not just whether samples or equipment caused the anomaly.
  • Willingness to invalidate: Organizations must be willing to acknowledge when ongoing evidence falsifies validation claims—when methods prove inadequate despite “passing validation”—and take appropriate corrective action including method redevelopment or replacement.

This last requirement is perhaps most challenging. Admitting that a validated method doesn’t actually work threatens regulatory commitments, creates resource demands for method improvement, and potentially reveals years of questionable analytical results. The organizational pressure to maintain the fiction that validated methods remain adequate is immense.

But genuinely robust quality systems require this honesty. Methods that seemed adequate during validation sometimes prove inadequate under routine conditions. Technology advances reveal limitations in historical methods. Understanding of critical quality attributes evolves, changing performance requirements. Falsifiable analytical systems acknowledge these realities and adapt, while unfalsifiable systems maintain comforting fictions about adequacy until external pressure forces change.

The connection to investigation excellence is direct. When investigations rely on analytical results generated by methods known to be marginal but maintained because they’re “validated,” investigation findings become questionable. We might be investigating analytical artifacts rather than genuine quality issues, or failing to investigate real issues because inadequate analytical methods don’t detect them.

Investigations founded on falsifiable analytical systems can have greater confidence that anomalous results reflect genuine events worth investigating rather than analytical noise. This confidence enables the kind of causal reasoning that identifies true mechanisms rather than documenting procedural deviations that might or might not have contributed to observed results.

The Validation Revolution We Need

The convergence of revised USP <1225>, ICH Q2(R2), and ICH Q14 represents potential for genuine transformation in how pharmaceutical organizations approach analytical validation—if we’re willing to embrace the conceptual challenges these documents present rather than treating them as updated compliance templates.

The core shift is from validation-as-event to validation-as-lifecycle-stage. Methods aren’t validated once and then assumed adequate until problems force revalidation. They’re developed with systematic understanding, validated to confirm fitness for purpose, and continuously verified to ensure they remain adequate under evolving conditions. Knowledge accumulates across the lifecycle, informing method improvements and transfer while building organizational capability.

This transformation demands intellectual honesty about whether our methods actually perform as claimed, organizational willingness to invest resources in genuine lifecycle management rather than minimal compliance, and leadership that insists on substance over theater. These demands are substantial, which is why many organizations will implement the letter of revised requirements while missing their spirit.

For quality leaders committed to building genuinely robust analytical systems, the path forward involves:

  • Developing organizational capability in lifecycle validation thinking, ensuring analytical teams understand concepts beyond superficial compliance requirements and can apply them thoughtfully to specific analytical challenges.
  • Creating systems and processes that support Stage 3 verification, not just Stage 2 validation, acknowledging that ongoing performance monitoring is where lifecycle validation either succeeds or fails in practice.
  • Building bridges between analytical validation and other quality functions, particularly investigations, trending, and change management, so that analytical performance information actually informs decision-making across the quality system.
  • Maintaining falsifiability in analytical systems, insisting on explicit, testable performance claims rather than vague adequacy assertions, and creating organizational conditions where evidence of inadequate performance prompts honest response rather than rationalization.
  • Engaging authentically with what methods can and cannot do, avoiding the twin errors of assuming validated methods are perfect or maintaining methods known to be inadequate because they’re “validated.”

The pharmaceutical industry has an opportunity to advance analytical quality substantially through thoughtful implementation of lifecycle validation principles. The revised USP <1225>, aligned with ICH Q2(R2) and Q14, provides the conceptual framework. Whether we achieve genuine transformation or merely update compliance theater depends on choices quality leaders make about how to implement these frameworks in practice.

The stakes are substantial. Analytical methods are how we know what we think we know about product quality. When those methods are inadequate—whether because validation was theatrical, ongoing performance has drifted, or fitness for purpose was never genuinely established—our entire quality system rests on questionable foundations. We might be releasing product that doesn’t meet specifications, investigating artifacts rather than genuine quality issues, or maintaining comfortable confidence in systems that don’t actually work as assumed.

Lifecycle validation, implemented with genuine commitment to falsifiable quality systems, offers a path toward analytical capabilities we can actually trust rather than merely document. The question is whether pharmaceutical organizations will embrace this transformation or simply add new compliance layers onto existing practices while fundamental problems persist.

The answer to that question will emerge not from reading guidance documents but from how quality leaders choose to lead, what they demand from their analytical organizations, and what they’re willing to acknowledge about the gap between validation documents and validation reality. The revised USP <1225> provides tools for building better analytical systems. Whether we use those tools constructively or merely as updated props for compliance theater is entirely up to us.

Method Qualification and Validation

The terms “method qualification” and “method validation” are often used in the context of analytical procedures in the pharmaceutical and biotechnology industries. While related, they serve different purposes and are applied at different stages of method development. Here is a detailed comparison of the two:

Method Qualification

Definition

Method qualification demonstrates that an analytical method is suitable for its intended use during the early stages of development. It involves preliminary testing to ensure the method can produce reliable and reproducible results for the specific application.

Purpose

  • Early Development: This is typically performed during the early phases of drug development (e.g., preclinical and Phase I clinical trials) to assess the method’s feasibility.
  • Optimization: Helps in optimizing the method to ensure it meets the necessary performance criteria before full validation.
  • Feasibility Studies: Often referred to as feasibility or pre-validation studies, method qualification helps understand the method’s performance characteristics and establish preliminary acceptance criteria.

Characteristics

  • Flexibility: The method can still be modified and optimized based on the results obtained during qualification.
  • Parameters: This evaluation method evaluates fewer parameters than validation, focusing on key performance indicators like specificity, linearity, accuracy, and precision.
  • Voluntary: This is not always required by regulatory authorities, but it is a good practice to ensure the method is on the right track for future validation.

Example

A company developing a new drug might perform method qualification to ensure that their analytical method can accurately measure the drug’s concentration in biological samples before moving on to more rigorous validation studies.

Method Validation

Definition

Method validation proves that an analytical method is suitable for its intended purpose and can consistently produce reliable and reproducible results under specified conditions. It is a regulatory requirement for methods used to test drug substances and products.

Purpose

  • Regulatory Compliance: Required by regulatory authorities (e.g., FDA, EMA) for methods used in quality control and release testing of pharmaceutical products.
  • Late Development: This is typically performed during the later stages of drug development (e.g., Phase III clinical trials, and commercial production) when the method is fully developed and optimized.
  • Consistency: Ensures that the method produces consistent results over time and across different laboratories.

Characteristics

  • Rigidity: The method must be fully developed and optimized before validation. Any changes to the method after validation would require re-validation.
  • Comprehensive: Involves a thorough evaluation of multiple parameters as defined by guidelines such as ICH Q2(R1), including accuracy, precision, specificity, linearity, range, limit of detection (LOD), limit of quantification (LOQ), robustness, and ruggedness.
  • Mandatory: Required by regulatory authorities to ensure the quality, reliability, and consistency of analytical results.

Example

Before a pharmaceutical company can market a new drug, it must validate its analytical methods to demonstrate that they can reliably measure the drug’s potency, purity, and stability according to regulatory standards.

Key Differences

AspectMethod QualificationMethod Validation
Stage of DevelopmentEarly stages (pre-clinical, Phase I)Later stages (Phase III, commercial production)
PurposeFeasibility and optimizationRegulatory compliance and consistency
FlexibilityMethod can be modifiedMethod must be fully developed and optimized
Parameters EvaluatedFewer, key performance indicatorsComprehensive, as per regulatory guidelines
Regulatory RequirementVoluntary, good practiceMandatory

In summary, method qualification is an early-stage activity aimed at ensuring that an analytical method is on the right path to becoming reliable and reproducible, while method validation is a more rigorous, comprehensive process required to demonstrate that the method meets all regulatory requirements for its intended use.

The State of the Analytical Lifecycle

There have been a lot of changes in the way pharma thinks of analytical lifecycles in the last few years. With changes in technology, new product modalities, ICH Q2(R2) and ICH Q14 being released in November 2023, and USP <1220> in 2022, it is fair to say we are all catching up with our analytical lifecycle programs.

Let’s discuss what I think are the four pivotal documents that provide direction.

ICH Q2(R2) and ICH Q14

ICH Q2(R2) and ICH Q14 are complementary guidelines that provide a comprehensive framework for the development, validation, and lifecycle management of analytical procedures used in the pharmaceutical industry.

ICH Q14 describes the scientific principles and risk-based approaches for developing and maintaining suitable analytical procedures throughout their lifecycle. It outlines the key elements and considerations for analytical procedure development, including:

  • Defining an Analytical Target Profile (ATP)
  • Knowledge management and risk assessment
  • Evaluating robustness and parameter ranges
  • Establishing an Analytical Procedure Control Strategy
  • Lifecycle management and post-approval changes
  • Multivariate analytical procedures
  • Real-time release testing

On the other hand, ICH Q2(R2) provides specific guidance on validating analytical procedures to demonstrate their suitability for the intended use. It covers various validation tests, methodologies, and evaluation criteria, such as:

  • Specificity/selectivity
  • Working range
  • Accuracy and precision
  • Robustness
  • Stability-indicating properties
  • Multivariate analytical procedures

In summary, ICH Q14 establishes the overarching principles and approaches for analytical procedure development. At the same time, ICH Q2(R2) focuses on the validation aspects to ensure the analytical procedures are fit for purpose and meet quality requirements throughout their lifecycle. The two guidelines are intended to be applied together, with ICH Q14 providing the framework for development and ICH Q2(R2) specifying the validation requirements.

USP <1220> Analytical Procedure Lifecycle and USP <1058> Analytical Instrument Qualification

USP <1220> Analytical Procedure Lifecycle and USP <1058> Analytical Instrument Qualification are closely connected and complementary guidelines that provide a comprehensive framework for ensuring data integrity and quality in analytical procedures throughout their lifecycle.

The key connections between USP <1220> and USP <1058> are:

  1. USP <1220> establishes the principles and requirements for managing the entire lifecycle of analytical procedures, from procedure design and development to retirement. It emphasizes the importance of defining an Analytical Target Profile (ATP) and implementing an Analytical Procedure Control Strategy.
  2. USP <1058> focuses explicitly on the qualification of analytical instruments that execute analytical procedures. It outlines the requirements for ensuring instruments are suitable for their intended use through proper qualification (Design, Installation, Operational, and Performance Qualification).
  3. The instrument qualification activities described in USP <1058> are critical to the overall Analytical Procedure Control Strategy outlined in USP <1220>. Proper instrument qualification as per <1058> helps ensure the quality and integrity of data generated by analytical procedures throughout their lifecycle.
  4. Both guidelines stress the importance of defining user requirements (ATP in <1220> and User Requirements Specification in <1058>) as the basis for procedure development and instrument qualification activities.
  5. USP <1220> requires ongoing monitoring and periodic requalification of analytical procedures, which includes re-evaluating the suitability of the analytical instruments used, as described in the Performance Qualification section of <1058>.

USP <1220> provides the overarching framework for holistically managing analytical procedures. USP <1058> focuses on ensuring the instruments used to execute those procedures are properly qualified and suitable for their intended use. The two guidelines work together to maintain data integrity and quality across the entire analytical lifecycle.

Complementary Approaches

USP <1220> Analytical Procedure Lifecycle is closely related to and complements the ICH Q2(R2) and ICH Q14 guidelines.

  1. USP <1220> aligns with the principles outlined in ICH Q14 for managing the entire lifecycle of analytical procedures, from design and development to retirement. Both emphasize defining an Analytical Target Profile and implementing an Analytical Procedure Control Strategy.
  2. The validation activities described in ICH Q2(R2), such as evaluating specificity, accuracy, precision, and robustness, are critical components of the Analytical Procedure Control Strategy required by USP <1220>.
  3. USP <1220> requires ongoing monitoring and periodic requalification of analytical procedures, which aligns with the lifecycle management approach promoted in ICH Q14 and the validation during the lifecycle section in Q2(R2).
  4. All these guidelines stress the importance of knowledge management, risk management, and a science/risk-based approach throughout the analytical procedure lifecycle.
  5. The instrument qualification requirements outlined in USP <1058> are an integral part of the overall Analytical Procedure Control Strategy described in USP <1220>, ensuring instruments are suitable as per ICH Q2(R2) validation principles.

In essence, USP <1220> provides a comprehensive framework for analytical procedure lifecycle management that incorporates and operationalizes the scientific principles and validation activities detailed in the ICH Q14 and Q2(R2) guidelines, while USP <1058> provides the roadmap for instrument qualification. These four documents establish harmonized best practices for analytical procedures from development through retirement.

GMP Lab Warning Letter – A Baseline of Expectations

A February 2022 FDA Warning Letter to Accu Bio-Chem Laboratories provides a great baseline for what your audit programs should look at and what your own labs should focus on:

Throw in a good lab instrument qualification review, and supplier/raw materials management, and you have a pretty solid program.