You Gotta Have Heart: Combating Human Error

The persistent attribution of human error as a root cause deviations reveals far more about systemic weaknesses than individual failings. The label often masks deeper organizational, procedural, and cultural flaws. Like cracks in a foundation, recurring human errors signal where quality management systems (QMS) fail to account for the complexities of human cognition, communication, and operational realities.

The Myth of Human Error as a Root Cause

Regulatory agencies increasingly reject “human error” as an acceptable conclusion in deviation investigations. This shift recognizes that human actions occur within a web of systemic influences. A technician’s missed documentation step or a formulation error rarely stem from carelessness alone but emerge from:

The aviation industry’s “Tower of Babel” problem—where siloed teams develop isolated communication loops—parallels pharmaceutical manufacturing. The Quality Unit may prioritize regulatory compliance, while production focuses on throughput, creating disjointed interpretations of “quality.” These disconnects manifest as errors when cross-functional risks go unaddressed.

Cognitive Architecture and Error Propagation

Human cognition operates under predictable constraints. Attentional biases, memory limitations, and heuristic decision-making—while evolutionarily advantageous—create vulnerabilities in GMP environments. For example:

  • Attentional tunneling: An operator hyper-focused on solving a equipment jam may overlook a temperature excursion alert.
  • Procedural drift: Subtle deviations from written protocols accumulate over time as workers optimize for perceived efficiency.
  • Complacency cycles: Over-familiarity with routine tasks reduces vigilance, particularly during night shifts or prolonged operations.

These cognitive patterns aren’t failures but features of human neurobiology. Effective QMS design anticipates them through:

  1. Error-proofing: Automated checkpoints that detect deviations before critical process stages
  2. Cognitive load management: Procedures (including batch records) tailored to cognitive load principles with decision-support prompts
  3. Resilience engineering: Simulations that train teams to recognize and recover from near-misses

Strategies for Reframing Human Error Analysis

Conduct Cognitive Autopsies

Move beyond 5-Whys to adopt human factors analysis frameworks:

  • Human Error Assessment and Reduction Technique (HEART): Quantifies the likelihood of specific error types based on task characteristics
  • Critical Action and Decision (CAD) timelines: Maps decision points where system defenses failed

For example, a labeling mix-up might reveal:

  • Task factors: Nearly identical packaging for two products (29% contribution to error likelihood)
  • Environmental factors: Poor lighting in labeling area (18%)
  • Organizational factors: Inadequate change control when adding new SKUs (53%)

Redesign for Intuitive Use

The redesign of for intuitive use requires multilayered approaches based on understand how human brains actually work. At the foundation lies procedural chunking, an evidence-based method that restructures complex standard operating procedures (SOPs) into digestible cognitive units aligned with working memory limitations. This approach segments multiphase processes like aseptic filling into discrete verification checkpoints, reducing cognitive overload while maintaining procedural integrity through sequenced validation gates. By mirroring the brain’s natural pattern recognition capabilities, chunked protocols demonstrate significantly higher compliance rates compared to traditional monolithic SOP formats.

Complementing this cognitive scaffolding, mistake-proof redesigns create inherent error detection mechanisms.

To sustain these engineered safeguards, progressive facilities implement peer-to-peer audit protocols during critical operations and transition periods.

Leverage Error Data Analytics

The integration of data analytics into organizational processes has emerged as a critical strategy for minimizing human error, enhancing accuracy, and driving informed decision-making. By leveraging advanced computational techniques, automation, and machine learning, data analytics addresses systemic vulnerabilities.

Human Error Assessment and Reduction Technique (HEART): A Systematic Framework for Error Mitigation

Benefits of the Human Error Assessment and Reduction Technique (HEART)

1. Simplicity and Speed: HEART is designed to be straightforward and does not require complex tools, software, or large datasets. This makes it accessible to organizations without extensive human factors expertise and allows for rapid assessments. The method is easy to understand and apply, even in time-constrained or resource-limited environments.

2. Flexibility and Broad Applicability: HEART can be used across a wide range of industries—including nuclear, healthcare, aviation, rail, process industries, and engineering—due to its generic task classification and adaptability to different operational contexts. It is suitable for both routine and complex tasks.

3. Systematic Identification of Error Influences: The technique systematically identifies and quantifies Error Producing Conditions (EPCs) that increase the likelihood of human error. This structured approach helps organizations recognize the specific factors—such as time pressure, distractions, or poor procedures—that most affect reliability.

4. Quantitative Error Prediction: HEART provides a numerical estimate of human error probability for specific tasks, which can be incorporated into broader risk assessments, safety cases, or design reviews. This quantification supports evidence-based decision-making and prioritization of interventions.

5. Actionable Risk Reduction: By highlighting which EPCs most contribute to error, HEART offers direct guidance on where to focus improvement efforts—whether through engineering redesign, training, procedural changes, or automation. This can lead to reduced error rates, improved safety, fewer incidents, and increased productivity.

6. Supports Accident Investigation and Design: HEART is not only a predictive tool but also valuable in investigating incidents and guiding the design of safer systems and procedures. It helps clarify how and why errors occurred, supporting root cause analysis and preventive action planning.

7. Encourages Safety and Quality Culture and Awareness: Regular use of HEART increases awareness of human error risks and the importance of control measures among staff and management, fostering a proactive culture.

When Is HEART Best Used?

  • Risk Assessment for Critical Tasks: When evaluating tasks where human error could have severe consequences (e.g., operating nuclear control systems, administering medication, critical maintenance), HEART helps quantify and reduce those risks.
  • Design and Review of Procedures: During the design or revision of operational procedures, HEART can identify steps most vulnerable to error and suggest targeted improvements.
  • Incident Investigation: After an failure or near-miss, HEART helps reconstruct the event, identify contributing EPCs, and recommend changes to prevent recurrence.
  • Training and Competence Assessment: HEART can inform training programs by highlighting the conditions and tasks where errors are most likely, allowing for focused skill development and awareness.
  • Resource-Limited or Fast-Paced Environments: Its simplicity and speed make HEART ideal for organizations needing quick, reliable human error assessments without extensive resources or data.

Generic Task Types (GTTs): Establishing Baselines

HEART classifies human activities into nine Generic Task Types (GTT) with predefined nominal human error probabilities (NHEPs) derived from decades of industrial incident data:

GTT CodeTask DescriptionNominal HEP Range
AComplex, novel tasks requiring problem-solving0.55 (0.35–0.97)
BShifting attention between multiple systems0.26 (0.14–0.42)
CHigh-skill tasks under time constraints0.16 (0.12–0.28)
DRule-based diagnostics under stress0.09 (0.06–0.13)
ERoutine procedural tasks0.02 (0.007–0.045)
FRestoring system states0.003 (0.0008–0.007)
GHighly practiced routine operations0.0004 (0.00008–0.009)
HSupervised automated actions0.00002 (0.000006–0.0009)
MMiscellaneous/undefined tasks0.003 (0.008–0.11)

Comprehensive Taxonomy of Error-Producing Conditions (EPCs)

HEART’s 38 Error Producing Conditionss represent contextual amplifiers of error probability, categorized under the 4M Framework (Man, Machine, Media, Management):

EPC CodeDescriptionMax Effect4M Category
1Unfamiliarity with task17×Man
2Time shortage11×Management
3Low signal-to-noise ratio10×Machine
4Override capability of safety featuresMachine
5Spatial/functional incompatibilityMachine
6Model mismatch between mental and system statesMan
7Irreversible actionsMachine
8Channel overload (information density)Media
9Technique unlearningMan
10Inadequate knowledge transfer5.5×Management
11Performance ambiguityMedia
12Misperception of riskMan
13Poor feedback systemsMachine
14Delayed/incomplete feedbackMedia
15Operator inexperienceMan
16Impoverished information qualityMedia
17Inadequate checking proceduresManagement
18Conflicting objectives2.5×Management
19Lack of information diversity2.5×Media
20Educational/training mismatchManagement
21Dangerous incentivesManagement
22Lack of skill practice1.8×Man
23Unreliable instrumentation1.6×Machine
24Need for absolute judgments1.6×Man
25Unclear functional allocation1.6×Management
26No progress tracking1.4×Media
27Physical capability mismatches1.4×Man
28Low semantic meaning of information1.4×Media
29Emotional stress1.3×Man
30Ill-health1.2×Man
31Low workforce morale1.2×Management
32Inconsistent interface design1.15×Machine
33Poor environmental conditions1.1×Media
34Low mental workload1.1×Man
35Circadian rhythm disruption1.06×Man
36External task pacing1.03×Management
37Supernumerary staffing issues1.03×Management
38Age-related capability decline1.02×Man

HEP Calculation Methodology

The HEART equation incorporates both multiplicative and additive effects of EPCs:

Where:

  • NHEP: Nominal Human Error Probability from GTT
  • EPC_i: Maximum effect of i-th EPC
  • APOE_i: Assessed Proportion of Effect (0–1)

HEART Case Study: Operator Error During Biologics Drug Substance Manufacturing

A biotech facility was producing a monoclonal antibody (mAb) drug substance using mammalian cell culture in large-scale bioreactors. The process involved upstream cell culture (expansion and production), followed by downstream purification (protein A chromatography, filtration), and final bulk drug substance filling. The manufacturing process required strict adherence to parameters such as temperature, pH, and feed rates to ensure product quality, safety, and potency.

During a late-night shift, an operator was responsible for initiating a nutrient feed into a 2,000L production bioreactor. The standard operating procedure (SOP) required the feed to be started at 48 hours post-inoculation, with a precise flow rate of 1.5 L/hr for 12 hours. The operator, under time pressure and after a recent shift change, incorrectly programmed the feed rate as 15 L/hr rather than 1.5 L/hr.

Outcome:

  • The rapid addition of nutrients caused a metabolic imbalance, leading to excessive cell growth, increased waste metabolite (lactate/ammonia) accumulation, and a sharp drop in product titer and purity.
  • The batch failed to meet quality specifications for potency and purity, resulting in the loss of an entire production lot.
  • Investigation revealed no system alarms for the high feed rate, and the error was only detected during routine in-process testing several hours later.

HEART Analysis

Task Definition

  • Task: Programming and initiating nutrient feed in a GMP biologics manufacturing bioreactor.
  • Criticality: Direct impact on cell culture health, product yield, and batch quality.

Generic Task Type (GTT)

GTT CodeDescriptionNominal HEP
ERoutine procedural task with checking0.02

Error-Producing Conditions (EPCs) Using the 5M Model

5M CategoryEPC (HEART)Max EffectAPOEExample in Incident
ManInexperience with new feed system (EPC15)0.8Operator recently trained on upgraded control interface
MachinePoor feedback (no alarm for high feed rate, EPC13)0.7System did not alert on out-of-range input
MediaAmbiguous SOP wording (EPC11)0.5SOP listed feed rate as “1.5L/hr” in a table, not text
ManagementTime pressure to meet batch deadlines (EPC2)11×0.6Shift was behind schedule due to earlier equipment delay
MilieuDistraction during shift change (EPC36)1.03×0.9Handover occurred mid-setup, leading to divided attention

Human Error Probability (HEP) Calculation

HEP ≈ 3.5 (350%)
This extremely high error probability highlights a systemic vulnerability, not just an individual lapse.

Root Cause and Contributing Factors

  • Operator: Recently trained, unfamiliar with new interface (Man)
  • System: No feedback or alarm for out-of-spec feed rate (Machine)
  • SOP: Ambiguous presentation of critical parameter (Media)
  • Management: High pressure to recover lost time (Management)
  • Environment: Shift handover mid-task, causing distraction (Milieu)

Corrective Actions

Technical Controls

  • Automated Range Checks: Bioreactor control software now prevents entry of feed rates outside validated ranges and requires supervisor override for exceptions.
  • Visual SOP Enhancements: Critical parameters are now highlighted in both text and tables, and reviewed during operator training.

Human Factors & Training

  • Simulation-Based Training: Operators practice feed setup in a virtual environment simulating distractions and time pressure.
  • Shift Handover Protocol: Critical steps cannot be performed during handover periods; tasks must be paused or completed before/after shift changes.

Management & Environmental Controls

  • Production Scheduling: Buffer time added to schedules to reduce time pressure during critical steps.
  • Alarm System Upgrade: Real-time alerts for any parameter entry outside validated ranges.

Outcomes (6-Month Review)

MetricPre-InterventionPost-Intervention
Feed rate programming errors4/year0/year
Batch failures (due to feed)2/year0/year
Operator confidence (survey)62/10091/100

Lessons Learned

  • Systemic Safeguards: Reliance on operator vigilance alone is insufficient in complex biologics manufacturing; layered controls are essential.
  • Human Factors: Addressing EPCs across the 5M model—Man, Machine, Media, Management, Milieu—dramatically reduces error probability.
  • Continuous Improvement: Regular review of near-misses and operator feedback is crucial for maintaining process robustness in biologics manufacturing.

This case underscores how a HEART-based approach, tailored to biologics drug substance manufacturing, can identify and mitigate multi-factorial risks before they result in costly failures.

Peer Checking

Peer checking is a technique where two individuals work together to prevent errors before and during a specific action or task. Here are the key points about peer checking:

  • It involves a performer (the person doing the task) and a peer checker (someone familiar with the task who observes the performer).
  • The purpose is to prevent errors by the performer by having a second set of eyes verify the correct action is being taken.
  • The performer and peer checker first agree on the intended action and component. Then, the performer performs the action while the peer observes to confirm it was done correctly.
  • It augments self-checking by the performer but does not replace self-checking. Both individuals self-check in parallel.
  • The peer checker provides a fresh perspective that is not trapped in the performer’s task mindset, allowing them to potentially identify hazards or consequences the performer may miss.
  • It is recommended for critical, irreversible steps or error-likely situations where an extra verification can prevent mistakes.
  • Peer checking should be used judiciously and not mandated for all actions, as overuse can make it become a mechanical process that loses effectiveness.
  • It can also be used to evaluate potential fatigue or stress in a co-worker before starting a task.

Personally, I think we overcheck, and the whole process loses effectiveness. A big part of automation and computerized systems like an MES is removing the need for peer checking. But frankly, I’m pretty sure it will never go away.

Peer-Checking is the Check/Witness

Self-Checking in Work-As-Done

Self-checking is one of the most effective tools we can teach and use. Rooted in the four aspects of risk-based thinking (anticipate, monitor, respond, and learn), it refers to the procedures and checks that employees perform as part of their routine tasks to ensure the quality and accuracy of their work. This practice is often implemented in industries where precision is critical, and errors can lead to significant consequences. For instance, in manufacturing or engineering, workers might perform self-checks to verify that their work meets the required specifications before moving on to the next production stage.

A proactive approach enhances the reliability, safety, and quality of various systems and practices by allowing for immediate detection and correction of errors, thereby preventing potential failures or flaws from escalating into more significant issues.

The memory aid STAR (stop, think, act, review) helps the user recall the thoughts and actions associated with self-checking.

  1. Stop – Just before conducting a task, pause to:
    • Eliminate distractions.
    • Focus attention on the task.
  2. Think – Understand what will happen when the action is performed.
    • Verify the action is appropriate.
    • Recall the critical parameters and the action’s expected result(s).
    • Consider contingencies to mitigate harm if an unexpected result occurs.
    • If there is any doubt, STOP and get help.
  3. Act – Perform the task per work-as-prescribed
  4. Review – Verify that the expected result is obtained.
    • Verify the desired change in critical parameters.
    • Stop work if criteria are not met.
    • Perform the contingency if an unexpected result occurs.

Experts think differently

Research on expertise has identified the following differences between expert performers and beginners

  • Experts have larger and more integrative knowledge units, and their represen­tations of information are more functional and abstract than those of novices, whose knowledge base is more fragmentary. For example, a beginning piano player reads sheet music note by note, whereas a concert pianist is able to see the whole row or even several rows of music notation at the same time.
  • When solving problems, experts may spend more time on the initial prob­lem evaluation and planning than novices. This enables them to form a holistic and in-depth understanding of the task and usually to reach a solution more swiftly than beginners.
  • Basic functions related to tasks or the job are automated in experts, whereas beginners need to pay attention to these functions. For instance, in a driving Basic functions related to tasks or the job are automated in experts, whereas beginners need to pay attention to these functions. For instance, in a driving school, a young driver focuses his or her attention on controlling devices and pedals, while an experienced driver performs basic strokes automatically. For this reason, an expert driver can observe and anticipate traffic situations better than a beginning driver.
  • Experts outperform novices in their metacognitive and reflective thinking. In other words, they make sharp observations of their own ways of think­ing, acting, and working, especially in non-routine situations when auto­ mated activities are challenged. Beginners’ knowledge is mainly explicit and they are dependent on learned rules. In addition to explicit knowledge, experts have tacit or implicit knowledge that accumulates with experience. This kind of knowledge makes it possible to make fast decisions on the basis of what is often called intuition.
  • In situations where something has gone wrong or when experts face totally new problems but are not required to make fast decisions, they critically reflect on their actions. Unlike beginners, experienced professionals focus their thinking not only on details but rather on the totality consisting of the details.
  • Experts’ thinking is more holistic than the thinking of novices. It seems that the quality of thinking is associated with the quality and amount of knowledge. With a fragmentary knowledge base, a novice in any field may remain on lower levels of thinking: things are seen as black and white, without any nuances. In contrast, more experienced colleagues with a more organized and holistic know­ledge base can access more material for their thinking, and, thus, may begin to explore different perspectives on matters and develop more relativistic views concerning certain problems. At the highest levels of thinking, an individual is able to reconcile different perspectives, either by forming a synthesis or by inte­grating different approaches or views.
LevelPerformance
BeginnerFollows simple directions
NovicePerforms using memory of facts and simple rules
CompetentMakes simple judgmentsfor typical tasksMay need help withcomplex or unusual tasksMay lack speed andflexibility
ProficientPerformance guided by deeper experience Able to figure out the most critical aspects of a situation Sees nuances missed by less-skilled performers Flexible performance
ExpertPerformance guided by extensive practice and easily retrievable knowledge and skillsNotices nuances, connections, and patterns Intuitive understanding based on extensive practice Able to solve difficult problems, learn quickly, and find needed resources
Levels of Performance

Sources

  • Clark, R. 2003. Building Expertise: Cognitive Methods for Training and Performance Improvement, 2nd ed. Silver Spring, MD: International Society for Performance Improvement.
  • Ericsson, K.A. 2016. Peak: Secrets From the New Science of Expertise. Boston: Houghton Mifflin Harcourt
  • Kallio, E, ed. Development of Adult Thinking : Interdisciplinary Perspectives on Cognitive Development and Adult Learning. Taylor & Francis Group, 2020.

.