Evaluating Controls as Part of Risk Management

When I teach an introductory risk management class, I usually use an icebreaker of “What is the riskiest activity you can think of doing. Inevitably you will get some version of skydiving, swimming with sharks, jumping off bridges. This activity is great because it starts all conversations around likelihood and severity. At heart, the question brings out the concept of risk important activities and the nature of controls.

The things people think of, such as skydiving, are great examples of activities that are surrounded by activities that control risk. The very activity is based on accepting reducing risk as low as possible and then proceeding in the safest possible pathway. These risk important activities are the mechanism just before a critical step that:

  1. Ensure the appropriate transfer of information and skill
  2. Ensure the appropriate number of actions to reduce risk
  3. Influence the presence or effectiveness of barriers
  4. Influence the ability to maintain positive control of the moderation of hazards

Risk important activities is a concept important to safety-thought and are at the center of a lot of human error reduction tools and practices. Risk important activities are all about thinking through the right set of controls, building them into the procedure, and successfully executing them before reaching the critical step of no return. Checklists are a great example of this mindset at work, but there are a ton of ways of doing them.

In the hospital they use a great thought process, “Five rights of Safe Medication Practices” that are: 1) right patient, 2) right drug, 3) right dose, 4) right route, and 5) right time. Next time you are getting medication in the doctor’s office or hospital evaluate just what your caregiver is doing and how it fits into that process. Those are examples of risk important activities.

Assessing controls during risk assessment

Risk is affected by the overall effectiveness of any controls that are in place.

The key aspects of controls are:

  • the mechanism by which the controls are intended to modify risk
  • whether the controls are in place, are capable of operating as intended, and are achieving the expected results
  • whether there are shortcomings in the design of controls or the way they are applied
  • whether there are gaps in controls
  • whether controls function independently, or if they need to function collectively to be effective
  • whether there are factors, conditions, vulnerabilities or circumstances that can reduce or eliminate control effectiveness including common cause failures
  • whether controls themselves introduce additional risks.

A risk can have more than one control and controls can affect more than one risk.

We always want to distinguish between controls that change likelihood, consequences or both, and controls that change how the burden of risk is shared between stakeholders

Any assumptions made during risk analysis about the actual effect and reliability of controls should be validated where possible, with a particular emphasis on individual or combinations of controls that are assumed to have a substantial modifying effect. This should take into account information gained through routine monitoring and review of controls.

Risk Important Activities, Critical Steps and Process

Critical steps are the way we meet our critical-to-quality requirements. The activities that ensure our product/service meets the needs of the organization.

These critical steps are the points of no-return, the point where the work-product is transformed into something else. Risk important activities are what we do to remove the danger of executing that critical step.

Beyond that critical step, you have rejection or rework. When I am cooking there is a lot of prep work which can be a mixture of critical steps, from which there is no return. I break the egg wrong and get eggshells in my batter, there is a degree of rework necessary. This is true for all our processes.

The risk-based approach to the process is to understand the critical steps and mitigate controls.

We are thinking through the following:

  • Critical Step: The action that triggers irreversibility. Think in terms of critical-to-quality attributes.
  • Input: What came before in the process
  • Output: The desired result (positive) or the possible difficulty (negative)
  • Preconditions: Technical conditions that must exist before the critical step
  • Resources: What is needed for the critical step to be completed
  • Local factors: Things that could influence the critical step. When human beings are involved, this is usually what can influence the performer’s thinking and actions before and during the critical step
  • Defenses: Controls, barriers and safeguards

Risk Management Mindset

Good risk management requires a mindset that includes the following attributes:

  • Expect to be surprised: Our processes are usually underspecified and there is a lot of hidden knowledge. Risk management serves to interrogate the unknowns
  • Possess a chronic sense of unease: There is no such thing as perfect processes, procedures, training, design, planning. Past performance is not a guarantee of future success.
  • Bend, not break: Everything is dynamic, especially risk. Quality comes from adaptability.
  • Learn: Learn from what goes well, from mistakes, have a learning culture
  • Embrace humility: No one knows everything, bring those in who know what you do not.
  • Acknowledge differences between work-as-imagined and work-as-done: Work to reduce the differences.
  • Value collaboration: Diversity of input
  • Drive out subjectivity: Understand how opinions are formed and decisions are made.
  • Systems Thinking: Performance emerges from complex, interconnected and interdependent systems and their components

The Role of Monitoring

One cannot control risk, or even successfully identify it unless a system is able flexibly to monitor both its own performance (what happens inside the system’s boundary) and what happens in the environment (outside the system’s boundary). Monitoring improves the ability to cope with possible risks

When performing the risk assessment, challenge existing monitoring and ensure that the right indicators are in place. But remember, monitoring itself is a low-effectivity control.

Ensure that there are leading indicators, which can be used as valid precursors for changes and events that are about to happen.

For each monitoring control, as yourself the following:

IndicatorHow have the indicators been defined? (By analysis, by tradition, by industry consensus, by the regulator, by international standards, etc.)
RelevanceWhen was the list created? How often is it revised? On which basis is it revised? Who is responsible for maintaining the list?
TypeHow many of the indicators are of the ‘leading,’ type and how many are of the lagging? Do indicators refer to single or aggregated measurements?
ValidityHow is the validity of an indicator established (regardless of whether it is leading or lagging)? Do indicators refer to an articulated process model, or just to ‘common sense’?
DelayFor lagging indicators, how long is the typical lag? Is it acceptable?
Measurement typeWhat is the nature of the measurements? Qualitative or quantitative? (If quantitative, what kind of scaling is used?)
Measurement frequencyHow often are the measurements made? (Continuously, regularly, every now and then?)
AnalysisWhat is the delay between measurement and analysis/interpretation? How many of the measurements are directly meaningful and how many require analysis of some kind? How are the results communicated and used?
StabilityAre the measured effects transient or permanent?
Organization SupportIs there a regular inspection scheme or -schedule? Is it properly resourced? Where does this measurement fit into the management review?

Key risk indicators come into play here.

Hierarchy of Controls

Not every control is the same. This principle applies to both current control and planning future controls.

Human Performance and Data Integrity

Gilbert’s Behavior Engineering Model (BEM) presents a concise way to consider both the environmental and the individual influences on a person’s behavior. The model suggests that a person’s environment supports impact to one’s behavior through information, instrumentation, and motivation. Examples include feedback, tools, and financial incentives (respectively), to name a few. The model also suggests that an individual’s behavior is influenced by their knowledge, capacity, and motives. Examples include training/education, physical or emotional limitations, and what drives them (respectively), to name a few. Let’s look at some further examples to better understand the variability of individual behavioral influences to see how they may negatively impact data integrity.

Kip Wolf “People: The Most Persistent Risk To Data Integrity

Good article in Pharmaceutical Online last week. It cannot be stated enough, and it is good that folks like Kip keep saying it — to understand data integrity we need to understand behavior — what people do and say — and realize it is a means to an end. It is very easy to focus on the behaviors which are observable acts that can be seen and heard by management and auditors and other stakeholders but what is more critical is to design systems to drive the behaviors we want. To recognize that behavior and its causes are extremely valuable as the signal for improvement efforts to anticipate, prevent, catch, or recover from errors.

By realizing that error-provoking aspects of design, procedures, processes, and human nature exist throughout our organizations. And people cannot perform better than the organization supporting them.

Design Consideration

Human Error Considerations

Manage Controls

Define the Scope of Work

·       Identify the critical steps

·       Consider the possible errors associated with each critical step and the likely consequences.

·       Ponder the "worst that could happen."

·       Consider the appropriate human performance tool(s) to use.

·       Identify other controls, contingencies, and relevant operating experience.

When tasks are identified and prioritized, and resources

are properly allocated (e.g., supervision, tools, equipment, work control, engineering support, training), human performance can flourish.

 

These organizational factors create a unique array of job-site conditions – a good work environment – that sets people up for success. Human error increases when expectations are not set, tasks are not clearly identified, and resources are not available to carry out the job.

The error precursors – conditions that provoke error – are reduced. This includes things such as:

·       Unexpected conditions

·       Workarounds

·       Departures from the routine

·       Unclear standards

·       Need to interpret requirements

 

Properly managing controls is

dependent on the elimination of error precursors that challenge the integrity of controls and allow human error to become consequential.

Apply proactive Risk Management

When risk is properly analyzed we can take appropriate action to mitigate the risks. Include the criteria in risk assessments:

·       Adverse environmental conditions (e.g. impact of gowning, noise, temperature, etc)

·       Unclear roles/responsibilities

·       Time pressures

·       High workload

·       Confusing displays or controls

Addressing risk through engineering and administrative controls are a cornerstone of a quality system.

 

Strong administrative and cultural controls can withstand human error. Controls are weakened when conditions are present that provoke error.

 

Eliminating error precursors

in the workplace reduces

the incidences of active errors.

Perform Work

 

Utilizing error reduction tools as part of all work. Examples include:

·       Self-checking

o   Questioning attitude

o   Stop when unsure

o   Effective communication

o   Procedure use and adherence

o   Peer-checking

o   Second-person verifications

o   Turnovers

 

Engineering Controls can often take the place of some of these, for example second-person verifications can be replaced by automation.

Appropriate process and tools in place to ensure that the organizational processes and values are in place to adequately support performance.

Because people err and make mistakes, it is all the more important that controls are implemented and properly maintained.

Feedback and Improvement

 

Continuous improvement is critical. Topics should include:

·       Surprises or unexpected outcomes.

·       Usability and quality of work documents

·       Knowledge and skill shortcomings

·       Minor errors during the activity

·       Unanticipated workplace conditions

·       Adequacy of tools and Resources

·       Quality of work planning/scheduling

·       Adequacy of supervision

Errors during work are inevitable. If we strive to understand and address even inconsequential acts we can strengthen controls and make future performance better.

Vulnerabilities with controls can be found and corrected when management decides it is important enough to devote resources to the effort

 

The fundamental aim of oversight is to improve resilience to significant events triggered by active errors in the workplace—that is, to minimize the severity of events.

 

Oversight controls provide opportunities to see what is happening, to identify specific vulnerabilities or performance gaps, to take action to address those vulnerabilities and performance gaps, and to verify that they have been resolved.

 

Risk Based Data Integrity Assessment

A quick overview. The risk-based approach will utilize three factors, data criticality, existing controls, and level of detection.

When assessing current controls, technical controls (properly implemented) are stronger than operational or organizational controls as they can eliminate the potential for data falsification or human error rather than simply reducing/detecting it. 

For criticality, it helps to build a table based on what the data is used for. For example:

For controls, use a table like the one below. Rank each column and then multiply the numbers together to get a final control ranking.  For example, if a process has Esign (1), no access control (3), and paper archival (2) then the control ranking would be 6 (1 x 3 x 2). 

Determine detectibility on the table below, rank each column and then multiply the numbers together to get a final detectability ranking. 

Another way to look at these scores:

Multiple above to determine a risk ranking and move ahead with mitigations. Mitigations should be to drive risk as low as possible, though the following table can be used to help determine priority.

Risk Rating Action Mitigation
>25 High Risk-Potential Impact to Patient Safety or Product Quality Mandatory
12-25 Moderate Risk-No Impact to Patient Safety or Product Quality but Potential Regulatory Risk Recommended
<12 Negligible DI Risk Not Required

In the case of long-term risk remediation actions, risk reducing short-term actions shall be implemented to reduce risk and provide an acceptable level of governance until the long-term remediation actions are completed.

Relevant site procedures (e.g., change control, validation policy) should outline the scope of additional testing through the change management process.

Reassessment of the system may be completed following the completion of remediation activities. The reassessment may be done at any time during the remediation process to document the impact of the remediation actions.

Once final remediation is complete, a reassessment of the equipment/system should be completed to demonstrate that the risk rating has been mitigated by the remediation actions taken. Think living risk assessment.

Barriers and root cause analysis

Barriers, or controls, are one of the (not-at-all) secret sauces of root cause analysis.

By understanding barriers, we can understand both why a problem happened and how it can be prevented in the future. An evaluation of current process controls as part of root cause analysis can help determine whether all the current barriers pertaining to the problem you are investigating were present and effective (even if they worked or not).

At its simplest it is just a three-part brainstorm:

Barrier Analysis
Barriers that failedThe barrier was in place and operational at the time of the accident, but it failed to prevent the accident.
Barriers that were not usedThe barrier was available, but workers chose not to use it.
Barriers that did not existThe barrier did not exist at the time of the event. A source of potential corrective and preventive actions (depending on what they are)
Three questions of barrier analysis

The key to this brainstorming session is to try to find all of the failed, unused, or nonexistent barriers. Do not be concerned if you are not certain which category they belong in.

Most forms of barrier analysis look at two types, technical and administrative, and we can further breakdown administrative into “human” and “organization.”

ChooseTechnicalHumanOrganization
IfA technical or engineering control existsThe control relies on a human reviewer or operatorThe control involves a transfer of responsibility. For example, a document reviewed by both manufacturing and quality.
ExamplesSeparation among manufacturing or packaging lines

 

Emergency power supply

Dedicated equipment

Barcoding

Keypad controlled doors

Separated storage for components

Software that prevents a workflow from going further if a field is not completed Redundant designs

Training and certifications

 

Use of checklist

Verification of critical task by a second person

 

Clear procedures and policies

 

Adequate supervision

Adequate load of work

Periodic process audits

These barriers are the same as current controls is in a risk assessment, which is key in a wide variety of risk assessment tools.