Measuring Training Effectiveness for Organizational Performance

When designing training we want to make sure four things happen:

  • Training is used correctly as a solution to a performance problem
  • Training has the the right content, objectives or methods
  • Trainees are sent to training for which they do have the basic skills, prerequisite skills, or confidence needed to learn
  • Training delivers the expected learning

Training is a useful lever in organization change and improvement. We want to make sure the training drives organization metrics. And like everything, you need to be able to measure it to improve.

The Kirkpatrick model is a simple and fairly accurate way to measure the effectiveness of adult learning events (i.e., training), and while other methods are introduced periodically, the Kirkpatrick model endures because of its simplicity. The model consists of four levels, each designed to measure a specific element of the training. Created by Donald Kirkpatrick, this model has been in use for over 50 years, evolving over multiple decades through application by learning and development professionals around the world. It is the most recognized method of evaluating the effectiveness of training programs. The model has stood the test of time and became popular due to its ability to break down complex subject into manageable levels. It takes into account any style of training, both informal and formal.

Level 1: Reaction

Kirkpatrick’s first level measures the learners’ reaction to the training. A level 1 evaluation is leveraging the strong correlation between learning retention and how much the learners enjoyed the time spent and found it valuable. Level 1 evaluations, euphemistically called a “smile sheet” should delve deeper than merely whether people liked the course. A good course evaluation will concentrate on three elements: course content, the physical environment and the instructor’s presentation/skills.

Level 2: Learning

Level 2 of Kirkpatrick’s model, learning, measures how much of the content attendees learned as a result of the training session. The best way to make this evaluation is through the use of a pre- and posttest. Pre- and posttests are key to ascertaining whether the participants learned anything in the learning event. Identical pre- and posttests are essential because the difference between the pre- and posttest scores indicates the amount of learning that took place. Without a pretest, one does not know if the trainees knew the material before the session, and unless the questions are the same, one cannot be certain that trainees learned the material in the session.

Level 3: Behavior

Level 3 measures whether the learning is transferred into practice in the workplace.

Level 4: Results

Measures the effect on the business environment. Do we meet objectives?

Evaluation LevelCharacteristicsExamples
Level 1: ReactionReaction evaluation is how the delegates felt, and their personal reactions to the training or learning experience, for example: ▪ Did trainee consider the training relevant?
▪ Did they like the venue, equipment, timing, domestics, etc?
▪ Did the trainees like and enjoy the training?
▪ Was it a good use of their time?
▪ Level of participation
▪ Ease and comfort of experience
▪ feedback forms based on subjective personal reaction to the training experience
▪ Verbal reaction which can be analyzed
▪ Post-training surveys or questionnaires
▪ Online evaluation or grading by delegates
▪ Subsequent verbal or written reports given by delegates to managers back at their jobs
▪ typically ‘happy sheets’
Level 2: LearningLearning evaluation is the measurement of the increase in knowledge or intellectual capability from before to after the learning experience:
▪ Did the trainees learn what intended to be taught?
▪ Did the trainee experience what was intended for them to experience?
▪ What is the extent of advancement or change in the trainees after the training, in the direction or area that was intended?
▪ Interview or observation can be used before and after although it is time-consuming and can be inconsistent
▪ Typically assessments or tests before and after the training
▪ Methods of assessment need to be closely related to the aims of the learning
▪ Reliable, clear scoring and measurements need to be established
▪ hard-copy, electronic, online or interview style assessments are all possible
Level 3: BehaviorBehavior evaluation is the extent to which the trainees applied the learning and changed their behavior, and this can be immediately and several months after the training, depending on the situation:
▪ Did the trainees put their learning into effect when back on the job?
▪ Were the relevant skills and knowledge used?
▪ Was there noticeable and measurable change in the activity and performance of the trainees when back in their roles?
▪ Would the trainee be able to transfer their learning to another person? is the trainee aware of their change in behavior, knowledge, skill level?
▪ Was the change in behavior and new level of knowledge sustained?
▪ Observation and interview over time are required to assess change, relevance of change, and sustainability of change
▪ Assessments need to be designed to reduce subjective judgment of the observer
▪ 360-degree feedback is useful method and need not be used before training, because respondents can make a judgment as to change after training, and this can be analyzed for groups of respondents and trainees
▪ Online and electronic assessments are more difficult to incorporate – assessments tend to be more successful when integrated within existing management and coaching protocols
Level 4: ResultsResults evaluation is the effect on the business or environment resulting from the improved performance of the trainee – it is the acid test

Measures would typically be business or organizational key performance indicators, such as: volumes, values, percentages, timescales, return on investment, and other quantifiable aspects of organizational performance, for instance; numbers of complaints, staff turnover, attrition, failures, wastage, non-compliance, quality ratings, achievement of standards and accreditations, growth, retention, etc.
The challenge is to identify which and how relate to the trainee’s input and influence. Therefore it is important to identify and agree accountability and relevance with the trainee at the start of the training, so they understand what is to be measured
▪ This process overlays normal good management practice – it simply needs linking to the training input
▪ For senior people particularly, annual appraisals and ongoing agreement of key business objectives are integral to measuring business results derived from training
4 Levels of Training Effectiveness

Example in Practice – CAPA

When building a training program, start with with the intended behaviors that will drive results. Evaluating our CAPA program, we have three key aims, which we can apply measures against.

BehaviorMeasure
Investigate to find root cause% recurring issues
Implement actions to eliminate root causePreventive to corrective action ratio

To support each of these top level measures we define a set of behavior indicators, such as cycle time, right the first time, etc. To support these, a review rubric is implemented.

Our four levels to measure training effectiveness will now look like this:

LevelMeasure
Level 1: Reaction Personal action plan and a happy sheet
Level 2: Learning Completion of Rubric on a sample event
Level 3: Behavior Continued performance and improvement against the Rubric and the key review behavior indicators
Level 4: Results Improvements in % of recurring issues and an increase in preventive to corrective actions

This is all about measuring the effectiveness of the transfer of behaviors.

Strong Signals of Transfer Expectations in the OrganizationSignals that Weaken Transfer Expectations in the Organization
Training participants are required to attend follow-up sesions and other transfer interventions.

What is indicates:
Individuals and teams are committed to the change and obtaining the intended benefits.
Attending the training is compulsory, but participating in follow-up sessions or oter transfer interventions is voluntary or even resisted by the organization.

What is indicates:
They key factor of a trainee is attendance, not behavior change.
The training description specifies transfer goals (e.g. “Trainee increases CAPA success by driving down recurrence of root cause”)

What is indicates:
The organization has a clear vision and expectation on what the training should accomplish.
The training description roughly outlines training goals (e.g. “Trainee improves their root cause analysis skills”)

What is indicates:
The organization only has a vague idea of what the training should accomplish.
Supervisors take time to support transfer (e.g. through pre- and post-training meetings). Transfer support is part of regular agendas.

What is indicates:
Transfer is considered important in the organization and supported by supervisors and managers, all the way to the top.
Supervisors do not invest in transfer support. Transfer support is not part of the supervisor role.

What is indicates:
Transfer is not considered very important in the organziaiton. Managers have more important things to do.
Each training ends with careful planning of individual transfer intentions.

What is indicates:
Defining transfer intentions is a central component of the training.
Transfer planning at the end of the training does not take place or only sporadically.

What is indicates:
Defining training intentions is not (or not an essential) part of the training.

Good training, and thus good and consistent transfer, builds that into the process. It is why I such a fan of utilizing a Rubric to drive consistent performance.

MHRA on Good Pharacovigilance Inspections

The MHRA GPvP inspectorate recently published their latest inspection metrics for the period from April 2019 to March 2020. 

Someday these reports won’t take a year to write. If I took a year writing my annual reports I would receive an inspection finding from the MHRA.

There is no surprise that the five critical observations are all from risk management. Risk management is also the largest source of major findings, with quality management a close second with a lot of growth.

There are a lot of observations around the smooth and effective running of the CAPA program; a fair amount on PSMF management; and a handful on procedure, training and oversight.

Looking at the nine major observations due to deficiencies in the management of CAPA, the MHRA reports these problems:

  • Delays to CAPA development
  • CAPA that did not address the root cause and impact analysis for the identified noncompliance
  • Open CAPA which were significantly past their due date
  • CAPA raised from a previous critical finding raised at an earlier MHRA inspection had not been addressed

I’m going to go out on a limb here and say some of these stem from companies thinking non-GMP CAPAs do not require the same level of control and scrutiny. Root Cause Analysis and a good CAPA program are fundamental, no matter where you fall on (or out of) the pharmaceutical regulatory spectrum.

Root Cause Analysis Deficiencies

An appropriate level of root cause analysis should be applied during the investigation of deviations, suspected product defects and other problems. This can be determined using Quality Risk Management principles. In cases where the true root cause(s) of the issue cannot be determined, consideration should be given to identifying the most likely root cause(s) and to addressing those. Where human error is suspected or identified as the cause, this should be justified having taken care to ensure that process, procedural or system based errors or problems have not been overlooked, if present.

Appropriate corrective actions and/or preventative actions (CAPAs) should be identified and taken in response to investigations. The effectiveness of such actions should be monitored and assessed, in line with Quality Risk Management principles.

EU Guidelines for Good Manufacturing Practice for Medicinal Products for Human and Veterinary Use, Chapter 1 Pharmaceutical System C1.4(xiv)

The MHRA cited 210 companies in 2019 on failure to conduct good root cause analysis and develop appropriate CAPAs. 6 of those were critical and a 100 were major.

My guess is if I asked those 210 companies in 2018 how their root cause analysis and CAPAs were doing, 85% would say “great!” We tend to overestimate our capabilities on the fundamentals (which root cause analysis and CAPA are) and not to continuously invest in improvement.

Of course, without good benchmarking, its really easy to say good enough and not be. There can be a tendency to say “Well we’ve never had a problem here, so we’re good.” Where in reality its just the problem has never been seen in an inspection or has never gone critical.

The FDA has fairly similar observations around root cause analysis. As does anyone who shares their metrics in any way. Bad root cause and bad CAPAs are pretty widespread.

This comes up a lot because the quality of CAPAs (and quantity) are considered key indicators of an organization’s health. CAPAs demonstrate that issues are acknowledged, tracked and remediated in an effective manner to eliminate or reduce the risk of a recurrence. The timeliness and robustness of these processes and records indicate whether an organization demonstrates effective planning and has sufficient resources to manage, resolve and correct past issues and prevent future issues.

A good CAPA system covers problem identification (which can be, and usually is a few different processes), root cause analysis, corrective and preventive actions, CAPA effectiveness, metrics, and governance. It is a house of cards, short one and the whole structure will fall down around you, often when you least need it to.

We can’t freeze our systems with superglue. If we are not continually improving then we are going backwards. No steady state when it comes to quality.

Layering metrics

We have these quality systems with lots of levers, with interrelated components. And yet we select one or two metrics and realize that even if we meet them, we aren’t really measuring the right stuff nor are we driving continious improvement.

One solution is to create layered metrics, which basically means drill down your process and identify the metrics at each step.

Lots of ways to do this. An easy way to start is to use the 5-why process, a tool most folks are comfortable with.

So for example, CAPA. It is pretty much agreed upon that CAPAs should be completed in a timely manner. That makes this a top level goal. Unfortunately, in this hypothetical example, we are suffering a less than 100% closure goal (or whatever level is appropriate in your organization based on maturity)

Why 1Why was CAPA closure not 100%
Because CAPA tasks were not closed on time.

Success factor needed for this step: CAPA tasks to be closed by due date.

Metric for this step: CAPA closure task success rate
Why 2Why were CAPA tasks not closed on time?
Because individuals did not have appropriate time to complete CAPA tasks.

Metric for this step: Planned versus Actual time commitment
Why 3Why did individuals not have appropriate time to complete CAPA tasks?
Because CAPA task due dates are guessed at.

Metric for this step: CAPA task adherence to target dates based on activity (e.g. it takes 14 days to revise a document and another 14 days to train, the average document revision task should be 28 days)
Why 4Why are CAPA task due dates guessed at?
Because appropriate project planning is not completed.

Metric for this step: Adherence to Process Confirmation
Why 5Why is appropriate project planning not completed?
Because CAPAs are always determined on the last day the deviation is due.

Metric: Adherence to Root Cause Analysis process

I might on report on the top CAPA closure rate and 1 or 2 of these, and keep the others in my process owner toolkit. Maybe we jump right to the last one as what we report on. Depends on what needs to be influenced in my organization and it will change over time.

It helps to compare this output against the 12 system leverage points.

Donella Meadows 12 System Leverage Points

These metrics go from 3 “goals of the system” with completing CAPA tasks effectively and on time, to 4 “self organize” and 5 “rules of the system.” It also has nice feedback loops based on the process confirmations. I’d view them as potentially pretty successful. Of course, we would test these and tinker and basically experiment until we find the right set of metrics that improves our top-level goal.