Understanding Data – A Core Quality Skill

A critical skill of a quality professional (of any professional), and a fundamental part of Quality 4.0, is managing data — knowing how to acquire good data, analyze it properly, follow the clues those analyses offer, explore the implications, and present results in a fair, compelling way.

As we build systems, validate computer systems, create processes we need to ensure the quality of data. Think about the data you generate, and continually work to make it better.

I am a big fan of tools like the Friday Afternoon Measurement to determine where data has problems.

Have the tools to decide what data stands out, use control charts and regression analysis. These tools will help you understand the data. “Looks Good To Me: Visualizations As Sanity Checks” by Michael Correll is a great overview of how data visualization can help us decide if the data we are gathering makes sense.

Then root cause analysis (another core capability) allows us to determine what is truly going wrong with our data.

Throughout all your engagements with data understand statistical significance, how to quantify whether a result is likely due to chance or from the factors you were measuring.

In the past it was enough to understand a pareto chart, and histogram, and maybe a basic control chart. Those days are long gone. What quality professionals need to bring to the table today is a deeper understanding of data and how to gather, analyze and determine relevance. Data integrity is a key concept, and to have integrity, you need to understand data.

Data, and all that jazz

As  we all try to figure out just exactly what Industry 4.0 and Quality 4.0 mean it is not an exaggeration to say “Data is your most valuable asset. Yet we all struggle to actually get a benefit from this data and data integrity is an area of intense regulatory concern.

To truly have value our data needs to be properly defined, relevant to the tasks at hand, structured such that it is easy to find and understand, and of high-enough quality that it can be trusted. Without that we just have noise.

Apply principles of good master data management and data integrity. Ensure systems are appropriately built and maintained.

Understand why data matters, how to pick the right metrics, and how to ask the right questions from data. Understand correlation vs. causation to be able to make decisions about when to act on analysis and when not to is critical.

In the 2013 article Keep Up with Your Quants, Thomas Davenport lists six questions that should be asked to evaluate conclusions obtained from data:

1. What was the source of your data?

2. How well do the sample data represent the population?

3. Does your data distribution include outliers? How did they affect the results?

4. What assumptions are behind your analysis? Might certain conditions render your assumptions and your model invalid?

5. Why did you decide on that particular analytical approach? What alternatives did you consider?

6. How likely is it that the independent variables are actually causing the changes in the dependent variable? Might other analyses establish causality more clearly?

Framing data, being able to ask the right questions, is critical to being able to use that data and make decisions. In the past it was adequate enough for a quality professional to have a familiarity with a few basic tools. Today it is critical to understand basic statistics. As Nate Silver advises in an interview with HBR. “The best training is almost always going to be hands on training,” he says. “Getting your hands dirty with the data set is, I think, far and away better than spending too much time doing reading and so forth.”

Understanding data is a key ability and is necessary to thrive. It is time to truly contemplate the data ecosystem as a system and stop treating it as a specialized area of the organization.

Computer system changes and the GAMP5 framework

Appropriate controls shall be exercised over computer or related systems to assure that changes in master production and control records or other records are instituted only by authorized personnel. Input to and output from the computer or related system of formulas or other records or data shall be checked for accuracy. The degree and frequency of input/output verification shall be based on the complexity and reliability of the computer or related system. A backup file of data entered into the computer or related system shall be maintained except where certain data, such as calculations performed in connection with laboratory analysis, are eliminated by computerization or other automated processes. In such instances a written record of the program shall be maintained along with appropriate validation data. Hard copy or alternative systems, such as duplicates, tapes, or microfilm, designed to assure that backup data are exact and complete and that it is secure from alteration, inadvertent erasures, or loss shall be maintained.

21 CFR 211.68(b)

Kris Kelly over at Advantu got me thinking about GAMP5 today.  As a result I went to the FDA’s Inspection Observations page and was quickly reminded me that in 2017 one of the top ten highest citations was against 211.68(b), with the largest frequency being “Appropriate controls are not exercised over computers or related systems to assure that changes in master production and control records or other records are instituted only by authorized personnel. ”

Similar requirements are found throughout the regulations of all major markets (for example EU 5.25) and data integrity is a big piece of this pie.

So yes, GAMP5 is probably one of your best tools for computer system validation. But this is also an argument for having one change management system/one change control process.

When building your change management system remember that your change is both a change to a validated change and a change to a process, and needs to go through the same appropriate rigor on both ends. Companies continue to get in a lot of trouble on this. Especially when you add in the impact of master data.

Make sure your IT organization is fully aligned. There’s a tendency at many companies (including mine) to build walls between an ITIL orientated change process and process changes. This needs to be driven by a risk based approach, and find the opportunities to tear down walls. I’m spending a lot of my time finding ways to do this, and to be honest, worry that there aren’t enough folks on the IT side of the fence willing to help tear down the fence.

So yes, GAMP5 is a great tool. Maybe one of the best frameworks we have available.

gamp5

 

Document Management

Today many companies are going digital, striving for paperless, reinventing how individuals find information, record data and make decisions. It is often good when undergoing these decisions to go back to basics and make sure we are all on the same page before we proceed.

There are three major types/functions of documents:

  • Functional Documents provide instructions so people can perform tasks and make decisions safely effectively, compliantly and consistently. This usually includes things like procedures, process instructions, protocols, methods and specifications. Many of these need some sort of training decision. Functional documents should involve a process to ensure they are up-to-date, especially in relation to current practices and relevant standards (periodic review)
  • Records provide evidence that actions were taken and decisions were made in keeping with procedures. This includes batch manufacturing records, logbooks and laboratory data sheets and notebooks. Records are a popular target for electronic alternatives.
  • Reports provide specific information on a particular topic on a formal, standardized way. Reports may include data summaries, findings and actions to be taken.

Often times these types are all engaged in a lifecycle. An SOP directs us to write a protocol (two documents), we execute the protocol (a record) and then write a report. This fluidity allows us to combine the types.

Throughout these types we need to apply good change management and data integrity practices (ALCOA).

All of these types follow a very similar path for their lifecycle.

document lifecycle

Everything we do is risk based. Some questions to ask when developing and improving this system include:

  • What are the risks of writing procedures at a “low level of detail versus a high level of detail) how much variability do we allow individuals performing a task?) – Both have advantages, both have disadvantages and it is not a one-sized fits all approach.
  • What are the risks in verifying (witnessing) non-critical tasks? How do we identify critical tasks?
  • What are the risks in not having evidence that a procedure-defined task was completed?
  • What are the risks in relation to archiving and documentation retrieval?

There is very little difference between paper records and documents and electronic records and documents as far as what is given above. Electronic records require the same concerns around generation, distribution and maintenance. Just now you are looking at a different set of safeguards and activities to make it happen.

ALCOA or ALCOA+

My colleague Michelle Eldridge recently shared this video for the differences between ALCOA and ALCOA+ from learnaboutgmp. It’s cute, it’s to the point, it makes a nice primer.

As I’ve mentioned before, the MHRA in it’s data integrity guidance did take a dig at ALCOA+:

The guidance refers to the acronym ALCOA rather than ‘ALCOA +’. ALCOA being Attributable, Legible, Contemporaneous, Original, and Accurate and the ‘+’ referring to Complete, Consistent, Enduring, and Available. ALCOA was historically regarded as defining the attributes of data quality that are suitable for regulatory purposes. The ‘+’has been subsequently added to emphasise the requirements. There is no difference in expectations regardless of which acronym is used since data governance measures should ensure that data is complete, consistent, enduring and available throughout the data lifecycle.

Two things should be drawn from this:

  1. Data Integrity is a set of best practices that are still developing, so make sure you are pushing that development and not ignoring it. Much better to be pushing the boundaries of the “c” then end up being surprised.
  2. I actually agree with the MHRA. Complete, consistent, enduring and available are really just subsets of the others. But, like they also say the acronym means little, just make sure you are doing it.

Data Integrity, it’s the new quality culture.