Understanding Data – A Core Quality Skill

A critical skill of a quality professional (of any professional), and a fundamental part of Quality 4.0, is managing data — knowing how to acquire good data, analyze it properly, follow the clues those analyses offer, explore the implications, and present results in a fair, compelling way.

As we build systems, validate computer systems, create processes we need to ensure the quality of data. Think about the data you generate, and continually work to make it better.

I am a big fan of tools like the Friday Afternoon Measurement to determine where data has problems.

Have the tools to decide what data stands out, use control charts and regression analysis. These tools will help you understand the data. “Looks Good To Me: Visualizations As Sanity Checks” by Michael Correll is a great overview of how data visualization can help us decide if the data we are gathering makes sense.

Then root cause analysis (another core capability) allows us to determine what is truly going wrong with our data.

Throughout all your engagements with data understand statistical significance, how to quantify whether a result is likely due to chance or from the factors you were measuring.

In the past it was enough to understand a pareto chart, and histogram, and maybe a basic control chart. Those days are long gone. What quality professionals need to bring to the table today is a deeper understanding of data and how to gather, analyze and determine relevance. Data integrity is a key concept, and to have integrity, you need to understand data.

Data, and all that jazz

As  we all try to figure out just exactly what Industry 4.0 and Quality 4.0 mean it is not an exaggeration to say “Data is your most valuable asset. Yet we all struggle to actually get a benefit from this data and data integrity is an area of intense regulatory concern.

To truly have value our data needs to be properly defined, relevant to the tasks at hand, structured such that it is easy to find and understand, and of high-enough quality that it can be trusted. Without that we just have noise.

Apply principles of good master data management and data integrity. Ensure systems are appropriately built and maintained.

Understand why data matters, how to pick the right metrics, and how to ask the right questions from data. Understand correlation vs. causation to be able to make decisions about when to act on analysis and when not to is critical.

In the 2013 article Keep Up with Your Quants, Thomas Davenport lists six questions that should be asked to evaluate conclusions obtained from data:

1. What was the source of your data?

2. How well do the sample data represent the population?

3. Does your data distribution include outliers? How did they affect the results?

4. What assumptions are behind your analysis? Might certain conditions render your assumptions and your model invalid?

5. Why did you decide on that particular analytical approach? What alternatives did you consider?

6. How likely is it that the independent variables are actually causing the changes in the dependent variable? Might other analyses establish causality more clearly?

Framing data, being able to ask the right questions, is critical to being able to use that data and make decisions. In the past it was adequate enough for a quality professional to have a familiarity with a few basic tools. Today it is critical to understand basic statistics. As Nate Silver advises in an interview with HBR. “The best training is almost always going to be hands on training,” he says. “Getting your hands dirty with the data set is, I think, far and away better than spending too much time doing reading and so forth.”

Understanding data is a key ability and is necessary to thrive. It is time to truly contemplate the data ecosystem as a system and stop treating it as a specialized area of the organization.