Pillars of Good Data

One thing we should all agree with is that we need reliable reliable, accurate, and trustworthy data. Which is why we strive for the principles of data governance, data quality, and data integrity, three interconnected concepts that work together to create a robust data management framework.

Overarching Framework: Data Governance

Data governance serves as the overarching framework that establishes the policies, procedures, and standards for managing data within an organization. It provides the structure and guidance necessary for effective data management, including:

  • Defining roles and responsibilities for data management
  • Establishing data policies and standards
  • Creating processes for data handling and decision-making
  • Ensuring compliance with regulations and internal policies

Data governance sets the stage for both data quality and data integrity initiatives by providing the necessary organizational structure and guidelines.

Data Quality: Ensuring Fitness for Purpose

Within the data governance framework, data quality focuses on ensuring that data is fit for its intended use. This involves:

  • Assessing data against specific quality dimensions (e.g., accuracy, completeness, consistency, validity, timeliness)
  • Implementing data cleansing and standardization processes
  • Monitoring and measuring data quality metrics
  • Continuously improving data quality through feedback loops and corrective actions

Data quality initiatives are guided by the policies and standards set forth in the data governance framework, ensuring that quality efforts align with organizational goals and requirements.

Data Integrity: Maintaining Trustworthiness

Data integrity works in tandem with data quality to ensure that data remains accurate, complete, consistent, and reliable throughout its lifecycle. The ALCOA+ principles, widely used in regulated industries, provide a comprehensive framework for ensuring data integrity.

ALCOA+ Principles

Attributable: Ensuring that data can be traced back to its origin and the individual responsible for its creation or modification.

Legible: Maintaining data in a clear, readable format that is easily understandable.

Contemporaneous: Recording data at the time of the event or observation to ensure accuracy and prevent reliance on memory.

Original: Preserving the original record or a certified true copy to maintain data authenticity.

Accurate: Ensuring data correctness and freedom from errors.

Complete: Capturing all necessary information without omissions.

Consistent: Maintaining data coherence across different systems and over time.

Enduring: Preserving data for the required retention period in a format that remains accessible.

Available: Ensuring data is readily accessible when needed for review or inspection.

Additional Data Integrity Measures

Security Measures: Implementing robust security protocols to protect data from unauthorized access, modification, or deletion.

Data Lineage Tracking: Establishing systems to monitor and document data transformations and origins throughout its lifecycle.

Auditability: Ensuring data changes are traceable through comprehensive logging and change management processes.

Data Consistency: Maintaining uniformity of data across various systems and databases.

Data integrity measures are often defined and enforced through data governance policies, while also supporting data quality objectives by preserving the accuracy and reliability of data. By adhering to the ALCOA+ principles and implementing additional integrity measures, organizations can ensure their data remains trustworthy and compliant with regulatory requirements.

Synergy in Action

The collaboration between these three elements can be illustrated through a practical example:

  1. Data Governance Framework: An organization establishes a data governance committee that defines policies for GxP data management, including data quality standards and security requirements.
  2. Data Quality Initiative: Based on the governance policies, the organization implements data quality checks to ensure GxP information is accurate, complete, and up-to-date. This includes:
    • Regular data profiling to identify quality issues
    • Data cleansing processes to correct errors
    • Validation rules to prevent the entry of incorrect data
  3. Data Integrity Measures: To maintain the trustworthiness of GxP data, the organization:
    • Implements access controls to prevent unauthorized modifications
    • Qualifies system to meet ALCOA+ requirements
    • Establishes audit trails to track changes to GxP records

By working together, these elements ensure that:

  • GxP data meets quality standards (data quality)
  • The data remains has a secure and unaltered lineage (data integrity)
  • All processes align with organizational policies and regulatory requirements (data governance)

Continuous Improvement Cycle

The relationship between data governance, quality, and integrity is not static but forms a continuous improvement cycle:

  1. Data governance policies inform data quality and integrity standards.
  2. Data quality assessments and integrity checks provide feedback on the effectiveness of governance policies.
  3. This feedback is used to refine and improve governance policies, which in turn enhance data quality and integrity practices.

This ongoing cycle ensures that an organization’s data management practices evolve to meet changing business needs and technological advancements.

Data governance, data quality, and data integrity work together as a cohesive system to ensure that an organization’s data is not only accurate and reliable but also properly managed, protected, and utilized in alignment with business objectives and regulatory requirements. This integrated approach is essential for organizations seeking to maximize the value of their data assets while minimizing risks associated with poor data management.

A GMP Application based on ISA S88.01

A great example of Data governance is applying ISA S88.01 to enhance batch control processes and improve overall manufacturing operations.

Data Standardization and Structure

ISA S88.01 provides a standardized framework for batch control, including models and terminology that define the physical, procedural, and recipe aspects of batch manufacturing. This standardization directly supports data governance efforts by:

  • Establishing a common language for batch processes across the organization
  • Defining consistent data structures and hierarchies
  • Facilitating clear communication between different departments and systems

Improved Data Quality

By following the ISA S88.01 standard, organizations can ensure higher data quality throughout the batch manufacturing process:

  • Consistent Data Collection: The standard defines specific data points to be collected at each stage of the batch process, ensuring comprehensive and uniform data capture.
  • Traceability: ISA S88.01 enables detailed tracking of each phase of the batch process, including raw materials used, process parameters, and quality data.
  • Data Integrity: The structured approach helps maintain data integrity by clearly defining data sources, formats, and relationships.

Enhanced Data Management

The ISA S88.01 model supports effective data management practices:

  • Modular Approach: The standard’s modular structure allows for easier management of data related to specific equipment, procedures, or recipes.
  • Scalability: As processes or equipment change, the modular nature of ISA S88.01 facilitates easier updates to data structures and governance policies.
  • Data Lifecycle Management: The standard’s clear delineation of process stages aids in managing data throughout its lifecycle, from creation to archival.

Regulatory Compliance

ISA S88.01 supports data governance efforts related to regulatory compliance:

  • Audit Trails: The standard’s emphasis on traceability aligns with regulatory requirements for maintaining detailed records of batch processes.
  • Consistent Documentation: Standardized terminology and structures facilitate the creation of consistent, compliant documentation.

Decision Support and Analytics

The structured data approach of ISA S88.01 enhances data governance initiatives aimed at improving decision-making:

  • Data Integration: The standard facilitates easier integration of batch data with other enterprise systems, supporting comprehensive analytics.
  • Performance Monitoring: Standardized data structures enable more effective monitoring and comparison of batch processes across different units or sites.

Continuous Improvement

Both data governance and ISA S88.01 support continuous improvement efforts:

  • Process Optimization: The structured data from ISA S88.01 compliant systems can be more easily analyzed to identify areas for process improvement.
  • Knowledge Management: The standard terminology and models facilitate better knowledge sharing and retention within the organization.

By leveraging ISA S88.01 in conjunction with robust data governance practices, organizations can create a powerful framework for managing batch processes, ensuring data quality, and driving operational excellence in manufacturing environments.

The Audit Trail and Data Integrity

Requirement

Description

Attributable (Traceable)

  • Each audit trail entry must be attributable to the individual responsible for the direct data input so all changes or creation of data with the persons making those changes. When using a user’s unique ID, this must identify an individual pers on.
  • Each audit trail must be linked to the relevant record throughout the data life cycle.

Legible

  • The system should be able to print or provide an electronic copy of the audit trail.
  • The audit trail must be available in a meaningful format when. viewed in the system or as hardcopy.

Contemporaneous

  • Each audit trail entry must be date- and time-stamped according to a controlled clock which cannot be altered. The time should either be based on central server time or a local time, so long as it is clear in which time zone the entry was performed.

Original

  • The audit trail should retain the dynamic functionalities found in the computerized system, included search functionality to facilitate audit trail review activities.

Accurate

  • Audit trail functionality must be verified to ensure the data written to the audit trail equals the data entered or system generated.
  • Audit trail data must be stored in a secure manner and users cannot have the ability to amend, delete, or switch off the audit trail. Where a system administrator amends, or switches off the audit trail, a record of that action must be retained.

Complete

  • The audit trail entries must be automatically captured by the computerized system whenever an electronic record is created, modified, or deleted.
  • Audit trails, at minimum, must record all end user initiated processes related to critical data. The following parameters must be included:
    • The identity of the person performing the action.
    • In the case of a change or deletion, the detail of the change or deletion, and a record of the original entry.
    • The reason for any GxP change or deletion.
    • The time and date when the action was performed.

Consistent

  • Audit trails are used to review, detect, report, and address data integrity issues.
  • Audit trail reviewers must have appropriate training, system knowledge and knowledge of the process to perform the audit trail review. The review of the relevant audit trails must be documented.
  • Audit trail discrepancies must be addressed, investigated, and escalated to JEB management and national authorities, as necessary.

Enduring

  • The audit trail must be retained for the same duration as the associated electronic record.

Available

  • The audit trail must be available for review at any time by inspectors and auditors during the required retention period.
  • The audit trail must be accessible in a human readable format.

21CFR Part 11 Requirements

Definition: An audit trail is a secure, computer-generated, time-stamped electronic record that allows for the reconstruction of events related to the creation, modification, and deletion of an electronic record.

Requirements:

  • Availability: Audit trails must be easily accessible for review and copying by the FDA during inspections.
  • Automation: Entries must be automatically captured by the system without manual intervention.
  • Components: Each entry must include a timestamp, user ID, original and new values, and reasons for changes where applicable.
  • Security: Audit trail data must be securely stored and not accessible for editing by users

EMA Annex 11 (Eudralex Volume 4) Requirements

Definition: Audit trails are records of all GMP-relevant changes and deletions, created by the system to ensure traceability and accountability.

Requirements:

  • Risk-Based Approach: Building audit trails into the system for all GMP-relevant changes and deletions should be considered based on a risk assessment.
  • Documentation: The reasons for changes or deletions must be documented.
  • Review: Audit trails must be available, convertible into a generally readable form, and regularly reviewed.
  • Validation: The audit trail functionality must be validated to ensure it captures all necessary data accurately and securely.

Requirements from PIC/S GMP Data Integrity Guidance

Definition: Audit trails are metadata recorded about critical information such as changes or deletions of GMP/GDP relevant data to enable the reconstruction of activities.

Requirements:

  • Review: Critical audit trails related to each operation should be independently reviewed with all other records related to the operation, especially before batch release.
  • Documentation: Significant deviations found during the audit trail review must be fully investigated and documented.

Attributable within a Process

Attributable is part of ALCOA that tells us that it should be possible to identify the individual or computerized system that performed the recorded task. The need to document who performed the task / function, is in part to demonstrate that the function was performed by trained and qualified personnel. This applies to changes made to records as well: corrections, deletions, changes, etc.

This means that records should be signed and dated using a unique identifier that is attributable to the author. Where author means the individual who created or recorded the data.

Understanding what role the individual is playing in the task is critical. There are basically six: Executor, Preparer, Checker, Verifier, Reviewer and Approver.

The Six Primary Roles

Document Management

Today many companies are going digital, striving for paperless, reinventing how individuals find information, record data and make decisions. It is often good when undergoing these decisions to go back to basics and make sure we are all on the same page before we proceed.

There are three major types/functions of documents:

  • Functional Documents provide instructions so people can perform tasks and make decisions safely effectively, compliantly and consistently. This usually includes things like procedures, process instructions, protocols, methods and specifications. Many of these need some sort of training decision. Functional documents should involve a process to ensure they are up-to-date, especially in relation to current practices and relevant standards (periodic review)
  • Records provide evidence that actions were taken and decisions were made in keeping with procedures. This includes batch manufacturing records, logbooks and laboratory data sheets and notebooks. Records are a popular target for electronic alternatives.
  • Reports provide specific information on a particular topic on a formal, standardized way. Reports may include data summaries, findings and actions to be taken.

Often times these types are all engaged in a lifecycle. An SOP directs us to write a protocol (two documents), we execute the protocol (a record) and then write a report. This fluidity allows us to combine the types.

Throughout these types we need to apply good change management and data integrity practices (ALCOA).

All of these types follow a very similar path for their lifecycle.

document lifecycle

Everything we do is risk based. Some questions to ask when developing and improving this system include:

  • What are the risks of writing procedures at a “low level of detail versus a high level of detail) how much variability do we allow individuals performing a task?) – Both have advantages, both have disadvantages and it is not a one-sized fits all approach.
  • What are the risks in verifying (witnessing) non-critical tasks? How do we identify critical tasks?
  • What are the risks in not having evidence that a procedure-defined task was completed?
  • What are the risks in relation to archiving and documentation retrieval?

There is very little difference between paper records and documents and electronic records and documents as far as what is given above. Electronic records require the same concerns around generation, distribution and maintenance. Just now you are looking at a different set of safeguards and activities to make it happen.

ALCOA or ALCOA+

My colleague Michelle Eldridge recently shared this video for the differences between ALCOA and ALCOA+ from learnaboutgmp. It’s cute, it’s to the point, it makes a nice primer.

As I’ve mentioned before, the MHRA in it’s data integrity guidance did take a dig at ALCOA+:

The guidance refers to the acronym ALCOA rather than ‘ALCOA +’. ALCOA being Attributable, Legible, Contemporaneous, Original, and Accurate and the ‘+’ referring to Complete, Consistent, Enduring, and Available. ALCOA was historically regarded as defining the attributes of data quality that are suitable for regulatory purposes. The ‘+’has been subsequently added to emphasise the requirements. There is no difference in expectations regardless of which acronym is used since data governance measures should ensure that data is complete, consistent, enduring and available throughout the data lifecycle.

Two things should be drawn from this:

  1. Data Integrity is a set of best practices that are still developing, so make sure you are pushing that development and not ignoring it. Much better to be pushing the boundaries of the “c” then end up being surprised.
  2. I actually agree with the MHRA. Complete, consistent, enduring and available are really just subsets of the others. But, like they also say the acronym means little, just make sure you are doing it.

Data Integrity, it’s the new quality culture.