Provides a systematic, business-driven approach to measure and evaluate data quality employing data quality dimensions, to ensure fitness for purpose and establish targets and thresholds for quality.
The business owns the data it creates and manages. No organization’s information technology staff can single-handedly improve the quality of its data. Business representatives across the patient demographic lifecycle must be engaged to: determine patient demographic data’s fitness for purpose across the lifecycle; define the level of quality desired; and define the level of quality acceptable.
The data quality assessment processes consist of making decisions about the data and acting on those decisions. Only those who create, modify, and delete patient demographic data, across every phase of its lifecycle, can decide:
To take an example of fitness for purpose, an organization may discover, based on data profiling results, that it does not capture a sufficient set of attributes to maximize the efficacy of its record matching algorithm. The data set is not fit for purpose because it is incomplete with respect to the business objective of preventing and reconciling duplicates. Business representatives would need to consider which attributes are most useful to add to constitute the minimum set that is required, for instance, mother’s maiden name, previous address, previous phone number, etc.
An example of targets might be: 100% population of all attributes (each specified) needed for matching. An example of a threshold might be: 95% of first names must contain more than one character (the rationale here might be that later in the patient lifecycle this could be modified). An example of a metric might be: the baseline profiling effort revealed that 10% of the street addresses did not have a street suffix (RD, ST, BLVD, etc.). This metric, percentage of records without street suffixes, could be monitored to assess improvement as data profiling monitoring and data cleansing activities were conducted over time.
The most effective mechanism to assist the business in assessing data quality and establishing useful targets, thresholds, and metrics is the consideration and application of data quality dimensions to each attribute. A “dimension” is a criterion against which data quality is measured. A number of different dimensions of quality can be measured. A sample set often used is presented below:
Performing a data quality assessment is based on the predefined quality expectations and criteria set by stakeholders and approved by governance. It is advisable to start by measuring data quality for a small set of key attributes supporting one or more primary business processes, i.e., patient demographic data. Profiling the data is the recommended first step. For each attribute identified, the organization should convene a working group (e.g., data stewards) representing all relevant stakeholders to determine targets, set thresholds, and define the quality dimensions that are most important.
Once the criteria are determined and the data evaluated, metrics can be developed and published in a scorecard or dashboard format. Assessment results facilitate root cause analysis and are key inputs into the organization's data quality improvement plans. Periodic assessments should be conducted to determine if acceptable thresholds and targets are being met, and metrics should be updated accordingly.
To support these efforts and track improvements over time, it is helpful to conduct an impact analysis of the overall data quality effort, as well as specific impacts of improvements regarding individual data elements, as part of the assessment process. Categorizing impacts of poor data quality, such as cost, risk, compliance, productivity, etc. also assists in prioritizing data cleansing and quality improvement plans.
Effective goverance is important to implementing this process (See Data Governance). Assignment of specific responsibilities and data ownership deepens business engagement, which is important because improving data quality is truly a team effort. For example, an organization may decide that the Billing department should own ZIP code because it is critical for mailing patient bills; whereas, it is not critical for clinical care delivery. Under the supervision of the data quality coordinator, if ZIP codes were found to be missing or inaccurate, Billing would initiate root cause analysis and sponsor the resulting improvements for remediation and defect prevention.
The data quality assessment process and accompanying mechanisms and metrics provide the following benefits:
When an organization establishes agreement on high-level objectives, such as assuring unique records in patient demographic data stores, it is better able to bring different perspectives into alignment around shared data assets.
Practically every step along the care continuum benefits from unambiguous patient records. However, the priority placed on the value of that uniqueness may vary, as well as the level of accuracy for the attributes used to ensure that uniqueness. Initial priorities often are driven by the primary purposes for which the data is collected. If the collective needs of all relevant suppliers and consumers are not addressed, there is a risk of negative downstream impacts.
For example, the accuracy of insurance information is very important for billing. Patient registration staff may consider that to be less important for the purpose of delivering quality care to the patient by registration staff. However, since patient registration also benefits from patient record uniqueness to meet their objectives for patient safety, they can be trained to be aware of the importance of capturing accurate patient and insurance information.
Example Work Products
The data quality assessment is the application of business-approved data quality requirements to a selected data set. Data quality requirements should be expressed in terms of data quality dimensions and should be aligned with organizational objectives. Targets and thresholds should be established for each dimension. Examples of quantitative documentation of targets and thresholds is illustrated in the following table:
|Accuracy||Affinity of data with original intent; veracity of the data as compared to an authoritative source; measurement precision.||85%||100%|
|Conformity||Alignment of data with the required standard.||75%||99.9%|
|Uniqueness||Unambiguous records in the data set.||80%||98%|
Having a predetermined set of key attributes is essential to keeping the scope of data quality efforts manageable. Governance representatives should agree on the scope of attributes based on priorities that support the organization’s goals; for example, agreeing on a standard set of patient demographic attributes that will improve the ability to match duplicate patient records to support patient safety. While the optimum group of attributes is not unanimous across the industry, there is a minimum set on which many healthcare organizations agree:
Data quality assessments should be conducted periodically according to an approved frequency, per the data quality assessment policy.
Data quality assessments typically result in the implementation of data quality rules that are informed by business knowledge of the data. These rules are needed to properly handle data and define required data elements, formats, and timeliness parameters through either manual entry or automated ingestion from bulk data sources (e.g., .TXT, .CSV, .XLS, etc.). It is important to specify rules before data migrations, connections to external systems such as Health Information Exchanges and extractions to repositories for analytics. As assessments are conducted, rules are added and refined until the quality of the data set surpasses the required threshold expressed by the relevant stakeholders and until the quality approaches or reaches the targets.
It is important that high-level information in data quality assessment reports can be traced back to individual records to ensure that current thresholds and targets are accurately met. The organization should work with vendors to ensure availability and access to data, as well as timely and accurate reports.
Example Work Products
A data quality assessment policy should be developed after creating the organization’s data quality plan (See Data Quality Planning). The policy should provide guidance on the selection of data sets, availability of data, alignment of data across systems, the types of methods and measures by which data quality should be assessed, the frequency and/or event triggers for conducting assessments, and the conditions for ensuring alignment with organizational objectives (i.e., governance sign-off) as well as compliance with the policy.
Example Work Products
1.1 Does the organization engage business representatives to determine fitness for purpose and quality criteria for patient demographic data?
2.1 Are objectives, targets, and thresholds expressed in terms of selected data quality dimensions for patient demographic data attributes?
2.2 Does the organization engage data governance for input on the selection and prioritization of attributes to be assessed?
2.3 Do data quality assessment activities produce new or modified quality rules for patient demographic data?
3.1 Does the organization conduct periodic assessments of the selected data sets in accordance with patient demographic data quality policies?