ONC （Office of National Coordinator for Health Information Technology)

https://www.healthit.gov

Data Requirements Definition

Purpose

Ensures that data produced and consumed satisfies business objectives, is understood by all relevant stakeholders, and meets the needs of the business processes that create and use the data.

Introductory Notes

While most organizations have a comprehensive approach to defining requirements for information system functionality, the corresponding data requirements are often neglected by comparison. Typically, attention is focused on system behavior; for example, “The system shall display a patient’s name history”, “The system shall require that the Social Security Number is entered twice”, or “The system shall display the message ‘check for existing patient’ if the user enters the same name, birth date, and gender as an existing patient record.”

It is not uncommon for an IT project team to quickly design a database during software development, without reference to business terms, data standards (names, metadata, allowed values, ranges, lengths, etc.), or quality rules. Organizations are much better served by ensuring that selection of and specifications for data used to satisfy business objectives, are prioritized, validated by stakeholders, and well documented through a repeatable process.

Data requirements definition establishes the process used to identify, prioritize, precisely formulate, and validate the data needed to achieve business objectives. When documenting data requirements, data should be referenced in business language, reusing approved standard business terms if available. If business terms have not yet been standardized and approved for the data within scope, the data requirements process provides the occasion to develop them. For patient demographic data, governance should be engaged in validating data requirements, with representation from supplying and consuming business areas across the lifecycle to ensure that their requirements are met.

Data requirements definition should follow an organized and sequential discovery and decomposition process. Business rules for system behavior should be developed in parallel with the logical design of the destination data store; this method is bi-directional and iterative. Data requirements should be represented in the logical design of the data store and should reflect standardization across projects.

If data in the new data store already exists elsewhere and will migrate, profiling should be performed to ensure that it meets the business expectations and requirements prior to population (See Data Profiling). This may positively impact the design process by surfacing the need for additional quality rules or specifications, and it will improve the percentage of requirements satisfied and reduce the amount of rework for future releases.

It is advised to develop a standard template for data requirements specification, for new systems, data stores consolidations, data repositories (e.g., Master Patient Index, enterprise data warehouse), and developing data exchange mechanisms. The data requirements definition process contributes to the creation and validation of business terms and definitions, which link to metadata, data standards, and the business processes which manage and process the data. The template can be as simple as a spreadsheet capturing, for example, the following information:

Business term – the data element name in business English, e.g., Street Address;
Term definition – the approved definition of the business term, e.g. Birth Date – the date on which a person was born;
Originating business process – the process that creates the data, e.g., Patient Registration;
Consuming business process(es) – the process or processes that use the data, e.g., Clinical Care, Laboratory, Claims;
Modifying business process(es) – the process or processes through which the data can be modified, e.g., Billing;
Owner – the name of the individual who has the responsibility for ensuring that the business term is correct and approved; and/or the ability to grant or deny permission for access; and/or the individual who manages the business process that creates the data element – however the responsibility is assigned;
Steward – the name of the individual who represents the data element in governance activities, on behalf of the entire organization;
Logical name – the business term transformed to the organization’s data design standards, e.g., Street Address 1 Text, Street Address 2 Text;
Allowed values – the codes, minimum/maximum ranges, etc. which are acceptable, e.g., M, F, U;
Values format – how the values are represented, e.g., MMDDYYYY, 60x (text characters), 999-99-9999 (SSN), 2x (state code), 9-999-999-9999 (phone number), etc.;
Originating data source(s) (if acquired) – e.g., Registration Capture System;
Source table name – the name of the table within the source, if applicable, e.g., PT_PRFLE (patient profile);
Source column name – the name of the column containing the data in the data source, e.g., PT_FRST_NM;
Physical name – the name of the term developed for the physical database in which it is or will be stored, applying physical data standards, e.g. ST_ADDR_1_TX; and
Quality rule(s) – the automated test or tests that will be applied to the data element upon entry, e.g., First Name must contain more than one character, First Name must contain a single word (no extra components such as suffixes).

It can be observed from the sample list above, which may vary according to the organization, that the data requirements definition process is dependent on, or may become the occasion for, executing many of the data management processes described in this document, supported by corresponding work products: Business Glossary, Metadata Management, Data Governance, Data Lifecycle Management, Data Quality Assessment, Data Cleansing and Improvement, and Provider Management. This is a practical illustration of the synergy of best practices.

The organization should apply the requirements definition process and standard template when considering adding new patient demographic data elements, such as mother’s maiden name, previous address, previous phone number, etc. The effects on existing business processes, matching algorithms, confidence in patient identity, and projected development and maintenance costs should be analyzed and reviewed and approved through governance.

Adopting data requirements from regulatory and industry sources is highly advised an organization seeking to improve the quality of its patient demographic data. For example, there are a number of healthcare industry efforts underway to advocate adoption of standardized data attributes to improve patient identity integrity. The recommended data sets proposed to improve matching vary; a number of relevant standards are referenced in the “Patient Identification and Matching Final Report,” published by Audacious Inquiry in 2014 for ONC.

Establishing and following sound practices for defining data requirements is critical to minimizing data complexity over time. Effectively implementing this process will yield the following benefits:

Ensures that knowledgeable individuals determine what data is needed;
Increases the ability to share data across the organization and among organizations;
Ensures proactive data quality measures are built into systems and data stores;
Strengthens the relationship between data and business processes;
Establishes data ownership, stewardship, and lineage; and
Enhances the business glossary and builds metadata assets.

Data Lifecycle Management

Purpose

Ensures that the organization understands, inventories, maps, and controls its data, as it is created and modified through business processes throughout the data lifecycle, from creation or acquisition to retirement.

Introductory Notes

Data lifecycle management enables an organization to avoid data risks and supports the discovery and application of needed data quality improvements. It is a particularly important topic when addressing interdependent business processes that share or modify data. The data lifecycle begins with the creation of data at its point of origin through its useful life in the business processes dependent on it, and its eventual retirement, archiving, or destruction. An organization benefits from defining data usage and corresponding dependencies across business processes, for data that is either required by multiple business processes or critical for important business functions.

The classification of lifecycle phases for data assets typically includes the following sequential categories:

Business specification (e.g., data requirements, business terms, metadata);
Origination (i.e., the point of data creation or acquisition by the organization);
Development (e.g., architecture and logical design);
Implementation (i.e., physical design, initial population in data store(s));
Deployment (i.e., rollout of physical data usage in an operational environment);
Operations (e.g., data modifications, data transformations, and integration performance monitoring and maintenance); and
Retirement (i.e., retirement, archiving, and destruction).

Data within each major subject area (i.e., broad data groupings such as Organizations, Facilities, Persons) is also classified, traced, and sequenced by its creation, modification, or usage within the primary business processes of the organization. For example, a new patient is registered and provides insurance information, the patient then sees a physician, is subject to treatment and/or laboratory tests, and returns for a followup visit; the patient’s insurance is submitted, the insurance payment is received, and the patient is billed.

The corresponding business processes that creates data would be: Demographic and insurance data is captured through the registration process; office visit information, diagnosis and treatment data, and provider notes are captured through the clinicial evaluation and diagnosis process; laboratory data is captured when tests are ordered and when lab results data is added; returning patient data is again captured through the registration process; procedures are documented and sent through the insurance claims process; insurance payments are recorded for that patient through the payment receipt and allocation process; and a bill for uncovered charges is sent through the billing process.

All of the above processes create data about a patient; however, central to them all is patient demographic data. Duplicate records caused by a lack of patient identity integrity, most often occuring at the point of origination through the registration process, can affect treatment, testing, insurance claims, and billing. The demographic data about a patient is a critical data set, and its reference and usage throughout the healthcare lifecycle is ubiquitous. Therefore, it is recommended that organizations analyze every process where it is created and updated, both to ensure completeness and accuracy within patient records and to prevent duplicate records.

It is advised to identify dependencies among business processes using patient demographic data at the attribute level, enabling the organization to develop a comprehensive understanding of data interrelationships. If external organizations are involved in capturing or modifying the data, it may be necessary to determine what processes they follow to discover where defects, anomalies, or missing data occurs.

The first step in mapping patient demographic data to supplying and consuming business processes is to model each process that produces, modifies, or consumes the data. This can begin as simply as creating a sequential activity list, indicating what the usage is with respect to the data. For example, it may be that under no circumstances would a nurse or provider providing clinical care ever make a change in demographic data. In that case, the usage can be classified as Reference (aka, “Read” access) for that business process. However, it may be that the claims or billing processes occasionally surface the need to correct an inaccurate ZIP code, so that process usage may be classified as Modify.

For any usage other than Reference, the organization can zero in on the activity step within the business process and determine if there is potential for introducing errors. This may lead to improvements in the business process or procedures.

If the organization has multiple data stores containing patient demographic data, establishing the source to target(s) mapping is another highly useful activity. This consists of the identificaion of data elements at the point of origin, the identification of other data destinations, the mapping of the representation in the source to the representation in the target(s). For example, the street address in a patient record as captured in registration may be initially stored as Street Address with a 60 character length limit, when transferred to another system it may be stored as Patient Address with a 40 character length limit, and when transferred to still another system it may be stored as Address with a 50 character length limit.

Understanding where the data comes from, where it goes, and who can modify it is essential to effective prevention of defects and proactive efforts for data improvement. Over time, the organization is advised to map all business processes involving patient demographic data. Once established, mapping may be reviewed periodically and updated to reflect changes.

The data management function (or role), working with business experts, business process architects, and other stakeholders, often through a data working group (See Governance Management), is typically charged with facilitating the definition and verification of business process to data requirements. The data management function also typically develops and maintains data lifecycle management processes.

When data usage has been mapped to business processes and data has been traced from source to target(s), the organization can realize the following benefits:

Identify and reduce process and data bottlenecks;
Control redundancy through more accurate identification of duplicate records;
Minimize or eliminate unwanted changes to data content;
Improve consistency, reliability, and access to needed data;
Improve the ability to perform root cause analysis;
Trace data lineage across the patient demographic lifecycle; and
Improve management of historical data

Defining Data Requirements

https://it.toolbox.com/blogs/craigborysowich/defining-data-requirements-041008

Define Entities

Together with the users assigned to the project team, prepare a high level Conceptual Data Model of the entities of interest. The Conceptual Data Model includes the following:

· entities and entity descriptions,

· preliminary attribute lists for each entity,

· relationships between entities.

The Conceptual Data Model describes the second highest level of detail in the life cycle.

Define Size and Volumes

For each entity, determine its estimated size, and current and projected volumes, including peak volumes.

Define Data Retention Characteristics

For each entity, determine the retention period and archival requirements. For historical data, determine the frequency of access and the type of data to be accessed.

Note any known backup or archiving requirements that would contribute significant network traffic. An example might be the need to implement a distributed backup from a central site.

Define Data Currency Requirements

Define how current the data must be. For example, an executive information system may only require updates on a monthly or quarterly basis, whereas a banking system may require accurate information to the millisecond. Data currency requirements may vary within systems for different types of data.

Define Data Security Requirements

Define the data security required for each entity, relative to different user classes.

Define Audit Requirements

Identify and document requirements, at the entity level, for audit trails and controls.

Tips and Hints

As soon as the requirements definition is underway start to build the Traceability Matrix.

软件需求 - 数据需求

ONC （Office of National Coordinator for Health Information Technology)

Data Requirements Definition

Data Lifecycle Management

Defining Data Requirements

猜你喜欢