Clicky

Health Datapedia

With its expansive reach, health data covers many subject areas — from analytics to biomedical research, mobile health applications, to policy and the organizations that oversee it. For even a veteran of the health data sphere, the terminology may be obscure or confusing. The Health Datapedia strives to clarify and shed light on some of these commonly used terms.

Are we missing a definition? See a term that we did not include here? Let us know, and we will add it!
Last Updated: March 3, 2021


A · B · C · D · E · F · G · H · I · J · K · L · M · N · O · P · Q · R · S · T · U · V · W · X · Y · Z

 

A

Accountable Care Organizations (ACOs): A network of doctors, hospitals or other healthcare providers that share financial and medical responsibility for providing coordinated care to patients in hopes of limiting unnecessary spending

AHRQ: Agency for Health Care Research and Quality

All-Payer Claims Databases (APCDs): Databases created by state mandata, that typically include data derived from medical claims, pharmacy claims, eligibility files, provider files, and dental claims from private and public payers. In states without a legislative mandate, there may be voluntary reporting of these data[17]

Anonymize: The process of removing all personal identifying information from data

Application Programming Interface (API): An interface implemented by a software program that enables it to interact with other software. APIs are important in the healthcare space because they allow third-party programmers (or innovators) to adapt existing legacy systems to evolving new systems

Armed Forces Health Longitudinal Technology Application (AHLTA): The current Department of Defense electronic medical records system

Attribution License: A license that requires an original source of licensed material to be cited

Authentication: Determining the identity of a principal

Authorization: Determining the rights of an authenticated principal

 

B

Big Data: The definition of “big data” is constantly evolving but it is often characterized by three “Vs”—volume, velocity and variety

Biobank: A large repository of tissue samples collected from patients over the course of a research study

Blue Button: A broad initiative across healthcare payers that seeks to empower patients by enabling them to access their own medical information in electronic form

Business Associate: Under HIPAA, a business associate is a person to whom a “covered entity” discloses PHI so that the business associate can carry out, assist with the performance of or perform on behalf of, a function or activity for the covered entity. A business associate can include: lawyers, auditors, data storage companies that maintain PHI, vendors or PHRs, and subcontractors that create, receive, maintain or transmit PHI on behalf of the business associate

C

CDC: Centers for Disease Control and Prevention

Clinical Data Research Networks (CDRNs): Networks that conduct randomized comparative effectiveness studies using data from clinical practice in a large, defined population

Clinical Decision Support: Providing relevant information to a clinician at the right time and place in order to provide ideal healthcare

Clinical Document Architecture (CDA): A markup standard developed by the organization, Health Level 7 (HL7) that defines the structure of clinical documents including discharge summaries and progress notes. Clinical documents can include images, text and other forms of multimedia. CDA is based on XML

Cloud-based: Technology that allows software to be run and data to be stored on remote servers[1]

Connectivity: The ability of communities to connect to the Internet

CMS: Centers for Medicare and Medicaid Services

Common Rule: A federal rule of ethics that governs biomedical and behavioral research involving human subjects in the U.S.

Comparative effectiveness research: Research that informs clinical decisions by comparing evidence on the benefits, harms and effectiveness of various medical treatments

Covered Entit(ies): Includes a healthcare provider that conducts certain financial and administrative transactions electronically (e.g.—billing, eligibility and funds transfers), a health plan or a healthcare clearinghouse. Should an individual, organization of agency meet this definition, it must comply with HIPAA’s privacy and security requirements

D

Data Access Protocol: A system that grants outsiders access to a database without overloading it.

Database rights: A right that prevents third parties from extracting and reusing content from a database. Common in many European jurisdictions.

Data-centric: A focus on specific data relevant to a given task

Data element access services: Services associated with crawling, indexing, security, identity, authentication, authorization and privacy

Data element indexing: A process and infrastructure for locating data elements

Data evangelist: A person who champions the benefits and use of data

Data Minimization: The concept that companies should limit the data they collect and retain, and dispose of it once they no longer need it[16]

Data mining: The computational process of discovering patterns in large data sets

Data provenance: The history of the data’s origin, ownership, use and subsequent modifications

Dataset(s): A collection of related sets of information that are comprised of separate elements but can be manipulated as a unit by a computer

Data segmentation: The electronic labeling or tagging of a patient’s health information in a way that provides patients or providers to share portions, but not all of a patient record.

Data warehouse: A system used in computing for reporting and data analysis. Data is integrated from one or more sources to create a central repository of data

De-identified data: Data that lacks a patient’s identifying information but it is still possible to provide information back to the patient under specified circumstances

Digital breadcrumbs: Small, often unconnected pieces of information that are produced by mobile phones, the Internet and other trackable daily activities such as shopping or commuting[2]

Digital signature: A cryptographic method for ensuring that data cannot be altered by anyone except the person that created the data

E

Electronic Health Record (EHR): An electronic record of a patient’s health-related information. An EHR may contain information from clinical visits, lab and imaging studies or other information pertaining to a patient’s medical history

Electronic Medical Record (EMR): An electronic record of a patient’s medical and clinical record gathered in one provider’s office

Electronic Protected Health Information (ePHI): Information, in electronic form that concerns the health status, provision or payment for healthcare that can be linked back to a specific individual

Encryption: The technology of making data indecipherable except for the person with the corresponding “key”

Enterprise Architecture: Defined in the JASON Report as the way a specific enterprise’s business processes are organized

F

FDA: Food and Drug Administration

Fast Healthcare Interoperability Resources (FHIR): The latest standard developed by the HL7 organization, FHIR defines a set of “resources” that are granular clinical concepts that can be maintained independently or aggregated into complex documents which offers some flexibility in addressing interoperability problems.

G

Genome: Genetic material of an organism

Genotype: Genetic makeup of a specific human being

H

Health Data Enclaves: A “secure environment that allows for remote access to confidential data. The environment is firewalled from outside intrusion, only accessible to authorized users and all information outflow and inflow is controlled and monitored by experienced confidentiality officers.”[3] A health data enclave could not only help aggregate siloed health data but allow stakeholders to utilize powerful analytic tools to assess large health datasets in a secure, HIPAA-compliant setting

Health Information Exchange (HIE): The exchange of electronic health information across organizations within a community, region or hospital system

Health Information Technology (HIT): Technologies that manage and transmit health information for use by providers, payers, consumers and other stakeholders

HHS: Department of Health and Human Services

Health Insurance Portability and Accountability Act (HIPAA): Enacted by Congress in 1996, HIPAA was enacted to guarantee the availability and renewability of health insurance coverage as well as limit restrictions on pre-existing conditions. HIPAA also contained tax provisions relating to health insurance and provisions requiring HHS to issue standards that would facilitate the electronic transmission of health information without compromising patient privacy

HIPAA Privacy Rule: Establishes national standards to protect patients’ medical records and other personal health information. The Rule applies to health plans, clearinghouses and healthcare providers that conduct certain healthcare transactions electronically. The Rule also requires that proper safeguards are taken to protect the privacy of personal health information and sets limits and conditions on the use and disclosure of such information without patient authorization. The Rule also enables patients to have rights over their health information including the ability to examine and obtain a copy of their own medical records and to request corrections

HIPAA Security Rule: Establishes national standards to protect an individual’s electronic personal health information that is created, used, received or maintained by a covered entity. The Rule also requires appropriate administrative, physical and technical safeguards to ensure the confidentiality, integrity and security of electronic protected health information. Examples where the Security Rule would apply include—electronic storage media, computer memory devices, and media used to exchange information already in electronic storage media (including email and internet)

HITECH Act: Also known as the Health Information Technology for Economic and Clinical Health (HITECH) Act. Enacted as part of the American Recovery and Reinvestment Act the HITECH Act authorized at least $20 billion for the adoption and use of interoperable EHRs

HL7: A set of international standards developed to enable the transfer of clinical and administrative data between hospital information systems

Human Genome Project (HGP): An international research project dedicated to determining the sequence of chemical base pairs that make up human DNA and mapping all of the genes of the human genome. The Project was declared complete in April 2003

I

ICD-10: The tenth revision of the International Statistical Classification of Diseases and Health Related Problems, a medical classification list by the World Health Organization (WHO). ICD-10 contains among other things, codes for diseases, symptoms, abnormal findings, complaints and injuries

Information Asset Registers (IARs): Registers set up to capture and organize meta-data about the vast quantities of information held by government departments and agencies. A comprehensive IAR includes among other items, databases, old sets of files, recent electronic files, collections of statistics, and research[4]

Integration engines: An application of a universal exchange language that can assist data exchanges between personal health records and other types of EHRs in a cloud

Internet of Things: The ability of everyday objects to connect to the Internet and to send and receive data

Interoperability: The extent to which systems and devices can exchange and interpret data in such a way that it can be understood by the user[5]

IOM: Institute of Medicine

Institutional Review Board (IRB):A committee that is tasked with reviewing, monitoring and approving research on human subjects including research on information gathered from a patient’s medical records[6]

J

JASON Report: A report authored by the MITRE Corporation in November 2013 that suggested the current lack of interoperability among EHRs is a substantial impediment to the free exchange of health data and the development of a robust health data infrastructure. The Report also found that the problem of interoperability can only be solved by establishing a comprehensive, transparent and overarching software architecture for health information

K

Key: A piece of data that can unlock and make readable cryptographically protected information

L

Learning Healthcare System: Harnessing patient data and analytics to learn about the best treatment for each patient and feeding that knowledge back to providers and other stakeholders to create cycles of continuous improvement[7]

Legacy system: An old method, technology or computer system that is used instead of available upgraded versions

Limited Data Sets: Protected health information (PHI) that is less regulated by HIPAA because it excludes direct identifiers of a patient with certain exceptions including such information as city, state, or zip code

M

Machine-readable: Data or metadata that is in a format so that it can be read by a computer

Markup Language(s): Designed for the processing, definition and presentation of text. A markup language specifies code within a text file for formatting (including layout and style.) XML is a common example of a markup language

Meaningful Use: Using certified EHR technology to improve quality, safety and efficiency in healthcare, engage patients and caregivers, improve care coordination and population and public health, as well as maintain the privacy and security of a patient’s health information[8]

Metadata: Information that characterizes data, including contextual information

Metadata tag: A tag accompanying each piece of data that describes the attributes, provenance and required security protections of that piece of information

mHealth: The practice of medicine and public health that is supported by mobile devices

Middleware: Software that extracts and reformats data elements from existing clinical systems

N

National Patient-Centered Clinical Research Network (PCORNet): Created in December 2013 and funded by PCORI, PCORNet consists of 11 clinical data research networks and 18 patient-powered research networks that are dedicated to improving comparative effectiveness research by integrating data from these various networks and creating networks for conducting clinical outcomes research

NIH: National Institutes of Health

NIH Big Data to Knowledge (BD2K): An initiative by the NIH that is focused on empowering biomedical scientists to capitalize on the big data that is generated by the research community. BD2K is focused on developing methods, standards, tools and software that will improve the use of big data in the biomedical research community by supporting a number of initiatives including research and training in data science

NIST: National Institute of Standards and Technology

O

ONC: Office of the National Coordinator for Health Information Technology

OpenFDA: An initiative by the FDA that provides public APIs. OpenFDA is also exploring the possibility of being able to download access to a number of datasets including adverse events, drug product labeling and recall enforcement reports

Open Data: Data that can be used for any purpose

Open Data Commons: A type of data that is available to all. Such data can include an obligation of acknowledging the source of the data. However, the data may be private, commercial or government controlled[9]

Open Health Data: Publically available data that can be accessed, downloaded or utilized without further requirements or stipulations of use by the data holder

Open Government Data: Open data that is produced by the government. Generally, it is data gathered during the course of business activities and do not identify individuals or breach commercial sensitivity

Open Standards: Commonly understood to meantechnical standards that are free from licensing restrictions. Can also denote standards that are developed in a vendor-neutral manner

P

Patient-Centered Outcomes Research Institute (PCORI): Authorized by the Affordable Care Act, PCORI is an independent nonprofit organization dedicated to funding comparative clinical effectiveness research. PCORI’s main goal is to assist patients, clinicians, payers and policymakers in making better informed decisions about healthcare as well as improving delivery of care and outcomes

Patient-centric: Healthcare organized around the needs and requirements of a patient

Patient-Generated Data: Health-related data that is created, collected, or recorded by a patient and/or their designee to help address a health concern[10]

Patient-powered Research Networks (PPRN): Networks governed and operated by patients and/or caregivers, that are focused on a particular condition and are interested in sharing health information and participating in research

Personal Health Record (PHR): An electronic record of health information that is maintained or managed by the patient

Personal/Proprietary Data: Data that is controlled by an individual, commercial entity or non-government institution that has the desire or legal right to restrict access to and use of the data[11]

Personalization: Tailoring medical care to meet the unique characteristics of an individual patient

Personally Identifiable Information (PII): Any information that can be used to identify, contact or locate an individual either by itself or in combination with other accessible sources

Population Health: “The health outcomes of a group of individuals, including the distribution of such outcomes within the group”[12]

Post-marketing surveillance: A system that identifies adverse events that did not arise during the drug or device approval process

Predictive analytics: A variety of statistical techniques including modeling, machine learning, and data mining that analyze current and historical data to make forecasts about future, unknown events in real-time

Primary care medical home (PCMH): A care model where the patient and primary care physician are the center of a virtual organization which is paid a fee to coordinate the care a patient receives from specialists and other providers

Protected Health Information (PHI): Information concerning the health status, provision or payment for healthcare that can be linked back to a specific individual

Q

Quantified Self: A movement dedicated to acquiring self-knowledge through self-tracking using technology

R

Randomized clinical trials: A clinical trial where participants are randomly assigned to different forms of treatment

Regional Extension Centers (RECs): Established as part of the American Recovery and Reinvestment Act and overseen by the Office of the National Coordinator for Health IT. RECs were established to help critical access hospitals with IT implementation

Regional Health Information Organizations (RHIO): An organization that brings together healthcare stakeholders within a defined geographic area and governs health information exchange among them to improve health and the delivery of care within that community. (Also known as a Health Information Exchange Organization (HIO))

S

Semantics: The clinical or operational meaning of data

Service-oriented architecture: An approach to health IT that uses software policies, practices and frameworks to allow a user to access sets of “services” on another party’s computers and data

Share-alike License: A license that requires users of a work to provide the content under same or similar conditions as the original.[13]

Social-sharing paradox: Consumers sharing data but expecting others to protect their privacy

Software Architecture: Referred to in the JASON Report as the collective components of a software system that interact in specified ways and across specified interfaces to ensure specified functionality[14]

Standardized health records: Health records that follow a standardized format and can be accessed by all necessary parties

Syndromic surveillance: A type of surveillance using health-related data that precedes diagnosis and signals a sufficient probability of a case or outbreak to warrant a public health response

Syntax: The formatting of data that are exchanged, as well as the details of the exchange protocols, including privacy protection

T

Tab-separated values: Acommon form of text file format for sharing tabular data. The
format is very simple and machine readable

Tagged data elements: Data accompanied by metadata describing the attributes and privacy protections of the data

Trust network: A form of data sharing that combines a computer network that keeps track of user permissions for each piece of personal data with a legal contract that specifies what can and cannot be done with the data and the potential ramifications if these permissions are breached[15]

Two-factor authentication: The use of two of the following to determine the identity of a principal: physical credentials (e.g.—smart cards), biometrics (e.g.—fingerprints), or a secret (e.g.—password)

U

Universal exchange language: A common language and format in which all electronic health systems can exchange data

Usability: The ease with which clinicians can learn to use EHRs, capture data from clinical encounters and in turn make use of the data to improve the delivery of care

V

Value-based purchasing: The idea that buyers (e.g.—patients) should hold healthcare providers accountable for the cost and quality of care

VistA: An integrated system of software applications that supports patient care at the Veterans Health Administration

W

Web API: An API that is devised to work over the Internet

X

XML: A set of rules for encoding documents in machine readable format

Y

Z

 


[17] Love, D., Custer, W., Miller, P. (2010). All-Payer Claims Databases: State Initiatives to Improve Health Care Transparency. The Commonwealth Fundhttp://commonwealthfund.org/~/media/Files/ Publications/Issue%20Brief/2010/Sep/ 1439_Love_allpayer_claims_databases_ib_v2.pdf.
[1]
 President’s Council of Advisors on Science and Technology. (2010). Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward. Washington, DC.
[16] Federal Trade Commission. (2015). Internet of Things: Privacy & Security in a Connected World.
[2] Heitmueller et al., (2014). Developing Public Policy to Advance the Use of Big Data in Health Care. Health Affairs, 33(9), 1523.
[3] Hair, Elizabeth. “Accessing CMS Claims Records: Data Enclaves as a Virtual RDC.” NORC at University of Chicago. PowerPoint presentation. Bethesda, MD.
[4] Dietrich, D., Gray J., McNamara, T., Poikola, A., Pollock, R., Tait, J., & Zijlstra, T., (2014). California Open Health Databook, 50.
[5] “What is Interoperability?,” Resource Library, HIMSS, http://www.himss.org/library/interoperability-standards/what-is,
[6] 45 C.F.R. §§ 46.102, 46.107-109 (2010).
[7] “Learning Health System Initiative,” Health Informatics, University of Michigan, http://healthinformatics.umich.edu/lhs, (2013)
[8] “Meaningful Use Definition & Objectives,” EHR Incentives & Certification, Office of the National Coordinator for Health IT, http://www.healthit.gov/providers-professionals/meaningful-use-definition-objectives, (March 18, 2021).
[9] Heitmueller et al., (2014). Developing Public Policy to Advance the Use of Big Data in Health Care. Health Affairs, 33(9), 1525.
[10] Howie, L., Hirsch, B., Locklear, T., & Abernethy, A. (2014). Assessing the Value of Patient-Generated Data to Comparative Effectiveness Research. Health Affairs, 32(8), 1222.
[11] Heitmueller et al., (2014). Developing Public Policy to Advance the Use of Big Data in Health Care. Health Affairs, 33(9), 1524.
[12] Kindig, D., Stoddart, G., (2003). What is Population Health? American Journal of Public Health, 93(3), 380-3.
[13] Dietrich, D., Gray J., McNamara, T., Poikola, A., Pollock, R., Tait, J., & Zijlstra, T., (2014). California Open Health Databook, 51.
[14] Agency for Healthcare Research and Quality. (2014). A Robust Health Data Infrastructure (AHRQ Publication No. 14-0041-EF). Rockville, MD.
[15] Bilbao-Osorio, B., Dutta, S.Lanvin, B., editors. (2014). The Global Information Technology Report 2014: Rewards and Risks of Big Datahttp://www3.weforum.org/docs/WEF_GlobalInformationTechnology_Report_2014.pdf.