Skip to Main Content
Home /

Data and AI Glossary

The Data Glossary provides definitions sourced from authoritative literature, relevant web resources, and trusted materials, encompassing essential data-related terms. It will be updated regularly to ensure comprehensive coverage of all relevant concepts.

Access Controls: Access Control refers to the process of granting or denying specific requests to 1) obtain and use information and related information processing services and 2) enter specific physical facilities (e.g., federal buildings, military establishments, border crossing entrances). https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.201-3.pdf 

Artificial Intelligence (AI): Artificial Intelligence refers to a branch of computer science devoted to developing data processing systems that performs functions normally associated with human intelligence, such as reasoning, learning, and self-improvement.  https://www.state.gov/artificial-intelligence/#:~:text=Artificial%20Intelligence%20and%20Society&text=%E2%80%9CThe%20term%20’artificial%20intelligence’,influencing%20real%20or%20virtual%20environments.%E2%80%9D

Data Asset: Data Asset refers to any entity that is comprised of data. For example, a database is a data asset that is comprised of data records. A data asset may be a system or application output file, database, document, or web page. A data asset also includes a service that may be provided to access data from an application. https://csrc.nist.gov/glossary/term/data_asset

Data Catalog: Data Catalog refers to an organized inventory of data assets in the organization. It uses metadata to help organizations manage their data. It also helps data professionals collect, organize, access, and enrich metadata to support data discovery and governance. https://www.oracle.com/big-data/data-catalog/what-is-a-data-catalog/ 

Data Classification: Data Classification refers to the assignment of a level of sensitivity to data that results in the specification of controls for each level of classification. Levels are assigned according to predefined categories as data are created, amended, enhanced, stored or transmitted. https://www.isaca.org/resources/glossary 

Data Cleansing: Data cleansing refers to the process of identifying and correcting data that are inaccurate, missing, or incomplete. https://www.nnlm.gov/guides/data-glossary/data-cleaning

Data Governance: Data Governance refers to setting direction on data use through prioritization and decision making, and ensuring alignment with agreed-on direction and objectives. https://www.isaca.org/resources/glossary

Data Privacy: Data privacy refers to freedom from intrusion into the private life or affairs of an individual when that intrusion results from undue or illegal gathering and use of data about that individual. https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-188.pdf 

Data Quality: Data Quality refers to the overall characteristic of data reflecting its fitness for use, including accuracy, completeness, timeliness, consistency, accessibility, and validity. https://www.govinfo.gov/content/pkg/PLAW-106publ554/html/PLAW-106publ554.htm 

Data Steward: Data Steward refers to an agency official with statutory or operational authority for specified data and responsibility for establishing the controls for its generation, collection, processing, dissemination, and disposal. https://csrc.nist.gov/glossary/term/information_steward 

De-identification: De-identification refers to any process of removing the association between a set of identifying data and the data subject. https://csrc.nist.gov/glossary/term/de_identification

Equitable Data: Equitable data refers to data that allows for rigorous assessment of the extent to which government programs and policies yield consistently fair, just, and impartial treatment of all individuals. https://www.whitehouse.gov/wp-content/uploads/2022/04/eo13985-vision-for-equitable-data.pdf

Machine Learning (ML): Machine learning (ML) is a field within artificial intelligence, focuses on the ability of computers to learn from provided data without being explicitly programmed for a particular task. https://www.nccoe.nist.gov/ai/adversarial-machine-learning

Metadata: Information describing the characteristics of data including, for example, structural metadata describing data structures (e.g., data format, syntax, and semantics) and descriptive metadata describing data contents (e.g., information security labels). https://csrc.nist.gov/glossary/term/metadata

Personally Identifiable Information (PII): PII refers to Information with the purpose of uniquely identifying a person within a given context. https://www.dol.gov/general/ppii#:~:text=Further%2C%20PII%20is%20defined%20as,with%20other%20data%20elements%2C%20i.e.%2C  

Pseudonymization: Pseudonymization refers to de-identification technique that replaces an identifier (or identifiers) for a data principal with a pseudonym in order to hide the identity of that data principal. https://csrc.nist.gov/glossary/term/pseudonymization

Risks: Risks refer to the extent to which an entity is threatened by a potential circumstance or event, and typically a function of: (i) the adverse impacts that would arise if the circumstance or event occurs; and (ii) the likelihood of occurrence. https://csrc.nist.gov/glossary/term/risk

Remediation: Remediation refers to the neutralization or elimination of a vulnerability or the likelihood of its exploitation. https://csrc.nist.gov/glossary/term/remediation

Sensitivity: Sensitivity refers to the potential harm that could result from unauthorized access, disclosure, modification, or destruction of information.https://csrc.nist.gov/glossary/term/sensitive

Spatial data: Spatial data, also known as geospatial data, refers to information that explicitly describes the location, shape, and relationships of geographic features and phenomena. https://geographicbook.com/types-of-spatial-data/

Transmission: Transmission refers to the movement of information across a communication channel. https://csrc.nist.gov/glossary/term/transmission