Data and AI Glossary
Access Controls: Access Control refers to the process of granting or denying specific requests to 1) obtain and use information and related information processing services and 2) enter specific physical facilities (e.g., federal buildings, military establishments, border crossing entrances). https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.201-3.pdf
Artificial Intelligence (AI): Artificial Intelligence refers to a branch of computer science devoted to developing data processing systems that performs functions normally associated with human intelligence, such as reasoning, learning, and self-improvement. https://www.state.gov/artificial-intelligence/#:~:text=Artificial%20Intelligence%20and%20Society&text=%E2%80%9CThe%20term%20’artificial%20intelligence’,influencing%20real%20or%20virtual%20environments.%E2%80%9D
Data Asset: Data Asset refers to any entity that is comprised of data. For example, a database is a data asset that is comprised of data records. A data asset may be a system or application output file, database, document, or web page. A data asset also includes a service that may be provided to access data from an application. https://csrc.nist.gov/glossary/term/data_asset
Data Catalog: Data Catalog refers to an organized inventory of data assets in the organization. It uses metadata to help organizations manage their data. It also helps data professionals collect, organize, access, and enrich metadata to support data discovery and governance. https://www.oracle.com/big-data/data-catalog/what-is-a-data-catalog/
Data Classification: Data Classification refers to the assignment of a level of sensitivity to data that results in the specification of controls for each level of classification. Levels are assigned according to predefined categories as data are created, amended, enhanced, stored or transmitted. https://www.isaca.org/resources/glossary
Data Cleansing: Data cleansing refers to the process of identifying and correcting data that are inaccurate, missing, or incomplete. https://www.nnlm.gov/guides/data-glossary/data-cleaning
Data Governance: Data Governance refers to setting direction on data use through prioritization and decision making, and ensuring alignment with agreed-on direction and objectives. https://www.isaca.org/resources/glossary
Data Privacy: Data privacy refers to freedom from intrusion into the private life or affairs of an individual when that intrusion results from undue or illegal gathering and use of data about that individual. https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-188.pdf
Data Quality: Data Quality refers to the overall characteristic of data reflecting its fitness for use, including accuracy, completeness, timeliness, consistency, accessibility, and validity. https://www.govinfo.gov/content/pkg/PLAW-106publ554/html/PLAW-106publ554.htm
Data Steward: Data Steward refers to an agency official with statutory or operational authority for specified data and responsibility for establishing the controls for its generation, collection, processing, dissemination, and disposal. https://csrc.nist.gov/glossary/term/information_steward
De-identification: De-identification refers to any process of removing the association between a set of identifying data and the data subject. https://csrc.nist.gov/glossary/term/de_identification
Equitable Data: Equitable data refers to data that allows for rigorous assessment of the extent to which government programs and policies yield consistently fair, just, and impartial treatment of all individuals. https://www.whitehouse.gov/wp-content/uploads/2022/04/eo13985-vision-for-equitable-data.pdf
Machine Learning (ML): Machine learning (ML) is a field within artificial intelligence, focuses on the ability of computers to learn from provided data without being explicitly programmed for a particular task. https://www.nccoe.nist.gov/ai/adversarial-machine-learning
Metadata: Information describing the characteristics of data including, for example, structural metadata describing data structures (e.g., data format, syntax, and semantics) and descriptive metadata describing data contents (e.g., information security labels). https://csrc.nist.gov/glossary/term/metadata
Personally Identifiable Information (PII): PII refers to Information with the purpose of uniquely identifying a person within a given context. https://www.dol.gov/general/ppii#:~:text=Further%2C%20PII%20is%20defined%20as,with%20other%20data%20elements%2C%20i.e.%2C
Pseudonymization: Pseudonymization refers to de-identification technique that replaces an identifier (or identifiers) for a data principal with a pseudonym in order to hide the identity of that data principal. https://csrc.nist.gov/glossary/term/pseudonymization
Risks: Risks refer to the extent to which an entity is threatened by a potential circumstance or event, and typically a function of: (i) the adverse impacts that would arise if the circumstance or event occurs; and (ii) the likelihood of occurrence. https://csrc.nist.gov/glossary/term/risk
Remediation: Remediation refers to the neutralization or elimination of a vulnerability or the likelihood of its exploitation. https://csrc.nist.gov/glossary/term/remediation
Sensitivity: Sensitivity refers to the potential harm that could result from unauthorized access, disclosure, modification, or destruction of information.https://csrc.nist.gov/glossary/term/sensitive
Spatial data: Spatial data, also known as geospatial data, refers to information that explicitly describes the location, shape, and relationships of geographic features and phenomena. https://geographicbook.com/types-of-spatial-data/
Transmission: Transmission refers to the movement of information across a communication channel. https://csrc.nist.gov/glossary/term/transmission