NIST AI 600-1. Generative AI Risks Governed by Design.
NIST AI 600-1 Overlay
The Generative AI Profile published by NIST in 2024 as a companion to the AI Risk Management Framework (AI 100-1). Twelve GenAI-specific risk categories mapped to actionable controls. Content provenance, training data governance, model security, adversarial resilience, and deployment monitoring. Continuous evidence collection from connected infrastructure. Overlay structure that extends your existing AI RMF posture with generative AI specificity.
NIST AI 600-1 Overlay
Generative AI introduces risks that general-purpose frameworks were not designed to address.
NIST AI 600-1 is the Generative AI Profile: a structured overlay to the AI Risk Management Framework that identifies twelve risk categories unique to generative AI systems. Traditional information security controls address infrastructure, access, and data protection. They do not address confabulation, content provenance, training data contamination, or the emergent behaviors that arise when foundation models generate novel outputs. AI 600-1 fills that gap with risk-specific guidance that maps directly to the Govern, Map, Measure, and Manage functions of the AI RMF. This overlay makes those controls assessable, evidenced, and continuously monitored.
NIST AI 600-1, formally titled the Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, was published in July 2024 as a companion document to the AI Risk Management Framework (NIST AI 100-1). Where the AI RMF provides a broad, technology-neutral structure for managing risks across all categories of artificial intelligence, AI 600-1 narrows its focus exclusively to generative AI systems: large language models, image generators, code synthesizers, multimodal foundation models, and any system that produces novel content based on learned patterns from training data. The profile does not replace the AI RMF. It extends it. Every action item in AI 600-1 maps to a specific function and category within the AI RMF's four core functions: Govern, Map, Measure, and Manage. Organizations that have already implemented the AI RMF gain a structured extension for their generative AI deployments. Organizations starting fresh gain a focused entry point that connects to the broader framework.
The document emerged from a recognition that generative AI systems introduce a class of risks that prior frameworks did not anticipate. The AI RMF addresses bias, transparency, accountability, and reliability in general terms applicable to classification systems, recommendation engines, and predictive models. Generative AI creates new categories of concern. A language model can fabricate citations that appear authoritative. An image generator can produce synthetic media indistinguishable from photographs. A code synthesizer can reproduce copyrighted code from its training data without attribution. These are not edge cases. They are inherent characteristics of how generative models operate: they learn statistical patterns from massive corpora and produce outputs that recombine those patterns in novel ways. The boundary between learned knowledge and fabricated content is not architecturally defined in most foundation models. AI 600-1 provides the structured risk taxonomy that organizations need to identify, measure, and manage these generative-specific behaviors.
The profile is organized around twelve risk categories, each describing a distinct class of harm that generative AI systems can produce or amplify. For each category, AI 600-1 identifies the risk, describes how it manifests in generative systems, and provides suggested actions mapped to the AI RMF's Govern, Map, Measure, and Manage functions. The suggested actions are not prescriptive technical requirements in the way that NIST 800-53 controls specify exact configurations. They are risk management activities: establishing policies, implementing monitoring, measuring outcomes, and managing identified risks through organizational processes and technical controls. This structure makes AI 600-1 an overlay rather than a standalone framework. It modifies and extends the AI RMF with generative-specific content, the same way a DISA STIG overlays a Security Requirements Guide with product-specific implementation details. The relationship is additive. AI 600-1 does not override anything in the AI RMF. It adds specificity where the base framework's general guidance is insufficient for the unique characteristics of generative systems.
Organizations deploying generative AI systems face a governance gap. Traditional information security frameworks address the infrastructure that hosts the model: access controls, encryption, network segmentation, audit logging, vulnerability management. These controls protect the system from external threats. They do not address the risks that originate from the model itself. A large language model integrated into a customer-facing application can generate false statements with the same confidence and formatting as accurate ones. An image generation system can produce content that infringes on intellectual property without any mechanism to detect or prevent the infringement. A code generation model can suggest implementations containing security vulnerabilities copied from its training data. These risks exist independently of whether the hosting infrastructure is properly secured. An organization can achieve perfect scores on every NIST 800-53 control and still deploy a generative AI system that fabricates medical guidance, generates misleading financial analysis, or produces content that violates regulatory requirements.
The pace of generative AI adoption has outstripped the pace of governance development. Organizations integrate foundation models into production workflows before establishing policies for acceptable use, output validation, or incident response specific to AI-generated content. Engineering teams deploy retrieval-augmented generation pipelines without defining how hallucinated content will be detected and handled. Product teams launch customer-facing features powered by language models without establishing monitoring for harmful, biased, or factually incorrect outputs. When incidents occur, there is no playbook. When regulators ask how AI-generated content is governed, there is no documented framework. When customers ask how the organization ensures the accuracy of AI-produced outputs, there is no evidence of systematic quality controls. The absence of a generative-AI-specific governance structure means that each team makes ad hoc decisions about risk tolerance, output filtering, and quality assurance. Those decisions are inconsistent, undocumented, and invisible to leadership until something fails publicly.
The consequences of ungoverned generative AI deployment are materializing across industries. Organizations face regulatory scrutiny from agencies that are actively developing enforcement positions on AI transparency, accuracy, and fairness. Legal exposure from AI-generated content that infringes intellectual property, disseminates false information, or produces discriminatory outputs is not theoretical; litigation is underway in multiple jurisdictions. Reputational damage from AI systems that produce harmful or embarrassing content spreads at the speed of social media. Supply chain risks emerge when organizations integrate third-party foundation models without understanding their training data provenance, bias characteristics, or update cadence. The cost of retroactive governance, implementing controls after an incident forces the issue, exceeds the cost of proactive governance by orders of magnitude. Incident response without a pre-established framework means every decision is made under pressure, without precedent, and without the organizational muscle memory that comes from having practiced the process. AI 600-1 provides the structured risk taxonomy and suggested actions that transform reactive scrambles into systematic governance.
AI 600-1 defines twelve risk categories that collectively describe the threat surface unique to generative AI systems. CBRN Information addresses the risk that generative models can synthesize or provide access to information about chemical, biological, radiological, and nuclear threats that could lower barriers to harm. Confabulation covers the generation of false content presented as fact: fabricated citations, invented statistics, fictional events described with authoritative confidence. This is distinct from general inaccuracy because the outputs are structurally indistinguishable from truthful content. Data Privacy addresses the risk that models trained on personal data can memorize and reproduce that data in their outputs, creating privacy violations that bypass traditional access controls. Environmental concerns the energy and resource consumption of training and operating large generative models, including the environmental impact of compute infrastructure required for inference at scale.
Human-AI Interaction addresses risks arising from how people engage with generative systems: over-reliance on AI outputs, anthropomorphization of language models, erosion of critical evaluation when AI-generated content is presented alongside human-authored content, and the difficulty users face in distinguishing generated content from verified information. Information Integrity covers the potential for generative AI to produce and amplify misinformation, disinformation, and manipulated media at unprecedented scale and fidelity. Information Security addresses AI-specific attack vectors: prompt injection that manipulates model behavior through crafted inputs, training data poisoning that corrupts model outputs at the source, and adversarial examples that exploit model vulnerabilities to produce targeted incorrect outputs. Intellectual Property concerns the risk that models trained on copyrighted material reproduce protected content in their outputs, creating infringement liability for the deploying organization.
Obscene, Degrading, and Abusive Content addresses the generation of harmful material including hate speech, explicit content, and content that targets or demeans specific groups. Value Chain and Component Integration covers risks introduced through the AI supply chain: third-party models with unknown training data provenance, pre-trained components with undocumented bias characteristics, and dependencies on external services whose behavior changes without notice. Homogenization describes the systemic risk that arises when many organizations deploy the same foundation models, creating correlated failures across industries when a shared model exhibits a flaw, bias, or vulnerability. Each of these twelve categories maps to specific suggested actions within the AI RMF's Govern, Map, Measure, and Manage functions. The mapping is not approximate. Each suggested action in AI 600-1 carries an explicit cross-reference to the AI RMF function and category it extends, creating a traceable chain from generative-specific risk to general AI governance structure.
Content provenance is the ability to determine whether a piece of content was generated by an AI system, which system produced it, when it was generated, and what inputs influenced the output. For generative AI, provenance is a foundational governance requirement. Without it, organizations cannot distinguish AI-generated reports from human-authored analysis. Customers cannot determine whether the information they receive was verified by a human or produced by a model that may confabulate. Regulators cannot assess whether disclosures about AI use are accurate. The challenge is technical: generative models produce outputs that are structurally identical to human-created content. A language model's output is text. An image generator's output is pixels. There is no inherent signal in the output that identifies its origin unless provenance mechanisms are deliberately implemented. AI 600-1 identifies content provenance as a cross-cutting concern that intersects with information integrity, human-AI interaction, and intellectual property risks.
Provenance mechanisms fall into three categories. Metadata-based provenance attaches structured information to AI-generated content at the point of creation: model identifier, generation timestamp, input parameters, confidence scores, and retrieval sources for augmented generation systems. This metadata travels with the content through downstream systems, enabling any consumer to inspect the content's origin. Watermarking embeds statistical signals into generated content that are imperceptible to humans but detectable by verification systems. Text watermarking modifies token selection probabilities to embed a detectable pattern. Image watermarking embeds signals in frequency domains that survive compression, cropping, and format conversion. Watermarking is probabilistic rather than deterministic: it provides a confidence score for AI generation rather than a binary determination. Deepfake detection applies forensic analysis to content suspected of being AI-generated, examining artifacts such as inconsistent lighting in images, unnatural prosody in audio, or statistical patterns in text that differ from human writing distributions.
Implementing content provenance requires organizational commitment beyond technical deployment. Policies must define which AI-generated content requires provenance marking, what metadata must accompany generated outputs, how provenance information is stored and transmitted, and who is responsible for verifying provenance claims. Technical infrastructure must support provenance at scale: every API call to a generative model must capture the inputs, parameters, and outputs with sufficient detail to reconstruct the generation context. Downstream systems must preserve provenance metadata rather than stripping it during processing. Verification endpoints must be available for consumers who want to check whether content carries a valid provenance signature. The governance challenge is that provenance is only as reliable as the weakest link in the content pipeline. If any system in the chain strips metadata, overwrites watermarks, or fails to log generation events, the provenance chain breaks and the content becomes untraceable. AI 600-1 maps these requirements to specific suggested actions within the Govern and Manage functions, establishing provenance as an organizational capability rather than a point feature.
The behavior of a generative AI system is determined by its training data. Every bias, factual error, copyrighted passage, and toxic pattern present in the training corpus has the potential to surface in the model's outputs. Training data governance is therefore upstream of every other risk category in AI 600-1. Confabulation rates correlate with the quality and consistency of training data. Intellectual property risks trace directly to whether the training corpus included copyrighted material. Privacy violations occur when personally identifiable information in training data is memorized and reproduced. Bias in generated outputs reflects bias in the data the model learned from. Addressing these risks at the output layer through filtering and post-processing is necessary but insufficient. Governance must extend to the data itself: what was included, why it was included, what quality controls were applied, and what legal rights the organization holds over the training material.
Data poisoning represents the adversarial dimension of training data risk. An attacker who can influence the training corpus can influence the model's behavior at a fundamental level. Poisoning attacks range from injecting specific trigger phrases that cause the model to produce targeted outputs, to subtle statistical manipulation that shifts the model's distribution toward attacker-chosen behaviors without obvious indicators of compromise. The attack surface is broad: web-scraped training data incorporates content from any publicly accessible source, meaning an attacker who publishes strategically crafted content on indexed websites can influence models that include web data in their training pipeline. Fine-tuning datasets are often smaller and more targeted, making them easier to poison with fewer injected samples. Copyright and licensing risks arise because most large-scale training datasets include material whose licensing terms may not permit use in commercial AI training. Organizations that fine-tune models on proprietary data must verify that the data's licensing terms permit derivative use in AI-generated outputs. Organizations that deploy third-party models must understand the provenance of those models' training data to assess their own liability exposure.
Bias in training corpora is a structural challenge rather than a defect to be fixed. Language corpora reflect the biases present in the text they contain. Image datasets reflect the demographic and cultural composition of their sources. Code repositories reflect the conventions and errors of their contributors. When a generative model learns from biased data, it reproduces and sometimes amplifies those biases in its outputs. The AI 600-1 profile maps training data governance to the Map function of the AI RMF, requiring organizations to document data sources, assess data quality, evaluate bias characteristics, and establish provenance for all training material. This is not a one-time activity performed before model deployment. Training data governance is continuous: fine-tuning datasets change, retrieval-augmented generation sources are updated, and the regulatory landscape governing data use in AI training evolves. Organizations must maintain a living inventory of their AI training data with sufficient metadata to answer regulatory inquiries, respond to intellectual property claims, and trace model behavior to specific data sources when issues arise.
Model security addresses threats that target the AI system itself rather than the infrastructure hosting it. Traditional information security protects the network, the servers, the storage, and the access controls. Model security protects the weights, the inference pipeline, the prompt processing chain, and the behavioral boundaries that define acceptable model outputs. These are fundamentally different attack surfaces. A network firewall does not prevent prompt injection. An intrusion detection system does not detect adversarial examples. Encryption at rest does not prevent model extraction through repeated API queries. AI 600-1 identifies information security as one of its twelve risk categories specifically because generative AI introduces attack vectors that fall outside the scope of conventional security controls. Organizations must layer AI-specific defenses on top of their existing infrastructure security posture.
Prompt injection is the most prevalent attack vector against deployed language models. Direct prompt injection embeds malicious instructions within user input that override the model's system-level instructions. Indirect prompt injection places malicious instructions in content the model retrieves or processes, such as web pages, documents, or database records accessed through retrieval-augmented generation. Both forms exploit the fact that language models process instructions and data in the same input channel, making it architecturally difficult to distinguish between legitimate instructions and injected commands. Model theft occurs when an attacker extracts a model's learned behavior through systematic querying, building a surrogate model that approximates the original without access to the original weights. This threatens both intellectual property (the model represents significant investment in training data, compute, and engineering) and security (the surrogate model can be studied offline to develop more effective adversarial attacks). Inference manipulation uses crafted inputs to produce specific incorrect outputs: causing a content filter to approve harmful content, a classification system to misidentify inputs, or a code generator to produce vulnerable implementations.
Defending against these threats requires a layered approach that mirrors the defense-in-depth principle from traditional security. Input validation inspects and sanitizes prompts before they reach the model, detecting known injection patterns and anomalous input structures. Output filtering examines generated content before it reaches the user, checking for policy violations, harmful content, and indicators of successful injection attacks. Rate limiting and query monitoring detect model extraction attempts by identifying patterns of systematic querying that resemble distillation or extraction campaigns. Behavioral monitoring tracks the model's output distribution over time, detecting statistical shifts that may indicate successful poisoning or manipulation. Adversarial testing subjects the model to known attack techniques during development and periodically during deployment, verifying that defenses remain effective as attack methods evolve. AI 600-1 maps these defensive capabilities to specific suggested actions within the Manage function of the AI RMF, establishing model security as an ongoing operational responsibility rather than a deployment-time checkbox.
Deploying a generative AI system into production creates an ongoing governance obligation that does not end when the model passes its initial evaluation. Foundation models behave differently in production than in testing. Real users submit inputs that test suites did not anticipate. Edge cases that occurred once per million queries in evaluation occur thousands of times daily at production scale. Retrieval-augmented generation systems depend on external data sources whose content changes independently of the model, meaning the system's effective behavior changes even when the model itself remains static. Organizations must establish runtime safeguards that monitor, filter, and constrain model behavior continuously. A model that passed all safety evaluations during development can produce harmful outputs in production when users discover input patterns that bypass the safety training. Runtime monitoring is not optional post-deployment hygiene. It is a core governance requirement identified in AI 600-1's Manage function.
Output filtering applies rule-based and model-based checks to generated content before it reaches the end user. Content classifiers screen for harmful, biased, or policy-violating material. Factual consistency checks compare generated claims against verified knowledge bases. Format validators ensure that structured outputs (code, data, citations) conform to expected schemas. Confidence thresholds reject or flag outputs where the model's own uncertainty signals exceed acceptable levels. These filters operate in the inference pipeline, adding latency and compute cost but preventing categories of harm that cannot be addressed through training alone. Performance monitoring tracks the model's behavior over time using metrics that capture both quality and safety. Output quality metrics measure relevance, accuracy, and coherence. Safety metrics measure the rate of flagged outputs, successful filter interventions, and user-reported issues. Operational metrics track latency, throughput, error rates, and resource consumption. Together, these metrics form a continuous signal that indicates whether the deployed system is operating within its governance boundaries.
GenAI model drift describes the phenomenon where a generative model's effective behavior changes over time. For models that undergo continuous fine-tuning, drift is introduced through new training data. For static models, drift occurs through changes in the deployment context: updated retrieval sources, modified system prompts, new integration patterns, or changes in the user population that shift the distribution of inputs the model receives. Drift is not inherently harmful. It becomes a governance concern when the model's current behavior diverges from the behavior that was evaluated, approved, and documented during the last assessment. Without drift detection, an organization's governance documentation describes a model that no longer exists in its documented form. AI 600-1 addresses deployment monitoring through the Measure function, requiring organizations to establish baselines for model behavior and detect meaningful departures from those baselines. The suggested actions specify that monitoring must be continuous rather than periodic, automated rather than manual, and connected to incident response processes that can escalate and remediate detected anomalies before they produce downstream harm.
Sentinel discovers and monitors the AI infrastructure within your connected estate. It identifies generative AI deployments across your environments: model endpoints, inference services, fine-tuning pipelines, retrieval-augmented generation data stores, and content filtering layers. Discovery is continuous. When a new model endpoint is deployed or an existing service is reconfigured, Sentinel detects the change and maps it to the affected AI 600-1 risk categories. Evidence collection operates at the infrastructure level: API gateway logs capture prompt and response metadata, compute monitoring tracks inference resource consumption, storage auditing records training data access patterns, and network monitoring maps data flows between model components. Each evidence artifact is stored as an immutable record with a SHA-256 integrity hash, timestamp, source system identifier, and full provenance chain.
Rampart maintains the assessment structure for AI 600-1, organizing the twelve risk categories and their suggested actions into an assessable control framework. Each suggested action carries an implementation status, linked evidence artifacts, assigned ownership, and a scoring dimension that tracks both control effectiveness and evidence freshness. Garrison provides the passive inventory of your AI estate: which models are deployed, where they run, what data sources they access, and how they connect to other systems within the authorization boundary. Artificer generates control narratives for each risk category, describing how your organization addresses the suggested actions based on observed infrastructure state rather than aspirational descriptions. When Sentinel detects a gap between your declared governance posture and the observed state of your AI infrastructure, the discrepancy surfaces in Citadel as an actionable finding with linked evidence and remediation guidance.
Vanguard extends its scanning capabilities to AI-specific concerns. Code analysis identifies prompt injection vulnerabilities in application code that processes user input before passing it to model endpoints. Dependency scanning flags AI libraries with known security advisories. Configuration analysis examines model deployment configurations for security misconfigurations: overly permissive API access, missing output filtering, absent rate limiting, and unencrypted model artifact storage. Armory provides infrastructure-as-code modules for deploying AI governance infrastructure: logging pipelines configured for model observability, content filtering services with policy-driven output classification, provenance metadata services that tag AI-generated content at the point of creation, and monitoring dashboards that surface drift indicators and safety metric trends. The modules deploy governance infrastructure that generates evidence from day one. The infrastructure IS the control, and its operational state IS the evidence.
AI 600-1 does not exist in isolation. It is structurally connected to a constellation of AI governance frameworks through explicit and derivable mappings. The most direct relationship is with the AI Risk Management Framework (NIST AI 100-1), which AI 600-1 extends by design. Every suggested action in AI 600-1 carries an explicit cross-reference to the AI RMF function and category it modifies. Organizations that have implemented the AI RMF gain AI 600-1 coverage by extending their existing Govern, Map, Measure, and Manage controls with generative-specific actions. The mapping is deterministic and complete: no AI 600-1 suggested action exists without a parent in the AI RMF structure. This means work done for AI 600-1 simultaneously advances AI RMF posture, and AI RMF implementations provide the foundation on which AI 600-1 overlays its generative-specific requirements.
The Coalition for Secure AI (COSAiS) framework addresses AI security from a threat-modeling perspective, complementing AI 600-1's risk management approach. Where AI 600-1 identifies risk categories and maps them to governance functions, COSAiS catalogs specific threat vectors, attack techniques, and defensive measures for AI systems. The overlap is substantial: prompt injection, model theft, training data poisoning, and adversarial manipulation appear in both frameworks. Cross-mapping between the two allows organizations to satisfy COSAiS threat coverage requirements through AI 600-1 control implementations, and vice versa. The EU AI Act establishes regulatory requirements for AI systems deployed in or affecting the European Union, including mandatory risk assessments, transparency obligations, and conformity assessments for high-risk AI systems. AI 600-1's twelve risk categories align with the EU AI Act's risk classification approach, and many of the suggested actions in AI 600-1 directly satisfy EU AI Act requirements for documentation, monitoring, human oversight, and transparency. Organizations operating in both US and EU jurisdictions can use AI 600-1 as a bridge between NIST and EU regulatory expectations.
The connection to NIST 800-53 completes the governance chain. AI 600-1 extends the AI RMF, which references NIST 800-53 for information security controls that protect AI systems at the infrastructure level. The Program Management (PM), Risk Assessment (RA), and System and Information Integrity (SI) control families in 800-53 provide foundational controls that AI 600-1 builds upon with AI-specific extensions. An organization implementing NIST 800-53 for its information systems gains partial coverage of AI 600-1 requirements through controls already in place: access management, audit logging, incident response, and configuration management all apply to AI infrastructure. AI 600-1 adds the generative-specific layer: confabulation monitoring, content provenance, training data governance, and model behavioral analysis that 800-53 does not address because it predates the generative AI era. Rampart resolves these cross-framework relationships through its derivation chain engine, computing your coverage across AI 600-1, AI RMF, COSAiS, EU AI Act, and NIST 800-53 simultaneously from a single security posture. Work done for any one framework propagates through the mapped relationships to advance every connected framework.
Something is being forged.
The full platform is under active development. Reach out to learn more or get early access.