First Look: AWS Brings NVIDIA Nemotron and OpenAI GPT OSS Models to GovCloud

Capability Overview

AWS has extended Amazon Bedrock’s model catalogue into GovCloud (US) with two open-weight model families: OpenAI’s GPT OSS (120B and 20B parameters) and NVIDIA’s Nemotron 3 family (Nano 9B/12B/30B, Super 120B). Inference runs entirely within the AWS GovCloud isolation boundary on U.S.-citizen-operated infrastructure, satisfying FedRAMP High, DoD SRG Impact Levels 2/4/5, ITAR, and CJIS requirements.

For defenders, the significance is not the compliance posture — it is the capability level. Models at the 120B parameter scale in the hands of mission workflows processing intelligence summaries, security logs, acquisition documents, and policy assessments represent a qualitative step change in what automated AI pipelines can do — and in what a successful attack against them can achieve.

Attack Surface Analysis

Open-weight supply chain exposure. Unlike proprietary hosted models, open-weight models are distributed as weights that customers or AWS ingest. Even within a managed service, agencies should scrutinise weight provenance. The GPT OSS and Nemotron weights were trained by organisations outside the U.S. government’s direct oversight. If weights were tampered with prior to packaging (AML.T0010), behavioural backdoors could persist inside the compliance boundary indefinitely.

Prompt injection in high-consequence pipelines. The blog explicitly names intelligence analysis, security log analysis, and acquisition review as target use cases. These workflows commonly ingest third-party documents — a classic indirect prompt injection surface. A malicious actor who can influence an ingested document (a contractor submitting a bid, an adversary planting content in an analysed source) may be able to redirect model behaviour in ways that alter recommendations or exfiltrate synthesised intelligence through model outputs.

Jailbreak risk at elevated impact. Jailbreaking a commercial chatbot carries limited consequence. Jailbreaking a model embedded in a DoD SRG IL4/5 workflow reviewing acquisition documents or planning materials carries mission-level consequence. The compliance boundary protects data residency — it does not protect against adversarial manipulation of model reasoning.

Agentic excessive agency. Multi-document intelligence synthesis and automated compliance checking are explicitly agentic use cases. Without fine-grained output validation and human-in-the-loop checkpoints, models operating on privileged government data with downstream action authority represent a textbook LLM08 risk.

DLP bypass via model summarisation. A cleared insider with Bedrock API access can instruct a 120B model to synthesise, reformat, and compress sensitive material in ways that evade keyword-based DLP. The model becomes an unmonitored exfiltration pathway.

Framework Mapping

Technique	Relevance
AML.T0051 - Prompt Injection	Adversarial content in ingested government documents
AML.T0054 - Jailbreak	Elevated consequence in classified-adjacent workflows
AML.T0010 - Supply Chain Compromise	Open-weight provenance outside government control
AML.T0057 - Data Leakage	Model outputs as exfiltration vector
LLM01 - Prompt Injection	Direct and indirect injection in agentic pipelines
LLM05 - Supply Chain	Weight integrity and model provenance
LLM08 - Excessive Agency	Agentic workflows with limited human review

Threat Scenarios

Scenario 1 — Adversarial bid document. A foreign-affiliated contractor submits an acquisition document containing an embedded indirect prompt injection payload. The automated contract review pipeline on Bedrock processes the document and the injected instruction overrides the system prompt, causing the model to classify the bid as compliant and suppress risk flags.

Scenario 2 — Insider synthesis exfiltration. A cleared employee with legitimate Bedrock API access uses a Nemotron 120B model to synthesise classified-adjacent CUI across multiple documents into a compact, reformatted summary. The output evades DLP controls and is exfiltrated via an authorised channel.

Scenario 3 — Weight backdoor activation. A supply chain compromise introduced a latent backdoor in GPT OSS weights prior to GovCloud deployment. The backdoor activates on a specific trigger phrase present in operational planning documents, causing the model to alter risk assessments in subtle but consequential ways.

Defender Checklist

Validate weight provenance: Document chain of custody for all open-weight models in your ATO boundary; require cryptographic hash verification against vendor-published manifests.
Apply input sanitisation: Instrument all document ingestion pipelines with injection-pattern detection before content reaches model context windows.
Scope model permissions minimally: Restrict Bedrock API access to named service roles; audit IAM policies quarterly.
Implement output monitoring: Deploy semantic DLP on model outputs in workflows touching CUI, ITAR, or CJIS data.
Mandate human-in-the-loop for high-impact decisions: Ensure agentic workflows (contract approval, risk assessment, log triage) require human confirmation before downstream action.
Red-team with adversarial documents: Test mission pipelines using documents with embedded indirect injection payloads before operational deployment.
Include model behaviour in IR playbooks: Define what a model compromise looks like operationally and how to detect anomalous output patterns.

References

AWS ML Blog — Run NVIDIA Nemotron and OpenAI GPT OSS models on Amazon Bedrock in AWS GovCloud (US)