Amazon Bedrock Prompt Injection Traverses Agent Hierarchies

Overview

Unit 42 researchers at Palo Alto Networks published red-team findings examining Amazon Bedrock’s multi-agent collaboration framework, specifically targeting the attack surface introduced when multiple specialised agents communicate and orchestrate tasks. The research demonstrates a systematic attack chain enabling adversaries to identify agent operating modes, discover collaborator agents, deliver malicious payloads, and ultimately execute unauthorised actions — including extracting system instructions and invoking agent tools with attacker-controlled inputs. No vulnerabilities in Amazon Bedrock itself were identified; the risk stems from the inherent susceptibility of LLMs to prompt injection.

Technical Analysis

Amazon Bedrock Agents supports two primary multi-agent operating modes: Supervisor and Supervisor with Routing. Researchers demonstrated that an attacker can fingerprint which mode is active through probing interactions, then progressively enumerate collaborator agents within the architecture.

Once agent topology is mapped, adversaries can inject malicious instructions through untrusted text inputs — such as user-supplied content processed by any agent in the chain. Because LLMs cannot reliably distinguish developer-defined system prompts from adversarial user input, injected instructions can propagate across agent boundaries. The attack chain exploited this to:

Disclose agent instructions and tool schemas — extracting the meta-prompt and API surface of sub-agents
Invoke tools with attacker-supplied parameters — weaponising legitimate action groups (API calls, external integrations) with malicious inputs

The multi-agent architecture amplifies the risk: a single injection point in an orchestrator agent can cascade through subordinate agents, each with their own tool sets and data access, multiplying the blast radius compared to single-agent deployments.

Bedrock’s extended agent capabilities — action groups, knowledge bases, memory, and code interpretation — all represent additional lateral movement vectors once an attacker achieves initial prompt injection.

Framework Mapping

MITRE ATLAS: The attack primarily maps to AML.T0051 (LLM Prompt Injection) as the core technique, with AML.T0056 (LLM Meta Prompt Extraction) covering instruction disclosure and AML.T0057 (LLM Data Leakage) covering sensitive schema exfiltration. AML.T0043 (Craft Adversarial Data) describes the payload construction phase.

OWASP LLM Top 10: LLM01 (Prompt Injection) is the primary category. LLM08 (Excessive Agency) is directly relevant — agents with broad tool permissions amplify the impact of successful injection. LLM06 (Sensitive Information Disclosure) and LLM07 (Insecure Plugin Design) cover instruction extraction and tool abuse respectively.

Impact Assessment

Organisations using Amazon Bedrock multi-agent architectures for business-critical workflows — particularly those with action groups connected to internal APIs, databases, or external services — face meaningful risk. A successful attack could result in: exfiltration of proprietary system prompts and tool schemas; unauthorised API calls executed under the agent’s IAM permissions; and potential lateral movement through integrated enterprise systems. The impact scales with the privilege level of agent-associated IAM roles and the sensitivity of connected data sources.

Mitigation & Recommendations

Enable Bedrock Guardrails — Activate the built-in prompt injection detection on all agent pre-processing stages. Unit 42 confirmed this effectively blocks the demonstrated attack chains.
Apply least privilege to agent IAM roles — Restrict action group permissions to the minimum necessary; limit which tools each agent can invoke.
Validate and sanitise inputs at every agent boundary — Do not assume inputs relayed between agents are trusted; treat all inter-agent messages as potentially adversarial.
Audit agent topology exposure — Minimise information returned when agents describe their own capabilities or collaborator structure to reduce reconnaissance surface.
Deploy layered AI security controls — Solutions such as Prisma AIRS can provide real-time threat detection and policy enforcement across agent interactions.

References

Unit 42: When an Attacker Meets a Group of Agents — Amazon Bedrock Multi-Agent Applications