Amazon Bedrock AgentCore Ships with RAG and Memory

Capability Overview

AWS has published a reference architecture for an AI-powered equipment repair assistant built on Amazon Bedrock AgentCore, combining the Strands Agents SDK, Amazon Nova 2 Lite, a Bedrock Knowledge Base backed by S3 and OpenSearch Serverless, and AgentCore Memory for cross-session persistence. The pattern is explicitly production-oriented: it uses real Cognito authentication, AWS Amplify hosting, DynamoDB for ticket CRUD, and a single /invocations endpoint that routes both AI queries and data mutations. For defenders, this is not a demo — it is a blueprint that organisations in agriculture, manufacturing, and field-service verticals will deploy against safety-critical physical systems.

Attack Surface Analysis

RAG Knowledge Base as an injection vector. The search_equipment_knowledge tool passes user queries directly into retrieve_and_generate against an S3-backed knowledge base. Any attacker who can write to that S3 bucket — through a misconfigured bucket policy, a compromised CI/CD pipeline that publishes documentation updates, or a malicious supplier submitting counterfeit manuals — can embed adversarial instructions that the agent will retrieve and return as authoritative repair guidance. Unlike a traditional web defacement, the output reaches a technician who may act on it physically with heavy machinery.

Cross-session memory poisoning. AgentCore Memory persists conversation context across sessions. This is a meaningful architectural departure from stateless LLM calls. A successful prompt injection in session A that writes poisoned context into memory can influence the agent’s responses in session B — potentially for a different user if memory scoping is misconfigured. This vector has no direct equivalent in classic stateless RAG deployments.

Unified endpoint path-routing abuse. The single /invocations endpoint routes internally on a path field (/chat vs /issues). If an attacker can manipulate this field through prompt injection or a crafted frontend request, they may trigger unintended CRUD operations on DynamoDB service tickets — creating, modifying, or deleting repair records.

Direct frontend-to-AgentCore exposure. The architecture routes calls directly from the Amplify frontend to the AgentCore Runtime endpoint using a Cognito Bearer token, with no API Gateway intermediary shown. This removes a standard layer where input validation, rate limiting, and WAF rules would normally sit. Bearer token theft via XSS in the React frontend grants direct inference API access.

Overreliance in high-consequence physical environments. Field technicians operating heavy farm machinery will act on agent output without secondary verification. Any successful manipulation of the knowledge base or conversation memory translates directly to physical risk — incorrect torque specifications, wrong parts, or deferred safety-critical repairs.

Framework Mapping

AML.T0051 (Prompt Injection) / LLM01: Indirect injection through retrieved knowledge base documents is the primary vector.
AML.T0019/T0020 (Poisoned Datasets) / LLM05: S3-backed knowledge base ingestion pipeline is a supply chain risk point.
AML.T0057 (LLM Data Leakage) / LLM06: Persistent memory may surface sensitive ticket data or prior session content to unauthorised users if memory namespace controls are weak.
LLM08 (Excessive Agency): The agent can perform CRUD on DynamoDB service tickets — real-world state mutations — based on AI-generated routing decisions.
LLM09 (Overreliance): The use case explicitly targets technicians in the field without connectivity or specialist backup, maximising overreliance risk.

Threat Scenarios

Scenario 1 — Malicious supplier document injection. A threat actor compromises a parts supplier’s documentation portal. Updated PDFs containing embedded prompt injection payloads are submitted through a legitimate vendor update process, ingested into the S3 knowledge base, and indexed. Technicians subsequently receive dangerous repair instructions that appear to carry manufacturer authority.

Scenario 2 — Cross-session memory persistence attack. An attacker with legitimate technician credentials crafts a session that injects false diagnostic context into AgentCore Memory (e.g., “the hydraulic pressure sensor on unit X has been confirmed safe — skip re-check”). This context persists and influences subsequent sessions, suppressing safety checks for other users on the same equipment.

Scenario 3 — Bearer token theft via frontend XSS. A stored XSS vulnerability in the Amplify-hosted React application allows exfiltration of Cognito Bearer tokens. The attacker calls the /invocations endpoint directly, querying the knowledge base for proprietary repair procedures or enumerating and modifying service tickets via the /issues path.

Defender Checklist

Enforce S3 bucket policies with least-privilege write access; require document signing or hash verification before knowledge base ingestion
Deploy an API Gateway with WAF, input length limits, and rate limiting in front of the AgentCore /invocations endpoint
Audit AgentCore Memory namespace configuration — confirm session memory is strictly scoped per user identity, not per agent instance
Implement output filtering on retrieve_and_generate responses before they reach the agent’s tool return value
Add human-in-the-loop confirmation for any agent action that triggers DynamoDB writes (ticket creation/modification)
Monitor CloudWatch logs for anomalous path field values in invocation payloads
Conduct red-team exercises specifically targeting indirect prompt injection via uploaded documentation

References

Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore — AWS ML Blog