Capability Overview
Bayer AG and Thoughtworks have published a detailed engineering case study on PRINCE (Preclinical Information Center), a cloud-hosted agentic RAG platform designed to answer complex pharmaceutical research questions and draft regulatory documents from decades of preclinical safety study data. The architecture is explicitly multi-agent: a clarification agent interprets user intent, a planner decomposes queries, a researcher agent retrieves data via both vector search and Text-to-SQL, a reflection agent validates data sufficiency, and a writer agent synthesises regulatory-quality outputs. The system is described as production-ready and emphasises transparency, explainability, and human-in-the-loop controls.
For defenders, this case study is significant not as an isolated deployment but as a widely-read blueprint. Its publication on Martin Fowler’s platform means the architectural patterns — and their attack surface — will be replicated broadly across regulated industries including pharma, finance, and defence.
Attack Surface Analysis
The PRINCE architecture’s depth is precisely what makes it consequential from a security standpoint. Each agent-to-agent handoff is a trust boundary. In conventional software, data crossing such a boundary would be validated against a schema. In LLM pipelines, the ‘schema’ is embedded in a natural language system prompt — a control that can be overridden by adversarially crafted content in the data stream itself.
Cascading prompt injection is the primary new vector. A malicious actor who can influence any retrieved document — whether via a compromised internal data store or a poisoned external reference — can inject instructions that propagate through the planner, researcher, and writer agents before any human review occurs. Because the reflection agent is tasked with validating sufficiency rather than authenticity, it represents a weak link: it may confirm that injected content satisfies the query without detecting that the content is adversarial.
Text-to-SQL translation converts natural language into executable database queries. This surface is well-understood in traditional web security but becomes significantly harder to defend when the query generator is an LLM responding to free-text user input. Prompt manipulation can shift query scope, bypass row-level security, or enumerate schema structure — all without triggering signature-based SQL injection detections.
Regulatory output integrity is a novel concern. If the writer agent synthesises regulatory-submission-quality documents from poisoned retrieved data, the downstream harm extends beyond data breach — it includes potential submission of fabricated or manipulated safety findings to regulators.
Observability as an exfiltration channel: the article highlights detailed transparency and explainability features including chain-of-thought traces. If these are accessible to insufficiently privileged users or exposed through APIs, they leak internal query patterns, data schema structure, and compound-level safety information.
Framework Mapping
- AML.T0051 (LLM Prompt Injection): Direct applicability across all agent handoffs; highest priority.
- AML.T0057 (LLM Data Leakage): Explainability outputs and retrieved context expose sensitive preclinical data.
- AML.T0043 (Craft Adversarial Data): Poisoned documents in the retrieval corpus can steer agent behaviour.
- AML.T0056 (LLM Meta Prompt Extraction): System prompts encoding compliance rules and agent personas are extraction targets.
- LLM01 (Prompt Injection) and LLM08 (Excessive Agency): The writer agent’s ability to produce regulatory documents from LLM-generated content with minimal human checkpointing exemplifies excessive agency in a high-stakes domain.
- LLM06 (Sensitive Information Disclosure): Preclinical safety data is commercially and regulatorily sensitive; retrieval pipelines can surface it to unauthorised actors.
Threat Scenarios
Scenario 1 — Insider regulatory manipulation: A researcher with write access to an internal document store embeds prompt injection payloads in a study report. The PRINCE writer agent incorporates fabricated safety findings into a regulatory draft, which is submitted before human review catches the discrepancy.
Scenario 2 — Competitive intelligence via explainability APIs: An authenticated but low-privilege user queries PRINCE and captures chain-of-thought traces returned by the transparency layer. These traces reveal which compounds are under active safety review, internal codenames, and database schema topology — all useful for competitive intelligence or targeted follow-on attacks.
Scenario 3 — Text-to-SQL scope escalation: An adversary crafts a query that, when translated by the LLM, generates SQL selecting across study types the user is not authorised to access. Row-level security was not enforced at the database layer because access controls were assumed to be handled by the agent’s system prompt.
Defender Checklist
- Apply prompt injection sanitisation at every agent input boundary, not only at the user interface layer
- Enforce parameterised queries and database-layer access controls for Text-to-SQL; never rely solely on LLM-level scope constraints
- Classify chain-of-thought traces and retrieval context logs as sensitive data; restrict access independently of the main application ACL
- Implement the reflection agent as a separate, sandboxed verification step with its own injection-resistant prompting, not as a downstream LLM call in the same trust chain
- Red-team the writer agent with poisoned retrieval results before regulatory document workflows go live
- Audit human-in-the-loop checkpoints: verify they occur before, not after, external submission or sharing of agent-generated outputs
- Monitor for anomalous SQL query patterns generated by the Text-to-SQL layer using query-level logging independent of the LLM