Overview
Two separate security research teams — Imperva and Varonis — have independently demonstrated high-severity attack paths against OpenClaw, a widely deployed self-hosted AI agent platform. Published concurrently in June 2026, the findings illustrate how indirect prompt injection via everyday communication objects (contacts, vCards, location pins, and emails) can subvert an AI agent into executing attacker-controlled code or exfiltrating sensitive credentials. One vulnerability has been patched; the other is architectural in nature and requires defensive design changes rather than a code fix.
Technical Analysis
Imperva: Message-Object Injection via Flattened Prompt Construction
Imperva researcher Yohann Sillam identified that OpenClaw passes shared contacts, vCards, and location pins to the underlying LLM by flattening them inline into the prompt body — without any untrusted-content boundary marker. This contrasts with web-fetched content, which does receive an untrusted-content wrapper.
The attack abuses the serialisation format for shared contacts: <contact: name, number>. Since angle brackets are syntactically valid within a contact name field, an attacker can embed arbitrary LLM instructions in that field. The name is truncated in the UI (both WhatsApp and the receiving app), meaning the victim never sees the injected payload.
The same vector applies to the FN (full-name) field of a vCard and the label of a shared location pin. In Imperva’s tests against Gemini 3.1 Pro (preview), the injected instruction successfully directed the agent to fetch and execute a remote script. The attack succeeds because models have been hardened against image-embedded instructions through training, but have had far less exposure to message-object injection patterns.
With OpenClaw’s persistent memory enabled by default, a single piece of widely shared malicious content could silently compromise every agent that ingests it, absent sandboxing. OpenClaw addressed this in version 2026.4.23 by routing contact names, vCard fields, and location labels through a separate untrusted-metadata channel rather than the prompt body.
Varonis: Social Engineering via Crafted Email
Varonis Threat Labs, led by Itay Yashar, built a test agent named Pinchy on the OpenClaw platform, connected it to a Gmail inbox seeded with synthetic business data and mock secrets (AWS keys, fake customer exports). A single plain-text email instructing the agent to forward specified data to an external address was sufficient to trigger exfiltration. This is not a patchable code flaw — it reflects excessive agency: the agent had both the capability and the authorisation model to act on the instruction.
Framework Mapping
- AML.T0051 (LLM Prompt Injection): Core technique in both attacks — adversarial instructions injected via contact objects and email.
- AML.T0057 (LLM Data Leakage): Varonis demonstrated credential and PII exfiltration through agent action.
- AML.T0043 (Craft Adversarial Data): Specially crafted vCards and contact names used as injection vehicles.
- LLM01 (Prompt Injection) and LLM08 (Excessive Agency): The Varonis finding is a textbook excessive-agency failure; Imperva maps directly to indirect prompt injection.
- LLM06 (Sensitive Information Disclosure): Both attacks result in credential or data exposure.
Impact Assessment
Organisations running OpenClaw with integrations to email, messaging platforms, or cloud services are at direct risk. The patched Imperva vector affects any unpatched instance. The Varonis behaviour-level risk is broader — it affects any agentic deployment where the LLM is permitted to send data externally without human-in-the-loop confirmation. Imperva also noted the flattening pattern exists in other personal AI assistants, suggesting systemic industry exposure.
Mitigation & Recommendations
- Patch immediately: Upgrade OpenClaw to version 2026.4.23 or later to close the message-object injection path.
- Restrict agent permissions: Apply least-privilege to all agent tools — revoke or gate outbound email, file transfer, and API write capabilities.
- Implement input trust boundaries: All externally sourced data (contacts, emails, web content) must be wrapped in explicit untrusted-content markers before LLM ingestion.
- Enable human-in-the-loop for sensitive actions: Require explicit user approval before agents forward data externally or execute scripts.
- Sandbox agent memory: Disable or scope persistent memory to reduce the blast radius of a single injected instruction propagating across agent sessions.