Adversa AI: 89% of AI Agents Fail Security Tests

Overview

Adversa AI has published its AI Risk Quadrant for Agent Security report, benchmarking 100 AI agents across ten functional categories against a combined measure of capability and defensive posture. The headline finding is stark: only 11 of the 100 agents evaluated qualify as both ‘capable’ and ‘well-defended’. The remaining 89 agents fall into categories defined by either dangerous capability without adequate defence, or defensive posture at the cost of usability — making them security liabilities or operational non-starters.

The report arrives at a moment when enterprise adoption of autonomous agents is accelerating, driven in part by AI-assisted cyberattacks that are forcing defenders to automate at scale. The irony is that the same urgency pushing organisations toward agents is compounding their exposure.

Technical Analysis

Adversa frames the core structural problem as the ’lethal trifecta’: the convergence of private data access, exposure to untrusted external content, and the ability to execute outbound actions. According to the report, 98% of tested agents exhibit all three characteristics — because these properties are functionally required for agents to be useful.

This creates what Adversa terms a ‘power-protection inversion’: the vendors shipping the most capable agents are simultaneously shipping the widest attack surfaces. This is not attributed to negligence by a subset of vendors but described as a structural feature of the current agent market, appearing consistently across all ten agent categories.

The worst-performing categories are computer agents — designed to make decisions or execute actions on behalf of users — and coding agents. Computer agents are particularly exposed because they require broad contextual input, increasing susceptibility to prompt injection and adversarial content embedded in their operating environment. Coding agents present elevated risk due to their access to execution environments, codebases, and external repositories.

Attack vectors of primary concern include prompt injection via untrusted content in the agent’s context window, insecure handling of tool outputs, and excessive agency granted without adequate guardrails — all directly exploitable without requiring model-level compromise.

Framework Mapping

AML.T0051 (LLM Prompt Injection) and AML.T0054 (LLM Jailbreak): Direct exploitation vectors for agents consuming untrusted content.
AML.T0057 (LLM Data Leakage) and AML.T0040 (ML Model Inference API Access): Relevant where agents have access to sensitive enterprise data.
LLM08 (Excessive Agency): The conceptual centrepiece of the report’s findings — agents are granted too much autonomous capability without compensating controls.
LLM01 (Prompt Injection) and LLM02 (Insecure Output Handling): Primary technical attack surface for agents consuming external or user-supplied content.

Impact Assessment

The systemic nature of the finding is what elevates this beyond a standard research disclosure. With 98% of agents carrying the lethal trifecta and only 11% meeting a combined security-capability bar, organisations deploying agents at scale are statistically likely to be running vulnerable systems. The risk is highest for security operations, software development, and business process automation use cases where agents operate with elevated privileges and access to sensitive data.

Mitigation & Recommendations

Apply least-privilege principles aggressively: Scope each agent’s data access, tool permissions, and outbound connectivity to the minimum required for its task.
Sandbox computer and coding agents: Isolate execution environments and enforce strict output validation before any action is committed.
Treat untrusted content as an attack vector: Implement content filtering and contextual integrity checks on all data entering an agent’s context window.
Reference the AI Risk Quadrant: Use Adversa’s rankings as a procurement and deployment filter, prioritising vendors in the ‘capable well-defended’ segment.
Establish agent-specific monitoring: Standard endpoint or application telemetry is insufficient — log agent reasoning chains, tool calls, and outbound actions for anomaly detection.

References

Security of 100 AI Agents Tested and Ranked – SecurityWeek
Adversa AI: AI Risk Quadrant for Agent Security Report (June 2026)