LIVE THREATS
MEDIUM Microsoft Scout Autonomous Agent Expands Attack Surface Across Microsoft 365 // HIGH High-Autonomy AI Agents With Broad Permissions Pose Enterprise Security Crisis // HIGH Indirect Prompt Injection via Notifications Hijacks Google Gemini on Android // HIGH Only 11 of 100 AI Agents Pass Security and Capability Benchmarks // HIGH Prompt Injection Flaw in Gemini Voice Assistant Enables Notification-Based Attacks // HIGH 2,000 AI-Built Apps Expose Corporate Data via Misconfigured Vibe-Coding Platforms // MEDIUM Anthropic Documents Sandbox Escape Risks and Credential Exfiltration Vectors in Claude … // HIGH ChatGPhish Exploit Turns ChatGPT Summarisation Into a Live Phishing Surface // HIGH LLMShare Campaign Weaponises ChatGPT Sharing Feature to Distribute Malware // MEDIUM Process-Level CAPTCHA Analysis Exposes Behavioural Fingerprints of AI Agents //
ATLAS OWASP HIGH Significant risk · Prioritise patching RELEVANCE ▲ 8.2

Only 11 of 100 AI Agents Pass Security and Capability Benchmarks

TL;DR HIGH
  • What happened: Only 11 of 100 AI agents tested are both capable and secure, per Adversa AI's benchmark.
  • Who's at risk: Enterprises deploying autonomous AI agents — especially computer and coding agents — face systemic risk due to structural design trade-offs between capability and security.
  • Act now: Audit all deployed AI agents against the lethal trifecta: restrict private data access, untrusted content exposure, and outbound action scope · Prioritise agents ranked in the 'capable well-defended' quadrant and validate vendor security claims independently · Implement least-privilege boundaries and sandboxing for computer and coding agents before granting autonomous operation
Only 11 of 100 AI Agents Pass Security and Capability Benchmarks

Overview

Adversa AI has published its AI Risk Quadrant for Agent Security report, benchmarking 100 AI agents across ten functional categories against a combined measure of capability and defensive posture. The headline finding is stark: only 11 of the 100 agents evaluated qualify as both ‘capable’ and ‘well-defended’. The remaining 89 agents fall into categories defined by either dangerous capability without adequate defence, or defensive posture at the cost of usability — making them security liabilities or operational non-starters.

The report arrives at a moment when enterprise adoption of autonomous agents is accelerating, driven in part by AI-assisted cyberattacks that are forcing defenders to automate at scale. The irony is that the same urgency pushing organisations toward agents is compounding their exposure.

Technical Analysis

Adversa frames the core structural problem as the ’lethal trifecta’: the convergence of private data access, exposure to untrusted external content, and the ability to execute outbound actions. According to the report, 98% of tested agents exhibit all three characteristics — because these properties are functionally required for agents to be useful.

This creates what Adversa terms a ‘power-protection inversion’: the vendors shipping the most capable agents are simultaneously shipping the widest attack surfaces. This is not attributed to negligence by a subset of vendors but described as a structural feature of the current agent market, appearing consistently across all ten agent categories.

The worst-performing categories are computer agents — designed to make decisions or execute actions on behalf of users — and coding agents. Computer agents are particularly exposed because they require broad contextual input, increasing susceptibility to prompt injection and adversarial content embedded in their operating environment. Coding agents present elevated risk due to their access to execution environments, codebases, and external repositories.

Attack vectors of primary concern include prompt injection via untrusted content in the agent’s context window, insecure handling of tool outputs, and excessive agency granted without adequate guardrails — all directly exploitable without requiring model-level compromise.

Framework Mapping

  • AML.T0051 (LLM Prompt Injection) and AML.T0054 (LLM Jailbreak): Direct exploitation vectors for agents consuming untrusted content.
  • AML.T0057 (LLM Data Leakage) and AML.T0040 (ML Model Inference API Access): Relevant where agents have access to sensitive enterprise data.
  • LLM08 (Excessive Agency): The conceptual centrepiece of the report’s findings — agents are granted too much autonomous capability without compensating controls.
  • LLM01 (Prompt Injection) and LLM02 (Insecure Output Handling): Primary technical attack surface for agents consuming external or user-supplied content.

Impact Assessment

The systemic nature of the finding is what elevates this beyond a standard research disclosure. With 98% of agents carrying the lethal trifecta and only 11% meeting a combined security-capability bar, organisations deploying agents at scale are statistically likely to be running vulnerable systems. The risk is highest for security operations, software development, and business process automation use cases where agents operate with elevated privileges and access to sensitive data.

Mitigation & Recommendations

  • Apply least-privilege principles aggressively: Scope each agent’s data access, tool permissions, and outbound connectivity to the minimum required for its task.
  • Sandbox computer and coding agents: Isolate execution environments and enforce strict output validation before any action is committed.
  • Treat untrusted content as an attack vector: Implement content filtering and contextual integrity checks on all data entering an agent’s context window.
  • Reference the AI Risk Quadrant: Use Adversa’s rankings as a procurement and deployment filter, prioritising vendors in the ‘capable well-defended’ segment.
  • Establish agent-specific monitoring: Standard endpoint or application telemetry is insufficient — log agent reasoning chains, tool calls, and outbound actions for anomaly detection.

References

◉ AI THREAT BRIEFING

Stay ahead of the threat.

Twice-weekly digest of critical AI security developments — every story mapped to MITRE ATLAS and OWASP LLM Top 10. Free.

No spam. Unsubscribe anytime.