LIVE FEED
FIRST LOOK First Look: MoEngage Acquires Aampe to Deploy Millions of Autonomous AI Marketing Agents // FIRST LOOK First Look: Dragos Launches EmberAI, an OT-Specific AI Security Intelligence Platform // FIRST LOOK First Look: Mistral AI Ships OCR 4 with Structured Document Extraction for RAG Pipelines // HIGH Malicious Pull Requests Compromise AI and Developer Toolchains via CI/CD Flaws // CRITICAL Anthropic's Mythos AI Breached Classified US Government Systems in Hours // FIRST LOOK Cisco and NVIDIA AI Agent Skill Scanners Bypassed by Fake Marketplace Skill // HIGH Legacy Infrastructure Becomes Primary Attack Path into Enterprise AI Agents // HIGH Role Confusion Attack Lets Injected Text Override LLM Safety Controls // FIRST LOOK First Look: OpenAI Launches 'Patch the Planet' Open-Source Vulnerability Remediation … // HIGH AutoJack Vulnerability Chain Enabled Remote Code Execution via AI Agent WebSocket //
FIRST LOOK ATLAS OWASP CRITICAL Active exploitation · Immediate action required RELEVANCE ▲ 9.2

Cisco and NVIDIA AI Agent Skill Scanners Bypassed by Fake Marketplace Skill

ATTACK SURFACE BRIEF CRITICAL ↗ RAPID
  • What shipped: A fake AI agent skill passed Cisco, NVIDIA, and skills.sh scanners and reached 26,000 agents via a post-install URL swap technique.
  • Who's now exposed: Any enterprise or individual deploying AI agents that consume third-party skills from public marketplaces, particularly non-technical users targeted by social advertising.
  • Assess now: Prohibit agent skills that reference external URLs for setup instructions; require all skill logic to be self-contained and re-scanned on any dependency change · Implement continuous runtime monitoring of outbound URLs fetched by agents, alerting on domain changes or new script-delivery patterns post-install · Treat GitHub star counts and marketplace provenance as zero-trust signals; enforce an internal allow-list of approved skills with periodic re-verification
Cisco and NVIDIA AI Agent Skill Scanners Bypassed by Fake Marketplace Skill

Capability Overview

Security firm AIR has published a proof-of-concept demonstrating that a fabricated AI agent skill — brand-landingpage, ostensibly a Google Stitch landing-page builder — passed every skill security scanner currently in production use, including Cisco’s scanner, NVIDIA’s scanner, and all three scanners integrated into skills.sh. The skill was distributed via a legitimate marketplace pull request and amplified through a paid Instagram ad campaign, ultimately reaching an estimated 26,000 agents, including those operating on corporate accounts. The payload was deliberately benign (email address harvesting only), but the research shows the full capability chain for weaponised deployment exists today.

For defenders, this is not a theoretical edge case. Trail of Bits independently achieved the same scanner bypass three weeks prior. This is a reproducible, scalable attack class.

Attack Surface Analysis

The core structural vulnerability is the temporal gap between scan and execution. Existing skill scanners perform static analysis on the submitted package — the SKILL.md and bundled files — at a single point in time. They cannot assess what an externally-referenced URL will serve when an agent fetches it post-install, nor can they detect if that content changes after the skill achieves distribution.

AIR’s technique stacked three compounding weaknesses:

  1. Static-only scanning: Scanners cleared the skill because the submitted package was genuinely clean. The malicious instruction set lived off-package, at an attacker-controlled domain initially mirroring legitimate Google Stitch documentation.
  2. Trust signal manipulation: By contributing to a 36,000-star repository, the skill inherited social proof entirely decoupled from its actual behaviour. Star counts and open-source affiliation are not integrity signals.
  3. Agent context authority: A skill loaded into an agent’s context operates with roughly the authority of a user prompt. Once the URL was swapped to deliver a script, the agent executed it within its own permission boundary — which in enterprise deployments can include file system access, internal API calls, and credential stores.

The practical consequence: an attacker who achieves wide distribution before activating a payload has already won the hardest part. Detection at activation time is too late for agents that have been running for days or weeks.

Framework Mapping

MITRE ATLAS: This maps most directly to AML.T0010 (ML Supply Chain Compromise) — the marketplace pull request is the supply chain insertion point. The post-install URL swap is a form of AML.T0051 (LLM Prompt Injection) delivered through a trusted skill context rather than user input. AML.T0057 (LLM Data Leakage) covers the demonstrated exfiltration outcome.

OWASP LLM Top 10: LLM05 (Supply Chain Vulnerabilities) is the primary mapping. LLM07 (Insecure Plugin Design) applies because skills inherit user-level trust without behavioural sandboxing. LLM08 (Excessive Agency) is relevant wherever agents can execute fetched scripts against live systems.

Threat Scenarios

Scenario 1 — Corporate data exfiltration: A threat actor publishes a skill targeting sales and marketing personas (plausible, given AIR’s own ad targeting). After 30 days of clean operation, the external URL is swapped to instruct the agent to read CRM exports and POST them to an attacker endpoint. The skill has already been approved by IT.

Scenario 2 — Credential harvesting at scale: A skill offering productivity automation fetches a script that instructs the agent to retrieve stored API keys or OAuth tokens from the agent’s accessible environment and exfiltrate them. No malware is installed on the host; the agent itself performs the action.

Scenario 3 — Lateral movement staging: An initial skill payload only establishes a callback beacon. A second-stage script, delivered weeks later, maps internal services reachable from the agent’s network context and prepares pivot points.

Defender Checklist

  • Audit all currently installed third-party agent skills for external URL dependencies in setup or runtime instructions
  • Block or quarantine any skill that fetches instructions, scripts, or documentation from domains not owned by your organisation or a pre-approved vendor
  • Deploy runtime network monitoring on agent processes; alert on new outbound domains appearing after a skill’s initial install date
  • Establish an internal skill allow-list; treat any skill not on it as untrusted regardless of marketplace reputation or star count
  • Re-scan approved skills on a scheduled basis, not just at initial submission
  • Review Anthropic’s published guidance on external URL risks in skills and validate it against your agent deployment configuration
  • Engage your agent platform vendor on whether continuous/dynamic scanning is on their roadmap

References

◉ AI THREAT BRIEFING

Stay ahead of the threat.

Twice-weekly digest of critical AI security developments — every story mapped to MITRE ATLAS and OWASP LLM Top 10. Free.

No spam. Unsubscribe anytime.