<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>GRID THE GREY — AI Threat Intelligence | GRID THE GREY</title><link>https://gridthegrey.com/</link><description>Real-time AI security intelligence — adversarial ML, LLM vulnerabilities, and supply chain threats mapped to MITRE ATLAS and OWASP LLM Top 10.</description><generator>Hugo</generator><language>en-us</language><copyright/><lastBuildDate>Wed, 06 May 2026 09:47:08 +0530</lastBuildDate><atom:link href="https://gridthegrey.com/index.xml" rel="self" type="application/rss+xml"/><item><title>Bleeding Llama Flaw Exposes 300,000 Ollama Servers to Unauthenticated Data Theft</title><link>https://gridthegrey.com/posts/bleeding-llama-flaw-exposes-300000-ollama-servers-to-unauthenticated-data-theft/</link><pubDate>Wed, 06 May 2026 04:16:56 +0000</pubDate><guid>https://gridthegrey.com/posts/bleeding-llama-flaw-exposes-300000-ollama-servers-to-unauthenticated-data-theft/</guid><category>Threat Level: CRITICAL</category><category>LLM Security</category><category>Research</category><category>Industry News</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0057 - LLM Data Leakage</category><category>AML.T0044 - Full ML Model Access</category><category>AML.T0043 - Craft Adversarial Data</category><description>A critical heap out-of-bounds read vulnerability (CVE-2026-7482, CVSS 9.3) in Ollama's GGUF model loader allows unauthenticated remote attackers to exfiltrate sensitive heap memory — including API keys, prompts, and PII — using just three API calls. With approximately 300,000 Ollama instances publicly exposed and no authentication required by default, the attack surface is immediately and broadly exploitable. The vulnerability has been patched in Ollama version 0.17.1, but unpatched internet-facing deployments remain at critical risk.</description></item><item><title>CrowdStrike Researcher Details AI Jailbreaking and Data Poisoning Techniques</title><link>https://gridthegrey.com/posts/crowdstrike-researcher-details-ai-jailbreaking-and-data-poisoning-techniques/</link><pubDate>Wed, 06 May 2026 04:15:58 +0000</pubDate><guid>https://gridthegrey.com/posts/crowdstrike-researcher-details-ai-jailbreaking-and-data-poisoning-techniques/</guid><category>Threat Level: MEDIUM</category><category>LLM Security</category><category>Jailbreaks</category><category>Adversarial ML</category><category>Data Poisoning</category><category>Research</category><category>Industry News</category><category>AML.T0054 - LLM Jailbreak</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0020 - Poison Training Data</category><category>AML.T0043 - Craft Adversarial Data</category><category>AML.T0015 - Evade ML Model</category><description>Joey Melo, Principal Security Researcher at CrowdStrike, outlines his methodology for AI red teaming, focusing on manipulating LLM guardrails through jailbreaking and data poisoning without altering underlying source code. His work, rooted in competitive AI hacking challenges, translates classical adversarial thinking into the emerging field of machine learning security. The profile highlights the growing professionalisation of AI red teaming as organisations seek to harden LLM deployments against real-world manipulation attacks.</description></item><item><title>Mass Scan Reveals Widespread Authentication Failures Across Exposed AI Infrastructure</title><link>https://gridthegrey.com/posts/mass-scan-reveals-widespread-authentication-failures-across-exposed-ai/</link><pubDate>Wed, 06 May 2026 04:15:21 +0000</pubDate><guid>https://gridthegrey.com/posts/mass-scan-reveals-widespread-authentication-failures-across-exposed-ai/</guid><category>Threat Level: HIGH</category><category>LLM Security</category><category>Agentic AI</category><category>Industry News</category><category>Research</category><category>Jailbreaks</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0044 - Full ML Model Access</category><category>AML.T0054 - LLM Jailbreak</category><category>AML.T0057 - LLM Data Leakage</category><category>AML.T0012 - Valid Accounts</category><category>AML.T0047 - ML-Enabled Product or Service</category><description>A scan of over one million exposed AI services found pervasive security failures including absent authentication, leaked API keys, and exposed business logic across self-hosted LLM deployments. Agent management platforms such as Flowise and n8n were discovered internet-exposed without access controls, revealing credential lists and internal workflows. The findings indicate systemic misconfiguration risk as enterprises race to self-host AI infrastructure without applying baseline security practices.</description></item><item><title>Backdoored PyTorch Lightning Package Steals Cloud Credentials from AI Developers</title><link>https://gridthegrey.com/posts/backdoored-pytorch-lightning-package-steals-cloud-credentials-from-ai-developers/</link><pubDate>Tue, 05 May 2026 05:36:41 +0000</pubDate><guid>https://gridthegrey.com/posts/backdoored-pytorch-lightning-package-steals-cloud-credentials-from-ai-developers/</guid><category>Threat Level: HIGH</category><category>Supply Chain</category><category>LLM Security</category><category>Industry News</category><category>AML.T0010 - ML Supply Chain Compromise</category><category>AML.T0018 - Backdoor ML Model</category><category>AML.T0012 - Valid Accounts</category><description>A malicious version of PyTorch Lightning (v2.6.3) was published to PyPI, embedding a hidden execution chain that silently downloads a JavaScript runtime and executes a heavily obfuscated credential-stealing payload dubbed 'ShaiWorm'. The attack targeted AI/ML developers who use this popular deep learning framework, exposing cloud credentials, API keys, browser-stored secrets, and GitHub tokens. The package has since been reverted to a safe version, but any developer who imported the compromised version should rotate all secrets immediately.</description></item><item><title>Pentagon Deploys Classified AI Across Seven Tech Giants for Warfighter Systems</title><link>https://gridthegrey.com/posts/pentagon-deploys-classified-ai-across-seven-tech-giants-for-warfighter-systems/</link><pubDate>Mon, 04 May 2026 03:28:36 +0000</pubDate><guid>https://gridthegrey.com/posts/pentagon-deploys-classified-ai-across-seven-tech-giants-for-warfighter-systems/</guid><category>Threat Level: HIGH</category><category>Agentic AI</category><category>Supply Chain</category><category>Regulatory</category><category>Industry News</category><category>LLM Security</category><category>AML.T0010 - ML Supply Chain Compromise</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0043 - Craft Adversarial Data</category><category>AML.T0057 - LLM Data Leakage</category><description>The US Department of Defense has formalised agreements with seven major technology companies — including Google, Microsoft, OpenAI, and Amazon Web Services — to integrate AI into classified military networks for battlefield decision support. The move raises significant AI security concerns around human oversight, adversarial manipulation of high-stakes AI systems, and supply chain risks introduced by multiple commercial vendors operating within classified environments. Notably, Anthropic was excluded following a public dispute over AI safety and ethics in warfare.</description></item><item><title>Cross-Machine AI Agent Relay Tool Expands Attack Surface for Developer Environments</title><link>https://gridthegrey.com/posts/cross-machine-ai-agent-relay-tool-expands-attack-surface-for-developer/</link><pubDate>Sun, 03 May 2026 03:31:51 +0000</pubDate><guid>https://gridthegrey.com/posts/cross-machine-ai-agent-relay-tool-expands-attack-surface-for-developer/</guid><category>Threat Level: MEDIUM</category><category>Agentic AI</category><category>Supply Chain</category><category>LLM Security</category><category>Industry News</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0010 - ML Supply Chain Compromise</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0040 - ML Model Inference API Access</category><description>Loopsy is an open-source tool enabling cross-machine communication between AI coding agents (Claude Code, Cursor, Codex) and mobile devices via a self-hosted Cloudflare Workers relay. While designed for legitimate developer productivity, the architecture introduces significant attack surface: a relay brokering shell access and AI agent commands across machines is a high-value target for interception, hijacking, or supply chain compromise. Security teams should assess exposure before deploying such tools in sensitive development environments.</description></item><item><title>Desktop Automation CLI Grants AI Agents Deep OS-Level Control</title><link>https://gridthegrey.com/posts/desktop-automation-cli-grants-ai-agents-deep-os-level-control/</link><pubDate>Sun, 03 May 2026 03:30:02 +0000</pubDate><guid>https://gridthegrey.com/posts/desktop-automation-cli-grants-ai-agents-deep-os-level-control/</guid><category>Threat Level: HIGH</category><category>Agentic AI</category><category>LLM Security</category><category>Prompt Injection</category><category>Research</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0057 - LLM Data Leakage</category><category>AML.T0040 - ML Model Inference API Access</category><description>agent-desktop is an open-source Rust CLI tool that exposes full OS accessibility trees to AI agents, enabling programmatic control of any desktop application without screenshots or browser sandboxing. This dramatically expands the attack surface for agentic AI systems, as a compromised or prompt-injected agent could silently manipulate native applications, exfiltrate data, or perform destructive actions across the host OS. The tool's deterministic element references and structured JSON output make it trivially scriptable, lowering the barrier for AI-driven desktop abuse.</description></item><item><title>Frontier LLMs Now Autonomously Breach Corporate Networks in AISI Cyber Tests</title><link>https://gridthegrey.com/posts/frontier-llms-now-autonomously-breach-corporate-networks-in-aisi-cyber-tests/</link><pubDate>Sat, 02 May 2026 04:50:23 +0000</pubDate><guid>https://gridthegrey.com/posts/frontier-llms-now-autonomously-breach-corporate-networks-in-aisi-cyber-tests/</guid><category>Threat Level: HIGH</category><category>LLM Security</category><category>Agentic AI</category><category>Research</category><category>Industry News</category><category>Regulatory</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0043 - Craft Adversarial Data</category><description>The UK's AI Security Institute (AISI) found that OpenAI's GPT-5.5 matches Anthropic's Mythos Preview on cybersecurity benchmarks, including a 32-step simulated corporate network intrusion. Both models successfully completed the 'The Last Ones' data-extraction simulation — a first for any AI system — suggesting autonomous offensive cyber capability is a general frontier-model property, not a one-vendor breakthrough. The findings raise urgent questions about responsible release practices and the pace at which LLMs can independently execute multi-stage attacks.</description></item><item><title>Premature AI Agent Deployments Expose Production Systems to Destructive Actions</title><link>https://gridthegrey.com/posts/premature-ai-agent-deployments-expose-production-systems-to-destructive-actions/</link><pubDate>Sat, 02 May 2026 04:45:09 +0000</pubDate><guid>https://gridthegrey.com/posts/premature-ai-agent-deployments-expose-production-systems-to-destructive-actions/</guid><category>Threat Level: HIGH</category><category>Agentic AI</category><category>LLM Security</category><category>Industry News</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0057 - LLM Data Leakage</category><description>Organisations are deploying AI agents into production environments without adequate security testing, resulting in destructive outcomes such as unintended deletion of production databases. The core risk is excessive agency granted to AI systems before trust boundaries and guardrails are established. This represents a systemic industry failure to apply basic security principles before integrating autonomous AI tooling into critical infrastructure.</description></item><item><title>Anthropic Launches Claude Security to Close AI-Accelerated Exploit Window</title><link>https://gridthegrey.com/posts/anthropic-launches-claude-security-to-close-ai-accelerated-exploit-window/</link><pubDate>Fri, 01 May 2026 07:06:29 +0000</pubDate><guid>https://gridthegrey.com/posts/anthropic-launches-claude-security-to-close-ai-accelerated-exploit-window/</guid><category>Threat Level: HIGH</category><category>LLM Security</category><category>Agentic AI</category><category>Industry News</category><category>Research</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0043 - Craft Adversarial Data</category><description>Anthropic has released Claude Security in public beta, a dedicated vulnerability scanning product aimed at countering the accelerating threat of AI-powered exploitation exemplified by its own Mythos model. The tool integrates directly into Claude Enterprise, scanning repositories for vulnerabilities, providing confidence-rated findings, and generating targeted patches — compressing the security team-to-engineer remediation cycle from days to a single session. The launch reflects a broader industry acknowledgment that frontier AI models in adversarial hands are fundamentally shortening time-to-exploit, forcing defenders to adopt equivalent AI-native tooling.</description></item><item><title>CVSS 10 Gemini CLI Flaw Turns CI/CD Pipelines Into RCE Attack Vectors</title><link>https://gridthegrey.com/posts/cvss-10-gemini-cli-flaw-turns-ci-cd-pipelines-into-rce-attack-vectors/</link><pubDate>Fri, 01 May 2026 06:54:32 +0000</pubDate><guid>https://gridthegrey.com/posts/cvss-10-gemini-cli-flaw-turns-ci-cd-pipelines-into-rce-attack-vectors/</guid><category>Threat Level: CRITICAL</category><category>LLM Security</category><category>Agentic AI</category><category>Supply Chain</category><category>Prompt Injection</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0010 - ML Supply Chain Compromise</category><category>AML.T0047 - ML-Enabled Product or Service</category><description>Google has patched a maximum-severity (CVSS 10.0) vulnerability in its Gemini CLI tooling that allowed unauthenticated attackers to achieve remote code execution by planting malicious configuration files in workspace directories automatically trusted by the agent in headless/CI mode. The flaw effectively weaponised CI/CD pipelines as supply chain attack paths, bypassing sandbox protections entirely before they could initialise. A secondary issue in '--yolo' mode further enabled prompt injection to trigger unrestricted shell command execution.</description></item><item><title>OpenAI Launches Phishing-Resistant Security Mode for High-Risk ChatGPT Accounts</title><link>https://gridthegrey.com/posts/openai-launches-phishing-resistant-security-mode-for-high-risk-chatgpt-accounts/</link><pubDate>Fri, 01 May 2026 04:42:27 +0000</pubDate><guid>https://gridthegrey.com/posts/openai-launches-phishing-resistant-security-mode-for-high-risk-chatgpt-accounts/</guid><category>Threat Level: MEDIUM</category><category>LLM Security</category><category>Industry News</category><category>Regulatory</category><category>AML.T0012 - Valid Accounts</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0047 - ML-Enabled Product or Service</category><description>OpenAI has introduced Advanced Account Security, an optional hardened authentication mode for ChatGPT and Codex users who face elevated risk of account takeover, including journalists, dissidents, and researchers. The feature enforces passkey or physical security key authentication, eliminates SMS/email recovery routes, and removes OpenAI support team access to recovery options to block social engineering attacks. Members of OpenAI's Trusted Access for Cyber programme will be mandated to enable it or provide equivalent enterprise SSO attestation by June 1.</description></item><item><title>UK AI Security Institute Finds GPT-5.5 Matches Claude Mythos in Cyber Capabilities</title><link>https://gridthegrey.com/posts/uk-ai-security-institute-finds-gpt-5-5-matches-claude-mythos-in-cyber/</link><pubDate>Fri, 01 May 2026 04:37:05 +0000</pubDate><guid>https://gridthegrey.com/posts/uk-ai-security-institute-finds-gpt-5-5-matches-claude-mythos-in-cyber/</guid><category>Threat Level: HIGH</category><category>LLM Security</category><category>Research</category><category>Industry News</category><category>Regulatory</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0043 - Craft Adversarial Data</category><description>The UK's AI Security Institute has evaluated OpenAI's GPT-5.5 for offensive cybersecurity capabilities, finding it comparable to Anthropic's Claude Mythos model in identifying security vulnerabilities. Unlike Mythos, GPT-5.5 is generally available, meaning its vulnerability-discovery capabilities are accessible to a broad population including malicious actors. This raises significant concerns about the proliferation of AI-assisted exploitation tools at scale.</description></item><item><title>AI-Powered Honeypots Expose Blind Spots in Automated Malicious AI Agents</title><link>https://gridthegrey.com/posts/ai-powered-honeypots-expose-blind-spots-in-automated-malicious-ai-agents/</link><pubDate>Thu, 30 Apr 2026 05:34:41 +0000</pubDate><guid>https://gridthegrey.com/posts/ai-powered-honeypots-expose-blind-spots-in-automated-malicious-ai-agents/</guid><category>Threat Level: MEDIUM</category><category>Agentic AI</category><category>LLM Security</category><category>Research</category><category>Prompt Injection</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0043 - Craft Adversarial Data</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0015 - Evade ML Model</category><description>Cisco Talos researcher Martin Lee demonstrates how generative AI can be used to rapidly deploy adaptive honeypot systems that deceive and study AI-driven attack agents. The technique exploits a fundamental weakness in AI agents — their lack of situational awareness — causing them to interact with simulated vulnerable systems as if they were real targets. This defensive approach shifts the paradigm from passive detection to active manipulation, giving defenders new insight into automated threat actor methodologies.</description></item><item><title>DPRK Actors Use Claude LLM to Inject Malware Into npm Supply Chain</title><link>https://gridthegrey.com/posts/dprk-actors-use-claude-llm-to-inject-malware-into-npm-supply-chain/</link><pubDate>Thu, 30 Apr 2026 05:33:29 +0000</pubDate><guid>https://gridthegrey.com/posts/dprk-actors-use-claude-llm-to-inject-malware-into-npm-supply-chain/</guid><category>Threat Level: HIGH</category><category>Supply Chain</category><category>Agentic AI</category><category>LLM Security</category><category>Industry News</category><category>AML.T0010 - ML Supply Chain Compromise</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0019 - Publish Poisoned Datasets</category><category>AML.T0057 - LLM Data Leakage</category><description>North Korean threat group Famous Chollima (Shifty Corsair) has weaponised AI-assisted code generation to embed malicious npm packages into autonomous AI agent projects, targeting cryptocurrency wallets. The campaign, dubbed PromptMink, exploited Anthropic's Claude Opus to co-author a malicious dependency commit, demonstrating a novel abuse of LLM coding agents for supply chain infiltration. The attack uses a multi-layer dependency structure to evade detection, with second-layer malicious packages swiftly rotated when identified.</description></item><item><title>SQL Injection in LiteLLM Proxy Exposes LLM Provider Keys Within 36 Hours</title><link>https://gridthegrey.com/posts/sql-injection-in-litellm-proxy-exposes-llm-provider-keys-within-36-hours/</link><pubDate>Thu, 30 Apr 2026 05:32:40 +0000</pubDate><guid>https://gridthegrey.com/posts/sql-injection-in-litellm-proxy-exposes-llm-provider-keys-within-36-hours/</guid><category>Threat Level: CRITICAL</category><category>LLM Security</category><category>Supply Chain</category><category>Industry News</category><category>AML.T0012 - Valid Accounts</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0010 - ML Supply Chain Compromise</category><category>AML.T0057 - LLM Data Leakage</category><description>A critical SQL injection vulnerability (CVE-2026-42208, CVSS 9.3) in BerriAI's LiteLLM AI gateway was actively exploited within 36 hours of public disclosure, targeting database tables storing upstream LLM provider API keys including OpenAI, Anthropic, and AWS Bedrock credentials. Attackers demonstrated prior knowledge of LiteLLM's internal schema, selectively probing credential and configuration tables while ignoring user and team tables. The blast radius extends far beyond a typical web-app SQL injection, as successful extraction equates to cloud-account-level compromise across multiple AI provider accounts.</description></item><item><title>Agentic AI Defense Costs Spiral as Adversarial Attack Volume Surges</title><link>https://gridthegrey.com/posts/agentic-ai-defense-costs-spiral-as-adversarial-attack-volume-surges/</link><pubDate>Wed, 29 Apr 2026 13:33:26 +0000</pubDate><guid>https://gridthegrey.com/posts/agentic-ai-defense-costs-spiral-as-adversarial-attack-volume-surges/</guid><category>Threat Level: MEDIUM</category><category>Agentic AI</category><category>LLM Security</category><category>Industry News</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0040 - ML Model Inference API Access</category><description>Sevii's Cyber Swarm Defense launch highlights a structural tension in enterprise AI security: the token-based cost model of agentic AI defense becomes unpredictable and potentially unsustainable as adversarial attack volume increases. CISOs face a compounding risk where budget exhaustion mid-attack could force a fallback to understaffed human teams. The article also references Claude Mythos as a frontier model enabling higher-volume adversarial campaigns, underscoring the asymmetric cost burden between attackers and defenders.</description></item><item><title>FIDO Alliance Launches Standards Push to Secure AI Agent Transactions</title><link>https://gridthegrey.com/posts/fido-alliance-launches-standards-push-to-secure-ai-agent-transactions/</link><pubDate>Wed, 29 Apr 2026 07:16:53 +0000</pubDate><guid>https://gridthegrey.com/posts/fido-alliance-launches-standards-push-to-secure-ai-agent-transactions/</guid><category>Threat Level: HIGH</category><category>Agentic AI</category><category>LLM Security</category><category>Regulatory</category><category>Industry News</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0012 - Valid Accounts</category><category>AML.T0057 - LLM Data Leakage</category><description>The FIDO Alliance, backed by Google and Mastercard, is forming working groups to establish cryptographic standards for authenticating AI agent-initiated transactions, addressing risks like agent hijacking, prompt injection, and unauthorised financial actions. The initiative responds to a growing attack surface where agentic AI systems act on behalf of users without adequate authentication frameworks. Google's Agent Payments Protocol (AP2) and Mastercard's Verifiable Intent framework are being contributed as open-source foundations for the effort.</description></item><item><title>Pre-Auth SQLi Flaw in LiteLLM Gateway Actively Exploited to Steal AI Credentials</title><link>https://gridthegrey.com/posts/pre-auth-sqli-flaw-in-litellm-gateway-actively-exploited-to-steal-ai-credentials/</link><pubDate>Wed, 29 Apr 2026 07:15:26 +0000</pubDate><guid>https://gridthegrey.com/posts/pre-auth-sqli-flaw-in-litellm-gateway-actively-exploited-to-steal-ai-credentials/</guid><category>Threat Level: CRITICAL</category><category>LLM Security</category><category>Supply Chain</category><category>Industry News</category><category>AML.T0040 - ML Model Inference API Access</category><category>AML.T0012 - Valid Accounts</category><category>AML.T0047 - ML-Enabled Product or Service</category><category>AML.T0057 - LLM Data Leakage</category><category>AML.T0010 - ML Supply Chain Compromise</category><description>A critical unauthenticated SQL injection vulnerability (CVE-2026-42208) in LiteLLM, a widely-used LLM proxy and SDK middleware, is being actively exploited to extract API keys, provider credentials, and configuration secrets from the proxy database. Exploitation began within 36 hours of public disclosure, with attackers demonstrating precise targeting of sensitive tables containing OpenAI, Anthropic, and Bedrock credentials. The stolen credentials could enable downstream attacks against AI infrastructure at scale, given LiteLLM's broad adoption across LLM application ecosystems.</description></item><item><title>Welcoming Llama Guard 4 on Hugging Face Hub</title><link>https://gridthegrey.com/posts/welcoming-llama-guard-4-on-hugging-face-hub/</link><pubDate>Tue, 28 Apr 2026 05:53:37 +0000</pubDate><guid>https://gridthegrey.com/posts/welcoming-llama-guard-4-on-hugging-face-hub/</guid><category>Threat Level: LOW</category><category>LLM Security</category><category>Jailbreaks</category><category>Prompt Injection</category><category>Research</category><category>Industry News</category><category>AML.T0054 - LLM Jailbreak</category><category>AML.T0051 - LLM Prompt Injection</category><category>AML.T0043 - Craft Adversarial Data</category><category>AML.T0047 - ML-Enabled Product or Service</category><description>Meta has released Llama Guard 4, a 12B multimodal safety classifier designed to detect and filter unsafe content in both image and text inputs/outputs for production LLM deployments. The model addresses jailbreak attempts and harmful content generation across 14 hazard categories defined by the MLCommons taxonomy. Alongside it, two lightweight Llama Prompt Guard 2 classifiers (86M and 22M parameters) target prompt injection and prompt attack detection.</description></item></channel></rss>