LIVE THREATS
CRITICAL Bleeding Llama Flaw Exposes 300,000 Ollama Servers to Unauthenticated Data Theft // MEDIUM CrowdStrike Researcher Details AI Jailbreaking and Data Poisoning Techniques // HIGH Mass Scan Reveals Widespread Authentication Failures Across Exposed AI Infrastructure // HIGH Backdoored PyTorch Lightning Package Steals Cloud Credentials from AI Developers // HIGH Pentagon Deploys Classified AI Across Seven Tech Giants for Warfighter Systems // MEDIUM Cross-Machine AI Agent Relay Tool Expands Attack Surface for Developer Environments // HIGH Desktop Automation CLI Grants AI Agents Deep OS-Level Control // HIGH Frontier LLMs Now Autonomously Breach Corporate Networks in AISI Cyber Tests // HIGH Premature AI Agent Deployments Expose Production Systems to Destructive Actions // HIGH Anthropic Launches Claude Security to Close AI-Accelerated Exploit Window //
ATLAS OWASP CRITICAL Active exploitation · Immediate action required RELEVANCE ▲ 9.2

Bleeding Llama Flaw Exposes 300,000 Ollama Servers to Unauthenticated Data Theft

TL;DR CRITICAL
  • What happened: Unauthenticated heap read bug in Ollama leaks API keys, prompts, and secrets from 300,000 exposed servers.
  • Who's at risk: Any organisation running Ollama as a self-hosted LLM inference engine without a firewall or authentication proxy in front of it — roughly 300,000 internet-facing instances — is immediately exploitable.
  • Act now: Upgrade Ollama to version 0.17.1 immediately · Block public internet access to Ollama's API port via firewall rules · Place an authentication proxy in front of all Ollama deployments and audit exposed API keys for rotation
Bleeding Llama Flaw Exposes 300,000 Ollama Servers to Unauthenticated Data Theft

Overview

A critical vulnerability dubbed Bleeding Llama (CVE-2026-7482, CVSS 9.3) has been disclosed in Ollama, the widely used open-source framework for running large language models locally and in self-hosted environments. Discovered by Cyera, the flaw allows a remote, unauthenticated attacker to read sensitive data from the server’s heap memory — including prompts, chat history, environment variables, API keys, and secrets — and exfiltrate it to an attacker-controlled server. With an estimated 300,000 Ollama instances exposed on the public internet and no authentication enabled by default, the practical blast radius of this vulnerability is immediate and severe.

Technical Analysis

The vulnerability resides in Ollama’s GGUF model loader, the component responsible for ingesting model files in the GGUF format. The flaw is a classic heap out-of-bounds read: an attacker supplies a maliciously crafted GGUF file in which a tensor’s declared offset and size exceed the actual file length. When Ollama processes this file, it reads beyond the allocated heap buffer, accessing adjacent memory regions that may contain live runtime data.

The attack chain requires only three unauthenticated API calls:

  1. Upload a crafted GGUF file via Ollama’s model import API.
  2. Trigger processing of the file, causing the out-of-bounds read and capturing heap data into the resulting model blob.
  3. Exfiltrate the blob using Ollama’s built-in model push feature, sending the memory-laced file to an attacker-controlled registry server.

Because Ollama listens on all network interfaces by default and ships without any authentication mechanism, every internet-accessible instance is exploitable without credentials. The memory regions exposed can include:

  • LLM prompt and message history
  • Environment variables (e.g., OPENAI_API_KEY, cloud provider tokens)
  • PHI, PII, and development secrets routed through the inference engine

Framework Mapping

FrameworkTechnique/CategoryRationale
MITRE ATLASAML.T0040 – ML Model Inference API AccessAttacker abuses Ollama’s unauthenticated API to trigger the vulnerable code path
MITRE ATLASAML.T0057 – LLM Data LeakageHeap memory containing prompts and secrets is exfiltrated
MITRE ATLASAML.T0043 – Craft Adversarial DataMaliciously crafted GGUF file is the attack vehicle
OWASP LLMLLM06 – Sensitive Information DisclosurePrimary impact: API keys, PII, and prompts leaked from runtime memory
OWASP LLMLLM05 – Supply Chain VulnerabilitiesGGUF model ingestion pipeline is the exploited trust boundary

Impact Assessment

The vulnerability affects all Ollama deployments prior to version 0.17.1 that are network-accessible without a firewall or authentication layer. The 300,000 figure represents publicly internet-facing instances; enterprise deployments on internal networks without segmentation are also at risk from insider threats or lateral movement. Depending on how Ollama is integrated, exploitation could expose:

  • Enterprise AI workflows: Employee chat history and routed tool outputs
  • Development environments: Hardcoded secrets and dev-time API tokens
  • Healthcare and legal contexts: PHI and PII passed through prompts
  • Multi-tenant platforms: Cross-tenant data leakage if Ollama is shared

Mitigation & Recommendations

  1. Upgrade immediately to Ollama version 0.17.1, which patches CVE-2026-7482.
  2. Restrict network access: Firewall Ollama’s API port (default: 11434) to localhost or trusted internal CIDRs only.
  3. Deploy an authentication proxy (e.g., OAuth2 Proxy, nginx with mTLS) in front of any network-accessible Ollama instance.
  4. Rotate all secrets: Assume any API keys, tokens, or credentials handled by an exposed Ollama instance are compromised.
  5. Audit GGUF ingestion pipelines: Validate model file sources and apply integrity checks before loading third-party GGUF files.
  6. Monitor for anomalous model push activity: Alert on outbound model push calls to unknown registries.

References

◉ AI THREAT BRIEFING

Stay ahead of the threat.

Twice-weekly digest of critical AI security developments — every story mapped to MITRE ATLAS and OWASP LLM Top 10. Free.

No spam. Unsubscribe anytime.