Capability Overview
Mistral AI has released OCR 4, a focused document intelligence model that goes beyond raw text extraction to return structured output: bounding boxes localising text on the page, typed block classification (titles, tables, equations, signatures), and inline confidence scores for each extracted region. The model supports 170 languages across 10 language groups, ships in a single self-hostable container, and is explicitly positioned as an ingestion component for enterprise RAG pipelines, search toolkits, and domain-specific retrieval systems.
The integration with Mistral’s open-source Search Toolkit, announced concurrently, means OCR 4 output feeds directly into citation generation, retrieval ranking, and evaluation workflows. This tight coupling to downstream AI decision-making is the primary reason defenders need to assess this capability quickly.
Attack Surface Analysis
OCR models have historically been treated as passive utilities. OCR 4’s explicit role as a structured ingestion layer for LLM-backed systems changes this assumption materially.
Adversarial document injection into RAG pipelines is the most significant new vector. A threat actor who can cause a target organisation to process a crafted PDF or image — through email, file upload, or shared document workflows — can embed content that OCR 4 extracts faithfully into the retrieval index. Once indexed, this content influences downstream LLM-generated answers, citations, and summaries without the user ever seeing the original document.
Confidence score exploitation introduces a subtler risk. Organisations building human-in-the-loop redaction or verification workflows are likely to automate review for high-confidence extractions and escalate low-confidence ones. Adversarial documents can be crafted to keep malicious content in high-confidence regions while keeping detection signals in low-confidence zones, selectively evading review.
Block-type misclassification attacks exploit the typed block classification feature. If an attacker can reliably cause content to be categorised as a table, equation, or signature rather than body text, they may bypass downstream filtering rules that apply differently to block types — for example, PII redaction logic applied only to text blocks.
Self-hosted container supply chain risk is elevated by the single-container deployment model. Organisations pulling and running this container internally must verify image provenance, pin digest references, and audit the container’s network exposure. An OCR service sitting on an internal network with broad access to document stores is a high-value pivot point.
Framework Mapping
| Technique | Relevance |
|---|---|
| AML.T0051 – LLM Prompt Injection | Adversarial text in documents injected into RAG context |
| AML.T0043 – Craft Adversarial Data | Crafted documents designed to manipulate OCR output |
| AML.T0057 – LLM Data Leakage | Sensitive document content flowing into improperly secured indexes |
| AML.T0010 – ML Supply Chain Compromise | Container image tampering for self-hosted deployments |
| LLM01 – Prompt Injection | OCR-extracted content reaching LLM prompt context without sanitisation |
| LLM06 – Sensitive Information Disclosure | PII and confidential document content entering search indexes |
| LLM09 – Overreliance | Automated redaction or citation decisions based solely on confidence scores |
Threat Scenarios
Scenario 1 — RAG poisoning via invoice spoofing. A threat actor submits a crafted invoice PDF to a financial services firm’s document processing portal. OCR 4 extracts embedded adversarial instructions alongside the invoice data; these instructions enter the RAG index and subsequently steer the firm’s internal LLM assistant to surface incorrect financial data or redirect queries.
Scenario 2 — Redaction bypass via confidence manipulation. A legal firm uses confidence scores to automate PII redaction. An attacker submits a contract with names formatted to score highly while injecting adversarial instructions in regions engineered to score above the auto-redact threshold, causing them to pass unreviewed into a shared retrieval store.
Scenario 3 — Container exfiltration pivot. A misconfigured self-hosted OCR 4 container with excessive internal network permissions is compromised via a known container escape. The attacker gains access to the document ingestion queue, exfiltrating all documents submitted for OCR processing.
Defender Checklist
- Treat all OCR 4 output as untrusted input — apply the same sanitisation you would to user-supplied text before it enters any LLM prompt or retrieval index
- Implement content security rules that operate independently of OCR confidence scores for sensitive content decisions (PII, legal, financial)
- Pin container image digests; verify signatures before deployment; restrict container network egress to required endpoints only
- Audit what document sources feed your OCR ingestion pipeline — apply strict allowlisting of document origins where possible
- Log all OCR output with document provenance for post-incident analysis; treat the ingestion layer as a security boundary
- Test your RAG pipeline against adversarially crafted documents before production deployment of OCR 4