OCR 4 is Mistral AI's latest document understanding model, designed to deliver structured information rather than simple text extraction. In addition to extracted content, the model provides bounding boxes, block classification, and confidence scores for every page and word, allowing downstream systems to understand both document content and its layout. The release also brings support for 170 languages across 10 language groups while maintaining compatibility with formats including PDF, DOC, PPT, and OpenDocument.
🔑 Key Highlights
- OCR 4 supports 170 languages across 10 language groups
- Model returns text, bounding boxes, and confidence scores
- Self-hosted deployment runs in a single container
- API pricing starts at $4 per 1,000 pages
- Document AI adds structured output through the same API
The company positioned OCR 4 as an ingestion component for enterprise search, Retrieval-Augmented Generation (RAG), and domain-specific retrieval workflows. The model integrates with the public preview of the Mistral Search Toolkit, supplying structured outputs for ingestion, retrieval, and evaluation tasks. Organizations can deploy the compact model within a single container for self-hosted environments, while developers may access it through an API or use Document AI in Mistral Studio for a no-code implementation built on the same engine.
Mistral AI reported that independent annotators preferred OCR 4 over every competing OCR and document AI system included in its evaluation, producing an average win rate of 72%. The company also stated that OCR 4 achieved the highest overall score of 85.20 on OlmOCRBench and recorded a score of 93.07 on OmniDocBench. At the same time, it cautioned that benchmark results should be treated as directional because scoring methods can misrepresent correct outputs through annotation errors, formatting differences, mathematical notation, reading order, and document segmentation.
According to the company, OCR 4 also led its internal multilingual evaluation across eight language groups, including English, Western Europe, Eastern Europe, Middle Eastern, Chinese, East Asian, Southeast Asian, and specialized languages. Mistral AI highlighted stronger performance on specialized and low-resource languages where other systems may lose accuracy. Customer testimonials included claims of lower cost, faster processing, and improved throughput when compared with existing document parsing providers, although the company recommended that organizations evaluate the model using their own document collections.
The release also introduces flexible deployment and pricing options. OCR 4 is available through a single API endpoint that always returns extracted text, block types, confidence scores, markdown formatting, and bounding boxes. Users seeking structured JSON, image annotation, or customized document interpretation can activate Document AI capabilities through additional API parameters without changing the underlying OCR engine. OCR 4 is priced at $4 per 1,000 pages, with Batch API pricing reducing costs to $2 per 1,000 pages, while Document AI costs $5 per 1,000 pages. The services are available through Mistral Studio, Amazon SageMaker, Microsoft Foundry, and will also be available through Snowflake Parse Document.
📊 What This Means (Our Analysis)
OCR 4 represents a broader shift from document transcription toward structured document understanding. By combining extracted text with layout information, confidence measurements, and content classification, the platform enables organizations to build workflows that rely on richer document context instead of plain text alone, making enterprise search, retrieval, and automation more consistent.
The addition of flexible deployment choices also broadens adoption options for different organizations. A single OCR engine that supports direct extraction, structured outputs, and self-hosted environments allows development teams to choose workflows that fit operational, compliance, and scaling requirements without changing the underlying document processing foundation.
📌 Our Take: Structured document intelligence is increasingly becoming as important as text extraction itself, and OCR 4 reflects that direction through a unified processing approach.