@noctus91: Mistral OCR 4 reading a handwritten Henri Poincaré letter from 1905. Historical manuscripts usually break OCR models. T…
Summary
Mistral AI releases Mistral OCR 4, which can read historical handwritten manuscripts and provides bounding boxes, block classification, and inline confidence scores in 170 languages.
View Cached Full Text
Cached at: 06/24/26, 08:00 AM
Mistral OCR 4 reading a handwritten Henri Poincaré letter from 1905.
Historical manuscripts usually break OCR models. This one held up well.
Solid release from @MistralAI.
Another one
need to test the new @Baidu_Inc ocr model
Maybe, but I’d be surprised if this specific document made it into the training data.
that’s a great idea tbh .
Similar Articles
Mistral OCR 4
Mistral AI releases Mistral OCR 4, a compact document intelligence model that provides bounding boxes, block classification, and inline confidence scores for structured text extraction. It supports 170 languages, runs in a single container for self-hosted deployment, and integrates with the Mistral Search Toolkit for enterprise search and RAG pipelines.
@atomic_chat_hq: Mistral OCR 4 turned a handwritten calculus exam into clean LaTeX! We gave it a photo of a hand-written exam page. The …
Mistral OCR 4 converts handwritten calculus exams into clean LaTeX, accurately reading formulas and accounting for graphs, though it does not redraw them. The model provides structured output with bounding boxes and confidence scores in 170 languages.
@stevibe: Mistral OCR 4 just dropped with bounding boxes (their most-requested feature) so I plugged it into my form-filling test…
Mistral OCR 4 has been released with bounding boxes, a highly requested feature. The user tested it for form filling and found it works well, though not perfectly.
Find the best open-source OCR models in one place at Papers with Code [P]
A curated page on Papers with Code lists top open-source OCR models and benchmarks, highlighting new releases from Baidu (Unlimited OCR) and Mistral (OCR 4), aimed at enabling AI agent use cases like RAG.
@techNmak: A lightweight VLM that beats the giants at OCR. (1.7B parameters, SOTA on OmniDocBench) dots. ocr is a new multilingual…
dots.ocr is a new lightweight 1.7B parameter multilingual vision-language model that achieves state-of-the-art performance on OmniDocBench, outperforming much larger models (72B+) at document parsing and OCR tasks.