@datalabto: Releasing Chandra 2.1 — smaller, faster, and significantly better on the two things hardest for OCR models to get right…
Summary
Releasing Chandra 2.1, an improved OCR model that is smaller, faster, and significantly better at handling complex tables and multilingual content, now live on the Datalab API.
View Cached Full Text
Cached at: 06/15/26, 11:08 PM
Releasing Chandra 2.1 — smaller, faster, and significantly better on the two things hardest for OCR models to get right: complex tables and multilingual. Live on the Datalab API today. Details ↓ https://t.co/1bwACqGD0S
Similar Articles
@VikParuchuri: In their OCR 4 launch this week, Mistral shared a significantly lower score for Chandra 2 than you get from our repo or…
Mistral launched OCR 4, but a tweet points out that they reported lower scores for Chandra 2 than its actual performance and omitted Infinity Parser (87.6%) from their olmocr comparison.
@techNmak: A lightweight VLM that beats the giants at OCR. (1.7B parameters, SOTA on OmniDocBench) dots. ocr is a new multilingual…
dots.ocr is a new lightweight 1.7B parameter multilingual vision-language model that achieves state-of-the-art performance on OmniDocBench, outperforming much larger models (72B+) at document parsing and OCR tasks.
@PaddlePaddle: PP-OCRv6 Tech Deep Dive Ep.1: In the Era of Large Models, Why Does Lightweight OCR Still Have Irreplaceable Value? — PP…
PP-OCRv6 is a lightweight OCR model (34.5M parameters) that challenges large VLMs with its MetaFormer architecture, offering efficient text detection and recognition across multiple deployment scenarios.
@vanstriendaniel: It's raining OCR models again! @Baidu_Inc's Unlimited-OCR is one of the more interesting. You can try it without much e…
This post shows how to serve Baidu's Unlimited-OCR model as a temporary, OpenAI-compatible endpoint on Hugging Face Jobs, enabling multi-page document parsing with features like table-to-HTML and equation-to-LaTeX extraction.
🚀PP-OCRv6 is officially released !
PaddleOCR releases PP-OCRv6, a new OCR model series with sizes from 1.5M to 34.5M parameters, offering improved accuracy and faster inference, supporting 50 languages and new scenarios like PCB and CAD drawings, under Apache 2.0 open source license.