@AdinaYakup: Unlimited-OCR New OCR from @PaddlePaddle It can parse hundreds of pages in a single pass while maintaining stable speed…

X AI KOLs Following Models

Summary

PaddlePaddle releases Unlimited-OCR, a new OCR model using Reference Sliding Window Attention (R-SWA) to maintain constant KV cache during decoding, achieving 93% on OmniDocBench and a 6% improvement over previous methods.

Unlimited-OCR 🔥New OCR from @PaddlePaddle It can parse hundreds of pages in a single pass while maintaining stable speed. The key idea is R-SWA (Reference Sliding Window Attention), which keeps KV cache constant during decoding. 🏆 93% on OmniDocBench 📈 +6% over https://t.co/uuXPUhL22L
Original Article
View Cached Full Text

Cached at: 06/22/26, 05:38 PM

Unlimited-OCR 🔥New OCR from @PaddlePaddle

It can parse hundreds of pages in a single pass while maintaining stable speed.

The key idea is R-SWA (Reference Sliding Window Attention), which keeps KV cache constant during decoding.

🏆 93% on OmniDocBench 📈 +6% over https://t.co/uuXPUhL22L

Similar Articles

Unlimited OCR Works

Hugging Face Daily Papers

Unlimited OCR introduces Reference Sliding Window Attention to eliminate growing memory consumption in long-sequence OCR tasks, enabling efficient transcription of multiple pages in a single forward pass.

PaddlePaddle/PaddleOCR

GitHub Trending (daily)

PaddleOCR is a powerful, lightweight OCR toolkit that converts PDFs and images into structured data for AI applications, supporting 100+ languages and designed to bridge documents with LLMs.

@GoSailGlobal: Current OCR processes multi-page documents page by page. Every time you turn a page, memory is reset. Today, Baidu quietly open-sourced a model on GitHub and HuggingFace called Unlimited OCR, inspired by how humans copy books: - When copying a book, you don't reread hundreds of pages every time you write a word...

X AI KOLs Timeline

Baidu has open-sourced the Unlimited OCR model, which uses a Reference Sliding Window Attention (R-SWA) mechanism to parse documents up to 32K context in a single pass, eliminating the need for page-by-page inference.