document-ai

Tag

Cards List
#document-ai

Operationalizing Document AI: A Microservice Architecture for OCR and LLM Pipelines in Production

arXiv cs.AI · 2026-05-20 Cached

This paper presents a microservice architecture for production document AI pipelines that combine classification, OCR, and LLM extraction, sharing design decisions and batch profiling insights that reveal OCR, not LLM parsing, dominates latency.

0 favorites 0 likes
#document-ai

PaddlePaddle/PaddleOCR

GitHub Trending (daily) · 2d ago

PaddleOCR is a powerful, lightweight OCR toolkit that converts PDFs and images into structured data for AI applications, supporting 100+ languages and designed to bridge documents with LLMs.

0 favorites 0 likes
← Back to home

Submit Feedback