document-processing

#document-processing

@jerryjliu0: LiteParse, our OSS document parser, is really good at parsing complex PDF layouts, text, and tables into a clean spatia…

X AI KOLs Following ↗ · 2026-04-22 Cached

LiteParse is an open-source, heuristic-based PDF parser that quickly converts complex layouts, text, and tables into a clean spatial grid without relying on ML models.

0 favorites 0 likes

#document-processing

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

arXiv cs.CL ↗ · 2026-04-20 Cached

Researchers from Banting Health AI present an AI system using generative LLMs with Retrieval-Augmented Generation (RAG) for automated clinical trial protocol information extraction, achieving 89% accuracy compared to 62.6% for standalone LLMs, with AI-assisted workflows completing tasks 40% faster and reducing cognitive demand.

0 favorites 0 likes

#document-processing

@jerryjliu0: A downside with using VLMs to parse PDFs is guaranteeing that the output text is correct and output in the correct re…

X AI KOLs Following ↗ · 2026-04-18 Cached

Jerry Liu discusses challenges with using Vision Language Models for PDF parsing, particularly around ensuring text correctness and maintaining proper reading order while avoiding hallucinations.

0 favorites 0 likes

#document-processing

Turning contracts into searchable data at OpenAI

OpenAI Blog ↗ · 2025-09-29 Cached

OpenAI shares how it built an internal contract data agent that automates the extraction and structuring of contract data from various document formats while keeping finance experts in control through a human-in-the-loop review process. The system has reduced contract review time by half and enabled the team to process thousands of contracts monthly without proportional headcount expansion.

0 favorites 0 likes

#document-processing

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Papers with Code Trending ↗ · 2025-03-14 Cached

SmolDocling is a compact 256M parameter vision-language model designed for end-to-end multi-modal document conversion. It introduces a new universal markup format called DocTags to capture page elements with location, competing with models 27 times larger.

0 favorites 0 likes

document-processing

@jerryjliu0: LiteParse, our OSS document parser, is really good at parsing complex PDF layouts, text, and tables into a clean spatia…

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

@jerryjliu0: A downside with using VLMs to parse PDFs is guaranteeing that the output text is *correct* and output in the correct re…

Turning contracts into searchable data at OpenAI

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Submit Feedback

@jerryjliu0: A downside with using VLMs to parse PDFs is guaranteeing that the output text is correct and output in the correct re…