PDFMathTranslate: Scientific Document Translation Preserving Layouts
Summary
This paper introduces PDFMathTranslate, an open-source tool for translating scientific documents while preserving their original layout, leveraging large language models and precise layout detection.
View Cached Full Text
Cached at: 05/08/26, 09:06 AM
Paper page - PDFMathTranslate: Scientific Document Translation Preserving Layouts
Source: https://huggingface.co/papers/2507.03009
Abstract
PDFMathTranslate enables layout-preserving scientific document translation using large language models and precise layout detection, offering improved precision, flexibility, and efficiency.
Language barriers inscientific documentshinder the diffusion and development of science and technologies. However, prior efforts in translating such documents largely overlooked the information in layouts. To bridge the gap, we introduce PDFMathTranslate, the world’s firstopen-source softwarefor translatingscientific documentswhile preserving layouts. Leveraging the most recent advances inlarge language modelsand preciselayout detection, we contribute to the community with key improvements in precision, flexibility, and efficiency. The work has been open-sourced at https://github.com/byaidu/pdfmathtranslate with more than 222k downloads.
View arXiv pageView PDFGitHub33.6kautoAdd to collection
Get this paper in your agent:
hf papers read 2507\.03009
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2507.03009 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2507.03009 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2507.03009 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@jerryjliu0: LiteParse, our OSS document parser, is really good at parsing complex PDF layouts, text, and tables into a clean spatia…
LiteParse is an open-source, heuristic-based PDF parser that quickly converts complex layouts, text, and tables into a clean spatial grid without relying on ML models.
@AIExplorerTim: Someone just released a tool that converts PDFs into clean, structured Markdown at speeds up to 100 pages/second. No GPU required. No API costs. No messy parsing. Just raw, usable data. It handles with ease: • Tables → Perfectly ex…
OpenDataLoader is an open-source tool that converts PDFs into structured Markdown and JSON, supporting local processing speeds of up to 100 pages/second without requiring a GPU or incurring API costs, designed specifically for RAG pipelines and PDF accessibility automation.
@tom_doerr: Converts images and PDFs to Markdown without OCR https://github.com/NanoNets/docext
docext is an on-premises toolkit that converts images and PDFs to markdown without OCR, leveraging vision-language models. It also introduces Nanonets-OCR-s, a compact 3B parameter model for efficient image-to-markdown conversion.
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion
SmolDocling is a compact 256M parameter vision-language model designed for end-to-end multi-modal document conversion. It introduces a new universal markup format called DocTags to capture page elements with location, competing with models 27 times larger.
Local manga translator with LLM build-in, written in Rust with llama.cpp integration
Koharu is an open-source Rust-based manga/image translator that combines object detection, visual LLM OCR, layout analysis, and inpainting, with llama.cpp integration supporting Gemma 4 and Qwen3.5 models.