document-extraction

Tag

Cards List
#document-extraction

@tom_doerr: Converts images and PDFs to Markdown without OCR https://github.com/NanoNets/docext

X AI KOLs Timeline · 6d ago Cached

docext is an on-premises toolkit that converts images and PDFs to markdown without OCR, leveraging vision-language models. It also introduces Nanonets-OCR-s, a compact 3B parameter model for efficient image-to-markdown conversion.

0 favorites 0 likes
← Back to home

Submit Feedback