pdf

#pdf

A PDF that changes based on who is reading

Hacker News Top ↗ · 2d ago Cached

This article presents a technique to embed hidden markdown structure inside PDFs using the PDF spec's replacement text property, enabling LLMs to extract clean, structured data while humans see the same visual document.

0 favorites 0 likes

#pdf

@neural_avb: One of my fav Paper Breakdown feature is to be able to jump to the exact location of the PDF that the LLM gathered it's…

X AI KOLs Timeline ↗ · 4d ago Cached

The tweet highlights a Paper Breakdown feature that allows users to jump to the exact location in a PDF where an LLM gathered information, providing distilled answers with direct paragraph links for single or multi-paper sessions.

0 favorites 0 likes

#pdf

Show HN: Extend UI – open-source UI kit for modern document apps

Hacker News Top ↗ · 4d ago Cached

Extend UI is an open-source UI kit for modern document apps, providing viewers for PDF, DOCX, XLSX, and CSV files, along with features like bounding box citations, file upload, and e-signing.

0 favorites 0 likes

#pdf

@kushalbyatnal: Introducing Extend UI — open-source components for document agents - 14 components & examples for PDF, DOCX, and XLSX v…

X AI KOLs Timeline ↗ · 5d ago Cached

Extend UI is an open-source library of 14 UI components for document agents, including viewers for PDF, DOCX, XLSX, with features like bounding box citations, file upload, and e-signature. It is MIT licensed and available on the shadcn component registry.

0 favorites 0 likes

#pdf

@mnmn94253156337: Let AI make a PPT for you, and it gives you a bunch of divs and a randomly laid out layout. Click to open, and it looks uglier than what you would have done yourself. What's more annoying is Excel—formulas are written incorrectly, formatting is all over the place, and after generation, you have to manually fix everything from start to finish. You might as well do it yourself. MiniMax open-sourced these four doc…

X AI KOLs Timeline ↗ · 2026-06-08 Cached

MiniMax open-sourced four AI document generation skills (PPT, PDF, Excel, Word), usable without an API key, aiming to solve issues like messy formatting and formula errors in AI-generated documents.

0 favorites 0 likes

#pdf

@vintcessun: Get It turns PDFs into interactive knowledge graphs and visualization engines, not just another summary tool. It detects concepts one by one, generating 3D/animations/formula visualizations for each keyword, then uses flashcards, quizzes, Feynman teaching and other tools to close the loop and assess depth of understanding. Each concept is scored across four dimensions—memory, understanding, structure, application—and can only go up...

X AI KOLs Timeline ↗ · 2026-06-07 Cached

Get It turns PDFs into interactive knowledge graphs and visualizations, using concept detection and multi-format rendering to help students deeply understand material. It runs locally using the user's own ChatGPT account.

0 favorites 0 likes

#pdf

I got tired of Al making stuff up about my PDFs, so I built something that actually cites its sources

Reddit r/artificial ↗ · 2026-06-07

A solo developer built Athena Wisdom, a free tool that answers questions about uploaded PDFs and other documents with explicit source citations, ensuring accuracy and transparency.

0 favorites 0 likes

#pdf

@GitHub_Daily: An open-source learning tool discovered on GitHub: Get It, which helps us deeply learn PDF content in multiple ways. Automatically annotates key concepts on PDF files, and can convert them into visual content such as 3D models, animations, formula derivations, etc., while generating a knowledge graph. GitHub…

X AI KOLs Timeline ↗ · 2026-06-07 Cached

Get It is an open-source learning tool that automatically annotates key concepts in PDFs and converts them into visual content like 3D models and animations, while generating a knowledge graph. It supports methods such as dialogue Q&A, flashcard memory, etc.

0 favorites 0 likes

#pdf

@grgerwcwetwet: Chinese parents recommend bookmarking this GitHub project: ChinaTextbook. Someone has compiled textbooks from primary school to university in China into PDFs, open-sourced and free to download. It's very convenient for finding textbooks, previewing, reviewing, and supplementing materials for children. The coverage is comprehensive: Primary school: Grades 1-6 all subjects, including 五四学制 versions. Middle school: Grades 7-9…

X AI KOLs Timeline ↗ · 2026-06-05 Cached

ChinaTextbook is a GitHub open-source project that organizes textbook PDFs from primary school to university for free download, making it convenient for parents and students to access digital textbooks.

0 favorites 0 likes

#pdf

@mdancho84: This 277-page PDF unlocks the secrets of Large Language Models. Here's what's inside:

X AI KOLs Timeline ↗ · 2026-06-05 Cached

A 277-page PDF guide revealing insights into Large Language Models, shared via a Twitter thread by Matt Dancho.

0 favorites 0 likes

#pdf

@ericzakariasson: cursor in slack can now read documents attached in the thread, including .txt, .log, .json, .zip, .pdf, or .docx files!

X AI KOLs Following ↗ · 2026-06-02 Cached

Cursor can now read documents attached in Slack threads, supporting formats like .txt, .log, .json, .zip, .pdf, and .docx.

0 favorites 0 likes

#pdf

@NFTCPS: Attention bookworms! Those tech books gathering dust on your shelf finally have a purpose. A new open-source tool called book-to-skill just blew up on GitHub, racking up over 2700 Stars. Its approach is wild: just drop in a PDF or EPUB, it automatically extracts the table of contents, core concepts, and patterns, and generates a skill with one click. Later, just type "/书名技能" plus your topic, and it will flip through the book for you.

X AI KOLs Timeline ↗ · 2026-06-01 Cached

The open-source tool book-to-skill on GitHub converts PDF/EPUB tech books into Claude Code skills, generating a table of contents, core concepts, and patterns with one click, turning dusty books into a personal on-demand consultant.

0 favorites 0 likes

#pdf

PDFs in your workflow is burning around your 3xtokens , save them for free using Microsoft's Markitdown

Reddit r/AI_Agents ↗ · 2026-05-31

Microsoft's Markitdown tool converts PDFs to markdown, saving tokens and cost when feeding documents to AI models like Claude, but requires caution with scanned PDFs, charts, and complex tables.

0 favorites 0 likes

#pdf

@kushalbyatnal: Over 1 billion PDFs are created every day, but your agents still can’t read them reliably. Today we’re releasing Parse …

X AI KOLs Following ↗ · 2026-05-26 Cached

Extend released Parse 2.0, a state-of-the-art document parsing API that achieves top accuracy on real-world documents, outperforming competitors on the open-source RealDoc-Bench benchmark.

0 favorites 0 likes

#pdf

@leopardracer: THIS PERSON READ A 134-PAGE BOOK IN 15 MINUTES AND DIDN'T HIGHLIGHT ONCE BUT REMEMBERS ALL OF IT he dropped the PDF and…

X AI KOLs Timeline ↗ · 2026-05-24 Cached

A user describes how someone used an AI-powered tool to read a 134-page book in 15 minutes, generating atomic notes and flashcards without highlighting, highlighting the productivity gap from using proper infrastructure.

0 favorites 0 likes

#pdf

@wsl8297: Want to turn ebooks or documents into audiobooks? Many tools sound too robotic or lack subtitle sync, leaving you frustrated. Then I found the open-source project Abogen: it supports ePub, PDF, plain text, etc., one-click conversion to high-quality audio with auto-generated synchronized subtitles. It uses Kokoro voice at its core…

X AI KOLs Timeline ↗ · 2026-05-24 Cached

Abogen is an open-source tool that can convert documents like ePub and PDF into high-quality audio with one click, automatically generating synchronized subtitles. It supports a voice mixer and multiple deployment methods.

0 favorites 0 likes

#pdf

How to parse tables from pdf's

Reddit r/AI_Agents ↗ · 2026-05-24

Advice on parsing tables from PDFs by converting to PNGs and using Gemini 3.1 Pro with low thinking, claiming 95% accuracy. Other tools like Extend, Reducto, Landing are poor for this task.

0 favorites 0 likes

#pdf

@knowledgefxg: Practical Open-Source Tool Recommendation: pdf-inspector solves a very real problem: not all PDFs need OCR. For example, you throw a PDF at it, and it first determines what type of PDF it is—whether it's a normal text-based version (e.g., exported from Word) or a scanned version (image)…

X AI KOLs Timeline ↗ · 2026-05-22 Cached

pdf-inspector is an open-source Rust library for intelligently classifying PDF types (text or scanned), extracting text, and converting to Markdown, avoiding unnecessary OCR to improve speed and save costs.

0 favorites 0 likes

#pdf

@FinanceYF5: How to Maximize the Value of Codex 1/ Jason Liu Redefines Knowledge Work with Codex Codex team DX engineer, Instructor founder jason says: Codex is not just about writing code. It has become a hub that crosses tool boundaries to handle slides, PDFs, ...

X AI KOLs Following ↗ · 2026-05-18 Cached

Jason Liu shared how to use Codex as a central hub to handle knowledge work across tools, such as slides, PDFs, spreadsheets, to maximize its value.

0 favorites 0 likes

#pdf

@VincentLogic: What's the most headache in RAG? Not the AI model, it's document parsing! PDF, Word, PPT to Markdown is a mess, tables and formulas all over the place... Recently tried MinerU 3.1, it's amazing! One-click conversion, perfect format preservation, auto-identification of tables, formulas, images...

X AI KOLs Timeline ↗ · 2026-05-15 Cached

Recommending MinerU 3.1 document parsing tool, which perfectly converts PDF, Word, PPT etc. to Markdown, supports auto-identification of tables, formulas, images, and offers three modes (Pipeline/VLM), open-source and commercially usable.

0 favorites 0 likes

pdf

Submit Feedback