Tag
Releasing Chandra 2.1, an improved OCR model that is smaller, faster, and significantly better at handling complex tables and multilingual content, now live on the Datalab API.
Proposes Structure-Aware RAG (SA-RAG), which uses tables as an intermediate structured representation to reduce noise in retrieval-augmented generation for conversational agents, with quality-aware metadata generation and two table generation methods, outperforming existing baselines on noisy real-world datasets.