RAG-Anything: All-in-One RAG Framework

Papers with Code Trending Papers

Summary

RAG-Anything is a new open-source framework that enhances multimodal knowledge retrieval by integrating cross-modal relationships and semantic matching, outperforming existing methods on complex benchmarks.

Retrieval-Augmented Generation (RAG) has emerged as a fundamental paradigm for expanding Large Language Models beyond their static training limitations. However, a critical misalignment exists between current RAG capabilities and real-world information environments. Modern knowledge repositories are inherently multimodal, containing rich combinations of textual content, visual elements, structured tables, and mathematical expressions. Yet existing RAG frameworks are limited to textual content, creating fundamental gaps when processing multimodal documents. We present RAG-Anything, a unified framework that enables comprehensive knowledge retrieval across all modalities. Our approach reconceptualizes multimodal content as interconnected knowledge entities rather than isolated data types. The framework introduces dual-graph construction to capture both cross-modal relationships and textual semantics within a unified representation. We develop cross-modal hybrid retrieval that combines structural knowledge navigation with semantic matching. This enables effective reasoning over heterogeneous content where relevant evidence spans multiple modalities. RAG-Anything demonstrates superior performance on challenging multimodal benchmarks, achieving significant improvements over state-of-the-art methods. Performance gains become particularly pronounced on long documents where traditional approaches fail. Our framework establishes a new paradigm for multimodal knowledge access, eliminating the architectural fragmentation that constrains current systems. Our framework is open-sourced at: https://github.com/HKUDS/RAG-Anything.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 05/08/26, 08:39 AM

Paper page - RAG-Anything: All-in-One RAG Framework

Source: https://huggingface.co/papers/2510.12323 Published on Oct 14, 2025

Abstract

RAG-Anything is a unified framework that enhances multimodal knowledge retrieval by integrating cross-modal relationships and semantic matching, outperforming existing methods on complex benchmarks.

Retrieval-Augmented Generation(RAG) has emerged as a fundamental paradigm for expandingLarge Language Modelsbeyond their static training limitations. However, a critical misalignment exists between currentRAGcapabilities and real-world information environments. Modern knowledge repositories are inherentlymultimodal, containing rich combinations oftextual content, visual elements,structured tables, andmathematical expressions. Yet existingRAGframeworks are limited totextual content, creating fundamental gaps when processingmultimodaldocuments. We presentRAG-Anything, a unified framework that enables comprehensive knowledge retrieval across all modalities. Our approach reconceptualizesmultimodalcontent as interconnected knowledge entities rather than isolated data types. The framework introduces dual-graph construction to capture both cross-modal relationships and textual semantics within a unified representation. We developcross-modal hybrid retrievalthat combinesstructural knowledge navigationwithsemantic matching. This enables effective reasoning over heterogeneous content where relevant evidence spans multiple modalities.RAG-Anything demonstrates superior performance on challengingmultimodal benchmarks, achieving significant improvements over state-of-the-art methods. Performance gains become particularly pronounced onlong documentswhere traditional approaches fail. Our framework establishes a new paradigm formultimodalknowledge access, eliminating the architectural fragmentation that constrains current systems. Our framework is open-sourced at: https://github.com/HKUDS/RAG-Anything.

View arXiv pageView PDFGitHub19.9kAdd to collection

Get this paper in your agent:

hf papers read 2510\.12323

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2510.12323 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2510.12323 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2510.12323 in a Space README.md to link it from this page.

Collections including this paper37

Browse 37 collections that include this paper

Similar Articles

HKUDS/RAG-Anything

GitHub Trending (daily)

HKUDS released RAG-Anything, an open-source all-in-one multimodal retrieval-augmented generation framework based on LightRAG.

AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

arXiv cs.AI

This paper introduces AgenticRAG, a framework from Microsoft that enhances enterprise knowledge base retrieval by equipping LLMs with tools for iterative search, document navigation, and analysis. It demonstrates significant improvements in recall and factuality over standard RAG pipelines on multiple benchmarks.

LightRAG: Simple and Fast Retrieval-Augmented Generation

Papers with Code Trending

The article introduces LightRAG, an open-source framework that enhances Retrieval-Augmented Generation by integrating graph structures for improved contextual awareness and efficient information retrieval.

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

arXiv cs.CL

Disco-RAG proposes a discourse-aware retrieval-augmented generation framework that integrates discourse signals through intra-chunk discourse trees and inter-chunk rhetorical graphs to improve knowledge synthesis in LLMs. The method achieves state-of-the-art results on QA and summarization benchmarks without fine-tuning.