ChartWalker: Benchmarking the Cross-Chart RAG Task
Summary
ChartWalker introduces a novel framework for cross-chart retrieval-augmented generation (RAG) using hierarchical knowledge graph construction and structure-aware sampling. It releases a challenging benchmark (ChartWalker-Bench) and an agentic baseline (ChartWalker-Agent), revealing significant performance gaps in current RAG paradigms.
View Cached Full Text
Cached at: 06/24/26, 01:48 PM
Paper page - ChartWalker: Benchmarking the Cross-Chart RAG Task
Source: https://huggingface.co/papers/2606.23997
Abstract
ChartWalker presents a novel framework for cross-chart retrieval-augmented generation with hierarchical knowledge graph construction and structure-aware sampling for challenging multi-modal analytical tasks.
Cross-Chart Retrieval-Augmented Generation(RAG) is critical for complex multi-modal analytical tasks in scientific, business, and political domains. However, existing benchmarks either focus on tables, which are well-structured and textualized, or generate cross-chart questions by simply extracting key points, which often induces lexical overlap between queries and evidence and yields logically inconsistent reasoning chains. To address this, we introduce ChartWalker, a novel framework for constructing challenging cross-chart RAG tasks. ChartWalker features ahierarchical knowledge graph constructionmethod tailored to charts, which organizes entities and relations by granularity to preserve analytical structure. We then propose astructure-aware samplingalgorithm that synthesizes semantically coherent,multi-hop reasoning paths, enabling explicit control over query difficulty and granularity for QA generation. Built with this framework, we releaseChartWalker-Bench, a comprehensive benchmark spanning diverse domains and cross-chart query types. Extensive evaluations across major RAG paradigms reveal significant performance gaps, underscoring the benchmark’s difficulty and utility. Furthermore, we provideChartWalker-Agent, an agentic baseline to facilitate analysis and inspire future system design.
View arXiv pageView PDFGitHub0Add to collection
Get this paper in your agent:
hf papers read 2606\.23997
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.23997 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.23997 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.23997 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
ChartArena: Benchmarking Chart Parsing across Languages, Scenarios, and Formats
ChartArena is a comprehensive bilingual benchmark for chart parsing that evaluates models across eight chart families and three visual scenarios (digital, printed, hand-drawn), using a human-agent annotation pipeline and format-agnostic evaluation. Evaluations of 26 MLLMs reveal that while proprietary models lead overall, open-source models are catching up, and diagrammatic structures and hand-drawn scenarios remain challenging.
How Fine-Grained Should a RAG Benchmark Be? A Hierarchical Framework for Synthetic Question Generation
This paper introduces HieraRAG, a hierarchical framework for determining optimal granularity in RAG benchmarks. It generates 5,872 synthetic QA pairs across three dimensions and finds that ideal granularity varies by dimension, offering a portable procedure for practitioners.
A Unified Framework for Context-Aware and Relation-Aware Graph Retrieval-Augmented Generation
This paper proposes HyGRAG, a hierarchical graph RAG framework that integrates contextual and relational information for multi-hop reasoning, achieving a 9.7% average accuracy improvement over existing methods.
Graph-Augmented Retrieval for Cross-Entity Financial Sentiment Analysis: A Comparative Study
This paper presents a comparative study of Graph-RAG versus standard vector-only RAG for cross-entity financial sentiment analysis, finding statistically significant improvements in entity recall and answer relevancy at modest latency cost.
RAGA: Reading-And-Graph-building-Agent for Autonomous Knowledge Graph Construction and Retrieval-Augmented Generation
RAGA is an LLM-driven autonomous agent that constructs knowledge graphs via a read-search-verify-construct cognitive loop and integrates hybrid symbolic-vector retrieval for retrieval-augmented generation, with experimental gains on scientific QA datasets.