Query-Adaptive Semantic Chunking for Retrieval-Augmented Generation: A Dynamic Strategy with Contextual Window Expansion
Summary
Proposes a query-adaptive semantic chunking method for retrieval-augmented generation that dynamically adjusts chunk boundaries using contextual window expansion to improve retrieval precision.
View Cached Full Text
Cached at: 05/25/26, 08:54 AM
# Query-Adaptive Semantic Chunking for Retrieval-Augmented Generation: A Dynamic Strategy with Contextual Window Expansion Source: [https://arxiv.org/abs/2605.22834](https://arxiv.org/abs/2605.22834) Bibliographic Tools ## Bibliographic and Citation Tools Bibliographic Explorer Toggle Code, Data, Media ## Code, Data and Media Associated with this Article Demos ## Demos Related Papers ## Recommenders and Search Tools About arXivLabs ## arXivLabs: experimental projects with community collaborators arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website\. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy\. arXiv is committed to these values and only works with partners that adhere to them\. Have an idea for a project that will add value for arXiv's community?[**Learn more about arXivLabs**](https://info.arxiv.org/labs/index.html)\.
Similar Articles
Adaptive Chunking: Optimizing Chunking-Method Selection for RAG
Introduces Adaptive Chunking, a framework using five intrinsic document metrics to select optimal chunking strategies for RAG, improving answer correctness from 62-64% to 72% and question resolution rate by over 30%.
In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective
This paper studies retrieval-augmented generation as an in-context optimization process, showing that linear self-attention can implement gradient descent on a unified RAG objective. It proposes a lightweight method for frozen RAG LLMs that predicts context-conditioned updates, improving performance across multiple QA benchmarks.
Web Retrieval-Aware Chunking (W-RAC) for Efficient and Cost-Effective Retrieval-Augmented Generation Systems
W-RAC introduces a cost-efficient chunking framework for web document processing in RAG systems that reduces LLM token usage by an order of magnitude through structured content representation and retrieval-aware grouping decisions. The method decouples text extraction from semantic chunk planning, achieving comparable or better retrieval performance than traditional chunking approaches while minimizing hallucination risks.
Chunking German Legal Code
This paper evaluates various chunking strategies for retrieval-augmented generation on German legal code, finding that structure-aligned methods like section-based retrieval outperform more complex approaches.
Evaluation of Chunking Strategies for Effective Text Embedding in Low-Resource Language on Agricultural Documents
This paper evaluates four text chunking strategies for Retrieval-Augmented Generation on Khmer agricultural documents, finding that character-based Recursive chunking with 300 characters yields the best retrieval and relevance performance.