How are you handling aggregation/counting questions in doc-aware agents? RAG keeps failing me here
Summary
A developer discusses the limitations of RAG for aggregation and counting queries over collections of documents, and asks for community advice on alternative approaches like text-to-SQL and intent routing.
Similar Articles
Wrote up the failure modes that kept breaking my RAG system: chunking, stale index, hybrid search, the works
A developer shares the failure modes encountered while debugging a RAG system, including issues with chunking, stale indices, and hybrid search, along with practical fixes like sliding window chunking and contextual retrieval.
Most agent RAG problems I see are retrieval problems, not model problems
The author argues that most agent RAG failures are due to retrieval problems—specifically chunking errors, lack of freshness signals, and reliance on pure vector search—rather than the LLM, and recommends structural chunking, decay-based ranking, and hybrid BM25+vector search.
Agentic Retrieval-Augmented Generation for Financial Document Question Answering
This paper introduces FinAgent-RAG, an agentic framework for financial document question answering that combines iterative retrieval, Program-of-Thought reasoning, and adaptive resource allocation to improve accuracy and reduce costs.
Most RAG apps in production are confidently wrong and nobody talks about this enough
The article highlights a critical failure mode in production RAG systems where confident but incorrect answers arise from versioning issues and lack of uncertainty mechanisms. It proposes architectural improvements like routing layers, retrieval scoring, and hallucination checks to mitigate these errors.
Adaptive Chunking: Optimizing Chunking-Method Selection for RAG
Introduces Adaptive Chunking, a framework using five intrinsic document metrics to select optimal chunking strategies for RAG, improving answer correctness from 62-64% to 72% and question resolution rate by over 30%.