tool-output

#tool-output

Beyond Document Grounding: Span-Level Hallucination Detection over Code, Tool Output, and Documents

arXiv cs.CL ↗ · yesterday Cached

This paper introduces a unified benchmark for span-level hallucination detection in RAG systems that extends beyond natural language to code, tool output, and structured documents, and presents a fine-tuned Qwen3.5-2B detector that outperforms existing methods on these new domains while remaining competitive on standard NLP benchmarks.

0 favorites 0 likes

tool-output

Beyond Document Grounding: Span-Level Hallucination Detection over Code, Tool Output, and Documents

Submit Feedback