Tag
This arXiv preprint introduces GRALIS, a unified mathematical framework using Riesz Representation Theory to formalize and compare linear attribution methods like SHAP, LIME, and Integrated Gradients.
TPA proposes a novel method for detecting hallucinations in RAG systems by attributing next-token probabilities to seven distinct sources (Query, RAG Context, Past Token, Self Token, FFN, Final LayerNorm, Initial Embedding) and aggregating by Part-of-Speech tags. The approach achieves state-of-the-art performance across five LLMs including Llama2, Llama3, Mistral, and Qwen.