Tag
The New York Times issued a correction after discovering an AI tool generated a false quote attributed to Canadian politician Pierre Poilievre, highlighting the risks of relying on AI for news reporting.
This paper introduces DataDignity, a framework and benchmark (FakeWiki) for pinpoint provenance, aiming to identify the specific training data sources that support an LLM's response. It proposes ScoringModel and SteerFuse methods to improve attribution accuracy over standard retrieval baselines.
This paper introduces PrimeFacts, a methodology and resource for extracting fine-grained evidence from fact-checking articles using large language models. The extracted premises improve evidence retrieval and claim verification performance by up to 30% in MRR and 10-20 points in Macro-F1.
This position paper argues that audio misinformation on platforms like podcasts and WhatsApp voice notes is structurally different from text-based misinformation, carrying unique persuasive properties through prosody and conversational dynamics that existing fact-checking pipelines fail to address. The authors call for a rethinking of verification pipelines tailored to the spoken and conversational nature of audio media.
Researchers present the first benchmark for multimodal claim extraction from social media, evaluating state-of-the-art multimodal LLMs and introducing MICE, an intent-aware framework that improves handling of rhetorical intent and contextual cues in combined text-image posts.
This paper introduces FRANQ, a method for detecting hallucinations in Retrieval-Augmented Generation (RAG) systems by applying distinct uncertainty quantification techniques to distinguish between factuality and faithfulness to retrieved context. The authors construct a new dataset annotated for both factuality and faithfulness, and demonstrate that FRANQ outperforms existing approaches in detecting factual errors across multiple datasets and LLMs.
A user documented a sequence in which Gemini detected a real $280M KelpDAO/AAVE crypto exploit mid-conversation, retracted it as a hallucination under user skepticism, then reconfirmed it once mainstream coverage caught up — illustrating how AI anti-hallucination overcorrection can cause models to retract accurate information.
Google DeepMind introduces Backstory, an experimental AI tool built on Gemini that helps users verify image authenticity and context by detecting AI-generation, tracking usage history, and identifying digital alterations.