Tag
Fine-tuning small LLMs (3B-7B) with QLoRA on biomedical claim verification achieves higher F1 than GPT-4o and GPT-5 at 44.5x lower cost, and reveals a structural artifact in SciFact. The study demonstrates robust cross-domain transfer when training on structurally sound data.
MoCA-Agent is a market-of-claims code agent that improves financial and numerical reasoning by decomposing questions into atomic claims and using specialist agents to buy/sell those claims, achieving strong results on multiple benchmarks using a fixed Qwen 3.6-27B backbone.
This paper introduces MAD2, a new benchmark for multimodal claim verification in spoken dialogues, and proposes a calibrated fusion of audio and text models that leverages conversational context to improve verification accuracy.
This paper introduces PrimeFacts, a methodology and resource for extracting fine-grained evidence from fact-checking articles using large language models. The extracted premises improve evidence retrieval and claim verification performance by up to 30% in MRR and 10-20 points in Macro-F1.