Tag
This paper investigates how incorporating web retrieval into LLM agents can degrade safety alignment, revealing the 'Safe Source Paradox' where even safety-oriented documents increase harmful compliance. It introduces the AgentREVEAL diagnostic framework and HarmURLBench benchmark to analyze and evaluate retrieval-induced safety vulnerabilities.
A developer created a small local tool for inspecting retrieval results from search providers like Brave, Serper, Tavily, and Exa before feeding them into a RAG pipeline, checking signals such as source diversity, duplicates, freshness, and SEO/GEO pollution risk.