deep-research-agents

#deep-research-agents

It Is Trivially Easy to Use Reddit to Manipulate AI Search, Research Suggests

Reddit r/ArtificialInteligence ↗ · 2d ago Cached

New research from Cornell University shows that a single snippet of user-generated text as short as 13 words from sites like Reddit or Wikipedia can be used to manipulate AI search tools like ChatGPT and Google AI Search, highlighting a growing vulnerability in AI-powered information retrieval.

0 favorites 0 likes

#deep-research-agents

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Hugging Face Daily Papers ↗ · 2026-06-01 Cached

This paper introduces a claim-centric auditing framework for identifying error spans in deep-research agent trajectories, along with a new benchmark TELBench, improving process-level reliability assessment.

0 favorites 0 likes

#deep-research-agents

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Hugging Face Daily Papers ↗ · 2026-04-16 Cached

DR³-Eval is a benchmark for evaluating deep research agents on multimodal, multi-file report generation with a realistic web environment simulation and comprehensive evaluation framework measuring information recall, factual accuracy, citation coverage, instruction following, and depth quality.

0 favorites 0 likes

deep-research-agents

It Is Trivially Easy to Use Reddit to Manipulate AI Search, Research Suggests

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Submit Feedback