browsing

Tag

Cards List
#browsing

EvoBrowseComp: Benchmarking Search Agents on Evolving Knowledge

Hugging Face Daily Papers · yesterday Cached

EvoBrowseComp is an evolving benchmark with 800 contamination-free questions for evaluating search agents, designed to prevent parametric memorization and maintain temporal freshness through a three-agent framework.

0 favorites 0 likes
← Back to home

Submit Feedback