Tag
Introduces XBCP (Cross-lingual BrowseComp-Plus), a benchmark for evaluating deep research agents and retrievers in cross-lingual and multilingual settings. Results show significant performance degradation when evidence is in a different language from the query, highlighting both retrieval failures and agent-side difficulty in integrating language-mismatched evidence.