Tag
LEAF is a living benchmark for evaluating large language models on event-augmented forecasting tasks, such as future event probabilities and time series forecasting. It uses a recursive retrieval agent system and dual-agent cross-validation to provide relevant auxiliary text, and shows that LLMs can leverage complex events to improve predictive performance.