living-benchmark

#living-benchmark

LEAF: A Living Benchmark for Event-Augmented Forecasting

arXiv cs.LG ↗ · 2026-05-19 Cached

LEAF is a living benchmark for evaluating large language models on event-augmented forecasting tasks, such as future event probabilities and time series forecasting. It uses a recursive retrieval agent system and dual-agent cross-validation to provide relevant auxiliary text, and shows that LLMs can leverage complex events to improve predictive performance.

0 favorites 0 likes

living-benchmark

LEAF: A Living Benchmark for Event-Augmented Forecasting

Submit Feedback