Models can predict future events and make money on Polymarket now?
Summary
Researchers at the Max Planck Institute introduced FutureSim, an environment where AI agents predict real-world future events by replaying historical web data. GPT 5.5 running in Codex achieved near-perfect Brier skill scores on some Polymarket markets like Super Bowl LX, outperforming human aggregate markets, though it struggled on others like UK elections and the Grammys.
Similar Articles
FutureSim: Replaying World Events to Evaluate Adaptive Agents
FutureSim replays chronological world events to benchmark AI agents' long-term predictive abilities, finding that even the best agent achieves only 25% accuracy.
Prediction markets are breaking the news and becoming their own beat
Prediction markets are increasingly influencing news coverage and becoming a subject of journalism in their own right, as platforms like Polymarket gain mainstream attention for forecasting real-world events.
Looking at the data behind prediction markets
An analysis of prediction markets like Polymarket and Kalshi, examining whether their massive trading volume actually produces valuable forecasting information or merely serves as gambling, referencing historical academic support and current data.
kept facing with coding agents was hallucinations context loss outdated framework knowledge and models confidently guessing wrong implementations
Proxima is a local tool that orchestrates multiple AI models (ChatGPT, Claude, Gemini, Perplexity) to collaborate via MCP, API, CLI, and webhooks, addressing coding agent issues like hallucinations and context loss by enabling multi-model workflows on the user's own machine.
Suraj vs The Future | With ChatGPT
A promotional video from OpenAI showcasing how to use ChatGPT to prepare smarter for the future, produced by Early Man Film.