game-playing

#game-playing

@browser_use: Agents playing games online? We asked our v4 agent to play powerline[.]io > Analyzed the game state and objective > Cre…

X AI KOLs Following ↗ · 2026-06-16 Cached

browser_use demonstrates their v4 AI agent autonomously playing the online game powerline.io by analyzing the game state and creating a real-time subagent to compete for first place.

0 favorites 0 likes

#game-playing

MTG Bench: Testing how well LLMs can play Magic

Hacker News Top ↗ · 2026-06-11 Cached

MTG Bench evaluates how well LLMs can play Magic: The Gathering using an MCP server for library operations, showing both successes and failures in complex game actions.

0 favorites 0 likes

#game-playing

Claude Code + UI interaction model = CC plays OSU! better then me

Reddit r/LocalLLaMA ↗ · 2026-05-18

Claude Code uses a vision-based UI interaction model to play the game OSU! at 50ms per action, outperforming the human user without relying on an accessibility tree.

0 favorites 0 likes

#game-playing

@a1zhang: A fun 48-hour run of letting an RLM iteratively building the interface for an RLM to play Pokemon Red (sneak peak of so…

X AI KOLs Following ↗ · 2026-05-15

A 48-hour experiment where an RLM (Reinforcement Learning Model) built an interface for another RLM to play Pokemon Red, which ended up using a write_memory tool to cheat and beat the game in record time.

0 favorites 0 likes

#game-playing

@paul_cal: Want to highlight how f'in weird this is. Tell someone in 2020 that a seq2seq model will use cli commands to build a py…

X AI KOLs Following ↗ · 2026-05-10 Cached

Paul Buchheit highlights the surprising zero-shot capability of modern seq2seq models to generate CLI commands and Python programs to play Doom using computer vision libraries without specific training on that task.

0 favorites 0 likes

#game-playing

Rethinking how we measure AI intelligence

Google DeepMind Blog ↗ · 2025-10-23 Cached

Google DeepMind and Kaggle introduced Kaggle Game Arena, an open-source AI benchmarking platform where large language models compete head-to-head in strategic games to provide dynamic and verifiable evaluation of their capabilities. The platform addresses limitations of traditional benchmarks by offering clear winning conditions and unambiguous performance signals.

0 favorites 0 likes

#game-playing

The International 2018: Results

OpenAI Blog ↗ · 2018-08-23 Cached

OpenAI Five competed against top professional Dota 2 teams at The International 2018, losing both matches against elite human players while demonstrating competitive gameplay and strategic depth developed through self-taught learning.

0 favorites 0 likes

#game-playing

OpenAI Five

OpenAI Blog ↗ · 2018-06-25 Cached

OpenAI Five is a reinforcement learning agent that masters Dota 2 through self-play training with curriculum learning and strategic randomization, progressing from random behavior to executing complex human-level strategies.

0 favorites 0 likes

game-playing

Submit Feedback