Trained transformer-based chess models to play like humans (including thinking time) [P]

Reddit r/MachineLearning 05/13/26, 10:08 PM Models

chess transformers deep-learning human-like rating-conditioning thinking-time open-source

Summary

Trained transformer-based chess models for rating buckets from 800 to 2500+, predicting moves, thinking time, and outcome. Achieves strong accuracy with only 9M parameters, and includes a novel thinking-time prediction component.

I trained a set of deep learning (transformer-based) chess models to play like humans (inspired by MAIA and Grandmaster Chess Without Search). There's a separate model for each 100-point rating bucket from \~800 to 2500+. I started with training a mid-strength model from scratch on a 8xH100 cluster, then fine-tuned models for the other rating ranges on my local 5090 GPU. The total training size was nearly a year of Lichess data, about 1B total games. Each rating range actually has 3 models: A move model, a thinking time model, and a white win / draw / black win model. Despite being quite small (only 9MM parameters!) the move models achieve better accuracy than MAIA-2 and are approximately on par with MAIA-3 (see [here](https://github.com/thomasj02/1e4_ai/blob/master/experiments/maia2_benchmark/RESULTS.md) for MAIA-2 comparison). AFAIK this is the only attempt to train on thinking times in chess, so I don't have a benchmark to compare against for that. Likely because of the network size, at high ratings the models aren't quite as good as they could be. They see short tactical motifs but can't do deep calculation - probably a bigger model would help here. The move and win models take into account player ratings and clock times. For instance, under extreme time pressure a much stronger player has a lower win prob even if their opponent is weaker. The models blunder more under time pressure as well. The data pipeline is C++ via nanobind, then training with Pytorch. Getting this right was actually the thing I spent the most time on. Pre-shuffling the dataset and then being able to read the shuffled dataset sequentially at training time kept the GPU utilization high. Without this it spent a huge percentage of time on I/O while the GPU sat idle. Happy to answer questions about the rating-conditioning, the clock model, or the data pipeline. Code (including training code and model weights) is at [https://github.com/thomasj02/1e4\_ai/](https://github.com/thomasj02/1e4_ai/). A demo is at [https://1e4.ai/](https://1e4.ai/) but all the frontend code is also in the repo if you want to self-host.

Original Article

Trained transformer-based chess models to play like humans (including thinking time) [P]

Similar Articles

Transformers Learn the Mestre-Nagao Heuristic

Transformers Linearly Represent Highly Structured World Models

Transformer Math Explorer [P]

@viditchess: Chess engines tell you the best move. But grandmasters are human, they don’t always play it. So I built "Kibitz": a hum…

Submit Feedback

Similar Articles

Transformers Learn the Mestre-Nagao Heuristic

Transformers Linearly Represent Highly Structured World Models

@NFTCPS: You keep talking about AI, but can't even explain what a Transformer is? There's a repo that goes all out — builds a GPT from scratch without using any high-level libraries. It lays out exactly how Attention, Multi-Head, Feed-Forward, Embedding, Residual connections, and Layer Norm are pieced together. And it's not just the model; the entire pipeline is covered…

@viditchess: Chess engines tell you the best move. But grandmasters are human, they don’t always play it. So I built "Kibitz": a hum…