Trained transformer-based chess models to play like humans (including thinking time) [P]
Summary
Trained transformer-based chess models for rating buckets from 800 to 2500+, predicting moves, thinking time, and outcome. Achieves strong accuracy with only 9M parameters, and includes a novel thinking-time prediction component.
Similar Articles
Transformers Learn the Mestre-Nagao Heuristic
This paper trains a two-layer transformer encoder to classify rational elliptic curves by rank from Frobenius traces, achieving >99% accuracy. Mechanistic interpretability reveals the model learns the Mestre-Nagao heuristic and concentrates attention on prime positions, demonstrating that transformers can learn number-theoretic algorithms.
Transformers Linearly Represent Highly Structured World Models
This paper demonstrates that transformers trained on Sudoku solving traces build structured world models organized by domain constraints, and identifies a sparse, monosemantic circuit responsible for the naked-single decision rule. The work provides a fully interpretable algorithmic account of transformer reasoning on a combinatorial task.
Transformer Math Explorer [P]
This interactive tool visualizes the mathematical underpinnings of transformer models through dataflow graphs, covering architectures from GPT-2 to Qwen 3.6 and various attention mechanisms.
@NFTCPS: You keep talking about AI, but can't even explain what a Transformer is? There's a repo that goes all out — builds a GPT from scratch without using any high-level libraries. It lays out exactly how Attention, Multi-Head, Feed-Forward, Embedding, Residual connections, and Layer Norm are pieced together. And it's not just the model; the entire pipeline is covered…
A GitHub open-source project that implements the complete GPT training pipeline from scratch, including data preprocessing, pretraining, SFT, and RLHF post-training, all based on native PyTorch. Ideal for developers who want to deeply understand the Transformer architecture.
@viditchess: Chess engines tell you the best move. But grandmasters are human, they don’t always play it. So I built "Kibitz": a hum…
Built 'Kibitz', a human move predictor for chess broadcasts, trained on RTX 5080, and automated its operation as a business using Hermes, Stripe, and NVIDIA AI Nemotron for a hackathon.