A very important milestone for me in the AI field.

Reddit r/LocalLLaMA Papers

Summary

The author announces the release of their first AI research paper, STAM (Stable Training with Adaptive Momentum), a new deep learning optimizer addressing stability and resource efficiency, and invites feedback from the AI community.

https://preview.redd.it/wmuii8r68i1h1.png?width=1672&format=png&auto=webp&s=ab2a21eb9cc361fb2080ad90ec7207b0e1263419 Three days ago, I officially released my first AI research paper: **STAM (Stable Training with Adaptive Momentum)**. STAM introduces a new optimizer for deep learning and focuses on: * improving training stability, * reducing resource consumption during training, * and addressing several limitations found in optimizers like Adam, AdamW, and Muon. The paper explains what makes STAM different, the problems it aims to solve, and includes comparisons with existing optimizers and training results. The research paper is currently available on SSRN, and it has reached a ranking of around 646K so far. What matters most to me is not numbers, but having AI engineers, researchers, and specialists read the paper and share honest technical feedback and criticism. I consider STAM one of the biggest projects I’ve ever worked on, and I plan to continue improving and developing it further. I would genuinely appreciate hearing opinions from researchers and experienced people in the AI community about the paper, the optimizer design, and the reported results compared to other optimizers. Research paper: [https://papers.ssrn.com/sol3/papers.cfm?abstract\_id=6699059](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6699059) https://preview.redd.it/va8a5vbb8i1h1.png?width=1254&format=png&auto=webp&s=1f5edf7da0e1d7988b61dd735081e1f3d3a25c15
Original Article

Similar Articles

@dair_ai: https://x.com/dair_ai/status/2061104052818108476

X AI KOLs Following

A roundup of three notable AI papers: SkillOpt treats skill documents as trainable parameters to optimize frozen agents; a new method compiles agentic workflows into model weights for 100x cost reduction; and AutoScientists introduces a decentralized agent team for long-running science without a central planner.

First Steps Toward Automated AI Research (12 minute read)

TLDR AI

Recursive releases an automated AI research system that achieves state-of-the-art results on three benchmarks: fixed-budget language model training, small-model training speed, and GPU kernel optimization. The system automates the research loop and open-sources artifacts from its runs.