@WilliamBarrHeld: To train better open models, we need predictable scaling. Delphi is Marin’s first step: we pretrained many small models…
Summary
Marin AI researchers, led by William Barr Held, introduce Delphi, a methodology that pretrains small models to accurately predict the training outcomes of larger 25B-parameter runs. This research aims to establish predictable scaling for more efficient open-source AI model development.
View Cached Full Text
Cached at: 05/11/26, 08:43 PM
To train better open models, we need predictable scaling.
Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error.
Getting there took some work 🧵 https://t.co/HmlVFl11ag
Similar Articles
@eliebakouch: one of my favorite projects is Marin from the stanford folks, they have a scientific approach to training, are ready to…
Marin is an open-source framework from Stanford for reproducible foundation model research, covering data curation, tokenization, training, and evaluation; it was used to train an 8B parameter model that outperforms Llama 3.1 8B.
Developing open source LLM from ground up from pretrain - rlhf(PPO/GRPO)
A developer shares progress on training a 7B parameter open source LLM from scratch using a DeepSeek architecture optimized for low VRAM, with the goal of democratizing AI development and eventually surpassing large proprietary models.
@percyliang: Not only do we want to train a good model, we want to know it'll be good before we even start training. About a month a…
The Marin team pre-registered a predicted loss of 2.252 for a 129B parameter MoE model training run, and the actual result landed at 2.234, demonstrating accurate loss prediction before training.
A 4b model is now beating 30b ones at web research and the reason is not size
A 4 billion parameter open model from the Apodex family outperforms 30 billion parameter models on web research benchmarks, attributed to careful training data and self-verification techniques rather than raw scale, suggesting a more democratic trajectory for AI capability.
@heyrobinai: THE ENTIRE AI INDUSTRY JUST GOT HUMILIATED a tiny model trained in just a few hours on a single graphics card is planni…
Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.