@WilliamBarrHeld: To train better open models, we need predictable scaling. Delphi is Marin’s first step: we pretrained many small models…
Summary
Marin AI researchers, led by William Barr Held, introduce Delphi, a methodology that pretrains small models to accurately predict the training outcomes of larger 25B-parameter runs. This research aims to establish predictable scaling for more efficient open-source AI model development.
View Cached Full Text
Cached at: 05/11/26, 08:43 PM
To train better open models, we need predictable scaling.
Delphi is Marin’s first step: we pretrained many small models with one recipe, then extrapolated 300× to predict a 25B-param / 600B-token run with just 0.2% error.
Getting there took some work 🧵 https://t.co/HmlVFl11ag
Similar Articles
Developing open source LLM from ground up from pretrain - rlhf(PPO/GRPO)
A developer shares progress on training a 7B parameter open source LLM from scratch using a DeepSeek architecture optimized for low VRAM, with the goal of democratizing AI development and eventually surpassing large proprietary models.
@heyrobinai: THE ENTIRE AI INDUSTRY JUST GOT HUMILIATED a tiny model trained in just a few hours on a single graphics card is planni…
Yann LeCun's team releases LeWorldModel, a tiny 15M-parameter physics model trained on a single GPU in hours that outperforms billion-dollar foundation models in planning speed and physical plausibility, challenging the dominant scaling paradigm.
Introducing improvements to the fine-tuning API and expanding our custom models program
OpenAI introduces improvements to its fine-tuning API with new features including epoch-based checkpoints, comparative playground for model evaluation, third-party integrations, and enhanced dashboard capabilities. The company also expands its custom models program to give developers more control and flexibility in building domain-specific AI solutions.
Model Distillation in the API
OpenAI introduces a Model Distillation offering in its API, enabling developers to use outputs from frontier models like o1-preview and GPT-4o to fine-tune smaller, cost-efficient models like GPT-4o mini through an integrated pipeline including Stored Completions, Evals, and Fine-tuning.
Scaling How We Build and Test Our Most Advanced AI
The article discusses the growing importance of reliability, security, and user protections as AI models become more capable and personalized.