Tag
Microsoft releases technical details of MAI-Thinking-1 training: uses purely human data to train a base model, then trains three domain expert models, merges capabilities back into the base model via distillation, and then applies reinforcement learning to enable the model to flexibly utilize different capabilities.
Recommended reading: the MAI-Thinking-1 technical paper, which details almost all the steps to train a SOTA large language model.
Microsoft announced MAI-Thinking-1, a flagship reasoning AI model, alongside six other new models at Build 2026, marking a major step in in-house model development.