Microsoft's new MAI models

Simon Willison's Blog Models

Summary

Microsoft announced two new LLMs: MAI-Thinking-1 (35B reasoning model) and MAI-Code-1-Flash (5B code model), both trained on enterprise-grade, clean data without third-party distillation, with MAI-Thinking-1 claimed to be preferred over Sonnet 4.6 in blind evaluations.

No content available
Original Article
View Cached Full Text

Cached at: 06/02/26, 11:35 PM

# Microsoft's new MAI models Source: [https://simonwillison.net/2026/Jun/2/microsofts-new-models/](https://simonwillison.net/2026/Jun/2/microsofts-new-models/) 2nd June 2026 Microsoft[announced two new text LLMs](https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/)this morning \-**[MAI\-Thinking\-1](https://microsoft.ai/news/introducing-mai-thinking-1/)**\(reasoning, 35B parameters, available to "select early partners"\) and**[MAI\-Code\-1\-Flash](https://microsoft.ai/news/introducingmai-code-1-flash/)**\(5B parameters, "purpose\-built for GitHub Copilot and VS Code to deliver high performance and lower cost \[\.\.\.\] rolling out to GitHub Copilot individual users in Visual Studio Code"\)\. I've not been able to try either of them just yet\. It's very interesting to see Microsoft releasing models with such low parameter counts, especially given how expensive larger models are to access right now\. They claim MAI\-Thinking\-1 "is preferred to Sonnet 4\.6 in our blind human side\-by\-side evaluations", which is impressive for a 35B model seeing as I frequently run models larger than that on my own laptop\. Also[of note](https://microsoft.ai/news/introducing-mai-thinking-1/): > We trained \[MAI\-Thinking\-1\] from the ground up on enterprise grade, clean and commercially licensed data, without distillation from third\-party models\. And for[MAI\-Code\-1\-Flash](https://microsoft.ai/news/introducingmai-code-1-flash/)as well: > It is built end\-to\-end by Microsoft using clean and appropriately licensed data\. I would*very much*like to learn more about this "appropriately licensed" data\! Could these be the first generally useful code\-specialist models that didn't train on an unlicensed dump of the web?

Similar Articles

MAI-Thinking-1

Hacker News Top

Microsoft AI introduces MAI-Thinking-1, a 35B-active parameter reasoning model trained from scratch without distillation, achieving strong performance on software engineering and math benchmarks while emphasizing clean data and self-sufficiency.

MAI-Code-1-Flash

Hacker News Top

Microsoft introduces MAI-Code-1-Flash, a coding model optimized for production workflows with fewer tokens and higher accuracy than Claude Haiku 4.5 across multiple benchmarks.