MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal

Reddit r/LocalLLaMA 06/01/26, 01:23 AM Models

open-weight coding agentic multimodal 1m-context sparse-attention

Summary

MiniMax releases M3, an open-weight model with frontier coding, agentic, 1M context, and native multimodal capabilities, achieving top benchmarks on coding and agentic tasks with autonomous task decomposition and long-context support.

No content available

Original Article

View Cached Full Text

Cached at: 06/01/26, 01:37 AM

# MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal Source: [https://www.minimax.io/models/text/m3](https://www.minimax.io/models/text/m3) Coding & Agentic Frontier\. 1M\-context MSA\. Native Multimodality The first open\-weight model with three frontier capabilities\. ### Performance Benchmark M3 achieves top\-tier performance on coding and agentic benchmarks, with autonomous task decomposition, tool invocation, and multi\-step reasoning capabilities — providing a reliable foundation for AI coding assistants and automated workflows\. Powered by the proprietary MiniMax Sparse Attention \(MSA\) architecture, M3 API supports up to 1M tokens context window with a guaranteed minimum of 512K tokens\. The 1M context is the infrastructure for long\-range Agent tasks, long\-range Coding, and long\-video understanding\. A natively multimodal model\. The entire data pipeline was rebuilt to scale pretraining data to 100T\+, with multimodal training from step zero achieving deep alignment between textual and visual semantic spaces\. Multimodal is a native core capability, not a superficial add\-on\. On BrowseComp, M3 scores 83\.5, surpassing Opus 4\.7 \(79\.3\), demonstrating strong autonomous browsing and information retrieval capabilities\. Until now, only a handful of closed\-source models could simultaneously achieve frontier coding capabilities, million\-token context, and Multimodal\. M3 is the first to bring complete frontier capability to the open world\. ![Paper Reproduction: 12-Hour Autonomous ICLR Paper Replication](https://file.cdn.minimax.io/public/ce62e404-de42-4c88-8897-f355eea0df41.png) ### Paper Reproduction: 12\-Hour Autonomous ICLR Paper Replication We tasked M3 with independently reproducing an ICLR 2025 Outstanding Paper — Learning Dynamics of LLM Finetuning\. M3 ran continuously for nearly 12 hours, independently producing 18 commits and 23 experimental figures, successfully replicating the core experiments\. Multimodal capabilities parsed charts and formulas from the paper, long context fit paper \+ code \+ experiment logs in a single window, and coding \+ agentic capabilities drove long\-horizon execution\. ### CUDA Kernel Optimization: 147 Iterations, 9\.4× Speedup FP8 GEMM is one of the most compute\-intensive and difficult\-to\-optimize operations in LLM inference\. We asked M3 to optimize this kernel on NVIDIA Hopper GPUs, starting with only a task description and a non\-runnable Triton skeleton\. Over ~24 hours, M3 completed 147 benchmark submissions and 1,959 tool calls, pushing hardware peak utilization from 7\.6% to 71\.3% — a 9\.4× speedup with zero human intervention\. ![CUDA Kernel Optimization: 147 Iterations, 9.4× Speedup](https://file.cdn.minimax.io/public/24346a19-3459-47e1-a5b6-a771951b2ca9.gif) ![PostTrainBench: M3 Training Models on Its Own](https://file.cdn.minimax.io/public/0551c035-b821-4ac4-8f13-31ac05bdc77b.gif) ### PostTrainBench: M3 Training Models on Its Own We gave M3 four pretrain\-only base models and asked it to autonomously complete the full pipeline — data synthesis, training, evaluation, and iteration — within 12 hours, enabling the models to perform math reasoning, code generation, and knowledge QA\. The entire process ran without human intervention\. M3 scored 37\.1, ranking \#3 overall, behind only Opus 4\.7 \(42\.4\) and GPT\-5\.5 \(39\.3\), significantly ahead of all other models\. DEVELOPER TOOLS ## Empowering Developer Choice Outstanding Tool Scaffolding Generalization #### 01 / Access Method ### Quick API Integration API versions: M3, with identical results but faster speed\. Full automatic Cache support, no configuration needed\. #### 02 / Access Method ### For AI Coding Tools 01 / Subscribe to the Token Plan The price remains unchanged, while performance has significantly improved\. Token Plan users now automatically benefit from M3's enhanced coding and reasoning capabilities\. [Read More](https://platform.minimax.io/subscribe/token-plan) 02 / Open Platform Integration Supports standard M3, with up to 1M tokens context window\. [Read More](https://platform.minimax.io/docs/guides/text-generation) 03 / MiniMax Code Integration The general Agent platform based on M3 is now fully open\. Experience coding agentic, multimodal understanding, and other flagship capabilities without any development required\. [Read More](https://code.minimax.io/) 04 / Open Source and Local Deployment We are committed to giving back to the community\. M3 will soon be fully open\-sourced on HuggingFace and GitHub, supporting private cluster deployment and fine\-tuning\. [Read More](https://huggingface.co/MiniMaxAI)

MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal

Similar Articles

microsoft/Mage-VL · Hugging Face - An Efficient Codec-Native Streaming Multimodal Foundation Model

Appreciation for Gemma 4 26b A4b

Kimi K3 Architecture Overview and Notes

Free API credits for DeepSeek V4, Kimi K2, and other open-weight LLMs — unified API, closed beta, no card required

@PrajwalTomar_: I gave Grok 4.5 one sentence and it built me a working app before I finished my coffee. Cursor just launched a plan onl…

Submit Feedback

Similar Articles

microsoft/Mage-VL · Hugging Face - An Efficient Codec-Native Streaming Multimodal Foundation Model

Appreciation for Gemma 4 26b A4b

Kimi K3 Architecture Overview and Notes

Free API credits for DeepSeek V4, Kimi K2, and other open-weight LLMs — unified API, closed beta, no card required

@PrajwalTomar_: I gave Grok 4.5 one sentence and it built me a working app before I finished my coffee. Cursor just launched a plan onl…