@Modular: Our kernel team has been deep in MiniMax M3 all week. The 1M-token context and native multimodality make it a hard mode…

X AI KOLs Following 06/09/26, 06:37 PM Models

mini-max-m3 1m-context multimodal open-weights serving modular kernel

Summary

Modular's kernel team is optimizing serving for MiniMax M3's 1M-token context and native multimodality, with open weights dropping soon for immediate deployment on Modular.

Our kernel team has been deep in MiniMax M3 all week. The 1M-token context and native multimodality make it a hard model to serve well, which is exactly the kind of problem we like! When the open weights drop in the next few days, you'll be able to run it on Modular right away. Stay tuned for @MiniMax_AI x Modular.

Original Article

View Cached Full Text

Cached at: 06/10/26, 12:20 AM

Our kernel team has been deep in MiniMax M3 all week. The 1M-token context and native multimodality make it a hard model to serve well, which is exactly the kind of problem we like!

When the open weights drop in the next few days, you’ll be able to run it on Modular right away.

Stay tuned for @MiniMax_AI x Modular.

Similar Articles

MiniMax promises M3 weights after 1M-context model launch (2 minute read)

TLDR AI

MiniMax released M3, a model with a 1M-token context window and native multimodal input, via API. The company promises open-weight release and a technical report within 10 days.

MiniMax M3 (2 minute read)

TLDR AI

MiniMax introduces M3, the first open-weights model to combine coding, agentic, and multimodal capabilities with up to 1M context via sparse attention.

@PrajwalTomar_: Everyone's sleeping on MiniMax. Again. They just shipped M3. The first open-weights model to combine frontier coding, 1…

X AI KOLs Following

MiniMax released M3, an open-weights model combining frontier coding, 1M context, and native multimodality, offering comparable performance to Opus at a fraction of the cost.

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost (12 minute read)

TLDR AI

MiniMax has released a detailed technical report on its M2 series and teased the upcoming M3 model, which uses a novel sparse attention mechanism to achieve up to 15.6× faster decoding at million-token contexts.

@RyanLeeMiniMax: MiniMax-M3 will by arrive on HuggingFace openweight at next week!

X AI KOLs Following

MiniMax announced MiniMax-M3, an open-weights model combining frontier coding and agentic capabilities with sparse attention scaling to 1M context, set to arrive on HuggingFace next week.

Similar Articles

MiniMax promises M3 weights after 1M-context model launch (2 minute read)

MiniMax M3 (2 minute read)

@PrajwalTomar_: Everyone's sleeping on MiniMax. Again. They just shipped M3. The first open-weights model to combine frontier coding, 1…

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost (12 minute read)

@RyanLeeMiniMax: MiniMax-M3 will by arrive on HuggingFace openweight at next week!

Submit Feedback