@TeksEdge: With MiniMax M3 open source now out, here is what to expect on quants and sizes, including VRAM needed: MiniMax M3 (428…

X AI KOLs Following 06/12/26, 04:17 PM Models

Summary

MiniMax M3, a 428B MoE model with ~23B active parameters, is now open source. It offers ultra-long context (up to 1M) and efficiency improvements, with various quantized sizes and VRAM requirements for local deployment.

With MiniMax M3 open source now out, here is what to expect on quants and sizes, including VRAM needed: MiniMax M3 (428B MoE, ~23B active) GGUF Size Estimates Q8_0 → ~430-450 GB Q6_K → ~340-360 GB Q5_K_M/XL → ~280-310 GB Q4_K_M/XL → ~220-250 GB (Best balance) Q3_K_XL → ~170-200 GB Q2_K → ~110-140 GB Last resort Very efficient due to extreme sparsity! Practical local runs will need high-VRAM setups (multiple 5090s or better).

Original Article

View Cached Full Text

Cached at: 06/12/26, 07:01 PM

With MiniMax M3 open source now out, here is what to expect on quants and sizes, including VRAM needed:

MiniMax M3 (428B MoE, ~23B active)

GGUF Size Estimates Q8_0 → ~430-450 GB Q6_K → ~340-360 GB Q5_K_M/XL → ~280-310 GB Q4_K_M/XL → ~220-250 GB (Best balance) Q3_K_XL → ~170-200 GB Q2_K → ~110-140 GB Last resort

Very efficient due to extreme sparsity!

Practical local runs will need high-VRAM setups (multiple 5090s or better).

ModelScope (@ModelScope2022): MiniMax M3 is now open source! The model combines native multimodal understanding, ultra-long context, and Agent capabilities in one.🚀

New MSA architecture: up to 1M context at 1/20 the per-token compute of the previous gen. 9x faster prefilling, 15x faster decoding, on par

@TeksEdge: With MiniMax M3 open source now out, here is what to expect on quants and sizes, including VRAM needed: MiniMax M3 (428…

Similar Articles

@stevibe: MiniMax M2.7 is 230B params. Can you actually run it at home? I tested Unsloth's UD-IQ3_XXS (80GB) on 4 different rigs:…

@no_stp_on_snek: Config-I quant of MiniMax-M3 is up on MLX. 2-bit experts, 4-bit attention, 8-bit boundaries + embeddings, f16 router. ~…

@PrajwalTomar_: Okay this is wild. MiniMax just dropped M3, and it might be the most capable open model for building right now. I gave …

MiniMax M3 (2 minute read)

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost (12 minute read)

Submit Feedback

Similar Articles

@stevibe: MiniMax M2.7 is 230B params. Can you actually run it at home? I tested Unsloth's UD-IQ3_XXS (80GB) on 4 different rigs:…

@no_stp_on_snek: Config-I quant of MiniMax-M3 is up on MLX. 2-bit experts, 4-bit attention, 8-bit boundaries + embeddings, f16 router. ~…

@PrajwalTomar_: Okay this is wild. MiniMax just dropped M3, and it might be the most capable open model for building right now. I gave …

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost (12 minute read)