@stevibe: MiniMax M2.7 is 230B params. Can you actually run it at home? I tested Unsloth's UD-IQ3_XXS (80GB) on 4 different rigs:…

X AI KOLs Following News

Summary

A user tested MiniMax M2.7 (230B parameter model) using Unsloth's UD-IQ3_XXS quantization (80GB) across four different hardware configurations including RTX 4090, RTX 5090, RTX PRO 6000, and DGX setups, reporting token generation speeds and time-to-first-token metrics.

MiniMax M2.7 is 230B params. Can you actually run it at home? I tested Unsloth's UD-IQ3_XXS (80GB) on 4 different rigs: 4x RTX 4090 (96GB): 71.52 tok/s, TTFT 1045ms 4x RTX 5090 (128GB): 120.54 tok/s, TTFT 725ms 1x RTX PRO 6000 (96GB): 118.74 tok/s, TTFT 765ms DGX
Original Article

Similar Articles

MiniMax2.7 @47tg 1200pp

Reddit r/LocalLLaMA

MiniMax2.7 model released with 47 trillion parameters and 1200pp context length.