@0xSero: Minimax-M3 running on 4x RTX Pro 6000s - 800k context - 4x concurrency at 250k - 70-120 tok/s - 2000 tok/s prefill no c…

X AI KOLs Following Models

Summary

Minimax-M3 is demonstrated running on 4x RTX Pro 6000 GPUs with 800k context, achieving 70-120 tok/s inference and 2000 tok/s prefill at 4x concurrency using 376GB VRAM in mxfp4 format.

Minimax-M3 running on 4x RTX Pro 6000s - 800k context - 4x concurrency at 250k - 70-120 tok/s - 2000 tok/s prefill no cache - 376gb vram - mxfp4 It's working on improving the audio on one of my videos, it's actually doing a good job in researching solutions. Good model https://t.co/7QcuzrDnEK
Original Article
View Cached Full Text

Cached at: 06/15/26, 09:00 AM

Minimax-M3 running on 4x RTX Pro 6000s

  • 800k context
  • 4x concurrency at 250k
  • 70-120 tok/s
  • 2000 tok/s prefill no cache
  • 376gb vram
  • mxfp4

It’s working on improving the audio on one of my videos, it’s actually doing a good job in researching solutions.

Good model https://t.co/7QcuzrDnEK

Similar Articles