@no_stp_on_snek: mrcr v2 8-needle at 1m, open weights stack, single rented mi300x. longctx directional 0.688 (n=30, mass-val rerun pendi…
Summary
Shares early benchmark scores and evaluation metrics for an open-weight model stack run on a single AMD MI300X, noting competitive performance against closed-source alternatives.
View Cached Full Text
Cached at: 05/09/26, 03:40 AM
mrcr v2 8-needle at 1m, open weights stack, single rented mi300x.
longctx directional 0.688 (n=30, mass-val rerun pending). conservative mass-val 0.601 (n=60). subq published 0.659.
within striking distance, possibly past on multiquery. open vs closed receipts in the article.
Similar Articles
@no_stp_on_snek: https://x.com/no_stp_on_snek/status/2052833502475833384
An open-source stack using Qwen2.5-32B-Instruct with longctx and vllm-turboquant on a single AMD MI300X achieves competitive results (0.601-0.688) versus SubQ's closed model (0.659) on the MRCR v2 1M-context benchmark, demonstrating open-weights approaches are within striking distance.
@no_stp_on_snek: small update from the long-context experiments: I got MRCR v2 running out to 1M on a single MI300X droplet with an open…
The author reports successful experiments running MRCR v2 with 1M context length on a single MI300X using Qwen2.5-32B and FAISS, achieving competitive scores at low cost.
@svpino: For the first time, I feel open-weight models are impossible to ignore. We are at a point where these models are compet…
Santiago (@svpino) highlights MiniMax-M2.7, a 230B open-weight model that rivals top proprietary models like Opus 4.6 and GPT-5.4, achieving 440+ tokens/s inference on SambaNova at low cost.
@no_stp_on_snek: Tested out MTP for the first time on my llamacpp fork last night with turbo4 sym. GX10 hardware. using MoE model: llmfa…
Tested Multi-Token Prediction on a llamacpp fork with a Qwen-based MoE model, achieving +0.41% PPL improvement over fp16 baseline.
@no_stp_on_snek: In progress
Promoting Atlas Inference, an open-source inference serving tool that achieved 200+ tok/s on a Qwen3.6-35B-A3B benchmark.