@antirez: DS4 running on DGX Spark (GB10 / CUDA), private branch for now. 12 tokens/sec, the memory bandwidth is limited in this …

X AI KOLs Timeline 05/10/26, 07:49 AM News

Summary

Antirez reports benchmarking DS4 inference on the DGX Spark (GB10), noting 12 tokens/sec generation speed and high prefill performance, with plans to merge the codebase once mature.

DS4 running on DGX Spark (GB10 / CUDA), private branch for now. 12 tokens/sec, the memory bandwidth is limited in this system, at 270GB/sec. But prefill is ways more alighed to M3 Max at ~200 t/s. I'll release when more mature, but it is almost sure that it will get merged. https://t.co/LVYSDQ4Hnp

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 05/10/26, 10:23 AM

DS4 running on DGX Spark (GB10 / CUDA), private branch for now. 12 tokens/sec, the memory bandwidth is limited in this system, at 270GB/sec. But prefill is ways more alighed to M3 Max at ~200 t/s. I’ll release when more mature, but it is almost sure that it will get merged. https://t.co/LVYSDQ4Hnp

@antirez: DS4 running on DGX Spark (GB10 / CUDA), private branch for now. 12 tokens/sec, the memory bandwidth is limited in this …

Similar Articles

@antirez: I just pushed a big refactoring of DS4 backends with CUDA support and single direction activation steering. The Metal p…

Dual dgx spark (Asus GX10) MiniMax M2.7 results

@ttasanen: Just fired up DS4 by @antirez on my Mac Studio M3 Ultra 256GB and man, it’s seriously impressive. A clean, purpose-buil…

Gemma 4 26B Hits 600 Tok/s on One RTX 5090

@mitsuhiko: And the ds4 SSD caches are great. This is continuing a session after the server was shut down which was already 63k tok…

Submit Feedback

Similar Articles

@antirez: I just pushed a big refactoring of DS4 backends with CUDA support and single direction activation steering. The Metal p…

Dual dgx spark (Asus GX10) MiniMax M2.7 results

@ttasanen: Just fired up DS4 by @antirez on my Mac Studio M3 Ultra 256GB and man, it’s seriously impressive. A clean, purpose-buil…

Gemma 4 26B Hits 600 Tok/s on One RTX 5090

@mitsuhiko: And the ds4 SSD caches are great. This is continuing a session after the server was shut down which was already 63k tok…