hopper

Tag

Cards List
#hopper

Here are some tips on hitting nearly 200 tok/s for DeepSeek v4 Flash on Hopper

Reddit r/LocalLLaMA · 2026-06-08 Cached

This blog post provides tips and benchmarks for achieving nearly 200 tokens per second inference on DeepSeek V4 Flash using vLLM on a dual GH200 workstation, highlighting the use of a quantized checkpoint from Canada-Quant and tensor parallelism optimizations.

0 favorites 0 likes
← Back to home

Submit Feedback