1000 tps generation on Qwen3.6 27B with V100s

Reddit r/LocalLLaMA 05/25/26, 04:42 AM Models

qwen v100 inference-benchmark tps concurrent-requests generation-speed

Summary

Achieved 1000 tokens per second generation on Qwen3.6 27B using V100 GPUs with 128 concurrent requests, and 80 t/s for single user.

I wanted to see what the absolute best case scenario for generation on this setup was and was not disappointed. 128 concurrent requests is so far removed from what I need but it’s funny to see big number. For single user (batch 1 not 128) the generation is around 80t/s with 3000 t/s processing,no mtp!!

Original Article

Similar Articles

125 tok/s for Qwen3.6 q4xl on 2x 4060ti is insane perf/dollar

Reddit r/LocalLLaMA

A user reports achieving 125 tokens per second running Qwen3.6 q4xl on two RTX 4060 Ti GPUs, highlighting excellent performance per dollar and wondering if further optimization can reach 150 tok/s.

Qwen 3.6 benchmarks on 2x RTX PRO 6000

Reddit r/LocalLLaMA

Benchmarks for Qwen 3.6 27B and 35B models on dual RTX PRO 6000 GPUs using VLLM, showing generation throughput up to 3500 tokens per second.

What speed is everyone getting on Qwen3.6 27b?

Reddit r/LocalLLaMA

User benchmarks Qwen3.6-27B-Q8_0 at ~13 tokens/sec on 3 mixed GPUs with 128k context via llama.cpp, asking if performance is typical.

MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)

Reddit r/LocalLLaMA

Benchmark results for running Qwen 3.6 27B on AMD MI50 GPUs using a custom vllm fork, achieving 52.8 tokens/s TG and 1569 tokens/s PP without quantization or MTP, demonstrating usability for agentic tasks on 2018 hardware.

@ItsmeAjayKV: Achievement Unlocked: Running Qwen3.6-27b dense Thanks to the RTX 3090, now I can do this. Running @Alibaba_Qwen Qwen 3…

X AI KOLs Timeline

User benchmarks Qwen3.6-27B on an RTX 3090 using llama.cpp, achieving 35 tok/s generation and 1247 tok/s prompt processing.

Similar Articles

125 tok/s for Qwen3.6 q4xl on 2x 4060ti is insane perf/dollar

Qwen 3.6 benchmarks on 2x RTX PRO 6000

What speed is everyone getting on Qwen3.6 27b?

MI50s Qwen 3.6 27B @52.8 tps TG @1569 tps PP (no MTP, no Quant)

@ItsmeAjayKV: Achievement Unlocked: Running Qwen3.6-27b dense Thanks to the RTX 3090, now I can do this. Running @Alibaba_Qwen Qwen 3…

Submit Feedback