@davideciffa: If you have an Nvidia RTX 4090 --ddtree-budget 36 is the best configuration that buys you 2.5x speed up during decoding…

X AI KOLs Timeline Tools

Summary

A tweet recommending --ddtree-budget 36 for Nvidia RTX 4090, claiming 2.5x speedup during decoding for Qwen3.6_27B.

If you have an Nvidia RTX 4090 --ddtree-budget 36 is the best configuration that buys you 2.5x speed up during decoding for Qwen3.6_27B. Thanks for the benchmark https://t.co/bs8xGnAl76 🙌 https://t.co/mO82mEWH7S
Original Article
View Cached Full Text

Cached at: 05/24/26, 04:35 PM

If you have an Nvidia RTX 4090 –ddtree-budget 36 is the best configuration that buys you 2.5x speed up during decoding for Qwen3.6_27B. Thanks for the benchmark https://t.co/bs8xGnAl76 🙌 https://t.co/mO82mEWH7S

Similar Articles

Best Settings for 48GB VRAM + Qwen 3.6 27B

Reddit r/LocalLLaMA

A user shares optimized settings for running Qwen3.6 27B (Q8_0) on a dual GPU setup (RTX 4090 + RTX 3090) with llama.cpp, achieving 75-100 t/s and 1500 pp with 250k context.