31b

Tag

#31b

Gemma 4 QAT 31B responds better to KV cache quantization too

Reddit r/LocalLLaMA ↗ · yesterday

The Gemma 4 QAT 31B model demonstrates improved behavior with KV cache quantization, suggesting enhanced inference efficiency.

0 favorites 0 likes