quantization-aware-training

Tag

Cards List
#quantization-aware-training

@_philschmid: Weights: https://huggingface.co/collections/google/gemma-4-qat-q4-0… Blog: https://blog.google/innovation-and-ai/techno…

X AI KOLs Following · yesterday Cached

Google released Gemma 4 models with quantization-aware training (QAT) at Q4_0 precision on Hugging Face, offering efficient variants from 5B to 33B parameters.

0 favorites 0 likes
#quantization-aware-training

@_philschmid: More Gemma 4! New QAT Gemma 4 checkpoints with similar performance while using ~4x less memory! It comes with a new mob…

X AI KOLs Following · yesterday Cached

New QAT Gemma 4 checkpoints offer similar performance with ~4x less memory, enabling a 1GB memory footprint for Gemma 4 E2B via a new mobile quantization format.

0 favorites 0 likes
#quantization-aware-training

Gemma 4 QAT models: Optimizing compression for mobile and laptop efficiency

Hacker News Top · 4d ago Cached

Google releases Gemma 4 models optimized with Quantization-Aware Training (QAT) to improve efficiency for mobile and laptop deployment, reducing memory footprint to 1GB for the E2B model while preserving quality.

0 favorites 0 likes
#quantization-aware-training

google/gemma-4-12B-it-qat-q4_0-gguf

Hugging Face Models Trending · 5d ago Cached

Google DeepMind releases Gemma 4 models optimized with Quantization-Aware Training (QAT) in multiple formats including GGUF, enabling high quality with reduced memory requirements.

0 favorites 0 likes
#quantization-aware-training

Max-Window Scale Estimation for Near-Lossless HiF8 W8A8 Quantization-Aware Training

arXiv cs.LG · 2026-05-27 Cached

This paper systematically studies HiF8 W8A8 quantization-aware training for OpenPangu-Embedded-1B, identifying and addressing failure modes such as amax saturation and catastrophic forgetting, achieving near-lossless performance with a 64-step max-algorithm DTS strategy and a 500-step BF16 warmup.

0 favorites 0 likes
← Back to home

Submit Feedback