quantized

#quantized

@outsource_: NEW GLM+ QWEN 18B RUNS ON CONSUMER GPU IT BEATS 35B MoE AT HALF THE VRAM @KyleHessling1 just dropped the healed Qwopus-…

X AI KOLs Timeline ↗ · 2026-04-20 Cached

A new 18B merged quantized model, Qwopus-GLM-18B-GGUF, outperforms 35B MoE models while using half the VRAM and running on consumer GPUs.

0 favorites 0 likes

#quantized

@rohanpaul_ai: Gemma 4 (specifically its edge-optimized E2B and E4B variants) running fully offline on an iPhone via apps like Locally…

X AI KOLs Following ↗ · 2026-04-19 Cached

Google’s Gemma 4 E2B/E4B quantized variants now run fully offline on iPhone via apps like Locally AI, leveraging the Apple Neural Engine for on-device inference.

0 favorites 0 likes

#quantized

Jiunsong/supergemma4-26b-uncensored-gguf-v2

Hugging Face Models Trending ↗ · 2026-04-11 Cached

SuperGemma4-26B-Uncensored-Fast GGUF v2 is a quantized, locally-runnable variant of Google's Gemma-4-26B model optimized for Apple Silicon, offering faster inference speeds and less-censored chat behavior while maintaining practical performance on general tasks.

0 favorites 0 likes

#quantized

Jiunsong/supergemma4-26b-uncensored-mlx-4bit-v2

Hugging Face Models Trending ↗ · 2026-04-10 Cached

SuperGemma4-26B-Uncensored-MLX-4bit-v2 is a fine-tuned and quantized variant of Google's Gemma 4 26B optimized for Apple Silicon, offering improved performance on code, reasoning, and tool-use tasks while maintaining faster inference speeds compared to the stock baseline.

0 favorites 0 likes

quantized

@outsource_: NEW GLM+ QWEN 18B RUNS ON CONSUMER GPU IT BEATS 35B MoE AT HALF THE VRAM @KyleHessling1 just dropped the healed Qwopus-…

@rohanpaul_ai: Gemma 4 (specifically its edge-optimized E2B and E4B variants) running fully offline on an iPhone via apps like Locally…

Jiunsong/supergemma4-26b-uncensored-gguf-v2

Jiunsong/supergemma4-26b-uncensored-mlx-4bit-v2

Submit Feedback