dense-model

#dense-model

@analogalok: Gemma 4 12B QAT (dense) achieves 1000+ tokens/sec prefill on 8GB VRAM with 120k context Gemma 4 12B QAT (dense), TurboQ…

X AI KOLs Following ↗ · 7h ago Cached

Gemma 4 12B QAT (dense) achieves over 1000 tokens per second prefill on an 8GB RTX 4060 with 120k context using TurboQuant, enabling full GPU layer offloading. This represents a 42% increase in prefill speed over previous methods.

0 favorites 0 likes

#dense-model

@analogalok: my 8 GB VRAM gaming laptop is absolutely going to hate me for this. but I still did it. ran a 31b dense model (Gemma 4 …

X AI KOLs Timeline ↗ · yesterday Cached

User runs Gemma 4 31B dense model on 8GB VRAM gaming laptop at ~3 tokens/sec using llama.cpp with MTP speculative decoding, demonstrating feasibility of running a 31B dense model on consumer hardware and proposing agentic workflows where a fast MoE model routes to this slower dense model for hard tasks.

0 favorites 0 likes

#dense-model

@mtschannen: For the past years my research focus was on unifying models and training paradigms across modalities. Today I'm excited…

X AI KOLs Timeline ↗ · 2026-06-03 Cached

Google DeepMind researcher announces the release of Gemma 4 12B, a dense encoder-free model that processes text, image, and audio inputs, continuing work on unifying models across modalities.

0 favorites 0 likes

#dense-model

@lmstudio: Gemma 4 12B is here! Dense, mid-sized Gemma that fits right on your laptop - released by @google under Apache 2.0 Avail…

X AI KOLs Timeline ↗ · 2026-06-03 Cached

Google released Gemma 4 12B, a dense mid-sized model that runs on laptops, under Apache 2.0, now available in LM Studio.

0 favorites 0 likes

#dense-model

Dense vs. MoE gap is shrinking fast with the 3.6-27B release

Reddit r/LocalLLaMA ↗ · 2026-04-22

A new 3.6-27B release shows MoE closing the performance gap with dense models, especially in coding tasks and large context windows, though dense still leads overall.

0 favorites 0 likes

#dense-model

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Simon Willison's Blog ↗ · 2026-04-22 Cached

Qwen releases Qwen3.6-27B, a 27B dense model claiming flagship-level coding performance surpassing the larger Qwen3.5-397B-A17B MoE, with impressive SVG generation demos.

0 favorites 0 likes

#dense-model

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Hacker News Top ↗ · 2026-04-22

Alibaba releases Qwen3.6-27B, a 27-billion-parameter dense model delivering flagship-level coding performance.

0 favorites 0 likes

dense-model

@analogalok: Gemma 4 12B QAT (dense) achieves 1000+ tokens/sec prefill on 8GB VRAM with 120k context Gemma 4 12B QAT (dense), TurboQ…

@analogalok: my 8 GB VRAM gaming laptop is absolutely going to hate me for this. but I still did it. ran a 31b dense model (Gemma 4 …

@mtschannen: For the past years my research focus was on unifying models and training paradigms across modalities. Today I'm excited…

@lmstudio: Gemma 4 12B is here! Dense, mid-sized Gemma that fits right on your laptop - released by @google under Apache 2.0 Avail…

Dense vs. MoE gap is shrinking fast with the 3.6-27B release

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Submit Feedback