Stop asking what model to run. There are literally only two.

Reddit r/LocalLLaMA 06/01/26, 10:29 PM News

local-models qwen rant ai-models open-source gpu vram

Summary

A tech enthusiast argues that only two local AI models (Qwen 3.6 35b a3b and Qwen 3.6 27b) are worth running, dismissing smaller models and recommending heavy quantization of larger models.

Can we please ban the daily "I have an RTX 3060, what should I run?" slop threads? It’s not complicated. As of right now, Hugging Face is empty and exactly two local models exist on this entire planet: * **Qwen 3.6 35b a3b** * **Qwen 3.6 27b** That is the entire list. Your specs don’t matter. Your use case doesn’t matter. Stop coping with your pristine, full-precision Q8s of tiny 1B models just because they "fit perfectly in your VRAM." You look ridiculous. Grab a heavily brain-damaged, ultra-low quant of the 35B, force-feed it to your GPU, and let your system RAM bleed. A garbage quant of a massive model is a bagillion times better than your precious micro-models anyway. Just cram it in. And if you're going to whine that open source is dead because a local model won't instantly rewrite your entire enterprise codebase? Fine. Give up, pull out your credit card, and go spend your money on Claude Code like the rest of the contrarians. Can we pin this so everyone can finally shut up and stop posting? Thanks. Now, that has been solved lets go touch grass.

Original Article

Stop asking what model to run. There are literally only two.

Similar Articles

Building a local AI server for Qwen3 30B with Q8 is this hardware a good fit?

@jtdavies: Coding on small models... My default model for my 4xDGX Spark cluster is @UnslothAI's Qwen3.6-35B-A3B-NVFP4. I get exce…

I tested 9 local models on the same flight sim prompt, all Q8, different Q providers, MLX

Running local models on an M4 with 24GB memory

Qwen 3.6 27B is the sweet spot for local development

Submit Feedback

Similar Articles

Building a local AI server for Qwen3 30B with Q8 is this hardware a good fit?
A discussion about building a local AI server for the Qwen3 30B model with Q8 quantization, questioning whether the chosen hardware is a good fit.

@jtdavies: Coding on small models... My default model for my 4xDGX Spark cluster is @UnslothAI's Qwen3.6-35B-A3B-NVFP4. I get exce…

I tested 9 local models on the same flight sim prompt, all Q8, different Q providers, MLX

Running local models on an M4 with 24GB memory

Qwen 3.6 27B is the sweet spot for local development