Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM

Reddit r/LocalLLaMA 04/21/26, 08:41 AM News

Summary

The author highlights how rapidly local AI capabilities have improved, enabling tasks once exclusive to top-tier cloud models to run on affordable hardware using models like Qwen 27b and Minimax 2.7.

I know benchmarks are questionable, imprecise on individual use cases, and LLMs are often trained to excel... But we're not talking numbers here. We're talking about a trend. When I was using GPT 4o or Sonnet 3.7, if you'd told me I could do all those things locally in such a short time, I wouldn't have believed it. Now it's happening. It's not just happening to those with 400GB of VRAM. It's also happening on more affordable hardware. I think if Qwen 3.6 27b actually comes out soon, it will be truly incredible. True: we're seeing licenses changing, and an increasing need for monetization from open source developers. But it's a really great time. Yesterday I completed tasks that I normally couldn't finish without Claude using the odd Qwen 27b + Minimax 2.7 Q4 combo. For those who want GLM 5 Air... Rediscover the 4.7, which is still very good and smaller. This is a chart that answers many questions I read here daily.

Original Article

Similar Articles

@rohanpaul_ai: Qwen 3.7 Max is super close to the frontier models for coding and agentic abilities. And and it’s now available on AI/M…

X AI KOLs Following

Qwen 3.7 Max, a new AI model from Qwen, is now available via AI/ML API, showing competitive coding and agentic abilities close to frontier models like GPT-5.4 and Gemini 3.5 Flash. Free promo codes are being offered to try it.

The GPUless Revolution: How Efficient AI Models Are Democratizing Artificial Intelligence

Reddit r/AI_Agents

A quiet revolution is making powerful AI models runnable on consumer hardware without expensive GPUs, thanks to breakthroughs in quantization and optimized implementations like llama.cpp's Gemma4 MTP support, democratizing access for hobbyists, small businesses, and edge computing.

@cjzafir: Models that I'm using daily: > Codex 5.5 high (fast) > Deepseek v4 pro via API > Kimi 2.6 via API Models that I am fine…

X AI KOLs Timeline

User shares a personal list of AI models they use daily (Codex 5.5, Deepseek v4 pro, Kimi 2.6) and for fine-tuning (Qwen 3.5 variants, Gemma4 E4B, GPT-oss 20B), aiming to fine-tune Small Language Models into Expert Language Models.

Gemma 4: Byte for byte, the most capable open models

Google DeepMind Blog

Google DeepMind introduces Gemma 4, its most capable family of open models to date, designed for advanced reasoning and agentic workflows with high intelligence-per-parameter efficiency across multiple sizes.

Introducing Gemma 3