Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM
Summary
The author highlights how rapidly local AI capabilities have improved, enabling tasks once exclusive to top-tier cloud models to run on affordable hardware using models like Qwen 27b and Minimax 2.7.
Similar Articles
@rohanpaul_ai: Qwen 3.7 Max is super close to the frontier models for coding and agentic abilities. And and it’s now available on AI/M…
Qwen 3.7 Max, a new AI model from Qwen, is now available via AI/ML API, showing competitive coding and agentic abilities close to frontier models like GPT-5.4 and Gemini 3.5 Flash. Free promo codes are being offered to try it.
The GPUless Revolution: How Efficient AI Models Are Democratizing Artificial Intelligence
A quiet revolution is making powerful AI models runnable on consumer hardware without expensive GPUs, thanks to breakthroughs in quantization and optimized implementations like llama.cpp's Gemma4 MTP support, democratizing access for hobbyists, small businesses, and edge computing.
@cjzafir: Models that I'm using daily: > Codex 5.5 high (fast) > Deepseek v4 pro via API > Kimi 2.6 via API Models that I am fine…
User shares a personal list of AI models they use daily (Codex 5.5, Deepseek v4 pro, Kimi 2.6) and for fine-tuning (Qwen 3.5 variants, Gemma4 E4B, GPT-oss 20B), aiming to fine-tune Small Language Models into Expert Language Models.
Gemma 4: Byte for byte, the most capable open models
Google DeepMind introduces Gemma 4, its most capable family of open models to date, designed for advanced reasoning and agentic workflows with high intelligence-per-parameter efficiency across multiple sizes.
Introducing Gemma 3
Google introduces Gemma 3, a collection of lightweight open models (1B, 4B, 12B, 27B) designed to run on single GPUs or TPUs, featuring support for 140+ languages, 128k context window, and multimodal capabilities. The models outperform larger competitors like Llama 3 and DeepSeek-V3 while maintaining efficiency for on-device deployment.