Tag
GLM 5.2 has been released with open weights under MIT license on HuggingFace, available via API and Ollama, featuring competitive benchmarks that trail Opus 4.8 by a point and edge GPT-5.5 by one.
GLM-5.2 is the first open-weights model to exceed 80% on Terminal-Bench, surpassing all other open models and even Gemini, making it a frontier-level model at a fraction of the cost.
Qwable-v1 is an open-weights agentic coding model (35B MoE, 3B active) built by chaining distills from Claude Opus 4.7 reasoning and Claude Fable-5 agentic tool-use traces. It can think in explicit CoT chains and act as a Claude-Code-style agent when prompted.
A user questions why Huawei's Atlas cards are not widely adopted and speculates on China's potential to produce consumer GPUs to challenge Nvidia's monopoly.
A poll on X shows MIT-licensed open weights are losing with 7 hours left and 1,800 votes cast.
A technical overview of the state of local AI models in mid-2026, highlighting how open-weight models have narrowed the gap to frontier models through advances in mixture-of-experts and sparse attention, enabling efficient local inference.
A tweet highlights an excellent WWDC video by Angelos Kath on building local agentic AI with MLX, noting rapid progress in open-weight models and hardware capabilities.
Avataar AI launches Varya, a video generation model optimized for India's scale and cultural context, using distillation from Wan 2.2 to achieve 20x cost reduction and local nuance understanding.
MiniMaxAI announces plans to release open weights for its upcoming M3 model on Friday, following the earlier M2.7 model.
Google released DiffusionGemma, an open-weight text generation model (26B parameters, 4B active) under Apache 2 license, demonstrating high inference speeds via NVIDIA's NIM cloud API.
Modular's kernel team is optimizing serving for MiniMax M3's 1M-token context and native multimodality, with open weights dropping soon for immediate deployment on Modular.
A developer ran DeepSeek-V4-Flash on a Raspberry Pi 5 by streaming model weights from an NVMe SSD, achieving 1.3 tokens/second at 8 watts, demonstrating the feasibility of frontier-adjacent open-weight models on low-cost, offline hardware.
A paper accepted at ICML 2026 introduces predictable hallucination via an information-budget abstention gate, and releases ntkMirror, a training-free open-weight implementation that reduces hallucination by abstaining when information is insufficient, achieving 0.0–0.7% hallucination at ~24% abstention.
Cohere and Cohere Labs released North Mini Code, an open weights 30B-A3B parameter model optimized for code generation, agentic software engineering, and terminal tasks, with strong benchmark results on SWE-Bench and Terminal-Bench.
Omi Health founder fine-tuned NVIDIA's Parakeet TDT 0.6B for medical ASR, releasing open-weights model Omi Med STT v1 that achieves competitive medical-WER while running locally on Mac, CUDA, or CPU.
The article questions why ternary language models like BitNet have not scaled beyond 2B parameters, given their initial promise, and discusses the apparent lack of progress from open-weight AI labs.
A recap of an extraordinary week in open AI, featuring over 25 open-weight model releases across LLMs, image generation, audio/speech, vision, and video/3D, with notable contributions from NVIDIA, Google, and others.
Cohere Labs released North Mini Code, a 30B-parameter (3B active) open-weights model optimized for code generation, agentic software engineering, and terminal tasks, licensed under Apache 2.0.
Google DeepMind releases Gemma 4 models optimized with Quantization-Aware Training (QAT) in multiple formats including GGUF, enabling high quality with reduced memory requirements.
NVIDIA released Nemotron 550B Ultra, a large language model featuring a clean XML-based tool calling interface instead of JSON schemas, with tool results delivered as user messages in XML tags.