open-weight

#open-weight

@rasbt: And another open-weight release. Nemotron 3 Ultra has an ultra impressive capability:efficiency ratio! Design-wise, it …

X AI KOLs Timeline ↗ · 2026-06-04 Cached

Nemotron 3 Ultra is an open-weight release with an impressive capability-to-efficiency ratio, using a Mamba-2-attention hybrid stack and LatentMoE, and is larger than the previous Super variant.

0 favorites 0 likes

#open-weight

Ideogram 4.0

Product Hunt ↗ · 2026-06-04

Ideogram 4.0 is released as an open-weight model with layout control for generating design-ready images.

0 favorites 0 likes

#open-weight

Ideogram 4 is an open-weight text-to-image model trained from scratch, featuring structured JSON prompting, best-in-class multilingual text rendering, bounding-box layout controls, color-palette controls, and native 2K resolution output.

0 favorites 0 likes

#open-weight

@rasbt: It's been a while! 4 nice additions to the open-weight local-LLM-on-consumer-hardware ecosystem:

X AI KOLs Timeline ↗ · 2026-06-03 Cached

Sebastian Raschka highlights four recent additions to the open-weight local LLM ecosystem that can run on consumer hardware.

0 favorites 0 likes

#open-weight

MiniMax promises M3 weights after 1M-context model launch (2 minute read)

TLDR AI ↗ · 2026-06-03 Cached

MiniMax released M3, a model with a 1M-token context window and native multimodal input, via API. The company promises open-weight release and a technical report within 10 days.

0 favorites 0 likes

#open-weight

JetBrains's Mellum 2 (49 minute read)

TLDR AI ↗ · 2026-06-02 Cached

JetBrains releases Mellum 2, a 12B-parameter open-weight Mixture-of-Experts language model specialized in software engineering, with competitive performance in code generation, reasoning, and tool use, available under Apache 2.0.

0 favorites 0 likes

#open-weight

@_djdumpling: Luke is one of the best people when it comes to RL infra, definitely worth reading!

X AI KOLs Timeline ↗ · 2026-06-01 Cached

Luke J. Huang's new blog post surveys asynchronous reinforcement learning theory and infrastructure across 8 open-weight frontier labs, addressing algorithmic techniques and systems fixes for train-inference mismatch.

0 favorites 0 likes

#open-weight

MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal

Reddit r/LocalLLaMA ↗ · 2026-06-01 Cached

MiniMax releases M3, an open-weight model with frontier coding, agentic, 1M context, and native multimodal capabilities, achieving top benchmarks on coding and agentic tasks with autonomous task decomposition and long-context support.

0 favorites 0 likes

#open-weight

@Miles_Brundage: I am not sure I have seen a good analysis of how much distillation reduces this gap - people have very different views …

X AI KOLs Timeline ↗ · 2026-05-30 Cached

Miles Brundage comments on the lack of quantitative analysis on how distillation affects the capability gap between open-weight and proprietary AI models, referencing a claim by Epoch AI that open-weight models lag by four months.

0 favorites 0 likes

#open-weight

@EpochAIResearch: We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, ope…

X AI KOLs Following ↗ · 2026-05-29 Cached

Epoch AI Research analyzed the capability gap between open-weight and proprietary AI models, finding that open-weight models have been trailing the state of the art by approximately four months since the start of the year.

0 favorites 0 likes

#open-weight

Mellum2 Technical Report

Hugging Face Daily Papers ↗ · 2026-05-29 Cached

Mellum 2 is a 12B-parameter open-weight MoE language model by JetBrains with 2.5B active parameters, specialized in software engineering tasks and optimized for efficient inference on commodity GPUs.

0 favorites 0 likes

#open-weight

NuExtract3 released: open-weight 4B VLM for Markdown, OCR and structured extraction (self-hostable) [P]

Reddit r/MachineLearning ↗ · 2026-05-22

Numind released NuExtract3, a 4B open-weight vision-language model based on Qwen3.5-4B, designed for converting document images to Markdown, OCR, and structured data extraction. It is Apache-2.0 licensed and self-hostable with quantized versions for low VRAM.

0 favorites 0 likes

#open-weight

Waiting for Qwen 3.7 open weight... The new King has arrived...

Reddit r/LocalLLaMA ↗ · 2026-05-21

Qwen 3.7 open-weight model has been released, generating significant hype in the AI community as a new top-tier model.

0 favorites 0 likes

#open-weight

Stable Audio 3.0 (3 minute read)

TLDR AI ↗ · 2026-05-21 Cached

Stability AI released Stable Audio 3.0, an open-weight model family for variable-length audio generation up to six minutes, with support for LoRA fine-tuning and audio inpainting, trained on fully licensed data.

0 favorites 0 likes

#open-weight

@thepatch_kev: some ai music models are actually made with musicians in mind stable audio 3 is a great example of that. grateful to @z…

X AI KOLs Following ↗ · 2026-05-20 Cached

Stability AI has released Stable Audio 3.0, an open-weight model family for generative audio, designed for artistic experimentation and integration into DAWs like gary4juce.

0 favorites 0 likes

#open-weight

MiroThinker-1.7, an open-weight deep research agent (Qwen3 MoE base) — mini is 30B/3B active, curious what tok/s people get on consumer hardware

Reddit r/LocalLLaMA ↗ · 2026-05-17

MiroThinker-1.7 is an open-weight deep research agent built on Qwen3 MoE, with a mini version (30B total, 3B active) designed for consumer hardware; the team shares benchmarks and seeks feedback on local deployment.

0 favorites 0 likes

#open-weight

@jerryjliu0: A new set of open-weight models is topping the leaderboard for document understanding INF just released two models: Inf…

X AI KOLs Following ↗ · 2026-05-15 Cached

Infinity releases two open-weight models, Infinity-Parser2-Pro (35B) and Infinity-Parser2-Flash (2B), which top the ParseBench leaderboard for document understanding, leveraging a synthetic data engine and a novel joint RL algorithm.

0 favorites 0 likes

#open-weight

@svpino: For the first time, I feel open-weight models are impossible to ignore. We are at a point where these models are compet…

X AI KOLs Following ↗ · 2026-05-15

Santiago (@svpino) highlights MiniMax-M2.7, a 230B open-weight model that rivals top proprietary models like Opus 4.6 and GPT-5.4, achieving 440+ tokens/s inference on SambaNova at low cost.

0 favorites 0 likes

#open-weight

@poolsideai: Poolside is hosting a 2-day model research hackathon in London. Join us to push an open-weight agent model as far as yo…

X AI KOLs Following ↗ · 2026-05-13

Poolside is hosting a 2-day model research hackathon in London to push an open-weight agent model further using RL and fine-tuning on Laguna XS.2, with partners including NVIDIA, Prime Intellect, and Hugging Face, and a prize of an NVIDIA DGX Spark.

0 favorites 0 likes

#open-weight

HEBATRON: A Hebrew-Specialized Open-Weight Mixture-of-Experts Language Model

arXiv cs.CL ↗ · 2026-05-13 Cached

Hebatron is a new open-weight Hebrew-specialized Large Language Model built on NVIDIA's Nemotron-3 Mixture-of-Experts architecture, achieving strong reasoning performance with efficient inference. It is the first language-specific adaptation of this architecture and supports native long-context processing.

0 favorites 0 likes

open-weight

@rasbt: And another open-weight release. Nemotron 3 Ultra has an ultra impressive capability:efficiency ratio! Design-wise, it …

Ideogram 4.0

Ideogram 4 (GitHub Repo)

@rasbt: It's been a while! 4 nice additions to the open-weight local-LLM-on-consumer-hardware ecosystem:

MiniMax promises M3 weights after 1M-context model launch (2 minute read)

JetBrains's Mellum 2 (49 minute read)

@_djdumpling: Luke is one of the best people when it comes to RL infra, definitely worth reading!

MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal

@Miles_Brundage: I am not sure I have seen a good analysis of how much distillation reduces this gap - people have very different views …

@EpochAIResearch: We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, ope…

Mellum2 Technical Report

NuExtract3 released: open-weight 4B VLM for Markdown, OCR and structured extraction (self-hostable) [P]

Waiting for Qwen 3.7 open weight... The new King has arrived...

Stable Audio 3.0 (3 minute read)

@thepatch_kev: some ai music models are actually made with musicians in mind stable audio 3 is a great example of that. grateful to @z…

MiroThinker-1.7, an open-weight deep research agent (Qwen3 MoE base) — mini is 30B/3B active, curious what tok/s people get on consumer hardware

@jerryjliu0: A new set of open-weight models is topping the leaderboard for document understanding INF just released two models: Inf…

@svpino: For the first time, I feel open-weight models are impossible to ignore. We are at a point where these models are compet…

@poolsideai: Poolside is hosting a 2-day model research hackathon in London. Join us to push an open-weight agent model as far as yo…

HEBATRON: A Hebrew-Specialized Open-Weight Mixture-of-Experts Language Model

Submit Feedback