DeepReinforce releases Ornith-1.0, an open-weight MIT-licensed LLM family built on Gemma 4 and Qwen 3.5, achieving state-of-the-art coding performance among comparable open-source models.
# Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding
Source: [https://simonwillison.net/2026/Jun/29/ornith/](https://simonwillison.net/2026/Jun/29/ornith/)
29th June 2026 \- Link Blog
**[Ornith\-1\.0: Self\-Scaffolding LLMs for Agentic Coding](https://deep-reinforce.com/ornith_1_0.html)**\. This is an interesting new open weights \(MIT licensed\) model, the first model release from DeepReinforce\.
> \[\.\.\.\] with variants including 9B Dense, 31B Dense, 35B MoE, and 397B MoE\. Built on top of pretrained Gemma 4 and Qwen 3\.5, it achieves state\-of\-the\-art performance among open\-source models of comparable size on coding benchmarks\.
As far as I can tell the licenses of those underlying models is compatible with being used in this way \- Gemma 4 is Apache 2\.0 licensed \(and not bound by the janky additional[Gemma Terms of Use](https://ai.google.dev/gemma/terms)that afflicted the previous Gemma models\) and Qwen 3\.5 is Apache 2\.0 licensed as well\.
I've been running the model using LM Studio and the[ornith\-1\.0\-35b\-Q4\_K\_M\.gguf](https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B-GGUF)\(20GB\) GGUF, hooked up to[Pi](https://pi.dev/)\. Initial impressions are very good \- it seems to be able to run the agent harness over many tool calls in a proficient way\.
Here's[a terminal session](https://gisthost.github.io/?35da4d9ce7f0c27124c67655a0dc9e5d)where I asked it to "find the code that decodes the actor cookie" and then "find the code that opens the insert dialog when thebutton is clicked" against a Datasette checkout, which it handled with ease\.
I also had it[draw this pelican](https://gist.github.com/simonw/1869e1bbcafe5bcad0f26351f6a978a6), which came out at 103 tokens/second:

It's a little bit mangled but the pelican is clearly a pelican\.
I couldn't find much information about DeepReinforce themselves\. The earliest paper I could find from the was[CUDA\-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning](https://arxiv.org/abs/2507.14111)from June 2025\.
Deep Reinforce releases Ornith-1.0, a family of open-source self-improving LLMs for agentic coding, spanning 9B to 397B parameters and achieving state-of-the-art performance on benchmarks like SWE-Bench Verified and Terminal-Bench 2.1, surpassing Claude Opus 4.7 and other leading open-source models.
DeepReinforce open-sources Ornith-1.0, a family of self-improving coding models from 9B to 397B parameters, trained on Gemma 4 and Qwen 3.5 foundations, featuring a novel RL approach that learns to generate its own scaffolds.
Ornith-1.0 is a family of open-source, self-improving models for agentic coding, achieving state-of-the-art performance on coding benchmarks via reinforcement learning that jointly optimizes scaffold and solution rollouts.
DeepReinforce releases Ornith-1.0, an MIT-licensed open-source family of agentic coding LLMs including a 397B MoE model that surpasses Claude Opus 4.7 on SWE-Bench and Terminal-Bench, using a novel self-improving training strategy.
deepreinforce-ai releases Ornith-1.0-35B-GGUF, a state-of-the-art open-source coding agent model that uses self-improving reinforcement learning to jointly optimize scaffold and solution generation, achieving SOTA performance on coding benchmarks.