@SlimTradeyBaby: Just read @no_stp_on_snek review of the new Ornith-1.0 35B coder easily one of the best model write-ups I've seen in a …

X AI KOLs Following 06/26/26, 12:17 PM Models

coding ai-model review benchmarks agentic-tasks long-horizon

Summary

A review of the new Ornith-1.0 35B coding model that bypasses public benchmarks and tests it on real agentic tasks, highlighting its strengths in long-horizon coding and coherence, as well as costs like verbosity.

Just read @no_stp_on_snek review of the new Ornith-1.0 35B coder easily one of the best model write-ups I've seen in a long time. He cuts straight through the promo hype, skips the easily gamed public benchmarks, and instead runs real h2h tests on held out agentic tasks. The result is a clear, honest breakdown: Ornith's genuine strengths in long horizon coding and coherence, plus its real costs (more cautious, more verbose). No hype, no copium just practical receipts on where it wins and where it trades off. Exactly the kind of grounded analysis this space needs. Great work, Tom. Give the man a follow always great content and solid git!

Original Article

View Cached Full Text

Cached at: 06/26/26, 04:13 PM

Just read @no_stp_on_snek review of the new Ornith-1.0 35B coder easily one of the best model write-ups I’ve seen in a long time. He cuts straight through the promo hype, skips the easily gamed public benchmarks, and instead runs real h2h tests on held out agentic tasks. The result is a clear, honest breakdown: Ornith’s genuine strengths in long horizon coding and coherence, plus its real costs (more cautious, more verbose). No hype, no copium just practical receipts on where it wins and where it trades off. Exactly the kind of grounded analysis this space needs. Great work, Tom. Give the man a follow always great content and solid git!

Tom Turney (@no_stp_on_snek): a new 35B coder dropped (Ornith-1.0) and a promo blog says it “crushes” the benchmarks. my first instinct was benchmaxx, public test sets like SWE-Bench and Terminal-Bench are easy to overfit. so i ignored the benchmarks and ran it head-to-head against stock Qwen3.6-35B on my own

@SlimTradeyBaby: Just read @no_stp_on_snek review of the new Ornith-1.0 35B coder easily one of the best model write-ups I've seen in a …

Similar Articles

@no_stp_on_snek: a new 35B coder dropped (Ornith-1.0) and a promo blog says it "crushes" the benchmarks. my first instinct was benchmaxx…

@SlimTradeyBaby: Just fired up Ornith 35B Q4 on the 5090 remotely… 2329 prompt / 195 gen tok/s and rock solid at 32k. Quick test only fu…

@no_stp_on_snek: one last thing: the real downside i found testing Ornith-1.0 (the new agentic coder): it over-gates legitimate work. on…

@no_stp_on_snek: verdict up front: it's a "pass" in my book in certain categories, just a narrower one than the 35B. you're buying real …

@SixZzshOtRipZz: I can advocate for this I ran a similar test to see if Ornith would cave on decision making, even attempting to trick i…

Submit Feedback

Similar Articles

@no_stp_on_snek: a new 35B coder dropped (Ornith-1.0) and a promo blog says it "crushes" the benchmarks. my first instinct was benchmaxx…

@SlimTradeyBaby: Just fired up Ornith 35B Q4 on the 5090 remotely… 2329 prompt / 195 gen tok/s and rock solid at 32k. Quick test only fu…

@no_stp_on_snek: one last thing: the real downside i found testing Ornith-1.0 (the new agentic coder): it over-gates legitimate work. on…

@no_stp_on_snek: verdict up front: it's a "pass" in my book in certain categories, just a narrower one than the 35B. you're buying real …

@SixZzshOtRipZz: I can advocate for this I ran a similar test to see if Ornith would cave on decision making, even attempting to trick i…