Tag
The author evaluates Ornith-9B against its base Qwen3.5-9B, finding that RL post-training improves token efficiency and sustained coding coherence but sacrifices single-turn judgment and robustness to misleading inputs, making it a narrower upgrade at 9B compared to the 35B version.