gpt-oss

#gpt-oss

Split my agent into a cheap router model and a premium synthesis model, bill dropped about 75%

Reddit r/AI_Agents ↗ · 2026-05-19

A developer splits their AI agent's LLM calls into a cheap router model (GPT-OSS 120B) for tool-picking and a premium model (gpt-5.4) for synthesis, cutting costs by ~78% while maintaining output quality.

0 favorites 0 likes

#gpt-oss

Effort as Ceiling, Not Dial: Reasoning Budget Does Not Modulate Cognitive Cost Alignment Between Humans and Large Reasoning Models

arXiv cs.CL ↗ · 2026-05-19 Cached

This paper tests whether varying inference-time reasoning effort affects the alignment between large reasoning models' chain-of-thought lengths and human reaction times. Results show alignment is invariant to effort perturbations, suggesting it is a training-time achievement.

0 favorites 0 likes

#gpt-oss

@populartourist: Qwen3.6 27B and 35B-A3B are amazing models, but nothing reaches the efficiency of GPT-OSS yet. Qwen3.6 35B-A3B is as fa…

X AI KOLs Timeline ↗ · 2026-05-16 Cached

A tweet comparing Qwen3.6 27B and 35B-A3B models to GPT-OSS, noting that while Qwen models are fast, GPT-OSS is more efficient, especially in prefill performance.

0 favorites 0 likes

gpt-oss

Split my agent into a cheap router model and a premium synthesis model, bill dropped about 75%

Effort as Ceiling, Not Dial: Reasoning Budget Does Not Modulate Cognitive Cost Alignment Between Humans and Large Reasoning Models

@populartourist: Qwen3.6 27B and 35B-A3B are amazing models, but nothing reaches the efficiency of GPT-OSS yet. Qwen3.6 35B-A3B is as fa…

Submit Feedback