production-traffic

#production-traffic

I stopped trusting model benchmarks and started running my own eval set, here is what changed[D]

Reddit r/MachineLearning ↗ · yesterday

The author describes losing faith in public AI model benchmarks due to vendor-created metrics, self-reported parameters, and lack of independent verification, and advocates for building custom evaluation sets from real production traffic to make more relevant model comparisons.

0 favorites 0 likes

#production-traffic

@heyshrutimishra: Most LLM routers are static rules; OrcaRouter is a router that learns. It embeds every prompt, scores it against past p…

X AI KOLs Following ↗ · 2026-05-08

OrcaRouter is a learning-based LLM router that dynamically routes prompts to appropriate models based on quality, cost, speed, and reliability, improving over time with production traffic.

0 favorites 0 likes

production-traffic

I stopped trusting model benchmarks and started running my own eval set, here is what changed[D]

@heyshrutimishra: Most LLM routers are static rules; OrcaRouter is a router that learns. It embeds every prompt, scores it against past p…

Submit Feedback