Fable 5 below even Gemini 3.1 on Livebench

Reddit r/singularity News

Summary

A discussion on LiveBench results showing Fable 5 performing below Gemini 3.1, questioning whether the benchmark is flawed or Anthropic is optimizing for benchmarks.

Is this benchmark broken, or is Anthropic benchmaxing? [LiveBench](https://livebench.ai/#/?highunseenbias=true)
Original Article

Similar Articles

Fable 5 benchmark with remotion video

Reddit r/singularity

Fable 5 shows overall improvement over Opus 4.8 in video generation benchmarks, but Gemini 3.1 Pro demonstrates more artistic vision despite issues with tool calls and buggy code.

Gemini 3.5 Flash Benchmarks

Reddit r/singularity

Benchmark results for the Gemini 3.5 Flash model are discussed, likely showcasing its performance across various AI tasks.

Claude Fable 5 benchmarks

Reddit r/singularity

Anthropic released benchmarks for Claude Fable 5, a new AI model, showing significant performance improvements.