Tag
A discussion on LiveBench results showing Fable 5 performing below Gemini 3.1, questioning whether the benchmark is flawed or Anthropic is optimizing for benchmarks.