model-benchmarks

Tag

Cards List
#model-benchmarks

Qwen3.7 Max scored by Artificial Analysis, 27B/35B waiting room

Reddit r/LocalLLaMA · 2026-05-20

Qwen3.7 Max ranks 5th on Artificial Analysis benchmarks, matching GPT-5.4 and outperforming Gemini 3.5 Flash, while Qwen3.6 27B trails significantly.

0 favorites 1 likes
#model-benchmarks

@0xLogicrw: Google DeepMind researcher Lun Wang announces departure, and in a long post completely dismisses the current AI evaluation approach. The current evaluation systems are all 'fighting the last war' — they can only passively test capabilities the model already possesses, and have no way to predict what new abilities the next generation of models will suddenly evolve. Compared to data, …

X AI KOLs Timeline · 2026-05-18 Cached

Google DeepMind researcher Lun Wang leaves the company and writes a post criticizing the current AI evaluation system, arguing that it lags behind model evolution and cannot predict new capabilities, leaving the industry in a state of 'flying blind'.

0 favorites 0 likes
← Back to home

Submit Feedback