Tag
Researchers benchmarked 7 frontier models on autoresearch tasks. Fable-5 won overall, but the open model Kimi-K2.7-Code surpassed others on ML engineering tasks.