Agentic harness for theoretical physics research
Summary
Hugging Face releases 'physics-intern', an agentic framework for theoretical physics research that doubles the performance of Gemini models on the CritPt benchmark and sets a new state-of-the-art compared to GPT-5.5 Pro.
Similar Articles
@dlouapre: Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on Crit…
Physics-intern is an agentic framework for theoretical physics that improves Gemini 3.1 Pro's performance on the CritPt benchmark from 17.7% to 31.4%, achieving a new state-of-the-art.
@RoundtableSpace: HUGGING FACE JUST AUTOMATED THEIR ENTIRE POST-TRAINING TEAM WITH AN AGENT. It reads papers, runs GPU experiments, itera…
Hugging Face replaced its post-training team with an autonomous agent that reads papers, runs GPU experiments, and improves models, achieving a 22-point benchmark jump in under 10 hours and beating Codex on HealthBench by 60%.
Gemini api showing agentic gemini models
Google's Gemini API now exposes agentic models, enabling developers to build autonomous AI agents with enhanced reasoning and action capabilities.
Introducing Gemini 2.0: our new AI model for the agentic era
Google DeepMind introduces Gemini 2.0, a new agentic AI model with native image and audio output, enhanced tool use, and multimodal capabilities designed for the next era of AI agents. Gemini 2.0 Flash is now available to developers with wider availability planned for early 2025.
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
DeepMind announces Gemini Deep Think's ability to solve professional research problems in mathematics, physics, and computer science, highlighted by a new agent 'Aletheia' that iteratively verifies and revises solutions.