@dlouapre: Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on Crit…

X AI KOLs Following 05/12/26, 03:09 PM Tools

Summary

Physics-intern is an agentic framework for theoretical physics that improves Gemini 3.1 Pro's performance on the CritPt benchmark from 17.7% to 31.4%, achieving a new state-of-the-art.

Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on CritPt, a new SOTA on one of the hardest benchmarks for LLMs. Theoretical physics is hard for humans and LLMs alike. But physics-intern decomposes problems and dispatches them to a team of specialized agents, solving research-level questions far more effectively than the base model alone.

Original Article

Similar Articles

@lvwerra: We released physics-intern: a simple harness for science problems! It gets models like Gemini 3.1 Pro to go from 17.7 -…

X AI KOLs Following

Released physics-intern, a simple harness that significantly boosts the performance of reasoning models like Gemini 3.1 Pro on science problems, from 17.7 to 31.4, outperforming GPT 5.5 Pro.

@dlouapre: Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on Crit…

Similar Articles

@lvwerra: We released physics-intern: a simple harness for science problems! It gets models like Gemini 3.1 Pro to go from 17.7 -…

Agentic harness for theoretical physics research

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Gemini 3.5: frontier intelligence with action

AlphaEvolve: Gemini-powered coding agent scaling impact across fields

Submit Feedback