@dlouapre: Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on Crit…

X AI KOLs Following Tools

Summary

Physics-intern is an agentic framework for theoretical physics that improves Gemini 3.1 Pro's performance on the CritPt benchmark from 17.7% to 31.4%, achieving a new state-of-the-art.

Meet physics-intern, our agentic framework for theoretical physics. It takes Gemini 3.1 Pro from 17.7% to 31.4% on CritPt, a new SOTA on one of the hardest benchmarks for LLMs. Theoretical physics is hard for humans and LLMs alike. But physics-intern decomposes problems and dispatches them to a team of specialized agents, solving research-level questions far more effectively than the base model alone.
Original Article

Similar Articles

Agentic harness for theoretical physics research

Reddit r/LocalLLaMA

Hugging Face releases 'physics-intern', an agentic framework for theoretical physics research that doubles the performance of Gemini models on the CritPt benchmark and sets a new state-of-the-art compared to GPT-5.5 Pro.

Gemini 3.5: frontier intelligence with action

Google DeepMind Blog

Google announces Gemini 3.5, a new family of AI models focused on agentic workflows and coding, starting with 3.5 Flash which delivers frontier performance at high speed.