@lvwerra: We released physics-intern: a simple harness for science problems! It gets models like Gemini 3.1 Pro to go from 17.7 -…

X AI KOLs Following Tools

Summary

Released physics-intern, a simple harness that significantly boosts the performance of reasoning models like Gemini 3.1 Pro on science problems, from 17.7 to 31.4, outperforming GPT 5.5 Pro.

We released physics-intern: a simple harness for science problems! It gets models like Gemini 3.1 Pro to go from 17.7 -> 31.4, thus beating GPT 5.5 Pro. The physics-intern harness can wrap any model and via dedicated subagent boost the performance of the vanilla reasoning models. While I think more and more of these harness capability gains will be absorbed into the models (like prompting tricks disappeared over time) there is a lot to be gained right now by building good scaffolds for those models and integrating tools well. Interestingly, the exception we found that GPT 5.5 Pro actually didn't benefit from the physics-intern harness! Read more about it here: https://huggingface.co/spaces/huggingface/physics-intern… PS: I think the Harness[Model] notation is kind of nice.
Original Article
View Cached Full Text

Cached at: 05/21/26, 05:35 PM

We released physics-intern: a simple harness for science problems!

It gets models like Gemini 3.1 Pro to go from 17.7 -> 31.4, thus beating GPT 5.5 Pro.

The physics-intern harness can wrap any model and via dedicated subagent boost the performance of the vanilla reasoning models.

While I think more and more of these harness capability gains will be absorbed into the models (like prompting tricks disappeared over time) there is a lot to be gained right now by building good scaffolds for those models and integrating tools well.

Interestingly, the exception we found that GPT 5.5 Pro actually didn’t benefit from the physics-intern harness!

Read more about it here: https://huggingface.co/spaces/huggingface/physics-intern…

PS: I think the Harness[Model] notation is kind of nice.


physics-intern: an Autonomous Agent for Physics Research - a Hugging Face Space by huggingface

Source: https://huggingface.co/spaces/huggingface/physics-intern Fetching metadata from the HF Docker repository...

Similar Articles

Agentic harness for theoretical physics research

Reddit r/LocalLLaMA

Hugging Face releases 'physics-intern', an agentic framework for theoretical physics research that doubles the performance of Gemini models on the CritPt benchmark and sets a new state-of-the-art compared to GPT-5.5 Pro.

Advancing science and math with GPT-5.2

OpenAI Blog

OpenAI releases GPT-5.2, featuring GPT-5.2 Pro and GPT-5.2 Thinking variants optimized for scientific and mathematical work. The models achieve state-of-the-art performance on benchmarks like GPQA Diamond (93.2%) and FrontierMath (40.3%), demonstrating improved reasoning capabilities designed to accelerate scientific research across physics, chemistry, biology, and mathematics.

Start building with Gemini 3

Google DeepMind Blog

Google has launched Gemini 3 Pro, a new AI model designed to outperform previous versions in coding, agentic workflows, and multimodal reasoning. The model is available via the Gemini API, Google AI Studio, and the new Google Antigravity development platform.