@ProfBuehlerMIT: For science, AI sovereignty and physics-grounded reasoning are non-negotiable. But how can we teach a small LLM like Ge…

X AI KOLs Timeline Tools

Summary

mistral.rs now natively supports Agent Skills, enabling locally-run small LLMs to perform complex agentic workflows for scientific tasks, with full control over models, data, and execution.

For science, AI sovereignty and physics-grounded reasoning are non-negotiable. But how can we teach a small LLM like Gemma-4-E4B physics? One way is to use Agent Skills, but this has so far been limited to closed frontier models. mistral․rs now implements Agent Skills natively: the first self-hosted inference engine that does this as part of the local inference substrate, where we can use small models to solve complex scientific and other tasks in a flexible and scalable way. We are in a period of uncertainty about frontier models - access, pricing, deprecation, abrupt restriction. The good news is that when the entire stack runs locally we can build AI that is entirely your own: You own the weights, the skills, the execution loop, the data - all of it runs on your hardware and is reproducible and durable. While virtually all local inference engines expose a model behind an OpenAI-compatible endpoint, everything agentic is then assembled around it by an external orchestrator that injects context, manages tools, mounts files, and brokers execution. mistral․rs is natively agentic and moves that machinery into the server itself, allowing us to build complex agentic workflows and run them locally, on open-source models. With this new feature you can now upload Agent Skills bundles to /v1/skills, reference them from Responses API requests by identity, and run them inside a native agentic loop with persistent Python sessions, figure capture, sandboxed shell execution, file inputs mounted directly into the working session; plug-and-play and completely compatible with your existing code/workflow. A model with a native skill substrate can act, observe consequences, and can modify what it is able to do. The skill is retained procedural capability of the system. Attached is a short video of all of it: skills, code execution, the full agentic loop carried by Gemma-4-E4B; running entirely on my MacBook Pro. You can install and run a server with this capability in two lines in your terminal, with any quantization you need. Nice work by the @googlegemma team @OfficialLoganK @demishassabis and @ericlbuehler with mistral․rs!
Original Article
View Cached Full Text

Cached at: 06/18/26, 06:09 AM

For science, AI sovereignty and physics-grounded reasoning are non-negotiable. But how can we teach a small LLM like Gemma-4-E4B physics? One way is to use Agent Skills, but this has so far been limited to closed frontier models. mistral․rs now implements Agent Skills natively: the first self-hosted inference engine that does this as part of the local inference substrate, where we can use small models to solve complex scientific and other tasks in a flexible and scalable way.

We are in a period of uncertainty about frontier models - access, pricing, deprecation, abrupt restriction. The good news is that when the entire stack runs locally we can build AI that is entirely your own: You own the weights, the skills, the execution loop, the data - all of it runs on your hardware and is reproducible and durable.

While virtually all local inference engines expose a model behind an OpenAI-compatible endpoint, everything agentic is then assembled around it by an external orchestrator that injects context, manages tools, mounts files, and brokers execution. mistral․rs is natively agentic and moves that machinery into the server itself, allowing us to build complex agentic workflows and run them locally, on open-source models.

With this new feature you can now upload Agent Skills bundles to /v1/skills, reference them from Responses API requests by identity, and run them inside a native agentic loop with persistent Python sessions, figure capture, sandboxed shell execution, file inputs mounted directly into the working session; plug-and-play and completely compatible with your existing code/workflow.

A model with a native skill substrate can act, observe consequences, and can modify what it is able to do. The skill is retained procedural capability of the system.

Attached is a short video of all of it: skills, code execution, the full agentic loop carried by Gemma-4-E4B; running entirely on my MacBook Pro. You can install and run a server with this capability in two lines in your terminal, with any quantization you need.

Nice work by the @googlegemma team @OfficialLoganK @demishassabis and @ericlbuehler with mistral․rs!

Similar Articles

Simulate, Reason, Decide: Scientific Reasoning with LLMs for Simulation-Driven Decision Making

arXiv cs.AI

Researchers from the University of Michigan introduce MechSim, a mechanism-grounded neuro-symbolic reasoning framework that enables LLM agents to reason about the internal assumptions, dependencies, and execution behavior of scientific simulators rather than treating them as black boxes. The framework improves explanation quality and decision-making reliability across high-stakes domains like healthcare, finance, and public policy.