Pi + Docker Sandbox + llama-server setup guide
Summary
A guide for running Pi AI agent securely inside a Docker Sandbox while running llama-server on the host machine for local GPU inference.
Similar Articles
@leopardracer: https://x.com/leopardracer/status/2055341758523883631
A user shares their experience setting up a dual-GPU local AI lab with RTX 4080 Super and 5060 Ti, running Qwen 3.6 models via llama.cpp and llama-swap to reduce API costs and enable unrestricted experimentation.
Running Qwen3.6-35B-A3B Locally for Coding Agent: My Setup & Working Config
A detailed guide for running the 35B-parameter Qwen3.6 model locally on Apple Silicon with llama.cpp to power the pi coding agent, including optimized configuration flags and sampling parameters.
I built a computer use sandbox framework for codex on headless linux. GPU passthrough, computer use, and sudo access for codex all work. It's the perfect dev sandbox to allow full auto work while minimizing the "rm -rf /" risk
The author built an AI sandbox manager using LXC containers that allows Codex agents to have full sudo access and GPU passthrough on headless Linux while keeping the host system safe from catastrophic errors.
Automated AI researcher running locally with llama.cpp
ml-intern is a harness for AI agents that integrates with Hugging Face's libraries and now supports running local models via llama.cpp or ollama, enabling an automated AI researcher to run 24/7 on a laptop.
We have sub-agents at home
A developer shares a forked sub-agent repository for pi coding agent that works with a single local LLM slot and limited VRAM, using llama.cpp server and quantized models. The post also discusses performance with the Apex Qwen variant using MTP.