Pi + Docker Sandbox + llama-server setup guide

Reddit r/AI_Agents 06/10/26, 01:58 PM Tools

pi docker sandbox llama-server setup-guide local-gpu isolation

Summary

A guide for running Pi AI agent securely inside a Docker Sandbox while running llama-server on the host machine for local GPU inference.

Hi everyone, I put together a guide for running Pi securely inside a Docker Sandbox while running llama-server directly on my host machine. The goal was to keep the agent isolated while still running the model on the local GPU at full speed. Hope someone finds this useful! \-> Link is in the comments below.

Original Article

Similar Articles

@leopardracer: https://x.com/leopardracer/status/2055341758523883631

X AI KOLs Timeline

A user shares their experience setting up a dual-GPU local AI lab with RTX 4080 Super and 5060 Ti, running Qwen 3.6 models via llama.cpp and llama-swap to reduce API costs and enable unrestricted experimentation.

Running Qwen3.6-35B-A3B Locally for Coding Agent: My Setup & Working Config

Reddit r/LocalLLaMA

A detailed guide for running the 35B-parameter Qwen3.6 model locally on Apple Silicon with llama.cpp to power the pi coding agent, including optimized configuration flags and sampling parameters.

I built a computer use sandbox framework for codex on headless linux. GPU passthrough, computer use, and sudo access for codex all work. It's the perfect dev sandbox to allow full auto work while minimizing the "rm -rf /" risk

Reddit r/LocalLLaMA

The author built an AI sandbox manager using LXC containers that allows Codex agents to have full sudo access and GPU passthrough on headless Linux while keeping the host system safe from catastrophic errors.

Automated AI researcher running locally with llama.cpp

Reddit r/LocalLLaMA

ml-intern is a harness for AI agents that integrates with Hugging Face's libraries and now supports running local models via llama.cpp or ollama, enabling an automated AI researcher to run 24/7 on a laptop.

We have sub-agents at home

Reddit r/LocalLLaMA

A developer shares a forked sub-agent repository for pi coding agent that works with a single local LLM slot and limited VRAM, using llama.cpp server and quantized models. The post also discusses performance with the Apex Qwen variant using MTP.

Similar Articles

@leopardracer: https://x.com/leopardracer/status/2055341758523883631

Running Qwen3.6-35B-A3B Locally for Coding Agent: My Setup & Working Config

I built a computer use sandbox framework for codex on headless linux. GPU passthrough, computer use, and sudo access for codex all work. It's the perfect dev sandbox to allow full auto work while minimizing the "rm -rf /" risk

Automated AI researcher running locally with llama.cpp

We have sub-agents at home

Submit Feedback