Turning local agents into self-optimizing agents

Reddit r/LocalLLaMA Tools

Summary

A self-optimizing agentic pipeline that improves benchmark performance from ~30% to ~90% on TerminalBench, and can be extended to everyday chats by logging interactions, reflecting with a local model, and injecting lessons into future system prompts.

I was experimenting with a self-optimizing agentic pipeline to climb the benchmark leaderboard (TerminalBench). On a 10-task subset, I got the performance to rise from \~30% → \~90%. That loop worked, so I asked: can the same reflect-and-rewrite step run continuously against everyday chats instead of a benchmark? **How it works** * Every chat with your local LLM goes through a small proxy and is logged. * `autoswarm reflect` has the same local model review those logs, distill concrete lessons, and write them to `skills.yaml`. * Lessons auto-inject into the system prompt of future chats. **Run it (LM Studio path)** 1. Start LM Studio's local server and load a model. 2. ```bash pip install -e . autoswarm doctor # verifies LM Studio is reachable autoswarm start # auto-detects upstream + model, listens on :8080 I'm genuinely fascinated by the idea of self-optimizing agents, and I believe there's **something bigger to uncover there**. That said, this is just a hobby project and I'm still experimenting with it. Would love your feedback! Link: [https://github.com/arteemg/autoswarm](https://github.com/arteemg/autoswarm) I'm actively working on the project, so please [**⭐ the repo**](https://github.com/arteemg/autoswarm/) to stay updated.
Original Article

Similar Articles

Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

arXiv cs.CL

Terminal-World introduces a fully automated pipeline that uses agent skills to synthesize high-quality training data for terminal agents, enabling models to outperform baselines with only 1.2% of the training data. The method co-derives task instructions, environments, and teacher trajectories from skill primitives.