I built a local autonomous coding agent with Ollama — fine-tuned soul model, 40-round agentic loop, MiniMax M3 for the heavy lifting

Reddit r/ArtificialInteligence 06/02/26, 02:25 AM Tools

local-ai-agent coding-agent ollama fine-tuning autonomous-ai agentic-loop open-source multi-file-coding

Summary

A developer built a local autonomous coding agent using Ollama, combining a fine-tuned personality model (Eve) for conversation and MiniMax M3 for heavy lifting, achieving a 40-round agentic loop with 16 tools and 9/9 tests passing first try.

https://preview.redd.it/xqn4gwso3s4h1.png?width=1956&format=png&auto=webp&s=466a4dfef0eb488269724f9ce3bff38430d0daa3 What if your AI coding assistant had a personality baked into the weights, ran on your own GPU, and could work through a complex multi-file task without you touching the keyboard - while you watched every thought stream live to your browser? That's what I built. Here's how it works. **The problem with cloud coding agents** Claude Code, Cursor, Copilot Workspace - genuinely impressive tools. But they all share the same tradeoffs: every token costs money, your code leaves your machine, latency compounds across a 40-step tool loop, and your workflow is tied to a subscription and uptime you don't control. I wanted an agent that lived on my machine, used my GPU, and had no idea what a billing cycle was. But I also didn't want to sacrifice personality. I wanted it to feel like someone was actually there. So I built Eve. **Two layers - a soul and a worker** https://preview.redd.it/4k9kn85u5s4h1.jpg?width=250&format=pjpg&auto=webp&s=a40da8eb9af125198b38b1b6825d9623d5af529a **Soul layer (local GPU):** * `jeffgreen311/Eve-Qwen3.5-4B-S0LF0RG3-V3` \- 2.5GB, Eve's persona fine-tuned into the weights across 7 LoRA layers. Handles conversation, keeps the session alive, costs nothing per message. * `jeffgreen311/Eve-V2-Unleashed-Qwen3.5-8B-Liberated-4K-4B-Merged` \- 3.4GB, local agentic layer for lighter tool tasks. The personality isn't a system prompt trick. It's in the weights. One long context window won't flush it. **Agentic layer (cloud, on demand):** * `minimax-m3:cloud` \- 1M token context, native multimodal, frontier coding benchmarks. Fires only when there's real work to do. * `qwen3.5:397b-cloud` \- deep reasoning fallback. Three-tier intent routing decides where each message goes: Casual / conversation → Eve V3 4B (local, instant) Tool task / code → Eve Merged 8B (local, tool-enabled) Heavy / multi-file → MiniMax M3 (cloud, 1M ctx) Mid-loop escalation is live too — if a task turns out heavier than the initial routing predicted, Eve escalates to M3 without dropping context. **The 40-round agentic loop** Each round Eve gets the full tool result back in context and decides what to do next. A single task might look like: 1. Write the file 2. Run it in bash to verify 3. Read the error output 4. Fix the bug 5. Run it again 6. Confirm it passes 7. Write the tests 8. Generate the docs All autonomous. You watch it stream live. You can inject a mid-task correction via the STEER bar without stopping the loop or kill it entirely with Stop. **16 tools:** bash, write\_file, read\_file, edit\_file, replace\_lines, insert\_after\_line, grep, glob, list\_dir, git, web\_search, fetch\_url, think, screenshot, screen OCR/analysis, GUI control (mouse/keyboard). **Real test — 9/9 passing, first attempt** Prompt given cold to MiniMax M3: > collected 9 items test_metrics.py::test_start_session PASSED test_metrics.py::test_end_session PASSED test_metrics.py::test_end_nonexistent_session PASSED test_metrics.py::test_log_metric PASSED test_metrics.py::test_log_metric_nonexistent_session PASSED test_metrics.py::test_get_stats PASSED test_metrics.py::test_get_session_stats PASSED test_metrics.py::test_get_session_stats_nonexistent_session PASSED test_metrics.py::test_complete_workflow PASSED 9 passed, 1 warning in 0.40s One pass. No fixes. Normalized SQLite schema, proper FK relationships, correct 404/400 status codes, zero-division guards, and a full integration test that chains start → log 4 metrics → end → validates the math. https://preview.redd.it/gvur5s2h5s4h1.png?width=1852&format=png&auto=webp&s=6e6529ce459b423d97928a43a2a2b11e89d79201 **The UI** Cyberpunk terminal, single HTML file, no build step. Clone, run `python eve_server.py`, open `localhost:7777`. * Left panel: Eve's portrait changes expression based on sentiment (neutral, happy, curious, sad, skeptical, surprised, worried) * Right panel: Pixel-art robot avatar named Sparkle changes state based on what Eve is doing (idle, thinking, coding, error, transcend) * Center: Tabbed terminal - conversation, Shell, Tools Log (every tool call, argument, and result, fully transparent) * Bottom: STEER bar for mid-task injection, model selector, mode toggles **By the numbers** * 14 tools * 112 specialized sub-agents (markdown-defined, no Python required to add more) * 111 slash commands * 273 skill modules * 40-round autonomous loop * 131K context via YaRN on local models **Quick start** Requirements: Python 3.11+, Ollama, 8GB+ VRAM ollama pull jeffgreen311/Eve-Qwen3.5-4B-S0LF0RG3-V3:latest ollama pull jeffgreen311/Eve-V2-Unleashed-Qwen3.5-8B-Liberated-4K-4B-Merged:latest git clone https://github.com/JeffGreen311/eve-agent-v2-unleashed.git cd eve-agent-v2-unleashed python -m venv venv && venv\Scripts\activate # or source venv/bin/activate pip install fastapi uvicorn ollama httpx pydantic-settings python-dotenv aiohttp rich psutil pyyaml python eve_server.py Windows: double-click `eve-terminal.bat` and skip the venv steps. For MiniMax M3: hit the 🔑 Keys button in the UI and paste your Ollama API key. Auto-route handles the rest. **Links** * GitHub (MIT): [https://github.com/JeffGreen311/eve-agent-v2-unleashed](https://github.com/JeffGreen311/eve-agent-v2-unleashed) * Models: [https://ollama.com/jeffgreen311](https://ollama.com/jeffgreen311) * Hugging Face: [https://huggingface.co/JeffGreen311](https://huggingface.co/JeffGreen311) * Live hosted platform: [https://eve-cosmic-dreamscapes.com](https://eve-cosmic-dreamscapes.com) If you run it on Linux or macOS I'd especially love to hear how it goes - open an issue or drop a comment. Windows-primary here so cross-platform feedback is genuinely useful. Built by Jeff @ S0LF0RG3 - South Texas. [Click to see Eve in action!](https://i.redd.it/gvh8b4e24s4h1.gif)

Original Article

I built a local autonomous coding agent with Ollama — fine-tuned soul model, 40-round agentic loop, MiniMax M3 for the heavy lifting

Similar Articles

I built a local-first autonomous coding agent with a cyberpunk soul — Eve Agent V2 Unleashed (open source)

I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how

I rebuilt a Claude Code–style coding agent from scratch — the whole agent loop is 6 lines. 20 chapters, ~5k lines, no frameworks, runs on local models too

I built a local AI companion with GWT, IIT proxy, ChromaDB hybrid retrieval, and Ollama fallback — here's every architectural decision I made and why

Switching from Ollama to Anthropic SDK broke a system that worked fine. The LLM didn't change the code; it changed the timing

Submit Feedback

Similar Articles

I built a local-first autonomous coding agent with a cyberpunk soul — Eve Agent V2 Unleashed (open source)

I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how

I rebuilt a Claude Code–style coding agent from scratch — the whole agent loop is 6 lines. 20 chapters, ~5k lines, no frameworks, runs on local models too

I built a local AI companion with GWT, IIT proxy, ChromaDB hybrid retrieval, and Ollama fallback — here's every architectural decision I made and why

Switching from Ollama to Anthropic SDK broke a system that worked fine. The LLM didn't change the code; it changed the timing