Turning every "no thats not what i meant" in chat into actual LoRA training data

Reddit r/LocalLLaMA Tools

Summary

A desktop app that lets users correct model responses in chat and train LoRA adapters locally, closing the feedback loop without manual notebook work.

i kept running local models on my own hardware, they'd say something dumb, id sit there going "no thats not what i meant", id close the chat and the model never learned. so i built the correction loop into a desktop app. starting experiment: hand-wrote 110 correction examples for a tiny helper persona, ran PEFT/LoRA on Qwen3 0.6B. Loss went 4.25 → 0.73, the adapter held identity across \~30 jailbreak prompts and answered domain questions the base model fumbled. small, but the loop works. scaled the same loop into everyday chat: **how it works** * every chat reply has a **Teach** button * click it, write what the model should have said * corrections accumulate as jsonl in your profile folder (visible on disk, grep-able) * when you have enough, hit **Train** — PEFT/LoRA runs against your active base * adapter goes into a registry, you switch it on in the model picker * a Training Proof panel shows candidate vs baseline scores side by side, plus which prompts now pass that used to fail (and any that regressed, in red) so the loop closes inside the chat: find a thing the model gets wrong, correct it once, train, next chat has your correction baked into the weights. no notebook, no `pip install --upgrade peft` at 2am. **run it** windows 10/11 64-bit, \~2.8 GB installer (bundles cuda runtime + portable python so the training sidecar just works): tideforge.ai/seels bring any GGUF. voice mode (whisper STT + piper TTS) + the full teach/train loop are free forever. signed via azure trusted signing — smartscreen will still warn the first dozen installs cause brand new app, theres an in-app walkthrough for the "more info → run anyway" flow. still very much alpha. things break. if you find a way to crash it i want the log. happy to talk about why tauri over electron, why a separate python sidecar for training, the agent + tool loop, whatever you got.
Original Article

Similar Articles

Turning local agents into self-optimizing agents

Reddit r/LocalLLaMA

A self-optimizing agentic pipeline that improves benchmark performance from ~30% to ~90% on TerminalBench, and can be extended to everyday chats by logging interactions, reflecting with a local model, and injecting lessons into future system prompts.