Built a lightweight Python framework for local LLM roleplay (Ollama/Phi-3) to stop context drift. Looking for feedback.

Reddit r/AI_Agents 06/02/26, 07:54 AM Tools

python framework llm roleplay ollama phi-3 open-source

Summary

A lightweight Python framework for local LLM roleplay using Ollama and Phi-3, featuring context preservation and native streaming to prevent character drift.

Hey everyone, I wanted to share an open-source project I’ve been building. I wanted a clean, lightweight way to handle local AI roleplay using Ollama and Phi-3, but I kept running into issues with character drift, streaming lag, and context management. Instead of dealing with massive, bloated AI agent frameworks, I built a dedicated, lightweight Python framework specifically for this use case. It handles context-locking and LLM streaming natively so the character stays locked in without eating up unnecessary token RAM. **Features built-in:** * **Ollama Integration:** Tailored directly for local models like Phi-3 (Other model are also supported but I use phi-3). * **Native LLM Streaming:** No long wait times for generations. * **Context Preservation:** Keeps the persona intact even over longer multi-turn chats. * **MIT License:** Purely open-source. Because my Reddit account is brand new, the automod will delete my post if I include a direct GitHub hyperlink. If you want to check out the code, look at the files, or help test it, **the direct link is pinned on my Reddit profile bio**, or you can search GitHub for: **tegetgoofficial-bot/ai-roleplay-framework** I’m looking to see if the structure makes sense to other developers here. What do you think of handling local roleplay state this way?

Original Article

Similar Articles

I built an arena where LLMs sword-fight with real physics. You decide which part of the blade is sharp, vote blind, and free OpenRouter models battle for Elo. Llama 3.3 is currently stabbing GPT-OSS in the face.

Reddit r/AI_Agents

A new arena lets LLMs control physics ragdolls in weapon duels where users define weapon damage zones, vote blind, and models battle for Elo. Free models like Llama 3.3 and GPT-OSS compete, with self-hostable infrastructure.

Built a Tauri v2 desktop chat shell for local LLMs — point it at Ollama / llama.cpp / any OpenAI-compatible endpoint, MIT, ~12 MB binary

Reddit r/LocalLLaMA

Built a Tauri v2 desktop chat shell for local LLMs that can connect to Ollama, llama.cpp, or any OpenAI-compatible endpoint. The project is MIT licensed and produces a ~12 MB binary.

I built a local GUI for the TradingAgents framework — works with Ollama

Reddit r/LocalLLaMA

A developer built a local web GUI for the TradingAgents multi-agent LLM stock analysis framework, supporting various LLM providers and adding features like live pipeline visualization, a report reader, and multi-session chat.

A tool I built to generate 3D objects with functional, articulated parts. It's on github, and is mostly LLM-agnostic.

Reddit r/LocalLLaMA

A developer built a pipeline that uses an LLM as a structured code compiler to generate Blender Python code, producing 3D objects with functional, articulated parts instead of monolithic meshes. The tool is open-source and LLM-agnostic.

@badlogicgames: pibot is now running fully local, using parakeet for STT, qwen3-tts for TTS, and Qwen 3.6 as the local multi-modal LLM …

X AI KOLs Following

pibot is now fully local, using Parakeet for STT, Qwen3-tts for TTS, and Qwen 3.6 as the local multimodal LLM via llama.cpp, with Rust/mlx-c based inference engines, achieving zero Python dependencies.

Similar Articles

I built an arena where LLMs sword-fight with real physics. You decide which part of the blade is sharp, vote blind, and free OpenRouter models battle for Elo. Llama 3.3 is currently stabbing GPT-OSS in the face.

Built a Tauri v2 desktop chat shell for local LLMs — point it at Ollama / llama.cpp / any OpenAI-compatible endpoint, MIT, ~12 MB binary

I built a local GUI for the TradingAgents framework — works with Ollama

A tool I built to generate 3D objects with functional, articulated parts. It's on github, and is mostly LLM-agnostic.

@badlogicgames: pibot is now running fully local, using parakeet for STT, qwen3-tts for TTS, and Qwen 3.6 as the local multi-modal LLM …

Submit Feedback