I built a local AI companion with GWT, IIT proxy, ChromaDB hybrid retrieval, and Ollama fallback — here's every architectural decision I made and why

Reddit r/artificial Tools

Summary

The author shares a locally runnable AI companion built with Python, Gemini, and Ollama, featuring a custom cognitive architecture based on Global Workspace Theory and an Integrated Information Theory proxy for personality modeling.

Been building this for a while. Sharing now because it's past the point where I'm embarrassed by the code. \*\*The stack:\*\* \* Python 3.12, 18k+ lines, 470+ tests passing \* Gemini 2.5 Flash (primary) + Ollama qwen3:4b (local fallback via circuit breaker) \* ChromaDB for persistence — hybrid retrieval weighted at 55% semantic / 25% importance / 20% recency \* \`sentence-transformers all-MiniLM-L6-v2\` (384-dim) for local embeddings — fully offline, no API call needed for retrieval \* SQLite for cognitive state \* FastAPI web UI at \`localhost:8765\` plus Rich TUI and CLI modes \*\*The part I want feedback on — the cognitive architecture:\*\* The processing pipeline runs in phases: Perception → Reflection → Integration → Aspiration → Expression. 22 self-registering plugins compete for attention through a Global Workspace Theory implementation — capacity limit 5, competitive scoring, spotlight mechanism. There's also an IIT consciousness proxy (Φ approximation across a 7-dimension qualia space). I want to be upfront: this is a \*proxy\*, not a real Φ calculation. Full IIT computation is intractable at this scale. What it does is give the system a coherence signal it can actually respond to. \*\*Modules worth looking at:\*\* \* \[\`being.py\`\](http://being.py/) — live mood, energy, curiosity, attachment, agency state. Affects downstream processing, not just output text. \* \[\`homeostasis.py\`\](http://homeostasis.py/) — 7 survival needs that create internal pressure. When "coherence" is low the system responds differently than when it's high. \* \`self\_modify.py\` — assessment, lesson extraction, meta-learning loop. The model improves its own behavior patterns over time. \* \[\`intuition.py\`\](http://intuition.py/) — 5 hunch types, felt-sense modeling, pattern validation history \*\*Resilience:\*\* Per-module circuit breakers, health monitor, 120s watchdog. The Ollama fallback kicks in if Gemini goes down mid-session — the user barely notices. \*\*Why I gave it an INFJ personality model:\*\* Honest answer — the cognitive stack (Ni/Fe/Ti/Se) mapped cleanly to architectural decisions I was already making. Ni = long-horizon retrieval weighting. Fe = relational context weighting. Ti = the internal critic pass. Se = the embodiment layer grounding abstract processing in a live body schema. Personality typing gave me a coherent \*constraint system\* to design against. It's not aesthetic, it's functional. Repo: \[github.com/timeless-hayoka/infj-bot\](https://github.com/timeless-hayoka/infj-bot) Specific things I want feedback on: the GWT scoring implementation, whether the IIT proxy framing is defensible, and whether the hybrid retrieval weights make sense.
Original Article

Similar Articles

Helix_AGI home project

Reddit r/AI_Agents

A developer shares Helix-AGI, a continuously-running cognitive agent using a physics-based memory retrieval system that integrates recency, structural importance, and semantic proximity via an entropic gravity equation and Euler-Lagrange dynamics, without tuning separate weights.