Local semantic memory search for OpenClaw agents using Harrier embeddings

Reddit r/openclaw 06/06/26, 10:44 PM Tools

semantic-search openclaw harrier embeddings local-ai agent-memory ollama

Summary

This article presents a practical method to equip OpenClaw agents with local semantic memory search using Microsoft's Harrier embedding model, enabling efficient retrieval of relevant text chunks without external services.

I put together a small repo showing how to give an OpenClaw agent local semantic memory search without sending embeddings to an external service: The basic idea: run a small local embedding server around Microsoft’s Harrier embedding model, expose an Ollama-compatible API, and point OpenClaw’s memorySearch config at it. For anyone unfamiliar with Harrier: it’s a local embedding model from Microsoft (microsoft/harrier-oss-v1-0.6b) that produces high-quality text embeddings. In plain English, it turns chunks of text into vectors so your agent can search by meaning instead of only exact keywords. Why this matters for agent memory: Most agent memory systems hit one of two problems: 1. You shove too much memory into the prompt, which burns tokens and makes context messy. 2. You keep memory files small and manual, which becomes hard to maintain once the agent has real history. Semantic memory search gives you a better middle path. Your long-term memory can stay in normal markdown files: MEMORY.md, daily logs, notes, project files, whatever structure is easiest for a human to read and edit. Then the agent retrieves only the relevant chunks at runtime. That means: • Less token waste, because you are not stuffing every durable fact into every prompt. • Cleaner memory files, because they do not need to be obsessively compressed into one giant context-efficient blob. • Better recall, because the agent can find conceptually related notes even when the wording does not match exactly. • Easier debugging, because the source of truth stays plain text instead of disappearing into an opaque vector database. • Better privacy, because embeddings are computed locally. The repo includes: • A small Python embedding server. • Ollama-compatible /api/embed and /api/embeddings endpoints. • Example OpenClaw memorySearch config. • A macOS launchd service template. • A mock markdown memory corpus. • Smoke tests and a local query demo. The useful part here is less “new retrieval algorithm” and more “practical wiring.” OpenClaw already knows how to talk to Ollama-style embedding endpoints, so this gives it a local SOTA-ish semantic memory layer without requiring you to run full Ollama or ship private memory to a hosted embedding API. The pattern has been especially useful for keeping token usage under control while letting memory remain human-manageable. Instead of constantly hand-curating a tiny context block, you can keep richer notes on disk and let retrieval pull the few chunks that actually matter. Blog post: [https://coltoncoan.com/blog/local-agent-memory-with-openclaw-ollama-and-sentence-transformers/](https://coltoncoan.com/blog/local-agent-memory-with-openclaw-ollama-and-sentence-transformers/) Repo: [https://promptclickrun.github.io/harrier-openclaw-memory-search](https://promptclickrun.github.io/harrier-openclaw-memory-search)

Original Article

Local semantic memory search for OpenClaw agents using Harrier embeddings

Similar Articles

An open source natural temporal memory for claude code, hermes and openclaw agent

Liberate your OpenClaw

I built a multi-agent platform on top of OpenClaw — 72 specialized agents, each with their own domain, all connected through ClawSwarm

Is your OpenClaw Ai agents Burning tokens like hell?

VisualClaw: A Real-Time, Personalized Agent for the Physical World

Submit Feedback

Similar Articles

An open source natural temporal memory for claude code, hermes and openclaw agent

I built a multi-agent platform on top of OpenClaw — 72 specialized agents, each with their own domain, all connected through ClawSwarm

Is your OpenClaw Ai agents Burning tokens like hell?

VisualClaw: A Real-Time, Personalized Agent for the Physical World