Show HN: HelixDB – A graph database built on object storage

Hacker News Top Tools

Summary

HelixDB is a graph-vector database built in Rust for knowledge graphs and AI memory, offering a unified platform that supports graph, vector, KV, document, and relational data models, with tools for easy local and cloud deployment.

Hey HN, it’s been just over a year since we launched HelixDB (<a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43975423">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=43975423</a>), a project a friend and I started in college. It’s an OLTP graph database built on object-storage, with native vector search and full-text search (FTS).<p>Why graph, vector and FTS? Graph databases provide a natural cognitive model for data, vectors allow for a semantic understanding of the entities and relationships in the graph, and FTS provides more specific filtering. Many AI-driven applications attempt to combine all of these functionalities by stitching together multiple disconnected systems, but even then there’s no native way to perform joins or queries that span all systems. You still need to handle this logic at the application level.<p>Helix started as a graph DB, but we moved to a hybrid graph&#x2F;vector approach after attempting to build an AI memory system, which led us down the GraphRAG and HybridRAG rabbit hole, where we would need separate graph and vector databases.<p>We knew scalability would be a challenge at each stage of our product&#x27;s development, however our initial focus this past year was to prove out the product through local deployments and was only meant to be run on a single node. Scaling graph DBs remained a difficult and expensive problem we’d have to solve later. Some common ways other graph DBs solve scaling is by duplicating entire datasets across distributed machines (extremely expensive per node), or by sharding the data.<p>Sharding databases is effective and affordable, however, graph data doesn’t have explicit partitions like relational databases do. For example, sharding a relational DB involves splitting up tables. When it comes to graph DBs, the edges can span across any of the partitions, and hopping across multiple machines when traversing nodes is ineffective and computationally expensive.<p>Replicating graph DBs for high availability and better throughput drastically increases the operational cost of the db and still has a limit of how big you can vertically scale. The workload that we’re used for requires storing a huge amount of data for agents, where only a subset of that data is ever needed at any one time. So rather than having the whole thing in memory, we can store it all in object-storage and get the bits we need when they’re needed.<p>Agents benefit from better context, which is achieved from more and better data (more relationships etc). By using S3 as the persistence&#x2F;data layer there is <i>no limit</i> to how big the graph can be or how many relationships you can have, and we can scale to serve throughput and requests by horizontally spinning up nodes and caching relevant subsets of the graph on each node. This way, you get extremely low latency for “hot” data and a p99 of ~100ms for writes and ~50ms for reads from cold storage (S3). Plus you get the benefit of dirt cheap storage.<p>Workloads that HelixDB is currently supporting: - Huge amounts of data (TBs) from which the agents need to search and traverse over - Offering affordable graph storage for companies where cost of graph data is a bottleneck - Consolidating multiple databases, enabling AI agents to have autonomy over companies, helping them become more autonomous. - AI memory - Company brains<p>We’re currently working on our own generalised AI memory layer which will use HelixDB under the hood and be completely open-source. Also, we’re finishing up on pre-filtering for vector search which will allow you to pre-filter based on relationships in the graph, metadata, and sub-graphs. And lastly, GA cloud will be available in the coming weeks.<p>If you want to run Helix locally (either on-disk or in-memory), you can find more info on our github (<a href="https:&#x2F;&#x2F;github.com&#x2F;HelixDB&#x2F;helix-db" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;HelixDB&#x2F;helix-db</a>) or via our docs (<a href="https:&#x2F;&#x2F;docs.helix-db.com&#x2F;database&#x2F;local-development">https:&#x2F;&#x2F;docs.helix-db.com&#x2F;database&#x2F;local-development</a>). If you’re interested in getting started with our distributed cloud, please email us [email protected].<p>Many thanks! Comments and feedback welcome!
Original Article
View Cached Full Text

Cached at: 06/10/26, 05:45 PM

HelixDB/helix-db

Source: https://github.com/HelixDB/helix-db

HelixDB Logo HelixDB Logo

HelixDB: a graph-vector database for knowledge graphs and AI memory. Built from scratch in Rust.

Launch YC: HelixDB - The Database for Intelligence

website | docs | discord | X/twitter

Docs Change Log GitHub Repo stars Discord LOC


HelixDB is a database that makes it easy to build all the components needed for AI applications in a single platform.

You don’t need a separate application DB, relational DB, vector DB, graph DB, or application layers to manage the multiple storage locations. HelixDB gives your agents federated access to company data, for memory, company brains, and applications.

Helix primarily operates with a graph + vector data model, but it also supports KV, documents, and relational data.

Getting Started

1. Install the CLI

The Helix CLI runs and manages local instances and talks to Helix Cloud.

curl -sSL "https://install.helix-db.com" | bash

Already installed? Update to the latest version with helix update.

2. The quickest path — helix chef

helix chef is an interactive, one-shot bootstrapper. It installs the HelixDB query skills and docs MCP, scaffolds a project, starts a local instance, seeds some example data, and writes a HELIX_CHEF_PROMPT.md. If a coding agent is available (Claude Code, Codex, or OpenCode), it can hand off and build a working app — frontend and all — from a one-line description of what you want.

helix chef

That’s it — no flags. Answer “what do you want to build?” and follow the prompts.

3. Manual local setup

If you’d rather wire things up yourself:

  1. Initialize a project. This scaffolds helix.toml, a .helix/ workspace dir, and a ready-to-run examples/request.json.
 mkdir my-helix-app && cd my-helix-app
 helix init
  1. Start a local instance. Runs a background container on port 6969 and waits until it accepts queries.
 helix start dev

⚠️ The default storage mode is in-memory — stopping the instance wipes its data. Use helix start dev --disk to persist data across restarts, or --foreground to stream logs.

  1. Send a query.
 helix query dev --file examples/request.json
  1. Stop the instance when you’re done.
 helix stop dev

Writing queries with the SDKs

Queries are authored with the Rust or TypeScript DSL and sent straight to a running instance as dynamic requests against POST /v1/query — no build or deploy step. Both SDKs produce the same JSON AST. The examples below talk to a local instance on http://localhost:6969 (the default helix start dev port). See the Querying Guide for the full builder catalog and the dynamic-query wire format.

Rust

Install the crate (published as helix-db, imported as helix_db):

cargo init && cargo add helix-db tokio sonic-rs

Define your queries as #[register] functions, then run them directly through the client:

use helix_db::Client;
use helix_db::dsl::prelude::*;

#[register]
pub fn add_user(name: String) {
    write_batch()
        .var_as(
            "user",
            g().add_n("User", vec![("name", name)])
                .value_map(None::<Vec<String>>),
        )
        .returning(["user"])
}

#[register]
pub fn get_user(name: String) {
    read_batch()
        .var_as(
            "user",
            g().n_with_label("User")
                .where_(Predicate::eq("name", name))
                .value_map(None::<Vec<String>>),
        )
        .returning(["user"])
}

#[tokio::main]
async fn main() {
    let client = Client::new(None).unwrap(); // defaults to http://localhost:6969

    // add user
    let new_user = client
        .query::<sonic_rs::Value>()
        .dynamic(add_user("John Doe".to_string()))
        .send()
        .await
        .unwrap();
    println!("new user: {:#}", sonic_rs::to_string_pretty(&new_user).unwrap());

    // get user
    let user = client
        .query::<sonic_rs::Value>()
        .dynamic(get_user("John Doe".to_string()))
        .send()
        .await
        .unwrap();
    println!("user: {:#}", sonic_rs::to_string_pretty(&user).unwrap());
}

TypeScript

Install the package (Node.js 20+):

npm init -y && npm install @helix-db/helix-db

Define your queries as functions, then POST them to the running instance:

import {
  Predicate, PropertyInput, PropertyProjection,
  defineParams, g, param, readBatch, writeBatch,
} from "@helix-db/helix-db";

const addUserParams = defineParams({ name: param.string() });
function addUser(p = addUserParams) {
  return writeBatch()
    .varAs("user",
      g().addN("User", { name: PropertyInput.param("name") })
        .project([PropertyProjection.new("name")]),
    )
    .returning(["user"]);
}

const getUserParams = defineParams({ name: param.string() });
function getUser(p = getUserParams) {
  return readBatch()
    .varAs("user",
      g().nWithLabel("User")
        .where(Predicate.eqParam("name", "name"))
        .project([PropertyProjection.new("name")]),
    )
    .returning(["user"]);
}

const HELIX_URL = "http://localhost:6969/v1/query";

// add user
const newUser = await fetch(HELIX_URL, {
  method: "POST",
  headers: { "content-type": "application/json" },
  body: addUser().toDynamicJson(addUserParams, { name: "John Doe" }),
}).then((r) => r.json());
console.log("new user:", newUser);

// get user
const user = await fetch(HELIX_URL, {
  method: "POST",
  headers: { "content-type": "application/json" },
  body: getUser().toDynamicJson(getUserParams, { name: "John Doe" }),
}).then((r) => r.json());
console.log("user:", user);

HelixDB Cloud

HelixDB Cloud is an object-storage-backed deployment with integrated vector and full-text search, full ACID transactions, a single writer with auto-scaling reader nodes, and high availability (3+ gateways and DB nodes). Cloud clusters use a separate deploy path from local instances:

helix auth login                                  # authenticate
helix workspace switch <workspace>                # select workspace + project
helix project switch <project>
helix init cloud --cluster-id <cluster-id>        # or: helix add cloud --name production --cluster-id <id>
helix sync production                             # pull gateway URL + auth contract into helix.toml
helix query production --file examples/request.json

Commercial Support

HelixDB Cloud

HelixDB is available as a distributed, high-availability, managed service. If you’re interested in using Helix’s managed service, go to our website to get started or contact us to talk with a founder.

Docs & Community


Just Use Helix.

Similar Articles

Helix_AGI home project

Reddit r/AI_Agents

A developer shares Helix-AGI, a continuously-running cognitive agent using a physics-based memory retrieval system that integrates recency, structural importance, and semantic proximity via an entropic gravity equation and Euler-Lagrange dynamics, without tuning separate weights.