# HelixDB > HelixDB Enterprise is an object-storage-backed graph database with integrated approximate vector search and BM25 full-text search. Queries are authored with Helix SDK DSLs or dynamic JSON and invoked over HTTP. This file is the full markdown corpus for AI agents — Enterprise architecture, concepts, guarantees, and the complete CLI v2 reference. # Introduction Helix Cloud is an object-storage-backed graph database with integrated vector search and full-text search. It combines a property graph engine with approximate vector search and BM25 full-text search on top of durable object storage, using SSD and in-memory caches for low-latency reads. Helix Cloud is a fundamentally different architecture and database compared to the opensource v1 version of HelixDB. That version used LMDB which was limited to sequential writes and could only handle a relatively small amount of data. Helix Cloud uses a new LSM based storage engine backed by object storage that can handle concurrent writes to the writer node and allows for virtually unlimited data storage. ## HelixDB Cloud System at a Glance A gateway routes all traffic. A single writer serializes mutations for consistency. Readers auto-scale horizontally to handle query load. Object storage is the durable system of record. Caches reduce steady-state latency and accelerate cold starts. ## Key Properties - **Everything on object storage.** Nodes, edges, properties, and vector/text index artifacts persist durably in object storage — no local disk required for correctness. - **Tiered caching.** Separate in-memory and SSD cache paths for graph, vector, and text data keep hot-path reads fast. - **Full ACID transactions.** Every query runs in a serializable snapshot isolation transaction. Concurrent reads and writes do not block each other. - **Dynamic query model.** Queries are authored in a Rust or TypeScript DSL and sent to the runtime as dynamic HTTP requests that carry the query inline. No separate deployment step. ## Next Steps # Local Development For local development, Helix ships a combined `enterprise-dev` image that runs a gateway and database together in a single container. The image supports two storage modes: - **In-memory** — fastest to spin up, but data is lost on container restart. - **On-disk** — backed by an S3-compatible object store such as [MinIO](https://min.io/), so data survives restarts and mirrors the production storage path. In both modes: - If `PATH_TO_QUERIES` is unset, send queries as dynamic `POST /v1/query` requests. - If `PATH_TO_QUERIES` is set, the container preloads your query bundle from the mounted `queries.json` file during startup. ## Quick Start with the CLI The fastest way to get started is the Helix CLI, which wraps the `enterprise-dev` image and handles container lifecycle for you: ```bash # Install the CLI curl -sSL "https://install.helix-db.com" | bash # Scaffold a new project (creates helix.toml and examples/request.json) mkdir my-helix-app && cd my-helix-app helix init local --name dev # Start the local in-memory runtime on port 6969 helix start dev # Or start with persistent local storage backed by MinIO helix start dev --disk # Send a dynamic query helix query dev --file examples/request.json # Stop the instance when you're done helix stop dev ``` For installation details, Helix Cloud authentication, and the full command reference, see the [CLI Getting Started guide](/cli/getting-started). ### AI-assisted query authoring If you develop with a coding agent, install the Helix skills bundle for writing, optimizing, and translating queries. Run it from your project root so the skills can inspect your local schema: ```bash npx skills add HelixDB/skills ``` The agent then loads the relevant skill automatically when you ask it to write a query, tune a slow traversal, or translate a Cypher/Gremlin query. ### Configuring disk mode To make disk mode the default for a local instance, initialize or add it with `--disk`: ```bash helix init local --disk helix add local --name persistent --disk ``` The CLI-managed disk mode starts a MinIO sidecar, creates the `helix-db` bucket, and keeps the MinIO volume across `helix stop`. Use `helix prune ` to delete the persisted local data. If you'd rather manage the containers yourself with Docker Compose, the in-memory and on-disk setups below show the raw image configuration that the CLI disk mode mirrors. ## Installing the SDKs To author queries against your local instance, install the SDK for your language. TypeScript, Rust, and Python can generate bundles; Go v1 posts dynamic requests directly. For the full project layout, generator setup, and authoring walkthrough, see [TypeScript Project Setup](/database/typescript-project-setup), [Rust Project Setup](/database/rust-project-setup), [Go Project Setup](/database/go-project-setup), and [Python Project Setup](/database/python-project-setup). ## In-Memory You can add this service to a local `docker-compose.yml`: ```yaml expandable services: helix: image: ghcr.io/helixdb/enterprise-dev restart: unless-stopped ports: - "6969:8080" environment: PATH_TO_QUERIES: /workspace/queries.json volumes: - ./queries.json:/workspace/queries.json:ro ``` To run without a preloaded bundle, omit `PATH_TO_QUERIES` and the mounted `queries.json` volume; queries still execute as dynamic `POST /v1/query` requests. ## On-Disk For persistent local development, point the `enterprise-dev` image at an S3-compatible object store. The example below brings up MinIO, initializes the `helix-db` bucket, and configures the `enterprise-dev` container to read and write to it. Data persists across container restarts as long as the MinIO volume is preserved. ```yaml expandable services: minio: image: minio/minio:latest command: server /data --console-address ":9001" restart: unless-stopped environment: MINIO_ROOT_USER: minioadmin MINIO_ROOT_PASSWORD: minioadmin volumes: - minio-data:/data minio-init: image: minio/mc:latest restart: "no" depends_on: - minio entrypoint: - /bin/sh - -c - | until mc alias set local http://minio:9000 minioadmin minioadmin; do sleep 1; done mc mb --ignore-existing local/helix-db helix: image: ghcr.io/helixdb/enterprise-dev:v0.1.0 restart: unless-stopped depends_on: minio-init: condition: service_completed_successfully ports: - "6969:8080" environment: PATH_TO_QUERIES: /workspace/queries.json S3_BUCKET: helix-db S3_REGION: us-east-1 DB_PATH: db/ AWS_ACCESS_KEY_ID: minioadmin AWS_SECRET_ACCESS_KEY: minioadmin AWS_ENDPOINT: http://minio:9000 AWS_ALLOW_HTTP: "true" volumes: - ./queries.json:/workspace/queries.json:ro volumes: minio-data: ``` ### Environment variables | Variable | Description | | ----------------------- | ---------------------------------------------------------------------------- | | `S3_BUCKET` | Bucket the database reads and writes to. | | `S3_REGION` | Region passed to the S3 client. Any value works against MinIO. | | `DB_PATH` | Prefix inside the bucket where the database keeps its files. | | `AWS_ACCESS_KEY_ID` | Access key for the object store. | | `AWS_SECRET_ACCESS_KEY` | Secret key for the object store. | | `AWS_ENDPOINT` | Custom endpoint URL — required for MinIO and other non-AWS S3 providers. | | `AWS_ALLOW_HTTP` | Set to `true` when the endpoint is plain HTTP (e.g. local MinIO). | The same variables work against any S3-compatible store — MinIO, LocalStack, Ceph RGW, or AWS S3 itself. Drop `AWS_ENDPOINT` and `AWS_ALLOW_HTTP` to talk to real AWS S3. ## Next Steps # Rust Project Setup Queries for HelixDB are authored in Rust using the [`helix-db`](https://crates.io/crates/helix-db) crate (imported as `helix_db`). This page covers bootstrapping a Rust project that builds your queries into the `queries.json` bundle and lets you send them as dynamic requests. For the traversal model and query patterns themselves, see [Querying](/database/querying) and the [docs.rs reference](https://docs.rs/helix-db/latest/helix_db/). ## Prerequisites - A recent stable Rust toolchain — install via [rustup](https://rustup.rs/) if you don't already have it. - Optional: the [Helix CLI](/cli/getting-started) for running the resulting bundle against a local instance. ## Create a new Rust project `generate_to_path` is invoked from `main`, so a binary crate is the simplest layout: ```bash cargo new helix-queries cd helix-queries ``` You can also add this as a binary inside an existing Cargo workspace if you'd rather keep your queries alongside the rest of your service code. ## Add the dependency ```bash cargo add helix-db ``` Or add it directly to `Cargo.toml`: ```toml [dependencies] helix-db = "2.0.0" ``` The crate is published on crates.io as `helix-db` but its module path is `helix_db`, so all imports look like `use helix_db::...`. ## Set up `src/main.rs` ```rust fn main() { let path = helix_db::generate_to_path("queries.json").expect("generate queries bundle"); println!("generated {}", path.display()); } ``` `generate_to_path` collects every query you've defined in the crate and writes a single `queries.json` bundle to the path you pass. It returns the resolved path so you can log it or hand it to a downstream deploy step. ## Importing the DSL Inside the modules where you author queries, bring the DSL into scope with the prelude: ```rust use helix_db::dsl::prelude::*; ``` The prelude re-exports the common building blocks — `read_batch`, `write_batch`, `g()`, `sub()`, `NodeRef`/`EdgeRef`, `Predicate`/`SourcePredicate`, the projection helpers, and more. See the [docs.rs reference](https://docs.rs/helix-db/latest/helix_db/) for the full surface. ## Setting up your queries Queries are defined as top-level `pub fn` items annotated with `#[register]`. Each function returns either a `WriteBatch` (for mutations) or a `ReadBatch` (for read-only traversals), and `generate_to_path` picks them up automatically when the bundle is built. The function arguments become the query's named parameters at the HTTP edge. For a small project you can keep everything in `src/main.rs` alongside `fn main()`; once you have more than a handful of queries, move them into their own modules. ```rust expandable use helix_db::dsl::prelude::*; // Write query: insert a new User node. #[register] pub fn add_user(userId: String, name: String) -> WriteBatch { // The macro reads parameter *names* at compile time, so the values themselves // go unused in the body; this line silences the unused-variable warning. let _ = (&userId, &name); write_batch() .var_as( "newUser", g().add_n( "User", vec![ ("userId", PropertyInput::param("userId")), ("name", PropertyInput::param("name")), ], ) .project(vec![PropertyProjection::renamed("$id", "id")]), ) .returning(["newUser"]) } // Read query: fetch a single User by id. #[register] pub fn user_by_id(userId: String) -> ReadBatch { let _ = &userId; read_batch() .var_as( "user", g().n_with_label("User") .where_(Predicate::eq_param("userId", "userId")) .project(vec![ PropertyProjection::renamed("$id", "id"), PropertyProjection::new("name"), ]), ) .returning(["user"]) } fn main() { let path = helix_db::generate_to_path("queries.json").expect("generate queries bundle"); println!("generated {}", path.display()); } ``` A few things to know about this snippet: - `#[register]` adds the function to the global bundle, so `generate_to_path` picks it up — no manual list to maintain. - Calling the function with concrete arguments returns a `DynamicQueryRequest` you POST to `/v1/query`; the request's `query_name` is set to the function name (so logs show `add_user`, not `__dynamic__`). See [Querying](/database/querying) for the send path. For the full builder catalog — edges, predicates, vector/text search, sub-traversals, aggregations — see the [docs.rs reference](https://docs.rs/helix-db/latest/helix_db/). ## Generate the bundle ```bash cargo run ``` This produces `queries.json` in the project root. From here you can: - Mount it into the local `enterprise-dev` runtime as `PATH_TO_QUERIES` — see [Local Development](/database/local-development). - Deploy it to HelixDB through the control plane — see [Working with HelixDB](/database/working-with-enterprise). ## Next Steps # TypeScript Project Setup Queries for HelixDB can also be authored in TypeScript using the [`@helix-db/helix-db`](https://www.npmjs.com/package/@helix-db/helix-db) package. The TypeScript DSL builds the same JSON AST as the Rust [`helix-db`](/database/rust-project-setup) crate, so the resulting `queries.json` bundle is interchangeable. Use TypeScript when your service is already Node.js or you want end-to-end type inference on parameters. For the conceptual model and per-builder walkthrough, see the [Querying Guide](/database/querying-guide/overview). ## Prerequisites - Node.js 20 or later. - A package manager (`npm`, `pnpm`, or `yarn`). - Optional: the [Helix CLI](/cli/getting-started) for deploying the resulting bundle. ## Create a new project ```bash mkdir helix-queries cd helix-queries npm init -y npm pkg set type=module ``` `@helix-db/helix-db` is published as an ES module, so the `"type": "module"` field in `package.json` matters. Skip this step if you are adding queries to an existing ESM-configured project. ## Add the dependency ```bash npm install @helix-db/helix-db npm install --save-dev typescript @types/node tsx ``` `tsx` is optional but convenient for running the generator without a separate build step. ## Minimal `tsconfig.json` ```json { "compilerOptions": { "target": "ES2022", "module": "NodeNext", "moduleResolution": "NodeNext", "strict": true, "outDir": "dist", "rootDir": "src" }, "include": ["src/**/*.ts"] } ``` The DSL relies on strict mode for the typestate-tracked traversal builder; turn it on. ## Author your queries Queries are plain functions that return a `ReadBatch` or `WriteBatch`. To bundle them, declare their parameters with `defineParams`, wrap them with `registerRead` / `registerWrite`, and collect them with `defineQueries`. The bundle's `generate(path)` method writes the same `queries.json` shape that the Rust [`generate_to_path`](/database/rust-project-setup#generate-the-bundle) emits. ```ts src/queries.ts NodeRef, Order, Predicate, PropertyInput, PropertyProjection, defineParams, defineQueries, g, param, readBatch, registerRead, registerWrite, writeBatch, } from "@helix-db/helix-db"; // Parameters for the read route. const userByUsernameParams = defineParams({ username: param.string(), }); function userByUsername(p = userByUsernameParams) { return readBatch() .varAs( "user", g() .nWithLabel("User") .where(Predicate.eqParam("username", "username")) .project([ PropertyProjection.renamed("$id", "id"), PropertyProjection.new("username"), PropertyProjection.new("tier"), ]), ) .returning(["user"]); } // Parameters for the write route. const addUserParams = defineParams({ username: param.string(), tier: param.string(), }); function addUser(p = addUserParams) { return writeBatch() .varAs( "newUser", g() .addN("User", { username: PropertyInput.param("username"), tier: PropertyInput.param("tier"), }) .project([PropertyProjection.renamed("$id", "id")]), ) .returning(["newUser"]); } export const queries = defineQueries({ read: { user_by_username: registerRead(userByUsername, userByUsernameParams), }, write: { add_user: registerWrite(addUser, addUserParams), }, }); ``` A few things to know about this snippet: - The keys under `read` and `write` name each query in the bundle and back the typed `queries.call.*` helpers (see below). Names must be unique across reads *and* writes — a duplicate throws `GenerateError` at bundle time. - `PropertyInput.param("name")` feeds a parameter into an `addN` property bag or a `setProperty`; for predicates use `Predicate.eqParam(...)` and friends. ## Add a generator entry point ```ts src/generate.ts await queries.generate("queries.json"); console.log("generated queries.json"); ``` `queries.generate(path)` writes the bundle and resolves to the path it wrote. The file is laid out identically to the Rust bundle so the runtime treats them as equivalent. ## Wire up the script ```json { "scripts": { "generate": "tsx src/generate.ts" } } ``` ```bash npm run generate ``` This produces `queries.json` in the project root. From here you can: - Mount it into the local `enterprise-dev` runtime as `PATH_TO_QUERIES` — see [Local Development](/database/local-development). - Deploy it to HelixDB through the control plane — see [Working with HelixDB](/database/working-with-enterprise). ## Sending registered queries from TypeScript `defineQueries` also exposes a typed `call` map. Each entry takes the query's parameter values (inferred from `defineParams`) and returns a `DynamicQueryRequest` you hand to the client's `.dynamic(...)`: ```ts const client = new Client("https://your-helix.example.com") .withApiKey("hx_secret"); // omit for a local instance const { user } = await client .query<{ user: unknown }>() .dynamic(queries.call.user_by_username({ username: "alice" })) .send(); ``` `call.` is fully typed — an unknown key or wrong parameter type is a compile error — and sets `query_name` to the route key so logs show `user_by_username`, not `__dynamic__`. See [Querying](/database/querying) for the full client send path. ## Next Steps # Go Project Setup Queries for HelixDB can be authored directly in Go with the `github.com/helixdb/helix-db/sdks/go` module. The Go SDK is dynamic-first: write ordinary Go functions that return `helix.Request`, declare parameters inline, and execute with `client.Exec(ctx, request, &out)`. There is no bundle-generation step in the primary Go workflow. For the traversal model and query patterns themselves, see [Querying](/database/querying) and the [Querying Guide](/database/querying-guide/overview). ## Prerequisites - Go 1.22 or later. - Optional: the [Helix CLI](/cli/getting-started) for local development and ad-hoc query testing. ## Create a project ```bash mkdir helix-go-app cd helix-go-app go mod init example.com/helix-go-app ``` ## Add the dependency ```bash go get github.com/helixdb/helix-db/sdks/go ``` Import the SDK as `helix`: ```go ``` ## Write query functions Go queries are normal functions. Use `helix.ReadQuery("name")` or `helix.WriteQuery("name")` to set the dynamic request's `query_name`, then declare runtime parameters inline with methods such as `q.ParamString`, `q.ParamI64`, and `q.ParamDateTime`. ```go package users type UserRow struct { ID int64 `json:"$id"` Name string `json:"name"` TenantID string `json:"tenantId"` } type FindUsersResponse struct { Users []UserRow `json:"users"` } func FindUsers(tenantID string, limit int64) helix.Request { q := helix.ReadQuery("find_users") tenant := q.ParamString("tenant_id", tenantID) maxRows := q.ParamI64("limit", limit) return q. VarAs("users", helix.G(). NWithLabel("User"). Where(helix.PredEq("tenantId", tenant)). Limit(maxRows). ValueMap("$id", "name", "tenantId"), ). Returning("users") } ``` Parameter refs can be passed directly into predicates, stream bounds, and mutation property inputs. The SDK inserts both `parameters` and `parameter_types` into the dynamic envelope. ## Parameterize request-specific values Direct Go values are serialized as literals in the query AST. This is useful for true constants, but it is not a runtime parameter: ```go helix.G().NWhere(helix.SourceEq("id", "foo")) // embeds "foo" in the AST ``` For values that change per request, declare a `q.Param*` value and pass the returned `ParamRef`. That keeps the query shape stable and lets the server reuse cached work across requests: ```go q := helix.ReadQuery("user_by_id") id := q.ParamString("id", userID) return q. VarAs("user", helix.G().NWhere(helix.SourceEq("id", id)).ValueMap("$id", "name")). Returning("user") ``` The same rule applies to traversal predicates such as `helix.PredEq`, bounds such as `Limit`, mutation properties, and search inputs: pass a `ParamRef` for request-specific values, not a direct literal. ## Return explicit variables The names passed to `Returning(...)` define the top-level response object keys and should match your response struct tags: ```go type FindUsersResponse struct { Users []UserRow `json:"users"` } return q.VarAs("users", traversal).Returning("users") ``` Use zero-argument `Returning()` only when the response is intentionally empty, such as an idempotent index-creation request. The SDK serializes it as `"returns":[]`. ## Execute queries ```go package users "context" helix "github.com/helixdb/helix-db/sdks/go" ) func ListUsers(ctx context.Context, client *helix.Client, tenantID string, limit int64) (FindUsersResponse, error) { var out FindUsersResponse err := client.Exec(ctx, FindUsers(tenantID, limit), &out) return out, err } ``` Create the client once and reuse it: ```go client, err := helix.NewClient("https://helix.example.com", helix.WithAPIKey("hx_secret")) if err != nil { return err } var out FindUsersResponse err = client.Exec(ctx, FindUsers("acme", 25), &out) ``` For local development, pass `""` or `"http://localhost:6969"` to `NewClient`. ## Write queries ```go type CreateUserResponse struct { User []UserRow `json:"user"` } func CreateUser(name string, tenantID string) helix.Request { q := helix.WriteQuery("create_user") nameParam := q.ParamString("name", name) tenant := q.ParamString("tenant_id", tenantID) return q. VarAs("user", helix.G().AddN("User", helix.Props{ helix.Prop("name", nameParam), helix.Prop("tenantId", tenant), }), ). Returning("user") } ``` Use execution options when a write must hit the writer node or wait for durability: ```go var created CreateUserResponse err = client.Exec(ctx, CreateUser("Alice", "acme"), &created, helix.WriterOnly(), helix.AwaitDurability(true), ) ``` ## Handle conflicts in application code `Client.Exec` does not retry HTTP `409 Conflict` responses automatically. Retry only when your operation is safe to replay. Remote errors are returned as `*helix.HelixError` with `StatusCode` set, and conflicts wrap `helix.ErrConflict`. Use `helix.IsConflict(err)` or `errors.Is(err, helix.ErrConflict)` instead of parsing the error text. ```go func ExecWithConflictRetry(ctx context.Context, client *helix.Client, build func() helix.Request, out any) error { for attempt := 0; attempt < 3; attempt++ { err := client.Exec(ctx, build(), out) if err == nil || !helix.IsConflict(err) || attempt == 2 { return err } time.Sleep(time.Duration(attempt+1) * 50 * time.Millisecond) } return nil } ``` ## Next Steps # Python Project Setup Queries for HelixDB can be authored directly in Python with the `helix-db` package, imported as `helixdb`. The Python SDK pairs a query-builder DSL with a small sync HTTP client. The API is Pythonic (`read_batch`, `write_batch`, `var_as`, `value_map`) and emits the same dynamic-query JSON AST as the Rust, TypeScript, and Go SDKs. For the traversal model and query patterns themselves, see [Querying](/database/querying) and the [Querying Guide](/database/querying-guide/overview). ## Prerequisites - Python 3.10 or later. - Optional: the [Helix CLI](/cli/getting-started) for local development and ad-hoc query testing. ## Create a project ```bash mkdir helix-python-app cd helix-python-app python3 -m venv .venv source .venv/bin/activate ``` ## Add the dependency When installing from a checkout of the HelixDB monorepo: ```bash pip install -e sdks/python ``` When installing from Git: ```bash pip install 'git+https://github.com/HelixDB/helix-db.git#subdirectory=sdks/python' ``` Import the SDK from `helixdb`: ```python from helixdb import Client, Predicate, define_params, g, param, read_batch, write_batch ``` A compatibility import path, `helix_db`, is also available for codebases that prefer underscore package names. ## Write query functions Python query builders are normal functions returning `ReadBatch` or `WriteBatch`. Use `define_params` for runtime values and pass the returned refs into predicates, limits, property inputs, search inputs, and mutations. ```python from helixdb import Predicate, Projection, define_params, g, param, read_batch find_users_params = define_params({ "tenant_id": param.string(), "limit": param.i64(), }) def find_users(p=find_users_params): return ( read_batch() .var_as( "users", g() .n_with_label("User") .where(Predicate.eq("tenantId", p.tenant_id)) .limit(p.limit) .project([ Projection.property("$id", "id"), Projection.property("name"), Projection.property("tenantId"), ]), ) .returning(["users"]) ) ``` Direct values are serialized as literals in the query AST. That is useful for true constants, but values that change per request should be declared as params so the query shape stays stable and the server can reuse cached work across requests. ## Build dynamic requests A batch becomes a dynamic request with `to_dynamic_request(...)` or `to_dynamic_json(...)`: ```python request = find_users().to_dynamic_request( find_users_params, {"tenant_id": "acme", "limit": 25}, query_name="find_users", ) body = find_users().to_dynamic_json( find_users_params, {"tenant_id": "acme", "limit": 25}, query_name="find_users", ) ``` The request includes `request_type`, `query_name`, `query`, `parameters`, and `parameter_types`, ready to POST to `/v1/query`. ## Execute queries Create the client once and reuse it: ```python from helixdb import Client, HelixError client = Client("http://localhost:6969") try: response = client.query().dynamic(request).send() except HelixError as error: if error.kind == "Remote": raise RuntimeError(error.details) from error raise users = response["users"] ``` For Helix Cloud, pass the cluster URL and API key: ```python client = Client("https://helix.example.com", api_key="hx_secret") ``` Use request-builder options when a write must hit the writer node or wait for durability: ```python created = ( client .query() .writer_only() .should_await_durability(True) .dynamic(create_user_request) .send() ) ``` Warm read-query caches with `warm_only()`: ```python client.query().warm_only().dynamic(read_request).send() ``` Stored routes post to `/v1/query/{name}`: ```python response = client.query().body({"tenant_id": "acme"}).stored("find_users").send() ``` ## Write queries ```python from helixdb import define_params, g, param, write_batch create_user_params = define_params({ "name": param.string(), "tenant_id": param.string(), }) def create_user(p=create_user_params): return ( write_batch() .var_as("user", g().add_n("User", {"name": p.name, "tenantId": p.tenant_id})) .returning(["user"]) ) ``` `read_batch().var_as(...)` rejects write traversals. Use `write_batch()` for any node/edge creation, property update/removal, drop, or index mutation. ## Bundles Python can also generate query bundles: ```python from helixdb import define_queries, register_read, register_write queries = define_queries({ "read": {"find_users": register_read(find_users, find_users_params)}, "write": {"create_user": register_write(create_user, create_user_params)}, }) # Dynamic request with query_name="find_users". request = queries.call.find_users({"tenant_id": "acme", "limit": 25}) # Write queries.json. queries.generate("queries.json") ``` Route names must be unique across read and write routes. Bundles serialize with version `4`, matching the other SDKs. ## Handle conflicts in application code The Python client does not retry HTTP `409 Conflict` responses automatically. Retry only when the operation is safe to replay. Remote errors are raised as `HelixError` with `kind == "Remote"`, `details`, and `status_code` populated. ```python for attempt in range(3): try: return client.query().dynamic(request).send() except HelixError as error: if error.kind != "Remote" or error.status_code != 409 or attempt == 2: raise ``` ## Next Steps # Roadmap ## Currently In Progress - ~~Local docker deployment~~ . - **More intelligent cache warming** (always improving) - ~~Dashboard improvements~~ / - **Supporting non-HA clusters** - **Reliability and availability improvements** (always improving) - **Backups and point in time recovery** - **Expand query plans to improve performance** ## Up Next - **SSO and SAML** - **RBAC and fine-grained api key permissions** - **AWS PrivateLink** Have a feature request? [Let us know on Discord](https://discord.com/invite/2stgMPr5BD) or [email us](mailto:founders@helix-db.com). # Release Notes # Working with HelixDB Cloud Helix Cloud separates deployment-time operations from runtime query execution. This boundary keeps the system easier to scale, secure, and operate. ## End-to-End Workflow ## Authoring Workflow Queries are authored as a single `queries.json` artifact — graph traversals, property filters, vector searches, text searches, and mutations all expressed as JSON. A single query can chain multiple traversals, apply filters, and combine graph, vector, and text operations within one transaction. You can build `queries.json` with the [Rust](/database/rust-project-setup), [TypeScript](/database/typescript-project-setup), or [Python](/database/python-project-setup) DSL. Queries are the only artifact that moves from development to production. There is no schema migration step, no ORM configuration, and no SQL to manage. The data model is implicit in the queries themselves. ## Runtime Workflow At runtime, applications interact with Helix Cloud over HTTP. 1. **Send the query.** An HTTP client POSTs a dynamic request to the gateway at `POST /v1/query` — the JSON query AST plus any required parameters travel in the request body, so there is nothing to deploy ahead of time. 2. **Transaction execution.** The gateway routes the request to the appropriate process. The query runs inside a transaction with serializable snapshot isolation. Reads and writes within the same query see a consistent snapshot. 3. **Result delivery.** Helix returns the query result as a JSON response. Reads are served by horizontally scaled readers. Writes are serialized through the single writer. ## Dynamic Queries Each request carries the query inline, so you never deploy a route ahead of time. - Send `POST /v1/query`. - Include a JSON body shaped like: ```json { "request_type": "read", "query_name": "user_by_username", "query": { "queries": [ { "Query": { "name": "user", "steps": [ { "NWhere": { "Eq": ["username", { "String": "Alice" }] } } ], "condition": null } } ], "returns": ["user"] }, "parameters": {} } ``` - `request_type` must be `read` or `write` so the gateway can route the request correctly. - `query_name` is optional operational metadata for logs and query diagnostics. Use the exact top-level field `query_name`; missing or `null` falls back to `__dynamic__`, and blank strings are rejected. - `query` is the same JSON object that lives under `read_routes.` or `write_routes.` in a generated `queries.json` bundle. - `parameters` is optional and carries the values for the query's named parameters. ## Query Warming Helix supports built-in query warming for read queries. - Send the normal read request to `POST /v1/query` with the same parameters you would use for a real read. - Add `X-Helix-Warm: true` or `X-Helix-Warm: 1`. The SDK clients set this for you via `.warmOnly()` (TypeScript), `.warm_only()` (Rust/Python), or `helix.WarmOnly()` (Go): - Helix executes the query as a query warming request, discards the result body, and returns `204 No Content`. This is useful when you want to pre-populate the per-process caches that a known query touches before live traffic arrives. Important details: - Query warming is only supported for read queries. Warming a write query is rejected. - Helix sends the query warming request to all backends so every node can populate its local caches with the data fetched during query execution. - Query warming does not create a separate query-result cache. It warms the normal storage, vector, and text-search caches that the query touches during execution. ## Separation of Concerns | Responsibility | Deploy-time | Runtime | | -------------------- | ------------- | --------------------------- | | Query authoring | `queries.json` | -- | | Query submission | -- | `helix query` + Gateway | | Query execution | -- | Gateway + Writer/Readers | | Data storage | -- | Object storage | | Caching | -- | SSD + in-memory per process | | Scaling | -- | Auto-scaling readers | ## Local Development For iterating locally without provisioning a full cluster, Helix ships a combined `enterprise-dev` image that runs a gateway and database together in a single container. See [Developing with HelixDB Locally](/database/local-development) for in-memory and on-disk setups. ## Next Steps # Architecture Helix Cloud runs as a single writer node and multiple reader nodes behind a routing gateway. Reads scale horizontally. Writes are serialized through a dedicated writer process to maintain a simple consistency model. Helix Cloud is a fundamentally different architecture and database compared to the opensource v1 version of HelixDB. That version used LMDB which was limited to sequential writes and could only handle a relatively small amount of data. Helix Cloud uses a new LSM based storage engine backed by object storage that can handle concurrent writes to the writer node and allows for virtually unlimited data storage. ## Gateway The gateway is the entry point for all client traffic. It authenticates each request via Bearer token, accepts the inline query payload, and routes it: mutations always go to the writer, read-only queries are distributed across the readers and the writer. For high availability, deploy at least three gateway instances per cluster. Smaller fleets work for non-HA or test workloads but are not recommended for production. ## Writer A single writer process handles all mutations. The writer supports concurrent write transactions through MVCC (multi-version concurrency control), allowing multiple writes to execute in parallel without blocking each other. Serializing the commit path through one process eliminates distributed coordination and simplifies the consistency model. The writer batches mutations for throughput and persists them durably to object storage before acknowledging. The writer also serves read-only queries. It maintains its own SSD and in-memory cache, giving it the most up-to-date view of the data. Reads routed to the writer see committed writes immediately, with no snapshot refresh delay. ## Readers Readers serve all read-only queries. They are stateless with respect to writes and can be added or removed without coordination. Each reader maintains a local SSD and in-memory cache populated from object storage. Reader scaling is automatic. As query load increases, new readers are provisioned. As load decreases, excess readers are removed. This keeps cost proportional to actual query volume. ## Object Storage Object storage is the durable system of record. All graph data, vector indexes, text index artifacts, and metadata persist here. No data lives exclusively on local disk. This means the system can recover from a full cache loss by reading from object storage, and storage capacity is effectively unbounded. ## Cache Hierarchy Each process (writer and readers) maintains local cache tiers for the data and indexes it serves. - **In-memory cache.** Fastest access. Holds the most frequently accessed graph data, vector search state, and hot text-search generations. Bounded by available RAM. - **SSD cache.** Larger capacity, lower cost per byte. Holds warm graph data, vector data, and reusable text-search artifacts. Reads from SSD are significantly faster than reads from object storage. Graph, vector, and text workloads use specialized cache paths so hot working sets do not fully contend with one another. On cold start, caches warm progressively as queries execute; for predictable latency from the first query, caches (including text index generations) can be pre-warmed. ## Read Path 1. A read request arrives at the gateway and is routed to a reader or the writer. 2. The reader resolves a consistent snapshot from object storage metadata. 3. Data is read from the in-memory cache, SSD cache, or object storage (in that order). 4. The query executes against the snapshot and returns results. Cache misses transparently fall through to object storage. The same query produces the same result regardless of cache state; caching affects latency, not correctness. ## Write Path 1. A write request arrives at the gateway and is routed to the writer. 2. The writer executes the mutation within a serializable transaction. Multiple write transactions execute concurrently via MVCC; conflicts are resolved at commit time. 3. The mutation is batched and persisted durably to object storage. 4. Once durable, the write is acknowledged to the client. 5. Readers observe the new data on their next snapshot refresh. ## Next Steps # Guarantees Helix Cloud provides full ACID guarantees for every transaction. ## Atomicity Each query executes as a single atomic transaction. All mutations within a query either commit together or roll back entirely. There are no partial writes. If any step of a multi-traversal query fails, the entire transaction is aborted and no data is modified. ## Consistency The database transitions from one valid state to another on every committed transaction. All constraints (unique IDs, type safety on properties, vector dimension matching, and text index value constraints) are enforced at write time, not retroactively. ## Isolation Transactions run with serializable snapshot isolation. Each transaction reads from a consistent point-in-time snapshot. Graph reads, vector search, and text search all observe that same snapshot. Write transactions also get read-your-writes, and indexed reads participate in normal conflict detection. - Read-only queries see a consistent snapshot and are never blocked by writes. - Write transactions commit through the single writer process, so conflicting writes are detected before commit. ## Durability All committed data is persisted to object storage before the write is acknowledged to the client. Object storage provides the durability guarantee. Local caches (SSD and in-memory) are performance optimizations, not durability mechanisms. A full cache loss does not result in data loss. ## High Availability Helix Cloud can run with fewer than three gateway or database nodes, but those deployments are not high availability and are not recommended for production. For high availability, use at least three database nodes (one writer and two readers) and at least three gateway instances per cluster. This baseline ensures: - **Reader redundancy.** If a reader becomes unavailable, the remaining readers and the writer continue serving read traffic. A replacement reader is provisioned automatically. - **Gateway redundancy.** Gateway instances are stateless and interchangeable. Loss of one gateway has no impact on availability. Traffic is redistributed across the remaining instances. - **Writer recovery.** If the writer becomes unavailable, a new writer process is started and recovers its state from object storage. No committed data is lost. Write availability is restored once the new writer is running. - **Infrastructure redundancy.** All supporting services (load balancers, metadata stores, coordination layers) run with redundant infrastructure sized for high-availability clusters, eliminating single points of failure across the cluster. Object storage itself provides independent durability and availability guarantees. The database cluster can be fully rebuilt from object storage contents. ## Service Level Agreements Custom SLAs covering uptime, latency percentiles, and support response times for enterprise clusters. Contact [founders@helix-db.com](mailto:founders@helix-db.com) to discuss SLA terms for your deployment. ## Consistency Model Helix Cloud executes every query against a committed snapshot. The writer can serve newly committed state immediately. Readers observe new data on their next snapshot refresh. The refresh interval determines the upper bound on read staleness for read-only traffic after a write is acknowledged. # Security ## Authentication All API requests to Helix Cloud are authenticated with a Bearer token. Tokens are managed through the Helix dashboard. Include the token in the `Authorization` header of every request: ``` Authorization: Bearer ``` The SDK clients attach this header for you — set the key once and every request is authenticated, via `withApiKey` (TypeScript), `with_api_key` (Rust), or `WithAPIKey` (Go): ```ts const client = new Client("https://helix.example.com").withApiKey(token); ``` Requests without a valid token are rejected at the gateway before reaching any database node. Token rotation and revocation take immediate effect from the dashboard. ## Encryption All traffic between clients and the gateway is encrypted in transit via TLS. Data at rest in object storage is encrypted using the storage provider's server-side encryption. ## Enterprise features The following are available for enterprise clusters. Contact [founders@helix-db.com](mailto:founders@helix-db.com) to enable them for your deployment. - **Role-based access control.** Scoped API keys with read-only, read-write, or operation-restricted permissions for least-privilege credentials per service or environment. - **SSO / SAML.** Dashboard access through your identity provider (Okta, Azure AD, Google Workspace) with centralized provisioning and deprovisioning. - **Audit logs.** Per-request logging (timestamp, token identity, query name, source IP, response status) for compliance (SOC 2, HIPAA, GDPR) and forensic analysis. - **AWS PrivateLink.** A private endpoint in your VPC that routes to Helix Cloud without traversing the public internet, for network-isolation requirements in regulated environments. # Tradeoffs Every system makes design choices. Helix Cloud optimizes for durable, cost-effective graph, vector, and text workloads with strong transactional guarantees. These choices have implications. ## Excels At | Area | Details | | ------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------- | | **Durable storage** | All data persists in object storage. No risk of data loss from local disk failure. Storage capacity scales independently of compute. | | **Read scalability** | Readers auto-scale horizontally. Doubling readers doubles read throughput with no coordination overhead. | | **Serializable transactions** | Every query runs against a stable snapshot with ACID semantics by default. | | **Mixed graph, vector, and text workloads** | Graph traversals, vector search, and full-text search execute in the same transaction, against the same snapshot. No need to stitch together separate systems for those workloads. | | **Cost efficiency at scale** | Object storage is significantly cheaper per GB than local SSDs or in-memory stores. Large datasets remain affordable. | | **Operational simplicity** | Single writer eliminates distributed consensus. No leader election, no split-brain, no quorum management. | ## Not Optimal For | Area | Details | | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Sub-millisecond reads** | Cache hits are fast, but cold reads require an object storage round trip. Workloads that require guaranteed sub-millisecond latency on every read are better served by in-memory databases. | | **Ultra-low write latency** | Writes incur object storage latency for durability. Write throughput is high, but individual write latency has a floor set by object storage round-trip time. | | **Exhaustive vector recall** | Vector search is approximate (ANN). Applications that require 100% exact nearest neighbor results should use brute-force search on smaller datasets. | ## Design Choices **Object storage as the system of record.** Caches accelerate reads but are not required for correctness. This means cold starts are slower than systems that keep all data on local disk, but durability and cost characteristics are superior. **Single writer.** Serializing all writes through one process avoids distributed coordination at the cost of write throughput being bounded by a single node. In practice, batching and the high throughput of the writer process make this sufficient for most workloads. **Specialized cache paths.** Helix maintains separate cache paths for graph data, vector indexes, and text search artifacts. The tradeoff is less flexibility in cache allocation when only one workload dominates. **Dynamic query model.** Queries travel inline with each request, so there is no deploy step and ad-hoc queries are fully supported. The tradeoff is the small per-request cost of deserializing the query AST on every call. # Limits Current constraints and practical limits. These reflect the current implementation, not fundamental architectural boundaries. ## Data Model | Limit | Value | | ---------------------- | ---------------------------------------------------------------------------------------------- | | Node and edge ID range | 64-bit unsigned integer (max 2^63) | | Property value types | boolean, integer, float, string, bytes, typed primitive arrays, generic arrays, and objects | | Nested structures | Stored object/array values are supported. Dotted-path lookup such as `metadata.externalID` works in scan-time filters, expressions, projections, `values`, filtered `valueMap`, and fallback ordering. Arrays are opaque; there is no array-index path syntax. | | Reserved property keys | `$label` (used for label-based filtering and label-scoped secondary, vector, and text indexes) | ## Vector Indexes | Limit | Value | | ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------- | | Scope | Node and edge properties | | Supported property types | Numeric array properties (`float32[]`, `float64[]`, `int64[]`) normalized to `float32` for indexing | | Tenant partitioning | Optional by configured tenant property name. Tenant-scoped searches require a tenant value. Unknown tenant partitions return no results. | | Dimension matching | Vectors must exactly match the configured index dimension | | Distance metrics | cosine, euclidean, manhattan | | Search type | Approximate nearest neighbor (ANN) | ## Text Indexes | Limit | Value | | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Scope | Node and edge properties | | Supported property types | Top-level `String` and `StringArray` properties. Nested object fields are not flattened for BM25. | | Unsupported values | `null` and non-string values are rejected | | Analyzer config | Preset analyzers: `standard`, `standard_stem_en`, `whitespace_lowercase` | | Term positions | Optional | | Tenant partitioning | Optional. Tenant-partitioned text indexes currently require the partition property name to be `tenant_id`, and tenant-scoped searches require a tenant value. | ## Secondary Indexes | Limit | Value | | ---------------- | ---------------------------------------------------------------------------------------------------------- | | Equality indexes | Supported on top-level node and edge properties. Nested dotted paths are scan-only in V1. | | Range indexes | Supported on top-level numeric and string properties. Nested dotted paths are scan-only in V1. | | Encoding | Range indexes use lexicographic string encoding. Values must be encoded consistently for correct ordering. | ## Queries | Limit | Value | | ----------------- | ------------------------------------ | | Query model | Dynamic queries | | Query language | SDK DSLs or dynamic JSON AST | | Transaction scope | One transaction per query invocation | # Data Model Helix Cloud stores data as a labeled property graph. The graph consists of nodes and directed edges, each carrying typed properties. ## Nodes and Edges Nodes and edges are identified by 64-bit unsigned IDs. Edges are directed: each edge has a source node and a target node. Labels are stored as the reserved `$label` property and are used for type-based filtering and label-scoped secondary, vector, and text indexes. Properties are strongly typed. Supported types include boolean, integer, floating-point, string, bytes, typed primitive arrays, generic arrays, and object maps. Object maps may be nested; query property names such as `metadata.externalID` read nested fields with exact-first dotted-path lookup. A stored top-level property literally named `metadata.externalID` wins over the nested `metadata.externalID` path during scans. Nested object fields are queryable in filters and projections, but V1 indexing is top-level only. Keep secondary, text, and vector indexed values as top-level properties. Generic arrays are stored as values and returned as values, but array index path syntax such as `tags.0` is not supported. ## Multigraph Multiple edges between the same pair of nodes are supported. Each edge has a unique ID. The pair `(from, to)` maps to the set of all edge IDs connecting those nodes. This allows modeling relationships like "user A sent message B to user C" and "user A sent message D to user C" as distinct edges with distinct properties. For how this data is filtered and searched, see [Indexing](/database/indexing/secondary). For how it is read and mutated, see [Querying](/database/querying). # Querying Helix Cloud exposes a single query surface over HTTP. Every query — whether a deep graph traversal, a vector search, a bulk write, or all three at once — is expressed in the same composable DSL and executes as one serializable transaction. Queries are authored once in TypeScript, Rust, Go, or Python (or by hand as JSON) and sent to the runtime as **dynamic** requests — the request body carries the query itself, so there is no separate deployment step. The [Querying Guide](/database/querying-guide/overview) walks the DSL end-to-end; this page covers how a dynamic query looks on the wire. ## Installing the SDKs The TypeScript, Rust, Go, and Python examples below are built with the Helix SDKs. Install the one for your language to follow along — all four emit the same dynamic query JSON. For the full project layout and generator setup, see [TypeScript Project Setup](/database/typescript-project-setup), [Rust Project Setup](/database/rust-project-setup), [Go Project Setup](/database/go-project-setup), and [Python Project Setup](/database/python-project-setup). ## Dynamic queries A dynamic query is sent inline: the request body carries the JSON AST for the query plus a small envelope that names the request kind, an optional query name, and any runtime parameters. There is no deployment step — the query travels with the request, so the same DSL that builds it also produces the body you POST. The example below counts every `User` node. The TypeScript, Rust, Go, and Python tabs are builder front-ends that produce the same JSON envelope; pick whichever your codebase already speaks. The JSON tab shows the resulting envelope verbatim — if you want to skip the DSL and POST hand-written JSON, this is the body you send. Sending that envelope is one POST to `/v1/query`. The response is a JSON object keyed by the names from `.returning([...])`. `query_name` is optional metadata for gateway logs and diagnostics. Use the exact field `query_name` (not `name` or `queryName`); it defaults to `__dynamic__` when missing or `null`, and blank strings are rejected. For larger envelopes, write the JSON to a file and use `curl --data-binary @file.json` — the inline `--data-raw` form gets unwieldy past a few steps. For parameters, bundles, and the typed `queries.call.*` helpers, see [Parameters & bundles](/database/querying-guide/parameters-bundles). ## Running queries with the client The SDKs ship a `Client` so you don't have to assemble requests by hand. The client owns the connection details (URL and, on Helix Cloud, the API key) and posts dynamic requests to `/v1/query`. `send()` / `Exec` returns the parsed JSON response on HTTP `200`. Any other status raises a `HelixError` whose `kind` is one of `Network`, `Remote`, `Serialization`, or `InvalidUrl`. In Go, remote errors are returned as `*helix.HelixError` with `StatusCode` populated; HTTP `409 Conflict` also wraps `helix.ErrConflict` and can be checked with `helix.IsConflict(err)`. The SDKs do not retry conflicts automatically, so retry only in application code when the operation is safe to replay. ## Transactions Every query — read or write — executes as a single transaction with serializable snapshot isolation. - **Serializable.** Transactions behave as if they executed one at a time, even when running concurrently. - **Snapshot isolation.** Each transaction reads from a consistent point-in-time snapshot. Reads within a transaction are never affected by concurrent writes. - **Automatic.** Transactions are implicit. Every query invocation is a transaction. There is no manual `BEGIN` / `COMMIT` / `ROLLBACK`. Read-only queries never block writes. Write transactions are serialized through the single writer process for correctness without distributed locking. See [Guarantees](/database/guarantees) for the full consistency and durability contract. ## Next Steps # Querying Guide: Overview This guide is a tutorial walkthrough of the HelixDB query language. Each page builds on the last, starting from the simplest possible read and ending at parameter-bound bundles. Key examples show the **TypeScript**, **Rust**, **Go**, **Python**, and **JSON** forms across the guide, so you can pick a client and follow along or skim across the surfaces to see how they relate. These forms are not different query languages; they are encodings of the same query AST. The TypeScript, Rust, Go, and Python DSLs are builders that emit the JSON shape directly. If you copy the JSON column into a `POST /v1/query` body, it will execute the same query as the DSL above it. ## Installing the SDKs To follow along in TypeScript, Rust, Go, or Python, install the SDK for that language. All four emit the same query AST, so you can switch between them — or use the JSON column directly and skip the SDK entirely. For the full project layout and generator setup, see [TypeScript Project Setup](/database/typescript-project-setup), [Rust Project Setup](/database/rust-project-setup), [Go Project Setup](/database/go-project-setup), and [Python Project Setup](/database/python-project-setup). ## The running example Every page in this guide uses one small social-graph schema, so each new step plugs into a domain you already know. - **Nodes:** `User` (`username`, `tier`, `createdAt`), `Post` (`title`, `body`, `embedding`, `createdAt`), `Tag` (`name`). - **Edges:** `FOLLOWS` (User → User), `AUTHORED` (User → Post), `LIKED` (User → Post), `TAGGED` (Post → Tag). - **Indexes:** a unique-equality index on `User.username`, a vector index on `Post.embedding`, a text index on `Post.body`. Index setup itself is covered on the [Search](/database/querying-guide/search) page; for now assume they exist. ## How a query is shaped Every query — read or write — is a small pipeline: 1. Open a batch with `readBatch()` or `writeBatch()`. 2. Bind one or more named sub-results with `.varAs("name", traversal)`. A traversal always starts from `g()` and chains steps until it produces nodes, edges, or a terminal value. 3. Choose which bound names to surface to the caller with `.returning([...])`. The TypeScript shell: ```ts readBatch() .varAs("name", g().someSource().someStep()) .varAs("other", g().someSource().someStep()) .returning(["name", "other"]); ``` The Rust shell is identical except for naming conventions (`read_batch`, `var_as`, `returning`). Go uses `helix.ReadQuery("name")`, `VarAs`, and variadic `Returning`. The JSON shell is what the batch serializes to: a top-level `{queries: [...], returns: [...]}` object, with each `varAs` producing one `{Query: {name, steps, condition}}` entry. The [Reading nodes](/database/querying-guide/reading-nodes) page covers the shell step by step; this overview just shows the finished product. ## A full first example Fetch a user by `username` and the ten most recent posts they authored. This example takes no parameters; it hard-codes `"alice"` and a limit of `10` so the JSON column is unambiguous. Parameter binding is introduced on the [Filtering](/database/querying-guide/filtering) and [Parameters & bundles](/database/querying-guide/parameters-bundles) pages. Two things to notice: - The `posts` traversal starts from `NodeRef.var("user")` — that's how one named binding feeds into the next. The same name appears as `{"N": {"Var": "user"}}` in the JSON. - The TypeScript, Rust, Go, and Python DSLs surface friendly `nWithLabel` / `n_with_label` / `NWithLabel` alias, but this example uses `nWhere(SourcePredicate.eq("username", "alice"))` directly because the read is anchored on a unique property, not a label scan. The next page covers when to use each anchor. ## Sending the request The JSON above is the actual body for `POST /v1/query`. Below are the same request in multiple transports — TypeScript, Rust, Go, and Python using their SDK clients, and `curl` straight from the shell. The transports are interchangeable; pick whichever fits your environment. The response is a JSON object keyed by the names from `.returning([...])` — here `{ "user": [...], "posts": [...] }`. Each value is the list of rows produced by that named binding. The [Parameters and bundles](/database/querying-guide/parameters-bundles) page shows how to bind parameters and organize queries into a `queries.json` bundle. ## How to read this guide Each page leads with a short intro, then `` blocks in all four forms. TypeScript is the lead tab; Rust mirrors it with snake_case names and trailing underscores for keywords (`where_`, `in_`, `as_`); Go mirrors the same AST via `helix.ReadQuery` / `helix.WriteQuery`; and JSON is the serialized AST you POST to `/v1/query`. Pages can be read in any order. ## Next Steps # Reading nodes and edges Every traversal in the DSL starts with `g()` — a fresh, empty graph cursor — followed by a *source* step that produces nodes or edges. This page covers the source steps: how to anchor on an id, a label, or an indexed property, and the difference between a `SourcePredicate` and a full `Predicate`. ## The batch shell Before any source step, you need a batch to hold the traversal. Reads use `readBatch()`; writes use `writeBatch()`. Inside, every named result is introduced with `varAs("name", traversal)`, and `returning([...])` chooses which names the caller receives. Two things worth noting from the JSON column: - `nWithLabel("User")` is a DSL convenience. On the wire it becomes a label-filtered source: `{"NWhere": {"Eq": ["$label", {"String": "User"}]}}`. The virtual `$label` field is implicit on every node. - `count()` is a *unit* step — it carries no payload — so it serializes as the bare string `"Count"` rather than a `{Count: ...}` object. ## Anchoring on a node id When you already have a node id (returned by an earlier query, or stored alongside your application data), pass it to `g().n(...)` to skip any index lookup at all. `NodeRef` has several useful constructors: - `NodeRef.id(42n)` / `NodeRef.ids([1n, 2n, 3n])` — one or many concrete ids. - `NodeRef.var("name")` — the result of an earlier `varAs` binding. - `NodeRef.param("name")` — a value supplied at request time (covered in [Parameters & bundles](/database/querying-guide/parameters-bundles)). - `NodeRef.all()` — every node in the graph; rarely what you want. Node ids fit in `i64` and can exceed `Number.MAX_SAFE_INTEGER`. Use `BigInt` literals (`42n`) in TypeScript when authoring queries that touch ids directly. ## Anchoring on a label `nWithLabel(label)` selects every node of a given label. Without a follow-up filter this is a full label scan, so reach for it only when you genuinely want every node of that label. ## Anchoring on a unique property The most common starting shape: look up a node by an indexed property. Use `nWhere(SourcePredicate.eq(...))` when there is no label scope, or `nWithLabelWhere(label, SourcePredicate)` when you want both at the source step. ### `SourcePredicate` vs `Predicate` Source steps (`nWhere`, `eWhere`, etc.) accept a `SourcePredicate`, which is a deliberately smaller set: `eq`, `neq`, `gt`, `gte`, `lt`, `lte`, `between`, `hasKey`, `startsWith`, plus `and` / `or`. These are the predicates the storage layer can push down into an index lookup. For richer post-filtering — collection membership, regex-ish contains/endsWith, `isNull`, `not`, parameter-bound comparisons — use a `.where(Predicate.*)` step *after* the source. That's covered in [Filtering](/database/querying-guide/filtering). If you've already built a `SourcePredicate` and need to lift it into a full `Predicate`, every `SourcePredicate` exposes `.toPredicate()`. ## Anchoring on label plus a predicate When the source step is both label-scoped and predicate-filtered, `nWithLabelWhere` collapses the two into one step. It produces the same JSON as `nWhere(And[$label, ...])` but reads more clearly. ## Anchoring on edges Edges have a parallel set of source steps: - `g().e(EdgeRef.id(...))` / `EdgeRef.ids([...])` — by id. - `g().eWithLabel("FOLLOWS")` — by edge label (e.g. every `FOLLOWS` edge in the graph). - `g().eWhere(SourcePredicate.gt("weight", 0.5))` — by indexed edge property. - `g().eWithLabelWhere("FOLLOWS", SourcePredicate.gte("since", "2026-01-01"))` — label + predicate. Edge streams support `.outN()` / `.inN()` / `.otherN()` to walk back to a node — see [Traversals](/database/querying-guide/traversals) for the full edge story. ## Picking the right anchor Order of preference, narrowest first: 1. **Node or edge id** (`g().n(NodeRef.id(...))`). No index lookup at all. 2. **Unique-indexed property** (`g().nWhere(SourcePredicate.eq("username", ...))`). 3. **Equality-indexed property** (same shape, non-unique index). 4. **Label-scoped predicate scan** (`g().nWithLabelWhere(...)`). 5. **Plain label scan** (`g().nWithLabel(...)`). Last resort. If your route accepts an id or other indexed identifier from the request, use it as the anchor — never start from a broad label scan when an indexed identifier is available. [Parameters & bundles](/database/querying-guide/parameters-bundles) shows the parameterized versions of every shape above. ## Next Steps # Traversals A *traversal* moves from one set of nodes or edges to another by following labeled edges. This page covers the node-to-node and node-to-edge steps, multi-hop chaining, deduplication, and naming intermediate frontiers with `as` / `select`. All examples assume the source step has already produced a node or edge stream — see [Reading nodes and edges](/database/querying-guide/reading-nodes) for those. ## Node-to-node steps From a node stream, three steps walk along edges and land on the nodes at the other end: - `.out(label?)` — follow outgoing edges; emit the *destination* nodes. - `.in(label?)` — follow incoming edges; emit the *source* nodes. - `.both(label?)` — both directions; emit the neighbor on the other side. The label argument is optional. Omit it to follow every edge regardless of label; supply it to scope to one edge type. In the JSON the omitted form becomes `{"Out": null}`. To get *followers* — users following Alice — swap `out("FOLLOWS")` for `in("FOLLOWS")`. To get both at once, use `both("FOLLOWS")`. ## Multi-hop traversals Steps chain. Reading a user's tags ("which tags appear on any of Alice's posts") is three hops: `User → AUTHORED → Post → TAGGED → Tag`. Add `.dedup()` because a tag might appear on more than one of Alice's posts. `.dedup()` is a unit step — it serializes as the bare string `"Dedup"`. ## Edge-valued steps Sometimes you want the *edges* themselves, not the nodes they point to. Use the `E` suffix: - `.outE(label?)` — outgoing edges (a *stream of edges*). - `.inE(label?)` — incoming edges. - `.bothE(label?)` — edges in either direction. From an edge stream: - `.outN()` — the source node of each edge. - `.inN()` — the destination node. - `.otherN()` — the *other* node, relative to the node you arrived from. `.otherN()` only makes sense after `.bothE(...)`, where "the other end" depends on the originating node. Edges carry properties too. The example below grabs every `LIKED` edge Alice has written, filters by edge property, and returns the destination posts. Notice that `"InN"`, like `"Dedup"` and `"Count"`, is a unit step encoded as a bare string in the JSON. ## Naming intermediate frontiers When a multi-step traversal needs to refer back to an earlier point — for "friends-of-friends excluding direct friends", or "posts not already liked by the viewer" — bind a name with `.as("frontier")` and reach for it later with `.select("frontier")` (which replaces the current stream with the stored one) or with the set operators `.within("frontier")` / `.without("frontier")`. `.store(name)` is an alias for `.as(name)`; both create a named snapshot of the current stream. A handy rule of thumb: - `.as(name)` / `.store(name)` — snapshot the current stream without changing it. - `.select(name)` — replace the current stream with the snapshot. - `.within(name)` — keep only items that are also in the snapshot (intersection). - `.without(name)` — drop items that are in the snapshot (set difference). ## What's not on this page - Variable-length traversals (`g().n(...).repeat(...)`) belong on the [Advanced patterns](/database/querying-guide/advanced) page along with `union`, `choose`, and `coalesce`. - Filtering inside the traversal (`.has`, `.where`, `.edgeHas`) is covered on [Filtering, ordering, paging](/database/querying-guide/filtering). - Shaping the output (`.valueMap`, `.project`, `.count`, `.exists`) is covered on [Projections](/database/querying-guide/projections). ## Next Steps # Filtering, ordering, paging This page covers the steps that *narrow* a stream rather than walking the graph: property and label filters, the full `Predicate` catalog, ordering, and slicing the result with `limit` / `skip` / `range`. ## `.has`, `.hasLabel`, `.hasKey` The simplest filters are equality checks on a single property. They work on both node and edge streams. - `.has(property, value)` — property equals the given value. - `.hasLabel(label)` — node/edge label equals. - `.hasKey(property)` — property exists (any non-null value). These are convenience wrappers; under the hood they could all be expressed via `.where(...)`. ## `.where(Predicate.*)` For anything richer than equality, use `.where(Predicate.*)`. `Predicate` exposes the full set: | Group | Builders | |---|---| | Comparison | `eq`, `neq`, `gt`, `gte`, `lt`, `lte`, `between` | | String | `startsWith`, `endsWith`, `contains` | | Collection | `isIn`, `isInExpr` | | Property presence | `hasKey`, `isNull`, `isNotNull` | | Logical | `and([...])`, `or([...])`, `not(p)` | | Parameter-bound | `eqParam`, `neqParam`, `gtParam`, `gteParam`, `ltParam`, `lteParam`, `isInParam`, `containsParam` | | Expression | `compare(left, op, right)` — for arithmetic on either side | A typical multi-predicate filter: ### `SourcePredicate` vs `Predicate` Use a `SourcePredicate` at source steps when the filter should participate in index selection; use `.where(Predicate.*)` for anything outside that set or to filter after a traversal step. See [`SourcePredicate` vs `Predicate`](/database/querying-guide/reading-nodes#sourcepredicate-vs-predicate) for the full breakdown. ## Nested property paths Object properties can be filtered with dotted paths. The runtime first checks for an exact top-level property name, then walks nested object fields when the name contains `.`. Dotted paths are scan-only in V1. Pair them with an indexed top-level anchor such as label, tenant, or status when the label is large; the indexed predicate narrows candidate rows and the dotted predicate runs as a residual filter. Arrays are opaque, so `metadata.tags.0` is not supported. ## Parameter-bound predicates To filter by a value the caller supplies at request time, use the `*Param` family. Pass the parameter *name* as a string; the runtime substitutes the value at execution. Full parameter declaration via `defineParams` is covered on [Parameters & bundles](/database/querying-guide/parameters-bundles); this page focuses on the predicate shape. A few wire-format details worth noting: - `Predicate.eqParam(prop, name)` desugars to a generic `Compare` predicate with a `Property` on the left and a `Param` on the right. The same shape applies to `neqParam`, `gtParam`, `gteParam`, `ltParam`, `lteParam` (the `op` field varies). - `Predicate.isInParam("status", "statuses")` becomes `{"IsInExpr": ["status", {"Param": "statuses"}]}` — a `Param`-valued expression, not a `Compare`. - `Predicate.containsParam("body", "needle")` becomes `{"ContainsExpr": ["body", {"Param": "needle"}]}`. ## Filtering edges in the middle of a traversal When you're walking through an edge stream and want to filter on edge properties without leaving the stream, use the same generic filters as node streams: `.has(prop, value)`, `.hasLabel(label)`, `.hasKey(prop)`, and `.where(Predicate.*)`. Edge predicates can read stored edge properties plus the virtual edge fields `$id`, `$label`, `$from`, `$to`, `$distance`, and `$score`. Use `.edgeHas(prop, value)` when the right-hand side must be a `PropertyInput` expression or runtime parameter. ## Ordering `.orderBy(property, Order.Asc | Order.Desc)` sorts by a single property. `.orderByMultiple([[prop, order], ...])` sorts by several at once with explicit direction per key. For multi-key ordering: ```ts g() .nWithLabel("Post") .orderByMultiple([ ["createdAt", Order.Desc], ["title", Order.Asc], ]); ``` ```go helix.G().NWithLabel("Post").OrderByMultiple( helix.Ordering{Property: "createdAt", Order: helix.OrderDesc}, helix.Ordering{Property: "title", Order: helix.OrderAsc}, ) ``` which serializes as: ```json { "OrderByMultiple": [["createdAt", "Desc"], ["title", "Asc"]] } ``` ## Paging: `.limit`, `.skip`, `.range` The three slicing steps all accept either a literal number / `bigint` or a `StreamBound` for parameter-bound bounds. - `.limit(n)` — keep the first `n`. - `.skip(n)` — drop the first `n`. - `.range(start, end)` — keep the half-open `[start, end)` slice. ### Literal bounds ### Parameter-bound limits For request-time pagination, wrap the bound in `StreamBound.expr(Expr.param("..."))`. The JSON variant changes too: `Limit` becomes `LimitBy`, `Skip` becomes `SkipBy`, `Range` becomes `RangeBy`. The `Range` parameter-bound variant looks like: ```json { "RangeBy": [{ "Literal": 0 }, { "Expr": { "Param": "end" } }] } ``` — each bound is independently a `Literal` or an `Expr`, so you can mix-and-match a constant start with a parameterized end. ## Next Steps # Projections Up to this point every example has ended with `.valueMap([...])` to ship a few properties back to the caller. This page covers the full set of terminal projections: how to shape the output, compute derived values, and aggregate the stream. A terminal step (`.count`, `.values`, `.project`, ...) ends the chain and produces a result rather than another stream. TypeScript and Rust block further chaining at compile time; Go reports it at serialization (`Validate`, `MarshalRequest`, or `Client.Exec`). ## Scalar terminals: `.count`, `.exists`, `.id`, `.label` The smallest terminals collapse the whole stream into a single value. The four scalar terminals: | Step | Returns | |---|---| | `.count()` | Number of items in the stream. | | `.exists()` | `true` if the stream is non-empty, `false` otherwise. | | `.id()` | Stream of `$id` values, one per item. | | `.label()` | Stream of `$label` values, one per item. | `.id()` and `.label()` are usually most useful followed by another binding via `.varAs(...)`, or as input to `.project(...)`. ## `.values(...)` and `.valueMap(...)` When you want a few properties from each item: - `.values([prop, ...])` — emits an array of arrays, one inner array per item, with the properties in the requested order. No keys, only positional values. - `.valueMap([prop, ...])` — emits an array of objects keyed by property name. Pass `null` (or omit the argument) for "every property". `valueMap` is the default choice for service-facing routes; `values` is handy when the caller does its own positional decoding. `$id`, `$label`, `$from`, `$to`, `$distance`, and `$score` are **virtual fields**: they are always available even though they are not declared on your schema. `$from`/`$to` appear on edge streams; `$distance`/`$score` appear on ranked vector / text hits. Filtered `values(...)` and `valueMap(...)` also accept dotted paths into object properties, such as `metadata.externalID`. `valueMap(null)` returns stored top-level properties as-is and does not flatten nested objects. When you request a dotted path explicitly, the returned object is keyed by the requested source string unless you rename it with `project(...)`. ```ts g().nWithLabel("User").valueMap(["$id", "metadata.externalID"]); ``` ```rust g().n_with_label("User") .value_map(Some(vec!["$id", "metadata.externalID"])); ``` ```go helix.G().NWithLabel("User").ValueMap("$id", "metadata.externalID") ``` ## `.project(...)` with renames and expressions `project([...])` is the most flexible terminal — it accepts a list of items, each one either: - `PropertyProjection.new(prop)` — emit `prop` under its own name. - `PropertyProjection.renamed(source, alias)` — emit `source` under a different key. - `ExprProjection.new(alias, expr)` — emit a computed expression under `alias`. Use it when you want to flatten ids out from under `$id`, return a different name than the schema's, or compute a derived field. `PropertyProjection` and `Expr.prop(...)` use the same property lookup rules as filters. This means nested object fields can be projected and renamed: ```ts g().nWithLabel("User").project([ PropertyProjection.renamed("metadata.externalID", "external_id"), ]); ``` ```rust g().n_with_label("User").project(vec![ PropertyProjection::renamed("metadata.externalID", "external_id"), ]); ``` ```go helix.G().NWithLabel("User").Project( helix.ProjectPropAs("metadata.externalID", "external_id"), ) ``` Dotted paths are also valid in fallback ordering, for example `.orderBy("metadata.score", Order.Desc)`, but they are not backed by secondary indexes in V1. A typical `project` call mixes plain property pulls with one or two renames: Each entry in `Project` is a bare object with `source` / `alias` (for `PropertyProjection`) or `alias` / `expr` (for `ExprProjection`). There is no enum tag around them — the `Projection` enum is untagged on the wire. ### Expressions for `ExprProjection` `Expr` is a small arithmetic-and-conditionals algebra: - Sources: `Expr.prop(name)`, `Expr.val(value)`, `Expr.id()`, `Expr.timestamp()`, `Expr.datetime()`, `Expr.param(name)`. - Arithmetic: `.add(other)`, `.sub(other)`, `.mul(other)`, `.div(other)`, `.modulo(other)`, `.neg()`. - Branching: `Expr.case([[predicate, expr], ...], elseExpr)`. A small derived field — fall back to `"unknown"` when the `tier` property is null: For time-relative comparisons against "now" (e.g. "rows newer than 30 days"), use `Expr.timestamp()` (server epoch millis) or `Expr.datetime()` (typed datetime) on one side of a `Predicate.compare(left, CompareOp.*, right)`. Arithmetic on `Expr` returns another `Expr`, so you can chain `Expr.timestamp().sub(Expr.val(30 * 86_400 * 1000))` to compute a cut-off server-side. ## Edge property output For edge streams, `.edgeProperties()` emits each edge as an object with all of its properties plus the virtual `$id`, `$label`, `$from`, `$to`, `$distance`, and `$score` fields when those fields are present on the current ranked edge stream. ## Aggregations: `.group`, `.groupCount`, `.aggregateBy` For grouped counts and aggregate statistics: - `.group(prop)` — bucket the stream by `prop`, emit `{groupValue: [items...]}`. - `.groupCount(prop)` — bucket and count, emit `{groupValue: count}`. - `.aggregateBy(AggregateFunction.*, prop)` — apply a reducer to one property: `Count`, `Sum`, `Min`, `Max`, `Mean`. `AggregateFunction` values are serialized as strings: `"Count"`, `"Sum"`, `"Min"`, `"Max"`, `"Mean"`. ## Next Steps # Mutations `readBatch()` only accepts read traversals — the TypeScript type system rejects any mutating step. To create nodes, add edges, or update properties, switch to `writeBatch()`. Inside a `writeBatch`, every `varAs` may chain any read step *or* any write step. The typestate machinery flips into "write" mode the moment a mutation appears, so a write batch can also do plain reads (look up an existing user, then update them). ## Adding nodes `g().addN(label, properties)` inserts a new node and emits a single-item stream — the new node. Property values can be passed as a plain object (most common) or as a list of `[name, PropertyInput]` tuples (matches the Rust `vec![("name", value)]` shape). A few wire-format details: - The `properties` array preserves insertion order — `Map`-style, not `Object`-style. - Each value is wrapped in a `PropertyInput`. A literal becomes `{"Value": {"String": ...}}`; a parameter becomes `{"Expr": {"Param": "..."}}` (covered in [Parameters & bundles](/database/querying-guide/parameters-bundles)). - The result of `addN` is a node stream, so the same chain can keep going — here, `.project([...])` collects the new id. ## Nested object and array properties Node and edge properties can store object and array values. Use nested objects for metadata you want to return or filter by scan-time dotted paths; keep frequently indexed values as top-level properties. Nested property lookup is exact-first. A top-level property literally named `metadata.externalID` is read before walking the nested `metadata` object. Dotted paths only walk object values; arrays are returned as values and do not support path syntax such as `tags.0`. ## Adding edges between bound nodes `addE` joins two nodes by label. The `to` argument is a `NodeRef` — most often `NodeRef.var("...")` pointing at a node bound earlier in the same batch. Every entry in a `writeBatch` shares the same transaction — either all three writes above land, or none do. There is no manual `BEGIN`/`COMMIT`; see [Transactions](/database/querying#transactions) for details. ## Updating properties Two single-property steps update an existing node or edge: - `.setProperty(name, value)` — set or overwrite. - `.removeProperty(name)` — unset. Both expect a node or edge stream (so you anchor first, then call), and both leave the stream unchanged so you can keep chaining or call `.count()` to confirm how many rows were touched. The envelope's `request_type` is `"write"`. Any mutation step anywhere in the AST makes the request a write — `.toDynamicJson` on a `writeBatch` picks the right value automatically. ## Deleting nodes and edges Three drop steps cover the common cases. All require an anchored stream first. - `.drop()` — delete the current nodes (and the edges they participate in). - `.dropEdge(to)` — from a node stream, delete edges going to a specific other node (`NodeRef`). Optionally label-scoped via `.dropEdgeLabeled(to, label)`. - `.dropEdgeById(EdgeRef)` — delete edges by id; the safe choice when the same node pair could have multiple parallel edges (multigraph mode). The same pattern works for `.drop()` to remove a node entirely — `.nWithLabel("Post").where(...).drop()` will remove matching posts plus every edge incident to them. ## Conditional writes with `varAsIf` Sometimes a write should only run when a previous step did or did not produce results. `.varAsIf(name, condition, traversal)` attaches a `BatchCondition` to a step so it only executes when the named variable is non-empty, empty, or has at least N items. The most common pattern is upsert: load the row, update if found, create if not. The full set of conditions: - `BatchCondition.varNotEmpty(name)` — run when `name` produced ≥ 1 result. - `BatchCondition.varEmpty(name)` — run when `name` produced 0 results. - `BatchCondition.varMinSize(name, n)` — run when `name` produced ≥ `n` results. - `BatchCondition.prevNotEmpty()` — same as `varNotEmpty` but for the immediately preceding step, useful when you don't want to name it. ## Next Steps # Vector and text search The DSL exposes two kinds of search source steps that return scored hits instead of a plain label scan: **vector search** (approximate k-NN over a float-array property) and **text search** (BM25 over a string property). Both are available on nodes and edges, both expose a literal form and a parameter-bound `_with` form, and both can be the first step of a longer traversal — once you have hits, you can keep walking the graph. This page assumes the indexes exist. Index creation is covered first, then queries. ## Creating indexes Indexes are graph mutations, so they live in a `writeBatch`. The unified `createIndexIfNotExists(IndexSpec.*)` step covers every variant and is idempotent — safe to run on each deploy. `IndexSpec` covers every shape the runtime supports: | Spec | What it is | |---|---| | `nodeEquality(label, prop)` | Secondary equality index. | | `nodeUniqueEquality(label, prop)` | Equality index with a uniqueness constraint. | | `nodeRange(label, prop)` | Range / ordering index. | | `nodeVector(label, prop, tenant?)` | ANN vector index. | | `nodeText(label, prop, tenant?)` | BM25 text index. | | `edgeEquality` / `edgeRange` / `edgeVector` / `edgeText` | Same shapes for edges. | `tenant?` is the multi-tenancy partition key — see the bottom of this page. ## Vector search `vectorSearchNodes(label, property, queryVector, k, tenantValue?)` returns the top `k` nodes whose `property` is closest to `queryVector`. Hits arrive in ascending distance order with a virtual `$distance` field projected onto each result. `$distance` is only present on the direct hit stream. The instant you step off it with `.out()`, `.in()`, `.both()`, `.outN()`, etc., the score is gone — project it *before* you traverse. ## Following the graph from a hit Vector hits are a normal node stream, so the next step can keep walking. The example below finds posts similar to a query vector, then returns their authors with the post's distance projected as an annotation. The two `varAs` blocks duplicate the search step because each is a fresh traversal binding. If you only want the authors and don't need the hits themselves in the response, drop the first `varAs` and remove `"hits"` from `returning(...)`. ## Text search (BM25) `textSearchNodes(label, property, queryText, k, tenantValue?)` runs BM25 over the named text-indexed property and returns the top `k` matching nodes. The interface mirrors `vectorSearchNodes` exactly, including the `$distance` virtual field (here it represents an inverted relevance score — smaller is more relevant, just like vector distance). ## Parameter-bound search For routes that take the query vector / text and `k` from the request, switch to the `_with` siblings. They accept `PropertyInput.param(...)` for the query and a `StreamBound` (literal or `Expr`-wrapped) for `k`. `textSearchNodesWith(...)` has the identical shape — just replace `PropertyInput.param("query_vector")` with `PropertyInput.param("query_text")` and declare the parameter as `param.string()`. ## Multi-tenancy If the index was created with a `tenantProperty`, every search must supply a matching `tenantValue` (the partition key). Pass it as the fifth argument to `vectorSearchNodes` / `textSearchNodes`, or wrap it in `PropertyInput.param("...")` inside the `_with` variants. A search that omits the tenant against a tenant-scoped index will fail to find any rows even if the data exists. See [Multi-Tenancy](/database/multi-tenancy) for the full partitioning model. ## Next Steps # Advanced patterns This page covers the steps that branch, repeat, or fan-out inside a single traversal: `repeat`, `union`, `choose`, `coalesce`, `optional`, and — at the batch level — `forEachParam`. All five sub-traversal-shaped steps share one helper: `sub()`, which opens a fresh traversal builder that does not start with `g()`. `sub()` is to in-line traversals what `g()` is to top-level ones. It has the same chainable surface (`out`, `where`, `dedup`, etc.) but it is not a complete query on its own — it is always passed into a parent step. ## `repeat`: variable-length traversal `.repeat(RepeatConfig.new(sub()...).times(n))` walks the same sub-traversal `n` times, threading the output of each iteration into the input of the next. By default, only the final frontier is emitted. Add `.emitAll()` to keep every intermediate frontier; add `.emitBefore()` / `.emitAfter()` to keep only one side. A two-hop reachability example: from Alice, who is followed-of-followed-by within three hops? The `max_depth: 100` field in the JSON is a safety ceiling — change it by chaining `.maxDepth(n)` on the `RepeatConfig`. To stop early when a predicate is satisfied instead of after `n` steps, swap `.times(3)` for `.until(predicate)`. ## `union`: combine multiple sub-traversals `.union([sub()..., sub()...])` runs each sub-traversal from the current stream and emits the concatenation. Use it when "the result is either A or B" — e.g. a personal feed that mixes posts you authored *and* posts you liked. ## `choose`: per-item if/else `.choose(predicate, thenSub, elseSub?)` evaluates the predicate per item and routes each item through the matching sub-traversal. The optional `else` arm defaults to "drop the item" — pass `null` (or omit it in Rust) for that. Route paid-tier users through a premium feed edge and everyone else through the default feed, then project the same fields from whichever branch matched: ## `coalesce`: first non-empty branch wins `.coalesce([sub()..., sub()..., sub()...])` tries each sub-traversal in order per item and returns the first one that produces results. Useful for "prefer A, fall back to B". ```ts g() .n(NodeRef.var("user")) .coalesce([ sub().out("PRIMARY_FEED"), sub().out("DEFAULT_FEED"), ]) ``` The JSON form: ```json { "Coalesce": [ { "steps": [{ "Out": "PRIMARY_FEED" }] }, { "steps": [{ "Out": "DEFAULT_FEED" }] } ] } ``` ## `optional`: keep the input even if the sub-traversal is empty `.optional(sub())` runs the sub-traversal and lets each item through unchanged if the sub-traversal produced no result. Think SQL `LEFT JOIN`: you get the original row whether or not the optional walk matched anything. ```ts g().nWithLabel("User").optional(sub().out("POSTED")); ``` ```json { "Optional": { "steps": [{ "Out": "POSTED" }] } } ``` ## Batch-level conditionals: `varAsIf` Inside a `readBatch` / `writeBatch`, `.varAsIf(name, BatchCondition.*, traversal)` makes a whole `varAs` step conditional on the result of an earlier one. This is the upsert pattern from the [Mutations](/database/querying-guide/mutations#conditional-writes-with-varasif) page, but it's just as useful on the read side — fetch related rows only if the primary lookup produced something. ```ts readBatch() .varAs("user", g().nWhere(SourcePredicate.eq("username", "alice"))) .varAsIf( "posts", BatchCondition.varNotEmpty("user"), g().n(NodeRef.var("user")).out("AUTHORED").valueMap(["$id", "title"]), ) .returning(["user", "posts"]); ``` In the JSON, the `condition` field on the conditional `Query` entry carries the constraint: ```json { "Query": { "name": "posts", "steps": [ { "N": { "Var": "user" } }, { "Out": "AUTHORED" }, { "ValueMap": ["$id", "title"] } ], "condition": { "VarNotEmpty": "user" } } } ``` ## Fan-out over array parameters: `forEachParam` `.forEachParam("paramName", body)` iterates over an array-of-object parameter and runs `body` once per element. Inside `body`, each object's keys are addressable as plain `PropertyInput.param(key)` references. This is the standard shape for bulk-insert routes. A few notes specific to `forEachParam`: - The body is a *whole* batch (`readBatch()` / `writeBatch()`), not a sub-traversal. This lets you bind multiple variables per iteration if needed. - Bindings inside the loop body are *per-iteration*. The outer `returning([...])` receives the per-iteration results collected into one array. - The parameter must be typed `{"Array": "Object"}` — an array of plain JSON objects. `defineParams({ users: param.array(param.object(...)) })` produces this shape automatically. - Each loop iteration exposes the current object's top-level fields as scoped parameters. If a field itself is an object, pass it through as a whole property value, for example `PropertyInput.param("metadata")`, then query the stored nested fields later with dotted paths such as `metadata.externalID`. ## Next Steps # Parameters and bundles So far every parameterized example on the prior pages has shown `p = paramsObject` / `Predicate.eqParam("prop", "name")` / `toDynamicJson(params, values, options)` without dwelling on the parameter system itself. This page is the dedicated treatment: how to declare parameters, how to refer to them in traversals, and how to ship a parameterized query to the runtime. ## SDK workflows TypeScript, Rust, and Python support both direct dynamic requests and registered bundle workflows. Go v1 is dynamic-first: write normal Go functions that return `helix.Request`, set the query name with `helix.ReadQuery("name")` or `helix.WriteQuery("name")`, and declare runtime parameters inline on the query builder. ```go func FindUsers(tenantID string, limit int64) helix.Request { q := helix.ReadQuery("find_users") tenant := q.ParamString("tenant_id", tenantID) maxRows := q.ParamI64("limit", limit) return q. VarAs("users", helix.G(). NWithLabel("User"). Where(helix.PredEq("tenantId", tenant)). Limit(maxRows). ValueMap("$id", "name", "tenantId"), ). Returning("users") } var out FindUsersResponse err := client.Exec(ctx, FindUsers("acme", 25), &out) ``` The Go helpers insert both `parameters` and `parameter_types` in the dynamic envelope. Use `q.ParamString`, `q.ParamI64`, `q.ParamF64`, `q.ParamF32`, `q.ParamBool`, `q.ParamDateTime`, `q.ParamValue`, `q.ParamObject`, and `q.ParamArray`. Do not use `.With(...)`, `WithQueryName(...)`, or stored-query bundle APIs in the Go v1 workflow. Use `MarshalRequest(req)` only for tests, parity fixtures, or debugging. In Go, direct values are literals in the inline AST. `helix.SourceEq("id", "foo")` and `helix.PredEq("id", "foo")` embed `"foo"` directly; they do not create a runtime parameter. For values that change per request, declare a `q.Param*` value and pass the returned ref, for example `id := q.ParamString("id", userID)` and `helix.SourceEq("id", id)`. That keeps the request shape stable for server-cache reuse. Python mirrors the TypeScript/Rust direct dynamic and bundle workflow with snake_case helpers: ```python from helixdb import Predicate, define_params, g, param, read_batch params = define_params({"tenant_id": param.string(), "limit": param.i64()}) def find_users(p=params): return ( read_batch() .var_as( "users", g() .n_with_label("User") .where(Predicate.eq("tenantId", p.tenant_id)) .limit(p.limit) .value_map(["$id", "name", "tenantId"]), ) .returning(["users"]) ) request = find_users().to_dynamic_request(params, {"tenant_id": "acme", "limit": 25}, query_name="find_users") ``` ## Declaring parameters with `defineParams` / `define_params` This section applies to TypeScript and Python bundle and dynamic helpers. Rust derives a similar schema from registered function arguments. Go declares the same schema inline through `q.Param*` methods as shown above. `defineParams({...})` in TypeScript and `define_params({...})` in Python take a record of parameter names to `ParamSchema` constructors and return a proxy. Each property of the returned proxy is a `ParamRef` you can pass into builder methods that expect one. ```ts TypeScript const params = defineParams({ tenant_id: param.string(), username: param.string(), limit: param.i64(), query_vector: param.array(param.f32()), created_after: param.dateTime(), }); ``` ```python Python from helixdb import define_params, param params = define_params({ "tenant_id": param.string(), "username": param.string(), "limit": param.i64(), "query_vector": param.array(param.f32()), "created_after": param.date_time(), }) ``` The supported constructors share the same wire types; Python uses snake_case for the datetime helper: | TypeScript | Python | Wire type | Accepted dynamic values | |---|---|---|---| | `param.bool()` | `param.bool()` | `Bool` | `boolean` / `bool` | | `param.i64()` | `param.i64()` | `I64` | safe integer, `bigint`, or Python `int` | | `param.f64()` / `param.f32()` | `param.f64()` / `param.f32()` | `F64` / `F32` | finite number / `float` | | `param.string()` | `param.string()` | `String` | `string` / `str` | | `param.dateTime()` | `param.date_time()` | `DateTime` | `DateTime`, RFC3339 string, or epoch-millis integer; Python also accepts `datetime.datetime` | | `param.bytes()` | `param.bytes()` | `Bytes` | declarable, but dynamic JSON requests reject bytes values | | `param.value()` | `param.value()` | `Value` | any JSON value | | `param.object(inner?)` | `param.object(inner=None)` | `Object` | object values matching the optional inner schema | | `param.array(inner)` | `param.array(inner)` | `{Array: inner}` | arrays/lists matching the inner schema | `param.array(param.f32())` is what produces the `{"Array": "F32"}` shape required for vector parameters. Use `param.array(param.object(...))` for row-style inputs to `forEachParam` / `for_each_param`. ## Referring to parameters inside a traversal There are three ways a parameter shows up inside a query, depending on what kind of value the builder expects: 1. **Bound to a step argument that takes a `ParamRef`** - for `.limit(p.limit)`, `.skip(p.skip)`, etc. The proxy returned by `defineParams` / `define_params` is the value to pass. 2. **Inside a predicate** - TypeScript offers `Predicate.eqParam("prop", "param_name")` and the rest of the `*Param` family for string-keyed parameter refs. Python can pass the `ParamRef` directly, for example `Predicate.eq("tenantId", p.tenant_id)`. 3. **As a property value or expression** - `PropertyInput.param("name")`, `Expr.param("name")`, `NodeRef.param("name")`, and `EdgeRef.param("name")` all take the name as a string. In Python, `p.tenant_id.input()` is a shorthand when a method expects a `PropertyInput`. In practice the proxy is mostly used to keep parameter names in sync - if you rename the field in `defineParams` / `define_params`, every reference through the proxy moves with it. For string-keyed refs (`PropertyInput.param("name")`), the name has to match exactly. A common convention is to assign the proxy to a `p` argument and read both forms from there: ## Shipping a parameterized query A query built with `readBatch()` / `writeBatch()` is sent to the runtime as a dynamic request. There are a few ways to produce that request — straight from a batch, from a registered bundle, or through the typed `call` helpers — but they all serialize to the same dynamic envelope you POST to `/v1/query`. ### 1. From a batch directly The simplest case: build the batch, call `.toDynamicJson(params, values, options)` in TypeScript or `.to_dynamic_json(params, values, query_name=...)` in Python, then POST the resulting body to `/v1/query`. No deploy step. Best for ad-hoc queries, prototyping, and one-off integrations. ```ts const params = defineParams({ username: param.string() }); function userByUsername(p = params) { return readBatch() .varAs( "user", g().nWithLabel("User") .where(Predicate.eqParam("username", "username")) .valueMap(["$id", "username", "tier"]), ) .returning(["user"]); } ``` The dynamic envelope produced by `.toDynamicJson(...)` / `.to_dynamic_json(...)` is the body for `POST /v1/query`. Below are the five equivalent ways to send it: For larger envelopes, write the JSON to a file and use `--data-binary @file.json` — the inline `--data-raw` form works for small bodies but gets unwieldy fast. The three serialization helpers on a batch: | TypeScript | Python | Returns | |---|---|---| | `.toDynamicJson(params?, values?, options?)` | `.to_dynamic_json(params=None, values=None, query_name=...)` | JSON string (full envelope: `request_type`, `query_name`, `query`, `parameters`, `parameter_types`) | | `.toDynamicRequest(params?, values?, options?)` | `.to_dynamic_request(params=None, values=None, query_name=...)` | `DynamicQueryRequest` instance - useful when you want to mutate it before serializing | | `.toDynamicBytes(params?, values?, options?)` | `.to_dynamic_bytes(params=None, values=None, query_name=...)` | UTF-8 bytes of the JSON encoding | Pass `{ queryName: "..." }` as the final TypeScript options argument, or `query_name="..."` in Python, when you want gateway logs and query diagnostics to identify a direct inline request. Direct requests without a name serialize `query_name: null`, which the gateway records as the fallback dynamic name `__dynamic__`. For a parameter-less batch, omit `params` and `values`; pass only the name option if you want to name the request: `.toJsonString()` / `.to_json_string()` returns the *raw batch* (no envelope, no parameters), the form that goes into a bundle - covered below. ### 2. Registered bundles Registered bundles are available in TypeScript, Rust, and Python. Go v1 does not generate or register bundles; keep Go application queries dynamic-first with `client.Exec(ctx, request, &out)`. To keep a set of queries organized and generate a single `queries.json` artifact, wrap each query function with `registerRead` or `registerWrite` and pass them all to `defineQueries`. The bundle's `generate(path)` method writes a single `queries.json` in the exact shape the runtime expects. The resulting bundle has this top-level shape: ```json { "version": 4, "read_routes": { "user_by_username": { "queries": [...], "returns": ["user"] } }, "write_routes": { "add_user": { "queries": [...], "returns": ["newUser"] } }, "read_parameters": { "user_by_username": [{ "name": "username", "ty": "String" }] }, "write_parameters": { "add_user": [ { "name": "username", "ty": "String" }, { "name": "tier", "ty": "String" } ] } } ``` Each `read_routes.` / `write_routes.` is exactly what `.toJsonString()` would have produced for the underlying batch — no envelope. To execute one of these queries, produce its dynamic envelope through the typed `call` helpers below and POST it to `/v1/query`. Route names must be unique across reads *and* writes; duplicates throw `GenerateError` at bundle time. ### 3. Typed call helpers `defineQueries(...)` / `define_queries(...)` also exposes a `call` map. Each entry is a function that takes the route's parameter values and returns a serializable `DynamicQueryRequest`. TypeScript infers the input type from `defineParams`; Python validates unknown keys, missing keys, and wrong value types at call time. The returned object serializes to the same dynamic envelope you would build by hand, so you POST it to `/v1/query`: `queries.call.*` keeps parameter names and types in sync with `defineParams` / `define_params`, so it is the most convenient way to send a registered query - in application code, tests, and scripts alike. The returned dynamic request sets `query_name` to the route key (`user_by_username` in this example) automatically. ## How parameters serialize Inside the AST, parameter references appear as `{"Param": "name"}` wrapped in the spot the value would go: `{"Limit": ... }` becomes `{"LimitBy": {"Param": "n"}}`, `{"Has": ["k", value]}` becomes `{"Where": {"Compare": {"left": {"Property": "k"}, "op": "Eq", "right": {"Param": "k"}}}}`, and so on. The [Filtering](/database/querying-guide/filtering) page tabulates every shape. At envelope position, the actual value is a *bare* JSON value, *not* a tagged `PropertyValue`. The shape `{"Array": "F32"}` you see in the `parameter_types` map is `QueryParamType`, the runtime coercion schema. A complete envelope produced by `toDynamicJson` / `to_dynamic_json`: ```json { "request_type": "read", "query_name": "user_by_username", "query": { "queries": [...], "returns": [...] }, "parameters": { "username": "alice", "limit": 25, "created_after": "2026-04-05T10:34:56.789Z", "labels": { "status": "active" } }, "parameter_types": { "username": "String", "limit": "I64", "created_after": "DateTime", "labels": "Object" } } ``` Nested object and array parameter values are still bare JSON at envelope position. When a parameter is used as a property value, the runtime converts that JSON object into a stored object property: ```ts const params = defineParams({ metadata: param.object(param.value()) }); writeBatch().varAs( "user", g().addN("User", { metadata: PropertyInput.param("metadata") }), ); ``` ```json { "parameters": { "metadata": { "externalID": "crm-42", "preferences": { "locale": "en-US" }, "tags": ["trial", 7] } }, "parameter_types": { "metadata": "Object" } } ``` After storage, query nested object fields with dotted paths such as `metadata.externalID`. The top-level `parameters` map itself does not use tagged `PropertyValue` wrappers. ## Number handling: `bigint` for `i64` JavaScript `number` only has 53 bits of integer precision. The TypeScript DSL accepts a plain `number` wherever it fits in a safe integer range, but for full `i64` (node ids, large counts, epoch nanos) you should pass a `bigint`: ```ts g().n(9223372036854775807n); PropertyValue.i64(9223372036854775807n); ``` Python `int` values are arbitrary precision, so the Python SDK validates the value against the `i64` range and serializes it directly. `stringifyJson`, `.toJsonString()` / `.to_json_string()`, and `serializeQueryBundle` preserve large integers without loss. Plain `JSON.stringify` does not - use the package serializers when your TypeScript payload may contain large integers. ## DateTime handling `DateTime` is the dedicated value type for timestamps. It stores epoch milliseconds and round-trips losslessly through the dynamic envelope as a UTC RFC3339 string. ```ts DateTime.fromMillis(-1).toRfc3339(); // "1969-12-31T23:59:59.999Z" DateTime.parseRfc3339("2026-04-05T12:34:56.789+02:00").toRfc3339(); // "2026-04-05T10:34:56.789Z" (normalized to UTC) ``` When a parameter is declared as `param.dateTime()` in TypeScript or `param.date_time()` in Python, the call helpers accept `DateTime`, an RFC3339 string, or an epoch-millis integer. Python also accepts `datetime.datetime`: All of these values serialize to the same envelope: ```json { "parameters": { "since": "2026-01-01T00:00:00.000Z" }, "parameter_types": { "since": "DateTime" } } ``` ## Bytes parameters `param.bytes()` is declarable in `defineParams` / `define_params`, but the dynamic envelope cannot round-trip binary values faithfully. Attempting to call `toDynamicJson(params, { payload: someBytes })`, `to_dynamic_json(params, {"payload": b"..."})`, or to build a request with a bytes parameter through `queries.call.*` throws an unsupported-bytes dynamic query error. If you need to pass binary data, send it through a binary-aware client (the runtime's RPC interface) rather than the JSON dynamic envelope, which is the bottleneck. ## Generating the bundle from a script The `queries.generate(path)` shortcut is enough for most projects. If you want to inspect or post-process the bundle first, the lower-level helpers are exported too: `deserializeQueryBundle` / `deserialize_query_bundle` validates `version === 4` (the current bundle version); it will reject older or newer bundles so you catch mismatches at load time rather than at deploy time. ## Next Steps # Secondary Indexes Secondary indexes accelerate property-based filtering on the labeled property graph. Two flavors are available, each on both nodes and edges: - **Equality indexes** map `property = value` to the set of matching IDs. Use them for exact-match lookups like "all users with `status = active`." - **Range indexes** provide ordered scans over numeric and string properties. Use them for comparisons like `timestamp > T` or `score >= 4`. Secondary indexes are opt-in. They speed up reads at the cost of additional write overhead and storage. Only index properties that are frequently filtered on. Unindexed properties remain queryable via `.where_(Predicate::...)`, but the route will scan the label rather than seek the index. Indexes address top-level properties only. Nested fields (e.g. `metadata.externalID`) can be filtered with a dotted path but are scan-only in V1 — store frequently indexed metadata as top-level properties. The DSL fragments below are bare traversals. To execute them, wrap each one in a `read_batch()` or `write_batch()` route as shown in [Querying](/database/querying). ## Equality Index — Nodes Declare the index as part of your schema setup write batch: ```rust g().create_index_if_not_exists( IndexSpec::node_equality("User", "status"), ) ``` Insert a node carrying the indexed property: ```rust g().add_n( "User", vec![ ("userId", PropertyInput::from("u-42")), ("status", PropertyInput::from("active")), ], ) ``` Query through the index using the source-predicate form. `n_with_label_where` pushes the predicate down to the index so the route never scans the full label: ```rust g().n_with_label_where( "User", SourcePredicate::eq("status", "active"), ) ``` ## Equality Index — Edges Declare an equality index on an edge property: ```rust g().create_index_if_not_exists( IndexSpec::edge_equality("FOLLOWS", "since_year"), ) ``` Create an edge between two known nodes. `add_e` is called from a node-state traversal — the current node is the edge's source, and the second argument is the target: ```rust g().n(NodeRef::param("source")) .add_e( "FOLLOWS", NodeRef::param("target"), vec![("since_year", PropertyInput::from(2024i64))], ) ``` Query the edge index directly with `e_with_label_where`: ```rust g().e_with_label_where( "FOLLOWS", SourcePredicate::eq("since_year", 2024i64), ) ``` ## Range Index — Nodes Declare a range index on a numeric or string property: ```rust g().create_index_if_not_exists( IndexSpec::node_range("Event", "timestamp"), ) ``` Range indexes use the same insert path as equality indexes; the property values are simply ordered by the index. Query with any of `gt`, `gte`, `lt`, `lte`, or `between`: ```rust g().n_with_label_where( "Event", SourcePredicate::gt("timestamp", 1_700_000_000i64), ) ``` ## Range Index — Edges ```rust g().create_index_if_not_exists( IndexSpec::edge_range("RATED", "score"), ) ``` ```rust g().e_with_label_where( "RATED", SourcePredicate::gte("score", 4i64), ) ``` ## See Also For tenant-scoped equality lookups using the same secondary-index machinery, see [Multi-Tenancy](/database/multi-tenancy). # Vector Indexes Vector indexes enable approximate nearest neighbor (ANN) search on numeric array properties. Helix supports vector indexes on both nodes and edges. Values are normalized to float32 for indexing, and supported distance metrics are cosine, euclidean, and manhattan. The query vector must match the configured index dimension exactly. Vector index properties are top-level properties. Nested object fields are not indexed as vectors; store embeddings directly on the node or edge property you pass to the vector-search builder. Vector search is approximate by design, trading a small amount of recall for significantly faster search over large datasets. The third argument to the index-creation builders is an optional tenant property name; pass `None::<&str>` for a global index, or see [Multi-Tenancy](/database/multi-tenancy) for the partitioned variant. The DSL fragments below are bare traversals. To execute them, wrap each one in a `read_batch()` or `write_batch()` route as shown in [Querying](/database/querying). ## Vector Index — Nodes Declare the index. The second argument is the property that holds the embedding: ```rust g().create_vector_index_nodes("Doc", "embedding", None::<&str>) ``` Insert a node with both the application data and the embedding. `PropertyInput::from(Vec)` works through the generic conversion path from any type that `PropertyValue` accepts: ```rust g().add_n( "Doc", vec![ ("title", PropertyInput::from("Intro to vector search")), ("embedding", PropertyInput::from(vec![0.12f32, 0.85, -0.04])), ], ) ``` kNN search returns the top `k` nearest nodes. Pass `None::` for the tenant argument when the index is not partitioned: ```rust g().vector_search_nodes( "Doc", "embedding", vec![0.12f32, 0.85, -0.04], 10, None::, ) ``` ## Vector Index — Edges Edges can carry embeddings too — useful when the relationship itself is what you want to retrieve by similarity (e.g., citation context, relation-typed embeddings). ```rust g().create_vector_index_edges("CITES", "embedding", None::<&str>) ``` ```rust g().n(NodeRef::param("source")) .add_e( "CITES", NodeRef::param("target"), vec![("embedding", PropertyInput::from(vec![0.12f32, 0.85, -0.04]))], ) ``` ```rust g().vector_search_edges( "CITES", "embedding", vec![0.12f32, 0.85, -0.04], 10, None::, ) ``` ## Parameterized Searches The plain `vector_search_nodes` / `vector_search_edges` builders take a literal `Vec` and a `usize` for `k`. For routes that receive the query vector and limit from request parameters, use the `_with` variants. They accept `PropertyInput::param(...)` for the vector and `Expr::param(...)` for `k`: ```rust g().vector_search_nodes_with( "Doc", "embedding", PropertyInput::param("queryVector"), Expr::param("limit"), None::, ) ``` The same pattern applies to `vector_search_edges_with`. # Text Indexes Text indexes provide BM25 full-text search on node and edge properties. They are stored durably like other indexes rather than rebuilt transiently per query. Indexed values can be `String` or `StringArray`, with configurable analyzer presets and optional term positions. Text index properties are top-level properties. Nested object fields are not flattened into BM25; store searchable text directly on the property you pass to the text-search builder. The third argument to the index-creation builders is an optional tenant property name; pass `None::<&str>` for a global index. Tenant-partitioned text indexes currently require the partition property name to be `tenant_id` — see [Multi-Tenancy](/database/multi-tenancy) for the partitioned variant. The DSL fragments below are bare traversals. To execute them, wrap each one in a `read_batch()` or `write_batch()` route as shown in [Querying](/database/querying). ## Text Index — Nodes Declare the index. The second argument is the text property: ```rust g().create_text_index_nodes("Doc", "body", None::<&str>) ``` Insert a node carrying the indexed text: ```rust g().add_n( "Doc", vec![ ("title", PropertyInput::from("Distributed databases")), ("body", PropertyInput::from("CAP theorem and consensus protocols")), ], ) ``` Run a BM25 search. The third argument is the query string, the fourth is `k`: ```rust g().text_search_nodes( "Doc", "body", "consensus protocols", 10, None::, ) ``` ## Text Index — Edges Edges can be text-indexed when the relationship carries searchable content — for example a reviewer's comment on a paper: ```rust g().create_text_index_edges("Reviewed", "comment", None::<&str>) ``` ```rust g().n(NodeRef::param("source")) .add_e( "Reviewed", NodeRef::param("target"), vec![("comment", PropertyInput::from("Thorough write-up"))], ) ``` ```rust g().text_search_edges( "Reviewed", "comment", "thorough write-up", 10, None::, ) ``` ## Parameterized Searches For routes that receive the query string and limit from request parameters, use the `_with` variants. They accept `PropertyInput::param(...)` for the query text and `Expr::param(...)` for `k`: ```rust g().text_search_nodes_with( "Doc", "body", PropertyInput::param("query"), Expr::param("limit"), None::, ) ``` The same pattern applies to `text_search_edges_with`. # Multi-Tenancy Multi-tenant architectures generally fall into one of three isolation levels. ## Isolation Levels | Level | Description | Isolation | Resource efficiency | | ------------------------ | --------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | -------------------------------------------------------------------- | | **Infrastructure-level** | Separate database instance per tenant. Fully independent compute, storage, and networking. | Strongest. No shared resources. | Lowest. Each tenant pays the full cost of an idle cluster. | | **Namespace-level** | Shared infrastructure, separate logical partitions (databases, schemas, or namespaces) per tenant. | Strong. Logical separation with shared compute. | Moderate. Shared compute, but metadata and indexes scale per tenant. | | **Row-level** | Shared infrastructure, shared data structures. Tenants distinguished by a property on every record, enforced at query time. | Application-enforced. Relies on consistent query-time filtering. | Highest. All tenants share indexes, caches, and storage. | ## Row-Level Isolation in Helix Cloud Helix Cloud focuses on row-level isolation, which lets you implement any tenancy model at the application layer without structural constraints on the database. Adding a tenant is a write, not a provisioning event. Assign a tenant identifier as a property on every node and edge, index it with an equality index, and filter every query on it so each request only sees its own tenant's data. This is the same mechanism as any property-based filtering: secondary indexes make the lookup fast, and snapshot isolation keeps concurrent tenants from interfering. The model stays shared infrastructure throughout — one writer fleet, one reader fleet, shared caches and secondary indexes. When search results also need isolating, vector and text search can use [tenant-partitioned indexes](#tenant-partitioned-search-indexes). ### In Practice The pattern is the same regardless of node label: declare an equality index on the tenant property once, attach `tenant_id` to every node and edge that gets written, and filter every read by `tenant_id`. Declare the index once, as part of your schema setup write batch: ```rust g().create_index_if_not_exists( IndexSpec::node_equality("Doc", "tenant_id"), ) ``` Attach `tenant_id` to every node when it is written: ```rust g().add_n( "Doc", vec![ ("tenant_id", PropertyInput::from("acme")), ("title", PropertyInput::from("Acme doc")), ], ) ``` Scope every read by the requesting tenant. Use the source-predicate form so the equality index on `tenant_id` is hit directly: ```rust g().n_with_label_where( "Doc", SourcePredicate::eq("tenant_id", "acme"), ) ``` When the tenant predicate sits mid-traversal — for example after walking edges from a known starting node — use `.where_(Predicate::eq("tenant_id", "acme"))`. The same scope rule applies at every hop: edges that fan out across the graph remain tenant-scoped as long as every step filters on the tenant property. ## What This Provides - **Data isolation.** Queries scoped to a tenant ID never observe another tenant's data. Isolation is enforced by the query layer, not by network or infrastructure boundaries. - **Shared infrastructure.** All tenants share the same writer, readers, caches, and object storage. No per-tenant provisioning, no per-tenant scaling configuration. - **Uniform scaling.** Reader auto-scaling responds to aggregate query load across all tenants. A spike from one tenant benefits from the same scaling that serves all others. - **Index efficiency.** Equality indexes on the tenant property resolve tenant-scoped queries without scanning unrelated data. - **Full flexibility.** The application layer decides how tenancy is modeled, scoped, and enforced. Helix Cloud provides the primitives; the application owns the policy. ## Tenant-Partitioned Search Indexes Tenant-partitioned search indexes are optional. They supplement row-level filtering with separate physical search structures per tenant value. ### Vector Indexes Vector indexes can optionally partition by a configured tenant property name. Helix reads that property from each record and maintains a separate vector index for each distinct property value. For example, if the tenant property name is `tenant_id`, records with different `tenant_id` values land in different vector indexes. This keeps tenant-scoped search working sets smaller and allows vector caches to warm, retain, and evict data at tenant granularity. Tenant-partitioned vector indexes are the preferred way to isolate vector search results when that behavior is required. The DSL surface for tenant-partitioned vector indexes: ```rust // Declare the index. The third argument is the property name Helix // reads from each record to choose the tenant partition. g().create_vector_index_nodes("Doc", "embedding", Some("tenant_id")) // Insert a node with both the tenant property and the embedding. g().add_n( "Doc", vec![ ("tenant_id", PropertyInput::from("acme")), ("title", PropertyInput::from("Acme doc")), ("embedding", PropertyInput::from(vec![1.0f32, 0.0, 0.0])), ], ) // kNN search scoped to a tenant. The tenant value selects the partition; // no additional filtering is needed. g().vector_search_nodes( "Doc", "embedding", vec![1.0f32, 0.0, 0.0], 5, Some(PropertyValue::from("acme")), ) ``` ### Text Indexes Text indexes provide the same model for full-text search. The tenant property still means the property name used to read the partition value from each record. Current text index validation requires that property name to be `tenant_id`, so Helix maintains a separate text index for each distinct `tenant_id` value. This keeps tenant-scoped full-text search working sets smaller and allows local text-search caches to track active tenants more closely. Tenant-partitioned text indexes are the preferred way to isolate full-text search results when that behavior is required. The DSL surface for tenant-partitioned text indexes: ```rust // Declare the text index. Text indexes currently require the partition // property name to be `tenant_id`. g().create_text_index_nodes("Doc", "body", Some("tenant_id")) // Full-text search scoped to a tenant. The tenant value selects the // partition; the search runs only against that tenant's text index. g().text_search_nodes( "Doc", "body", "search query string", 5, Some(PropertyValue::from("acme")), ) ``` ## Considerations - **Noisy neighbors.** Tenants share compute and cache, so a high-volume tenant can affect others' latency. Monitor per-tenant query volume and apply rate limiting on your infrastructure to mitigate. - **Shared search semantics.** Secondary indexes remain shared. Tenant-scoped vector and text searches require an explicit tenant value for the configured tenant property, and the system remains shared infrastructure even when search indexes are partitioned by tenant. ## Stricter Isolation Requirements For workloads that require namespace-level or infrastructure-level tenant isolation (regulatory compliance, data residency, or dedicated-resource SLAs), contact us at [founders@helix-db.com](mailto:founders@helix-db.com) to discuss options. # Getting started with HelixDB CLI ## Agent quickstart If you are starting a new project inside a coding-agent environment, use `helix chef`: ```bash helix chef ``` `helix chef` asks what you want to build, then lets you choose automatic or manual setup. It installs the Helix skills, connects the Helix docs MCP at `https://docs.helix-db.com/mcp`, scaffolds a local project (`helix init local`), starts the `dev` instance, seeds starter data, writes a `HELIX_CHEF_PROMPT.md` build prompt, and launches your coding agent (Claude Code, Codex, or OpenCode) to generate the app from your build intent. ## Local quickstart Edit `examples/request.json` (or write your own JSON files) and run `helix query dev --file …` again to iterate. See the [`helix query`](/cli/command-reference/query) reference for the request shape and validation rules. ## Helix Cloud quickstart To query a Helix Cloud cluster from the CLI: ## Tips ## Next Steps # Local Development Local development runs the prebuilt `ghcr.io/helixdb/enterprise-dev:latest` container and exposes the gateway at `POST /v1/query`. By default, storage is in-memory. Use `--disk` when you want persistent local data backed by a CLI-managed MinIO volume. ## Prerequisites - **Docker** or **Podman** on `PATH`. - The Helix CLI: `curl -sSL "https://install.helix-db.com" | bash`. ## Initial setup For an agent-assisted first app, [`helix chef`](/cli/command-reference/chef) can run this setup end-to-end: it installs Helix skills and the docs MCP, initializes `~/my-first-helix-project`, starts `dev`, seeds starter data, and launches your coding agent to build the app. ```bash helix chef ``` Use the manual flow below when you want to scaffold and run each step yourself. ## Persistent local storage ```bash # One-off disk mode for this run helix start dev --disk # Persist disk mode in helix.toml for a new local instance helix add local --name persistent --disk helix start persistent ``` Disk mode starts a MinIO sidecar, creates the `helix-db` bucket, and stores data in a Helix-managed Docker/Podman volume. `helix stop` removes the containers but keeps the volume. `helix prune ` removes the volume and deletes the persisted local data. ## Iteration loop ```bash # Edit a request file (or write a new one) $EDITOR examples/request.json # Send the request helix query dev --file examples/request.json # Tail container logs in another shell helix logs dev --follow # Stop and restart from a clean state helix restart dev ``` `helix restart` falls back to a fresh `helix start` if the container has been removed. ## Multiple local instances ```bash # Add a second local instance on a different port helix add local --name staging --port 9090 # Run them independently helix start dev helix start staging # See what's running helix status ``` Each instance is isolated by container name and host port. Disk-mode instances also get their own MinIO container, network, and volume. ## Inspecting logs ```bash helix logs dev # one-shot dump from docker/podman logs helix logs dev --follow # stream ``` `--range`, `--start`, and `--end` are Helix Cloud-only and rejected for local instances. ## Cleaning up | Goal | Command | |------|---------| | Stop one instance | `helix stop ` | | Restart one instance | `helix restart ` | | Remove containers, workspace state, and disk-mode volume for one instance | `helix prune ` | | Remove everything Helix-owned, for every local instance | `helix prune --all` (`--yes` in non-TTY) | | Permanently delete an instance from `helix.toml` | `helix delete ` (`--yes` in non-TTY) | [`helix prune`](/cli/command-reference/prune) only touches Helix-managed containers (`helix--` and disk-mode MinIO sidecars), networks, volumes, and the per-instance `.helix/` directory. It never runs a broad `docker/podman system prune`. ## Authoring dynamic queries A request JSON file must contain: - `request_type`: lowercase `"read"` or `"write"`. - `query_name` (optional): top-level operational name for logs and query diagnostics. Missing or `null` falls back to `__dynamic__`. - `query`: the query object — `queries[]` entries such as `{"Query": {"name": "...", "steps": [...], "condition": null}}`, plus a `returns[]` list. - `parameters` (optional): named parameter values. The first step must be a source step (for example `NWhere`) before any terminal step like `Count`. See [`helix query`](/cli/command-reference/query) for the validation rules. ## What next? # Helix Cloud The CLI handles authentication, workspace/project/cluster selection, metadata sync, log retrieval, and dynamic queries against a remote gateway. Cluster provisioning and scaling happen in the Helix control plane, not the CLI. ## Prerequisites - A Helix Cloud account with at least one workspace and Helix Cloud cluster. - A local Helix project (`helix init` in any directory). ## End-to-end flow ## Reference: per-instance auth contract Each `[enterprise.]` block stores the auth contract used by `helix query`: | Field | Default | Purpose | |-------|---------|---------| | `query_auth_header` | `Authorization` | HTTP header name set on every query request. | | `query_auth_env` | `HELIX_API_KEY` | Environment variable the CLI reads to populate the header value. | Override the variable name per-instance by setting `query_auth_env` under `[enterprise.]` and exporting whatever name you choose. ## Common errors For Helix Cloud auth, gateway, and logging errors, see [Troubleshooting](/cli/troubleshooting). ## What next? # Configuration Guide This page documents the configuration files the Helix CLI reads and writes: - `helix.toml` — per-project configuration in your project root. - `~/.helix/config` — user-global state, currently the selected workspace. - `~/.helix/credentials` — Helix Cloud credentials written by `helix auth login`. ## `helix.toml` `helix init` creates `helix.toml` in your project root. Everything the CLI knows about your project lives here. ```toml [project] name = "my-helix-app" [local.dev] port = 6969 image = "ghcr.io/helixdb/enterprise-dev" tag = "latest" # Optional; defaults to in-memory when omitted. # storage = "disk" [enterprise.production] cluster_id = "ec_01HX..." gateway_url = "https://gateway.example.com" ``` At least one `[local.*]` or `[enterprise.*]` instance is required. The CLI rejects an empty `helix.toml`. ### `[project]` | Key | Type | Required | Default | Description | |-----|------|----------|---------|-------------| | `name` | String | Yes | — | Project name. Used in the container name pattern `helix--`. | | `id` | String | No | — | Linked Helix Cloud project ID. Set by [`helix project switch`](/cli/command-reference/project). | | `workspace_id` | String | No | — | Linked Helix Cloud workspace ID. Set by [`helix project switch`](/cli/command-reference/project). | | `container_runtime` | `docker` \| `podman` | No | `docker` | Runtime used to manage local containers. | ### `[local.]` One block per local instance. The instance name is the table key (e.g. `[local.dev]`). | Key | Type | Default | Description | |-----|------|---------|-------------| | `port` | Number | `6969` | Host port published to container port `8080`. | | `image` | String | `ghcr.io/helixdb/enterprise-dev` | Container image used by [`helix start`](/cli/command-reference/start). | | `tag` | String | `latest` | Image tag. | | `storage` | `memory` \| `disk` | `memory` | Local storage mode. `disk` uses a CLI-managed MinIO container and persistent volume. | You can define as many local instances as you want, each on a different port: ```toml [local.dev] port = 6969 [local.staging] port = 9090 storage = "disk" ``` You can also use `helix start --disk` as a one-off disk-mode run without changing `helix.toml`. ### `[enterprise.]` One block per Helix Cloud instance. Most fields here are populated automatically by [`helix sync`](/cli/command-reference/sync). | Key | Type | Required | Default | Description | |-----|------|----------|---------|-------------| | `cluster_id` | String | Yes | — | Helix Cloud cluster ID. List clusters with [`helix cluster list`](/cli/command-reference/cluster). | | `workspace_id` | String | No | — | Workspace the cluster belongs to. | | `project_id` | String | No | — | Linked project ID. | | `gateway_url` | URL | No (required at query time) | — | Runtime gateway base URL. Populated by `helix sync` when available. | | `query_auth_header` | String | No | `Authorization` | HTTP header name used for query auth. | | `query_auth_env` | String | No | `HELIX_API_KEY` | Environment variable read by [`helix query`](/cli/command-reference/query) for the auth header value. | | `availability_mode` | String | No | — | Cluster availability mode, returned by `helix sync`. | | `gateway_node_type` | String | No | — | Gateway node type, returned by `helix sync`. | | `db_node_type` | String | No | — | DB node type, returned by `helix sync`. | If `gateway_url` is missing, [`helix query `](/cli/command-reference/query) fails with: ``` Enterprise gateway URL is not configured for ''. Run 'helix sync ' or set gateway_url in helix.toml. ``` ## `~/.helix/config` User-global configuration written by [`helix workspace switch`](/cli/command-reference/workspace). ```toml workspace_id = "ws_01HX..." ``` This file is independent of any project directory and applies to every Helix Cloud command you run. ## `~/.helix/credentials` Written by [`helix auth login`](/cli/command-reference/auth). Plain key=value format: ``` helix_user_id=usr_01HX... helix_user_key=hlxk_... ``` Remove with `helix auth logout`. Never commit this file. ## Environment variables | Variable | Used by | Description | |----------|---------|-------------| | `HELIX_API_KEY` (default) | [`helix query`](/cli/command-reference/query) | Per-cluster Helix Cloud API key sent in the configured auth header. The variable name is configurable per-instance via `query_auth_env`. | ## Troubleshooting - **Port conflict on `helix start`** — change the port in `[local.] port`, or pass `helix start --port ` to override for a single run. - **`Missing instances`** — `helix.toml` has no `[local.*]` or `[enterprise.*]` blocks. Add one with [`helix add`](/cli/command-reference/add). - **`Authentication required`** — run `helix auth login` for Helix Cloud commands. - **`gateway_url is still missing` after `helix sync`** — the cloud-side configuration has not been published yet. Re-run `helix sync` later, or set `gateway_url` manually under `[enterprise.]`. ## Next Steps # Troubleshooting This guide helps you resolve common issues with the Helix CLI v2. ## Installation issues ### `command not found: helix` The CLI installs to `~/.helix/bin/helix`. If your shell can't find it after running the installer: ```bash # Bash echo 'export PATH="$HOME/.helix/bin:$PATH"' >> ~/.bashrc source ~/.bashrc # Zsh echo 'export PATH="$HOME/.helix/bin:$PATH"' >> ~/.zshrc source ~/.zshrc ``` Verify the binary exists with `ls -la ~/.helix/bin/helix`. ### Updating the CLI ```bash helix update # update to the latest release helix update --force # reinstall even if already on the latest version helix update --v1 # update to v2.3.5 for v1 projects ``` ### Update check fails or hangs on a restricted network On its first run, the CLI checks `https://api.github.com` (once, then cached for 24h) for a newer release. This never fails a command — network errors are ignored — but in sandboxes, CI, or rate-limited/offline environments the check can add a few seconds of latency. Skip it entirely with: ```bash export HELIX_NO_UPDATE_CHECK=1 # also accepts HELIX_DISABLE_UPDATE_CHECK ``` ## Unrecognized commands ### `helix compile` / `helix check` These commands do **not** exist in HelixDB v2. There is no client-side compile or validation step — queries are validated **server-side** when you send them to a running instance. Author a JSON dynamic query and run it: ```bash helix query dev --file examples/request.json ``` See [`helix query`](/cli/command-reference/query) for the request shape. ### `helix deploy` Use [`helix push `](/cli/command-reference/push) to deploy an Enterprise Cloud instance. ### `--path` reported as an unexpected argument `--path` is only valid on [`helix init`](/cli/command-reference/init) (it sets the project directory). Other commands locate the project by walking up from the current directory to find `helix.toml`; `cd` into the project instead. ## Project configuration issues ### `Not in a Helix project directory` ``` Not in a Helix project directory. Run 'helix init' to create one. ``` Either run `helix init` to scaffold a new project, or `cd` into a directory that contains `helix.toml`. ### `helix.toml` validation errors | Error | Cause | Fix | |-------|-------|-----| | `project name is empty in ` | `[project] name` is blank. | Set a non-empty `name`. | | ` has no instances configured` | No `[local.*]` or `[enterprise.*]` blocks. | Run [`helix add local`](/cli/command-reference/add) or `helix add cloud`. | | `instance name is empty in ` | Empty key like `[local.""]`. | Give the instance a name. | | `cluster_id is empty for '' in ` | Missing required Helix Cloud field. | Set `cluster_id` under `[enterprise.]`. | | `Failed to parse helix.toml: invalid type` | Numbers quoted as strings. | Use bare numbers — `port = 6969`, not `port = "6969"`. | ### `Instance '' not found` The instance is not listed in `helix.toml`. Check with `helix status` and add it with [`helix add`](/cli/command-reference/add). ## Docker / Podman issues ### Docker daemon not running ``` Docker is not available. Install/start docker and try again ``` A container runtime must be both **installed and running** — installing Docker is not enough if the daemon is stopped. `helix start` attempts to start it for you, but surfaces this error when it can't. - **macOS:** `open -a Docker`, or `colima start` / `podman machine start`. - **Linux:** `sudo systemctl start docker`. - **Headless / sandbox (no init system):** start the daemon directly, e.g. `sudo dockerd > /tmp/dockerd.log 2>&1 &`, then verify with `docker info`. Many sandboxes ship the Docker binary but do not start `dockerd` for you. - **Podman:** ensure `podman` is on `PATH` and set `[project] container_runtime = "podman"` in `helix.toml`. **Rootless Podman** needs `newuidmap`/`subuid`/`subgid` set up and frequently fails in restricted or rootless containers (namespace/mount errors) — install Docker or use a privileged container there instead. ### Permission denied on Docker socket ``` permission denied while trying to connect to the Docker daemon socket ``` ```bash sudo usermod -aG docker $USER newgrp docker ``` ### Port already in use ``` bind: address already in use ``` ```bash lsof -i :6969 ``` Either stop the conflicting process or override the port for this run: ```bash helix start dev --port 9090 ``` Or change `[local.] port` in `helix.toml` permanently. ### Container name conflict The CLI names containers `helix--`. If a stale container exists, remove it: ```bash docker rm -f helix-my-helix-app-dev ``` `helix start` removes prior containers with the same name automatically, but a foreign container outside the CLI's control may need manual cleanup. ### Local runtime did not become ready in time ``` local Helix did not become ready in time hint: check logs with 'helix logs' or verify port 6969 is reachable ``` The `enterprise-dev` container started but did not accept `POST /v1/query` within ~30 seconds. Tail the container with `helix logs --follow` to investigate. ## Authentication issues ### `Authentication required` ``` Authentication required. Run 'helix auth login' first. ``` Run [`helix auth login`](/cli/command-reference/auth). Credentials are stored in `~/.helix/credentials`. ### Stale credentials ```bash helix auth logout helix auth login ``` ## Helix Cloud query issues ### `gateway_url is not configured` ``` Enterprise gateway URL is not configured for ''. Run 'helix sync ' or set gateway_url in helix.toml. ``` Run [`helix sync `](/cli/command-reference/sync) to fetch the latest metadata. If `helix sync` reports `gateway_url is still missing`, the cluster's gateway hasn't been published yet — set `gateway_url` manually under `[enterprise.]` once you know it. ### `Environment variable HELIX_API_KEY is required` The auth env var named by `query_auth_env` is unset. ```bash export HELIX_API_KEY="..." ``` ### `--warm is only valid for read requests` `helix query --warm` is rejected for write requests. Remove `--warm` or change `request_type` to `"read"`. ### `request_type must be lowercase 'read' or 'write'` `request_type` must be either `"read"` or `"write"`. Capitalized or alternative values are rejected. ### `dynamic query request must include query` The JSON file is missing the top-level `query` object. See [`helix query`](/cli/command-reference/query) for the request shape. ## Logs issues ### `--range, --start, and --end are only supported for Enterprise logs` Range queries are Helix Cloud-only. For local instances, use `helix logs --follow` to tail the container. ### `live Enterprise logs are not supported yet` `helix logs --follow` on a Helix Cloud instance is rejected. Use `--range --start --end ` instead. ## Destructive command issues ### `Refusing to delete '' non-interactively` ```bash helix delete --yes ``` ### `Refusing to prune all instances non-interactively` ```bash helix prune --all --yes ``` [`helix prune`](/cli/command-reference/prune) only removes Helix-owned containers, networks, disk-mode volumes, and per-instance workspace directories — it never runs a broad `docker/podman system prune`. ## Interactive vs scripted use The CLI uses `cliclack` prompts when stdin is a TTY and a required choice is missing — for example, `helix init` with no subcommand, or `helix start` when multiple local instances exist and no instance is passed. In non-interactive contexts (CI, pipes, scripts): - Pass subcommands and required flags explicitly. - Use `--yes` for [`helix delete`](/cli/command-reference/delete) and [`helix prune --all`](/cli/command-reference/prune). - The CLI fails clearly when a required choice cannot be resolved — it never silently picks a default. ## Quick checks before reporting an issue ## Getting help - Documentation: [docs.helix-db.com](https://docs.helix-db.com) - GitHub issues: [github.com/HelixDB/helix-db/issues](https://github.com/HelixDB/helix-db/issues) - Discord: [discord.gg/2stgMPr5BD](https://discord.gg/2stgMPr5BD) - Email: founders@helix-db.com When reporting an issue, include: - `helix --version` - Operating system + container runtime version - The exact command and full error output - A scrubbed `helix.toml` (remove cluster IDs, gateway URLs, auth env names if sensitive) # CLI Command Reference This page lists every command in Helix CLI v2. Click any command for full flag and example details. ## Global Options These options are available for every command: ```bash helix --help, -h Show help for the command helix --version, -V Print the CLI version helix --quiet Suppress output (errors and final result only) helix -v, --verbose Show detailed output with timing information ``` ## Commands | Command | Subcommands | Flags | Scope | Description | |:--------|:------------|:------|:------|:------------| | [`helix init`](/cli/command-reference/init) | `local`, `enterprise` | `-p`/`--path` | Local, Helix Cloud | Initialize a v2 Helix project | | [`helix chef`](/cli/command-reference/chef) | | | Local | Bootstrap a first Helix app with skills, docs MCP, starter queries, seed data, and a launched coding agent | | [`helix add`](/cli/command-reference/add) | `local`, `enterprise` | | Local, Helix Cloud | Add an instance to an existing project | | [`helix start`](/cli/command-reference/start) | | `--foreground`, `--port`, `--disk`, `--persist` | Local | Start a local instance (background by default) | | [`helix stop`](/cli/command-reference/stop) | | | Local | Stop a background local instance | | [`helix restart`](/cli/command-reference/restart) | | | Local | Restart a background local instance | | [`helix status`](/cli/command-reference/status) | | | Local, Helix Cloud | Show instance status | | [`helix logs`](/cli/command-reference/logs) | | `-f`/`--follow`, `-r`/`--range`, `--start`, `--end` | Local, Helix Cloud | View instance logs | | [`helix query`](/cli/command-reference/query) | | `-f`/`--file`, `--json`, `-e`/`--ts`, `--ts-file`, `--warm`, `--host`, `--port`, `--compact` | Local, Helix Cloud | Send a dynamic query (JSON or TypeScript DSL) to `POST /v1/query` | | [`helix auth`](/cli/command-reference/auth) | `login`, `logout`, `create-key` | | Helix Cloud | Manage Helix Cloud authentication | | [`helix workspace`](/cli/command-reference/workspace) | `list`, `show`, `switch` | `--format`, `--id` | Helix Cloud | Manage active workspace selection | | [`helix project`](/cli/command-reference/project) | `list`, `show`, `switch` | `--format`, `--id`, `--workspace-id` | Helix Cloud | Manage linked project selection | | [`helix cluster`](/cli/command-reference/cluster) | `list`, `indexes` | `--format`, `--workspace-id`, `--project-id`, `--cluster-id` | Helix Cloud | List Helix Cloud clusters and their indexes | | [`helix push`](/cli/command-reference/push) | | | Helix Cloud | Deploy a Helix Cloud instance to Helix Cloud | | [`helix sync`](/cli/command-reference/sync) | | `--dry-run`, `-y`/`--yes` | Helix Cloud | Refresh Helix Cloud instance metadata in `helix.toml` | | [`helix prune`](/cli/command-reference/prune) | | `-a`/`--all`, `-y`/`--yes` | Local | Remove Helix-owned local containers, volumes, and workspaces | | [`helix delete`](/cli/command-reference/delete) | | `-y`/`--yes` | Local, Helix Cloud | Delete an instance from `helix.toml` and local state | | [`helix skills`](/cli/command-reference/skills) | `install`, `update`, `list` | `--project` | — | Install, update, and list the Helix agent skills | | [`helix metrics`](/cli/command-reference/metrics) | `full`, `basic`, `off`, `status` | | Local, Helix Cloud | Configure CLI telemetry | | [`helix update`](/cli/command-reference/update) | | `--force`, `--v1` | — | Update the CLI (and refresh installed skills) to the latest version | | [`helix feedback`](/cli/command-reference/feedback) | | | — | Send feedback to the Helix team | ## Next Steps # helix add Add a new instance to an existing Helix project. The new block is appended to `helix.toml`. ## Usage ```bash helix add [SUBCOMMAND] ``` If you omit the subcommand in a TTY, the CLI prompts you to choose between `local` and `enterprise`. ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `local` | Add a local v2 development instance. | | `enterprise` | Add a Helix Cloud instance linked to a cluster. | ### `helix add local` | Flag | Type | Required | Description | Default | |------|------|----------|-------------|---------| | `-n`, `--name` | String | Yes | Local instance name (must be unique in `helix.toml`). | — | | `--port` | Number | No | Host port for the local gateway. | `6969` | | `--disk` | Boolean | No | Persist local data with on-disk storage backed by a CLI-managed MinIO volume. | `false` | Interactive local add prompts for the storage mode and defaults to in-memory. ### `helix add cloud` | Flag | Type | Required | Description | Default | |------|------|----------|-------------|---------| | `-n`, `--name` | String | Yes | Helix Cloud instance name. | — | | `--cluster-id` | String | Yes | Helix Cloud cluster ID. Find one with [`helix cluster list`](/cli/command-reference/cluster). | — | | `--gateway-url` | URL | No | Runtime gateway URL for dynamic queries. If omitted, run [`helix sync`](/cli/command-reference/sync) afterward to fetch it. | — | `helix add cloud` requires Helix Cloud authentication. Run [`helix auth login`](/cli/command-reference/auth) first. ## Examples ```bash # Add a second local instance on a different port helix add local --name staging --port 9090 # Add a local instance that uses persistent on-disk storage by default helix add local --name disk-dev --disk # Add a Helix Cloud instance with a known cluster ID helix add cloud --name production --cluster-id ec_01HX... # Add a Helix Cloud instance and let helix sync fill in gateway/auth metadata helix add cloud --name prod --cluster-id ec_01HX... helix sync prod ``` # helix auth Manage Helix Cloud authentication. Credentials are stored in `~/.helix/credentials`. ## Usage ```bash helix auth ``` ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `login` | Authenticate with Helix Cloud via a browser-based GitHub device flow. | | `logout` | Remove `~/.helix/credentials`. | | `create-key ` | Rotate the API key for a specific Helix Cloud cluster. The new key prints once. | ### `helix auth create-key` | Argument | Description | |----------|-------------| | `CLUSTER` | Cluster ID to rotate. Find one with [`helix cluster list`](/cli/command-reference/cluster). | `helix auth create-key` requires you to already be logged in. The new key replaces any previous keys for that cluster — update `HELIX_API_KEY` (or whatever `query_auth_env` resolves to for the instance) before running [`helix query`](/cli/command-reference/query) again. ## Examples ```bash # Login (opens a browser device flow) helix auth login # Logout (clears ~/.helix/credentials) helix auth logout # Rotate a cluster API key helix auth create-key ec_01HX... ``` # helix chef Bootstrap a first HelixDB app for a coding agent. `helix chef` installs agent context, scaffolds a local project, starts the local database, seeds starter data, writes a build prompt, and launches your coding agent to generate the app. ## Usage ```bash helix chef ``` `helix cook` is an alias for `helix chef`. The command takes no flags — it is fully interactive (and falls back to sensible defaults when run without a TTY). ## Authentication The first interactive run signs you in to Helix Cloud through a GitHub device-code flow. This is used only to upload an optional, anonymized setup snapshot — the build itself does not require it. - **Non-interactive (agents, CI, sandboxes):** when stdin is not a TTY, `helix chef` skips the login automatically and proceeds without it. The snapshot upload is simply skipped. - **Opt out in an interactive shell:** set `HELIX_SKIP_CLOUD_AUTH=1`. - **No-auth alternative:** to scaffold without `chef` (and without any Cloud login), run `helix init local` → `helix start dev` → `helix query dev --file …` by hand. ## What it does 1. Installs the Helix skills (`npx skills add HelixDB/skills`) and docs MCP. 2. Initializes a local project (`helix init local`) with a `dev` instance on port `6969`, then starts it in-memory. 3. Writes `HELIX_CHEF_PROMPT.md` (your build intent, or a Personal CRM default) and, for the default build, starter query files under `examples/` — then seeds the data. 4. Detects an installed coding agent (Claude Code → Codex → OpenCode), asks for a permission mode, and launches it against the build prompt. 5. Opens the generated app at `http://localhost:3000` once the agent's frontend is running. ## Interactive mode Running `helix chef` in a terminal prompts first for: ```text What do you want to build? ``` Leave this blank to use the default Personal CRM build. If you provide an app idea, it is written into `HELIX_CHEF_PROMPT.md` and used to drive the agent. Next, choose a setup mode: | Mode | Behavior | |------|----------| | Automatic setup | Runs every setup step with defaults. | | Manual setup | Lets you choose the project path and confirm each setup step. | ## Examples ```bash # Interactive setup helix chef # Skip the Cloud login (also implicit in non-interactive shells) HELIX_SKIP_CLOUD_AUTH=1 helix chef # Alias helix cook ``` ## Related - [`helix init`](/cli/command-reference/init) — initialize a project without agent setup. - [`helix start`](/cli/command-reference/start) — start a local instance. - [`helix query`](/cli/command-reference/query) — send dynamic JSON requests. # helix cluster List Helix Cloud clusters available to the active workspace or a specific project. ## Usage ```bash helix cluster ``` ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `list` | List Helix Cloud clusters. | | `indexes` (alias `indices`) | List the indexes in a cluster. | ### `helix cluster list` | Flag | Type | Description | Default | |------|------|-------------|---------| | `--workspace-id` | String | List clusters for a specific workspace. | Active workspace from `~/.helix/config` | | `--project-id` | String | Limit results to a specific project. | Linked project from `helix.toml`, if any | | `--format` | `human` \| `json` | Output format. | `human` | ### `helix cluster indexes` List the indexes defined in a cluster. Alias: `helix cluster indices`. | Flag | Type | Description | Default | |------|------|-------------|---------| | `--cluster-id` | String | Cluster to inspect. | The current project's Enterprise instance, from `helix.toml` | | `--format` | `human` \| `json` | Output format. | `human` | ## Examples ```bash # List clusters in the active workspace helix cluster list # List clusters scoped to a specific project helix cluster list --project-id prj_01HX... # List clusters in a non-active workspace helix cluster list --workspace-id ws_01HX... # Emit machine-readable JSON helix cluster list --format json # List indexes for the current project's Enterprise cluster helix cluster indexes # List indexes for a specific cluster as JSON helix cluster indexes --cluster-id ec_01HX... --format json ``` Use the resulting cluster ID with [`helix init cloud`](/cli/command-reference/init), [`helix add cloud`](/cli/command-reference/add), or [`helix auth create-key`](/cli/command-reference/auth). ## Related - [`helix workspace`](/cli/command-reference/workspace) — manage the active workspace - [`helix project`](/cli/command-reference/project) — manage the linked project - [`helix sync`](/cli/command-reference/sync) — refresh Helix Cloud instance metadata # helix delete Delete an instance from `helix.toml` and clean up any local runtime state for it. For local instances, Helix-owned containers, networks, disk-mode volumes, and the per-instance directory under `.helix/` are deleted. For Helix Cloud instances, only the `[enterprise.]` block in `helix.toml` and any local workspace state are removed — the cluster itself is untouched. ## Usage ```bash helix delete [OPTIONS] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Instance to delete. Required. | ## Available flags | Flag | Type | Description | |------|------|-------------| | `-y`, `--yes` | Boolean | Skip confirmation prompts. Required in non-TTY environments. | ## Confirmation behavior In a TTY, `helix delete` prints a warning and asks for confirmation. In non-TTY environments it refuses to run without `--yes`: ``` Refusing to delete '' non-interactively. Re-run with --yes to confirm. ``` ## Examples ```bash # Delete a local instance (TTY) helix delete staging # Delete non-interactively helix delete staging --yes ``` # helix feedback Send feedback to the Helix team. ## Usage ```bash helix feedback [MESSAGE] ``` ## Arguments | Argument | Description | |----------|-------------| | `MESSAGE` | Feedback message. If omitted, the CLI prompts for one in a TTY. | ## Examples ```bash # Send feedback inline helix feedback "love the new dynamic query flow" # Open an interactive prompt helix feedback ``` # helix init Initialize a v2 Helix project. Creates `helix.toml`, a `.helix/` workspace directory, a `.gitignore` entry, and — for local projects — an `examples/request.json` scaffold. ## Usage ```bash helix init [OPTIONS] [SUBCOMMAND] ``` If you omit the subcommand in a TTY, the CLI prompts you to choose between `local` and `enterprise`. ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `local` | Initialize a local v2 development project. | | `enterprise` | Initialize a Helix Cloud project linked to a cluster. | ## Top-level flags | Flag | Type | Description | Default | |------|------|-------------|---------| | `-p`, `--path` | Path | Project directory to initialize. | Current directory | | `--skills` | Boolean | Install the Helix agent skills + docs MCP (the install you'd otherwise be prompted for). | Prompted in a TTY | | `--no-skills` | Boolean | Skip installing the Helix agent skills + docs MCP. | Prompted in a TTY | ### `helix init local` | Flag | Type | Description | Default | |------|------|-------------|---------| | `-n`, `--name` | String | Local instance name. | `dev` | | `--port` | Number | Host port for the local gateway. The container always listens on `8080` internally. | `6969` | | `--disk` | Boolean | Persist local data with on-disk storage backed by a CLI-managed MinIO volume. | `false` | Interactive local init prompts for the storage mode and defaults to in-memory. ### `helix init cloud` | Flag | Type | Required | Description | Default | |------|------|----------|-------------|---------| | `-n`, `--name` | String | No | Helix Cloud instance name. | `production` | | `--cluster-id` | String | Yes | Helix Cloud cluster ID. Find one with [`helix cluster list`](/cli/command-reference/cluster). | — | | `--gateway-url` | URL | No | Runtime gateway URL for dynamic queries. If omitted, run [`helix sync`](/cli/command-reference/sync) after init to fetch it. | — | `helix init cloud` requires Helix Cloud authentication. Run [`helix auth login`](/cli/command-reference/auth) first. ## What `helix init` produces - `helix.toml` — project configuration with one `[local.]` or `[enterprise.]` instance. - `.helix/` — CLI-managed workspace state (ignored from git). - `.gitignore` — entries appended for `.helix/`, `target/`, and `*.log`. - `examples/request.json` (local only) — a runnable read request you can send with `helix query`: ```json { "request_type": "read", "query": { "queries": [ { "Query": { "name": "node_count", "steps": [ { "NWhere": { "Eq": ["$label", { "String": "User" }] } }, "Count" ], "condition": null } } ], "returns": ["node_count"] }, "parameters": {} } ``` ## Examples ```bash # Interactive init in the current directory (prompts in a TTY) helix init # Initialize a local project in a new directory helix init --path ./my-helix-app local # Local with a custom instance name and port helix init local --name staging --port 9090 # Local with persistent on-disk storage by default helix init local --disk # Helix Cloud project linked to an existing cluster helix init cloud --cluster-id ec_01HX... --gateway-url https://gateway.example.com ``` ## Next steps after `helix init local` ```bash helix start dev helix query dev --file examples/request.json ``` ## Next steps after `helix init cloud` ```bash helix sync helix query --file ``` # helix logs View logs for a local or Helix Cloud instance. - For **local** instances, logs come from `docker logs` / `podman logs`. - For **Helix Cloud** instances, logs are fetched over a time range from Helix Cloud and require `helix auth login`. ## Usage ```bash helix logs [INSTANCE] [OPTIONS] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Instance name. If omitted in a TTY with multiple instances, the CLI prompts. Defaults to `dev` when present and no prompt is needed. | ## Available flags | Flag | Type | Description | Default | |------|------|-------------|---------| | `-f`, `--follow` | Boolean | Stream live logs. **Local only** — rejected for Helix Cloud instances. | `false` | | `-r`, `--range` | Boolean | Query historical logs from Helix Cloud over a time range. **Helix Cloud only**. | `false` | | `--start` | RFC 3339 timestamp | Start of the range. Requires `--range`. | `--end` minus 1 hour | | `--end` | RFC 3339 timestamp | End of the range. Requires `--range`. | Now (UTC) | `--range`, `--start`, and `--end` are rejected for local instances with: ``` --range, --start, and --end are only supported for Enterprise logs; local logs use docker/podman logs ``` `--follow` against a Helix Cloud instance is rejected with: ``` live Enterprise logs are not supported yet; use --range instead ``` ## Examples ```bash # Dump local container logs once helix logs dev # Stream local logs helix logs dev --follow # Last hour of Helix Cloud logs (defaults: end=now, start=end-1h) helix logs production --range # Fixed Helix Cloud log range helix logs production \ --range \ --start 2026-05-12T10:00:00Z \ --end 2026-05-12T11:00:00Z ``` # helix metrics Configure CLI telemetry and usage metrics collection. Settings are stored in the per-user metrics config. ## Usage ```bash helix metrics ``` ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `full` | Enable full metrics collection (prompts for an email address). | | `basic` | Enable minimal anonymous metrics. | | `off` | Disable all metrics collection. | | `status` | Show current metrics configuration and when it was last updated. | ## `helix metrics status` output `status` prints the metrics level, your user ID (if logged in), and a relative `Last updated` time: | Age | Format | |-----|--------| | 0–4 seconds | `just now` | | 5–59 seconds | `s ago` | | 1–59 minutes | `m ago` | | 1–23 hours | `h ago` | | 1+ days | `d ago` | ## Examples ```bash # Enable full metrics (prompts for email) helix metrics full # Enable anonymous basic metrics helix metrics basic # Disable all metrics helix metrics off # Show current state helix metrics status ``` # helix project Manage the Helix Cloud project that this local project is linked to. Selection is persisted in `helix.toml` (under `[project] workspace_id` and `[project] id`). ## Usage ```bash helix project ``` If you run `helix project` with no subcommand in a TTY, the CLI prompts you to choose a project. ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `list` | List projects in the active workspace. | | `show` | Show the currently linked project. | | `switch ` | Link this directory to a workspace project by name or ID. | ### `helix project list` | Flag | Type | Description | Default | |------|------|-------------|---------| | `--workspace-id` | String | Override the active workspace selection for this command. | Value from `~/.helix/config` | | `--format` | `human` \| `json` | Output format. | `human` | ### `helix project show` | Flag | Type | Description | Default | |------|------|-------------|---------| | `--format` | `human` \| `json` | Output format. | `human` | ### `helix project switch` | Argument | Description | |----------|-------------| | `PROJECT` | Project name or ID. | | Flag | Type | Description | |------|------|-------------| | `--id` | Boolean | Treat `PROJECT` as an ID rather than a name. | ## Examples ```bash # List projects in the active workspace helix project list # Show the linked project helix project show # Link to a project by name helix project switch payments-api # Link to a project by ID helix project switch prj_01HX... --id # List projects in a specific workspace without switching globally helix project list --workspace-id ws_01HX... ``` ## Related - [`helix workspace`](/cli/command-reference/workspace) — manage the active workspace - [`helix cluster`](/cli/command-reference/cluster) — list clusters available in the workspace/project - [`helix sync`](/cli/command-reference/sync) — refresh Helix Cloud instance metadata # helix prune Remove Helix-owned local containers, networks, disk-mode volumes, and per-instance workspace state. `helix prune` only touches resources the CLI manages: - The container named `helix--`. - The disk-mode MinIO sidecar, network, and persistent volume when present. - The per-instance directory under `.helix/`. It never runs a broad `docker system prune` or `podman system prune`. ## Usage ```bash helix prune [INSTANCE] [OPTIONS] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Local instance to prune. If omitted, the CLI prompts in a TTY or requires `--all`. | ## Available flags | Flag | Type | Description | |------|------|-------------| | `-a`, `--all` | Boolean | Prune every local instance in the project. | | `-y`, `--yes` | Boolean | Skip confirmation prompts. Required with `--all` in non-TTY environments. | ## Confirmation behavior - `helix prune ` removes resources without confirmation, including persisted disk-mode data. - `helix prune --all` prints a warning and asks for confirmation in a TTY. In non-TTY environments, it refuses to run without `--yes`: ``` Refusing to prune all instances non-interactively. Re-run with --yes to confirm. ``` If nothing was found to prune for an instance, the CLI prints `No local runtime resources found for ''` and exits successfully. ## Examples ```bash # Prune a specific instance helix prune dev # Interactive selection (TTY) helix prune # Prune everything Helix-owned for every local instance helix prune --all # Same, non-interactively helix prune --all --yes ``` # helix push Deploy a Helix Cloud instance. `helix push` reads the `[enterprise.]` block in `helix.toml` and deploys it to Helix Cloud, then reports progress until the deploy completes. Requires Helix Cloud authentication. Run [`helix auth login`](/cli/command-reference/auth) first. ## Usage ```bash helix push [INSTANCE] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Helix Cloud instance name from `helix.toml`. If omitted, the CLI prompts in a TTY, or fails listing the available `[enterprise.*]` instances when non-interactive. | ## Behavior - Resolves the named `[enterprise.]` block and deploys it to Helix Cloud, streaming deploy progress. - Fails on a **local** instance with `Local instance '' uses the v2 runtime. Run 'helix start ' instead.` - On deploy failure, the CLI hints to `check the cluster with 'helix status', and re-authenticate with 'helix auth login' if needed`. ## Examples ```bash # Deploy a named Helix Cloud instance helix push production # Deploy without naming the instance (prompts in a TTY) helix push ``` ## Related - [`helix auth`](/cli/command-reference/auth) — authenticate with Helix Cloud before deploying - [`helix sync`](/cli/command-reference/sync) — refresh gateway/auth metadata after a deploy - [`helix query`](/cli/command-reference/query) — query the deployed instance - [`helix status`](/cli/command-reference/status) — check the deployed cluster's state # helix query Send a dynamic query to `POST /v1/query` on a local or Helix Cloud instance. You can supply the query as a raw JSON request (`--file`/`--json`) or write it in the TypeScript DSL (`-e`/`--ts`/`--ts-file`) and let the CLI build the request for you. ## Usage ```bash helix query [INSTANCE] (--file | --json '' | -e '' | --ts-file ) [OPTIONS] ``` Provide **exactly one** input: a JSON request (`--file` or `--json`) or a TypeScript DSL expression (`-e`/`--ts` or `--ts-file`). The four input flags are mutually exclusive. ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Instance name from `helix.toml`. Defaults to `dev` if omitted. | ## Available flags Exactly one input flag is required (`--file`, `--json`, `-e`/`--ts`, or `--ts-file`). | Flag | Type | Required | Description | Default | |------|------|----------|-------------|---------| | `-f`, `--file` | Path | One input required | JSON request body file. | — | | `--json` | String | One input required | Inline JSON request body. | — | | `-e`, `--ts` | String | One input required | TypeScript DSL expression, evaluated locally to build the request (like `mysql -e`). See [TypeScript DSL input](#typescript-dsl-input). | — | | `--ts-file` | Path | One input required | Path to a `.ts` file containing a TypeScript DSL expression. | — | | `--warm` | Boolean | No | Send the request as a cache-warming call (`X-Helix-Warm: true`). Read requests only. | `false` | | `--host` | String | No | Override the host for local instances. | `localhost` | | `--port` | Number | No | Override the port for local instances. | `[local.] port` (default `6969`) | | `--compact` | Boolean | No | Print compact JSON instead of pretty-printed JSON. | `false` | ## Request body The JSON request must include: - `request_type` — lowercase `"read"` or `"write"`. Anything else is rejected with `request_type must be lowercase 'read' or 'write'`. - `query_name` — optional top-level query name for gateway logs and query diagnostics. Use exactly `query_name`; `name` and `queryName` are not accepted aliases. Missing or `null` falls back to `__dynamic__`. - `query` — the dynamic query object: a `queries` array of `{ "Query": { "name", "steps", "condition" } }` entries plus a `returns` list naming what to return (see the example below). Missing `query` is rejected with `dynamic query request must include query`. - `parameters` — optional object of named parameters. `helix init local` scaffolds `examples/request.json` with a runnable read request: ```json { "request_type": "read", "query_name": "node_count", "query": { "queries": [ { "Query": { "name": "node_count", "steps": [ { "NWhere": { "Eq": ["$label", { "String": "User" }] } }, "Count" ], "condition": null } } ], "returns": ["node_count"] }, "parameters": {} } ``` Every query must begin with a source step (for example `NWhere`) before any terminal aggregator like `Count`. Use `--file` for checked-in or reusable requests, and `--json` for quick one-off requests from a shell. Quote inline JSON so your shell passes it as one argument. ## TypeScript DSL input Instead of hand-writing JSON, you can pass a TypeScript DSL expression and let the CLI build the request — the same way `mysql -e` runs SQL from the shell: ```bash # Inline expression with -e / --ts helix query dev -e 'readBatch().varAs("c", g().nWithLabel("User").count()).returning(["c"])' # Or from a .ts file helix query dev --ts-file queries/count_users.ts ``` How it works: - The expression must evaluate to a `readBatch()` or `writeBatch()` builder. `g`, `readBatch`, `writeBatch`, `defineParams`, and `param` are auto-imported and already in scope — you write only the expression, no imports. - The CLI evaluates it locally with Node using the published [`@helix-db/helix-db`](https://www.npmjs.com/package/@helix-db/helix-db) SDK, calls `.toDynamicJson()`, and posts the resulting dynamic-query JSON to `/v1/query`. The builders are pure (no I/O), so no instance needs to be running to *build* the request — only to run it. There is no separate compile step. - `request_type` is inferred automatically from whether you used `readBatch()` or `writeBatch()`. Requirements: - **Node.js 20+** on `PATH` (npm ships with it). On first use the CLI installs the SDK once into its cache directory; later runs reuse it. - For inline `-e` use, write a **single expression** with **no TypeScript type annotations** (it is evaluated as an expression, not compiled). Put more elaborate queries in a `--ts-file`. If Node is missing, the CLI tells you to install Node 20+ or fall back to `--json`/`--file`. For the full DSL surface, see the [`helix-query-typescript`](https://github.com/HelixDB/skills) skill. ## Validation rules | Rule | Error | |------|-------| | `request_type` is missing | `dynamic query request must include request_type` | | `request_type` is not lowercase `read` or `write` | `request_type must be lowercase 'read' or 'write'` | | `query_name` is blank or whitespace-only | `query_name must be non-empty when provided` | | `name` or `queryName` is used instead of `query_name` | Invalid request body / unknown field | | `query` is missing | `dynamic query request must include query` | | `--warm` is used with a write request | `--warm is only valid for read requests` | | Neither `--file` nor `--json` is provided | Clap usage error | | Both `--file` and `--json` are provided | Clap conflict error | ## Helix Cloud targets For a Helix Cloud instance, `helix query` reads `[enterprise.]` in `helix.toml` and: - Posts to `/v1/query`. - Sends an auth header named by `query_auth_header` (default `Authorization`) with the value read from the environment variable `query_auth_env` (default `HELIX_API_KEY`). The value is read from your shell environment or from a `.env` file in the project root (whichever is set), so you can keep the key out of your shell history. If `gateway_url` is missing, the CLI returns: ``` Enterprise gateway URL is not configured for ''. Run 'helix sync ' or set gateway_url in helix.toml. ``` If the auth env var is missing, the CLI returns `Environment variable is required for Enterprise query auth`. ## Output - A non-empty JSON response is pretty-printed by default; pass `--compact` to print on a single line. - `204 No Content` (returned for `--warm` requests) produces no output and exits successfully. - Non-2xx responses produce `Query failed with HTTP : ` and exit with a non-zero status. ## Examples ```bash # Send the example request to the local 'dev' instance helix query dev --file examples/request.json # Run the same request as a warm-up — populates per-process caches without printing output helix query dev --file examples/request.json --warm # Print compact JSON suitable for piping into jq helix query dev --file examples/request.json --compact | jq '.node_count' # Send an inline JSON request body helix query dev --json '{"request_type":"read","query_name":"ad_hoc_read","query":{"queries":[],"returns":[]},"parameters":{}}' # Build the request from a TypeScript DSL expression instead of JSON helix query dev -e 'readBatch().varAs("c", g().nWithLabel("User").count()).returning(["c"])' # Target a Helix Cloud instance (requires HELIX_API_KEY in env and gateway_url in helix.toml) helix query production --file examples/request.json ``` ## Related - [`helix start`](/cli/command-reference/start) — start a local instance to query against - [`helix sync`](/cli/command-reference/sync) — refresh Helix Cloud gateway/auth metadata # helix restart Restart a background local instance. If the container still exists, it is restarted in place. If the container has been removed (for example after `helix prune`), `helix restart` falls back to a fresh [`helix start`](/cli/command-reference/start). Default local data is in-memory and is wiped by every restart. Disk-mode instances preserve data in their local MinIO volume. ## Usage ```bash helix restart [INSTANCE] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Local instance to restart. If omitted in a TTY with multiple instances, the CLI prompts. Defaults to `dev` when present. | ## Examples ```bash # Restart the default 'dev' instance helix restart dev # Restart with interactive picker helix restart ``` # helix skills Install, refresh, and inspect the Helix agent skills — the query-authoring skills (`helix-query-typescript`, `helix-query-rust`, `helix-memory-system`, and friends) that coding agents like Claude Code use to write correct HelixDB queries. They are installed with the [`skills`](https://github.com/vercel-labs/skills) CLI under the hood (`npx skills add HelixDB/skills`), so this command requires **Node.js/npm** (`npx`) on your PATH. `helix init` and `helix chef` install these skills for you; `helix skills` is how you manage them afterwards. They install **globally** by default (`~/.agents/skills`, shared across projects) — pass `--project` to operate on the current project instead. ## Usage ```bash helix skills ``` ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `install` | Install the Helix agent skills (interactive). | | `update` | Refresh installed Helix skills to the latest version. | | `list` | List installed agent skills. | ## Available flags | Flag | Type | Description | Default | |------|------|-------------|---------| | `--project` | Boolean | Operate on the current project (`./skills`) instead of globally. | Global | ## Update notifications When skills are installed, the CLI checks once every 24 hours (the same cadence and `HELIX_NO_UPDATE_CHECK` opt-out as the CLI version check) whether the Helix skills are out of date, and prints a one-line notice pointing you at `helix skills update`. It is **notify-only** — it never rewrites skill files on a routine command. Running [`helix update`](/cli/command-reference/update) also refreshes installed skills. ## Examples ```bash # Install the Helix skills globally helix skills install # Refresh installed skills to the latest version helix skills update # Install into the current project instead of globally helix skills install --project # List installed skills helix skills list ``` # helix start Start a local v2 instance. By default the container starts in the background and the CLI waits until the local gateway accepts `POST /v1/query` requests before returning. ## Usage ```bash helix start [INSTANCE] [OPTIONS] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Local instance name from `helix.toml`. If omitted, defaults to `dev` when present, otherwise prompts in a TTY or fails if non-interactive. | ## Available flags | Flag | Type | Description | Default | |------|------|-------------|---------| | `--foreground` | Boolean | Run attached and stop the container on Ctrl-C. Useful for streaming startup logs. | `false` | | `--port` | Number | Override the host port for this run. The container always listens on `8080` internally. | Value from `[local.] port` (default `6969`) | | `--disk` | Boolean | Use on-disk storage backed by a CLI-managed MinIO container for this run. | `false` | | `--persist` | Boolean | Write the resolved port and storage settings for this run back to `[local.]` in `helix.toml`, so future runs reuse them. | `false` | ## Behavior - Pulls `ghcr.io/helixdb/enterprise-dev:latest` (or the image/tag set in `[local.]`). - Names the container `helix--` and publishes the configured port to container port `8080`. - Uses Docker or Podman based on `[project] container_runtime` in `helix.toml`. - Default storage is in-memory. Passing `--disk` starts a MinIO sidecar, creates the `helix-db` bucket, and runs `enterprise-dev` with S3-compatible storage environment variables. - Background mode (`-d`, `--restart unless-stopped`) waits up to ~30 seconds for `POST /v1/query` readiness and prints the URL and container name when ready. - Foreground mode (`--rm`) streams the container's stdout/stderr until Ctrl-C, then removes the container. With `--disk`, `helix stop` removes the Helix and MinIO containers but keeps the persistent local volume. `helix prune` removes that volume and deletes the persisted local data. ## Examples ```bash # Start the default 'dev' instance in the background helix start # Start a named instance in the background helix start staging # Stream logs in the foreground; Ctrl-C stops the container helix start dev --foreground # Override the host port for this run only helix start dev --port 9090 # Start with persistent local storage for this run helix start dev --disk # Start on a new port and save it to helix.toml for future runs helix start dev --port 9090 --persist ``` ## Related - [`helix stop`](/cli/command-reference/stop) — stop a background instance - [`helix restart`](/cli/command-reference/restart) — restart a background instance - [`helix query`](/cli/command-reference/query) — send a dynamic query to a running instance - [`helix logs`](/cli/command-reference/logs) — view container logs # helix status Show the status of local and Helix Cloud instances configured in `helix.toml`. For local instances, status comes from `docker ps -a` / `podman ps -a` filtered on `helix--` and includes the configured storage mode. For Helix Cloud instances, status reflects the cluster configuration recorded in `helix.toml`. ## Usage ```bash helix status [INSTANCE] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Show status for a specific instance. If omitted, shows every instance in the project. In a TTY with multiple instances, the CLI may offer a single-instance picker. | ## Examples ```bash # Show every instance helix status # Show one instance helix status dev ``` # helix stop Stop a background local instance and remove its containers. `helix stop` is idempotent — if the instance is not running, the CLI exits successfully without error. For disk-mode instances, `helix stop` removes the Helix and MinIO containers but keeps the persistent local volume. Use [`helix prune`](/cli/command-reference/prune) to delete disk-mode data. ## Usage ```bash helix stop [INSTANCE] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Local instance to stop. If omitted in a TTY with multiple instances, the CLI prompts. Defaults to `dev` when present. | ## Examples ```bash # Stop the default 'dev' instance helix stop dev # Stop with interactive picker helix stop # Safe to call in scripts even if the instance is already stopped helix stop staging || true ``` ## Related - [`helix start`](/cli/command-reference/start) — start a local instance - [`helix restart`](/cli/command-reference/restart) — restart a local instance - [`helix prune`](/cli/command-reference/prune) — remove all Helix-owned local state for an instance # helix sync Refresh Helix Cloud instance metadata in `helix.toml`. `helix sync` fetches the latest gateway URL, auth contract, and node-type info from Helix Cloud and writes it back to the local `[enterprise.]` block, so [`helix query`](/cli/command-reference/query), [`helix logs`](/cli/command-reference/logs), and other Cloud commands can target the instance. Requires Helix Cloud authentication. Run [`helix auth login`](/cli/command-reference/auth) first. ## Usage ```bash helix sync [INSTANCE] [OPTIONS] ``` ## Arguments | Argument | Description | |----------|-------------| | `INSTANCE` | Helix Cloud instance name to sync. If omitted, every `[enterprise.*]` instance in `helix.toml` is synced. | ## Available flags | Flag | Type | Description | Default | |------|------|-------------|---------| | `--dry-run` | Boolean | Fetch the remote state and print the changes `helix sync` would make, without writing anything. Mutually exclusive with `--yes`. | `false` | | `-y`, `--yes` | Boolean | Skip interactive conflict prompts and apply the reconciliation non-interactively (useful in CI). | `false` | ## What gets updated For each matched Helix Cloud instance, `helix sync` overwrites these fields in `helix.toml` from the cloud-side configuration: - `gateway_url` - `query_auth_header` (defaults to `Authorization`) - `query_auth_env` (defaults to `HELIX_API_KEY`) - `availability_mode` - `gateway_node_type` - `db_node_type` If the cloud response does not include a `gateway_url`, the CLI prints: ``` Enterprise instance '' is synced, but gateway_url is still missing. Set it in helix.toml before using 'helix query '. ``` ## Examples ```bash # Sync a single Helix Cloud instance helix sync production # Sync every Helix Cloud instance defined in helix.toml helix sync # Preview what would change without writing to helix.toml helix sync production --dry-run # Apply without interactive conflict prompts (CI) helix sync production --yes ``` ## Related - [`helix auth`](/cli/command-reference/auth) — authenticate with Helix Cloud - [`helix query`](/cli/command-reference/query) — uses the synced `gateway_url` and auth contract - [`helix cluster`](/cli/command-reference/cluster) — list available clusters # helix update Update the Helix CLI to the latest version. Use `--v1` to update to `v2.3.5`, the last CLI release for v1 projects. If the [Helix agent skills](/cli/command-reference/skills) are installed, `helix update` also refreshes them to the latest version (a failure here degrades to a warning and never fails the CLI update). Separately, when skills are installed the CLI checks once every 24 hours whether they are out of date and prints a notice pointing you at `helix skills update`; set `HELIX_NO_UPDATE_CHECK=1` to opt out of both the CLI and skills checks. ## Usage ```bash helix update [OPTIONS] ``` ## Available flags | Flag | Type | Description | |------|------|-------------| | `--force` | Boolean | Force update even if already on latest version | | `--v1` | Boolean | Update to `v2.3.5`, the last CLI release for v1 projects | ## Examples ```bash # Update to the latest version helix update # Force update helix update --force # Update to the last v1-compatible CLI release helix update --v1 ``` # helix workspace Manage the active Helix Cloud workspace selection stored in `~/.helix/config`. ## Usage ```bash helix workspace ``` If you run `helix workspace` with no subcommand in a TTY, the CLI prompts you to choose a workspace. ## Available sub-commands | Sub-command | Description | |-------------|-------------| | `list` | List workspaces you can access. | | `show` | Show the currently selected workspace. | | `switch ` | Select a workspace by slug or ID. | ### `helix workspace list` | Flag | Type | Description | Default | |------|------|-------------|---------| | `--format` | `human` \| `json` | Output format. | `human` | ### `helix workspace show` | Flag | Type | Description | Default | |------|------|-------------|---------| | `--format` | `human` \| `json` | Output format. | `human` | ### `helix workspace switch` | Argument | Description | |----------|-------------| | `WORKSPACE` | Workspace slug or ID. | | Flag | Type | Description | |------|------|-------------| | `--id` | Boolean | Treat `WORKSPACE` as an ID rather than a slug. | ## Examples ```bash # List accessible workspaces helix workspace list # Show the active workspace helix workspace show # Switch by slug helix workspace switch my-team # Switch by ID helix workspace switch ws_01HX... --id ``` ## Related - [`helix project`](/cli/command-reference/project) — link the local project to a workspace project - [`helix cluster`](/cli/command-reference/cluster) — list Helix Cloud clusters - [`helix auth`](/cli/command-reference/auth) — authenticate with Helix Cloud