Memory for AI agent fleets, behind one HTTP service.
Typed events, vector search, and provenance — multi-tenant over FFS, in one Rust binary.

CURL  MEMSPINE.COM/INSTALL.SH  |  SH
~/fleet — memspine
appendsearchtrace
$ curl -fsSL https://memspine.com/install.sh | sh Checksum OK. Installed memspine-server and memspine-admin to ~/.local/bin $ MEMSPINE_BIND=0.0.0.0:7777 memspine-server & $ curl localhost:7777/health {"status":"ok","service":"memspine-server","version":"0.0.0"} $ curl -X POST localhost:7777/v1/memory/event \ -H "Authorization: Bearer $TENANT" \ -d '{"kind":"tool_call","embedding":[…],"properties":{"name":"web.fetch"}}' {"event_id":"c104b6d6-afca-4118-b8bc-6928f15858a4"} $ curl -X POST localhost:7777/v1/memory/search \ -H "Authorization: Bearer $TENANT" -d '{"embedding":[…],"k":5}' {"hits":[{"event_id":"c104b6d6-afca-4118-b8bc-6928f15858a4","distance":0.0}]}

Quickstart

The installer puts memspine-server and memspine-admin in ~/.local/bin — prebuilt for Linux x86_64 and macOS arm64, checksums verified, no root, no dependencies. Start the server and talk to it with curl; in v0 the Bearer token is the tenant id, so there is no control-plane setup:

# embeddings default to 384 dims; pin to 4 for this toy walkthrough
MEMSPINE_BIND=0.0.0.0:7777 MEMSPINE_EMBEDDING_DIM=4 memspine-server

# append, then traverse provenance
curl -X POST localhost:7777/v1/memory/event \
  -H "Authorization: Bearer 00000000-0000-0000-0000-000000000042" \
  -H "Content-Type: application/json" \
  -d '{"kind":"tool_call","embedding":[0.1,0.2,0.3,0.4],
       "properties":{"name":"web.fetch","url":"https://example.com"}}'

curl -X POST localhost:7777/v1/memory/cypher \
  -H "Authorization: Bearer 00000000-0000-0000-0000-000000000042" \
  -H "Content-Type: application/json" \
  -d '{"query":"MATCH (n:Event)-[:SOURCE_FROM]->(m:Event) RETURN count(n)"}'

Every event lands in the tenant's own FFS database file — one file per tenant, opened on first request and kept warm in a pool. There is no shared table for a noisy neighbour to sit in.

API

Auth is Authorization: Bearer <key> with per-key scopes enforced on every handler.

RouteScope
GET /healthliveness probe
GET /metricsPrometheus exposition, per-tenant series
GET /v1/whoamireadecho the resolved tenant
POST /v1/memory/eventwriteappend a typed event, with optional embedding
POST /v1/memory/searchreadvector search, optional Cypher predicate
POST /v1/memory/cypherreadCypher over events and provenance edges
POST /v1/admin/tenantsadmincreate tenant
GET /v1/admin/tenantsadminlist tenants
POST /v1/admin/keysadminmint a key — plaintext returned once
DELETE /v1/admin/keys/<id>adminrevoke a key
GET /v1/admin/auditadminpaginate the admin audit log

SDKs ship with the server: async Rust, Python, and TypeScript clients built from the same core, plus Idempotency-Key support on writes.

Architecture

HTTP        axum + tower-http (tracing, CORS)
auth        Bearer token → tenant, scope check per handler
core        per-tenant engine pool
            ingest · vector search · Cypher · provenance
ffs         graph + vector + columnar, one file per tenant

memspine is deliberately the boring layer. FFS does the hard engine work — pages, WAL, MVCC, HNSW, atomic commits across graph and vector writes. memspine adds what a service needs and an engine shouldn't carry: tenancy, keys and scopes, rate limits, quotas, metrics, and an audit trail.

Operations

Tracing

Every response carries x-memspine-trace-id — caller-supplied or generated — and the same id is on every span the request produced. A log aggregator pivots from one HTTP call to everything it touched.

Metrics

/metrics exposes Prometheus series per tenant: write rate, query latency, storage growth, vector-index churn.

Rate limits

Per-tenant token buckets keep a runaway agent from taking down its neighbours or silently burning capacity.

Audit

Admin actions append to an immutable log, paginated over the API.

FAQ

What is agent memory?

The durable record of everything an AI agent does and learns — tool calls, decisions, retrieval traces, derived facts. memspine stores those events as a typed graph with embeddings, so agents recall what happened, find similar past events, and trace where any fact came from.

How is memspine different from a vector database?

A vector database answers one question: what's similar. Agent memory also needs what happened (typed events in order) and where did this come from (provenance edges). memspine keeps all three in one engine instead of bolting a vector store onto a separate event log.

Does it work with any LLM or agent framework?

Yes — it's a plain HTTP API with Bearer-token auth. Anything that can make an HTTP request can append and search memory. SDKs ship for Rust, Python, and TypeScript.

Can I self-host it?

Yes. One command installs a single Rust binary with no dependencies. Each tenant's data lives in its own FFS file on your disk.

How does multi-tenancy work?

Every tenant gets a dedicated database file, per-tenant rate limits, and scoped API keys (read / write / admin). No shared table, so one tenant's load never sits in another's query path.

Status

Early and honest about it. The HTTP surface, scopes, rate limits, admin plane, audit log, vector search, edge traversal, and the FFS storage path work end to end, gated by CI on every push. The v0 executor returns node ids and counts; richer Cypher projections arrive with the FFS columnar read path. Running an agent fleet and want managed memory under it? Write to sd@erp.ai.