Does memspine work with any LLM or agent framework?

Yes. memspine is a plain HTTP API with Bearer-token auth — any agent that can make an HTTP request can append and search memory. SDKs ship for Rust, Python, and TypeScript.

Can I self-host memspine?

Yes. One command installs a single Rust binary with no dependencies: curl -fsSL https://memspine.com/install.sh | sh. Each tenant's data lives in its own FFS database file on your disk.

memspine — agent memory as a service. Memory for AI agent fleets

Q: What is agent memory?

Agent memory is the durable record of everything an AI agent does and learns: tool calls, decisions, retrieval traces, and derived facts. memspine stores those events as a typed graph with embeddings so agents can recall what happened, find similar past events, and trace where any fact came from.

Q: How is memspine different from a vector database?

A vector database answers one question: what's similar. Agent memory also needs what happened (typed events in order) and where did this come from (provenance edges). memspine stores all three in one engine — vector index, event graph, and Cypher traversal — instead of bolting a vector store onto a separate event log.

Q: How does multi-tenancy work?

Every tenant gets a dedicated FFS database file, per-tenant rate limits, and scoped API keys (read / write / admin). There is no shared table, so one tenant's load never sits in another's query path.

Quickstart

The installer puts memspine-server and memspine-admin in ~/.local/bin — prebuilt for Linux x86_64 and macOS arm64, checksums verified, no root, no dependencies. Start the server and talk to it with curl; in v0 the Bearer token is the tenant id, so there is no control-plane setup:

# embeddings default to 384 dims; pin to 4 for this toy walkthrough
MEMSPINE_BIND=0.0.0.0:7777 MEMSPINE_EMBEDDING_DIM=4 memspine-server

# append, then traverse provenance
curl -X POST localhost:7777/v1/memory/event \
  -H "Authorization: Bearer 00000000-0000-0000-0000-000000000042" \
  -H "Content-Type: application/json" \
  -d '{"kind":"tool_call","embedding":[0.1,0.2,0.3,0.4],
       "properties":{"name":"web.fetch","url":"https://example.com"}}'

curl -X POST localhost:7777/v1/memory/cypher \
  -H "Authorization: Bearer 00000000-0000-0000-0000-000000000042" \
  -H "Content-Type: application/json" \
  -d '{"query":"MATCH (n:Event)-[:SOURCE_FROM]->(m:Event) RETURN count(n)"}'

Every event lands in the tenant's own FFS database file — one file per tenant, opened on first request and kept warm in a pool. There is no shared table for a noisy neighbour to sit in.

API

Auth is Authorization: Bearer <key> with per-key scopes enforced on every handler.

Route	Scope
GET /health	—	liveness probe
GET /metrics	—	Prometheus exposition, per-tenant series
GET /v1/whoami	read	echo the resolved tenant
POST /v1/memory/event	write	append a typed event, with optional embedding
POST /v1/memory/search	read	vector search, optional Cypher predicate
POST /v1/memory/cypher	read	Cypher over events and provenance edges
POST /v1/admin/tenants	admin	create tenant
GET /v1/admin/tenants	admin	list tenants
POST /v1/admin/keys	admin	mint a key — plaintext returned once
DELETE /v1/admin/keys/<id>	admin	revoke a key
GET /v1/admin/audit	admin	paginate the admin audit log

SDKs ship with the server: async Rust, Python, and TypeScript clients built from the same core, plus Idempotency-Key support on writes.

Architecture

HTTP        axum + tower-http (tracing, CORS)
auth        Bearer token → tenant, scope check per handler
core        per-tenant engine pool
            ingest · vector search · Cypher · provenance
ffs         graph + vector + columnar, one file per tenant

memspine is deliberately the boring layer. FFS does the hard engine work — pages, WAL, MVCC, HNSW, atomic commits across graph and vector writes. memspine adds what a service needs and an engine shouldn't carry: tenancy, keys and scopes, rate limits, quotas, metrics, and an audit trail.

Operations

Tracing

Every response carries x-memspine-trace-id — caller-supplied or generated — and the same id is on every span the request produced. A log aggregator pivots from one HTTP call to everything it touched.

Metrics

/metrics exposes Prometheus series per tenant: write rate, query latency, storage growth, vector-index churn.

Rate limits

Per-tenant token buckets keep a runaway agent from taking down its neighbours or silently burning capacity.

Audit

Admin actions append to an immutable log, paginated over the API.

FAQ

What is agent memory?

The durable record of everything an AI agent does and learns — tool calls, decisions, retrieval traces, derived facts. memspine stores those events as a typed graph with embeddings, so agents recall what happened, find similar past events, and trace where any fact came from.

How is memspine different from a vector database?

A vector database answers one question: what's similar. Agent memory also needs what happened (typed events in order) and where did this come from (provenance edges). memspine keeps all three in one engine instead of bolting a vector store onto a separate event log.

Does it work with any LLM or agent framework?

Yes — it's a plain HTTP API with Bearer-token auth. Anything that can make an HTTP request can append and search memory. SDKs ship for Rust, Python, and TypeScript.

Can I self-host it?

Yes. One command installs a single Rust binary with no dependencies. Each tenant's data lives in its own FFS file on your disk.

How does multi-tenancy work?

Every tenant gets a dedicated database file, per-tenant rate limits, and scoped API keys (read / write / admin). No shared table, so one tenant's load never sits in another's query path.

Status

Early and honest about it. The HTTP surface, scopes, rate limits, admin plane, audit log, vector search, edge traversal, and the FFS storage path work end to end, gated by CI on every push. The v0 executor returns node ids and counts; richer Cypher projections arrive with the FFS columnar read path. Running an agent fleet and want managed memory under it? Write to sd@erp.ai.

Memory for AI agent fleets, behind one HTTP service. Typed events, vector search, and provenance — multi-tenant over FFS, in one Rust binary.