Persistent memory for AI agents.
Postgres-backed, schema-first, multi-tenant. One SDK, two transports: Sibyl Cloud or your own database. The substrate that scored #2 on LongMemEval, packaged for production.
Sibyl Memory is the file-based agent memory layer that scored 95.6% on LongMemEval Oracle. Postgres-backed, multi-tenant, no vector database, no embeddings. From a single operator to a million users.
LongMemEval Oracle (ICLR 2025, University of Michigan) is the standard benchmark for long-horizon agent memory. 500 questions across six categories. Our score is public, reproducible, and live on the leaderboard.
| Rank | System | Score | Architecture |
|---|---|---|---|
| 1 | agentmemory V4 | 96.2% | embedding-based |
| 2 | Sibyl Memory | 95.6% | file-based · zero vectors |
| 2 | Chronos (PwC) | 95.6% | embedding-based |
| 4 | Mastra Observational Memory | 94.9% | embedding-based |
| 5 | MemMachine | 93.0% | embedding-based |
| 6 | Hindsight (Vectorize) | 91.4% | vector DB |
| Mem0 · Zep · Supermemory · Emergence AI · Oracle baseline — all below the top tier | |||
We did not optimize for the benchmark. We optimized for production efficiency. The benchmark improvement was a side effect.
The full report (per-category breakdown, runtime cost, ablation on the file-based architecture) is at blog.sibylcap.com/longmemeval-v2.
Postgres-backed, schema-first, multi-tenant. One SDK, two transports: Sibyl Cloud or your own database. Persistent state belongs in a schema, not a chatbot's context window.
Postgres-backed, schema-first, multi-tenant. One SDK, two transports: Sibyl Cloud or your own database. The substrate that scored #2 on LongMemEval, packaged for production.
Sibyl Memory is one product in the lab's surface. For the full catalog including SIBYL Framework, Hermes plugin, Ping protocol, and x402 endpoints, see /products.
Same architecture across every deployment. File-based at the substrate, Postgres-backed in production, multi-tenant by namespace. Different access patterns optimized for different team problems. From a single operator to a million users.
Full-stack memory for one operator. Priorities, journal, entities, scars, relationships, arc. The same shape Sibyl uses to operate herself, packaged for delivery to anyone running long-horizon work through an AI.
Buyers: indie founders, solo researchers, AI-first builders, autonomous agent operators · single-tenant per operator, multi-month operational continuity
Per-user persistent memory at platform scale. Tracks history, extracts patterns, surfaces back via your UI. Each user gets an isolated namespace that grows with their behavior.
Buyers: prediction markets, social platforms, agent platforms · 1K to 1M+ active users per platform
Three more shapes — Conversational Continuity, Agent Reputation, and Org Memory — are in the development pipeline. See Products → In development for the full surface.
The multi-tenant pattern: per-tenant schema namespace, ephemeral agent runtime, the same hierarchical memory shape across every use case. Each request loads exactly the requesting tenant's slice, processes the turn, writes back, exits. Memory persists in Postgres. Agents do not persist at all.
Multi-tenant by tenant_id. Single-source-of-truth per entity enforced as a UNIQUE constraint at the DB level. A bug cannot create two facts about the same entity. Job queue with SKIP LOCKED + retries + DLQ. Event fabric via LISTEN/NOTIFY. Append-only audit on every destructive admin action. GDPR-grade delete_user_cascade() built in.
Every row in every table carries tenant_id UUID NOT NULL. Single-source-of-truth is a UNIQUE constraint, not a convention. No application-level access bug can leak across tenants.
10,000 concurrent users does not mean 10,000 long-running agent processes. Each request spins up an ephemeral agent: load memory, do work, exit. Peak concurrency is realistic (10–50), not user count.
Schema-led retrieval over Postgres indexes means zero embedding-API cost and zero vector-DB hosting cost. At 100K active users, competitors pay $10K–30K/month in that layer alone. We pay zero.
What buyers and integrators actually ask before committing.
No. Sibyl Memory is file-based at the substrate and Postgres-backed in production. There is no vector database, no embedding pipeline, and no external retrieval service. Schema is imposed at write time, not inferred at read time. That is the architectural choice that produced the 95.6% result on a 4 vCPU / 16GB EC2 instance.
A single Postgres query against an indexed namespace. In practice, p50 sits in the low single-digit milliseconds for hot-tier reads, with no embedding round-trip and no vector-DB hop. Cold-tier reads incur the cost of decompression. Production-grade benchmarks are in the full report at blog.sibylcap.com/longmemeval-v2.
Yes. Sibyl Memory uses standard Postgres 14+ features (JSONB, partial indexes, triggers, materialized views). RDS, Aurora, Neon, Supabase, and self-hosted Postgres all work. The self-host tier is BYOC by design.
Yes. Tenant-scoped cascade delete is a first-class operation. Removing a user namespace removes their messages, memory entries, audit log rows, and any derived state in a single transaction. Tamper-proof audit log on every write keeps the deletion itself recorded for compliance. EU AI Act export is supported.
Vector approaches infer structure at read time via embedding similarity. That is good for fuzzy semantic recall and bad for everything that needs a precise answer. Sibyl Memory imposes structure at write time via schema, which is good for everything that needs a precise answer and slightly less elegant for unstructured ambient recall. The 95.6% LongMemEval score is the proof point that schema-first does not lose to embedding-first on the questions that matter to autonomous agents. Full comparison →
The LongMemEval Oracle benchmark itself is public (ICLR 2025, University of Michigan). The Sibyl Memory implementation is licensed; the architecture, methodology, and per-category results are documented in the full report at blog.sibylcap.com/longmemeval-v2. For deeper architecture review under NDA, contact [email protected].
Sibyl Labs, LLC was formed in April 2026 to wrap the agentic infrastructure work in a real legal entity. The lab builds memory systems, agentic frameworks, and the supporting tooling that makes long-running autonomous agents possible. Every product we sell is the same architecture our own agent operates on.
The thesis is not complicated. Most agents forget. The ones that remember are built on architecture that scales. We publish the work in public, benchmark it in the open, and ship the substrate so others can build the next generation of agents on something that has already survived production.
Memory is one shape of the work. Frameworks are another. Custom builds for partners are a third. The output is the same: infrastructure for agents that operate, not demo.
For self-host, BYOC, or air-gapped deployments. For research collaboration on memory or agent benchmarks. For bespoke memory or framework integrations sized to a partner's specific architecture.