Operating since 02·2026

Sibyl Memory. #2 on LongMemEval.

Sibyl Memory is the file-based agent memory layer that scored 95.6% on LongMemEval Oracle. Postgres-backed, multi-tenant, no vector database, no embeddings. From a single operator to a million users.

#2 · 95.6% LongMemEval Oracle · Claude Opus 4.6 · April 2026 Read the report →
01 Benchmark

We score 95.6% on LongMemEval. Without vectors.

LongMemEval Oracle (ICLR 2025, University of Michigan) is the standard benchmark for long-horizon agent memory. 500 questions across six categories. Our score is public, reproducible, and live on the leaderboard.

95.6%
LongMemEval Oracle · 500 questions · Claude Opus 4.6 Sibyl Memory placed second on the public leaderboard, tied with Chronos (PwC). The only file-based system in the top tier.
4 vCPU · 16GB EC2 · zero vectors · zero embeddings · zero external retrieval
RankSystemScoreArchitecture
1agentmemory V496.2%embedding-based
2Sibyl Memory95.6%file-based · zero vectors
2Chronos (PwC)95.6%embedding-based
4Mastra Observational Memory94.9%embedding-based
5MemMachine93.0%embedding-based
6Hindsight (Vectorize)91.4%vector DB
Mem0 · Zep · Supermemory · Emergence AI · Oracle baseline — all below the top tier
100%
single-session-user
100%
single-session-assistant
96.2%
temporal-reasoning
93.3%
single-session-pref
93.2%
multi-session
92.3%
knowledge-update

We did not optimize for the benchmark. We optimized for production efficiency. The benchmark improvement was a side effect.

Sibyl Labs · LongMemEval Report · April 2026
01.5 Methodology

How we ran the benchmark.

Benchmark
LongMemEval Oracle (ICLR 2025, University of Michigan)
Dataset
500 questions across 6 categories, public leaderboard
Judge model
Claude Opus 4.6
Hardware
4 vCPU / 16GB EC2
Architecture
File-based, zero vectors, zero embeddings, zero external retrieval
Run date
April 2026
Result
95.6%, ranked #2 (tied with Chronos)

The full report (per-category breakdown, runtime cost, ablation on the file-based architecture) is at blog.sibylcap.com/longmemeval-v2.

Read the full benchmark report
02 Product

Sibyl Memory.

Postgres-backed, schema-first, multi-tenant. One SDK, two transports: Sibyl Cloud or your own database. Persistent state belongs in a schema, not a chatbot's context window.

Sibyl Memory

Persistent memory for AI agents.

Postgres-backed, schema-first, multi-tenant. One SDK, two transports: Sibyl Cloud or your own database. The substrate that scored #2 on LongMemEval, packaged for production.

Free100 MAU · cloud
Starter$99 / mo · 1K MAU
Pro$499 / mo · 10K MAU
Scale$2,500+ / mo · 100K+ MAU
Self-host$25,000 / yr · BYOC
From Free to enterprise self-host

Sibyl Memory is one product in the lab's surface. For the full catalog including SIBYL Framework, Hermes plugin, Ping protocol, and x402 endpoints, see /products.

03 Applications

One memory system. Five operating shapes.

Same architecture across every deployment. File-based at the substrate, Postgres-backed in production, multi-tenant by namespace. Different access patterns optimized for different team problems. From a single operator to a million users.

01

Operator Memory

For solo operators running AI

Full-stack memory for one operator. Priorities, journal, entities, scars, relationships, arc. The same shape Sibyl uses to operate herself, packaged for delivery to anyone running long-horizon work through an AI.

Buyers: indie founders, solo researchers, AI-first builders, autonomous agent operators · single-tenant per operator, multi-month operational continuity

Live
02

User Profile Memory

For consumer platforms with active users

Per-user persistent memory at platform scale. Tracks history, extracts patterns, surfaces back via your UI. Each user gets an isolated namespace that grows with their behavior.

Buyers: prediction markets, social platforms, agent platforms · 1K to 1M+ active users per platform

Pilot

Three more shapes — Conversational Continuity, Agent Reputation, and Org Memory — are in the development pipeline. See Products → In development for the full surface.

04 Architecture

Memory is the state. Agents are stateless.

The multi-tenant pattern: per-tenant schema namespace, ephemeral agent runtime, the same hierarchical memory shape across every use case. Each request loads exactly the requesting tenant's slice, processes the turn, writes back, exits. Memory persists in Postgres. Agents do not persist at all.

Schema · Five Tiers
HOT
state_documents
treasury · priorities · session
WARM
entities · entity_relations
UNIQUE(tenant, category, name)
COLD
journal · revenue · errors · metrics
append-only · indexed by ts
REFERENCE
reference_documents
runbooks · rules · constants
ARCHIVE
archived_entities
frozen · recoverable
your app MemoryClient SDK sibyl_memory.* on Postgres (Neon · RDS · Aurora · self-host)

Multi-tenant by tenant_id. Single-source-of-truth per entity enforced as a UNIQUE constraint at the DB level. A bug cannot create two facts about the same entity. Job queue with SKIP LOCKED + retries + DLQ. Event fabric via LISTEN/NOTIFY. Append-only audit on every destructive admin action. GDPR-grade delete_user_cascade() built in.

Schema-Enforced Isolation

Every row in every table carries tenant_id UUID NOT NULL. Single-source-of-truth is a UNIQUE constraint, not a convention. No application-level access bug can leak across tenants.

Stateless Compute

10,000 concurrent users does not mean 10,000 long-running agent processes. Each request spins up an ephemeral agent: load memory, do work, exit. Peak concurrency is realistic (10–50), not user count.

No Vector Tax

Schema-led retrieval over Postgres indexes means zero embedding-API cost and zero vector-DB hosting cost. At 100K active users, competitors pay $10K–30K/month in that layer alone. We pay zero.

04.5 Common questions

Six questions, answered.

What buyers and integrators actually ask before committing.

Does Sibyl Memory require a vector database?

No. Sibyl Memory is file-based at the substrate and Postgres-backed in production. There is no vector database, no embedding pipeline, and no external retrieval service. Schema is imposed at write time, not inferred at read time. That is the architectural choice that produced the 95.6% result on a 4 vCPU / 16GB EC2 instance.

What is the latency overhead per memory call?

A single Postgres query against an indexed namespace. In practice, p50 sits in the low single-digit milliseconds for hot-tier reads, with no embedding round-trip and no vector-DB hop. Cold-tier reads incur the cost of decompression. Production-grade benchmarks are in the full report at blog.sibylcap.com/longmemeval-v2.

Can it run on RDS, Neon, Aurora, or other managed Postgres?

Yes. Sibyl Memory uses standard Postgres 14+ features (JSONB, partial indexes, triggers, materialized views). RDS, Aurora, Neon, Supabase, and self-hosted Postgres all work. The self-host tier is BYOC by design.

Does it support GDPR cascade delete?

Yes. Tenant-scoped cascade delete is a first-class operation. Removing a user namespace removes their messages, memory entries, audit log rows, and any derived state in a single transaction. Tamper-proof audit log on every write keeps the deletion itself recorded for compliance. EU AI Act export is supported.

How is Sibyl Memory different from vector-DB approaches like Mem0 or Zep?

Vector approaches infer structure at read time via embedding similarity. That is good for fuzzy semantic recall and bad for everything that needs a precise answer. Sibyl Memory imposes structure at write time via schema, which is good for everything that needs a precise answer and slightly less elegant for unstructured ambient recall. The 95.6% LongMemEval score is the proof point that schema-first does not lose to embedding-first on the questions that matter to autonomous agents. Full comparison →

Is the benchmark code public?

The LongMemEval Oracle benchmark itself is public (ICLR 2025, University of Michigan). The Sibyl Memory implementation is licensed; the architecture, methodology, and per-category results are documented in the full report at blog.sibylcap.com/longmemeval-v2. For deeper architecture review under NDA, contact [email protected].

05 The Lab

Sibyl Labs is a research and infrastructure lab.

Sibyl Labs, LLC was formed in April 2026 to wrap the agentic infrastructure work in a real legal entity. The lab builds memory systems, agentic frameworks, and the supporting tooling that makes long-running autonomous agents possible. Every product we sell is the same architecture our own agent operates on.

The thesis is not complicated. Most agents forget. The ones that remember are built on architecture that scales. We publish the work in public, benchmark it in the open, and ship the substrate so others can build the next generation of agents on something that has already survived production.

Memory is one shape of the work. Frameworks are another. Custom builds for partners are a third. The output is the same: infrastructure for agents that operate, not demo.

06 Contact

Enterprise. Research. Custom builds.

For self-host, BYOC, or air-gapped deployments. For research collaboration on memory or agent benchmarks. For bespoke memory or framework integrations sized to a partner's specific architecture.