Lex + parse + validate the .zanith source into a runtime graph. Sub-linear at scale because validation reuses tokenization state.
Proof · 01 — The receipts
Real numbers. Sourced. Re-run on every commit.
Every figure on this page comes from a benchmark file in the engine repo, with the source named below it. They re-run on every commit; the page is updated when the numbers move. The disclosures at the bottom are part of the page, not buried.
- 765/765
- tests passing
- 22.9ms
- schema compile · 1k models
- 2.4µs
- SELECT compile
- 88.97KB
- ESM bundle
How the engine behaves at a thousand models.
Indexed lookup against the in-memory model registry. Sub-microsecond per call — well below any human-perceptible threshold.
Total memory footprint of the runtime graph at 1k models. Order of magnitude smaller than a comparably-sized generated client on disk.
Every measured operation, all on one page.
Expression · simple eq
{ field: value }
Expression · AND/OR
{ AND: [a, b], OR: [c, d] }
SELECT · with WHERE
where + projection
findMany · build + compile
args validate + AST + emit
JOIN · projection · WHERE
1 included relation
GROUP BY + COUNT + SUM
aggregate compile
INSERT · single row
RETURNING *
UPSERT · ON CONFLICT DO UPDATE
deduped by unique field
Bulk INSERT · 10 rows
values list
765 of 765 passing. Including the one that stayed red for weeks — fixed, not hidden.
$ pnpm testvitest run · zanith engine✓ test/compiler/select.test.ts (8)✓ test/compiler/insert.test.ts (5)✓ test/compiler/update.test.ts (4)✓ test/compiler/delete.test.ts (3)✓ test/expression/expr.test.ts (12)✓ test/expression/where.test.ts (18)✓ test/integration/pipeline.test.ts (6)✓ test/integration/typed-client.test.ts (9)✓ test/edge-cases/negative.test.ts (22)✓ test/edge-cases/null-handling.test.ts (8)✓ test/types/end-to-end.test.ts (13)✓ test/types/inference.test.ts (7)✓ test/schema/parser.test.ts (15)✓ test/schema/validator.test.ts (11)✓ test/schema/relations.test.ts (9)✓ test/schema/builder.test.ts (12)✓ test/benchmark/scale.test.ts (6) 427ms✓ test/benchmark/execution.test.ts (1) 466ms✓ test/proof/06-flagship-claims.test.ts (15)Test Files 74 passed (74)Tests 765 passed (765)Duration 2.14s
The promise this box made, kept
The flaky ratio was rewritten. The red test was a real bug.
This box used to disclose 191/192: a benchmark ratio against a near-zero baseline that swung 100× on CPU jitter. We said it would be rewritten — it was, as a smoke check with honest headroom. The deeper find: a recovery test had been red for weeks because the migration runner recorded failures inside an already-aborted transaction, masking the real error. Unmasking it exposed a one-line cleanup bug. Both fixed; the suite is 765 / 765 with nothing left to disclose — until the next thing, which will appear right here.
05 — Production-grade proof suite
Tests that prove correctness under ugly cases.
Six suites · 37 / 37 passing · all run against a real Postgres database (PostgreSQL 17.8). The stress test compiles a 1,000-model schema with 3,967 directed relation edges. Type assertions are verified under tsc --noEmit, including negative cases via // @ts-expect-error. Crash-resume + recovery checksum tests use real data — every claim below is backed by a test you can re-run.
- CPU
- Apple M4 Pro
- RAM
- 48 GB
- OS
- Darwin arm64 (25.3.0)
- Node
- v20.20.0
- Postgres
- 17.8 (Homebrew)
- Engine
- v0.2
- pg driver
- 8.20.0
- vitest
- 0.34.6
- Test database
- zanith_proof_tests (separate, schema-owned)
- §00
System + repro report
9/9real DBMachine, Node, Postgres, engine + driver versions. Cold/warm + p50/p95/p99 timings for the 1,000-model compile. Generated SQL snapshots for relation-heavy queries.
test/proof/00-system-report.test.ts - §01
Large-schema stress
5/51,000 models · 3,967 directed relation edges · self-relations · m2m through junction · 5-hop chain · duplicate FKs to the same target (createdBy / updatedBy / approvedBy → User).
test/proof/01-large-schema.test.ts - §02
Relation correctness
7/7real DBEach test compiles SQL → executes against a fresh Postgres database with seed rows → asserts on returned data. Covers one-hop, multi-hop, self-relation, duplicate joins, m2m, LEFT/INNER nullability.
test/proof/02-relation-correctness.test.ts - §03
Type-inference proof
8/8expectTypeOf for positive cases, // @ts-expect-error for negatives. Verified under tsc --noEmit (vitest strips types, so the bare run is insufficient — tsc is the gate).
test/proof/03-type-inference.test.ts - §04
Preflight + shadow verify
6/6real DBaddUnique with duplicates blocked · addForeignKey with orphans blocked · NOT NULL with NULLs blocked · clean data passes · shadow apply + introspect + diff catches a rawSql plan with extra columns.
test/proof/04-preflight-shadow.test.ts - §05
Backfill + recovery
2/2real DB10,000-row backfill · simulated crash after 4 batches · resume from checkpoint · 10,000 final · zero NULLs, zero duplicates. archiveColumn round-trips data; tampered archive is rejected with a checksum-mismatch error.
test/proof/05-backfill-recovery.test.ts
cold 5.69 · mean 4.4 · p50 4.2 · p95 5.52 · p99 6.32 (ms)
phase 1 commits 2000 rows then throws · phase 2 reads _zanith_migration_checkpoints and resumes · 0 duplicates, 0 misses
- 2,982 · 75%Cross-domain belongsTo (refA / refB / refC)
- 993 · 25%Audit FKs → User (createdBy / updatedBy / approvedBy)
- -8 · -0%Self-relation · m2m · author · chain hops
3,967directed edges across 1,000 models · avg 4.0 per model · proves the "dense relations" claim
# 1. dedicated test database (idempotent) psql -h localhost -d postgres \ -c "DROP DATABASE IF EXISTS zanith_proof_tests" \ -c "CREATE DATABASE zanith_proof_tests" # 2. run the proof suite npx vitest run test/proof/ # 3. (optional) confirm type assertions under tsc — vitest strips types npx tsc --noEmit
Generated SQL · paste into psql, they run
SELECT "domain_0"."id", "domain_0"."slug" FROM "domain_0" AS "domain_0" WHERE "domain_0"."active" = TRUE LIMIT $1 -- params: [10]
SELECT "chain0"."id" AS "topId",
"next_5"."label" AS "deepestLabel"
FROM "chain0" AS "chain0"
LEFT JOIN "chain1" AS "next" ON "chain0"."next_id" = "next"."id"
LEFT JOIN "chain2" AS "next_2" ON "next"."next_id" = "next_2"."id"
LEFT JOIN "chain3" AS "next_3" ON "next_2"."next_id" = "next_3"."id"
LEFT JOIN "chain4" AS "next_4" ON "next_3"."next_id" = "next_4"."id"
LEFT JOIN "chain5" AS "next_5" ON "next_4"."next_id" = "next_5"."id"SELECT "domain_0"."id",
"createdBy"."email" AS "creator",
"updatedBy"."email" AS "updater",
"approvedBy"."email" AS "approver"
FROM "domain_0" AS "domain_0"
LEFT JOIN "users" AS "createdBy" ON "domain_0"."created_by_id" = "createdBy"."id"
LEFT JOIN "users" AS "updatedBy" ON "domain_0"."updated_by_id" = "updatedBy"."id"
LEFT JOIN "users" AS "approvedBy" ON "domain_0"."approved_by_id" = "approvedBy"."id"
LIMIT $1
-- params: [5]Findings the suite surfaced
The point of these tests is to catch the things benchmarks hide. Two real engine quirks the suite turned up — both flagged, both worked around in the test, neither pretended not to exist:
m2m junction column-name bug
The m2m emit hard-codes junction.fromField / junction.toField as the literal column name in the join's ON clause — it does not route through the junction model's columnName mapping. Workaround in the test: quote junction columns as camelCase. Documented in 02-relation-correctness.test.ts.
Shadow-verify representation gap
Even matching plans (engine-generated diff vs declared schema) report residual createTable + dropTable drift. The introspector and compileSchema aren't producing identical schema graphs. Real drift IS still caught — the rawSql case proves the gate works — but the false-positive on matching plans means verifyOnShadowOrThrow is currently over-strict.
not yet covered (out of scope for this batch)
- — Concurrent
migrate upadvisory-locking proof (two racing processes) - — SQL golden snapshots for window functions, CTEs, JSONB, INSERT ON CONFLICT, DISTINCT ON
- — Real-Postgres matrix across 14 / 15 / 16 / 17
- — Prisma-to-Zanith schema converter test
05 — What this page doesn't measure
The numbers above are real. Here's what they aren't.
Per the VOICE.md disclaimer rule: every claim made on this page has an accompanying caveat about what the figure does and does not cover. Four gaps, listed plainly so the credibility is shaped by the absence too.
- Gap · 01scope · by design
End-to-end query time against a live database
Every µs on this page is engine overhead. Network latency, connection setup, and the database's actual execution time are excluded. If you want end-to-end numbers, the local benchmark setup in §06 will produce them.
- Gap · 02in progress
Comparable benchmarks against Prisma, Drizzle, TypeORM
We don't have sourced competitor figures yet. Until we do, the marketing copy uses one consistent hedge: "codegen ORMs at this scale spend minutes regenerating." Specific µs comparisons will land when we can run a matched workload on each.
- Gap · 03planned
Long-running memory churn
The 3.4MB figure is the steady-state graph footprint. We haven't yet measured what happens to memory under continuous schema reloads, sustained query throughput, or under-leak-detection runs over hours.
- Gap · 04downstream concern
Performance on hot-paths inside the DB
The compiler's choice of joins, projections, and parameter shapes affects how the database plans the query. We measure what we emit, not how the planner reacts. The /examples page shows the SQL we generate; production tuning lives downstream.
Reproducible benchmarks
~/zanith/engine $ cd packages/benchmarks
~/zanith/engine/benchmarks $ npm install
# Run the 1000-model massive graph benchmark
~/zanith/engine/benchmarks $ npx tsx run.ts --type massive
✓ Compiling runtime graph...
> Zanith Core Initialized in 59.12ms
> Peak RSS Memory: 14.3MB
"Don't trust this page. Run it yourself."