Phase 0 · Early access open

Every Kalshi trade. Every Polymarket fill. Every Manifold bet.
Reconciled.

A normalized cross-venue historical archive of prediction markets — on a single Postgres schema, with a daily reconciliation log against venue-published volume, and full retention of resolved and pruned markets. Built for quants who want to know exactly what their data does and doesn't cover.

3
venues live
0.0%
Kalshi drift
105k+
Polymarket trades
100%
resolved retained
daily
recon log
Coverage

Three venues. One schema. Honest about every gap.

Most data vendors hide the residual drift. We publish it. Below is the reconciliation state of the Phase-0 dataset — the same numbers a customer sees in the recon log.

Kalshi
● audit-grade
volume drift
0.0%
markets sampled
50
trade reconciliation
100% (50/50)
source
public REST
Manifold
● calibration-grade
perfect reconcile
30/40
residual drift
~10% (active mkts)
cause
fills[] aggregation
source
public REST + cursor
Polymarket
● on-chain audit-grade
resolution coverage
50/50 closed mkts
standard CTF
Goldsky subgraph
NegRisk wrapped
Polygon RPC direct
data-api cap busted
3,500 → unlimited
Phase-0 finding worth knowing: Polymarket's public data-api caps at offset 3,500 trades per market — high-volume markets are unrecoverable from REST alone. We bust that ceiling by reading the Goldsky orderbook subgraph, plus a direct Polygon RPC layer for NegRisk-wrapped markets the subgraph doesn't index. Full write-up →
Schema

One canonical model across venues.

Six tables, every venue mapped to the same shape. Binary, categorical, on-chain, off-chain — all the same query.

-- markets are globally addressable: <venue>:<native_id>
SELECT m.market_id, m.venue_id, m.title,
       m.volume_native, m.volume_unit,
       m.resolved_at, m.closes_at
FROM markets m
WHERE m.venue_id = 'kalshi'
  AND m.resolved_at IS NOT NULL
ORDER BY m.volume_native DESC LIMIT 10;

SQL on the Postgres canonical schema.

from predmarket import Predmarket

pm = Predmarket(api_key="YOUR_KEY")

# list any closed market across any venue
markets = pm.markets(venue="polymarket", limit=100)

# full trade history for one market
trades = pm.trades(market_id="polymarket:0xdd224...")

Python SDK — pip install predmarket (early access).

Why us

Survivorship-bias-free. Reconciled. Open about the failure modes.

01 — Survivorship bias

Resolved + pruned markets retained

When a venue closes or archives a market, most scrapers lose it. We keep the last-known snapshot in a deletion ledger so backtests see the world as it actually was.

02 — Reconciliation

Daily drift report vs venue volume

Every coverage claim is backed by a row in recon_log. We publish the drift, the threshold, and whether the gate passed. No claim without evidence.

03 — Transparency

Public docs of every quirk

Manifold sums abs(amount). Polymarket's data-api caps at offset 3,500. NegRisk markets need Polygon RPC. We write the post-mortem you'd have to discover yourself.

Phase 0 · Limited access

Get the sample. 5 markets. ~24KB. Three readers verified.

Parquet files for markets, outcomes, and trades — readable by pandas, polars, and duckdb. Free for evaluation. Email us with your use case and we'll send the link plus a sketch of what production access would cost for your shape of work.