JUNE 17, 2026::

The Hard Part of Realtime SQL: Snapshot First, Then Live Changes

KalamDB turns SQL queries into realtime subscriptions. Instead of polling or consuming raw change streams, clients receive a consistent snapshot first, then ordered live updates using sequence ids for replay, reconnects, and offline synchronization.

Most realtime systems can tell you when a table changes.

That's useful, but it's usually not what an application actually needs.

Imagine a support dashboard that shows tickets assigned to the current user:


sql snippetSQL
SELECT id, title, priority, status, updated_atFROM app.ticketsWHERE assignee_id = $current_userORDER BY updated_at DESC;


A common solution is polling. Another is using CDC or a generic change feed.

Both approaches work, but they push a surprising amount of complexity onto the client:

  • Re-fetching data over and over
  • Filtering out rows that don't belong to the current user
  • Handling reconnects
  • Detecting missed updates
  • Keeping local state in sync

KalamDB takes a different approach.

Subscribe once, receive a consistent snapshot, then stream everything that happened after it.

Because KalamDB stores data in user-scoped tables, subscriptions don't require expensive filtering on every update. In many cases, the server only needs to answer a simple question:

Is this row newer than the last sequence id the client has seen?

That makes realtime delivery simpler, more predictable, and easier to scale.


The KalamDB approach

A client opens a WebSocket connection, authenticates, and subscribes to a SQL query:

json snippetJSON
{  "type": "subscribe",  "subscription": {    "id": "tickets",    "sql": "SELECT id, title, status, _seq FROM app.tickets"  }}


The server then:

  1. Captures a snapshot boundary.
  2. Sends the initial result set in batches.
  3. Streams all changes that occur after that boundary.
  4. Allows the client to reconnect from the last sequence id it received.

The client doesn't need to poll or repeatedly execute the same query.


Why sequence ids matter

Every row in KalamDB contains a monotonic sequence id:

_seq

This sequence id is the foundation of synchronization.

In many systems, every update requires evaluating filters, permissions, and subscriptions before deciding whether a client should receive it.

In KalamDB, subscriptions are naturally scoped to user-owned tables and partitions. That means the server can often reduce the problem to:

Is _seq greater than the last sequence id the client received?

That simple comparison provides several benefits:

  • Less filtering
  • Lower CPU overhead
  • Reduced network traffic
  • Straightforward reconnect logic
  • Efficient replay
  • Predictable ordering

Snowflake-based sequence ids

KalamDB's _seq is implemented using a Snowflake-style identifier.

That means it is:

  • Globally sortable
  • Monotonically increasing
  • Fast to compare
  • Safe to generate across multiple nodes

This becomes especially important in distributed deployments.

When KalamDB runs as a cluster using KalamDB Raft, multiple nodes can generate sequence ids while preserving the ordering guarantees required for synchronization and replay.

The implementation details deserve their own post, but for realtime synchronization the important rule is simple:

Higher _seq = newer change

Clients can safely resume from a sequence id regardless of which node originally generated it.


Protocol overview

The WebSocket endpoint is:

GET /v1/ws

A typical session looks like this:

sequenceDiagram
participant Client
participant Server

Client->>Server: authenticate
Server-->>Client: auth_success

Client->>Server: subscribe
Server-->>Client: snapshot batches

Note over Server,Client: snapshot complete

Server-->>Client: live changes

Note over Client,Server: reconnect if needed

Client->>Server: subscribe(from_seq_id)
Server-->>Client: missed changes

The key idea is that the client always knows the latest sequence id it has processed.

That makes recovery straightforward.


Snapshot first, then live changes

The hardest part of any realtime system isn't opening a WebSocket.

It's maintaining consistency.

Consider this timeline:

T1: client subscribes
T2: snapshot starts
T3: row changes
T4: snapshot finishes
T5: live updates begin

Without careful coordination, the client can:

  • Receive duplicate data
  • Miss updates entirely
  • End up with an inconsistent view of the world

KalamDB solves this by recording a snapshot boundary:

snapshot_end_seq = current sequence

The server sends all rows up to that sequence id.

Any changes that occur after the boundary are buffered while the snapshot is being delivered.

Once the snapshot is complete:

snapshot complete
→ replay buffered changes
→ switch to live mode

The client receives a consistent view of the data without gaps or races.

Lock-free delivery

One advantage of this approach is that writes never need to stop while a snapshot is being generated.

New writes continue normally.

Changes that occur after the snapshot boundary are simply buffered and replayed once the snapshot finishes.

That gives us an important guarantee:

No write blocking
No global locks
No missed changes

The snapshot establishes a stable point in time, while sequence ids ensure every change after that point is delivered in order.


Initial loading

Large result sets are delivered in batches:


json snippetJSON
{  "type": "initial_data_batch",  "rows": [    {      "id": 1,      "_seq": 11990    }  ],  "batch_control": {    "status": "loading",    "has_more": true  }}


The client requests additional batches as needed:

json snippetJSON
{  "type": "next_batch"}


This prevents large snapshots from overwhelming the connection.

Because the snapshot boundary has already been established, batches can be delivered independently without risking missed updates.


Live updates

Once the snapshot is complete:


json snippetJSON
{  "batch_control": {    "status": "ready",    "has_more": false  }}


The server switches to live streaming mode.

Example:

json snippetJSON
{  "type": "change",  "rows": [    {      "id": 42,      "_seq": 13001    }  ]}


Since every change includes a sequence id, the client always knows exactly where it is in the stream.


Reconnect and resume

Connections drop.

Browsers refresh.

Mobile networks disappear at the worst possible moment.

Instead of starting over, the client reconnects using the last sequence id it processed:

json snippetJSON
{  "type": "subscribe",  "subscription": {    "id": "tickets",    "sql": "SELECT id, title, status, _seq FROM app.tickets",    "options": {      "from_seq_id": 13001    }  }}


The server sends everything newer than that sequence.

No custom replay logic. No full refresh. No guessing what was missed.

Because replay is built directly into the protocol, reconnect behavior remains deterministic even after long disconnects.


Offline-first applications

Sequence ids become even more useful when clients spend long periods offline.

Consider a mobile application:

Online
→ Receive updates
→ Go offline
→ User continues working
→ Reconnect later

Traditional synchronization systems often rely on:

  • Conflict tracking
  • Timestamp comparisons
  • Custom replay logic
  • Periodic full refreshes

With KalamDB, the client already knows the last sequence id it processed.

When connectivity returns:

Reconnect
→ Send last sequence id
→ Receive everything newer

That's it.

For offline-first applications, _seq effectively becomes a synchronization cursor.

Persist it locally and resume exactly where you left off without downloading the entire dataset again.

This model works particularly well for:

  • Mobile applications
  • Field service software
  • Messaging systems
  • Collaborative tools
  • Local-first experiences

Comparing KalamDB to other solutions

Polling

Polling repeatedly executes the same query:

SELECT ...
SELECT ...
SELECT ...
SELECT ...

The downsides are familiar:

  • Repeated database work
  • Higher latency
  • More bandwidth usage

KalamDB executes the query once and then streams changes as they happen.


PostgreSQL LISTEN/NOTIFY

LISTEN/NOTIFY is great for invalidation:

orders changed

But clients still need to:

  • Re-query data
  • Detect missed updates
  • Maintain local state

KalamDB delivers rows directly and provides sequence-based recovery.


CDC and Change Streams

CDC systems tell you:

Row X changed

The application still has to answer:

Do I care about Row X?

In many architectures that means filtering large streams of events before finding the few that matter.

With KalamDB's user-scoped tables, routing is much simpler. Most updates already belong to a specific user partition, so the server can focus on ordering and delivery rather than evaluating large numbers of filters.


Firebase

Firebase Realtime Database and Firestore provide excellent realtime synchronization and are among the easiest ways to build realtime applications.

Firebase clients subscribe to documents, collections, or query results and automatically receive updates when those results change.

Firebase also includes strong offline support, local caching, conflict resolution, and SDKs that handle synchronization automatically.

However, Firebase is fundamentally document-oriented rather than relational:

  • Data is stored as documents rather than tables and rows
  • Complex relational queries often require denormalization
  • Joins are not a native part of the data model
  • Query flexibility is intentionally constrained for scalability

Firebase's strength is developer productivity around document data.

KalamDB's advantage is that it keeps the relational SQL model intact:

  • SQL queries instead of document queries
  • Relational schemas instead of denormalized document trees
  • Sequence-based replay and recovery
  • Snapshot-first consistency guarantees
  • User-scoped routing optimized for application state

If your application naturally fits a document database, Firebase is often an excellent choice.

If your application already thinks in terms of SQL tables, relationships, and queries, KalamDB avoids the impedance mismatch of reshaping relational data into documents.


Supabase Realtime

Supabase Realtime is built on PostgreSQL logical replication.

Technically, it streams database changes from PostgreSQL's WAL (Write-Ahead Log) and delivers them over WebSockets.

For example, a client can subscribe to:

INSERT on orders
UPDATE on orders
DELETE on orders

Supabase also supports server-side filtering on replication streams and integrates with PostgreSQL Row Level Security (RLS), which significantly reduces the amount of filtering clients need to perform themselves.

That's an important distinction: modern Supabase Realtime is not simply broadcasting every table change to every client.

However, Supabase Realtime is still fundamentally change-stream based.

The primary unit of delivery is:

A database row changed

rather than:

This SQL query result changed

Applications often still need to maintain local query state, apply ordering, merge updates into cached results, and handle how row-level changes affect a particular view.

Another important difference is replay.

KalamDB assigns every change a durable monotonic sequence id (_seq) and allows clients to reconnect from a specific point in the stream:

Give me everything after sequence 13001

This makes replay a first-class protocol feature.

Supabase Realtime does not expose an equivalent durable change identifier that clients can use as a replay cursor across reconnects. Applications typically need to recover state through re-fetching data or rebuilding state from the database rather than replaying from a known change position.

KalamDB focuses on a different abstraction:

Subscribe to state
not
Subscribe to changes

Key advantages of KalamDB include:

  • Snapshot-first subscriptions built into the protocol
  • Durable sequence ids for replay and recovery
  • Consistent snapshot-to-live transitions
  • User-scoped routing with minimal subscription evaluation
  • SQL result synchronization rather than raw row-change delivery
  • Efficient replay using sequence comparisons
  • No need to filter large shared tables by user before replaying changes

Because KalamDB stores data in user-scoped tables, replay is particularly efficient. The server already knows which user's data is being synchronized and can simply stream changes newer than a given sequence id.

Conceptually:

User
└── Table
└── Rows

Replay becomes:

Find user's table
→ Find rows with _seq > last_seen_seq
→ Stream changes

rather than:

Scan shared table
→ Filter by user
→ Apply permissions
→ Determine missed changes
→ Stream results

In other words:

Supabase Realtime excels at exposing PostgreSQL changes to applications.

KalamDB is designed around keeping application state synchronized from the beginning, with replay and recovery built directly into the protocol.


Why user-scoped tables matter

This is one of the biggest architectural differences in KalamDB.

Many realtime systems start with:

One large table
→ filter per user
→ filter per subscription
→ filter per client

KalamDB organizes data around ownership instead.

Conceptually:

User
└── Table
└── Rows

When a row changes, the server already knows who owns it.

Routing becomes:

Find user
→ Compare sequence id
→ Deliver update

instead of:

Scan subscriptions
→ Evaluate filters
→ Check permissions
→ Route update

The same advantage applies to replay.

When a client reconnects, KalamDB doesn't need to scan a large shared table and filter rows by user ownership before determining what was missed. It can directly replay changes from the user's table using sequence ordering.

As the number of users grows, that difference becomes significant.


Developer experience

A subscription can be as simple as:


typescript snippettypescript
const unsubscribe = await client.live(`  SELECT id, body, created_at, _seq  FROM chat.messages`,  (event) => {    console.log(event);  });


The SDK handles:

  • Authentication
  • WebSocket lifecycle
  • Snapshot loading
  • Reconnects
  • Sequence tracking
  • Resume logic

Applications simply react to events.


When to use KalamDB

KalamDB works particularly well for:

  • Chat applications
  • Dashboards
  • Notifications
  • Collaborative tools
  • Mobile apps
  • Offline-first applications
  • User-specific realtime data

It's designed for application state, not high-volume analytics streams.


Design summary


Goal

KalamDB approach

Initial consistency

Snapshot before live updates

Reconnect safety

Resume from sequence id

Low filtering cost

User-scoped tables

Efficient routing

Route by ownership

Efficient replay

Replay from sequence ids

Reduced bandwidth

Send only relevant rows

Simple clients

SDK manages synchronization

Offline sync

Resume from persisted sequence id

Lock-free delivery

Snapshot boundary plus buffered changes

Distributed ordering

Snowflake-based sortable sequence ids

Scalable realtime

Compare sequence ids instead of re-running queries


Conclusion

Most realtime systems focus on broadcasting events.

KalamDB focuses on synchronizing state.

The core idea is simple:

Snapshot
→ Sequence id
→ Live updates
→ Resume from last sequence

The sequence id is more than a counter.

It's a globally sortable Snowflake identifier that enables reconnects, offline synchronization, distributed deployments, efficient replay, and efficient realtime delivery.

Unlike systems that primarily expose change streams, KalamDB treats replay as a first-class capability. Clients always know exactly where they are in the stream and can resume from a durable sequence id without rebuilding state from scratch.

Because data is organized around ownership, the server doesn't spend its time evaluating complex filters for every update. Instead, it can route changes efficiently and rely primarily on sequence ordering. Replay benefits from the same architecture: the server can directly locate a user's data and stream everything newer than the last acknowledged sequence id.

Combined with the snapshot-first design, this creates a lock-free synchronization model where writes continue uninterrupted while clients receive a complete, ordered stream of changes.

The result is a simpler way to think about realtime data:

Subscribe once.
Receive a snapshot.
Stay up to date.
Reconnect from the last sequence id.

That's the foundation of realtime SQL in KalamDB.

Postgres and kalamdb diagram
MAY 24, 20264 MINUTES READJamal Saad

PostgreSQL Fan-Out at Scale with KalamDB

KalamDB brings real-time SQL subscriptions, topic consumers, and per-user storage to PostgreSQL workflows — helping chat apps and AI agents share one live data plane without Redis, Kafka, or custom fan-out layers.

Read post