Most realtime systems can tell you when a table changes.
That's useful, but it's usually not what an application actually needs.
Imagine a support dashboard that shows tickets assigned to the current user:
SELECT id, title, priority, status, updated_atFROM app.ticketsWHERE assignee_id = $current_userORDER BY updated_at DESC;A common solution is polling. Another is using CDC or a generic change feed.
Both approaches work, but they push a surprising amount of complexity onto the client:
- Re-fetching data over and over
- Filtering out rows that don't belong to the current user
- Handling reconnects
- Detecting missed updates
- Keeping local state in sync
KalamDB takes a different approach.
Subscribe once, receive a consistent snapshot, then stream everything that happened after it.
Because KalamDB stores data in user-scoped tables, subscriptions don't require expensive filtering on every update. In many cases, the server only needs to answer a simple question:
Is this row newer than the last sequence id the client has seen?
That makes realtime delivery simpler, more predictable, and easier to scale.
The KalamDB approach
A client opens a WebSocket connection, authenticates, and subscribes to a SQL query:
{ "type": "subscribe", "subscription": { "id": "tickets", "sql": "SELECT id, title, status, _seq FROM app.tickets" }}The server then:
- Captures a snapshot boundary.
- Sends the initial result set in batches.
- Streams all changes that occur after that boundary.
- Allows the client to reconnect from the last sequence id it received.
The client doesn't need to poll or repeatedly execute the same query.
Why sequence ids matter
Every row in KalamDB contains a monotonic sequence id:
_seq
This sequence id is the foundation of synchronization.
In many systems, every update requires evaluating filters, permissions, and subscriptions before deciding whether a client should receive it.
In KalamDB, subscriptions are naturally scoped to user-owned tables and partitions. That means the server can often reduce the problem to:
Is _seq greater than the last sequence id the client received?
That simple comparison provides several benefits:
- Less filtering
- Lower CPU overhead
- Reduced network traffic
- Straightforward reconnect logic
- Efficient replay
- Predictable ordering
Snowflake-based sequence ids
KalamDB's _seq is implemented using a Snowflake-style identifier.
That means it is:
- Globally sortable
- Monotonically increasing
- Fast to compare
- Safe to generate across multiple nodes
This becomes especially important in distributed deployments.
When KalamDB runs as a cluster using KalamDB Raft, multiple nodes can generate sequence ids while preserving the ordering guarantees required for synchronization and replay.
The implementation details deserve their own post, but for realtime synchronization the important rule is simple:
Higher _seq = newer change
Clients can safely resume from a sequence id regardless of which node originally generated it.
Protocol overview
The WebSocket endpoint is:
GET /v1/ws
A typical session looks like this:
sequenceDiagram participant Client participant Server Client->>Server: authenticate Server-->>Client: auth_success Client->>Server: subscribe Server-->>Client: snapshot batches Note over Server,Client: snapshot complete Server-->>Client: live changes Note over Client,Server: reconnect if needed Client->>Server: subscribe(from_seq_id) Server-->>Client: missed changes
The key idea is that the client always knows the latest sequence id it has processed.
That makes recovery straightforward.
Snapshot first, then live changes
The hardest part of any realtime system isn't opening a WebSocket.
It's maintaining consistency.
Consider this timeline:
T1: client subscribesT2: snapshot startsT3: row changesT4: snapshot finishesT5: live updates begin
Without careful coordination, the client can:
- Receive duplicate data
- Miss updates entirely
- End up with an inconsistent view of the world
KalamDB solves this by recording a snapshot boundary:
snapshot_end_seq = current sequence
The server sends all rows up to that sequence id.
Any changes that occur after the boundary are buffered while the snapshot is being delivered.
Once the snapshot is complete:
snapshot complete
→ replay buffered changes
→ switch to live mode
The client receives a consistent view of the data without gaps or races.
Lock-free delivery
One advantage of this approach is that writes never need to stop while a snapshot is being generated.
New writes continue normally.
Changes that occur after the snapshot boundary are simply buffered and replayed once the snapshot finishes.
That gives us an important guarantee:
No write blocking
No global locks
No missed changes
The snapshot establishes a stable point in time, while sequence ids ensure every change after that point is delivered in order.
Initial loading
Large result sets are delivered in batches:
{ "type": "initial_data_batch", "rows": [ { "id": 1, "_seq": 11990 } ], "batch_control": { "status": "loading", "has_more": true }}
The client requests additional batches as needed:
{ "type": "next_batch"}This prevents large snapshots from overwhelming the connection.
Because the snapshot boundary has already been established, batches can be delivered independently without risking missed updates.
Live updates
Once the snapshot is complete:
{ "batch_control": { "status": "ready", "has_more": false }}The server switches to live streaming mode.
Example:
{ "type": "change", "rows": [ { "id": 42, "_seq": 13001 } ]}Since every change includes a sequence id, the client always knows exactly where it is in the stream.
Reconnect and resume
Connections drop.
Browsers refresh.
Mobile networks disappear at the worst possible moment.
Instead of starting over, the client reconnects using the last sequence id it processed:
{ "type": "subscribe", "subscription": { "id": "tickets", "sql": "SELECT id, title, status, _seq FROM app.tickets", "options": { "from_seq_id": 13001 } }}The server sends everything newer than that sequence.
No custom replay logic. No full refresh. No guessing what was missed.
Because replay is built directly into the protocol, reconnect behavior remains deterministic even after long disconnects.
Offline-first applications
Sequence ids become even more useful when clients spend long periods offline.
Consider a mobile application:
Online→ Receive updates→ Go offline→ User continues working→ Reconnect later
Traditional synchronization systems often rely on:
- Conflict tracking
- Timestamp comparisons
- Custom replay logic
- Periodic full refreshes
With KalamDB, the client already knows the last sequence id it processed.
When connectivity returns:
Reconnect
→ Send last sequence id
→ Receive everything newer
That's it.
For offline-first applications, _seq effectively becomes a synchronization cursor.
Persist it locally and resume exactly where you left off without downloading the entire dataset again.
This model works particularly well for:
- Mobile applications
- Field service software
- Messaging systems
- Collaborative tools
- Local-first experiences
Comparing KalamDB to other solutions
Polling
Polling repeatedly executes the same query:
SELECT ...SELECT ...SELECT ...SELECT ...
The downsides are familiar:
- Repeated database work
- Higher latency
- More bandwidth usage
KalamDB executes the query once and then streams changes as they happen.
PostgreSQL LISTEN/NOTIFY
LISTEN/NOTIFY is great for invalidation:
orders changed
But clients still need to:
- Re-query data
- Detect missed updates
- Maintain local state
KalamDB delivers rows directly and provides sequence-based recovery.
CDC and Change Streams
CDC systems tell you:
Row X changed
The application still has to answer:
Do I care about Row X?
In many architectures that means filtering large streams of events before finding the few that matter.
With KalamDB's user-scoped tables, routing is much simpler. Most updates already belong to a specific user partition, so the server can focus on ordering and delivery rather than evaluating large numbers of filters.
Firebase
Firebase Realtime Database and Firestore provide excellent realtime synchronization and are among the easiest ways to build realtime applications.
Firebase clients subscribe to documents, collections, or query results and automatically receive updates when those results change.
Firebase also includes strong offline support, local caching, conflict resolution, and SDKs that handle synchronization automatically.
However, Firebase is fundamentally document-oriented rather than relational:
- Data is stored as documents rather than tables and rows
- Complex relational queries often require denormalization
- Joins are not a native part of the data model
- Query flexibility is intentionally constrained for scalability
Firebase's strength is developer productivity around document data.
KalamDB's advantage is that it keeps the relational SQL model intact:
- SQL queries instead of document queries
- Relational schemas instead of denormalized document trees
- Sequence-based replay and recovery
- Snapshot-first consistency guarantees
- User-scoped routing optimized for application state
If your application naturally fits a document database, Firebase is often an excellent choice.
If your application already thinks in terms of SQL tables, relationships, and queries, KalamDB avoids the impedance mismatch of reshaping relational data into documents.
Supabase Realtime
Supabase Realtime is built on PostgreSQL logical replication.
Technically, it streams database changes from PostgreSQL's WAL (Write-Ahead Log) and delivers them over WebSockets.
For example, a client can subscribe to:
INSERT on orders
UPDATE on orders
DELETE on orders
Supabase also supports server-side filtering on replication streams and integrates with PostgreSQL Row Level Security (RLS), which significantly reduces the amount of filtering clients need to perform themselves.
That's an important distinction: modern Supabase Realtime is not simply broadcasting every table change to every client.
However, Supabase Realtime is still fundamentally change-stream based.
The primary unit of delivery is:
A database row changed
rather than:
This SQL query result changed
Applications often still need to maintain local query state, apply ordering, merge updates into cached results, and handle how row-level changes affect a particular view.
Another important difference is replay.
KalamDB assigns every change a durable monotonic sequence id (_seq) and allows clients to reconnect from a specific point in the stream:
Give me everything after sequence 13001
This makes replay a first-class protocol feature.
Supabase Realtime does not expose an equivalent durable change identifier that clients can use as a replay cursor across reconnects. Applications typically need to recover state through re-fetching data or rebuilding state from the database rather than replaying from a known change position.
KalamDB focuses on a different abstraction:
Subscribe to state
not
Subscribe to changes
Key advantages of KalamDB include:
- Snapshot-first subscriptions built into the protocol
- Durable sequence ids for replay and recovery
- Consistent snapshot-to-live transitions
- User-scoped routing with minimal subscription evaluation
- SQL result synchronization rather than raw row-change delivery
- Efficient replay using sequence comparisons
- No need to filter large shared tables by user before replaying changes
Because KalamDB stores data in user-scoped tables, replay is particularly efficient. The server already knows which user's data is being synchronized and can simply stream changes newer than a given sequence id.
Conceptually:
User └── Table └── Rows
Replay becomes:
Find user's table→ Find rows with _seq > last_seen_seq→ Stream changes
rather than:
Scan shared table→ Filter by user→ Apply permissions→ Determine missed changes→ Stream results
In other words:
Supabase Realtime excels at exposing PostgreSQL changes to applications.
KalamDB is designed around keeping application state synchronized from the beginning, with replay and recovery built directly into the protocol.
Why user-scoped tables matter
This is one of the biggest architectural differences in KalamDB.
Many realtime systems start with:
One large table→ filter per user→ filter per subscription→ filter per client
KalamDB organizes data around ownership instead.
Conceptually:
User └── Table └── Rows
When a row changes, the server already knows who owns it.
Routing becomes:
Find user
→ Compare sequence id
→ Deliver update
instead of:
Scan subscriptions→ Evaluate filters→ Check permissions→ Route update
The same advantage applies to replay.
When a client reconnects, KalamDB doesn't need to scan a large shared table and filter rows by user ownership before determining what was missed. It can directly replay changes from the user's table using sequence ordering.
As the number of users grows, that difference becomes significant.
Developer experience
A subscription can be as simple as:
const unsubscribe = await client.live(` SELECT id, body, created_at, _seq FROM chat.messages`, (event) => { console.log(event); });The SDK handles:
- Authentication
- WebSocket lifecycle
- Snapshot loading
- Reconnects
- Sequence tracking
- Resume logic
Applications simply react to events.
When to use KalamDB
KalamDB works particularly well for:
- Chat applications
- Dashboards
- Notifications
- Collaborative tools
- Mobile apps
- Offline-first applications
- User-specific realtime data
It's designed for application state, not high-volume analytics streams.
Design summary
Goal | KalamDB approach |
|---|---|
Initial consistency | Snapshot before live updates |
Reconnect safety | Resume from sequence id |
Low filtering cost | User-scoped tables |
Efficient routing | Route by ownership |
Efficient replay | Replay from sequence ids |
Reduced bandwidth | Send only relevant rows |
Simple clients | SDK manages synchronization |
Offline sync | Resume from persisted sequence id |
Lock-free delivery | Snapshot boundary plus buffered changes |
Distributed ordering | Snowflake-based sortable sequence ids |
Scalable realtime | Compare sequence ids instead of re-running queries |
Conclusion
Most realtime systems focus on broadcasting events.
KalamDB focuses on synchronizing state.
The core idea is simple:
Snapshot→ Sequence id→ Live updates→ Resume from last sequence
The sequence id is more than a counter.
It's a globally sortable Snowflake identifier that enables reconnects, offline synchronization, distributed deployments, efficient replay, and efficient realtime delivery.
Unlike systems that primarily expose change streams, KalamDB treats replay as a first-class capability. Clients always know exactly where they are in the stream and can resume from a durable sequence id without rebuilding state from scratch.
Because data is organized around ownership, the server doesn't spend its time evaluating complex filters for every update. Instead, it can route changes efficiently and rely primarily on sequence ordering. Replay benefits from the same architecture: the server can directly locate a user's data and stream everything newer than the last acknowledged sequence id.
Combined with the snapshot-first design, this creates a lock-free synchronization model where writes continue uninterrupted while clients receive a complete, ordered stream of changes.
The result is a simpler way to think about realtime data:
Subscribe once.
Receive a snapshot.
Stay up to date.
Reconnect from the last sequence id.
That's the foundation of realtime SQL in KalamDB.
-TLgmZcK1heam1n69RVEGBCFlYidxUF.jpeg%3Fprefix%3Dmedia&w=128&q=75)


