Skip to Content
ArchitectureLive Query

Live Query Architecture

KalamDB is designed from the ground up to support real-time, reactive applications. Unlike traditional databases that require complex polling mechanisms or external message brokers (like Kafka or Redis) to stream changes, KalamDB natively supports Live Queries via WebSockets.

This architecture allows clients to subscribe to changes on USER, SHARED, and STREAM tables and receive instant push notifications whenever data is inserted, updated, or deleted.

How It Works

When a client subscribes to a query (e.g., SELECT * FROM chat.messages WHERE conversation_id = 123), KalamDB does not just execute the query once. It registers a Live Query Subscription in the cluster.

You can subscribe in two common ways:

  • Whole-row subscription: SELECT * FROM chat.changes WHERE ...
  • Projection subscription: SELECT username, status FROM chat.changes WHERE ...

Projection subscriptions are useful when the client needs only a few columns, which reduces payload size and improves UI update performance.

Tenant-Aware Routing for Performance

One of the main reasons KalamDB achieves better performance than other databases for real-time workloads is its tenant-aware storage architecture.

In traditional databases, a live query often requires the system to scan or index the entire table or maintain complex global replication logs to find relevant changes. In KalamDB, USER and STREAM tables are physically sharded by user_id.

When a live query is registered for a specific user, the system doesn’t need to scan the whole table. It directly monitors the isolated storage partition for that specific user tenant. This drastically reduces I/O overhead and allows the system to scale horizontally to millions of concurrent connections.

In clustered mode, the client may be connected to a follower node while writes are accepted by the leader. KalamDB forwards those changes through Raft replication so the follower applies them locally, evaluates matching subscriptions, and pushes updates to the connected client.

Offline-First and Sequence IDs

KalamDB’s live queries are built to support offline-first applications seamlessly.

When subscribing to a table, you don’t just get future events. You can also request historical data to synchronize your local state before the live stream begins. This is powered by Sequence IDs (SeqId), which provide a strict, monotonically increasing order of operations.

When initiating a subscription, you can specify:

  • since_seq: Pull all changes that happened after a specific Sequence ID.
  • fetch_last: Load only the latest N number of rows.

Example: Chat Application

Imagine a user opening an old conversation in a chat app. Loading the entire history of the conversation would be slow and consume unnecessary bandwidth.

Instead, the client can subscribe to the conversation and request only the latest 10 messages:

  1. The client sends a subscription request with fetch_last: 10.
  2. KalamDB immediately returns an initial_data_batch containing those 10 messages.
  3. The subscription remains open.
  4. As new messages are sent in that conversation, KalamDB pushes them to the client in real-time.

If the user goes offline and reconnects later, the client can simply resubscribe using the since_seq of the last message it received. KalamDB will instantly push any messages missed during the offline period, followed by a seamless transition back to the live stream.

Cluster-Wide Visibility

Live query subscriptions are replicated across the KalamDB cluster using Raft (UserDataCommand). This ensures that no matter which node a client connects to, or which node processes a write operation, the event is correctly routed and pushed to the subscribed client.

Clients can stay connected to follower nodes while writes are processed by the leader. The change stream is still delivered to followers, so followers can fan out updates to their connected clients. This helps increase total concurrent connection capacity because client sessions are distributed across nodes.

Shared Table Subscriptions at High Scale

KalamDB has a dedicated shared-table streaming notification path optimized for high subscriber fan-out.

For SHARED tables, the runtime keeps a dedicated table-level subscription index so it can resolve shared subscribers quickly without user-partition lookup.

When a shared-table change arrives, KalamDB uses a streaming fan-out pipeline:

  1. Pre-compute row JSON once and share it across workers.
  2. Snapshot subscriber handles for the target shared table.
  3. Process subscribers in parallel chunks (chunked dispatch) while still applying per-subscriber filter, projection, flow control, and channel delivery semantics.

Why this enables high subscriber counts:

  • avoids repeating full row conversion for every subscriber
  • uses shared payload memory for lower CPU and allocation overhead
  • parallelizes fan-out while preserving subscription-specific behavior
  • keeps backpressure behavior intact with flow control and non-blocking send paths

This is a core part of KalamDB’s realtime power for SHARED tables, where a single change may need to reach very large subscriber groups with low latency.

Last updated on