Topic Pub/Sub

KalamDB topics are a durable table-change feed. Topic definitions live in system.topics, matching table writes are turned into topic messages inside the write path, and consumer progress is persisted in system.topic_offsets.

If you want an app write to trigger background work without polling a table, topics are the pub/sub layer for that.

A consumer group is just a named cursor in system.topic_offsets. Use a group when a worker should resume where it left off, and omit GROUP when you want a stateless read for debugging or replay.

Why topics exist

Use topics when you need:

normal SQL writes from your app
durable delivery to workers or services
replay from stored offsets
consumer-group progress tracking
a clean boundary between request-time writes and async work

A common pattern is:

your app inserts a row
KalamDB publishes a topic message for that row change
an agent or worker consumes the message
the worker writes results back with SQL

See Getting Started for the smallest SQL-only version of that flow.

How topics work inside KalamDB

CREATE TOPIC stores topic metadata and routes in system.topics.
On startup, KalamDB reloads persisted topics into an in-memory route cache and restores next offsets from stored topic messages.
When a table write happens, the table provider calls the topic publisher inline in the write path.
In cluster mode, only the leader publishes topic messages so the same row change is not published twice.
Each matching row change becomes one topic message.
Messages are appended to the internal RocksDB topic_messages partition with a (topic_id, partition_id, offset) key.
Consumer-group acknowledgements are stored in system.topic_offsets.
While a group is processing a batch, KalamDB keeps an in-memory claim for that (topic, group, partition) range. If it is not acknowledged before the visibility timeout expires, it can be delivered again.
Retention policies are stored per topic and can prune old messages by age or per-partition byte cap without rewriting offsets.

The claim visibility timeout is controlled by topics.visibility_timeout_secs in server.toml. Keep it longer than normal worker processing time, but low enough that crashed workers recover quickly. The default is 60 seconds; local smoke tests and fast worker loops often use 10 seconds.

What a topic message contains

Each message carries:

the topic id
the partition id
the offset inside that partition
the payload bytes
an optional key
the publish timestamp
the producing user id when available
the triggering operation: Insert, Update, or Delete

SQL CONSUME exposes that field as user_id, while the HTTP topic API resolves it to user in JSON responses.

What works well today

durable topic metadata in system.topics
routing by source table and operation
row-local WHERE filters on topic routes
synchronous publish-on-write from table mutations
persisted message log in RocksDB
consumer-group offsets in system.topic_offsets
HTTP consume plus explicit ack
startup restoration of routes and next offsets
topic retention by age or per-partition byte cap
background cleanup after DROP TOPIC or CLEAR TOPIC

Current limits and experimental areas

Use PARTITIONS 1 in production for now.

Current code still has unfinished edges:

multi-partition topics have internal plumbing, but the full user-facing story is incomplete
partition_key_expr is stored in route metadata but is not used by the publisher
SQL CONSUME reads partition 0 only
HTTP consume reads one partition per request
payload = 'full' is the only payload mode you should rely on today
payload = 'diff' is still in development; the parser accepts it, but current publishing code emits the same full-row payload as full
payload = 'key' is still in development; current publishing code serializes row JSON instead of a stable primary-key-only payload shape

Retention is now part of the supported topic surface. You can configure it with CREATE TOPIC ... WITH (...), update it with ALTER TOPIC ... SET RETENTION, and disable it with ALTER TOPIC ... CLEAR RETENTION. See SQL Topics & Consumers for the exact syntax and how retention affects FROM EARLIEST and consumer-group recovery.

Choose the right page

Getting Started: simple SQL-only topic flow for an AI worker.
SQL Topics & Consumers: create topics, attach routes, inspect queue state, and use the SQL consume surface.
HTTP Topic API: explicit consume and ack flow for services and workers.