Topic Pub/Sub
KalamDB topics are a durable table-change feed. Topic definitions live in system.topics, matching table writes are turned into topic messages inside the write path, and consumer progress is persisted in system.topic_offsets.
If you want an app write to trigger background work without polling a table, topics are the pub/sub layer for that.
A consumer group is just a named cursor in system.topic_offsets. Use a group when a worker should resume where it left off, and omit GROUP when you want a stateless read for debugging or replay.
Why topics exist
Use topics when you need:
- normal SQL writes from your app
- durable delivery to workers or services
- replay from stored offsets
- consumer-group progress tracking
- a clean boundary between request-time writes and async work
A common pattern is:
- your app inserts a row
- KalamDB publishes a topic message for that row change
- an agent or worker consumes the message
- the worker writes results back with SQL
See Getting Started for the smallest SQL-only version of that flow.
How topics work inside KalamDB
CREATE TOPICstores topic metadata and routes insystem.topics.- On startup, KalamDB reloads persisted topics into an in-memory route cache and restores next offsets from stored topic messages.
- When a table write happens, the table provider calls the topic publisher inline in the write path.
- In cluster mode, only the leader publishes topic messages so the same row change is not published twice.
- Each matching row change becomes one topic message.
- Messages are appended to the internal RocksDB
topic_messagespartition with a(topic_id, partition_id, offset)key. - Consumer-group acknowledgements are stored in
system.topic_offsets. - While a group is processing a batch, KalamDB keeps an in-memory claim for that
(topic, group, partition)range. If it is not acknowledged before the visibility timeout expires, it can be delivered again.
The claim visibility timeout is controlled by topics.visibility_timeout_secs in server.toml. Keep it longer than normal worker processing time, but low enough that crashed workers recover quickly. The default is 60 seconds; local smoke tests and fast worker loops often use 10 seconds.
What a topic message contains
Each message carries:
- the topic id
- the partition id
- the offset inside that partition
- the payload bytes
- an optional key
- the publish timestamp
- the producing user id when available
- the triggering operation:
Insert,Update, orDelete
SQL CONSUME exposes that field as user_id, while the HTTP topic API resolves it to user in JSON responses.
What works well today
- durable topic metadata in
system.topics - routing by source table and operation
- synchronous publish-on-write from table mutations
- persisted message log in RocksDB
- consumer-group offsets in
system.topic_offsets - HTTP consume plus explicit ack
- startup restoration of routes and next offsets
- background cleanup after
DROP TOPICorCLEAR TOPIC
Current limits and experimental areas
Use PARTITIONS 1 in production for now.
Current code still has unfinished edges:
- multi-partition topics have internal plumbing, but the full user-facing story is incomplete
partition_key_expris stored in route metadata but is not used by the publisher- SQL
CONSUMEreads partition0only - HTTP consume reads one partition per request
- route
WHEREclauses are parsed and stored but are not evaluated in the publisher path today payload = 'full'is the only payload mode you should rely on todaypayload = 'diff'is accepted, but current publishing code emits the same full-row payload asfullpayload = 'key'currently serializes the row JSON used as the message key; do not assume a primary-key-only payload yet- topic retention metadata exists, but automatic retention enforcement is still incomplete
Choose the right page
- Getting Started: simple SQL-only topic flow for an AI worker.
- SQL Topics & Consumers: create topics, attach routes, inspect queue state, and use the SQL consume surface.
- HTTP Topic API: explicit consume and ack flow for services and workers.