Backup & Restore
KalamDB provides built-in SQL commands for backing up and restoring the entire database. A backup captures everything in one compressed archive — RocksDB data, Parquet storage files, Raft snapshots, and the server configuration — so you always have a consistent, self-contained snapshot you can restore from.
Role requirement:
BACKUP DATABASEandRESTORE DATABASErequire the DBA or System role.
BACKUP DATABASE
Create a full database backup. The entire data directory and server.toml are
compressed into a single .tar.gz archive at the specified path.
BACKUP DATABASE TO '<backup_path>';The command enqueues a background backup job and returns immediately with a Job ID you can use to monitor progress.
Parameters
| Parameter | Description |
|---|---|
backup_path | Absolute path for the output archive. Must end in .tar.gz. Quotes (single or double) are required. |
Archive contents
| Path inside archive | Description |
|---|---|
data/rocksdb/ | RocksDB write-path data |
data/storage/ | Flushed Parquet segment files |
data/snapshots/ | Raft snapshots |
data/streams/ | Stream commit log data |
server.toml | Server configuration (if present) |
Example
-- Back up the entire database to a timestamped archive
BACKUP DATABASE TO '/backups/kalamdb_20260224.tar.gz';
-- Store to a date-named path
BACKUP DATABASE TO '/var/backups/kalamdb/2026-02-24.tar.gz';Response
Database backup started to '/backups/kalamdb_20260224.tar.gz'. Job ID: BK-0001RESTORE DATABASE
Restore the entire database from a previously created .tar.gz backup archive.
The command stages the restored data under <db_path>_restore_pending/ without
touching the live database. A server restart is required to complete the
restore — on startup, KalamDB detects the pending directory and swaps it in.
RESTORE DATABASE FROM '<backup_path>';Warning: Once the server restarts after staging, all data written since the backup was taken will be lost. Take a fresh backup of the current state before running a restore if you need a rollback option.
Parameters
| Parameter | Description |
|---|---|
backup_path | Absolute path to the .tar.gz backup archive to restore from. Quotes (single or double) are required. |
Restore lifecycle
- Issue
RESTORE DATABASE FROM '<path>';— returns a Job ID. - The job extracts the archive to
<db_path>_restore_pending/. - Job status transitions to
Completedwith the message “Restore staged from ’…’. RocksDB restore is pending server restart.” - Restart the KalamDB server — it detects the pending directory, swaps it in, and starts fresh.
Example
-- Stage a restore from a specific timestamped backup
RESTORE DATABASE FROM '/backups/kalamdb_20260224.tar.gz';Response
Database restore started from '/backups/kalamdb_20260224.tar.gz'. Job ID: RS-0001After the job completes:
-- Confirm the restore job reached Completed before restarting
SELECT job_id, status, message FROM system.jobs WHERE job_id = 'RS-0001';Then restart the server process to activate the restored data.
Path security rules
KalamDB validates the backup path on both BACKUP and RESTORE to prevent
path traversal and sensitive file access:
..sequences are not allowed (blocks path traversal)- Null bytes (
\0) are not allowed - Writing to
/etc/,/root/,/var/log/, orC:\Windows\is blocked - The path must be quoted (bare paths are rejected)
-- These will be rejected:
BACKUP DATABASE TO '../../../tmp/evil.tar.gz'; -- path traversal
BACKUP DATABASE TO '/etc/shadow'; -- sensitive directory
BACKUP DATABASE TO /backups/app.tar.gz; -- unquoted pathAsync job execution
Backup and restore operations run as background jobs managed by the
UnifiedJobManager. The SQL command returns immediately with a job ID; the
work happens asynchronously so the server remains responsive.
You can monitor job status via the system.jobs table:
SELECT job_id, status, message
FROM system.jobs
WHERE job_id = 'BK-0001';status value | Meaning |
|---|---|
Queued | Job is waiting to start |
Running | Backup/restore is in progress |
Completed | Finished successfully |
Failed | An error occurred — see message for details |
Backup strategy
For production deployments, consider:
- Scheduled backups — Run
BACKUP DATABASEon a cron or application-level schedule (daily, hourly, etc.) - Timestamped archives — Include a date/time in the path so each backup is uniquely named and old ones are not silently overwritten
- Remote storage — Copy the
.tar.gzarchive to S3, GCS, or Azure Blob Storage after creation for off-site durability - Snapshot + backup — Combine
CLUSTER SNAPSHOTwithBACKUP DATABASEfor a fully consistent cluster-wide state capture - Pre-restore snapshot — Take a backup of the current state before running
RESTORE DATABASEso you can roll back if needed