Backup & Restore

KalamDB provides built-in SQL commands for backing up and restoring the entire database. A backup captures everything in one compressed archive — RocksDB data, Parquet storage files, Raft snapshots, and the server configuration — so you always have a consistent, self-contained snapshot you can restore from.

Role requirement: BACKUP DATABASE and RESTORE DATABASE require the DBA or System role.

BACKUP DATABASE

Create a full database backup. The entire data directory and server.toml are compressed into a single .tar.gz archive at the specified path.


BACKUP DATABASE TO '<backup_path>';

The command enqueues a background backup job and returns immediately with a Job ID you can use to monitor progress.

Parameters

Parameter	Description
`backup_path`	Absolute path for the output archive. Must end in `.tar.gz`. Quotes (single or double) are required.

Archive contents

Path inside archive	Description
`data/rocksdb/`	RocksDB write-path data
`data/storage/`	Flushed Parquet segment files
`data/snapshots/`	Raft snapshots
`data/streams/`	Stream commit log data
`server.toml`	Server configuration (if present)

Example


-- Back up the entire database to a timestamped archive
BACKUP DATABASE TO '/backups/kalamdb_20260224.tar.gz';
 
-- Store to a date-named path
BACKUP DATABASE TO '/var/backups/kalamdb/2026-02-24.tar.gz';

Response


Database backup started to '/backups/kalamdb_20260224.tar.gz'. Job ID: BK-0001

RESTORE DATABASE

Restore the entire database from a previously created .tar.gz backup archive. The command stages the restored data under <db_path>_restore_pending/ without touching the live database. A server restart is required to complete the restore — on startup, KalamDB detects the pending directory and swaps it in.


RESTORE DATABASE FROM '<backup_path>';

Warning: Once the server restarts after staging, all data written since the backup was taken will be lost. Take a fresh backup of the current state before running a restore if you need a rollback option.

Parameters

Parameter	Description
`backup_path`	Absolute path to the `.tar.gz` backup archive to restore from. Quotes (single or double) are required.

Restore lifecycle

Issue RESTORE DATABASE FROM '<path>'; — returns a Job ID.
The job extracts the archive to <db_path>_restore_pending/.
Job status transitions to Completed with the message “Restore staged from ’…’. RocksDB restore is pending server restart.”
Restart the KalamDB server — it detects the pending directory, swaps it in, and starts fresh.

Example


-- Stage a restore from a specific timestamped backup
RESTORE DATABASE FROM '/backups/kalamdb_20260224.tar.gz';

Response


Database restore started from '/backups/kalamdb_20260224.tar.gz'. Job ID: RS-0001

After the job completes:


-- Confirm the restore job reached Completed before restarting
SELECT job_id, status, message FROM system.jobs WHERE job_id = 'RS-0001';

Then restart the server process to activate the restored data.

Path security rules

KalamDB validates the backup path on both BACKUP and RESTORE to prevent path traversal and sensitive file access:

.. sequences are not allowed (blocks path traversal)
Null bytes (\0) are not allowed
Writing to /etc/, /root/, /var/log/, or C:\Windows\ is blocked
The path must be quoted (bare paths are rejected)


-- These will be rejected:
BACKUP DATABASE TO '../../../tmp/evil.tar.gz';   -- path traversal
BACKUP DATABASE TO '/etc/shadow';                -- sensitive directory
BACKUP DATABASE TO /backups/app.tar.gz;          -- unquoted path

Async job execution

Backup and restore operations run as background jobs managed by the UnifiedJobManager. The SQL command returns immediately with a job ID; the work happens asynchronously so the server remains responsive.

You can monitor job status via the system.jobs table:


SELECT job_id, status, message
FROM system.jobs
WHERE job_id = 'BK-0001';

`status` value	Meaning
`Queued`	Job is waiting to start
`Running`	Backup/restore is in progress
`Completed`	Finished successfully
`Failed`	An error occurred — see `message` for details

Backup strategy

For production deployments, consider:

Scheduled backups — Run BACKUP DATABASE on a cron or application-level schedule (daily, hourly, etc.)
Timestamped archives — Include a date/time in the path so each backup is uniquely named and old ones are not silently overwritten
Remote storage — Copy the .tar.gz archive to S3, GCS, or Azure Blob Storage after creation for off-site durability
Snapshot + backup — Combine CLUSTER SNAPSHOT with BACKUP DATABASE for a fully consistent cluster-wide state capture
Pre-restore snapshot — Take a backup of the current state before running RESTORE DATABASE so you can roll back if needed

Backup & Restore

BACKUP DATABASE

Parameters

Archive contents

Example

RESTORE DATABASE

Parameters

Restore lifecycle

Example

Path security rules

Async job execution

Backup strategy

Getting Started

Documentation

Resources

Community