Running KalamDB in Production

This guide walks you through every setting, secret, and network boundary you need to lock down before exposing a KalamDB server to the public internet or an untrusted network.

Work through the checklist top-to-bottom. Each section is independently actionable — you can turn a single item on and restart the server between steps.

Threat model assumption: attackers can reach your HTTP and cluster RPC ports. Your defenses must not rely on network isolation alone.

Quick Checklist

Copy this into your deploy runbook. Every box must be ticked before you open the port.

Network & TLS

server.host = "127.0.0.1" (or explicitly a private interface); never 0.0.0.0 unless you need remote access
Terminate TLS at an edge proxy (nginx, Caddy, ALB, Cloudflare) — KalamDB itself serves plain HTTP
HTTP server sits behind a firewall/security-group that only allows the edge proxy
Cluster RPC port (cluster.rpc_addr) is not reachable from the public internet
For multi-node clusters: rpc_tls.enabled = true with require_client_cert = true

Secrets

auth.jwt_secret is at least 32 bytes, random, unique per environment
JWT secret is injected via env var or secret manager, never committed to source
Root/admin password is changed from any default and stored in a secret manager
KALAMDB_ROOT_PASSWORD env var is cleared from the shell history after seeding
OAuth/OIDC client secrets are in env vars, not the config file

Auth & RBAC

auth.allow_remote_setup = false after first-run bootstrap
Every human user has a named account, not the built-in root
Service accounts use the service role, never dba/system
Account lockout is enabled (auth.max_failed_attempts, auth.lockout_duration_minutes)
Access token expiry ≤ 1 hour; refresh token expiry ≤ 7 days

Origins & Cookies

security.cors.allowed_origins is an explicit allowlist — no "*"
security.cors.allow_credentials = true only when origins are an explicit allowlist
security.strict_ws_origin_check = true
Auth cookies served over HTTPS: auth.cookie_secure = true, SameSite=Strict, HttpOnly

Abuse Controls

rate_limit.enable_connection_protection = true
rate_limit.max_auth_requests_per_ip_per_sec tuned (start at 10–20)
rate_limit.max_connections_per_ip tuned (start at 100)
security.max_request_body_size tuned to your real upload size
security.max_ws_message_size tuned to your real message size

Observability

Audit logs shipped to a separate write-only sink (e.g., SIEM, S3 with object lock)
Metrics endpoint exposed only to the internal monitoring network
Alerts configured on: failed logins/min, 401/403 rate, new user creation, role changes

1. Bind Addresses & TLS

KalamDB does not terminate TLS itself. Run it behind a reverse proxy that provides HTTPS and forwards to KalamDB on localhost.

Recommended topology


[ client ] ──HTTPS──► [ nginx/Caddy/ALB ] ──HTTP(loopback)──► [ kalamdb :8080 ]
                                                              [ kalamdb :9090 cluster mTLS ]

Config


[server]
host = "127.0.0.1"     # loopback only; proxy handles remote clients
port = 8080
 
[cluster]
# Bind cluster RPC to the private cluster network interface only.
rpc_addr = "10.0.1.15:9090"
 
[rpc_tls]
enabled = true
require_client_cert = true
ca_cert = "/etc/kalamdb/tls/cluster-ca.pem"
server_cert = "/etc/kalamdb/tls/node.pem"
server_key = "/etc/kalamdb/tls/node.key"

If your cluster spans more than loopback nodes, KalamDB will refuse to start without rpc_tls.enabled = true. Do not work around this check.

2. JWT Secrets & Token Hygiene

Secret requirements

Length: minimum 32 bytes (64+ recommended)
Source: cryptographic RNG (openssl rand -base64 48), not a passphrase
Rotation: rotate on compromise, personnel change, or at least annually
Scope: unique per environment (dev, staging, prod never share)

Generate and inject via environment:


export KALAMDB_AUTH_JWT_SECRET="$(openssl rand -base64 48)"


[auth]
# Leave unset in the file when you set KALAMDB_AUTH_JWT_SECRET in the env.
# jwt_secret = "..."
access_token_expiry_hours = 1
refresh_token_expiry_hours = 168   # 7 days

KalamDB refuses to start on non-loopback binds if the secret is empty, a known-weak placeholder, or shorter than 32 bytes.

Rotation procedure

Generate a new secret.
Deploy it to every node simultaneously.
Bounce the service.
All outstanding tokens will fail verification — clients must re-login.

Do not re-use an old secret. There is no “grace period” key ring; rotation is abrupt by design.

Where tokens are exposed

Surface	Carries token?	Notes
`Authorization: Bearer …` header	Yes	Preferred for server-to-server
Auth cookies	Yes	Browser flows; always set `Secure`, `HttpOnly`, `SameSite=Strict`
WebSocket subprotocol	Yes	Sent once during upgrade; never in URL query
Logs	No	Log redaction is enabled by default; keep it enabled

3. Admin Bootstrap & Setup

On first boot, KalamDB seeds a root user. What happens next depends on your configuration.

Option A — Seeded password (recommended for remote deployments)


export KALAMDB_ROOT_PASSWORD="$(openssl rand -base64 24)"
# run the server once so the password is hashed and persisted
# then unset the env var and store the password in your secret manager
unset KALAMDB_ROOT_PASSWORD

Option B — Remote setup endpoint (one-time)

If you cannot set the env var, allow the setup endpoint for the first call only:


[auth]
allow_remote_setup = true   # flip back to false immediately after setup

Set allow_remote_setup = false and restart after you complete setup. Leaving it enabled lets anyone who can reach the endpoint attempt to seed credentials.

Option C — Localhost-only root

If you only administer via an SSH tunnel, leave the password empty. KalamDB will refuse remote logins for root and only accept localhost calls. Create named admin users via CREATE USER for remote access.

4. Password Policy

Minimum controls enforced by KalamDB:

Minimum length (configurable via auth.password_min_length)
Max length 72 bytes (bcrypt limit)
Bcrypt hash with cost 12
Rejected common-password list
Constant-time verification
Generic error messages (no user enumeration)
Account lockout after N failed attempts

Recommended runtime:


[auth]
password_min_length = 12
max_failed_attempts = 5
lockout_duration_minutes = 15

Push complexity requirements (e.g., character classes) up to your identity provider — KalamDB deliberately does not reject non-common weak passwords beyond length to avoid false rejections of passphrases.

5. CORS & WebSocket Origins

CORS for the browser admin UI


[security.cors]
allowed_origins = [
    "https://admin.example.com",
    "https://app.example.com",
]
allow_credentials = true
allowed_methods  = ["GET", "POST", "OPTIONS"]
allowed_headers  = ["Authorization", "Content-Type"]
max_age          = 600

Hard rules:

Never combine allowed_origins = ["*"] with allow_credentials = true. Browsers reject it, and KalamDB will refuse to start with that combination.
Use scheme + host + port — origin matching is exact.
Never include localhost origins in production config.

WebSocket origins


[security]
strict_ws_origin_check = true
allowed_ws_origins = ["https://app.example.com"]

strict_ws_origin_check = true rejects connections that omit the Origin header (non-browser tooling must set it explicitly). Leave empty allowed_ws_origins only on loopback dev installs.

6. Rate Limiting & Connection Protection

Even with auth in place, rate limits protect against credential stuffing, SQL abuse, and connection exhaustion.


[rate_limit]
enable_connection_protection = true
 
# Auth endpoints (login, refresh, setup, WS auth)
max_auth_requests_per_ip_per_sec = 10
 
# Per-IP concurrent connection cap
max_connections_per_ip = 100
 
# Pre-auth request flood protection
max_requests_per_ip_per_sec = 200
 
# Per-user query rate (SQL endpoint)
max_queries_per_sec = 100
 
# WebSocket message flood protection
max_messages_per_sec = 50
 
# Temporary ban window for abusive IPs
ban_duration_seconds = 300

If you run behind a reverse proxy, configure security.trusted_proxy_ranges so that rate limits attribute requests to the real client IP via X-Forwarded-For. Do not set this to 0.0.0.0/0 — that lets any caller spoof any IP and bypass the limits.


[security]
trusted_proxy_ranges = ["10.0.0.0/8", "172.16.0.0/12"]

7. RBAC & Least Privilege

KalamDB has four roles. Use the lowest one that works.

Role	Typical use	Can do
`user`	App end-users, SDK clients	Read/write their own namespaces, run DML
`service`	Backend services, pub/sub consumers	DML + topic consume/ack
`dba`	Database administrators	DDL, manage users except `system`
`system`	Reserved for the server itself	Everything (do not create new ones)

Rules of thumb:

Application tokens should be user or service, never dba.
Create one dba per human administrator; audit their usage.
EXECUTE AS USER follows a strict hierarchy (System → DBA → Service → User). A role cannot impersonate its peers or upward.
System tables (system.*) are read/write restricted to dba and system. This is enforced inside the query planner, including through subqueries, CTEs, and views.

On role demotion

When you demote or lock a user, rotate their tokens. KalamDB re-validates the DB role on each token refresh and invalidates tokens whose token_generation is older than the DB row — but already-issued access tokens remain valid until access_token_expiry_hours. Keep access-token TTLs short.

8. Request Size Guards & File Uploads


[security]
max_request_body_size = 10485760   # 10 MiB; tune to real payload size
max_ws_message_size   = 1048576    # 1 MiB

For file-accepting endpoints (multipart FILE("name") placeholders, exports), KalamDB:

Validates path components against an allowlist (alphanumeric, -, _, .)
Canonicalises paths and rejects symlink escape
Returns 403 to non-owners trying to download another user’s files or exports

Operators should additionally:

Put an object-storage lifecycle on the exports bucket (e.g., delete after N days)
Scan uploaded files with your AV/virus pipeline before re-serving them to users

9. Cluster RPC Hardening

For anything beyond a single-node loopback cluster:


[rpc_tls]
enabled = true
require_client_cert = true
ca_cert = "/etc/kalamdb/tls/cluster-ca.pem"
server_cert = "/etc/kalamdb/tls/node-1.pem"
server_key = "/etc/kalamdb/tls/node-1.key"
 
[cluster]
rpc_addr = "10.0.1.15:9090"   # private interface only

Notes:

The cluster RPC port carries Raft consensus, ForwardSql, Ping, and node-info RPCs. Treat it as equally sensitive to the data port.
ForwardSql additionally re-validates the caller’s Bearer JWT on the receiving node — mTLS is defense-in-depth, not the only layer.
Use a private CA unique to the cluster. Do not reuse your edge-TLS certificate chain for cluster mTLS.
Rotate node certificates independently of JWT secrets.

10. Logging, Audit & Observability

KalamDB emits structured JSON logs. SQL statements are redacted before logging; do not re-enable raw SQL in log formatters.
Enable the audit stream to capture user logins, user creation/modification, role changes, impersonation events, and admin export downloads.
Ship audit logs to an append-only sink (S3 with object lock, SIEM).
Expose Prometheus metrics on the internal network only — never on the public API port.

Suggested alerts

auth_failed_logins_per_minute > 20
auth_403_forbidden_per_minute > 50
user_role_change_events > 0 (human review on every change)
impersonation_events grouped by actor
Cluster RPC connection count deviation > 3σ

11. Incident Response Playbook

When credentials, a token secret, or a node are suspected compromised:

Contain — rotate auth.jwt_secret. All outstanding tokens become invalid immediately.
Lock down — set auth.allow_remote_setup = false, revoke compromised user accounts (ALTER USER … LOCK).
Rotate — regenerate cluster mTLS material, redeploy, force clients to re-authenticate.
Narrow — temporarily lower max_auth_requests_per_ip_per_sec and max_connections_per_ip.
Preserve — snapshot logs, audit stream, and RocksDB backup before further changes.
Review — audit system.users for unexpected role changes, system.jobs for unexpected admin jobs, exports directory for unexpected downloads.
Post-mortem — document the breach path, fix the configuration gap, add a regression check.

12. High-Risk Misconfigurations To Avoid

KalamDB will refuse to start on several of these. The rest are human errors we see most often.

server.host = "0.0.0.0" with no edge proxy and default JWT secret
security.cors.allowed_origins = ["*"] combined with allow_credentials = true
security.strict_ws_origin_check = false on a public WebSocket endpoint
auth.allow_remote_setup = true left on after bootstrap
auth.jwt_secret shared across environments, or shorter than 32 bytes
security.trusted_proxy_ranges containing 0.0.0.0/0 or ::/0
Running a multi-node cluster with rpc_tls.enabled = false
Handing out dba or system role tokens to applications
Storing secrets in server.toml committed to Git

Baseline Production `server.toml`

Drop-in starting point. Fill in real values, inject secrets via env.


[server]
host = "127.0.0.1"
port = 8080
enable_http2 = true
 
[auth]
# jwt_secret intentionally omitted — supplied via KALAMDB_AUTH_JWT_SECRET
allow_remote_setup = false
cookie_secure = true
access_token_expiry_hours = 1
refresh_token_expiry_hours = 168
password_min_length = 12
max_failed_attempts = 5
lockout_duration_minutes = 15
 
[rate_limit]
enable_connection_protection = true
max_auth_requests_per_ip_per_sec = 10
max_connections_per_ip = 100
max_requests_per_ip_per_sec = 200
max_queries_per_sec = 100
max_messages_per_sec = 50
ban_duration_seconds = 300
 
[security]
max_request_body_size = 10485760
max_ws_message_size = 1048576
strict_ws_origin_check = true
allowed_ws_origins = ["https://app.example.com"]
trusted_proxy_ranges = ["10.0.0.0/8"]
 
[security.cors]
allowed_origins = ["https://admin.example.com", "https://app.example.com"]
allow_credentials = true
allowed_methods = ["GET", "POST", "OPTIONS"]
allowed_headers = ["Authorization", "Content-Type"]
max_age = 600
 
[rpc_tls]
enabled = true
require_client_cert = true
ca_cert = "/etc/kalamdb/tls/cluster-ca.pem"
server_cert = "/etc/kalamdb/tls/node.pem"
server_key = "/etc/kalamdb/tls/node.key"
 
[cluster]
rpc_addr = "10.0.1.15:9090"

Running KalamDB in Production

Quick Checklist

Network & TLS

Secrets

Auth & RBAC

Origins & Cookies

Abuse Controls

Observability

1. Bind Addresses & TLS

Recommended topology

Config

2. JWT Secrets & Token Hygiene

Secret requirements

Rotation procedure

Where tokens are exposed

3. Admin Bootstrap & Setup

Option A — Seeded password (recommended for remote deployments)

Option B — Remote setup endpoint (one-time)

Option C — Localhost-only root

4. Password Policy

5. CORS & WebSocket Origins

CORS for the browser admin UI

WebSocket origins

6. Rate Limiting & Connection Protection

7. RBAC & Least Privilege

On role demotion

8. Request Size Guards & File Uploads

9. Cluster RPC Hardening

10. Logging, Audit & Observability

Suggested alerts

11. Incident Response Playbook

12. High-Risk Misconfigurations To Avoid

Baseline Production `server.toml`

Getting Started

Documentation

Resources

Community

Running KalamDB in Production

Quick Checklist

Network & TLS

Secrets

Auth & RBAC

Origins & Cookies

Abuse Controls

Observability

1. Bind Addresses & TLS

Recommended topology

Config

2. JWT Secrets & Token Hygiene

Secret requirements

Rotation procedure

Where tokens are exposed

3. Admin Bootstrap & Setup

Option A — Seeded password (recommended for remote deployments)

Option B — Remote setup endpoint (one-time)

Option C — Localhost-only root

4. Password Policy

5. CORS & WebSocket Origins

CORS for the browser admin UI

WebSocket origins

6. Rate Limiting & Connection Protection

7. RBAC & Least Privilege

On role demotion

8. Request Size Guards & File Uploads

9. Cluster RPC Hardening

10. Logging, Audit & Observability

Suggested alerts

11. Incident Response Playbook

12. High-Risk Misconfigurations To Avoid

Baseline Production server.toml

Related Docs

Getting Started

Documentation

Resources

Community

Baseline Production `server.toml`