Skip to Content
SecurityProduction Checklist

Running KalamDB in Production

This guide walks you through every setting, secret, and network boundary you need to lock down before exposing a KalamDB server to the public internet or an untrusted network.

Work through the checklist top-to-bottom. Each section is independently actionable — you can turn a single item on and restart the server between steps.

Threat model assumption: attackers can reach your HTTP and cluster RPC ports. Your defenses must not rely on network isolation alone.


Quick Checklist

Copy this into your deploy runbook. Every box must be ticked before you open the port.

Network & TLS

  • server.host = "127.0.0.1" (or explicitly a private interface); never 0.0.0.0 unless you need remote access
  • Terminate TLS at an edge proxy (nginx, Caddy, ALB, Cloudflare) — KalamDB itself serves plain HTTP
  • HTTP server sits behind a firewall/security-group that only allows the edge proxy
  • Cluster RPC port (cluster.rpc_addr) is not reachable from the public internet
  • For multi-node clusters: rpc_tls.enabled = true with require_client_cert = true

Secrets

  • auth.jwt_secret is at least 32 bytes, random, unique per environment
  • JWT secret is injected via env var or secret manager, never committed to source
  • Root/admin password is changed from any default and stored in a secret manager
  • KALAMDB_ROOT_PASSWORD env var is cleared from the shell history after seeding
  • OAuth/OIDC client secrets are in env vars, not the config file

Auth & RBAC

  • auth.allow_remote_setup = false after first-run bootstrap
  • Every human user has a named account, not the built-in root
  • Service accounts use the service role, never dba/system
  • Account lockout is enabled (auth.max_failed_attempts, auth.lockout_duration_minutes)
  • Access token expiry ≤ 1 hour; refresh token expiry ≤ 7 days

Origins & Cookies

  • security.cors.allowed_origins is an explicit allowlist — no "*"
  • security.cors.allow_credentials = true only when origins are an explicit allowlist
  • security.strict_ws_origin_check = true
  • Auth cookies served over HTTPS: auth.cookie_secure = true, SameSite=Strict, HttpOnly

Abuse Controls

  • rate_limit.enable_connection_protection = true
  • rate_limit.max_auth_requests_per_ip_per_sec tuned (start at 10–20)
  • rate_limit.max_connections_per_ip tuned (start at 100)
  • security.max_request_body_size tuned to your real upload size
  • security.max_ws_message_size tuned to your real message size

Observability

  • Audit logs shipped to a separate write-only sink (e.g., SIEM, S3 with object lock)
  • Metrics endpoint exposed only to the internal monitoring network
  • Alerts configured on: failed logins/min, 401/403 rate, new user creation, role changes

1. Bind Addresses & TLS

KalamDB does not terminate TLS itself. Run it behind a reverse proxy that provides HTTPS and forwards to KalamDB on localhost.

[ client ] ──HTTPS──► [ nginx/Caddy/ALB ] ──HTTP(loopback)──► [ kalamdb :8080 ] [ kalamdb :9090 cluster mTLS ]

Config

[server] host = "127.0.0.1" # loopback only; proxy handles remote clients port = 8080 [cluster] # Bind cluster RPC to the private cluster network interface only. rpc_addr = "10.0.1.15:9090" [rpc_tls] enabled = true require_client_cert = true ca_cert = "/etc/kalamdb/tls/cluster-ca.pem" server_cert = "/etc/kalamdb/tls/node.pem" server_key = "/etc/kalamdb/tls/node.key"

If your cluster spans more than loopback nodes, KalamDB will refuse to start without rpc_tls.enabled = true. Do not work around this check.


2. JWT Secrets & Token Hygiene

Secret requirements

  • Length: minimum 32 bytes (64+ recommended)
  • Source: cryptographic RNG (openssl rand -base64 48), not a passphrase
  • Rotation: rotate on compromise, personnel change, or at least annually
  • Scope: unique per environment (dev, staging, prod never share)

Generate and inject via environment:

export KALAMDB_AUTH_JWT_SECRET="$(openssl rand -base64 48)"
[auth] # Leave unset in the file when you set KALAMDB_AUTH_JWT_SECRET in the env. # jwt_secret = "..." access_token_expiry_hours = 1 refresh_token_expiry_hours = 168 # 7 days

KalamDB refuses to start on non-loopback binds if the secret is empty, a known-weak placeholder, or shorter than 32 bytes.

Rotation procedure

  1. Generate a new secret.
  2. Deploy it to every node simultaneously.
  3. Bounce the service.
  4. All outstanding tokens will fail verification — clients must re-login.

Do not re-use an old secret. There is no “grace period” key ring; rotation is abrupt by design.

Where tokens are exposed

SurfaceCarries token?Notes
Authorization: Bearer … headerYesPreferred for server-to-server
Auth cookiesYesBrowser flows; always set Secure, HttpOnly, SameSite=Strict
WebSocket subprotocolYesSent once during upgrade; never in URL query
LogsNoLog redaction is enabled by default; keep it enabled

3. Admin Bootstrap & Setup

On first boot, KalamDB seeds a root user. What happens next depends on your configuration.

export KALAMDB_ROOT_PASSWORD="$(openssl rand -base64 24)" # run the server once so the password is hashed and persisted # then unset the env var and store the password in your secret manager unset KALAMDB_ROOT_PASSWORD

Option B — Remote setup endpoint (one-time)

If you cannot set the env var, allow the setup endpoint for the first call only:

[auth] allow_remote_setup = true # flip back to false immediately after setup

Set allow_remote_setup = false and restart after you complete setup. Leaving it enabled lets anyone who can reach the endpoint attempt to seed credentials.

Option C — Localhost-only root

If you only administer via an SSH tunnel, leave the password empty. KalamDB will refuse remote logins for root and only accept localhost calls. Create named admin users via CREATE USER for remote access.


4. Password Policy

Minimum controls enforced by KalamDB:

  • Minimum length (configurable via auth.password_min_length)
  • Max length 72 bytes (bcrypt limit)
  • Bcrypt hash with cost 12
  • Rejected common-password list
  • Constant-time verification
  • Generic error messages (no user enumeration)
  • Account lockout after N failed attempts

Recommended runtime:

[auth] password_min_length = 12 max_failed_attempts = 5 lockout_duration_minutes = 15

Push complexity requirements (e.g., character classes) up to your identity provider — KalamDB deliberately does not reject non-common weak passwords beyond length to avoid false rejections of passphrases.


5. CORS & WebSocket Origins

CORS for the browser admin UI

[security.cors] allowed_origins = [ "https://admin.example.com", "https://app.example.com", ] allow_credentials = true allowed_methods = ["GET", "POST", "OPTIONS"] allowed_headers = ["Authorization", "Content-Type"] max_age = 600

Hard rules:

  • Never combine allowed_origins = ["*"] with allow_credentials = true. Browsers reject it, and KalamDB will refuse to start with that combination.
  • Use scheme + host + port — origin matching is exact.
  • Never include localhost origins in production config.

WebSocket origins

[security] strict_ws_origin_check = true allowed_ws_origins = ["https://app.example.com"]

strict_ws_origin_check = true rejects connections that omit the Origin header (non-browser tooling must set it explicitly). Leave empty allowed_ws_origins only on loopback dev installs.


6. Rate Limiting & Connection Protection

Even with auth in place, rate limits protect against credential stuffing, SQL abuse, and connection exhaustion.

[rate_limit] enable_connection_protection = true # Auth endpoints (login, refresh, setup, WS auth) max_auth_requests_per_ip_per_sec = 10 # Per-IP concurrent connection cap max_connections_per_ip = 100 # Pre-auth request flood protection max_requests_per_ip_per_sec = 200 # Per-user query rate (SQL endpoint) max_queries_per_sec = 100 # WebSocket message flood protection max_messages_per_sec = 50 # Temporary ban window for abusive IPs ban_duration_seconds = 300

If you run behind a reverse proxy, configure security.trusted_proxy_ranges so that rate limits attribute requests to the real client IP via X-Forwarded-For. Do not set this to 0.0.0.0/0 — that lets any caller spoof any IP and bypass the limits.

[security] trusted_proxy_ranges = ["10.0.0.0/8", "172.16.0.0/12"]

7. RBAC & Least Privilege

KalamDB has four roles. Use the lowest one that works.

RoleTypical useCan do
userApp end-users, SDK clientsRead/write their own namespaces, run DML
serviceBackend services, pub/sub consumersDML + topic consume/ack
dbaDatabase administratorsDDL, manage users except system
systemReserved for the server itselfEverything (do not create new ones)

Rules of thumb:

  • Application tokens should be user or service, never dba.
  • Create one dba per human administrator; audit their usage.
  • EXECUTE AS USER follows a strict hierarchy (System → DBA → Service → User). A role cannot impersonate its peers or upward.
  • System tables (system.*) are read/write restricted to dba and system. This is enforced inside the query planner, including through subqueries, CTEs, and views.

On role demotion

When you demote or lock a user, rotate their tokens. KalamDB re-validates the DB role on each token refresh and invalidates tokens whose token_generation is older than the DB row — but already-issued access tokens remain valid until access_token_expiry_hours. Keep access-token TTLs short.


8. Request Size Guards & File Uploads

[security] max_request_body_size = 10485760 # 10 MiB; tune to real payload size max_ws_message_size = 1048576 # 1 MiB

For file-accepting endpoints (multipart FILE("name") placeholders, exports), KalamDB:

  • Validates path components against an allowlist (alphanumeric, -, _, .)
  • Canonicalises paths and rejects symlink escape
  • Returns 403 to non-owners trying to download another user’s files or exports

Operators should additionally:

  • Put an object-storage lifecycle on the exports bucket (e.g., delete after N days)
  • Scan uploaded files with your AV/virus pipeline before re-serving them to users

9. Cluster RPC Hardening

For anything beyond a single-node loopback cluster:

[rpc_tls] enabled = true require_client_cert = true ca_cert = "/etc/kalamdb/tls/cluster-ca.pem" server_cert = "/etc/kalamdb/tls/node-1.pem" server_key = "/etc/kalamdb/tls/node-1.key" [cluster] rpc_addr = "10.0.1.15:9090" # private interface only

Notes:

  • The cluster RPC port carries Raft consensus, ForwardSql, Ping, and node-info RPCs. Treat it as equally sensitive to the data port.
  • ForwardSql additionally re-validates the caller’s Bearer JWT on the receiving node — mTLS is defense-in-depth, not the only layer.
  • Use a private CA unique to the cluster. Do not reuse your edge-TLS certificate chain for cluster mTLS.
  • Rotate node certificates independently of JWT secrets.

10. Logging, Audit & Observability

  • KalamDB emits structured JSON logs. SQL statements are redacted before logging; do not re-enable raw SQL in log formatters.
  • Enable the audit stream to capture user logins, user creation/modification, role changes, impersonation events, and admin export downloads.
  • Ship audit logs to an append-only sink (S3 with object lock, SIEM).
  • Expose Prometheus metrics on the internal network only — never on the public API port.

Suggested alerts

  • auth_failed_logins_per_minute > 20
  • auth_403_forbidden_per_minute > 50
  • user_role_change_events > 0 (human review on every change)
  • impersonation_events grouped by actor
  • Cluster RPC connection count deviation > 3σ

11. Incident Response Playbook

When credentials, a token secret, or a node are suspected compromised:

  1. Contain — rotate auth.jwt_secret. All outstanding tokens become invalid immediately.
  2. Lock down — set auth.allow_remote_setup = false, revoke compromised user accounts (ALTER USER … LOCK).
  3. Rotate — regenerate cluster mTLS material, redeploy, force clients to re-authenticate.
  4. Narrow — temporarily lower max_auth_requests_per_ip_per_sec and max_connections_per_ip.
  5. Preserve — snapshot logs, audit stream, and RocksDB backup before further changes.
  6. Review — audit system.users for unexpected role changes, system.jobs for unexpected admin jobs, exports directory for unexpected downloads.
  7. Post-mortem — document the breach path, fix the configuration gap, add a regression check.

12. High-Risk Misconfigurations To Avoid

KalamDB will refuse to start on several of these. The rest are human errors we see most often.

  • server.host = "0.0.0.0" with no edge proxy and default JWT secret
  • security.cors.allowed_origins = ["*"] combined with allow_credentials = true
  • security.strict_ws_origin_check = false on a public WebSocket endpoint
  • auth.allow_remote_setup = true left on after bootstrap
  • auth.jwt_secret shared across environments, or shorter than 32 bytes
  • security.trusted_proxy_ranges containing 0.0.0.0/0 or ::/0
  • Running a multi-node cluster with rpc_tls.enabled = false
  • Handing out dba or system role tokens to applications
  • Storing secrets in server.toml committed to Git

Baseline Production server.toml

Drop-in starting point. Fill in real values, inject secrets via env.

[server] host = "127.0.0.1" port = 8080 enable_http2 = true [auth] # jwt_secret intentionally omitted — supplied via KALAMDB_AUTH_JWT_SECRET allow_remote_setup = false cookie_secure = true access_token_expiry_hours = 1 refresh_token_expiry_hours = 168 password_min_length = 12 max_failed_attempts = 5 lockout_duration_minutes = 15 [rate_limit] enable_connection_protection = true max_auth_requests_per_ip_per_sec = 10 max_connections_per_ip = 100 max_requests_per_ip_per_sec = 200 max_queries_per_sec = 100 max_messages_per_sec = 50 ban_duration_seconds = 300 [security] max_request_body_size = 10485760 max_ws_message_size = 1048576 strict_ws_origin_check = true allowed_ws_origins = ["https://app.example.com"] trusted_proxy_ranges = ["10.0.0.0/8"] [security.cors] allowed_origins = ["https://admin.example.com", "https://app.example.com"] allow_credentials = true allowed_methods = ["GET", "POST", "OPTIONS"] allowed_headers = ["Authorization", "Content-Type"] max_age = 600 [rpc_tls] enabled = true require_client_cert = true ca_cert = "/etc/kalamdb/tls/cluster-ca.pem" server_cert = "/etc/kalamdb/tls/node.pem" server_key = "/etc/kalamdb/tls/node.key" [cluster] rpc_addr = "10.0.1.15:9090"

Last updated on