Skip to Content
SecurityProduction Checklist

Running KalamDB in Production

This guide walks you through every setting, secret, and network boundary you need to lock down before exposing a KalamDB server to the public internet or an untrusted network.

Work through the checklist top-to-bottom. Each section is independently actionable — you can turn a single item on and restart the server between steps.

Threat model assumption: attackers can reach your HTTP and cluster RPC ports. Your defenses must not rely on network isolation alone.


Quick Checklist

Copy this into your deploy runbook. Every box must be ticked before you open the port.

Network & TLS

  • server.host = "127.0.0.1" (or explicitly a private interface); never 0.0.0.0 unless you need remote access
  • Terminate TLS at an edge proxy (nginx, Caddy, ALB, Cloudflare) — KalamDB itself serves plain HTTP
  • HTTP server sits behind a firewall/security-group that only allows the edge proxy
  • Cluster RPC port (cluster.rpc_addr) is not reachable from the public internet
  • For multi-node clusters: rpc_tls.enabled = true with require_client_cert = true

Secrets

  • auth.jwt_secret is at least 32 bytes, random, unique per environment
  • JWT secret is injected via env var or secret manager, never committed to source
  • Root/admin password is changed from any default and stored in a secret manager
  • KALAMDB_ROOT_PASSWORD env var is cleared from the shell history after seeding
  • OAuth/OIDC client secrets are in env vars, not the config file

Auth & RBAC

  • auth.allow_remote_setup = false after first-run bootstrap
  • Every human user has a named account, not the built-in root
  • Service accounts use the service role, never dba/system
  • Account lockout is enabled (auth.max_failed_attempts, auth.lockout_duration_minutes)
  • Access token expiry ≤ 1 hour; refresh token expiry ≤ 7 days

Origins & Cookies

  • security.cors.allowed_origins is an explicit allowlist — no "*"
  • security.cors.allow_credentials = true only when origins are an explicit allowlist
  • security.strict_ws_origin_check = true
  • Auth cookies served over HTTPS: auth.cookie_secure = true, SameSite=Strict, HttpOnly

Abuse Controls

  • rate_limit.enable_connection_protection = true
  • rate_limit.max_auth_requests_per_ip_per_sec tuned (start at 10–20)
  • rate_limit.max_connections_per_ip tuned (start at 100)
  • security.max_request_body_size tuned to your real upload size
  • security.max_ws_message_size tuned to your real message size

Observability

  • Audit logs shipped to a separate write-only sink (e.g., SIEM, S3 with object lock)
  • Metrics endpoint exposed only to the internal monitoring network
  • Alerts configured on: failed logins/min, 401/403 rate, new user creation, role changes

1. Bind Addresses & TLS

KalamDB does not terminate TLS itself. Run it behind a reverse proxy that provides HTTPS and forwards to KalamDB on localhost.

plaintext snippetplaintext
[ client ] ──HTTPS──► [ nginx/Caddy/ALB ] ──HTTP(loopback)──► [ kalamdb :2900 ]                                                              [ kalamdb :2910 cluster mTLS ]

Config

toml snippetTOML
[server]host = "127.0.0.1"     # loopback only; proxy handles remote clientsport = 2900 [cluster]# Bind cluster RPC to the private cluster network interface only.rpc_addr = "10.0.1.15:2910" [rpc_tls]enabled = truerequire_client_cert = trueca_cert = "/etc/kalamdb/tls/cluster-ca.pem"server_cert = "/etc/kalamdb/tls/node.pem"server_key = "/etc/kalamdb/tls/node.key"

If your cluster spans more than loopback nodes, KalamDB will refuse to start without rpc_tls.enabled = true. Do not work around this check.


2. JWT Secrets & Token Hygiene

Secret requirements

  • Length: minimum 32 bytes (64+ recommended)
  • Source: cryptographic RNG (openssl rand -base64 48), not a passphrase
  • Rotation: rotate on compromise, personnel change, or at least annually
  • Scope: unique per environment (dev, staging, prod never share)

Generate and inject via environment:

bash snippetBASH
export KALAMDB_AUTH_JWT_SECRET="$(openssl rand -base64 48)"
toml snippetTOML
[auth]# Leave unset in the file when you set KALAMDB_AUTH_JWT_SECRET in the env.# jwt_secret = "..."access_token_expiry_hours = 1refresh_token_expiry_hours = 168   # 7 days

KalamDB refuses to start on non-loopback binds if the secret is empty, a known-weak placeholder, or shorter than 32 bytes.

Rotation procedure

  1. Generate a new secret.
  2. Deploy it to every node simultaneously.
  3. Bounce the service.
  4. All outstanding tokens will fail verification — clients must re-login.

Do not re-use an old secret. There is no “grace period” key ring; rotation is abrupt by design.

Where tokens are exposed

SurfaceCarries token?Notes
Authorization: Bearer … headerYesPreferred for server-to-server
Auth cookiesYesBrowser flows; always set Secure, HttpOnly, SameSite=Strict
WebSocket subprotocolYesSent once during upgrade; never in URL query
LogsNoLog redaction is enabled by default; keep it enabled

3. Admin Bootstrap & Setup

On first boot, KalamDB seeds a root user. What happens next depends on your configuration.

bash snippetBASH
export KALAMDB_ROOT_PASSWORD="$(openssl rand -base64 24)"# run the server once so the password is hashed and persisted# then unset the env var and store the password in your secret managerunset KALAMDB_ROOT_PASSWORD

Option B — Remote setup endpoint (one-time)

If you cannot set the env var, allow the setup endpoint for the first call only:

toml snippetTOML
[auth]allow_remote_setup = true   # flip back to false immediately after setup

Set allow_remote_setup = false and restart after you complete setup. Leaving it enabled lets anyone who can reach the endpoint attempt to seed credentials.

Option C — Localhost-only root

If you only administer via an SSH tunnel, leave the password empty. KalamDB will refuse remote logins for root and only accept localhost calls. Create named admin users via CREATE USER for remote access.


4. Password Policy

Minimum controls enforced by KalamDB:

  • Minimum length (configurable via auth.password_min_length)
  • Max length 72 bytes (bcrypt limit)
  • Bcrypt hash with cost 12
  • Rejected common-password list
  • Constant-time verification
  • Generic error messages (no user enumeration)
  • Account lockout after N failed attempts

Recommended runtime:

toml snippetTOML
[auth]password_min_length = 12max_failed_attempts = 5lockout_duration_minutes = 15

Push complexity requirements (e.g., character classes) up to your identity provider — KalamDB deliberately does not reject non-common weak passwords beyond length to avoid false rejections of passphrases.


5. CORS & WebSocket Origins

CORS for the browser admin UI

toml snippetTOML
[security.cors]allowed_origins = [    "https://admin.example.com",    "https://app.example.com",]allow_credentials = trueallowed_methods  = ["GET", "POST", "OPTIONS"]allowed_headers  = ["Authorization", "Content-Type"]max_age          = 600

Hard rules:

  • Never combine allowed_origins = ["*"] with allow_credentials = true. Browsers reject it, and KalamDB will refuse to start with that combination.
  • Use scheme + host + port — origin matching is exact.
  • Never include localhost origins in production config.

WebSocket origins

toml snippetTOML
[security]strict_ws_origin_check = trueallowed_ws_origins = ["https://app.example.com"]

strict_ws_origin_check = true rejects connections that omit the Origin header (non-browser tooling must set it explicitly). Leave empty allowed_ws_origins only on loopback dev installs.


6. Rate Limiting & Connection Protection

Even with auth in place, rate limits protect against credential stuffing, SQL abuse, and connection exhaustion.

toml snippetTOML
[rate_limit]enable_connection_protection = true # Auth endpoints (login, refresh, setup, WS auth)max_auth_requests_per_ip_per_sec = 10 # Per-IP concurrent connection capmax_connections_per_ip = 100 # Pre-auth request flood protectionmax_requests_per_ip_per_sec = 200 # Per-user query rate (SQL endpoint)max_queries_per_sec = 100 # WebSocket message flood protectionmax_messages_per_sec = 50 # Temporary ban window for abusive IPsban_duration_seconds = 300

If you run behind a reverse proxy, configure security.trusted_proxy_ranges so that rate limits attribute requests to the real client IP via X-Forwarded-For. Do not set this to 0.0.0.0/0 — that lets any caller spoof any IP and bypass the limits.

toml snippetTOML
[security]trusted_proxy_ranges = ["10.0.0.0/8", "172.16.0.0/12"]

7. RBAC & Least Privilege

KalamDB has four roles. Use the lowest one that works.

RoleTypical useCan do
userApp end-users, SDK clientsRead/write their own namespaces, run DML
serviceBackend services, pub/sub consumersDML + topic consume/ack
dbaDatabase administratorsDDL, manage users except system
systemReserved for the server itselfEverything (do not create new ones)

Rules of thumb:

  • Application tokens should be user or service, never dba.
  • Create one dba per human administrator; audit their usage.
  • EXECUTE AS '<user_id>' follows the role hierarchy: system can target system/DBA/service/user, DBA can target DBA/service/user, service can target service/user, and regular users can only target themselves.
  • System tables (system.*) are read/write restricted to dba and system. This is enforced inside the query planner, including through subqueries, CTEs, and views.

On role demotion

When you demote or lock a user, rotate their tokens. KalamDB re-validates the DB role on each token refresh and invalidates tokens whose token_generation is older than the DB row — but already-issued access tokens remain valid until access_token_expiry_hours. Keep access-token TTLs short.


8. Request Size Guards & File Uploads

toml snippetTOML
[security]max_request_body_size = 10485760   # 10 MiB; tune to real payload sizemax_ws_message_size   = 1048576    # 1 MiB

For file-accepting endpoints (multipart FILE("name") placeholders, exports), KalamDB:

  • Validates path components against an allowlist (alphanumeric, -, _, .)
  • Canonicalises paths and rejects symlink escape
  • Returns 403 to non-owners trying to download another user’s files or exports

Operators should additionally:

  • Put an object-storage lifecycle on the exports bucket (e.g., delete after N days)
  • Scan uploaded files with your AV/virus pipeline before re-serving them to users

9. Cluster RPC Hardening

For anything beyond a single-node loopback cluster:

toml snippetTOML
[rpc_tls]enabled = truerequire_client_cert = trueca_cert = "/etc/kalamdb/tls/cluster-ca.pem"server_cert = "/etc/kalamdb/tls/node-1.pem"server_key = "/etc/kalamdb/tls/node-1.key" [cluster]rpc_addr = "10.0.1.15:2910"   # private interface only

Notes:

  • The cluster RPC port carries Raft consensus, ForwardSql, Ping, and node-info RPCs. Treat it as equally sensitive to the data port.
  • ForwardSql additionally re-validates the caller’s Bearer JWT on the receiving node — mTLS is defense-in-depth, not the only layer.
  • Use a private CA unique to the cluster. Do not reuse your edge-TLS certificate chain for cluster mTLS.
  • Rotate node certificates independently of JWT secrets.

10. Logging, Audit & Observability

  • KalamDB emits structured JSON logs. SQL statements are redacted before logging; do not re-enable raw SQL in log formatters.
  • Enable the audit stream to capture user logins, user creation/modification, role changes, impersonation events, and admin export downloads.
  • Ship audit logs to an append-only sink (S3 with object lock, SIEM).
  • Expose Prometheus metrics on the internal network only — never on the public API port.

Suggested alerts

  • auth_failed_logins_per_minute > 20
  • auth_403_forbidden_per_minute > 50
  • user_role_change_events > 0 (human review on every change)
  • impersonation_events grouped by actor
  • Cluster RPC connection count deviation > 3σ

11. Incident Response Playbook

When credentials, a token secret, or a node are suspected compromised:

  1. Contain — rotate auth.jwt_secret. All outstanding tokens become invalid immediately.
  2. Lock down — set auth.allow_remote_setup = false, revoke compromised user accounts (ALTER USER … LOCK).
  3. Rotate — regenerate cluster mTLS material, redeploy, force clients to re-authenticate.
  4. Narrow — temporarily lower max_auth_requests_per_ip_per_sec and max_connections_per_ip.
  5. Preserve — snapshot logs, audit stream, and RocksDB backup before further changes.
  6. Review — audit system.users for unexpected role changes, system.jobs for unexpected admin jobs, exports directory for unexpected downloads.
  7. Post-mortem — document the breach path, fix the configuration gap, add a regression check.

12. High-Risk Misconfigurations To Avoid

KalamDB will refuse to start on several of these. The rest are human errors we see most often.

  • server.host = "0.0.0.0" with no edge proxy and default JWT secret
  • security.cors.allowed_origins = ["*"] combined with allow_credentials = true
  • security.strict_ws_origin_check = false on a public WebSocket endpoint
  • auth.allow_remote_setup = true left on after bootstrap
  • auth.jwt_secret shared across environments, or shorter than 32 bytes
  • security.trusted_proxy_ranges containing 0.0.0.0/0 or ::/0
  • Running a multi-node cluster with rpc_tls.enabled = false
  • Handing out dba or system role tokens to applications
  • Storing secrets in server.toml committed to Git

Baseline Production server.toml

Drop-in starting point. Fill in real values, inject secrets via env.

toml snippetTOML
[server]host = "127.0.0.1"port = 2900enable_http2 = true [auth]# jwt_secret intentionally omitted — supplied via KALAMDB_AUTH_JWT_SECRETallow_remote_setup = falsecookie_secure = trueaccess_token_expiry_hours = 1refresh_token_expiry_hours = 168password_min_length = 12max_failed_attempts = 5lockout_duration_minutes = 15 [rate_limit]enable_connection_protection = truemax_auth_requests_per_ip_per_sec = 10max_connections_per_ip = 100max_requests_per_ip_per_sec = 200max_queries_per_sec = 100max_messages_per_sec = 50ban_duration_seconds = 300 [security]max_request_body_size = 10485760max_ws_message_size = 1048576strict_ws_origin_check = trueallowed_ws_origins = ["https://app.example.com"]trusted_proxy_ranges = ["10.0.0.0/8"] [security.cors]allowed_origins = ["https://admin.example.com", "https://app.example.com"]allow_credentials = trueallowed_methods = ["GET", "POST", "OPTIONS"]allowed_headers = ["Authorization", "Content-Type"]max_age = 600 [rpc_tls]enabled = truerequire_client_cert = trueca_cert = "/etc/kalamdb/tls/cluster-ca.pem"server_cert = "/etc/kalamdb/tls/node.pem"server_key = "/etc/kalamdb/tls/node.key" [cluster]rpc_addr = "10.0.1.15:2910"

Last updated on