Skip to Content
ArchitectureOverview

Architecture Overview

KalamDB combines SQL execution, realtime delivery, and multi-tier storage in one runtime.

It also supports vector search with EMBEDDING(n) columns, cosine indexes, and similarity ranking in SQL for semantic retrieval workflows.

Use this page as the map. The deeper architecture pages explain one boundary at a time: /docs/server/architecture/table-types, /docs/server/architecture/storage-tiers, /docs/server/architecture/stream-storage, /docs/server/architecture/manifests, /docs/server/architecture/live-query, /docs/server/architecture/clustering, /docs/server/architecture/datatypes, /docs/server/architecture/vector-search, and /docs/server/architecture/file-upload-datatype.

Major Runtime Layers

  • kalamdb-api: HTTP + WebSocket surface (/v1/api/*, /v1/ws)
  • kalamdb-core: orchestration, SQL execution flow, authorization checks, job dispatch
  • kalamdb-sql: SQL parser/classifier/extensions (SUBSCRIBE, TOPIC, STORAGE, EXECUTE AS '<user_id>', etc.)
  • kalamdb-store + RocksDB: hot write/read path
  • kalamdb-filestore + Parquet: cold segments and manifests
  • kalamdb-raft: cluster consensus and replication

Runtime Flow

Architecture Reading Map

Read /docs/server/architecture/table-types first when you are deciding data ownership or security. Read /docs/server/architecture/storage-tiers and /docs/server/architecture/manifests when you are tuning flush, compaction, or storage templates. Read /docs/server/architecture/stream-storage when you are tuning STREAM table retention, transient UI state, or TTL cleanup behavior. Read /docs/server/architecture/live-query and /docs/server/architecture/clustering when you are scaling realtime connections or multi-node deployments.

Data Directory Layout

Default paths are rooted at storage.data_path (usually ./data):

TEXT
data/├── rocksdb/      # hot tier├── storage/      # cold parquet tier├── snapshots/    # raft snapshots└── streams/      # stream table logs

Table-specific cold paths are generated from templates:

  • shared tables: storage.shared_tables_template
  • user tables: storage.user_tables_template

Query Path (Simplified)

  1. API receives SQL (POST /v1/api/sql) with bearer auth.
  2. SQL is parsed/classified (standard + KalamDB extensions).
  3. Authorization and table-type rules are applied.
  4. Query executes against hot tier, cold tier, or both.
  5. Response returns normalized JSON schema/rows.
  6. Optional change notifications flow to subscriptions/topics.

For detailed flush behavior and cold-tier movement, see /docs/server/architecture/storage-tiers. For the file-backed STREAM table path, see /docs/server/architecture/stream-storage.

Technology Stack

ComponentTechnologyNotes
LanguageRust 1.92+concurrency + memory safety
Query engineApache DataFusion 52.xSQL planning/execution
Columnar formatApache Arrow 57.xin-memory batches
Cold storageApache Parquet 57.xcompressed columnar files
Hot storageRocksDB 0.24write-heavy low-latency path
API runtimeActix Web 4.12HTTP + WebSocket
Authbcrypt + JWTpassword + token flows

Continue Reading

Last updated on