Skip to Content

PostgreSQL Extension Architecture

The KalamDB PostgreSQL extension currently operates in remote mode: PostgreSQL hosts the extension entrypoints and relation bridge, while the actual table definition and execution live in a running KalamDB server.

Main Components

  1. PostgreSQL 16 session: owns SQL parsing, planning, and the visible PostgreSQL-side table surface.
  2. pg_kalam extension: registers the bridge layer, DDL interception hook, helper functions, and the kalam.user_id session setting.
  3. Foreign server definition: stores the remote connection settings such as host, port, TLS material, timeout, and optional auth header.
  4. KalamDB server: receives gRPC scan and mutation requests plus mirrored SQL DDL.

Read Path

For SELECT statements:

  1. PostgreSQL plans the extension-managed relation scan.
  2. pg_kalam resolves the relation identity and the foreign server connection settings.
  3. The extension reads the current kalam.user_id value, if present.
  4. A remote scan request is sent to KalamDB.
  5. The returned Arrow batches are converted into PostgreSQL tuples.

The extension reads and writes through PostgreSQL relations that were created from CREATE TABLE ... USING kalamdb.

Write Path

For INSERT, UPDATE, and DELETE statements:

  1. PostgreSQL routes the operation through FDW modify callbacks.
  2. pg_kalam resolves the target table and tenant context.
  3. Rows are converted from PostgreSQL datums into KalamDB scalar values, while _seq stays read-only and _userid is treated as an insert-only tenant override.
  4. Mutations are sent to KalamDB over the remote executor.

Single-row inserts are buffered so PostgreSQL transactions can batch work more efficiently. Pending writes are flushed before scans and before transaction completion so read-your-writes behavior stays predictable inside the PostgreSQL session.

Explicit Transaction Flow

When a PostgreSQL session opens an explicit transaction with BEGIN or START TRANSACTION, the extension binds staged KalamDB writes to that pg session until COMMIT or ROLLBACK.

That transaction state is observable through two server views:

  • system.sessions for the live pg bridge session and its current remote transaction metadata.
  • system.transactions for the active explicit transaction itself alongside other active transaction origins.

This makes PostgreSQL transaction behavior visible without materializing extra persisted bookkeeping on the server.

DDL Forwarding Path

When you run CREATE TABLE ... USING kalamdb, the extension intercepts the statement, mirrors the user-column DDL into KalamDB, and injects the visible system columns back into the local PostgreSQL schema. The flow is:

  1. resolve the PostgreSQL schema and table name as the KalamDB namespace and table name,
  2. create the namespace in KalamDB if needed,
  3. create the KalamDB table from the user-defined columns and forwarded table options,
  4. add _seq to every local PostgreSQL table surface and _userid to user tables,
  5. reject explicit declarations of _seq, _userid, and _deleted in the original CREATE TABLE ... USING kalamdb statement.

This is the public authoring flow for the extension. PostgreSQL WITH (...) options are forwarded to KalamDB as the mirrored table options.

Direct Statement Passthrough

The extension also exposes a direct SQL bridge:

SELECT kalam_exec('SELECT * FROM system.tables');

Use kalam_exec(...) when you want to send a statement directly to KalamDB without wrapping it in PostgreSQL syntax first. This is the simplest path for server-native DDL, system queries, or commands that do not need PostgreSQL-side relation mapping.

Namespace and Schema Mapping

KalamDB namespaces map naturally to PostgreSQL schemas.

  • The PostgreSQL schema becomes the KalamDB namespace.
  • The PostgreSQL relation name becomes the KalamDB table name.

That lets you organize PostgreSQL schemas and KalamDB namespaces with the same naming model.

Tenant Context

kalam.user_id is a PostgreSQL session setting registered by the extension.

SET kalam.user_id = 'user-alice'; SELECT kalam_user_id();

The session value is attached to remote requests and is the main way user-scoped reads, updates, and deletes are isolated. On INSERT, a visible _userid column on user tables can override the session value for that inserted row batch.

Data Definition Details

The extension preserves KalamDB-focused DDL details where possible:

  • PRIMARY KEY and DEFAULT SNOWFLAKE_ID() are kept in the KalamDB definition.
  • PostgreSQL keeps a relation surface that matches the forwarded KalamDB table definition closely enough for normal DML while exposing the extension-managed system columns locally.
  • SERIAL, BIGSERIAL, SMALLSERIAL, and GENERATED ... AS IDENTITY are normalized into explicit integer columns with DEFAULT SNOWFLAKE_ID() before the mirrored KalamDB DDL is sent.
  • _seq is exposed as a read-only projected column and is not treated as a writable PostgreSQL application field.
Last updated on