Built-in Functions
KalamDB provides built-in functions for ID generation, user context, and timestamps.
KalamDB query execution also supports DataFusion SQL functions. For advanced query-time function usage, see:
- DataFusion SQL Functions Overview
- DataFusion Scalar Functions
- DataFusion Aggregate Functions
- DataFusion Window Functions
The functions on this page are KalamDB-specific functions you will use most often.
ID Generation
SNOWFLAKE_ID()
Generate a unique 64-bit snowflake ID (time-ordered, globally unique):
Commonly used as the default for primary keys:
UUID_V7()
Generate a UUIDv7 (time-ordered UUID):
ULID()
Generate a ULID (Universally Unique Lexicographically Sortable Identifier):
Context Functions
CURRENT_USER()
Returns the username of the currently authenticated user:
Useful for audit columns:
Date & Time
NOW()
Returns the current timestamp:
Commonly used as a default for timestamp columns:
Vector Search
COSINE_DISTANCE()
Use COSINE_DISTANCE(embedding_column, '[...]') to rank rows by vector similarity.
Typical flow:
- Store embeddings in an
EMBEDDING(n)column. - Create a cosine index with
ALTER TABLE ... CREATE INDEX ... USING COSINE. - Use
ORDER BY COSINE_DISTANCE(...) LIMIT kto fetch nearest matches.
Notes:
- The query vector must match the column dimension.
- Lower distance means a closer match.
- This is the main SQL entry point for vector search in KalamDB today.
JSON Operators and Functions
KalamDB registers DataFusion’s JSON function package in every SQL session. That gives JSON columns the common PostgreSQL-style query syntax most users expect for extraction and existence checks.
Supported PostgreSQL-style operators in KalamDB SQL:
| Syntax | Equivalent helper | Result |
|---|---|---|
json_col -> 'key' or json_col -> 0 | json_get(...) | JSON value |
json_col ->> 'key' or json_col ->> 0 | json_as_text(...) | text |
json_col ? 'key' | json_contains(...) | BOOLEAN |
Available JSON helpers from the registered DataFusion package:
| Function | Use |
|---|---|
json_get | Return a JSON value at a path |
json_as_text | Return a value at a path as text |
json_get_json | Return nested JSON as raw JSON text |
json_get_str | Return a string value |
json_get_int | Return an integer value |
json_get_float | Return a floating-point value |
json_get_bool | Return a boolean value |
json_get_array | Return an array value |
json_contains | Test whether a key or index exists |
json_length | Return the length of an array or object |
json_object_keys | Return the keys of a JSON object as an array |
json_from_scalar | Convert a scalar into a JSON value for composed expressions |
Scope note:
- KalamDB currently plans PostgreSQL-style
->,->>, and?operators directly. - Do not assume full PostgreSQL
jsonboperator parity for operators such as#>,#>>, or@>yet.
Function Summary
| Function | Return Type | Description |
|---|---|---|
SNOWFLAKE_ID() | BIGINT | Time-ordered 64-bit unique ID |
UUID_V7() | TEXT | Time-ordered UUID v7 |
ULID() | TEXT | Lexicographically sortable unique ID |
CURRENT_USER() | TEXT | Authenticated username |
NOW() | TIMESTAMP | Current server timestamp |
COSINE_DISTANCE(vector, query) | DOUBLE | Distance for vector similarity ranking |
DataFusion Function Support
KalamDB supports DataFusion SQL function execution for SELECT/query workloads, including scalar, aggregate, and window functions documented in DataFusion.
If you need advanced expression behavior beyond KalamDB-specific built-ins, use the DataFusion function references linked above as the canonical guide.
For end-to-end schema and index examples, see /docs/server/architecture/vector-search.