Data Types
This page is a practical guide for using data types in SQL when building with KalamDB.
Type Selection Flow
Quick Type Selection
Use this table when deciding a column type:
| You store… | Use | Why |
|---|---|---|
IDs from SNOWFLAKE_ID() | BIGINT | 64-bit integer ID generation |
| Small whole numbers (status, counters) | INT | Compact and fast |
| Large counters | BIGINT | Higher range |
| User-facing text | TEXT | Flexible string storage |
| True/false values | BOOLEAN | Clear semantics |
| Decimal money-like values | DECIMAL(p,s) | Exact precision |
| Timestamps | TIMESTAMP | Time-based queries and ordering |
| Dates only | DATE | Calendar values without time |
| Structured document payload | JSON | Nested object/array data |
| Vector embeddings | EMBEDDING(n) | Semantic retrieval and similarity search |
Common Types in Practice
Integer Types
CREATE TABLE app.orders ( id BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID(), user_id BIGINT NOT NULL, item_count INT NOT NULL DEFAULT 1) WITH (TYPE = 'USER');- Prefer
INTfor bounded counts. - Prefer
BIGINTfor IDs and values that can grow large.
Text and Boolean
CREATE TABLE app.users ( id BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID(), email TEXT NOT NULL, display_name TEXT, is_active BOOLEAN NOT NULL DEFAULT TRUE) WITH (TYPE = 'USER');- Use
TEXTfor variable-length strings. - Use
BOOLEANinstead of0/1integers for feature flags and state.
Decimal Values
CREATE TABLE billing.invoice_items ( id BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID(), sku TEXT NOT NULL, amount DECIMAL(12,2) NOT NULL, tax_rate DECIMAL(5,4) NOT NULL DEFAULT 0.0000) WITH (TYPE = 'SHARED');- Use
DECIMALfor exact arithmetic (pricing, balances, rates). - Pick precision/scale deliberately, e.g.
DECIMAL(12,2).
Time Types
CREATE TABLE chat.messages ( id BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID(), user_id BIGINT NOT NULL, content TEXT NOT NULL, created_at TIMESTAMP NOT NULL DEFAULT NOW()) WITH (TYPE = 'USER'); SELECT *FROM chat.messagesWHERE created_at >= NOW() - INTERVAL '7 days'ORDER BY created_at DESC;- Use
TIMESTAMPfor event records and ordering. - Use
DATEwhen time-of-day is not needed.
JSON Payloads
CREATE TABLE app.events ( id BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID(), event_name TEXT NOT NULL, payload JSON, created_at TIMESTAMP NOT NULL DEFAULT NOW()) WITH ( TYPE = 'STREAM', TTL_SECONDS = 300);- Use
JSONfor dynamic payloads and schema-flexible metadata. - Keep frequently filtered fields as typed top-level columns.
Embeddings for Vector Search
CREATE TABLE rag.documents_vectors ( id BIGINT PRIMARY KEY, doc_embedding EMBEDDING(384)) WITH (TYPE = 'USER'); ALTER TABLE rag.documents_vectors CREATE INDEX doc_embedding USING COSINE; SELECT idFROM rag.documents_vectorsORDER BY COSINE_DISTANCE(doc_embedding, '[0.12,0.03,0.98]')LIMIT 5;- Use
EMBEDDING(n)for fixed-size vector columns. - Pick
nto match the embedding model you generate in your app or worker. - Index the embedding column before using it for nearest-neighbor retrieval.
Type Conversion
Use explicit casts when moving between compatible types.
SELECT CAST('42' AS INT) AS count_value;SELECT CAST(amount AS DECIMAL(12,2)) AS normalized_amountFROM billing.invoice_items;Prefer explicit conversion over relying on implicit casting in critical queries.
Nullability and Defaults
Use NOT NULL for required fields and DEFAULT for stable inserts.
CREATE TABLE app.sessions ( id BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID(), user_id BIGINT NOT NULL, started_at TIMESTAMP NOT NULL DEFAULT NOW(), ended_at TIMESTAMP, is_valid BOOLEAN NOT NULL DEFAULT TRUE) WITH (TYPE = 'USER');Recommended Patterns
- IDs:
BIGINT PRIMARY KEY DEFAULT SNOWFLAKE_ID() - Created timestamps:
TIMESTAMP NOT NULL DEFAULT NOW() - Money/rates:
DECIMAL(not floating point) - Optional user text: nullable
TEXT - Real-time transient events:
STREAM+TTL_SECONDS
Related Docs
Last updated on