Extensions
- storage_engine 1.0.6
- colcompress and rowcompress Table Access Methods with vectorized execution
Documentation
- CODE_OF_CONDUCT
- Contributor Covenant Code of Conduct
- BENCHMARKS
- storage_engine — Benchmarks
- CHANGELOG
- CHANGELOG
- README
- Updating vendored dependencies
- README
- safestringlib
- CMakeLists
- CMakeLists
- LICENSE©ING
- LICENSE©ING
- README
- storage_engine — Benchmark Replication GuideUbuntu / Debian (example for PostgreSQL 18):
README
Contents
- storage_engine
- Table of Contents
- Quick Start
- colcompress AM
- rowcompress AM
- Management Functions
- Catalog Views
- Installation
- Known Limitations
- No AFTER ROW triggers (and no pg_repack)
- AFTER STATEMENT triggers are supported
- sort-on-write is disabled when indexes exist
- B-tree indexes on colcompress disable stripe pruning
- No CLUSTER support
- No VACUUM FULL / table rewrite
- No unlogged tables
- No speculative insertion (INSERT … ON CONFLICT on non-unique predicates)
- Benchmarks
- PostgreSQL Version Compatibility
- Attribution
storage_engine
A PostgreSQL extension providing two high-performance Table Access Methods designed for analytical and HTAP workloads.
colcompress— column-oriented compressed storage with vectorized execution and parallel scanrowcompress— row-oriented batch-compressed storage with parallel scan
Both AMs coexist alongside standard heap tables in the same database. All catalog objects are isolated in the engine schema, making the extension safe to install alongside citus_columnar or any other columnar extension (all exported C symbols carry the se_ prefix to avoid linker conflicts).
Lineage:
storage_engineis a fork of Hydra Columnar (itself derived from citus_columnar), extended withrowcompress, full DELETE/UPDATE support, stripe-level min/max pruning, and a redesigned parallel scan. The MergeTree-styleorderbyoption and zone-map pruning are directly inspired by ClickHouse.
Table of Contents
- Quick Start
- colcompress AM
- rowcompress AM
- Management Functions
- Catalog Views
- Installation
- PostgreSQL Version Compatibility
- Known Limitations
- Benchmarks
- Attribution
Quick Start
CREATE EXTENSION storage_engine;
-- Column-oriented analytics table
CREATE TABLE events (
ts timestamptz NOT NULL,
user_id bigint,
event_type text,
value float8
) USING colcompress;
-- Row-oriented compressed table (good for append-heavy workloads)
CREATE TABLE logs (
id bigserial,
logged_at timestamptz NOT NULL,
message text
) USING rowcompress;
-- Insert normally
INSERT INTO events
SELECT now(), g, 'click', random()
FROM generate_series(1, 1000000) g;
-- Query normally — column projection, vectorized execution,
-- and parallel scan are transparent and automatic.
SELECT event_type, count(*), avg(value)
FROM events
WHERE ts > now() - interval '1 day'
GROUP BY 1;
colcompress AM
How It Works
Data is stored column by column on disk. Each column is split into stripes (default 150,000 rows each), and each stripe is further divided into chunk groups (default 10,000 rows). Every chunk records the minimum and maximum value of its column, enabling zone-map pruning at query time.
Table file
├── Stripe 1 (rows 1 – 150,000)
│ ├── Chunk group 0 (rows 1 – 10,000)
│ │ ├── Column A [min, max, compressed values…]
│ │ ├── Column B [min, max, compressed values…]
│ │ └── …
│ └── Chunk group 1 (rows 10,001 – 20,000) …
└── Stripe 2 (rows 150,001 – 300,000) …
A scan only reads the columns referenced by the query, skipping all others entirely. This dramatically reduces I/O for wide tables with selective column access patterns.
Column-Level Caching
The AM maintains an in-memory column cache that stores decompressed column chunks across executor iterations. When the same stripe region is accessed more than once (nested loops, repeated plans, self-joins), the decompressed data is served from cache without re-reading or re-decompressing the file.
SET storage_engine.enable_column_cache = on; -- default: on
Vectorized Execution
colcompress ships a vectorized expression evaluation engine that processes WHERE clauses and aggregates in column-oriented batches of up to 10,000 values per call, instead of row-at-a-time evaluation. This maps naturally onto column chunks and eliminates per-row interpreter overhead.
Supported vectorized operations:
| Category | Types |
|---|---|
Comparison operators (=, <>, <, <=, >, >=) |
int2, int4, int8, float4, float8, date, timestamp, timestamptz, char, bpchar, text, varchar, name, bool, oid |
Aggregates (count, sum, avg, max, min) |
int2, int4, int8, float8, date, timestamptz |
SET storage_engine.enable_vectorization = on; -- default: on
Parallel Scan
The AM implements the full PostgreSQL parallel Table AM protocol using Dynamic Shared Memory (DSM). The coordinator divides the stripe range across workers; each worker independently reads and decompresses its assigned stripes, then feeds results back through the Gather node.
Parallel scan stacks on top of vectorized execution — each worker runs its own vectorized evaluation pipeline independently.
SET storage_engine.enable_parallel_execution = on; -- default: on
SET storage_engine.min_parallel_processes = 8; -- minimum workers (default: 8)
-- Standard PostgreSQL parallel knobs also apply:
SET max_parallel_workers_per_gather = 4;
SET parallel_setup_cost = 0;
SET parallel_tuple_cost = 0;
Stripe-Level and Chunk-Level Min/Max Pruning
colcompress implements two layers of zone-map pruning using the minimum_value / maximum_value statistics stored per chunk in engine.chunk.
Stripe-level pruning (coarse) — Before reading any data, the scan aggregates the min/max across all chunks of each stripe and tests the resulting stripe-wide range against the query’s WHERE predicates using PostgreSQL’s predicate_refuted_by. Any stripe whose range is provably disjoint from the predicate is skipped entirely — no decompression, no I/O. The number of stripes removed this way is reported in EXPLAIN:
EXPLAIN (ANALYZE, BUFFERS)
SELECT count(*) FROM events WHERE ts > '2025-01-01';
-- Custom Scan (ColumnarScan) on events ...
-- Engine Stripes Removed by Pruning: 41
-- Engine Stripes Read: 12
Stripe pruning is most effective after engine.colcompress_merge() has established a global sort order, but it also works on any partially sorted data.
Chunk-level pruning (fine) — Within each stripe that survives the coarse filter, the custom scan evaluates each individual chunk group’s min/max range against the same predicates. Chunk groups whose range cannot satisfy the predicate are skipped.
The two layers compose: a query on a large, well-sorted table typically eliminates entire stripes before touching them, then further prunes chunk groups within the remaining stripes, resulting in very small I/O amplification even without an index.
Pruning effectiveness scales directly with data sortedness. Use the orderby option combined with engine.colcompress_merge() to establish global sort order and maximize pruning at both levels.
Index-Backed Scan
A custom index scan path allows B-tree and other indexes to drive lookups into a colcompress table, decompressing only the rows matched by the index.
-- Session GUC (applies to all colcompress tables in the session)
SET storage_engine.enable_columnar_index_scan = on; -- default: off
-- Per-table override (persisted in engine.col_options, survives reconnect)
SELECT engine.alter_colcompress_table_set('documents'::regclass, index_scan => true);
The right default depends on the access pattern:
| Workload | Recommendation |
|---|---|
| Analytical (aggregates, range scans, high column count) | Keep off — the sequential vectorized path + chunk pruning wins |
| Document repository (XML, PDF, JSON blobs stored for compression, fetched by primary key or unique identifier) | Set on — you want columnar compression with point-lookup speed, not full-table scan |
Index scan is enabled for a given query if either the session GUC is on or the table’s index_scan option is true. This lets you keep the GUC off globally while enabling it selectively per table:
-- OLAP tables: keep default (index_scan = false)
-- Document table: enable permanently
SELECT engine.alter_colcompress_table_set('adm.documents'::regclass, index_scan => true);
-- Inspect
SELECT table_name, index_scan FROM engine.colcompress_options WHERE table_name = 'adm.documents';
-- table_name | index_scan
-- ---------------+-----------
-- adm.documents | t
For document repositories the combination is compelling: colcompress with zstd can compress large binary/text documents 3–10×, and with the index scan enabled, retrievals like WHERE id = $1 or WHERE document_key = $1 decompress only the matching rows without touching the rest of the stripe.
When to use
index_scan = truevscolcompress_merge()These are two distinct use cases, not competing strategies:
Use case Strategy File/document storage (XML, PDF, JSON blobs — fetched by primary key or unique key) index_scan = true. You want columnar compression for storage savings and point-lookup speed without full-stripe decompression. Sort order is irrelevant; every fetch targets a specific row.Analytics (aggregations, date ranges, GROUP BY, pattern scans over millions of rows) index_scan = false+colcompress_merge(). Sort the data by the query’s filter column (orderby = 'event_date ASC'), then merge to produce globally ordered stripes. Stripe-level min/max pruning skips irrelevant stripes entirely before any decompression occurs.Mixing both on the same table is possible but not ideal: a B-tree index on the
orderbycolumn will cause the planner to preferIndexScanfor range queries, disabling stripe pruning (see Known Limitations). If you need occasional point lookups on an analytical table, rely on the GUCSET storage_engine.enable_columnar_index_scan = onat session level rather than creating a B-tree index.
DELETE and UPDATE Support
colcompress fully supports DELETE and UPDATE via a row mask stored in engine.row_mask. Each deleted row is marked as a bit in a per-chunk-group bitmask; the scan engine skips masked rows without rewriting the stripe. UPDATE is implemented as a delete-then-insert.
SET storage_engine.enable_dml = on; -- default: on
DELETE FROM events WHERE ts < now() - interval '1 year';
UPDATE events SET value = value * 1.1 WHERE event_type = 'purchase';
Deleted rows are reclaimed during VACUUM, which rewrites affected stripes and clears the row mask.
ON CONFLICT / Upserts
Standard INSERT … ON CONFLICT is fully supported, including DO NOTHING and DO UPDATE SET …. Requires a unique index on the conflict target column(s).
INSERT INTO events (ts, user_id, event_type, value)
VALUES (now(), 42, 'click', 1.0)
ON CONFLICT (user_id, event_type) DO UPDATE SET value = EXCLUDED.value;
MergeTree-Like Ordering and colcompress_merge
Inspired by ClickHouse’s MergeTree engine, colcompress supports a global sort key per table. When set, every new stripe written to the table is sorted by the key before compression. The engine.colcompress_merge() function rewrites the entire table in a single globally sorted pass, making stripe-level and chunk-level min/max pruning maximally effective across all stripes.
-- Assign a sort key to an existing table
SELECT engine.alter_colcompress_table_set(
'events'::regclass,
orderby => 'ts ASC, user_id ASC'
);
-- After a large INSERT or initial load, compact and globally sort:
SELECT engine.colcompress_merge('events');
-- Now WHERE ts BETWEEN ... skips almost all chunks.
colcompress_merge internally:
1. Copies all live (non-deleted) rows to a temporary heap table
2. Truncates the target table
3. Re-inserts rows in the order defined by orderby, writing fresh globally-ordered stripes
Compression Options
Each chunk is compressed independently. Available algorithms:
| Name | Description |
|---|---|
none |
No compression |
lz4 |
Fast compression/decompression, moderate ratio |
zstd |
High ratio, configurable level 1–19 (default level: 3) |
pglz |
PostgreSQL’s built-in LZ variant |
SELECT engine.alter_colcompress_table_set('events',
compression => 'zstd',
compression_level => 9);
Configuration GUCs
All parameters can be set per-session or globally in postgresql.conf.
GUC availability: The
storage_engine.*GUCs are registered when thestorage_engineshared library is loaded. This happens automatically the first time acolcompressorrowcompresstable is accessed in a session. To make GUCs available immediately in every session (including before any table access), add the extension toshared_preload_libraries:shared_preload_libraries = 'storage_engine'GUC names use thestorage_engine.prefix and do not overlap withcitus_columnar(which uses thecolumnar.prefix), so both extensions can be loaded simultaneously without conflict.
| Parameter | Type | Default | Description |
|---|---|---|---|
storage_engine.enable_parallel_execution |
bool | on |
Enable parallel custom scan via DSM |
storage_engine.min_parallel_processes |
int | 8 |
Minimum worker count to launch for parallel scan |
storage_engine.enable_vectorization |
bool | on |
Enable vectorized WHERE/aggregate evaluation |
storage_engine.enable_custom_scan |
bool | on |
Enable projection and qual-pushdown custom scan |
storage_engine.enable_column_cache |
bool | on |
Enable in-memory column chunk cache |
storage_engine.enable_columnar_index_scan |
bool | off |
Enable index-driven columnar scan path (recommended on for document/point-lookup repositories) |
storage_engine.enable_dml |
bool | on |
Allow DELETE and UPDATE on colcompress tables |
storage_engine.stripe_row_limit |
int | 150000 |
Default rows per stripe (1,000 – 100,000,000) |
storage_engine.chunk_group_row_limit |
int | 10000 |
Default rows per chunk group (1,000 – 100,000,000) |
storage_engine.compression_level |
int | 3 |
Default compression level for zstd (1 – 19) |
Per-Table Options
Per-table options override the session GUCs for a specific table:
SELECT engine.alter_colcompress_table_set(
'events'::regclass,
chunk_group_row_limit => 10000, -- NULL = leave unchanged
stripe_row_limit => 150000,
compression => 'zstd',
compression_level => 9,
orderby => 'ts ASC, user_id ASC',
index_scan => false -- true = skip cost penalty for index scans on this table
);
-- Reset individual options to system defaults
SELECT engine.alter_colcompress_table_reset(
'events'::regclass,
compression => true,
compression_level => true,
index_scan => true -- resets to false
);
-- Inspect current options
SELECT * FROM engine.colcompress_options WHERE table_name = 'events';
rowcompress AM
How It Works
rowcompress stores rows in fixed-size batches (default 10,000 rows per batch). Each batch is serialized using heap tuple format and then compressed as a single unit. Batch metadata (file offset, byte size, first row number, row count) is stored in engine.row_batch.
Table file (ColumnarStorage layout)
├── Batch 1: [header | row offsets | compressed heap tuple data]
├── Batch 2: [header | row offsets | compressed heap tuple data]
└── …
This AM suits append-heavy workloads where compression matters but column projection is not needed — event logs, audit trails, time-series with many columns all queried together. Typical storage savings of 2–10× with zstd.
Compared to colcompress:
- Reads full rows (no column projection)
- Lower write latency per row (no columnar transposition)
- No vectorized execution or chunk-level pruning
- Supports parallel reads and index scans
- Supports multiple compression algorithms
Parallel Scan
rowcompress implements the PostgreSQL parallel scan protocol via atomic batch claiming. Each parallel worker atomically increments a shared counter to claim the next unprocessed batch, decompresses it, and repeats. There is no coordinator or work distribution step — workers self-schedule in a work-stealing fashion with zero contention on most paths.
-- Standard PostgreSQL parallel knobs apply
SET max_parallel_workers_per_gather = 4;
Per-Table Options
SELECT engine.alter_rowcompress_table_set(
'logs'::regclass,
batch_size => 10000, -- rows per compressed batch (default: 10000)
compression => 'zstd',
compression_level => 5
);
-- Reset to defaults
SELECT engine.alter_rowcompress_table_reset('logs'::regclass, compression => true);
-- Rewrite all batches with current options (e.g. after changing compression)
SELECT engine.rowcompress_repack('logs');
-- Inspect
SELECT * FROM engine.rowcompress_options WHERE table_name = 'logs';
SELECT * FROM engine.rowcompress_batches LIMIT 10;
Management Functions
| Function | Description |
|---|---|
engine.alter_colcompress_table_set(regclass, ...) |
Set one or more options on a colcompress table |
engine.alter_colcompress_table_reset(regclass, ...) |
Reset colcompress options to system defaults |
engine.colcompress_merge(regclass) |
Rewrite and globally sort a colcompress table by its orderby key |
engine.colcompress_repack(regclass) |
Alias for colcompress_merge; drop-in replacement for pg_repack on colcompress tables |
engine.alter_rowcompress_table_set(regclass, ...) |
Set one or more options on a rowcompress table |
engine.alter_rowcompress_table_reset(regclass, ...) |
Reset rowcompress options to system defaults |
engine.rowcompress_repack(regclass) |
Rewrite all batches of a rowcompress table with current options |
Catalog Views
| View | Description |
|---|---|
engine.colcompress_options |
Per-table options for all colcompress tables |
engine.colcompress_stripes |
Stripe-level metadata (offset, size, row range) per table |
engine.rowcompress_options |
Per-table options for all rowcompress tables |
engine.rowcompress_batches |
Batch-level metadata for all rowcompress tables |
All views grant SELECT to PUBLIC.
Installation
Build from source
Requires PostgreSQL server headers and pg_config in PATH.
cd dist/
sudo make -j$(nproc) install
Add to postgresql.conf:
shared_preload_libraries = 'storage_engine'
Loading with citus
If citus or pg_cron is also in shared_preload_libraries, the load order matters:
shared_preload_libraries = 'pg_cron,citus,storage_engine'
citus must appear before storage_engine. PostgreSQL registers planner hooks in load order; citus expects to be the outermost hook in the chain. Reversing the order causes PostgreSQL to refuse to start.
Restart PostgreSQL and load the extension:
CREATE EXTENSION storage_engine;
Docker
docker compose up
psql postgres://postgres:password@127.0.0.1:5432
The engine schema and both AMs are created automatically on CREATE EXTENSION.
Known Limitations
No AFTER ROW triggers (and no pg_repack)
colcompress and rowcompress tables do not support AFTER ROW triggers or foreign keys. This is a fundamental architectural constraint: columnar storage does not maintain heap tuples in a form that row-level trigger machinery can inspect.
This means pg_repack cannot be used — it relies internally on an AFTER ROW trigger to capture concurrent changes during its online copy phase:
ERROR: Foreign keys and AFTER ROW triggers are not supported for columnar tables
DETAIL: Tools such as pg_repack use AFTER ROW triggers internally and cannot be used with colcompress or rowcompress tables.
HINT: Use engine.colcompress_repack(table) as a drop-in replacement for pg_repack on colcompress tables.
Use engine.colcompress_repack() instead:
-- Drop-in replacement for: pg_repack -t mytable
SELECT engine.colcompress_repack('adm.documents'::regclass);
-- Repack all colcompress tables in the database:
DO $$
DECLARE r record;
BEGIN
FOR r IN
SELECT n.nspname, c.relname
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_am a ON c.relam = a.oid
WHERE a.amname = 'colcompress' AND c.relkind = 'r'
ORDER BY pg_total_relation_size(c.oid) DESC
LOOP
RAISE NOTICE 'Repacking %.%...', r.nspname, r.relname;
PERFORM engine.colcompress_repack(
(quote_ident(r.nspname) || '.' || quote_ident(r.relname))::regclass);
END LOOP;
END;
$$;
Limitation vs. pg_repack:
colcompress_repackacquiresAccessExclusiveLockfor the duration of the operation (reads and writes are blocked). There is no online/concurrent mode. Schedule during a maintenance window for large tables.
AFTER STATEMENT triggers are supported
Only row-level (FOR EACH ROW) AFTER triggers are blocked. Statement-level (FOR EACH STATEMENT) AFTER triggers work fine.
sort-on-write is disabled when indexes exist
When a colcompress table has both an orderby option and one or more B-tree indexes, the sort-on-write path is automatically disabled during writes. This avoids a TID placeholder problem that would corrupt index entries. The table’s stripes are still read in their natural write order; run engine.colcompress_merge() (or colcompress_repack()) to produce globally sorted stripes after loading data.
B-tree indexes on colcompress disable stripe pruning
When a B-tree index exists on a colcompress column, the PostgreSQL planner may choose IndexScan even for queries that would benefit from a full sequential scan with stripe pruning. IndexScan opens the table with randomAccess=true, which bypasses the stripe pruning code path — causing date-range and similar ordered queries to read all stripes instead of only the relevant one.
For analytical tables: do not create B-tree indexes on columns covered by the orderby key. Use GIN indexes for JSONB and array columns, and rely on stripe pruning for range predicates on the sort key. If point-lookup access is occasionally needed on an analytical colcompress table, set index_scan = false explicitly and use engine.colcompress_merge() to keep the data globally sorted.
Examples of problematic and safe index patterns:
-- BAD: B-tree index on the ordery column defeats stripe pruning on range queries
CREATE INDEX ON events_col (event_date); -- do NOT do this
-- GOOD: GIN indexes for JSONB / array columns are fine
CREATE INDEX ON events_col USING gin (metadata jsonb_path_ops);
CREATE INDEX ON events_col USING gin (tags);
-- Ensure the planner does not choose IndexScan
SELECT engine.alter_colcompress_table_set('events_col'::regclass, index_scan => false);
No CLUSTER support
CLUSTER (index-ordered physical rewrite) is not implemented for columnar tables. Use engine.colcompress_merge() with an orderby option to achieve equivalent physical ordering.
No VACUUM FULL / table rewrite
VACUUM FULL triggers a table rewrite that is not implemented for colcompress or rowcompress tables. Use engine.colcompress_repack() / engine.rowcompress_repack() instead, which perform the same compaction without the PostgreSQL rewrite machinery.
No unlogged tables
CREATE UNLOGGED TABLE ... USING colcompress is not supported (ERRCODE_FEATURE_NOT_SUPPORTED). Use the default LOGGED persistence.
No speculative insertion (INSERT … ON CONFLICT on non-unique predicates)
Speculative insertion (INSERT … ON CONFLICT) requires a unique index on the conflict target. Conflict detection on arbitrary predicates or without an index is not supported.
Benchmarks
Benchmark suite: 1 000 000 rows, PostgreSQL 18.3, AMD Ryzen 7 5800H (8-core), 40 GB RAM, shared_buffers=10GB. colcompress configured with lz4 compression and orderby = 'event_date ASC' (globally sorted via colcompress_merge).
Two scenarios are measured:
| Scenario | Settings |
|---|---|
| Serial (storage baseline) | JIT=off, max_parallel_workers_per_gather=0 |
| Parallel (real-world simulation) | JIT=on, max_parallel_workers_per_gather=16 |
Serial results — median of 3 runs
| Query | heap | colcompress | rowcompress | citus_columnar |
|---|---|---|---|---|
Q1 count(*) |
38.6ms | 43.7ms | 305ms | 36.9ms |
Q2 SUM/AVG numeric+double |
182.3ms | 118.3ms | 356ms | 121.4ms |
Q3 GROUP BY country (10 vals) |
214.4ms | 162.3ms | 382ms | 141.4ms |
Q4 GROUP BY event_type + p95 |
538.2ms | 452.5ms | 680ms | 469.9ms |
| Q5 date range 1 month | 21.1ms | 23.5ms | 60.0ms | 21.1ms |
Q6 JSONB @> GIN |
121.7ms | 371.4ms | 322ms | 236.9ms |
| Q7 JSONB key + GROUP BY | 386.2ms | 309.0ms | 537ms | 354.4ms |
Q8 array @> GIN |
61.3ms | 329.2ms | 272ms | 143.8ms |
| Q9 LIKE text scan | 147.0ms | 88.3ms | 333ms | 90.3ms |
| Q10 heavy multi-agg | 1908ms | 1902ms | 2067ms | 1914ms |
Q5 on colcompress achieves heap-equivalent performance (23.5ms vs 21.1ms heap) because stripe pruning skips 6 of 7 stripes — data is physically sorted by event_date via the orderby option. lz4 decompression adds negligible overhead over zstd for this stripe-pruned workload while reducing merge time.
Parallel results — median of 3 runs (JIT on, 16 workers)
| Query | heap | colcompress | rowcompress | citus_columnar |
|---|---|---|---|---|
Q1 count(*) |
17.8ms | 16.3ms | 144ms | 37.0ms |
Q2 SUM/AVG numeric+double |
50.1ms | 30.9ms | 142ms | 121.7ms |
Q3 GROUP BY country (10 vals) |
57.6ms | 171ms | 151ms | 138ms |
Q4 GROUP BY event_type + p95 |
539ms | 329ms | 686ms | 473ms |
| Q5 date range 1 month | 21.2ms | 242ms | 69.5ms | 21.0ms |
Q6 JSONB @> GIN |
84.5ms | 42.8ms | 465ms | 235ms |
| Q7 JSONB key + GROUP BY | 391ms | 87.7ms | 692ms | 349ms |
Q8 array @> GIN |
61.7ms | 33.3ms | 275ms | 147ms |
| Q9 LIKE text scan | 48.7ms | 26.8ms | 140ms | 91.0ms |
| Q10 heavy multi-agg | 1951ms | 691ms | 2085ms | 1958ms |
Note: Q5 shows higher latency for colcompress in parallel mode. Each parallel worker scans its assigned block range independently without a global stripe-pruning pass, so all stripes are read. Stripe pruning is effective only in the single-process sequential scan path.
Reproducing the results
The full benchmark kit is in tests/bench/:
createdb bench_am
psql -d bench_am -f tests/bench/setup.sql
# Serial run
bash tests/bench/run.sh 3
python3 tests/bench/chart.py
# Parallel run
bash tests/bench/run_parallel.sh 3
python3 tests/bench/chart_parallel.py
See tests/README.md for full environment description and step-by-step instructions.
Benchmark results above correspond to version 1.0.4 with
lz4compression and globally sorted stripes.
PostgreSQL Version Compatibility
| PostgreSQL | Status |
|---|---|
| 13 | Supported |
| 14 | Supported |
| 15 | Supported |
| 16 | Supported |
| 17 | Supported |
| 18 | Supported (current development target) |
Attribution
storage_engine is a fork of Hydra Columnar, which is itself derived from citus_columnar — the columnar extension originally built by Citus Data (now part of Microsoft). The original work is copyright Citus Data / Hydra and licensed under the AGPL-3.0.
Key ideas borrowed or adapted from other projects:
| Inspiration | Feature |
|---|---|
| ClickHouse MergeTree | Per-table orderby sort key; zone-map (min/max) pruning at stripe and chunk level |
| Apache Parquet | Row-group statistics; column projection; dictionary encoding |
| DuckDB | Vectorized expression evaluation for columnar batches |
We are grateful to the Hydra and Citus teams for making their work open source.