storage_engine 1.0.6

This Release

Date

2026-04-16

Status

Stable

Other Releases

Abstract

Column-oriented (colcompress) and row-compressed (rowcompress) Table Access Methods for PostgreSQL with vectorized execution and parallel scan

Description

storage_engine provides two high-performance Table Access Methods: colcompress (columnar, vectorized, MergeTree-style sort key, stripe-level and chunk-level min/max pruning, parallel DSM scan) and rowcompress (batch-compressed rows, parallel work-stealing scan). Both AMs use zstd by default and coexist safely with heap tables. All catalog objects live in the engine schema; C symbols use the se_ prefix.

Released By

saulojb

License

AGPL-3.0

Resources

Special Files

Tags

Extensions

storage_engine 1.0.6: colcompress and rowcompress Table Access Methods with vectorized execution

Documentation

CODE_OF_CONDUCT: Contributor Covenant Code of Conduct
BENCHMARKS: storage_engine — Benchmarks
CHANGELOG: CHANGELOG
README: Updating vendored dependencies
README: safestringlib
CMakeLists: CMakeLists
LICENSE&COPYING: LICENSE&COPYING
README: storage_engine — Benchmark Replication GuideUbuntu / Debian (example for PostgreSQL 18):

README

storage_engine

A PostgreSQL extension providing two high-performance Table Access Methods designed for analytical and HTAP workloads.

colcompress — column-oriented compressed storage with vectorized execution and parallel scan
rowcompress — row-oriented batch-compressed storage with parallel scan

Both AMs coexist alongside standard heap tables in the same database. All catalog objects are isolated in the engine schema, making the extension safe to install alongside citus_columnar or any other columnar extension (all exported C symbols carry the se_ prefix to avoid linker conflicts).

Lineage: storage_engine is a fork of Hydra Columnar (itself derived from citus_columnar), extended with rowcompress, full DELETE/UPDATE support, stripe-level min/max pruning, and a redesigned parallel scan. The MergeTree-style orderby option and zone-map pruning are directly inspired by ClickHouse.

Quick Start

CREATE EXTENSION storage_engine;

-- Column-oriented analytics table
CREATE TABLE events (
    ts          timestamptz NOT NULL,
    user_id     bigint,
    event_type  text,
    value       float8
) USING colcompress;

-- Row-oriented compressed table (good for append-heavy workloads)
CREATE TABLE logs (
    id          bigserial,
    logged_at   timestamptz NOT NULL,
    message     text
) USING rowcompress;

-- Insert normally
INSERT INTO events
SELECT now(), g, 'click', random()
FROM generate_series(1, 1000000) g;

-- Query normally — column projection, vectorized execution,
-- and parallel scan are transparent and automatic.
SELECT event_type, count(*), avg(value)
FROM events
WHERE ts > now() - interval '1 day'
GROUP BY 1;

colcompress AM

How It Works

Data is stored column by column on disk. Each column is split into stripes (default 150,000 rows each), and each stripe is further divided into chunk groups (default 10,000 rows). Every chunk records the minimum and maximum value of its column, enabling zone-map pruning at query time.

Table file
├── Stripe 1  (rows 1 – 150,000)
│   ├── Chunk group 0  (rows 1 – 10,000)
│   │   ├── Column A  [min, max, compressed values…]
│   │   ├── Column B  [min, max, compressed values…]
│   │   └── …
│   └── Chunk group 1  (rows 10,001 – 20,000)  …
└── Stripe 2  (rows 150,001 – 300,000)  …

A scan only reads the columns referenced by the query, skipping all others entirely. This dramatically reduces I/O for wide tables with selective column access patterns.

Column-Level Caching

The AM maintains an in-memory column cache that stores decompressed column chunks across executor iterations. When the same stripe region is accessed more than once (nested loops, repeated plans, self-joins), the decompressed data is served from cache without re-reading or re-decompressing the file.

SET storage_engine.enable_column_cache = on;   -- default: on

Vectorized Execution

colcompress ships a vectorized expression evaluation engine that processes WHERE clauses and aggregates in column-oriented batches of up to 10,000 values per call, instead of row-at-a-time evaluation. This maps naturally onto column chunks and eliminates per-row interpreter overhead.

Supported vectorized operations:

Category	Types
Comparison operators (`=`, `<>`, `<`, `<=`, `>`, `>=`)	`int2`, `int4`, `int8`, `float4`, `float8`, `date`, `timestamp`, `timestamptz`, `char`, `bpchar`, `text`, `varchar`, `name`, `bool`, `oid`
Aggregates (`count`, `sum`, `avg`, `max`, `min`)	`int2`, `int4`, `int8`, `float8`, `date`, `timestamptz`

SET storage_engine.enable_vectorization = on;   -- default: on

Parallel Scan

The AM implements the full PostgreSQL parallel Table AM protocol using Dynamic Shared Memory (DSM). The coordinator divides the stripe range across workers; each worker independently reads and decompresses its assigned stripes, then feeds results back through the Gather node.

Parallel scan stacks on top of vectorized execution — each worker runs its own vectorized evaluation pipeline independently.

SET storage_engine.enable_parallel_execution = on;  -- default: on
SET storage_engine.min_parallel_processes = 8;       -- minimum workers (default: 8)

-- Standard PostgreSQL parallel knobs also apply:
SET max_parallel_workers_per_gather = 4;
SET parallel_setup_cost = 0;
SET parallel_tuple_cost = 0;

Stripe-Level and Chunk-Level Min/Max Pruning

colcompress implements two layers of zone-map pruning using the minimum_value / maximum_value statistics stored per chunk in engine.chunk.

Stripe-level pruning (coarse) — Before reading any data, the scan aggregates the min/max across all chunks of each stripe and tests the resulting stripe-wide range against the query’s WHERE predicates using PostgreSQL’s predicate_refuted_by. Any stripe whose range is provably disjoint from the predicate is skipped entirely — no decompression, no I/O. The number of stripes removed this way is reported in EXPLAIN:

EXPLAIN (ANALYZE, BUFFERS)
SELECT count(*) FROM events WHERE ts > '2025-01-01';

-- Custom Scan (ColumnarScan) on events  ...
--   Engine Stripes Removed by Pruning: 41
--   Engine Stripes Read: 12

Stripe pruning is most effective after engine.colcompress_merge() has established a global sort order, but it also works on any partially sorted data.

Chunk-level pruning (fine) — Within each stripe that survives the coarse filter, the custom scan evaluates each individual chunk group’s min/max range against the same predicates. Chunk groups whose range cannot satisfy the predicate are skipped.

The two layers compose: a query on a large, well-sorted table typically eliminates entire stripes before touching them, then further prunes chunk groups within the remaining stripes, resulting in very small I/O amplification even without an index.

Pruning effectiveness scales directly with data sortedness. Use the orderby option combined with engine.colcompress_merge() to establish global sort order and maximize pruning at both levels.

Index-Backed Scan

A custom index scan path allows B-tree and other indexes to drive lookups into a colcompress table, decompressing only the rows matched by the index.

-- Session GUC (applies to all colcompress tables in the session)
SET storage_engine.enable_columnar_index_scan = on;  -- default: off

-- Per-table override (persisted in engine.col_options, survives reconnect)
SELECT engine.alter_colcompress_table_set('documents'::regclass, index_scan => true);

The right default depends on the access pattern:

Workload	Recommendation
Analytical (aggregates, range scans, high column count)	Keep `off` — the sequential vectorized path + chunk pruning wins
Document repository (XML, PDF, JSON blobs stored for compression, fetched by primary key or unique identifier)	Set `on` — you want columnar compression with point-lookup speed, not full-table scan

Index scan is enabled for a given query if either the session GUC is on or the table’s index_scan option is true. This lets you keep the GUC off globally while enabling it selectively per table:

-- OLAP tables: keep default (index_scan = false)

-- Document table: enable permanently
SELECT engine.alter_colcompress_table_set('adm.documents'::regclass, index_scan => true);

-- Inspect
SELECT table_name, index_scan FROM engine.colcompress_options WHERE table_name = 'adm.documents';
--  table_name    | index_scan
-- ---------------+-----------
--  adm.documents | t

For document repositories the combination is compelling: colcompress with zstd can compress large binary/text documents 3–10×, and with the index scan enabled, retrievals like WHERE id = $1 or WHERE document_key = $1 decompress only the matching rows without touching the rest of the stripe.

When to use index_scan = true vs colcompress_merge()

These are two distinct use cases, not competing strategies:

Use case Strategy
File/document storage (XML, PDF, JSON blobs — fetched by primary key or unique key) index_scan = true. You want columnar compression for storage savings and point-lookup speed without full-stripe decompression. Sort order is irrelevant; every fetch targets a specific row.
Analytics (aggregations, date ranges, GROUP BY, pattern scans over millions of rows) index_scan = false + colcompress_merge(). Sort the data by the query’s filter column (orderby = 'event_date ASC'), then merge to produce globally ordered stripes. Stripe-level min/max pruning skips irrelevant stripes entirely before any decompression occurs.
Mixing both on the same table is possible but not ideal: a B-tree index on the orderby column will cause the planner to prefer IndexScan for range queries, disabling stripe pruning (see Known Limitations). If you need occasional point lookups on an analytical table, rely on the GUC SET storage_engine.enable_columnar_index_scan = on at session level rather than creating a B-tree index.

Use case	Strategy
File/document storage (XML, PDF, JSON blobs — fetched by primary key or unique key)	`index_scan = true`. You want columnar compression for storage savings and point-lookup speed without full-stripe decompression. Sort order is irrelevant; every fetch targets a specific row.
Analytics (aggregations, date ranges, GROUP BY, pattern scans over millions of rows)	`index_scan = false` + `colcompress_merge()`. Sort the data by the query’s filter column (`orderby = 'event_date ASC'`), then merge to produce globally ordered stripes. Stripe-level min/max pruning skips irrelevant stripes entirely before any decompression occurs.

DELETE and UPDATE Support

colcompress fully supports DELETE and UPDATE via a row mask stored in engine.row_mask. Each deleted row is marked as a bit in a per-chunk-group bitmask; the scan engine skips masked rows without rewriting the stripe. UPDATE is implemented as a delete-then-insert.

SET storage_engine.enable_dml = on;  -- default: on

DELETE FROM events WHERE ts < now() - interval '1 year';
UPDATE events SET value = value * 1.1 WHERE event_type = 'purchase';

Deleted rows are reclaimed during VACUUM, which rewrites affected stripes and clears the row mask.

ON CONFLICT / Upserts

Standard INSERT … ON CONFLICT is fully supported, including DO NOTHING and DO UPDATE SET …. Requires a unique index on the conflict target column(s).

INSERT INTO events (ts, user_id, event_type, value)
VALUES (now(), 42, 'click', 1.0)
ON CONFLICT (user_id, event_type) DO UPDATE SET value = EXCLUDED.value;

MergeTree-Like Ordering and colcompress_merge

Inspired by ClickHouse’s MergeTree engine, colcompress supports a global sort key per table. When set, every new stripe written to the table is sorted by the key before compression. The engine.colcompress_merge() function rewrites the entire table in a single globally sorted pass, making stripe-level and chunk-level min/max pruning maximally effective across all stripes.

-- Assign a sort key to an existing table
SELECT engine.alter_colcompress_table_set(
    'events'::regclass,
    orderby => 'ts ASC, user_id ASC'
);

-- After a large INSERT or initial load, compact and globally sort:
SELECT engine.colcompress_merge('events');

-- Now WHERE ts BETWEEN ... skips almost all chunks.

colcompress_merge internally: 1. Copies all live (non-deleted) rows to a temporary heap table 2. Truncates the target table 3. Re-inserts rows in the order defined by orderby, writing fresh globally-ordered stripes

Compression Options

Each chunk is compressed independently. Available algorithms:

Name	Description
`none`	No compression
`lz4`	Fast compression/decompression, moderate ratio
`zstd`	High ratio, configurable level 1–19 (default level: 3)
`pglz`	PostgreSQL’s built-in LZ variant

SELECT engine.alter_colcompress_table_set('events',
    compression       => 'zstd',
    compression_level => 9);

Configuration GUCs

All parameters can be set per-session or globally in postgresql.conf.

GUC availability: The storage_engine.* GUCs are registered when the storage_engine shared library is loaded. This happens automatically the first time a colcompress or rowcompress table is accessed in a session. To make GUCs available immediately in every session (including before any table access), add the extension to shared_preload_libraries: shared_preload_libraries = 'storage_engine' GUC names use the storage_engine. prefix and do not overlap with citus_columnar (which uses the columnar. prefix), so both extensions can be loaded simultaneously without conflict.

Parameter	Type	Default	Description
`storage_engine.enable_parallel_execution`	bool	`on`	Enable parallel custom scan via DSM
`storage_engine.min_parallel_processes`	int	`8`	Minimum worker count to launch for parallel scan
`storage_engine.enable_vectorization`	bool	`on`	Enable vectorized WHERE/aggregate evaluation
`storage_engine.enable_custom_scan`	bool	`on`	Enable projection and qual-pushdown custom scan
`storage_engine.enable_column_cache`	bool	`on`	Enable in-memory column chunk cache
`storage_engine.enable_columnar_index_scan`	bool	`off`	Enable index-driven columnar scan path (recommended `on` for document/point-lookup repositories)
`storage_engine.enable_dml`	bool	`on`	Allow DELETE and UPDATE on colcompress tables
`storage_engine.stripe_row_limit`	int	`150000`	Default rows per stripe (1,000 – 100,000,000)
`storage_engine.chunk_group_row_limit`	int	`10000`	Default rows per chunk group (1,000 – 100,000,000)
`storage_engine.compression_level`	int	`3`	Default compression level for zstd (1 – 19)

Per-Table Options

Per-table options override the session GUCs for a specific table:

SELECT engine.alter_colcompress_table_set(
    'events'::regclass,
    chunk_group_row_limit => 10000,    -- NULL = leave unchanged
    stripe_row_limit      => 150000,
    compression           => 'zstd',
    compression_level     => 9,
    orderby               => 'ts ASC, user_id ASC',
    index_scan            => false     -- true = skip cost penalty for index scans on this table
);

-- Reset individual options to system defaults
SELECT engine.alter_colcompress_table_reset(
    'events'::regclass,
    compression       => true,
    compression_level => true,
    index_scan        => true   -- resets to false
);

-- Inspect current options
SELECT * FROM engine.colcompress_options WHERE table_name = 'events';

rowcompress AM

How It Works

rowcompress stores rows in fixed-size batches (default 10,000 rows per batch). Each batch is serialized using heap tuple format and then compressed as a single unit. Batch metadata (file offset, byte size, first row number, row count) is stored in engine.row_batch.

Table file (ColumnarStorage layout)
├── Batch 1: [header | row offsets | compressed heap tuple data]
├── Batch 2: [header | row offsets | compressed heap tuple data]
└── …

This AM suits append-heavy workloads where compression matters but column projection is not needed — event logs, audit trails, time-series with many columns all queried together. Typical storage savings of 2–10× with zstd.

Compared to colcompress: - Reads full rows (no column projection) - Lower write latency per row (no columnar transposition) - No vectorized execution or chunk-level pruning - Supports parallel reads and index scans - Supports multiple compression algorithms

Parallel Scan

rowcompress implements the PostgreSQL parallel scan protocol via atomic batch claiming. Each parallel worker atomically increments a shared counter to claim the next unprocessed batch, decompresses it, and repeats. There is no coordinator or work distribution step — workers self-schedule in a work-stealing fashion with zero contention on most paths.

-- Standard PostgreSQL parallel knobs apply
SET max_parallel_workers_per_gather = 4;

Per-Table Options

SELECT engine.alter_rowcompress_table_set(
    'logs'::regclass,
    batch_size        => 10000,   -- rows per compressed batch (default: 10000)
    compression       => 'zstd',
    compression_level => 5
);

-- Reset to defaults
SELECT engine.alter_rowcompress_table_reset('logs'::regclass, compression => true);

-- Rewrite all batches with current options (e.g. after changing compression)
SELECT engine.rowcompress_repack('logs');

-- Inspect
SELECT * FROM engine.rowcompress_options WHERE table_name = 'logs';
SELECT * FROM engine.rowcompress_batches LIMIT 10;

Management Functions

Function	Description
`engine.alter_colcompress_table_set(regclass, ...)`	Set one or more options on a colcompress table
`engine.alter_colcompress_table_reset(regclass, ...)`	Reset colcompress options to system defaults
`engine.colcompress_merge(regclass)`	Rewrite and globally sort a colcompress table by its `orderby` key
`engine.colcompress_repack(regclass)`	Alias for `colcompress_merge`; drop-in replacement for `pg_repack` on colcompress tables
`engine.alter_rowcompress_table_set(regclass, ...)`	Set one or more options on a rowcompress table
`engine.alter_rowcompress_table_reset(regclass, ...)`	Reset rowcompress options to system defaults
`engine.rowcompress_repack(regclass)`	Rewrite all batches of a rowcompress table with current options

Catalog Views

View	Description
`engine.colcompress_options`	Per-table options for all colcompress tables
`engine.colcompress_stripes`	Stripe-level metadata (offset, size, row range) per table
`engine.rowcompress_options`	Per-table options for all rowcompress tables
`engine.rowcompress_batches`	Batch-level metadata for all rowcompress tables

All views grant SELECT to PUBLIC.

Installation

Build from source

Requires PostgreSQL server headers and pg_config in PATH.

cd dist/
sudo make -j$(nproc) install

Add to postgresql.conf:

shared_preload_libraries = 'storage_engine'

Loading with citus

If citus or pg_cron is also in shared_preload_libraries, the load order matters:

shared_preload_libraries = 'pg_cron,citus,storage_engine'

citus must appear before storage_engine. PostgreSQL registers planner hooks in load order; citus expects to be the outermost hook in the chain. Reversing the order causes PostgreSQL to refuse to start.

Restart PostgreSQL and load the extension:

CREATE EXTENSION storage_engine;

Docker

docker compose up
psql postgres://postgres:password@127.0.0.1:5432

The engine schema and both AMs are created automatically on CREATE EXTENSION.

Known Limitations

No AFTER ROW triggers (and no pg_repack)

colcompress and rowcompress tables do not support AFTER ROW triggers or foreign keys. This is a fundamental architectural constraint: columnar storage does not maintain heap tuples in a form that row-level trigger machinery can inspect.

This means pg_repack cannot be used — it relies internally on an AFTER ROW trigger to capture concurrent changes during its online copy phase:

ERROR: Foreign keys and AFTER ROW triggers are not supported for columnar tables
DETAIL: Tools such as pg_repack use AFTER ROW triggers internally and cannot be used with colcompress or rowcompress tables.
HINT: Use engine.colcompress_repack(table) as a drop-in replacement for pg_repack on colcompress tables.

Use engine.colcompress_repack() instead:

-- Drop-in replacement for: pg_repack -t mytable
SELECT engine.colcompress_repack('adm.documents'::regclass);

-- Repack all colcompress tables in the database:
DO $$
DECLARE r record;
BEGIN
  FOR r IN
    SELECT n.nspname, c.relname
    FROM pg_class c
    JOIN pg_namespace n ON n.oid = c.relnamespace
    JOIN pg_am a ON c.relam = a.oid
    WHERE a.amname = 'colcompress' AND c.relkind = 'r'
    ORDER BY pg_total_relation_size(c.oid) DESC
  LOOP
    RAISE NOTICE 'Repacking %.%...', r.nspname, r.relname;
    PERFORM engine.colcompress_repack(
      (quote_ident(r.nspname) || '.' || quote_ident(r.relname))::regclass);
  END LOOP;
END;
$$;

Limitation vs. pg_repack: colcompress_repack acquires AccessExclusiveLock for the duration of the operation (reads and writes are blocked). There is no online/concurrent mode. Schedule during a maintenance window for large tables.

AFTER STATEMENT triggers are supported

Only row-level (FOR EACH ROW) AFTER triggers are blocked. Statement-level (FOR EACH STATEMENT) AFTER triggers work fine.

sort-on-write is disabled when indexes exist

When a colcompress table has both an orderby option and one or more B-tree indexes, the sort-on-write path is automatically disabled during writes. This avoids a TID placeholder problem that would corrupt index entries. The table’s stripes are still read in their natural write order; run engine.colcompress_merge() (or colcompress_repack()) to produce globally sorted stripes after loading data.

B-tree indexes on colcompress disable stripe pruning

When a B-tree index exists on a colcompress column, the PostgreSQL planner may choose IndexScan even for queries that would benefit from a full sequential scan with stripe pruning. IndexScan opens the table with randomAccess=true, which bypasses the stripe pruning code path — causing date-range and similar ordered queries to read all stripes instead of only the relevant one.

For analytical tables: do not create B-tree indexes on columns covered by the orderby key. Use GIN indexes for JSONB and array columns, and rely on stripe pruning for range predicates on the sort key. If point-lookup access is occasionally needed on an analytical colcompress table, set index_scan = false explicitly and use engine.colcompress_merge() to keep the data globally sorted.

Examples of problematic and safe index patterns:

-- BAD: B-tree index on the ordery column defeats stripe pruning on range queries
CREATE INDEX ON events_col (event_date);  -- do NOT do this

-- GOOD: GIN indexes for JSONB / array columns are fine
CREATE INDEX ON events_col USING gin (metadata jsonb_path_ops);
CREATE INDEX ON events_col USING gin (tags);

-- Ensure the planner does not choose IndexScan
SELECT engine.alter_colcompress_table_set('events_col'::regclass, index_scan => false);

No CLUSTER support

CLUSTER (index-ordered physical rewrite) is not implemented for columnar tables. Use engine.colcompress_merge() with an orderby option to achieve equivalent physical ordering.

No VACUUM FULL / table rewrite

VACUUM FULL triggers a table rewrite that is not implemented for colcompress or rowcompress tables. Use engine.colcompress_repack() / engine.rowcompress_repack() instead, which perform the same compaction without the PostgreSQL rewrite machinery.

No unlogged tables

CREATE UNLOGGED TABLE ... USING colcompress is not supported (ERRCODE_FEATURE_NOT_SUPPORTED). Use the default LOGGED persistence.

No speculative insertion (INSERT … ON CONFLICT on non-unique predicates)

Speculative insertion (INSERT … ON CONFLICT) requires a unique index on the conflict target. Conflict detection on arbitrary predicates or without an index is not supported.

Benchmarks

Benchmark suite: 1 000 000 rows, PostgreSQL 18.3, AMD Ryzen 7 5800H (8-core), 40 GB RAM, shared_buffers=10GB. colcompress configured with lz4 compression and orderby = 'event_date ASC' (globally sorted via colcompress_merge).

Two scenarios are measured:

Scenario	Settings
Serial (storage baseline)	`JIT=off`, `max_parallel_workers_per_gather=0`
Parallel (real-world simulation)	`JIT=on`, `max_parallel_workers_per_gather=16`

Serial results — median of 3 runs

Query	heap	colcompress	rowcompress	citus_columnar
Q1 `count(*)`	38.6ms	43.7ms	305ms	36.9ms
Q2 `SUM/AVG` numeric+double	182.3ms	118.3ms	356ms	121.4ms
Q3 `GROUP BY` country (10 vals)	214.4ms	162.3ms	382ms	141.4ms
Q4 `GROUP BY` event_type + p95	538.2ms	452.5ms	680ms	469.9ms
Q5 date range 1 month	21.1ms	23.5ms	60.0ms	21.1ms
Q6 JSONB `@>` GIN	121.7ms	371.4ms	322ms	236.9ms
Q7 JSONB key + GROUP BY	386.2ms	309.0ms	537ms	354.4ms
Q8 array `@>` GIN	61.3ms	329.2ms	272ms	143.8ms
Q9 LIKE text scan	147.0ms	88.3ms	333ms	90.3ms
Q10 heavy multi-agg	1908ms	1902ms	2067ms	1914ms

Q5 on colcompress achieves heap-equivalent performance (23.5ms vs 21.1ms heap) because stripe pruning skips 6 of 7 stripes — data is physically sorted by event_date via the orderby option. lz4 decompression adds negligible overhead over zstd for this stripe-pruned workload while reducing merge time.

Parallel results — median of 3 runs (JIT on, 16 workers)

Query	heap	colcompress	rowcompress	citus_columnar
Q1 `count(*)`	17.8ms	16.3ms	144ms	37.0ms
Q2 `SUM/AVG` numeric+double	50.1ms	30.9ms	142ms	121.7ms
Q3 `GROUP BY` country (10 vals)	57.6ms	171ms	151ms	138ms
Q4 `GROUP BY` event_type + p95	539ms	329ms	686ms	473ms
Q5 date range 1 month	21.2ms	242ms	69.5ms	21.0ms
Q6 JSONB `@>` GIN	84.5ms	42.8ms	465ms	235ms
Q7 JSONB key + GROUP BY	391ms	87.7ms	692ms	349ms
Q8 array `@>` GIN	61.7ms	33.3ms	275ms	147ms
Q9 LIKE text scan	48.7ms	26.8ms	140ms	91.0ms
Q10 heavy multi-agg	1951ms	691ms	2085ms	1958ms

Note: Q5 shows higher latency for colcompress in parallel mode. Each parallel worker scans its assigned block range independently without a global stripe-pruning pass, so all stripes are read. Stripe pruning is effective only in the single-process sequential scan path.

Reproducing the results

The full benchmark kit is in tests/bench/:

createdb bench_am
psql -d bench_am -f tests/bench/setup.sql

# Serial run
bash tests/bench/run.sh 3
python3 tests/bench/chart.py

# Parallel run
bash tests/bench/run_parallel.sh 3
python3 tests/bench/chart_parallel.py

See tests/README.md for full environment description and step-by-step instructions.

Benchmark results above correspond to version 1.0.4 with lz4 compression and globally sorted stripes.

PostgreSQL Version Compatibility

PostgreSQL	Status
13	Supported
14	Supported
15	Supported
16	Supported
17	Supported
18	Supported (current development target)

Attribution

storage_engine is a fork of Hydra Columnar, which is itself derived from citus_columnar — the columnar extension originally built by Citus Data (now part of Microsoft). The original work is copyright Citus Data / Hydra and licensed under the AGPL-3.0.

Key ideas borrowed or adapted from other projects:

Inspiration	Feature
ClickHouse MergeTree	Per-table `orderby` sort key; zone-map (min/max) pruning at stripe and chunk level
Apache Parquet	Row-group statistics; column projection; dictionary encoding
DuckDB	Vectorized expression evaluation for columnar batches

We are grateful to the Hydra and Citus teams for making their work open source.

PGXN

PostgreSQL Extension Network

storage_engine 1.0.6

Extensions

Documentation

README

Contents

storage_engine

Table of Contents

Quick Start

colcompress AM

How It Works

Column-Level Caching

Vectorized Execution

Parallel Scan

Stripe-Level and Chunk-Level Min/Max Pruning

Index-Backed Scan

DELETE and UPDATE Support

ON CONFLICT / Upserts

MergeTree-Like Ordering and colcompress_merge

Compression Options

Configuration GUCs

Per-Table Options

rowcompress AM

How It Works

Parallel Scan

Per-Table Options

Management Functions

Catalog Views

Installation

Build from source

Loading with citus

Docker

Known Limitations

No AFTER ROW triggers (and no pg_repack)

AFTER STATEMENT triggers are supported

sort-on-write is disabled when indexes exist

B-tree indexes on colcompress disable stripe pruning

No CLUSTER support

No VACUUM FULL / table rewrite

No unlogged tables

No speculative insertion (INSERT … ON CONFLICT on non-unique predicates)

Benchmarks

Serial results — median of 3 runs

Parallel results — median of 3 runs (JIT on, 16 workers)

Reproducing the results

PostgreSQL Version Compatibility

Attribution