storage_engine — Benchmark Replication Guide
- Test Environment
  - postgresql.conf settings used
- Prerequisites
Ubuntu / Debian (example for PostgreSQL 18):

storage_engine — Benchmark Replication Guide

This folder contains everything needed to reproduce the official storage_engine benchmark results independently.

Test Environment

Component	Details
CPU	AMD Ryzen 7 5800H (8 cores / 16 threads @ 3.2–4.4 GHz)
RAM	40 GB DDR4
OS	Ubuntu 24.04 LTS (x86_64)
PostgreSQL	18.3 (Ubuntu 18.3-1.pgdg24.04+1)
storage_engine	1.0
citus	14.0.0 (for `citus_columnar` comparison)
pg_cron	1.6
shared_buffers	10 GB
work_mem	256 MB
Dataset	1 000 000 rows, ~388 MB heap / ~78 MB colcompress / ~106 MB rowcompress / ~48 MB columnar

`postgresql.conf` settings used

shared_preload_libraries = 'pg_cron,citus,storage_engine'
shared_buffers = 10GB
work_mem = 256MB
max_parallel_workers_per_gather = 8   # system default; overridden per test
jit = on                              # system default; overridden per test

Prerequisites

Build and install storage_engine (from the repo root): bash cd dist/ sudo make -j$(nproc) install
Install citus (optional — needed for the citus_columnar column): ```bash

Ubuntu / Debian (example for PostgreSQL 18):

sudo apt install postgresql-18-citus-14.0 ```
Configure shared_preload_libraries in postgresql.conf or postgresql.auto.conf: shared_preload_libraries = 'pg_cron,citus,storage_engine'

Order matters: citus must appear before storage_engine. PostgreSQL will fail to start if storage_engine is listed first.
Restart PostgreSQL: bash sudo systemctl restart postgresql@18-main
Install Python dependencies (for chart generation): bash pip install matplotlib numpy

Step-by-step Replication

1 — Create the database

createdb bench_am

2 — Load schema and data

psql -d bench_am -f tests/bench/setup.sql

This creates four tables (events_heap, events_col, events_row, events_cit) with 1 000 000 rows each and builds the appropriate indexes.

Note about events_cit: The events_cit table uses citus_columnar and is commented out in setup.sql by default. If citus is installed, uncomment the relevant blocks before running the script.

Note about btree indexes on events_col: The setup script intentionally does not create btree indexes on events_col. A btree index causes PostgreSQL’s planner to prefer IndexScan with randomAccess=true, which disables stripe pruning in the columnar engine and makes date-range queries (Q5) approximately 10× slower. Only GIN indexes are created for JSONB/array columns.

3 — Run the serial benchmark (baseline)

cd tests/bench/
bash run.sh 3

Disable JIT and parallelism to isolate raw storage/decompression costs: - SET jit = off - SET max_parallel_workers_per_gather = 0

Results are saved to results_serial.csv.

4 — Generate the serial chart

python3 chart.py
# → benchmark.png
# → benchmark.svg

5 — Run the parallel benchmark (real-world simulation)

bash run_parallel.sh 3

Enables JIT and 16 parallel workers to simulate production multi-core workloads: - SET jit = on - SET max_parallel_workers_per_gather = 16

Results are saved to results_parallel.csv.

6 — Generate the parallel chart

python3 chart_parallel.py
# → benchmark_parallel.png
# → benchmark_parallel.svg

Query Descriptions

Query	SQL pattern	What it measures
Q1	`COUNT(*)`	Full-scan decompression throughput
Q2	`SUM/AVG` on numeric + double	Numeric decompression + aggregation
Q3	`GROUP BY country_code`	Low-cardinality group-by
Q4	`GROUP BY event_type` + `percentile_disc(0.95)`	Ordered-set aggregate
Q5	Date range filter (1 month)	Stripe pruning on sorted columnar data
Q6	`JSONB @> GIN`	GIN index seek + columnar fetch
Q7	`JSONB ->>'key'` + `GROUP BY`	JSONB extraction + aggregation
Q8	`array @> GIN`	Array GIN index + columnar fetch
Q9	`LIKE '/page/1%'`	Text scan
Q10	Heavy multi-aggregate (10 agg functions)	Vectorized aggregate throughput

Key Observations

Serial mode (JIT off, no parallelism)

Q1–Q3, Q9–Q10: colcompress is 1.2–2× faster than heap due to decompression reducing I/O (data fits in fewer cache lines).
Q5 (date range): colcompress achieves ~10× speedup over heap when data is physically sorted by event_date (set via ALTER TABLE events_col SET (orderby = 'event_date ASC')). The columnar engine’s stripe-level chunk filter skips stripes whose min/max date range does not overlap the query predicate, reading only the one relevant stripe out of seven.
Q6, Q8: heap wins — GIN index seeks return scattered row IDs that map to random stripe reads, negating columnar compression benefits.

Parallel mode (JIT on, 16 workers)

Most queries see significant speedups across all AMs due to parallelism.
Q5 shows no stripe-pruning benefit in parallel mode: each parallel worker receives a disjoint block range and scans its assigned blocks without the global stripe-pruning pass, so all stripes are read.
colcompress still wins on aggregate-heavy queries (Q10: ~3× vs heap) due to vectorized execution in parallel workers.

Troubleshooting

EXPLAIN crashes with citus loaded This was a known bug in storage_engine ≤ 1.0 where IsCreateTableAs(NULL) was called with a NULL query_string passed internally by citus, causing a SIGSEGV. Fixed in storage_engine 1.0 (commit: add NULL guard in IsCreateTableAs in engine_planner_hook.c).

Q5 is slow on events_col Ensure: (a) there are no btree indexes on events_col, (b) index_scan=false is set, and © data was inserted in event_date order (as setup.sql does via the ORDER BY event_date clause in the INSERT … SELECT from events_heap).

citus must appear before storage_engine PostgreSQL enforces strict shared_preload_libraries ordering for extensions that register planner hooks. citus registers its hook at startup and expects to be the outermost hook in the chain. Placing storage_engine before citus causes PostgreSQL to refuse to start.

PGXN

PostgreSQL Extension Network

Contents

storage_engine — Benchmark Replication Guide

Test Environment

`postgresql.conf` settings used

Prerequisites

Ubuntu / Debian (example for PostgreSQL 18):

Step-by-step Replication

1 — Create the database

2 — Load schema and data

3 — Run the serial benchmark (baseline)

4 — Generate the serial chart

5 — Run the parallel benchmark (real-world simulation)

6 — Generate the parallel chart

Query Descriptions

Key Observations

Serial mode (JIT off, no parallelism)

Parallel mode (JIT on, 16 workers)

Troubleshooting

PGXN

PostgreSQL Extension Network

Contents

storage_engine — Benchmark Replication Guide

Test Environment

postgresql.conf settings used

Prerequisites

Ubuntu / Debian (example for PostgreSQL 18):

Step-by-step Replication

1 — Create the database

2 — Load schema and data

3 — Run the serial benchmark (baseline)

4 — Generate the serial chart

5 — Run the parallel benchmark (real-world simulation)

6 — Generate the parallel chart

Query Descriptions

Key Observations

Serial mode (JIT off, no parallelism)

Parallel mode (JIT on, 16 workers)

Troubleshooting

`postgresql.conf` settings used