Contents
storage_engine — Benchmark Replication Guide
This folder contains everything needed to reproduce the official storage_engine benchmark results independently.
Test Environment
| Component | Details |
|---|---|
| CPU | AMD Ryzen 7 5800H (8 cores / 16 threads @ 3.2–4.4 GHz) |
| RAM | 40 GB DDR4 |
| OS | Ubuntu 24.04 LTS (x86_64) |
| PostgreSQL | 18.3 (Ubuntu 18.3-1.pgdg24.04+1) |
| storage_engine | 1.0 |
| citus | 14.0.0 (for citus_columnar comparison) |
| pg_cron | 1.6 |
| shared_buffers | 10 GB |
| work_mem | 256 MB |
| Dataset | 1 000 000 rows, ~388 MB heap / ~78 MB colcompress / ~106 MB rowcompress / ~48 MB columnar |
postgresql.conf settings used
shared_preload_libraries = 'pg_cron,citus,storage_engine'
shared_buffers = 10GB
work_mem = 256MB
max_parallel_workers_per_gather = 8 # system default; overridden per test
jit = on # system default; overridden per test
Prerequisites
Build and install storage_engine (from the repo root):
bash cd dist/ sudo make -j$(nproc) installInstall citus (optional — needed for the
citus_columnarcolumn): ```bashUbuntu / Debian (example for PostgreSQL 18):
sudo apt install postgresql-18-citus-14.0 ```
Configure
shared_preload_librariesinpostgresql.conforpostgresql.auto.conf:shared_preload_libraries = 'pg_cron,citus,storage_engine'Order matters: citus must appear before storage_engine. PostgreSQL will fail to start if storage_engine is listed first.
Restart PostgreSQL:
bash sudo systemctl restart postgresql@18-mainInstall Python dependencies (for chart generation):
bash pip install matplotlib numpy
Step-by-step Replication
1 — Create the database
createdb bench_am
2 — Load schema and data
psql -d bench_am -f tests/bench/setup.sql
This creates four tables (events_heap, events_col, events_row, events_cit)
with 1 000 000 rows each and builds the appropriate indexes.
Note about
events_cit: Theevents_cittable usescitus_columnarand is commented out insetup.sqlby default. If citus is installed, uncomment the relevant blocks before running the script.Note about btree indexes on
events_col: The setup script intentionally does not create btree indexes onevents_col. A btree index causes PostgreSQL’s planner to preferIndexScanwithrandomAccess=true, which disables stripe pruning in the columnar engine and makes date-range queries (Q5) approximately 10× slower. Only GIN indexes are created for JSONB/array columns.
3 — Run the serial benchmark (baseline)
cd tests/bench/
bash run.sh 3
Disable JIT and parallelism to isolate raw storage/decompression costs:
- SET jit = off
- SET max_parallel_workers_per_gather = 0
Results are saved to results_serial.csv.
4 — Generate the serial chart
python3 chart.py
# → benchmark.png
# → benchmark.svg
5 — Run the parallel benchmark (real-world simulation)
bash run_parallel.sh 3
Enables JIT and 16 parallel workers to simulate production multi-core workloads:
- SET jit = on
- SET max_parallel_workers_per_gather = 16
Results are saved to results_parallel.csv.
6 — Generate the parallel chart
python3 chart_parallel.py
# → benchmark_parallel.png
# → benchmark_parallel.svg
Query Descriptions
| Query | SQL pattern | What it measures |
|---|---|---|
| Q1 | COUNT(*) |
Full-scan decompression throughput |
| Q2 | SUM/AVG on numeric + double |
Numeric decompression + aggregation |
| Q3 | GROUP BY country_code |
Low-cardinality group-by |
| Q4 | GROUP BY event_type + percentile_disc(0.95) |
Ordered-set aggregate |
| Q5 | Date range filter (1 month) | Stripe pruning on sorted columnar data |
| Q6 | JSONB @> GIN |
GIN index seek + columnar fetch |
| Q7 | JSONB ->>'key' + GROUP BY |
JSONB extraction + aggregation |
| Q8 | array @> GIN |
Array GIN index + columnar fetch |
| Q9 | LIKE '/page/1%' |
Text scan |
| Q10 | Heavy multi-aggregate (10 agg functions) | Vectorized aggregate throughput |
Key Observations
Serial mode (JIT off, no parallelism)
- Q1–Q3, Q9–Q10: colcompress is 1.2–2× faster than heap due to decompression reducing I/O (data fits in fewer cache lines).
- Q5 (date range): colcompress achieves ~10× speedup over heap when data is
physically sorted by
event_date(set viaALTER TABLE events_col SET (orderby = 'event_date ASC')). The columnar engine’s stripe-level chunk filter skips stripes whose min/max date range does not overlap the query predicate, reading only the one relevant stripe out of seven. - Q6, Q8: heap wins — GIN index seeks return scattered row IDs that map to random stripe reads, negating columnar compression benefits.
Parallel mode (JIT on, 16 workers)
- Most queries see significant speedups across all AMs due to parallelism.
- Q5 shows no stripe-pruning benefit in parallel mode: each parallel worker receives a disjoint block range and scans its assigned blocks without the global stripe-pruning pass, so all stripes are read.
- colcompress still wins on aggregate-heavy queries (Q10: ~3× vs heap) due to vectorized execution in parallel workers.
Troubleshooting
EXPLAIN crashes with citus loaded
This was a known bug in storage_engine ≤ 1.0 where IsCreateTableAs(NULL)
was called with a NULL query_string passed internally by citus, causing a
SIGSEGV. Fixed in storage_engine 1.0 (commit: add NULL guard in
IsCreateTableAs in engine_planner_hook.c).
Q5 is slow on events_col
Ensure: (a) there are no btree indexes on events_col, (b) index_scan=false
is set, and © data was inserted in event_date order (as setup.sql does
via the ORDER BY event_date clause in the INSERT … SELECT from
events_heap).
citus must appear before storage_engine
PostgreSQL enforces strict shared_preload_libraries ordering for extensions
that register planner hooks. citus registers its hook at startup and expects to
be the outermost hook in the chain. Placing storage_engine before citus causes
PostgreSQL to refuse to start.