layout: default title: Limitations

Limitations

Supported PK types

The zone map tracks the first two PK columns. Supported types:

Type Zone map support
int2, int4, int8 Full
timestamp, timestamptz Full
date Full
uuid Lossy (first 8 bytes; UUIDs sharing a prefix may not be pruned)
text, varchar Lossy, requires COLLATE "C" (byte order must equal sort order)

Non-C collation text/varchar columns are not tracked. The table still works but queries on those columns will not benefit from scan pruning.


Online compact/merge restrictions

sorted_heap_compact_online and sorted_heap_merge_online are not supported for tables with UUID, text, or varchar primary keys. The lossy int64 hash representation causes collisions during change replay.

Use the offline variants (sorted_heap_compact, sorted_heap_merge) instead. See Online Compact/Merge for Lossy PK Types for the future lossless replay-key contract.


UPDATE behavior

UPDATE does not re-sort tuples. After many updates, the physical order may drift from PK order. Run sorted_heap_compact or sorted_heap_merge periodically on write-heavy tables.

Eager vs lazy mode

By default (eager mode), every UPDATE that widens a zone map entry flushes the meta page to disk. This keeps scan pruning accurate but adds per-UPDATE WAL overhead (~46% of heap throughput for small-column updates).

Set sorted_heap.lazy_update = on to skip per-UPDATE zone map maintenance. The first UPDATE invalidates the zone map on disk; the planner falls back to Index Scan. Compact or merge restores zone map pruning. INSERT always uses eager maintenance regardless of this setting.

The mode is never activated automatically – choose it based on your workload. See the README “UPDATE modes” section for a decision guide.


Zone map validity

  • After compact or rebuild, the zone map is marked valid and scan pruning is active.
  • Single-row INSERTs into pages already covered by the zone map update it in place (pruning stays active).
  • INSERTs into pages beyond zone map coverage invalidate the flag. VACUUM with sorted_heap.vacuum_rebuild_zonemap = on (default) automatically rebuilds it.

Block range pruning

heap_setscanlimits() supports only contiguous block ranges. For non-contiguous distributions (e.g., after many random inserts without compaction), the scan reads intervening pages but skips tuple processing on pages outside the bounds.

Zone maps are not index-only scans

Zone maps store page-level min/max metadata. They let SortedHeapScan skip heap pages, but they do not store row values or tuple identities and cannot return rows without fetching heap tuples. Any future heap-fetch-avoiding path needs either a metadata-only proof contract or a covering value-bearing sidecar.

See Zone-Map-Only Fast Paths for the design boundary.


sorted_hnsw ordered-scan contract

The current planner-integrated sorted_hnsw path is intentionally narrow:

  • it targets base-relation ORDER BY embedding <=> query LIMIT k
  • it is not used when there is no LIMIT
  • it is not used when LIMIT > sorted_hnsw.ef_search
  • it is not used when extra base-table quals or parameterization would make the current Phase 1 scan under-return candidates

For filtered retrieval flows, materialize/filter first or use the GraphRAG helper/wrapper API instead of treating sorted_hnsw as a general filtered ANN index.


Locking

Operation Lock level
sorted_heap_compact AccessExclusiveLock (blocks all access)
sorted_heap_merge AccessExclusiveLock
sorted_heap_compact_online ShareUpdateExclusiveLock during copy; brief AccessExclusiveLock for swap
sorted_heap_merge_online Same as compact_online
sorted_heap_compact_partitions Calls sorted_heap_compact leaf-by-leaf
sorted_heap_merge_partitions Calls sorted_heap_merge leaf-by-leaf

Only one online compact/merge can run on a table at a time. A second concurrent attempt will fail.

Partition parent helpers are operational wrappers, not global transactions: they preflight unsupported leaves and sorted_heap leaves without primary keys, then process concrete leaves one at a time. They reduce temporary disk-space requirements from “whole logical table” to “current leaf rewrite”, but an individual leaf still needs rewrite headroom. If a later leaf fails, earlier leaves are not rolled back by the helper.

Use sorted_heap_partition_maintenance_plan(parent, operation) before large runs to list all blockers and estimate the current leaf rewrite headroom. The estimate is intentionally conservative and relation-size based. The plan also reports each leaf tablespace so operators can connect the estimate to external tablespace/free-space monitoring; PostgreSQL does not provide a portable SQL-level free-byte metric.

The manual smoke make test-partition-lock verifies the expected lock behavior with two sessions and lock_timeout.

See Huge-Table Compaction Operating Model for the detailed rewrite/free-space contract.


Data migration

  • pg_dump / pg_restore: the zone map needs a compact after restore to re-enable scan pruning.
  • pg_upgrade 16 to 18 and 17 to 18: tested and verified. Data files (including zone map) are copied as-is.

Post-restore checklist:

-- Discover tables/leaves needing post-restore maintenance:
SELECT *
FROM sorted_heap_restore_plan();

-- Concrete sorted_heap table:
SELECT sorted_heap_compact('events'::regclass);

-- Partitioned parent with sorted_heap leaves:
SELECT *
FROM sorted_heap_compact_partitions('events_parent'::regclass);

-- Inspect storage state after maintenance:
SELECT *
FROM sorted_heap_partition_status('events_parent'::regclass);

For sorted_hnsw, rebuild the index/sidecar after restore. It stores physical heap TIDs, and pg_restore rewrites tuples with new TIDs.


ALTER TABLE

Most ALTER TABLE operations work correctly:

Operation Zone map impact
ADD COLUMN No impact
DROP COLUMN (non-PK) No impact
RENAME COLUMN No impact (including PK columns)
ALTER TYPE (non-PK) Table rewrite; compact restores zone map
DROP PRIMARY KEY Disables pruning; re-add PK + compact to restore