Plan: Transactionally Updated Views (Immediate IVM)

Plan: Transactionally Updated Views (Immediate IVM)

Date: 2026-02-28 Status: IN PROGRESS (Phase 1 complete, Phase 3 complete, Phase 4 partially complete) Last Updated: 2026-03-03

1. Overview

This plan proposes adding immediate (transactional) incremental view maintenance to pg_trickle — updating stream tables within the same transaction that modifies a base table. Today, all stream table refreshes happen in separate transactions (either on a schedule or manually). This new mode would complement the existing deferred (scheduled) refresh model and enable pg_trickle to serve as a drop-in replacement for pg_ivm users.

Terminology

Term	Definition
Deferred IVM	Views are updated in a separate transaction (current pg_trickle model).
Immediate IVM	Views are updated in the same transaction as the base table DML (pg_ivm model).
IMMV	Incrementally Maintainable Materialized View (pg_ivm term).
Stream Table	pg_trickle’s equivalent of a materialized view with IVM.

Goals

Allow stream tables to be updated in the same transaction as base table modifications, providing read-your-writes consistency.
Provide a pg_ivm compatibility layer so existing pg_ivm users can migrate with minimal changes.
Maintain backward compatibility — the current deferred/scheduled model remains the default.
Support the subset of SQL that pg_ivm supports in immediate mode, with a path to extending coverage using pg_trickle’s broader operator support.

Non-Goals (out of scope for Phase 1)

Replacing the deferred refresh mode (it remains the default).
Supporting immediate IVM for all SQL constructs pg_trickle supports (window functions, UNION/INTERSECT/EXCEPT, recursive CTEs, LATERAL, etc. will start as deferred-only).
Multi-table concurrent write correctness without locking (pg_ivm also uses ExclusiveLock to serialize).

2. Background: How pg_ivm Works

2.1 Architecture

pg_ivm implements immediate IVM entirely through statement-level AFTER triggers with transition tables (the REFERENCING NEW TABLE AS ... OLD TABLE AS ... syntax). This is the key mechanism:

BEFORE trigger (IVM_immediate_before): Fires before each statement. Takes a snapshot of the pre-modification state. Tracks a count of before trigger invocations per view.
AFTER trigger (IVM_immediate_maintenance): Fires after each statement. Collects transition tables (tg_oldtable, tg_newtable) into in-memory tuplestores. When the last AFTER trigger fires (count matches BEFORE count), the actual delta computation and application runs.
Delta computation: The view’s defining query is rewritten by:
- Replacing each base table RTE with a subquery representing its “pre-update” state (using the snapshot from step 1 + the old transition table).
- Computing “old delta” by substituting the modified table’s RTE with the old transition table and executing the rewritten query.
- Computing “new delta” by substituting with the new transition table.
Delta application: The deltas are applied to the IMMV via SPI:
- Old delta → DELETE matching rows from the IMMV.
- New delta → INSERT/UPDATE rows into the IMMV.
- For aggregates: UPDATE existing group rows with incremental arithmetic (e.g., SET sum = mv.sum + delta.sum).
Concurrency: ExclusiveLock is taken on the IMMV during maintenance. In REPEATABLE READ/SERIALIZABLE, errors are raised if conflicts are detected.

2.2 pg_ivm Limitations

Feature	pg_ivm Support	pg_trickle Support
Inner joins	✓	✓
Outer joins	✓ (limited: simple equijoin only)	✓ (full)
DISTINCT	✓	✓
count/sum/avg/min/max	✓	✓ (+ more aggregates)
Window functions	✗	✓
UNION/INTERSECT/EXCEPT	✗	✓
DISTINCT ON	✗	✓ (via rewrite)
GROUPING SETS/CUBE/ROLLUP	✗	✓ (via rewrite)
Recursive CTEs	✗	✓
LATERAL subqueries	✗	✓
Subqueries in FROM	✓ (simple)	✓
EXISTS subqueries	✓ (simple)	✓
Views as base tables	✗	✓ (via view inlining)
Partitioned tables	✗	✓
Cascading (ST→ST)	✗	✓
User-defined aggregates	✗	✓
Mutable functions	✗	✓ (FULL mode)
HAVING	✗	✓
ORDER BY/LIMIT/OFFSET	✗	not applicable

2.3 pg_ivm API Surface

-- Core functions
pgivm.create_immv(immv_name text, view_definition text) → bigint
pgivm.refresh_immv(immv_name text, with_data bool) → bigint
pgivm.get_immv_def(immv regclass) → text

-- Catalog
pgivm.pg_ivm_immv (immvrelid, viewdef, ispopulated, lastivmupdate)

-- Drop via: DROP TABLE immv_name
-- Rename via: ALTER TABLE immv_name RENAME TO new_name

3. Proposed Architecture for pg_trickle

3.1 New Refresh Mode: `IMMEDIATE`

Add a third refresh mode alongside FULL and DIFFERENTIAL:

refresh_mode = 'FULL' | 'DIFFERENTIAL' | 'IMMEDIATE'

When refresh_mode = 'IMMEDIATE': - Statement-level AFTER triggers with transition tables are installed on all base tables (instead of row-level CDC triggers). - The trigger function computes and applies the delta within the same transaction as the base table DML. - No change buffer tables are used (no pgtrickle_changes.*). - No scheduler involvement — updates are synchronous. - The stream table is always up-to-date within the current transaction.

3.2 Trigger Design

Current CDC Triggers (deferred mode)

AFTER INSERT OR UPDATE OR DELETE  (row-level)
  → writes to pgtrickle_changes.changes_<oid>
  → consumed later by scheduler/manual refresh

New IMMEDIATE Triggers

BEFORE INSERT OR UPDATE OR DELETE  (statement-level)
  → IVM_before: capture snapshot, increment before_count

AFTER INSERT OR UPDATE OR DELETE  (statement-level, with transition tables)
  → IVM_after: collect transition tables, when last trigger fires:
    1. Generate delta SQL from operator tree
    2. Apply delta to stream table
    3. Advance metadata

BEFORE TRUNCATE  (statement-level)
  → IVM_before

AFTER TRUNCATE  (statement-level)
  → Truncate or full-refresh the stream table

Implementation Strategy

Option A: PL/pgSQL trigger calling a Rust pg_extern function

The trigger body is a thin PL/pgSQL wrapper that calls a Rust function:

CREATE TRIGGER pgt_ivm_before_ins
  BEFORE INSERT ON source_table
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_before(<st_oid>, <lock_mode>);

CREATE TRIGGER pgt_ivm_after_ins
  AFTER INSERT ON source_table
  REFERENCING NEW TABLE AS __pgt_newtable
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_after(<st_oid>, <lock_mode>);

The Rust pgt_ivm_before and pgt_ivm_after functions handle all the logic:

pgt_ivm_before:
- Register snapshot for pre-state visibility.
- Increment per-view before-trigger counter.
- Acquire appropriate lock on the stream table.
pgt_ivm_after:
- Collect transition tables into tuplestores.
- Increment after-trigger counter.
- When before_count == after_count (last trigger for this statement):
  - Generate delta SQL using the existing DVM engine.
  - Apply delta to stream table via SPI.
  - Update catalog metadata (frontier, data_timestamp).

Option B: C-level trigger functions (like pg_ivm)

Write the trigger functions in C (via pgrx unsafe blocks accessing pg_sys::TriggerData) to directly access transition tables without the PL/pgSQL overhead. This is what pg_ivm does.

Recommendation: Option A for Phase 1, Option B as optimization.

Option A is simpler, safer (fewer unsafe blocks), and leverages pgrx’s existing SPI infrastructure. Option B can be pursued later for performance if the PL/pgSQL overhead is measurable. The critical performance factor is the delta SQL execution, not the trigger dispatch overhead.

3.3 Delta Computation Strategy

Approach 1: Reuse Existing DVM Operator Tree (Recommended)

pg_trickle already has a sophisticated DVM engine that generates delta SQL from an operator tree. For immediate mode, we adapt this:

Pre-compute the delta SQL template at create_stream_table time (already done for deferred mode via CachedMergeTemplate).
In the AFTER trigger, instead of reading from change buffer tables, we create Ephemeral Named Relations (ENRs) from the transition tables and rewrite the delta SQL to reference those ENRs.
The DVM’s Scan operator delta would need an alternative code path: instead of SELECT ... FROM pgtrickle_changes.changes_<oid> WHERE lsn BETWEEN $prev AND $new, it would reference the ENR directly: SELECT ... FROM __pgt_newtable / __pgt_oldtable.

This approach reuses 90%+ of the existing DVM engine code.

Approach 2: Direct Query Rewriting (pg_ivm style)

Rewrite the defining query by replacing base table RTEs, similar to what pg_ivm does. This would be a separate code path from the DVM engine.

Not recommended — duplicates logic and doesn’t leverage pg_trickle’s existing operator tree infrastructure.

3.4 Transition Table Access

PostgreSQL’s transition tables (REFERENCING NEW TABLE AS ... OLD TABLE AS ...) are only available in statement-level AFTER triggers. They provide read-only access to the rows affected by the triggering statement.

Key considerations:

Multi-table views: If a view joins tables A and B, and A is modified, we need A’s transition table plus B’s current state. This is exactly how the current DVM delta works — join the delta of one side against the current state of the other.
Self-joins: If a table appears multiple times in the query, the same transition table is used for all occurrences. pg_ivm handles this with rte_paths tracking.
Cascading: If stream table B depends on stream table A, and A’s base table is modified, first A must be updated (in the same trigger), then B must be updated using A’s changes. This requires ordering trigger execution or handling cascades explicitly within the trigger function.

3.5 Concurrency and Locking

Following pg_ivm’s proven approach:

Scenario	Lock on Stream Table	Rationale
Single base table, no agg/distinct, INSERT only	RowExclusiveLock	Safe because inserts don’t conflict
Multiple base tables, or agg/distinct	ExclusiveLock	Prevents concurrent conflicting updates
TRUNCATE	ExclusiveLock	Full replacement

In READ COMMITTED isolation: - The lock ensures serial maintenance of the stream table. - After acquiring the lock, push a fresh snapshot to see concurrent commits.

In REPEATABLE READ / SERIALIZABLE: - If the lock can’t be acquired immediately, raise an error. - If a concurrent transaction already updated the view, raise an error to prevent inconsistency.

3.6 Catalog Changes

New column in `pgt_stream_tables`

ALTER TABLE pgtrickle.pgt_stream_tables
  ADD COLUMN refresh_mode text NOT NULL DEFAULT 'DIFFERENTIAL';
-- Already exists, but add 'IMMEDIATE' as a valid value

The existing refresh_mode column already stores 'FULL' or 'DIFFERENTIAL'. Add 'IMMEDIATE' as a third option.

New catalog table for IMMEDIATE mode metadata

CREATE TABLE pgtrickle.pgt_ivm_state (
    pgt_id         bigint  PRIMARY KEY REFERENCES pgtrickle.pgt_stream_tables(pgt_id),
    lock_mode      text    NOT NULL DEFAULT 'EXCLUSIVE',
    -- 'EXCLUSIVE' or 'ROW_EXCLUSIVE' (auto-determined)
    last_update_xid xid8,   -- pg_ivm compat: last transaction that updated
    is_populated    boolean NOT NULL DEFAULT false
);

Change tracking differences

For IMMEDIATE mode stream tables: - No change buffer tables (pgtrickle_changes.changes_<oid>) are created. - No CDC row-level triggers are created. - Statement-level triggers with transition tables are created instead. - No frontier tracking (there’s no LSN-based change window). - data_timestamp is updated to now() after each trigger-based refresh.

3.7 API Changes

Updated `create_stream_table`

pgtrickle.create_stream_table(
    name          text,
    query         text,
    schedule      text DEFAULT '1m',
    refresh_mode  text DEFAULT 'DIFFERENTIAL',
    initialize    bool DEFAULT true
) → void

When refresh_mode = 'IMMEDIATE': - schedule is ignored (set to NULL internally). - The stream table is populated immediately (full refresh). - Statement-level IVM triggers are created on all base tables. - The query is validated against IMMEDIATE mode restrictions (see §3.8).

Updated `alter_stream_table`

pgtrickle.alter_stream_table(
    name          text,
    schedule      text DEFAULT NULL,
    refresh_mode  text DEFAULT NULL,
    status        text DEFAULT NULL
) → void

Allow switching between 'DIFFERENTIAL' and 'IMMEDIATE': - DIFFERENTIAL → IMMEDIATE: Drop CDC triggers, create IVM triggers, drop change buffer tables, do a full refresh to ensure consistency. - IMMEDIATE → DIFFERENTIAL: Drop IVM triggers, create CDC triggers, create change buffer tables, do a full refresh. - FULL → IMMEDIATE or IMMEDIATE → FULL: Similar migration.

New: `refresh_stream_table` in IMMEDIATE mode

For IMMEDIATE mode tables, refresh_stream_table() performs a full refresh (like pg_ivm’s refresh_immv(name, true)). This is useful to re-sync after disabling/re-enabling immediate mode.

3.8 Query Restrictions for IMMEDIATE Mode

Phase 1 supports the same query subset as pg_ivm, plus pg_trickle’s auto- rewrite passes. Unsupported constructs in IMMEDIATE mode will error at creation time with a clear message suggesting 'DIFFERENTIAL' mode instead.

Supported in Phase 1:

SELECT ... FROM table (simple scan)
JOIN (inner, left, right, full outer — with equijoin restriction lifted in later phases)
WHERE (filter)
GROUP BY with count, sum, avg, min, max
DISTINCT
Simple subqueries in FROM
EXISTS subqueries in WHERE
Simple WITH (non-recursive CTEs)
Views (via existing auto-rewrite view inlining)
DISTINCT ON (via existing auto-rewrite)
GROUPING SETS/CUBE/ROLLUP (via existing auto-rewrite)

Deferred to later phases:

Recursive CTEs (semi-naive evaluation with fixpoint iteration not validated with transition tables)
User-defined aggregates (needs verification of incremental formulas)

Now supported (Phase 3 complete):

Window functions (partition-based recomputation via transition tables)
LATERAL subqueries and functions (row-scoped recomputation)
Scalar subqueries in SELECT (correlated subquery delta via transition tables)
Cascading IMMEDIATE stream tables (ST depending on another IMMEDIATE ST)

4. pg_ivm Compatibility Layer

4.1 Compatibility Functions

Provide a pgivm schema with wrapper functions:

CREATE SCHEMA IF NOT EXISTS pgivm;

-- create_immv: wraps create_stream_table with IMMEDIATE mode
CREATE FUNCTION pgivm.create_immv(
    immv_name text,
    view_definition text
) RETURNS bigint AS $$
DECLARE
    row_count bigint;
BEGIN
    -- Parse immv_name to extract schema if qualified
    PERFORM pgtrickle.create_stream_table(
        name := immv_name,
        query := view_definition,
        schedule := NULL,
        refresh_mode := 'IMMEDIATE',
        initialize := true
    );
    -- Return row count
    EXECUTE format('SELECT count(*) FROM %I', immv_name) INTO row_count;
    RETURN row_count;
END;
$$ LANGUAGE plpgsql;

-- refresh_immv: wraps refresh_stream_table
CREATE FUNCTION pgivm.refresh_immv(
    immv_name text,
    with_data boolean
) RETURNS bigint AS $$
DECLARE
    row_count bigint;
BEGIN
    IF with_data THEN
        PERFORM pgtrickle.refresh_stream_table(immv_name);
        EXECUTE format('SELECT count(*) FROM %I', immv_name) INTO row_count;
        RETURN row_count;
    ELSE
        -- Disable immediate maintenance
        PERFORM pgtrickle.alter_stream_table(
            name := immv_name,
            status := 'SUSPENDED'
        );
        EXECUTE format('TRUNCATE %I', immv_name);
        RETURN 0;
    END IF;
END;
$$ LANGUAGE plpgsql;

-- get_immv_def: wraps existing catalog
CREATE FUNCTION pgivm.get_immv_def(immv regclass) RETURNS text AS $$
    SELECT st.defining_query
    FROM pgtrickle.pgt_stream_tables st
    WHERE st.pgt_relid = immv;
$$ LANGUAGE sql STABLE;

4.2 Compatibility Catalog View

CREATE VIEW pgivm.pg_ivm_immv AS
SELECT
    st.pgt_relid AS immvrelid,
    st.defining_query AS viewdef,
    st.is_populated AS ispopulated,
    NULL::xid8 AS lastivmupdate  -- or read from pgt_ivm_state
FROM pgtrickle.pgt_stream_tables st
WHERE st.refresh_mode = 'IMMEDIATE';

4.3 Drop/Rename Compatibility

pg_ivm uses DROP TABLE and ALTER TABLE RENAME for IMMVs. pg_trickle stream tables are already regular tables, so DROP TABLE would need to be intercepted to also clean up catalog entries and triggers.

Options: 1. object_access_hook (what pg_ivm uses): Intercept DROP TABLE events and clean up IMMV catalog entries. pg_trickle already has DDL event triggers (_on_sql_drop) that do this. 2. FK cascading + event triggers (current pg_trickle approach): The existing _on_sql_drop event trigger already detects when stream table storage tables are dropped and marks them for cleanup.

The existing mechanism should work. We may need to add explicit DROP TABLE support (currently users use pgtrickle.drop_stream_table()).

4.4 Migration Guide for pg_ivm Users

-- pg_ivm:
SELECT pgivm.create_immv('my_view', 'SELECT a, sum(b) FROM t GROUP BY a');

-- pg_trickle (native):
SELECT pgtrickle.create_stream_table(
    'my_view',
    'SELECT a, sum(b) FROM t GROUP BY a',
    refresh_mode := 'IMMEDIATE'
);

-- pg_trickle (compat layer):
SELECT pgivm.create_immv('my_view', 'SELECT a, sum(b) FROM t GROUP BY a');
-- Works identically!

5. Implementation Phases

Phase 1: Core IMMEDIATE Mode (MVP)

Goal: Single-table & multi-table immediate IVM with aggregates and JOIN.

Implementation Status (2026-07-10): Phase 1 is fully implemented. See implementation notes below for deviations from the original design.

Add IMMEDIATE refresh mode to catalog and API ✅ DONE
- RefreshMode::Immediate variant added to dag.rs with is_immediate() and is_scheduled() helpers.
- create_stream_table accepts 'IMMEDIATE', sets schedule to NULL, skips CDC trigger setup, calls ivm::setup_ivm_triggers() instead.
- Catalog CHECK constraint updated: ('FULL', 'DIFFERENTIAL', 'IMMEDIATE').
- TopK + IMMEDIATE combination rejected at creation time.
- Manual refresh_stream_table() for IMMEDIATE STs does a full refresh.
Implement statement-level IVM triggers ✅ DONE
- New src/ivm.rs module (~572 lines) with:
  - setup_ivm_triggers() — creates 8 triggers per source table (4 BEFORE
    - 4 AFTER with transition tables REFERENCING NEW/OLD TABLE AS).
  - cleanup_ivm_triggers() — drops all triggers and PL/pgSQL functions.
  - PL/pgSQL wrapper functions that copy transition tables to temp tables then call Rust pg_extern functions.
- Deviation from plan: Phase 1 uses PL/pgSQL wrappers + temp tables instead of C-level ENR access (Option A from §3.2). This is simpler and avoids unsafe code. ENR optimization deferred to Phase 4.
- Deviation: No in-memory state tracking (IvmTriggerState hash table) in Phase 1. Each trigger invocation independently loads metadata and applies the delta. Simpler and correct, but slightly less efficient for multi-table views where before/after counting would batch work.
Adapt DVM delta computation for transition tables ✅ DONE
- DeltaSource enum added to src/dvm/diff.rs: ChangeBuffer (default) vs TransitionTable { tables: HashMap<u32, TransitionTableNames> }.
- DiffContext gained delta_source field with with_delta_source() builder method.
- diff_scan() in src/dvm/operators/scan.rs refactored to dispatch between diff_scan_change_buffer() (existing) and diff_scan_transition() (new).
- Transition scan: reads from temp tables, computes __pgt_row_id from PK hash, emits UNION ALL of DELETE (old) + INSERT (new).
Delta application via existing explicit DML path ✅ DONE
- pgt_ivm_apply_delta(pgt_id, source_oid, has_new, has_old) — pg_extern function that loads ST metadata, parses defining query, builds DeltaSource::TransitionTable, generates delta SQL via DVM, materializes to temp table, then applies DELETE + INSERT ON CONFLICT.
- Reuses existing DVM engine — Filter, Project, Join, Aggregate operators work unchanged.
Handle TRUNCATE ✅ DONE
- pgt_ivm_handle_truncate(pgt_id) — pg_extern function that truncates the stream table and re-populates from the defining query.
- BEFORE TRUNCATE trigger acquires advisory lock for serialization.
Basic concurrency: Advisory lock on all IMMEDIATE stream tables ✅ DONE
- Uses pg_advisory_xact_lock(st_oid) as default lock. Simple scan chains (Scan → optional Filter → optional Project) use pg_try_advisory_xact_lock(st_oid) instead for lighter concurrency.
- IvmLockMode enum (Exclusive / RowExclusive) with for_query() analysis determines the lock mode at trigger creation time.
- Lock acquired in BEFORE trigger, released at transaction end.
alter_stream_table mode switching ✅ DONE
- Switching between DIFFERENTIAL↔IMMEDIATE and FULL↔IMMEDIATE is fully supported. Tears down old infrastructure (IVM triggers or CDC triggers), sets up new infrastructure, updates catalog, runs full refresh.
- Validates query restrictions when switching TO IMMEDIATE mode.
- Restores a default schedule (‘1m’) when switching FROM IMMEDIATE.
Query restriction validation for IMMEDIATE mode ✅ DONE
- validate_immediate_mode_support() in src/dvm/parser.rs walks the OpTree and rejects RecursiveCte only. Window functions, LATERAL subqueries, LATERAL functions, and scalar subqueries are now allowed (they all bottom out at Scan nodes which already support transition tables). Clear error message suggests using DIFFERENTIAL mode.
- Called at both create_stream_table and alter_stream_table time.
Delta SQL template caching ✅ DONE
- Thread-local IVM_DELTA_CACHE keyed by (pgt_id, source_oid, has_new, has_old). Avoids re-parsing and re-differentiating the defining query on every trigger invocation.
- Cross-session invalidation via shared cache generation counter.
- invalidate_ivm_delta_cache(pgt_id) for explicit invalidation.
Tests ✅ DONE
7 unit tests for transition table scan path in scan.rs.
1 unit test for RefreshMode::Immediate helpers.
29 E2E tests in tests/e2e_ivm_tests.rs: create, INSERT/UPDATE/DELETE propagation, TRUNCATE, DROP cleanup, TopK rejection, manual refresh, mixed operations, mode switching (DIFFERENTIAL↔IMMEDIATE, FULL↔IMMEDIATE), window function creation + propagation, LATERAL join creation + propagation, scalar subquery creation, cascading IMMEDIATE stream tables, concurrent inserts, recursive CTE rejection, aggregate
- join in IMMEDIATE mode, alter mode switching (recursive CTE rejection, window function acceptance).

Phase 2: pg_ivm Compatibility Layer — POSTPONED

Status: Postponed — not needed for core pg_trickle functionality. Will be revisited if there is user demand for pg_ivm migration support.

Goal: Drop-in replacement for pg_ivm users.

Implement pgivm.* wrapper functions (see §4.1).
Implement pgivm.pg_ivm_immv catalog view (see §4.2).
Support DROP TABLE for IMMEDIATE stream tables via event triggers.
Support ALTER TABLE RENAME to update catalog entries.
Automatic index creation on IMMEDIATE stream tables (matching pg_ivm’s behavior: index on GROUP BY columns, DISTINCT columns, or base table PKs).
Add __ivm_count__ hidden column for views with DISTINCT or aggregates (or use existing __pgt_row_id).
Migration documentation and test suite comparing pg_ivm and pg_trickle behavior.

Phase 3: Extended Query Support

Goal: Support more SQL features in IMMEDIATE mode.

Window functions ✅ DONE — partition-based recomputation via diff_window works unchanged with transition tables. Enabled in check_immediate_support(). E2E tests verify creation + INSERT propagation.
UNION/INTERSECT/EXCEPT ✅ DONE (already allowed; validate_immediate_mode_support passes UNION ALL, INTERSECT, EXCEPT through without restriction).
LATERAL subqueries and functions ✅ DONE — diff_lateral_subquery and diff_lateral_function use ctx.diff_node(child) → Scan → diff_scan_transition(). E2E tests verify creation + INSERT propagation.
Cascading IMMEDIATE stream tables ✅ DONE — DML triggers on ST_A fire ST_B’s IVM triggers via nested trigger execution. Temp table names are scoped by OID/pgt_id. E2E test verifies base → ST_A → ST_B propagation.
Optimized locking ✅ DONE (IvmLockMode::for_query() analysis in src/ivm.rs; simple scan chains use pg_try_advisory_xact_lock, others use pg_advisory_xact_lock).
Scalar subqueries in SELECT ✅ DONE — diff_scalar_subquery uses ctx.diff_node() for both child and subquery nodes. E2E test verifies creation.

Phase 4: Performance Optimization

C-level trigger functions (Option B from §3.2) to eliminate PL/pgSQL overhead.
Delta SQL template caching — pre-compile IMMEDIATE mode delta SQL (already done for deferred mode). ✅ DONE (thread-local IVM_DELTA_CACHE in src/ivm.rs; keyed by (pgt_id, source_oid, has_new, has_old) with cross-session invalidation).
Prepared statement reuse — keep SPI prepared statements across trigger invocations within the same transaction.

Aggregate fast-path optimization — for “pure aggregate” queries (no JOINs, no subqueries, single GROUP BY, all aggregate functions invertible), bypass full delta SQL and emit a single parameterized UPDATE target SET sum = sum + $1, count = count + 1 WHERE group_key = $2. This reduces the per-DML cost to a single index-lookup UPDATE.

Invertible aggregate classification:

Aggregate	Invertible	Delta formula
`COUNT(*)`	Yes	+1 (INSERT), -1 (DELETE)
`SUM(expr)`	Yes	+new_val (INSERT), -old_val (DELETE)
`AVG(expr)`	Partial	Maintain (sum, count) pair; derive avg
`MIN(expr)`	No	Removal of minimum requires full scan
`MAX(expr)`	No	Removal of maximum requires full scan
`COUNT(DISTINCT)`	No	Requires set state
`ARRAY_AGG`	No	Requires ordered state
`PERCENTILE_*`	No	Requires sorted state
`BOOL_AND/OR`	No	Removal requires full scan

Only queries where all aggregates are invertible qualify. Add a GROUP BY cardinality guard: if the estimated number of groups exceeds a threshold (e.g. 100K), fall back to standard delta SQL to avoid per-row UPDATE overhead exceeding batch delta cost.

Benchmarking suite — compare pg_trickle IMMEDIATE vs pg_ivm vs deferred refresh on standard workloads.

Prioritized Remaining Work (post Phase 1)

The following items remain from the original plan. Items marked ✅ are done; the rest are ordered by priority. All remaining items are Phase 4 performance optimizations — the feature surface is complete.

Priority	Item	Phase	Status	Complexity
P0	~~`alter_stream_table` mode switching~~	1	✅ Done	—
P0	~~E2E test validation~~	1	✅ Done (29 tests)	—
P1	~~Query restriction validation~~	1	✅ Done	—
P2	~~pg_ivm compatibility layer~~	2	POSTPONED	—
P2	~~Optimized locking (RowExclusiveLock)~~	3	✅ Done	—
P2	~~Concurrent transaction tests~~	1	✅ Done	—
P3	~~`DROP TABLE` interception for IMMEDIATE STs~~	2	POSTPONED	—
P3	~~Cascading IMMEDIATE stream tables~~	3	✅ Done	—
P3	~~Delta SQL template caching~~	4	✅ Done	—
P3	~~Window functions in IMMEDIATE mode~~	3	✅ Done	—
P3	~~LATERAL subqueries in IMMEDIATE mode~~	3	✅ Done	—
P3	~~Scalar subqueries in IMMEDIATE mode~~	3	✅ Done	—
P3	ENR-based transition table access	4	Not started	High (`unsafe` pg_sys ENR APIs)
P3	In-memory state tracking (`IvmTriggerState`)	1	Not started	High (`unsafe` snapshot/xact callbacks)
P4	Aggregate fast-path optimization	4	Not started	Medium (detect invertible aggs, emit UPDATE)
P4	C-level trigger functions	4	Not started	Very High (`unsafe` TriggerData access)
P4	Prepared statement reuse	4	Not started	Medium (`unsafe` SPI_prepare/SPI_keepplan)

Assessment of Remaining Items

ENR-based transition table access (P3): Replace PL/pgSQL wrappers that copy transition tables to temp tables (CREATE TEMP TABLE ... ON COMMIT DROP AS SELECT * FROM __pgt_newtable) with Ephemeral Named Relations registered directly in the SPI executor. Eliminates CREATE/DROP overhead per trigger invocation. Requires unsafe access to pg_sys::EphemeralNamedRelation* APIs. Estimated ~200 lines of unsafe code.

In-memory state tracking (P3): Track per-ST before/after trigger counts to batch delta application when multiple source tables of the same ST are modified in a single statement. Marginal benefit for most use cases (single- table or single-source-modified queries). Requires unsafe snapshot management and RegisterXactCallback for abort cleanup.

Aggregate fast-path (P4): For “pure aggregate” queries (single GROUP BY, all aggregates invertible — COUNT, SUM, AVG), bypass the full DVM delta pipeline and emit a single UPDATE st SET sum = sum + $val WHERE key = $key. Requires invertible-aggregate detection, empty-group handling (delete when count reaches 0), and new-group insertion. Estimated ~300 lines.

C-level trigger functions (P4): Replace PL/pgSQL trigger wrappers with C-level trigger functions that access TriggerData and Tuplestorestate directly. Maximum performance but highest risk. Requires extensive unsafe code and thorough testing.

Prepared statement reuse (P4): Cache SPI prepared statement handles across trigger invocations within the same transaction. Avoids PostgreSQL’s parse → analyze → plan overhead on repeated trigger firings. Requires unsafe SPI_prepare / SPI_keepplan calls (not exposed by pgrx).

6. Technical Deep-Dive

6.1 Trigger Registration

For each base table referenced by an IMMEDIATE stream table, create:

-- BEFORE triggers (one per DML type)
CREATE TRIGGER pgt_ivm_before_ins_{st_oid}
  BEFORE INSERT ON {source_table}
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_before('{st_oid}', '{lock_mode}');

CREATE TRIGGER pgt_ivm_before_upd_{st_oid}
  BEFORE UPDATE ON {source_table}
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_before('{st_oid}', '{lock_mode}');

CREATE TRIGGER pgt_ivm_before_del_{st_oid}
  BEFORE DELETE ON {source_table}
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_before('{st_oid}', '{lock_mode}');

CREATE TRIGGER pgt_ivm_before_trunc_{st_oid}
  BEFORE TRUNCATE ON {source_table}
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_before('{st_oid}', 'true');

-- AFTER triggers (with transition tables)
CREATE TRIGGER pgt_ivm_after_ins_{st_oid}
  AFTER INSERT ON {source_table}
  REFERENCING NEW TABLE AS __pgt_newtable
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_after('{st_oid}', '{lock_mode}');

CREATE TRIGGER pgt_ivm_after_upd_{st_oid}
  AFTER UPDATE ON {source_table}
  REFERENCING OLD TABLE AS __pgt_oldtable NEW TABLE AS __pgt_newtable
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_after('{st_oid}', '{lock_mode}');

CREATE TRIGGER pgt_ivm_after_del_{st_oid}
  AFTER DELETE ON {source_table}
  REFERENCING OLD TABLE AS __pgt_oldtable
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_after('{st_oid}', '{lock_mode}');

CREATE TRIGGER pgt_ivm_after_trunc_{st_oid}
  AFTER TRUNCATE ON {source_table}
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.pgt_ivm_after('{st_oid}', 'true');

6.2 In-Memory State Management

/// Per-session hash table for tracking IVM trigger state.
/// Similar to pg_ivm's mv_trigger_info.
struct IvmTriggerState {
    /// Keyed by stream table OID.
    entries: HashMap<Oid, IvmTriggerEntry>,
}

struct IvmTriggerEntry {
    /// OID of the stream table being maintained.
    st_oid: Oid,
    /// Snapshot taken just before the first modification.
    snapshot: Option<pg_sys::Snapshot>,
    /// Count of BEFORE trigger invocations.
    before_count: u32,
    /// Count of AFTER trigger invocations.
    after_count: u32,
    /// Transition tables collected from AFTER triggers.
    tables: Vec<IvmTriggerTable>,
    /// Whether any old tuples were seen (DELETE/UPDATE).
    has_old: bool,
    /// Whether any new tuples were seen (INSERT/UPDATE).
    has_new: bool,
}

struct IvmTriggerTable {
    /// OID of the modified base table.
    table_oid: Oid,
    /// Tuplestores for deleted/updated-from tuples.
    old_tuplestores: Vec<*mut pg_sys::Tuplestorestate>,
    /// Tuplestores for inserted/updated-to tuples.
    new_tuplestores: Vec<*mut pg_sys::Tuplestorestate>,
}

This state is stored in TopTransactionContext memory and is automatically freed on transaction abort/commit. An AtAbort_IVM callback (registered via RegisterXactCallback) handles cleanup on abort.

6.3 Delta SQL Generation for Transition Tables

The key adaptation is in the Scan operator’s delta generation:

Current (deferred mode): sql -- Delta SQL reads from change buffer table SELECT action, new_col1, new_col2, old_col1, old_col2 FROM pgtrickle_changes.changes_12345 WHERE lsn > $prev_lsn AND lsn <= $new_lsn

New (immediate mode): ```sql – Delta SQL reads from transition table ENRs – For INSERT: SELECT ‘I’ AS action, col1, col2 FROM __pgt_newtable

– For DELETE: SELECT ’D' AS action, col1, col2 FROM __pgt_oldtable

– For UPDATE: SELECT ’D' AS action, col1, col2 FROM pgt_oldtable UNION ALL SELECT ‘I’ AS action, col1, col2 FROM pgt_newtable ```

The transition tables are registered as Ephemeral Named Relations (ENRs) in the query environment, making them accessible from SPI queries.

6.4 Cascading IMMEDIATE Stream Tables

When ST B depends on ST A (both IMMEDIATE), and a base table of A is modified:

A’s AFTER trigger fires, computes delta, updates A.
A’s update fires B’s AFTER trigger (because A is B’s base table).
B’s AFTER trigger computes delta using A’s transition (the rows that were inserted/deleted/updated in A), and updates B.

This requires that: - Stream tables be recognized as valid “base tables” for IMMEDIATE triggers. - The trigger on A fires B’s IVM triggers via the DML it performs on A. - Topological ordering is implicitly handled by trigger nesting.

Risk: Deep cascading chains may exceed PostgreSQL’s trigger nesting limit or cause performance issues. This is deferred to Phase 3.

6.5 Transaction Lifecycle Integration

User issues: INSERT INTO base_table VALUES (...)

PostgreSQL:
  1. Execute INSERT
  2. Fire BEFORE statement triggers
     → pgt_ivm_before: register snapshot, increment before_count
  3. Insert rows (row-level triggers fire if any)
  4. Fire AFTER statement triggers
     → pgt_ivm_after: collect transition table
     → before_count == after_count? Yes.
       → CommandCounterIncrement() [make inserted rows visible]
       → Acquire ExclusiveLock on stream table
       → Generate delta SQL (ENR-based)
       → Execute delta via SPI: DELETE old rows, INSERT/UPDATE new rows
       → Update catalog: data_timestamp, last_update_xid
  5. Return to user — stream table is already updated

7. Comparison with Current Architecture

Aspect	Deferred (current)	Immediate (proposed)
When updates happen	In scheduler or manual refresh (separate transaction)	In same transaction as DML
Change capture	Row-level AFTER triggers → change buffer tables	Statement-level AFTER triggers with transition tables
Delta source	Change buffer tables with LSN ranges	In-memory transition tuplestores (ENRs)
Consistency	Eventually consistent (staleness = schedule interval)	Strongly consistent (read-your-writes)
Write overhead	~5-15% per DML (trigger writes to buffer table)	~10-50%+ per DML (full delta computation inline)
Read overhead	None	None
Concurrency	No locking on stream table during DML	ExclusiveLock on stream table during DML
Batching	Changes accumulate, applied in bulk	Applied per-statement
Query support	Full pg_trickle operator set	Subset (Phase 1: pg_ivm equivalent)
Cascading	Scheduler handles topological order	Trigger nesting handles order
Background worker	Required (scheduler)	Not required

8. Risk Assessment

Risk	Severity	Mitigation
Write performance impact	High	Clearly document tradeoffs. Default remains deferred. Users opt-in.
Lock contention	High	ExclusiveLock may block concurrent readers. Consider AccessShareLock compatibility. Document isolation level implications.
Trigger nesting depth	Medium	Limit cascade depth. PostgreSQL’s `max_stack_depth` is the bound.
Transition table memory	Medium	Large DML may cause memory pressure. Document `work_mem` tuning. pg_ivm has the same limitation.
Unsafe code in trigger functions	Medium	Wrap all `pg_sys::*` calls in safe abstractions. Thorough review of SAFETY comments.
Snapshot management complexity	High	Carefully follow pg_ivm’s proven snapshot lifecycle. Test under all isolation levels.
Query restriction validation	Low	Clear error messages pointing users to DIFFERENTIAL mode.
Compatibility edge cases	Medium	Comprehensive test suite comparing pg_ivm and pg_trickle behavior.

9. Open Questions

Should IMMEDIATE mode support the __pgt_row_id column? pg_ivm doesn’t use content-hash IDs — it matches rows by all columns (for non-aggregate) or by GROUP BY keys (for aggregates). We could either:
- (a) Keep __pgt_row_id for consistency with deferred mode.
- (b) Omit it for IMMEDIATE mode and match by column values (pg_ivm compat). This means IMMEDIATE stream tables have a different schema than DIFFERENTIAL ones.
Should we support mixed mode? E.g., a stream table that is normally IMMEDIATE but can be temporarily switched to DEFERRED for bulk loads (like pg_ivm’s refresh_immv(name, false) + refresh_immv(name, true) pattern).
How to handle ALTER TABLE on base tables of IMMEDIATE stream tables? Currently, DDL event triggers set needs_reinit. For IMMEDIATE mode, should we immediately reinitialize (full refresh + recreate triggers)?
Should the compatibility layer live in the pg_trickle extension or as a separate SQL file? Keeping it in-extension simplifies installation but adds namespace complexity.
Trigger naming conflicts: If a base table already has pg_ivm triggers, what happens? We should detect this and advise the user.

10. Success Criteria

Functional: INSERT/UPDATE/DELETE on a base table immediately updates all IMMEDIATE stream tables within the same transaction. SELECT from the stream table after the DML (in the same transaction) returns up-to-date results.
pg_ivm compatibility: The pgivm.create_immv() wrapper produces behavior indistinguishable from pg_ivm for all supported query types. A pg_ivm test suite ported to pg_trickle should pass.
Performance: Write overhead for IMMEDIATE mode is within 2x of pg_ivm for equivalent queries. Deferred mode is not affected.
Reliability: No data corruption under concurrent transactions. Proper behavior under all isolation levels (READ COMMITTED, REPEATABLE READ, SERIALIZABLE).
Observability: IMMEDIATE stream tables appear in pgtrickle.pgt_status with refresh_mode = 'IMMEDIATE' and accurate data_timestamp.

11. References

pg_ivm source: https://github.com/sraoss/pg_ivm
PostgreSQL transition tables: https://www.postgresql.org/docs/current/trigger-definition.html
DBSP: Budiu et al., “DBSP: Automatic Incremental View Maintenance” (VLDB 2023)
Gupta & Mumick: “Maintenance of Materialized Views: Problems, Techniques, and Applications” (1995)
pg_trickle ADR-001: Triggers as default CDC mechanism
pg_trickle ADR-003: Query differentiation via operator tree
pg_trickle ADR-006: Explicit DML for user triggers

12. ADR Reference

This plan would result in the following new ADR:

ADR-008: Immediate (Transactional) IVM Mode

Field	Value
Status	Proposed
Category	IVM Engine
Date	2026-02-28

Decision: Add an IMMEDIATE refresh mode that uses statement-level AFTER triggers with transition tables to update stream tables in the same transaction as base table DML. This provides read-your-writes consistency and enables pg_ivm compatibility.

Options Considered: 1. Statement-level triggers with transition tables (chosen — matches pg_ivm’s proven approach) 2. Row-level triggers with deferred constraint triggers (rejected — no transition table access, complex) 3. Custom executor hooks (rejected — too invasive, fragile across PG versions) 4. Logical decoding within transaction (rejected — not possible; WAL is only visible after commit)

Key Points: - Complementary to existing deferred mode, not a replacement - Reuses DVM engine with ENR-based delta source - ExclusiveLock for concurrency safety (pg_ivm’s approach) - Phase 1 targets pg_ivm feature parity; later phases extend coverage

PGXN

PostgreSQL Extension Network

Contents