Contents
- Differential Refresh for Stream-Table-to-Stream-Table Dependencies
Differential Refresh for Stream-Table-to-Stream-Table Dependencies
Status: Proposed
Date: 2026-03-25
Design Goal: Maximum performance and differential propagation — FULL refreshes never cascade as FULL to downstream STs.
Related: PLAN_DAG_PERFORMANCE.md · ARCHITECTURE.md · CONFIGURATION.md
1. Problem
When stream table B depends on stream table A, the scheduler is forced to use
RefreshAction::Full every time A’s data changes. This is because there is no
CDC change buffer for stream tables — only for base tables. The scheduler
detects upstream staleness by comparing data_timestamp values, but has no
way to know which rows changed. A DIFFERENTIAL refresh would be a no-op
because the DVM engine expects to read changes from a pgtrickle_changes
buffer table, and none exists for ST sources.
Why this matters:
- A FULL refresh reads the entire upstream table and re-executes the downstream’s defining query against it, regardless of how many rows changed.
- For a 10M-row upstream ST where 100 rows changed, a FULL refresh does ~100,000× more work than necessary.
- In ST-to-ST chains (A → B → C), every hop is FULL. Each ST in the chain re-reads the full output of its predecessor.
- Wide DAGs with ST dependencies at every level compound this waste across all levels.
2. Current Architecture
How Base-Table CDC Works
When a base table is used as a source, pg_trickle creates:
- A change buffer table in the
pgtrickle_changesschema:sql CREATE TABLE pgtrickle_changes.changes_{source_oid} ( change_id BIGSERIAL, lsn PG_LSN NOT NULL, action CHAR(1) NOT NULL, -- 'I' | 'U' | 'D' | 'T' pk_hash BIGINT, changed_cols BIGINT, new_{col1} TYPE, old_{col1} TYPE, ... ); CREATE INDEX idx_changes_{oid}_lsn_pk_cid ON pgtrickle_changes.changes_{oid} (lsn, pk_hash, change_id) INCLUDE (action); - AFTER triggers on INSERT/UPDATE/DELETE that write rows into this buffer with the current LSN.
During a DIFFERENTIAL refresh, the DVM reads the buffer, computes a delta (query-specific differentiation), and applies it via MERGE:
[source table] → [CDC trigger] → [changes_OID buffer]
↓
[DVM delta SQL] ←── reads LSN range ────┘
↓
[MERGE INTO stream_table]
How ST-to-ST Works Today
There is no change buffer. The scheduler detects staleness via timestamp comparison, and forces FULL:
[upstream ST A refreshed] → data_timestamp updated
↓
[scheduler checks B] → A.data_timestamp > B.data_timestamp?
↓ yes
[RefreshAction::Full for B] → re-reads ALL of A's rows
The relevant code path in the scheduler (refresh_single_st()):
let (has_changes, has_stream_table_changes) = upstream_change_state(&st, ...);
let action = if has_changes && has_stream_table_changes {
RefreshAction::Full // ← forced FULL, no alternative
} else {
refresh::determine_refresh_action(&st, has_changes) // can be Differential
};
3. Proposed Solution: Per-ST Change Buffers
Core Idea
After a stream table’s differential (or full) refresh produces a delta and applies it via MERGE, capture the delta into a change buffer table that downstream STs can consume. This makes ST sources look exactly like base-table sources to the DVM pipeline — the same delta SQL, frontier tracking, and cleanup logic all work unchanged.
[upstream ST A refreshed]
↓
[DVM produces delta rows: __pgt_row_id, __pgt_action, cols...]
↓
[MERGE INTO A's storage table] ← existing
[INSERT delta INTO changes_pgt_{A.id}] ← NEW: capture for downstream
↓
[downstream ST B: DIFFERENTIAL refresh]
↓
[DVM reads changes_pgt_{A.id}] ← same DVM pipeline as base tables
[MERGE INTO B's storage table]
[INSERT delta INTO changes_pgt_{B.id}] ← cascade continues
Why This Approach
- Reuses the entire existing DVM/MERGE pipeline. No new query logic, no new delta-computation model. The change buffer schema is identical.
- Frontier tracking works unchanged. Each downstream ST already tracks per-source LSN frontiers. We assign LSNs to ST buffer rows and the existing min-frontier cleanup logic handles multi-consumer scenarios.
- Opt-in per ST. Not every ST needs a change buffer — only those with downstream ST dependents. This avoids storage waste.
4. Detailed Design
4.1 Change Buffer Table for STs
When: Created automatically when a stream table is first used as a source
by another stream table (i.e., when pgt_dependencies receives a
source_type = 'STREAM_TABLE' row). Also during CREATE STREAM TABLE if
the defining query references another ST that doesn’t have a buffer yet.
Naming convention: pgtrickle_changes.changes_pgt_{pgt_id}
Using pgt_id (not the source OID) avoids collision with base-table buffers
(changes_{oid}), and the _pgt_ infix makes the ownership unambiguous.
Schema:
CREATE TABLE pgtrickle_changes.changes_pgt_{pgt_id} (
change_id BIGSERIAL,
lsn PG_LSN NOT NULL, -- assigned at capture time
action CHAR(1) NOT NULL, -- 'I' | 'D' (no 'U' — deltas are I/D pairs)
pk_hash BIGINT, -- __pgt_row_id from the delta
new_{col1} TYPE, ... -- output columns of the ST
);
CREATE INDEX idx_changes_pgt_{pgt_id}_lsn_pk_cid
ON pgtrickle_changes.changes_pgt_{pgt_id} (lsn, pk_hash, change_id)
INCLUDE (action);
Note: No old_* columns are needed. The DVM delta already expresses
changes as (row_id, 'D') / (row_id, 'I') pairs. An UPDATE in the
original data appears as a delete of the old row + insert of the new row.
The downstream DVM scans read new_* columns for ‘I’ actions and ignores
payload columns for ’D' actions (it only needs pk_hash and action).
Partitioning: Follow the same auto-partitioning policy as base-table
buffers (pg_trickle.buffer_partitioning GUC).
4.2 Delta Capture During Refresh
After the DVM computes the delta and before/during MERGE application, capture the delta rows into the ST’s change buffer. The exact insertion point depends on which MERGE execution path is used:
Path 1 — Explicit DML (user-trigger path):
The delta is already materialized into __pgt_delta_{pgt_id} (a temp table).
After MERGE, INSERT into the change buffer from this temp table:
INSERT INTO pgtrickle_changes.changes_pgt_{pgt_id}
(lsn, action, pk_hash, new_col1, new_col2, ...)
SELECT pg_current_wal_lsn(), d.__pgt_action, d.__pgt_row_id,
d.col1, d.col2, ...
FROM __pgt_delta_{pgt_id} d
WHERE d.__pgt_action IN ('I', 'D');
Path 2 — Prepared statement / direct MERGE:
The delta is not yet materialized in memory. Two options:
Option A: Materialize the delta into a temp table first (same pattern as the explicit-DML path), then MERGE from it AND INSERT into the buffer. One extra
CREATE TEMP TABLE ... AS SELECTbefore the MERGE.Option B (preferred): Use
MERGE ... RETURNING, which is available since PostgreSQL 17 and therefore present in the pg_trickle target of PostgreSQL 18. The RETURNING clause can reference source-table alias columns directly, yielding exactly the delta rows needed for the buffer insert — no temp table required:
WITH delta_rows AS (
MERGE INTO "{schema}"."{name}" AS st
USING ({delta_sql}) AS d
ON st.__pgt_row_id = d.__pgt_row_id
WHEN MATCHED AND d.__pgt_action = 'D' THEN DELETE
WHEN MATCHED AND d.__pgt_action = 'I' ... THEN UPDATE SET ...
WHEN NOT MATCHED AND d.__pgt_action = 'I' THEN INSERT ...
RETURNING d.__pgt_action, d.__pgt_row_id, d.col1, d.col2, ...
)
INSERT INTO pgtrickle_changes.changes_pgt_{pgt_id}
(lsn, action, pk_hash, new_col1, new_col2, ...)
SELECT pg_current_wal_lsn(), r.__pgt_action, r.__pgt_row_id,
r.col1, r.col2, ...
FROM delta_rows r;
This atomically applies the MERGE and captures the buffer rows in a single round-trip, with no intermediate materialisation cost.
LSN assignment: Use pg_current_wal_lsn() at capture time. This gives
each captured batch a monotonically increasing LSN that downstream STs can
use for frontier tracking, consistent with how base-table triggers assign
LSNs.
Empty deltas: When the MERGE produces zero changes (no-op refresh), skip the buffer INSERT entirely. The downstream timestamp comparison will correctly detect that nothing changed.
4.3 Frontier & Change Detection
Downstream STs track the upstream ST’s frontier using the same JSONB structure as for base tables:
{
"sources": {
"pgt_{upstream_pgt_id}": { "lsn": "0/1A3B5C" }
}
}
The key uses pgt_{id} instead of a raw OID to distinguish ST sources from
base-table sources in the frontier map.
Change detection in the scheduler (has_stream_table_source_changes())
changes from timestamp comparison to buffer inspection:
// Before (current):
// Compare data_timestamp to detect staleness → force FULL
// After (proposed):
// Check if change buffer has rows beyond our frontier → allow DIFFERENTIAL
fn has_stream_table_source_changes(st: &StreamTableMeta) -> bool {
let deps = StDependency::get_for_st(st.pgt_id).unwrap_or_default();
for dep in &deps {
if dep.source_type != "STREAM_TABLE" { continue; }
let upstream_pgt_id = get_pgt_id_for_relid(dep.source_relid);
let buffer_table = format!("changes_pgt_{}", upstream_pgt_id);
// Check if buffer table exists and has rows beyond our frontier
let prev_lsn = st.frontier_lsn_for_source(upstream_pgt_id);
let has_rows = Spi::get_one::<bool>(&format!(
"SELECT EXISTS(SELECT 1 FROM {schema}.{buffer_table} \
WHERE lsn > '{prev_lsn}'::pg_lsn LIMIT 1)",
)).unwrap_or(Some(false)).unwrap_or(false);
if has_rows { return true; }
}
false
}
This is the same pattern as has_table_source_changes() — exactly aligned.
4.4 Scheduler Policy Change
In refresh_single_st(), the FULL override is currently triggered by:
if has_changes && has_stream_table_changes {
RefreshAction::Full
}
With ST change buffers, this override is removed (or gated behind a
feature flag). When the upstream ST has a change buffer, the downstream ST
uses the normal determine_refresh_action() path, which respects the ST’s
refresh_mode setting. If refresh_mode = Differential, it gets
DIFFERENTIAL.
Fallback: If an upstream ST does not have a change buffer (e.g., it has no downstream ST dependents yet), the current FULL behavior is preserved.
4.5 Cleanup
The existing drain_pending_cleanups() logic already computes the minimum
frontier across all consumers of a given source. Extend it to include ST
change buffers:
- In
drain_pending_cleanups(), includepgtrickle_changes.changes_pgt_{id}tables alongsidechanges_{oid}tables. - The min-frontier query joins on
pgt_dependencies WHERE source_type = 'STREAM_TABLE'to find all downstream STs for a given upstream ST. - DELETE rows where
lsn <= min_frontier(same as base-table cleanup). - If partitioning is enabled, reuse
detach_consumed_partitions().
4.6 Buffer Lifecycle
| Event | Action |
|---|---|
| ST B is created with ST A as source | If changes_pgt_{A.id} doesn’t exist, create it |
| ST B is dropped and was the last consumer of A | Drop changes_pgt_{A.id} (no consumers left) |
| ST A is dropped | Drop changes_pgt_{A.id} |
| ST A’s schema changes (reinit) | Recreate changes_pgt_{A.id} with new column types |
refresh_mode changed to FULL |
Buffer still populated (downstream might be DIFFERENTIAL) |
The buffer is owned by the upstream ST, not the downstream. This is consistent with base-table buffers being owned by the source, not the consumer.
5. DVM Integration
5.1 Scan Operator for ST Sources
The DVM’s scan operator (dvm/operators/scan.rs) currently generates delta
SQL that reads from pgtrickle_changes.changes_{oid}. For ST sources, it
needs to read from pgtrickle_changes.changes_pgt_{pgt_id} instead.
The scan CTE structure is identical:
WITH scan_upstream AS (
-- INSERT / new rows
SELECT c.pk_hash AS __pgt_row_id,
'I'::TEXT AS __pgt_action,
c.new_col1, c.new_col2, ...
FROM pgtrickle_changes.changes_pgt_{upstream_pgt_id} c
WHERE c.lsn > '__PGS_PREV_LSN_pgt_{id}__'::pg_lsn
AND c.lsn <= '__PGS_NEW_LSN_pgt_{id}__'::pg_lsn
AND c.action = 'I'
UNION ALL
-- DELETE rows
SELECT c.pk_hash AS __pgt_row_id,
'D'::TEXT AS __pgt_action,
c.new_col1, c.new_col2, ...
FROM pgtrickle_changes.changes_pgt_{upstream_pgt_id} c
WHERE c.lsn > '__PGS_PREV_LSN_pgt_{id}__'::pg_lsn
AND c.lsn <= '__PGS_NEW_LSN_pgt_{id}__'::pg_lsn
AND c.action = 'D'
)
Key difference from base-table scans:
- Buffer name: changes_pgt_{pgt_id} not changes_{oid}
- LSN placeholder tokens: __PGS_PREV_LSN_pgt_{id}__ (new token format)
- No 'U' action: ST deltas only produce 'I' and 'D'
- No old_* columns: DELETE rows carry new_* columns (the values being
removed) because the DVM delta format uses (__pgt_row_id, 'D', values)
not (__pgt_row_id, 'D', NULL, NULL, ...)
5.2 Frontier Placeholder Resolution
The LSN placeholder resolver (in refresh.rs, around line 2400) needs to
handle the new pgt_ prefix:
// Existing: replace __PGS_PREV_LSN_{oid}__ with actual LSN string
// New: also replace __PGS_PREV_LSN_pgt_{id}__ for ST sources
Alternatively, since pgt_id is always an integer, the resolver can match
on any __PGS_PREV_LSN_\w+__ pattern and look up the corresponding
frontier entry.
6. Implementation Plan
Phase 1: Change Buffer Infrastructure (Foundation)
| Task | File | Description |
|---|---|---|
| 1.1 | cdc.rs |
Add create_st_change_buffer_table(pgt_id, columns) — creates changes_pgt_{id} with schema from the ST’s output columns |
| 1.2 | cdc.rs |
Add drop_st_change_buffer_table(pgt_id) |
| 1.3 | cdc.rs |
Add has_st_change_buffer(pgt_id) -> bool — checks table existence |
| 1.4 | api.rs |
In create_stream_table_impl(), after dependency insertion, check if any source has source_type = 'STREAM_TABLE' and create the upstream ST’s buffer if it doesn’t exist |
| 1.5 | api.rs |
In drop_stream_table_impl(), check if dropped ST is the last consumer; if so, drop the upstream’s buffer |
| 1.6 | catalog.rs |
Add count_downstream_st_consumers(pgt_id) -> i64 |
Phase 2: Delta Capture
| Task | File | Description |
|---|---|---|
| 2.1 | refresh.rs |
Add capture_delta_to_st_buffer(pgt_id, delta_temp_table, columns) — INSERTs delta rows into changes_pgt_{id} with pg_current_wal_lsn() (used by the explicit-DML path which already has a temp table) |
| 2.2 | refresh.rs |
In execute_differential_refresh(), for the prepared-statement and direct-MERGE paths: wrap the MERGE in a CTE using MERGE ... RETURNING to atomically capture delta rows into the buffer (PostgreSQL 17+ feature, available on the pg_trickle target of PG 18) |
| 2.3 | refresh.rs |
In the explicit-DML path, call capture_delta_to_st_buffer() after the DELETE/UPDATE/INSERT sequence, reading from the already-materialized __pgt_delta_{pgt_id} temp table |
| 2.4 | refresh.rs |
In execute_full_refresh(), after the full rewrite, compute and capture the full diff (new state minus old state) into the buffer — see Section 7 |
Phase 3: DVM Scan for ST Sources
| Task | File | Description |
|---|---|---|
| 3.1 | dvm/operators/scan.rs |
Extend scan operator: when source is a STREAM_TABLE, read from changes_pgt_{id} instead of changes_{oid}, using pgt_-prefixed LSN placeholders |
| 3.2 | dvm/mod.rs |
Pass source_type into delta generation so the scan operator knows which buffer to reference |
| 3.3 | refresh.rs |
Extend LSN placeholder resolver to handle __PGS_PREV_LSN_pgt_{id}__ tokens |
| 3.4 | refresh.rs |
Include ST source buffers in frontier computation |
Phase 4: Scheduler Integration
| Task | File | Description |
|---|---|---|
| 4.1 | scheduler.rs |
Modify has_stream_table_source_changes(): check buffer for rows beyond frontier instead of comparing data_timestamp |
| 4.2 | scheduler.rs |
In refresh_single_st(), remove the has_stream_table_changes → Full override when the upstream has a change buffer |
| 4.3 | scheduler.rs |
Add fallback: if upstream has no buffer, retain the existing FULL + timestamp behavior |
Phase 5: Cleanup & Lifecycle
| Task | File | Description |
|---|---|---|
| 5.1 | refresh.rs |
Extend drain_pending_cleanups() to include changes_pgt_{id} tables, using the same min-frontier logic |
| 5.2 | refresh.rs |
Extend cleanup_change_buffers_by_frontier() for ST sources |
| 5.3 | hooks.rs |
Handle ST DDL changes: if upstream ST’s schema changes, drop + recreate the buffer with new column types |
| 5.4 | api.rs |
Handle ALTER STREAM TABLE ... SET QUERY: recreate buffer if columns change |
Phase 6: Testing
| Test | Tier | Description |
|---|---|---|
| 6.1 | Unit | create_st_change_buffer_table / drop_st_change_buffer_table |
| 6.2 | Unit | DVM scan operator generates correct SQL for ST sources |
| 6.3 | Integration | ST A → ST B chain: verify B uses DIFFERENTIAL when A has a buffer |
| 6.4 | Integration | Mixed sources: ST B depends on table T and ST A; verify correct action selection per-source |
| 6.5 | E2E | 3-level chain (table → ST₁ → ST₂ → ST₃): INSERT into base table, verify all levels update differentially |
| 6.6 | E2E | DROP ST₁ (middle of chain): verify ST₂ falls back to FULL |
| 6.7 | E2E | Buffer cleanup: verify consumed rows are drained at min frontier |
| 6.8 | E2E | Schema change propagation: ALTER source column type, verify buffer is recreated |
7. Task 2.4 Deep Dive: FULL Refresh Delta Capture
FULL refresh is the trickiest case. When an ST does a FULL refresh (truncate + rewrite), we still need to capture the diff between the old and new state so downstream STs can consume it differentially. Two approaches:
Chosen Approach: Pre/Post Diff
Decision: Option A is implemented from day one. The design goal is that FULL refreshes on an upstream ST never force downstream STs to also do FULL. The ~2× FULL refresh cost is an accepted and bounded tradeoff: it is O(N) with the upstream table size, exactly the same complexity class as the FULL refresh itself, and it eliminates unbounded FULL cascading through deep chains.
Before the FULL refresh, snapshot the current __pgt_row_id set. After the
refresh, compare:
-- Capture pre-state
CREATE TEMP TABLE __pgt_pre_{pgt_id} ON COMMIT DROP AS
SELECT __pgt_row_id, col1, col2, ... FROM "{schema}"."{name}";
-- ... FULL refresh runs (TRUNCATE + re-INSERT) ...
-- Compute diff
-- Deleted rows: in pre but not in post
INSERT INTO pgtrickle_changes.changes_pgt_{pgt_id} (lsn, action, pk_hash, ...)
SELECT pg_current_wal_lsn(), 'D', pre.__pgt_row_id, pre.col1, ...
FROM __pgt_pre_{pgt_id} pre
LEFT JOIN "{schema}"."{name}" post ON pre.__pgt_row_id = post.__pgt_row_id
WHERE post.__pgt_row_id IS NULL;
-- Inserted rows: in post but not in pre
INSERT INTO pgtrickle_changes.changes_pgt_{pgt_id} (lsn, action, pk_hash, ...)
SELECT pg_current_wal_lsn(), 'I', post.__pgt_row_id, post.col1, ...
FROM "{schema}"."{name}" post
LEFT JOIN __pgt_pre_{pgt_id} pre ON post.__pgt_row_id = pre.__pgt_row_id
WHERE pre.__pgt_row_id IS NULL;
-- Changed rows: same row_id but different content
INSERT INTO pgtrickle_changes.changes_pgt_{pgt_id} (lsn, action, pk_hash, ...)
SELECT pg_current_wal_lsn(), 'D', pre.__pgt_row_id, pre.col1, ...
FROM __pgt_pre_{pgt_id} pre
JOIN "{schema}"."{name}" post ON pre.__pgt_row_id = post.__pgt_row_id
WHERE (pre.col1, pre.col2, ...) IS DISTINCT FROM (post.col1, post.col2, ...);
INSERT INTO pgtrickle_changes.changes_pgt_{pgt_id} (lsn, action, pk_hash, ...)
SELECT pg_current_wal_lsn(), 'I', post.__pgt_row_id, post.col1, ...
FROM "{schema}"."{name}" post
JOIN __pgt_pre_{pgt_id} pre ON post.__pgt_row_id = pre.__pgt_row_id
WHERE (pre.col1, pre.col2, ...) IS DISTINCT FROM (post.col1, post.col2, ...);
Cost: Two table scans (pre-state + diff). For large STs this is meaningful overhead, but it’s O(N) with the table size — the same as the FULL refresh itself. The constant factor roughly doubles the refresh time. This cost is accepted: a FULL refresh is already expensive, and protecting all downstream STs from a cascading FULL is worth the constant overhead.
8. Performance Impact
Storage Overhead
Each ST with downstream dependents gets a change buffer table. Buffer size scales with delta size per refresh, not total ST size:
| Delta size per tick | Buffer accumulation (1 consumer) | With 5 consumers |
|---|---|---|
| 100 rows | 100 rows (cleaned next tick) | 500 rows worst case |
| 10,000 rows | 10,000 rows | 50,000 rows worst case |
In practice, cleanup runs at the start of each downstream refresh, so buffer tables stay small (one tick’s worth of deltas per consumer lag).
Write Overhead
Each ST refresh that has a change buffer adds one INSERT per delta row. For a 100-row delta this is ~1 ms. For a 10,000-row delta, ~10–50 ms. This is constant overhead per refresh, amortized across all downstream consumers.
Downstream Refresh Speedup
| Upstream ST size | Delta size | FULL refresh | DIFFERENTIAL refresh | Speedup |
|---|---|---|---|---|
| 100K rows | 100 rows | ~200 ms | ~5 ms | 40× |
| 1M rows | 1,000 rows | ~2 s | ~20 ms | 100× |
| 10M rows | 100 rows | ~20 s | ~5 ms | 4,000× |
Net Latency Impact (DAG Propagation)
Using the model from PLAN_DAG_PERFORMANCE.md for a wide DAG (N=500, D=10) in CALCULATED mode with $T_r^{\text{full}} = 100\text{ms}$ and $T_r^{\text{diff}} = 5\text{ms}$:
| Mode | Before (all FULL) | After (all DIFFERENTIAL) |
|---|---|---|
| CALCULATED | $1\text{s} + 500 \times 100\text{ms} = 51\text{s}$ | $1\text{s} + 500 \times 5\text{ms} = 3.5\text{s}$ |
| Parallel C=16 | ~8 s | ~8 s (dominated by poll overhead) |
The sequential CALCULATED mode benefits enormously: 51s → 3.5s for the same DAG. Parallel mode doesn’t benefit as much because the 200 ms poll interval already dominates at low $T_r$.
9. Migration & Compatibility
- Always-on. ST change buffers are created unconditionally whenever a
stream table is used as a source by another stream table. There is no
opt-out GUC — the feature is always active. Deployments that want FULL
propagation can set
refresh_mode = Fullon downstream STs explicitly. - Upgrade path: On extension upgrade, automatically scan
pgt_dependenciesfor all existingsource_type = 'STREAM_TABLE'rows and create the missingchanges_pgt_{id}buffers. Existing ST-to-ST dependencies immediately gain differential propagation without any user action. The upgrade step runs inside the extension update transaction and is idempotent (usesCREATE TABLE IF NOT EXISTS).
10. Risks & Mitigations
| Risk | Mitigation |
|---|---|
| Buffer tables increase catalog bloat | Only created for STs with downstream dependents; dropped when last consumer is removed |
| Large deltas cause buffer table bloat | Cleanup runs at min-frontier; buffer_partitioning GUC enables partition-based cleanup |
pg_current_wal_lsn() not advancing on read-only workloads |
Dismissed. The INSERT into changes_pgt_{id} is itself a WAL write, which always advances pg_current_wal_lsn(). A valid LSN is always available at capture time. |
| Pre/post diff doubles the FULL refresh time | Accepted tradeoff. The cost is O(N) — same complexity class as the FULL refresh itself. It eliminates unbounded FULL cascading through downstream chains. |
| No opt-out for ST change buffers | By design: always-on maximises differential propagation. Downstream STs can still be set to refresh_mode = Full individually if needed. |
| DVM scan template cache invalidation | Existing get_delta_sql_template() cache must be invalidated when a source switches from TABLE to STREAM_TABLE source type (this shouldn’t happen in practice) |
| Schema changes on upstream ST | Handled by dropping/recreating the buffer; downstream STs already handle source schema changes via needs_reinit |
11. Alternatives Considered
Alternative 1: Embed delta in frontier JSONB
Serialize delta rows into the frontier column instead of a separate table. Avoids new tables but is impractical for deltas > ~100 rows due to JSONB size and serialization overhead. Rejected.
Alternative 2: Use AFTER triggers on ST storage tables
Instead of capturing the delta in refresh.rs, install CDC-style triggers on
the stream table’s storage table to capture changes into a buffer. This would
mirror the base-table approach exactly.
Problem: pg_trickle already suppresses user triggers during MERGE
(ALTER TABLE ... DISABLE TRIGGER USER), and the internal CDC triggers would
need careful handling to not fire during manual DML vs. scheduled refresh.
The complexity is higher than the proposed approach with no benefit — we
already have the delta in memory during the refresh. Rejected.
Alternative 3: WAL-based ST CDC
Use logical decoding on the ST’s storage table to capture changes. This would be fully transparent and handle manual DML too. However, it requires replication slots, publications, and the full WAL-decoding machinery — massive implementation effort for marginal benefit over the proposed approach. Deferred to a possible future enhancement.