Contents
- PLAN: Edge Cases — Catalogue, Workarounds & Prioritised Remediation
- Motivation
- Priority Tiers
- Edge Case Catalogue
- EC-01 — JOIN key change + simultaneous right-side DELETE
- EC-02 — CUBE/ROLLUP expansion limit (> 64 branches)
- EC-03 — Window functions inside expressions
- EC-04 — Non-monotone recursive CTEs rejected
- EC-05 — Foreign tables in DIFFERENTIAL mode
- EC-06 — Keyless tables with exact duplicate rows
- EC-07 — IMMEDIATE mode write serialization
- EC-08 — IMMEDIATE mode write amplification
- EC-09 — IMMEDIATE mode unsupported constructs
- EC-10 — IMMEDIATE mode: no throttling
- EC-11 — Schedule interval shorter than refresh duration
- EC-12 — No read-your-writes (DIFFERENTIAL/FULL)
- EC-13 — Diamond dependency split-version reads
- EC-14 — Diamond upstream failure skips downstream
- EC-15 — SELECT * on source table + DDL
- EC-16 — ALTER FUNCTION body change not detected
- EC-17 — Schema change during active refresh
- EC-18 — auto CDC mode silent non-upgrade
- EC-19 — WAL mode + keyless tables need REPLICA IDENTITY FULL
- EC-20 — CDC TRANSITIONING phase complexity
- EC-21 — Stream tables not propagated to logical replication subscriber
- EC-22 — Change buffers not published to subscribers
- EC-23 — Scheduler does not run on standby / replica
- EC-24 — TRUNCATE on source table bypasses CDC triggers
- EC-25 — TRUNCATE on a stream table itself
- EC-26 — Direct DML on stream table
- EC-27 — Foreign keys on stream tables
- EC-28 — PgBouncer transaction pooling
- EC-29 — TopK + set operations / GROUPING SETS
- EC-30 — Circular dependencies
- EC-31 — FOR UPDATE / FOR SHARE in defining query
- EC-32 — ALL (subquery) not supported
- EC-33 — Statistical aggregates (CORR, COVAR*, REGR*)
- EC-34 — Backup / restore loses WAL replication slots
- EC-35 — Managed cloud PostgreSQL restrictions
- EC-36 — Managed cloud: shared_preload_libraries restriction
- Prioritised Recommendations
- Implementation Order
- Open Questions
- OQ-01 — EC-01: What is the correct new default for adaptive_full_threshold? ✅ RESOLVED
- OQ-02 — EC-01: Is pre-image capture viable without a breaking change buffer schema change? ✅ RESOLVED
- OQ-03 — EC-01: What is the overhead of the R₀ via EXCEPT ALL approach?
- OQ-04 — EC-03: Where exactly do window-function-in-expression rewrites fail?
- OQ-05 — EC-06: Is ctid-based row identity viable for trigger-mode CDC?
- OQ-06 — EC-08: Do statement-level transition tables actually batch multiple row changes?
- OQ-07 — EC-36: How much of the scheduler can be decoupled for cloud deployments?
PLAN: Edge Cases — Catalogue, Workarounds & Prioritised Remediation
Status: Proposed Target milestone: v0.3.0+ Last updated: 2026-06-05
Motivation
pg_trickle’s documentation (FAQ, SQL Reference, Architecture) describes 36 discrete edge cases and limitations. Some are fundamental trade-offs, others are fixable engineering gaps. This document:
- Catalogues every known edge case in one place.
- Proposes concrete workarounds or fixes for each.
- Assigns a priority tier so the team can focus effort where it matters most.
Priority Tiers
| Tier | Meaning | Guidance |
|---|---|---|
| P0 | Data correctness risk — users can silently get wrong results | Fix before v1.0 |
| P1 | Operational surprise — confusing failure, silent non-behaviour | Fix before v1.0 |
| P2 | Usability gap — documented but painful; strong workaround exists | Schedule for a minor release |
| P3 | Accepted trade-off — inherent to the design; document clearly | Keep documentation up to date |
Edge Case Catalogue
EC-01 — JOIN key change + simultaneous right-side DELETE
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — JOIN delta |
| Tier | P0 |
| Impact | Stale row remains in stream table until next full refresh |
| Current mitigation | Adaptive FULL fallback (pg_trickle.adaptive_full_threshold) |
| Documented in | FAQ § “Known edge cases”; SQL_REFERENCE § “Known Delta Computation Limitations” |
| Status | ✅ IMPLEMENTED — Part 1 split (R₀ via EXCEPT ALL) in diff_inner_join, diff_left_join, diff_full_join |
Root cause: The delta query reads current_right after all changes are
applied. When the old join partner is deleted before the delta runs, the
DELETE half of the join finds no partner and is silently dropped.
Proposed fix (chosen approach: R₀ via EXCEPT ALL — medium-term):
The short-term threshold adjustment is not viable: it trades query correctness for a heuristic that can still silently miss deletes when co-occurring changes fall below the threshold.
Primary fix — R₀ via EXCEPT ALL (4–6 days):
The delta formula in src/dvm/operators/join.rs is:
ΔJ = (ΔQ ⋈ R₁) + (L₀ ⋈ ΔR) [current code]
Part 2 already computes L₀ (pre-change left state) via EXCEPT ALL for
Scan children (left_part2_source in diff_inner_join). The fix mirrors
this symmetrically for Part 1’s right side — splitting Part 1 into:
Part 1a: ΔQ_inserts ⋈ R₁ (post-change right — unchanged)
Part 1b: ΔQ_deletes ⋈ R₀ (pre-change right via EXCEPT ALL)
R₀ is computed identically to the existing L₀:
sql
R_current EXCEPT ALL ΔR_inserts UNION ALL ΔR_deletes
build_snapshot_sql() already exists in join_common.rs and is reused
here. The correction term (Part 3) was added for the L₁ case in Part 2
and is unaffected — Part 2 stays unchanged.
Implementation plan:
- Day 1–2: Add right_part1_source decision + EXCEPT ALL CTE analogous
to use_l0 / left_part2_source. Mirror use_r0 logic for nested
right children (no SemiJoin, ≤ 2 scan nodes → R₀; otherwise R₁).
- Day 2–3: Split Part 1 SQL into two UNION ALL arms; update row ID hashing
for Part 1b arm.
- Day 4–5: Integration tests — co-delete scenario, UPDATE-then-delete,
multi-cycle correctness.
- Day 5–6: TPC-H regression suite (Q07 is the critical path).
Deferred — pre-image capture from change buffer:
The long-term approach (using old_col values already stored in the
change buffer trigger to reconstruct R₀ from ΔR directly, skipping the
live table scan) is deferred. The old_col values are already
captured by the CDC trigger (OLD."col" → "old_col" for UPDATE/DELETE),
but surfacing them through DiffContext/DeltaSource requires structural
changes to the delta CTE column representation that carry higher risk.
Estimated additional effort: 7–10 days. Schedule for v1.x when the EXCEPT
ALL approach is already correct and benchmarked.
EC-02 — CUBE/ROLLUP expansion limit (> 64 branches)
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — grouping sets |
| Tier | P3 |
| Impact | Creation error |
| Current mitigation | Use explicit GROUPING SETS(…) |
| Documented in | SQL_REFERENCE § “CUBE/ROLLUP Expansion Limit” |
| Status | ✅ Done — pg_trickle.max_grouping_set_branches GUC (default 64, range 1–65536) |
Proposed fix: The 64-branch limit is a sensible guard. Keep it but make
the limit configurable via a GUC Implemented: PGS_MAX_GROUPING_SET_BRANCHES
in src/config.rs; parser.rs reads GUC dynamically instead of const.
Documented in CONFIGURATION.md.
EC-03 — Window functions inside expressions
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — window functions |
| Tier | P2 |
| Impact | Creation error |
| Current mitigation | Keep window functions as top-level SELECT columns |
| Documented in | FAQ § “Window Functions” |
| Status | ✅ IMPLEMENTED — rewrite_nested_window_exprs() subquery-lift in src/dvm/parser.rs; helper collect_all_window_func_nodes() for recursive window func harvesting; deparse_select_window_clause() for WINDOW clause passthrough; exported from src/dvm/mod.rs; wired in src/api.rs before DISTINCT ON rewrite |
Implementation: nested-subquery lift (see PLAN_TRANSACTIONAL_IVM_PART_2.md Task 1.3):
Window functions nested inside expressions are lifted into a synthetic column in an inner subquery, then the outer SELECT applies the original expression with the window function replaced by the synthetic column alias:
-- Before (rejected):
SELECT id, ABS(ROW_NUMBER() OVER (ORDER BY score) - 5) AS adjusted_rank
FROM players
-- After rewrite (supported):
SELECT id, ABS("__pgt_wf_inner"."__pgt_wf_1" - 5) AS "adjusted_rank"
FROM (
SELECT *, ROW_NUMBER() OVER (ORDER BY score) AS "__pgt_wf_1"
FROM players
) "__pgt_wf_inner"
The subquery approach is preferred over the earlier CTE approach because a
WITH clause produces a WithQuery node in the OpTree, requiring the DVM
engine to handle CTEs wrapping window functions — an untested path. The
subquery produces a nested Scan → Window → Project chain that the existing
DVM path already handles correctly.
Test coverage: 9 E2E tests (e2e_window_tests.rs):
- 6 creation-acceptance tests: test_window_in_case_expression_rejected,
test_window_in_coalesce_rejected, test_window_in_arithmetic_rejected,
test_window_in_cast_rejected, test_window_deeply_nested_rejected,
test_top_level_window_still_works (regression).
- 3 data-correctness tests: test_ec03_case_window_data_correctness —
CASE + ROW_NUMBER() produces correct tier labels;
test_ec03_arithmetic_window_data_correctness — ROW_NUMBER() * 10
verified with a FULL refresh after INSERT;
test_ec03_coalesce_window_data_correctness — COALESCE(SUM() OVER, 0)
produces correct partition sums.
Known limitation: Window-in-expression queries are accepted in
DIFFERENTIAL mode (creation succeeds), but differential refresh fails with
a Fixed (EC-03): AUTO mode now
correctly falls back to FULL refresh with an informational message. Explicit
DIFFERENTIAL mode emits a column st.* does not exist error.WARNING at creation time alerting the user that
the pattern will fall back to full recomputation at refresh time. The DVM delta
engine cannot currently resolve the st.* alias inside the inner subquery
introduced by the rewrite. FULL refresh works correctly. Differential support
for the rewritten form is deferred.
EC-04 — Non-monotone recursive CTEs rejected
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — recursive CTEs |
| Tier | P3 |
| Impact | Creation error |
| Current mitigation | Restructure as non-recursive CTE; or use FULL mode |
| Documented in | FAQ § “Recursive CTEs” |
Proposed fix: This is a fundamental limitation of semi-naive evaluation — non-monotone operators in the recursive term can “un-derive” rows, breaking the incremental fixpoint. No code fix proposed. Keep the clear error message and document the reasoning. Users who need non-monotone recursion should use FULL mode.
EC-05 — Foreign tables in DIFFERENTIAL mode
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — foreign data wrappers |
| Tier | P2 |
| Impact | Creation error |
| Current mitigation | Use FULL mode |
| Documented in | SQL_REFERENCE § “Source Tables” |
| Status | ✅ IMPLEMENTED — error message improved + polling CDC implemented |
Proposed fix:
- ✅ Short term (Done): Improved the error message to suggest
FULLmode explicitly and mention thepostgres_fdw+IMPORT FOREIGN SCHEMApattern. - ✅ Medium term (Done): Polling-based CDC for foreign tables. Enabled via
pg_trickle.foreign_table_polling = onGUC. Creates a snapshot table per foreign table source; before each differential refresh, computes EXCEPT ALL deltas and populates the standard change buffer. Snapshot is refreshed each cycle. Cleanup path drops snapshot tables when STs are removed.
EC-06 — Keyless tables with exact duplicate rows
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — row identity |
| Tier | P0 |
| Impact | Phantom or missed deletes in stream table |
| Current mitigation | Add a primary key; add a synthetic unique column; or use FULL mode |
| Documented in | FAQ § “Keyless Tables”; SQL_REFERENCE § “Row Identity” |
| Status | ✅ IMPLEMENTED — Full EC-06 fix: catalog has_keyless_source flag, non-unique index for keyless sources, net-counting delta SQL in scan.rs (decompose changes into atomic +1/-1 per content hash, expand via generate_series), counted DELETE (ROW_NUMBER matching) in refresh.rs and ivm.rs, plain INSERT (no ON CONFLICT) for keyless sources. WARNING at creation time also implemented. |
Implementation (completed):
Short term (P0): Emit a✅ DoneWARNINGatcreate_stream_table()time- Full fix (P0): ✅ Implemented via net-counting approach:
- Catalog: Added
has_keyless_source BOOLEAN NOT NULL DEFAULT FALSEtopgtrickle.pgt_stream_tables(set at creation time). - Index: Non-unique
CREATE INDEXfor keyless sources (instead ofCREATE UNIQUE INDEX), allowing duplicate__pgt_row_idvalues. - Delta SQL (scan.rs): For keyless tables, decomposes all change
events into atomic
+1(INSERT) /-1(DELETE) operations per content hash. UPDATEs are split into their DELETE-old + INSERT-new components. Net counts are computed viaGROUP BY content_hash+SUM(delta_sign), then expanded to the correct number of D/I rows usinggenerate_series. - Apply (refresh.rs): For keyless sources, forces explicit DML path
(cannot use MERGE with non-unique row_ids). Counted DELETE uses
ROW_NUMBERmatching to pair delta deletes 1:1 with stream table rows. Plain INSERT without NOT EXISTS / ON CONFLICT. - Apply (ivm.rs): Same counted DELETE + plain INSERT for IMMEDIATE mode.
- Upgrade SQL:
ALTER TABLE ... ADD COLUMN IF NOT EXISTS has_keyless_source
- Catalog: Added
Bug fixes (post-implementation):
- Guard trigger DELETE cancel: The DML guard trigger (EC-26) used
RETURN NEWwhich is NULL for DELETE operations, silently cancelling managed-refresh DELETEs on the stream table. Fixed to useIF TG_OP = 'DELETE' THEN RETURN OLD; ELSE RETURN NEW; END IF;. - Keyless aggregate UPDATE no-op: The keyless UPDATE template was a
no-op (
SELECT 1 WHERE false), which is correct for keyless scans but wrong for aggregate queries on keyless sources (where'I'-action rows update existing groups). Fixed to use the normal UPDATE template, which naturally matches 0 rows for scan-level keyless deltas.
Test coverage: 7 E2E tests (e2e_keyless_duplicate_tests.rs) — all
passing. Covers: duplicate rows basic, delete-one-of-duplicates,
update-one-of-duplicates, unique content multi-cycle DML, delete-all-
then-reinsert, mixed DML stress, aggregate with duplicates.
EC-07 — IMMEDIATE mode write serialization
| Field | Value |
|---|---|
| Area | IMMEDIATE mode — concurrency |
| Tier | P3 |
| Impact | Reduced write throughput under high concurrency |
| Current mitigation | Use DIFFERENTIAL mode for high-throughput OLTP |
| Documented in | FAQ § “IMMEDIATE Mode Trade-offs” |
Proposed fix: This is an inherent trade-off of synchronous IVM — you
cannot maintain consistency without serialization. The IvmLockMode
already uses lighter pg_try_advisory_xact_lock for simple scan chains.
No code fix proposed. Document the throughput characteristics in the
performance guide and add a benchmark that quantifies the write-side
overhead so users can make informed mode choices.
EC-08 — IMMEDIATE mode write amplification
| Field | Value |
|---|---|
| Area | IMMEDIATE mode — latency |
| Tier | P3 |
| Impact | Increased per-DML latency |
| Current mitigation | Use DIFFERENTIAL mode |
| Documented in | FAQ § “IMMEDIATE Mode Trade-offs” |
Proposed fix: Inherent trade-off. Potential optimisation: batch multiple
trigger invocations within a single statement into one delta application
(the current implementation already uses statement-level triggers with
transition tables, which provides this batching naturally). Add a benchmark
to benches/ that measures per-statement overhead for 1, 10, 100, 1000
affected rows.
EC-09 — IMMEDIATE mode unsupported constructs
| Field | Value |
|---|---|
| Area | IMMEDIATE mode — SQL support |
| Tier | P2 |
| Impact | Creation / alter error |
| Current mitigation | Use DIFFERENTIAL mode |
| Documented in | FAQ § “What SQL features are NOT supported in IMMEDIATE mode?” |
| Status | ✅ Done — recursive CTEs + TopK now supported in IMMEDIATE mode |
Proposed fix (recursive CTEs + TopK): deferred to implemented from PLAN_TRANSACTIONAL_IVM_PART_2.md Phase 5.
Both constructs are now implemented:
- Recursive CTEs (G10 / Task 5.1):
check_immediate_support()changed from rejection to warning about stack depth. Semi-naive evaluation proceeds withDeltaSource::TransitionTable. - TopK (G11 / Task 5.2):
apply_topk_micro_refresh()inivm.rsimplements statement-level micro-refresh. Bounded bypg_trickle.ivm_topk_max_limitGUC (default 1000). (default 1000) to prevent inline recomputation latency spikes for large K.
For materialised view sources: use a polling-change-detection wrapper (same approach as EC-05 for foreign tables). No code fix proposed for this case.
EC-10 — IMMEDIATE mode: no throttling
| Field | Value |
|---|---|
| Area | IMMEDIATE mode — back-pressure |
| Tier | P3 |
| Impact | No rate limiting for high-frequency writes |
| Current mitigation | Use DIFFERENTIAL mode with a schedule interval |
| Documented in | FAQ § “IMMEDIATE Mode Trade-offs” |
Proposed fix: By design — throttling IMMEDIATE would break the
same-transaction consistency guarantee. No code fix. Ensure the FAQ
clearly explains that DIFFERENTIAL with a short interval (e.g. '1s') is
the right answer when back-pressure is needed.
EC-11 — Schedule interval shorter than refresh duration
| Field | Value |
|---|---|
| Area | Scheduler — timing |
| Tier | P1 |
| Impact | Perpetual latency spiral; cascading DAG chains fall behind |
| Current mitigation | Widen the interval; use IMMEDIATE mode; or split the DAG |
| Documented in | FAQ § “Performance and Tuning” |
| Status | ✅ IMPLEMENTED — scheduler_falling_behind NOTIFY alert at 80% threshold |
Test coverage: SchedulerFallingBehind variant covered in
test_alert_event_as_str and test_alert_event_all_variants_unique
(monitor.rs unit tests).
Chosen fix: scheduler_falling_behind NOTIFY alert (short term):
When the refresh duration exceeds 80% of the schedule interval for 3
consecutive cycles, emit a NOTIFY on the channel pgtrickle_alerts with a
JSON payload containing the stream table name, average duration, and
configured interval. Operators can LISTEN on the channel and wire it into
their alerting stack.
Deferred:
- auto_backoff GUC (double effective interval when falling behind) — medium term, schedule for a later sprint.
- explain_st(name) recommendation from pgt_refresh_history — long term.
EC-12 — No read-your-writes (DIFFERENTIAL/FULL)
| Field | Value |
|---|---|
| Area | Scheduler — consistency model |
| Tier | P3 |
| Impact | Stale reads immediately after write |
| Current mitigation | Use IMMEDIATE mode or call refresh_stream_table() manually |
| Documented in | FAQ § “Consistency” |
Proposed fix: Inherent to the scheduled refresh model. No code fix.
Consider a convenience function pgtrickle.write_and_refresh(dml_sql,
st_name) that executes the DML and triggers a manual refresh in a single
call. Low priority — manual refresh_stream_table() already covers this.
EC-13 — Diamond dependency split-version reads
| Field | Value |
|---|---|
| Area | Diamond dependencies — consistency |
| Tier | P1 |
| Impact | Fan-in node reads inconsistent upstream versions |
| Current mitigation | diamond_consistency = 'atomic' |
| Documented in | FAQ § “Diamond Dependencies”; SQL_REFERENCE |
| Status | ✅ FULLY IMPLEMENTED — default 'atomic' + scheduler sub-transaction fix |
Proposed fix: The atomic mode already fixes this. Remaining work:
- Short term (P1): Default
diamond_consistencyto'atomic'for new stream tables (currently'none'). This is a safer default — users who don’t need atomicity can opt out. ✅ Done. - Medium term: Add a
health_check()warning when a diamond graph exists butdiamond_consistency = 'none'.
Scheduler bug fixed (follow-on): When EC-13 changed the default to
'atomic', a latent bug surfaced: the scheduler’s diamond group path used
Spi::run("SAVEPOINT …") which PostgreSQL rejects in background workers
(SPI_ERROR_TRANSACTION). This silently skipped every diamond group refresh.
Fixed by replacing the SPI SAVEPOINT calls with
pg_sys::BeginInternalSubTransaction / ReleaseCurrentSubTransaction /
RollbackAndReleaseCurrentSubTransaction, which bypass SPI. See
src/scheduler.rs.
EC-14 — Diamond upstream failure skips downstream
| Field | Value |
|---|---|
| Area | Diamond dependencies — failure handling |
| Tier | P3 |
| Impact | Downstream stays stale for one cycle |
| Current mitigation | Automatic retry on next cycle |
| Documented in | FAQ § “Diamond Dependencies” |
Proposed fix: Current behaviour is correct. Add a NOTIFY
diamond_group_partial_failure when a group member fails so operators
can investigate. No urgency — the automatic retry handles recovery.
EC-15 — SELECT * on source table + DDL
| Field | Value |
|---|---|
| Area | DDL — schema evolution |
| Tier | P1 |
| Impact | Refresh fails; stream table marked needs_reinit |
| Current mitigation | Use explicit column lists in defining queries |
| Documented in | FAQ § “Schema Changes” |
| Status | ✅ IMPLEMENTED — WARNING at creation time for SELECT * |
Test coverage: 10 unit tests for detect_select_star() pure function
(api.rs): bare *, table-qualified t.*, explicit columns, count(*),
sum(*), mixed agg+bare star, no asterisk, subquery star, case
insensitivity, no-FROM clause.
Proposed fix:
- Short term (P1): ✅ Done — Emits
WARNINGatcreate_stream_table()time when the defining query containsSELECT *. Detection logic extracted to puredetect_select_star()function for unit testability. - Medium term: When
needs_reinitis triggered by a column-count mismatch, attempt an automatic reinitialisation: re-parse the query, verify the new column set is compatible, and refresh. If compatible, skip theneeds_reinitflag and refresh directly.
EC-16 — ALTER FUNCTION body change not detected
| Field | Value |
|---|---|
| Area | DDL — function dependency tracking |
| Tier | P1 |
| Impact | Stream table silently uses new function logic without a full rebase |
| Current mitigation | Manual full refresh after function changes |
| Documented in | FAQ § “Schema Changes” |
| Status | ✅ IMPLEMENTED — function_hashes TEXT column on pgt_stream_tables (migration 0.2.1→0.2.2); check_proc_hashes_changed() in refresh.rs queries md5(prosrc || coalesce(probin::text,'')) from pg_proc on each diff refresh; on mismatch calls mark_for_reinitialize + NOTICE; on first call stores baseline silently |
Implemented fix:
Polling via pg_proc on each differential refresh:
- On first call (stored hashes = NULL): compute and persist baseline. No reinit.
- On following calls: recompute hashes. If any differ → mark_for_reinitialize + NOTICE. The next scheduler cycle performs a full refresh.
- Uses md5(string_agg(prosrc || coalesce(probin::text,''), ',' ORDER BY oid)) to handle polymorphic overloads robustly.
Test coverage: 3 E2E tests (e2e_multi_cycle_tests.rs):
- test_ec16_function_body_change_marks_reinit — verifies that
CREATE OR REPLACE FUNCTION triggers needs_reinit = true on the next
differential refresh.
- test_ec16_function_change_full_refresh_recovery — verifies that after
reinit, a full refresh produces correct results with the new function logic.
- test_ec16_no_functions_unaffected — verifies that stream tables without
user-defined functions are never affected by the hash polling mechanism.
EC-17 — Schema change during active refresh
| Field | Value |
|---|---|
| Area | DDL — concurrency |
| Tier | P2 |
| Impact | Refresh errors; stream table suspended |
| Documented in | FAQ § “Schema Changes” |
| Status | ✅ Done — FAQ section documents lock interaction |
Proposed fix: No code fix needed Implemented: FAQ section added
explaining ShareLock on source tables during refresh blocks concurrent
ALTER TABLE (AccessExclusiveLock). The window is only open if the DDL
sneaks in between lock acquisition and first read.
EC-18 — auto CDC mode silent non-upgrade
| Field | Value |
|---|---|
| Area | CDC — mode transition |
| Tier | P1 |
| Impact | User thinks WAL CDC is active; actually stuck on triggers forever |
| Current mitigation | Default is TRIGGER mode; auto must be explicitly set |
| Documented in | FAQ § “CDC Architecture” |
| Status | ✅ IMPLEMENTED — rate-limited LOG every ~60 scheduler ticks |
Proposed fix:
- Short term (P1): When
cdc_mode = 'auto', emit aLOG-level message on every scheduler cycle that explains why the transition hasn’t happened (e.g. “wal_level is ‘replica’, need ‘logical’”). Rate- limit to once per 5 minutes. - Medium term: Add a
check_cdc_health()finding for “auto mode stuck in TRIGGER phase for > 1 hour”.
Test coverage: 2 E2E tests (e2e_wal_cdc_tests.rs) + 1 existing test:
- test_ec18_check_cdc_health_shows_trigger_for_stuck_auto — verifies
check_cdc_health() reports TRIGGER mode for a keyless auto-CDC source
and no alert fires for healthy TRIGGER-mode sources.
- test_ec18_health_check_ok_with_trigger_auto_sources — verifies
health_check() does not flag TRIGGER-mode auto-CDC sources as errors.
- test_wal_keyless_table_stays_on_triggers (pre-existing) — verifies
keyless tables remain on TRIGGER mode in auto-CDC configuration.
EC-19 — WAL mode + keyless tables need REPLICA IDENTITY FULL
| Field | Value |
|---|---|
| Area | CDC — WAL decoding |
| Tier | P0 |
| Impact | Silent data loss for UPDATEs/DELETEs on keyless tables in WAL mode |
| Current mitigation | Set REPLICA IDENTITY FULL before switching to WAL CDC |
| Documented in | FAQ § “CDC Architecture” |
| Status | ✅ IMPLEMENTED — setup_cdc_for_source() rejects at creation time |
Proposed fix:
- Short term (P0): At
create_stream_table()/alter_stream_table()time, whencdc_modeinvolves WAL and any source table is keyless, checkpg_class.relreplidentand reject with a clear error if not'f'(full). This prevents the silent data loss entirely. - Medium term: Automatically issue
ALTER TABLE … REPLICA IDENTITY FULLfor keyless tables when entering WAL mode (with a NOTICE).
Test coverage: 2 E2E tests (e2e_wal_cdc_tests.rs):
- test_ec19_wal_keyless_without_replica_identity_full_rejected — verifies
that create_stream_table(cdc_mode => 'wal') on a keyless table without
REPLICA IDENTITY FULL is rejected with a clear error.
- test_ec19_wal_keyless_with_replica_identity_full_accepted — verifies
that the same combination succeeds once REPLICA IDENTITY FULL is set.
EC-20 — CDC TRANSITIONING phase complexity
| Field | Value |
|---|---|
| Area | CDC — mode transition |
| Tier | P2 |
| Impact | Potential duplicate capture during transition; rollback to triggers on failure |
| Current mitigation | LSN-based deduplication; automatic rollback |
| Documented in | FAQ § “CDC Architecture” |
| Status | ✅ Done — check_cdc_transition_health() in scheduler.rs |
Proposed fix: Add a
Implemented: trigger_inventory() + check_cdc_health() combined checkcheck_cdc_transition_health() runs after recover_from_crash()
in the scheduler. Loads all stream table deps, filters TRANSITIONING entries,
checks pg_replication_slots for slot existence, rolls back to TRIGGER mode
if slot is missing, logs status.
EC-21 — Stream tables not propagated to logical replication subscriber
| Field | Value |
|---|---|
| Area | Replication — architecture |
| Tier | P3 |
| Impact | Subscriber gets static snapshot without refresh capability |
| Current mitigation | Run pg_trickle only on primary |
| Documented in | FAQ § “Replication” |
| Status | ✅ Done — FAQ documentation sweep completed |
Proposed fix: Inherent to the architecture — CDC triggers and the scheduler only run on the primary. No code fix. Consider a future “subscriber mode” that detects replicated source tables and runs its own CDC + refresh cycle, but this is a major undertaking (v2.0+).
EC-22 — Change buffers not published to subscribers
| Field | Value |
|---|---|
| Area | Replication — change buffers |
| Tier | P3 |
| Impact | Subscriber cannot drive its own refresh |
| Documented in | FAQ § “Replication” |
| Status | ✅ Done — FAQ documentation sweep completed |
Proposed fix: By design. Same as EC-21 — keep documentation clear.
EC-23 — Scheduler does not run on standby / replica
| Field | Value |
|---|---|
| Area | Replication — standby |
| Tier | P3 |
| Impact | No IVM on read replicas |
| Current mitigation | Query the primary; or promote the replica |
| Documented in | FAQ § “HA / Replication” |
| Status | ✅ Done — FAQ documentation sweep completed |
Proposed fix: By design — standbys are read-only. No code fix. A future enhancement could add a read-only health endpoint on standby that reports the last refresh timestamp from the replicated catalog, so monitoring can detect staleness.
EC-24 — TRUNCATE on source table bypasses CDC triggers
| Field | Value |
|---|---|
| Area | CDC — TRUNCATE handling |
| Tier | P3 |
| Impact | Stream table not updated (until reinit on next cycle) |
| Current mitigation | TRUNCATE event trigger marks needs_reinit; next cycle does full recompute |
| Documented in | SQL_REFERENCE § “TRUNCATE” |
Proposed fix: Already handled by the TRUNCATE event trigger. The only gap is latency — between the TRUNCATE and the next cycle, the stream table is stale. For IMMEDIATE mode, TRUNCATE triggers a same-transaction full refresh (already implemented). For DIFFERENTIAL/FULL, the current behaviour is acceptable. No further code fix needed.
EC-25 — TRUNCATE on a stream table itself
| Field | Value |
|---|---|
| Area | Stream table — direct manipulation |
| Tier | P1 |
| Impact | Frontier / buffer desync; future refreshes produce incorrect deltas |
| Current mitigation | Use refresh_stream_table() to reset |
| Documented in | SQL_REFERENCE § “What Is NOT Allowed” |
| Status | ✅ IMPLEMENTED — BEFORE TRUNCATE guard trigger blocks direct TRUNCATE |
Test coverage: test_guard_trigger_blocks_truncate E2E test
(e2e_guard_trigger_tests.rs).
Proposed fix:
- Short term (P1): ✅ Done — BEFORE TRUNCATE trigger installed on
storage tables at creation time. Checks
pg_trickle.internal_refreshsession variable to allow managed refreshes. Theinternal_refreshflag is set in all refresh code paths:execute_manual_refresh,execute_manual_full_refresh,execute_full_refresh,execute_topk_refresh, andpgt_ivm_handle_truncate. - Medium term: Instead of blocking, intercept TRUNCATE and automatically perform a full refresh (same approach as IMMEDIATE mode TRUNCATE handling). This is friendlier but may surprise users who expect TRUNCATE to be instant.
EC-26 — Direct DML on stream table
| Field | Value |
|---|---|
| Area | Stream table — direct manipulation |
| Tier | P1 |
| Impact | Data overwritten or duplicated on next refresh |
| Current mitigation | Documentation; never write directly |
| Documented in | SQL_REFERENCE § “What Is NOT Allowed” |
| Status | ✅ IMPLEMENTED — BEFORE INSERT/UPDATE/DELETE guard trigger on storage table |
Test coverage: 4 E2E tests (e2e_guard_trigger_tests.rs) —
test_guard_trigger_blocks_direct_insert, _update, _delete +
test_guard_trigger_allows_managed_refresh.
Proposed fix:
- Short term (P1): ✅ Done — Row-level BEFORE trigger installed on
storage tables at creation time. Uses
pg_trickle.internal_refreshsession variable (checked viacurrent_setting(..., true)) to allow managed refreshes. Returns OLD for DELETE operations (returning NEW would be NULL, silently cancelling the DELETE). Theinternal_refreshflag is set across all refresh code paths. - Medium term: Use a PostgreSQL row-level security policy or a
pg_trickle.allow_direct_dmlsession variable as an escape hatch for power users.
EC-27 — Foreign keys on stream tables
| Field | Value |
|---|---|
| Area | Stream table — constraints |
| Tier | P3 |
| Impact | FK violations during bulk MERGE |
| Current mitigation | Do not define FKs |
| Documented in | SQL_REFERENCE § “What Is NOT Allowed” |
Proposed fix: Reject ALTER TABLE … ADD CONSTRAINT … FOREIGN KEY on
stream tables via event trigger (same pattern as EC-25/EC-26). P3 — the
current documentation is sufficient, and FK creation on stream tables is
rare.
EC-28 — PgBouncer transaction pooling
| Field | Value |
|---|---|
| Area | Infrastructure — connection pooling |
| Tier | P2 |
| Impact | Prepared statement errors, lock escapes, GUC not applied |
| Current mitigation | Use session-mode pooling; use SET LOCAL |
| Documented in | FAQ § “PgBouncer” |
| Status | ✅ Done — documentation already exists |
Proposed fix: pg_trickle’s scheduler uses a single dedicated backend
connection (not pooled), so the scheduler itself is unaffected. The edge
case applies to application queries that use SET pg_trickle.* GUCs via a
pooled connection. No code fix — this is a PgBouncer limitation.
Improve documentation with a recommended PgBouncer configuration snippet.
EC-29 — TopK + set operations / GROUPING SETS
| Field | Value |
|---|---|
| Area | TopK — SQL restrictions |
| Tier | P3 |
| Impact | Creation error |
| Documented in | SQL_REFERENCE § “TopK” |
Proposed fix: TopK uses scoped recomputation via MERGE, which
fundamentally requires a single ORDER BY … LIMIT at the top level. Set
operations and multi-level grouping sets break this invariant. No code
fix. Keep the clear rejection error and document the workaround (split
into separate stream tables).
EC-30 — Circular dependencies
| Field | Value |
|---|---|
| Area | DAG — cycle detection |
| Tier | P3 |
| Impact | Creation error with cycle path in message |
| Documented in | FAQ § “Dependencies”; SQL_REFERENCE |
Proposed fix: Already handled — rejected at creation time with a clear
error listing the cycle path. See plans/sql/PLAN_CIRCULAR_REFERENCES.md
for future exploration of safe fixed-point evaluation. No further work.
EC-31 — FOR UPDATE / FOR SHARE in defining query
| Field | Value |
|---|---|
| Area | SQL — row locking |
| Tier | P3 |
| Impact | Creation error |
| Documented in | SQL_REFERENCE § “Rejected Constructs” |
Proposed fix: By design — row locks serve no purpose in stream table defining queries. No code fix. The error message is clear.
EC-32 — ALL (subquery) not supported
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — subquery operators |
| Tier | P2 |
| Impact | Creation error |
| Documented in | FAQ § “Why Are These SQL Features Not Supported?” |
| Status | ✅ IMPLEMENTED — parse_all_sublink() in src/dvm/parser.rs updated with NULL-safe anti-join condition (col IS NULL OR NOT (x op col)); full AntiJoin pipeline was already wired (parse_sublink_to_wrapper → parse_all_sublink → AntiJoin SublinkWrapper → diff_anti_join) |
Fix applied: NULL-safe anti-join condition.
ALL (subquery) is mathematically a conjunction of per-row comparisons.
The full pipeline through parse_all_sublink() → AntiJoin OpTree →
diff_anti_join was already wired. The outstanding issue was NULL-safety:
the previous condition NOT (x op col) evaluated to NULL (not TRUE) when
col IS NULL, so NULL rows in the subquery were silently ignored, giving
wrong results. Fixed by changing the condition to
(col IS NULL OR NOT (x op col)), which correctly excludes the outer row
whenever any subquery row is NULL or fails the comparison.
Test coverage: 8 E2E tests (e2e_all_subquery_tests.rs):
- test_all_subquery_less_than_differential — basic price < ALL filter
matching the SQL Reference worked example.
- test_all_subquery_differential_inner_insert — differential refresh after
inserting a cheaper competitor price that disqualifies a product.
- test_all_subquery_differential_outer_change — outer INSERT + inner DELETE
with recomputation.
- test_all_subquery_null_in_inner — NULL-safety: ALL returns false when
subquery contains NULL; removing NULL re-qualifies rows.
- test_all_subquery_empty_inner — empty subquery: ALL against empty set
yields true (SQL standard); inserting reduces result.
- test_all_subquery_full_refresh — >= ALL with FULL refresh mode.
- test_all_subquery_equals_operator — = ALL (value equals every row).
- test_all_subquery_not_equals_operator — <> ALL (equivalent to NOT IN).
EC-33 — Statistical aggregates (CORR, COVAR*, REGR*)
| Field | Value |
|---|---|
| Area | DIFFERENTIAL — aggregates |
| Tier | P3 |
| Impact | Creation error in DIFFERENTIAL mode |
| Current mitigation | Use FULL mode; or pre-aggregate in DIFFERENTIAL and compute in a view |
| Documented in | FAQ § “Unsupported Aggregates” |
Chosen fix: group-rescan (see PLAN_TRANSACTIONAL_IVM_PART_2.md Task 4.1):
Implement CORR/COVAR_POP/COVAR_SAMP/REGR_* using the proven
group-rescan strategy already used for BOOL_AND, STRING_AGG, etc.:
- Detect affected groups from the delta (groups where any row was inserted, updated, or deleted).
- For each affected group, re-aggregate from source:
SELECT group_key, CORR(y, x) FROM source GROUP BY group_key - Apply the result as an UPDATE to the stream table.
This requires no stream table schema changes and no new OpTree variants.
Deferred — Welford auxiliary accumulator columns:
The earlier proposal of maintaining sum_x, sum_y, sum_xy, sum_x2,
count accumulator columns in the stream table using Welford’s online
algorithm adapted to SQL EXCEPT/UNION was explored but is deferred.
Rationale: it requires a breaking schema change to existing stream tables,
the SQL delta-algebra correctness of the Welford adaptation is non-trivial
to verify, and group-rescan is already correct and benchmarked for
analogue aggregates. Revisit for v2.0 if group-rescan performance is
insufficient for very large groups.
EC-34 — Backup / restore loses WAL replication slots
| Field | Value |
|---|---|
| Area | Operations — disaster recovery |
| Tier | P1 |
| Impact | CDC stuck; slot-missing error after restore |
| Current mitigation | Set cdc_mode = 'trigger' after restore; let auto recreate slots |
| Documented in | FAQ § “Operations” |
| Status | ✅ IMPLEMENTED — auto-detect missing slot in health check; fall back to TRIGGER |
Proposed fix:
- Short term (P1): Add a
health_check()finding that detects “cdc_mode includes WAL but replication slot does not exist” and recommends the trigger fallback. - Medium term: On scheduler startup, when a source’s
cdc_phaseisWALbut the slot is missing, automatically fall back to TRIGGER mode with a WARNING log and a NOTIFY.
Test coverage: 2 E2E tests (e2e_wal_cdc_tests.rs):
- test_wal_fallback_on_missing_slot (pre-existing) — full lifecycle test:
creates WAL-mode ST, drops replication slot externally, waits for automatic
fallback to TRIGGER, verifies data integrity after fallback.
- test_ec34_check_cdc_health_detects_missing_slot — verifies
check_cdc_health() surfaces replication_slot_missing alert immediately
after slot drop, then confirms fallback to TRIGGER and post-fallback
data integrity.
EC-35 — Managed cloud PostgreSQL restrictions
| Field | Value |
|---|---|
| Area | Infrastructure — cloud compatibility |
| Tier | P3 |
| Impact | Extension cannot be installed |
| Documented in | FAQ § “Deployment” |
Proposed fix: pg_trickle cannot control cloud provider policies. Keep
the FAQ entry updated with a compatibility matrix (RDS, Cloud SQL,
AlloyDB, Azure Flexible, Supabase, Neon). Note that trigger-based CDC
avoids the most common restriction (wal_level = logical).
EC-36 — Managed cloud: shared_preload_libraries restriction
| Field | Value |
|---|---|
| Area | Infrastructure — cloud compatibility |
| Tier | P3 |
| Impact | Background worker + shared memory unavailable |
| Documented in | FAQ § “Deployment” |
Proposed fix: Investigate a “no-bgworker” mode where
refresh_stream_table() is called by an external cron (e.g. pg_cron or
an application timer). The scheduler would be optional when
shared_preload_libraries is unavailable. See
plans/infra/REPORT_EXTERNAL_PROCESS.md for the sidecar feasibility
study. Long term (v2.0+).
Prioritised Recommendations
P0 — Must fix (data correctness)
| # | Edge Case | Action | Estimated Effort |
|---|---|---|---|
| 1 | ✅ EC-01: JOIN key change + right-side DELETE | R₀ via EXCEPT ALL — implemented in diff_inner/left/full_join |
Done |
| 2 | ✅ EC-06: Keyless table duplicate rows | WARNING at creation + has_keyless_source flag |
Done |
| 3 | EC-19: WAL + keyless without REPLICA IDENTITY FULL | Reject at creation time with clear error | 0.5 day |
P1 — Should fix (operational surprise)
| # | Edge Case | Action | Estimated Effort |
|---|---|---|---|
| 4 | ✅ EC-11: Schedule < refresh duration | scheduler_falling_behind NOTIFY alert at 80% threshold |
Done |
| 5 | ✅ EC-13: Diamond split-version reads | Default 'atomic' + pg_sys::BeginInternalSubTransaction in scheduler |
Done |
| 6 | ✅ EC-15: SELECT * + DDL breakage | WARNING at creation for SELECT * queries |
Done |
| 7 | EC-16: ALTER FUNCTION undetected | pg_proc hash polling on refresh cycle |
2 days |
| 8 | EC-18: auto CDC stuck in TRIGGER |
Rate-limited LOG explaining why; health_check finding | 1 day |
| 9 | ✅ EC-25: TRUNCATE on stream table | Event trigger blocks TRUNCATE on pgtrickle schema | Done |
| 10 | ✅ EC-26: Direct DML on stream table | Guard trigger detecting non-engine writes | Done |
| 11 | EC-34: WAL slots lost after restore | Auto-detect missing slot; fall back to TRIGGER + WARNING | 1 day |
P2 — Nice to have (usability gaps)
| # | Edge Case | Action | Estimated Effort |
|---|---|---|---|
| 12 | EC-03: Window functions in expressions | CTE extraction in parser | 3–5 days |
| 13 | EC-05: Foreign tables | ✅ IMPLEMENTED — error msg + polling CDC | — |
| 14 | ✅ EC-09: IMMEDIATE unsupported constructs | Done — recursive CTEs + TopK now supported | Done |
| 15 | ✅ EC-17: DDL during active refresh | FAQ documentation added | Done |
| 16 | ✅ EC-20: CDC TRANSITIONING complexity | check_cdc_transition_health() in scheduler.rs |
Done |
| 17 | ✅ EC-28: PgBouncer | Documentation already exists | Done |
| 18 | EC-32: ALL (subquery) | Parser rewrite to NOT EXISTS (… EXCEPT …) | 2–3 days |
P3 — Accepted trade-offs (document, no code change)
✅ EC-02 (GUC implemented), EC-04, EC-07, EC-08, EC-10, EC-12, EC-14, ✅ EC-21/22/23 (docs done), EC-24, EC-27, EC-29, EC-30, EC-31, EC-35, EC-36.
These are either fundamental design trade-offs or have adequate existing mitigations. Keep documentation current and revisit if user demand surfaces.
P3 — Committed long-term implementation
| # | Edge Case | Action | Estimated Effort |
|---|---|---|---|
| 19 | EC-33: CORR, COVAR*, REGR* | Auxiliary accumulator columns; Welford-based SQL delta templates | 5–8 days |
Implementation Order
Recommended sprint sequence (assuming 2-week sprints):
Sprint 1 — P0 corrections: ✅ DONE
- ✅ EC-19: WAL + keyless rejection (done: has_keyless_source flag + WARNING)
- ✅ EC-06: Keyless duplicate warning + has_keyless_source catalog column
- ✅ EC-01: R₀ via EXCEPT ALL (implemented in diff_inner/left/full_join)
Sprint 2 — P1 operational safety: ✅ DONE
- ✅ EC-25: Block TRUNCATE on stream tables (event trigger)
- ✅ EC-26: Guard trigger for direct DML
- ✅ EC-15: Warn on SELECT * (WARNING at creation)
- ✅ EC-11: scheduler_falling_behind alert (NOTIFY at 80% threshold)
- ✅ EC-13: Default diamond_consistency = 'atomic' + scheduler SAVEPOINT → pg_sys::BeginInternalSubTransaction fix
Sprint 3 — Remaining P1 + P2 starts: ✅ DONE (v0.6.0)
- ✅ EC-18: Rate-limited LOG for stuck auto mode (1 day)
- ✅ EC-34: Auto-detect missing WAL slot (1 day)
- ✅ EC-16: pg_proc hash polling for function changes (2 days)
- ✅ EC-03: Window function CTE extraction (3–5 days)
Sprint 4+ — P2 + P3 long-term: ✅ DONE (v0.6.0)
- ✅ EC-05, EC-32: Foreign table change detection; ALL (subquery) → NOT EXISTS (… EXCEPT …) rewrite
- EC-33: Statistical aggregates via auxiliary accumulator columns
- Documentation sweep for all remaining P3 items
Open Questions
The following questions are unresolved design or validation unknowns that must be answered before the associated fixes can be fully specified or committed to.
OQ-01 — EC-01: What is the correct new default for adaptive_full_threshold? ✅ RESOLVED
Resolution (2026-05-30): The short-term threshold adjustment is dropped
as not viable. It is a heuristic that can still silently miss deletes when
co-occurring changes fall below the threshold, providing false safety. The
primary fix is the R₀ via EXCEPT ALL approach (see EC-01). The
adaptive_full_threshold GUC remains as-is for users who want it as an
escape valve, but it is no longer positioned as an EC-01 mitigation path.
OQ-02 — EC-01: Is pre-image capture viable without a breaking change buffer schema change? ✅ RESOLVED
Resolution (2026-05-30): Schema changes are NOT needed. The CDC trigger
(src/cdc.rs) already stores OLD."col" → "old_col" typed columns in the
change buffer for every UPDATE and DELETE event. The old join key values are
therefore already available. The question is now about when to use them.
The R₀ via EXCEPT ALL approach (chosen for Sprint 1) does not need old_col
column access from the delta CTE at all — it reconstructs the pre-change right
state from the live table + ΔR change set. The pre-image (old_col) approach
is still viable as a future optimisation (avoids the live-table EXCEPT ALL
scan) but is deferred to v1.x because it requires structural changes to
DiffContext/DeltaSource to surface old_col values in the delta CTE
output. Effort: 7–10 days when the time comes.
OQ-03 — EC-01: What is the overhead of the R₀ via EXCEPT ALL approach?
Updated analysis (2026-05-30):
Code archaeology shows the overhead is similar to what Part 2 already pays
for L₀:
- Part 1 already does a semi-join-filtered scan of
R₁(full right table, filtered by delta-left join keys). Adding R₀ =R₁ EXCEPT ALL ΔR_inserts UNION ALL ΔR_deletesadds one extra pass over the ΔR CTE (which is small — only changed rows). The live table is not scanned an extra time becauseR₁is already in the query plan; the EXCEPT ALL just subtracts a small set from it. - The semi-join filter that limits R₁ also applies to R₀, so the set difference operates on already-filtered rows.
- The correction term (Part 3) is unaffected by this change.
Expected overhead: < 10% additional latency on the delta query for typical workloads where ΔR is small relative to R. The concern about “one extra temp-table scan per join source” from the original write-up was over-estimated — the EXCEPT ALL operates on the already-materialized ΔR CTE, not the full right table.
Recommendation: Confirm with a micro-benchmark (1M right table, 0.1%
change rate) before the Sprint 1 PR merges. Accept if overhead is under 30%
of current DIFFERENTIAL latency; otherwise fall back to applying R₀ only
when ΔR is non-empty (easy conditional).
OQ-04 — EC-03: Where exactly do window-function-in-expression rewrites fail?
The CTE extraction rewrite for EC-03 is described as covering “~80% of user demand”, with complex cases (nested window calls inside arithmetic) remaining unsupported.
Open questions:
- Is SUM(ROW_NUMBER() OVER (…)) rejectable at parse time, or does it
require a full semantic analysis?
- Can NTILE(4) OVER (…) * 100 be rewritten to a CTE without changing
query semantics?
- What are the most common real-world patterns users hit? (Survey the
GitHub issues / Discord.)
Recommendation: Before implementing the rewrite, define a formal
boundary: “any expression where a window function appears as a direct
child of the SELECT list is rewriteable; anything deeper is rejected with
a clear error.” Enumerate the cases explicitly in a test file
(tests/e2e_window_expression_tests.rs) before writing the rewrite code.
This keeps the scope bounded.
OQ-05 — EC-06: Is ctid-based row identity viable for trigger-mode CDC?
The long-term fix for EC-06 proposes using ctid as a stable row
identifier for keyless tables in trigger mode. ctid is the physical
heap location and changes after VACUUM FULL / CLUSTER.
Open questions:
- Does pgrx expose the ctid of OLD in a row-level trigger? (Need to
verify via SPI_gettypeid or HeapTupleHeader.)
- What happens if VACUUM FULL runs between the CDC trigger fire and the
next refresh cycle? Would the stale ctid silently match the wrong row?
- Is ctid reset to the same value when the same physical slot is reused?
(Yes in PostgreSQL — ctid is reused after VACUUM, not unique across
time.)
Recommendation: Do not pursue ctid-based identity. The reuse
semantics make it unsafe as a durable row identifier. Instead, focus
Sprint 1 effort on the count-based hash approach (OQ-05 is answered: ctid
is off the table). If a unique row identity is required, recommend users
add a DEFAULT gen_random_uuid() column.
OQ-06 — EC-08: Do statement-level transition tables actually batch multiple row changes?
The plan for EC-08 states that statement-level triggers with transition
tables “provide batching naturally”. This is true for a single UPDATE
that affects N rows — the transition table contains all N old/new rows and
the IVM trigger fires once. But it is not true for N separate
single-row UPDATE statements in a loop — each fires the trigger once.
Open questions:
- What is the actual p50/p99 latency overhead per DML statement in
IMMEDIATE mode on a stream table with a 3-way join? (No benchmark
exists.)
- Does pg_advisory_xact_lock inside the trigger add measurable
contention under 100+ concurrent single-row updates?
Recommendation: Add benches/ivm_write_overhead.rs as a Criterion
benchmark that measures per-statement latency for IMMEDIATE vs
DIFFERENTIAL mode at 1, 10, 100 concurrent writers. Run it as part of the
weekly benchmark schedule. Make the results visible in
docs/BENCHMARK.md so users can make informed mode choices. This answers
the question empirically rather than by assumption.
OQ-07 — EC-36: How much of the scheduler can be decoupled for cloud deployments?
The long-term fix for EC-36 proposes a “no-bgworker” mode where
refresh_stream_table() is driven by pg_cron or an application timer.
plans/infra/REPORT_EXTERNAL_PROCESS.md covers the sidecar feasibility
research but reaches no conclusion.
Open questions:
- The scheduler currently holds a pg_advisory_lock during refresh to
prevent concurrent refreshes. Would a cron-driven caller need to acquire
this lock explicitly? (Yes — this is straightforward.)
- The scheduler tracks refresh history in pgt_refresh_history. Would a
cron-driven caller need to insert rows there manually, or can this be
encapsulated in refresh_stream_table()? (Likely yes — needs a small
refresh.rs refactor.)
- What happens to the background worker registration if pg_trickle is in
shared_preload_libraries but pg_trickle.enabled = off? The worker
starts but immediately sleeps — is this acceptable for cloud providers
that bill per background worker slot?
Recommendation: The prerequisite for EC-36 is verifying that
refresh_stream_table() already records to pgt_refresh_history (check
src/refresh.rs). If it does, the cron-mode is already 80% implemented —
users just need to disable the bgworker and call refresh_stream_table()
on a schedule. Document this pattern in docs/FAQ.md under “Deployment on
managed PostgreSQL” as an interim workaround before a formal no-bgworker
mode is built.