Plan: Native PostgreSQL Syntax for Stream Tables

Plan: Native PostgreSQL Syntax for Stream Tables

Date: 2026-02-25 Status: PROPOSED Last Updated: 2026-02-25

Executive Summary
Motivation
Constraints
Approach Evaluation
Recommended Strategy: Tiered Syntax
Detailed Design — Tier 2: ProcessUtility_hook
Detailed Design — Tier 1.5: CALL Procedure Syntax
Implementation Phases
Module Layout Changes
ADR: Native Syntax Approach
Testing Strategy
Risks and Mitigations
Comparison with Prior Art
References

Executive Summary

This plan proposes a tiered syntax strategy for creating and managing stream tables in pg_trickle, evolving from the current function-call API toward a more native-feeling DDL experience without forking PostgreSQL.

Tier	Syntax	Status	Requires
Tier 1	`SELECT pgtrickle.create_stream_table(...)`	Existing	Nothing extra
Tier 1.5	`CALL pgtrickle.create_stream_table(...)`	New (trivial)	PG 11+
Tier 2	`CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream = true) AS ...`	New	`shared_preload_libraries`

All tiers produce identical results — a stream table registered in pgtrickle.pgt_stream_tables with CDC, DAG, and scheduling fully configured. Tier 1 remains the stable, always-available interface. Tier 2 provides native-feeling DDL syntax for users who prefer standard SQL idioms.

Estimated effort: ~1,200–1,800 lines of Rust (hook + tests) for Tier 2.

Motivation

Problem

The current API requires function-call syntax:

SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region',
    '1m',
    'DIFFERENTIAL'
);

While functional, this has several UX shortcomings:

Unfamiliar idiom — PostgreSQL users expect DDL operations (CREATE, DROP, ALTER) for persistent database objects. SELECT function() reads as a query, not a definition.
Query-as-string — The defining query is embedded as a text literal, losing IDE support (syntax highlighting, auto-complete, error detection).
No psql \d integration — Stream tables appear as plain tables in \dt, with no indication they are maintained views.
pg_dump gap — Function calls are not captured by pg_dump. Stream table definitions survive only via the catalog tables, requiring custom backup/restore tooling.
Discoverability — New users searching for “create materialized view postgres” won’t find pg_trickle’s function-based approach.

Goal

Provide a syntax that: - Feels like native PostgreSQL DDL - Preserves the query as SQL (not a string literal) - Works with existing PostgreSQL tooling where possible - Does NOT require forking PostgreSQL - Maintains full backward compatibility with the function API

Constraints

Hard Constraints

PostgreSQL’s parser is not extensible. There is no parser hook. The raw parser (gram.y) is compiled into the server binary. Extensions cannot add new keywords (STREAM) or grammar productions (CREATE STREAM TABLE). This rules out true CREATE STREAM TABLE syntax.
shared_preload_libraries is already required for the background scheduler worker and shared memory. The ProcessUtility_hook falls within the same requirement — no additional deployment burden.
Hook chaining — Other extensions (TimescaleDB, Citus, pg_stat_statements) may also set ProcessUtility_hook. Our hook MUST chain to the previous hook to avoid breaking other extensions.
Backward compatibility — The existing function API (pgtrickle.create_stream_table()) must continue to work unchanged. The new syntax is an additional path, not a replacement.

Soft Constraints

pg_dump compatibility — Desirable but not expected to be perfect. A companion pgtrickle.dump_stream_tables() function is acceptable.
Minimal surface area — The hook should intercept only the DDL commands it needs and pass everything else through without modification.

Approach Evaluation

Approach A — Parser Fork (Rejected)

Concept: Fork PostgreSQL, add CREATE STREAM TABLE as new grammar in gram.y, ship a modified PostgreSQL binary.

Why rejected: - Must maintain a full PostgreSQL fork, rebasing on every major release - Cannot use CREATE EXTENSION — must replace the user’s entire PostgreSQL - pg_dump / pg_restore / psql all need modifications - User adoption barrier is extreme (replace your database engine) - Maintenance cost: gram.y changes significantly between PG majors

Viability: Only for a PostgreSQL distribution product (like YugabyteDB, Greenplum). Not viable for a loadable extension.

Approach B — Table Access Method (Rejected)

Concept: Register a custom table AM (stream_heap), intercept CREATE TABLE ... USING stream_heap WITH (query = '...', schedule = '1m').

Why rejected: - Table AM requires implementing 60+ callbacks (scan, insert, delete, vacuum, etc.) - CREATE TABLE ... USING requires explicit column definitions — stream tables derive columns from the query, creating redundancy - No AS SELECT support with USING clause in PG grammar - Storing the defining query in WITH reloptions is hacky and has length limits - pg_dump would dump CREATE TABLE ... USING stream_heap but not the associated metadata (query, schedule) - Extreme implementation complexity for marginal UX improvement

Approach C — Foreign Data Wrapper (Rejected)

Concept: Register a custom FDW, use CREATE FOREIGN TABLE ... SERVER pgtrickle_server OPTIONS (query '...', schedule '1m').

Why rejected: - Foreign tables cannot have indexes — stream tables need __pgt_row_id unique index - Foreign tables cannot have triggers — breaks user-trigger support - No MVCC snapshot isolation guarantees - “Foreign table” implies external data, confusing the mental model - EXPLAIN shows “Foreign Scan” instead of “Seq Scan”

Approach D — ProcessUtility_hook + CREATE MATERIALIZED VIEW (Recommended)

Concept: Intercept CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream = true) AS SELECT ... via ProcessUtility_hook. When the custom option is detected, route to create_stream_table_impl() instead of standard matview creation.

Why recommended: - Proven pattern — TimescaleDB uses this exact approach for continuous aggregates (WITH (timescaledb.continuous)), one of the most widely deployed PostgreSQL extensions - Native feel — CREATE MATERIALIZED VIEW is the closest standard DDL to what a stream table is conceptually (a materialized derived dataset) - Query is SQL — The AS SELECT ... clause preserves the query as native SQL, enabling IDE syntax highlighting and auto-complete - shared_preload_libraries already required — No additional deployment burden - Column derivation — CREATE MATERIALIZED VIEW ... AS SELECT naturally derives column types from the query, exactly like stream tables - Low risk — The hook only fires for matviews with our specific option; all other DDL passes through untouched - DROP / REFRESH integration — Can intercept DROP MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW for matching stream tables

Trade-offs: - ~1,200–1,800 lines of hook code - Must track ProcessUtility_hook signature changes across PG versions (changed in PG14, PG15; stable since) - pg_dump will dump CREATE MATERIALIZED VIEW DDL but our hook must be active during restore for it to take effect - Users may be confused that stream tables appear as “materialized views” in some tools

Approach E — CALL Procedure Syntax (Complementary)

Concept: Expose create_stream_table as a PostgreSQL procedure (not just a function), enabling CALL pgtrickle.create_stream_table(...).

Why complementary: - Trivial to implement (add #[pg_extern] with procedure semantics or create a wrapper procedure) - CALL reads as a command/action rather than a query - No shared_preload_libraries requirement - Does not solve the query-as-string problem - Quick win that can ship independently

Recommended Strategy: Tiered Syntax

Tier 1: Function API (Existing — No Changes)

-- Always available, no special requirements
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region',
    '1m', 'DIFFERENTIAL'
);

SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.refresh_stream_table('order_totals');
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');

Tier 1.5: CALL Procedure Syntax (New — Trivial)

-- Same API but reads as a command, not a query
CALL pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region',
    '1m', 'DIFFERENTIAL'
);

CALL pgtrickle.drop_stream_table('order_totals');
CALL pgtrickle.refresh_stream_table('order_totals');

Tier 2: Native DDL Syntax (New — ProcessUtility_hook)

-- Create a stream table using familiar matview DDL
CREATE MATERIALIZED VIEW order_totals
WITH (
    pgtrickle.stream   = true,
    pgtrickle.schedule = '1m',
    pgtrickle.mode     = 'DIFFERENTIAL'
)
AS SELECT region, SUM(amount) FROM orders GROUP BY region
WITH NO DATA;

-- Or initialize immediately (WITH DATA is the default):
CREATE MATERIALIZED VIEW order_totals
WITH (pgtrickle.stream = true)
AS SELECT region, SUM(amount) FROM orders GROUP BY region;

-- Refresh
REFRESH MATERIALIZED VIEW order_totals;

-- Drop
DROP MATERIALIZED VIEW order_totals;
DROP MATERIALIZED VIEW IF EXISTS order_totals CASCADE;

-- Alter (still a function call — no standard ALTER MATERIALIZED VIEW
-- for custom options):
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');

Detailed Design — Tier 2: ProcessUtility_hook

User-Facing Syntax

CREATE

CREATE MATERIALIZED VIEW [IF NOT EXISTS] [schema.]name
WITH (
    pgtrickle.stream   = true,         -- required: marks this as a stream table
    pgtrickle.schedule = '1m',         -- optional: default '1m'
    pgtrickle.mode     = 'DIFFERENTIAL' -- optional: default 'DIFFERENTIAL'
)
AS select_query
[WITH DATA | WITH NO DATA];

Option semantics:

Option	Type	Default	Description
`pgtrickle.stream`	`bool`	—	Required. Must be `true` to trigger stream table creation.
`pgtrickle.schedule`	`text`	`'1m'`	Duration string or cron expression. Set to `'CALCULATED'` for NULL/downstream schedule.
`pgtrickle.mode`	`text`	`'DIFFERENTIAL'`	`'FULL'` or `'DIFFERENTIAL'`.

WITH DATA (default) → calls create_stream_table_impl() with initialize = true
WITH NO DATA → calls create_stream_table_impl() with initialize = false

DROP

DROP MATERIALIZED VIEW [IF EXISTS] [schema.]name [CASCADE | RESTRICT];

The hook checks whether the target is a registered stream table (by OID lookup in pgtrickle.pgt_stream_tables). If so, it routes to drop_stream_table_impl() instead of standard matview drop. If not, it passes through to the standard handler.

CASCADE behavior: If a stream table depends on another stream table being dropped, CASCADE propagates via pg_trickle’s dependency tracking (same as drop_stream_table_impl() already handles).

REFRESH

REFRESH MATERIALIZED VIEW [CONCURRENTLY] [schema.]name;

The hook checks whether the target is a stream table. If so, routes to refresh_stream_table_impl(). The CONCURRENTLY keyword is ignored (stream table refreshes are always concurrent by design — readers see the previous version until the refresh completes).

ALTER

There is no standard ALTER MATERIALIZED VIEW ... SET (option = value) syntax that we can meaningfully intercept for custom options. Alter operations remain as function calls:

SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');
SELECT pgtrickle.alter_stream_table('order_totals', status => 'SUSPENDED');

Future PostgreSQL versions may add ALTER MATERIALIZED VIEW ... SET (reloption) support. If so, we can intercept that.

Hook Registration

// src/hooks.rs — extended

use std::ffi::c_char;

static mut PREV_PROCESS_UTILITY_HOOK: pg_sys::ProcessUtility_hook_type = None;

/// Register ProcessUtility_hook for native DDL syntax support.
///
/// Called from _PG_init() when loaded via shared_preload_libraries.
/// Chains with any previously installed hook (e.g., pg_stat_statements,
/// TimescaleDB).
pub fn register_process_utility_hook() {
    unsafe {
        PREV_PROCESS_UTILITY_HOOK = pg_sys::ProcessUtility_hook;
        pg_sys::ProcessUtility_hook = Some(pg_trickle_process_utility);
    }
}

In _PG_init() (src/lib.rs):

if in_shared_preload {
    shmem::init_shared_memory();
    scheduler::register_scheduler_worker();

    // NEW: Register ProcessUtility_hook for native DDL syntax
    hooks::register_process_utility_hook();

    log!("pg_trickle: initialized (shared_preload_libraries)");
}

CREATE Interception

When ProcessUtility_hook fires with a CreateTableAsStmt node where objtype == OBJECT_MATVIEW:

pg_trickle_process_utility()
  ├── Is it CreateTableAsStmt with OBJECT_MATVIEW?
  │     ├── No → pass through to prev hook / standard_ProcessUtility
  │     └── Yes → extract reloptions from IntoClause
  │           ├── Has pgtrickle.stream = true?
  │           │     ├── No → pass through (normal matview)
  │           │     └── Yes → handle_create_stream_table_ddl()
  │           │           ├── Extract query string from the parse tree
  │           │           ├── Extract pgtrickle.schedule, pgtrickle.mode
  │           │           ├── Determine schema.name from IntoClause->rel
  │           │           ├── Determine WITH DATA / WITH NO DATA
  │           │           ├── Call create_stream_table_impl()
  │           │           └── Set QueryCompletion tag
  │           └── Has only non-pgtrickle options?
  │                 └── Pass through (normal matview with custom storage params)
  └── Is it something else?
        └── Check other interceptions (DROP, REFRESH)

Key implementation detail: The query text for the defining query must be extracted from the CreateTableAsStmt->query node. Since this is an analyzed Query node (not raw SQL text), we need to either:

Deparse from the parse tree — Use pg_sys::nodeToString() or pgrx deparse utilities to reconstruct the SQL from the SelectStmt node.
Extract from the raw query string — The queryString parameter to the hook contains the full original SQL. Parse it to extract the AS SELECT ... portion.
Use pg_get_viewdef-style deparsing — Not applicable since the matview doesn’t exist yet.

Approach (2) is simpler and preserves the user’s original formatting.The AS keyword position can be found by walking the query string past the WITH (...) clause.

Alternatively, and more robustly: use PostgreSQL’s deparse_query on the SelectStmt subtree to get a canonical SQL string. This avoids fragile string parsing and handles edge cases (comments, dollar-quoting, etc.).

DROP Interception

When ProcessUtility_hook fires with a DropStmt node where removeType == OBJECT_MATVIEW:

pg_trickle_process_utility()
  ├── Is it DropStmt with OBJECT_MATVIEW?
  │     └── Yes → for each object in the drop list:
  │           ├── Resolve name to OID
  │           ├── Is OID in pgtrickle.pgt_stream_tables?
  │           │     ├── No → pass through (normal matview drop)
  │           │     └── Yes → handle_drop_stream_table_ddl()
  │           │           ├── Call drop_stream_table_impl()
  │           │           └── Set QueryCompletion tag
  │           └── Mixed list? (some STs, some normal matviews)
  │                 └── Split: handle STs ourselves, pass rest through
  └── Not a matview drop → pass through

IF EXISTS handling: If the name doesn’t resolve to an OID and IF EXISTS is set in the DropStmt, emit a NOTICE and continue (matching standard PG behavior).

REFRESH Interception

When ProcessUtility_hook fires with a RefreshMatViewStmt node:

pg_trickle_process_utility()
  ├── Is it RefreshMatViewStmt?
  │     └── Yes → resolve relation name to OID
  │           ├── Is OID in pgtrickle.pgt_stream_tables?
  │           │     ├── No → pass through (normal matview refresh)
  │           │     └── Yes → handle_refresh_stream_table_ddl()
  │           │           ├── Call refresh_stream_table_impl()
  │           │           └── Set QueryCompletion tag
  │           └── OID not found → pass through (let standard handler error)
  └── Not a refresh → pass through

CONCURRENTLY: The RefreshMatViewStmt has a concurrent field. For stream tables, we log a DEBUG1 message that concurrency is implicit and proceed with the normal refresh.

ALTER Interception

ALTER MATERIALIZED VIEW name SET (option = value) is handled as an AlterTableStmt with objtype == OBJECT_MATVIEW and subcommand AT_SetRelOptions. We could intercept this to support:

ALTER MATERIALIZED VIEW order_totals SET (pgtrickle.schedule = '5m');

However, this interception is complex (must parse subcommands, handle mixed option sets) and provides marginal UX benefit since ALTER is less frequent than CREATE. Deferred to a future iteration.

psql Tab-Completion Compatibility

Since stream tables are created via CREATE MATERIALIZED VIEW syntax, they naturally appear in psql’s \dm (list materialized views) output. However, since we intercept creation and create a regular table (not an actual matview), they will appear in \dt (regular tables) instead.

Mitigation options:

Actually create a matview — After our custom processing, also create a thin matview entry in pg_class with relkind = 'm'. This is fragile and conflicts with the storage table.
Create a view alias — Create a regular view with the same name that reads from the storage table. The storage table gets an internal name like _pgtrickle_store_<name>.
Accept the \dt listing — Stream tables appear as regular tables in \dt, which is technically accurate (they ARE regular heap tables). The pgtrickle.stream_tables_info view provides the authoritative listing.
Register a custom psql command — Not possible without modifying psql.

Recommendation: Option 3 (accept \dt listing) for the initial implementation. The storage table IS a regular table, and \dt is correct. Add COMMENT ON TABLE with a descriptive marker so users can identify stream tables:

COMMENT ON TABLE schema.name IS 'pgtrickle: stream table (schedule=1m, mode=DIFFERENTIAL)';

pg_dump / pg_restore Strategy

This is the most significant challenge with the hook-based approach.

Problem

pg_dump dumps objects based on their relkind in pg_class. Since stream tables are stored as regular tables (relkind = 'r'), pg_dump will emit CREATE TABLE DDL with the full column list — not the CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream = true) AS SELECT ... that our hook expects.

The defining query, schedule, and other metadata live in pgtrickle.pgt_stream_tables, which pg_dump dumps as INSERT statements (since it’s an extension catalog table in extension_sql!).

Strategy: Companion Dump/Restore Functions

Provide explicit dump and restore functions:

-- Generate SQL to recreate all stream tables
SELECT pgtrickle.generate_dump();

-- Output:
-- SELECT pgtrickle.create_stream_table('order_totals',
--     'SELECT region, SUM(amount) FROM orders GROUP BY region',
--     '1m', 'DIFFERENTIAL', false);
-- SELECT pgtrickle.create_stream_table('daily_stats', ...);

-- Restore: run after pg_restore has created the extension and base tables
-- The function reads pgtrickle.pgt_stream_tables (restored by pg_dump)
-- and recreates the storage tables + CDC + triggers
SELECT pgtrickle.restore_stream_tables();

Strategy: Event Trigger on Extension Load

Register an event trigger that fires on CREATE EXTENSION pg_trickle and checks if the catalog tables contain orphaned entries (tables exist in catalog but storage tables are missing). If so, automatically recreate them.

pg_dump –section Behavior

pg_dump section	What it dumps	pg_trickle impact
`pre-data`	Schemas, extensions, types	`CREATE EXTENSION pg_trickle` → creates catalog tables
`data`	Table contents	Catalog table rows (INSERT INTO pgtrickle.pgt_stream_tables) + storage table data
`post-data`	Indexes, triggers, constraints	Storage table indexes, CDC triggers (but these are extension-managed)

Recommended restore workflow:

# 1. Restore schema + extension
pg_restore --section=pre-data dump.sql

# 2. Restore data (includes catalog entries + storage table data)
pg_restore --section=data dump.sql

# 3. Rebuild stream tables from catalog (recreates CDC, triggers, indexes)
psql -c "SELECT pgtrickle.restore_stream_tables();"

# 4. Restore remaining post-data
pg_restore --section=post-data dump.sql

Error Handling

Hook errors must be handled carefully to avoid crashing the backend:

Never panic!() in the hook — Use pg_guard + Result<> patterns
Failed interception → pass through — If our hook encounters an error during stream table detection (e.g., SPI failure querying catalog), log a warning and pass through to the standard handler
Stream table creation errors — Report via ereport(ERROR, ...) as usual. The transaction rolls back, leaving no artifacts.
Unknown options — If pgtrickle.stream = true is present but other pgtrickle.* options are unrecognized, raise an error listing valid options

Detailed Design — Tier 1.5: CALL Procedure Syntax

Implementation

PostgreSQL 11+ supports CALL procedure(...) for procedures. Currently pg_trickle exposes functions (callable via SELECT). To support CALL syntax:

Option A: Wrapper Procedures (SQL layer)

CREATE PROCEDURE pgtrickle.create_stream_table(
    name          text,
    query         text,
    schedule      text DEFAULT '1m',
    refresh_mode  text DEFAULT 'DIFFERENTIAL',
    initialize    bool DEFAULT true
)
LANGUAGE sql
AS $$
    SELECT pgtrickle.create_stream_table(name, query, schedule, refresh_mode, initialize);
$$;

This approach is straightforward but creates naming conflicts (function and procedure with same name and signature). PostgreSQL resolves this based on context (SELECT calls the function, CALL calls the procedure), but some tools may be confused.

Option B: Distinct Procedure Names

CALL pgtrickle.stream_table_create('order_totals', 'SELECT ...', '1m');
CALL pgtrickle.stream_table_drop('order_totals');
CALL pgtrickle.stream_table_refresh('order_totals');

Option C: pgrx #[pg_extern] with Procedure Support

pgrx 0.17.x does not yet natively generate CREATE PROCEDURE. The wrapper approach (Option A) via extension_sql!() is the most practical.

Recommendation: Option A — wrapper procedures with the same name. Add via extension_sql!() in lib.rs. This preserves the existing function API and adds CALL as an alias.

Effort

~30 lines of SQL in extension_sql!(). Minimal Rust changes.

Implementation Phases

Phase 1: Tier 1.5 — CALL Syntax (1–2 days)

Scope: Add CREATE PROCEDURE wrappers for create_stream_table, drop_stream_table, refresh_stream_table, and alter_stream_table.

Files changed: - src/lib.rs — New extension_sql!() block with procedure definitions

Tests: - E2E test: create/refresh/drop via CALL syntax - Verify SELECT and CALL both work for same operations

Phase 2: Hook Infrastructure (3–5 days)

Scope: Register ProcessUtility_hook in _PG_init(), implement the dispatch logic, and handle passthrough for non-pgtrickle DDL.

Files changed: - src/hooks.rs — Add register_process_utility_hook(), pg_trickle_process_utility() dispatch function, hook chaining - src/lib.rs — Call hooks::register_process_utility_hook() in _PG_init()

Tests: - Verify hook registration doesn’t break existing DDL - Verify non-pgtrickle matviews still work - Verify other extension hooks still chain correctly

Phase 3: CREATE Interception (5–7 days)

Scope: Intercept CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream = true) and route to create_stream_table_impl().

Key tasks: 1. Parse CreateTableAsStmt node to extract reloptions 2. Detect pgtrickle.stream = true in the option list 3. Extract schema, name, query text from the parse tree 4. Extract pgtrickle.schedule and pgtrickle.mode options 5. Determine WITH DATA / WITH NO DATA 6. Call create_stream_table_impl() with extracted parameters 7. Set QueryCompletion tag to "CREATE MATERIALIZED VIEW" 8. Add COMMENT ON TABLE with stream table marker

Files changed: - src/hooks.rs — handle_create_stream_table_ddl() - src/api.rs — Ensure create_stream_table_impl() is pub(crate)

Tests: - E2E: Create stream table via matview syntax - E2E: Verify options parsing (schedule, mode, with data/no data) - E2E: Verify normal matviews are unaffected - E2E: Verify error handling (invalid options, bad queries) - E2E: Verify the resulting table matches function-API-created tables

Phase 4: DROP + REFRESH Interception (3–4 days)

Scope: Intercept DROP MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW for registered stream tables.

Files changed: - src/hooks.rs — handle_drop_stream_table_ddl(), handle_refresh_stream_table_ddl()

Tests: - E2E: Drop stream table via DROP MATERIALIZED VIEW - E2E: Refresh via REFRESH MATERIALIZED VIEW - E2E: IF EXISTS handling - E2E: CASCADE handling - E2E: Verify normal matviews are unaffected

Phase 5: pg_dump / Restore Support (InProgress)

Status: In Progress. generate_dump() and restore_stream_tables() are fully implemented. Next is the event-trigger behavior.

Scope: Implement pgtrickle.generate_dump() and pgtrickle.restore_stream_tables() functions.

Files changed: - src/api.rs — New functions generate_dump(), restore_stream_tables()

Tests: - Integration: Round-trip dump → restore → verify stream tables work - [x] E2E: Verify restore from pg_dump output

Phase 6: Documentation + Polish (2–3 days)

Scope: Update SQL_REFERENCE.md, GETTING_STARTED.md, FAQ.md. Add migration guide for existing users.

Files changed: - docs/SQL_REFERENCE.md — Document Tier 2 syntax - docs/GETTING_STARTED.md — Show native syntax in examples - docs/FAQ.md — Add “Why CREATE MATERIALIZED VIEW?” Q&A - CHANGELOG.md — Feature announcement

Total estimated effort: 17–25 days

Module Layout Changes

src/
├── hooks.rs          # Extended: ProcessUtility_hook registration + dispatch
│   ├── (existing)    # DDL event triggers (_on_ddl_end, _on_sql_drop)
│   ├── NEW           # register_process_utility_hook()
│   ├── NEW           # pg_trickle_process_utility() — main dispatch
│   ├── NEW           # handle_create_stream_table_ddl()
│   ├── NEW           # handle_drop_stream_table_ddl()
│   ├── NEW           # handle_refresh_stream_table_ddl()
│   └── NEW           # extract_pgtrickle_options() — reloption parser
├── api.rs            # Unchanged (impl functions already exist)
├── lib.rs            # Extended: hook registration in _PG_init(), CALL procs
└── (all other files unchanged)

ADR: Native Syntax Approach

ADR-012: Native DDL Syntax via ProcessUtility_hook

Field	Value
Status	Proposed
Date	2026-02-25
Deciders	pg_trickle core team
Category	API & Schema Design

Context

Users expect DDL syntax (CREATE, DROP, ALTER) for persistent database objects. The function-call API works but feels foreign to PostgreSQL users accustomed to standard DDL. Additionally, embedding the defining query as a string literal loses IDE support.

Decision

Implement a tiered syntax strategy: - Tier 1 (existing): Function API — always available, no special requirements - Tier 1.5 (new): CALL procedure wrappers — trivial addition - Tier 2 (new): CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream = true) via ProcessUtility_hook — native-feeling DDL when shared_preload_libraries is configured

Options Considered

Option	Verdict	Reason
Parser fork	Rejected	Requires PostgreSQL fork, not an extension
Table AM	Rejected	60+ callbacks, column derivation unsupported
FDW	Rejected	No indexes, no triggers, wrong semantics
ProcessUtility_hook + matview	Selected	Proven by TimescaleDB, native feel
COMMENT abuse	Rejected	Fragile, poor UX

Consequences

Positive: - Users get native-feeling DDL syntax - Defining query is SQL (not a string literal) - Follows established pattern (TimescaleDB continuous aggregates) - No additional deployment burden (shared_preload_libraries already required)

Negative: - ~1,200–1,800 lines of new hook code - Must track ProcessUtility_hook signature across PG versions - pg_dump requires custom companion functions - Stream tables appear as regular tables in \dt, not \dm - Potential confusion: users may expect full matview semantics (e.g., REFRESH ... CONCURRENTLY with different behavior)

Testing Strategy

Unit Tests (src/hooks.rs `#[cfg(test)]`)

Option parsing: extract pgtrickle.stream, pgtrickle.schedule, pgtrickle.mode from DefElem lists
Schema/name parsing from RangeVar
WITH DATA / WITH NO DATA detection

E2E Tests (tests/e2e_native_syntax_tests.rs)

Test	Description
`test_create_via_matview_syntax`	Basic CREATE MATERIALIZED VIEW with pgtrickle.stream
`test_create_with_all_options`	schedule + mode + WITH NO DATA
`test_create_default_options`	Only pgtrickle.stream = true, all defaults
`test_create_schema_qualified`	`CREATE MATERIALIZED VIEW myschema.my_st ...`
`test_create_if_not_exists`	IF NOT EXISTS handling
`test_drop_via_matview_syntax`	DROP MATERIALIZED VIEW on a stream table
`test_drop_if_exists`	DROP MATERIALIZED VIEW IF EXISTS
`test_drop_cascade`	CASCADE propagation
`test_refresh_via_matview_syntax`	REFRESH MATERIALIZED VIEW on a stream table
`test_refresh_concurrently_ignored`	CONCURRENTLY keyword logged but ignored
`test_normal_matview_passthrough`	Regular matview without pgtrickle.stream works normally
`test_normal_drop_passthrough`	DROP on non-stream matview works normally
`test_mixed_create_function_and_ddl`	Create via function, drop via DDL and vice versa
`test_invalid_pgtrickle_options`	Unknown option → error
`test_call_syntax`	CALL pgtrickle.create_stream_table(…)
`test_hook_chaining`	Verify pg_stat_statements still works alongside hook
`test_dump_restore_roundtrip`	generate_dump() → restore → verify

Property Tests

For any stream table created via Tier 1 (function), the resulting catalog entries and storage table should be identical to one created via Tier 2 (DDL) with the same parameters.

Risks and Mitigations

Risk	Likelihood	Impact	Mitigation
ProcessUtility_hook signature changes in PG 19+	Medium	Medium	Conditional compilation with `#[cfg(any(...))]` for PG version. Hook signature has been stable since PG 15.
Other extension hooks conflict	Low	High	Always chain `prev_ProcessUtility_hook`. Test with pg_stat_statements. Document hook ordering requirements.
pg_dump produces incorrect restore SQL	High	Medium	Provide `pgtrickle.generate_dump()` and `pgtrickle.restore_stream_tables()`. Document recommended backup workflow.
User confusion: “why is my matview in `\dt` not `\dm`?”	Medium	Low	Documentation, COMMENT ON TABLE marker, FAQ entry.
Hook code introduces crashes	Low	High	Extensive `#[pg_guard]` usage, never `unwrap()` in hook path, graceful passthrough on errors.
Deparse of defining query loses fidelity	Medium	Medium	Use raw query string extraction (approach 2) rather than deparsing from parse tree. Preserve user’s original SQL.
Extension load order matters	Low	Medium	Document that pg_trickle should be listed after pg_stat_statements in `shared_preload_libraries` (or before — test both).

Comparison with Prior Art

Feature	pg_trickle (proposed)	TimescaleDB	pg_ivm	Citus
Creation syntax	`CREATE MATVIEW WITH (pgtrickle.stream)`	`CREATE MATVIEW WITH (timescaledb.continuous)`	`SELECT create_immv(...)`	`SELECT create_distributed_table(...)`
Function API	Yes (primary)	Yes (policies)	Yes (primary)	Yes (primary)
ProcessUtility_hook	Yes	Yes (extensive)	No	Yes (extensive)
pg_dump support	Custom functions	Built-in (complex)	No	Built-in
Storage type	Regular table	Hypertable	Regular table	Distributed table
REFRESH integration	`REFRESH MATERIALIZED VIEW`	`REFRESH MATERIALIZED VIEW`	`SELECT refresh_immv(...)`	N/A
DROP integration	`DROP MATERIALIZED VIEW`	`DROP MATERIALIZED VIEW`	`DROP TABLE`	`SELECT undistribute_table(...)`

References

Custom SQL Syntax Research Report — Comprehensive analysis of all PostgreSQL extension syntax mechanisms
TimescaleDB process_utility.c — github.com/timescale/timescaledb
Citus multi_utility.c — github.com/citusdata/citus
pg_ivm createas.c — github.com/sraoss/pg_ivm
PostgreSQL ProcessUtility_hook — src/backend/tcop/utility.c in PostgreSQL source
PostgreSQL Table AM API — postgresql.org/docs/18/tableam.html
pgrx Hooks — pgrx documentation on ProcessUtility_hook in Rust
ADR Records — ADR-001 (Triggers), ADR-003 (DVM Engine)

PGXN

PostgreSQL Extension Network

Contents

Plan: Native PostgreSQL Syntax for Stream Tables

Table of Contents

Executive Summary

Motivation

Problem

Goal

Constraints

Hard Constraints

Soft Constraints

Approach Evaluation

Approach A — Parser Fork (Rejected)

Approach B — Table Access Method (Rejected)

Approach C — Foreign Data Wrapper (Rejected)

Approach D — ProcessUtility_hook + CREATE MATERIALIZED VIEW (Recommended)

Approach E — CALL Procedure Syntax (Complementary)

Recommended Strategy: Tiered Syntax

Tier 1: Function API (Existing — No Changes)

Tier 1.5: CALL Procedure Syntax (New — Trivial)

Tier 2: Native DDL Syntax (New — ProcessUtility_hook)

Detailed Design — Tier 2: ProcessUtility_hook

User-Facing Syntax

CREATE

DROP

REFRESH

ALTER

Hook Registration

CREATE Interception

DROP Interception

REFRESH Interception

ALTER Interception

psql Tab-Completion Compatibility

pg_dump / pg_restore Strategy

Problem

Strategy: Companion Dump/Restore Functions

Strategy: Event Trigger on Extension Load

pg_dump –section Behavior

Error Handling

Detailed Design — Tier 1.5: CALL Procedure Syntax

Implementation

Effort

Implementation Phases

Phase 1: Tier 1.5 — CALL Syntax (1–2 days)

Phase 2: Hook Infrastructure (3–5 days)

Phase 3: CREATE Interception (5–7 days)

Phase 4: DROP + REFRESH Interception (3–4 days)

Phase 5: pg_dump / Restore Support (InProgress)

Phase 6: Documentation + Polish (2–3 days)

Module Layout Changes

ADR: Native Syntax Approach

ADR-012: Native DDL Syntax via ProcessUtility_hook

Context

Decision

Options Considered

Consequences

Testing Strategy

Unit Tests (src/hooks.rs #[cfg(test)])

E2E Tests (tests/e2e_native_syntax_tests.rs)

Property Tests

Risks and Mitigations

Comparison with Prior Art

References

Unit Tests (src/hooks.rs `#[cfg(test)]`)