Contents
Foreign Table Sources
This tutorial shows how to use a postgres_fdw foreign table as a source for
a stream table. Foreign tables let you aggregate data from remote PostgreSQL
databases into a local stream table that refreshes automatically.
Background
PostgreSQL’s Foreign Data Wrapper
(postgres_fdw) lets you define tables that transparently query a remote
database. pg_trickle can use these foreign tables as stream table sources,
but with different change-detection semantics than regular tables.
Key difference: Foreign tables cannot use trigger-based or WAL-based CDC. Changes are detected either by re-scanning the entire remote table (FULL refresh) or by comparing a local snapshot to the remote data (polling-based CDC).
Step 1 — Set Up the Foreign Server
-- Enable the foreign data wrapper extension
CREATE EXTENSION IF NOT EXISTS postgres_fdw;
-- Create a connection to the remote database
CREATE SERVER warehouse_db
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'warehouse.example.com', dbname 'analytics', port '5432');
-- Map the current user to a remote user
CREATE USER MAPPING FOR CURRENT_USER
SERVER warehouse_db
OPTIONS (user 'readonly_user', password 'secret');
Step 2 — Define the Foreign Table
CREATE FOREIGN TABLE remote_orders (
id INT,
customer_id INT,
amount NUMERIC(12,2),
region TEXT,
created_at TIMESTAMP
) SERVER warehouse_db
OPTIONS (schema_name 'public', table_name 'orders');
Alternatively, import an entire remote schema:
IMPORT FOREIGN SCHEMA public
LIMIT TO (orders, customers)
FROM SERVER warehouse_db
INTO public;
Step 3 — Create a Stream Table with FULL Refresh
The simplest approach uses FULL refresh mode — pg_trickle re-executes the query against the remote table on every refresh cycle:
SELECT pgtrickle.create_stream_table(
name => 'orders_by_region',
query => $$
SELECT
region,
COUNT(*) AS order_count,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_order_value
FROM remote_orders
GROUP BY region
$$,
schedule => '5m',
refresh_mode => 'FULL'
);
pg_trickle will emit an informational message:
INFO: pg_trickle: source table remote_orders is a foreign table.
Foreign tables cannot use trigger-based or WAL-based CDC —
only FULL refresh mode or polling-based change detection is supported.
How FULL refresh works with foreign tables:
- Every 5 minutes, pg_trickle executes the defining query.
- The query is sent to the remote database via
postgres_fdw. - The complete result set replaces the stream table contents.
- This is equivalent to a
MATERIALIZED VIEWrefresh, but automated.
Step 4 — Polling-Based CDC (Optional)
If the remote table is large and changes are small, FULL refresh becomes expensive because it transfers the entire result set every cycle. Polling-based CDC provides a more efficient alternative:
-- Enable polling globally (or per-session)
SET pg_trickle.foreign_table_polling = on;
-- Now create with DIFFERENTIAL mode — pg_trickle will use polling
SELECT pgtrickle.create_stream_table(
name => 'orders_by_region_polling',
query => $$
SELECT
region,
COUNT(*) AS order_count,
SUM(amount) AS total_revenue,
AVG(amount) AS avg_order_value
FROM remote_orders
GROUP BY region
$$,
schedule => '5m',
refresh_mode => 'FULL'
);
How polling works:
- On the first refresh, pg_trickle creates a local snapshot table that mirrors the remote table’s data.
- On subsequent refreshes, it fetches the current remote data and computes
an
EXCEPT ALLdifference against the snapshot. - Only the changed rows are written to the change buffer and processed through the incremental delta pipeline.
- The snapshot table is updated to reflect the new remote state.
- When the stream table is dropped, the snapshot table is cleaned up.
Trade-offs:
| Aspect | FULL Refresh | Polling CDC |
|---|---|---|
| Network transfer | Full result set every cycle | Full remote scan, but only diffs applied |
| Local storage | Stream table only | Stream table + snapshot table |
| Best for | Small remote tables | Large remote tables with small change rates |
| GUC required | No | pg_trickle.foreign_table_polling = on |
Step 5 — Verify and Monitor
-- Check stream table status
SELECT * FROM pgtrickle.pgt_status('orders_by_region');
-- Check CDC health (will show foreign table constraints)
SELECT * FROM pgtrickle.check_cdc_health();
-- View refresh history
SELECT * FROM pgtrickle.get_refresh_history('orders_by_region', 5);
-- Monitor staleness
SELECT * FROM pgtrickle.get_staleness('orders_by_region');
Worked Example — Remote Inventory Dashboard
This example aggregates inventory data from a remote warehouse database into a local dashboard table:
-- Remote table definition
CREATE FOREIGN TABLE remote_inventory (
sku TEXT,
warehouse TEXT,
quantity INT,
updated_at TIMESTAMP
) SERVER warehouse_db
OPTIONS (schema_name 'inventory', table_name 'stock_levels');
-- Dashboard: inventory summary by warehouse
SELECT pgtrickle.create_stream_table(
name => 'inventory_dashboard',
query => $$
SELECT
warehouse,
COUNT(DISTINCT sku) AS unique_products,
SUM(quantity) AS total_units,
MIN(updated_at) AS oldest_update,
MAX(updated_at) AS newest_update
FROM remote_inventory
GROUP BY warehouse
$$,
schedule => '10m',
refresh_mode => 'FULL'
);
After the first refresh:
SELECT * FROM inventory_dashboard;
warehouse | unique_products | total_units | oldest_update | newest_update
-----------+-----------------+-------------+---------------------+---------------------
east | 142 | 23500 | 2026-03-14 08:00:00 | 2026-03-14 09:15:00
west | 98 | 15200 | 2026-03-14 07:30:00 | 2026-03-14 09:10:00
central | 215 | 41000 | 2026-03-14 06:00:00 | 2026-03-14 09:20:00
Constraints and Caveats
| Constraint | Details |
|---|---|
| No trigger CDC | Foreign tables don’t support PostgreSQL row-level triggers. |
| No WAL CDC | Foreign tables don’t generate local WAL entries. |
| Network latency | Each refresh cycle queries the remote database. Schedule accordingly. |
| Remote availability | If the remote database is down, the refresh will fail (logged in pgt_refresh_history). The stream table retains its last successful data. |
| Authentication | CREATE USER MAPPING credentials must remain valid. Use .pgpass or environment variables in production. |
| Snapshot storage | Polling CDC creates a snapshot table sized proportionally to the remote table. Monitor disk usage. |
FAQ
Q: Why does my foreign table stream table only work in FULL mode?
Foreign tables cannot install row-level triggers (the mechanism pg_trickle uses
for trigger-based CDC) and don’t generate local WAL records (used by WAL-based
CDC). FULL refresh works because it simply re-executes the remote query.
Enable pg_trickle.foreign_table_polling if you need differential-style
change detection.
Q: Can I mix foreign and local tables in the same defining query?
Yes. If your query joins a foreign table with a local table, pg_trickle uses trigger/WAL CDC for the local table and FULL-rescan or polling for the foreign table. The refresh mode must be FULL unless polling is enabled for the foreign table sources.
Q: What happens if the remote database is temporarily unavailable?
The refresh attempt fails, is logged in pgt_refresh_history with status
FAILED, and the consecutive_errors counter increments. The stream table
retains its last successful data. When the remote database recovers, the next
scheduled refresh succeeds and the error counter resets.