Extensions
- pg_igraph 1.1.0
- High-performance graph traversal engine with JSON parameters and multi-graph support
Documentation
- CHANGELOG
- Changelog
- README_PGXN
- pg_igraph - High-Performance Graph Traversal Engine
README
Contents
- pg_igraph π
- π₯ Performance That Speaks
- β‘ Quick Start
- π― Why pg_igraph?
- π Version 1.1 Features
- ποΈ Architecture Highlights
- π― Perfect For
- π Benchmarks
- π§ API Reference
- π Installation
- βοΈ Configuration
- π External Integration
- π€ Contributing
- π Documentation
- ποΈ How It Works
- π License
- π Support the Project
pg_igraph π
High-Performance Graph Traversal Engine for PostgreSQL
Transform your PostgreSQL into a high-performance graph database. No external systems needed.
π₯ Performance That Speaks
| Operation | Dataset | pg_igraph | Recursive CTE | Improvement |
|---|---|---|---|---|
| BFS Traversal | 335K nodes | 227ms | 47,000ms | π 207x |
| Shortest Path | 10K nodes | 49ms | 8,500ms | π 173x |
| Multi-hop Query | 50K nodes | 156ms | 12,000ms | π 77x |
β‘ Quick Start
# Install
git clone https://github.com/ineron/pg_igraph
cd pg_igraph && make && sudo make install
# Enable
psql -c "CREATE EXTENSION pg_igraph;"
-- Find friends within 3 degrees
SELECT igraph_query('
MATCH (u:User)-[:FOLLOWS*1..3]->(friend)
WHERE u.id = 42
RETURN friend
');
-- Shortest path between users
SELECT igraph_query('PATH FROM 100 TO 500 VIA FOLLOWS');
-- π NEW in v1.1: Multiple graphs with table prefixes
SELECT graph_add_node('User', 'social_network');
-- π NEW in v1.1: JSON parameters with enhanced WHERE clauses
SELECT igraph_query('network',
'MATCH (u:User)-[:FOLLOWS]->(f)
WHERE f.influence > &data.threshold AND u.id != &data.exclude_id
RETURN f.name',
'{"data":{"threshold":100, "exclude_id":42}}'
);
-- π NEW in v1.1: All comparison operators supported
SELECT igraph_query('',
'MATCH (source:User)-[:follows]->(target:User)
WHERE source.id >= &data.min_id AND source.id <= &data.max_id
RETURN source, target',
'{"data":{"min_id":1, "max_id":100}}'
);
π― Why pg_igraph?
β Native PostgreSQL Integration
- No external graph databases
- Leverage existing PostgreSQL infrastructure
- ACID compliance and backup strategies work
β Adaptive Performance
- Automatically switches between SPI and C hash maps
- Smart thresholds based on graph topology
- Zero SQL overhead for large traversals
β Rich Query Language
- Cypher-like syntax integrated into SQL
- Full AST with flex/bison parser
- Support for complex pattern matching
β External System Integration
- REF system for external object references
- Custom resolver functions for business logic
- Seamless integration with existing applications
π Version 1.1 Features
Enhanced WHERE Clauses
- JSON Parameters:
&data.fieldsyntax for dynamic queries - All Comparison Operators:
=,>,<,>=,<=,!= - Flexible Filtering: No longer limited to
WHERE src.id = X
Clean API Response
- Pure Data: Returns arrays
[{...}]or empty objects{} - No Status Fields: Removed unnecessary
"status": "ok" - Library-Ready: Perfect for integration into larger systems
Multi-Graph Support
- Table Prefixes: Support multiple graph instances
- Schema Separation:
"schema.prefix_"format - Independent Graphs: Isolated data and operations
ποΈ Architecture Highlights
- Dual-Direction Storage: Forward and reverse edges for optimal access
- Hash Partitioning: 8-64 partitions for parallel processing
- Covering Indexes: Three indexes per partition for maximum performance
- Adaptive Algorithms: Phase 1 (SPI) β Phase 2 (C hash maps)
π― Perfect For
- π’ Social Networks: Friend recommendations, influence analysis
- π E-commerce: Product recommendations, customer journeys
- π¨ Enterprise: Org charts, workflow dependencies
- π Geographic: Route optimization, network topology
- π Security: Access control, fraud detection
π Benchmarks
# Run comprehensive benchmarks
./benchmark.sh --scale large
See detailed benchmark results for more performance data.
π§ API Reference
Core Functions
-- Graph management
graph_add_node(label, table_prefix DEFAULT '') β BIGINT
graph_add_edge(from_id, to_id, relationship, table_prefix DEFAULT '') β VOID
-- Traversal
graph_traverse(start, rel, direction, max_depth) β SETOF BIGINT
graph_shortest_path(start, end, rel) β BIGINT[]
-- Properties
graph_set_property(node_id, prop_name, type, value, table_prefix DEFAULT '') β VOID
graph_get_property(node_id, prop_name, table_prefix DEFAULT '') β TEXT
-- External integration
graph_resolve_ref(uuid, type, resolver_func) β JSONB
Query Language
-- Pattern matching with JSON parameters
MATCH (n:Label)-[:REL*1..5]->(m)
WHERE n.prop > &data.threshold AND m.id != &data.exclude
RETURN m
-- Path finding
PATH FROM &data.start_id TO &data.end_id VIA &data.relationship
-- Node creation with references
CREATE (n:Type REF External = &data.external_uuid)
π Installation
Prerequisites
# Install PostgreSQL development headers
sudo yum install postgresql14-devel
# Install flex and bison for query parser
sudo yum install flex bison
# Install pg_ilib dependency
git clone https://github.com/ineron/pg_ilib
cd pg_ilib && make && sudo make install
Build and Install
git clone https://github.com/ineron/pg_igraph
cd pg_igraph
make
sudo make install
Database Setup
-- Enable extensions (order matters!)
CREATE EXTENSION pg_ilib; -- Required dependency
CREATE EXTENSION pg_igraph;
-- Initialize graph schema
./init_graph.sh
βοΈ Configuration
Partitioning Strategy
# .env configuration
GRAPH_PARTITIONS=16 # Recommended for 10M-100M edges
# Scaling guidelines:
# 8 partitions: up to 10M edges
# 16 partitions: 10M-100M edges
# 32 partitions: 100M-500M edges
# 64 partitions: 500M+ edges
Performance Tuning
# Recommended PostgreSQL settings
shared_buffers = 4GB
effective_cache_size = 12GB
random_page_cost = 1.1 # For SSD storage
work_mem = 256MB
π External Integration
REF System
-- Create nodes with external references
CREATE (order:Order REF User = &data.user_uuid);
-- Resolve references through external systems
SELECT graph_resolve_ref(uuid, 'User', 'external_resolver_function');
Custom Resolvers
-- Custom resolver function for CRM integration
CREATE FUNCTION crm_user_resolver(
ref_data BYTEA,
fields TEXT[] DEFAULT NULL
) RETURNS JSONB
AS $$
-- Your business logic here
-- Unpack ref_data, query external systems, return enriched data
$$;
π€ Contributing
We welcome contributions! See docs/CONTRIBUTING.md for guidelines.
Development Setup
git clone https://github.com/ineron/pg_igraph
cd pg_igraph
make clean && make
./install.sh
π Documentation
- Installation Guide
- Performance Tuning
- Query Language Reference
- External Integration Guide
- API Documentation
- Version History
ποΈ How It Works
Adaptive Execution Strategy
pg_igraph automatically switches between execution modes: - Phase 1: SPI-based execution for small frontiers - Phase 2: In-memory C hash maps for large-scale traversals - Smart Thresholds: Automatically optimizes based on frontier size (default: 200 nodes)
Storage Architecture
- Dual-direction edges: Both forward and reverse stored explicitly
- Hash-partitioned tables: 8/16/32/64 partitions for parallel access
- Covering Indexes: Three indexes per partition for optimal patterns
Why Not Recursive CTEs?
PostgreSQLβs recursive executor materializes intermediate results and cannot maintain a visited set across iterations. On a 335K-node tree, recursive CTEs take 47 seconds vs pg_igraphβs 227ms.
π License
MIT License - see LICENSE file for details.
π Support the Project
If pg_igraph helps your project, please: - β Star this repository - π Report issues and bugs - π Contribute to documentation - π‘ Share use cases and success stories