explanation 0.3.0
This extension adds a new function, explanation(), to your database. Pass it
a string that executes a query and the function runs EXPLAIN on the query
and returns the results as a table. Each node in the plan is represented by a
single row, and child nodes refer to the unique identifier of their parents.
The results, that is, are organized into a proximity tree.
Synopsis
Plan a simple query:
SELECT node_type, strategy, actual_startup_time, actual_total_time
  FROM explanation(
       query    := $$ SELECT * FROM pg_class WHERE relname = 'users' $$,
       analyzed := true
  );
Output:
 node_type  │ strategy │ actual_startup_time │ actual_total_time 
────────────┼──────────┼─────────────────────┼───────────────────
 Index Scan │          │ 00:00:00.000017     │ 00:00:00.000017
Usage
To use the explanation() function, simply pass a string you'd like to have
EXPLAINed:
SELECT * FROM explanation(:query);
If you'd like the output of EXPLAIN ANALYZE, pass true as the second
argument:
SELECT * FROM explanation(:query, true);
Or via the analyzed parameter:
SELECT * FROM explanation(query := :query, anayzed := true);
The function returns a relation with each node of the plan as a single row.
The first row will be the outermost node, and any other rows represent the
child nodes. The structure of the relation is the same as this CREATE TABLE
statement, which you can use to actually insert values:
CREATE TABLE plans (
    planned_at              TIMESTAMPTZ,
    node_id                 TEXT PRIMARY KEY,
    parent_id               TEXT REFERENCES plans(node_id),
    node_type               TEXT NOT NULL,
    total_runtime           INTERVAL,
    strategy                TEXT,
    operation               TEXT,
    startup_cost            FLOAT,
    total_cost              FLOAT,
    plan_rows               FLOAT,
    plan_width              INTEGER,
    actual_startup_time     INTERVAL,
    actual_total_time       INTERVAL,
    actual_rows             FLOAT,
    actual_loops            FLOAT,
    parent_relationship     TEXT,
    sort_key                TEXT[],
    sort_method             TEXT[],
    sort_space_used         BIGINT,
    sort_space_type         TEXT,
    join_type               TEXT,
    join_filter             TEXT,
    hash_cond               TEXT,
    relation_name           TEXT,
    alias                   TEXT,
    scan_direction          TEXT,
    index_name              TEXT,
    index_cond              TEXT,
    recheck_cond            TEXT,
    tid_cond                TEXT,
    merge_cond              TEXT,
    subplan_name            TEXT,
    function_name           TEXT,
    function_call           TEXT,
    filter                  TEXT,
    one_time_filter         TEXT,
    command                 TEXT,
    shared_hit_blocks       BIGINT,
    shared_read_blocks      BIGINT,
    shared_written_blocks   BIGINT,
    local_hit_blocks        BIGINT,
    local_read_blocks       BIGINT,
    local_written_blocks    BIGINT,
    temp_read_blocks        BIGINT,
    temp_written_blocks     BIGINT,
    output                  TEXT[],
    hash_buckets            BIGINT,
    hash_batches            BIGINT,
    original_hash_batches   BIGINT,
    peak_memory_usage       BIGINT,
    schema                  TEXT,
    cte_name                TEXT,       
    triggers                trigger_plan[]
);
Insert values like so:
INSERT INTO plans SELECT * FROM explanation(
    query    := $$ SELECT * FROM pg_class WHERE relname = 'users' $$,
    analyzed := true
);
Some notes on the columns:
- The - planned_atcolumn is just- NOW(). Convenient for when the output is stored in a table and you'd like to refer back to earlier plans when comparing changes to queries over time.
- The - node_idcolumn contains an MD5 hash created just before a node is parsed, from the concatenation of the server PID and the current time:- md5( pg_backend_pid() || clock_timestamp() )- As such it should be adequately unique on a single server. The - parent_idwill be- NULLfor the outer plan. For example, here's the output of the first three columns of a query with nine plan nodes:- node_id │ parent_id │ node_type ─────────────────────────────────┼──────────────────────────────────┼──────────────── 029dde3a3c872f0c960f03d2ecfaf5ee | | Aggregate 3e4c4968cee7653037613c234a953be1 | 029dde3a3c872f0c960f03d2ecfaf5ee | Sort dd3d1b1fb6c70be827075e01b306250c | 3e4c4968cee7653037613c234a953be1 | Nested Loop 037a8fe70739ed1be6a3006d0ab80c82 | dd3d1b1fb6c70be827075e01b306250c | Hash Join 2c4e922dc19ce9f01a3bf08fbd76b041 | 037a8fe70739ed1be6a3006d0ab80c82 | Seq Scan 709b2febd8e560dd8830f4c7277c3758 | 037a8fe70739ed1be6a3006d0ab80c82 | Hash 9dd89be09ea07a1000a21cbfc09121c7 | 709b2febd8e560dd8830f4c7277c3758 | Seq Scan 8dc3d35ab978f6c6e46f7927e7b86d21 | dd3d1b1fb6c70be827075e01b306250c | Index Scan 3d7c72f13ae7571da70f434b5bc9e0af | 029dde3a3c872f0c960f03d2ecfaf5ee | Function Scan
- The - total_runtimecolumn sums the runtime of the entire query.
- The - node_typecolumn may have one of the following values (read from- src/backend/commands/explain.c), although other values may be added by plugins and the like:- Aggregate
- Append
- Bitmap Heap Scan
- Bitmap Index Scan
- BitmapAnd
- BitmapOr
- CTE Scan
- Foreign Scan
- Function Scan
- Group
- Hash
- Hash Join
- Index Scan
- Limit
- LockRows
- Materialize
- Merge Append
- Merge Join
- ModifyTable
- Nested Loop
- Recursive Union
- Result
- Seq Scan
- SetOp
- Sort
- Subquery Scan
- Tid Scan
- Unique
- Values Scan
- WindowAgg
- WorkTable Scan
 
- The - triggerscolumn also applies only to the outer-most plan, and provides an array of- trigger_planrecords for the that were called. The columns of the composite- trigger_plantype are:- trigger_nameTEXT
- constraint_nameTEXT
- relationTEXT
- timeINTERVAL
- callsFLOAT
 - You can turn them into a full table expression by selecting them from the - planstable described above like so:- SELECT (a.b).trigger_name, (a.b).relation, (a.b).relation, (a.b).time, (a.b).calls FROM (SELECT unnest(triggers) FROM plans) AS a(b);
All other columns are derived directly from the XML output of EXPLAIN.
Please see "Using
EXPLAIN"
for further reading on using EXPLAIN.
Specifying Columns
The column values are created by executing xpath() queries against the XML
EXPLAIN format. There's a lot of data, so for big queries with lots of
nodes, all those calculations can be quite expensive. For ad hoc analyses this
isn't a big deal, and for slow queries most of the overhead is likely to be
taken up if you analyze. However, if you need to process a lot of queries with
this function, and you don't need all of the data, tell it the data you do
want by passing an array listing the columns you're interested in, like so:
SELECT node_type, strategy, actual_startup_time, actual_total_time
  FROM explanation(
       query    := $$ SELECT * FROM pg_class WHERE relname = 'users' $$,
       analyzed := true,
       columns  := ARRAY['node_type', 'total_runtime', 'strategy', 'total_cost']
  );
With this execution, only the node_id (which is always calculated),
node_type, total_runtime, strategy, and total_cost columns will
contain values. All others will be NULL.
Example
Say you had a table full of queries extracted from a query log, and you'd like to analyze the sequence, index, function, and tid scans executed against a set of partitions. You might do something like this to generate that data:
CREATE TABLE partition_query_stats (
    statement        TEXT     NOT NULL,
    runtime          INTERVAL NOT NULL,
    index_scan_count INT      NOT NULL DEFAULT 0,
    seq_scan_count   INT      NOT NULL DEFAULT 0,
    scan_time        INTERVAL NOT NULL DEFAULT '0 secs'
);
CREATE OR REPLACE FUNCTION analyze_partition_queries(
    partition_regex TEXT
) RETURNS VOID LANGUAGE plpgsql AS $$
DECLARE
    query TEXT;
BEGIN
    FOR query in SELECT query FROM logged_queries LOOP
        INSERT INTO partition_query_stats (
               statement, runtime, index_scan_count, seq_scan_count,
              scan_time
        )
        SELECT query,
               MAX(total_runtime),
               COUNT( CASE WHEN node_type ~* '.*Index Scan' THEN 1 ELSE NULL END ),
               COUNT( CASE WHEN node_type IN ('Seq Scan', 'Function Scan', 'Tid Scan') THEN 1 ELSE NULL END ),
               SUM(actual_total_time )
         FROM explanation(
              query    := query,
              analyzed := TRUE, 
              columns  := ARRAY['total_runtime', 'node_type', 'actual_total_time', 'relation_name']
         )
        WHERE relation_name ~* partition_regex
          AND (node_type LIKE '% Scan' OR node_type = 'Append')
        GROUP BY statement;
    END LOOP;
END;
$$;
SELECT analyze_partition_queries('at_call_log_.+');
The scan data would then be in the query_partition_stats table for further
examination and analysis.
Author
David E. Wheeler, PostgreSQL Experts, Inc..
Copyright and License
Copyright (c) 2010-2011, Marchex.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the name of the Marchex nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 
     