Contents
pg_qualstats
This software is EXPERIMENTAL and therefore NOT production ready. Use at your own risk.
pg_qualstats is a PostgreSQL extension keeping statistics on predicates found
in WHERE
statements and JOIN
clauses.
Most of the code is a blatant rip-off of pg_stat_statements.
The extension works by looking for known patterns in queries. Currently, this includes:
Binary OpExpr where at least one side is a column from a table. Whenever possible, the predicate will be swaped so that CONST OP VAR expressions are turned into VAR COMMUTED_OP CONST. AND and OR expression members are counted as separate entries. Ex: WHERE column1 = 2, WHERE column1 = column2, WHERE 3 = column3
ScalarArrayOpExpr where the left side is a VAR, and the right side is an array constant. Those will be counted one time per element in the array. Ex: WHERE column1 IN (2, 3) will be counted as 2 occurences for the (column1, '=') operator pair
This project is sponsored by Dalibo
Installation
- Compatible with PostgreSQL 9.3 and 9.4
- Needs postgresql header files
- sudo make install
- Add pg_qualstats to the shared preload libraries:
shared_preload_libraries = 'pg_qualstats'
Configuration
The following GUCs can be configured, in postgresql.conf:
- pg_qualstats.max: the maximum number of statements tracked (defaults to 1000)
Usage
- Create the extension in any database:
CREATE EXTENSION pg_qualstats;
Functions
The extension defines the following functions:
pg_qualstats: returns the counts for every qualifier, identified by the expression hash. This hash identifies each expression.
- userid: oid of the user who executed the query
- dbid: oid of the database in which the query has been executed
- lrelid, lattnum: oid of the relation and attribute number of the VAR on the left hand side, if any
- rrelid, rattnum: oid of the relation and attribute number of the VAR on the right hand side, if any
- parenthash: hash of the parent "AND" expression, if any. This is useful for identifying predicates which are used together
- nodehash: the predicate hash. Everything (down to constants) is used to compute this hash
- count: the total number of occurences of this predicate
- queryid: if pg_stats_statements is installed, the queryid identifying this query
- constvalue: a string representation of the right-hand side constant, if any, truncated to 80 bytes.
Example:
ro=# select * from pg_qualstats;
userid | dbid | lrelid | lattnum | opno | rrelid | rattnum | parenthash | nodehash | count | queryid | constvalue
--------+-------+--------+---------+------+--------+---------+-------------+------------+-------+---------+-------------
10 | 16546 | 1262 | 1 | 93 | | | 1167468204 | -312474735 | 1 | | 12::integer
10 | 16546 | | | 607 | 1262 | -2 | -1449854762 | 1327480291 | 1 | |
- pg_qualstats_reset: reset the internal counters and forget about every encountered qual.
Views
In addition to that, the extension defines some views on top of the pg_qualstats function:
pg_qualstats: filters calls to pg_qualstats() by the current database.
pg_qualstats_pretty: performs the appropriate joins to display a readable form for every attribute from the pg_qualstats view
Example:
ro=# select * from pg_qualstats_pretty;
left_schema | left_table | left_column | operator | right_schema | right_table | right_column | count
-------------+------------------+-------------+--------------+--------------+-------------+--------------+-------
public | pgbench_accounts | aid | pg_catalog.= | | | | 20
public | pgbench_tellers | tid | pg_catalog.= | | | | 10
public | pgbench_branches | bid | pg_catalog.= | | | | 10
pg_qualstats_all: sums the counts for each attribute / operator pair, regardless of its position as an operand (LEFT or RIGHT), grouping together attributes used in AND clauses.
Example:
ro=# select * from pg_qualstats_all; relid | attnums | opno | parenthash | count -------+---------+------+-------------+------- 74150 | {1,3} | 96 | -1878264478 | 2 74153 | {1} | 96 | 0 | 10 74156 | {1} | 96 | 0 | 20 74159 | {1} | 96 | 0 | 10
pg_qualstats_indexes: looks up those attributes for which an index doesn't exist with the attribute in first position.
Example:
ro=# select * from pg_qualstats_indexes; relid | attnames | possible_types | count ------------------+-----------------------------+----------------+------- pgbench_accounts | {filler} | {btree,hash} | 5 pgbench_accounts | {bid} | {btree,hash} | 2 pgbench_accounts | {bid,filler} | {btree,hash} | 8 (9 rows)
Todo
- Test, and ensure it doesnt crash
- Add pg_qualstats_foreignkeys for suggesting FKs (frequently joined together columns)
- Normalize queries to eliminate constants
- Function or example in docs on how to use pg_qualstats with pg_stats_statements.