postgresql/contrib/pg_stat_statements
Michael Paquier 3db72ebcbe Generate code for query jumbling through gen_node_support.pl
This commit changes the query jumbling code in queryjumblefuncs.c to be
generated automatically based on the information of the nodes in the
headers of src/include/nodes/ by using gen_node_support.pl.  This
approach offers many advantages:
- Support for query jumbling for all the utility statements, based on the
state of their parsed Nodes and not only their query string.  This will
greatly ease the switch to normalize the information of some DDLs, like
SET or CALL for example (this is left unchanged and should be part of a
separate discussion).  With this feature, the number of entries stored
for utilities in pg_stat_statements is reduced (for example now
"CHECKPOINT" and "checkpoint" mean the same thing with the same query
ID).
- Documentation of query jumbling directly in the structure definition
of the nodes.  Since this code has been introduced in pg_stat_statements
and then moved to code, the reasons behind the choices of what should be
included in the jumble are rather sparse.  Note that some explanation is
added for the most relevant parts, as a start.
- Overall code reduction and more consistency with the other parts
generating read, write and copy depending on the nodes.

The query jumbling is controlled by a couple of new node attributes,
documented in nodes/nodes.h:
- custom_query_jumble, to mark a Node as having a custom
implementation.
- no_query_jumble, to ignore entirely a Node.
- query_jumble_ignore, to ignore a field in a Node.
- query_jumble_location, to mark a location in a Node, for
normalization.  This can apply only to int fields, with "location" in
their name (only Const as of this commit).

There should be no compatibility impact on pg_stat_statements, as the
new code applies the jumbling to the same fields for each node (its
regression tests have no modification, for one).

Some benchmark of the query jumbling between HEAD and this commit for
SELECT and DMLs has proved that this new code does not cause a
performance regression, with computation times close for both methods.
For utility queries, the new method is slower than the previous method
of calculating a hash of the query string, though we are talking about
extra ns-level changes based on what I measured, which is unnoticeable
even for OLTP workloads as a query ID is calculated once per query
post-parse analysis.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/Y5BHOUhX3zTH/ig6@paquier.xyz
2023-01-31 15:24:05 +09:00
..
expected Generate code for query jumbling through gen_node_support.pl 2023-01-31 15:24:05 +09:00
sql Generate code for query jumbling through gen_node_support.pl 2023-01-31 15:24:05 +09:00
.gitignore pg_stat_statements: Add .gitignore file for tests 2016-11-13 08:24:43 -05:00
Makefile pg_stat_statements: Track I/O timing for temporary file blocks 2022-04-08 13:12:07 +09:00
meson.build meson: Add two missing regress tests 2023-01-17 13:49:09 -08:00
pg_stat_statements--1.0--1.1.sql Fix typo in update scripts for some contrib modules. 2013-07-19 04:13:01 +09:00
pg_stat_statements--1.1--1.2.sql Keep pg_stat_statements' query texts in a file, not in shared memory. 2014-01-27 15:37:54 -05:00
pg_stat_statements--1.2--1.3.sql Add stats for min, max, mean, stddev times to pg_stat_statements. 2015-03-27 15:43:22 -04:00
pg_stat_statements--1.3--1.4.sql Update pg_stat_statements extension for parallel query. 2016-06-10 10:42:01 -04:00
pg_stat_statements--1.4--1.5.sql Default monitoring roles 2017-03-30 14:18:53 -04:00
pg_stat_statements--1.4.sql Update pg_stat_statements extension for parallel query. 2016-06-10 10:42:01 -04:00
pg_stat_statements--1.5--1.6.sql Revoke pg_stat_statements_reset() permissions 2018-09-25 09:55:44 +09:00
pg_stat_statements--1.6--1.7.sql Extend pg_stat_statements_reset to reset statistics specific to a 2019-01-11 08:50:09 +05:30
pg_stat_statements--1.7--1.8.sql Change the display of WAL usage statistics in Explain. 2020-05-05 08:00:53 +05:30
pg_stat_statements--1.8--1.9.sql Merge v1.10 of pg_stat_statements into v1.9 2021-04-08 15:15:17 +02:00
pg_stat_statements--1.9--1.10.sql Add JIT counters to pg_stat_statements 2022-04-08 13:52:16 +02:00
pg_stat_statements.c Move queryjumble.c code to src/backend/nodes/ 2023-01-21 11:48:37 +09:00
pg_stat_statements.conf Allow compute_query_id to be set to 'auto' and make it default 2021-05-15 14:13:09 -04:00
pg_stat_statements.control pg_stat_statements: Track I/O timing for temporary file blocks 2022-04-08 13:12:07 +09:00