2010-09-20 22:08:53 +02:00
|
|
|
<!-- doc/src/sgml/planstats.sgml -->
|
2005-02-27 01:49:28 +01:00
|
|
|
|
2005-02-27 02:17:34 +01:00
|
|
|
<chapter id="planner-stats-details">
|
2005-02-27 01:49:28 +01:00
|
|
|
<title>How the Planner Uses Statistics</title>
|
|
|
|
|
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
This chapter builds on the material covered in <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="using-explain"/> and <xref linkend="planner-stats"/> to show some
|
2007-12-28 22:03:31 +01:00
|
|
|
additional details about how the planner uses the
|
|
|
|
system statistics to estimate the number of rows each part of a query might
|
|
|
|
return. This is a significant part of the planning process,
|
2005-02-27 01:49:28 +01:00
|
|
|
providing much of the raw material for cost calculation.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
The intent of this chapter is not to document the code in detail,
|
|
|
|
but to present an overview of how it works.
|
|
|
|
This will perhaps ease the learning curve for someone who subsequently
|
|
|
|
wishes to read the code.
|
2005-02-27 01:49:28 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<sect1 id="row-estimation-examples">
|
|
|
|
<title>Row Estimation Examples</title>
|
|
|
|
|
|
|
|
<indexterm zone="row-estimation-examples">
|
|
|
|
<primary>row estimation</primary>
|
|
|
|
<secondary>planner</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The examples shown below use tables in the <productname>PostgreSQL</productname>
|
2007-12-28 22:03:31 +01:00
|
|
|
regression test database.
|
|
|
|
The outputs shown are taken from version 8.3.
|
|
|
|
The behavior of earlier (or later) versions might vary.
|
2017-10-09 03:44:17 +02:00
|
|
|
Note also that since <command>ANALYZE</command> uses random sampling
|
2007-12-28 22:03:31 +01:00
|
|
|
while producing statistics, the results will change slightly after
|
2017-10-09 03:44:17 +02:00
|
|
|
any new <command>ANALYZE</command>.
|
2007-12-28 22:03:31 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Let's start with a very simple query:
|
|
|
|
|
2005-02-27 01:49:28 +01:00
|
|
|
<programlisting>
|
|
|
|
EXPLAIN SELECT * FROM tenk1;
|
|
|
|
|
|
|
|
QUERY PLAN
|
|
|
|
-------------------------------------------------------------
|
2007-12-28 22:03:31 +01:00
|
|
|
Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
|
|
|
|
How the planner determines the cardinality of <structname>tenk1</structname>
|
2017-11-23 15:39:47 +01:00
|
|
|
is covered in <xref linkend="planner-stats"/>, but is repeated here for
|
2007-12-28 22:03:31 +01:00
|
|
|
completeness. The number of pages and rows is looked up in
|
|
|
|
<structname>pg_class</structname>:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
relpages | reltuples
|
|
|
|
----------+-----------
|
2007-12-28 22:03:31 +01:00
|
|
|
358 | 10000
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2017-10-09 03:44:17 +02:00
|
|
|
These numbers are current as of the last <command>VACUUM</command> or
|
|
|
|
<command>ANALYZE</command> on the table. The planner then fetches the
|
2007-12-28 22:03:31 +01:00
|
|
|
actual current number of pages in the table (this is a cheap operation,
|
|
|
|
not requiring a table scan). If that is different from
|
|
|
|
<structfield>relpages</structfield> then
|
|
|
|
<structfield>reltuples</structfield> is scaled accordingly to
|
2016-02-16 07:36:52 +01:00
|
|
|
arrive at a current number-of-rows estimate. In the example above, the value of
|
2011-09-29 01:39:54 +02:00
|
|
|
<structfield>relpages</structfield> is up-to-date so the rows estimate is
|
|
|
|
the same as <structfield>reltuples</structfield>.
|
2005-02-27 01:49:28 +01:00
|
|
|
</para>
|
2007-12-28 22:03:31 +01:00
|
|
|
|
2005-02-27 01:49:28 +01:00
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
Let's move on to an example with a range condition in its
|
2005-02-27 01:49:28 +01:00
|
|
|
<literal>WHERE</literal> clause:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000;
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;-------------
|
2007-12-28 22:03:31 +01:00
|
|
|
Bitmap Heap Scan on tenk1 (cost=24.06..394.64 rows=1007 width=244)
|
|
|
|
Recheck Cond: (unique1 < 1000)
|
|
|
|
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
|
|
|
|
Index Cond: (unique1 < 1000)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
The planner examines the <literal>WHERE</literal> clause condition
|
|
|
|
and looks up the selectivity function for the operator
|
|
|
|
<literal><</literal> in <structname>pg_operator</structname>.
|
|
|
|
This is held in the column <structfield>oprrest</structfield>,
|
|
|
|
and the entry in this case is <function>scalarltsel</function>.
|
|
|
|
The <function>scalarltsel</function> function retrieves the histogram for
|
|
|
|
<structfield>unique1</structfield> from
|
2017-01-25 20:35:31 +01:00
|
|
|
<structname>pg_statistic</structname>. For manual queries it is more
|
2007-12-28 22:03:31 +01:00
|
|
|
convenient to look in the simpler <structname>pg_stats</structname>
|
2005-02-27 01:49:28 +01:00
|
|
|
view:
|
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
SELECT histogram_bounds FROM pg_stats
|
2005-02-27 01:49:28 +01:00
|
|
|
WHERE tablename='tenk1' AND attname='unique1';
|
|
|
|
|
|
|
|
histogram_bounds
|
|
|
|
------------------------------------------------------
|
2007-12-28 22:03:31 +01:00
|
|
|
{0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995}
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
Next the fraction of the histogram occupied by <quote>< 1000</quote>
|
|
|
|
is worked out. This is the selectivity. The histogram divides the range
|
|
|
|
into equal frequency buckets, so all we have to do is locate the bucket
|
|
|
|
that our value is in and count <emphasis>part</emphasis> of it and
|
|
|
|
<emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
|
2019-10-25 20:23:44 +02:00
|
|
|
the second bucket (993–1997). Assuming a linear distribution of
|
2007-12-28 22:03:31 +01:00
|
|
|
values inside each bucket, we can calculate the selectivity as:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets
|
|
|
|
= (1 + (1000 - 993)/(1997 - 993))/10
|
|
|
|
= 0.100697
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
that is, one whole bucket plus a linear fraction of the second, divided by
|
|
|
|
the number of buckets. The estimated number of rows can now be calculated as
|
2007-12-28 22:03:31 +01:00
|
|
|
the product of the selectivity and the cardinality of
|
|
|
|
<structname>tenk1</structname>:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
rows = rel_cardinality * selectivity
|
2007-12-28 22:03:31 +01:00
|
|
|
= 10000 * 0.100697
|
|
|
|
= 1007 (rounding off)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
Next let's consider an example with an equality condition in its
|
2005-02-27 01:49:28 +01:00
|
|
|
<literal>WHERE</literal> clause:
|
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA';
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
QUERY PLAN
|
|
|
|
----------------------------------------------------------
|
2007-12-28 22:03:31 +01:00
|
|
|
Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244)
|
|
|
|
Filter: (stringu1 = 'CRAAAA'::name)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
Again the planner examines the <literal>WHERE</literal> clause condition
|
|
|
|
and looks up the selectivity function for <literal>=</literal>, which is
|
|
|
|
<function>eqsel</function>. For equality estimation the histogram is
|
|
|
|
not useful; instead the list of <firstterm>most
|
2017-10-09 03:44:17 +02:00
|
|
|
common values</firstterm> (<acronym>MCV</acronym>s) is used to determine the
|
2007-12-28 22:03:31 +01:00
|
|
|
selectivity. Let's have a look at the MCVs, with some additional columns
|
|
|
|
that will be useful later:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
|
2005-02-27 01:49:28 +01:00
|
|
|
WHERE tablename='tenk1' AND attname='stringu1';
|
|
|
|
|
|
|
|
null_frac | 0
|
2007-12-28 22:03:31 +01:00
|
|
|
n_distinct | 676
|
2020-05-15 00:13:08 +02:00
|
|
|
most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,&zwsp;JOAAAA,MCAAAA,NAAAAA,WGAAAA}
|
|
|
|
most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,&zwsp;0.003,0.003,0.003,0.003}
|
2007-12-28 22:03:31 +01:00
|
|
|
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2017-10-09 03:44:17 +02:00
|
|
|
Since <literal>CRAAAA</literal> appears in the list of MCVs, the selectivity is
|
2007-12-28 22:03:31 +01:00
|
|
|
merely the corresponding entry in the list of most common frequencies
|
|
|
|
(<acronym>MCF</acronym>s):
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
selectivity = mcf[3]
|
|
|
|
= 0.003
|
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
As before, the estimated number of rows is just the product of this with the
|
|
|
|
cardinality of <structname>tenk1</structname>:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
rows = 10000 * 0.003
|
|
|
|
= 30
|
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
Now consider the same query, but with a constant that is not in the
|
2005-02-27 01:49:28 +01:00
|
|
|
<acronym>MCV</acronym> list:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx';
|
|
|
|
|
|
|
|
QUERY PLAN
|
|
|
|
----------------------------------------------------------
|
2007-12-28 22:03:31 +01:00
|
|
|
Seq Scan on tenk1 (cost=0.00..483.00 rows=15 width=244)
|
2005-02-27 01:49:28 +01:00
|
|
|
Filter: (stringu1 = 'xxx'::name)
|
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
This is quite a different problem: how to estimate the selectivity when the
|
|
|
|
value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list.
|
|
|
|
The approach is to use the fact that the value is not in the list,
|
2005-02-27 01:49:28 +01:00
|
|
|
combined with the knowledge of the frequencies for all of the
|
|
|
|
<acronym>MCV</acronym>s:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
selectivity = (1 - sum(mvf))/(num_distinct - num_mcv)
|
2007-12-28 22:03:31 +01:00
|
|
|
= (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 +
|
|
|
|
0.003 + 0.003 + 0.003 + 0.003))/(676 - 10)
|
|
|
|
= 0.0014559
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
That is, add up all the frequencies for the <acronym>MCV</acronym>s and
|
|
|
|
subtract them from one, then
|
|
|
|
divide by the number of <emphasis>other</emphasis> distinct values.
|
|
|
|
This amounts to assuming that the fraction of the column that is not any
|
|
|
|
of the MCVs is evenly distributed among all the other distinct values.
|
|
|
|
Notice that there are no null values so we don't have to worry about those
|
|
|
|
(otherwise we'd subtract the null fraction from the numerator as well).
|
|
|
|
The estimated number of rows is then calculated as usual:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
rows = 10000 * 0.0014559
|
|
|
|
= 15 (rounding off)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The previous example with <literal>unique1 < 1000</literal> was an
|
2007-12-28 22:03:31 +01:00
|
|
|
oversimplification of what <function>scalarltsel</function> really does;
|
|
|
|
now that we have seen an example of the use of MCVs, we can fill in some
|
|
|
|
more detail. The example was correct as far as it went, because since
|
2017-10-09 03:44:17 +02:00
|
|
|
<structfield>unique1</structfield> is a unique column it has no MCVs (obviously, no
|
2007-12-28 22:03:31 +01:00
|
|
|
value is any more common than any other value). For a non-unique
|
|
|
|
column, there will normally be both a histogram and an MCV list, and
|
|
|
|
<emphasis>the histogram does not include the portion of the column
|
2017-10-09 03:44:17 +02:00
|
|
|
population represented by the MCVs</emphasis>. We do things this way because
|
2007-12-28 22:03:31 +01:00
|
|
|
it allows more precise estimation. In this situation
|
|
|
|
<function>scalarltsel</function> directly applies the condition (e.g.,
|
2017-10-09 03:44:17 +02:00
|
|
|
<quote>< 1000</quote>) to each value of the MCV list, and adds up the
|
2007-12-28 22:03:31 +01:00
|
|
|
frequencies of the MCVs for which the condition is true. This gives
|
|
|
|
an exact estimate of the selectivity within the portion of the table
|
|
|
|
that is MCVs. The histogram is then used in the same way as above
|
|
|
|
to estimate the selectivity in the portion of the table that is not
|
|
|
|
MCVs, and then the two numbers are combined to estimate the overall
|
|
|
|
selectivity. For example, consider
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE stringu1 < 'IAAAAA';
|
2005-02-27 01:49:28 +01:00
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
QUERY PLAN
|
|
|
|
------------------------------------------------------------
|
|
|
|
Seq Scan on tenk1 (cost=0.00..483.00 rows=3077 width=244)
|
|
|
|
Filter: (stringu1 < 'IAAAAA'::name)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2017-10-09 03:44:17 +02:00
|
|
|
We already saw the MCV information for <structfield>stringu1</structfield>,
|
2007-12-28 22:03:31 +01:00
|
|
|
and here is its histogram:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
SELECT histogram_bounds FROM pg_stats
|
|
|
|
WHERE tablename='tenk1' AND attname='stringu1';
|
|
|
|
|
|
|
|
histogram_bounds
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;-------------
|
|
|
|
{AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,&zwsp;XLAAAA,ZZAAAA}
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
Checking the MCV list, we find that the condition <literal>stringu1 <
|
2017-10-09 03:44:17 +02:00
|
|
|
'IAAAAA'</literal> is satisfied by the first six entries and not the last four,
|
2007-12-28 22:03:31 +01:00
|
|
|
so the selectivity within the MCV part of the population is
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
selectivity = sum(relevant mvfs)
|
|
|
|
= 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003
|
|
|
|
= 0.01833333
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
|
|
|
|
Summing all the MCFs also tells us that the total fraction of the
|
|
|
|
population represented by MCVs is 0.03033333, and therefore the
|
|
|
|
fraction represented by the histogram is 0.96966667 (again, there
|
|
|
|
are no nulls, else we'd have to exclude them here). We can see
|
2017-10-09 03:44:17 +02:00
|
|
|
that the value <literal>IAAAAA</literal> falls nearly at the end of the
|
2007-12-28 22:03:31 +01:00
|
|
|
third histogram bucket. Using some rather cheesy assumptions
|
|
|
|
about the frequency of different characters, the planner arrives
|
|
|
|
at the estimate 0.298387 for the portion of the histogram population
|
2017-10-09 03:44:17 +02:00
|
|
|
that is less than <literal>IAAAAA</literal>. We then combine the estimates
|
2007-12-28 22:03:31 +01:00
|
|
|
for the MCV and non-MCV populations:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction
|
|
|
|
= 0.01833333 + 0.298387 * 0.96966667
|
|
|
|
= 0.307669
|
|
|
|
|
|
|
|
rows = 10000 * 0.307669
|
|
|
|
= 3077 (rounding off)
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
In this particular example, the correction from the MCV list is fairly
|
|
|
|
small, because the column distribution is actually quite flat (the
|
|
|
|
statistics showing these particular values as being more common than
|
|
|
|
others are mostly due to sampling error). In a more typical case where
|
|
|
|
some values are significantly more common than others, this complicated
|
|
|
|
process gives a useful improvement in accuracy because the selectivity
|
|
|
|
for the most common values is found exactly.
|
2005-02-27 01:49:28 +01:00
|
|
|
</para>
|
2007-12-28 22:03:31 +01:00
|
|
|
|
2005-02-27 01:49:28 +01:00
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
Now let's consider a case with more than one
|
|
|
|
condition in the <literal>WHERE</literal> clause:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000 AND stringu1 = 'xxx';
|
2005-02-27 01:49:28 +01:00
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;-------------
|
2007-12-28 22:03:31 +01:00
|
|
|
Bitmap Heap Scan on tenk1 (cost=23.80..396.91 rows=1 width=244)
|
|
|
|
Recheck Cond: (unique1 < 1000)
|
|
|
|
Filter: (stringu1 = 'xxx'::name)
|
|
|
|
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
|
|
|
|
Index Cond: (unique1 < 1000)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
The planner assumes that the two conditions are independent, so that
|
|
|
|
the individual selectivities of the clauses can be multiplied together:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
selectivity = selectivity(unique1 < 1000) * selectivity(stringu1 = 'xxx')
|
|
|
|
= 0.100697 * 0.0014559
|
|
|
|
= 0.0001466
|
|
|
|
|
|
|
|
rows = 10000 * 0.0001466
|
|
|
|
= 1 (rounding off)
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
Notice that the number of rows estimated to be returned from the bitmap
|
|
|
|
index scan reflects only the condition used with the index; this is
|
|
|
|
important since it affects the cost estimate for the subsequent heap
|
|
|
|
fetches.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Finally we will examine a query that involves a join:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
|
|
|
|
WHERE t1.unique1 < 50 AND t1.unique2 = t2.unique2;
|
2005-02-27 01:49:28 +01:00
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;-------------------
|
2007-12-28 22:03:31 +01:00
|
|
|
Nested Loop (cost=4.64..456.23 rows=50 width=488)
|
|
|
|
-> Bitmap Heap Scan on tenk1 t1 (cost=4.64..142.17 rows=50 width=244)
|
|
|
|
Recheck Cond: (unique1 < 50)
|
|
|
|
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.63 rows=50 width=0)
|
|
|
|
Index Cond: (unique1 < 50)
|
|
|
|
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..6.27 rows=1 width=244)
|
2010-07-13 22:57:19 +02:00
|
|
|
Index Cond: (unique2 = t1.unique2)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
The restriction on <structname>tenk1</structname>,
|
|
|
|
<literal>unique1 < 50</literal>,
|
|
|
|
is evaluated before the nested-loop join.
|
|
|
|
This is handled analogously to the previous range example. This time the
|
|
|
|
value 50 falls into the first bucket of the
|
|
|
|
<structfield>unique1</structfield> histogram:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets
|
|
|
|
= (0 + (50 - 0)/(993 - 0))/10
|
|
|
|
= 0.005035
|
|
|
|
|
|
|
|
rows = 10000 * 0.005035
|
|
|
|
= 50 (rounding off)
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
|
2017-10-09 03:44:17 +02:00
|
|
|
The restriction for the join is <literal>t2.unique2 = t1.unique2</literal>.
|
2007-12-28 22:03:31 +01:00
|
|
|
The operator is just
|
|
|
|
our familiar <literal>=</literal>, however the selectivity function is
|
|
|
|
obtained from the <structfield>oprjoin</structfield> column of
|
|
|
|
<structname>pg_operator</structname>, and is <function>eqjoinsel</function>.
|
|
|
|
<function>eqjoinsel</function> looks up the statistical information for both
|
|
|
|
<structname>tenk2</structname> and <structname>tenk1</structname>:
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
<programlisting>
|
2007-12-28 22:03:31 +01:00
|
|
|
SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats
|
|
|
|
WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2';
|
2005-02-27 01:49:28 +01:00
|
|
|
|
|
|
|
tablename | null_frac | n_distinct | most_common_vals
|
|
|
|
-----------+-----------+------------+------------------
|
|
|
|
tenk1 | 0 | -1 |
|
|
|
|
tenk2 | 0 | -1 |
|
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
In this case there is no <acronym>MCV</acronym> information for
|
|
|
|
<structfield>unique2</structfield> because all the values appear to be
|
|
|
|
unique, so we use an algorithm that relies only on the number of
|
2005-02-27 01:49:28 +01:00
|
|
|
distinct values for both relations together with their null fractions:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2)
|
2007-12-28 22:03:31 +01:00
|
|
|
= (1 - 0) * (1 - 0) / max(10000, 10000)
|
2005-02-27 01:49:28 +01:00
|
|
|
= 0.0001
|
|
|
|
</programlisting>
|
|
|
|
|
2007-12-28 22:03:31 +01:00
|
|
|
This is, subtract the null fraction from one for each of the relations,
|
|
|
|
and divide by the maximum of the numbers of distinct values.
|
|
|
|
The number of rows
|
|
|
|
that the join is likely to emit is calculated as the cardinality of the
|
|
|
|
Cartesian product of the two inputs, multiplied by the
|
2005-02-27 01:49:28 +01:00
|
|
|
selectivity:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
rows = (outer_cardinality * inner_cardinality) * selectivity
|
2007-12-28 22:03:31 +01:00
|
|
|
= (50 * 10000) * 0.0001
|
|
|
|
= 50
|
2005-02-27 01:49:28 +01:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2007-12-28 22:03:31 +01:00
|
|
|
Had there been MCV lists for the two columns,
|
|
|
|
<function>eqjoinsel</function> would have used direct comparison of the MCV
|
|
|
|
lists to determine the join selectivity within the part of the column
|
|
|
|
populations represented by the MCVs. The estimate for the remainder of the
|
|
|
|
populations follows the same approach shown here.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Notice that we showed <literal>inner_cardinality</literal> as 10000, that is,
|
|
|
|
the unmodified size of <structname>tenk2</structname>. It might appear from
|
|
|
|
inspection of the <command>EXPLAIN</command> output that the estimate of
|
2007-12-28 22:03:31 +01:00
|
|
|
join rows comes from 50 * 1, that is, the number of outer rows times
|
2010-08-17 06:37:21 +02:00
|
|
|
the estimated number of rows obtained by each inner index scan on
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>tenk2</structname>. But this is not the case: the join relation size
|
2007-12-28 22:03:31 +01:00
|
|
|
is estimated before any particular join plan has been considered. If
|
|
|
|
everything is working well then the two ways of estimating the join
|
2013-05-21 03:13:13 +02:00
|
|
|
size will produce about the same answer, but due to round-off error and
|
2007-12-28 22:03:31 +01:00
|
|
|
other factors they sometimes diverge significantly.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
For those interested in further details, estimation of the size of
|
2017-10-09 03:44:17 +02:00
|
|
|
a table (before any <literal>WHERE</literal> clauses) is done in
|
2007-12-28 22:03:31 +01:00
|
|
|
<filename>src/backend/optimizer/util/plancat.c</filename>. The generic
|
|
|
|
logic for clause selectivities is in
|
|
|
|
<filename>src/backend/optimizer/path/clausesel.c</filename>. The
|
|
|
|
operator-specific selectivity functions are mostly found
|
2005-02-27 01:49:28 +01:00
|
|
|
in <filename>src/backend/utils/adt/selfuncs.c</filename>.
|
|
|
|
</para>
|
2017-04-20 20:43:33 +02:00
|
|
|
</sect1>
|
2005-02-27 01:49:28 +01:00
|
|
|
|
2017-04-20 20:43:33 +02:00
|
|
|
<sect1 id="multivariate-statistics-examples">
|
|
|
|
<title>Multivariate Statistics Examples</title>
|
2017-04-06 00:00:42 +02:00
|
|
|
|
2017-04-20 20:43:33 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>row estimation</primary>
|
|
|
|
<secondary>multivariate</secondary>
|
|
|
|
</indexterm>
|
2017-04-06 00:00:42 +02:00
|
|
|
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
<sect2 id="functional-dependencies">
|
2017-05-15 01:15:52 +02:00
|
|
|
<title>Functional Dependencies</title>
|
|
|
|
|
2017-04-06 00:00:42 +02:00
|
|
|
<para>
|
2017-05-15 01:15:52 +02:00
|
|
|
Multivariate correlation can be demonstrated with a very simple data set
|
|
|
|
— a table with two columns, both containing the same values:
|
2017-04-06 00:00:42 +02:00
|
|
|
|
2017-04-20 20:43:33 +02:00
|
|
|
<programlisting>
|
|
|
|
CREATE TABLE t (a INT, b INT);
|
|
|
|
INSERT INTO t SELECT i % 100, i % 100 FROM generate_series(1, 10000) s(i);
|
|
|
|
ANALYZE t;
|
|
|
|
</programlisting>
|
|
|
|
|
2017-11-23 15:39:47 +01:00
|
|
|
As explained in <xref linkend="planner-stats"/>, the planner can determine
|
2017-04-20 20:43:33 +02:00
|
|
|
cardinality of <structname>t</structname> using the number of pages and
|
|
|
|
rows obtained from <structname>pg_class</structname>:
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
SELECT relpages, reltuples FROM pg_class WHERE relname = 't';
|
|
|
|
|
|
|
|
relpages | reltuples
|
|
|
|
----------+-----------
|
|
|
|
45 | 10000
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
The data distribution is very simple; there are only 100 distinct values
|
|
|
|
in each column, uniformly distributed.
|
2017-04-06 00:00:42 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The following example shows the result of estimating a <literal>WHERE</literal>
|
|
|
|
condition on the <structfield>a</structfield> column:
|
2017-04-06 00:00:42 +02:00
|
|
|
|
|
|
|
<programlisting>
|
2017-04-20 20:43:33 +02:00
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
|
2022-04-20 17:04:28 +02:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;------------
|
2017-04-20 20:43:33 +02:00
|
|
|
Seq Scan on t (cost=0.00..170.00 rows=100 width=8) (actual rows=100 loops=1)
|
|
|
|
Filter: (a = 1)
|
2017-04-06 00:00:42 +02:00
|
|
|
Rows Removed by Filter: 9900
|
|
|
|
</programlisting>
|
|
|
|
|
2017-04-20 20:43:33 +02:00
|
|
|
The planner examines the condition and determines the selectivity
|
|
|
|
of this clause to be 1%. By comparing this estimate and the actual
|
|
|
|
number of rows, we see that the estimate is very accurate
|
|
|
|
(in fact exact, as the table is very small). Changing the
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>WHERE</literal> condition to use the <structfield>b</structfield> column, an
|
2017-06-18 01:03:12 +02:00
|
|
|
identical plan is generated. But observe what happens if we apply the same
|
2017-10-09 03:44:17 +02:00
|
|
|
condition on both columns, combining them with <literal>AND</literal>:
|
2017-04-06 00:00:42 +02:00
|
|
|
|
|
|
|
<programlisting>
|
2017-04-20 20:43:33 +02:00
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
|
2022-04-20 17:04:28 +02:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;----------
|
2017-04-20 20:43:33 +02:00
|
|
|
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=100 loops=1)
|
2017-04-06 00:00:42 +02:00
|
|
|
Filter: ((a = 1) AND (b = 1))
|
|
|
|
Rows Removed by Filter: 9900
|
|
|
|
</programlisting>
|
|
|
|
|
2017-04-20 20:43:33 +02:00
|
|
|
The planner estimates the selectivity for each condition individually,
|
2017-05-15 01:15:52 +02:00
|
|
|
arriving at the same 1% estimates as above. Then it assumes that the
|
|
|
|
conditions are independent, and so it multiplies their selectivities,
|
|
|
|
producing a final selectivity estimate of just 0.01%.
|
|
|
|
This is a significant underestimate, as the actual number of rows
|
|
|
|
matching the conditions (100) is two orders of magnitude higher.
|
2017-04-06 00:00:42 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-05-15 01:15:52 +02:00
|
|
|
This problem can be fixed by creating a statistics object that
|
2017-10-09 03:44:17 +02:00
|
|
|
directs <command>ANALYZE</command> to calculate functional-dependency
|
2017-04-20 20:43:33 +02:00
|
|
|
multivariate statistics on the two columns:
|
2017-04-06 00:00:42 +02:00
|
|
|
|
|
|
|
<programlisting>
|
Change CREATE STATISTICS syntax
Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types. Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only. (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).
Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command. This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.
Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming. He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.
Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.
Implement tab-completion for statistics objects. This can stand some
more improvement.
Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
2017-05-12 19:59:23 +02:00
|
|
|
CREATE STATISTICS stts (dependencies) ON a, b FROM t;
|
2017-04-20 20:43:33 +02:00
|
|
|
ANALYZE t;
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
|
2022-04-20 17:04:28 +02:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;------------
|
2017-04-20 20:43:33 +02:00
|
|
|
Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
|
|
|
|
Filter: ((a = 1) AND (b = 1))
|
|
|
|
Rows Removed by Filter: 9900
|
2017-04-06 00:00:42 +02:00
|
|
|
</programlisting>
|
|
|
|
</para>
|
2017-04-20 20:43:33 +02:00
|
|
|
</sect2>
|
2017-04-06 00:00:42 +02:00
|
|
|
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
<sect2 id="multivariate-ndistinct-counts">
|
2017-05-15 01:15:52 +02:00
|
|
|
<title>Multivariate N-Distinct Counts</title>
|
|
|
|
|
2017-04-06 00:00:42 +02:00
|
|
|
<para>
|
2017-05-15 01:15:52 +02:00
|
|
|
A similar problem occurs with estimation of the cardinality of sets of
|
|
|
|
multiple columns, such as the number of groups that would be generated by
|
|
|
|
a <command>GROUP BY</command> clause. When <command>GROUP BY</command>
|
|
|
|
lists a single column, the n-distinct estimate (which is visible as the
|
|
|
|
estimated number of rows returned by the HashAggregate node) is very
|
|
|
|
accurate:
|
2017-04-20 20:43:33 +02:00
|
|
|
<programlisting>
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
|
2022-04-20 17:04:28 +02:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;----------------------
|
2017-04-20 20:43:33 +02:00
|
|
|
HashAggregate (cost=195.00..196.00 rows=100 width=12) (actual rows=100 loops=1)
|
|
|
|
Group Key: a
|
|
|
|
-> Seq Scan on t (cost=0.00..145.00 rows=10000 width=4) (actual rows=10000 loops=1)
|
|
|
|
</programlisting>
|
|
|
|
But without multivariate statistics, the estimate for the number of
|
|
|
|
groups in a query with two columns in <command>GROUP BY</command>, as
|
|
|
|
in the following example, is off by an order of magnitude:
|
|
|
|
<programlisting>
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
|
2022-04-20 17:04:28 +02:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;-------------------------
|
2017-04-20 20:43:33 +02:00
|
|
|
HashAggregate (cost=220.00..230.00 rows=1000 width=16) (actual rows=100 loops=1)
|
|
|
|
Group Key: a, b
|
|
|
|
-> Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
|
|
|
|
</programlisting>
|
2017-05-15 01:15:52 +02:00
|
|
|
By redefining the statistics object to include n-distinct counts for the
|
|
|
|
two columns, the estimate is much improved:
|
2017-04-20 20:43:33 +02:00
|
|
|
<programlisting>
|
|
|
|
DROP STATISTICS stts;
|
Change CREATE STATISTICS syntax
Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types. Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only. (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).
Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command. This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.
Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming. He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.
Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.
Implement tab-completion for statistics objects. This can stand some
more improvement.
Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
2017-05-12 19:59:23 +02:00
|
|
|
CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
|
2017-04-20 20:43:33 +02:00
|
|
|
ANALYZE t;
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
|
2022-04-20 17:04:28 +02:00
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;-------------------------
|
2017-04-20 20:43:33 +02:00
|
|
|
HashAggregate (cost=220.00..221.00 rows=100 width=16) (actual rows=100 loops=1)
|
|
|
|
Group Key: a, b
|
|
|
|
-> Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
|
|
|
|
</programlisting>
|
2017-04-06 00:00:42 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
</sect2>
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
|
|
|
|
<sect2 id="mcv-lists">
|
2019-09-08 10:26:35 +02:00
|
|
|
<title>MCV Lists</title>
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
As explained in <xref linkend="functional-dependencies"/>, functional
|
|
|
|
dependencies are very cheap and efficient type of statistics, but their
|
|
|
|
main limitation is their global nature (only tracking dependencies at
|
|
|
|
the column level, not between individual column values).
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This section introduces multivariate variant of <acronym>MCV</acronym>
|
|
|
|
(most-common values) lists, a straightforward extension of the per-column
|
|
|
|
statistics described in <xref linkend="row-estimation-examples"/>. These
|
|
|
|
statistics address the limitation by storing individual values, but it is
|
|
|
|
naturally more expensive, both in terms of building the statistics in
|
|
|
|
<command>ANALYZE</command>, storage and planning time.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Let's look at the query from <xref linkend="functional-dependencies"/>
|
|
|
|
again, but this time with a <acronym>MCV</acronym> list created on the
|
|
|
|
same set of columns (be sure to drop the functional dependencies, to
|
|
|
|
make sure the planner uses the newly created statistics).
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
DROP STATISTICS stts;
|
|
|
|
CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
|
|
|
|
ANALYZE t;
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
|
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;------------
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
|
|
|
|
Filter: ((a = 1) AND (b = 1))
|
|
|
|
Rows Removed by Filter: 9900
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
The estimate is as accurate as with the functional dependencies, mostly
|
|
|
|
thanks to the table being fairly small and having a simple distribution
|
|
|
|
with a low number of distinct values. Before looking at the second query,
|
|
|
|
which was not handled by functional dependencies particularly well,
|
|
|
|
let's inspect the <acronym>MCV</acronym> list a bit.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Inspecting the <acronym>MCV</acronym> list is possible using
|
|
|
|
<function>pg_mcv_list_items</function> set-returning function.
|
|
|
|
|
|
|
|
<programlisting>
|
Rework the pg_statistic_ext catalog
Since extended statistic got introduced in PostgreSQL 10, there was a
single catalog pg_statistic_ext storing both the definitions and built
statistic. That's however problematic when a user is supposed to have
access only to the definitions, but not to user data.
Consider for example pg_dump on a database with RLS enabled - if the
pg_statistic_ext catalog respects RLS (which it should, if it contains
user data), pg_dump would not see any records and the result would not
define any extended statistics. That would be a surprising behavior.
Until now this was not a pressing issue, because the existing types of
extended statistic (functional dependencies and ndistinct coefficients)
do not include any user data directly. This changed with introduction
of MCV lists, which do include most common combinations of values.
The easiest way to fix this is to split the pg_statistic_ext catalog
into two - one for definitions, one for the built statistic values.
The new catalog is called pg_statistic_ext_data, and we're maintaining
a 1:1 relationship with the old catalog - either there are matching
records in both catalogs, or neither of them.
Bumped CATVERSION due to changing system catalog definitions.
Author: Dean Rasheed, with improvements by me
Reviewed-by: Dean Rasheed, John Naylor
Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui%3DVdx4N%3D%3DVV5XOK5dsXfnGgVOz_JhAicB%3DZA%40mail.gmail.com
2019-06-13 17:19:21 +02:00
|
|
|
SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid),
|
|
|
|
pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts2';
|
2022-04-20 17:04:28 +02:00
|
|
|
index | values | nulls | frequency | base_frequency
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
-------+----------+-------+-----------+----------------
|
|
|
|
0 | {0, 0} | {f,f} | 0.01 | 0.0001
|
|
|
|
1 | {1, 1} | {f,f} | 0.01 | 0.0001
|
|
|
|
...
|
|
|
|
49 | {49, 49} | {f,f} | 0.01 | 0.0001
|
|
|
|
50 | {50, 50} | {f,f} | 0.01 | 0.0001
|
|
|
|
...
|
|
|
|
97 | {97, 97} | {f,f} | 0.01 | 0.0001
|
|
|
|
98 | {98, 98} | {f,f} | 0.01 | 0.0001
|
|
|
|
99 | {99, 99} | {f,f} | 0.01 | 0.0001
|
|
|
|
(100 rows)
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
This confirms there are 100 distinct combinations in the two columns, and
|
|
|
|
all of them are about equally likely (1% frequency for each one). The
|
|
|
|
base frequency is the frequency computed from per-column statistics, as if
|
|
|
|
there were no multi-column statistics. Had there been any null values in
|
|
|
|
either of the columns, this would be identified in the
|
|
|
|
<structfield>nulls</structfield> column.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
When estimating the selectivity, the planner applies all the conditions
|
|
|
|
on items in the <acronym>MCV</acronym> list, and then sums the frequencies
|
|
|
|
of the matching ones. See <function>mcv_clauselist_selectivity</function>
|
|
|
|
in <filename>src/backend/statistics/mcv.c</filename> for details.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Compared to functional dependencies, <acronym>MCV</acronym> lists have two
|
|
|
|
major advantages. Firstly, the list stores actual values, making it possible
|
|
|
|
to decide which combinations are compatible.
|
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
|
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;--------
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
|
|
|
|
Filter: ((a = 1) AND (b = 10))
|
|
|
|
Rows Removed by Filter: 10000
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
|
2019-08-20 05:36:31 +02:00
|
|
|
not just equality clauses like functional dependencies. For example,
|
|
|
|
consider the following range query for the same table:
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
|
|
|
|
<programlisting>
|
|
|
|
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a <= 49 AND b > 49;
|
|
|
|
QUERY PLAN
|
2020-05-15 00:13:08 +02:00
|
|
|
-------------------------------------------------------------------&zwsp;--------
|
Add support for multivariate MCV lists
Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.
Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27 18:32:18 +01:00
|
|
|
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
|
|
|
|
Filter: ((a <= 49) AND (b > 49))
|
|
|
|
Rows Removed by Filter: 10000
|
|
|
|
</programlisting>
|
|
|
|
|
|
|
|
</para>
|
|
|
|
|
|
|
|
</sect2>
|
|
|
|
|
2005-02-27 01:49:28 +01:00
|
|
|
</sect1>
|
2017-05-05 18:18:48 +02:00
|
|
|
|
|
|
|
<sect1 id="planner-stats-security">
|
|
|
|
<title>Planner Statistics and Security</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Access to the table <structname>pg_statistic</structname> is restricted to
|
|
|
|
superusers, so that ordinary users cannot learn about the contents of the
|
|
|
|
tables of other users from it. Some selectivity estimation functions will
|
|
|
|
use a user-provided operator (either the operator appearing in the query or
|
|
|
|
a related operator) to analyze the stored statistics. For example, in order
|
|
|
|
to determine whether a stored most common value is applicable, the
|
|
|
|
selectivity estimator will have to run the appropriate <literal>=</literal>
|
|
|
|
operator to compare the constant in the query to the stored value.
|
|
|
|
Thus the data in <structname>pg_statistic</structname> is potentially
|
|
|
|
passed to user-defined operators. An appropriately crafted operator can
|
|
|
|
intentionally leak the passed operands (for example, by logging them
|
|
|
|
or writing them to a different table), or accidentally leak them by showing
|
|
|
|
their values in error messages, in either case possibly exposing data from
|
|
|
|
<structname>pg_statistic</structname> to a user who should not be able to
|
|
|
|
see it.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
In order to prevent this, the following applies to all built-in selectivity
|
|
|
|
estimation functions. When planning a query, in order to be able to use
|
|
|
|
stored statistics, the current user must either
|
|
|
|
have <literal>SELECT</literal> privilege on the table or the involved
|
|
|
|
columns, or the operator used must be <literal>LEAKPROOF</literal> (more
|
|
|
|
accurately, the function that the operator is based on). If not, then the
|
|
|
|
selectivity estimator will behave as if no statistics are available, and
|
|
|
|
the planner will proceed with default or fall-back assumptions.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
If a user does not have the required privilege on the table or columns,
|
|
|
|
then in many cases the query will ultimately receive a permission-denied
|
|
|
|
error, in which case this mechanism is invisible in practice. But if the
|
|
|
|
user is reading from a security-barrier view, then the planner might wish
|
|
|
|
to check the statistics of an underlying table that is otherwise
|
|
|
|
inaccessible to the user. In that case, the operator should be leak-proof
|
|
|
|
or the statistics will not be used. There is no direct feedback about
|
|
|
|
that, except that the plan might be suboptimal. If one suspects that this
|
|
|
|
is the case, one could try running the query as a more privileged user,
|
|
|
|
to see if a different plan results.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
This restriction applies only to cases where the planner would need to
|
|
|
|
execute a user-defined operator on one or more values
|
|
|
|
from <structname>pg_statistic</structname>. Thus the planner is permitted
|
|
|
|
to use generic statistical information, such as the fraction of null values
|
|
|
|
or the number of distinct values in a column, regardless of access
|
|
|
|
privileges.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Selectivity estimation functions contained in third-party extensions that
|
|
|
|
potentially operate on statistics with user-defined operators should follow
|
|
|
|
the same security rules. Consult the PostgreSQL source code for guidance.
|
|
|
|
</para>
|
|
|
|
</sect1>
|
2005-02-27 01:49:28 +01:00
|
|
|
</chapter>
|