Go to file
Tom Lane 95bee941be Fix misestimation of n_distinct for a nearly-unique column with many nulls.
If ANALYZE found no repeated non-null entries in its sample, it set the
column's stadistinct value to -1.0, intending to indicate that the entries
are all distinct.  But what this value actually means is that the number
of distinct values is 100% of the table's rowcount, and thus it was
overestimating the number of distinct values by however many nulls there
are.  This could lead to very poor selectivity estimates, as for example
in a recent report from Andreas Joseph Krogh.  We should discount the
stadistinct value by whatever we've estimated the nulls fraction to be.
(That is what will happen if we choose to use a negative stadistinct for
a column that does have repeated entries, so this code path was just
inconsistent.)

In addition to fixing the stadistinct entries stored by several different
ANALYZE code paths, adjust the logic where get_variable_numdistinct()
forces an "all distinct" estimate on the basis of finding a relevant unique
index.  Unique indexes don't reject nulls, so there's no reason to assume
that the null fraction doesn't apply.

Back-patch to all supported branches.  Back-patching is a bit of a judgment
call, but this problem seems to affect only a few users (else we'd have
identified it long ago), and it's bad enough when it does happen that
destabilizing plan choices in a worse direction seems unlikely.

Patch by me, with documentation wording suggested by Dean Rasheed

Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena>
Discussion: <16143.1470350371@sss.pgh.pa.us>
2016-08-07 18:52:02 -04:00
config Update config.guess and config.sub 2016-05-06 14:02:44 -04:00
contrib Don't propagate a null subtransaction snapshot up to parent transaction. 2016-08-07 13:15:55 -04:00
doc Fix misestimation of n_distinct for a nearly-unique column with many nulls. 2016-08-07 18:52:02 -04:00
src Fix misestimation of n_distinct for a nearly-unique column with many nulls. 2016-08-07 18:52:02 -04:00
.dir-locals.el emacs: Set indent-tabs-mode in perl-mode 2015-04-12 23:53:23 -04:00
.gitattributes Fix whitespace and remove obsolete gitattributes entry 2016-03-13 16:03:13 -04:00
.gitignore Add .gitignore entries for AIX-specific intermediate build artifacts. 2015-07-08 20:44:22 -04:00
aclocal.m4 Replace our hacked version of ax_pthread.m4 with latest upstream version. 2015-07-08 20:36:06 +03:00
configure Stamp 9.6beta3. 2016-07-18 16:54:26 -04:00
configure.in Stamp 9.6beta3. 2016-07-18 16:54:26 -04:00
COPYRIGHT Update copyright for 2016 2016-01-02 13:33:40 -05:00
GNUmakefile.in Fix distclean/maintainer-clean targets to remove top-level tmp_install dir. 2015-05-13 18:48:05 -04:00
HISTORY Improve text of stub HISTORY file. 2014-02-12 18:16:17 -05:00
Makefile Allow make check in PL directories 2011-02-15 06:52:12 +02:00
README Don't generate plain-text HISTORY and src/test/regress/README anymore. 2014-02-10 20:48:04 -05:00
README.git Don't generate plain-text HISTORY and src/test/regress/README anymore. 2014-02-10 20:48:04 -05:00

PostgreSQL Database Management System
=====================================

This directory contains the source code distribution of the PostgreSQL
database management system.

PostgreSQL is an advanced object-relational database management system
that supports an extended subset of the SQL standard, including
transactions, foreign keys, subqueries, triggers, user-defined types
and functions.  This distribution also contains C language bindings.

PostgreSQL has many language interfaces, many of which are listed here:

	http://www.postgresql.org/download

See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.

The latest version of this software may be obtained at
http://www.postgresql.org/download/.  For more information look at our
web site located at http://www.postgresql.org/.