postgresql/src/include
Tom Lane 0c882e52a8 Improve ineq_histogram_selectivity's behavior for non-default orderings.
ineq_histogram_selectivity() can be invoked in situations where the
ordering we care about is not that of the column's histogram.  We could
be considering some other collation, or even more drastically, the
query operator might not agree at all with what was used to construct
the histogram.  (We'll get here for anything using scalarineqsel-based
estimators, so that's quite likely to happen for extension operators.)

Up to now we just ignored this issue and assumed we were dealing with
an operator/collation whose sort order exactly matches the histogram,
possibly resulting in junk estimates if the binary search gets confused.
It's past time to improve that, since the use of nondefault collations
is increasing.  What we can do is verify that the given operator and
collation match what's recorded in pg_statistic, and use the existing
code only if so.  When they don't match, instead execute the operator
against each histogram entry, and take the fraction of successes as our
selectivity estimate.  This gives an estimate that is probably good to
about 1/histogram_size, with no assumptions about ordering.  (The quality
of the estimate is likely to degrade near the ends of the value range,
since the two orderings probably don't agree on what is an extremal value;
but this is surely going to be more reliable than what we did before.)

At some point we might further improve matters by storing more than one
histogram calculated according to different orderings.  But this code
would still be good fallback logic when no matches exist, so that is
not an argument for not doing this.

While here, also improve get_variable_range() to deal more honestly
with non-default collations.

This isn't back-patchable, because it requires adding another argument
to ineq_histogram_selectivity, and because it might have significant
impact on the estimation results for extension operators relying on
scalarineqsel --- mostly for the better, one hopes, but in any case
destabilizing plan choices in back branches is best avoided.

Per investigation of a report from James Lucas.

Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com
2020-06-05 16:55:27 -04:00
..
access Fix some comments in xlogreader.h 2020-05-28 16:40:07 +09:00
bootstrap Update copyrights for 2020 2020-01-01 12:21:45 -05:00
catalog Make pg_stat_wal_receiver consistent with the WAL receiver's shmem info 2020-05-17 09:22:07 +09:00
commands Rename SLRU structures and associated LWLocks. 2020-05-15 14:28:25 -04:00
common Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
datatype Update copyrights for 2020 2020-01-01 12:21:45 -05:00
executor Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
fe_utils Reduce size of backend scanner's tables. 2020-01-13 15:04:31 -05:00
foreign Update copyrights for 2020 2020-01-01 12:21:45 -05:00
jit jit: Reference expression step functions via llvmjit_types. 2020-02-06 22:29:14 -08:00
lib Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
libpq Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
mb Allow Unicode escapes in any server encoding, not only UTF-8. 2020-03-06 14:17:43 -05:00
nodes Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
optimizer Support FETCH FIRST WITH TIES 2020-04-07 16:22:13 -04:00
parser Revert 0f5ca02f53 2020-04-08 11:37:27 +03:00
partitioning Allow partitionwise joins in more cases. 2020-04-08 10:25:00 +09:00
port Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
portability Update copyrights for 2020 2020-01-01 12:21:45 -05:00
postmaster Trigger autovacuum based on number of INSERTs 2020-03-28 19:20:12 +13:00
regex Assume that we have <wchar.h>. 2020-02-21 14:30:47 -05:00
replication Don't call palloc() while holding a spinlock, either. 2020-06-03 12:36:23 -04:00
rewrite Update copyrights for 2020 2020-01-01 12:21:45 -05:00
snowball Update copyrights for 2020 2020-01-01 12:21:45 -05:00
statistics Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
storage Drop the redundant "Lock" suffix from LWLock wait event names. 2020-05-15 19:55:56 -04:00
tcop Allow the planner-related functions and hook to accept the query string. 2020-03-30 13:51:05 +09:00
tsearch Assume that we have <wchar.h>. 2020-02-21 14:30:47 -05:00
utils Improve ineq_histogram_selectivity's behavior for non-default orderings. 2020-06-05 16:55:27 -04:00
.gitignore Refactor dlopen() support 2018-09-06 11:33:04 +02:00
Makefile Get rid of jsonpath_gram.h and jsonpath_scanner.h 2019-03-20 11:13:34 +03:00
c.h Enable Unix-domain sockets support on Windows 2020-03-28 15:01:01 +01:00
fmgr.h Fix minor violations of FunctionCallInvoke usage protocol. 2020-04-21 14:23:53 -04:00
funcapi.h Avoid holding a directory FD open across assorted SRF calls. 2020-03-16 21:05:52 -04:00
getaddrinfo.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
getopt_long.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
miscadmin.h Add unlikely() to CHECK_FOR_INTERRUPTS() 2020-06-05 16:49:25 -04:00
pg_config.h.in Enable Unix-domain sockets support on Windows 2020-03-28 15:01:01 +01:00
pg_config_ext.h.in Autoconfiscate selection of 64-bit int type for 64-bit large object API. 2012-10-07 21:52:43 -04:00
pg_config_manual.h Remove ACLDEBUG #define and associated code. 2020-04-23 15:38:04 -04:00
pg_getopt.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
pg_trace.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
pgstat.h Mop-up for wait event naming issues. 2020-05-16 21:00:11 -04:00
pgtar.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
pgtime.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
port.h Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
postgres.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
postgres_ext.h Phase 2 of pgindent updates. 2017-06-21 15:19:25 -04:00
postgres_fe.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
rusagestub.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
windowapi.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00