postgresql/src/backend/utils/adt
Tom Lane 3c93a60f60 Add some more defenses against silly estimates to gincostestimate().
A report from Andy Colson showed that gincostestimate() was not being
nearly paranoid enough about whether to believe the statistics it finds in
the index metapage.  The problem is that the metapage stats (other than the
pending-pages count) are only updated by VACUUM, and in the worst case
could still reflect the index's original empty state even when it has grown
to many entries.  We attempted to deal with that by scaling up the stats to
match the current index size, but if nEntries is zero then scaling it up
still gives zero.  Moreover, the proportion of pages that are entry pages
vs. data pages vs. pending pages is unlikely to be estimated very well by
scaling if the index is now orders of magnitude larger than before.

We can improve matters by expanding the use of the rule-of-thumb estimates
I introduced in commit 7fb008c5ee59b040: if the index has grown by more
than a cutoff amount (here set at 4X growth) since VACUUM, then use the
rule-of-thumb numbers instead of scaling.  This might not be exactly right
but it seems much less likely to produce insane estimates.

I also improved both the scaling estimate and the rule-of-thumb estimate
to account for numPendingPages, since it's reasonable to expect that that
is accurate in any case, and certainly pages that are in the pending list
are not either entry or data pages.

As a somewhat separate issue, adjust the estimation equations that are
concerned with extra fetches for partial-match searches.  These equations
suppose that a fraction partialEntries / numEntries of the entry and data
pages will be visited as a consequence of a partial-match search.  Now,
it's physically impossible for that fraction to exceed one, but our
estimate of partialEntries is mostly bunk, and our estimate of numEntries
isn't exactly gospel either, so we could arrive at a silly value.  In the
example presented by Andy we were coming out with a value of 100, leading
to insane cost estimates.  Clamp the fraction to one to avoid that.

Like the previous patch, back-patch to all supported branches; this
problem can be demonstrated in one form or another in all of them.
2016-01-01 13:42:21 -05:00
..
Makefile Remove some more dead Alpha-specific code. 2015-11-02 19:37:51 -05:00
acl.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
array_expanded.c Support "expanded" objects, particularly arrays, for better performance. 2015-05-14 12:08:49 -04:00
array_selfuncs.c Collection of typo fixes. 2015-05-20 16:56:22 +03:00
array_typanalyze.c Collection of typo fixes. 2015-05-20 16:56:22 +03:00
array_userfuncs.c Message improvements 2015-11-16 21:39:23 -05:00
arrayfuncs.c Remove unnecessary escaping in C character literals 2015-12-22 22:43:46 -05:00
arrayutils.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
ascii.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
bool.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
cash.c Collection of typo fixes. 2015-05-20 16:56:22 +03:00
char.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
date.c Fix incorrect translation of minus-infinity datetimes for json/jsonb. 2015-10-20 11:07:04 -07:00
datetime.c Move DTK_ISODOW DTK_DOW and DTK_DOY to be type UNITS rather than 2015-09-06 03:35:56 +01:00
datum.c Fix problems with ParamListInfo serialization mechanism. 2015-11-02 18:11:29 -05:00
dbsize.c pg_size_pretty: Format negative values similar to positive ones. 2015-11-06 11:03:02 -05:00
domains.c Use the typcache to cache constraints for domain types. 2015-03-01 14:06:55 -05:00
encode.c Message improvements 2015-11-16 21:39:23 -05:00
enum.c Remove enum-related special cases for catalog scans. 2015-04-29 15:48:44 -04:00
expandeddatum.c Support "expanded" objects, particularly arrays, for better performance. 2015-05-14 12:08:49 -04:00
float.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
format_type.c Allow postgres_fdw to ship extension funcs/operators for remote execution. 2015-11-03 18:42:18 -05:00
formatting.c to_number(): allow 'V' to divide by 10^(the number of digits) 2015-10-05 21:03:38 -04:00
genfile.c Add missing_ok option to the SQL functions for reading files. 2015-06-28 21:35:46 +03:00
geo_ops.c Add missing CHECK_FOR_INTERRUPTS in lseg_inside_poly 2015-12-14 16:44:40 -03:00
geo_selfuncs.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
inet_cidr_ntop.c pgindent run for 9.4 2014-05-06 12:12:18 -04:00
inet_net_pton.c Run pgindent on 9.2 source tree in preparation for first 9.3 2012-06-10 15:20:04 -04:00
int.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
int8.c Define integer limits independently from the system definitions. 2015-04-02 17:43:35 +02:00
json.c Remove unnecessary escaping in C character literals 2015-12-22 22:43:46 -05:00
jsonb.c Improve some messages 2015-12-10 22:05:27 -05:00
jsonb_gin.c Fix erroneous hash calculations in gin_extract_jsonb_path(). 2015-11-05 18:15:48 -05:00
jsonb_op.c Use JsonbIteratorToken consistently in automatic variable declarations. 2015-10-11 23:53:35 -04:00
jsonb_util.c Use JsonbIteratorToken consistently in automatic variable declarations. 2015-10-11 23:53:35 -04:00
jsonfuncs.c Improve some messages 2015-12-10 22:05:27 -05:00
levenshtein.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
like.c Add recursion depth protection to LIKE matching. 2015-10-02 15:00:51 -04:00
like_match.c Add recursion depth protection to LIKE matching. 2015-10-02 15:00:51 -04:00
lockfuncs.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
mac.c Allow input format xxxx-xxxx-xxxx for macaddr type 2014-10-21 16:16:39 -04:00
misc.c Message style improvements 2015-10-28 20:38:36 -04:00
nabstime.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
name.c Add new OID alias type regrole 2015-05-09 13:06:49 -04:00
network.c Add geometry/range functions to support BRIN inclusion 2015-05-05 15:22:24 -03:00
network_gist.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
network_selfuncs.c Provide real selectivity estimators for inet/cidr operators. 2015-04-01 17:11:21 -04:00
numeric.c Improve div_var_fast(), mostly by making comments better. 2015-11-25 16:05:57 -05:00
numutils.c Define integer limits independently from the system definitions. 2015-04-02 17:43:35 +02:00
oid.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
oracle_compat.c Remove spurious semicolons. 2015-03-31 15:12:27 +03:00
orderedsetaggs.c Extend abbreviated key infrastructure to datum tuplesorts. 2015-05-13 14:36:26 -04:00
pg_locale.c Revoke support for strxfrm() that write past the specified array length. 2015-07-08 20:44:21 -04:00
pg_lsn.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
pg_upgrade_support.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
pgstatfuncs.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
pseudotypes.c Redesign tablesample method API, and do extensive code review. 2015-07-25 14:39:00 -04:00
quote.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
rangetypes.c Remove unnecessary escaping in C character literals 2015-12-22 22:43:46 -05:00
rangetypes_gist.c Move strategy numbers to include/access/stratnum.h 2015-05-15 17:03:16 -03:00
rangetypes_selfuncs.c Collection of typo fixes. 2015-05-20 16:56:22 +03:00
rangetypes_spgist.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
rangetypes_typanalyze.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
regexp.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
regproc.c Fix misc typos. 2015-09-05 11:35:49 +03:00
ri_triggers.c ALTER TABLE .. FORCE ROW LEVEL SECURITY 2015-10-04 21:05:08 -04:00
rowtypes.c Remove unnecessary escaping in C character literals 2015-12-22 22:43:46 -05:00
ruleutils.c Allow omitting one or both boundaries in an array slice specifier. 2015-12-22 21:05:29 -05:00
selfuncs.c Add some more defenses against silly estimates to gincostestimate(). 2016-01-01 13:42:21 -05:00
tid.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
timestamp.c Fix incorrect translation of minus-infinity datetimes for json/jsonb. 2015-10-20 11:07:04 -07:00
trigfuncs.c Use FLEXIBLE_ARRAY_MEMBER for HeapTupleHeaderData.t_bits[]. 2015-02-21 15:13:06 -05:00
tsginidx.c Move strategy numbers to include/access/stratnum.h 2015-05-15 17:03:16 -03:00
tsgistidx.c Reorganize our CRC source files again. 2015-04-14 17:03:42 +03:00
tsquery.c Reorganize our CRC source files again. 2015-04-14 17:03:42 +03:00
tsquery_cleanup.c Prevent stack overflow in query-type functions. 2015-10-05 10:06:30 -04:00
tsquery_gist.c Move strategy numbers to include/access/stratnum.h 2015-05-15 17:03:16 -03:00
tsquery_op.c pgindent run for 9.5 2015-05-23 21:35:49 -04:00
tsquery_rewrite.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
tsquery_util.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
tsrank.c Centralize definition of integer limits. 2015-03-25 22:39:42 +01:00
tsvector.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
tsvector_op.c Add header forgotten in 213335c145 2015-09-18 14:32:09 +03:00
tsvector_parser.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
txid.c Fix a number of places that produced XX000 errors in the regression tests. 2015-08-02 23:49:19 -04:00
uuid.c Remove unnecessary cast in previous commit. 2015-11-06 12:17:31 -05:00
varbit.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
varchar.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
varlena.c Remove unnecessary escaping in C character literals 2015-12-22 22:43:46 -05:00
version.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
windowfuncs.c Update copyright for 2015 2015-01-06 11:43:47 -05:00
xid.c Add "xid <> xid" and "xid <> int4" operators. 2015-11-07 16:40:15 -05:00
xml.c Use appendStringInfoString/Char et al where appropriate. 2015-07-02 12:36:03 +03:00