Commit Graph

411 Commits

Author SHA1 Message Date
Tom Lane 5e09280057 Make pg_statistic and related code account more honestly for collations.
When we first put in collations support, we basically punted on teaching
pg_statistic, ANALYZE, and the planner selectivity functions about that.
They've just used DEFAULT_COLLATION_OID independently of the actual
collation of the data.  It's time to improve that, so:

* Add columns to pg_statistic that record the specific collation associated
with each statistics slot.

* Teach ANALYZE to use the column's actual collation when comparing values
for statistical purposes, and record this in the appropriate slot.  (Note
that type-specific typanalyze functions are now expected to fill
stats->stacoll with the appropriate collation, too.)

* Teach assorted selectivity functions to use the actual collation of
the stats they are looking at, instead of just assuming it's
DEFAULT_COLLATION_OID.

This should give noticeably better results in selectivity estimates for
columns with nondefault collations, at least for query clauses that use
that same collation (which would be the default behavior in most cases).
It's still true that comparisons with explicit COLLATE clauses different
from the stored data's collation won't be well-estimated, but that's no
worse than before.  Also, this patch does make the first step towards
doing better with that, which is that it's now theoretically possible to
collect stats for a collation other than the column's own collation.

Patch by me; thanks to Peter Eisentraut for review.

Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us
2018-12-14 12:52:49 -05:00
Tom Lane a314c34079 Clamp semijoin selectivity to be not more than inner-join selectivity.
We should never estimate the output of a semijoin to be more rows than
we estimate for an inner join with the same input rels and join condition;
it's obviously impossible for that to happen.  However, given the
relatively poor quality of our semijoin selectivity estimates ---
particularly, but not only, in cases where we punt and return a default
estimate --- we did often deliver such estimates.  To improve matters,
calculate both estimates inside eqjoinsel() and take the smaller one.

The bulk of this patch is just mechanical refactoring to avoid repetitive
information lookup when we call both eqjoinsel_semi and eqjoinsel_inner.
The actual new behavior is just

	selec = Min(selec, inner_rel->rows * selec_inner);

which looks a bit odd but is correct because of our different definitions
for inner and semi join selectivity.

There is one ensuing plan change in the regression tests, but it looks
reasonable enough (and checking the actual row counts shows that the
estimate moved closer to reality, not further away).

Per bug #15160 from Alexey Ermakov.  Although this is arguably a bug fix,
I won't risk destabilizing plan choices in stable branches by
back-patching.

Tom Lane, reviewed by Melanie Plageman

Discussion: https://postgr.es/m/152395805004.19366.3107109716821067806@wrigleys.postgresql.org
2018-11-23 12:48:49 -05:00
Andres Freund 578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00
Andres Freund 1a0586de36 Introduce notion of different types of slots (without implementing them).
Upcoming work intends to allow pluggable ways to introduce new ways of
storing table data. Accessing those table access methods from the
executor requires TupleTableSlots to be carry tuples in the native
format of such storage methods; otherwise there'll be a significant
conversion overhead.

Different access methods will require different data to store tuples
efficiently (just like virtual, minimal, heap already require fields
in TupleTableSlot). To allow that without requiring additional pointer
indirections, we want to have different structs (embedding
TupleTableSlot) for different types of slots.  Thus different types of
slots are needed, which requires adapting creators of slots.

The slot that most efficiently can represent a type of tuple in an
executor node will often depend on the type of slot a child node
uses. Therefore we need to track the type of slot is returned by
nodes, so parent slots can create slots based on that.

Relatedly, JIT compilation of tuple deforming needs to know which type
of slot a certain expression refers to, so it can create an
appropriate deforming function for the type of tuple in the slot.

But not all nodes will only return one type of slot, e.g. an append
node will potentially return different types of slots for each of its
subplans.

Therefore add function that allows to query the type of a node's
result slot, and whether it'll always be the same type (whether it's
fixed). This can be queried using ExecGetResultSlotOps().

The scan, result, inner, outer type of slots are automatically
inferred from ExecInitScanTupleSlot(), ExecInitResultSlot(),
left/right subtrees respectively. If that's not correct for a node,
that can be overwritten using new fields in PlanState.

This commit does not introduce the actually abstracted implementation
of different kind of TupleTableSlots, that will be left for a followup
commit.  The different types of slots introduced will, for now, still
use the same backing implementation.

While this already partially invalidates the big comment in
tuptable.h, it seems to make more sense to update it later, when the
different TupleTableSlot implementations actually exist.

Author: Ashutosh Bapat and Andres Freund, with changes by Amit Khandekar
Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
2018-11-15 22:00:30 -08:00
Andres Freund cda6a8d01d Remove deprecated abstime, reltime, tinterval datatypes.
These types have been deprecated for a *long* time.

Catversion bump, for obvious reasons.

Author: Andres Freund
Discussion:
    https://postgr.es/m/20181009192237.34wjp3nmw7oynmmr@alap3.anarazel.de
    https://postgr.es/m/20171213080506.cwjkpcz3bkk6yz2u@alap3.anarazel.de
    https://postgr.es/m/25615.1513115237@sss.pgh.pa.us
2018-10-11 11:59:15 -07:00
Andres Freund 29c94e03c7 Split ExecStoreTuple into ExecStoreHeapTuple and ExecStoreBufferHeapTuple.
Upcoming changes introduce further types of tuple table slots, in
preparation of making table storage pluggable. New storage methods
will have different representation of tuples, therefore the slot
accessor should refer explicitly to heap tuples.

Instead of just renaming the functions, split it into one function
that accepts heap tuples not residing in buffers, and one accepting
ones in buffers.  Previously one function was used for both, but that
was a bit awkward already, and splitting will allow us to represent
slot types for tuples in buffers and normal memory separately.

This is split out from the patch introducing abstract slots, as this
largely consists out of mechanical changes.

Author: Ashutosh Bapat
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de
2018-09-25 16:27:48 -07:00
Alvaro Herrera f2c587067a Rethink how to get float.h in old Windows API for isnan/isinf
We include <float.h> in every place that needs isnan(), because MSVC
used to require it.  However, since MSVC 2013 that's no longer necessary
(cf. commit cec8394b5c), so we can retire the inclusion to a
version-specific stanza in win32_port.h, where it doesn't need to
pollute random .c files.  The header is of course still needed in a few
places for other reasons.

I (Álvaro) removed float.h from a few more files than in Emre's original
patch.  This doesn't break the build in my system, but we'll see what
the buildfarm has to say about it all.

Author: Emre Hasegeli
Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
2018-07-11 09:11:48 -04:00
Teodor Sigaev c266ed31a8 Cleanup covering infrastructure
- Explicitly forbids opclass, collation and indoptions (like DESC/ASC etc) for
  including columns. Throw an error if user points that.
- Truncated storage arrays for such attributes to store only key atrributes,
  added assertion checks.
- Do not check opfamily and collation for including columns in
  CompareIndexInfo()

Discussion: https://www.postgresql.org/message-id/5ee72852-3c4e-ee35-e2ed-c1d053d45c08@sigaev.ru
2018-04-12 16:37:22 +03:00
Teodor Sigaev 8224de4f42 Indexes with INCLUDE columns and their support in B-tree
This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.

Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.

In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.

Bump catalog version

Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
			 David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
2018-04-07 23:00:39 +03:00
Teodor Sigaev 710d90da1f Add prefix operator for TEXT type.
The prefix operator along with SP-GiST indexes can be used as an alternative
for LIKE 'word%' commands  and it doesn't have a limitation of string/prefix
length as B-Tree has.

Bump catalog version

Author: Ildus Kurbangaliev with some editorization by me
Review by: Arthur Zakirov, Alexander Korotkov, and me
Discussion: https://www.postgresql.org/message-id/flat/20180202180327.222b04b3@wp.localdomain
2018-04-03 19:46:45 +03:00
Tom Lane 58d9acc18d Fix assorted issues in convert_to_scalar().
If convert_to_scalar is passed a pair of datatypes it can't cope with,
its former behavior was just to elog(ERROR).  While this is OK so far as
the core code is concerned, there's extension code that would like to use
scalarltsel/scalargtsel/etc as selectivity estimators for operators that
work on non-core datatypes, and this behavior is a show-stopper for that
use-case.  If we simply allow convert_to_scalar to return FALSE instead of
outright failing, then the main logic of scalarltsel/scalargtsel will work
fine for any operator that behaves like a scalar inequality comparison.
The lack of conversion capability will mean that we can't estimate to
better than histogram-bin-width precision, since the code will effectively
assume that the comparison constant falls at the middle of its bin.  But
that's still a lot better than nothing.  (Someday we should provide a way
for extension code to supply a custom version of convert_to_scalar, but
today is not that day.)

While poking at this issue, we noted that the existing code for handling
type bytea in convert_to_scalar is several bricks shy of a load.
It assumes without checking that if the comparison value is type bytea,
the bounds values are too; in the worst case this could lead to a crash.
It also fails to detoast the input values, so that the comparison result is
complete garbage if any input is toasted out-of-line, compressed, or even
just short-header.  I'm not sure how often such cases actually occur ---
the bounds values, at least, are probably safe since they are elements of
an array and hence can't be toasted.  But that doesn't make this code OK.

Back-patch to all supported branches, partly because author requested that,
but mostly because of the bytea bugs.  The change in API for the exposed
routine convert_network_to_scalar() is theoretically a back-patch hazard,
but it seems pretty unlikely that any third-party code is calling that
function directly.

Tomas Vondra, with some adjustments by me

Discussion: https://postgr.es/m/b68441b6-d18f-13ab-b43b-9a72188a4e02@2ndquadrant.com
2018-03-03 20:31:35 -05:00
Bruce Momjian 9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Tom Lane 7ca25b7de6 Fix neqjoinsel's behavior for semi/anti join cases.
Previously, this function estimated the selectivity as 1 minus eqjoinsel()
for the negator equality operator, regardless of join type (I think there
was an expectation that eqjoinsel would handle the join type).  But
actually this is completely wrong for semijoin cases: the fraction of the
LHS that has a non-matching row is not one minus the fraction of the LHS
that has a matching row.  In reality a semijoin with <> will nearly always
succeed: it can only fail when the RHS is empty, or it contains a single
distinct value that is equal to the particular LHS value, or the LHS value
is null.  The only one of those things we should have much confidence in
estimating is the fraction of LHS values that are null, so let's just take
the selectivity as 1 minus outer nullfrac.

Per coding convention, antijoin should be estimated the same as semijoin.

Arguably this is a bug fix, but in view of the lack of field complaints
and the risk of destabilizing plans, no back-patch.

Thomas Munro, reviewed by Ashutosh Bapat

Discussion: https://postgr.es/m/CAEepm=270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg@mail.gmail.com
2017-11-29 22:00:37 -05:00
Tom Lane df3a66e282 Improve planner's handling of set-returning functions in grouping columns.
Improve query_is_distinct_for() to accept SRFs in the targetlist when
we can prove distinctness from a DISTINCT clause.  In that case the
de-duplication will surely happen after SRF expansion, so the proof
still works.  Continue to punt in the case where we'd try to prove
distinctness from GROUP BY (or, in the future, source relations).
To do that, we'd have to determine whether the SRFs were in the
grouping columns or elsewhere in the tlist, and it still doesn't
seem worth the trouble.  But this trivial change allows us to
recognize that "SELECT DISTINCT unnest(foo) FROM ..." produces
unique-ified output, which seems worth having.

Also, fix estimate_num_groups() to consider the possibility of SRFs in
the grouping columns.  Its failure to do so was masked before v10 because
grouping_planner() scaled up plan rowcount estimates by the estimated SRF
multiplier after performing grouping.  That doesn't happen anymore, which
is more correct, but it means we need an adjustment in the estimate for
the number of groups.  Failure to do this leads to an underestimate for
the number of output rows of subqueries like "SELECT DISTINCT unnest(foo)"
compared to what 9.6 and earlier estimated, thus breaking plan choices
in some cases.

Per report from Dmitry Shalashov.  Back-patch to v10 to avoid degraded
plan choices compared to previous releases.

Discussion: https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
2017-11-25 11:48:09 -05:00
Peter Eisentraut 2eb4a831e5 Change TRUE/FALSE to true/false
The lower case spellings are C and C++ standard and are used in most
parts of the PostgreSQL sources.  The upper case spellings are only used
in some files/modules.  So standardize on the standard spellings.

The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so
those are left as is when using those APIs.

In code comments, we use the lower-case spelling for the C concepts and
keep the upper-case spelling for the SQL concepts.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-11-08 11:37:28 -05:00
Tom Lane c12d570fa1 Support arrays over domains.
Allowing arrays with a domain type as their element type was left un-done
in the original domain patch, but not for any very good reason.  This
omission leads to such surprising results as array_agg() not working on
a domain column, because the parser can't identify a suitable output type
for the polymorphic aggregate.

In order to fix this, first clean up the APIs of coerce_to_domain() and
some internal functions in parse_coerce.c so that we consistently pass
around a CoercionContext along with CoercionForm.  Previously, we sometimes
passed an "isExplicit" boolean flag instead, which is strictly less
information; and coerce_to_domain() didn't even get that, but instead had
to reverse-engineer isExplicit from CoercionForm.  That's contrary to the
documentation in primnodes.h that says that CoercionForm only affects
display and not semantics.  I don't think this change fixes any live bugs,
but it makes things more consistent.  The main reason for doing it though
is that now build_coercion_expression() receives ccontext, which it needs
in order to be able to recursively invoke coerce_to_target_type().

Next, reimplement ArrayCoerceExpr so that the node does not directly know
any details of what has to be done to the individual array elements while
performing the array coercion.  Instead, the per-element processing is
represented by a sub-expression whose input is a source array element and
whose output is a target array element.  This simplifies life in
parse_coerce.c, because it can build that sub-expression by a recursive
invocation of coerce_to_target_type().  The executor now handles the
per-element processing as a compiled expression instead of hard-wired code.
The main advantage of this is that we can use a single ArrayCoerceExpr to
handle as many as three successive steps per element: base type conversion,
typmod coercion, and domain constraint checking.  The old code used two
stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty
inefficient, and adding yet another array deconstruction to do domain
constraint checking seemed very unappetizing.

In the case where we just need a single, very simple coercion function,
doing this straightforwardly leads to a noticeable increase in the
per-array-element runtime cost.  Hence, add an additional shortcut evalfunc
in execExprInterp.c that skips unnecessary overhead for that specific form
of expression.  The runtime speed of simple cases is within 1% or so of
where it was before, while cases that previously required two levels of
array processing are significantly faster.

Finally, create an implicit array type for every domain type, as we do for
base types, enums, etc.  Everything except the array-coercion case seems
to just work without further effort.

Tom Lane, reviewed by Andrew Dunstan

Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
2017-09-30 13:40:56 -04:00
Tom Lane 7d08ce286c Distinguish selectivity of < from <= and > from >=.
Historically, the selectivity functions have simply not distinguished
< from <=, or > from >=, arguing that the fraction of the population that
satisfies the "=" aspect can be considered to be vanishingly small, if the
comparison value isn't any of the most-common-values for the variable.
(If it is, the code path that executes the operator against each MCV will
take care of things properly.)  But that isn't really true unless we're
dealing with a continuum of variable values, and in practice we seldom are.
If "x = const" would estimate a nonzero number of rows for a given const
value, then it follows that we ought to estimate different numbers of rows
for "x < const" and "x <= const", even if the const is not one of the MCVs.
Handling this more honestly makes a significant difference in edge cases,
such as the estimate for a tight range (x BETWEEN y AND z where y and z
are close together).

Hence, split scalarltsel into scalarltsel/scalarlesel, and similarly
split scalargtsel into scalargtsel/scalargesel.  Adjust <= and >=
operator definitions to reference the new selectivity functions.
Improve the core ineq_histogram_selectivity() function to make a
correction for equality.  (Along the way, I learned quite a bit about
exactly why that function gives good answers, which I tried to memorialize
in improved comments.)

The corresponding join selectivity functions were, and remain, just stubs.
But I chose to split them similarly, to avoid confusion and to prevent the
need for doing this exercise again if someone ever makes them less stubby.

In passing, change ineq_histogram_selectivity's clamp for extreme
probability estimates so that it varies depending on the histogram
size, instead of being hardwired at 0.0001.  With the default histogram
size of 100 entries, you still get the old clamp value, but bigger
histograms should allow us to put more faith in edge values.

Tom Lane, reviewed by Aleksander Alekseev and Kuntal Ghosh

Discussion: https://postgr.es/m/12232.1499140410@sss.pgh.pa.us
2017-09-13 11:12:39 -04:00
Tom Lane 3ca930fc39 Improve performance of get_actual_variable_range with recently-dead tuples.
In commit fccebe421, we hacked get_actual_variable_range() to scan the
index with SnapshotDirty, so that if there are many uncommitted tuples
at the end of the index range, it wouldn't laboriously scan through all
of them looking for a live value to return.  However, that didn't fix it
for the case of many recently-dead tuples at the end of the index;
SnapshotDirty recognizes those as committed dead and so we're back to
the same problem.

To improve the situation, invent a "SnapshotNonVacuumable" snapshot type
and use that instead.  The reason this helps is that, if the snapshot
rejects a given index entry, we know that the indexscan will mark that
index entry as killed.  This means the next get_actual_variable_range()
scan will proceed past that entry without visiting the heap, making the
scan a lot faster.  We may end up accepting a recently-dead tuple as
being the estimated extremal value, but that doesn't seem much worse than
the compromise we made before to accept not-yet-committed extremal values.

The cost of the scan is still proportional to the number of dead index
entries at the end of the range, so in the interval after a mass delete
but before VACUUM's cleaned up the mess, it's still possible for
get_actual_variable_range() to take a noticeable amount of time, if you've
got enough such dead entries.  But the constant factor is much much better
than before, since all we need to do with each index entry is test its
"killed" bit.

We chose to back-patch commit fccebe421 at the time, but I'm hesitant to
do so here, because this form of the problem seems to affect many fewer
people.  Also, even when it happens, it's less bad than the case fixed
by commit fccebe421 because we don't get the contention effects from
expensive TransactionIdIsInProgress tests.

Dmitriy Sarafannikov, reviewed by Andrey Borodin

Discussion: https://postgr.es/m/05C72CF7-B5F6-4DB9-8A09-5AC897653113@yandex.ru
2017-09-07 19:41:51 -04:00
Tom Lane 2b74303637 Make the planner assume that the entries in a VALUES list are distinct.
Previously, if we had to estimate the number of distinct values in a
VALUES column, we fell back on the default behavior used whenever we lack
statistics, which effectively is that there are Min(# of entries, 200)
distinct values.  This can be very badly off with a large VALUES list,
as noted by Jeff Janes.

We could consider actually running an ANALYZE-like scan on the VALUES,
but that seems unduly expensive, and anyway it could not deliver reliable
info if the entries are not all constants.  What seems like a better choice
is to assume that the values are all distinct.  This will sometimes be just
as wrong as the old code, but it seems more likely to be more nearly right
in many common cases.  Also, it is more consistent with what happens in
some related cases, for example WHERE x = ANY(ARRAY[1,2,3,...,n]) and
WHERE x = ANY(VALUES (1),(2),(3),...,(n)) now are estimated similarly.

This was discussed some time ago, but consensus was it'd be better
to slip it in at the start of a development cycle not near the end.
(It should've gone into v10, really, but I forgot about it.)

Discussion: https://postgr.es/m/CAMkU=1xHkyPa8VQgGcCNg3RMFFvVxUdOpus1gKcFuvVi0w6Acg@mail.gmail.com
2017-08-16 15:37:20 -04:00
Tom Lane 4867d7f62f Avoid out-of-memory in a hash join with many duplicate inner keys.
The executor is capable of splitting buckets during a hash join if
too much memory is being used by a small number of buckets.  However,
this only helps if a bucket's population is actually divisible; if
all the hash keys are alike, the tuples still end up in the same
new bucket.  This can result in an OOM failure if there are enough
inner keys with identical hash values.  The planner's cost estimates
will bias it against choosing a hash join in such situations, but not
by so much that it will never do so.  To mitigate the OOM hazard,
explicitly estimate the hash bucket space needed by just the inner
side's most common value, and if that would exceed work_mem then
add disable_cost to the hash cost estimate.

This approach doesn't account for the possibility that two or more
common values would share the same hash value.  On the other hand,
work_mem is normally a fairly conservative bound, so that eating
two or more times that much space is probably not going to kill us.

If we have no stats about the inner side, ignore this consideration.
There was some discussion of making a conservative assumption, but that
would effectively result in disabling hash join whenever we lack stats,
which seems like an overreaction given how seldom the problem manifests
in the field.

Per a complaint from David Hinkle.  Although this could be viewed
as a bug fix, the lack of similar complaints weighs against back-
patching; indeed we waited for v11 because it seemed already rather
late in the v10 cycle to be making plan choice changes like this one.

Discussion: https://postgr.es/m/32013.1487271761@sss.pgh.pa.us
2017-08-15 14:05:53 -04:00
Tom Lane 780b3a4c43 Manually un-break a few URLs that pgindent used to insist on splitting.
These will no longer get re-split by pgindent runs, so it's worth cleaning
them up now.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 16:02:08 -04:00
Tom Lane 382ceffdf7 Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.

By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis.  However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent.  That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.

This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:35:54 -04:00
Tom Lane c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Tom Lane e3860ffa4d Initial pgindent run with pg_bsd_indent version 2.0.
The new indent version includes numerous fixes thanks to Piotr Stefaniak.
The main changes visible in this commit are:

* Nicer formatting of function-pointer declarations.
* No longer unexpectedly removes spaces in expressions using casts,
  sizeof, or offsetof.
* No longer wants to add a space in "struct structname *varname", as
  well as some similar cases for const- or volatile-qualified pointers.
* Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely.
* Fixes bug where comments following declarations were sometimes placed
  with no space separating them from the code.
* Fixes some odd decisions for comments following case labels.
* Fixes some cases where comments following code were indented to less
  than the expected column 33.

On the less good side, it now tends to put more whitespace around typedef
names that are not listed in typedefs.list.  This might encourage us to
put more effort into typedef name collection; it's not really a bug in
indent itself.

There are more changes coming after this round, having to do with comment
indentation and alignment of lines appearing within parentheses.  I wanted
to limit the size of the diffs to something that could be reviewed without
one's eyes completely glazing over, so it seemed better to split up the
changes as much as practical.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:39:04 -04:00
Robert Haas b08df9cab7 Teach predtest.c about CHECK clauses to fix partitioning bugs.
In a CHECK clause, a null result means true, whereas in a WHERE clause
it means false.  predtest.c provided different functions depending on
which set of semantics applied to the predicate being proved, but had
no option to control what a null meant in the clauses provided as
axioms.  Add one.

Use that in the partitioning code when figuring out whether the
validation scan on a new partition can be skipped.  Rip out the
old logic that attempted (not very successfully) to compensate
for the absence of the necessary support in predtest.c.

Ashutosh Bapat and Robert Haas, reviewed by Amit Langote and
incorporating feedback from Tom Lane.

Discussion: http://postgr.es/m/CAFjFpReT_kq_uwU_B8aWDxR7jNGE=P0iELycdq5oupi=xSQTOw@mail.gmail.com
2017-06-14 13:13:11 -04:00
Tom Lane 5936d25f81 Remove dead variables.
Commit 512c7356b left a couple of variables unused except for being set.
My compiler didn't whine about this, but some buildfarm members did.
2017-06-03 20:35:52 -04:00
Tom Lane 512c7356b6 Fix <> and pattern-NOT-match estimators to handle nulls correctly.
These estimators returned 1 minus the corresponding equality/match
estimate, which is incorrect: we need to subtract off the fraction
of nulls in the column, since those are neither equal nor not equal
to the comparison value.  The error only becomes obvious if the
nullfrac is large, but it could be very bad in a mostly-nulls
column, as reported in bug #14676 from Marko Tiikkaja.

To fix the <> case, refactor eqsel() and neqsel() to call a common
support routine, which can be made to account for nullfrac correctly.
The pattern-match cases were already factored that way, and it was
simply an oversight that patternsel() wasn't subtracting off nullfrac.

neqjoinsel() has a similar problem, but since we're elsewhere discussing
changing its behavior entirely, I left it alone for now.

This is a very longstanding bug, but I'm hesitant to back-patch a fix for
it.  Given the lack of prior complaints, such cases must not come up often,
so it's probably not worth the risk of destabilizing plans in stable
branches.

Discussion: https://postgr.es/m/20170529153847.4275.95416@wrigleys.postgresql.org
2017-06-03 14:36:25 -04:00
Bruce Momjian a6fd7b7a5f Post-PG 10 beta1 pgindent run
perltidy run not included.
2017-05-17 16:31:56 -04:00
Tom Lane f04c9a6146 Standardize terminology for pg_statistic_ext entries.
Consistently refer to such an entry as a "statistics object", not just
"statistics" or "extended statistics".  Previously we had a mismash of
terms, accompanied by utter confusion as to whether the term was
singular or plural.  That's not only grating (at least to the ear of
a native English speaker) but could be outright misleading, eg in error
messages that seemed to be referring to multiple objects where only one
could be meant.

This commit fixes the code and a lot of comments (though I may have
missed a few).  I also renamed two new SQL functions,
pg_get_statisticsextdef -> pg_get_statisticsobjdef
pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible
to conform better with this terminology.

I have not touched the SGML docs other than fixing those function
names; the docs certainly need work but it seems like a separable task.

Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
2017-05-14 10:55:01 -04:00
Tom Lane 9aab83fc50 Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed.
The mess cleaned up in commit da0759600 is clear evidence that it's a
bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
to provide the correct type OID for the array elements in the slot.
Moreover, we weren't even getting any performance benefit from that,
since get_attstatsslot() was extracting the real type OID from the array
anyway.  So we ought to get rid of that requirement; indeed, it would
make more sense for get_attstatsslot() to pass back the type OID it found,
in case the caller isn't sure what to expect, which is likely in binary-
compatible-operator cases.

Another problem with the current implementation is that if the stats array
element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
for each element.  That seemed acceptable when the code was written because
we were targeting O(10) array sizes --- but these days, stats arrays are
almost always bigger than that, sometimes much bigger.  We can save a
significant number of cycles by doing one palloc/memcpy/pfree of the whole
array.  Indeed, in the now-probably-common case where the array is toasted,
that happens anyway so this method is basically free.  (Note: although the
catcache code will inline any out-of-line toasted values, it doesn't
decompress them.  At the other end of the size range, it doesn't expand
short-header datums either.  In either case, DatumGetArrayTypeP would have
to make a copy.  We do end up using an extra array copy step if the element
type is pass-by-value and the array length is neither small enough for a
short header nor large enough to have suffered compression.  But that
seems like a very acceptable price for winning in pass-by-ref cases.)

Hence, redesign to take these insights into account.  While at it,
convert to an API in which we fill a struct rather than passing a bunch
of pointers to individual output arguments.  That will make it less
painful if we ever want further expansion of what get_attstatsslot can
pass back.

It's certainly arguable that this is new development and not something to
push post-feature-freeze.  However, I view it as primarily bug-proofing
and therefore something that's better to have sooner not later.  Since
we aren't quite at beta phase yet, let's put it in.

Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
2017-05-13 15:14:39 -04:00
Peter Eisentraut e2d4ef8de8 Add security checks to selectivity estimation functions
Some selectivity estimation functions run user-supplied operators over
data obtained from pg_statistic without security checks, which allows
those operators to leak pg_statistic data without having privileges on
the underlying tables.  Fix by checking that one of the following is
satisfied: (1) the user has table or column privileges on the table
underlying the pg_statistic data, or (2) the function implementing the
user-supplied operator is leak-proof.  If neither is satisfied, planning
will proceed as if there are no statistics available.

At least one of these is satisfied in most cases in practice.  The only
situations that are negatively impacted are user-defined or
not-leak-proof operators on a security-barrier view.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Peter Eisentraut <peter_e@gmx.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>

Security: CVE-2017-7484
2017-05-08 09:26:32 -04:00
Tom Lane 8f0530f580 Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type.  For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))".  Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.

As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.

Patch by me, after an idea of Andrew Gierth's.

Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 13:51:53 -04:00
Simon Riggs ac2b095088 Reset API of clause_selectivity()
Discussion: https://postgr.es/m/CAKJS1f9yurJQW9pdnzL+rmOtsp2vOytkpXKGnMFJEO-qz5O5eA@mail.gmail.com
2017-04-06 19:10:51 -04:00
Alvaro Herrera 7e534adcdc Fix BRIN cost estimation
The original code was overly optimistic about the cost of scanning a
BRIN index, leading to BRIN indexes being selected when they'd be a
worse choice than some other index.  This complete rewrite should be
more accurate.

Author: David Rowley, based on an earlier patch by Emre Hasegeli
Reviewed-by: Emre Hasegeli
Discussion: https://postgr.es/m/CAKJS1f9n-Wapop5Xz1dtGdpdqmzeGqQK4sV2MK-zZugfC14Xtw@mail.gmail.com
2017-04-06 17:51:53 -03:00
Simon Riggs 2686ee1b7c Collect and use multi-column dependency stats
Follow on patch in the multi-variate statistics patch series.

CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t;
ANALYZE;
will collect dependency stats on (a, b) and then use the measured
dependency in subsequent query planning.

Commit 7b504eb282 added
CREATE STATISTICS with n-distinct coefficients. These are now
specified using the mutually exclusive option WITH (ndistinct).

Author: Tomas Vondra, David Rowley
Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas
and many other comments and contributions
Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com
2017-04-05 18:00:42 -04:00
Robert Haas 7a39b5e4d1 Abstract logic to allow for multiple kinds of child rels.
Currently, the only type of child relation is an "other member rel",
which is the child of a baserel, but in the future joins and even
upper relations may have child rels.  To facilitate that, introduce
macros that test to test for particular RelOptKind values, and use
them in various places where they help to clarify the sense of a test.
(For example, a test may allow RELOPT_OTHER_MEMBER_REL either because
it intends to allow child rels, or because it intends to allow simple
rels.)

Also, remove find_childrel_top_parent, which will not work for a
child rel that is not a baserel.  Instead, add a new RelOptInfo
member top_parent_relids to track the same kind of information in a
more generic manner.

Ashutosh Bapat, slightly tweaked by me.  Review and testing of the
patch set from which this was taken by Rajkumar Raghuwanshi and Rafia
Sabih.

Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com
2017-04-03 22:41:31 -04:00
Alvaro Herrera 1f171a1803 Fix thinko in estimate_num_groups
The code for the reworked n-distinct estimation on commit 7b504eb282 was
written differently in a previous version of the patch, prior to commit;
on rewriting it, we missed updating an initializer.  This caused the
code to (mistakenly) apply a fudge factor even in the case where a
single value is applied, leading to incorrect results.

This means that the 'relvarcount' variable name is now wrong.  Add a
comment to try and make the situation clearer, and remove an incorrect
comment I added.

Problem noticed, and code patch, by Tomas Vondra.  Additional commentary
by Álvaro.
2017-03-27 13:14:23 -03:00
Alvaro Herrera 7b504eb282 Implement multivariate n-distinct coefficients
Add support for explicitly declared statistic objects (CREATE
STATISTICS), allowing collection of statistics on more complex
combinations that individual table columns.  Companion commands DROP
STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are
added too.  All this DDL has been designed so that more statistic types
can be added later on, such as multivariate most-common-values and
multivariate histograms between columns of a single table, leaving room
for permitting columns on multiple tables, too, as well as expressions.

This commit only adds support for collection of n-distinct coefficient
on user-specified sets of columns in a single table.  This is useful to
estimate number of distinct groups in GROUP BY and DISTINCT clauses;
estimation errors there can cause over-allocation of memory in hashed
aggregates, for instance, so it's a worthwhile problem to solve.  A new
special pseudo-type pg_ndistinct is used.

(num-distinct estimation was deemed sufficiently useful by itself that
this is worthwhile even if no further statistic types are added
immediately; so much so that another version of essentially the same
functionality was submitted by Kyotaro Horiguchi:
https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp
though this commit does not use that code.)

Author: Tomas Vondra.  Some code rework by Álvaro.
Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes,
    Ideriha Takeshi
Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz
    https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 14:06:10 -03:00
Peter Eisentraut eccfef81e1 ICU support
Add a column collprovider to pg_collation that determines which library
provides the collation data.  The existing choices are default and libc,
and this adds an icu choice, which uses the ICU4C library.

The pg_locale_t type is changed to a union that contains the
provider-specific locale handles.  Users of locale information are
changed to look into that struct for the appropriate handle to use.

Also add a collversion column that records the version of the collation
when it is created, and check at run time whether it is still the same.
This detects potentially incompatible library upgrades that can corrupt
indexes and other structures.  This is currently only supported by
ICU-provided collations.

initdb initializes the default collation set as before from the `locale
-a` output but also adds all available ICU locales with a "-x-icu"
appended.

Currently, ICU-provided collations can only be explicitly named
collations.  The global database locales are still always libc-provided.

ICU support is enabled by configure --with-icu.

Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2017-03-23 15:28:48 -04:00
Stephen Frost c7a9fa399d Add support for EUI-64 MAC addresses as macaddr8
This adds in support for EUI-64 MAC addresses by adding a new data type
called 'macaddr8' (using our usual convention of indicating the number
of bytes stored).

This was largely a copy-and-paste from the macaddr data type, with
appropriate adjustments for having 8 bytes instead of 6 and adding
support for converting a provided EUI-48 (6 byte format) to the EUI-64
format.  Conversion from EUI-48 to EUI-64 inserts FFFE as the 4th and
5th bytes but does not perform the IPv6 modified EUI-64 action of
flipping the 7th bit, but we add a function to perform that specific
action for the user as it may be commonly done by users who wish to
calculate their IPv6 address based on their network prefix and 48-bit
MAC address.

Author: Haribabu Kommi, with a good bit of rework of macaddr8_in by me.
Reviewed by: Vitaly Burovoy, Kuntal Ghosh

Discussion: https://postgr.es/m/CAJrrPGcUi8ZH+KkK+=TctNQ+EfkeCEHtMU_yo1mvX8hsk_ghNQ@mail.gmail.com
2017-03-15 11:16:25 -04:00
Peter Eisentraut f97a028d8e Spelling fixes in code comments
From: Josh Soref <jsoref@gmail.com>
2017-03-14 12:58:39 -04:00
Noah Misch 3a0d473192 Use wrappers of PG_DETOAST_DATUM_PACKED() more.
This makes almost all core code follow the policy introduced in the
previous commit.  Specific decisions:

- Text search support functions with char* and length arguments, such as
  prsstart and lexize, may receive unaligned strings.  I doubt
  maintainers of non-core text search code will notice.

- Use plain VARDATA() on values detoasted or synthesized earlier in the
  same function.  Use VARDATA_ANY() on varlenas sourced outside the
  function, even if they happen to always have four-byte headers.  As an
  exception, retain the universal practice of using VARDATA() on return
  values of SendFunctionCall().

- Retain PG_GETARG_BYTEA_P() in pageinspect.  (Page images are too large
  for a one-byte header, so this misses no optimization.)  Sites that do
  not call get_page_from_raw() typically need the four-byte alignment.

- For now, do not change btree_gist.  Its use of four-byte headers in
  memory is partly entangled with storage of 4-byte headers inside
  GBT_VARKEY, on disk.

- For now, do not change gtrgm_consistent() or gtrgm_distance().  They
  incorporate the varlena header into a cache, and there are multiple
  credible implementation strategies to consider.
2017-03-12 19:35:34 -04:00
Tom Lane 86dbbf20d8 Put back <float.h> in a few files that need it for _isnan().
Further fallout from commit c29aff959: there are some files that need
<float.h>, and were getting it from datatype/timestamp.h, but it was not
apparent in my (tgl's) testing because the requirement for <float.h>
exists only on certain Windows toolchains.

Report and patch by David Rowley.

Discussion: https://postgr.es/m/CAKJS1f-BHceaFzZScFapDV48gUVM2CAOBfhkgffdqXzFb+kwew@mail.gmail.com
2017-03-08 15:38:34 -05:00
Tom Lane b9d092c962 Remove now-dead code for !HAVE_INT64_TIMESTAMP.
This is a basically mechanical removal of #ifdef HAVE_INT64_TIMESTAMP
tests and the negative-case controlled code.

Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
2017-02-23 14:04:43 -05:00
Peter Eisentraut 38d103763d Make more use of castNode() 2017-02-21 11:59:09 -05:00
Robert Haas 5262f7a4fc Add optimizer and executor support for parallel index scans.
In combination with 569174f1be, which
taught the btree AM how to perform parallel index scans, this allows
parallel index scan plans on btree indexes.  This infrastructure
should be general enough to support parallel index scans for other
index AMs as well, if someone updates them to support parallel
scans.

Amit Kapila, reviewed and tested by Anastasia Lubennikova, Tushar
Ahuja, and Haribabu Kommi, and me.
2017-02-15 13:53:24 -05:00
Peter Eisentraut f21a563d25 Move some things from builtins.h to new header files
This avoids that builtins.h has to include additional header files.
2017-01-20 20:29:53 -05:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Tom Lane ca5f885020 Improve eqjoinsel_semi's behavior for small inner relations with no stats.
If we don't have any MCV statistics for the inner relation, and we don't
trust its numdistinct estimate either, eqjoinsel_semi falls back to a very
conservative estimate (that 50% of the outer rows have matches).  This is
particularly problematic if the inner relation is completely empty, since
then even an explicit ANALYZE won't produce any pg_statistic entries,
so there's no way to budge the planner off the bad estimate.

We'd produce a better estimate in such cases if we used the nd2/nd1
selectivity heuristic, so an easy fix is to treat the nd2 estimate as
non-default if we derive it from clamping to the inner rel's rowcount
estimate.  This won't fix every related case (mainly because the rowcount
estimate might be larger than DEFAULT_NUM_DISTINCT), but it seems like a
sane extension of the existing logic, so let's apply the change in HEAD
and see if anyone complains.  Per bug #14438 from Nikolay Nikitin.

Report: https://postgr.es/m/20161128182113.6527.58926@wrigleys.postgresql.org
Discussion: https://postgr.es/m/31089.1480384713@sss.pgh.pa.us
2016-11-29 18:00:56 -05:00
Tom Lane 95bee941be Fix misestimation of n_distinct for a nearly-unique column with many nulls.
If ANALYZE found no repeated non-null entries in its sample, it set the
column's stadistinct value to -1.0, intending to indicate that the entries
are all distinct.  But what this value actually means is that the number
of distinct values is 100% of the table's rowcount, and thus it was
overestimating the number of distinct values by however many nulls there
are.  This could lead to very poor selectivity estimates, as for example
in a recent report from Andreas Joseph Krogh.  We should discount the
stadistinct value by whatever we've estimated the nulls fraction to be.
(That is what will happen if we choose to use a negative stadistinct for
a column that does have repeated entries, so this code path was just
inconsistent.)

In addition to fixing the stadistinct entries stored by several different
ANALYZE code paths, adjust the logic where get_variable_numdistinct()
forces an "all distinct" estimate on the basis of finding a relevant unique
index.  Unique indexes don't reject nulls, so there's no reason to assume
that the null fraction doesn't apply.

Back-patch to all supported branches.  Back-patching is a bit of a judgment
call, but this problem seems to affect only a few users (else we'd have
identified it long ago), and it's bad enough when it does happen that
destabilizing plan choices in a worse direction seems unlikely.

Patch by me, with documentation wording suggested by Dean Rasheed

Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena>
Discussion: <16143.1470350371@sss.pgh.pa.us>
2016-08-07 18:52:02 -04:00