Commit Graph

2226 Commits

Author SHA1 Message Date
Tom Lane 57cd2b6e6d Fix create_scan_plan's handling of sortgrouprefs for physical tlists.
We should only run apply_pathtarget_labeling_to_tlist if CP_LABEL_TLIST
was specified, because only in that case has use_physical_tlist checked
that the labeling will succeed; otherwise we may get an "ORDER/GROUP BY
expression not found in targetlist" error.  (This subsumes the previous
test about gating_clauses, because we reset "flags" to zero earlier
if there are gating clauses to apply.)

The only known case in which a failure can occur is with a ProjectSet
path directly atop a table scan path, although it seems likely that there
are other cases or will be such in future.  This means that the failure
is currently only visible in the v10 branch: 9.6 didn't have ProjectSet,
while in v11 and HEAD, apply_scanjoin_target_to_paths for some weird
reason is using create_projection_path not apply_projection_to_path,
masking the problem because there's a ProjectionPath in between.

Nonetheless this code is clearly wrong on its own terms, so back-patch
to 9.6 where this logic was introduced.

Per report from Regina Obe.

Discussion: https://postgr.es/m/001501d40f88$75186950$5f493bf0$@pcorp.us
2018-07-11 15:25:28 -04:00
Tom Lane ff4f889164 Fix bugs with degenerate window ORDER BY clauses in GROUPS/RANGE mode.
nodeWindowAgg.c failed to cope with the possibility that no ordering
columns are defined in the window frame for GROUPS mode or RANGE OFFSET
mode, leading to assertion failures or odd errors, as reported by Masahiko
Sawada and Lukas Eder.  In RANGE OFFSET mode, an ordering column is really
required, so add an Assert about that.  In GROUPS mode, the code would
work, except that the node initialization code wasn't in sync with the
execution code about when to set up tuplestore read pointers and spare
slots.  Fix the latter for consistency's sake (even though I think the
changes described below make the out-of-sync cases unreachable for now).

Per SQL spec, a single ordering column is required for RANGE OFFSET mode,
and at least one ordering column is required for GROUPS mode.  The parser
enforced the former but not the latter; add a check for that.

We were able to reach the no-ordering-column cases even with fully spec
compliant queries, though, because the planner would drop partitioning
and ordering columns from the generated plan if they were redundant with
earlier columns according to the redundant-pathkey logic, for instance
"PARTITION BY x ORDER BY y" in the presence of a "WHERE x=y" qual.
While in principle that's an optimization that could save some pointless
comparisons at runtime, it seems unlikely to be meaningful in the real
world.  I think this behavior was not so much an intentional optimization
as a side-effect of an ancient decision to construct the plan node's
ordering-column info by reverse-engineering the PathKeys of the input
path.  If we give up redundant-column removal then it takes very little
code to generate the plan node info directly from the WindowClause,
ensuring that we have the expected number of ordering columns in all
cases.  (If anyone does complain about this, the planner could perhaps
be taught to remove redundant columns only when it's safe to do so,
ie *not* in RANGE OFFSET mode.  But I doubt anyone ever will.)

With these changes, the WindowAggPath.winpathkeys field is not used for
anything anymore, so remove it.

The test cases added here are not actually very interesting given the
removal of the redundant-column-removal logic, but they would represent
important corner cases if anyone ever tries to put that back.

Tom Lane and Masahiko Sawada.  Back-patch to v11 where RANGE OFFSET
and GROUPS modes were added.

Discussion: https://postgr.es/m/CAD21AoDrWqycq-w_+Bx1cjc+YUhZ11XTj9rfxNiNDojjBx8Fjw@mail.gmail.com
Discussion: https://postgr.es/m/153086788677.17476.8002640580496698831@wrigleys.postgresql.org
2018-07-11 12:07:20 -04:00
Alvaro Herrera f2c587067a Rethink how to get float.h in old Windows API for isnan/isinf
We include <float.h> in every place that needs isnan(), because MSVC
used to require it.  However, since MSVC 2013 that's no longer necessary
(cf. commit cec8394b5c), so we can retire the inclusion to a
version-specific stanza in win32_port.h, where it doesn't need to
pollute random .c files.  The header is of course still needed in a few
places for other reasons.

I (Álvaro) removed float.h from a few more files than in Emre's original
patch.  This doesn't break the build in my system, but we'll see what
the buildfarm has to say about it all.

Author: Emre Hasegeli
Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
2018-07-11 09:11:48 -04:00
Peter Eisentraut bcbd940806 Remove dynamic_shared_memory_type=none
PostgreSQL nowadays offers some kind of dynamic shared memory feature on
all supported platforms.  Having the choice of "none" prevents us from
relying on DSM in core features.  So this patch removes the choice of
"none".

Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
2018-07-10 18:35:24 +02:00
Michael Paquier fc057b2b8f Remove dead code for temporary relations in partition planning
Since recent commit 1c7c317c, temporary relations cannot be mixed with
permanent relations within the same partition tree, and the same counts
for temporary relations created by other sessions, which the planner
simply discarded.  Instead be paranoid and issue an error, as those
should be blocked at definition time, at least for now.

At the same time, a test case is added to stress what has been moved
when expand_partitioned_rtentry gets called recursively but bumps on a
partitioned relation with no partitions which should be handled the same
way as the non-inheritance case.  This code may be reworked in a close
future, and covering this code path will limit surprises.

Reported-by: David Rowley
Author: David Rowley
Reviewed-by: Amit Langote, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
2018-07-04 10:37:40 +09:00
Alvaro Herrera 7d872c91a3 Allow direct lookups of AppendRelInfo by child relid
find_appinfos_by_relids had quite a large overhead when the number of
items in the append_rel_list was high, as it had to trawl through the
append_rel_list looking for AppendRelInfos belonging to the given
childrelids.  Since there can only be a single AppendRelInfo for each
child rel, it seems much better to store an array in PlannerInfo which
indexes these by child relid, making the function O(1) rather than O(N).
This function was only called once inside the planner, so just replace
that call with a lookup to the new array.  find_childrel_appendrelinfo
is now unused and thus removed.

This fixes a planner performance regression new to v11 reported by
Thomas Reiss.

Author: David Rowley
Reported-by: Thomas Reiss
Reviewed-by: Ashutosh Bapat
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/94dd7a4b-5e50-0712-911d-2278e055c622@dalibo.com
2018-06-26 10:35:26 -04:00
Robert Haas c6f28af5d7 Avoid generating bogus paths with partitionwise aggregate.
Previously, if some or all partitions had no partially aggregated path,
we would still try to generate a partially aggregated path for the
parent, leading to assertion failures or wrong answers.

Report by Rajkumar Raghuwanshi.  Patch by Jeevan Chalke, reviewed
by Ashutosh Bapat.  A few changes by me.

Discussion: http://postgr.es/m/CAKcux6=q4+Mw8gOOX16ef6ZMFp9Cve7KWFstUsrDa4GiFaXGUQ@mail.gmail.com
2018-06-22 09:20:19 -04:00
Amit Kapila 98d476a965 Improve coding pattern in Parallel Append code.
The create_append_path code didn't consider that list_concat will
modify it's first argument leading to inconsistent traversal of
resulting list.  In practice, it won't lead to any user-visible bug
but changing it for making the code behave consistently.

Reported-by: Tom Lane
Author: Tom Lane
Reviewed-by: Amit Khandekar and Amit Kapila
Discussion: https://postgr.es/m/32365.1528994120@sss.pgh.pa.us
2018-06-22 08:43:36 +05:30
Tom Lane 07e5a21352 Fix mishandling of sortgroupref labels while splitting SRF targetlists.
split_pathtarget_at_srfs() neglected to worry about sortgroupref labels
in the intermediate PathTargets it constructs.  I think we'd supposed
that their labeling didn't matter, but it does at least for the case that
GroupAggregate/GatherMerge nodes appear immediately under the ProjectSet
step(s).  This results in "ERROR: ORDER/GROUP BY expression not found in
targetlist" during create_plan(), as reported by Rajkumar Raghuwanshi.

To fix, make this logic track the sortgroupref labeling of expressions,
not just their contents.  This also restores the pre-v10 behavior that
separate GROUP BY expressions will be kept distinct even if they are
textually equal().

Discussion: https://postgr.es/m/CAKcux6=1_Ye9kx8YLBPmJs_xE72PPc6vNi5q2AOHowMaCWjJ2w@mail.gmail.com
2018-06-21 10:58:42 -04:00
Amit Kapila 403318b71f Don't consider parallel append for parallel unsafe paths.
Commit ab72716778 allowed Parallel Append paths to be generated for a
relation that is not parallel safe.  Prevent that from happening.

Initial analysis by Tom Lane.

Reported-by: Rajkumar Raghuwanshi
Author: Amit Kapila and Rajkumar Raghuwanshi
Reviewed-by: Amit Khandekar and Robert Haas
Discussion:https://postgr.es/m/CAKcux6=tPJ6nJ08r__nU_pmLQiC0xY15Fn0HvG1Cprsjdd9s_Q@mail.gmail.com
2018-06-20 07:51:42 +05:30
Tom Lane 4e23236403 Improve commentary about run-time partition pruning data structures.
No code changes except for a couple of new Asserts.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAKJS1f-6GODRNgEtdPxCnAPme2h2hTztB6LmtfdmcYAAOE0kQg@mail.gmail.com
2018-06-11 17:35:53 -04:00
Tom Lane 939449de0e Relocate partition pruning structs to a saner place.
These struct definitions were originally dropped into primnodes.h,
which is a poor choice since that's mainly intended for primitive
expression node types; these are not in that category.  What they
are is auxiliary info in Plan trees, so move them to plannodes.h.

For consistency, also relocate some related code that was apparently
placed with the aid of a dartboard.

There's no interesting code changes in this commit, just reshuffling.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com
2018-06-10 16:30:14 -04:00
Magnus Hagander 848b1f3e35 Fix typo in README
Author: Daniel Gustafsson <daniel@yesql.se>
2018-06-07 14:40:38 +02:00
Heikki Linnakangas a0b37684ba Fix typo in comment. 2018-05-22 11:18:16 +03:00
Tom Lane a11b3bd37f Fix misprocessing of equivalence classes involving record_eq().
canonicalize_ec_expression() is supposed to agree with coerce_type() as to
whether a RelabelType should be inserted to make a subexpression be valid
input for the operators of a given opclass.  However, it did the wrong
thing with named-composite-type inputs to record_eq(): it put in a
RelabelType to RECORDOID, which the parser doesn't.  In some cases this was
harmless because all code paths involving a particular equivalence class
did the same thing, but in other cases this would result in failing to
recognize a composite-type expression as being a member of an equivalence
class that it actually is a member of.  The most obvious bad effect was to
fail to recognize that an index on a composite column could provide the
sort order needed for a mergejoin on that column, as reported by Teodor
Sigaev.  I think there might be other, subtler, cases that result in
misoptimization.  It also seems possible that an unwanted RelabelType
would sometimes get into an emitted plan --- but because record_eq and
friends don't examine the declared type of their input expressions, that
would not create any visible problems.

To fix, just treat RECORDOID as if it were a polymorphic type, which in
some sense it is.  We might want to consider formalizing that a bit more
someday, but for the moment this seems to be the only place where an
IsPolymorphicType() test ought to include RECORDOID as well.

This has been broken for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/a6b22369-e3bf-4d49-f59d-0c41d3551e81@sigaev.ru
2018-05-16 13:46:23 -04:00
Robert Haas 7fc7dac1a7 Pass the correct PlannerInfo to PlanForeignModify/PlanDirectModify.
Previously, we passed the toplevel PlannerInfo, but we actually want
to pass the relevant subroot.  One problem with passing the toplevel
PlannerInfo is that the FDW which wants to push down an UPDATE or
DELETE against a join won't find the relevant joinrel there.
As of commit 1bc0100d27, postgres_fdw
tries to do exactly this and can be made to fail an assertion as a
result.

It's possible that this should be regarded as a bug fix and
back-patched to earlier releases, but for lack of a test case that
fails in earlier releases, no back-patch for now.

Etsuro Fujita, reviewed by Amit Langote.

Discussion: http://postgr.es/m/5AF43E02.30000@lab.ntt.co.jp
2018-05-16 11:32:38 -04:00
Tom Lane bdf46af748 Post-feature-freeze pgindent run.
Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
2018-04-26 14:47:16 -04:00
Robert Haas dc1057fcd8 Prevent generation of bogus subquery scan paths.
Commit 0927d2f46d didn't check that
consider_parallel was set for the target relation or account for
the possibility that required_outer might be non-empty.

To prevent future bugs of this ilk, add some assertions to
add_partial_path and do a bit of future-proofing of the code
recently added to recurse_set_operations.

Report by Andreas Seltenreich.  Patch by Jeevan Chalke.  Review
by Amit Kapila and by me.

Discussion: http://postgr.es/m/CAM2+6=U+9otsyF2fYB8x_2TBeHTR90itarqW=qAEjN-kHaC7kw@mail.gmail.com
2018-04-25 15:25:55 -04:00
Alvaro Herrera 055fb8d33d Add GUC enable_partition_pruning
This controls both plan-time and execution-time new-style partition
pruning.  While finer-grain control is possible (maybe using an enum GUC
instead of boolean), there doesn't seem to be much need for that.

This new parameter controls partition pruning for all queries:
trivially, SELECT queries that affect partitioned tables are naturally
under its control since they are using the new technology.  However,
while UPDATE/DELETE queries do not use the new code, we make the new GUC
control their behavior also (stealing control from
constraint_exclusion), because it is more natural, and it leads to a
more natural transition to the future in which those queries will also
use the new pruning code.

Constraint exclusion still controls pruning for regular inheritance
situations (those not involving partitioned tables).

Author: David Rowley
Review: Amit Langote, Ashutosh Bapat, Justin Pryzby, David G. Johnston
Discussion: https://postgr.es/m/CAKJS1f_0HwsxJG9m+nzU+CizxSdGtfe6iF_ykPYBiYft302DCw@mail.gmail.com
2018-04-23 17:57:43 -03:00
Tom Lane ec38dcd363 Tweak a couple of planner APIs to save recalculating join relids.
Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-20 16:00:47 -04:00
Tom Lane c792c7db41 Change more places to be less trusting of RestrictInfo.is_pushed_down.
On further reflection, commit e5d83995e didn't go far enough: pretty much
everywhere in the planner that examines a clause's is_pushed_down flag
ought to be changed to use the more complicated behavior where we also
check the clause's required_relids.  Otherwise we could make incorrect
decisions about whether, say, a clause is safe to use as a hash clause.

Some (many?) of these places are safe as-is, either because they are
never reached while considering a parameterized path, or because there
are additional checks that would reject a pushed-down clause anyway.
However, it seems smarter to just code them all the same way rather
than rely on easily-broken reasoning of that sort.

In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should
be used in place of direct tests on the is_pushed_down flag.

Like the previous patch, back-patch to all supported branches.

Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-20 15:19:16 -04:00
Tom Lane e5d83995e9 Fix incorrect handling of join clauses pushed into parameterized paths.
In some cases a clause attached to an outer join can be pushed down into
the outer join's RHS even though the clause is not degenerate --- this
can happen if we choose to make a parameterized path for the RHS.  If
the clause ends up attached to a lower outer join, we'd misclassify it
as being a "join filter" not a plain "filter" condition at that node,
leading to wrong query results.

To fix, teach extract_actual_join_clauses to examine each join clause's
required_relids, not just its is_pushed_down flag.  (The latter now
seems vestigial, or at least in need of rethinking, but we won't do
anything so invasive as redefining it in a bug-fix patch.)

This has been wrong since we introduced parameterized paths in 9.2,
though it's evidently hard to hit given the lack of previous reports.
The test case used here involves a lateral function call, and I think
that a lateral reference may be required to get the planner to select
a broken plan; though I wouldn't swear to that.  In any case, even if
LATERAL is needed to trigger the bug, it still affects all supported
branches, so back-patch to all.

Per report from Andreas Karlsson.  Thanks to Andrew Gierth for
preliminary investigation.

Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
2018-04-19 15:49:30 -04:00
Alvaro Herrera da6f3e45dd Reorganize partitioning code
There's been a massive addition of partitioning code in PostgreSQL 11,
with little oversight on its placement, resulting in a
catalog/partition.c with poorly defined boundaries and responsibilities.
This commit tries to set a couple of distinct modules to separate things
a little bit.  There are no code changes here, only code movement.

There are three new files:
  src/backend/utils/cache/partcache.c
  src/include/partitioning/partdefs.h
  src/include/utils/partcache.h

The previous arrangement of #including catalog/partition.h almost
everywhere is no more.

Authors: Amit Langote and Álvaro Herrera
Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp
	https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp
	https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp
	https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql
2018-04-14 21:12:14 -03:00
Peter Eisentraut a8677e3ff6 Support named and default arguments in CALL
We need to call expand_function_arguments() to expand named and default
arguments.

In PL/pgSQL, we also need to deal with named and default INOUT arguments
when receiving the output values into variables.

Author: Pavel Stehule <pavel.stehule@gmail.com>
2018-04-14 09:13:53 -04:00
Teodor Sigaev c266ed31a8 Cleanup covering infrastructure
- Explicitly forbids opclass, collation and indoptions (like DESC/ASC etc) for
  including columns. Throw an error if user points that.
- Truncated storage arrays for such attributes to store only key atrributes,
  added assertion checks.
- Do not check opfamily and collation for including columns in
  CompareIndexInfo()

Discussion: https://www.postgresql.org/message-id/5ee72852-3c4e-ee35-e2ed-c1d053d45c08@sigaev.ru
2018-04-12 16:37:22 +03:00
Simon Riggs 08ea7a2291 Revert MERGE patch
This reverts commits d204ef6377,
83454e3c2b and a few more commits thereafter
(complete list at the end) related to MERGE feature.

While the feature was fully functional, with sufficient test coverage and
necessary documentation, it was felt that some parts of the executor and
parse-analyzer can use a different design and it wasn't possible to do that in
the available time. So it was decided to revert the patch for PG11 and retry
again in the future.

Thanks again to all reviewers and bug reporters.

List of commits reverted, in reverse chronological order:

 f1464c5380 Improve parse representation for MERGE
 ddb4158579 MERGE syntax diagram correction
 530e69e59b Allow cpluspluscheck to pass by renaming variable
 01b88b4df5 MERGE minor errata
 3af7b2b0d4 MERGE fix variable warning in non-assert builds
 a5d86181ec MERGE INSERT allows only one VALUES clause
 4b2d44031f MERGE post-commit review
 4923550c20 Tab completion for MERGE
 aa3faa3c7a WITH support in MERGE
 83454e3c2b New files for MERGE
 d204ef6377 MERGE SQL Command following SQL:2016

Author: Pavan Deolasee
Reviewed-by: Michael Paquier
2018-04-12 11:22:56 +01:00
Tom Lane cefa387153 Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.
Traditionally, include/catalog/pg_foo.h contains extern declarations
for functions in backend/catalog/pg_foo.c, in addition to its function
as the authoritative definition of the pg_foo catalog's rowtype.
In some cases, we'd been forced to split out those extern declarations
into separate pg_foo_fn.h headers so that the catalog definitions
could be #include'd by frontend code.  That problem is gone as of
commit 9c0a0de4c, so let's undo the splits to make things less
confusing.

Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
2018-04-08 14:35:29 -04:00
Teodor Sigaev 02f3e558f2 match_clause_to_index should check only key columns
Alexander Korotkov per gripe from Tom Lane noticed on valgrind-enabled
buildfarm members
2018-04-08 19:58:15 +03:00
Alvaro Herrera 499be013de Support partition pruning at execution time
Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query.  This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.

This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:

1. Parameterized Nested Loop Joins: The parameter from the outer side of the
   join can be used to determine the minimum set of inner side partitions to
   scan.

2. Initplans: Once an initplan has been executed we can then determine which
   partitions match the value from the initplan.

Partition pruning is performed in two ways.  When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor.  This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.

For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait.  Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output.  In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed.  If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)".  Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times.  This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.

This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable.  This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.

Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
2018-04-07 17:54:39 -03:00
Teodor Sigaev 8224de4f42 Indexes with INCLUDE columns and their support in B-tree
This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.

Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.

In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.

Bump catalog version

Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
			 David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
2018-04-07 23:00:39 +03:00
Alvaro Herrera 9fdb675fc5 Faster partition pruning
Add a new module backend/partitioning/partprune.c, implementing a more
sophisticated algorithm for partition pruning.  The new module uses each
partition's "boundinfo" for pruning instead of constraint exclusion,
based on an idea proposed by Robert Haas of a "pruning program": a list
of steps generated from the query quals which are run iteratively to
obtain a list of partitions that must be scanned in order to satisfy
those quals.

At present, this targets planner-time partition pruning, but there exist
further patches to apply partition pruning at execution time as well.

This commit also moves some definitions from include/catalog/partition.h
to a new file include/partitioning/partbounds.h, in an attempt to
rationalize partitioning related code.

Authors: Amit Langote, David Rowley, Dilip Kumar
Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen.
Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
2018-04-06 16:44:05 -03:00
Simon Riggs 4b2d44031f MERGE post-commit review
Review comments from Andres Freund

* Consolidate code into AfterTriggerGetTransitionTable()
* Rename nodeMerge.c to execMerge.c
* Rename nodeMerge.h to execMerge.h
* Move MERGE handling in ExecInitModifyTable()
  into a execMerge.c ExecInitMerge()
* Move mt_merge_subcommands flags into execMerge.h
* Rename opt_and_condition to opt_merge_when_and_condition
* Wordsmith various comments

Author: Pavan Deolasee
Reviewer: Simon Riggs
2018-04-05 09:54:07 +01:00
Simon Riggs d204ef6377 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 09:28:16 +01:00
Simon Riggs 7cf8a5c302 Revert "Modified files for MERGE"
This reverts commit 354f13855e.
2018-04-02 21:34:15 +01:00
Simon Riggs 354f13855e Modified files for MERGE 2018-04-02 21:12:47 +01:00
Robert Haas 7e0d64c7a5 postgres_fdw: Push down partition-wise aggregation.
Since commit 7012b132d0, postgres_fdw
has been able to push down the toplevel aggregation operation to the
remote server.  Commit e2f1eb0ee3 made
it possible to break down the toplevel aggregation into one
aggregate per partition.  This commit lets postgres_fdw push down
aggregation in that case just as it does at the top level.

In order to make this work, this commit adds an additional argument
to the GetForeignUpperPaths FDW API.  A matching argument is added
to the signature for create_upper_paths_hook.  Third-party code using
either of these will need to be updated.

Also adjust create_foreignscan_plan() so that it picks up the correct
set of relids in this case.

Jeevan Chalke, reviewed by Ashutosh Bapat and by me and with some
adjustments by me.  The larger patch series of which this patch is a
part was also reviewed and tested by Antonin Houska, Rajkumar
Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal
Legrand, and Rafia Sabih.

Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
2018-04-02 10:51:50 -04:00
Tom Lane 0b11a674fb Fix a boatload of typos in C comments.
Justin Pryzby

Discussion: https://postgr.es/m/20180331105640.GK28454@telsasoft.com
2018-04-01 15:01:28 -04:00
Robert Haas 96030f9a48 Don't call IS_DUMMY_REL() when cheapest_total_path might be junk.
Unlike the previous coding, this might result in a Gather per Append
subplan when the target list is parallel-restricted, but such a plan
is probably worth considering in that case, since a single Gather
on top of the entire Append is impossible.

Per Andres Freund and the buildfarm.

Discussion: http://postgr.es/m/20180330050351.bmxx4cdtz67czjda@alap3.anarazel.de
2018-03-30 11:40:41 -04:00
Robert Haas c1de1a3a8b Remove 'target' from GroupPathExtraData.
It's not needed.

Jeevan Chalke

Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
2018-03-29 16:17:18 -04:00
Robert Haas 11cf92f6e2 Rewrite the code that applies scan/join targets to paths.
If the toplevel scan/join target list is parallel-safe, postpone
generating Gather (or Gather Merge) paths until after the toplevel has
been adjusted to return it.  This (correctly) makes queries with
expensive functions in the target list more likely to choose a
parallel plan, since the cost of the plan now reflects the fact that
the evaluation will happen in the workers rather than the leader.
The original complaint about this problem was from Jeff Janes.

If the toplevel scan/join relation is partitioned, recursively apply
the changes to all partitions.  This sometimes allows us to get rid of
Result nodes, because Append is not projection-capable but its
children may be.  It also cleans up what appears to be incorrect SRF
handling from commit e2f1eb0ee30d144628ab523432320f174a2c8966: the old
code had no knowledge of SRFs for child scan/join rels.

Because we now use create_projection_path() in some cases where we
formerly used apply_projection_to_path(), this changes the ordering
of columns in some queries generated by postgres_fdw.  Update
regression outputs accordingly.

Patch by me, reviewed by Amit Kapila and by Ashutosh Bapat.  Other
fixes for this problem (substantially different from this version)
were reviewed by Dilip Kumar, Amit Khandekar, and Marina Polyakova.

Discussion: http://postgr.es/m/CAMkU=1ycXNipvhWuweUVpKuyu6SpNjF=yHWu4c4US5JgVGxtZQ@mail.gmail.com
2018-03-29 15:49:31 -04:00
Robert Haas 3f90ec8597 Postpone generate_gather_paths for topmost scan/join rel.
Don't call generate_gather_paths for the topmost scan/join relation
when it is initially populated with paths.  Instead, do the work in
grouping_planner.  By itself, this gains nothing; in fact it loses
slightly because we end up calling set_cheapest() for the topmost
scan/join rel twice rather than once.  However, it paves the way for
a future commit which will postpone generate_gather_paths for the
topmost scan/join relation even further, allowing more accurate
costing of parallel paths.

Amit Kapila and Robert Haas.  Earlier versions of this patch (which
different substantially) were reviewed by Dilip Kumar, Amit
Khandekar, Marina Polyakova, and Ashutosh Bapat.
2018-03-29 15:40:40 -04:00
Robert Haas d7c19e62a8 Teach create_projection_plan to omit projection where possible.
We sometimes insert a ProjectionPath into a plan tree when projection
is not strictly required. The existing code already arranges to avoid
emitting a Result node when the ProjectionPath's subpath can perform
the projection itself, but previously it didn't consider the
possibility that the parent node might not actually require the
projection to be performed at all.

Skipping projection when it's not required can not only avoid Result
nodes that aren't needed, but also avoid losing the "physical tlist"
optimization unneccessarily.

Patch by me, reviewed by Amit Kapila.

Discussion: http://postgr.es/m/CA+TgmoakT5gmahbPWGqrR2nAdFOMAOnOXYoWHRdVfGWs34t6_A@mail.gmail.com
2018-03-29 15:37:48 -04:00
Andres Freund 9370462e9a Add inlining support to LLVM JIT provider.
This provides infrastructure to allow JITed code to inline code
implemented in C. This e.g. can be postgres internal functions or
extension code.

This already speeds up long running queries, by allowing the LLVM
optimizer to optimize across function boundaries. The optimization
potential currently doesn't reach its full potential because LLVM
cannot optimize the FunctionCallInfoData argument fully away, because
it's allocated on the heap rather than the stack. Fixing that is
beyond what's realistic for v11.

To be able to do that, use CLANG to convert C code to LLVM bitcode,
and have LLVM build a summary for it. That bitcode can then be used to
to inline functions at runtime. For that the bitcode needs to be
installed. Postgres bitcode goes into $pkglibdir/bitcode/postgres,
extensions go into equivalent directories.  PGXS has been modified so
that happens automatically if postgres has been compiled with LLVM
support.

Currently this isn't the fastest inline implementation, modules are
reloaded from disk during inlining. That's to work around an apparent
LLVM bug, triggering an apparently spurious error in LLVM assertion
enabled builds.  Once that is resolved we can remove the superfluous
read from disk.

Docs will follow in a later commit containing docs for the whole JIT
feature.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-28 13:19:08 -07:00
Andrew Dunstan 16828d5c02 Fast ALTER TABLE ADD COLUMN with a non-NULL default
Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.

Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.

The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g.  catalog relations) NULL can be passed for this
argument.

Andrew Dunstan, heavily modified from an original patch from Serge
Rielau.
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.

Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com
2018-03-28 10:43:52 +10:30
Andres Freund 32af96b2b1 JIT tuple deforming in LLVM JIT provider.
Performing JIT compilation for deforming gains performance benefits
over unJITed deforming from compile-time knowledge of the tuple
descriptor. Fixed column widths, NOT NULLness, etc can be taken
advantage of.

Right now the JITed deforming is only used when deforming tuples as
part of expression evaluation (and obviously only if the descriptor is
known). It's likely to be beneficial in other cases, too.

By default tuple deforming is JITed whenever an expression is JIT
compiled. There's a separate boolean GUC controlling it, but that's
expected to be primarily useful for development and benchmarking.

Docs will follow in a later commit containing docs for the whole JIT
feature.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-26 12:57:19 -07:00
Alvaro Herrera 1b89c2188b Fix typo 2018-03-26 09:56:41 -03:00
Andres Freund 2a0faed9d7 Add expression compilation support to LLVM JIT provider.
In addition to the interpretation of expressions (which back
evaluation of WHERE clauses, target list projection, aggregates
transition values etc) support compiling expressions to native code,
using the infrastructure added in earlier commits.

To avoid duplicating a lot of code, only support emitting code for
cases that are likely to be performance critical. For expression steps
that aren't deemed that, use the existing interpreter.

The generated code isn't great - some architectural changes are
required to address that. But this already yields a significant
speedup for some analytics queries, particularly with WHERE clauses
filtering a lot, or computing multiple aggregates.

Author: Andres Freund
Tested-By: Thomas Munro
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de

Disable JITing for VALUES() nodes.

VALUES() nodes are only ever executed once. This is primarily helpful
for debugging, when forcing JITing even for cheap queries.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 14:45:59 -07:00
Robert Haas 88ba0ae2aa Consider Parallel Append of partial paths for UNION [ALL].
Without this patch, we can implement a UNION or UNION ALL as an
Append where Gather appears beneath one or more of the Append
branches, but this lets us put the Gather node on top, with
a partial path for each relation underneath.

There is considerably more work that could be done to improve
planning in this area, but that will probably need to wait
for a future release.

Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar
Raghuwanshi.

Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com
2018-03-22 16:09:28 -04:00
Andres Freund cc415a56d0 Basic planner and executor integration for JIT.
This adds simple cost based plan time decision about whether JIT
should be performed. jit_above_cost, jit_optimize_above_cost are
compared with the total cost of a plan, and if the cost is above them
JIT is performed / optimization is performed respectively.

For that PlannedStmt and EState have a jitFlags (es_jit_flags) field
that stores information about what JIT operations should be performed.

EState now also has a new es_jit field, which can store a
JitContext. When there are no errors the context is released in
standard_ExecutorEnd().

It is likely that the default values for jit_[optimize_]above_cost
will need to be adapted further, but in my test these values seem to
work reasonably.

Author: Andres Freund, with feedback by Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 11:51:58 -07:00
Robert Haas e2f1eb0ee3 Implement partition-wise grouping/aggregation.
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition.  This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition.  This should be no worse, and often better, than the normal
approach.

If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results.  This is less certain
to be a win, but it's still useful.

Jeevan Chalke, Ashutosh Bapat, Robert Haas.  The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.

Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-03-22 12:49:48 -04:00
Tom Lane 0f0deb7194 Improve predtest.c's handling of cases with NULL-constant inputs.
Currently, if operator_predicate_proof() is given an operator clause like
"something op NULL", it just throws up its hands and reports it can't prove
anything.  But we can often do better than that, if the operator is strict,
because then we know that the clause returns NULL overall.  Depending on
whether we're trying to prove or refute something, and whether we need
weak or strong semantics for NULL, this may be enough to prove the
implication, especially when we rely on the standard rule that "false
implies anything".  In particular, this lets us do something useful with
questions like "does X IN (1,3,5,NULL) imply X <= 5?"  The null entry
in the IN list can effectively be ignored for this purpose, but the
proof rules were not previously smart enough to deduce that.

This patch is by me, but it owes something to previous work by
Amit Langote to try to solve problems of the form mentioned.
Thanks also to Emre Hasegeli and Ashutosh Bapat for review.

Discussion: https://postgr.es/m/3bad48fc-f257-c445-feeb-8a2b2fb622ba@lab.ntt.co.jp
2018-03-21 18:30:46 -04:00
Andrew Gierth d2d79887ea Repair crash with unsortable grouping sets.
If there were multiple grouping sets, none of them empty, all of which
were unsortable, then an oversight in consider_groupingsets_paths led
to a null pointer dereference. Fix, and add a regression test for this
case.

Per report from Dang Minh Huong, though I didn't use their patch.

Backpatch to 10.x where hashed grouping sets were added.
2018-03-21 11:39:28 +00:00
Robert Haas 94150513ec Don't pass the grouping target around unnecessarily.
Since commit 4f15e5d09d made grouped_rel
set reltarget, a variety of other functions can just get it from
grouped_rel instead of having to pass it around explicitly.  Simplify
accordingly.

Patch by me, reviewed by Ashutosh Bapat.

Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com
2018-03-20 11:37:43 -04:00
Robert Haas b5996c2791 Determine grouping strategies in create_grouping_paths.
Partition-wise aggregate will call create_ordinary_grouping_paths
multiple times and we don't want to redo this work every time; have
the caller do it instead and pass the details down.

Patch by me, reviewed by Ashutosh Bapat.

Discussion: http://postgr.es/m/CA+TgmoY7VYYn9a7YHj1nJL6zj6BkHmt4K-un9LRmXkyqRZyynA@mail.gmail.com
2018-03-20 11:31:06 -04:00
Robert Haas 4f15e5d09d Defer creation of partially-grouped relation until it's needed.
This avoids unnecessarily creating a RelOptInfo for which we have no
actual need.  This idea is from Ashutosh Bapat, who wrote a very
different patch to accomplish a similar goal.  It will be more
important if and when we get partition-wise aggregate, since then
there could be many partially grouped relations all of which could
potentially be unnecessary.  In passing, this sets the grouping
relation's reltarget, which wasn't done previously but makes things
simpler for this refactoring.

Along the way, adjust things so that add_paths_to_partial_grouping_rel,
now renamed create_partial_grouping_paths, does not perform the Gather
or Gather Merge steps to generate non-partial paths from partial
paths; have the caller do it instead.  This is again for the
convenience of partition-wise aggregate, which wants to inject
additional partial paths are created and before we decide which ones
to Gather/Gather Merge.  This might seem like a separate change, but
it's actually pretty closely entangled; I couldn't really see much
value in separating it and having to change some things twice.

Patch by me, reviewed by Ashutosh Bapat.

Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com
2018-03-20 11:18:04 -04:00
Robert Haas c596fadbfe Generate a separate upper relation for each stage of setop planning.
Commit 3fc6e2d7f5 made setop planning
stages return paths rather than plans, but all such paths were loosely
associated with a single RelOptInfo, and only the final path was added
to the RelOptInfo.  Even at the time, it was foreseen that this should
be changed, because there is otherwise no good way for a single stage
of setop planning to return multiple paths.  With this patch, each
stage of set operation planning now creates a separate RelOptInfo;
these are distinguished by using appropriate relid sets.  Note that
this patch does nothing whatsoever about actually returning multiple
paths for the same set operation; it just makes it possible for a
future patch to do so.

Along the way, adjust things so that create_upper_paths_hook is called
for each of these new RelOptInfos rather than just once, since that
might be useful to extensions using that hook.  It might be a good
to provide an FDW API here as well, but I didn't try to do that for
now.

Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar
Raghuwanshi.

Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com
2018-03-19 11:55:38 -04:00
Robert Haas 49525c4630 Rewrite recurse_union_children to iterate, rather than recurse.
Also, rename it to plan_union_chidren, so the old name wasn't
very descriptive.  This results in a small net reduction in code,
seems at least to me to be easier to understand, and saves
space on the process stack.

Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar
Raghuwanshi.

Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com
2018-03-19 11:54:56 -04:00
Tom Lane 877cdf11ea Mop-up for letting VOID-returning SQL functions end with a SELECT.
Part of the intent in commit fd1a421fe was to allow SQL functions that are
declared to return VOID to contain anything, including an unrelated final
SELECT, the same as SQL-language procedures can.  However, the planner's
inlining logic didn't get that memo.  Fix it, and add some regression tests
covering this area, since evidently we had none.

In passing, clean up some typos in comments in create_function_3.sql,
and get rid of its none-too-safe assumption that DROP CASCADE notice
output is immutably ordered.

Per report from Prabhat Sahu.

Discussion: https://postgr.es/m/CANEvxPqxAj6nNHVcaXxpTeEFPmh24Whu+23emgjiuKrhJSct0A@mail.gmail.com
2018-03-16 12:48:13 -04:00
Robert Haas 1466bcfa4a Split create_grouping_paths into degenerate and non-degenerate cases.
There's no functional change here, or at least I hope there isn't,
just code rearrangement.  The rearrangement is motivated by
partition-wise aggregate, which doesn't need to consider the
degenerate case but wants to reuse the logic for the ordinary case.

Based loosely on a patch from Ashutosh Bapat and Jeevan Chalke, but I
whacked it around pretty heavily. The larger patch series of which
this patch is a part was also reviewed and tested by Antonin Houska,
Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik,
Pascal Legrand, Rafia Sabih, and me.

Discussion: http://postgr.es/m/CAFjFpRewpqCmVkwvq6qrRjmbMDpN0CZvRRzjd8UvncczA3Oz1Q@mail.gmail.com
2018-03-15 14:43:58 -04:00
Robert Haas 648a6c7bd8 Pass additional arguments to a couple of grouping-related functions.
get_number_of_groups() and make_partial_grouping_target() currently
fish information directly out of the PlannerInfo; in the former case,
the target list, and in the latter case, the HAVING qual.  This works
fine if there's only one grouping relation, but if the pending patch
for partition-wise aggregate gets committed, we'll have multiple
grouping relations and must therefore use appropriately translated
versions of these values for each one.  To make that simpler, pass the
values to be used as arguments.

Jeevan Chalke.  The larger patch series of which this patch is a part
was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi,
David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia
Sabih, and me.

Discussion: http://postgr.es/m/CAM2+6=UqFnFUypOvLdm5TgC+2M=-E0Q7_LOh0VDFFzmk2BBPzQ@mail.gmail.com
Discussion: http://postgr.es/m/CAM2+6=W+L=C4yBqMrgrfTfNtbtmr4T53-hZhwbA2kvbZ9VMrrw@mail.gmail.com
2018-03-15 11:33:52 -04:00
Stephen Frost 6b960aae90 Fix function-header comments in planner.c
In b5635948ab, a couple of function header comments weren't changed, or
weren't changed correctly, to reflect the arguments being passed into
the functions.  Specifically, get_number_of_groups() had the wrong
argument name in the commit and create_grouping_paths() wasn't
updated even though the arguments had been changed.

The issue with create_grouping_paths() was noticed by Ashutosh Bapat,
while I discovered the issue with get_number_of_groups() by looking to
see if there were any similar issues from that commit.

Discussion: https://postgr.es/m/CAFjFpRcbp4702jcp387PExt3fNCt62QJN8++DQGwBhsW6wRHWA@mail.gmail.com
2018-03-14 13:51:15 -04:00
Stephen Frost 1f7b8967ef Fix typo in add_paths_to_append_rel()
The comment should have been referring to the number of workers, not the
number of paths.

Author: Ashutosh Bapat
Discussion: https://postgr.es/m/CAFjFpRcbp4702jcp387PExt3fNCt62QJN8++DQGwBhsW6wRHWA@mail.gmail.com
2018-03-14 13:51:14 -04:00
Robert Haas 0927d2f46d Let Parallel Append over simple UNION ALL have partial subpaths.
A simple UNION ALL gets flattened into an appendrel of subquery
RTEs, but up until now it's been impossible for the appendrel to use
the partial paths for the subqueries, so we can implement the
appendrel as a Parallel Append but only one with non-partial paths
as children.

There are three separate obstacles to removing that limitation.
First, when planning a subquery, propagate any partial paths to the
final_rel so that they are potentially visible to outer query levels
(but not if they have initPlans attached, because that wouldn't be
safe).  Second, after planning a subquery, propagate any partial paths
for the final_rel to the subquery RTE in the outer query level in the
same way we do for non-partial paths.  Third, teach finalize_plan() to
account for the possibility that the fake parameter we use for rescan
signalling when the plan contains a Gather (Merge) node may be
propagated from an outer query level.

Patch by me, reviewed and tested by Amit Khandekar, Rajkumar
Raghuwanshi, and Ashutosh Bapat.  Test cases based on examples by
Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CA+Tgmoa6L9A1nNCk3aTDVZLZ4KkHDn1+tm7mFyFvP+uQPS7bAg@mail.gmail.com
2018-03-13 16:34:08 -04:00
Tom Lane 4a4e2442a7 Fix improper uses of canonicalize_qual().
One of the things canonicalize_qual() does is to remove constant-NULL
subexpressions of top-level AND/OR clauses.  It does that on the assumption
that what it's given is a top-level WHERE clause, so that NULL can be
treated like FALSE.  Although this is documented down inside a subroutine
of canonicalize_qual(), it wasn't mentioned in the documentation of that
function itself, and some callers hadn't gotten that memo.

Notably, commit d007a9505 caused get_relation_constraints() to apply
canonicalize_qual() to CHECK constraints.  That allowed constraint
exclusion to misoptimize situations in which a CHECK constraint had a
provably-NULL subclause, as seen in the regression test case added here,
in which a child table that should be scanned is not.  (Although this
thinko is ancient, the test case doesn't fail before 9.2, for reasons
I've not bothered to track down in detail.  There may be related cases
that do fail before that.)

More recently, commit f0e44751d added an independent bug by applying
canonicalize_qual() to index expressions, which is even sillier since
those might not even be boolean.  If they are, though, I think this
could lead to making incorrect index entries for affected index
expressions in v10.  I haven't attempted to prove that though.

To fix, add an "is_check" parameter to canonicalize_qual() to specify
whether it should assume WHERE or CHECK semantics, and make it perform
NULL-elimination accordingly.  Adjust the callers to apply the right
semantics, or remove the call entirely in cases where it's not known
that the expression has one or the other semantics.  I also removed
the call in some cases involving partition expressions, where it should
be a no-op because such expressions should be canonical already ...
and was a no-op, independently of whether it could in principle have
done something, because it was being handed the qual in implicit-AND
format which isn't what it expects.  In HEAD, add an Assert to catch
that type of mistake in future.

This represents an API break for external callers of canonicalize_qual().
While that's intentional in HEAD to make such callers think about which
case applies to them, it seems like something we probably wouldn't be
thanked for in released branches.  Hence, in released branches, the
extra parameter is added to a new function canonicalize_qual_ext(),
and canonicalize_qual() is a wrapper that retains its old behavior.

Patch by me with suggestions from Dean Rasheed.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us
2018-03-11 18:10:42 -04:00
Tom Lane 5748f3a0aa Improve predtest.c's internal docs, and enhance its functionality a bit.
Commit b08df9cab left things rather poorly documented as far as the
exact semantics of "clause_is_check" mode went.  Also, that mode did
not really work correctly for predicate_refuted_by; although given the
lack of specification as to what it should do, as well as the lack
of any actual use-case, that's perhaps not surprising.

Rename "clause_is_check" to "weak" proof mode, and provide specifications
for what it should do.  I defined weak refutation as meaning "truth of A
implies non-truth of B", which makes it possible to use the mode in the
part of relation_excluded_by_constraints that checks for mutually
contradictory WHERE clauses.  Fix up several places that did things wrong
for that definition.  (As far as I can see, these errors would only lead
to failure-to-prove, not incorrect claims of proof, making them not
serious bugs even aside from the fact that v10 contains no use of this
mode.  So there seems no need for back-patching.)

In addition, teach predicate_refuted_by_recurse that it can use
predicate_implied_by_recurse after all when processing a strong NOT-clause,
so long as it asks for the correct proof strength.  This is an optimization
that could have been included in commit b08df9cab, but wasn't.

Also, simplify and generalize the logic that checks for whether nullness of
the argument of IS [NOT] NULL would force overall nullness of the predicate
or clause.  (This results in a change in the partition_prune test's output,
as it is now able to prune an all-nulls partition that it did not recognize
before.)

In passing, in PartConstraintImpliedByRelConstraint, remove bogus
conversion of the constraint list to explicit-AND form and then right back
again; that accomplished nothing except forcing a useless extra level of
recursion inside predicate_implied_by.

Discussion: https://postgr.es/m/5983.1520487191@sss.pgh.pa.us
2018-03-09 16:58:26 -05:00
Robert Haas 960df2a971 Correctly assess parallel-safety of tlists when SRFs are used.
Since commit 69f4b9c85f, the existing
code was no longer assessing the parallel-safety of the real tlist
for each upper rel, but rather the first of possibly several tlists
created by split_pathtarget_at_srfs().  Repair.

Even though this is clearly wrong, it's not clear that it has any
user-visible consequences at the moment, so no back-patch for now.  If
we discover later that it does have user-visible consequences, we
might need to back-patch this to v10.

Patch by me, per a report from Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CA+Tgmoaob_Strkg4Dcx=VyxnyXtrmkV=ofj=pX7gH9hSre-g0Q@mail.gmail.com
2018-03-08 14:25:31 -05:00
Peter Eisentraut 5b804cc168 Fix costing of parallel hash joins.
Commit 1804284042 established that single-batch
parallel-aware hash joins could create one large shared hash table using the
combined work_mem budget of all participants.  The costing accidentally
assumed that parallel-oblivious hash joins could also do that.  The
documentation for initial_cost_hashjoin() also failed to mention the new
argument.  Repair.

Author: Thomas Munro
Reported-By: Antonin Houska
Reviewed-By: Antonin Houska
Discussion: https://postgr.es/m/12441.1513935950%40localhost
2018-03-06 21:54:37 -05:00
Peter Eisentraut fd1a421fe6 Add prokind column, replacing proisagg and proiswindow
The new column distinguishes normal functions, procedures, aggregates,
and window functions.  This replaces the existing columns proisagg and
proiswindow, and replaces the convention that procedures are indicated
by prorettype == 0.  Also change prorettype to be VOIDOID for procedures.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2018-03-02 13:48:33 -05:00
Tom Lane b5febc1d12 Fix IOS planning when only some index columns can return an attribute.
Since 9.5, it's possible that some but not all columns of an index
support returning the indexed value for index-only scans.  If the
same indexed column appears in index columns that behave both ways,
check_index_only() supposed that it'd be OK to do an index-only scan
testing that column; but that fails if we have to recheck the indexed
condition on one of the columns that doesn't support this.

In principle we could make this work by remapping the recheck expressions
to pull the value from a column that does support returning the indexed
value.  But such cases are so weird and rare that, at least for now,
it doesn't seem worth the trouble.  Instead, just teach check_index_only
that a value is returnable only if all the index columns containing it
are returnable, rather than any of them.

Per report from David Pereiro Lagares.  Back-patch to 9.5 where the
possibility of this situation appeared.

Kyotaro Horiguchi

Discussion: https://postgr.es/m/1516210494.1798.16.camel@nlpgo.com
2018-03-01 15:35:03 -05:00
Robert Haas 2af28e6033 For partitionwise join, match on partcollation, not parttypcoll.
The previous code considered two tables to have the partition scheme
if the underlying columns had the same collation, but what we
actually need to compare is not the collations associated with the
column but the collation used for partitioning.  Fix that.

Robert Haas and Amit Langote

Discussion: http://postgr.es/m/0f95f924-0efa-4cf5-eb5f-9a3d1bc3c33d@lab.ntt.co.jp
2018-02-28 12:16:09 -05:00
Robert Haas 5e6a63c0d1 Minor cleanup of code related to partially_grouped_rel.
Jeevan Chalke

Discussion: http://postgr.es/m/CAM2+6=X9kxQoL2ZqZ00E6asBt9z+rfyWbOmhXJ0+8fPAyMZ9Jg@mail.gmail.com
2018-02-27 13:23:50 -05:00
Robert Haas 3bfe957761 Fix logic error in add_paths_to_partial_grouping_rel.
Commit 3bf05e096b sometimes uses the
cheapest_partial_path variable in this function to mean the cheapest
one from the input rel and at other times the cheapest one from the
partially grouped rel, but it never resets it, so we can end up with
bad plans, leading to "ERROR: Aggref found in non-Agg plan node".

Jeevan Chalke, per a report from Andreas Joseph Krogh and a separate
off-list report from Rajkumar Raghuwanshi

Discussion: http://postgr.es/m/CAM2+6=X9kxQoL2ZqZ00E6asBt9z+rfyWbOmhXJ0+8fPAyMZ9Jg@mail.gmail.com
2018-02-27 13:23:50 -05:00
Robert Haas 3bf05e096b Add a new upper planner relation for partially-aggregated results.
Up until now, we've abused grouped_rel->partial_pathlist as a place to
store partial paths that have been partially aggregate, but that's
really not correct, because a partial path for a relation is supposed
to be one which produces the correct results with the addition of only
a Gather or Gather Merge node, and these paths also require a Finalize
Aggregate step.  Instead, add a new partially_group_rel which can hold
either partial paths (which need to be gathered and then have
aggregation finalized) or non-partial paths (which only need to have
aggregation finalized).  This allows us to reuse generate_gather_paths
for partially_grouped_rel instead of writing new code, so that this
patch actually basically no net new code while making things cleaner,
simplifying things for pending patches for partition-wise aggregate.

Robert Haas and Jeevan Chalke.  The larger patch series of which this
patch is a part was also reviewed and tested by Antonin Houska,
Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik,
Pascal Legrand, Rafia Sabih, and me.

Discussion: http://postgr.es/m/CA+TgmobrzFYS3+U8a_BCy3-hOvh5UyJbC18rEcYehxhpw5=ETA@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZyQEjdBNuoG9-wC5GQ5GrO4544Myo13dVptvx+uLg9uQ@mail.gmail.com
2018-02-26 09:32:32 -05:00
Tom Lane 9afd513df0 Fix planner failures with overlapping mergejoin clauses in an outer join.
Given overlapping or partially redundant join clauses, for example
	t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x
the planner's EquivalenceClass machinery will ordinarily refactor the
clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't
see multiple references to the same EquivalenceClass in a list of join
equality clauses.  However, if the join is outer, it's incorrect to derive
a restriction clause on the outer side from the join conditions, so the
clause refactoring does not happen and we end up with overlapping join
conditions.  The code that attempted to deal with such cases had several
subtle bugs, which could result in "left and right pathkeys do not match in
mergejoin" or "outer pathkeys do not match mergeclauses" planner errors,
if the selected join plan type was a mergejoin.  (It does not appear that
any actually incorrect plan could have been emitted.)

The core of the problem really was failure to recognize that the outer and
inner relations' pathkeys have different relationships to the mergeclause
list.  A join's mergeclause list is constructed by reference to the outer
pathkeys, so it will always be ordered the same as the outer pathkeys, but
this cannot be presumed true for the inner pathkeys.  If the inner sides of
the mergeclauses contain multiple references to the same EquivalenceClass
({t2.x} in the above example) then a simplistic rendering of the required
inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery
recognizes that the second sort column is redundant and throws it away.
The mergejoin planning code failed to account for that behavior properly.
One error was to try to generate cut-down versions of the mergeclause list
from cut-down versions of the inner pathkeys in the same way as the initial
construction of the mergeclause list from the outer pathkeys was done; this
could lead to choosing a mergeclause list that fails to match the outer
pathkeys.  The other problem was that the pathkey cross-checking code in
create_mergejoin_plan treated the inner and outer pathkey lists
identically, whereas actually the expectations for them must be different.
That led to false "pathkeys do not match" failures in some cases, and in
principle could have led to failure to detect bogus plans in other cases,
though there is no indication that such bogus plans could be generated.

Reported by Alexander Kuzmenkov, who also reviewed this patch.  This has
been broken for years (back to around 8.3 according to my testing), so
back-patch to all supported branches.

Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
2018-02-23 13:47:33 -05:00
Robert Haas 7d8ac9814b Charge cpu_tuple_cost * 0.5 for Append and MergeAppend nodes.
Previously, Append didn't charge anything at all, and MergeAppend
charged only cpu_operator_cost, about half the value used here.  This
change might make MergeAppend plans slightly more likely to be chosen
than before, since this commit increases the assumed cost for Append
-- with default values -- by 0.005 per tuple but MergeAppend by only
0.0025 per tuple.  Since the comparisons required by MergeAppend are
costed separately, it's not clear why MergeAppend needs to be
otherwise more expensive than Append, so hopefully this is OK.

Prior to partition-wise join, it didn't really matter whether or not
an Append node had any cost of its own, because every plan had to use
the same number of Append or MergeAppend nodes and in the same places.
Only the relative cost of Append vs. MergeAppend made a difference.
Now, however, it is possible to avoid some of the Append nodes using a
partition-wise join, so it's worth making an effort.  Pending patches
for partition-wise aggregate care too, because an Append of Aggregate
nodes will incur the Append overhead fewer times than an Aggregate
over an Append.  Although in most cases this change will favor the use
of partition-wise techniques, it does the opposite when the join
cardinality is greater than the sum of the input cardinalities.  Since
this situation arises in an existing regression test, I [rhaas]
adjusted it to keep the overall plan shape approximately the same.

Jeevan Chalke, per a suggestion from David Rowley.  Reviewed by
Ashutosh Bapat.  Some changes by me.  The larger patch series of which
this patch is a part was also reviewed and tested by Antonin Houska,
Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik,
Pascal Legrand, Rafia Sabih, and me.

Discussion: http://postgr.es/m/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B=ZRh-rxy9qxfPA5Gw@mail.gmail.com
2018-02-21 23:09:27 -05:00
Peter Eisentraut 2fb1abaeb0 Rename enable_partition_wise_join to enable_partitionwise_join
Discussion: https://www.postgresql.org/message-id/flat/ad24e4f4-6481-066e-e3fb-6ef4a3121882%402ndquadrant.com
2018-02-16 10:33:59 -05:00
Robert Haas 88ef48c1cc Fix parallel index builds for dynamic_shared_memory_type=none.
The previous code failed to realize that this setting effectively
disables parallelism, and would crash if it decided to attempt
parallelism anyway.  Instead, treat it as a disabling condition.

Kyotaro Horiguchi, who also reported the issue. Reviewed by Michael
Paquier and Peter Geoghegan.

Discussion: http://postgr.es/m/20180209.170635.256350357.horiguchi.kyotaro@lab.ntt.co.jp
2018-02-12 12:55:12 -05:00
Tom Lane 0a459cec96 Support all SQL:2011 options for window frame clauses.
This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING"
frame boundaries in window functions.  We'd punted on that back in the
original patch to add window functions, because it was not clear how to
do it in a reasonably data-type-extensible fashion.  That problem is
resolved here by adding the ability for btree operator classes to provide
an "in_range" support function that defines how to add or subtract the
RANGE offset value.  Factoring it this way also allows the operator class
to avoid overflow problems near the ends of the datatype's range, if it
wishes to expend effort on that.  (In the committed patch, the integer
opclasses handle that issue, but it did not seem worth the trouble to
avoid overflow failures for datetime types.)

The patch includes in_range support for the integer_ops opfamily
(int2/int4/int8) as well as the standard datetime types.  Support for
other numeric types has been requested, but that seems like suitable
material for a follow-on patch.

In addition, the patch adds GROUPS mode which counts the offset in
ORDER-BY peer groups rather than rows, and it adds the frame_exclusion
options specified by SQL:2011.  As far as I can see, we are now fully
up to spec on window framing options.

Existing behaviors remain unchanged, except that I changed the errcode
for a couple of existing error reports to meet the SQL spec's expectation
that negative "offset" values should be reported as SQLSTATE 22013.

Internally and in relevant parts of the documentation, we now consistently
use the terminology "offset PRECEDING/FOLLOWING" rather than "value
PRECEDING/FOLLOWING", since the term "value" is confusingly vague.

Oliver Ford, reviewed and whacked around some by me

Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 00:06:56 -05:00
Robert Haas f069c91a57 Fix possible crash in partition-wise join.
The previous code assumed that we'd always succeed in creating
child-joins for a joinrel for which partition-wise join was considered,
but that's not guaranteed, at least in the case where dummy rels
are involved.

Ashutosh Bapat, with some wordsmithing by me.

Discussion: http://postgr.es/m/CAFjFpRf8=uyMYYfeTBjWDMs1tR5t--FgOe2vKZPULxxdYQ4RNw@mail.gmail.com
2018-02-05 17:31:57 -05:00
Robert Haas 9da0cc3528 Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds.  Testing
to date shows that this can often be 2-3x faster than a serial
index build.

The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature.  We can
refine it as we get more experience.

Peter Geoghegan with some help from Rushabh Lathia.  While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature.  Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.

Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 13:32:44 -05:00
Tom Lane 35a528062c Add stack-overflow guards in set-operation planning.
create_plan_recurse lacked any stack depth check.  This is not per
our normal coding rules, but I'd supposed it was safe because earlier
planner processing is more complex and presumably should eat more
stack.  But bug #15033 from Andrew Grossman shows this isn't true,
at least not for queries having the form of a many-thousand-way
INTERSECT stack.

Further testing showed that recurse_set_operations is also capable
of being crashed in this way, since it likewise will recurse to the
bottom of a parsetree before calling any support functions that
might themselves contain any stack checks.  However, its stack
consumption is only perhaps a third of create_plan_recurse's.

It's possible that this particular problem with create_plan_recurse can
only manifest in 9.6 and later, since before that we didn't build a Path
tree for set operations.  But having seen this example, I now have no
faith in the proposition that create_plan_recurse doesn't need a stack
check, so back-patch to all supported branches.

Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org
2018-01-28 13:39:07 -05:00
Robert Haas 9fd8b7d632 Factor some code out of create_grouping_paths.
This is preparatory refactoring to prepare the way for partition-wise
aggregate, which will reuse the new subroutines for child grouping
rels.  It also does not seem like a bad idea on general principle,
as the function was getting pretty long.

Jeevan Chalke.  The larger patch series of which this patch is a part
was reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi,
Ashutosh Bapat, David Rowley, Dilip Kumar, Konstantin Knizhnik,
Pascal Legrand, and me.  Some cosmetic changes by me.

Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-01-26 15:03:12 -05:00
Alvaro Herrera 05fb5d6619 Ignore partitioned indexes where appropriate
get_relation_info() was too optimistic about opening indexes in
partitioned tables, which would raise errors when any queries were
planned on such tables.  Fix by ignoring any indexes of the partitioned
kind.

CLUSTER (and ALTER TABLE CLUSTER ON) had a similar problem.  Fix by
disallowing these commands in partitioned tables.

Fallout from 8b08f7d482.
2018-01-25 16:12:15 -03:00
Tom Lane bb94ce4d26 Teach reparameterize_path() to handle AppendPaths.
If we're inside a lateral subquery, there may be no unparameterized paths
for a particular child relation of an appendrel, in which case we *must*
be able to create similarly-parameterized paths for each other child
relation, else the planner will fail with "could not devise a query plan
for the given query".  This means that there are situations where we'd
better be able to reparameterize at least one path for each child.

This calls into question the assumption in reparameterize_path() that
it can just punt if it feels like it.  However, the only case that is
known broken right now is where the child is itself an appendrel so that
all its paths are AppendPaths.  (I think possibly I disregarded that in
the original coding on the theory that nested appendrels would get folded
together --- but that only happens *after* reparameterize_path(), so it's
not excused from handling a child AppendPath.)  Given that this code's been
like this since 9.3 when LATERAL was introduced, it seems likely we'd have
heard of other cases by now if there were a larger problem.

Per report from Elvis Pranskevichus.  Back-patch to 9.3.

Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
2018-01-23 16:50:34 -05:00
Robert Haas 2f17844104 Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint.  In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one.  This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented.  (There is a pending patch to improve the
situation further, but it needs more review.)

Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me.  A few final revisions by me.

Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 15:33:06 -05:00
Bruce Momjian f033462d8f Reorder C includes
Reorder header files in joinrels.c and pathnode.c in alphabetical order,
removing unnecessary ones.

Author: Etsuro Fujita
2018-01-17 18:10:05 -05:00
Tom Lane 680d540502 Avoid unnecessary failure in SELECT concurrent with ALTER NO INHERIT.
If a query against an inheritance tree runs concurrently with an ALTER
TABLE that's disinheriting one of the tree members, it's possible to get
a "could not find inherited attribute" error because after obtaining lock
on the removed member, make_inh_translation_list sees that its columns
have attinhcount=0 and decides they aren't the columns it's looking for.

An ideal fix, perhaps, would avoid including such a just-removed member
table in the query at all; but there seems no way to accomplish that
without adding expensive catalog rechecks or creating a likelihood of
deadlocks.  Instead, let's just drop the check on attinhcount.  In this
way, a query that's included a just-disinherited child will still
succeed, which is not a completely unreasonable behavior.

This problem has existed for a long time, so back-patch to all supported
branches.  Also add an isolation test verifying related behaviors.

Patch by me; the new isolation test is based on Kyotaro Horiguchi's work.

Discussion: https://postgr.es/m/20170626.174612.23936762.horiguchi.kyotaro@lab.ntt.co.jp
2018-01-12 15:46:37 -05:00
Tom Lane 90947674fc Fix incorrect handling of subquery pullup in the presence of grouping sets.
If we flatten a subquery whose target list contains constants or
expressions, when those output columns are used in GROUPING SET columns,
the planner was capable of doing the wrong thing by merging a pulled-up
expression into the surrounding expression during const-simplification.
Then the late processing that attempts to match subexpressions to grouping
sets would fail to match those subexpressions to grouping sets, with the
effect that they'd not go to null when expected.

To fix, wrap such subquery outputs in PlaceHolderVars, ensuring that
they preserve their separate identity throughout the planner's expression
processing.  This is a bit of a band-aid, because the wrapper defeats
const-simplification even in places where it would be safe to allow.
But a nicer fix would likely be too invasive to back-patch, and the
consequences of the missed optimizations probably aren't large in most
cases.

Back-patch to 9.5 where grouping sets were introduced.

Heikki Linnakangas, with small mods and better test cases by me;
additional review by Andrew Gierth

Discussion: https://postgr.es/m/7dbdcf5c-b5a6-ef89-4958-da212fe10176@iki.fi
2018-01-12 12:24:50 -05:00
Bruce Momjian bdb70c12b3 C comment: fix "the the" mentions in C comments
Reported-by: Christoph Dreis

Discussion: https://postgr.es/m/007e01d3519e$2734ca10$759e5e30$@freenet.de

Author: Christoph Dreis
2018-01-11 21:50:21 -05:00
Peter Eisentraut 9e945f8626 Fix Latin spelling
"c.f." should be "cf.".
2018-01-11 08:32:01 -05:00
Robert Haas 2fd58096f0 Add missing "return" statement to accumulate_append_subpath.
Without this, Parallel Append can end up with extra children.

Report by Rajkumar Raghuwanshi.  Fix by Amit Khandekar.  Brown
paper bag bug by me.

Discussion: http://postgr.es/m/CAKcux6mBF-NiddyEe9LwymoUC5+wh8bQJ=uk2gGkOE+L8cv=LA@mail.gmail.com
2018-01-10 11:21:20 -05:00
Tom Lane 624e440a47 Improve the heuristic for ordering child paths of a parallel append.
Commit ab7271677 introduced code that attempts to order the child
scans of a Parallel Append node in a way that will minimize execution
time, based on total cost and startup cost.  However, it failed to
think hard about what to do when estimated costs are exactly equal;
a case that's particularly likely to occur when comparing on startup
cost.  In such a case the ordering of the child paths would be left
to the whims of qsort, an algorithm that isn't even stable.

We can improve matters by applying the rule used elsewhere in the
planner: if total costs are equal, sort on startup cost, and
vice versa.  When both cost estimates are exactly equal, rather
than letting qsort do something unpredictable, sort based on the
child paths' relids, which should typically result in sorting in
inheritance order.  (The latter provision requires inventing a
qsort-style comparator for bitmapsets, but maybe we'll have use
for that for other reasons in future.)

This results in a few plan changes in the select_parallel test,
but those all look more reasonable than before, when the actual
underlying cost numbers are taken into account.

Discussion: https://postgr.es/m/4944.1515446989@sss.pgh.pa.us
2018-01-09 13:07:52 -05:00
Robert Haas 63008b19ee Fix comment.
RELATION_IS_OTHER_TEMP is tested in the caller, not here.

Discussion: http://postgr.es/m/5A5438E4.3090709@lab.ntt.co.jp
2018-01-09 09:40:31 -05:00
Robert Haas c759395617 Code review for Parallel Append.
- Remove unnecessary #include mistakenly added in execnodes.h.
- Fix mistake in comment in choose_next_subplan_for_leader.
- Adjust row estimates in cost_append for a possibly-different
  parallel divisor.
- Clamp row estimates in cost_append after operations that may
  not produce integers.

Amit Kapila, with cosmetic adjustments by me.

Discussion: http://postgr.es/m/CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com
2018-01-04 07:56:09 -05:00
Tom Lane 3decd150a2 Teach eval_const_expressions() to handle some more cases.
Add some infrastructure (mostly macros) to make it easier to write
typical cases for constant-expression simplification.  Add simplification
processing for ArrayRef, RowExpr, and ScalarArrayOpExpr node types,
which formerly went unsimplified even if all their inputs were constants.
Also teach it to simplify FieldSelect from a composite constant.
Make use of the new infrastructure to reduce the amount of code needed
for the existing ArrayExpr and ArrayCoerceExpr cases.

One existing test case changes output as a result of the fact that
RowExpr can now be folded to a constant.  All the new code is exercised
by existing test cases according to gcov, so I feel no need to add
additional tests.

Tom Lane, reviewed by Dmitry Dolgov

Discussion: https://postgr.es/m/3be3b82c-e29c-b674-2163-bf47d98817b1@iki.fi
2018-01-03 12:35:09 -05:00
Bruce Momjian 9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Tom Lane c4c2885cbb Fix UNION/INTERSECT/EXCEPT over no columns.
Since 9.4, we've allowed the syntax "select union select" and variants
of that.  However, the planner wasn't expecting a no-column set operation
and ended up treating the set operation as if it were UNION ALL.

Turns out it's trivial to fix in v10 and later; we just need to be careful
about not generating a Sort node with no sort keys.  However, since a weird
corner case like this is never going to be exercised by developers, we'd
better have thorough regression tests if we want to consider it supported.

Per report from Victor Yegorov.

Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
2017-12-22 12:08:06 -05:00
Tom Lane 6719b238e8 Rearrange execution of PARAM_EXTERN Params for plpgsql's benefit.
This patch does three interrelated things:

* Create a new expression execution step type EEOP_PARAM_CALLBACK
and add the infrastructure needed for add-on modules to generate that.
As discussed, the best control mechanism for that seems to be to add
another hook function to ParamListInfo, which will be called by
ExecInitExpr if it's supplied and a PARAM_EXTERN Param is found.
For stand-alone expressions, we add a new entry point to allow the
ParamListInfo to be specified directly, since it can't be retrieved
from the parent plan node's EState.

* Redesign the API for the ParamListInfo paramFetch hook so that the
ParamExternData array can be entirely virtual.  This also lets us get rid
of ParamListInfo.paramMask, instead leaving it to the paramFetch hook to
decide which param IDs should be accessible or not.  plpgsql_param_fetch
was already doing the identical masking check, so having callers do it too
seemed redundant.  While I was at it, I added a "speculative" flag to
paramFetch that the planner can specify as TRUE to avoid unwanted failures.
This solves an ancient problem for plpgsql that it couldn't provide values
of non-DTYPE_VAR variables to the planner for fear of triggering premature
"record not assigned yet" or "field not found" errors during planning.

* Rework plpgsql to get rid of the need for "unshared" parameter lists,
by dint of turning the single ParamListInfo per estate into a nearly
read-only data structure that doesn't instantiate any per-variable data.
Instead, the paramFetch hook controls access to per-variable data and can
make the right decisions on the fly, replacing the cases that we used to
need multiple ParamListInfos for.  This might perhaps have been a
performance loss on its own, but by using a paramCompile hook we can
bypass plpgsql_param_fetch entirely during normal query execution.
(It's now only called when, eg, we copy the ParamListInfo into a cursor
portal.  copyParamList() or SerializeParamList() effectively instantiate
the virtual parameter array as a simple physical array without a
paramFetch hook, which is what we want in those cases.)  This allows
reverting most of commit 6c82d8d1f, though I kept the cosmetic
code-consolidation aspects of that (eg the assign_simple_var function).

Performance testing shows this to be at worst a break-even change,
and it can provide wins ranging up to 20% in test cases involving
accesses to fields of "record" variables.  The fact that values of
such variables can now be exposed to the planner might produce wins
in some situations, too, but I've not pursued that angle.

In passing, remove the "parent" pointer from the arguments to
ExecInitExprRec and related functions, instead storing that pointer in a
transient field in ExprState.  The ParamListInfo pointer for a stand-alone
expression is handled the same way; we'd otherwise have had to add
yet another recursively-passed-down argument in expression compilation.

Discussion: https://postgr.es/m/32589.1513706441@sss.pgh.pa.us
2017-12-21 12:57:45 -05:00
Andres Freund 1804284042 Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash.  While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.

After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory.  If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.

The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:

 * avoids wasting memory on duplicated hash tables
 * avoids wasting disk space on duplicated batch files
 * divides the work of building the hash table over the CPUs

One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables.  This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes.  Another is that
outer batch 0 must be written to disk if multiple batches are required.

A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.

A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.

Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
    https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
    https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 00:43:41 -08:00
Robert Haas 38fc54703e Re-fix wrong costing of Sort under Gather Merge.
Commit dc02c7bca4 changed this call
to create_sort_path() to take -1 rather than limit_tuples because,
at that time, there was no way for a Sort beneath a Gather Merge
to become a top-N sort.

Later, commit 3452dc5240 provided
a way for a Sort beneath a Gather Merge to become a top-N sort,
but failed to revert the previous commit in the process.  Do that.

Report and analysis by Jeff Janes; patch by Thomas Munro; review by
Amit Kapila and by me.

Discussion: http://postgr.es/m/CAEepm=1BWtC34vUroA0Uqjw02MaqdUrW+d6WD85_k8SLyPiKHQ@mail.gmail.com
2017-12-19 10:42:17 -05:00
Robert Haas d329dc2ea4 Remove bug from OPTIMIZER_DEBUG code for partition-wise join.
Etsuro Fujita, reviewed by Ashutosh Bapat

Discussion: http://postgr.es/m/5A2A60E6.6000008@lab.ntt.co.jp
2017-12-12 10:52:15 -05:00
Robert Haas ab72716778 Support Parallel Append plan nodes.
When we create an Append node, we can spread out the workers over the
subplans instead of piling on to each subplan one at a time, which
should typically be a bit more efficient, both because the startup
cost of any plan executed entirely by one worker is paid only once and
also because of reduced contention.  We can also construct Append
plans using a mix of partial and non-partial subplans, which may allow
for parallelism in places that otherwise couldn't support it.
Unfortunately, this patch doesn't handle the important case of
parallelizing UNION ALL by running each branch in a separate worker;
the executor infrastructure is added here, but more planner work is
needed.

Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by
Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and
Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
2017-12-05 17:28:39 -05:00
Robert Haas 1cbc17aaca Try to exclude partitioned tables in toto.
Ashutosh Bapat, reviewed by Jeevan Chalke.  Comment by me.

Discussion: http://postgr.es/m/CAFjFpRcuRaydz88CY_aQekmuvmN2A9ax5z0k=ppT+s8KS8xMRA@mail.gmail.com
2017-12-01 10:59:09 -05:00
Peter Eisentraut e4128ee767 SQL procedures
This adds a new object type "procedure" that is similar to a function
but does not have a return type and is invoked by the new CALL statement
instead of SELECT or similar.  This implementation is aligned with the
SQL standard and compatible with or similar to other SQL implementations.

This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well
as ALTER/DROP ROUTINE that can refer to either a function or a
procedure (or an aggregate function, as an extension to SQL).  There is
also support for procedures in various utility commands such as COMMENT
and GRANT, as well as support in pg_dump and psql.  Support for defining
procedures is available in all the languages supplied by the core
distribution.

While this commit is mainly syntax sugar around existing functionality,
future features will rely on having procedures as a separate object
type.

Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2017-11-30 11:03:20 -05:00
Robert Haas 1761653bbb Make create_unique_path manage memory like mark_dummy_rel.
Put the unique path in the same context as the owning RelOptInfo, rather
than the toplevel planner context.  This is how this function worked
originally, but commit f41803bb39
changed it without explanation.  mark_dummy_rel adopted the older (or
newer?) technique in commit eca75a12a2,
which also featured a much better explanation of why it is correct.
So, switch back to that technique here, with the same explanation
given there.

Although this fixes a possible memory leak when GEQO is in use, the
leak is minor and probably nobody cares, so no back-patch.

Ashutosh Bapat, reviewed by Tom Lane and by me

Discussion: http://postgr.es/m/CAFjFpRcXkHHrXyD9BCvkgGJV4TnHG2SWJ0PhJfrDu3NAcQvh7g@mail.gmail.com
2017-11-30 09:50:10 -05:00
Robert Haas eaedf0df71 Update typedefs.list and re-run pgindent
Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com
2017-11-29 09:24:24 -05:00
Tom Lane 801386af62 Clarify old comment about qual_is_pushdown_safe's handling of subplans.
This comment glossed over the difference between initplans and subplans,
but they are indeed different for our purposes here.
2017-11-28 23:32:17 -05:00
Tom Lane 9a785ad573 Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table.  That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables.  However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping.  This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.

To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table.  We can conveniently call it from
preprocess_targetlist.  Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist.  (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)

Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't.  (That's
really an independent bug, but it manifests in much the same cases.)

Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.

Back-patch to 9.5 where the problem was introduced.

Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat

Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
2017-11-27 17:54:07 -05:00
Tom Lane df3a66e282 Improve planner's handling of set-returning functions in grouping columns.
Improve query_is_distinct_for() to accept SRFs in the targetlist when
we can prove distinctness from a DISTINCT clause.  In that case the
de-duplication will surely happen after SRF expansion, so the proof
still works.  Continue to punt in the case where we'd try to prove
distinctness from GROUP BY (or, in the future, source relations).
To do that, we'd have to determine whether the SRFs were in the
grouping columns or elsewhere in the tlist, and it still doesn't
seem worth the trouble.  But this trivial change allows us to
recognize that "SELECT DISTINCT unnest(foo) FROM ..." produces
unique-ified output, which seems worth having.

Also, fix estimate_num_groups() to consider the possibility of SRFs in
the grouping columns.  Its failure to do so was masked before v10 because
grouping_planner() scaled up plan rowcount estimates by the estimated SRF
multiplier after performing grouping.  That doesn't happen anymore, which
is more correct, but it means we need an adjustment in the estimate for
the number of groups.  Failure to do this leads to an underestimate for
the number of output rows of subqueries like "SELECT DISTINCT unnest(foo)"
compared to what 9.6 and earlier estimated, thus breaking plan choices
in some cases.

Per report from Dmitry Shalashov.  Back-patch to v10 to avoid degraded
plan choices compared to previous releases.

Discussion: https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
2017-11-25 11:48:09 -05:00
Robert Haas e89a71fb44 Pass InitPlan values to workers via Gather (Merge).
If a PARAM_EXEC parameter is used below a Gather (Merge) but the InitPlan
that computes it is attached to or above the Gather (Merge), force the
value to be computed before starting parallelism and pass it down to all
workers.  This allows us to use parallelism in cases where it previously
would have had to be rejected as unsafe.  We do - in this case - lose the
optimization that the value is only computed if it's actually used.  An
alternative strategy would be to have the first worker that needs the value
compute it, but one downside of that approach is that we'd then need to
select a parallel-safe path to compute the parameter value; it couldn't for
example contain a Gather (Merge) node.  At some point in the future, we
might want to consider both approaches.

Independent of that consideration, there is a great deal more work that
could be done to make more kinds of PARAM_EXEC parameters parallel-safe.
This infrastructure could be used to allow a Gather (Merge) on the inner
side of a nested loop (although that's not a very appealing plan) and
cases where the InitPlan is attached below the Gather (Merge) could be
addressed as well using various techniques.  But this is a good start.

Amit Kapila, reviewed and revised by me.  Reviewing and testing from
Kuntal Ghosh, Haribabu Kommi, and Tushar Ahuja.

Discussion: http://postgr.es/m/CAA4eK1LV0Y1AUV4cUCdC+sYOx0Z0-8NAJ2Pd9=UKsbQ5Sr7+JQ@mail.gmail.com
2017-11-16 12:06:14 -05:00
Robert Haas e5253fdc4f Add parallel_leader_participation GUC.
Sometimes, for testing, it's useful to have the leader do nothing but
read tuples from workers; and it's possible that could work out better
even in production.

Thomas Munro, reviewed by Amit Kapila and by me.  A few final tweaks
by me.

Discussion: http://postgr.es/m/CAEepm=2U++Lp3bNTv2Bv_kkr5NE2pOyHhxU=G0YTa4ZhSYhHiw@mail.gmail.com
2017-11-15 08:23:18 -05:00
Robert Haas 44ae64c388 Push target list evaluation through Gather Merge.
We already do this for Gather, but it got overlooked for Gather Merge.

Amit Kapila, with review and minor revisions by Rushabh Lathia
and by me.

Discussion: http://postgr.es/m/CAA4eK1KUC5Uyu7qaifxrjpHxbSeoQh3yzwN3bThnJsmJcZ-qtA@mail.gmail.com
2017-11-13 16:37:42 -05:00
Robert Haas e64861c79b Track in the plan the types associated with PARAM_EXEC parameters.
Up until now, we only tracked the number of parameters, which was
sufficient to allocate an array of Datums of the appropriate size,
but not sufficient to, for example, know how to serialize a Datum
stored in one of those slots.  An upcoming patch wants to do that,
so add this tracking to make it possible.

Patch by me, reviewed by Tom Lane and Amit Kapila.

Discussion: http://postgr.es/m/CA+TgmoYqpxDKn8koHdW8BEKk8FMUL0=e8m2Qe=M+r0UBjr3tuQ@mail.gmail.com
2017-11-13 15:24:12 -05:00
Robert Haas 5edc63bda6 Account for the effect of lossy pages when costing bitmap scans.
Dilip Kumar, reviewed by Alexander Kumenkov, Amul Sul, and me.
Some final adjustments by me.

Discussion: http://postgr.es/m/CAFiTN-sYtqUOXQ4SpuhTv0Z9gD0si3YxZGv_PQAAMX8qbOotcg@mail.gmail.com
2017-11-10 16:50:50 -05:00
Robert Haas b9941d3468 Fix incorrect comment.
Etsuro Fujita

Discussion: http://postgr.es/m/5A05728E.4050009@lab.ntt.co.jp
2017-11-10 10:55:09 -05:00
Robert Haas 1aba8e651a Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly.  This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.

At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work.  Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.

Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me.  A few
final tweaks also by me.

Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-09 18:07:44 -05:00
Peter Eisentraut 2eb4a831e5 Change TRUE/FALSE to true/false
The lower case spellings are C and C++ standard and are used in most
parts of the PostgreSQL sources.  The upper case spellings are only used
in some files/modules.  So standardize on the standard spellings.

The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so
those are left as is when using those APIs.

In code comments, we use the lower-case spelling for the C concepts and
keep the upper-case spelling for the SQL concepts.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-11-08 11:37:28 -05:00
Tom Lane 7b6c075471 Teach planner to account for HAVING quals in aggregation plan nodes.
For some reason, we have never accounted for either the evaluation cost
or the selectivity of filter conditions attached to Agg and Group nodes
(which, in practice, are always conditions from a HAVING clause).

Applying our regular selectivity logic to post-grouping conditions is a
bit bogus, but it's surely better than taking the selectivity as 1.0.
Perhaps someday the extended-statistics mechanism can be taught to provide
statistics that would help us in getting non-default estimates here.

Per a gripe from Benjamin Coutu.  This is surely a bug fix, but I'm
hesitant to back-patch because of the prospect of destabilizing existing
plan choices.  Given that it took us this long to notice the bug, it's
probably not hurting too many people in the field.

Discussion: https://postgr.es/m/20968.1509486337@sss.pgh.pa.us
2017-11-02 11:24:12 -04:00
Tom Lane 7c70996ebf Allow bitmap scans to operate as index-only scans when possible.
If we don't have to return any columns from heap tuples, and there's
no need to recheck qual conditions, and the heap page is all-visible,
then we can skip fetching the heap page altogether.

Skip prefetching pages too, when possible, on the assumption that the
recheck flag will remain the same from one page to the next.  While that
assumption is hardly bulletproof, it seems like a good bet most of the
time, and better than prefetching pages we don't need.

This commit installs the executor infrastructure, but doesn't change
any planner cost estimates, thus possibly causing bitmap scans to
not be chosen in cases where this change renders them the best choice.
I (tgl) am not entirely convinced that we need to account for this
behavior in the planner, because I think typically the bitmap scan would
get chosen anyway if it's the best bet.  In any case the submitted patch
took way too many shortcuts, resulting in too many clearly-bad choices,
to be committable.

Alexander Kuzmenkov, reviewed by Alexey Chernyshov, and whacked around
rather heavily by me.

Discussion: https://postgr.es/m/239a8955-c0fc-f506-026d-c837e86c827b@postgrespro.ru
2017-11-01 17:38:20 -04:00
Robert Haas cf7ab13bfb Fix code related to partitioning schemes for dropped columns.
The entry in appinfo->translated_vars can be NULL; if so, we must avoid
dereferencing it.

Ashutosh Bapat

Discussion: http://postgr.es/m/CAFjFpReL7+1ien=-21rhjpO3bV7aAm1rQ8XgLVk2csFagSzpZQ@mail.gmail.com
2017-10-31 14:43:05 +05:30
Robert Haas 24fd674a1a Fix grammar.
Etsuro Fujita

Discussion: http://postgr.es/m/cc7767b6-6a1b-74a2-8b3c-48b8e64c12ed@lab.ntt.co.jp
2017-10-28 11:14:23 +02:00
Robert Haas 682ce911f8 Allow parallel query for prepared statements with generic plans.
This was always intended to work, but due to an oversight in
max_parallel_hazard_walker, it didn't.  In testing, we missed the
fact that it was only working for custom plans, where the parameter
value has been substituted for the parameter itself early enough
that everything worked.  In a generic plan, the Param node survives
and must be treated as parallel-safe.  SerializeParamList provides
for the transmission of parameter values to workers.

Amit Kapila with help from Kuntal Ghosh.  Some changes by me.

Discussion: http://postgr.es/m/CAA4eK1+_BuZrmVCeua5Eqnm4Co9DAXdM5HPAOE2J19ePbR912Q@mail.gmail.com
2017-10-27 22:22:39 +02:00
Tom Lane 37a795a60b Support domains over composite types.
This is the last major omission in our domains feature: you can now
make a domain over anything that's not a pseudotype.

The major complication from an implementation standpoint is that places
that might be creating tuples of a domain type now need to be prepared
to apply domain_check().  It seems better that unprepared code fail
with an error like "<type> is not composite" than that it silently fail
to apply domain constraints.  Therefore, relevant infrastructure like
get_func_result_type() and lookup_rowtype_tupdesc() has been adjusted
to treat domain-over-composite as a distinct case that unprepared code
won't recognize, rather than just transparently treating it the same
as plain composite.  This isn't a 100% solution to the possibility of
overlooked domain checks, but it catches most places.

In passing, improve typcache.c's support for domains (it can now cache
the identity of a domain's base type), and rewrite the argument handling
logic in jsonfuncs.c's populate_record[set]_worker to reduce duplicative
per-call lookups.

I believe this is code-complete so far as the core and contrib code go.
The PLs need varying amounts of work, which will be tackled in followup
patches.

Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us
2017-10-26 13:47:45 -04:00
Tom Lane 08f1e1f0a4 Make setrefs.c match by ressortgroupref even for plain Vars.
Previously, we skipped using search_indexed_tlist_for_sortgroupref()
if the tlist expression being sought in the child plan node was merely
a Var.  This is purely an optimization, based on the theory that
search_indexed_tlist_for_var() is faster, and one copy of a Var should
be as good as another.  However, the GROUPING SETS patch broke the
latter assumption: grouping columns containing the "same" Var can
sometimes have different outputs, as shown in the test case added here.
So do it the hard way whenever a ressortgroupref marking exists.

(If this seems like a bottleneck, we could imagine building a tlist index
data structure for ressortgroupref values, as we do for Vars.  But I'll
let that idea go until there's some evidence it's worthwhile.)

Back-patch to 9.6.  The problem also exists in 9.5 where GROUPING SETS
came in, but this patch is insufficient to resolve the problem in 9.5:
there is some obscure dependency on the upper-planner-pathification
work that happened in 9.6.  Given that this is such a weird corner case,
and no end users have complained about it, it doesn't seem worth the work
to develop a fix for 9.5.

Patch by me, per a report from Heikki Linnakangas.  (This does not fix
Heikki's original complaint, just the follow-on one.)

Discussion: https://postgr.es/m/aefc657e-edb2-64d5-6df1-a0828f6e9104@iki.fi
2017-10-26 12:17:40 -04:00
Tom Lane 896eb5efbd In the planner, delete joinaliasvars lists after we're done with them.
Although joinaliasvars lists coming out of the parser are quite simple,
those lists can contain arbitrarily complex expressions after subquery
pullup.  We do not perform expression preprocessing on them, meaning that
expressions in those lists will not meet the expectations of later phases
of the planner (for example, that they do not contain SubLinks).  This had
been thought pretty harmless, since we don't intentionally touch those
lists in later phases --- but Andreas Seltenreich found a case in which
adjust_appendrel_attrs() could recurse into a joinaliasvars list and then
die on its assertion that it never sees a SubLink.  We considered a couple
of localized fixes to prevent that specific case from looking at the
joinaliasvars lists, but really this seems like a generic hazard for all
expression processing in the planner.  Therefore, probably the best answer
is to delete the joinaliasvars lists from the parsetree at the end of
expression preprocessing, so that there are no reachable expressions that
haven't been through preprocessing.

The case Andreas found seems to be harmless in non-Assert builds, and so
far there are no field reports suggesting that there are user-visible
effects in other cases.  I considered back-patching this anyway, but
it turns out that Andreas' test doesn't fail at all in 9.4-9.6, because
in those versions adjust_appendrel_attrs contains code (added in commit
842faa714 and removed again in commit 215b43cdc) to process SubLinks
rather than complain about them.  Barring discovery of another path by
which unprocessed joinaliasvars lists can cause trouble, the most
prudent compromise seems to be to patch this into v10 but not further.

Patch by me, with thanks to Amit Langote for initial investigation
and review.

Discussion: https://postgr.es/m/87r2tvt9f1.fsf@ansel.ydns.eu
2017-10-24 18:42:47 -04:00
Magnus Hagander 752871b6de Fix typos
David Rowley
2017-10-19 13:58:30 +02:00
Robert Haas 6393613b6a Fix possible crash with Parallel Bitmap Heap Scan.
If a Parallel Bitmap Heap scan's chain of leftmost descendents
includes a BitmapOr whose first child is a BitmapAnd, the prior coding
would mistakenly create a non-shared TIDBitmap and then try to perform
shared iteration.

Report by Tomas Vondra.  Patch by Dilip Kumar.

Discussion: http://postgr.es/m/50e89684-8ad9-dead-8767-c9545bafd3b6@2ndquadrant.com
2017-10-13 15:02:45 -04:00
Tom Lane 8ec5429e2f Reduce "X = X" to "X IS NOT NULL", if it's easy to do so.
If the operator is a strict btree equality operator, and X isn't volatile,
then the clause must yield true for any non-null value of X, or null if X
is null.  At top level of a WHERE clause, we can ignore the distinction
between false and null results, so it's valid to simplify the clause to
"X IS NOT NULL".  This is a useful improvement mainly because we'll get
a far better selectivity estimate in most cases.

Because such cases seldom arise in well-written queries, it is unappetizing
to expend a lot of planner cycles looking for them ... but it turns out
that there's a place we can shoehorn this in practically for free, because
equivclass.c already has to detect and reject candidate equivalences of the
form X = X.  That doesn't catch every place that it would be valid to
simplify to X IS NOT NULL, but it catches the typical case.  Working harder
doesn't seem justified.

Patch by me, reviewed by Petr Jelinek

Discussion: https://postgr.es/m/CAMjNa7cC4X9YR-vAJS-jSYCajhRDvJQnN7m2sLH1wLh-_Z2bsw@mail.gmail.com
2017-10-08 12:23:32 -04:00
Robert Haas 45866c7550 Copy information from the relcache instead of pointing to it.
We have the relations continuously locked, but not open, so relcache
pointers are not guaranteed to be stable.  Per buildfarm member
prion.

Ashutosh Bapat.  I fixed a typo.

Discussion: http://postgr.es/m/CAFjFpRcRBqoKLZSNmRsjKr81uEP=ennvqSQaXVCCBTXvJ2rW+Q@mail.gmail.com
2017-10-06 15:28:07 -04:00
Robert Haas f49842d1ee Basic partition-wise join functionality.
Instead of joining two partitioned tables in their entirety we can, if
it is an equi-join on the partition keys, join the matching partitions
individually.  This involves teaching the planner about "other join"
rels, which are related to regular join rels in the same way that
other member rels are related to baserels.  This can use significantly
more CPU time and memory than regular join planning, because there may
now be a set of "other" rels not only for every base relation but also
for every join relation.  In most practical cases, this probably
shouldn't be a problem, because (1) it's probably unusual to join many
tables each with many partitions using the partition keys for all
joins and (2) if you do that scenario then you probably have a big
enough machine to handle the increased memory cost of planning and (3)
the resulting plan is highly likely to be better, so what you spend in
planning you'll make up on the execution side.  All the same, for now,
turn this feature off by default.

Currently, we can only perform joins between two tables whose
partitioning schemes are absolutely identical.  It would be nice to
cope with other scenarios, such as extra partitions on one side or the
other with no match on the other side, but that will have to wait for
a future patch.

Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit
Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit
Khandekar, and by me.  A few final adjustments by me.

Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com
Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com
2017-10-06 11:11:10 -04:00
Robert Haas e9baa5e9fa Allow DML commands that create tables to use parallel query.
Haribabu Kommi, reviewed by Dilip Kumar and Rafia Sabih.  Various
cosmetic changes by me to explain why this appears to be safe but
allowing inserts in parallel mode in general wouldn't be.  Also, I
removed the REFRESH MATERIALIZED VIEW case from Haribabu's patch,
since I'm not convinced that case is OK, and hacked on the
documentation somewhat.

Discussion: http://postgr.es/m/CAJrrPGdo5bak6qnPWe8Kpi8g_jfQEs-G4SYmG9y+OFaw2-dPvA@mail.gmail.com
2017-10-05 11:40:48 -04:00
Tom Lane c12d570fa1 Support arrays over domains.
Allowing arrays with a domain type as their element type was left un-done
in the original domain patch, but not for any very good reason.  This
omission leads to such surprising results as array_agg() not working on
a domain column, because the parser can't identify a suitable output type
for the polymorphic aggregate.

In order to fix this, first clean up the APIs of coerce_to_domain() and
some internal functions in parse_coerce.c so that we consistently pass
around a CoercionContext along with CoercionForm.  Previously, we sometimes
passed an "isExplicit" boolean flag instead, which is strictly less
information; and coerce_to_domain() didn't even get that, but instead had
to reverse-engineer isExplicit from CoercionForm.  That's contrary to the
documentation in primnodes.h that says that CoercionForm only affects
display and not semantics.  I don't think this change fixes any live bugs,
but it makes things more consistent.  The main reason for doing it though
is that now build_coercion_expression() receives ccontext, which it needs
in order to be able to recursively invoke coerce_to_target_type().

Next, reimplement ArrayCoerceExpr so that the node does not directly know
any details of what has to be done to the individual array elements while
performing the array coercion.  Instead, the per-element processing is
represented by a sub-expression whose input is a source array element and
whose output is a target array element.  This simplifies life in
parse_coerce.c, because it can build that sub-expression by a recursive
invocation of coerce_to_target_type().  The executor now handles the
per-element processing as a compiled expression instead of hard-wired code.
The main advantage of this is that we can use a single ArrayCoerceExpr to
handle as many as three successive steps per element: base type conversion,
typmod coercion, and domain constraint checking.  The old code used two
stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty
inefficient, and adding yet another array deconstruction to do domain
constraint checking seemed very unappetizing.

In the case where we just need a single, very simple coercion function,
doing this straightforwardly leads to a noticeable increase in the
per-array-element runtime cost.  Hence, add an additional shortcut evalfunc
in execExprInterp.c that skips unnecessary overhead for that specific form
of expression.  The runtime speed of simple cases is within 1% or so of
where it was before, while cases that previously required two levels of
array processing are significantly faster.

Finally, create an implicit array type for every domain type, as we do for
base types, enums, etc.  Everything except the array-coercion case seems
to just work without further effort.

Tom Lane, reviewed by Andrew Dunstan

Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
2017-09-30 13:40:56 -04:00
Andrew Dunstan 28ae524bbf Quieten warnings about unused variables
These variables are only ever written to in assertion-enabled builds,
and the latest Microsoft compilers complain about such variables in
non-assertion-enabled builds.

Apparently they don't worry so much about variables that are written to
but not read from, so most of our PG_USED_FOR_ASSERTS_ONLY variables
don't cause the problem.

Discussion: https://postgr.es/m/7800.1505950322@sss.pgh.pa.us
2017-09-21 08:41:14 -04:00
Robert Haas 9140cf8269 Associate partitioning information with each RelOptInfo.
This is not used for anything yet, but it is necessary infrastructure
for partition-wise join and for partition pruning without constraint
exclusion.

Ashutosh Bapat, reviewed by Amit Langote and with quite a few changes,
mostly cosmetic, by me.  Additional review and testing of this patch
series by Antonin Houska, Amit Khandekar, Rafia Sabih, Rajkumar
Raghuwanshi, Thomas Munro, and Dilip Kumar.

Discussion: http://postgr.es/m/CAFjFpRfneFG3H+F6BaiXemMrKF+FY-POpx3Ocy+RiH3yBmXSNw@mail.gmail.com
2017-09-20 23:39:13 -04:00
Robert Haas 57eebca03a Fix create_lateral_join_info to handle dead relations properly.
Commit 0a480502b0 broke it.

Report by Andreas Seltenreich.  Fix by Ashutosh Bapat.

Discussion: http://postgr.es/m/874ls2vrnx.fsf@ansel.ydns.eu
2017-09-20 10:20:10 -04:00
Robert Haas 7f3a3312ab Fix typo.
Thomas Munro

Discussion: http://postgr.es/m/CAEepm=2j-HAgnBUrAazwS0ry7Z_ihk+d7g+Ye3u99+6WbiGt_Q@mail.gmail.com
2017-09-20 10:07:53 -04:00
Tom Lane 6f44fe7f12 Allow rel_is_distinct_for() to look through RelabelType below OpExpr.
This lets it do the right thing for, eg, varchar columns.
Back-patch to 9.5 where this logic appeared.

David Rowley, per report from Kim Rose Carlsen

Discussion: https://postgr.es/m/VI1PR05MB17091F9A9876528055D6A827C76D0@VI1PR05MB1709.eurprd05.prod.outlook.com
2017-09-17 15:28:51 -04:00
Robert Haas 0a480502b0 Expand partitioned table RTEs level by level, without flattening.
Flattening the partitioning hierarchy at this stage makes various
desirable optimizations difficult.  The original use case for this
patch was partition-wise join, which wants to match up the partitions
in one partitioning hierarchy with those in another such hierarchy.
However, it now seems that it will also be useful in making partition
pruning work using the PartitionDesc rather than constraint exclusion,
because with a flattened expansion, we have no easy way to figure out
which PartitionDescs apply to which leaf tables in a multi-level
partition hierarchy.

As it turns out, we end up creating both rte->inh and !rte->inh RTEs
for each intermediate partitioned table, just as we previously did for
the root table.  This seems unnecessary since the partitioned tables
have no storage and are not scanned.  We might want to go back and
rejigger things so that no partitioned tables (including the parent)
need !rte->inh RTEs, but that seems to require some adjustments not
related to the core purpose of this patch.

Ashutosh Bapat, reviewed by me and by Amit Langote.  Some final
adjustments by me.

Discussion: http://postgr.es/m/CAFjFpRd=1venqLL7oGU=C1dEkuvk2DJgvF+7uKbnPHaum1mvHQ@mail.gmail.com
2017-09-14 15:41:08 -04:00
Robert Haas 77b6b5e9ce Make RelationGetPartitionDispatchInfo expand depth-first.
With this change, the order of leaf partitions as returned by
RelationGetPartitionDispatchInfo should now be the same as the
order used by expand_inherited_rtentry.  This will make it simpler
for future patches to match up the partition dispatch information
with the planner data structures.  The new code is also, in my
opinion anyway, simpler and easier to understand.

Amit Langote, reviewed by Amit Khandekar.  I also reviewed and
made a few cosmetic revisions.

Discussion: http://postgr.es/m/d98d4761-5071-1762-501e-0e15047c714b@lab.ntt.co.jp
2017-09-14 12:28:50 -04:00
Robert Haas 1555566d9e Set partitioned_rels appropriately when UNION ALL is used.
In most cases, this omission won't matter, because the appropriate
locks will have been acquired during parse/plan or by AcquireExecutorLocks.
But it's a bug all the same.

Report by Ashutosh Bapat.  Patch by me, reviewed by Amit Langote.

Discussion: http://postgr.es/m/CAFjFpRdHb_ZnoDTuBXqrudWXh3H1ibLkr6nHsCFT96fSK4DXtA@mail.gmail.com
2017-09-14 11:00:39 -04:00
Tom Lane 7d08ce286c Distinguish selectivity of < from <= and > from >=.
Historically, the selectivity functions have simply not distinguished
< from <=, or > from >=, arguing that the fraction of the population that
satisfies the "=" aspect can be considered to be vanishingly small, if the
comparison value isn't any of the most-common-values for the variable.
(If it is, the code path that executes the operator against each MCV will
take care of things properly.)  But that isn't really true unless we're
dealing with a continuum of variable values, and in practice we seldom are.
If "x = const" would estimate a nonzero number of rows for a given const
value, then it follows that we ought to estimate different numbers of rows
for "x < const" and "x <= const", even if the const is not one of the MCVs.
Handling this more honestly makes a significant difference in edge cases,
such as the estimate for a tight range (x BETWEEN y AND z where y and z
are close together).

Hence, split scalarltsel into scalarltsel/scalarlesel, and similarly
split scalargtsel into scalargtsel/scalargesel.  Adjust <= and >=
operator definitions to reference the new selectivity functions.
Improve the core ineq_histogram_selectivity() function to make a
correction for equality.  (Along the way, I learned quite a bit about
exactly why that function gives good answers, which I tried to memorialize
in improved comments.)

The corresponding join selectivity functions were, and remain, just stubs.
But I chose to split them similarly, to avoid confusion and to prevent the
need for doing this exercise again if someone ever makes them less stubby.

In passing, change ineq_histogram_selectivity's clamp for extreme
probability estimates so that it varies depending on the histogram
size, instead of being hardwired at 0.0001.  With the default histogram
size of 100 entries, you still get the old clamp value, but bigger
histograms should allow us to put more faith in edge values.

Tom Lane, reviewed by Aleksander Alekseev and Kuntal Ghosh

Discussion: https://postgr.es/m/12232.1499140410@sss.pgh.pa.us
2017-09-13 11:12:39 -04:00
Tom Lane 8689e38263 Clean up handling of dropped columns in NAMEDTUPLESTORE RTEs.
The NAMEDTUPLESTORE patch piggybacked on the infrastructure for
TABLEFUNC/VALUES/CTE RTEs, none of which can ever have dropped columns,
so the possibility was ignored most places.  Fix that, including adding a
specification to parsenodes.h about what it's supposed to look like.

In passing, clean up assorted comments that hadn't been maintained
properly by said patch.

Per bug #14799 from Philippe Beaudoin.  Back-patch to v10.

Discussion: https://postgr.es/m/20170906120005.25630.84360@wrigleys.postgresql.org
2017-09-06 10:41:05 -04:00
Tom Lane 6e427aa4e5 Use lfirst_node() and linitial_node() where appropriate in planner.c.
There's no particular reason to target this module for the first
wholesale application of these macros; but we gotta start somewhere.

Ashutosh Bapat and Jeevan Chalke

Discussion: https://postgr.es/m/CAFjFpRcNr3r=u0ni=7A4GD9NnHQVq+dkFafzqo2rS6zy=dt1eg@mail.gmail.com
2017-09-05 15:57:48 -04:00
Robert Haas 30833ba154 Expand partitioned tables in PartDesc order.
Previously, we expanded the inheritance hierarchy in the order in
which find_all_inheritors had locked the tables, but that turns out
to block quite a bit of useful optimization.  For example, a
partition-wise join can't count on two tables with matching bounds
to get expanded in the same order.

Where possible, this change results in expanding partitioned tables in
*bound* order.  Bound order isn't well-defined for a list-partitioned
table with a null-accepting partition or for a list-partitioned table
where the bounds for a single partition are interleaved with other
partitions.  However, when expansion in bound order is possible, it
opens up further opportunities for optimization, such as
strength-reducing MergeAppend to Append when the expansion order
matches the desired sort order.

Patch by me, with cosmetic revisions by Ashutosh Bapat.

Discussion: http://postgr.es/m/CA+TgmoZrKj7kEzcMSum3aXV4eyvvbh9WD=c6m=002WMheDyE3A@mail.gmail.com
2017-08-31 15:50:18 -04:00
Tom Lane 7df2c1f8da Force rescanning of parallel-aware scan nodes below a Gather[Merge].
The ExecReScan machinery contains various optimizations for postponing
or skipping rescans of plan subtrees; for example a HashAgg node may
conclude that it can re-use the table it built before, instead of
re-reading its input subtree.  But that is wrong if the input contains
a parallel-aware table scan node, since the portion of the table scanned
by the leader process is likely to vary from one rescan to the next.
This explains the timing-dependent buildfarm failures we saw after
commit a2b70c89c.

The established mechanism for showing that a plan node's output is
potentially variable is to mark it as depending on some runtime Param.
Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC
parameter number, but carries no actual value) associated with each Gather
or GatherMerge node, mark parallel-aware nodes below that node as dependent
on that Param, and arrange for ExecReScanGather[Merge] to flag that Param
as changed whenever the Gather[Merge] node is rescanned.

This solution breaks an undocumented assumption made by the parallel
executor logic, namely that all rescans of nodes below a Gather[Merge]
will happen synchronously during the ReScan of the top node itself.
But that's fundamentally contrary to the design of the ExecReScan code,
and so was doomed to fail someday anyway (even if you want to argue
that the bug being fixed here wasn't a failure of that assumption).
A follow-on patch will address that issue.  In the meantime, the worst
that's expected to happen is that given very bad timing luck, the leader
might have to do all the work during a rescan, because workers think
they have nothing to do, if they are able to start up before the eventual
ReScan of the leader's parallel-aware table scan node has reset the
shared scan state.

Although this problem exists in 9.6, there does not seem to be any way
for it to manifest there.  Without GatherMerge, it seems that a plan tree
that has a rescan-short-circuiting node below Gather will always also
have one above it that will short-circuit in the same cases, preventing
the Gather from being rescanned.  Hence we won't take the risk of
back-patching this change into 9.6.  But v10 needs it.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
2017-08-30 09:29:55 -04:00
Andres Freund 2cd7084524 Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n).
This is a mechanical change in preparation for a later commit that
will change the layout of TupleDesc.  Introducing a macro to abstract
the details of where attributes are stored will allow us to change
that in separate step and revise it in future.

Author: Thomas Munro, editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-08-20 11:19:07 -07:00
Robert Haas 1e56883a52 Attempt to clarify comments related to force_parallel_mode.
Per discussion with Tom Lane.

Discussion: http://postgr.es/m/28589.1502902172@sss.pgh.pa.us
2017-08-17 14:09:14 -04:00
Tom Lane 963af96920 Add missing "static" marker.
Per pademelon.
2017-08-17 11:17:39 -04:00
Tom Lane 4867d7f62f Avoid out-of-memory in a hash join with many duplicate inner keys.
The executor is capable of splitting buckets during a hash join if
too much memory is being used by a small number of buckets.  However,
this only helps if a bucket's population is actually divisible; if
all the hash keys are alike, the tuples still end up in the same
new bucket.  This can result in an OOM failure if there are enough
inner keys with identical hash values.  The planner's cost estimates
will bias it against choosing a hash join in such situations, but not
by so much that it will never do so.  To mitigate the OOM hazard,
explicitly estimate the hash bucket space needed by just the inner
side's most common value, and if that would exceed work_mem then
add disable_cost to the hash cost estimate.

This approach doesn't account for the possibility that two or more
common values would share the same hash value.  On the other hand,
work_mem is normally a fairly conservative bound, so that eating
two or more times that much space is probably not going to kill us.

If we have no stats about the inner side, ignore this consideration.
There was some discussion of making a conservative assumption, but that
would effectively result in disabling hash join whenever we lack stats,
which seems like an overreaction given how seldom the problem manifests
in the field.

Per a complaint from David Hinkle.  Although this could be viewed
as a bug fix, the lack of similar complaints weighs against back-
patching; indeed we waited for v11 because it seemed already rather
late in the v10 cycle to be making plan choice changes like this one.

Discussion: https://postgr.es/m/32013.1487271761@sss.pgh.pa.us
2017-08-15 14:05:53 -04:00
Robert Haas e139f1953f Assorted preparatory refactoring for partition-wise join.
Instead of duplicating the logic to search for a matching
ParamPathInfo in multiple places, factor it out into a separate
function.

Pass only the relevant bits of the PartitionKey to
partition_bounds_equal instead of the whole thing, because
partition-wise join will want to call this without having a
PartitionKey available.

Adjust allow_star_schema_join and calc_nestloop_required_outer
to take relevant Relids rather than the entire Path, because
partition-wise join will want to call it with the top-parent
relids to determine whether a child join is allowable.

Ashutosh Bapat.  Review and testing of the larger patch set of which
this is a part by Amit Langote, Rajkumar Raghuwanshi, Rafia Sabih,
Thomas Munro, Dilip Kumar, and me.

Discussion: http://postgr.es/m/CA+TgmobQK80vtXjAsPZWWXd7c8u13G86gmuLupN+uUJjA+i4nA@mail.gmail.com
2017-08-15 12:30:38 -04:00
Tom Lane 00418c6124 Simplify plpgsql's check for simple expressions.
plpgsql wants to recognize expressions that it can execute directly
via ExecEvalExpr() instead of going through the full SPI machinery.
Originally the test for this consisted of recursively groveling through
the post-planning expression tree to see if it contained only nodes that
plpgsql recognized as safe.  That was a major maintenance headache, since
it required updating plpgsql every time we added any kind of expression
node.  It was also kind of expensive, so over time we added various
pre-planning checks to try to short-circuit having to do that.
Robert Haas pointed out that as of the SRF-processing changes in v10,
particularly the addition of Query.hasTargetSRFs, there really isn't
any reason to make the recursive scan at all: the initial checks cover
everything we really care about.  We do have to make sure that those
checks agree with what inline_function() considers, so that inlining
of a function that formerly wasn't inlined can't cause an expression
considered simple to become non-simple.

Hence, delete the recursive function exec_simple_check_node(), and tweak
those other tests to more exactly agree with inline_function().  Adjust
some comments and function naming to match.

Discussion: https://postgr.es/m/CA+TgmoZGZpwdEV2FQWaVxA_qZXsQE1DAS5Fu8fwxXDNvfndiUQ@mail.gmail.com
2017-08-15 12:28:39 -04:00
Robert Haas 480f1f4329 Teach adjust_appendrel_attrs(_multilevel) to do multiple translations.
Currently, child relations are always base relations, so when we
translate parent relids to child relids, we only need to translate
a singler relid.  However, the proposed partition-wise join feature
will create child joins, which will mean we need to translate a set
of parent relids to the corresponding child relids.  This is
preliminary refactoring to make that possible.

Ashutosh Bapat.  Review and testing of the larger patch set of which
this is a part by Amit Langote, Rajkumar Raghuwanshi, Rafia Sabih,
Thomas Munro, Dilip Kumar, and me.  Some adjustments, mostly
cosmetic, by me.

Discussion: http://postgr.es/m/CA+TgmobQK80vtXjAsPZWWXd7c8u13G86gmuLupN+uUJjA+i4nA@mail.gmail.com
2017-08-15 10:49:06 -04:00
Robert Haas d57929afc7 Avoid unnecessary single-child Append nodes.
Before commit d3cc37f1d8, an inheritance parent
whose only children were temp tables of other sessions would end up
as a simple scan of the parent; but with that commit, we end up with
an Append node, per a report from Ashutosh Bapat.  Tweak the logic
so that we go back to the old way, and update the function header
comment for partitioning while we're at it.

Ashutosh Bapat, reviewed by Amit Langote and adjusted by me.

Discussion: http://postgr.es/m/CAFjFpReWJr1yTkHU=OqiMBmcYCMoSW3VPR39RBuQ_ovwDFBT5Q@mail.gmail.com
2017-08-15 09:16:33 -04:00
Tom Lane 21d304dfed Final pgindent + perltidy run for v10. 2017-08-14 17:29:33 -04:00
Robert Haas 7086be6e36 When WCOs are present, disable direct foreign table modification.
If the user modifies a view that has CHECK OPTIONs and this gets
translated into a modification to an underlying relation which happens
to be a foreign table, the check options should be enforced.  In the
normal code path, that was happening properly, but it was not working
properly for "direct" modification because the whole operation gets
pushed to the remote side in that case and we never have an option to
enforce the constraint against individual tuples.  Fix by disabling
direct modification when there is a need to enforce CHECK OPTIONs.

Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me.

Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp
2017-07-24 15:57:24 -04:00
Tom Lane 278cb43411 Be more consistent about errors for opfamily member lookup failures.
Add error checks in some places that were calling get_opfamily_member
or get_opfamily_proc and just assuming that the call could never fail.
Also, standardize the wording for such errors in some other places.

None of these errors are expected in normal use, hence they're just
elog not ereport.  But they may be handy for diagnosing omissions in
custom opclasses.

Rushabh Lathia found the oversight in RelationBuildPartitionKey();
I found the others by grepping for all callers of these functions.

Discussion: https://postgr.es/m/CAGPqQf2R9Nk8htpv0FFi+FP776EwMyGuORpc9zYkZKC8sFQE3g@mail.gmail.com
2017-07-24 11:23:27 -04:00
Tom Lane decb08ebdf Code review for NextValueExpr expression node type.
Add missing infrastructure for this node type, notably in ruleutils.c where
its lack could demonstrably cause EXPLAIN to fail.  Add outfuncs/readfuncs
support.  (outfuncs support is useful today for debugging purposes.  The
readfuncs support may never be needed, since at present it would only
matter for parallel query and NextValueExpr should never appear in a
parallelizable query; but it seems like a bad idea to have a primnode type
that isn't fully supported here.)  Teach planner infrastructure that
NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node
with cost cpu_operator_cost.  Given its limited scope of usage, there
*might* be no live bug today from the lack of that knowledge, but it's
certainly going to bite us on the rear someday.  Teach pg_stat_statements
about the new node type, too.

While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction,
XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost.
Failing to do this for SQLValueFunction was an oversight in my commit
0bb51aa96.  The others are longer-standing oversights, but no time like the
present to fix them.  (In principle, CoerceToDomain could have cost much
higher than this, but it doesn't presently seem worth trying to examine the
domain's constraints here.)

Modify execExprInterp.c to execute NextValueExpr as an out-of-line
function; it seems quite unlikely to me that it's worth insisting that
it be inlined in all expression eval methods.  Besides, providing the
out-of-line function doesn't stop anyone from inlining if they want to.

Adjust some places where NextValueExpr support had been inserted with the
aid of a dartboard rather than keeping it in the same order as elsewhere.

Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
2017-07-14 15:25:43 -04:00
Robert Haas 6af9f1bd4b Document partitioned_rels in create_modifytable_path header comment.
Etsuro Fujita, slightly adjusted by me.

Discussion: http://postgr.es/m/e87c4a6d-23d7-5e7c-e8db-44ed418eb5d1@lab.ntt.co.jp
2017-06-22 13:52:50 -04:00
Robert Haas 1300276042 Update comment to account for table partitioning.
Ashutosh Bapat and Amit Langote

Discussion: http://postgr.es/m/CAFjFpRcG_NaAv6cDHD-9VfGdvB8maAtSfB=fTQr5+kxP2_sXzg@mail.gmail.com
2017-06-22 10:53:37 -04:00
Tom Lane 382ceffdf7 Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.

By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis.  However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent.  That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.

This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:35:54 -04:00
Tom Lane c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Tom Lane e3860ffa4d Initial pgindent run with pg_bsd_indent version 2.0.
The new indent version includes numerous fixes thanks to Piotr Stefaniak.
The main changes visible in this commit are:

* Nicer formatting of function-pointer declarations.
* No longer unexpectedly removes spaces in expressions using casts,
  sizeof, or offsetof.
* No longer wants to add a space in "struct structname *varname", as
  well as some similar cases for const- or volatile-qualified pointers.
* Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely.
* Fixes bug where comments following declarations were sometimes placed
  with no space separating them from the code.
* Fixes some odd decisions for comments following case labels.
* Fixes some cases where comments following code were indented to less
  than the expected column 33.

On the less good side, it now tends to put more whitespace around typedef
names that are not listed in typedefs.list.  This might encourage us to
put more effort into typedef name collection; it's not really a bug in
indent itself.

There are more changes coming after this round, having to do with comment
indentation and alignment of lines appearing within parentheses.  I wanted
to limit the size of the diffs to something that could be reviewed without
one's eyes completely glazing over, so it seemed better to split up the
changes as much as practical.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:39:04 -04:00
Tom Lane 9ef2dbefc7 Final pgindent run with old pg_bsd_indent (version 1.3).
This is just to have a clean basis for comparison with the results of
the new version (which will indeed end up reverting some of these
changes...)

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:09:24 -04:00
Tom Lane d8e6b84bd2 Avoid regressions in foreign-key-based selectivity estimates.
David Rowley found that the "use the smallest per-column selectivity"
heuristic applied in some cases by get_foreign_key_join_selectivity()
was badly off if the FK columns are independent, producing estimates
much worse than we got before that code was added in 9.6.

One case where that heuristic was used was for LEFT and FULL outer joins
with the referenced rel on the outside of the join.  But we should not
really need to special-case those here.  eqjoinsel() never has had such a
special case; the correction is applied by calc_joinrel_size_estimate()
instead.  Let's just estimate such cases like inner joins and rely on that
later adjustment.  (I think there was something of a thinko here, in that
the comments seem to be thinking about the selectivity as defined for
semi/anti joins; but that shouldn't apply to left/full joins.)  Add a
regression test exercising such a case to show that this is sane in
at least some cases.

The other case where we used that heuristic was for SEMI/ANTI outer joins,
either if the referenced rel was on the outside, or if it was on the inside
but was part of a join within the RHS.  In either case, the FK doesn't give
us a lot of traction towards estimating the selectivity.  To ensure that
we don't have regressions from what happened before 9.6, let's punt by
ignoring the FK in such cases and applying the traditional selectivity
calculation.  (We might be able to improve on that later, but for now
I just want to be sure it's not worse than 9.5.)

Report and patch by David Rowley, simplified a bit by me.  Back-patch
to 9.6 where this code was added.

Discussion: https://postgr.es/m/CAKJS1f8NO8oCDcxrteohG6O72uU1saEVT9qX=R8pENr5QWerXw@mail.gmail.com
2017-06-19 15:33:41 -04:00
Robert Haas b08df9cab7 Teach predtest.c about CHECK clauses to fix partitioning bugs.
In a CHECK clause, a null result means true, whereas in a WHERE clause
it means false.  predtest.c provided different functions depending on
which set of semantics applied to the predicate being proved, but had
no option to control what a null meant in the clauses provided as
axioms.  Add one.

Use that in the partitioning code when figuring out whether the
validation scan on a new partition can be skipped.  Rip out the
old logic that attempted (not very successfully) to compensate
for the absence of the necessary support in predtest.c.

Ashutosh Bapat and Robert Haas, reviewed by Amit Langote and
incorporating feedback from Tom Lane.

Discussion: http://postgr.es/m/CAFjFpReT_kq_uwU_B8aWDxR7jNGE=P0iELycdq5oupi=xSQTOw@mail.gmail.com
2017-06-14 13:13:11 -04:00
Tom Lane 9db7d47f90 #ifdef out assorted unused GEQO code.
I'd always assumed that backend/optimizer/geqo/'s remarkably poor
showing on code coverage metrics was because we weren't exercising
it much in the regression tests.  But it turns out that a good chunk
of the problem is that there's a bunch of code that is physically
unreachable (because the calls to it are #ifdef'd out in geqo_main.c)
but is being built anyway.  Making the called code have #if guards
similar to the calling code saves a couple of kilobytes of executable
size and should make the coverage numbers more reflective of reality.

It's arguable that we should just delete all the unused recombination
mechanisms altogether, but I didn't feel a need to go that far today.
2017-06-04 13:34:05 -04:00
Tom Lane 23886581b5 Fix old corner-case logic error in final_cost_nestloop().
When costing a nestloop with stop-at-first-inner-match semantics, and a
non-indexscan inner path, final_cost_nestloop() wants to charge the full
scan cost of the inner rel at least once, with additional scans charged
at inner_rescan_run_cost which might be less.  However the logic for
doing this effectively assumed that outer_matched_rows is at least 1.
If it's zero, which is not unlikely for a small outer rel, we ended up
charging inner_run_cost plus N times inner_rescan_run_cost, as much as
double the correct charge for an outer rel with only one row that
we're betting won't be matched.  (Unless the inner rel is materialized,
in which case it has very small inner_rescan_run_cost and the cost
is not so far off what it should have been.)

The upshot of this was that the planner had a tendency to select plans
that failed to make effective use of the stop-at-first-inner-match
semantics, and that might have Materialize nodes in them even when the
predicted number of executions of the Materialize subplan was only 1.
This was not so obvious before commit 9c7f5229a, because the case only
arose in connection with semi/anti joins where there's not freedom to
reverse the join order.  But with the addition of unique-inner joins,
it could result in some fairly bad planning choices, as reported by
Teodor Sigaev.  Indeed, some of the test cases added by that commit
have plans that look dubious on closer inspection, and are changed
by this patch.

Fix the logic to ensure that we don't charge for too many inner scans.
I chose to adjust it so that the full-freight scan cost is associated
with an unmatched outer row if possible, not a matched one, since that
seems like a better model of what would happen at runtime.

This is a longstanding bug, but given the lesser impact in back branches,
and the lack of field complaints, I won't risk a back-patch.

Discussion: https://postgr.es/m/CAKJS1f-LzkUsFxdJ_-Luy38orQ+AdEXM5o+vANR+-pHAWPSecg@mail.gmail.com
2017-06-03 13:48:15 -04:00
Robert Haas b522759508 Copy partitioned_rels lists to avoid shared substructure.
Otherwise, set_plan_refs() can get applied to the same list
multiple times through different references, leading to chaos.

Amit Langote, Dilip Kumar, and Robert Haas, reviewed by Ashutosh
Bapat.  Original report by Sveinn Sveinsson.

Discussion: http://postgr.es/m/20170517141151.1435.79890@wrigleys.postgresql.org
2017-05-19 15:26:05 -04:00
Bruce Momjian a6fd7b7a5f Post-PG 10 beta1 pgindent run
perltidy run not included.
2017-05-17 16:31:56 -04:00
Tom Lane f674743487 Remove no-longer-needed fields of Hash plan nodes.
skewColType/skewColTypmod are no longer used in the wake of commit
9aab83fc5, and seem unlikely to be wanted in future, so let's drop 'em.

Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
2017-05-14 11:07:40 -04:00
Tom Lane f04c9a6146 Standardize terminology for pg_statistic_ext entries.
Consistently refer to such an entry as a "statistics object", not just
"statistics" or "extended statistics".  Previously we had a mismash of
terms, accompanied by utter confusion as to whether the term was
singular or plural.  That's not only grating (at least to the ear of
a native English speaker) but could be outright misleading, eg in error
messages that seemed to be referring to multiple objects where only one
could be meant.

This commit fixes the code and a lot of comments (though I may have
missed a few).  I also renamed two new SQL functions,
pg_get_statisticsextdef -> pg_get_statisticsobjdef
pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible
to conform better with this terminology.

I have not touched the SGML docs other than fixing those function
names; the docs certainly need work but it seems like a separable task.

Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us
2017-05-14 10:55:01 -04:00
Tom Lane 92a43e4857 Reduce semijoins with unique inner relations to plain inner joins.
If the inner relation can be proven unique, that is it can have no more
than one matching row for any row of the outer query, then we might as
well implement the semijoin as a plain inner join, allowing substantially
more freedom to the planner.  This is a form of outer join strength
reduction, but it can't be implemented in reduce_outer_joins() because
we don't have enough info about the individual relations at that stage.
Instead do it much like remove_useless_joins(): once we've built base
relations, we can make another pass over the SpecialJoinInfo list and
get rid of any entries representing reducible semijoins.

This is essentially a followon to the inner-unique patch (commit 9c7f5229a)
and makes use of the proof machinery that that patch created.  We need only
minor refactoring of innerrel_is_unique's API to support this usage.

Per performance complaint from Teodor Sigaev.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:53:42 -04:00
Tom Lane 2057a58d16 Fix mis-optimization of semijoins with more than one LHS relation.
The inner-unique patch (commit 9c7f5229a) supposed that if we're
considering a JOIN_UNIQUE_INNER join path, we can always set inner_unique
for the join, because the inner path produced by create_unique_path should
be unique relative to the outer relation.  However, that's true only if
we're considering joining to the whole outer relation --- otherwise we may
be applying only some of the join quals, and so the inner path might be
non-unique from the perspective of this join.  Adjust the test to only
believe that we can set inner_unique if we have the whole semijoin LHS on
the outer side.

There is more that can be done in this area, but this commit is only
intended to provide the minimal fix needed to get correct plans.

Per report from Teodor Sigaev.  Thanks to David Rowley for preliminary
investigation.

Discussion: https://postgr.es/m/f994fc98-389f-4a46-d1bc-c42e05cb43ed@sigaev.ru
2017-05-01 14:39:11 -04:00
Robert Haas e180c8aa8c Fire per-statement triggers on partitioned tables.
Even though no actual tuples are ever inserted into a partitioned
table (the actual tuples are in the partitions, not the partitioned
table itself), we still need to have a ResultRelInfo for the
partitioned table, or per-statement triggers won't get fired.

Amit Langote, per a report from Rajkumar Raghuwanshi.  Reviewed by me.

Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
2017-05-01 08:23:01 -04:00
Tom Lane 39151781c8 Fix testing of parallel-safety of SubPlans.
is_parallel_safe() supposed that the only relevant property of a SubPlan
was the parallel safety of the referenced subplan tree.  This is wrong:
the testexpr or args subtrees might contain parallel-unsafe stuff, as
demonstrated by the test case added here.  However, just recursing into the
subtrees fails in a different way: we'll typically find PARAM_EXEC Params
representing the subplan's output columns in the testexpr.  The previous
coding supposed that any Param must be treated as parallel-restricted, so
that a naive attempt at fixing this disabled parallel pushdown of SubPlans
altogether.  We must instead determine, for any visited Param, whether it
is one that would be computed by a surrounding SubPlan node; if so, it's
safe to push down along with the SubPlan node.

We might later be able to extend this logic to cope with Params used for
correlated subplans and other cases; but that's a task for v11 or beyond.

Tom Lane and Amit Kapila

Discussion: https://postgr.es/m/7064.1492022469@sss.pgh.pa.us
2017-04-18 15:43:56 -04:00
Alvaro Herrera ee6922112e Rename columns in new pg_statistic_ext catalog
The new catalog reused a column prefix "sta" from pg_statistic, but this
is undesirable, so change the catalog to use prefix "stx" instead.
Also, rename the column that lists enabled statistic kinds as "stxkind"
rather than "enabled".

Discussion: https://postgr.es/m/CAKJS1f_2t5jhSN7huYRFH3w3rrHfG2QU7hiUHsu-Vdjd1rYT3w@mail.gmail.com
2017-04-17 18:34:29 -03:00
Tom Lane 76799fc89d Always build a custom plan node's targetlist from the path's pathtarget.
We were applying the use_physical_tlist optimization to all relation
scan plans, even those implemented by custom scan providers.  However,
that's a bad idea for a couple of reasons.  The custom provider might
be unable to provide columns that it hadn't expected to be asked for
(for example, the custom scan might depend on an index-only scan).
Even more to the point, there's no good reason to suppose that this
"optimization" is a win for a custom scan; whatever the custom provider
is doing is likely not based on simply returning physical heap tuples.
(As a counterexample, if the custom scan is an interface to a column store,
demanding all columns would be a huge loss.)  If it is a win, the custom
provider could make that decision for itself and insert a suitable
pathtarget into the path, anyway.

Per discussion with Dmitry Ivanov.  Back-patch to 9.5 where custom scan
support was introduced.  The argument that the custom provider can adjust
the behavior by changing the pathtarget only applies to 9.6+, but on
balance it seems more likely that use_physical_tlist will hurt custom
scans than help them.

Discussion: https://postgr.es/m/e29ddd30-8ef9-4da5-a50b-2bb7b8c7198d@postgrespro.ru
2017-04-17 15:29:15 -04:00
Tom Lane 003d80f3df Mark finished Plan nodes with parallel_safe flags.
We'd managed to avoid doing this so far, but it seems pretty obvious
that it would be forced on us some day, and this is much the cleanest
way of approaching the open problem that parallel-unsafe subplans are
being transmitted to parallel workers.  Anyway there's no space cost
due to alignment considerations, and the time cost is pretty minimal
since we're just copying the flag from the corresponding Path node.
(At least in most cases ... some of the klugier spots in createplan.c
have to work a bit harder.)

In principle we could perhaps get rid of SubPlan.parallel_safe,
but I thought it better to keep that in case there are reasons to
consider a SubPlan unsafe even when its child plan is parallel-safe.

This patch doesn't actually do anything with the new flags, but
I thought I'd commit it separately anyway.

Note: although this touches outfuncs/readfuncs, there's no need for
a catversion bump because Plan trees aren't stored on disk.

Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
2017-04-12 15:13:34 -04:00
Tom Lane 8f0530f580 Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type.  For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))".  Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.

As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.

Patch by me, after an idea of Andrew Gierth's.

Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 13:51:53 -04:00
Tom Lane eef8c0069e Clean up bugs in clause_selectivity() cleanup.
Commit ac2b09508 was not terribly carefully reviewed.  Band-aid it to
not fail on non-RestrictInfo input, per report from Andreas Seltenreich.
Also make it do something more reasonable with variable-free clauses,
and improve nearby comments.

Discussion: https://postgr.es/m/87inmf5rdx.fsf@credativ.de
2017-04-08 16:38:03 -04:00
Tom Lane 9c7f5229ad Optimize joins when the inner relation can be proven unique.
If there can certainly be no more than one matching inner row for a given
outer row, then the executor can move on to the next outer row as soon as
it's found one match; there's no need to continue scanning the inner
relation for this outer row.  This saves useless scanning in nestloop
and hash joins.  In merge joins, it offers the opportunity to skip
mark/restore processing, because we know we have not advanced past the
first possible match for the next outer row.

Of course, the devil is in the details: the proof of uniqueness must
depend only on joinquals (not otherquals), and if we want to skip
mergejoin mark/restore then it must depend only on merge clauses.
To avoid adding more planning overhead than absolutely necessary,
the present patch errs in the conservative direction: there are cases
where inner_unique or skip_mark_restore processing could be used, but
it will not do so because it's not sure that the uniqueness proof
depended only on "safe" clauses.  This could be improved later.

David Rowley, reviewed and rather heavily editorialized on by me

Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com
2017-04-07 22:20:13 -04:00
Tom Lane 89deca582a Fix planner error (or assert trap) with nested set operations.
As reported by Sean Johnston in bug #14614, since 9.6 the planner can fail
due to trying to look up the referent of a Var with varno 0.  This happens
because we generate such Vars in generate_append_tlist, for lack of any
better way to describe the output of a SetOp node.  In typical situations
nothing really cares about that, but given nested set-operation queries
we will call estimate_num_groups on the output of the subquery, and that
wants to know what a Var actually refers to.  That logic used to look at
subquery->targetList, but in commit 3fc6e2d7f I'd switched it to look at
subroot->processed_tlist, ie the actual output of the subquery plan not the
parser's idea of the result.  It seemed like a good idea at the time :-(.
As a band-aid fix, change it back.

Really we ought to have an honest way of naming the outputs of SetOp steps,
which suggests that it'd be a good idea for the parser to emit an RTE
corresponding to each one.  But that's a task for another day, and it
certainly wouldn't yield a back-patchable fix.

Report: https://postgr.es/m/20170407115808.25934.51866@wrigleys.postgresql.org
2017-04-07 12:18:38 -04:00
Simon Riggs ac2b095088 Reset API of clause_selectivity()
Discussion: https://postgr.es/m/CAKJS1f9yurJQW9pdnzL+rmOtsp2vOytkpXKGnMFJEO-qz5O5eA@mail.gmail.com
2017-04-06 19:10:51 -04:00
Alvaro Herrera b1fc51a36e Comment fixes for extended statistics
Clean up some code comments in new extended statistics code, from
7b504eb282.
2017-04-06 12:28:50 -03:00
Simon Riggs 2686ee1b7c Collect and use multi-column dependency stats
Follow on patch in the multi-variate statistics patch series.

CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t;
ANALYZE;
will collect dependency stats on (a, b) and then use the measured
dependency in subsequent query planning.

Commit 7b504eb282 added
CREATE STATISTICS with n-distinct coefficients. These are now
specified using the mutually exclusive option WITH (ndistinct).

Author: Tomas Vondra, David Rowley
Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas
and many other comments and contributions
Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com
2017-04-05 18:00:42 -04:00
Robert Haas 7a39b5e4d1 Abstract logic to allow for multiple kinds of child rels.
Currently, the only type of child relation is an "other member rel",
which is the child of a baserel, but in the future joins and even
upper relations may have child rels.  To facilitate that, introduce
macros that test to test for particular RelOptKind values, and use
them in various places where they help to clarify the sense of a test.
(For example, a test may allow RELOPT_OTHER_MEMBER_REL either because
it intends to allow child rels, or because it intends to allow simple
rels.)

Also, remove find_childrel_top_parent, which will not work for a
child rel that is not a baserel.  Instead, add a new RelOptInfo
member top_parent_relids to track the same kind of information in a
more generic manner.

Ashutosh Bapat, slightly tweaked by me.  Review and testing of the
patch set from which this was taken by Rajkumar Raghuwanshi and Rafia
Sabih.

Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com
2017-04-03 22:41:31 -04:00
Andrew Gierth f578093526 Try and silence spurious Coverity warning.
gset_data (aka gd) in planner.c is always non-null if and only if
parse->groupingSets is non-null, but Coverity doesn't know that and
complains.  Feed it an assertion to see if that keeps it happy.
2017-04-03 23:30:24 +01:00
Kevin Grittner 18ce3a4ab2 Add infrastructure to support EphemeralNamedRelation references.
A QueryEnvironment concept is added, which allows new types of
objects to be passed into queries from parsing on through
execution.  At this point, the only thing implemented is a
collection of EphemeralNamedRelation objects -- relations which
can be referenced by name in queries, but do not exist in the
catalogs.  The only type of ENR implemented is NamedTuplestore, but
provision is made to add more types fairly easily.

An ENR can carry its own TupleDesc or reference a relation in the
catalogs by relid.

Although these features can be used without SPI, convenience
functions are added to SPI so that ENRs can easily be used by code
run through SPI.

The initial use of all this is going to be transition tables in
AFTER triggers, but that will be added to each PL as a separate
commit.

An incidental effect of this patch is to produce a more informative
error message if an attempt is made to modify the contents of a CTE
from a referencing DML statement.  No tests previously covered that
possibility, so one is added.

Kevin Grittner and Thomas Munro
Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
with valuable comments and suggestions from many others
2017-03-31 23:17:18 -05:00
Robert Haas 7d8f6986b8 Fix parallel query so it doesn't spoil row estimates above Gather.
Commit 45be99f8cd removed GatherPath's
num_workers field, but this is entirely bogus.  Normally, a path's
parallel_workers flag is supposed to indicate the number of workers
that it wants, and should be 0 for a non-partial path.  In that
commit, I mistakenly thought that GatherPath could also use that field
to indicate the number of workers that it would try to start, but
that's disastrous, because then it can propagate up to higher nodes in
the plan tree, which will then get incorrect rowcounts because the
parallel_workers flag is involved in computing those values.  Repair
by putting the separate field back.

Report by Tomas Vondra.  Patch by me, reviewed by Amit Kapila.

Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com
2017-03-31 21:01:20 -04:00
Peter Eisentraut 4cb824699e Cast result of copyObject() to correct type
copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.

Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
2017-03-28 21:59:23 -04:00
Andrew Gierth b5635948ab Support hashed aggregation with grouping sets.
This extends the Aggregate node with two new features: HashAggregate
can now run multiple hashtables concurrently, and a new strategy
MixedAggregate populates hashtables while doing sorted grouping.

The planner will now attempt to save as many sorts as possible when
planning grouping sets queries, while not exceeding work_mem for the
estimated combined sizes of all hashtables used.  No SQL-level changes
are required.  There should be no user-visible impact other than the
new EXPLAIN output and possible changes to result ordering when ORDER
BY was not used (which affected a few regression tests).  The
enable_hashagg option is respected.

Author: Andrew Gierth
Reviewers: Mark Dilger, Andres Freund
Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
2017-03-27 04:20:54 +01:00
Tom Lane 244dd95ce9 Update some obsolete comments.
Fix a few stray references to expression eval functions that don't
exist anymore or don't take the same input representation they used to.
2017-03-26 11:36:46 -04:00
Andres Freund b8d7f053c5 Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.

This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.

The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
  function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
  out operation metadata sequentially; including the avoidance of
  nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
  constant re-checks at evaluation time

Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.

The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
  overhead of expression evaluation, by caching state in prepared
  statements.  That'd be helpful in OLTPish scenarios where
  initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
  work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
  been made here too.

The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
  initialization, whereas previously they were done during
  execution. In edge cases this can lead to errors being raised that
  previously wouldn't have been, e.g. a NULL array being coerced to a
  different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
  during expression initialization, previously it was re-built
  every time a domain check was evaluated. For normal queries this
  doesn't change much, but e.g. for plpgsql functions, which caches
  ExprStates, the old set could stick around longer.  The behavior
  around might still change.

Author: Andres Freund, with significant changes by Tom Lane,
	changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-25 14:52:06 -07:00
Alvaro Herrera 7b504eb282 Implement multivariate n-distinct coefficients
Add support for explicitly declared statistic objects (CREATE
STATISTICS), allowing collection of statistics on more complex
combinations that individual table columns.  Companion commands DROP
STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are
added too.  All this DDL has been designed so that more statistic types
can be added later on, such as multivariate most-common-values and
multivariate histograms between columns of a single table, leaving room
for permitting columns on multiple tables, too, as well as expressions.

This commit only adds support for collection of n-distinct coefficient
on user-specified sets of columns in a single table.  This is useful to
estimate number of distinct groups in GROUP BY and DISTINCT clauses;
estimation errors there can cause over-allocation of memory in hashed
aggregates, for instance, so it's a worthwhile problem to solve.  A new
special pseudo-type pg_ndistinct is used.

(num-distinct estimation was deemed sufficiently useful by itself that
this is worthwhile even if no further statistic types are added
immediately; so much so that another version of essentially the same
functionality was submitted by Kyotaro Horiguchi:
https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp
though this commit does not use that code.)

Author: Tomas Vondra.  Some code rework by Álvaro.
Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes,
    Ideriha Takeshi
Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz
    https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
2017-03-24 14:06:10 -03:00
Robert Haas dc02c7bca4 Fix wrong costing of Sort under Gather Merge.
There's no mechanism for such a sort to become a top-N sort, so we
should pass -1 rather than limit_tuples to cost_sort().

Rushabh Lathia, per a report from Mithun Cy

Discussion: http://postgr.es/m/CAGPqQf1akRcSgC9=6iwx=sEPap9UvPpHJLzg8_N+OuHdb6fL+g@mail.gmail.com
2017-03-22 14:45:14 -04:00
Robert Haas d3cc37f1d8 Don't scan partitioned tables.
Partitioned tables do not contain any data; only their unpartitioned
descendents need to be scanned.  However, the partitioned tables still
need to be locked, even though they're not scanned.  To make that
work, Append and MergeAppend relations now need to carry a list of
(unscanned) partitioned relations that must be locked, and InitPlan
must lock all partitioned result relations.

Aside from the obvious advantage of avoiding some work at execution
time, this has two other advantages.  First, it may improve the
planner's decision-making in some cases since the empty relation
might throw things off.  Second, it paves the way to getting rid of
the storage for partitioned tables altogether.

Amit Langote, reviewed by me.

Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp
2017-03-21 09:48:04 -04:00
Robert Haas 1ea60ad602 Fix failure to use clamp_row_est() for parallel joins.
Commit 0c2070cefa neglected to use
clamp_row_est() where it should have done so.

Patch by me.  Report by Amit Kapila.

Discussion: http://postgr.es/m/CAA4eK1KPm8RYa1Kun3ZmQj9pb723b-EFN70j47Pid1vn3ByquA@mail.gmail.com
2017-03-15 12:28:54 -04:00
Robert Haas c44c47a773 Some preliminary refactoring towards partitionwise join.
Partitionwise join proposes add a concept of child join relations,
which will have the same relationship with join relations as "other
member" relations do with base relations.  These relations will need
some but not all of the handling that we currently have for join
relations, and some but not all of the handling that we currently have
for appendrels, since they are a mix of the two.  Refactor a little
bit so that the necessary bits of logic are exposed as separate
functions.

Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi and
by me.

Discussion: http://postgr.es/m/CAFjFpRfqotRR6cM3sooBHMHEVdkFfAZ6PyYg4GRZsoMuW08HjQ@mail.gmail.com
2017-03-14 19:25:47 -04:00
Robert Haas 2609e91fcf Fix regression in parallel planning against inheritance tables.
Commit 51ee6f3160 accidentally changed
the behavior around inheritance hierarchies; before, we always
considered parallel paths even for very small inheritance children,
because otherwise an inheritance hierarchy with even one small child
wouldn't be eligible for parallelism.  That exception was inadverently
removed; put it back.

In passing, also adjust the degree-of-parallelism comptuation for
index-only scans not to consider the number of heap pages fetched.
Otherwise, we'll avoid parallel index-only scans on tables that are
mostly all-visible, which isn't especially logical.

Robert Haas and Amit Kapila, per a report from Ashutosh Sharma.

Discussion: http://postgr.es/m/CAE9k0PmgSoOHRd60SHu09aRVTHRSs8s6pmyhJKWHxWw9C_x+XA@mail.gmail.com
2017-03-14 14:33:14 -04:00
Peter Eisentraut a47b38c9ee Spelling fixes
From: Josh Soref <jsoref@gmail.com>
2017-03-14 12:58:39 -04:00