Commit Graph

3059 Commits

Author SHA1 Message Date
Tom Lane 8f72a57048 Fix format_type() to restore its old behavior.
Commit a26116c6c accidentally changed the behavior of the SQL format_type()
function while refactoring.  For the reasons explained in that function's
comment, a NULL typemod argument should behave differently from a -1
argument.  Since we've managed to break this, add a regression test
memorializing the intended behavior.

In passing, be consistent about the type of the "flags" parameter.

Noted by Rushabh Lathia, though I revised the patch some more.

Discussion: https://postgr.es/m/CAGPqQf3RB2q-d2Awp_-x-Ur6aOxTUwnApt-vm-iTtceZxYnePg@mail.gmail.com
2018-03-01 11:37:46 -05:00
Robert Haas ce1663cdcd Fix assertion failure when Parallel Append is run serially.
Parallel-aware plan nodes must be prepared to run without parallelism
if it's not possible at execution time for whatever reason.  Commit
ab72716778, which introduced Parallel
Append, overlooked this.

Rajkumar Raghuwanshi reported this problem, and I included his test
case in this patch.  The code changes are by me.

Discussion: http://postgr.es/m/CAKcux6=WqkUudLg1GLZZ7fc5ScWC1+Y9qD=pAHeqy32WoeJQvw@mail.gmail.com
2018-02-28 10:58:27 -05:00
Tom Lane e98a4de7d2 Use the correct tuplestore read pointer in a NamedTuplestoreScan.
Tom Kazimiers reported that transition tables don't work correctly when
they are scanned by more than one executor node.  That's because commit
18ce3a4ab allocated separate read pointers for each executor node, as it
must, but failed to make them active at the appropriate times.  Repair.

Thomas Munro

Discussion: https://postgr.es/m/20180224034748.bixarv6632vbxgeb%40dewberry.localdomain
2018-02-27 15:56:51 -05:00
Tom Lane c40e20a83c Revert renaming of int44in/int44out.
This seemed like a good idea in commit be42eb9d6, but it causes more
trouble than it's worth for cross-branch upgrade testing.

Discussion: https://postgr.es/m/11927.1519756619@sss.pgh.pa.us
2018-02-27 15:15:35 -05:00
Tom Lane 25b692568f Prevent dangling-pointer access when update trigger returns old tuple.
A before-update row trigger may choose to return the "new" or "old" tuple
unmodified.  ExecBRUpdateTriggers failed to consider the second
possibility, and would proceed to free the "old" tuple even if it was the
one returned, leading to subsequent access to already-deallocated memory.
In debug builds this reliably leads to an "invalid memory alloc request
size" failure; in production builds it might accidentally work, but data
corruption is also possible.

This is a very old bug.  There are probably a couple of reasons it hasn't
been noticed up to now.  It would be more usual to return NULL if one
wanted to suppress the update action; returning "old" is significantly less
efficient since the update will occur anyway.  Also, none of the standard
PLs would ever cause this because they all returned freshly-manufactured
tuples even if they were just copying "old".  But commit 4b93f5799 changed
that for plpgsql, making it possible to see the bug with a plpgsql trigger.
Still, this is certainly legal behavior for a trigger function, so it's
ExecBRUpdateTriggers's fault not plpgsql's.

It seems worth creating a test case that exercises returning "old" directly
with a C-language trigger; testing this through plpgsql seems unreliable
because its behavior might change again.

Report and fix by Rushabh Lathia; regression test case by me.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAGPqQf1P4pjiNPrMof=P_16E-DFjt457j+nH2ex3=nBTew7tXw@mail.gmail.com
2018-02-27 13:28:02 -05:00
Tom Lane be42eb9d62 Improve regression test coverage of regress.c.
It's a bit silly to have test functions that aren't tested, so test
them.

In passing, rename int44in/int44out to city_budget_in/_out so that they
match how the regression tests use them.  Also, fix city_budget_out
so that it emits the format city_budget_in expects to read; otherwise
we'd have dump/reload failures when testing pg_dump against the
regression database.  (We avoided that in the past only because no
data of type city_budget was actually stored anywhere.)

Discussion: https://postgr.es/m/29322.1519701006@sss.pgh.pa.us
2018-02-27 12:13:14 -05:00
Tom Lane db3af9feb1 Remove unused functions in regress.c.
This patch removes five functions that presumably were once used in the
regression tests, but haven't been so used in many years.  Nonetheless
we've been wasting maintenance effort on them (e.g., by converting them
to V1 function protocol).  I see no reason to think that reviving them
would add any useful test coverage, so drop 'em.

In passing, mark regress_lseg_construct static, since it's not called
from outside this file.

Discussion: https://postgr.es/m/29322.1519701006@sss.pgh.pa.us
2018-02-27 11:11:25 -05:00
Tom Lane 3d2aed664e Avoid using unsafe search_path settings during dump and restore.
Historically, pg_dump has "set search_path = foo, pg_catalog" when
dumping an object in schema "foo", and has also caused that setting
to be used while restoring the object.  This is problematic because
functions and operators in schema "foo" could capture references meant
to refer to pg_catalog entries, both in the queries issued by pg_dump
and those issued during the subsequent restore run.  That could
result in dump/restore misbehavior, or in privilege escalation if a
nefarious user installs trojan-horse functions or operators.

This patch changes pg_dump so that it does not change the search_path
dynamically.  The emitted restore script sets the search_path to what
was used at dump time, and then leaves it alone thereafter.  Created
objects are placed in the correct schema, regardless of the active
search_path, by dint of schema-qualifying their names in the CREATE
commands, as well as in subsequent ALTER and ALTER-like commands.

Since this change requires a change in the behavior of pg_restore
when processing an archive file made according to this new convention,
bump the archive file version number; old versions of pg_restore will
therefore refuse to process files made with new versions of pg_dump.

Security: CVE-2018-1058
2018-02-26 10:18:21 -05:00
Tom Lane 8b29e88cdc Add window RANGE support for float4, float8, numeric.
Commit 0a459cec9 left this for later, but since time's running out,
I went ahead and took care of it.  There are more data types that
somebody might someday want RANGE support for, but this is enough
to satisfy all expectations of the SQL standard, which just says that
"numeric, datetime, and interval" types should have RANGE support.
2018-02-24 13:23:38 -05:00
Tom Lane 9afd513df0 Fix planner failures with overlapping mergejoin clauses in an outer join.
Given overlapping or partially redundant join clauses, for example
	t1 JOIN t2 ON t1.a = t2.x AND t1.b = t2.x
the planner's EquivalenceClass machinery will ordinarily refactor the
clauses as "t1.a = t1.b AND t1.a = t2.x", so that join processing doesn't
see multiple references to the same EquivalenceClass in a list of join
equality clauses.  However, if the join is outer, it's incorrect to derive
a restriction clause on the outer side from the join conditions, so the
clause refactoring does not happen and we end up with overlapping join
conditions.  The code that attempted to deal with such cases had several
subtle bugs, which could result in "left and right pathkeys do not match in
mergejoin" or "outer pathkeys do not match mergeclauses" planner errors,
if the selected join plan type was a mergejoin.  (It does not appear that
any actually incorrect plan could have been emitted.)

The core of the problem really was failure to recognize that the outer and
inner relations' pathkeys have different relationships to the mergeclause
list.  A join's mergeclause list is constructed by reference to the outer
pathkeys, so it will always be ordered the same as the outer pathkeys, but
this cannot be presumed true for the inner pathkeys.  If the inner sides of
the mergeclauses contain multiple references to the same EquivalenceClass
({t2.x} in the above example) then a simplistic rendering of the required
inner sort order is like "ORDER BY t2.x, t2.x", but the pathkey machinery
recognizes that the second sort column is redundant and throws it away.
The mergejoin planning code failed to account for that behavior properly.
One error was to try to generate cut-down versions of the mergeclause list
from cut-down versions of the inner pathkeys in the same way as the initial
construction of the mergeclause list from the outer pathkeys was done; this
could lead to choosing a mergeclause list that fails to match the outer
pathkeys.  The other problem was that the pathkey cross-checking code in
create_mergejoin_plan treated the inner and outer pathkey lists
identically, whereas actually the expectations for them must be different.
That led to false "pathkeys do not match" failures in some cases, and in
principle could have led to failure to detect bogus plans in other cases,
though there is no indication that such bogus plans could be generated.

Reported by Alexander Kuzmenkov, who also reviewed this patch.  This has
been broken for years (back to around 8.3 according to my testing), so
back-patch to all supported branches.

Discussion: https://postgr.es/m/5dad9160-4632-0e47-e120-8e2082000c01@postgrespro.ru
2018-02-23 13:47:33 -05:00
Peter Eisentraut 76b6aa41f4 Support parameters in CALL
To support parameters in CALL, move the parse analysis of the procedure
and arguments into the global transformation phase, so that the parser
hooks can be applied.  And then at execution time pass the parameters
from ProcessUtility on to ExecuteCallStmt.
2018-02-22 21:36:48 -05:00
Peter Eisentraut 10cfce34c0 Add user-callable SHA-2 functions
Add the user-callable functions sha224, sha256, sha384, sha512.  We
already had these in the C code to support SCRAM, but there was no test
coverage outside of the SCRAM tests.  Adding these as user-callable
functions allows writing some tests.  Also, we have a user-callable md5
function but no more modern alternative, which led to wide use of md5 as
a general-purpose hash function, which leads to occasional complaints
about using md5.

Also mark the existing md5 functions as leak-proof.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
2018-02-22 11:34:53 -05:00
Robert Haas 9a5c4f58f3 Try to stabilize EXPLAIN output in partition_check test.
Commit 7d8ac9814b adjusted these
tests in the hope of preserving the plan shape, but I failed to
notice that the three partitions were, on my local machine, choosing
two different plan shapes.  This is probably related to the fact
that all three tables have exactly the same row count.  Try to
improve the situation by making pht1_e about half as large as
the other two.

Per Tom Lane and the buildfarm.

Discussion: http://postgr.es/m/25380.1519277713@sss.pgh.pa.us
2018-02-22 08:51:00 -05:00
Robert Haas 7d8ac9814b Charge cpu_tuple_cost * 0.5 for Append and MergeAppend nodes.
Previously, Append didn't charge anything at all, and MergeAppend
charged only cpu_operator_cost, about half the value used here.  This
change might make MergeAppend plans slightly more likely to be chosen
than before, since this commit increases the assumed cost for Append
-- with default values -- by 0.005 per tuple but MergeAppend by only
0.0025 per tuple.  Since the comparisons required by MergeAppend are
costed separately, it's not clear why MergeAppend needs to be
otherwise more expensive than Append, so hopefully this is OK.

Prior to partition-wise join, it didn't really matter whether or not
an Append node had any cost of its own, because every plan had to use
the same number of Append or MergeAppend nodes and in the same places.
Only the relative cost of Append vs. MergeAppend made a difference.
Now, however, it is possible to avoid some of the Append nodes using a
partition-wise join, so it's worth making an effort.  Pending patches
for partition-wise aggregate care too, because an Append of Aggregate
nodes will incur the Append overhead fewer times than an Aggregate
over an Append.  Although in most cases this change will favor the use
of partition-wise techniques, it does the opposite when the join
cardinality is greater than the sum of the input cardinalities.  Since
this situation arises in an existing regression test, I [rhaas]
adjusted it to keep the overall plan shape approximately the same.

Jeevan Chalke, per a suggestion from David Rowley.  Reviewed by
Ashutosh Bapat.  Some changes by me.  The larger patch series of which
this patch is a part was also reviewed and tested by Antonin Houska,
Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik,
Pascal Legrand, Rafia Sabih, and me.

Discussion: http://postgr.es/m/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B=ZRh-rxy9qxfPA5Gw@mail.gmail.com
2018-02-21 23:09:27 -05:00
Peter Eisentraut c2ff42c6c1 Error message improvement 2018-02-20 17:58:27 -05:00
Alvaro Herrera 4108a28d3a Fix expected output 2018-02-19 17:56:43 -03:00
Alvaro Herrera eb7ed3f306 Allow UNIQUE indexes on partitioned tables
If we restrict unique constraints on partitioned tables so that they
must always include the partition key, then our standard approach to
unique indexes already works --- each unique key is forced to exist
within a single partition, so enforcing the unique restriction in each
index individually is enough to have it enforced globally.  Therefore we
can implement unique indexes on partitions by simply removing a few
restrictions (and adding others.)

Discussion: https://postgr.es/m/20171222212921.hi6hg6pem2w2t36z@alvherre.pgsql
Discussion: https://postgr.es/m/20171229230607.3iib6b62fn3uaf47@alvherre.pgsql
Reviewed-by: Simon Riggs, Jesper Pedersen, Peter Eisentraut, Jaime
	Casanova, Amit Langote
2018-02-19 17:40:00 -03:00
Alvaro Herrera cef60043dd Mention trigger name in trigger test
This makes it more explicit exactly what is going on, for further
proposed behavior changes.

Discussion: https://postgr.es/m/20180214212624.hm7of76flesodamf@alvherre.pgsql
2018-02-17 13:18:34 -03:00
Peter Eisentraut 2fb1abaeb0 Rename enable_partition_wise_join to enable_partitionwise_join
Discussion: https://www.postgresql.org/message-id/flat/ad24e4f4-6481-066e-e3fb-6ef4a3121882%402ndquadrant.com
2018-02-16 10:33:59 -05:00
Tom Lane f9263006d8 Support CONSTANT/NOT NULL/initial value for plpgsql composite variables.
These features were never implemented previously for composite or record
variables ... not that the documentation admitted it, so there's no doc
updates here.

This also fixes some issues concerning enforcing DOMAIN NOT NULL
constraints against plpgsql variables, although I'm not sure that
that topic is completely dealt with.

I created a new plpgsql test file for these features, and moved the
one relevant existing test case into that file.

Tom Lane, reviewed by Daniel Gustafsson

Discussion: https://postgr.es/m/18362.1514605650@sss.pgh.pa.us
2018-02-13 22:15:08 -05:00
Tom Lane 4b93f57999 Make plpgsql use its DTYPE_REC code paths for composite-type variables.
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible.  In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session.  And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple.  A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.

Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type.  DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.

To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90.  A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests.  In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.

In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type.  This allows very cheap detection of the need to
refresh tupdesc-dependent data.  This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.

In passing, allow plpgsql functions to accept as well as return type
"record".  There was no good reason for the old restriction, and it
was out of step with most of the other PLs.

Tom Lane, reviewed by Pavel Stehule

Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
2018-02-13 18:52:21 -05:00
Peter Eisentraut 7a32ac8a66 Add procedure support to pg_get_functiondef
This also makes procedures work in psql's \ef and \sf commands.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
2018-02-13 15:13:44 -05:00
Peter Eisentraut 7cd56f218d Add tests for pg_get_functiondef 2018-02-13 15:13:44 -05:00
Peter Eisentraut a7b8f0661d Fix typo 2018-02-13 15:13:44 -05:00
Tom Lane d02d4a6d4f Avoid premature free of pass-by-reference CALL arguments.
Prematurely freeing the EState used to evaluate CALL arguments led, in some
cases, to passing dangling pointers to the procedure.  This was masked in
trivial cases because the argument pointers would point to Const nodes in
the original expression tree, and in some other cases because the result
value would end up in the standalone ExprContext rather than in memory
belonging to the EState --- but that wasn't exactly high quality
programming either, because the standalone ExprContext was never
explicitly freed, breaking assorted API contracts.

In addition, using a separate EState for each argument was just silly.

So let's use just one EState, and one ExprContext, and make the latter
belong to the former rather than be standalone, and clean up the EState
(and hence the ExprContext) post-call.

While at it, improve the function's commentary a bit.

Discussion: https://postgr.es/m/29173.1518282748@sss.pgh.pa.us
2018-02-10 13:37:12 -05:00
Tom Lane 65b1d76785 Fix oversight in CALL argument handling, and do some minor cleanup.
CALL statements cannot support sub-SELECTs in the arguments of the called
procedure, since they just use ExecEvalExpr to evaluate such arguments.
Teach transformSubLink() to reject the case, as it already does for other
contexts in which subqueries are not supported.

In passing, s/EXPR_KIND_CALL/EXPR_KIND_CALL_ARGUMENT/ to make that enum
symbol line up more closely with the phrasing of the error messages it is
associated with.  And fix someone's weak grasp of English grammar in the
preceding EXPR_KIND_PARTITION_EXPRESSION addition.  Also update an
incorrect comment in resolve_unique_index_expr (possibly it was correct
when written, but nowadays transformExpr definitely does reject SRFs here).

Per report from Pavel Stehule --- but this resolves only one of the bugs
he mentions.

Discussion: https://postgr.es/m/CAFj8pRDxOwPPzpA8i+AQeDQFj7bhVw-dR2==rfWZ3zMGkm568Q@mail.gmail.com
2018-02-10 13:05:14 -05:00
Peter Eisentraut 7c44b75a2a Make new triggers tests more robust
Add explicit collation on the trigger name to avoid locale dependencies.
Also restrict the tables selected, to avoid interference from
concurrently running tests.
2018-02-07 14:57:19 -05:00
Peter Eisentraut 32ff269117 Add more information_schema columns
- table_constraints.enforced
- triggers.action_order
- triggers.action_reference_old_table
- triggers.action_reference_new_table

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-02-07 10:08:02 -05:00
Tom Lane 0a459cec96 Support all SQL:2011 options for window frame clauses.
This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING"
frame boundaries in window functions.  We'd punted on that back in the
original patch to add window functions, because it was not clear how to
do it in a reasonably data-type-extensible fashion.  That problem is
resolved here by adding the ability for btree operator classes to provide
an "in_range" support function that defines how to add or subtract the
RANGE offset value.  Factoring it this way also allows the operator class
to avoid overflow problems near the ends of the datatype's range, if it
wishes to expend effort on that.  (In the committed patch, the integer
opclasses handle that issue, but it did not seem worth the trouble to
avoid overflow failures for datetime types.)

The patch includes in_range support for the integer_ops opfamily
(int2/int4/int8) as well as the standard datetime types.  Support for
other numeric types has been requested, but that seems like suitable
material for a follow-on patch.

In addition, the patch adds GROUPS mode which counts the offset in
ORDER-BY peer groups rather than rows, and it adds the frame_exclusion
options specified by SQL:2011.  As far as I can see, we are now fully
up to spec on window framing options.

Existing behaviors remain unchanged, except that I changed the errcode
for a couple of existing error reports to meet the SQL spec's expectation
that negative "offset" values should be reported as SQLSTATE 22013.

Internally and in relevant parts of the documentation, we now consistently
use the terminology "offset PRECEDING/FOLLOWING" rather than "value
PRECEDING/FOLLOWING", since the term "value" is confusingly vague.

Oliver Ford, reviewed and whacked around some by me

Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
2018-02-07 00:06:56 -05:00
Robert Haas f069c91a57 Fix possible crash in partition-wise join.
The previous code assumed that we'd always succeed in creating
child-joins for a joinrel for which partition-wise join was considered,
but that's not guaranteed, at least in the case where dummy rels
are involved.

Ashutosh Bapat, with some wordsmithing by me.

Discussion: http://postgr.es/m/CAFjFpRf8=uyMYYfeTBjWDMs1tR5t--FgOe2vKZPULxxdYQ4RNw@mail.gmail.com
2018-02-05 17:31:57 -05:00
Tom Lane 3492a0af0b Fix RelationBuildPartitionKey's processing of partition key expressions.
Failure to advance the list pointer while reading partition expressions
from a list results in invoking an input function with inappropriate data,
possibly leading to crashes or, with carefully crafted input, disclosure
of arbitrary backend memory.

Bug discovered independently by Álvaro Herrera and David Rowley.
This patch is by Álvaro but owes something to David's proposed fix.
Back-patch to v10 where the issue was introduced.

Security: CVE-2018-1052
2018-02-05 10:37:30 -05:00
Peter Eisentraut 533c5d8bdd Fix application of identity values in some cases
Investigation of 2d2d06b7e2 revealed that
identity values were not applied in some further cases, including
logical replication subscribers, VALUES RTEs, and ALTER TABLE ... ADD
COLUMN.  To fix all that, apply the identity column expression in
build_column_default() instead of repeating the same logic at each call
site.

For ALTER TABLE ... ADD COLUMN ... IDENTITY, the previous coding
completely ignored that existing rows for the new column should have
values filled in from the identity sequence.  The coding using
build_column_default() fails for this because the sequence ownership
isn't registered until after ALTER TABLE, and we can't do it before
because we don't have the column in the catalog yet.  So we specially
remember in ColumnDef the sequence name that we decided on and build a
custom NextValueExpr using that.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-02-02 14:39:10 -05:00
Robert Haas 22757960bb Fix typo: colums -> columns.
Along the way, also fix code indentation.

Alexander Lakhin, reviewed by Michael Paquier

Discussion: http://postgr.es/m/45c44aa7-7cfa-7f3b-83fd-d8300677fdda@gmail.com
2018-01-31 16:45:37 -05:00
Robert Haas 3ccdc6f9a5 Fix list partition constraints for partition keys of array type.
The old code generated always generated a constraint of the form
col = ANY(ARRAY[val1, val2, ...]), but that's invalid when col is an
array type.  Instead, generate col = val when there's only one value,
col = val1 OR col = val2 OR ... when there are multiple values and
col is of array type, and the old form when there are multiple values
and col is not of an array type.

As a side benefit, this makes constraint exclusion able to prune
a list partition declared to accept a single Boolean value, which
didn't work before.

Amit Langote, reviewed by Etsuro Fujita

Discussion: http://postgr.es/m/97267195-e235-89d1-a41a-c110198dfce9@lab.ntt.co.jp
2018-01-31 15:43:11 -05:00
Andres Freund c068f87723 Improve bit perturbation in TupleHashTableHash.
The changes in b81b5a96f4 did not fully
address the issue, because the bit-mixing of the IV into the final
hash-key didn't prevent clustering in the input-data survive in the
output data.

This didn't cause a lot of problems because of the additional growth
conditions added d4c62a6b62. But as we
want to rein those in due to explosive growth in some edges, this
needs to be fixed.

Author: Andres Freund
Discussion: https://postgr.es/m/20171127185700.1470.20362@wrigleys.postgresql.org
Backpatch: 10, where simplehash was introduced
2018-01-29 11:24:57 -08:00
Tom Lane fb8697b31a Avoid unnecessary use of pg_strcasecmp for already-downcased identifiers.
We have a lot of code in which option names, which from the user's
viewpoint are logically keywords, are passed through the grammar as plain
identifiers, and then matched to string literals during command execution.
This approach avoids making words into lexer keywords unnecessarily.  Some
places matched these strings using plain strcmp, some using pg_strcasecmp.
But the latter should be unnecessary since identifiers would have been
downcased on their way through the parser.  Aside from any efficiency
concerns (probably not a big factor), the lack of consistency in this area
creates a hazard of subtle bugs due to different places coming to different
conclusions about whether two option names are the same or different.
Hence, standardize on using strcmp() to match any option names that are
expected to have been fed through the parser.

This does create a user-visible behavioral change, which is that while
formerly all of these would work:
	alter table foo set (fillfactor = 50);
	alter table foo set (FillFactor = 50);
	alter table foo set ("fillfactor" = 50);
	alter table foo set ("FillFactor" = 50);
now the last case will fail because that double-quoted identifier is
different from the others.  However, none of our documentation says that
you can use a quoted identifier in such contexts at all, and we should
discourage doing so since it would break if we ever decide to parse such
constructs as true lexer keywords rather than poor man's substitutes.
So this shouldn't create a significant compatibility issue for users.

Daniel Gustafsson, reviewed by Michael Paquier, small changes by me

Discussion: https://postgr.es/m/29405B24-564E-476B-98C0-677A29805B84@yesql.se
2018-01-26 18:25:14 -05:00
Alvaro Herrera 05fb5d6619 Ignore partitioned indexes where appropriate
get_relation_info() was too optimistic about opening indexes in
partitioned tables, which would raise errors when any queries were
planned on such tables.  Fix by ignoring any indexes of the partitioned
kind.

CLUSTER (and ALTER TABLE CLUSTER ON) had a similar problem.  Fix by
disallowing these commands in partitioned tables.

Fallout from 8b08f7d482.
2018-01-25 16:12:15 -03:00
Peter Eisentraut a61116da8b Add tests for record_image_eq and record_image_cmp
record_image_eq was covered a bit by the materialized view code that it
is meant to support, but record_image_cmp was not tested at all.

While we're here, add more tests to record_eq and record_cmp as well,
for symmetry.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-01-24 13:23:57 -05:00
Tom Lane bb94ce4d26 Teach reparameterize_path() to handle AppendPaths.
If we're inside a lateral subquery, there may be no unparameterized paths
for a particular child relation of an appendrel, in which case we *must*
be able to create similarly-parameterized paths for each other child
relation, else the planner will fail with "could not devise a query plan
for the given query".  This means that there are situations where we'd
better be able to reparameterize at least one path for each child.

This calls into question the assumption in reparameterize_path() that
it can just punt if it feels like it.  However, the only case that is
known broken right now is where the child is itself an appendrel so that
all its paths are AppendPaths.  (I think possibly I disregarded that in
the original coding on the theory that nested appendrels would get folded
together --- but that only happens *after* reparameterize_path(), so it's
not excused from handling a child AppendPath.)  Given that this code's been
like this since 9.3 when LATERAL was introduced, it seems likely we'd have
heard of other cases by now if there were a larger problem.

Per report from Elvis Pranskevichus.  Back-patch to 9.3.

Discussion: https://postgr.es/m/5981018.zdth1YWmNy@hammer.magicstack.net
2018-01-23 16:50:34 -05:00
Robert Haas 2f17844104 Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint.  In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one.  This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented.  (There is a pending patch to improve the
situation further, but it needs more review.)

Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me.  A few final revisions by me.

Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 15:33:06 -05:00
Peter Eisentraut 8b9e9644dc Replace AclObjectKind with ObjectType
AclObjectKind was basically just another enumeration for object types,
and we already have a preferred one for that.  It's only used in
aclcheck_error.  By using ObjectType instead, we can also give some more
precise error messages, for example "index" instead of "relation".

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-01-19 14:01:15 -05:00
Alvaro Herrera 189d0ff588 Fix regression tests for better stability
Per buildfarm
2018-01-19 12:31:34 -03:00
Alvaro Herrera 8b08f7d482 Local partitioned indexes
When CREATE INDEX is run on a partitioned table, create catalog entries
for an index on the partitioned table (which is just a placeholder since
the table proper has no data of its own), and recurse to create actual
indexes on the existing partitions; create them in future partitions
also.

As a convenience gadget, if the new index definition matches some
existing index in partitions, these are picked up and used instead of
creating new ones.  Whichever way these indexes come about, they become
attached to the index on the parent table and are dropped alongside it,
and cannot be dropped on isolation unless they are detached first.

To support pg_dump'ing these indexes, add commands
    CREATE INDEX ON ONLY <table>
(which creates the index on the parent partitioned table, without
recursing) and
    ALTER INDEX ATTACH PARTITION
(which is used after the indexes have been created individually on each
partition, to attach them to the parent index).  These reconstruct prior
database state exactly.

Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit
	Langote, Jesper Pedersen, Simon Riggs, David Rowley
Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql
2018-01-19 11:49:22 -03:00
Peter Eisentraut 77216cae47 Add tests for session_replication_role
This was hardly tested at all.  The trigger case was lightly tested by
the logical replication tests, but rules and event triggers were not
tested at all.
2018-01-18 11:24:07 -05:00
Tom Lane 90947674fc Fix incorrect handling of subquery pullup in the presence of grouping sets.
If we flatten a subquery whose target list contains constants or
expressions, when those output columns are used in GROUPING SET columns,
the planner was capable of doing the wrong thing by merging a pulled-up
expression into the surrounding expression during const-simplification.
Then the late processing that attempts to match subexpressions to grouping
sets would fail to match those subexpressions to grouping sets, with the
effect that they'd not go to null when expected.

To fix, wrap such subquery outputs in PlaceHolderVars, ensuring that
they preserve their separate identity throughout the planner's expression
processing.  This is a bit of a band-aid, because the wrapper defeats
const-simplification even in places where it would be safe to allow.
But a nicer fix would likely be too invasive to back-patch, and the
consequences of the missed optimizations probably aren't large in most
cases.

Back-patch to 9.5 where grouping sets were introduced.

Heikki Linnakangas, with small mods and better test cases by me;
additional review by Andrew Gierth

Discussion: https://postgr.es/m/7dbdcf5c-b5a6-ef89-4958-da212fe10176@iki.fi
2018-01-12 12:24:50 -05:00
Bruce Momjian bdb70c12b3 C comment: fix "the the" mentions in C comments
Reported-by: Christoph Dreis

Discussion: https://postgr.es/m/007e01d3519e$2734ca10$759e5e30$@freenet.de

Author: Christoph Dreis
2018-01-11 21:50:21 -05:00
Peter Eisentraut 5115854170 Add tests for PL/pgSQL returning unnamed portals as refcursor
Existing tests only covered returning explicitly named portals as
refcursor.  The unnamed cursor case was recently broken without a test
failing.
2018-01-10 16:39:13 -05:00
Andrew Dunstan 11b623dd0a Implement TZH and TZM timestamp format patterns
These are compatible with Oracle and required for the datetime template
language for jsonpath in an upcoming patch.

Nikita Glukhov and Andrew Dunstan, reviewed by Pavel Stehule.
2018-01-09 14:25:05 -05:00
Tom Lane 624e440a47 Improve the heuristic for ordering child paths of a parallel append.
Commit ab7271677 introduced code that attempts to order the child
scans of a Parallel Append node in a way that will minimize execution
time, based on total cost and startup cost.  However, it failed to
think hard about what to do when estimated costs are exactly equal;
a case that's particularly likely to occur when comparing on startup
cost.  In such a case the ordering of the child paths would be left
to the whims of qsort, an algorithm that isn't even stable.

We can improve matters by applying the rule used elsewhere in the
planner: if total costs are equal, sort on startup cost, and
vice versa.  When both cost estimates are exactly equal, rather
than letting qsort do something unpredictable, sort based on the
child paths' relids, which should typically result in sorting in
inheritance order.  (The latter provision requires inventing a
qsort-style comparator for bitmapsets, but maybe we'll have use
for that for other reasons in future.)

This results in a few plan changes in the select_parallel test,
but those all look more reasonable than before, when the actual
underlying cost numbers are taken into account.

Discussion: https://postgr.es/m/4944.1515446989@sss.pgh.pa.us
2018-01-09 13:07:52 -05:00
Tom Lane 934c7986f4 Tweak parallel hash join test case in hopes of improving stability.
This seems to make things better on gaur, let's see what the rest
of the buildfarm thinks.

Thomas Munro

Discussion: https://postgr.es/m/CAEepm=1uuT8iJxMEsR=jL+3zEi87DB2v0+0H9o_rUXXCZPZT3A@mail.gmail.com
2018-01-04 01:06:58 -05:00
Tom Lane 3decd150a2 Teach eval_const_expressions() to handle some more cases.
Add some infrastructure (mostly macros) to make it easier to write
typical cases for constant-expression simplification.  Add simplification
processing for ArrayRef, RowExpr, and ScalarArrayOpExpr node types,
which formerly went unsimplified even if all their inputs were constants.
Also teach it to simplify FieldSelect from a composite constant.
Make use of the new infrastructure to reduce the amount of code needed
for the existing ArrayExpr and ArrayCoerceExpr cases.

One existing test case changes output as a result of the fact that
RowExpr can now be folded to a constant.  All the new code is exercised
by existing test cases according to gcov, so I feel no need to add
additional tests.

Tom Lane, reviewed by Dmitry Dolgov

Discussion: https://postgr.es/m/3be3b82c-e29c-b674-2163-bf47d98817b1@iki.fi
2018-01-03 12:35:09 -05:00
Tom Lane dd2243f2ad Improve regression tests' code coverage for plpgsql control structures.
I noticed that our code coverage report showed considerable deficiency
in test coverage for PL/pgSQL control statements.  Notably, both
exec_stmt_block and most of the loop control statements had very poor
coverage of handling of return/exit/continue result codes from their
child statements; and exec_stmt_fori was seriously lacking in feature
coverage, having no test that exercised its BY or REVERSE features,
nor verification that its overflow defenses work.

Now that we have some infrastructure for plpgsql-specific test scripts,
the natural thing to do is make a new script rather than further extend
plpgsql.sql.  So I created a new script plpgsql_control.sql with the
charter to test plpgsql control structures, and moved a few existing
tests there because they fell entirely under that charter.  I then
added new test cases that exercise the bits of code complained of above.

Of the five kinds of loop statements, only exec_stmt_while's result code
handling is fully exercised by these tests.  That would be a deficiency
as things stand, but a follow-on commit will merge the loop statements'
result code handling into one implementation.  So testing each usage of
that implementation separately seems redundant.

In passing, also add a couple test cases to plpgsql.sql to more fully
exercise plpgsql's code related to expanded arrays --- I had thought
that area was sufficiently covered already, but the coverage report
showed a couple of un-executed code paths.

Discussion: https://postgr.es/m/26314.1514670401@sss.pgh.pa.us
2017-12-31 17:04:11 -05:00
Teodor Sigaev ff963b393c Add polygon opclass for SP-GiST
Polygon opclass uses compress method feature of SP-GiST added earlier. For now
it's a single operator class which uses this feature. SP-GiST actually indexes
a bounding boxes of input polygons, so part of supported operations are lossy.
Opclass uses most methods of corresponding opclass over boxes of SP-GiST and
treats bounding boxes as point in 4D-space.

Bump catalog version.

Authors: Nikita Glukhov, Alexander Korotkov with minor editorization by me
Reviewed-By: all authors + Darafei Praliaskouski
Discussion: https://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru
2017-12-25 18:59:38 +03:00
Tom Lane c4c2885cbb Fix UNION/INTERSECT/EXCEPT over no columns.
Since 9.4, we've allowed the syntax "select union select" and variants
of that.  However, the planner wasn't expecting a no-column set operation
and ended up treating the set operation as if it were UNION ALL.

Turns out it's trivial to fix in v10 and later; we just need to be careful
about not generating a Sort node with no sort keys.  However, since a weird
corner case like this is never going to be exercised by developers, we'd
better have thorough regression tests if we want to consider it supported.

Per report from Victor Yegorov.

Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com
2017-12-22 12:08:06 -05:00
Andres Freund 1804284042 Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash.  While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.

After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory.  If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.

The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:

 * avoids wasting memory on duplicated hash tables
 * avoids wasting disk space on duplicated batch files
 * divides the work of building the hash table over the CPUs

One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables.  This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes.  Another is that
outer batch 0 must be written to disk if multiple batches are required.

A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.

A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.

Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
    https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
    https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 00:43:41 -08:00
Robert Haas 7d3583ad9a Test instrumentation of Hash nodes with parallel query.
Commit 8526bcb2df fixed bugs related
to both Sort and Hash, but only added a test case for Sort.  This
adds a test case for Hash to match.

Thomas Munro

Discussion: http://postgr.es/m/CAEepm=2-LRnfwUBZDqQt+XAcd0af_ykNyyVvP3h1uB1AQ=e-eA@mail.gmail.com
2017-12-19 15:29:08 -05:00
Robert Haas 8526bcb2df Try again to fix accumulation of parallel worker instrumentation.
When a Gather or Gather Merge node is started and stopped multiple
times, accumulate instrumentation data only once, at the end, instead
of after each execution, to avoid recording inflated totals.

Commit 778e78ae9f, the previous attempt
at a fix, instead reset the state after every execution, which worked
for the general instrumentation data but had problems for the additional
instrumentation specific to Sort and Hash nodes.

Report by hubert depesz lubaczewski.  Analysis and fix by Amit Kapila,
following a design proposal from Thomas Munro, with a comment tweak
by me.

Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
2017-12-19 12:21:56 -05:00
Robert Haas 1d6fb35ad6 Revert "Fix accumulation of parallel worker instrumentation."
This reverts commit 2c09a5c12a.  Per
further discussion, that doesn't seem to be the best possible fix.

Discussion: http://postgr.es/m/CAA4eK1LW2aFKzY3=vwvc=t-juzPPVWP2uT1bpx_MeyEqnM+p8g@mail.gmail.com
2017-12-13 15:19:28 -05:00
Peter Eisentraut 632b03da31 Start a separate test suite for plpgsql
The plpgsql.sql test file in the main regression tests is now by far the
largest after numeric_big, making editing and managing the test cases
very cumbersome.  The other PLs have their own test suites split up into
smaller files by topic.  It would be nice to have that for plpgsql as
well.  So, to get that started, set up test infrastructure in
src/pl/plpgsql/src/ and split out the recently added procedure test
cases into a new file there.  That file now mirrors the test cases added
to the other PLs, making managing those matching tests a bit easier too.

msvc build system changes with help from Michael Paquier
2017-12-13 11:02:29 -05:00
Peter Eisentraut 3d8874224f Fix crash when using CALL on an aggregate
Author: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Reported-by: Rushabh Lathia <rushabh.lathia@gmail.com>
2017-12-13 10:37:48 -05:00
Tom Lane 9edc97b712 Stabilize output of new regression test case.
The test added by commit 390d58135 turns out to have different output
in CLOBBER_CACHE_ALWAYS builds: there's an extra CONTEXT line in the
error message as a result of detecting the error at a different place.
Possibly we should do something to make that more consistent.  But as
a stopgap measure to make the buildfarm green again, adjust the test
to suppress CONTEXT entirely.  We can revert this if we do something
in the backend to eliminate the inconsistency.

Discussion: https://postgr.es/m/31545.1512924904@sss.pgh.pa.us
2017-12-10 12:44:03 -05:00
Tom Lane 390d58135b Fix plpgsql to reinitialize record variables at block re-entry.
If one exits and re-enters a DECLARE ... BEGIN ... END block within a
single execution of a plpgsql function, perhaps due to a surrounding loop,
the declared variables are supposed to get re-initialized to null (or
whatever their initializer is).  But this failed to happen for variables
of type "record", because while exec_stmt_block() expected such variables
to be included in the block's initvarnos list, plpgsql_add_initdatums()
only adds DTYPE_VAR variables to that list.  This bug appears to have
been there since the aboriginal addition of plpgsql to our tree.

Fix by teaching plpgsql_add_initdatums() to include DTYPE_REC variables
as well.  (We don't need to consider other DTYPEs because they don't
represent separately-stored values.)  I failed to resist the temptation
to make some nearby cosmetic adjustments, too.

No back-patch, because there have not been field complaints, and it
seems possible that somewhere out there someone has code depending
on the incorrect behavior.  In any case this change would have no
impact on correctly-written code.

Discussion: https://postgr.es/m/22994.1512800671@sss.pgh.pa.us
2017-12-09 12:03:04 -05:00
Magnus Hagander ce1468d02b Fix regression test output
Missed this in the last commit.
2017-12-09 13:45:06 +01:00
Peter Eisentraut 005ac298b1 Prohibit identity columns on typed tables and partitions
Those cases currently crash and supporting them is more work then
originally thought, so we'll just prohibit these scenarios for now.

Author: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Reported-by: Мансур Галиев <gomer94@yandex.ru>
Bug: #14866
2017-12-08 12:13:04 -05:00
Peter Eisentraut 2d2d06b7e2 Apply identity sequence values on COPY
A COPY into a table should apply identity sequence values just like it
does for ordinary defaults.  This was previously forgotten, leading to
null values being inserted, which in turn would fail because identity
columns have not-null constraints.

Author: Michael Paquier <michael.paquier@gmail.com>
Reported-by: Steven Winfield <steven.winfield@cantabcapital.com>
Bug: #14952
2017-12-08 09:18:18 -05:00
Tom Lane 979a36c389 Adjust regression test cases added by commit ab7271677.
I suppose it is a copy-and-paste error that this test doesn't actually
test the "Parallel Append with both partial and non-partial subplans"
case (EXPLAIN alone surely doesn't qualify as a test of executor
behavior).  Fix that.

Also, add cosmetic aliases to make it possible to tell apart these
otherwise-identical test cases in log_statement output.
2017-12-05 22:40:43 -05:00
Robert Haas ab72716778 Support Parallel Append plan nodes.
When we create an Append node, we can spread out the workers over the
subplans instead of piling on to each subplan one at a time, which
should typically be a bit more efficient, both because the startup
cost of any plan executed entirely by one worker is paid only once and
also because of reduced contention.  We can also construct Append
plans using a mix of partial and non-partial subplans, which may allow
for parallelism in places that otherwise couldn't support it.
Unfortunately, this patch doesn't handle the important case of
parallelizing UNION ALL by running each branch in a separate worker;
the executor infrastructure is added here, but more planner work is
needed.

Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by
Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and
Rajkumar Raghuwanshi.

Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
2017-12-05 17:28:39 -05:00
Robert Haas 2c09a5c12a Fix accumulation of parallel worker instrumentation.
When a Gather or Gather Merge node is started and stopped multiple
times, the old code wouldn't reset the shared state between executions,
potentially resulting in dramatically inflated instrumentation data
for nodes beneath it.  (The per-worker instrumentation ended up OK,
I think, but the overall totals were inflated.)

Report by hubert depesz lubaczewski.  Analysis and fix by Amit Kapila,
reviewed and tweaked a bit by me.

Discussion: http://postgr.es/m/20171127175631.GA405@depesz.com
2017-12-05 14:35:33 -05:00
Andres Freund 5bcf389ecf Fix EXPLAIN ANALYZE of hash join when the leader doesn't participate.
If a hash join appears in a parallel query, there may be no hash table
available for explain.c to inspect even though a hash table may have
been built in other processes.  This could happen either because
parallel_leader_participation was set to off or because the leader
happened to hit the end of the outer relation immediately (even though
the complete relation is not empty) and decided not to build the hash
table.

Commit bf11e7ee introduced a way for workers to exchange
instrumentation via the DSM segment for Sort nodes even though they
are not parallel-aware.  This commit does the same for Hash nodes, so
that explain.c has a way to find instrumentation data from an
arbitrary participant that actually built the hash table.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm%3D3DUQC2-z252N55eOcZBer6DPdM%3DFzrxH9dZc5vYLsjaA%40mail.gmail.com
2017-12-05 10:55:56 -08:00
Robert Haas 87c37e3291 Re-allow INSERT .. ON CONFLICT DO NOTHING on partitioned tables.
Commit 8355a011a0 was reverted in
f05230752d, but this attempt is
hopefully better-considered: we now pass the correct value to
ExecOpenIndices, which should avoid the crash that we hit before.

Amit Langote, reviewed by Simon Riggs and by me.  Some final
editing by me.

Discussion: http://postgr.es/m/7ff1e8ec-dc39-96b1-7f47-ff5965dceeac@lab.ntt.co.jp
2017-12-01 12:53:21 -05:00
Robert Haas 59c8078744 Fix uninitialized memory reference.
Without this, when partdesc->nparts == 0, we end up calling
ExecBuildSlotPartitionKeyDescription without initializing values
and isnull.

Reported by Coverity via Michael Paquier.  Patch by Michael Paquier,
reviewed and revised by Amit Langote.

Discussion: http://postgr.es/m/CAB7nPqQ3mwkdMoPY-ocgTpPnjd8TKOadMxdTtMLvEzF8480Zfg@mail.gmail.com
2017-12-01 10:05:00 -05:00
Peter Eisentraut e4128ee767 SQL procedures
This adds a new object type "procedure" that is similar to a function
but does not have a return type and is invoked by the new CALL statement
instead of SELECT or similar.  This implementation is aligned with the
SQL standard and compatible with or similar to other SQL implementations.

This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well
as ALTER/DROP ROUTINE that can refer to either a function or a
procedure (or an aggregate function, as an extension to SQL).  There is
also support for procedures in various utility commands such as COMMENT
and GRANT, as well as support in pg_dump and psql.  Support for defining
procedures is available in all the languages supplied by the core
distribution.

While this commit is mainly syntax sugar around existing functionality,
future features will rely on having procedures as a separate object
type.

Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
2017-11-30 11:03:20 -05:00
Tom Lane 7ca25b7de6 Fix neqjoinsel's behavior for semi/anti join cases.
Previously, this function estimated the selectivity as 1 minus eqjoinsel()
for the negator equality operator, regardless of join type (I think there
was an expectation that eqjoinsel would handle the join type).  But
actually this is completely wrong for semijoin cases: the fraction of the
LHS that has a non-matching row is not one minus the fraction of the LHS
that has a matching row.  In reality a semijoin with <> will nearly always
succeed: it can only fail when the RHS is empty, or it contains a single
distinct value that is equal to the particular LHS value, or the LHS value
is null.  The only one of those things we should have much confidence in
estimating is the fraction of LHS values that are null, so let's just take
the selectivity as 1 minus outer nullfrac.

Per coding convention, antijoin should be estimated the same as semijoin.

Arguably this is a bug fix, but in view of the lack of field complaints
and the risk of destabilizing plans, no back-patch.

Thomas Munro, reviewed by Ashutosh Bapat

Discussion: https://postgr.es/m/CAEepm=270ze2hVxWkJw-5eKzc3AB4C9KpH3L2kih75R5pdSogg@mail.gmail.com
2017-11-29 22:00:37 -05:00
Andres Freund fa330f9adf Add some regression tests that exercise hash join code.
Although hash joins are already tested by many queries, these tests
systematically cover the four different states we can reach as part of
the strategy for respecting work_mem.

Author: Thomas Munro
Reviewed-By: Andres Freund
2017-11-29 16:06:50 -08:00
Robert Haas 8d4e70a63b Add extensive tests for partition pruning.
Currently, partition pruning happens via constraint exclusion, but
there are pending places to replace that with a different and
hopefully faster mechanism.  To be sure that we don't change behavior
without realizing it, add extensive test coverage.

Note that not all of these behaviors are optimal; in some cases,
partitions are not pruned even though it would be safe to do so.
These tests therefore serve to memorialize the current state rather
than the ideal state.  Patches that improve things can update the test
results as appropriate.

Amit Langote, adjusted by me.  Review and testing of the larger patch
set of which this is a part by Ashutosh Bapat, David Rowley, Dilip
Kumar, Jesper Pedersen, Rajkumar Raghuwanshi, Beena Emerson, Amul Sul,
and Kyotaro Horiguchi.

Discussion: http://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
2017-11-29 15:25:29 -05:00
Robert Haas 2d7950f222 If a range-partitioned table has no default partition, reject null keys.
Commit 4e5fe9ad19 introduced this
problem.  Also add a test so it doesn't get broken again.

Report by Rushabh Lathia.  Fix by Amit Langote.  Reviewed by Rushabh
Lathia and Amul Sul.  Tweaked by me.

Discussion: http://postgr.es/m/CAGPqQf0Y1iJyk4QJBdMf=pS9i6Q0JUMM_h5-qkR3OMJ-e04PyA@mail.gmail.com
2017-11-28 14:11:16 -05:00
Robert Haas 7b88d63a91 Add null test to partition constraint for default range partitions.
Non-default range partitions have a constraint which include null
tests, and both default and non-default list partitions also have a
constraint which includes null tests, but for some reason this was
missed for default range partitions.  This could cause the partition
constraint to evaluate to false for rows that were (correctly) routed
to that partition by insert tuple routing, which could in turn
cause constraint exclusion to prune the default partition in cases
where it should not.

Amit Langote, reviewed by Kyotaro Horiguchi

Discussion: http://postgr.es/m/ba7aaeb1-4399-220e-70b4-62eade1522d0@lab.ntt.co.jp
2017-11-28 10:51:01 -05:00
Tom Lane 9b63c13f0a Repair failure with SubPlans in multi-row VALUES lists.
When nodeValuesscan.c was written, it was impossible to have a SubPlan in
VALUES --- any sub-SELECT there would have to be uncorrelated and thereby
would produce an InitPlan instead.  We therefore took a shortcut in the
logic that throws away a ValuesScan's per-row expression evaluation data
structures.  This was broken by the introduction of LATERAL however; a
sub-SELECT containing a lateral reference produces a correlated SubPlan.

The cleanest fix for this would be to give up the optimization of
discarding the expression eval state.  But that still seems pretty
unappetizing for long VALUES lists.  It seems to work to just prevent
the subexpressions from hooking into the ValuesScan node's subPlan
list, so let's do that and see how well it works.  (If this breaks,
due to additional connections between the subexpressions and the outer
query structures, we might consider compromises like throwing away data
only for VALUES rows not containing SubPlans.)

Per bug #14924 from Christian Duta.  Back-patch to 9.3 where LATERAL
was introduced.

Discussion: https://postgr.es/m/20171124120836.1463.5310@wrigleys.postgresql.org
2017-11-25 14:15:48 -05:00
Tom Lane e842791b0f Fix unstable regression test added by commits 59b71c6fe et al.
The query didn't really have a preferred index, leading to platform-
specific choices of which one to use.  Adjust it to make sure tenk1_hundred
is always chosen.

Per buildfarm.
2017-11-24 00:29:20 -05:00
Andres Freund 59b71c6fe6 Fix handling of NULLs returned by aggregate combine functions.
When strict aggregate combine functions, used in multi-stage/parallel
aggregation, returned NULL, we didn't check for that, invoking the
combine function with NULL the next round, despite it being strict.

The equivalent code invoking normal transition functions has a check
for that situation, which did not get copied in a7de3dc5c3. Fix the
bug by adding the equivalent check.

Based on a quick look I could not find any strict combine functions in
core actually returning NULL, and it doesn't seem very likely external
users have done so. So this isn't likely to have caused issues in
practice.

Add tests verifying transition / combine functions returning NULL is
tested.

Reported-By: Andres Freund
Author: Andres Freund
Discussion: https://postgr.es/m/20171121033642.7xvmjqrl4jdaaat3@alap3.anarazel.de
Backpatch: 9.6, where parallel aggregation was introduced
2017-11-23 17:15:27 -08:00
Simon Riggs 3bae43ca4d Sort default partition to bottom of psql \d+
Minor patch to change sort order only

Author: Ashutosh Bapat
Reviewed-by:  Álvaro Herrera, Simon Riggs
2017-11-23 05:17:47 +11:00
Simon Riggs 05b6ec39d7 Show partition info from psql \d+
Author: Amit Langote, Ashutosh Bapat
Reviewed-by:  Álvaro Herrera, Simon Riggs
2017-11-23 05:10:39 +11:00
Robert Haas f3b0897a12 Fix multiple problems with satisfies_hash_partition.
Fix the function header comment to describe the actual behavior.
Check that table OID, modulus, and remainder arguments are not NULL
before accessing them.  Check that the modulus and remainder are
sensible.  If the table OID doesn't exist, return NULL instead of
emitting an internal error, similar to what we do elsewhere.  Check
that the actual argument types match, or at least are binary coercible
to, the expected argument types.  Correctly handle invocation of this
function using the VARIADIC syntax.  Add regression tests.

Robert Haas and Amul Sul, per a report by Andreas Seltenreich and
subsequent followup investigation.

Discussion: http://postgr.es/m/871sl4sdrv.fsf@ansel.ydns.eu
2017-11-21 13:06:32 -05:00
Simon Riggs 56f3468622 Reduce test variability for toast_tuple_target test 2017-11-20 12:09:40 +11:00
Simon Riggs c2513365a0 Parameter toast_tuple_target controls TOAST for new rows
Specifies the point at which we try to move long column values
into TOAST tables.

No effect on existing rows.

Discussion: https://postgr.es/m/CANP8+jKsVmw6CX6YP9z7zqkTzcKV1+Uzr3XjKcZW=2Ya00OyQQ@mail.gmail.com

Author: Simon Riggs <simon@2ndQudrant.com>
Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndQuadrant.com>
2017-11-20 09:50:10 +11:00
Tom Lane 63ca86318d Fix quoted-substring handling in format parsing for to_char/to_number/etc.
This code evidently intended to treat backslash as an escape character
within double-quoted substrings, but it was sufficiently confused that
cases like ..."foo\\"... did not work right: the second backslash
managed to quote the double-quote after it, despite being quoted itself.
Rewrite to get that right, while preserving the existing behavior
outside double-quoted substrings, which is that backslash isn't special
except in the combination \".

Comparing to Oracle, it seems that their version of to_char() for
timestamps allows literal alphanumerics only within double quotes, while
non-alphanumerics are allowed outside quotes; backslashes aren't special
anywhere; there is no way at all to emit a literal double quote.
(Bizarrely, their to_char() for numbers is different; it doesn't allow
literal text at all AFAICT.)  The fact that they don't treat backslash
as special justifies our existing behavior for backslash outside double
quotes.  I considered making backslash inside double quotes act the same
way (ie, special only if before "), which in a green field would be a
more consistent behavior.  But that would likely break more existing SQL
code than what this patch does.

Add some test cases illustrating this behavior.  (Only the last new
case actually changes behavior in this commit.)

Little of this behavior was documented, either, so fix that.

Discussion: https://postgr.es/m/3626.1510949486@sss.pgh.pa.us
2017-11-18 12:16:37 -05:00
Tom Lane e87d4965bd Prevent to_number() from losing data when template doesn't match exactly.
Non-data template patterns would consume characters whether or not those
characters were what the pattern expected, for example
	SELECT TO_NUMBER('1234', '9,999');
produced 134 because the '2' got eaten by the comma pattern.  This seems
undesirable, not least because it doesn't happen in Oracle.  For the ','
and 'G' template patterns, we can fix this by consuming characters only
if they match what the pattern would output.  For non-data patterns such
as 'L' and 'TH', it seems impractical to tighten things up to the point of
consuming only exact matches to what the pattern would output; but we can
improve matters quite a lot by redefining the behavior as "consume only
characters that aren't digits, signs, decimal point, or comma".

Also, fix it so that the behavior is to consume the number of *characters*
the pattern would output, not the number of *bytes*.  The old coding would
do surprising things with non-ASCII currency symbols, for example.  (It
would be good to apply that rule for literal text as well, but this commit
only fixes it for non-data patterns.)

Oliver Ford, reviewed by Thomas Munro and Nathan Wagner, and whacked around
a bit more by me

Discussion: https://postgr.es/m/CAGMVOdvpbMqPf9XWNzOwBpzJfErkydr_fEGhmuDGa015z97mwg@mail.gmail.com
2017-11-17 12:04:13 -05:00
Robert Haas be92769e4e Set proargmodes for satisfies_hash_partition.
It appears that proargmodes should always be set for variadic
functions, but satifies_hash_partition had it as NULL.  In addition to
fixing the problem, add a regression test to guard against future
mistakes of this type.
2017-11-17 11:53:00 -05:00
Robert Haas e89a71fb44 Pass InitPlan values to workers via Gather (Merge).
If a PARAM_EXEC parameter is used below a Gather (Merge) but the InitPlan
that computes it is attached to or above the Gather (Merge), force the
value to be computed before starting parallelism and pass it down to all
workers.  This allows us to use parallelism in cases where it previously
would have had to be rejected as unsafe.  We do - in this case - lose the
optimization that the value is only computed if it's actually used.  An
alternative strategy would be to have the first worker that needs the value
compute it, but one downside of that approach is that we'd then need to
select a parallel-safe path to compute the parameter value; it couldn't for
example contain a Gather (Merge) node.  At some point in the future, we
might want to consider both approaches.

Independent of that consideration, there is a great deal more work that
could be done to make more kinds of PARAM_EXEC parameters parallel-safe.
This infrastructure could be used to allow a Gather (Merge) on the inner
side of a nested loop (although that's not a very appealing plan) and
cases where the InitPlan is attached below the Gather (Merge) could be
addressed as well using various techniques.  But this is a good start.

Amit Kapila, reviewed and revised by me.  Reviewing and testing from
Kuntal Ghosh, Haribabu Kommi, and Tushar Ahuja.

Discussion: http://postgr.es/m/CAA4eK1LV0Y1AUV4cUCdC+sYOx0Z0-8NAJ2Pd9=UKsbQ5Sr7+JQ@mail.gmail.com
2017-11-16 12:06:14 -05:00
Robert Haas e5253fdc4f Add parallel_leader_participation GUC.
Sometimes, for testing, it's useful to have the leader do nothing but
read tuples from workers; and it's possible that could work out better
even in production.

Thomas Munro, reviewed by Amit Kapila and by me.  A few final tweaks
by me.

Discussion: http://postgr.es/m/CAEepm=2U++Lp3bNTv2Bv_kkr5NE2pOyHhxU=G0YTa4ZhSYhHiw@mail.gmail.com
2017-11-15 08:23:18 -05:00
Tom Lane 7518049980 Prevent int128 from requiring more than MAXALIGN alignment.
Our initial work with int128 neglected alignment considerations, an
oversight that came back to bite us in bug #14897 from Vincent Lachenal.
It is unsurprising that int128 might have a 16-byte alignment requirement;
what's slightly more surprising is that even notoriously lax Intel chips
sometimes enforce that.

Raising MAXALIGN seems out of the question: the costs in wasted disk and
memory space would be significant, and there would also be an on-disk
compatibility break.  Nor does it seem very practical to try to allow some
data structures to have more-than-MAXALIGN alignment requirement, as we'd
have to push knowledge of that throughout various code that copies data
structures around.

The only way out of the box is to make type int128 conform to the system's
alignment assumptions.  Fortunately, gcc supports that via its
__attribute__(aligned()) pragma; and since we don't currently support
int128 on non-gcc-workalike compilers, we shouldn't be losing any platform
support this way.

Although we could have just done pg_attribute_aligned(MAXIMUM_ALIGNOF) and
called it a day, I did a little bit of extra work to make the code more
portable than that: it will also support int128 on compilers without
__attribute__(aligned()), if the native alignment of their 128-bit-int
type is no more than that of int64.

Add a regression test case that exercises the one known instance of the
problem, in parallel aggregation over a bigint column.

This will need to be back-patched, along with the preparatory commit
91aec93e6.  But let's see what the buildfarm makes of it first.

Discussion: https://postgr.es/m/20171110185747.31519.28038@wrigleys.postgresql.org
2017-11-14 15:03:55 -05:00
Robert Haas 44ae64c388 Push target list evaluation through Gather Merge.
We already do this for Gather, but it got overlooked for Gather Merge.

Amit Kapila, with review and minor revisions by Rushabh Lathia
and by me.

Discussion: http://postgr.es/m/CAA4eK1KUC5Uyu7qaifxrjpHxbSeoQh3yzwN3bThnJsmJcZ-qtA@mail.gmail.com
2017-11-13 16:37:42 -05:00
Noah Misch 4b865aee25 Fix previous commit's test, for non-UTF8 databases with non-XML builds.
To ensure stable output, catch one more configuration-specific error.
Back-patch to 9.3, like the commit that added the test.
2017-11-11 13:07:46 -08:00
Noah Misch 2918fcedbf Ignore XML declaration in xpath_internal(), for UTF8 databases.
When a value contained an XML declaration naming some other encoding,
this function interpreted UTF8 bytes as the named encoding, yielding
mojibake.  xml_parse() already has similar logic.  This would be
necessary but not sufficient for non-UTF8 databases, so preserve
behavior there until the xpath facility can support such databases
comprehensively.  Back-patch to 9.3 (all supported versions).

Pavel Stehule and Noah Misch

Discussion: https://postgr.es/m/CAFj8pRC-dM=tT=QkGi+Achkm+gwPmjyOayGuUfXVumCxkDgYWg@mail.gmail.com
2017-11-11 11:10:53 -08:00
Robert Haas 1aba8e651a Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly.  This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.

At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work.  Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.

Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me.  A few
final tweaks also by me.

Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-09 18:07:44 -05:00
Tom Lane 5ecc0d738e Restrict lo_import()/lo_export() via SQL permissions not hard-wired checks.
While it's generally unwise to give permissions on these functions to
anyone but a superuser, we've been moving away from hard-wired permission
checks inside functions in favor of using the SQL permission system to
control access.  Bring lo_import() and lo_export() into compliance with
that approach.

In particular, this removes the manual configuration option
ALLOW_DANGEROUS_LO_FUNCTIONS.  That dates back to 1999 (commit 4cd4a54c8);
it's unlikely anyone has used it in many years.  Moreover, if you really
want such behavior, now you can get it with GRANT ... TO PUBLIC instead.

Michael Paquier

Discussion: https://postgr.es/m/CAB7nPqRHmNOYbETnc_2EjsuzSM00Z+BWKv9sy6tnvSd5gWT_JA@mail.gmail.com
2017-11-09 12:36:58 -05:00
Tom Lane b574228715 Add tests for json{b}_populate_recordset() crash case.
The problem reported as CVE-2017-15098 was already resolved in HEAD by
commit 37a795a60, but let's add the relevant test cases anyway.

Michael Paquier and Tom Lane, per a report from David Rowley.

Security: CVE-2017-15098
2017-11-06 10:29:37 -05:00
Dean Rasheed 87b2ebd352 Always require SELECT permission for ON CONFLICT DO UPDATE.
The update path of an INSERT ... ON CONFLICT DO UPDATE requires SELECT
permission on the columns of the arbiter index, but it failed to check
for that in the case of an arbiter specified by constraint name.

In addition, for a table with row level security enabled, it failed to
check updated rows against the table's SELECT policies when the update
path was taken (regardless of how the arbiter index was specified).

Backpatch to 9.5 where ON CONFLICT DO UPDATE and RLS were introduced.

Security: CVE-2017-15099
2017-11-06 09:19:22 +00:00
Tom Lane 7c70996ebf Allow bitmap scans to operate as index-only scans when possible.
If we don't have to return any columns from heap tuples, and there's
no need to recheck qual conditions, and the heap page is all-visible,
then we can skip fetching the heap page altogether.

Skip prefetching pages too, when possible, on the assumption that the
recheck flag will remain the same from one page to the next.  While that
assumption is hardly bulletproof, it seems like a good bet most of the
time, and better than prefetching pages we don't need.

This commit installs the executor infrastructure, but doesn't change
any planner cost estimates, thus possibly causing bitmap scans to
not be chosen in cases where this change renders them the best choice.
I (tgl) am not entirely convinced that we need to account for this
behavior in the planner, because I think typically the bitmap scan would
get chosen anyway if it's the best bet.  In any case the submitted patch
took way too many shortcuts, resulting in too many clearly-bad choices,
to be committable.

Alexander Kuzmenkov, reviewed by Alexey Chernyshov, and whacked around
rather heavily by me.

Discussion: https://postgr.es/m/239a8955-c0fc-f506-026d-c837e86c827b@postgrespro.ru
2017-11-01 17:38:20 -04:00
Tom Lane af20e2d728 Fix ALTER TABLE code to update domain constraints when needed.
It's possible for dropping a column, or altering its type, to require
changes in domain CHECK constraint expressions; but the code was
previously only expecting to find dependent table CHECK constraints.
Make the necessary adjustments.

This is a fairly old oversight, but it's a lot easier to encounter
the problem in the context of domains over composite types than it
was before.  Given the lack of field complaints, I'm not going to
bother with a back-patch, though I'd be willing to reconsider that
decision if someone does complain.

Patch by me, reviewed by Michael Paquier

Discussion: https://postgr.es/m/30656.1509128130@sss.pgh.pa.us
2017-11-01 13:32:23 -04:00