Commit Graph

294 Commits

Author SHA1 Message Date
Simon Riggs f1464c5380 Improve parse representation for MERGE
Separation of parser data structures from executor, as
requested by Tom Lane. Further improvements possible.

While there, implement error for multiple VALUES clauses via parser
to allow line number of error, as requested by Andres Freund.

Author: Pavan Deolasee

Discussion: https://www.postgresql.org/message-id/CABOikdPpqjectFchg0FyTOpsGXyPoqwgC==OLKWuxgBOsrDDZw@mail.gmail.com
2018-04-06 09:38:59 +01:00
Simon Riggs d204ef6377 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 09:28:16 +01:00
Simon Riggs 7cf8a5c302 Revert "Modified files for MERGE"
This reverts commit 354f13855e.
2018-04-02 21:34:15 +01:00
Simon Riggs 354f13855e Modified files for MERGE 2018-04-02 21:12:47 +01:00
Tom Lane 0b11a674fb Fix a boatload of typos in C comments.
Justin Pryzby

Discussion: https://postgr.es/m/20180331105640.GK28454@telsasoft.com
2018-04-01 15:01:28 -04:00
Andrew Dunstan 16828d5c02 Fast ALTER TABLE ADD COLUMN with a non-NULL default
Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.

Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.

The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g.  catalog relations) NULL can be passed for this
argument.

Andrew Dunstan, heavily modified from an original patch from Serge
Rielau.
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.

Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com
2018-03-28 10:43:52 +10:30
Peter Eisentraut 533c5d8bdd Fix application of identity values in some cases
Investigation of 2d2d06b7e2 revealed that
identity values were not applied in some further cases, including
logical replication subscribers, VALUES RTEs, and ALTER TABLE ... ADD
COLUMN.  To fix all that, apply the identity column expression in
build_column_default() instead of repeating the same logic at each call
site.

For ALTER TABLE ... ADD COLUMN ... IDENTITY, the previous coding
completely ignored that existing rows for the new column should have
values filled in from the identity sequence.  The coding using
build_column_default() fails for this because the sequence ownership
isn't registered until after ALTER TABLE, and we can't do it before
because we don't have the column in the catalog yet.  So we specially
remember in ColumnDef the sequence name that we decided on and build a
custom NextValueExpr using that.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-02-02 14:39:10 -05:00
Bruce Momjian 9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Tom Lane 9a785ad573 Fix creation of resjunk tlist entries for inherited mixed UPDATE/DELETE.
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table.  That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables.  However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping.  This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.

To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table.  We can conveniently call it from
preprocess_targetlist.  Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist.  (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)

Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't.  (That's
really an independent bug, but it manifests in much the same cases.)

Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.

Back-patch to 9.5 where the problem was introduced.

Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat

Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
2017-11-27 17:54:07 -05:00
Peter Eisentraut 2eb4a831e5 Change TRUE/FALSE to true/false
The lower case spellings are C and C++ standard and are used in most
parts of the PostgreSQL sources.  The upper case spellings are only used
in some files/modules.  So standardize on the standard spellings.

The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so
those are left as is when using those APIs.

In code comments, we use the lower-case spelling for the C concepts and
keep the upper-case spelling for the SQL concepts.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-11-08 11:37:28 -05:00
Robert Haas cff440d368 pg_stat_statements: Widen query IDs from 32 bits to 64 bits.
This takes advantage of the infrastructure introduced by commit
81c5e46c49 to greatly reduce the
likelihood that two different queries will end up with the same query
ID.  It's still possible, of course, but whereas before it the chances
of a collision reached 25% around 50,000 queries, it will now take
more than 3 billion queries.

Backward incompatibility: Because the type exposed at the SQL level is
int8, users may now see negative query IDs in the pg_stat_statements
view (and also, query IDs more than 4 billion, which was the old
limit).

Patch by me, reviewed by Michael Paquier and Peter Geoghegan.

Discussion: http://postgr.es/m/CA+TgmobG_Kp4cBKFmsznUAaM1GWW6hhRNiZC0KjRMOOeYnz5Yw@mail.gmail.com
2017-10-11 19:52:46 -04:00
Tom Lane c12d570fa1 Support arrays over domains.
Allowing arrays with a domain type as their element type was left un-done
in the original domain patch, but not for any very good reason.  This
omission leads to such surprising results as array_agg() not working on
a domain column, because the parser can't identify a suitable output type
for the polymorphic aggregate.

In order to fix this, first clean up the APIs of coerce_to_domain() and
some internal functions in parse_coerce.c so that we consistently pass
around a CoercionContext along with CoercionForm.  Previously, we sometimes
passed an "isExplicit" boolean flag instead, which is strictly less
information; and coerce_to_domain() didn't even get that, but instead had
to reverse-engineer isExplicit from CoercionForm.  That's contrary to the
documentation in primnodes.h that says that CoercionForm only affects
display and not semantics.  I don't think this change fixes any live bugs,
but it makes things more consistent.  The main reason for doing it though
is that now build_coercion_expression() receives ccontext, which it needs
in order to be able to recursively invoke coerce_to_target_type().

Next, reimplement ArrayCoerceExpr so that the node does not directly know
any details of what has to be done to the individual array elements while
performing the array coercion.  Instead, the per-element processing is
represented by a sub-expression whose input is a source array element and
whose output is a target array element.  This simplifies life in
parse_coerce.c, because it can build that sub-expression by a recursive
invocation of coerce_to_target_type().  The executor now handles the
per-element processing as a compiled expression instead of hard-wired code.
The main advantage of this is that we can use a single ArrayCoerceExpr to
handle as many as three successive steps per element: base type conversion,
typmod coercion, and domain constraint checking.  The old code used two
stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty
inefficient, and adding yet another array deconstruction to do domain
constraint checking seemed very unappetizing.

In the case where we just need a single, very simple coercion function,
doing this straightforwardly leads to a noticeable increase in the
per-array-element runtime cost.  Hence, add an additional shortcut evalfunc
in execExprInterp.c that skips unnecessary overhead for that specific form
of expression.  The runtime speed of simple cases is within 1% or so of
where it was before, while cases that previously required two levels of
array processing are significantly faster.

Finally, create an implicit array type for every domain type, as we do for
base types, enums, etc.  Everything except the array-coercion case seems
to just work without further effort.

Tom Lane, reviewed by Andrew Dunstan

Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
2017-09-30 13:40:56 -04:00
Andres Freund 2cd7084524 Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n).
This is a mechanical change in preparation for a later commit that
will change the layout of TupleDesc.  Introducing a macro to abstract
the details of where attributes are stored will allow us to change
that in separate step and revise it in future.

Author: Thomas Munro, editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-08-20 11:19:07 -07:00
Tom Lane b1cb32fb62 Fix multiple assignments to a column of a domain type.
We allow INSERT and UPDATE commands to assign to the same column more than
once, as long as the assignments are to subfields or elements rather than
the whole column.  However, this failed when the target column was a domain
over array rather than plain array.  Fix by teaching process_matched_tle()
to look through CoerceToDomain nodes, and add relevant test cases.

Also add a group of test cases exercising domains over array of composite.
It's doubtless accidental that CREATE DOMAIN allows this case while not
allowing straight domain over composite; but it does, so we'd better make
sure we don't break it.  (I could not find any documentation mentioning
either side of that, so no doc changes.)

It's been like this for a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us
2017-07-11 16:48:59 -04:00
Peter Eisentraut ef74e03ef6 Fix UPDATE of GENERATED ALWAYS identity columns
The bug would previously prevent the update of any column in a table
with identity columns, rather than just the actual identity column.

Reported-by: zam6ak@gmail.com
Bug: #14718
2017-06-30 23:44:17 -04:00
Tom Lane 382ceffdf7 Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.

By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis.  However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent.  That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.

This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:35:54 -04:00
Tom Lane c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Tom Lane e3860ffa4d Initial pgindent run with pg_bsd_indent version 2.0.
The new indent version includes numerous fixes thanks to Piotr Stefaniak.
The main changes visible in this commit are:

* Nicer formatting of function-pointer declarations.
* No longer unexpectedly removes spaces in expressions using casts,
  sizeof, or offsetof.
* No longer wants to add a space in "struct structname *varname", as
  well as some similar cases for const- or volatile-qualified pointers.
* Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely.
* Fixes bug where comments following declarations were sometimes placed
  with no space separating them from the code.
* Fixes some odd decisions for comments following case labels.
* Fixes some cases where comments following code were indented to less
  than the expected column 33.

On the less good side, it now tends to put more whitespace around typedef
names that are not listed in typedefs.list.  This might encourage us to
put more effort into typedef name collection; it's not really a bug in
indent itself.

There are more changes coming after this round, having to do with comment
indentation and alignment of lines appearing within parentheses.  I wanted
to limit the size of the diffs to something that could be reviewed without
one's eyes completely glazing over, so it seemed better to split up the
changes as much as practical.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:39:04 -04:00
Tom Lane 651902deb1 Re-run pgindent.
This is just to have a clean base state for testing of Piotr Stefaniak's
latest version of FreeBSD indent.  I fixed up a couple of places where
pgindent would have changed format not-nicely.  perltidy not included.

Discussion: https://postgr.es/m/VI1PR03MB119959F4B65F000CA7CD9F6BF2CC0@VI1PR03MB1199.eurprd03.prod.outlook.com
2017-06-13 13:05:59 -04:00
Dean Rasheed b6263cd851 Teach relation_is_updatable() about partitioned tables.
Table partitioning, introduced in commit f0e44751d7, added a new
relkind - RELKIND_PARTITIONED_TABLE. Update relation_is_updatable() to
handle it. Specifically, partitioned tables and simple views built on
top of them are updatable.

This affects the SQL-callable functions pg_relation_is_updatable() and
pg_column_is_updatable(), and the views information_schema.views and
information_schema.columns.

Dean Rasheed, reviewed by Ashutosh Bapat.

Discussion: https://postgr.es/m/CAEZATCXnbiFkMXgF4Ez1pmM2c-tS1z33bSq7OGbw7QQhHov%2B6Q%40mail.gmail.com
2017-06-13 17:30:36 +01:00
Joe Conway 4f7a95be2c Apply RLS policies to partitioned tables.
The new partitioned table capability added a new relkind, namely
RELKIND_PARTITIONED_TABLE. Update fireRIRrules() to apply RLS
policies on RELKIND_PARTITIONED_TABLE as it does RELKIND_RELATION.

In addition, add RLS regression test coverage for partitioned tables.

Issue raised by Fakhroutdinov Evgenievich and patch by Mike Palmiotto.
Regression test editorializing by me.

Discussion: https://postgr.es/m/flat/20170601065959.1486.69906@wrigleys.postgresql.org
2017-06-11 08:51:18 -07:00
Bruce Momjian a6fd7b7a5f Post-PG 10 beta1 pgindent run
perltidy run not included.
2017-05-17 16:31:56 -04:00
Tom Lane 8f0530f580 Improve castNode notation by introducing list-extraction-specific variants.
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type.  For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))".  Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.

As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.

Patch by me, after an idea of Andrew Gierth's.

Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10 13:51:53 -04:00
Peter Eisentraut 3217327053 Identity columns
This is the SQL standard-conforming variant of PostgreSQL's serial
columns.  It fixes a few usability issues that serial columns have:

- CREATE TABLE / LIKE copies default but refers to same sequence
- cannot add/drop serialness with ALTER TABLE
- dropping default does not drop sequence
- need to grant separate privileges to sequence
- other slight weirdnesses because serial is some kind of special macro

Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
2017-04-06 08:41:37 -04:00
Peter Eisentraut 4cb824699e Cast result of copyObject() to correct type
copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.

Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
2017-03-28 21:59:23 -04:00
Alvaro Herrera fcec6caafa Support XMLTABLE query expression
XMLTABLE is defined by the SQL/XML standard as a feature that allows
turning XML-formatted data into relational form, so that it can be used
as a <table primary> in the FROM clause of a query.

This new construct provides significant simplicity and performance
benefit for XML data processing; what in a client-side custom
implementation was reported to take 20 minutes can be executed in 400ms
using XMLTABLE.  (The same functionality was said to take 10 seconds
using nested PostgreSQL XPath function calls, and 5 seconds using
XMLReader under PL/Python).

The implemented syntax deviates slightly from what the standard
requires.  First, the standard indicates that the PASSING clause is
optional and that multiple XML input documents may be given to it; we
make it mandatory and accept a single document only.  Second, we don't
currently support a default namespace to be specified.

This implementation relies on a new executor node based on a hardcoded
method table.  (Because the grammar is fixed, there is no extensibility
in the current approach; further constructs can be implemented on top of
this such as JSON_TABLE, but they require changes to core code.)

Author: Pavel Stehule, Álvaro Herrera
Extensively reviewed by: Craig Ringer
Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com
2017-03-08 12:40:26 -03:00
Peter Eisentraut 38d103763d Make more use of castNode() 2017-02-21 11:59:09 -05:00
Robert Haas 587cda35ca Fix things so that updatable views work with partitioned tables.
Previously, ExecInitModifyTable was missing handling for WITH CHECK
OPTION, and view_query_is_auto_updatable was missing handling for
RELKIND_PARTITIONED_TABLE.

Amit Langote, reviewed by me.
2017-01-24 15:46:50 -05:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Robert Haas f0e44751d7 Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own.  The children are called
partitions and contain all of the actual data.  Each partition has an
implicit partitioning constraint.  Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed.  Partitions
can't have extra columns and may not allow nulls unless the parent
does.  Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.

Currently, tables can be range-partitioned or list-partitioned.  List
partitioning is limited to a single column, but range partitioning can
involve multiple columns.  A partitioning "column" can be an
expression.

Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations.  The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.

Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others.  Minor revisions by me.
2016-12-07 13:17:55 -05:00
Tom Lane 24aef33804 Cleanup of rewriter and planner handling of Query.hasRowSecurity flag.
Be sure to pull up the subquery's hasRowSecurity flag when flattening a
subquery in pull_up_simple_subquery().  This isn't a bug today because
we don't look at the hasRowSecurity flag during planning, but it could
easily be a bug tomorrow.

Likewise, make rewriteRuleAction() pull up the hasRowSecurity flag when
absorbing RTEs from a rule action.  This isn't a bug either, for the
opposite reason: the flag should never be set yet.  But again, it seems
like good future proofing.

Add a comment explaining why rewriteTargetView() should *not* set
hasRowSecurity when adding stuff to securityQuals.

Improve some nearby comments about securityQuals processing, and document
that field more completely in parsenodes.h.

Patch by me, analysis by Dean Rasheed.

Discussion: <CAEZATCXZ8tb2DV6f=bkhsMV6u_gRcZ0CZBw2J-qU84RxSukZog@mail.gmail.com>
2016-11-10 16:16:33 -05:00
Tom Lane a4c35ea1c2 Improve parser's and planner's handling of set-returning functions.
Teach the parser to reject misplaced set-returning functions during parse
analysis using p_expr_kind, in much the same way as we do for aggregates
and window functions (cf commit eaccfded9).  While this isn't complete
(it misses nesting-based restrictions), it's much better than the previous
error reporting for such cases, and it allows elimination of assorted
ad-hoc expression_returns_set() error checks.  We could add nesting checks
later if it seems important to catch all cases at parse time.

There is one case the parser will now throw error for although previous
versions allowed it, which is SRFs in the tlist of an UPDATE.  That never
behaved sensibly (since it's ill-defined which generated row should be
used to perform the update) and it's hard to see why it should not be
treated as an error.  It's a release-note-worthy change though.

Also, add a new Query field hasTargetSRFs reporting whether there are
any SRFs in the targetlist (including GROUP BY/ORDER BY expressions).
The parser can now set that basically for free during parse analysis,
and we can use it in a number of places to avoid expression_returns_set
searches.  (There will be more such checks soon.)  In some places, this
allows decontorting the logic since it's no longer expensive to check for
SRFs in the tlist --- so I made the checks parallel to the handling of
hasAggs/hasWindowFuncs wherever it seemed appropriate.

catversion bump because adding a Query field changes stored rules.

Andres Freund and Tom Lane

Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 13:54:24 -04:00
Bruce Momjian ee94300446 Update copyright for 2016
Backpatch certain files through 9.1
2016-01-02 13:33:40 -05:00
Tom Lane fd19525756 Put back one copyObject() in rewriteTargetView().
Commit 6f8cb1e234 tried to centralize rewriteTargetView's copying
of a target view's Query struct.  However, it ignored the fact that the
jointree->quals field was used twice.  This only accidentally failed to
fail immediately because the same ChangeVarNodes mutation is applied in
both cases, so that we end up with logically identical expression trees
for both uses (and, as the code stands, the second ChangeVarNodes call
actually does nothing).  However, we end up linking *physically*
identical expression trees into both an RTE's securityQuals list and
the WithCheckOption list.  That's pretty dangerous, mainly because
prepsecurity.c is utterly cavalier about further munging such structures
without copying them first.

There may be no live bug in HEAD as a consequence of the fact that we apply
preprocess_expression in between here and prepsecurity.c, and that will
make a copy of the tree anyway.  Or it may just be that the regression
tests happen to not trip over it.  (I noticed this only because things
fell over pretty badly when I tried to relocate the planner's call of
expand_security_quals to before expression preprocessing.)  In any case
it's very fragile because if anyone tried to make the securityQuals and
WithCheckOption trees diverge before we reach preprocess_expression, it
would not work.  The fact that the current code will preprocess
securityQuals and WithCheckOptions lists at completely different times in
different query levels does nothing to increase my trust that that can't
happen.

In view of the fact that 9.5.0 is almost upon us and the aforesaid commit
has seen exactly zero field testing, the prudent course is to make an extra
copy of the quals so that the behavior is not different from what has been
in the field during beta.
2015-12-29 16:45:47 -05:00
Stephen Frost 6f8cb1e234 Make viewquery a copy in rewriteTargetView()
Rather than expect the Query returned by get_view_query() to be
read-only and then copy bits and pieces of it out, simply copy the
entire structure when we get it.  This addresses an issue where
AcquireRewriteLocks, which is called by acquireLocksOnSubLinks(),
scribbles on the parsetree passed in, which was actually an entry
in relcache, leading to segfaults with certain view definitions.
This also future-proofs us a bit for anyone adding more code to this
path.

The acquireLocksOnSubLinks() was added in commit c3e0ddd40.

Back-patch to 9.3 as that commit was.
2015-12-21 10:34:14 -05:00
Stephen Frost 22eaf35c1d RLS refactoring
This refactors rewrite/rowsecurity.c to simplify the handling of the
default deny case (reducing the number of places where we check for and
add the default deny policy from three to one) by splitting up the
retrival of the policies from the application of them.

This also allowed us to do away with the policy_id field.  A policy_name
field was added for WithCheckOption policies and is used in error
reporting, when available.

Patch by Dean Rasheed, with various mostly cosmetic changes by me.

Back-patch to 9.5 where RLS was introduced to avoid unnecessary
differences, since we're still in alpha, per discussion with Robert.
2015-09-15 15:49:31 -04:00
Stephen Frost c3e0ddd403 Lock all relations referred to in updatable views
Even views considered "simple" enough to be automatically updatable may
have mulitple relations involved (eg: in a where clause).  We need to
make sure and lock those relations when rewriting the query.

Back-patch to 9.3 where updatable views were added.

Pointed out by Andres, patch thanks to Dean Rasheed.
2015-09-08 17:02:49 -04:00
Stephen Frost 2ba9e2b778 Ensure locks are acquired on RLS-added relations
During fireRIRrules(), get_row_security_policies can add to
securityQuals and withCheckOptions.  Make sure to lock any relations
added at that point and before firing RIR rules on those expressions.

Back-patch to 9.5 where RLS was added.
2015-08-28 11:39:37 -04:00
Andres Freund c0f0d8097b Clarify what some historic terms in rewriteHandler.c mean.
Discussion: 20150827131352.GF2435@awork2.anarazel.de
2015-08-28 16:27:58 +02:00
Tom Lane dd7a8f66ed Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM.  (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.)  Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type.  This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.

Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.

Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.

Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 14:39:00 -04:00
Bruce Momjian 807b9e0dff pgindent run for 9.5 2015-05-23 21:35:49 -04:00
Andres Freund f3d3118532 Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.

This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.

The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.

The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.

Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage.  The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting.  The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.

A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.

Needs a catversion bump because stored rules may change.

Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
    Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:46:31 +02:00
Simon Riggs f6d208d6e5 TABLESAMPLE, SQL Standard and extensible
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.

Petr Jelinek

Reviewed by Michael Paquier and Simon Riggs
2015-05-15 14:37:10 -04:00
Andres Freund 168d5805e4 Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

Bumps catversion as stored rules change.

Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:43:10 +02:00
Andres Freund 2c8f4836db Represent columns requiring insert and update privileges indentently.
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held.  As statements so far required
either insert or update privileges but never both, that was
sufficient. The required permission could be inferred from the top level
statement run.

The upcoming INSERT ... ON CONFLICT UPDATE feature needs to
independently check for both privileges in one statement though, so that
is not sufficient anymore.

Bumps catversion as stored rules change.

Author: Peter Geoghegan
Reviewed-By: Andres Freund
2015-05-08 00:20:46 +02:00
Stephen Frost e89bd02f58 Perform RLS WITH CHECK before constraints, etc
The RLS capability is built on top of the WITH CHECK OPTION
system which was added for auto-updatable views, however, unlike
WCOs on views (which are mandated by the SQL spec to not fire until
after all other constraints and checks are done), it makes much more
sense for RLS checks to happen earlier than constraint and uniqueness
checks.

This patch reworks the structure which holds the WCOs a bit to be
explicitly either VIEW or RLS checks and the RLS-related checks are
done prior to the constraint and uniqueness checks.  This also allows
better error reporting as we are now reporting when a violation is due
to a WITH CHECK OPTION and when it's due to an RLS policy violation,
which was independently noted by Craig Ringer as being confusing.

The documentation is also updated to include a paragraph about when RLS
WITH CHECK handling is performed, as there have been a number of
questions regarding that and the documentation was previously silent on
the matter.

Author: Dean Rasheed, with some kabitzing and comment changes by me.
2015-04-24 20:34:26 -04:00
Stephen Frost 0bf22e0c8b RLS fixes, new hooks, and new test module
In prepend_row_security_policies(), defaultDeny was always true, so if
there were any hook policies, the RLS policies on the table would just
get discarded.  Fixed to start off with defaultDeny as false and then
properly set later if we detect that only the default deny policy exists
for the internal policies.

The infinite recursion detection in fireRIRrules() didn't properly
manage the activeRIRs list in the case of WCOs, so it would incorrectly
report infinite recusion if the same relation with RLS appeared more
than once in the rtable, for example "UPDATE t ... FROM t ...".

Further, the RLS expansion code in fireRIRrules() was handling RLS in
the main loop through the rtable, which lead to RTEs being visited twice
if they contained sublink subqueries, which
prepend_row_security_policies() attempted to handle by exiting early if
the RTE already had securityQuals.  That doesn't work, however, since
if the query involved a security barrier view on top of a table with
RLS, the RTE would already have securityQuals (from the view) by the
time fireRIRrules() was invoked, and so the table's RLS policies would
be ignored.  This is fixed in fireRIRrules() by handling RLS in a
separate loop at the end, after dealing with any other sublink
subqueries, thus ensuring that each RTE is only visited once for RLS
expansion.

The inheritance planner code didn't correctly handle non-target
relations with RLS, which would get turned into subqueries during
planning. Thus an update of the form "UPDATE t1 ... FROM t2 ..." where
t1 has inheritance and t2 has RLS quals would fail.  Fix by making sure
to copy in and update the securityQuals when they exist for non-target
relations.

process_policies() was adding WCOs to non-target relations, which is
unnecessary, and could lead to a lot of wasted time in the rewriter and
the planner. Fix by only adding WCO policies when working on the result
relation.  Also in process_policies, we should be copying the USING
policies to the WITH CHECK policies on a per-policy basis, fix by moving
the copying up into the per-policy loop.

Lastly, as noted by Dean, we were simply adding policies returned by the
hook provided to the list of quals being AND'd, meaning that they would
actually restrict records returned and there was no option to have
internal policies and hook-based policies work together permissively (as
all internal policies currently work).  Instead, explicitly add support
for both permissive and restrictive policies by having a hook for each
and combining the results appropriately.  To ensure this is all done
correctly, add a new test module (test_rls_hooks) to test the various
combinations of internal, permissive, and restrictive hook policies.

Largely from Dean Rasheed (thanks!):

CAEZATCVmFUfUOwwhnBTcgi6AquyjQ0-1fyKd0T3xBWJvn+xsFA@mail.gmail.com

Author: Dean Rasheed, though I added the new hooks and test module.
2015-04-22 12:01:06 -04:00
Tom Lane e983c4d1aa Rationalize the APIs of array element/slice access functions.
The four functions array_ref, array_set, array_get_slice, array_set_slice
have traditionally declared their array inputs and results as being of type
"ArrayType *".  This is a lie, and has been since Berkeley days, because
they actually also support "fixed-length array" types such as "name" and
"point"; not to mention that the inputs could be toasted.  These values
should be declared Datum instead to avoid confusion.  The current coding
already risks possible misoptimization by compilers, and it'll get worse
when "expanded" array representations become a valid alternative.

However, there's a fair amount of code using array_ref and array_set with
arrays that *are* known to be ArrayType structures, and there might be more
such places in third-party code.  Rather than cluttering those call sites
with PointerGetDatum/DatumGetArrayTypeP cruft, what I did was to rename the
existing functions to array_get_element/array_set_element, fix their
signatures, then reincarnate array_ref/array_set as backwards compatibility
wrappers.

array_get_slice/array_set_slice have no such constituency in the core code,
and probably not in third-party code either, so I just changed their APIs.
2015-02-16 12:23:58 -05:00
Bruce Momjian 4baaf863ec Update copyright for 2015
Backpatch certain files through 9.0
2015-01-06 11:43:47 -05:00
Tom Lane f4e031c662 Add bms_next_member(), and use it where appropriate.
This patch adds a way of iterating through the members of a bitmapset
nondestructively, unlike the old way with bms_first_member().  While
bms_next_member() is very slightly slower than bms_first_member()
(at least for typical-size bitmapsets), eliminating the need to palloc
and pfree a temporary copy of the target bitmapset is a significant win.
So this method should be preferred in all cases where a temporary copy
would be necessary.

Tom Lane, with suggestions from Dean Rasheed and David Rowley
2014-11-28 13:37:25 -05:00