Commit Graph

444 Commits

Author SHA1 Message Date
Tom Lane 8004953b5a Speed up ruleutils' name de-duplication code, and fix overlength-name case.
Since commit 11e131854f, ruleutils.c has
attempted to ensure that each RTE in a query or plan tree has a unique
alias name.  However, the code that was added for this could be quite slow,
even as bad as O(N^3) if N identical RTE names must be replaced, as noted
by Jeff Janes.  Improve matters by building a transient hash table within
set_rtable_names.  The hash table in itself reduces the cost of detecting a
duplicate from O(N) to O(1), and we can save another factor of N by storing
the number of de-duplicated names already created for each entry, so that
we don't have to re-try names already created.  This way is probably a bit
slower overall for small range tables, but almost by definition, such cases
should not be a performance problem.

In principle the same problem applies to the column-name-de-duplication
code; but in practice that seems to be less of a problem, first because
N is limited since we don't support extremely wide tables, and second
because duplicate column names within an RTE are fairly rare, so that in
practice the cost is more like O(N^2) not O(N^3).  It would be very much
messier to fix the column-name code, so for now I've left that alone.

An independent problem in the same area was that the de-duplication code
paid no attention to the identifier length limit, and would happily produce
identifiers that were longer than NAMEDATALEN and wouldn't be unique after
truncation to NAMEDATALEN.  This could result in dump/reload failures, or
perhaps even views that silently behaved differently than before.  We can
fix that by shortening the base name as needed.  Fix it for both the
relation and column name cases.

In passing, check for interrupts in set_rtable_names, just in case it's
still slow enough to be an issue.

Back-patch to 9.3 where this code was introduced.
2015-11-16 13:45:17 -05:00
Tom Lane 7745bc352a Fix ruleutils.c's dumping of whole-row Vars in ROW() and VALUES() contexts.
Normally ruleutils prints a whole-row Var as "foo.*".  We already knew that
that doesn't work at top level of a SELECT list, because the parser would
treat the "*" as a directive to expand the reference into separate columns,
not a whole-row Var.  However, Joshua Yanovski points out in bug #13776
that the same thing happens at top level of a ROW() construct; and some
nosing around in the parser shows that the same is true in VALUES().
Hence, apply the same workaround already devised for the SELECT-list case,
namely to add a forced cast to the appropriate rowtype in these cases.
(The alternative of just printing "foo" was rejected because it is
difficult to avoid ambiguity against plain columns named "foo".)

Back-patch to all supported branches.
2015-11-15 14:41:09 -05:00
Tom Lane dd7a8f66ed Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM.  (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.)  Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type.  This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.

Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.

Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.

Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 14:39:00 -04:00
Heikki Linnakangas f92d6a540a Use appendStringInfoString/Char et al where appropriate.
Patch by David Rowley. Backpatch to 9.5, as some of the calls were new in
9.5, and keeping the code in sync with master makes future backpatching
easier.
2015-07-02 12:36:03 +03:00
Tom Lane 0a52d378b0 Avoid passing NULL to memcmp() in lookups of zero-argument functions.
A few places assumed they could pass NULL for the argtypes array when
looking up functions known to have zero arguments.  At first glance
it seems that this should be safe enough, since memcmp() is surely not
allowed to fetch any bytes if its count argument is zero.  However,
close reading of the C standard says that such calls have undefined
behavior, so we'd probably best avoid it.

Since the number of places doing this is quite small, and some other
places looking up zero-argument functions were already passing dummy
arrays, let's standardize on the latter solution rather than hacking
the function lookup code to avoid calling memcmp() in these cases.
I also added Asserts to catch any future violations of the new rule.

Given the utter lack of any evidence that this actually causes any
problems in the field, I don't feel a need to back-patch this change.

Per report from Piotr Stefaniak, though this is not his patch.
2015-06-27 17:47:39 -04:00
Tom Lane f46edf479e Fix pg_get_functiondef() to print a function's LEAKPROOF property.
Seems to have been an oversight in the original leakproofness patch.
Per report and patch from Jeevan Chalke.

In passing, prettify some awkward leakproof-related code in AlterFunction.
2015-05-28 11:24:37 -04:00
Tom Lane 2aa0476dc3 Manual cleanup of pgindent results.
Fix some places where pgindent did silly stuff, often because project
style wasn't followed to begin with.  (I've not touched the atomics
headers, though.)
2015-05-24 15:04:10 -04:00
Bruce Momjian 807b9e0dff pgindent run for 9.5 2015-05-23 21:35:49 -04:00
Andres Freund 284bef2977 Fix yet another bug in ON CONFLICT rule deparsing.
Expand testing of rule deparsing a good bit, it's evidently needed.

Author: Peter Geoghegan, Andres Freund
Discussion: CAM3SWZQmXxZhQC32QVEOTYfNXJBJ_Q2SDENL7BV14Cq-zL0FLg@mail.gmail.com
2015-05-23 02:16:24 +02:00
Andres Freund 9bc77c4519 Various fixes around ON CONFLICT for rule deparsing.
Neither the deparsing of the new alias for INSERT's target table, nor of
the inference clause was supported. Also fixup a typo in an error
message.

Add regression tests to test those code paths.

Author: Peter Geoghegan
2015-05-19 23:18:57 +02:00
Andres Freund f3d3118532 Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.

This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.

The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.

The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.

Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage.  The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting.  The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.

A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.

Needs a catversion bump because stored rules may change.

Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
    Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:46:31 +02:00
Simon Riggs f6d208d6e5 TABLESAMPLE, SQL Standard and extensible
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.

Petr Jelinek

Reviewed by Michael Paquier and Simon Riggs
2015-05-15 14:37:10 -04:00
Peter Eisentraut d02f16470f Replace some appendStringInfo* calls with more appropriate variants
Author: David Rowley <dgrowleyml@gmail.com>
2015-05-11 20:38:55 -04:00
Tom Lane 1a8a4e5cde Code review for foreign/custom join pushdown patch.
Commit e7cb7ee145 included some design
decisions that seem pretty questionable to me, and there was quite a lot
of stuff not to like about the documentation and comments.  Clean up
as follows:

* Consider foreign joins only between foreign tables on the same server,
rather than between any two foreign tables with the same underlying FDW
handler function.  In most if not all cases, the FDW would simply have had
to apply the same-server restriction itself (far more expensively, both for
lack of caching and because it would be repeated for each combination of
input sub-joins), or else risk nasty bugs.  Anyone who's really intent on
doing something outside this restriction can always use the
set_join_pathlist_hook.

* Rename fdw_ps_tlist/custom_ps_tlist to fdw_scan_tlist/custom_scan_tlist
to better reflect what they're for, and allow these custom scan tlists
to be used even for base relations.

* Change make_foreignscan() API to include passing the fdw_scan_tlist
value, since the FDW is required to set that.  Backwards compatibility
doesn't seem like an adequate reason to expect FDWs to set it in some
ad-hoc extra step, and anyway existing FDWs can just pass NIL.

* Change the API of path-generating subroutines of add_paths_to_joinrel,
and in particular that of GetForeignJoinPaths and set_join_pathlist_hook,
so that various less-used parameters are passed in a struct rather than
as separate parameter-list entries.  The objective here is to reduce the
probability that future additions to those parameter lists will result in
source-level API breaks for users of these hooks.  It's possible that this
is even a small win for the core code, since most CPU architectures can't
pass more than half a dozen parameters efficiently anyway.  I kept root,
joinrel, outerrel, innerrel, and jointype as separate parameters to reduce
code churn in joinpath.c --- in particular, putting jointype into the
struct would have been problematic because of the subroutines' habit of
changing their local copies of that variable.

* Avoid ad-hocery in ExecAssignScanProjectionInfo.  It was probably all
right for it to know about IndexOnlyScan, but if the list is to grow
we should refactor the knowledge out to the callers.

* Restore nodeForeignscan.c's previous use of the relcache to avoid
extra GetFdwRoutine lookups for base-relation scans.

* Lots of cleanup of documentation and missed comments.  Re-order some
code additions into more logical places.
2015-05-10 14:36:36 -04:00
Andres Freund 168d5805e4 Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

Bumps catversion as stored rules change.

Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:43:10 +02:00
Magnus Hagander d678bde655 Fix indentation that could mask a future bug
Michael Paquier, spotted using Coverity
2015-05-07 11:41:26 +02:00
Robert Haas e044a44949 Deparse named arguments to use the new => operator instead of :=
Tom Lane pointed out that this wasn't done, and asked whether that was
intentional.  Subsequent discussion was in favor of making the change,
so here we go.
2015-05-01 09:37:10 -04:00
Robert Haas e7cb7ee145 Allow FDWs and custom scan providers to replace joins with scans.
Foreign data wrappers can use this capability for so-called "join
pushdown"; that is, instead of executing two separate foreign scans
and then joining the results locally, they can generate a path which
performs the join on the remote server and then is scanned locally.
This commit does not extend postgres_fdw to take advantage of this
capability; it just provides the infrastructure.

Custom scan providers can use this in a similar way.  Previously,
it was only possible for a custom scan provider to scan a single
relation.  Now, it can scan an entire join tree, provided of course
that it knows how to produce the same results that the join would
have produced if executed normally.

KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me.
2015-05-01 08:50:35 -04:00
Peter Eisentraut cac7658205 Add transforms feature
This provides a mechanism for specifying conversions between SQL data
types and procedural languages.  As examples, there are transforms
for hstore and ltree for PL/Perl and PL/Python.

reviews by Pavel Stěhule and Andres Freund
2015-04-26 10:33:14 -04:00
Tom Lane 542320c2bd Be more careful about printing constants in ruleutils.c.
The previous coding in get_const_expr() tried to avoid quoting integer,
float, and numeric literals if at all possible.  While that looks nice,
it means that dumped expressions might re-parse to something that's
semantically equivalent but not the exact same parsetree; for example
a FLOAT8 constant would re-parse as a NUMERIC constant with a cast to
FLOAT8.  Though the result would be the same after constant-folding,
this is problematic in certain contexts.  In particular, Jeff Davis
pointed out that this could cause unexpected failures in ALTER INHERIT
operations because of child tables having not-exactly-equivalent CHECK
expressions.  Therefore, favor correctness over legibility and dump
such constants in quotes except in the limited cases where they'll
be interpreted as the same type even without any casting.

This results in assorted small changes in the regression test outputs,
and will affect display of user-defined views and rules similarly.
The odds of that causing problems in the field seem non-negligible;
given the lack of previous complaints, it seems best not to change
this in the back branches.
2015-03-30 14:59:49 -04:00
Tom Lane 7b8b8a4331 Improve representation of PlanRowMark.
This patch fixes two inadequacies of the PlanRowMark representation.

First, that the original LockingClauseStrength isn't stored (and cannot be
inferred for foreign tables, which always get ROW_MARK_COPY).  Since some
PlanRowMarks are created out of whole cloth and don't actually have an
ancestral RowMarkClause, this requires adding a dummy LCS_NONE value to
enum LockingClauseStrength, which is fairly annoying but the alternatives
seem worse.  This fix allows getting rid of the use of get_parse_rowmark()
in FDWs (as per the discussion around commits 462bd95705 and
8ec8760fc8), and it simplifies some things elsewhere.

Second, that the representation assumed that all child tables in an
inheritance hierarchy would use the same RowMarkType.  That's true today
but will soon not be true.  We add an "allMarkTypes" field that identifies
the union of mark types used in all a parent table's children, and use
that where appropriate (currently, only in preprocess_targetlist()).

In passing fix a couple of minor infelicities left over from the SKIP
LOCKED patch, notably that _outPlanRowMark still thought waitPolicy
is a bool.

Catversion bump is required because the numeric values of enum
LockingClauseStrength can appear in on-disk rules.

Extracted from a much larger patch to support foreign table inheritance;
it seemed worth breaking this out, since it's a separable concern.

Shigeru Hanada and Etsuro Fujita, somewhat modified by me
2015-03-15 18:41:47 -04:00
Tom Lane e9f1c01b71 Fix dumping of views that are just VALUES(...) but have column aliases.
The "simple" path for printing VALUES clauses doesn't work if we need
to attach nondefault column aliases, because there's noplace to do that
in the minimal VALUES() syntax.  So modify get_simple_values_rte() to
detect nondefault aliases and treat that as a non-simple case.  This
further exposes that the "non-simple" path never actually worked;
it didn't produce valid syntax.  Fix that too.  Per bug #12789 from
Curtis McEnroe, and analysis by Andrew Gierth.

Back-patch to all supported branches.  Before 9.3, this also requires
back-patching the part of commit 092d7ded29
that created get_simple_values_rte() to begin with; inserting the extra
test into the old factorization of that logic would've been too messy.
2015-02-25 12:01:12 -05:00
Tom Lane a5cd70dcbc Improve performance of EXPLAIN with large range tables.
As of 9.3, ruleutils.c goes to some lengths to ensure that table and column
aliases used in its output are unique.  Of course this takes more time than
was required before, which in itself isn't fatal.  However, EXPLAIN was set
up so that recalculation of the unique aliases was repeated for each
subexpression printed in a plan.  That results in O(N^2) time and memory
consumption for large plan trees, which did not happen in older branches.

Fortunately, the expensive work is the same across a whole plan tree,
so there is no need to repeat it; we can do most of the initialization
just once per query and re-use it for each subexpression.  This buys
back most (not all) of the performance loss since 9.2.

We need an extra ExplainState field to hold the precalculated deparse
context.  That's no problem in HEAD, but in the back branches, expanding
sizeof(ExplainState) seems risky because third-party extensions might
have local variables of that struct type.  So, in 9.4 and 9.3, introduce
an auxiliary struct to keep sizeof(ExplainState) the same.  We should
refactor the APIs to avoid such local variables in future, but that's
material for a separate HEAD-only commit.

Per gripe from Alexey Bashtanov.  Back-patch to 9.3 where the issue
was introduced.
2015-01-15 13:18:12 -05:00
Bruce Momjian 4baaf863ec Update copyright for 2015
Backpatch certain files through 9.0
2015-01-06 11:43:47 -05:00
Tom Lane 447770404c Rearrange CustomScan API.
Make it work more like FDW plans do: instead of assuming that there are
expressions in a CustomScan plan node that the core code doesn't know
about, insist that all subexpressions that need planner attention be in
a "custom_exprs" list in the Plan representation.  (Of course, the
custom plugin can break the list apart again at executor initialization.)
This lets us revert the parts of the patch that exposed setrefs.c and
subselect.c processing to the outside world.

Also revert the GetSpecialCustomVar stuff in ruleutils.c; that concept
may work in future, but it's far from fully baked right now.
2014-11-21 18:21:46 -05:00
Robert Haas 0b03e5951b Introduce custom path and scan providers.
This allows extension modules to define their own methods for
scanning a relation, and get the core code to use them.  It's
unclear as yet how much use this capability will find, but we
won't find out if we never commit it.

KaiGai Kohei, reviewed at various times and in various levels
of detail by Shigeru Hanada, Tom Lane, Andres Freund, Álvaro
Herrera, and myself.
2014-11-07 17:34:36 -05:00
Alvaro Herrera 7b1c2a0f20 Split builtins.h to a new header ruleutils.h
The new header contains many prototypes for functions in ruleutils.c
that are not exposed to the SQL level.

Reviewed by Andres Freund and Michael Paquier.
2014-10-08 18:10:47 -03:00
Alvaro Herrera df630b0dd5 Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro
2014-10-07 17:23:34 -03:00
Tom Lane 9b35ddce93 Partial fix for dropped columns in functions returning composite.
When a view has a function-returning-composite in FROM, and there are
some dropped columns in the underlying composite type, ruleutils.c
printed junk in the column alias list for the reconstructed FROM entry.
Before 9.3, this was prevented by doing get_rte_attribute_is_dropped
tests while printing the column alias list; but that solution is not
currently available to us for reasons I'll explain below.  Instead,
check for empty-string entries in the alias list, which can only exist
if that column position had been dropped at the time the view was made.
(The parser fills in empty strings to preserve the invariant that the
aliases correspond to physical column positions.)

While this is sufficient to handle the case of columns dropped before
the view was made, we have still got issues with columns dropped after
the view was made.  In particular, the view could contain Vars that
explicitly reference such columns!  The dependency machinery really
ought to refuse the column drop attempt in such cases, as it would do
when trying to drop a table column that's explicitly referenced in
views.  However, we currently neglect to store dependencies on columns
of composite types, and fixing that is likely to be too big to be
back-patchable (not to mention that existing views in existing databases
would not have the needed pg_depend entries anyway).  So I'll leave that
for a separate patch.

Pre-9.3, ruleutils would print such Vars normally (with their original
column names) even though it suppressed their entries in the RTE's
column alias list.  This is certainly bogus, since the printed view
definition would fail to reload, but at least it didn't crash.  However,
as of 9.3 the printed column alias list is tightly tied to the names
printed for Vars; so we can't treat columns as dropped for one purpose
and not dropped for the other.  This is why we can't just put back the
get_rte_attribute_is_dropped test: it results in an assertion failure
if the view in fact contains any Vars referencing the dropped column.
Once we've got dependencies preventing such cases, we'll probably want
to do it that way instead of relying on the empty-string test used here.

This fix turned up a very ancient bug in outfuncs/readfuncs, namely
that T_String nodes containing empty strings were not dumped/reloaded
correctly: the node was printed as "<>" which is read as a string
value of <>.  Since (per SQL) we disallow empty-string identifiers,
such nodes don't occur normally, which is why we'd not noticed.
(Such nodes aren't used for literal constants, just identifiers.)

Per report from Marc Schablewski.  Back-patch to 9.3 which is where
the rule printing behavior changed.  The dangling-variable case is
broken all the way back, but that's not what his complaint is about.
2014-07-19 14:28:52 -04:00
Tom Lane 8f889b1083 Implement UPDATE tab SET (col1,col2,...) = (SELECT ...), ...
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated.  While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.

The standard actually says that the source for a multi-column assignment
could be any row-valued expression.  The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too.  However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs.  For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
2014-06-18 13:22:34 -04:00
Bruce Momjian 0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
Tom Lane 91e16b9806 Fix yet another corner case in dumping rules/views with USING clauses.
ruleutils.c tries to cope with additions/deletions/renamings of columns in
tables referenced by views, by means of adding machine-generated aliases to
the printed form of a view when needed to preserve the original semantics.
A recent blog post by Marko Tiikkaja pointed out a case I'd missed though:
if one input of a join with USING is itself a join, there is nothing to
stop the user from adding a column of the same name as the USING column to
whichever side of the sub-join didn't provide the USING column.  And then
there'll be an error when the view is re-parsed, since now the sub-join
exposes two columns matching the USING specification.  We were catching a
lot of related cases, but not this one, so add some logic to cope with it.

Back-patch to 9.3, which is the first release that makes any serious
attempt to cope with such cases (cf commit 2ffa740be and follow-ons).
2014-05-01 20:22:37 -04:00
Tom Lane 0bff398761 Check for interrupts and stack overflow during rule/view dumps.
Since ruleutils.c recurses, it could be driven to stack overflow by
deeply nested constructs.  Very large queries might also take long
enough to deparse that a check for interrupts seems like a good idea.
Stick appropriate tests into a couple of key places.

Noted by Greg Stark.  Back-patch to all supported branches.
2014-04-30 13:46:13 -04:00
Tom Lane 41de93c53a Reduce indentation/parenthesization of set operations in rule/view dumps.
A query such as "SELECT x UNION SELECT y UNION SELECT z UNION ..."
produces a left-deep nested parse tree, which we formerly showed in its
full nested glory and with all the possible parentheses.  This does little
for readability, though, and long UNION lists resulting in excessive
indentation are common.  Instead, let's omit parentheses and indent all
the subqueries at the same level in such cases.

This patch skips indentation/parenthesization whenever the lefthand input
of a SetOperationStmt is another SetOperationStmt of the same kind and
ALL/DISTINCT property.  We could teach the code the exact syntactic
precedence of set operations and thereby avoid parenthesization in some
more cases, but it's not clear that that'd be a readability win: it seems
better to parenthesize if the set operation changes.  (As an example,
if there's one UNION in a long list of UNION ALL, it now stands out like
a sore thumb, which seems like a good thing.)

Back-patch to 9.3.  This completes our response to a complaint from Greg
Stark that since commit 62e666400d there's a performance problem in pg_dump
for views containing long UNION sequences (or other types of deeply nested
constructs).  The previous commit 0601cb54da
handles the general problem, but this one makes the specific case of UNION
lists look a lot nicer.
2014-04-30 13:26:26 -04:00
Tom Lane 0601cb54da Limit overall indentation in rule/view dumps.
Continuing to indent no matter how deeply nested we get doesn't really
do anything for readability; what's worse, it results in O(N^2) total
whitespace, which can become a performance and memory-consumption issue.

To address this, once we get past 40 characters of indentation, reduce
the indentation step distance 4x, and also limit the maximum indentation
by reducing it modulo 40.  This latter choice is a bit weird at first
glance, but it seems to preserve readability better than a simple cap
would do.

Back-patch to 9.3, because since commit 62e666400d the performance issue
is a hazard for pg_dump.

Greg Stark and Tom Lane
2014-04-30 12:48:12 -04:00
Tom Lane d166eed302 Fix indentation of JOIN clauses in rule/view dumps.
The code attempted to outdent JOIN clauses further left than the parent
FROM keyword, which was odd in any case, and led to inconsistent formatting
since in simple cases the clauses couldn't be moved any further left than
that.  And it left a permanent decrement of the indentation level, causing
subsequent lines to be much further left than they should be (again, this
couldn't be seen in simple cases for lack of indentation to give up).

After a little experimentation I chose to make it indent JOIN keywords
two spaces from the parent FROM, which is one space more than the join's
lefthand input in cases where that appears on a different line from FROM.

Back-patch to 9.3.  This is a purely cosmetic change, and the bug is quite
old, so that may seem arbitrary; but we are going to be making some other
changes to the indentation behavior in both HEAD and 9.3, so it seems
reasonable to include this in 9.3 too.  I committed this one first because
its effects are more visible in the regression test results as they
currently stand than they will be later.
2014-04-30 12:01:19 -04:00
Simon Riggs e5550d5fec Reduce lock levels of some ALTER TABLE cmds
VALIDATE CONSTRAINT

CLUSTER ON
SET WITHOUT CLUSTER

ALTER COLUMN SET STATISTICS
ALTER COLUMN SET ()
ALTER COLUMN RESET ()

All other sub-commands use AccessExclusiveLock

Simon Riggs and Noah Misch

Reviews by Robert Haas and Andres Freund
2014-04-06 11:13:43 -04:00
Tom Lane c7b3539599 Fix non-equivalence of VARIADIC and non-VARIADIC function call formats.
For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...)
and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the
former is converted to the latter at parse time.  They have indeed been
equivalent, in all releases before 9.3.  However, commit 75b39e790 made an
ill-considered decision to record which syntax had been used in FuncExpr
nodes, and then to make equal() test that in checking node equality ---
which caused the syntaxes to not be seen as equivalent by the planner.
This is the underlying cause of bug #9817 from Dmitry Ryabov.

It might seem that a quick fix would be to make equal() disregard
FuncExpr.funcvariadic, but the same commit made that untenable, because
the field actually *is* semantically significant for some VARIADIC ANY
functions.  This patch instead adopts the approach of redefining
funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument
is a variadic array, whether it got that way by parser intervention or was
supplied explicitly by the user.  Therefore the value will always be true
for non-ANY variadic functions, restoring the principle of equivalence.
(However, the planner will continue to consider use of VARIADIC as a
meaningful difference for VARIADIC ANY functions, even though some such
functions might disregard it.)

In HEAD, this change lets us simplify the decompilation logic in
ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether
to print VARIADIC.  However, in 9.3 we have to continue to cope with
existing stored rules/views that might contain the previous definition.
Fortunately, this just means no change in ruleutils.c, since its existing
behavior effectively ignores funcvariadic for all cases other than VARIADIC
ANY functions.

In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic
changed meanings; this is sort of pro forma, since I don't believe any
built-in views are affected.

Unfortunately, this patch doesn't magically fix everything for affected
9.3 users.  After installing 9.3.5, they might need to recreate their
rules/views/indexes containing variadic function calls in order to get
everything consistent with the new definition.  As in the cited bug,
the symptom of a problem would be failure to use a nominally matching
index that has a variadic function call in its definition.  We'll need
to mention this in the 9.3.5 release notes.
2014-04-03 22:02:24 -04:00
Tom Lane 7c31874945 Avoid getting more than AccessShareLock when deparsing a query.
In make_ruledef and get_query_def, we have long used AcquireRewriteLocks
to ensure that the querytree we are about to deparse is up-to-date and
the schemas of the underlying relations aren't changing.  Howwever, that
function thinks the query is about to be executed, so it acquires locks
that are stronger than necessary for the purpose of deparsing.  Thus for
example, if pg_dump asks to deparse a rule that includes "INSERT INTO t",
we'd acquire RowExclusiveLock on t.  That results in interference with
concurrent transactions that might for example ask for ShareLock on t.
Since pg_dump is documented as being purely read-only, this is unexpected.
(Worse, it used to actually be read-only; this behavior dates back only
to 8.1, cf commit ba4200246.)

Fix this by adding a parameter to AcquireRewriteLocks to tell it whether
we want the "real" execution locks or only AccessShareLock.

Report, diagnosis, and patch by Dean Rasheed.  Back-patch to all supported
branches.
2014-03-06 19:31:05 -05:00
Bruce Momjian 7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Tom Lane 8d65da1f01 Support ordered-set (WITHIN GROUP) aggregates.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()).  We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.

Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions.  To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c.  This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need.  There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.

In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates.  Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.

Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
2013-12-23 16:11:35 -05:00
Noah Misch 53685d7981 Rename TABLE() to ROWS FROM().
SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and
other collection types.  This feature, however, deals with set-returning
functions.  Use a different syntax for this feature to keep open the
possibility of implementing the standard TABLE().
2013-12-10 09:34:37 -05:00
Peter Eisentraut 85ed91ee7d Implement information_schema.parameters.parameter_default column
Reviewed-by: Ali Dar <ali.munir.dar@gmail.com>
Reviewed-by: Amit Khandekar <amit.khandekar@enterprisedb.com>
Reviewed-by: Rodolfo Campero <rodolfo.campero@anachronics.com>
2013-11-26 23:21:35 -05:00
Peter Eisentraut b7212c9726 Fix thinko in SPI_execute_plan() calls
Two call sites were apparently thinking that the last argument of
SPI_execute_plan() is the number of query parameters, but it is actually
the row limit.  Change the calls to 0, since we don't care about the
limit there.  The previous code didn't break anything, but it was still
wrong.
2013-11-23 09:34:57 -05:00
Peter Eisentraut 4053189d59 Avoid potential buffer overflow crash
A pointer to a C string was treated as a pointer to a "name" datum and
passed to SPI_execute_plan().  This pointer would then end up being
passed through datumCopy(), which would try to copy the entire 64 bytes
of name data, thus running past the end of the C string.  Fix by
converting the string to a proper name structure.

Found by LLVM AddressSanitizer.
2013-11-23 07:25:37 -05:00
Tom Lane 784e762e88 Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry.  The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others.  This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.

This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.

Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).

The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does.  There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST().  After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.

Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
2013-11-21 19:37:20 -05:00
Tom Lane 0b7e660d6c Fix ruleutils pretty-printing to not generate trailing whitespace.
The pretty-printing logic in ruleutils.c operates by inserting a newline
and some indentation whitespace into strings that are already valid SQL.
This naturally results in leaving some trailing whitespace before the
newline in many cases; which can be annoying when processing the output
with other tools, as complained of by Joe Abbate.  We can fix that in
a pretty localized fashion by deleting any trailing whitespace before
we append a pretty-printing newline.  In addition, we have to modify the
code inserted by commit 2f582f76b1 so that
we also delete trailing whitespace when transposing items from temporary
buffers into the main result string, when a temporary item starts with a
newline.

This results in rather voluminous changes to the regression test results,
but it's easily verified that they are only removal of trailing whitespace.

Back-patch to 9.3, because the aforementioned commit resulted in many
more cases of trailing whitespace than had occurred in earlier branches.
2013-11-11 13:36:38 -05:00
Tom Lane bb45c64041 Support default arguments and named-argument notation for window functions.
These things didn't work because the planner omitted to do the necessary
preprocessing of a WindowFunc's argument list.  Add the few dozen lines
of code needed to handle that.

Although this sounds like a feature addition, it's really a bug fix because
the default-argument case was likely to crash previously, due to lack of
checking of the number of supplied arguments in the built-in window
functions.  It's not a security issue because there's no way for a
non-superuser to create a window function definition with defaults that
refers to a built-in C function, but nonetheless people might be annoyed
that it crashes rather than producing a useful error message.  So
back-patch as far as the patch applies easily, which turns out to be 9.2.
I'll put a band-aid in earlier versions as a separate patch.

(Note that these features still don't work for aggregates, and fixing that
case will be harder since we represent aggregate arg lists as target lists
not bare expression lists.  There's no crash risk though because CREATE
AGGREGATE doesn't accept defaults, and we reject named-argument notation
when parsing an aggregate call.)
2013-11-06 13:33:09 -05:00
Robert Haas cacbdd7810 Use appendStringInfoString instead of appendStringInfo where possible.
This shaves a few cycles, and generally seems like good programming
practice.

David Rowley
2013-10-31 10:55:59 -04:00
Kevin Grittner 277607d600 Eliminate pg_rewrite.ev_attr column and related dead code.
Commit 95ef6a3448 removed the
ability to create rules on an individual column as of 7.3, but
left some residual code which has since been useless.  This cleans
up that dead code without any change in behavior other than
dropping the useless column from the catalog.
2013-09-05 14:03:43 -05:00