Commit Graph

343 Commits

Author SHA1 Message Date
Tom Lane c12d570fa1 Support arrays over domains.
Allowing arrays with a domain type as their element type was left un-done
in the original domain patch, but not for any very good reason.  This
omission leads to such surprising results as array_agg() not working on
a domain column, because the parser can't identify a suitable output type
for the polymorphic aggregate.

In order to fix this, first clean up the APIs of coerce_to_domain() and
some internal functions in parse_coerce.c so that we consistently pass
around a CoercionContext along with CoercionForm.  Previously, we sometimes
passed an "isExplicit" boolean flag instead, which is strictly less
information; and coerce_to_domain() didn't even get that, but instead had
to reverse-engineer isExplicit from CoercionForm.  That's contrary to the
documentation in primnodes.h that says that CoercionForm only affects
display and not semantics.  I don't think this change fixes any live bugs,
but it makes things more consistent.  The main reason for doing it though
is that now build_coercion_expression() receives ccontext, which it needs
in order to be able to recursively invoke coerce_to_target_type().

Next, reimplement ArrayCoerceExpr so that the node does not directly know
any details of what has to be done to the individual array elements while
performing the array coercion.  Instead, the per-element processing is
represented by a sub-expression whose input is a source array element and
whose output is a target array element.  This simplifies life in
parse_coerce.c, because it can build that sub-expression by a recursive
invocation of coerce_to_target_type().  The executor now handles the
per-element processing as a compiled expression instead of hard-wired code.
The main advantage of this is that we can use a single ArrayCoerceExpr to
handle as many as three successive steps per element: base type conversion,
typmod coercion, and domain constraint checking.  The old code used two
stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty
inefficient, and adding yet another array deconstruction to do domain
constraint checking seemed very unappetizing.

In the case where we just need a single, very simple coercion function,
doing this straightforwardly leads to a noticeable increase in the
per-array-element runtime cost.  Hence, add an additional shortcut evalfunc
in execExprInterp.c that skips unnecessary overhead for that specific form
of expression.  The runtime speed of simple cases is within 1% or so of
where it was before, while cases that previously required two levels of
array processing are significantly faster.

Finally, create an implicit array type for every domain type, as we do for
base types, enums, etc.  Everything except the array-coercion case seems
to just work without further effort.

Tom Lane, reviewed by Andrew Dunstan

Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
2017-09-30 13:40:56 -04:00
Robert Haas 6f6b99d133 Allow a partitioned table to have a default partition.
Any tuples that don't route to any other partition will route to the
default partition.

Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert
Haas, with review and testing at various stages by (at least) Rushabh
Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven
Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar.

Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com
Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
2017-09-08 17:28:04 -04:00
Tom Lane 7df2c1f8da Force rescanning of parallel-aware scan nodes below a Gather[Merge].
The ExecReScan machinery contains various optimizations for postponing
or skipping rescans of plan subtrees; for example a HashAgg node may
conclude that it can re-use the table it built before, instead of
re-reading its input subtree.  But that is wrong if the input contains
a parallel-aware table scan node, since the portion of the table scanned
by the leader process is likely to vary from one rescan to the next.
This explains the timing-dependent buildfarm failures we saw after
commit a2b70c89c.

The established mechanism for showing that a plan node's output is
potentially variable is to mark it as depending on some runtime Param.
Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC
parameter number, but carries no actual value) associated with each Gather
or GatherMerge node, mark parallel-aware nodes below that node as dependent
on that Param, and arrange for ExecReScanGather[Merge] to flag that Param
as changed whenever the Gather[Merge] node is rescanned.

This solution breaks an undocumented assumption made by the parallel
executor logic, namely that all rescans of nodes below a Gather[Merge]
will happen synchronously during the ReScan of the top node itself.
But that's fundamentally contrary to the design of the ExecReScan code,
and so was doomed to fail someday anyway (even if you want to argue
that the bug being fixed here wasn't a failure of that assumption).
A follow-on patch will address that issue.  In the meantime, the worst
that's expected to happen is that given very bad timing luck, the leader
might have to do all the work during a rescan, because workers think
they have nothing to do, if they are able to start up before the eventual
ReScan of the leader's parallel-aware table scan node has reset the
shared scan state.

Although this problem exists in 9.6, there does not seem to be any way
for it to manifest there.  Without GatherMerge, it seems that a plan tree
that has a rescan-short-circuiting node below Gather will always also
have one above it that will short-circuit in the same cases, preventing
the Gather from being rescanned.  Hence we won't take the risk of
back-patching this change into 9.6.  But v10 needs it.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
2017-08-30 09:29:55 -04:00
Dean Rasheed d363d42bb9 Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition bounds.
Previously, UNBOUNDED meant no lower bound when used in the FROM list,
and no upper bound when used in the TO list, which was OK for
single-column range partitioning, but problematic with multiple
columns. For example, an upper bound of (10.0, UNBOUNDED) would not be
collocated with a lower bound of (10.0, UNBOUNDED), thus making it
difficult or impossible to define contiguous multi-column range
partitions in some cases.

Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to
represent a partition column that is unbounded below or above
respectively. This syntax removes any ambiguity, and ensures that if
one partition's lower bound equals another partition's upper bound,
then the partitions are contiguous.

Also drop the constraint prohibiting finite values after an unbounded
column, and just document the fact that any values after MINVALUE or
MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED
multiple times, which was needlessly verbose.

Note: Forces a post-PG 10 beta2 initdb.

Report by Amul Sul, original patch by Amit Langote with some
additional hacking by me.

Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com
2017-07-21 09:20:47 +01:00
Tom Lane decb08ebdf Code review for NextValueExpr expression node type.
Add missing infrastructure for this node type, notably in ruleutils.c where
its lack could demonstrably cause EXPLAIN to fail.  Add outfuncs/readfuncs
support.  (outfuncs support is useful today for debugging purposes.  The
readfuncs support may never be needed, since at present it would only
matter for parallel query and NextValueExpr should never appear in a
parallelizable query; but it seems like a bad idea to have a primnode type
that isn't fully supported here.)  Teach planner infrastructure that
NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node
with cost cpu_operator_cost.  Given its limited scope of usage, there
*might* be no live bug today from the lack of that knowledge, but it's
certainly going to bite us on the rear someday.  Teach pg_stat_statements
about the new node type, too.

While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction,
XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost.
Failing to do this for SQLValueFunction was an oversight in my commit
0bb51aa96.  The others are longer-standing oversights, but no time like the
present to fix them.  (In principle, CoerceToDomain could have cost much
higher than this, but it doesn't presently seem worth trying to examine the
domain's constraints here.)

Modify execExprInterp.c to execute NextValueExpr as an out-of-line
function; it seems quite unlikely to me that it's worth insisting that
it be inlined in all expression eval methods.  Besides, providing the
out-of-line function doesn't stop anyone from inlining if they want to.

Adjust some places where NextValueExpr support had been inserted with the
aid of a dartboard rather than keeping it in the same order as elsewhere.

Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
2017-07-14 15:25:43 -04:00
Peter Eisentraut d80e73f229 Fix output of char node fields
WRITE_CHAR_FIELD() didn't do any escaping, so that for example a zero
byte would cause the whole output string to be truncated.  To fix, pass
the char through outToken(), so it is escaped like a string.  Adjust the
reading side to handle this.
2017-07-05 10:51:54 -04:00
Tom Lane c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Noah Misch 39ac55918f Reconcile nodes/*funcs.c with PostgreSQL 10 work.
The _equalTableFunc() omission of coltypmods has semantic significance,
but I did not track down resulting user-visible bugs, if any.  The other
changes are cosmetic only, affecting order.  catversion bump due to
readfuncs.c field order change.
2017-06-16 00:16:11 -07:00
Robert Haas f32d57fd70 Fix problems related to RangeTblEntry members enrname and enrtuples.
Commit 18ce3a4ab2 failed to update
the comments in parsenodes.h for the new members, and made only
incomplete updates to src/backend/nodes

Thomas Munro, per a report from Noah Misch.

Discussion: http://postgr.es/m/20170611062525.GA1628882@rfd.leadboat.com
2017-06-14 16:19:46 -04:00
Tom Lane 80f583ffe9 Fix omission of locations in outfuncs/readfuncs partitioning node support.
We could have limped along without this for v10, which was my intention
when I annotated the bug in commit 76a3df6e5.  But consensus is that it's
better to fix it now and take the cost of a post-beta1 initdb (which is
needed because these node types are stored in pg_class.relpartbound).

Since we're forcing initdb anyway, take the opportunity to make the node
type identification strings match the node struct names, instead of being
randomly different from them.

Discussion: https://postgr.es/m/E1dFBEX-0004wt-8t@gemulon.postgresql.org
2017-05-30 11:32:41 -04:00
Tom Lane 76a3df6e5e Code review focused on new node types added by partitioning support.
Fix failure to check that we got a plain Const from const-simplification of
a coercion request.  This is the cause of bug #14666 from Tian Bing: there
is an int4 to money cast, but it's only stable not immutable (because of
dependence on lc_monetary), resulting in a FuncExpr that the code was
miserably unequipped to deal with, or indeed even to notice that it was
failing to deal with.  Add test cases around this coercion behavior.

In view of the above, sprinkle the code liberally with castNode() macros,
in hope of catching the next such bug a bit sooner.  Also, change some
functions that were randomly declared to take Node* to take more specific
pointer types.  And change some struct fields that were declared Node*
but could be given more specific types, allowing removal of assorted
explicit casts.

Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting.
Likewise check only-one-key-for-list-partitioning restriction in a less
random place.

Avoid not-per-project-style usages like !strcmp(...).

Fix assorted failures to avoid scribbling on the input of parse
transformation.  I'm not sure how necessary this is, but it's entirely
silly for these functions to be expending cycles to avoid that and not
getting it right.

Add guards against partitioning on system columns.

Put backend/nodes/ support code into an order that matches handling
of these node types elsewhere.

Annotate the fact that somebody added location fields to PartitionBoundSpec
and PartitionRangeDatum but forgot to handle them in
outfuncs.c/readfuncs.c.  This is fairly harmless for production purposes
(since readfuncs.c would just substitute -1 anyway) but it's still bogus.
It's not worth forcing a post-beta1 initdb just to fix this, but if we
have another reason to force initdb before 10.0, we should go back and
clean this up.

Contrariwise, somebody added location fields to PartitionElem and
PartitionSpec but forgot to teach exprLocation() about them.

Consolidate duplicative code in transformPartitionBound().

Improve a couple of error messages.

Improve assorted commentary.

Re-pgindent the files touched by this patch; this affects a few comment
blocks that must have been added quite recently.

Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org
2017-05-28 23:20:28 -04:00
Tom Lane f674743487 Remove no-longer-needed fields of Hash plan nodes.
skewColType/skewColTypmod are no longer used in the wake of commit
9aab83fc5, and seem unlikely to be wanted in future, so let's drop 'em.

Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
2017-05-14 11:07:40 -04:00
Robert Haas e180c8aa8c Fire per-statement triggers on partitioned tables.
Even though no actual tuples are ever inserted into a partitioned
table (the actual tuples are in the partitions, not the partitioned
table itself), we still need to have a ResultRelInfo for the
partitioned table, or per-statement triggers won't get fired.

Amit Langote, per a report from Rajkumar Raghuwanshi.  Reviewed by me.

Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
2017-05-01 08:23:01 -04:00
Tom Lane 003d80f3df Mark finished Plan nodes with parallel_safe flags.
We'd managed to avoid doing this so far, but it seems pretty obvious
that it would be forced on us some day, and this is much the cleanest
way of approaching the open problem that parallel-unsafe subplans are
being transmitted to parallel workers.  Anyway there's no space cost
due to alignment considerations, and the time cost is pretty minimal
since we're just copying the flag from the corresponding Path node.
(At least in most cases ... some of the klugier spots in createplan.c
have to work a bit harder.)

In principle we could perhaps get rid of SubPlan.parallel_safe,
but I thought it better to keep that in case there are reasons to
consider a SubPlan unsafe even when its child plan is parallel-safe.

This patch doesn't actually do anything with the new flags, but
I thought I'd commit it separately anyway.

Note: although this touches outfuncs/readfuncs, there's no need for
a catversion bump because Plan trees aren't stored on disk.

Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
2017-04-12 15:13:34 -04:00
Tom Lane 9c7f5229ad Optimize joins when the inner relation can be proven unique.
If there can certainly be no more than one matching inner row for a given
outer row, then the executor can move on to the next outer row as soon as
it's found one match; there's no need to continue scanning the inner
relation for this outer row.  This saves useless scanning in nestloop
and hash joins.  In merge joins, it offers the opportunity to skip
mark/restore processing, because we know we have not advanced past the
first possible match for the next outer row.

Of course, the devil is in the details: the proof of uniqueness must
depend only on joinquals (not otherquals), and if we want to skip
mergejoin mark/restore then it must depend only on merge clauses.
To avoid adding more planning overhead than absolutely necessary,
the present patch errs in the conservative direction: there are cases
where inner_unique or skip_mark_restore processing could be used, but
it will not do so because it's not sure that the uniqueness proof
depended only on "safe" clauses.  This could be improved later.

David Rowley, reviewed and rather heavily editorialized on by me

Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com
2017-04-07 22:20:13 -04:00
Peter Eisentraut 3217327053 Identity columns
This is the SQL standard-conforming variant of PostgreSQL's serial
columns.  It fixes a few usability issues that serial columns have:

- CREATE TABLE / LIKE copies default but refers to same sequence
- cannot add/drop serialness with ALTER TABLE
- dropping default does not drop sequence
- need to grant separate privileges to sequence
- other slight weirdnesses because serial is some kind of special macro

Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
2017-04-06 08:41:37 -04:00
Kevin Grittner 18ce3a4ab2 Add infrastructure to support EphemeralNamedRelation references.
A QueryEnvironment concept is added, which allows new types of
objects to be passed into queries from parsing on through
execution.  At this point, the only thing implemented is a
collection of EphemeralNamedRelation objects -- relations which
can be referenced by name in queries, but do not exist in the
catalogs.  The only type of ENR implemented is NamedTuplestore, but
provision is made to add more types fairly easily.

An ENR can carry its own TupleDesc or reference a relation in the
catalogs by relid.

Although these features can be used without SPI, convenience
functions are added to SPI so that ENRs can easily be used by code
run through SPI.

The initial use of all this is going to be transition tables in
AFTER triggers, but that will be added to each PL as a separate
commit.

An incidental effect of this patch is to produce a more informative
error message if an attempt is made to modify the contents of a CTE
from a referencing DML statement.  No tests previously covered that
possibility, so one is added.

Kevin Grittner and Thomas Munro
Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro
with valuable comments and suggestions from many others
2017-03-31 23:17:18 -05:00
Robert Haas d3cc37f1d8 Don't scan partitioned tables.
Partitioned tables do not contain any data; only their unpartitioned
descendents need to be scanned.  However, the partitioned tables still
need to be locked, even though they're not scanned.  To make that
work, Append and MergeAppend relations now need to carry a list of
(unscanned) partitioned relations that must be locked, and InitPlan
must lock all partitioned result relations.

Aside from the obvious advantage of avoiding some work at execution
time, this has two other advantages.  First, it may improve the
planner's decision-making in some cases since the empty relation
might throw things off.  Second, it paves the way to getting rid of
the storage for partitioned tables altogether.

Amit Langote, reviewed by me.

Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp
2017-03-21 09:48:04 -04:00
Robert Haas 355d3993c5 Add a Gather Merge executor node.
Like Gather, we spawn multiple workers and run the same plan in each
one; however, Gather Merge is used when each worker produces the same
output ordering and we want to preserve that output ordering while
merging together the streams of tuples from various workers.  (In a
way, Gather Merge is like a hybrid of Gather and MergeAppend.)

This works out to a win if it saves us from having to perform an
expensive Sort.  In cases where only a small amount of data would need
to be sorted, it may actually be faster to use a regular Gather node
and then sort the results afterward, because Gather Merge sometimes
needs to wait synchronously for tuples whereas a pure Gather generally
doesn't.  But if this avoids an expensive sort then it's a win.

Rushabh Lathia, reviewed and tested by Amit Kapila, Thomas Munro,
and Neha Sharma, and reviewed and revised by me.

Discussion: http://postgr.es/m/CAGPqQf09oPX-cQRpBKS0Gq49Z+m6KBxgxd_p9gX8CKk_d75HoQ@mail.gmail.com
2017-03-09 07:49:29 -05:00
Robert Haas f35742ccb7 Support parallel bitmap heap scans.
The index is scanned by a single process, but then all cooperating
processes can iterate jointly over the resulting set of heap blocks.
In the future, we might also want to support using a parallel bitmap
index scan to set up for a parallel bitmap heap scan, but that's a
job for another day.

Dilip Kumar, with some corrections and cosmetic changes by me.  The
larger patch set of which this is a part has been reviewed and tested
by (at least) Andres Freund, Amit Khandekar, Tushar Ahuja, Rafia
Sabih, Haribabu Kommi, Thomas Munro, and me.

Discussion: http://postgr.es/m/CAFiTN-uc4=0WxRGfCzs-xfkMYcSEWUC-Fon6thkJGjkh9i=13A@mail.gmail.com
2017-03-08 12:05:43 -05:00
Alvaro Herrera fcec6caafa Support XMLTABLE query expression
XMLTABLE is defined by the SQL/XML standard as a feature that allows
turning XML-formatted data into relational form, so that it can be used
as a <table primary> in the FROM clause of a query.

This new construct provides significant simplicity and performance
benefit for XML data processing; what in a client-side custom
implementation was reported to take 20 minutes can be executed in 400ms
using XMLTABLE.  (The same functionality was said to take 10 seconds
using nested PostgreSQL XPath function calls, and 5 seconds using
XMLReader under PL/Python).

The implemented syntax deviates slightly from what the standard
requires.  First, the standard indicates that the PASSING clause is
optional and that multiple XML input documents may be given to it; we
make it mandatory and accept a single document only.  Second, we don't
currently support a default namespace to be specified.

This implementation relies on a new executor node based on a hardcoded
method table.  (Because the grammar is fixed, there is no extensibility
in the current approach; further constructs can be implemented on top of
this such as JSON_TABLE, but they require changes to core code.)

Author: Pavel Stehule, Álvaro Herrera
Extensively reviewed by: Craig Ringer
Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com
2017-03-08 12:40:26 -03:00
Peter Eisentraut 788af6f854 Move atooid() definition to a central place 2017-03-01 11:55:28 -05:00
Robert Haas 5e6d8d2bbb Allow parallel workers to execute subplans.
This doesn't do anything to make Param nodes anything other than
parallel-restricted, so this only helps with uncorrelated subplans,
and it's not necessarily very cheap because each worker will run the
subplan separately (just as a Hash Join will build a separate copy of
the hash table in each participating process), but it's a first step
toward supporting cases that are more likely to help in practice, and
is occasionally useful on its own.

Amit Kapila, reviewed and tested by Rafia Sabih, Dilip Kumar, and
me.

Discussion: http://postgr.es/m/CAA4eK1+e8Z45D2n+rnDMDYsVEb5iW7jqaCH_tvPMYau=1Rru9w@mail.gmail.com
2017-02-14 18:16:03 -05:00
Andres Freund 69f4b9c85f Move targetlist SRF handling from expression evaluation to new executor node.
Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT
generate_series(1,5)) so far was done in the expression evaluation (i.e.
ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code.

This meant that most executor nodes performing projection, and most
expression evaluation functions, had to deal with the possibility that an
evaluated expression could return a set of return values.

That's bad because it leads to repeated code in a lot of places. It also,
and that's my (Andres's) motivation, made it a lot harder to implement a
more efficient way of doing expression evaluation.

To fix this, introduce a new executor node (ProjectSet) that can evaluate
targetlists containing one or more SRFs. To avoid the complexity of the old
way of handling nested expressions returning sets (e.g. having to pass up
ExprDoneCond, and dealing with arguments to functions returning sets etc.),
those SRFs can only be at the top level of the node's targetlist.  The
planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is
only necessary in ProjectSet nodes and that SRFs are only present at the
top level of the node's targetlist. If there are nested SRFs the planner
creates multiple stacked ProjectSet nodes.  The ProjectSet nodes always get
input from an underlying node.

We also discussed and prototyped evaluating targetlist SRFs using ROWS
FROM(), but that turned out to be more complicated than we'd hoped.

While moving SRF evaluation to ProjectSet would allow to retain the old
"least common multiple" behavior when multiple SRFs are present in one
targetlist (i.e.  continue returning rows until all SRFs are at the end of
their input at the same time), we decided to instead only return rows till
all SRFs are exhausted, returning NULL for already exhausted ones.  We
deemed the previous behavior to be too confusing, unexpected and actually
not particularly useful.

As a side effect, the previously prohibited case of multiple set returning
arguments to a function, is now allowed. Not because it's particularly
desirable, but because it ends up working and there seems to be no argument
for adding code to prohibit it.

Currently the behavior for COALESCE and CASE containing SRFs has changed,
returning multiple rows from the expression, even when the SRF containing
"arm" of the expression is not evaluated. That's because the SRFs are
evaluated in a separate ProjectSet node.  As that's quite confusing, we're
likely to instead prohibit SRFs in those places.  But that's still being
discussed, and the code would reside in places not touched here, so that's
a task for later.

There's a lot of, now superfluous, code dealing with set return expressions
around. But as the changes to get rid of those are verbose largely boring,
it seems better for readability to keep the cleanup as a separate commit.

Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-18 13:40:27 -08:00
Tom Lane ab1f0c8225 Change representation of statement lists, and add statement location info.
This patch makes several changes that improve the consistency of
representation of lists of statements.  It's always been the case
that the output of parse analysis is a list of Query nodes, whatever
the types of the individual statements in the list.  This patch brings
similar consistency to the outputs of raw parsing and planning steps:

* The output of raw parsing is now always a list of RawStmt nodes;
the statement-type-dependent nodes are one level down from that.

* The output of pg_plan_queries() is now always a list of PlannedStmt
nodes, even for utility statements.  In the case of a utility statement,
"planning" just consists of wrapping a CMD_UTILITY PlannedStmt around
the utility node.  This list representation is now used in Portal and
CachedPlan plan lists, replacing the former convention of intermixing
PlannedStmts with bare utility-statement nodes.

Now, every list of statements has a consistent head-node type depending
on how far along it is in processing.  This allows changing many places
that formerly used generic "Node *" pointers to use a more specific
pointer type, thus reducing the number of IsA() tests and casts needed,
as well as improving code clarity.

Also, the post-parse-analysis representation of DECLARE CURSOR is changed
so that it looks more like EXPLAIN, PREPARE, etc.  That is, the contained
SELECT remains a child of the DeclareCursorStmt rather than getting flipped
around to be the other way.  It's now true for both Query and PlannedStmt
that utilityStmt is non-null if and only if commandType is CMD_UTILITY.
That allows simplifying a lot of places that were testing both fields.
(I think some of those were just defensive programming, but in many places,
it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.)

Because PlannedStmt carries a canSetTag field, we're also able to get rid
of some ad-hoc rules about how to reconstruct canSetTag for a bare utility
statement; specifically, the assumption that a utility is canSetTag if and
only if it's the only one in its list.  While I see no near-term need for
relaxing that restriction, it's nice to get rid of the ad-hocery.

The API of ProcessUtility() is changed so that what it's passed is the
wrapper PlannedStmt not just the bare utility statement.  This will affect
all users of ProcessUtility_hook, but the changes are pretty trivial; see
the affected contrib modules for examples of the minimum change needed.
(Most compilers should give pointer-type-mismatch warnings for uncorrected
code.)

There's also a change in the API of ExplainOneQuery_hook, to pass through
cursorOptions instead of expecting hook functions to know what to pick.
This is needed because of the DECLARE CURSOR changes, but really should
have been done in 9.6; it's unlikely that any extant hook functions
know about using CURSOR_OPT_PARALLEL_OK.

Finally, teach gram.y to save statement boundary locations in RawStmt
nodes, and pass those through to Query and PlannedStmt nodes.  This allows
more intelligent handling of cases where a source query string contains
multiple statements.  This patch doesn't actually do anything with the
information, but a follow-on patch will.  (Passing this information through
cleanly is the true motivation for these changes; while I think this is all
good cleanup, it's unlikely we'd have bothered without this end goal.)

catversion bump because addition of location fields to struct Query
affects stored rules.

This patch is by me, but it owes a good deal to Fabien Coelho who did
a lot of preliminary work on the problem, and also reviewed the patch.

Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-14 16:02:35 -05:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Tom Lane fe591f8bf6 Replace enum InhOption with simple boolean.
Now that it has only INH_NO and INH_YES values, it's just weird that
it's not a plain bool, so make it that way.

Also rename RangeVar.inhOpt to "inh", to be like RangeTblEntry.inh.
My recollection is that we gave it a different name specifically because
it had a different representation than the derived bool value, but it
no longer does.  And this is a good forcing function to be sure we
catch any places that are affected by the change.

Bump catversion because of possible effect on stored RangeVar nodes.
I'm not exactly convinced that we ever store RangeVar on disk, but
we have a readfuncs function for it, so be cautious.  (If we do do so,
then commit e13486eba was in error not to bump catversion.)

Follow-on to commit e13486eba.

Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com
2016-12-23 13:35:18 -05:00
Tom Lane 0b78106cd4 Fix reporting of column typmods for multi-row VALUES constructs.
expandRTE() and get_rte_attribute_type() reported the exprType() and
exprTypmod() values of the expressions in the first row of the VALUES as
being the column type/typmod returned by the VALUES RTE.  That's fine for
the data type, since we coerce all expressions in a column to have the same
common type.  But we don't coerce them to have a common typmod, so it was
possible for rows after the first one to return values that violate the
claimed column typmod.  This leads to the incorrect result seen in bug
#14448 from Hassan Mahmood, as well as some other corner-case misbehaviors.

The desired behavior is the same as we use in other type-unification
cases: report the common typmod if there is one, but otherwise return -1
indicating no particular constraint.  It's cheap for transformValuesClause
to determine the common typmod while transforming a multi-row VALUES, but
it'd be less cheap for expandRTE() and get_rte_attribute_type() to
re-determine that info every time they're asked --- possibly a lot less
cheap, if the VALUES has many rows.  Therefore, the best fix is to record
the common typmods explicitly in a list in the VALUES RTE, as we were
already doing for column collations.  This looks quite a bit like what
we're doing for CTE RTEs, so we can save a little bit of space and code by
unifying the representation for those two RTE types.  They both now share
coltypes/coltypmods/colcollations fields.  (At some point it might seem
desirable to populate those fields for all RTE types; but right now it
looks like constructing them for other RTE types would add more code and
cycles than it would save.)

The RTE change requires a catversion bump, so this fix is only usable
in HEAD.  If we fix this at all in the back branches, the patch will
need to look quite different.

Report: https://postgr.es/m/20161205143037.4377.60754@wrigleys.postgresql.org
Discussion: https://postgr.es/m/27429.1480968538@sss.pgh.pa.us
2016-12-08 11:40:02 -05:00
Robert Haas f0e44751d7 Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own.  The children are called
partitions and contain all of the actual data.  Each partition has an
implicit partitioning constraint.  Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed.  Partitions
can't have extra columns and may not allow nulls unless the parent
does.  Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.

Currently, tables can be range-partitioned or list-partitioned.  List
partitioning is limited to a single column, but range partitioning can
involve multiple columns.  A partitioning "column" can be an
expression.

Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations.  The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.

Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others.  Minor revisions by me.
2016-12-07 13:17:55 -05:00
Tom Lane a4c35ea1c2 Improve parser's and planner's handling of set-returning functions.
Teach the parser to reject misplaced set-returning functions during parse
analysis using p_expr_kind, in much the same way as we do for aggregates
and window functions (cf commit eaccfded9).  While this isn't complete
(it misses nesting-based restrictions), it's much better than the previous
error reporting for such cases, and it allows elimination of assorted
ad-hoc expression_returns_set() error checks.  We could add nesting checks
later if it seems important to catch all cases at parse time.

There is one case the parser will now throw error for although previous
versions allowed it, which is SRFs in the tlist of an UPDATE.  That never
behaved sensibly (since it's ill-defined which generated row should be
used to perform the update) and it's hard to see why it should not be
treated as an error.  It's a release-note-worthy change though.

Also, add a new Query field hasTargetSRFs reporting whether there are
any SRFs in the targetlist (including GROUP BY/ORDER BY expressions).
The parser can now set that basically for free during parse analysis,
and we can use it in a number of places to avoid expression_returns_set
searches.  (There will be more such checks soon.)  In some places, this
allows decontorting the logic since it's no longer expensive to check for
SRFs in the tlist --- so I made the checks parallel to the handling of
hasAggs/hasWindowFuncs wherever it seemed appropriate.

catversion bump because adding a Query field changes stored rules.

Andres Freund and Tom Lane

Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 13:54:24 -04:00
Peter Eisentraut 49eb0fd097 Add location field to DefElem
Add a location field to the DefElem struct, used to parse many utility
commands.  Update various error messages to supply error position
information.

To propogate the error position information in a more systematic way,
create a ParseState in standard_ProcessUtility() and pass that to
interested functions implementing the utility commands.  This seems
better than passing the query string and then reassembling a parse state
ad hoc, which violates the encapsulation of the ParseState type.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2016-09-06 12:00:00 -04:00
Tom Lane 2c00fad286 Fix improper repetition of previous results from a hashed aggregate.
ExecReScanAgg's check for whether it could re-use a previously calculated
hashtable neglected the possibility that the Agg node might reference
PARAM_EXEC Params that are not referenced by its input plan node.  That's
okay if the Params are in upper tlist or qual expressions; but if one
appears in aggregate input expressions, then the hashtable contents need
to be recomputed when the Param's value changes.

To avoid unnecessary performance degradation in the case of a Param that
isn't within an aggregate input, add logic to the planner to determine
which Params are within aggregate inputs.  This requires a new field in
struct Agg, but fortunately we never write plans to disk, so this isn't
an initdb-forcing change.

Per report from Jeevan Chalke.  This has been broken since forever,
so back-patch to all supported branches.

Andrew Gierth, with minor adjustments by me

Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>
2016-08-24 14:38:12 -04:00
Tom Lane 0bb51aa967 Improve parsetree representation of special functions such as CURRENT_DATE.
We implement a dozen or so parameterless functions that the SQL standard
defines special syntax for.  Up to now, that was done by converting them
into more or less ad-hoc constructs such as "'now'::text::date".  That's
messy for multiple reasons: it exposes what should be implementation
details to users, and performance is worse than it needs to be in several
cases.  To improve matters, invent a new expression node type
SQLValueFunction that can represent any of these parameterless functions.

Bump catversion because this changes stored parsetrees for rules.

Discussion: <30058.1463091294@sss.pgh.pa.us>
2016-08-16 20:33:01 -04:00
Tom Lane 45639a0525 Avoid invalidating all foreign-join cached plans when user mappings change.
We must not push down a foreign join when the foreign tables involved
should be accessed under different user mappings.  Previously we tried
to enforce that rule literally during planning, but that meant that the
resulting plans were dependent on the current contents of the
pg_user_mapping catalog, and we had to blow away all cached plans
containing any remote join when anything at all changed in pg_user_mapping.
This could have been improved somewhat, but the fact that a syscache inval
callback has very limited info about what changed made it hard to do better
within that design.  Instead, let's change the planner to not consider user
mappings per se, but to allow a foreign join if both RTEs have the same
checkAsUser value.  If they do, then they necessarily will use the same
user mapping at runtime, and we don't need to know specifically which one
that is.  Post-plan-time changes in pg_user_mapping no longer require any
plan invalidation.

This rule does give up some optimization ability, to wit where two foreign
table references come from views with different owners or one's from a view
and one's directly in the query, but nonetheless the same user mapping
would have applied.  We'll sacrifice the first case, but to not regress
more than we have to in the second case, allow a foreign join involving
both zero and nonzero checkAsUser values if the nonzero one is the same as
the prevailing effective userID.  In that case, mark the plan as only
runnable by that userID.

The plancache code already had a notion of plans being userID-specific,
in order to support RLS.  It was a little confused though, in particular
lacking clarity of thought as to whether it was the rewritten query or just
the finished plan that's dependent on the userID.  Rearrange that code so
that it's clearer what depends on which, and so that the same logic applies
to both RLS-injected role dependency and foreign-join-injected role
dependency.

Note that this patch doesn't remove the other issue mentioned in the
original complaint, which is that while we'll reliably stop using a foreign
join if it's disallowed in a new context, we might fail to start using a
foreign join if it's now allowed, but we previously created a generic
cached plan that didn't use one.  It was agreed that the chance of winning
that way was not high enough to justify the much larger number of plan
invalidations that would have to occur if we tried to cause it to happen.

In passing, clean up randomly-varying spelling of EXPLAIN commands in
postgres_fdw.sql, and fix a COSTS ON example that had been allowed to
leak into the committed tests.

This reverts most of commits fbe5a3fb7 and 5d4171d1c, which were the
previous attempt at ensuring we wouldn't push down foreign joins that
span permissions contexts.

Etsuro Fujita and Tom Lane

Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>
2016-07-15 17:23:02 -04:00
Robert Haas 3edcdbcc4d Fix typo in comment.
Amit Langote
2016-07-07 16:13:37 -04:00
Tom Lane 19e972d558 Rethink node-level representation of partial-aggregation modes.
The original coding had three separate booleans representing partial
aggregation behavior, which was confusing, unreadable, and error-prone,
not least because the booleans weren't always listed in the same order.
It was also inadequate for the allegedly-desirable future extension to
support intermediate partial aggregation, because we'd need separate
markers for serialization and deserialization in such a case.

Merge these bools into an enum "AggSplit" to provide symbolic names for
the supported operating modes (and document what those are).  By assigning
the values of the enum constants carefully, we can treat AggSplit values
as options bitmasks so that tests of what to do aren't noticeably more
expensive than before.

While at it, get rid of Aggref.aggoutputtype.  That's not needed since
commit 59a3795c2 got rid of setrefs.c's special-purpose Aggref comparison
code, and it likewise seemed more confusing than helpful.

Assorted comment cleanup as well (there's still more that I want to do
in that line).

catversion bump for change in Aggref node contents.  Should be the last
one for partial-aggregation changes.

Discussion: <29309.1466699160@sss.pgh.pa.us>
2016-06-26 14:33:38 -04:00
Tom Lane 915b703e16 Fix handling of argument and result datatypes for partial aggregation.
When doing partial aggregation, the args list of the upper (combining)
Aggref node is replaced by a Var representing the output of the partial
aggregation steps, which has either the aggregate's transition data type
or a serialized representation of that.  However, nodeAgg.c blindly
continued to use the args list as an indication of the user-level argument
types.  This broke resolution of polymorphic transition datatypes at
executor startup (though it accidentally failed to fail for the ANYARRAY
case, which is likely the only one anyone had tested).  Moreover, the
constructed FuncExpr passed to the finalfunc contained completely wrong
information, which would have led to bogus answers or crashes for any case
where the finalfunc examined that information (which is only likely to be
with polymorphic aggregates using a non-polymorphic transition type).

As an independent bug, apply_partialaggref_adjustment neglected to resolve
a polymorphic transition datatype before assigning it as the output type
of the lower-level Aggref node.  This again accidentally failed to fail
for ANYARRAY but would be unlikely to work in other cases.

To fix the first problem, record the user-level argument types in a
separate OID-list field of Aggref, and look to that rather than the args
list when asking what the argument types were.  (It turns out to be
convenient to include any "direct" arguments in this list too, although
those are not currently subject to being overwritten.)

Rather than adding yet another resolve_aggregate_transtype() call to fix
the second problem, add an aggtranstype field to Aggref, and store the
resolved transition type OID there when the planner first computes it.
(By doing this in the planner and not the parser, we can allow the
aggregate's transition type to change from time to time, although no DDL
support yet exists for that.)  This saves nothing of consequence for
simple non-polymorphic aggregates, but for polymorphic transition types
we save a catalog lookup during executor startup as well as several
planner lookups that are new in 9.6 due to parallel query planning.

In passing, fix an error that was introduced into count_agg_clauses_walker
some time ago: it was applying exprTypmod() to something that wasn't an
expression node at all, but a TargetEntry.  exprTypmod silently returned
-1 so that there was not an obvious failure, but this broke the intended
sensitivity of aggregate space consumption estimates to the typmod of
varchar and similar data types.  This part needs to be back-patched.

Catversion bump due to change of stored Aggref nodes.

Discussion: <8229.1466109074@sss.pgh.pa.us>
2016-06-17 21:44:37 -04:00
Robert Haas 4bc424b968 pgindent run for 9.6 2016-06-09 18:02:36 -04:00
Robert Haas 59eb551279 Fix EXPLAIN VERBOSE output for parallel aggregate.
The way that PartialAggregate and FinalizeAggregate plan nodes were
displaying output columns before was bogus.  Now, FinalizeAggregate
produces the same outputs as an Aggregate would have produced, while
PartialAggregate produces each of those outputs prefixed by the word
PARTIAL.

Discussion: 12585.1460737650@sss.pgh.pa.us

Patch by me, reviewed by David Rowley.
2016-04-27 07:37:40 -04:00
Andres Freund c1ddd2361f Expose more out/readfuncs support functions.
Previously bcac23d exposed a subset of support functions, namely the
ones Kaigai found useful. In
20160304193704.elq773pyg5fyl3mi@alap3.anarazel.de I mentioned that
there's some functions missing to use the facility in an external
project.

To avoid having to add functions piecemeal, add all the functions which
are used to define READ_* and WRITE_* macros; users of the extensible
node functionality are likely to need these. Additionally expose
outDatum(), which doesn't have it's own WRITE_ macro, as it needs
information from the embedding struct.

Discussion: 20160304193704.elq773pyg5fyl3mi@alap3.anarazel.de
2016-04-08 14:26:36 -07:00
Robert Haas 5fe5a2cee9 Allow aggregate transition states to be serialized and deserialized.
This is necessary infrastructure for supporting parallel aggregation
for aggregates whose transition type is "internal".  Such values
can't be passed between cooperating processes, because they are
just pointers.

David Rowley, reviewed by Tomas Vondra and by me.
2016-03-29 15:04:05 -04:00
Robert Haas f9143d102f Rework custom scans to work more like the new extensible node stuff.
Per discussion, the new extensible node framework is thought to be
better designed than the custom path/scan/scanstate stuff we added
in PostgreSQL 9.5.  Rework the latter to be more like the former.

This is not backward-compatible, but we generally don't promise that
for C APIs, and there probably aren't many people using this yet
anyway.

KaiGai Kohei, reviewed by Petr Jelinek and me.  Some further
cosmetic changes by me.
2016-03-29 11:28:04 -04:00
Robert Haas e06a38965b Support parallel aggregation.
Parallel workers can now partially aggregate the data and pass the
transition values back to the leader, which can combine the partial
results to produce the final answer.

David Rowley, based on earlier work by Haribabu Kommi.  Reviewed by
Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.
2016-03-21 09:30:18 -04:00
Tom Lane 07aed46a6b Fix missed update in _readForeignScan().
Blatant fail in 0bf3ae88af.
Caught by buildfarm member mandrill.
2016-03-19 01:20:34 -04:00
Robert Haas 0bf3ae88af Directly modify foreign tables.
postgres_fdw can now sent an UPDATE or DELETE statement directly to
the foreign server in simple cases, rather than sending a SELECT FOR
UPDATE statement and then updating or deleting rows one-by-one.

Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro
Horiguchi, Albe Laurenz, Thom Brown, and me.
2016-03-18 13:55:52 -04:00
Tom Lane 3fc6e2d7f5 Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is.  This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps.  Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step.  We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.

In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan.  It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation.  (A couple of regression test outputs change in consequence of
that.  However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)

There is a great deal left to do here.  This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations.  (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.)  I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.

Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 15:58:22 -05:00
Robert Haas bcac23de73 Introduce extensible node types.
An extensible node is always tagged T_Extensible, but the extnodename
field identifies it more specifically; it may also include arbitrary
private data.  Extensible nodes can be copied, tested for equality,
serialized, and deserialized, but the core system doesn't know
anything about them otherwise.  Some extensions may find it useful to
include these nodes in fdw_private or custom_private lists in lieu of
arm-wrestling their data into a format that the core code can
understand.

Along the way, so as not to burden the authors of such extensible
node types too much, expose the functions for writing serialized
tokens, and for serializing and deserializing bitmapsets.

KaiGai Kohei, per a design suggested by me.  Reviewed by Andres Freund
and by me, and further edited by me.
2016-02-12 09:38:11 -05:00
Robert Haas 7c944bd903 Introduce a new GUC force_parallel_mode for testing purposes.
When force_parallel_mode = true, we enable the parallel mode restrictions
for all queries for which this is believed to be safe.  For the subset of
those queries believed to be safe to run entirely within a worker, we spin
up a worker and run the query there instead of running it in the
original process.  When force_parallel_mode = regress, make additional
changes to allow the regression tests to run cleanly even though parallel
workers have been injected under the hood.

Taken together, this facilitates both better user testing and better
regression testing of the parallelism code.

Robert Haas, with help from Amit Kapila and Rushabh Lathia.
2016-02-07 11:41:33 -05:00
Robert Haas fbe5a3fb73 Only try to push down foreign joins if the user mapping OIDs match.
Previously, the foreign join pushdown infrastructure left the question
of security entirely up to individual FDWs, but it would be easy for
a foreign data wrapper to inadvertently open up subtle security holes
that way.  So, make it the core code's job to determine which user
mapping OID is relevant, and don't attempt join pushdown unless it's
the same for all relevant relations.

Per a suggestion from Tom Lane.  Shigeru Hanada and Ashutosh Bapat,
reviewed by Etsuro Fujita and KaiGai Kohei, with some further
changes by me.
2016-01-28 14:05:36 -05:00
Robert Haas a7de3dc5c3 Support multi-stage aggregation.
Aggregate nodes now have two new modes: a "partial" mode where they
output the unfinalized transition state, and a "finalize" mode where
they accept unfinalized transition states rather than individual
values as input.

These new modes are not used anywhere yet, but they will be necessary
for parallel aggregation.  The infrastructure also figures to be
useful for cases where we want to aggregate local data and remote
data via the FDW interface, and want to bring back partial aggregates
from the remote side that can then be combined with locally generated
partial aggregates to produce the final value.  It may also be useful
even when neither FDWs nor parallelism are in play, as explained in
the comments in nodeAgg.c.

David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
Linnakangas, Haribabu Kommi, and me.
2016-01-20 13:46:50 -05:00