Minimally fix partial aggregation for aggregates that don't have one argument.

For partial aggregation combine steps,
AggStatePerTrans->numTransInputs was set to the transition function's
number of inputs, rather than the combine function's number of
inputs (always 1).

That lead to partial aggregates with strict combine functions to
wrongly check for NOT NULL input as required by strictness. When the
aggregate wasn't exactly passed one argument, the strictness check was
either omitted (in the 0 args case) or too many arguments were
checked. In the latter case we'd read beyond the end of
FunctionCallInfoData->args (only in master).

AggStatePerTrans->numTransInputs actually has been wrong since since
9.6, where partial aggregates were added. But it turns out to not be
an active problem in 9.6 and 10, because numTransInputs wasn't used at
all for combine functions: Before c253b722f6 there simply was no NULL
check for the input to strict trans functions, and after that the
check was simply hardcoded for the right offset in fcinfo, as it's
done by code specific to combine functions.

In bf6c614a2f (11) the strictness check was generalized, with common
code doing the strictness checks for both plain and combine transition
functions, based on numTransInputs. For combine functions this lead to
not emitting an expression step to check for strict input in the 0
arguments case, and in the > 1 arguments case, we'd check too many
arguments.Due to the fact that the relevant fcinfo->isnull[2..] was
always zero-initialized (more or less by accident, by being part of
the AggStatePerTrans struct, which is palloc0'ed), there was no
observable damage in the latter case before a9c35cf85c, we just
checked too many array elements.

Due to the changes in a9c35cf85c, > 1 argument bug became visible,
because these days fcinfo is a) dynamically allocated without being
zeroed b) exactly the length required for the number of specified
arguments (hardcoded to 2 in this case).

This commit only contains a fairly minimal fix, setting numTransInputs
to a hardcoded 1 when building a pertrans for a combine function. It
seems likely that we'll want to clean this up further (e.g. the
arguments build_pertrans_for_aggref() aren't particularly meaningful
for combine functions). But the wrap date for 12 beta1 is coming up
fast, so it seems good to have a minimal fix in place.

Backpatch to 11. While AggStatePerTrans->numTransInputs was set
wrongly before that, the value was not used for combine functions.

Reported-By: Rajkumar Raghuwanshi
Diagnosed-By: Kyotaro Horiguchi, Jeevan Chalke, Andres Freund, David Rowley
Author: David Rowley, Kyotaro Horiguchi, Andres Freund
Discussion: https://postgr.es/m/CAKcux6=uZEyWyLw0N7HtR9OBc-sWEFeByEZC7t-KDf15FKxVew@mail.gmail.com
This commit is contained in:
Andres Freund 2019-05-19 18:01:06 -07:00
parent 03310dbea9
commit 2657283256
3 changed files with 43 additions and 23 deletions

View File

@ -2911,12 +2911,6 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
pertrans->aggtranstype = aggtranstype;
/* Detect how many arguments to pass to the transfn */
if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
pertrans->numTransInputs = numInputs;
else
pertrans->numTransInputs = numArguments;
/*
* When combining states, we have no use at all for the aggregate
* function's transfn. Instead we use the combinefn. In this case, the
@ -2926,6 +2920,17 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
if (DO_AGGSPLIT_COMBINE(aggstate->aggsplit))
{
Expr *combinefnexpr;
size_t numTransArgs;
/*
* When combining there's only one input, the to-be-combined added
* transition value from below (this node's transition value is
* counted separately).
*/
pertrans->numTransInputs = 1;
/* account for the current transition state */
numTransArgs = pertrans->numTransInputs + 1;
build_aggregate_combinefn_expr(aggtranstype,
aggref->inputcollid,
@ -2938,7 +2943,7 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
(FunctionCallInfo) palloc(SizeForFunctionCallInfo(2));
InitFunctionCallInfoData(*pertrans->transfn_fcinfo,
&pertrans->transfn,
2,
numTransArgs,
pertrans->aggCollation,
(void *) aggstate, NULL);
@ -2956,7 +2961,16 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
else
{
Expr *transfnexpr;
size_t numInputs = pertrans->numTransInputs + 1;
size_t numTransArgs;
/* Detect how many arguments to pass to the transfn */
if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
pertrans->numTransInputs = numInputs;
else
pertrans->numTransInputs = numArguments;
/* account for the current transition state */
numTransArgs = pertrans->numTransInputs + 1;
/*
* Set up infrastructure for calling the transfn. Note that invtrans
@ -2976,10 +2990,10 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans,
fmgr_info_set_expr((Node *) transfnexpr, &pertrans->transfn);
pertrans->transfn_fcinfo =
(FunctionCallInfo) palloc(SizeForFunctionCallInfo(numInputs));
(FunctionCallInfo) palloc(SizeForFunctionCallInfo(numTransArgs));
InitFunctionCallInfoData(*pertrans->transfn_fcinfo,
&pertrans->transfn,
numInputs,
numTransArgs,
pertrans->aggCollation,
(void *) aggstate, NULL);

View File

@ -2204,21 +2204,26 @@ SET max_parallel_workers_per_gather = 4;
SET enable_indexonlyscan = off;
-- variance(int4) covers numeric_poly_combine
-- sum(int8) covers int8_avg_combine
EXPLAIN (COSTS OFF)
SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
QUERY PLAN
----------------------------------------------
-- regr_count(float8, float8) covers int8inc_float8_float8 and aggregates with > 1 arg
EXPLAIN (COSTS OFF, VERBOSE)
SELECT variance(unique1::int4), sum(unique1::int8), regr_count(unique1::float8, unique1::float8) FROM tenk1;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate
Output: variance(unique1), sum((unique1)::bigint), regr_count((unique1)::double precision, (unique1)::double precision)
-> Gather
Output: (PARTIAL variance(unique1)), (PARTIAL sum((unique1)::bigint)), (PARTIAL regr_count((unique1)::double precision, (unique1)::double precision))
Workers Planned: 4
-> Partial Aggregate
-> Parallel Seq Scan on tenk1
(5 rows)
Output: PARTIAL variance(unique1), PARTIAL sum((unique1)::bigint), PARTIAL regr_count((unique1)::double precision, (unique1)::double precision)
-> Parallel Seq Scan on public.tenk1
Output: unique1, unique2, two, four, ten, twenty, hundred, thousand, twothousand, fivethous, tenthous, odd, even, stringu1, stringu2, string4
(9 rows)
SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
variance | sum
----------------------+----------
8334166.666666666667 | 49995000
SELECT variance(unique1::int4), sum(unique1::int8), regr_count(unique1::float8, unique1::float8) FROM tenk1;
variance | sum | regr_count
----------------------+----------+------------
8334166.666666666667 | 49995000 | 10000
(1 row)
ROLLBACK;

View File

@ -963,10 +963,11 @@ SET enable_indexonlyscan = off;
-- variance(int4) covers numeric_poly_combine
-- sum(int8) covers int8_avg_combine
EXPLAIN (COSTS OFF)
SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
-- regr_count(float8, float8) covers int8inc_float8_float8 and aggregates with > 1 arg
EXPLAIN (COSTS OFF, VERBOSE)
SELECT variance(unique1::int4), sum(unique1::int8), regr_count(unique1::float8, unique1::float8) FROM tenk1;
SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
SELECT variance(unique1::int4), sum(unique1::int8), regr_count(unique1::float8, unique1::float8) FROM tenk1;
ROLLBACK;