2003-04-09 01:20:04 +02:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
*
|
|
|
|
* array_userfuncs.c
|
|
|
|
* Misc user-visible array support functions
|
|
|
|
*
|
2021-01-02 19:06:25 +01:00
|
|
|
* Copyright (c) 2003-2021, PostgreSQL Global Development Group
|
2003-04-09 01:20:04 +02:00
|
|
|
*
|
|
|
|
* IDENTIFICATION
|
2010-09-20 22:08:53 +02:00
|
|
|
* src/backend/utils/adt/array_userfuncs.c
|
2003-04-09 01:20:04 +02:00
|
|
|
*
|
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
#include "postgres.h"
|
|
|
|
|
2015-03-18 20:01:34 +01:00
|
|
|
#include "catalog/pg_type.h"
|
2017-12-13 01:32:31 +01:00
|
|
|
#include "common/int.h"
|
2003-04-09 01:20:04 +02:00
|
|
|
#include "utils/array.h"
|
2003-07-27 06:53:12 +02:00
|
|
|
#include "utils/builtins.h"
|
2003-04-09 01:20:04 +02:00
|
|
|
#include "utils/lsyscache.h"
|
2015-03-18 20:01:34 +01:00
|
|
|
#include "utils/typcache.h"
|
|
|
|
|
|
|
|
|
2015-03-30 21:13:21 +02:00
|
|
|
static Datum array_position_common(FunctionCallInfo fcinfo);
|
2003-04-09 01:20:04 +02:00
|
|
|
|
2005-11-17 23:14:56 +01:00
|
|
|
|
2015-02-19 02:53:14 +01:00
|
|
|
/*
|
|
|
|
* fetch_array_arg_replace_nulls
|
|
|
|
*
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
* Fetch an array-valued argument in expanded form; if it's null, construct an
|
|
|
|
* empty array value of the proper data type. Also cache basic element type
|
|
|
|
* information in fn_extra.
|
|
|
|
*
|
|
|
|
* Caution: if the input is a read/write pointer, this returns the input
|
|
|
|
* argument; so callers must be sure that their changes are "safe", that is
|
|
|
|
* they cannot leave the array in a corrupt state.
|
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation. This led to ping-ponging between
flat and expanded formats, which costs a lot. For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.
To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value). We already know
that that's a hot spot so I'm loath to add any cycles at all there. Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.
(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values. Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice. Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)
With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower. It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.
Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 17:27:41 +01:00
|
|
|
*
|
|
|
|
* If we're being called as an aggregate function, make sure any newly-made
|
|
|
|
* expanded array is allocated in the aggregate state context, so as to save
|
|
|
|
* copying operations.
|
2015-02-19 02:53:14 +01:00
|
|
|
*/
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
static ExpandedArrayHeader *
|
2015-02-19 02:53:14 +01:00
|
|
|
fetch_array_arg_replace_nulls(FunctionCallInfo fcinfo, int argno)
|
|
|
|
{
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
ExpandedArrayHeader *eah;
|
2015-02-25 20:19:13 +01:00
|
|
|
Oid element_type;
|
2015-02-19 02:53:14 +01:00
|
|
|
ArrayMetaState *my_extra;
|
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation. This led to ping-ponging between
flat and expanded formats, which costs a lot. For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.
To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value). We already know
that that's a hot spot so I'm loath to add any cycles at all there. Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.
(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values. Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice. Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)
With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower. It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.
Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 17:27:41 +01:00
|
|
|
MemoryContext resultcxt;
|
2015-02-19 02:53:14 +01:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
/* If first time through, create datatype cache struct */
|
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
|
|
|
if (my_extra == NULL)
|
|
|
|
{
|
|
|
|
my_extra = (ArrayMetaState *)
|
|
|
|
MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
|
|
|
|
sizeof(ArrayMetaState));
|
|
|
|
my_extra->element_type = InvalidOid;
|
|
|
|
fcinfo->flinfo->fn_extra = my_extra;
|
|
|
|
}
|
|
|
|
|
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation. This led to ping-ponging between
flat and expanded formats, which costs a lot. For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.
To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value). We already know
that that's a hot spot so I'm loath to add any cycles at all there. Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.
(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values. Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice. Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)
With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower. It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.
Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 17:27:41 +01:00
|
|
|
/* Figure out which context we want the result in */
|
|
|
|
if (!AggCheckCallContext(fcinfo, &resultcxt))
|
|
|
|
resultcxt = CurrentMemoryContext;
|
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
/* Now collect the array value */
|
2015-02-25 20:19:13 +01:00
|
|
|
if (!PG_ARGISNULL(argno))
|
|
|
|
{
|
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation. This led to ping-ponging between
flat and expanded formats, which costs a lot. For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.
To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value). We already know
that that's a hot spot so I'm loath to add any cycles at all there. Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.
(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values. Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice. Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)
With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower. It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.
Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 17:27:41 +01:00
|
|
|
MemoryContext oldcxt = MemoryContextSwitchTo(resultcxt);
|
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
eah = PG_GETARG_EXPANDED_ARRAYX(argno, my_extra);
|
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation. This led to ping-ponging between
flat and expanded formats, which costs a lot. For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.
To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value). We already know
that that's a hot spot so I'm loath to add any cycles at all there. Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.
(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values. Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice. Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)
With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower. It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.
Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 17:27:41 +01:00
|
|
|
MemoryContextSwitchTo(oldcxt);
|
2015-02-25 20:19:13 +01:00
|
|
|
}
|
|
|
|
else
|
2015-02-19 02:53:14 +01:00
|
|
|
{
|
2015-02-25 20:19:13 +01:00
|
|
|
/* We have to look up the array type and element type */
|
2015-02-19 02:53:14 +01:00
|
|
|
Oid arr_typeid = get_fn_expr_argtype(fcinfo->flinfo, argno);
|
|
|
|
|
|
|
|
if (!OidIsValid(arr_typeid))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
|
|
|
errmsg("could not determine input data type")));
|
|
|
|
element_type = get_element_type(arr_typeid);
|
|
|
|
if (!OidIsValid(element_type))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("input data type is not an array")));
|
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
eah = construct_empty_expanded_array(element_type,
|
Improve speed of aggregates that use array_append as transition function.
In the previous coding, if an aggregate's transition function returned an
expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
force it into the flat representation. This led to ping-ponging between
flat and expanded formats, which costs a lot. For an aggregate using
array_append as transition function, I measured about a 15X slowdown
compared to the pre-9.5 code, when working on simple int[] arrays.
Of course, the old code was already O(N^2) in this usage due to copying
flat arrays all the time, but it wasn't quite this inefficient.
To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
values without copying, so long as the transition function takes care to
return the transition value already properly parented under the aggcontext.
That puts a bit of extra responsibility on the transition function, but
doing it this way allows us to not need any extra logic in the fast path
of advance_transition_function (ie, with a pass-by-value transition value,
or with a modified-in-place pass-by-reference value). We already know
that that's a hot spot so I'm loath to add any cycles at all there. Also,
while only array_append currently knows how to follow this convention,
this solution allows other transition functions to opt-in without needing
to have a whitelist in the core aggregation code.
(The reason we would need a whitelist is that currently, if you pass a
R/W expanded-object pointer to an arbitrary function, it's allowed to do
anything with it including deleting it; that breaks the core agg code's
assumption that it should free discarded values. Returning a value under
aggcontext is the transition function's signal that it knows it is an
aggregate transition function and will play nice. Possibly the API rules
for expanded objects should be refined, but that would not be a
back-patchable change.)
With this fix, an aggregate using array_append is no longer O(N^2), so it's
much faster than pre-9.5 code rather than much slower. It's still a bit
slower than the bespoke infrastructure for array_agg, but the differential
seems to be only about 10%-20% rather than orders of magnitude.
Discussion: <6315.1477677885@sss.pgh.pa.us>
2016-10-30 17:27:41 +01:00
|
|
|
resultcxt,
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
my_extra);
|
2015-02-19 02:53:14 +01:00
|
|
|
}
|
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
return eah;
|
2015-02-19 02:53:14 +01:00
|
|
|
}
|
|
|
|
|
2003-04-09 01:20:04 +02:00
|
|
|
/*-----------------------------------------------------------------------------
|
2015-02-19 02:53:14 +01:00
|
|
|
* array_append :
|
|
|
|
* push an element onto the end of a one-dimensional array
|
2003-04-09 01:20:04 +02:00
|
|
|
*----------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
Datum
|
2015-02-19 02:53:14 +01:00
|
|
|
array_append(PG_FUNCTION_ARGS)
|
2003-04-09 01:20:04 +02:00
|
|
|
{
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
ExpandedArrayHeader *eah;
|
2003-04-09 01:20:04 +02:00
|
|
|
Datum newelem;
|
2005-11-17 23:14:56 +01:00
|
|
|
bool isNull;
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
Datum result;
|
2003-04-09 01:20:04 +02:00
|
|
|
int *dimv,
|
|
|
|
*lb;
|
|
|
|
int indx;
|
2003-06-27 02:33:26 +02:00
|
|
|
ArrayMetaState *my_extra;
|
2003-04-09 01:20:04 +02:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
eah = fetch_array_arg_replace_nulls(fcinfo, 0);
|
2015-02-19 02:53:14 +01:00
|
|
|
isNull = PG_ARGISNULL(1);
|
|
|
|
if (isNull)
|
|
|
|
newelem = (Datum) 0;
|
|
|
|
else
|
|
|
|
newelem = PG_GETARG_DATUM(1);
|
2003-04-09 01:20:04 +02:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
if (eah->ndims == 1)
|
2003-04-09 01:20:04 +02:00
|
|
|
{
|
2015-02-19 02:53:14 +01:00
|
|
|
/* append newelem */
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
lb = eah->lbound;
|
|
|
|
dimv = eah->dims;
|
2015-02-19 02:53:14 +01:00
|
|
|
|
2017-12-13 01:32:31 +01:00
|
|
|
/* index of added elem is at lb[0] + (dimv[0] - 1) + 1 */
|
|
|
|
if (pg_add_s32_overflow(lb[0], dimv[0], &indx))
|
2015-02-19 02:53:14 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
|
|
|
|
errmsg("integer out of range")));
|
2003-04-09 01:20:04 +02:00
|
|
|
}
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
else if (eah->ndims == 0)
|
2015-02-19 02:53:14 +01:00
|
|
|
indx = 1;
|
2003-04-09 01:20:04 +02:00
|
|
|
else
|
2003-07-27 06:53:12 +02:00
|
|
|
ereport(ERROR,
|
2015-02-19 02:53:14 +01:00
|
|
|
(errcode(ERRCODE_DATA_EXCEPTION),
|
|
|
|
errmsg("argument must be empty or one-dimensional array")));
|
|
|
|
|
|
|
|
/* Perform element insertion */
|
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
result = array_set_element(EOHPGetRWDatum(&eah->hdr),
|
|
|
|
1, &indx, newelem, isNull,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
-1, my_extra->typlen, my_extra->typbyval, my_extra->typalign);
|
2015-02-19 02:53:14 +01:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
PG_RETURN_DATUM(result);
|
2015-02-19 02:53:14 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*-----------------------------------------------------------------------------
|
|
|
|
* array_prepend :
|
|
|
|
* push an element onto the front of a one-dimensional array
|
|
|
|
*----------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
array_prepend(PG_FUNCTION_ARGS)
|
|
|
|
{
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
ExpandedArrayHeader *eah;
|
2015-02-19 02:53:14 +01:00
|
|
|
Datum newelem;
|
|
|
|
bool isNull;
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
Datum result;
|
2015-02-20 06:47:28 +01:00
|
|
|
int *lb;
|
2015-02-19 02:53:14 +01:00
|
|
|
int indx;
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
int lb0;
|
2015-02-19 02:53:14 +01:00
|
|
|
ArrayMetaState *my_extra;
|
2003-04-09 01:20:04 +02:00
|
|
|
|
2015-02-19 02:53:14 +01:00
|
|
|
isNull = PG_ARGISNULL(0);
|
|
|
|
if (isNull)
|
|
|
|
newelem = (Datum) 0;
|
|
|
|
else
|
|
|
|
newelem = PG_GETARG_DATUM(0);
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
eah = fetch_array_arg_replace_nulls(fcinfo, 1);
|
2005-11-17 23:14:56 +01:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
if (eah->ndims == 1)
|
2003-06-25 01:14:49 +02:00
|
|
|
{
|
2015-02-19 02:53:14 +01:00
|
|
|
/* prepend newelem */
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
lb = eah->lbound;
|
|
|
|
lb0 = lb[0];
|
2003-06-27 02:33:26 +02:00
|
|
|
|
2017-12-13 01:32:31 +01:00
|
|
|
if (pg_sub_s32_overflow(lb0, 1, &indx))
|
2015-02-19 02:53:14 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
|
|
|
|
errmsg("integer out of range")));
|
2003-06-25 01:14:49 +02:00
|
|
|
}
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
else if (eah->ndims == 0)
|
|
|
|
{
|
2003-06-27 02:33:26 +02:00
|
|
|
indx = 1;
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
lb0 = 1;
|
|
|
|
}
|
2003-06-25 01:14:49 +02:00
|
|
|
else
|
2003-07-27 06:53:12 +02:00
|
|
|
ereport(ERROR,
|
2003-09-15 22:03:37 +02:00
|
|
|
(errcode(ERRCODE_DATA_EXCEPTION),
|
2005-10-15 04:49:52 +02:00
|
|
|
errmsg("argument must be empty or one-dimensional array")));
|
2003-06-27 02:33:26 +02:00
|
|
|
|
2015-02-19 02:53:14 +01:00
|
|
|
/* Perform element insertion */
|
2003-06-27 02:33:26 +02:00
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
2003-06-25 23:30:34 +02:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
result = array_set_element(EOHPGetRWDatum(&eah->hdr),
|
|
|
|
1, &indx, newelem, isNull,
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
-1, my_extra->typlen, my_extra->typbyval, my_extra->typalign);
|
2003-04-09 01:20:04 +02:00
|
|
|
|
2015-02-19 02:53:14 +01:00
|
|
|
/* Readjust result's LB to match the input's, as expected for prepend */
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
Assert(result == EOHPGetRWDatum(&eah->hdr));
|
|
|
|
if (eah->ndims == 1)
|
|
|
|
{
|
|
|
|
/* This is ok whether we've deconstructed or not */
|
|
|
|
eah->lbound[0] = lb0;
|
|
|
|
}
|
2005-11-19 02:50:08 +01:00
|
|
|
|
Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation. Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype. There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.
The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.
In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables. We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work. (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)
Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad. In any case most applications should see a net win.
Tom Lane, reviewed by Andres Freund
2015-05-14 18:08:40 +02:00
|
|
|
PG_RETURN_DATUM(result);
|
2003-04-09 01:20:04 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
/*-----------------------------------------------------------------------------
|
|
|
|
* array_cat :
|
2003-08-18 01:43:27 +02:00
|
|
|
* concatenate two nD arrays to form an nD array, or
|
2003-04-09 01:20:04 +02:00
|
|
|
* push an (n-1)D array onto the end of an nD array
|
|
|
|
*----------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
array_cat(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2003-08-04 02:43:34 +02:00
|
|
|
ArrayType *v1,
|
|
|
|
*v2;
|
2005-11-17 23:14:56 +01:00
|
|
|
ArrayType *result;
|
2003-08-04 02:43:34 +02:00
|
|
|
int *dims,
|
|
|
|
*lbs,
|
|
|
|
ndims,
|
2005-11-17 23:14:56 +01:00
|
|
|
nitems,
|
2003-08-04 02:43:34 +02:00
|
|
|
ndatabytes,
|
|
|
|
nbytes;
|
|
|
|
int *dims1,
|
|
|
|
*lbs1,
|
|
|
|
ndims1,
|
2005-11-17 23:14:56 +01:00
|
|
|
nitems1,
|
2003-08-04 02:43:34 +02:00
|
|
|
ndatabytes1;
|
|
|
|
int *dims2,
|
|
|
|
*lbs2,
|
|
|
|
ndims2,
|
2005-11-17 23:14:56 +01:00
|
|
|
nitems2,
|
2003-08-04 02:43:34 +02:00
|
|
|
ndatabytes2;
|
2003-08-18 01:43:27 +02:00
|
|
|
int i;
|
2003-08-04 02:43:34 +02:00
|
|
|
char *dat1,
|
|
|
|
*dat2;
|
2005-11-17 23:14:56 +01:00
|
|
|
bits8 *bitmap1,
|
|
|
|
*bitmap2;
|
2003-04-09 01:20:04 +02:00
|
|
|
Oid element_type;
|
|
|
|
Oid element_type1;
|
|
|
|
Oid element_type2;
|
2005-11-17 23:14:56 +01:00
|
|
|
int32 dataoffset;
|
|
|
|
|
|
|
|
/* Concatenating a null array is a no-op, just return the other input */
|
|
|
|
if (PG_ARGISNULL(0))
|
|
|
|
{
|
|
|
|
if (PG_ARGISNULL(1))
|
|
|
|
PG_RETURN_NULL();
|
|
|
|
result = PG_GETARG_ARRAYTYPE_P(1);
|
|
|
|
PG_RETURN_ARRAYTYPE_P(result);
|
|
|
|
}
|
|
|
|
if (PG_ARGISNULL(1))
|
|
|
|
{
|
|
|
|
result = PG_GETARG_ARRAYTYPE_P(0);
|
|
|
|
PG_RETURN_ARRAYTYPE_P(result);
|
|
|
|
}
|
2003-04-09 01:20:04 +02:00
|
|
|
|
|
|
|
v1 = PG_GETARG_ARRAYTYPE_P(0);
|
|
|
|
v2 = PG_GETARG_ARRAYTYPE_P(1);
|
|
|
|
|
2004-12-17 21:59:58 +01:00
|
|
|
element_type1 = ARR_ELEMTYPE(v1);
|
|
|
|
element_type2 = ARR_ELEMTYPE(v2);
|
|
|
|
|
|
|
|
/* Check we have matching element types */
|
|
|
|
if (element_type1 != element_type2)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_DATATYPE_MISMATCH),
|
|
|
|
errmsg("cannot concatenate incompatible arrays"),
|
|
|
|
errdetail("Arrays with element types %s and %s are not "
|
|
|
|
"compatible for concatenation.",
|
|
|
|
format_type_be(element_type1),
|
|
|
|
format_type_be(element_type2))));
|
|
|
|
|
|
|
|
/* OK, use it */
|
|
|
|
element_type = element_type1;
|
|
|
|
|
2003-08-18 01:43:27 +02:00
|
|
|
/*----------
|
|
|
|
* We must have one of the following combinations of inputs:
|
|
|
|
* 1) one empty array, and one non-empty array
|
|
|
|
* 2) both arrays empty
|
|
|
|
* 3) two arrays with ndims1 == ndims2
|
|
|
|
* 4) ndims1 == ndims2 - 1
|
|
|
|
* 5) ndims1 == ndims2 + 1
|
|
|
|
*----------
|
2003-04-09 01:20:04 +02:00
|
|
|
*/
|
|
|
|
ndims1 = ARR_NDIM(v1);
|
|
|
|
ndims2 = ARR_NDIM(v2);
|
|
|
|
|
2003-06-27 02:33:26 +02:00
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* short circuit - if one input array is empty, and the other is not, we
|
|
|
|
* return the non-empty one as the result
|
2003-06-27 02:33:26 +02:00
|
|
|
*
|
|
|
|
* if both are empty, return the first one
|
|
|
|
*/
|
|
|
|
if (ndims1 == 0 && ndims2 > 0)
|
|
|
|
PG_RETURN_ARRAYTYPE_P(v2);
|
|
|
|
|
|
|
|
if (ndims2 == 0)
|
|
|
|
PG_RETURN_ARRAYTYPE_P(v1);
|
|
|
|
|
2003-08-18 01:43:27 +02:00
|
|
|
/* the rest fall under rule 3, 4, or 5 */
|
|
|
|
if (ndims1 != ndims2 &&
|
|
|
|
ndims1 != ndims2 - 1 &&
|
|
|
|
ndims1 != ndims2 + 1)
|
2003-07-27 06:53:12 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
|
|
|
|
errmsg("cannot concatenate incompatible arrays"),
|
|
|
|
errdetail("Arrays of %d and %d dimensions are not "
|
2003-08-04 02:43:34 +02:00
|
|
|
"compatible for concatenation.",
|
|
|
|
ndims1, ndims2)));
|
2003-04-09 01:20:04 +02:00
|
|
|
|
|
|
|
/* get argument array details */
|
|
|
|
lbs1 = ARR_LBOUND(v1);
|
|
|
|
lbs2 = ARR_LBOUND(v2);
|
|
|
|
dims1 = ARR_DIMS(v1);
|
|
|
|
dims2 = ARR_DIMS(v2);
|
|
|
|
dat1 = ARR_DATA_PTR(v1);
|
|
|
|
dat2 = ARR_DATA_PTR(v2);
|
2005-11-17 23:14:56 +01:00
|
|
|
bitmap1 = ARR_NULLBITMAP(v1);
|
|
|
|
bitmap2 = ARR_NULLBITMAP(v2);
|
|
|
|
nitems1 = ArrayGetNItems(ndims1, dims1);
|
|
|
|
nitems2 = ArrayGetNItems(ndims2, dims2);
|
|
|
|
ndatabytes1 = ARR_SIZE(v1) - ARR_DATA_OFFSET(v1);
|
|
|
|
ndatabytes2 = ARR_SIZE(v2) - ARR_DATA_OFFSET(v2);
|
2003-04-09 01:20:04 +02:00
|
|
|
|
|
|
|
if (ndims1 == ndims2)
|
|
|
|
{
|
|
|
|
/*
|
2003-08-18 01:43:27 +02:00
|
|
|
* resulting array is made up of the elements (possibly arrays
|
|
|
|
* themselves) of the input argument arrays
|
2003-04-09 01:20:04 +02:00
|
|
|
*/
|
2003-08-18 01:43:27 +02:00
|
|
|
ndims = ndims1;
|
2003-04-09 01:20:04 +02:00
|
|
|
dims = (int *) palloc(ndims * sizeof(int));
|
|
|
|
lbs = (int *) palloc(ndims * sizeof(int));
|
|
|
|
|
2003-08-18 01:43:27 +02:00
|
|
|
dims[0] = dims1[0] + dims2[0];
|
|
|
|
lbs[0] = lbs1[0];
|
2003-04-09 01:20:04 +02:00
|
|
|
|
2003-08-18 01:43:27 +02:00
|
|
|
for (i = 1; i < ndims; i++)
|
2003-04-09 01:20:04 +02:00
|
|
|
{
|
|
|
|
if (dims1[i] != dims2[i] || lbs1[i] != lbs2[i])
|
2003-07-27 06:53:12 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
|
|
|
|
errmsg("cannot concatenate incompatible arrays"),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errdetail("Arrays with differing element dimensions are "
|
|
|
|
"not compatible for concatenation.")));
|
2003-04-09 01:20:04 +02:00
|
|
|
|
2003-08-18 01:43:27 +02:00
|
|
|
dims[i] = dims1[i];
|
|
|
|
lbs[i] = lbs1[i];
|
2003-04-09 01:20:04 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
else if (ndims1 == ndims2 - 1)
|
|
|
|
{
|
|
|
|
/*
|
2005-10-15 04:49:52 +02:00
|
|
|
* resulting array has the second argument as the outer array, with
|
2005-11-19 02:50:08 +01:00
|
|
|
* the first argument inserted at the front of the outer dimension
|
2003-04-09 01:20:04 +02:00
|
|
|
*/
|
|
|
|
ndims = ndims2;
|
2003-08-18 01:43:27 +02:00
|
|
|
dims = (int *) palloc(ndims * sizeof(int));
|
|
|
|
lbs = (int *) palloc(ndims * sizeof(int));
|
|
|
|
memcpy(dims, dims2, ndims * sizeof(int));
|
|
|
|
memcpy(lbs, lbs2, ndims * sizeof(int));
|
2003-04-09 01:20:04 +02:00
|
|
|
|
|
|
|
/* increment number of elements in outer array */
|
|
|
|
dims[0] += 1;
|
|
|
|
|
|
|
|
/* make sure the added element matches our existing elements */
|
|
|
|
for (i = 0; i < ndims1; i++)
|
|
|
|
{
|
|
|
|
if (dims1[i] != dims[i + 1] || lbs1[i] != lbs[i + 1])
|
2003-07-27 06:53:12 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
|
|
|
|
errmsg("cannot concatenate incompatible arrays"),
|
2005-10-15 04:49:52 +02:00
|
|
|
errdetail("Arrays with differing dimensions are not "
|
|
|
|
"compatible for concatenation.")));
|
2003-04-09 01:20:04 +02:00
|
|
|
}
|
|
|
|
}
|
2003-08-04 02:43:34 +02:00
|
|
|
else
|
2003-04-09 01:20:04 +02:00
|
|
|
{
|
|
|
|
/*
|
2003-08-18 01:43:27 +02:00
|
|
|
* (ndims1 == ndims2 + 1)
|
|
|
|
*
|
2005-10-15 04:49:52 +02:00
|
|
|
* resulting array has the first argument as the outer array, with the
|
|
|
|
* second argument appended to the end of the outer dimension
|
2003-04-09 01:20:04 +02:00
|
|
|
*/
|
|
|
|
ndims = ndims1;
|
2003-08-18 01:43:27 +02:00
|
|
|
dims = (int *) palloc(ndims * sizeof(int));
|
|
|
|
lbs = (int *) palloc(ndims * sizeof(int));
|
|
|
|
memcpy(dims, dims1, ndims * sizeof(int));
|
|
|
|
memcpy(lbs, lbs1, ndims * sizeof(int));
|
2003-04-09 01:20:04 +02:00
|
|
|
|
|
|
|
/* increment number of elements in outer array */
|
|
|
|
dims[0] += 1;
|
|
|
|
|
|
|
|
/* make sure the added element matches our existing elements */
|
|
|
|
for (i = 0; i < ndims2; i++)
|
|
|
|
{
|
|
|
|
if (dims2[i] != dims[i + 1] || lbs2[i] != lbs[i + 1])
|
2003-07-27 06:53:12 +02:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
|
|
|
|
errmsg("cannot concatenate incompatible arrays"),
|
2005-10-15 04:49:52 +02:00
|
|
|
errdetail("Arrays with differing dimensions are not "
|
|
|
|
"compatible for concatenation.")));
|
2003-04-09 01:20:04 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2005-11-17 23:14:56 +01:00
|
|
|
/* Do this mainly for overflow checking */
|
|
|
|
nitems = ArrayGetNItems(ndims, dims);
|
|
|
|
|
2003-04-09 01:20:04 +02:00
|
|
|
/* build the result array */
|
|
|
|
ndatabytes = ndatabytes1 + ndatabytes2;
|
2005-11-17 23:14:56 +01:00
|
|
|
if (ARR_HASNULL(v1) || ARR_HASNULL(v2))
|
|
|
|
{
|
|
|
|
dataoffset = ARR_OVERHEAD_WITHNULLS(ndims, nitems);
|
|
|
|
nbytes = ndatabytes + dataoffset;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
dataoffset = 0; /* marker for no null bitmap */
|
|
|
|
nbytes = ndatabytes + ARR_OVERHEAD_NONULLS(ndims);
|
|
|
|
}
|
Fix array- and path-creating functions to ensure padding bytes are zeroes.
Per recent discussion, it's important for all computed datums (not only the
results of input functions) to not contain any ill-defined (uninitialized)
bits. Failing to ensure that can result in equal() reporting that
semantically indistinguishable Consts are not equal, which in turn leads to
bizarre and undesirable planner behavior, such as in a recent example from
David Johnston. We might eventually try to fix this in a general manner by
allowing datatypes to define identity-testing functions, but for now the
path of least resistance is to expect datatypes to force all unused bits
into consistent states.
Per some testing by Noah Misch, array and path functions seem to be the
only ones presenting risks at the moment, so I looked through all the
functions in adt/array*.c and geo_ops.c and fixed them as necessary. In
the array functions, the easiest/safest fix is to allocate result arrays
with palloc0 instead of palloc. Possibly in future someone will want to
look into whether we can just zero the padding bytes, but that looks too
complex for a back-patchable fix. In the path functions, we already had a
precedent in path_in for just zeroing the one known pad field, so duplicate
that code as needed.
Back-patch to all supported branches.
2011-04-27 19:58:36 +02:00
|
|
|
result = (ArrayType *) palloc0(nbytes);
|
2007-02-28 00:48:10 +01:00
|
|
|
SET_VARSIZE(result, nbytes);
|
2003-04-09 01:20:04 +02:00
|
|
|
result->ndim = ndims;
|
2005-11-17 23:14:56 +01:00
|
|
|
result->dataoffset = dataoffset;
|
2003-04-09 01:20:04 +02:00
|
|
|
result->elemtype = element_type;
|
|
|
|
memcpy(ARR_DIMS(result), dims, ndims * sizeof(int));
|
|
|
|
memcpy(ARR_LBOUND(result), lbs, ndims * sizeof(int));
|
|
|
|
/* data area is arg1 then arg2 */
|
|
|
|
memcpy(ARR_DATA_PTR(result), dat1, ndatabytes1);
|
|
|
|
memcpy(ARR_DATA_PTR(result) + ndatabytes1, dat2, ndatabytes2);
|
2005-11-17 23:14:56 +01:00
|
|
|
/* handle the null bitmap if needed */
|
|
|
|
if (ARR_HASNULL(result))
|
|
|
|
{
|
|
|
|
array_bitmap_copy(ARR_NULLBITMAP(result), 0,
|
|
|
|
bitmap1, 0,
|
|
|
|
nitems1);
|
|
|
|
array_bitmap_copy(ARR_NULLBITMAP(result), nitems1,
|
|
|
|
bitmap2, 0,
|
|
|
|
nitems2);
|
|
|
|
}
|
2003-04-09 01:20:04 +02:00
|
|
|
|
|
|
|
PG_RETURN_ARRAYTYPE_P(result);
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2008-11-14 01:12:08 +01:00
|
|
|
/*
|
2014-11-25 18:21:22 +01:00
|
|
|
* ARRAY_AGG(anynonarray) aggregate function
|
2008-11-14 01:12:08 +01:00
|
|
|
*/
|
2008-11-13 16:59:51 +01:00
|
|
|
Datum
|
|
|
|
array_agg_transfn(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2009-06-11 16:49:15 +02:00
|
|
|
Oid arg1_typeid = get_fn_expr_argtype(fcinfo->flinfo, 1);
|
2008-12-28 19:54:01 +01:00
|
|
|
MemoryContext aggcontext;
|
2008-11-14 01:12:08 +01:00
|
|
|
ArrayBuildState *state;
|
|
|
|
Datum elem;
|
2008-11-13 16:59:51 +01:00
|
|
|
|
|
|
|
if (arg1_typeid == InvalidOid)
|
2008-11-14 01:12:08 +01:00
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
|
|
|
errmsg("could not determine input data type")));
|
2008-11-13 16:59:51 +01:00
|
|
|
|
2014-11-25 18:21:22 +01:00
|
|
|
/*
|
|
|
|
* Note: we do not need a run-time check about whether arg1_typeid is a
|
|
|
|
* valid array element type, because the parser would have verified that
|
|
|
|
* while resolving the input/result types of this polymorphic aggregate.
|
|
|
|
*/
|
|
|
|
|
2010-02-08 21:39:52 +01:00
|
|
|
if (!AggCheckCallContext(fcinfo, &aggcontext))
|
2008-12-28 19:54:01 +01:00
|
|
|
{
|
|
|
|
/* cannot be called directly because of internal-type argument */
|
|
|
|
elog(ERROR, "array_agg_transfn called in non-aggregate context");
|
|
|
|
}
|
2008-11-13 16:59:51 +01:00
|
|
|
|
2015-02-22 02:24:48 +01:00
|
|
|
if (PG_ARGISNULL(0))
|
|
|
|
state = initArrayResult(arg1_typeid, aggcontext, false);
|
|
|
|
else
|
|
|
|
state = (ArrayBuildState *) PG_GETARG_POINTER(0);
|
|
|
|
|
2008-11-14 01:12:08 +01:00
|
|
|
elem = PG_ARGISNULL(1) ? (Datum) 0 : PG_GETARG_DATUM(1);
|
2015-02-22 02:24:48 +01:00
|
|
|
|
2008-11-14 01:12:08 +01:00
|
|
|
state = accumArrayResult(state,
|
|
|
|
elem,
|
|
|
|
PG_ARGISNULL(1),
|
|
|
|
arg1_typeid,
|
2008-12-28 19:54:01 +01:00
|
|
|
aggcontext);
|
2008-11-14 01:12:08 +01:00
|
|
|
|
|
|
|
/*
|
2009-06-11 16:49:15 +02:00
|
|
|
* The transition type for array_agg() is declared to be "internal", which
|
2014-05-06 18:12:18 +02:00
|
|
|
* is a pass-by-value type the same size as a pointer. So we can safely
|
2009-06-11 16:49:15 +02:00
|
|
|
* pass the ArrayBuildState pointer through nodeAgg.c's machinations.
|
2008-11-14 01:12:08 +01:00
|
|
|
*/
|
|
|
|
PG_RETURN_POINTER(state);
|
2008-11-13 16:59:51 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
array_agg_finalfn(PG_FUNCTION_ARGS)
|
|
|
|
{
|
2008-12-28 19:54:01 +01:00
|
|
|
Datum result;
|
2008-11-14 01:12:08 +01:00
|
|
|
ArrayBuildState *state;
|
2008-12-28 19:54:01 +01:00
|
|
|
int dims[1];
|
|
|
|
int lbs[1];
|
2008-11-14 01:12:08 +01:00
|
|
|
|
2008-11-13 16:59:51 +01:00
|
|
|
/* cannot be called directly because of internal-type argument */
|
2010-02-08 21:39:52 +01:00
|
|
|
Assert(AggCheckCallContext(fcinfo, NULL));
|
2008-11-13 16:59:51 +01:00
|
|
|
|
2014-11-25 18:21:22 +01:00
|
|
|
state = PG_ARGISNULL(0) ? NULL : (ArrayBuildState *) PG_GETARG_POINTER(0);
|
|
|
|
|
|
|
|
if (state == NULL)
|
|
|
|
PG_RETURN_NULL(); /* returns null iff no input values */
|
2008-12-28 19:54:01 +01:00
|
|
|
|
|
|
|
dims[0] = state->nelems;
|
|
|
|
lbs[0] = 1;
|
|
|
|
|
2009-06-20 20:45:28 +02:00
|
|
|
/*
|
|
|
|
* Make the result. We cannot release the ArrayBuildState because
|
2010-02-26 03:01:40 +01:00
|
|
|
* sometimes aggregate final functions are re-executed. Rather, it is
|
|
|
|
* nodeAgg.c's responsibility to reset the aggcontext when it's safe to do
|
|
|
|
* so.
|
2009-06-20 20:45:28 +02:00
|
|
|
*/
|
2008-12-28 19:54:01 +01:00
|
|
|
result = makeMdArrayResult(state, 1, dims, lbs,
|
|
|
|
CurrentMemoryContext,
|
2009-06-20 20:45:28 +02:00
|
|
|
false);
|
2008-12-28 19:54:01 +01:00
|
|
|
|
|
|
|
PG_RETURN_DATUM(result);
|
2008-11-13 16:59:51 +01:00
|
|
|
}
|
2014-11-25 18:21:22 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* ARRAY_AGG(anyarray) aggregate function
|
|
|
|
*/
|
|
|
|
Datum
|
|
|
|
array_agg_array_transfn(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
Oid arg1_typeid = get_fn_expr_argtype(fcinfo->flinfo, 1);
|
|
|
|
MemoryContext aggcontext;
|
|
|
|
ArrayBuildStateArr *state;
|
|
|
|
|
|
|
|
if (arg1_typeid == InvalidOid)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
|
|
|
|
errmsg("could not determine input data type")));
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note: we do not need a run-time check about whether arg1_typeid is a
|
|
|
|
* valid array type, because the parser would have verified that while
|
|
|
|
* resolving the input/result types of this polymorphic aggregate.
|
|
|
|
*/
|
|
|
|
|
|
|
|
if (!AggCheckCallContext(fcinfo, &aggcontext))
|
|
|
|
{
|
|
|
|
/* cannot be called directly because of internal-type argument */
|
|
|
|
elog(ERROR, "array_agg_array_transfn called in non-aggregate context");
|
|
|
|
}
|
|
|
|
|
2015-02-22 02:24:48 +01:00
|
|
|
|
|
|
|
if (PG_ARGISNULL(0))
|
|
|
|
state = initArrayResultArr(arg1_typeid, InvalidOid, aggcontext, false);
|
|
|
|
else
|
|
|
|
state = (ArrayBuildStateArr *) PG_GETARG_POINTER(0);
|
|
|
|
|
2014-11-25 18:21:22 +01:00
|
|
|
state = accumArrayResultArr(state,
|
|
|
|
PG_GETARG_DATUM(1),
|
|
|
|
PG_ARGISNULL(1),
|
|
|
|
arg1_typeid,
|
|
|
|
aggcontext);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The transition type for array_agg() is declared to be "internal", which
|
|
|
|
* is a pass-by-value type the same size as a pointer. So we can safely
|
|
|
|
* pass the ArrayBuildStateArr pointer through nodeAgg.c's machinations.
|
|
|
|
*/
|
|
|
|
PG_RETURN_POINTER(state);
|
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
|
|
|
array_agg_array_finalfn(PG_FUNCTION_ARGS)
|
|
|
|
{
|
|
|
|
Datum result;
|
|
|
|
ArrayBuildStateArr *state;
|
|
|
|
|
|
|
|
/* cannot be called directly because of internal-type argument */
|
|
|
|
Assert(AggCheckCallContext(fcinfo, NULL));
|
|
|
|
|
|
|
|
state = PG_ARGISNULL(0) ? NULL : (ArrayBuildStateArr *) PG_GETARG_POINTER(0);
|
|
|
|
|
|
|
|
if (state == NULL)
|
|
|
|
PG_RETURN_NULL(); /* returns null iff no input values */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Make the result. We cannot release the ArrayBuildStateArr because
|
|
|
|
* sometimes aggregate final functions are re-executed. Rather, it is
|
|
|
|
* nodeAgg.c's responsibility to reset the aggcontext when it's safe to do
|
|
|
|
* so.
|
|
|
|
*/
|
|
|
|
result = makeArrayResultArr(state, CurrentMemoryContext, false);
|
|
|
|
|
|
|
|
PG_RETURN_DATUM(result);
|
|
|
|
}
|
2015-03-18 20:01:34 +01:00
|
|
|
|
|
|
|
/*-----------------------------------------------------------------------------
|
2015-03-30 21:13:21 +02:00
|
|
|
* array_position, array_position_start :
|
2015-03-18 20:01:34 +01:00
|
|
|
* return the offset of a value in an array.
|
|
|
|
*
|
|
|
|
* IS NOT DISTINCT FROM semantics are used for comparisons. Return NULL when
|
|
|
|
* the value is not found.
|
|
|
|
*-----------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
Datum
|
2015-03-30 21:13:21 +02:00
|
|
|
array_position(PG_FUNCTION_ARGS)
|
2015-03-18 20:01:34 +01:00
|
|
|
{
|
2015-03-30 21:13:21 +02:00
|
|
|
return array_position_common(fcinfo);
|
2015-03-18 20:01:34 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
Datum
|
2015-03-30 21:13:21 +02:00
|
|
|
array_position_start(PG_FUNCTION_ARGS)
|
2015-03-18 20:01:34 +01:00
|
|
|
{
|
2015-03-30 21:13:21 +02:00
|
|
|
return array_position_common(fcinfo);
|
2015-03-18 20:01:34 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2015-03-30 21:13:21 +02:00
|
|
|
* array_position_common
|
2015-05-24 03:35:49 +02:00
|
|
|
* Common code for array_position and array_position_start
|
2015-03-18 20:01:34 +01:00
|
|
|
*
|
|
|
|
* These are separate wrappers for the sake of opr_sanity regression test.
|
|
|
|
* They are not strict so we have to test for null inputs explicitly.
|
|
|
|
*/
|
|
|
|
static Datum
|
2015-03-30 21:13:21 +02:00
|
|
|
array_position_common(FunctionCallInfo fcinfo)
|
2015-03-18 20:01:34 +01:00
|
|
|
{
|
|
|
|
ArrayType *array;
|
|
|
|
Oid collation = PG_GET_COLLATION();
|
|
|
|
Oid element_type;
|
|
|
|
Datum searched_element,
|
|
|
|
value;
|
|
|
|
bool isnull;
|
2015-03-30 21:13:21 +02:00
|
|
|
int position,
|
|
|
|
position_min;
|
2015-03-18 20:01:34 +01:00
|
|
|
bool found = false;
|
|
|
|
TypeCacheEntry *typentry;
|
|
|
|
ArrayMetaState *my_extra;
|
|
|
|
bool null_search;
|
|
|
|
ArrayIterator array_iterator;
|
|
|
|
|
|
|
|
if (PG_ARGISNULL(0))
|
|
|
|
PG_RETURN_NULL();
|
|
|
|
|
|
|
|
array = PG_GETARG_ARRAYTYPE_P(0);
|
|
|
|
element_type = ARR_ELEMTYPE(array);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We refuse to search for elements in multi-dimensional arrays, since we
|
|
|
|
* have no good way to report the element's location in the array.
|
|
|
|
*/
|
|
|
|
if (ARR_NDIM(array) > 1)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("searching for elements in multidimensional arrays is not supported")));
|
|
|
|
|
|
|
|
if (PG_ARGISNULL(1))
|
|
|
|
{
|
2015-04-12 09:46:17 +02:00
|
|
|
/* fast return when the array doesn't have nulls */
|
2015-03-18 20:01:34 +01:00
|
|
|
if (!array_contains_nulls(array))
|
|
|
|
PG_RETURN_NULL();
|
|
|
|
searched_element = (Datum) 0;
|
|
|
|
null_search = true;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
searched_element = PG_GETARG_DATUM(1);
|
|
|
|
null_search = false;
|
|
|
|
}
|
|
|
|
|
2015-03-30 21:13:21 +02:00
|
|
|
position = (ARR_LBOUND(array))[0] - 1;
|
|
|
|
|
2015-03-18 20:01:34 +01:00
|
|
|
/* figure out where to start */
|
|
|
|
if (PG_NARGS() == 3)
|
|
|
|
{
|
|
|
|
if (PG_ARGISNULL(2))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
|
2015-11-17 03:16:42 +01:00
|
|
|
errmsg("initial position must not be null")));
|
2015-03-18 20:01:34 +01:00
|
|
|
|
2015-03-30 21:13:21 +02:00
|
|
|
position_min = PG_GETARG_INT32(2);
|
2015-03-18 20:01:34 +01:00
|
|
|
}
|
|
|
|
else
|
2015-03-30 21:13:21 +02:00
|
|
|
position_min = (ARR_LBOUND(array))[0];
|
2015-03-18 20:01:34 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We arrange to look up type info for array_create_iterator only once per
|
2015-05-24 03:35:49 +02:00
|
|
|
* series of calls, assuming the element type doesn't change underneath
|
|
|
|
* us.
|
2015-03-18 20:01:34 +01:00
|
|
|
*/
|
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
|
|
|
if (my_extra == NULL)
|
|
|
|
{
|
|
|
|
fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
|
|
|
|
sizeof(ArrayMetaState));
|
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
|
|
|
my_extra->element_type = ~element_type;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (my_extra->element_type != element_type)
|
|
|
|
{
|
|
|
|
get_typlenbyvalalign(element_type,
|
|
|
|
&my_extra->typlen,
|
|
|
|
&my_extra->typbyval,
|
|
|
|
&my_extra->typalign);
|
|
|
|
|
|
|
|
typentry = lookup_type_cache(element_type, TYPECACHE_EQ_OPR_FINFO);
|
|
|
|
|
|
|
|
if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_UNDEFINED_FUNCTION),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("could not identify an equality operator for type %s",
|
|
|
|
format_type_be(element_type))));
|
2015-03-18 20:01:34 +01:00
|
|
|
|
|
|
|
my_extra->element_type = element_type;
|
2016-12-09 16:42:17 +01:00
|
|
|
fmgr_info_cxt(typentry->eq_opr_finfo.fn_oid, &my_extra->proc,
|
|
|
|
fcinfo->flinfo->fn_mcxt);
|
2015-03-18 20:01:34 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Examine each array element until we find a match. */
|
|
|
|
array_iterator = array_create_iterator(array, 0, my_extra);
|
|
|
|
while (array_iterate(array_iterator, &value, &isnull))
|
|
|
|
{
|
2015-03-30 21:13:21 +02:00
|
|
|
position++;
|
2015-03-18 20:01:34 +01:00
|
|
|
|
|
|
|
/* skip initial elements if caller requested so */
|
2015-03-30 21:13:21 +02:00
|
|
|
if (position < position_min)
|
2015-03-18 20:01:34 +01:00
|
|
|
continue;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Can't look at the array element's value if it's null; but if we
|
|
|
|
* search for null, we have a hit and are done.
|
|
|
|
*/
|
|
|
|
if (isnull || null_search)
|
|
|
|
{
|
|
|
|
if (isnull && null_search)
|
|
|
|
{
|
|
|
|
found = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* not nulls, so run the operator */
|
|
|
|
if (DatumGetBool(FunctionCall2Coll(&my_extra->proc, collation,
|
|
|
|
searched_element, value)))
|
|
|
|
{
|
|
|
|
found = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
array_free_iterator(array_iterator);
|
|
|
|
|
|
|
|
/* Avoid leaking memory when handed toasted input */
|
|
|
|
PG_FREE_IF_COPY(array, 0);
|
|
|
|
|
|
|
|
if (!found)
|
|
|
|
PG_RETURN_NULL();
|
|
|
|
|
2015-03-30 21:13:21 +02:00
|
|
|
PG_RETURN_INT32(position);
|
2015-03-18 20:01:34 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*-----------------------------------------------------------------------------
|
2015-03-30 21:13:21 +02:00
|
|
|
* array_positions :
|
|
|
|
* return an array of positions of a value in an array.
|
2015-03-18 20:01:34 +01:00
|
|
|
*
|
|
|
|
* IS NOT DISTINCT FROM semantics are used for comparisons. Returns NULL when
|
|
|
|
* the input array is NULL. When the value is not found in the array, returns
|
|
|
|
* an empty array.
|
|
|
|
*
|
|
|
|
* This is not strict so we have to test for null inputs explicitly.
|
|
|
|
*-----------------------------------------------------------------------------
|
|
|
|
*/
|
|
|
|
Datum
|
2015-03-30 21:13:21 +02:00
|
|
|
array_positions(PG_FUNCTION_ARGS)
|
2015-03-18 20:01:34 +01:00
|
|
|
{
|
|
|
|
ArrayType *array;
|
|
|
|
Oid collation = PG_GET_COLLATION();
|
|
|
|
Oid element_type;
|
|
|
|
Datum searched_element,
|
|
|
|
value;
|
|
|
|
bool isnull;
|
2015-03-30 21:13:21 +02:00
|
|
|
int position;
|
2015-03-18 20:01:34 +01:00
|
|
|
TypeCacheEntry *typentry;
|
|
|
|
ArrayMetaState *my_extra;
|
|
|
|
bool null_search;
|
|
|
|
ArrayIterator array_iterator;
|
|
|
|
ArrayBuildState *astate = NULL;
|
|
|
|
|
|
|
|
if (PG_ARGISNULL(0))
|
|
|
|
PG_RETURN_NULL();
|
|
|
|
|
|
|
|
array = PG_GETARG_ARRAYTYPE_P(0);
|
|
|
|
element_type = ARR_ELEMTYPE(array);
|
|
|
|
|
2015-03-30 21:13:21 +02:00
|
|
|
position = (ARR_LBOUND(array))[0] - 1;
|
|
|
|
|
2015-03-18 20:01:34 +01:00
|
|
|
/*
|
|
|
|
* We refuse to search for elements in multi-dimensional arrays, since we
|
|
|
|
* have no good way to report the element's location in the array.
|
|
|
|
*/
|
|
|
|
if (ARR_NDIM(array) > 1)
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
|
|
|
|
errmsg("searching for elements in multidimensional arrays is not supported")));
|
|
|
|
|
|
|
|
astate = initArrayResult(INT4OID, CurrentMemoryContext, false);
|
|
|
|
|
|
|
|
if (PG_ARGISNULL(1))
|
|
|
|
{
|
2015-04-12 09:46:17 +02:00
|
|
|
/* fast return when the array doesn't have nulls */
|
2015-03-18 20:01:34 +01:00
|
|
|
if (!array_contains_nulls(array))
|
|
|
|
PG_RETURN_DATUM(makeArrayResult(astate, CurrentMemoryContext));
|
|
|
|
searched_element = (Datum) 0;
|
|
|
|
null_search = true;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
searched_element = PG_GETARG_DATUM(1);
|
|
|
|
null_search = false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We arrange to look up type info for array_create_iterator only once per
|
2015-05-24 03:35:49 +02:00
|
|
|
* series of calls, assuming the element type doesn't change underneath
|
|
|
|
* us.
|
2015-03-18 20:01:34 +01:00
|
|
|
*/
|
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
|
|
|
if (my_extra == NULL)
|
|
|
|
{
|
|
|
|
fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
|
|
|
|
sizeof(ArrayMetaState));
|
|
|
|
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
|
|
|
|
my_extra->element_type = ~element_type;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (my_extra->element_type != element_type)
|
|
|
|
{
|
|
|
|
get_typlenbyvalalign(element_type,
|
|
|
|
&my_extra->typlen,
|
|
|
|
&my_extra->typbyval,
|
|
|
|
&my_extra->typalign);
|
|
|
|
|
|
|
|
typentry = lookup_type_cache(element_type, TYPECACHE_EQ_OPR_FINFO);
|
|
|
|
|
|
|
|
if (!OidIsValid(typentry->eq_opr_finfo.fn_oid))
|
|
|
|
ereport(ERROR,
|
|
|
|
(errcode(ERRCODE_UNDEFINED_FUNCTION),
|
Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 21:35:54 +02:00
|
|
|
errmsg("could not identify an equality operator for type %s",
|
|
|
|
format_type_be(element_type))));
|
2015-03-18 20:01:34 +01:00
|
|
|
|
|
|
|
my_extra->element_type = element_type;
|
2016-12-09 16:42:17 +01:00
|
|
|
fmgr_info_cxt(typentry->eq_opr_finfo.fn_oid, &my_extra->proc,
|
|
|
|
fcinfo->flinfo->fn_mcxt);
|
2015-03-18 20:01:34 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2015-05-24 03:35:49 +02:00
|
|
|
* Accumulate each array position iff the element matches the given
|
|
|
|
* element.
|
2015-03-18 20:01:34 +01:00
|
|
|
*/
|
|
|
|
array_iterator = array_create_iterator(array, 0, my_extra);
|
|
|
|
while (array_iterate(array_iterator, &value, &isnull))
|
|
|
|
{
|
2015-03-30 21:13:21 +02:00
|
|
|
position += 1;
|
2015-03-18 20:01:34 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Can't look at the array element's value if it's null; but if we
|
|
|
|
* search for null, we have a hit.
|
|
|
|
*/
|
|
|
|
if (isnull || null_search)
|
|
|
|
{
|
|
|
|
if (isnull && null_search)
|
|
|
|
astate =
|
2015-03-30 21:13:21 +02:00
|
|
|
accumArrayResult(astate, Int32GetDatum(position), false,
|
2015-03-18 20:01:34 +01:00
|
|
|
INT4OID, CurrentMemoryContext);
|
|
|
|
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* not nulls, so run the operator */
|
|
|
|
if (DatumGetBool(FunctionCall2Coll(&my_extra->proc, collation,
|
|
|
|
searched_element, value)))
|
|
|
|
astate =
|
2015-03-30 21:13:21 +02:00
|
|
|
accumArrayResult(astate, Int32GetDatum(position), false,
|
2015-03-18 20:01:34 +01:00
|
|
|
INT4OID, CurrentMemoryContext);
|
|
|
|
}
|
|
|
|
|
|
|
|
array_free_iterator(array_iterator);
|
|
|
|
|
|
|
|
/* Avoid leaking memory when handed toasted input */
|
|
|
|
PG_FREE_IF_COPY(array, 0);
|
|
|
|
|
|
|
|
PG_RETURN_DATUM(makeArrayResult(astate, CurrentMemoryContext));
|
|
|
|
}
|