Design problems were discovered in the handling of composite types and
record types that would cause some relevant versions not to be recorded.
Misgivings were also expressed about the use of the pg_depend catalog
for this purpose. We're out of time for this release so we'll revert
and try again.
Commits reverted:
1bf946bd: Doc: Document known problem with Windows collation versions.
cf002008: Remove no-longer-relevant test case.
ef387bed: Fix bogus collation-version-recording logic.
0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>).
ff942057: Suppress "warning: variable 'collcollate' set but not used".
d50e3b1f: Fix assertion in collation version lookup.
f24b1569: Rethink extraction of collation dependencies.
257836a7: Track collation versions for indexes.
cd6f479e: Add pg_depend.refobjversion.
7d1297df: Remove pg_collation.collversion.
Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com
websearch_to_tsquery() splits text in quotes into tokens and connects them with
phrase operator on its own. However, that leads to surprising results when the
token contains no words.
For instance, websearch_to_tsquery('"aaa: bbb"') is 'aaa <2> bbb', because
it is equivalent of to_tsquery(E'aaa <-> \':\' <-> bbb'). But
websearch_to_tsquery('"aaa: bbb"') has to be 'aaa <-> bbb' in order to match
to_tsvector('aaa: bbb').
Since 0c4f355c6a, we anyway connect lexemes of complex tokens with phrase
operators. Thus, let's just websearch_to_tsquery() parse text in quotes as
a single token. Therefore, websearch_to_tsquery() should process the quoted
text in the same way phraseto_tsquery() does. This solution is what we exactly
need and also simplifies the code.
This commit is an incompatible change, so we don't backpatch it.
Reported-by: Valentin Gatien-Baron
Discussion: https://postgr.es/m/CA%2B0DEqiZs7gdOd4ikmg%3D0UWG%2BSwWOLxPsk_JW-sx9WNOyrb0KQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Tom Lane, Zhihong Yu
Previously, we used to use the array of size max_replication_slots to
store stats for replication slots. But that had two problems in the cases
where a message for dropping a slot gets lost: 1) the stats for the new
slot are not recorded if the array is full and 2) writing beyond the end
of the array if the user reduces the max_replication_slots.
This commit uses HTAB for replication slot statistics, resolving both
problems. Now, pgstat_vacuum_stat() search for all the dead replication
slots in stats hashtable and tell the collector to remove them. To avoid
showing the stats for the already-dropped slots, pg_stat_replication_slots
view searches slot stats by the slot name taken from pg_replication_slots.
Also, we send a message for creating a slot at slot creation, initializing
the stats. This reduces the possibility that the stats are accumulated
into the old slot stats when a message for dropping a slot gets lost.
Reported-by: Andres Freund
Author: Sawada Masahiko, test case by Vignesh C
Reviewed-by: Amit Kapila, Vignesh C, Dilip Kumar
Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
These functions shouldn't receive null arguments: multirange_constructor0()
doesn't have any arguments while multirange_constructor2() has a single array
argument, which is never null.
But mark them strict anyway for the sake of uniformity.
Also, make checks for null arguments use elog() instead of ereport() as these
errors should normally be never thrown. And adjust corresponding comments.
Catversion is bumped.
Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/0f783a96-8d67-9e71-996b-f34a7352eeef%40enterprisedb.com
Previously, it was pg_stat_activity.queryid to match the
pg_stat_statements queryid column. This is an adjustment to patch
4f0b0966c8. This also adjusts some of the internal function calls to
match. Catversion bumped.
Reported-by: Álvaro Herrera, Julien Rouhaud
Discussion: https://postgr.es/m/20210408032704.GA7498@alvherre.pgsql
This adds the statistics about total transactions count and total
transaction data logically sent to the decoding output plugin from
ReorderBuffer. Users can query the pg_stat_replication_slots view to check
these stats.
Suggested-by: Andres Freund
Author: Vignesh C and Amit Kapila
Reviewed-by: Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
This will make it consistent with the other usage of slotname in the code.
In the passing, change pgstat_report_replslot signature to use a structure
rather than multiple parameters.
Reported-by: Andres Freund
Author: Vignesh C
Reviewed-by: Sawada Masahiko, Amit Kapila
Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de
With the Oracle Developer Studio 12.6 compiler, #line directives alter
the current source file location for purposes of #include "..."
directives. Hence, a VPATH build failed with 'cannot find include file:
"specscanner.c"'. With two exceptions, parser-containing directories
already add "-I. -I$(srcdir)"; eliminate the exceptions. Back-patch to
9.6 (all supported versions).
Using Roman numbers (via "RM" or "rm") for a conversion to calculate a
number of months has never considered the case of negative numbers,
where a conversion could easily cause out-of-bound memory accesses. The
conversions in themselves were not completely consistent either, as
specifying 12 would result in NULL, but it should mean XII.
This commit reworks the conversion calculation to have a more
consistent behavior:
- If the number of months and years is 0, return NULL.
- If the number of months is positive, return the exact month number.
- If the number of months is negative, do a backward calculation, with
-1 meaning December, -2 November, etc.
Reported-by: Theodor Arsenij Larionov-Trichkin
Author: Julien Rouhaud
Discussion: https://postgr.es/m/16953-f255a18f8c51f1d5@postgresql.org
backpatch-through: 9.6
Coverity complained about possible overflow in expressions like
intresult = tm->tm_sec * 1000000 + fsec;
on the grounds that the multiplication would happen in 32-bit
arithmetic before widening to the int64 result. I think these
are all false positives because of the limited possible range of
tm_sec; but nonetheless it seems silly to spell it like that when
nearby lines have the identical computation written with a 64-bit
constant.
... or more accurately, with an LL constant, which is not project
style. Make all of these use INT64CONST(), as we do elsewhere.
This is all new code from a2da77cdb, so no need for back-patch.
Commit c9c41c7a33 used two different
naming patterns. Standardize on the majority pattern, which was the
only pattern in the last reviewed version of that commit.
Comment fixes are applied on HEAD, and documentation improvements are
applied on back-branches where needed.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20210408164008.GJ6592@telsasoft.com
Backpatch-through: 9.6
Ignore parallel workers in pg_stat_statements
Oversight in 4f0b0966c8 which exposed queryid in parallel workers.
Counters are aggregated by the main backend process so parallel workers
would report duplicated activity, and could also report activity for the
wrong entry as they are only aware of the top level queryid.
Fix thinko in pg_stat_get_activity when retrieving the queryid.
Remove unnecessary call to pgstat_report_queryid().
Reported-by: Amit Kapila, Andres Freund, Thomas Munro
Discussion: https://postgr.es/m/20210408051735.lfbdzun5zdlax5gd@alap3.anarazel.dep634GTSOqnDW86Owrn6qDAVosC5dJjXjp7BMfc5Gz1Q@mail.gmail.com
Author: Julien Rouhaud
This adds support for writing CREATE FUNCTION and CREATE PROCEDURE
statements for language SQL with a function body that conforms to the
SQL standard and is portable to other implementations.
Instead of the PostgreSQL-specific AS $$ string literal $$ syntax,
this allows writing out the SQL statements making up the body
unquoted, either as a single statement:
CREATE FUNCTION add(a integer, b integer) RETURNS integer
LANGUAGE SQL
RETURN a + b;
or as a block
CREATE PROCEDURE insert_data(a integer, b integer)
LANGUAGE SQL
BEGIN ATOMIC
INSERT INTO tbl VALUES (a);
INSERT INTO tbl VALUES (b);
END;
The function body is parsed at function definition time and stored as
expression nodes in a new pg_proc column prosqlbody. So at run time,
no further parsing is required.
However, this form does not support polymorphic arguments, because
there is no more parse analysis done at call time.
Dependencies between the function and the objects it uses are fully
tracked.
A new RETURN statement is introduced. This can only be used inside
function bodies. Internally, it is treated much like a SELECT
statement.
psql needs some new intelligence to keep track of function body
boundaries so that it doesn't send off statements when it sees
semicolons that are inside a function body.
Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com
Use the in-core query id computation for pg_stat_activity,
log_line_prefix, and EXPLAIN VERBOSE.
Similar to other fields in pg_stat_activity, only the queryid from the
top level statements are exposed, and if the backends status isn't
active then the queryid from the last executed statements is displayed.
Add a %Q placeholder to include the queryid in log_line_prefix, which
will also only expose top level statements.
For EXPLAIN VERBOSE, if a query identifier has been computed, either by
enabling compute_query_id or using a third-party module, display it.
Bump catalog version.
Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol
Author: Julien Rouhaud
Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu
The previous implementation of EXTRACT mapped internally to
date_part(), which returned type double precision (since it was
implemented long before the numeric type existed). This can lead to
imprecise output in some cases, so returning numeric would be
preferrable. Changing the return type of an existing function is a
bit risky, so instead we do the following: We implement a new set of
functions, which are now called "extract", in parallel to the existing
date_part functions. They work the same way internally but use
numeric instead of float8. The EXTRACT construct is now mapped by the
parser to these new extract functions. That way, dumps of views
etc. from old versions (which would use date_part) continue to work
unchanged, but new uses will map to the new extract functions.
Additionally, the reverse compilation of EXTRACT now reproduces the
original syntax, using the new mechanism introduced in
40c24bfef9.
The following minor changes of behavior result from the new
implementation:
- The column name from an isolated EXTRACT call is now "extract"
instead of "date_part".
- Extract from date now rejects inappropriate field names such as
HOUR. It was previously mapped internally to extract from
timestamp, so it would silently accept everything appropriate for
timestamp.
- Return values when extracting fields with possibly fractional
values, such as second and epoch, now have the full scale that the
value has internally (so, for example, '1.000000' instead of just
'1').
Reported-by: Petr Fedorov <petr.fedorov@phystech.edu>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
Commit 3e98c0bafb added pg_backend_memory_contexts view to display
the memory contexts of the backend process. However its target process
is limited to the backend that is accessing to the view. So this is
not so convenient when investigating the local memory bloat of other
backend process. To improve this situation, this commit adds
pg_log_backend_memory_contexts() function that requests to log
the memory contexts of the specified backend process.
This information can be also collected by calling
MemoryContextStats(TopMemoryContext) via a debugger. But
this technique cannot be used in some environments because no debugger
is available there. So, pg_log_backend_memory_contexts() allows us to
see the memory contexts of specified backend more easily.
Only superusers are allowed to request to log the memory contexts
because allowing any users to issue this request at an unbounded rate
would cause lots of log messages and which can lead to denial of service.
On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its memory contexts at LOG_SERVER_ONLY level,
so that these memory contexts will appear in the server log but not
be sent to the client. It logs one message per memory context.
Because if it buffers all memory contexts into StringInfo to log them
as one message, which may require the buffer to be enlarged very much
and lead to OOM error since there can be a large number of memory
contexts in a backend.
When a backend process is consuming huge memory, logging all its
memory contexts might overrun available disk space. To prevent this,
now this patch limits the number of child contexts to log per parent
to 100. As with MemoryContextStats(), it supposes that practical cases
where the log gets long will typically be huge numbers of siblings
under the same parent context; while the additional debugging value
from seeing details about individual siblings beyond 100 will not be large.
There was another proposed patch to add the function to return
the memory contexts of specified backend as the result sets,
instead of logging them, in the discussion. However that patch is
not included in this commit because it had several issues to address.
Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra,
Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao
Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com
spg_box_quad_leaf_consistent unconditionally returned the leaf
datum as leafValue, even though in its usage for poly_ops that
value is of completely the wrong type.
In versions before 12, that was harmless because the core code did
nothing with leafValue in non-index-only scans ... but since commit
2a6368343, if we were doing a KNN-style scan, spgNewHeapItem would
unconditionally try to copy the value using the wrong datatype
parameters. Said copying is a waste of time and space if we're not
going to return the data, but it accidentally failed to fail until
I fixed the datatype confusion in ac9099fc1.
Hence, change spgNewHeapItem to not copy the datum unless we're
actually going to return it later. This saves cycles and dodges
the question of whether lossy opclasses are returning the right
type. Also change spg_box_quad_leaf_consistent to not return
data that might be of the wrong type, as insurance against
somebody introducing a similar bug into the core code in future.
It seems like a good idea to back-patch these two changes into
v12 and v13, although I'm afraid to change spgNewHeapItem's
mistaken idea of which datatype to use in those branches.
Per buildfarm results from ac9099fc1.
Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us
Through various refactorings over time, the extract(timezone_minute
from time with time zone) and extract(timezone_minute from timestamp
with time zone) implementations ended up with two different but
equally nonsensical formulas by using SECS_PER_MINUTE and
MINS_PER_HOUR interchangeably. Since those two are of course both the
same number, the formulas do work, but for readability, fix them to be
semantically correct.
According to the comments, when an invalid or dropped column oid is passed
to has_column_privilege(), the intention has always been to return NULL.
However, when the caller had table level privilege the invalid/missing
column was never discovered, because table permissions were checked first.
Fix that by introducing extended versions of pg_attribute_acl(check|mask)
and pg_class_acl(check|mask) which take a new argument, is_missing. When
is_missing is NULL, the old behavior is preserved. But when is_missing is
passed by the caller, no ERROR is thrown for dropped or missing
columns/relations, and is_missing is flipped to true. This in turn allows
has_column_privilege to check for column privileges first, providing the
desired semantics.
Not backpatched since it is a user visible behavioral change with no previous
complaints, and the fix is a bit on the invasive side.
Author: Joe Conway
Reviewed-By: Tom Lane
Reported by: Ian Barwick
Discussion: https://postgr.es/m/flat/9b5f4311-157b-4164-7fe7-077b4fe8ed84%40joeconway.com
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
This allows something like
SELECT ... FROM t1 JOIN t2 USING (a, b, c) AS x
where x has the columns a, b, c and unlike a regular alias it does not
hide the range variables of the tables being joined t1 and t2.
Per SQL:2016 feature F404 "Range variable for common column names".
Reviewed-by: Vik Fearing <vik.fearing@2ndquadrant.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/454638cf-d563-ab76-a585-2564428062af@2ndquadrant.com
Here we add a new output parameter to estimate_num_groups() to allow it to
inform the caller of additional, possibly useful information about the
estimation.
The new output parameter is a struct that currently contains just a single
field with a set of flags. This was done rather than having the flags as
an output parameter to allow future fields to be added without having to
change the signature of the function at a later date when we want to pass
back further information that might not be suitable to store in the flags
field.
It seems reasonable that one day in the future that the planner would want
to know more about the estimation. For example, how many individual sets
of statistics was the estimation generated from? The planner may want to
take that into account if we ever want to consider risks as well as costs
when generating plans.
For now, there's only 1 flag we set in the flags field. This is to
indicate if the estimation fell back on using the hard-coded constants in
any part of the estimation. Callers may like to change their behavior if
this is set, and this gives them the ability to do so. Callers may pass
the flag pointer as NULL if they have no interest in obtaining any
additional information about the estimate.
We're not adding any actual usages of these flags here. Some follow-up
commits will make use of this feature. Additionally, we're also not
making any changes to add support for clauselist_selectivity() and
clauselist_selectivity_ext(). However, if this is required in the future
then the same struct being added here should be fine to use as a new
output argument for those functions too.
Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvqQqpk=1W-G_ds7A9CsXX3BggWj_7okinzkLVhDubQzjA@mail.gmail.com
Allow defining extended statistics on expressions, not just just on
simple column references. With this commit, expressions are supported
by all existing extended statistics kinds, improving the same types of
estimates. A simple example may look like this:
CREATE TABLE t (a int);
CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t;
ANALYZE t;
The collected statistics are useful e.g. to estimate queries with those
expressions in WHERE or GROUP BY clauses:
SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0;
SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20);
This introduces new internal statistics kind 'e' (expressions) which is
built automatically when the statistics object definition includes any
expressions. This represents single-expression statistics, as if there
was an expression index (but without the index maintenance overhead).
The statistics is stored in pg_statistics_ext_data as an array of
composite types, which is possible thanks to 79f6a942bd.
CREATE STATISTICS allows building statistics on a single expression, in
which case in which case it's not possible to specify statistics kinds.
A new system view pg_stats_ext_exprs can be used to display expression
statistics, similarly to pg_stats and pg_stats_ext views.
ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it
treats indexes, i.e. it drops and recreates the statistics. This means
all statistics are reset, and we no longer try to preserve at least the
functional dependencies. This should not be a major issue in practice,
as the functional dependencies actually rely on per-column statistics,
which were always reset anyway.
Author: Tomas Vondra
Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu
Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com
When estimating the number of groups using extended statistics, the code
was discarding information about system attributes. This led to strange
situation that
SELECT 1 FROM t GROUP BY ctid;
could have produced higher estimate (equal to pg_class.reltuples) than
SELECT 1 FROM t GROUP BY a, b, ctid;
with extended statistics on (a,b). Fixed by retaining information about
the system attribute.
Backpatch all the way to 10, where extended statistics were introduced.
Author: Tomas Vondra
Backpatch-through: 10
Membership consists, implicitly, of the current database owner. Expect
use in template databases. Once pg_database_owner has rights within a
template, each owner of a database instantiated from that template will
exercise those rights.
Reviewed by John Naylor.
Discussion: https://postgr.es/m/20201228043148.GA1053024@rfd.leadboat.com
Allow a partition be detached from its partitioned table without
blocking concurrent queries, by running in two transactions and only
requiring ShareUpdateExclusive in the partitioned table.
Because it runs in two transactions, it cannot be used in a transaction
block. This is the main reason to use dedicated syntax: so that users
can choose to use the original mode if they need it. But also, it
doesn't work when a default partition exists (because an exclusive lock
would still need to be obtained on it, in order to change its partition
constraint.)
In case the second transaction is cancelled or a crash occurs, there's
ALTER TABLE .. DETACH PARTITION .. FINALIZE, which executes the final
steps.
The main trick to make this work is the addition of column
pg_inherits.inhdetachpending, initially false; can only be set true in
the first part of this command. Once that is committed, concurrent
transactions that use a PartitionDirectory will include or ignore
partitions so marked: in optimizer they are ignored if the row is marked
committed for the snapshot; in executor they are always included. As a
result, and because of the way PartitionDirectory caches partition
descriptors, queries that were planned before the detach will see the
rows in the detached partition and queries that are planned after the
detach, won't.
A CHECK constraint is created that duplicates the partition constraint.
This is probably not strictly necessary, and some users will prefer to
remove it afterwards, but if the partition is re-attached to a
partitioned table, the constraint needn't be rechecked.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200803234854.GA24158@alvherre.pgsql
Similar to date_trunc, but allows binning by an arbitrary interval
rather than just full units.
Author: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Artur Zakirov <zaartur@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACPNZCt4buQFRgy6DyjuZS-2aPDpccRkrJBmgUfwYc1KiaXYxg@mail.gmail.com
This function for bit and bytea counts the set bits in the bit or byte
string. Internally, we use the existing popcount functionality.
For the name, after some discussion, we settled on bit_count, which
also exists with this meaning in MySQL, Java, and Python.
Author: David Fetter <david@fetter.org>
Discussion: https://www.postgresql.org/message-id/flat/20201230105535.GJ13234@fetter.org
There is now a per-column COMPRESSION option which can be set to pglz
(the default, and the only option in up until now) or lz4. Or, if you
like, you can set the new default_toast_compression GUC to lz4, and
then that will be the default for new table columns for which no value
is specified. We don't have lz4 support in the PostgreSQL code, so
to use lz4 compression, PostgreSQL must be built --with-lz4.
In general, TOAST compression means compression of individual column
values, not the whole tuple, and those values can either be compressed
inline within the tuple or compressed and then stored externally in
the TOAST table, so those properties also apply to this feature.
Prior to this commit, a TOAST pointer has two unused bits as part of
the va_extsize field, and a compessed datum has two unused bits as
part of the va_rawsize field. These bits are unused because the length
of a varlena is limited to 1GB; we now use them to indicate the
compression type that was used. This means we only have bit space for
2 more built-in compresison types, but we could work around that
problem, if necessary, by introducing a new vartag_external value for
any further types we end up wanting to add. Hopefully, it won't be
too important to offer a wide selection of algorithms here, since
each one we add not only takes more coding but also adds a build
dependency for every packager. Nevertheless, it seems worth doing
at least this much, because LZ4 gets better compression than PGLZ
with less CPU usage.
It's possible for LZ4-compressed datums to leak into composite type
values stored on disk, just as it is for PGLZ. It's also possible for
LZ4-compressed attributes to be copied into a different table via SQL
commands such as CREATE TABLE AS or INSERT .. SELECT. It would be
expensive to force such values to be decompressed, so PostgreSQL has
never done so. For the same reasons, we also don't force recompression
of already-compressed values even if the target table prefers a
different compression method than was used for the source data. These
architectural decisions are perhaps arguable but revisiting them is
well beyond the scope of what seemed possible to do as part of this
project. However, it's relatively cheap to recompress as part of
VACUUM FULL or CLUSTER, so this commit adjusts those commands to do
so, if the configured compression method of the table happens not to
match what was used for some column value stored therein.
Dilip Kumar. The original patches on which this work was based were
written by Ildus Kurbangaliev, and those were patches were based on
even earlier work by Nikita Glukhov, but the design has since changed
very substantially, since allow a potentially large number of
compression methods that could be added and dropped on a running
system proved too problematic given some of the architectural issues
mentioned above; the choice of which specific compression method to
add first is now different; and a lot of the code has been heavily
refactored. More recently, Justin Przyby helped quite a bit with
testing and reviewing and this version also includes some code
contributions from him. Other design input and review from Tomas
Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander
Korotkov, and me.
Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain
Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com
With grouping sets, it's possible that some of the grouping sets are
duplicate. This is especially common with CUBE and ROLLUP clauses. For
example GROUP BY CUBE (a,b), CUBE (b,c) is equivalent to
GROUP BY GROUPING SETS (
(a, b, c),
(a, b, c),
(a, b, c),
(a, b),
(a, b),
(a, b),
(a),
(a),
(a),
(c, a),
(c, a),
(c, a),
(c),
(b, c),
(b),
()
)
Some of the grouping sets are calculated multiple times, which is mostly
unnecessary. This commit implements a new GROUP BY DISTINCT feature, as
defined in the SQL standard, which eliminates the duplicate sets.
Author: Vik Fearing
Reviewed-by: Erik Rijkers, Georgios Kokolatos, Tomas Vondra
Discussion: https://postgr.es/m/bf3805a8-d7d1-ae61-fece-761b7ff41ecc@postgresfriends.org
pg_read_file() is the function that's in core, pg_file_read() is in
adminpack. But when using pg_file_read() in adminpack it calls the *C*
level function pg_read_file() in core, which probably threw the original
author off. But the error hint should be about the SQL function.
Reported-By: Sergei Kornilov
Backpatch-through: 11
Discussion: https://postgr.es/m/373021616060475@mail.yandex.ru
When a foreign key constraint is applied to a partitioned table, each
leaf partition inherits a similar FK constraint. We were processing all
of those constraints independently, meaning that in large partitioning
trees we'd build up large collections of cached FK-checking query plans.
However, in all cases but one, the generated queries are actually
identical for all members of the inheritance tree (because, in most
cases, the query only mentions the topmost table of the other side of
the FK relationship). So we can share a single cached plan among all
the partitions, saving memory, not to mention time to build and maintain
the cached plans.
Keisuke Kuroda and Amit Langote
Discussion: https://postgr.es/m/cab4b85d-9292-967d-adf2-be0d803c3e23@nttcom.co.jp_1
This commit adds new GUC track_wal_io_timing. When this is enabled,
the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
WAL data to disk are counted in pg_stat_wal. This information would be
useful to check how much WAL write and sync affect the performance.
Enabling track_wal_io_timing will make the server query the operating
system for the current time every time WAL is written or synced,
which may cause significant overhead on some platforms. To avoid such
additional overhead in the server with track_io_timing enabled,
this commit introduces track_wal_io_timing as a separate parameter from
track_io_timing.
Note that WAL write and sync activity by walreceiver has not been tracked yet.
This commit makes the server also track the numbers of times XLogWrite
writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
regardless of the setting of track_wal_io_timing. This counters can be
used to calculate the WAL write and sync time per request, for example.
Bump PGSTAT_FILE_FORMAT_ID.
Bump catalog version.
Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
PostgreSQL disabled compression as of e3bdb2d and the documentation
recommends against using it since. Additionally, SSL compression has
been disabled in OpenSSL since version 1.1.0, and was disabled in many
distributions long before that. The most recent TLS version, TLSv1.3,
disallows compression at the protocol level.
This commit removes the feature itself, removing support for the libpq
parameter sslcompression (parameter still listed for compatibility
reasons with existing connection strings, just ignored), and removes
the equivalent field in pg_stat_ssl and de facto PgBackendSSLStatus.
Note that, on top of removing the ability to activate compression by
configuration, compression is actively disabled in both frontend and
backend to avoid overrides from local configurations.
A TAP test is added for deprecated SSL parameters to check after
backwards compatibility.
Bump catalog version.
Author: Daniel Gustafsson
Reviewed-by: Peter Eisentraut, Magnus Hagander, Michael Paquier
Discussion: https://postgr.es/m/7E384D48-11C5-441B-9EC3-F7DB1F8518F6@yesql.se
Point to the specific line where the error was detected; the
previous code tended to include several preceding lines as well.
Avoid re-scanning the entire input to recompute which line that
was. Simplify the logic a bit. Add test cases.
Simon Riggs and Hamid Akhtar, reviewed by Daniel Gustafsson and myself
Discussion: https://postgr.es/m/CANbhV-EPBnXm3MF_TTWBwwqgn1a1Ghmep9VHfqmNBQ8BT0f+_g@mail.gmail.com
The code paths for three different OSes finished up with three different
ways of excluding C[.xxx] and POSIX from consideration. Merge them.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
Instead of an unsightly internal "cache lookup failed" message, just
return NULL for bad OIDs, as is the convention for other similar things.
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com
Commit 2f2007fbb did this partially, but there were two remaining
warts. checkcondition_gin handled some uncertain cases by setting
the out-of-band recheck flag, some by returning TS_MAYBE, and some
by doing both. Meanwhile, TS_execute arbitrarily converted a
TS_MAYBE result to TS_YES. Thus, if checkcondition_gin chose to
only return TS_MAYBE, the outcome would be TS_YES with no recheck
flag, potentially resulting in wrong query outputs.
The case where this'd happen is if there were GIN_MAYBE entries
in the indexscan results passed to gin_tsquery_[tri]consistent,
which so far as I can see would only happen if the tidbitmap used
to accumulate indexscan results grew large enough to become lossy.
I initially thought of fixing this by ensuring we always set the
recheck flag as well as returning TS_MAYBE in uncertain cases.
But that errs in the other direction, potentially forcing rechecks
of rows that provably match the query (since the recheck flag
remains set even if TS_execute later finds that the answer must be
TS_YES). Instead, let's get rid of the out-of-band recheck flag
altogether and rely on returning TS_MAYBE. This requires exporting
a version of TS_execute that will actually return the full ternary
result of the evaluation ... but we likely should have done that
to start with.
Unfortunately it doesn't seem practical to add a regression test case
that covers this: the amount of data needed to cause the GIN bitmap to
become lossy results in a longer runtime than I think we want to have
in the tests. (I'm wondering about allowing smaller work_mem settings
to ameliorate that, but it'd be a matter for a separate patch.)
Per bug #16865 from Dimitri Nüscheler. Back-patch to v13 where
the faulty commit came in.
Discussion: https://postgr.es/m/16865-4ffdc3e682e6d75b@postgresql.org
This commit adds new column "waitstart" into pg_locks view. This column
reports the time when the server process started waiting for the lock
if the lock is not held. This information is useful, for example, when
examining the amount of time to wait on a lock by subtracting
"waitstart" in pg_locks from the current time, and identify the lock
that the processes are waiting for very long.
This feature uses the current time obtained for the deadlock timeout
timer as "waitstart" (i.e., the time when this process started waiting
for the lock). Since getting the current time newly can cause overhead,
we reuse the already-obtained time to avoid that overhead.
Note that "waitstart" is updated without holding the lock table's
partition lock, to avoid the overhead by additional lock acquisition.
This can cause "waitstart" in pg_locks to become NULL for a very short
period of time after the wait started even though "granted" is false.
This is OK in practice because we can assume that users are likely to
look at "waitstart" when waiting for the lock for a long time.
The first attempt of this patch (commit 3b733fcd04) caused the buildfarm
member "rorqual" (built with --disable-atomics --disable-spinlocks) to report
the failure of the regression test. It was reverted by commit 890d2182a2.
The cause of this failure was that the atomic variable for "waitstart"
in the dummy process entry created at the end of prepare transaction was
not initialized. This second attempt fixes that issue.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao
Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com
With its current design, a careless use of pg_cryptohash_final() could
would result in an out-of-bound write in memory as the size of the
destination buffer to store the result digest is not known to the
cryptohash internals, without the caller knowing about that. This
commit adds a new argument to pg_cryptohash_final() to allow such sanity
checks, and implements such defenses.
The internals of SCRAM for HMAC could be tightened a bit more, but as
everything is based on SCRAM_KEY_LEN with uses particular to this code
there is no need to complicate its interface more than necessary, and
this comes back to the refactoring of HMAC in core. Except that, this
minimizes the uses of the existing DIGEST_LENGTH variables, relying
instead on sizeof() for the result sizes. In ossp-uuid, this also makes
the code more defensive, as it already relied on dce_uuid_t being at
least the size of a MD5 digest.
This is in philosophy similar to cfc40d3 for base64.c and aef8948 for
hex.c.
Reported-by: Ranier Vilela
Author: Michael Paquier, Ranier Vilela
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAEudQAoqEGmcff3J4sTSV-R_16Monuz-UpJFbf_dnVH=APr02Q@mail.gmail.com
Given a regex pattern with a very long fixed prefix (approaching 500
characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can
underflow to zero. Typically the preceding selectivity calculation
would have underflowed as well, so that we compute 0/0 and get NaN.
In released branches this leads to an assertion failure later on.
That doesn't happen in HEAD, for reasons I've not explored yet,
but it's surely still a bug.
To fix, just skip the division when the pow() result is zero, so
that we'll (most likely) return a zero selectivity estimate. In
the edge cases where "sel" didn't yet underflow, perhaps this
isn't desirable, but I'm not sure that the case is worth spending
a lot of effort on. The results of regex_selectivity_sub() are
barely worth the electrons they're written on anyway :-(
Per report from Alexander Lakhin. Back-patch to all supported versions.
Discussion: https://postgr.es/m/6de0a0c3-ada9-cd0c-3e4e-2fa9964b41e3@gmail.com
Explicitly testing for INT_MIN and INT_MAX isn't particularly good
style; it's tedious and may draw useless compiler warnings on
machines where int and long are the same width. We invented
strtoint() precisely for this usage, so use that instead.
While here, remove gratuitous variations in the way the tests for
did-strtoint-succeed were spelled. Also, avoid attempting to
negate INT_MIN; that would probably work given that the result
is implicitly cast to uint32, but I think it's nominally undefined
behavior.
Per gripe from Ranier Vilela, though this isn't his proposed patch.
Discussion: https://postgr.es/m/CAEudQAqge3QfzoBRhe59QrB_5g+NmQUj2QpzqZ9Nc7QepXGAEw@mail.gmail.com
This commit adds new column "waitstart" into pg_locks view. This column
reports the time when the server process started waiting for the lock
if the lock is not held. This information is useful, for example, when
examining the amount of time to wait on a lock by subtracting
"waitstart" in pg_locks from the current time, and identify the lock
that the processes are waiting for very long.
This feature uses the current time obtained for the deadlock timeout
timer as "waitstart" (i.e., the time when this process started waiting
for the lock). Since getting the current time newly can cause overhead,
we reuse the already-obtained time to avoid that overhead.
Note that "waitstart" is updated without holding the lock table's
partition lock, to avoid the overhead by additional lock acquisition.
This can cause "waitstart" in pg_locks to become NULL for a very short
period of time after the wait started even though "granted" is false.
This is OK in practice because we can assume that users are likely to
look at "waitstart" when waiting for the lock for a long time.
Bump catalog version.
Author: Atsushi Torikoshi
Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao
Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com
This follows in the spirit of commit dfb75e478, which created primary
key and uniqueness constraints to improve the visibility of constraints
imposed on the system catalogs. While our catalogs contain many
foreign-key-like relationships, they don't quite follow SQL semantics,
in that the convention for an omitted reference is to write zero not
NULL. Plus, we have some cases in which there are arrays each of whose
elements is supposed to be an FK reference; SQL has no way to model that.
So we can't create actual foreign key constraints to describe the
situation. Nonetheless, we can collect and use knowledge about these
relationships.
This patch therefore adds annotations to the catalog header files to
declare foreign-key relationships. (The BKI_LOOKUP annotations cover
simple cases, but we weren't previously distinguishing which such
columns are allowed to contain zeroes; we also need new markings for
multi-column FK references.) Then, Catalog.pm and genbki.pl are
taught to collect this information into a table in a new generated
header "system_fk_info.h". The only user of that at the moment is
a new SQL function pg_get_catalog_foreign_keys(), which exposes the
table to SQL. The oidjoins regression test is rewritten to use
pg_get_catalog_foreign_keys() to find out which columns to check.
Aside from removing the need for manual maintenance of that test
script, this allows it to cover numerous relationships that were not
checked by the old implementation based on findoidjoins. (As of this
commit, 217 relationships are checked by the test, versus 181 before.)
Discussion: https://postgr.es/m/3240355.1612129197@sss.pgh.pa.us
This adds the SQL standard feature that adds the SEARCH and CYCLE
clauses to recursive queries to be able to do produce breadth- or
depth-first search orders and detect cycles. These clauses can be
rewritten into queries using existing syntax, and that is what this
patch does in the rewriter.
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com
On machines where sizeof(Datum) > sizeof(Oid) (that is, any 64-bit
platform), the previous coding would compute a misaligned
workspace->index pointer if nupper is odd. Architectures where
misaligned access is a hard no-no would then fail. This appears
to explain why thorntail is unhappy but other buildfarm members
are not.
The same code pattern was repeated four times when compiling a SHA-2
hash. This refactoring has the advantage to issue a compilation warning
if a new value is added to pg_cryptohash_type, so as anybody doing an
addition in this area would need to consider if support for a new SQL
function is needed or not.
Author: Sehrope Sarkuni, Michael Paquier
Discussion: https://postgr.es/m/YA7DvLRn2xnTgsMc@paquier.xyz
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins. This could result in a malformed plan. The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.
The right value to use is the join level we actually intend to evaluate
the PHV at. We can get that from the ph_eval_at field of the associated
PlaceHolderInfo. However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level. (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.) Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.
The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos. We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter. (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)
Back-patch to v12. The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.
Per report from Stephan Springl.
Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
This add counters for number of sessions, the different kind of session
termination types, and timers for how much time is spent in active vs
idle in a database to pg_stat_database.
Internally this also renames the parameter "force" to disconnect. This
was the only use-case for the parameter before, so repurposing it to
this mroe narrow usecase makes things cleaner than inventing something
new.
Author: Laurenz Albe
Reviewed-By: Magnus Hagander, Soumyadeep Chakraborty, Masahiro Ikeda
Discussion: https://postgr.es/m/b07e1f9953701b90c66ed368656f2aef40cac4fb.camel@cybertec.at
This commit addresses some issues with c3826f83 that moved the hex
decoding routine to src/common/:
- The decoding function lacked overflow checks, so when used for
security-related features it was an open door to out-of-bound writes if
not carefully used that could remain undetected. Like the base64
routines already in src/common/ used by SCRAM, this routine is reworked
to check for overflows by having the size of the destination buffer
passed as argument, with overflows checked before doing any writes.
- The encoding routine was missing. This is moved to src/common/ and
it gains the same overflow checks as the decoding part.
On failure, the hex routines of src/common/ issue an error as per the
discussion done to make them usable by frontend tools, but not by shared
libraries. Note that this is why ECPG is left out of this commit, and
it still includes a duplicated logic doing hex encoding and decoding.
While on it, this commit uses better variable names for the source and
destination buffers in the existing escape and base64 routines in
encode.c and it makes them more robust to overflow detection. The
previous core code issued a FATAL after doing out-of-bound writes if
going through the SQL functions, which would be enough to detect
problems when working on changes that impacted this area of the
code. Instead, an error is issued before doing an out-of-bound write.
The hex routines were being directly called for bytea conversions and
backup manifests without such sanity checks. The current calls happen
to not have any problems, but careless uses of such APIs could easily
lead to CVE-class bugs.
Author: Bruce Momjian, Michael Paquier
Reviewed-by: Sehrope Sarkuni
Discussion: https://postgr.es/m/20201231003557.GB22199@momjian.us
Forced cache invalidation (CLOBBER_CACHE_ALWAYS) has been impractical
to use for testing in PostgreSQL because it's so slow and because it's
toggled on/off only at build time. It is helpful when hunting bugs in
any code that uses the sycache/relcache because causes cache
invalidations to be injected whenever it would be possible for an
invalidation to occur, whether or not one was really pending.
Address this by providing run-time control over cache clobber
behaviour using the new debug_invalidate_system_caches_always GUC.
Support is not compiled in at all unless assertions are enabled or
CLOBBER_CACHE_ENABLED is explicitly defined at compile time. It
defaults to 0 if compiled in, so it has negligible effect on assert
build performance by default.
When support is compiled in, test code can now set
debug_invalidate_system_caches_always=1 locally to a backend to test
specific queries, functions, extensions, etc. Or tests can toggle it
globally for a specific test case while retaining normal performance
during test setup and teardown.
For backwards compatibility with existing test harnesses and scripts,
debug_invalidate_system_caches_always defaults to 1 if
CLOBBER_CACHE_ALWAYS is defined, and to 3 if CLOBBER_CACHE_RECURSIVE
is defined.
CLOBBER_CACHE_ENABLED is now visible in pg_config_manual.h, as is the
related RECOVER_RELATION_BUILD_MEMORY setting for the relcache.
Author: Craig Ringer <craig.ringer@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMsr+YF=+ctXBZj3ywmvKNUjWpxmuTuUKuv-rgbHGX5i5pLstQ@mail.gmail.com
In power_var_int(), the computation of the number of significant
digits to use in the computation used log(Abs(exp)), which isn't safe
because Abs(exp) returns INT_MIN when exp is INT_MIN. Use fabs()
instead of Abs(), so that the exponent is cast to a double before the
absolute value is taken.
Back-patch to 9.6, where this was introduced (by 7d9a4737c2).
Discussion: https://postgr.es/m/CAEZATCVd6pMkz=BrZEgBKyqqJrt2xghr=fNc8+Z=5xC6cgWrWA@mail.gmail.com
If the substring start index and length overflow when added together,
substring() misbehaved, either throwing a bogus "negative substring
length" error on a case that should succeed, or failing to complain that
a negative length is negative (and instead returning the whole string,
in most cases). Unsurprisingly, the text, bytea, and bit variants of
the function all had this issue. Rearrange the logic to ensure that
negative lengths are always rejected, and add an overflow check to
handle the other case.
Also install similar guards into detoast_attr_slice() (nee
heap_tuple_untoast_attr_slice()), since it's far from clear that
no other code paths leading to that function could pass it values
that would overflow.
Patch by myself and Pavel Stehule, per bug #16804 from Rafi Shamim.
Back-patch to v11. While these bugs are old, the common/int.h
infrastructure for overflow-detecting arithmetic didn't exist before
commit 4d6ad3125, and it doesn't seem like these misbehaviors are bad
enough to justify developing a standalone fix for the older branches.
Discussion: https://postgr.es/m/16804-f4eeeb6c11ba71d4@postgresql.org
Performance issues with the preceding patch to re-implement array
element assignment within pl/pgsql led me to realize that the read/write
parameter mechanism is misdesigned. Instead of requiring the assignment
source expression to be such that *all* its references to the target
variable could be passed as R/W, we really want to identify *one*
reference to the target variable to be passed as R/W, allowing any other
ones to be passed read/only as they would be by default. As long as the
R/W reference is a direct argument to the top-level (hence last to be
executed) function in the expression, there is no harm in R/O references
being passed to other lower parts of the expression. Nor is there any
use-case for more than one argument of the top-level function being R/W.
Hence, rewrite that logic to identify one single Param that references
the target variable, and make only that Param pass a read/write
reference, not any other Params referencing the target variable.
Discussion: https://postgr.es/m/4165684.1607707277@sss.pgh.pa.us
This commit preserves the logic of multirange_in() but makes it more clear
what's going on. Also, this commit fixes the compiler warning spotted by the
buildfarm.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/2246043.1609290699%40sss.pgh.pa.us
Since at least 2001 we've used putenv() and avoided setenv(), on the
grounds that the latter was unportable and not in POSIX. However,
POSIX added it that same year, and by now the situation has reversed:
setenv() is probably more portable than putenv(), since POSIX now
treats the latter as not being a core function. And setenv() has
cleaner semantics too. So, let's reverse that old policy.
This commit adds a simple src/port/ implementation of setenv() for
any stragglers (we have one in the buildfarm, but I'd not be surprised
if that code is never used in the field). More importantly, extend
win32env.c to also support setenv(). Then, replace usages of putenv()
with setenv(), and get rid of some ad-hoc implementations of setenv()
wannabees.
Also, adjust our src/port/ implementation of unsetenv() to follow the
POSIX spec that it returns an error indicator, rather than returning
void as per the ancient BSD convention. I don't feel a need to make
all the call sites check for errors, but the portability stub ought
to match real-world practice.
Discussion: https://postgr.es/m/2065122.1609212051@sss.pgh.pa.us
Attempt to get selectivity estimation for @> (anymultirange, anyrange) operator
caused an error in buildfarm, because this operator was missed in switch()
of calc_hist_selectivity(). Fix that and also make regression tests reliably
check that selectivity estimation for (multi)ranges doesn't fall. Previously,
whether we test selectivity estimation for (multi)ranges depended on
whether autovacuum managed to gather concurrently to the test.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/X%2BwmgjRItuvHNBeV%40paquier.xyz
6df7a9698b has introduced a set of operators between ranges and multiranges.
Existing GiST indexes for ranges could easily support majority of them.
This commit adds support for new operators to the existing range GiST indexes.
New operators resides the same strategy numbers as existing ones. Appropriate
check function is determined using the subtype.
Catversion is bumped.
There is a set of *_internal() functions exposed in
include/utils/multirangetypes.h. This commit improves the signatures of these
functions in two ways.
* Add const qualifies where applicable.
* Replace multirange typecache argument with range typecache argument.
Multirange typecache was used solely to find the range typecache. At the
same time, range typecache is easier for the caller to find.
We have operators for checking if the multirange contains a range but don't
have the opposite. This commit improves completeness of the operator set by
adding two new operators: @> (anyrange,anymultirange) and
<@(anymultirange,anyrange).
Catversion is bumped.
Two functions multirange_range_overlaps_bsearch_comparison() and
multirange_range_contains_bsearch_comparison() contain bugs of returning -1
instead of 1. This commit fixes these bugs and adds corresponding regression
tests.
This makes existing sessions reflect "ALTER ROLE ... [NO]INHERIT" as
quickly as they have been reflecting "GRANT role_name". Back-patch to
9.5 (all supported versions).
Reviewed by Nathan Bossart.
Discussion: https://postgr.es/m/20201221095028.GB3777719@rfd.leadboat.com
The jsonb || jsonb operator arbitrarily rejected certain combinations
of scalar and non-scalar inputs, while being willing to concatenate
other combinations. This was of course quite undocumented. Rather
than trying to document it, let's just remove the restriction,
creating a uniform rule that unless we are handling an object-to-object
concatenation, non-array inputs are converted to one-element arrays,
resulting in an array-to-array concatenation. (This does not change
the behavior for any case that didn't throw an error before.)
Per complaint from Joel Jacobson. Back-patch to all supported branches.
Discussion: https://postgr.es/m/163099.1608312033@sss.pgh.pa.us
6df7a9698b introduces multirange types, whose typanalyze function shares
infrastructure with range types typanalyze function. Since 6df7a9698b,
information about type gathered by statistics is filled from typcache.
But typalign is mistakenly always set to double. This commit fixes this
oversight.
Multiranges are basically sorted arrays of non-overlapping ranges with
set-theoretic operations defined over them.
Since v14, each range type automatically gets a corresponding multirange
datatype. There are both manual and automatic mechanisms for naming multirange
types. Once can specify multirange type name using multirange_type_name
attribute in CREATE TYPE. Otherwise, a multirange type name is generated
automatically. If the range type name contains "range" then we change that to
"multirange". Otherwise, we add "_multirange" to the end.
Implementation of multiranges comes with a space-efficient internal
representation format, which evades extra paddings and duplicated storage of
oids. Altogether this format allows fetching a particular range by its index
in O(n).
Statistic gathering and selectivity estimation are implemented for multiranges.
For this purpose, stored multirange is approximated as union range without gaps.
This field will likely need improvements in the future.
Catversion is bumped.
Discussion: https://postgr.es/m/CALNJ-vSUpQ_Y%3DjXvTxt1VYFztaBSsWVXeF1y6gTYQ4bOiWDLgQ%40mail.gmail.com
Discussion: https://postgr.es/m/a0b8026459d1e6167933be2104a6174e7d40d0ab.camel%40j-davis.com#fe7218c83b08068bfffb0c5293eceda0
Author: Paul Jungwirth, revised by me
Reviewed-by: David Fetter, Corey Huinker, Jeff Davis, Pavel Stehule
Reviewed-by: Alvaro Herrera, Tom Lane, Isaac Morland, David G. Johnston
Reviewed-by: Zhihong Yu, Alexander Korotkov
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which
was formerly the default; and add assertions insisting that exactly
one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set.
This is in hopes of preventing recurrences of the type of oversight
fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS).
Also, when HASH_STRINGS is specified, insist that the keysize be
more than 8 bytes. This is a heuristic, but it should catch
accidental use of HASH_STRINGS for integer or pointer keys.
(Nearly all existing use-cases set the keysize to NAMEDATALEN or
more, so there's little reason to think this restriction should
be problematic.)
Tweak hash_create() to insist that the HASH_ELEM flag be set, and
remove the defaults it had for keysize and entrysize. Since those
defaults were undocumented and basically useless, no callers
omitted HASH_ELEM anyway.
Also, remove memset's zeroing the HASHCTL parameter struct from
those callers that had one. This has never been really necessary,
and while it wasn't a bad coding convention it was confusing that
some callers did it and some did not. We might as well save a few
cycles by standardizing on "not".
Also improve the documentation for hash_create().
In passing, improve reinit.c's usage of a hash table by storing
the key as a binary Oid rather than a string; and, since that's
a temporary hash table, allocate it in CurrentMemoryContext for
neatness.
Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us
This patch generalizes the subscripting infrastructure so that any
data type can be subscripted, if it provides a handler function to
define what that means. Traditional variable-length (varlena) arrays
all use array_subscript_handler(), while the existing fixed-length
types that support subscripting use raw_array_subscript_handler().
It's expected that other types that want to use subscripting notation
will define their own handlers. (This patch provides no such new
features, though; it only lays the foundation for them.)
To do this, move the parser's semantic processing of subscripts
(including coercion to whatever data type is required) into a
method callback supplied by the handler. On the execution side,
replace the ExecEvalSubscriptingRef* layer of functions with direct
calls to callback-supplied execution routines. (Thus, essentially
no new run-time overhead should be caused by this patch. Indeed,
there is room to remove some overhead by supplying specialized
execution routines. This patch does a little bit in that line,
but more could be done.)
Additional work is required here and there to remove formerly
hard-wired assumptions about the result type, collation, etc
of a SubscriptingRef expression node; and to remove assumptions
that the subscript values must be integers.
One useful side-effect of this is that we now have a less squishy
mechanism for identifying whether a data type is a "true" array:
instead of wiring in weird rules about typlen, we can look to see
if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this
to be bulletproof, we have to forbid user-defined types from using
that handler directly; but there seems no good reason for them to
do so.
This patch also removes assumptions that the number of subscripts
is limited to MAXDIM (6), or indeed has any hard-wired limit.
That limit still applies to types handled by array_subscript_handler
or raw_array_subscript_handler, but to discourage other dependencies
on this constant, I've moved it from c.h to utils/array.h.
Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov,
Peter Eisentraut, Pavel Stehule
Discussion: https://postgr.es/m/CA+q6zcVDuGBv=M0FqBYX8DPebS3F_0KQ6OVFobGJPM507_SZ_w@mail.gmail.com
Discussion: https://postgr.es/m/CA+q6zcVovR+XY4mfk-7oNk-rF91gH0PebnNfuUjuuDsyHjOcVA@mail.gmail.com
It was still using a scan of pg_depend instead of using the conindid
column that has been added since.
Since it is now just a catalog lookup wrapper and not related to
pg_depend, move from pg_depend.c to lsyscache.c.
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/4688d55c-9a2e-9a5a-d166-5f24fe0bf8db%40enterprisedb.com
87ae969 has created two new files called cryptohash{_openssl}.c in
src/common/, whose names overlap with the existing backend file called
cryptohashes.c dedicated to the SQL wrappers for SHA2 and MD5. This
file is renamed to cryptohashfuncs.c to be more consistent with the
surroundings and reduce the confusion with the new cryptohash interface
of src/common/.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/X8hHhaQgbMbW+aGU@paquier.xyz
Commit 6b466bf5f2 allowed pg_stat_statements to track the number of
WAL records, full page images and bytes that each statement generated.
Similarly this commit allows us to track the cluster-wide WAL statistics
counters.
New columns wal_records, wal_fpi and wal_bytes are added into the
pg_stat_wal view, and reports the total number of WAL records,
full page images and bytes generated in the , respectively.
Author: Masahiro Ikeda
Reviewed-by: Amit Kapila, Movead Li, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/35ef960128b90bfae3b3fdf60a3a860f@oss.nttdata.com
Two new routines to allocate a hash context and to free it are created,
as these become necessary for the goal behind this refactoring: switch
the all cryptohash implementations for OpenSSL to use EVP (for FIPS and
also because upstream does not recommend the use of low-level cryptohash
functions for 20 years). Note that OpenSSL hides the internals of
cryptohash contexts since 1.1.0, so it is necessary to leave the
allocation to OpenSSL itself, explaining the need for those two new
routines. This part is going to require more work to properly track
hash contexts with resource owners, but this not introduced here.
Still, this refactoring makes the move possible.
This reduces the number of routines for all SHA2 implementations from
twelve (SHA{224,256,386,512} with init, update and final calls) to five
(create, free, init, update and final calls) by incorporating the hash
type directly into the hash context data.
The new cryptohash routines are moved to a new file, called cryptohash.c
for the fallback implementations, with SHA2 specifics becoming a part
internal to src/common/. OpenSSL specifics are part of
cryptohash_openssl.c. This infrastructure is usable for more hash
types, like MD5 or HMAC.
Any code paths using the internal SHA2 routines are adapted to report
correctly errors, which are most of the changes of this commit. The
zones mostly impacted are checksum manifests, libpq and SCRAM.
Note that e21cbb4 was a first attempt to switch SHA2 to EVP, but it
lacked the refactoring needed for libpq, as done here.
This patch has been tested on Linux and Windows, with and without
OpenSSL, and down to 1.0.1, the oldest version supported on HEAD.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200924025314.GE7405@paquier.xyz
currtid() and currtid2() are an undocumented set of functions whose sole
known user is the Postgres ODBC driver, able to retrieve the latest TID
version for a tuple given by the caller of those functions.
As used by Postgres ODBC, currtid() is a shortcut able to retrieve the
last TID loaded into a backend by passing an OID of 0 (magic value)
after a tuple insertion. This is removed in this commit, as it became
obsolete after the driver began using "RETURNING ctid" with inserts, a
clause supported since Postgres 8.2 (using RETURNING is better for
performance anyway as it reduces the number of round-trips to the
backend).
currtid2() is still used by the driver, so this remains around for now.
Note that this function is kept in its original shape for backward
compatibility reasons.
Per discussion with many people, including Andres Freund, Peter
Eisentraut, Álvaro Herrera, Hiroshi Inoue, Tom Lane and myself.
Bump catalog version.
Discussion: https://postgr.es/m/20200603021448.GB89559@paquier.xyz
Since we're assuming IEEE floats these days, there seems little
reason not to do this. It has the advantage that when the slope is
computed as infinite due to the presence of Inf coordinates, we get
saner behavior than before from line_construct(), and thence also
in some dependent operations such as finding the closest point.
Also fix line_construct() to special-case slope zero. The previous
coding got the right answer in most cases, but it could compute
C as NaN when the point has Inf coordinates.
Discussion: https://postgr.es/m/CAGf+fX70rWFOk5cd00uMfa__0yP+vtQg5ck7c2Onb-Yczp0URA@mail.gmail.com
"FPeq(infinity, infinity)" returned false, on account of getting NaN
when it subtracts the two inputs. Fix that by adding a separate
check for exact equality.
FPle() and FPge() similarly got the wrong answer for two like-signed
infinities. In those cases, we can just rearrange the comparisons
to avoid potentially subtracting infinities.
While the sibling functions FPne() etc accidentally gave the right
answers even with the internal NaN results, it seems best to make
similar adjustments to them to avoid depending on this.
FPeq() has to be converted to an inline function to avoid double
evaluations of its arguments, and I did the same for the others
just for consistency.
In passing, make the handling of NaN cases in line_eq() and
point_eq_point() simpler and easier to reason about, and perhaps
faster.
This results in just one visible regression test change: slope()
now gives DBL_MAX for two inputs of (inf,1e300), which is consistent
with what it does for (1e300,inf), so that seems like a bug fix.
Discussion: https://postgr.es/m/CAGf+fX70rWFOk5cd00uMfa__0yP+vtQg5ck7c2Onb-Yczp0URA@mail.gmail.com
Add hash functions for the record type as well as a hash operator
family and operator class for the record type. This enables all the
hash functionality for the record type such as hash-based plans for
UNION/INTERSECT/EXCEPT DISTINCT, recursive queries using UNION
DISTINCT, hash joins, and hash partitioning.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/38eccd35-4e2d-6767-1b3c-dada1eac3124%402ndquadrant.com
Buildfarm members that lack both HAVE_LOCALE_T and USE_ICU have been
complaining about pg_newlocale_from_collation's collcollate variable.
This is evidently fallout from commit 7d1297df0, which removed the
only usage outside those two #ifdef'd code paths. Mark the variable
pg_attribute_unused(), like its sibling collctype, which has been that
way for a long time.
This provides a handy way to get, say, the last field of the string.
Use of a negative index in this way has precedent in the nearby
left() and right() functions.
The implementation scans the string twice when N < -1, but it seems
likely that N = -1 will be the huge majority of actual use cases,
so I'm not really excited about adding complexity to avoid that.
Nikhil Benesch, reviewed by Jacob Champion; cosmetic tweakage by me
Discussion: https://postgr.es/m/cbb7f861-6162-3a51-9823-97bc3aa0b638@gmail.com
Introduce TimestampDifferenceMilliseconds() to simplify callers
that would rather have the difference in milliseconds, instead of
the select()-oriented seconds-and-microseconds format. This gets
rid of at least one integer division per call, and it eliminates
some apparently-easy-to-mess-up arithmetic.
Two of these call sites were in fact wrong:
* pg_prewarm's autoprewarm_main() forgot to multiply the seconds
by 1000, thus ending up with a delay 1000X shorter than intended.
That doesn't quite make it a busy-wait, but close.
* postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute
microseconds not milliseconds, thus ending up with a delay 1000X longer
than intended. Somebody along the way had noticed this problem but
misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather
than fixing the units. This was relatively harmless in context, because
we don't care that much about exactly how long this delay is; still,
it's wrong.
There are a few more callers of TimestampDifference() that don't
have a direct need for seconds-and-microseconds, but can't use
TimestampDifferenceMilliseconds() either because they do need
microsecond precision or because they might possibly deal with
intervals long enough to overflow 32-bit milliseconds. It might be
worth inventing another API to improve that, but that seems outside
the scope of this patch; so those callers are untouched here.
Given the fact that we are fixing some bugs, and the likelihood
that future patches might want to back-patch code that uses this
new API, back-patch to all supported branches.
Alexey Kondratov and Tom Lane
Discussion: https://postgr.es/m/3b1c053a21c07c1ed5e00be3b2b855ef@postgrespro.ru
Commit 257836a7 included an assertion that a version lookup routine is
not trying to look up "C" or "POSIX", but that case is reachable with
the user-facing SQL function pg_collation_actual_version(). Remove the
assertion.
Move the system catalog index declarations from catalog/indexing.h to
the respective parent tables' catalog/pg_*.h files. The original
reason for having it split was that the old genbki system produced the
output in the order of the catalog files it read, so all the indexing
stuff needed to come separately. But this is no longer the case, and
keeping it together makes more sense.
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/c7cc82d6-f976-75d6-2e3e-b03d2cab26bb@2ndquadrant.com
The SQL spec calls out nonstandard syntax for certain function calls,
for example substring() with numeric position info is supposed to be
spelled "SUBSTRING(string FROM start FOR count)". We accept many
of these things, but up to now would not print them in the same format,
instead simplifying down to "substring"(string, start, count).
That's long annoyed me because it creates an interoperability
problem: we're gratuitously injecting Postgres-specific syntax into
what might otherwise be a perfectly spec-compliant view definition.
However, the real reason for addressing it right now is to support
a planned change in the semantics of EXTRACT() a/k/a date_part().
When we switch that to returning numeric, we'll have the parser
translate EXTRACT() to some new function name (might as well be
"extract" if you ask me) and then teach ruleutils.c to reverse-list
that per SQL spec. In this way existing calls to date_part() will
continue to have the old semantics.
To implement this, invent a new CoercionForm value COERCE_SQL_SYNTAX,
and make the parser insert that rather than COERCE_EXPLICIT_CALL when
the input has SQL-spec decoration. (But if the input has the form of
a plain function call, continue to mark it COERCE_EXPLICIT_CALL, even
if it's calling one of these functions.) Then ruleutils.c recognizes
COERCE_SQL_SYNTAX as a cue to emit SQL call syntax. It can know
which decoration to emit using hard-wired knowledge about the
functions that could be called this way. (While this solution isn't
extensible without manual additions, neither is the grammar, so this
doesn't seem unmaintainable.) Notice that this solution will
reverse-list a function call with SQL decoration only if it was
entered that way; so dump-and-reload will not by itself produce any
changes in the appearance of views.
This requires adding a CoercionForm field to struct FuncCall.
(I couldn't resist the temptation to rearrange that struct's
field order a tad while I was at it.) FuncCall doesn't appear
in stored rules, so that change isn't a reason for a catversion
bump, but I did one anyway because the new enum value for
CoercionForm fields could confuse old backend code.
Possible future work:
* Perhaps CoercionForm should now be renamed to DisplayForm,
or something like that, to reflect its more general meaning.
This'd require touching a couple hundred places, so it's not
clear it's worth the code churn.
* The SQLValueFunction node type, which was invented partly for
the same goal of improving SQL-compatibility of view output,
could perhaps be replaced with regular function calls marked
with COERCE_SQL_SYNTAX. It's unclear if this would be a net
code savings, however.
Discussion: https://postgr.es/m/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
hash_array_extended() needs to pass PG_GET_COLLATION() to the hash
function of the element type. Otherwise, the hash function of a
collation-aware data type such as text will error out, since the
introduction of nondeterministic collation made hash functions require
a collation, too.
The consequence of this is that before this change, hash partitioning
using an array over text in the partition key would not work.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/32c1fdae-95c6-5dc6-058a-a90330a3b621%40enterprisedb.com
Accept that we can't get versions for such locale names for now. Users
will need to specify the newer language tag format to enable the
collation versioning feature. It's not clear that we can do automatic
conversion from the old style to the new style reliably enough for this
purpose.
Unfortunately, this means that collation versioning probably won't work
for the default collation unless you provide something like en-US at
initdb or CREATE DATABASE time (though, for reasons not yet understood,
it does seem to work on some systems). It'd be nice to find a better
solution, or document this quirk if we settle on it, but this should
unbreak the 3 failing build farm animals in the meantime.
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
make_ruledef() and make_viewdef() were coded to cope with possible
null-ness of these columns, but they've been marked BKI_FORCE_NOT_NULL
for some time. So there's not really any need to do more than what
we do for the other columns of pg_rewrite, i.e. just Assert that
we got non-null results.
(There is a school of thought that says Asserts aren't the thing
to do to check for corrupt data, but surely here is not the place
to start if we want such a policy.)
Also, remove long-dead-if-indeed-it-ever-wasn't-dead handling of
an empty actions list in make_ruledef(). That's an error case
and should be treated as such. (DO INSTEAD NOTHING is represented
by a CMD_NOTHING Query, not an empty list; cf transformRuleStmt.)
Kyotaro Horiguchi, some changes by me
Discussion: https://postgr.es/m/CAEudQApoA=tMTic6xEPYP_hsNZ8XtToVThK_0x7D_aFQYowq3w@mail.gmail.com
Traditionally, the names of fmgroids.h macros for pg_proc OIDs
have been constructed from the prosrc field. But sometimes the
same C function underlies multiple pg_proc entries, forcing us
to make an arbitrary choice of which OID to reference; the other
entries are then not namable via fmgroids.h. Moreover, we could
not have macros at all for pg_proc entries that aren't for
C-coded functions.
Instead, use the proname field, and append the proargtypes field
(replacing inter-argument spaces with underscores) if proname is
not unique. Special-casing unique entries such as F_OIDEQ removes
the need to change a lot of code. Indeed, I can only find two
places in the tree that need to be adjusted; while this changes
quite a few existing entries in fmgroids.h, few of them are
referenced from C code.
With this patch, all entries in pg_proc.dat have macros in fmgroids.h.
Discussion: https://postgr.es/m/472274.1604258384@sss.pgh.pa.us
Some places were using PG_GETARG_UINT32 where PG_GETARG_TRANSACTIONID
would be more appropriate. (Of course, they are the same internally,
so there is no externally visible effect.) To do that, export
PG_GETARG_TRANSACTIONID outside of xid.c. We also export
PG_RETURN_TRANSACTIONID for symmetry, even though there are currently
no external users.
Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local
Record the current version of dependent collations in pg_depend when
creating or rebuilding an index. When accessing the index later, warn
that the index may be corrupted if the current version doesn't match.
Thanks to Douglas Doole, Peter Eisentraut, Christoph Berg, Laurenz Albe,
Michael Paquier, Robert Haas, Tom Lane and others for very helpful
discussion.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> (earlier versions)
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
This model couldn't be extended to cover the default collation, and
didn't have any information about the affected database objects when the
version changed. Remove, in preparation for a follow-up commit that
will add a new mechanism.
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
This adds the statistics about transactions streamed to the decoding
output plugin from ReorderBuffer. Users can query the
pg_stat_replication_slots view to check these stats and call
pg_stat_reset_replication_slot to reset the stats of a particular slot.
Users can pass NULL in pg_stat_reset_replication_slot to reset stats of
all the slots.
Commit 9868167500 has added the basic infrastructure to capture the stats
of slot and this commit extends the statistics collector to track
additional information about slots.
Bump the catversion as we have added new columns in the catalog entry.
Author: Ajin Cherian and Amit Kapila
Reviewed-by: Sawada Masahiko and Dilip Kumar
Discussion: https://postgr.es/m/CAA4eK1+chpEomLzgSoky-D31qev19AmECNiEAietPQUGEFhtVA@mail.gmail.com
This makes use of CheckBuffer() introduced in c780a7a, adding a SQL
wrapper able to do checks for all the pages of a relation. By default,
all the fork types of a relation are checked, and it is possible to
check only a given relation fork. Note that if the relation given in
input has no physical storage or is temporary, then no errors are
generated, allowing full-database checks when coupled with a simple scan
of pg_class for example. This is not limited to clusters with data
checksums enabled, as clusters without data checksums can still apply
checks on pages using the page headers or for the case of a page full of
zeros.
This function returns a set of tuples consisting of:
- The physical file where a broken page has been detected (without the
segment number as that can be AM-dependent, which can be guessed from
the block number for heap). A relative path from PGPATH is used.
- The block number of the broken page.
By default, only superusers have an access to this function but
execution rights can be granted to other users.
The feature introduced here is still minimal, and more improvements
could be done, like:
- Addition of a start and end block number to run checks on a range
of blocks, which would apply only if one fork type is checked.
- Addition of some progress reporting.
- Throttling, with configuration parameters in function input or
potentially some cost-based GUCs.
Regression tests are added for positive cases in the main regression
test suite, and TAP tests are added for cases involving the emulation of
page corruptions.
Bump catalog version.
Author: Julien Rouhaud, Michael Paquier
Reviewed-by: Masahiko Sawada, Justin Pryzby
Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com
Change the attribute 'name' to 'slot_name' in pg_stat_replication_slots
view to make it clear and that way we will be consistent with the other
places like pg_stat_wal_receiver view where we display the same attribute.
In the passing, fix the typo in one of the macros in the related code.
Bump the catversion as we have modified the name in the catalog as well.
Reported-by: Noriyoshi Shinoda
Author: Noriyoshi Shinoda
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
A number of places were using appendStringInfo() when they could have been
using appendStringInfoString() instead. While there's no functionality
change there, it's just more efficient to use appendStringInfoString()
when no formatting is required. Likewise for some
appendStringInfoString() calls which were just appending a single char.
We can just use appendStringInfoChar() for that.
Additionally, many places were using appendPQExpBuffer() when they could
have used appendPQExpBufferStr(). Change those too.
Patch by Zhijie Hou, but further searching by me found significantly more
places that deserved the same treatment.
Author: Zhijie Hou, David Rowley
Discussion: https://postgr.es/m/cb172cf4361e4c7ba7167429070979d4@G08CNEXMBPEKD05.g08.fujitsu.local
Multiply before dividing, not the reverse, so that cases that should
produce exact results do produce exact results. (width_bucket_float8
got this right already.) Even when the result is inexact, this avoids
making it more inexact, since only the division step introduces any
imprecision.
While at it, fix compute_bucket() to not uselessly repeat the sign
check already done by its caller, and avoid duplicating the
multiply/divide steps by adjusting variable usage.
Per complaint from Martin Visser. Although this seems like a bug fix,
I'm hesitant to risk changing width_bucket()'s results in stable
branches, so no back-patch.
Discussion: https://postgr.es/m/6FA5117D-6AED-4656-8FEF-B74AC18FAD85@brytlyt.com
While the calculation is not well-defined if the bounds arguments are
infinite, there is a perfectly sane outcome if the test operand is
infinite: it's just like any other value that's before the first bucket
or after the last one. width_bucket_float8() got this right, but
I was too hasty about the case when adding infinities to numerics
(commit a57d312a7), so that width_bucket_numeric() just rejected it.
Fix that, and sync the relevant error message strings.
No back-patch needed, since infinities-in-numeric haven't shipped yet.
Discussion: https://postgr.es/m/2465409.1602170063@sss.pgh.pa.us
This adds the statistics about transactions spilled to disk from
ReorderBuffer. Users can query the pg_stat_replication_slots view to check
these stats and call pg_stat_reset_replication_slot to reset the stats of
a particular slot. Users can pass NULL in pg_stat_reset_replication_slot
to reset stats of all the slots.
This commit extends the statistics collector to track this information
about slots.
Author: Sawada Masahiko and Amit Kapila
Reviewed-by: Amit Kapila and Dilip Kumar
Discussion: https://postgr.es/m/CA+fd4k5_pPAYRTDrO2PbtTOe0eHQpBvuqmCr8ic39uTNmR49Eg@mail.gmail.com
The date-vs-timestamp, date-vs-timestamptz, and timestamp-vs-timestamptz
comparators all worked by promoting the first type to the second and
then doing a simple same-type comparison. This works fine, except
when the conversion result is out of range, in which case we throw an
entirely avoidable error. The sources of such failures are
(a) type date can represent dates much farther in the future than
the timestamp types can;
(b) timezone rotation might cause a just-in-range timestamp value to
become a just-out-of-range timestamptz value.
Up to now we just ignored these corner-case issues, but now we have
an actual user complaint (bug #16657 from Huss EL-Sheikh), so let's
do something about it.
It turns out that commit 52ad1e659 already built all the necessary
infrastructure to support error-free comparisons, but neglected to
actually use it in the main-line code paths. Fix that, do a little
bit of code style review, and remove the now-duplicate logic in
jsonpath_exec.c.
Back-patch to v13 where 52ad1e659 came in. We could take this back
further by back-patching said infrastructure, but given the small
number of complaints so far, I don't feel a great need to.
Discussion: https://postgr.es/m/16657-cde2f876d8cc7971@postgresql.org
This view shows the statistics about WAL activity. Currently it has only
two columns: wal_buffers_full and stats_reset. wal_buffers_full column
indicates the number of times WAL data was written to the disk because
WAL buffers got full. This information is useful when tuning wal_buffers.
stats_reset column indicates the time at which these statistics were
last reset.
pg_stat_wal view is also the basic infrastructure to expose other
various statistics about WAL activity later.
Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format.
Bump catalog version.
Author: Masahiro Ikeda
Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao
Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com
Previously, a conversion such as
to_date('-44-02-01','YYYY-MM-DD')
would result in '0045-02-01 BC', as the code attempted to interpret
the negative year as BC, but failed to apply the correction needed
for our internal handling of BC years. Fix the off-by-one problem.
Also, arrange for the combination of a negative year and an
explicit "BC" marker to cancel out and produce AD. This is how
the negative-century case works, so it seems sane to do likewise.
Continue to read "year 0000" as 1 BC. Oracle would throw an error,
but we've accepted that case for a long time so I'm hesitant to
change it in a back-patch.
Per bug #16419 from Saeed Hubaishan. Back-patch to all supported
branches.
Dar Alathar-Yemen and Tom Lane
Discussion: https://postgr.es/m/16419-d8d9db0a7553f01b@postgresql.org
Previously we threw an error. But make_date already allowed the case,
so it is inconsistent as well as unhelpful for make_timestamp not to.
Both functions continue to reject year zero.
Code and test fixes by Peter Eisentraut, doc changes by me
Discussion: https://postgr.es/m/13c0992c-f15a-a0ca-d839-91d3efd965d9@2ndquadrant.com
The SQL standard doesn't require jsonpath .datetime() method to support the
ISO 8601 format. But our to_json[b]() functions convert timestamps to text in
the ISO 8601 format in the sake of compatibility with javascript. So, we add
support of the ISO 8601 to the jsonpath .datetime() in the sake compatibility
with to_json[b]().
The standard mode of datetime parsing currently supports just template patterns
and separators in the format string. In order to implement ISO 8601, we have to
add support of the format string double quotes to the standard parsing mode.
Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Backpatch-through: 13
bffe1bd684 has introduced jsonpath .datetime() method, but default formats
for time and timestamp contain excess space between time and timezone. This
commit removes this excess space making behavior of .datetime() method
standard-compliant.
Discussion: https://postgr.es/m/94321be0-cc96-1a81-b6df-796f437f7c66%40postgrespro.ru
Author: Nikita Glukhov
Backpatch-through: 13
We have a dozen or so places that need to iterate over all but the
first cell of a List. Prior to v13 this was typically written as
for_each_cell(lc, lnext(list_head(list)))
Commit 1cff1b95a changed these to
for_each_cell(lc, list, list_second_cell(list))
This patch introduces a new macro for_each_from() which expresses
the start point as a list index, allowing these to be written as
for_each_from(lc, list, 1)
This is marginally more efficient, since ForEachState.i can be
initialized directly instead of backing into it from a ListCell
address. It also seems clearer and less typo-prone.
Some of the remaining uses of for_each_cell() look like they could
profitably be changed to for_each_from(), but here I confined myself
to changing uses of list_second_cell().
Also, fix for_each_cell_setup() and for_both_cell_setup() to
const-ify their arguments; that's a simple oversight in 1cff1b95a.
Back-patch into v13, on the grounds that (1) the const-ification
is a minor bug fix, and (2) it's better for back-patching purposes
if we only have two ways to write these loops rather than three.
In HEAD, also remove list_third_cell() and list_fourth_cell(),
which were also introduced in 1cff1b95a, and are unused as of
cc99baa43. It seems unlikely that any third-party code would
have started to use them already; anyone who has can be directed
to list_nth_cell instead.
Discussion: https://postgr.es/m/CAApHDvpo1zj9KhEpU2cCRZfSM3Q6XGdhzuAS2v79PH7WJBkYVA@mail.gmail.com
SQL operations such as CURRENT_DATE, CURRENT_TIME, LOCALTIME, and
conversion of "now" in a datetime input string have to obtain the
transaction start timestamp ("now()") as a broken-down struct pg_tm.
This is a remarkably expensive conversion, and since now() does not
change intra-transaction, it doesn't really need to be done more than
once per transaction. Introducing a simple cache provides visible
speedups in queries that compute these values many times, for example
insertion of many rows that use a default value of CURRENT_DATE.
Peter Smith, with a bit of kibitzing by me
Discussion: https://postgr.es/m/CAHut+Pu89TWjq530V2gY5O6SWi=OEJMQ_VHMt8bdZB_9JFna5A@mail.gmail.com
When commit bd3daddaf introduced AlternativeSubPlans, I had some
ambitions towards allowing the choice of subplan to change during
execution. That has not happened, or even been thought about, in the
ensuing twelve years; so it seems like a failed experiment. So let's
rip that out and resolve the choice of subplan at the end of planning
(in setrefs.c) rather than during executor startup. This has a number
of positive benefits:
* Removal of a few hundred lines of executor code, since
AlternativeSubPlans need no longer be supported there.
* Removal of executor-startup overhead (particularly, initialization
of subplans that won't be used).
* Removal of incidental costs of having a larger plan tree, such as
tree-scanning and copying costs in the plancache; not to mention
setrefs.c's own costs of processing the discarded subplans.
* EXPLAIN no longer has to print a weird (and undocumented)
representation of an AlternativeSubPlan choice; it sees only the
subplan actually used. This should mean less confusion for users.
* Since setrefs.c knows which subexpression of a plan node it's
working on at any instant, it's possible to adjust the estimated
number of executions of the subplan based on that. For example,
we should usually estimate more executions of a qual expression
than a targetlist expression. The implementation used here is
pretty simplistic, because we don't want to expend a lot of cycles
on the issue; but it's better than ignoring the point entirely,
as the executor had to.
That last point might possibly result in shifting the choice
between hashed and non-hashed EXISTS subplans in a few cases,
but in general this patch isn't meant to change planner choices.
Since we're doing the resolution so late, it's really impossible
to change any plan choices outside the AlternativeSubPlan itself.
Patch by me; thanks to David Rowley for review.
Discussion: https://postgr.es/m/1992952.1592785225@sss.pgh.pa.us
Up to now, if you tried to omit "AS" before a column label in a SELECT
list, it would only work if the column label was an IDENT, that is not
any known keyword. This is rather unfriendly considering that we have
so many keywords and are constantly growing more. In the wake of commit
1ed6b8956 it's possible to improve matters quite a bit.
We'd originally tried to make this work by having some of the existing
keyword categories be allowed without AS, but that didn't work too well,
because each category contains a few special cases that don't work
without AS. Instead, invent an entirely orthogonal keyword property
"can be bare column label", and mark all keywords that way for which
we don't get shift/reduce errors by doing so.
It turns out that of our 450 current keywords, all but 39 can be made
bare column labels, improving the situation by over 90%. This number
might move around a little depending on future grammar work, but it's
a pretty nice improvement.
Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor
Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
This feature has been a thorn in our sides for a long time, causing
many grammatical ambiguity problems. It doesn't seem worth the
pain to continue to support it, so remove it.
There are some follow-on improvements we can make in the grammar,
but this commit only removes the bare minimum number of productions,
plus assorted backend support code.
Note that pg_dump and psql continue to have full support, since
they may be used against older servers. However, pg_dump warns
about postfix operators. There is also a check in pg_upgrade.
Documentation-wise, I (tgl) largely removed the "left unary"
terminology in favor of saying "prefix operator", which is
a more standard and IMO less confusing term.
I included a catversion bump, although no initial catalog data
changes here, to mark the boundary at which oprkind = 'r'
stopped being valid in pg_operator.
Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor
Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
In the particular case of GRANTED BY, this is specified in the SQL
standard. Since in PostgreSQL, CURRENT_ROLE is equivalent to
CURRENT_USER, and CURRENT_USER is already supported here, adding
CURRENT_ROLE is trivial. The other cases are PostgreSQL extensions,
but for the same reason it also makes sense there.
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Asif Rehman <asifr.rehman@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/f2feac44-b4c5-f38f-3699-2851d6a76dc9%402ndquadrant.com
Introduced in 0aa8f7640.
MSVC warned about performing 32-bit bit shifting when it appeared like we
might like a 64-bit result. We did, but it just so happened that none of
the calls to this function could have caused the 32-bit shift to overflow.
Here we just cast the constant to int64 to make the compiler happy.
Discussion: https://postgr.es/m/CAApHDvofA_vsrpC13mq_hZyuye5B-ssKEaer04OouXYCO5-uXQ@mail.gmail.com