Commit Graph

20553 Commits

Author SHA1 Message Date
Tom Lane e41955faf0 Fix bugs in gin_fuzzy_search_limit processing.
entryGetItem()'s three code paths each contained bugs associated
with filtering the entries for gin_fuzzy_search_limit.

The posting-tree path failed to advance "advancePast" after having
decided to filter an item.  If we ran out of items on the current
page and needed to advance to the next, what would actually happen
is that entryLoadMoreItems() would re-load the same page.  Eventually,
the random dropItem() test would accept one of the same items it'd
previously rejected, and we'd move on --- but it could take awhile
with small gin_fuzzy_search_limit.  To add insult to injury, this
case would inevitably cause entryLoadMoreItems() to decide it needed
to re-descend from the root, making things even slower.

The posting-list path failed to implement gin_fuzzy_search_limit
filtering at all, so that all entries in the posting list would
be returned.

The bitmap-result path used a "gotitem" variable that it failed to
update in the one place where it'd actually make a difference, ie
at the one "continue" statement.  I think this was unreachable in
practice, because if we'd looped around then it shouldn't be the
case that the entries on the new page are before advancePast.
Still, the "gotitem" variable was contributing nothing to either
clarity or correctness, so get rid of it.

Refactor all three loops so that the termination conditions are
more alike and less unreadable.

The code coverage report showed that we had no coverage at all for
the re-descend-from-root code path in entryLoadMoreItems(), which
seems like a very bad thing, so add a test case that exercises it.
We also had exactly no coverage for gin_fuzzy_search_limit, so add a
simplistic test case that at least hits those code paths a little bit.

Back-patch to all supported branches.

Adé Heyward and Tom Lane

Discussion: https://postgr.es/m/CAEknJCdS-dE1Heddptm7ay2xTbSeADbkaQ8bU2AXRCVC2LdtKQ@mail.gmail.com
2020-04-03 13:15:45 -04:00
Tom Lane 6dd9f35779 Fix bogus CALLED_AS_TRIGGER() defenses.
contrib/lo's lo_manage() thought it could use
trigdata->tg_trigger->tgname in its error message about
not being called as a trigger.  That naturally led to a core dump.

unique_key_recheck() figured it could Assert that fcinfo->context
is a TriggerData node in advance of having checked that it's
being called as a trigger.  That's harmless in production builds,
and perhaps not that easy to reach in any case, but it's logically
wrong.

The first of these per bug #16340 from William Crowell;
the second from manual inspection of other CALLED_AS_TRIGGER
call sites.

Back-patch the lo.c change to all supported branches, the
other to v10 where the thinko crept in.

Discussion: https://postgr.es/m/16340-591c7449dc7c8c47@postgresql.org
2020-04-03 11:24:56 -04:00
Fujii Masao 19db23bcbd Revert "Include information on buffer usage during planning phase, in EXPLAIN output."
This reverts commit ed7a509571.

Per buildfarm member prion.
2020-04-03 12:20:42 +09:00
Fujii Masao 18808f8c89 Add wait events for recovery conflicts.
This commit introduces new wait events RecoveryConflictSnapshot and
RecoveryConflictTablespace. The former is reported while waiting for
recovery conflict resolution on a vacuum cleanup. The latter is reported
while waiting for recovery conflict resolution on dropping tablespace.

Also this commit changes the code so that the wait event Lock is reported
while waiting in ResolveRecoveryConflictWithVirtualXIDs() for recovery
conflict resolution on a lock. Basically the wait event Lock is reported
during that wait, but previously was not reported only when that wait
happened in ResolveRecoveryConflictWithVirtualXIDs().

Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
2020-04-03 12:15:56 +09:00
Fujii Masao ed7a509571 Include information on buffer usage during planning phase, in EXPLAIN output.
When BUFFERS option is enabled, EXPLAIN command includes the information
on buffer usage during each plan node, in its output. In addition to that,
this commit makes EXPLAIN command include also the information on
buffer usage during planning phase, in its output. This feature makes it
easier to discern the cases where lots of buffer access happen during
planning.

Author: Julien Rouhaud, slightly revised by Fujii Masao
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/16109-26a1a88651e90608@postgresql.org
2020-04-03 11:27:09 +09:00
Tom Lane 0b34e7d307 Improve user control over truncation of logged bind-parameter values.
This patch replaces the boolean GUC log_parameters_on_error introduced
by commit ba79cb5dc with an integer log_parameter_max_length_on_error,
adding the ability to specify how many bytes to trim each logged
parameter value to.  (The previous coding hard-wired that choice at
64 bytes.)

In addition, add a new parameter log_parameter_max_length that provides
similar control over truncation of query parameters that are logged in
response to statement-logging options, as opposed to errors.  Previous
releases always logged such parameters in full, possibly causing log
bloat.

For backwards compatibility with prior releases,
log_parameter_max_length defaults to -1 (log in full), while
log_parameter_max_length_on_error defaults to 0 (no logging).

Per discussion, log_parameter_max_length is SUSET since the DBA should
control routine logging behavior, but log_parameter_max_length_on_error
is USERSET because it also affects errcontext data sent back to the
client.

Alexey Bashtanov, editorialized a little by me

Discussion: https://postgr.es/m/b10493cc-a399-a03a-67c7-068f2791ee50@imap.cc
2020-04-02 15:04:51 -04:00
Peter Eisentraut 2991ac5fc9 Add SQL functions for Unicode normalization
This adds SQL expressions NORMALIZE() and IS NORMALIZED to convert and
check Unicode normal forms, per SQL standard.

To support fast IS NORMALIZED tests, we pull in a new data file
DerivedNormalizationProps.txt from Unicode and build a lookup table
from that, using techniques similar to ones already used for other
Unicode data.  make update-unicode will keep it up to date.  We only
build and use these tables for the NFC and NFKC forms, because they
are too big for NFD and NFKD and the improvement is not significant
enough there.

Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/flat/c1909f27-c269-2ed9-12f8-3ab72c8caf7a@2ndquadrant.com
2020-04-02 08:56:27 +02:00
Peter Eisentraut c6e0edad46 Add some comments to some SQL features
Otherwise, it could be confusing to a reader that some of these
well-publicized features are simply listed as unsupported without
further explanation.
2020-04-02 07:52:20 +02:00
Thomas Munro 37b3794dfc Add maintenance_io_concurrency to postgresql.conf.sample.
New GUC from commit fc34b0d9.
2020-04-02 16:50:36 +13:00
Amit Kapila 3a5e22138a Allow parallel vacuum to accumulate buffer usage.
Commit 40d964ec99 allowed vacuum command to process indexes in parallel but
forgot to accumulate the buffer usage stats of parallel workers.  This
allows leader backend to accumulate buffer usage stats of all the parallel
workers.

Reported-by: Julien Rouhaud
Author: Sawada Masahiko
Reviewed-by: Dilip Kumar, Amit Kapila and Julien Rouhaud
Discussion: https://postgr.es/m/20200328151721.GB12854@nol
2020-04-02 08:04:58 +05:30
Tomas Vondra 28cac71bd3 Collect statistics about SLRU caches
There's a number of SLRU caches used to access important data like clog,
commit timestamps, multixact, asynchronous notifications, etc. Until now
we had no easy way to monitor these shared caches, compute hit ratios,
number of reads/writes etc.

This commit extends the statistics collector to track this information
for a predefined list of SLRUs, and also introduces a new system view
pg_stat_slru displaying the data.

The list of built-in SLRUs is fixed, but additional SLRUs may be defined
in extensions. Unfortunately, there's no suitable registry of SLRUs, so
this patch simply defines a fixed list of SLRUs with entries for the
built-in ones and one entry for all additional SLRUs. Extensions adding
their own SLRU are fairly rare, so this seems acceptable.

This patch only allows monitoring of SLRUs, not tuning. The SLRU sizes
are still fixed (hard-coded in the code) and it's not entirely clear
which of the SLRUs might need a GUC to tune size. In a way, allowing us
to determine that is one of the goals of this patch.

Bump catversion as the patch introduces new functions and system view.

Author: Tomas Vondra
Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/flat/20200119143707.gyinppnigokesjok@development
2020-04-02 02:34:21 +02:00
Tom Lane 501b018799 Check equality semantics for unique indexes on partitioned tables.
We require the partition key to be a subset of the set of columns
being made unique, so that physically-separate indexes on the different
partitions are sufficient to enforce the uniqueness constraint.

The existing code checked that the listed columns appear, but did not
inquire into the index semantics, which is a serious oversight given
that different index opclasses might enforce completely different
notions of uniqueness.

Ideally, perhaps, we'd just match the partition key opfamily to the
index opfamily.  But hash partitioning uses hash opfamilies which we
can't directly match to btree opfamilies.  Hence, look up the equality
operator in each family, and accept if it's the same operator.  This
should be okay in a fairly general sense, since the equality operator
ought to precisely represent the opfamily's notion of uniqueness.

A remaining weak spot is that we don't have a cross-index-AM notion of
which opfamily member is "equality".  But we know which one to use for
hash and btree AMs, and those are the only two that are relevant here
at present.  (Any non-core AMs that know how to enforce equality are
out of luck, for now.)

Back-patch to v11 where this feature was introduced.

Guancheng Luo, revised a bit by me

Discussion: https://postgr.es/m/D9C3CEF7-04E8-47A1-8300-CA1DCD5ED40D@gmail.com
2020-04-01 14:49:49 -04:00
Tom Lane a80818605e Improve selectivity estimation for assorted match-style operators.
Quite a few matching operators such as JSONB's @> used "contsel" and
"contjoinsel" as their selectivity estimators.  That was a bad idea,
because (a) contsel is only a stub, yielding a fixed default estimate,
and (b) that default is 0.001, meaning we estimate these operators as
five times more selective than equality, which is surely pretty silly.

There's a good model for improving this in ltree's ltreeparentsel():
for any "var OP constant" query, we can try applying the operator
to all of the column's MCV and histogram values, taking the latter
as being a random sample of the non-MCV values.  That code is
actually 100% generic, except for the question of exactly what
default selectivity ought to be plugged in when we don't have stats.

Hence, migrate the guts of ltreeparentsel() into the core code, provide
wrappers "matchingsel" and "matchingjoinsel" with a more-appropriate
default estimate, and use those for the non-geometric operators that
formerly used contsel (mostly JSONB containment operators and tsquery
matching).

Also apply this code to some match-like operators in hstore, ltree, and
pg_trgm, including the former users of ltreeparentsel as well as ones
that improperly used contsel.  Since commit 911e70207 just created new
versions of those extensions that we haven't released yet, we can sneak
this change into those new versions instead of having to create an
additional generation of update scripts.

Patch by me, reviewed by Alexey Bashtanov

Discussion: https://postgr.es/m/12237.1582833074@sss.pgh.pa.us
2020-04-01 10:32:33 -04:00
Peter Eisentraut d8653f4687 Refactor code to look up local replication tuple
This unifies some duplicate code.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://www.postgresql.org/message-id/CA+HiwqFjYE5anArxvkjr37AQMd52L-LZtz9Ld2QrLQ3YfcYhTw@mail.gmail.com
2020-04-01 15:34:41 +02:00
Amit Kapila 2401d93718 Fix coverity complaint about commit 40d964ec99.
The coverity complained that dividing integer expressions and then
converting the integer quotient to type "double" would lose fractional
part.  Typecasting one of the arguments of expression with double should
fix the report.

Author: Mahendra Singh Thalor
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/20200329224818.6phnhv7o2q2rfovf@alap3.anarazel.de
2020-04-01 09:28:13 +05:30
Peter Geoghegan 7dbe290da4 Add CREATE INDEX deduplication assertions.
Add two assertions that verify the assumptions about posting list tuple
space accounting and suffix truncation made within nbtsort.c.
2020-03-31 14:38:39 -07:00
Tom Lane fe3036527a Fix race condition in statext_store().
Must hold some lock on the pg_statistic_ext_data catalog *before*
we look up the tuple we aim to replace.  Otherwise a concurrent
VACUUM FULL or similar operation could move it to a different TID,
leaving us trying to replace the wrong tuple.

Back-patch to v12 where this got broken.

Credit goes to Dean Rasheed; I'm just doing the clerical work.

Discussion: https://postgr.es/m/CAEZATCU0zHMDiQV0g8P2U+YSP9C1idUPrn79DajsbonwkN0xvQ@mail.gmail.com
2020-03-31 17:06:22 -04:00
Fujii Masao b0236508d3 Improve the message logged when recovery is paused.
When recovery target is reached and recovery is paused because of
recovery_target_action=pause, executing pg_wal_replay_resume() causes
the standby to promote, i.e., the recovery to end. So, in this case,
the previous message "Execute pg_wal_replay_resume() to continue"
logged was confusing because pg_wal_replay_resume() doesn't cause
the recovery to continue.

This commit improves the message logged when recovery is paused,
and the proper message is output based on what (pg_wal_replay_pause
or recovery_target_action) causes recovery to be paused.

Author: Sergei Kornilov, revised by Fujii Masao
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/19168211580382043@myt5-b646bde4b8f3.qloud-c.yandex.net
2020-04-01 03:35:13 +09:00
Tom Lane 82e8018522 Teach pg_ls_dir_files() to ignore ENOENT failures from stat().
Buildfarm experience shows that this function can fail with ENOENT
if some other process unlinks a file between when we read the directory
entry and when we try to stat() it.  The problem is old but we had
not noticed it until 085b6b667 added regression test coverage.

To fix, just ignore ENOENT failures.  There is one other case that
this might hide: a symlink that points to nowhere.  That seems okay
though, at least better than erroring.

Back-patch to v10 where this function was added, since the regression
test cases were too.

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-31 12:57:55 -04:00
Alexander Korotkov 02a5786df2 Improve error reporting in opclasscmds.c
This commit improves error reporting introduced by 911e702077.  It puts
argument of errmsg() to the single line for easier grepping source for error
text.  Also it improves wording of errhint().
2020-03-31 17:51:57 +03:00
Magnus Hagander 087d3d0583 Fix assorted typos
Author: Daniel Gustafsson <daniel@yesql.se>
2020-03-31 16:00:06 +02:00
Peter Eisentraut de3bbfcc96 Fix INSERT OVERRIDING USER VALUE behavior
The original implementation disallowed using OVERRIDING USER VALUE on
identity columns defined as GENERATED ALWAYS, which is not per
standard.  So allow that now.

Expand documentation and tests around this.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Discussion: https://www.postgresql.org/message-id/flat/CAEZATCVrh2ufCwmzzM%3Dk_OfuLhTTPBJCdFkimst2kry4oHepuQ%40mail.gmail.com
2020-03-31 08:50:39 +02:00
Michael Paquier 616ae3d2b0 Move routine definitions of xlogarchive.c to a new header file
The definitions of the routines defined in xlogarchive.c have been part
of xlog_internal.h which is included by several frontend tools, but all
those routines are only called by the backend.  More cleanup could be
done within xlog_internal.h, but that's already a nice cut.

This will help a follow-up patch for pg_rewind where handling of
restore_command is added for frontends.

Author: Alexey Kondratov, Michael Paquier
Reviewed-by: Álvaro Herrera, Alexander Korotkov
Discussion: https://postgr.es/m/a3acff50-5a0d-9a2c-b3b2-ee36168955c1@postgrespro.ru
2020-03-31 15:33:04 +09:00
Peter Eisentraut fc8c3bdde2 Update SQL features
Set T653 to supported.  This has always been possible.
2020-03-31 08:25:03 +02:00
Amit Kapila ef75140fe7 Avoid calls to RelationGetRelationName() and RelationGetNamespace() in
vacuum code.

After commit b61d161c14, during vacuum, we cache the information of
relation name and relation namespace in local structure LVRelStats so that
we can use it in an error callback function.  We can use the cached
information to avoid the calls to RelationGetRelationName(),
RelationGetNamespace() and get_namespace_name().  This is mainly for the
consistent in vacuum code path but it will avoid the extra syscache lookup
we do in get_namespace_name().

Author: Justin Pryzby
Reviewed-by: Amit Kapila
Discussion: https://www.postgresql.org/message-id/20191120210600.GC30362@telsasoft.com
2020-03-31 09:34:49 +05:30
Peter Geoghegan f01157e2ac Further simplify nbtree high key truncation.
Commit 7c2dbc69 reorganized _bt_truncate() in a way that enables a
further simplification that I (pgeoghegan) missed:  Since we mark the
tuple that is returned to the caller as a pivot tuple before the point
where its heap TID is set as of 7c2dbc69, it is possible to use the high
level BTreeTupleGetHeapTID() inline function to get an item pointer.  Do
it that way now.  This approach is clearer and more maintainable.
2020-03-30 17:34:12 -07:00
Michael Paquier dd9ac7d5d8 Revert "Skip redundant anti-wraparound vacuums"
This reverts commit 2aa6e33, that added a fast path to skip
anti-wraparound and non-aggressive autovacuum jobs (these have no sense
as anti-wraparound implies aggressive).  With a cluster using a high
amount of relations with a portion of them being heavily updated, this
could cause autovacuum to lock down, with autovacuum workers attempting
repeatedly those jobs on the same relations for the same database, that
just kept being skipped.  This lock down can be solved with a manual
VACUUM FREEZE.

Justin King has reported one environment where the issue happened, and
Julien Rouhaud and I have been able to reproduce it in a second
environment.  With a very aggressive autovacuum_freeze_max_age,
triggering those jobs with pgbench is a matter of minutes, and hitting
the lock down is a lot harder (my local tests failed to do that).

Note that anti-wraparound and non-aggressive jobs can only be triggered
on a subset of shared catalogs:
- pg_auth_members
- pg_authid
- pg_database
- pg_replication_origin
- pg_shseclabel
- pg_subscription
- pg_tablespace
While the lock down was possible down to v12, the root cause of those
jobs is a much older issue, which needs more analysis.

Bonus thanks to Andres Freund for the discussion.

Reported-by: Justin King
Discussion: https://postgr.es/m/CAE39h22zPLrkH17GrkDgAYL3kbjvySYD1io+rtnAUFnaJJVS4g@mail.gmail.com
Backpatch-through: 12
2020-03-31 08:27:47 +09:00
Peter Geoghegan 7c2dbc691c Refactor nbtree high key truncation.
Simplify _bt_truncate(), the routine that generates truncated leaf page
high keys.  Remove a micro-optimization that avoided a second palloc0()
call (this was used when a heap TID was needed in the final pivot tuple,
though only when the index happened to not be an INCLUDE index).

Removing this dubious micro-optimization allows _bt_truncate() to use
the index_truncate_tuple() indextuple.c utility routine in all cases.
This was already the common case.

This commit is a HEAD-only follow up to bugfix commit 4b42a899.
2020-03-30 15:52:39 -07:00
Andres Freund d4b34f60c5 Deduplicate PageIsNew() check in lazy_scan_heap().
The recheck isn't needed anymore, as RelationGetBufferForTuple() now
extends the relation with RBM_ZERO_AND_LOCK. Previously we needed to
handle the fact that relation extension extended the relation and then
separately acquired a lock on the page - while expecting that the page
is empty.

Reported-By: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQArA_=J0D5T258xsCY6Xtf6wiH4b=QDPDgVS+WZUN10WDw@mail.gmail.com
2020-03-30 13:56:40 -07:00
Alexander Korotkov 364bdd0b41 Fix missing SP-GiST support in 911e702077
911e702077 misses setting of amoptsprocnum for SP-GiST.  This commit fixes
that.
2020-03-30 23:45:03 +03:00
Alexander Korotkov 851b14b0c6 Remove rudiments of supporting procnum == 0 from 911e702077
Early versions of opclass options patch uses zero support procedure as opclass
options procedure.  This commit removes rudiments of it, which were committed
in 911e702077.  Also, it implements correct handling of amoptsprocnum == 0.
2020-03-30 23:43:25 +03:00
Peter Geoghegan 4b42a89938 Consistently truncate non-key suffix columns.
INCLUDE indexes failed to have their non-key attributes physically
truncated away in certain rare cases.  This led to physically larger
pivot tuples that contained useless non-key attribute values.  The
impact on users should be negligible, but this is still clearly a
regression (Postgres 11 supports INCLUDE indexes, and yet was not
affected).

The bug appeared in commit dd299df8, which introduced "true" suffix
truncation of key attributes.

Discussion: https://postgr.es/m/CAH2-Wz=E8pkV9ivRSFHtv812H5ckf8s1-yhx61_WrJbKccGcrQ@mail.gmail.com
Backpatch: 12-, where "true" suffix truncation was introduced.
2020-03-30 12:03:59 -07:00
Alexander Korotkov 911e702077 Implement operator class parameters
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing.  These index AMs are GiST, GIN,
SP-GiST and BRIN.  There opclasses define representation of keys, operations on
them and supported search strategies.  So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision.  This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.

This commit doesn't introduce new storage in system catalog.  Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.

In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions.  Options
are set to fn_expr as the constant bytea expression.  It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.

This commit comes with some examples of opclass options usage.  We parametrize
signature length in GiST.  That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops.  Also we parametrize maximum number of integer ranges for
gist__int_ops.  However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.

Catversion is bumped.

Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
2020-03-30 19:17:23 +03:00
Fujii Masao 64638ccba3 Report waiting via PS while recovery is waiting for buffer pin in hot standby.
Previously while the startup process was waiting for the recovery conflict
with snapshot, tablespace or lock to be resolved, waiting was reported in
PS display, but not in the case of recovery conflict with buffer pin.
This commit makes the startup process in hot standby report waiting via PS
while waiting for the conflicts with other backends holding buffer pins to
be resolved.

Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
2020-03-30 17:35:03 +09:00
Peter Eisentraut 246f136e76 Improve handling of parameter differences in physical replication
When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

The correspondence of settings between primary and standby is required
because those settings influence certain shared memory sizings that
are required for processing WAL records that the primary might send.
For example, if the primary sends a prepared transaction, the standby
must have had max_prepared_transaction set appropriately or it won't
be able to process those WAL records.

However, fatally shutting down the standby immediately upon receipt of
the parameter change record might be a bit of an overreaction.  The
resources related to those settings are not required immediately at
that point, and might never be required if the activity on the primary
does not exhaust all those resources.  If we just let the standby roll
on with recovery, it will eventually produce an appropriate error when
those resources are used.

So this patch relaxes this a bit.  Upon receipt of
XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
warning and set a global flag if there is a problem.  Then when we
actually hit the resource issue and the flag was set, we issue another
warning message with relevant information.  At that point we pause
recovery, so a hot standby remains usable.  We also repeat the last
warning message once a minute so it is harder to miss or ignore.

Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/4ad69a4c-cc9b-0dfe-0352-8b1b0cd36c7b@2ndquadrant.com
2020-03-30 09:53:45 +02:00
Peter Eisentraut a01e1b8b9d Add new part SQL/MDA to information_schema.sql_parts 2020-03-30 08:55:55 +02:00
Fujii Masao 6aba63ef3e Allow the planner-related functions and hook to accept the query string.
This commit adds query_string argument into the planner-related functions
and hook and allows us to pass the query string to them.

Currently there is no user of the query string passed. But the upcoming patch
for the planning counters will add the planning hook function into
pg_stat_statements and the function will need the query string. So this change
will be necessary for that patch.

Also this change is useful for some extensions that want to use the query
string in their planner hook function.

Author: Pascal Legrand, Julien Rouhaud
Reviewed-by: Yoshikazu Imai, Tom Lane, Fujii Masao
Discussion: https://postgr.es/m/CAOBaU_bU1m3_XF5qKYtSj1ua4dxd=FWDyh2SH4rSJAUUfsGmAQ@mail.gmail.com
Discussion: https://postgr.es/m/1583789487074-0.post@n3.nabble.com
2020-03-30 13:51:05 +09:00
Fujii Masao 4a539a25eb Expose BufferUsageAccumDiff().
Previously pg_stat_statements calculated the difference of buffer counters
by its own code even while BufferUsageAccumDiff() had the same code.
This commit expose BufferUsageAccumDiff() and makes pg_stat_statements
use it for the calculation, in order to simply the code.

This change also would be useful for the upcoming patch for the planning
counters in pg_stat_statements because the patch will add one more code
for the calculation of difference of buffer counters and that can easily be
done by using BufferUsageAccumDiff().

Author: Julien Rouhaud
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/bdfee4e0-a304-2498-8da5-3cb52c0a193e@oss.nttdata.com
2020-03-30 12:15:26 +09:00
Amit Kapila b61d161c14 Introduce vacuum errcontext to display additional information.
The additional information displayed will be block number for error
occurring while processing heap and index name for error occurring
while processing the index.

This will help us in diagnosing the problems that occur during a vacuum.
For ex. due to corruption (either caused by bad hardware or by some bug)
if we get some error while vacuuming, it can help us identify the block
in heap and or additional index information.

It sets up an error context callback to display additional information
with the error.  During different phases of vacuum (heap scan, heap
vacuum, index vacuum, index clean up, heap truncate), we update the error
context callback to display appropriate information.  We can extend it to
a bit more granular level like adding the phases for FSM operations or for
prefetching the blocks while truncating. However, I felt that it requires
adding many more error callback function calls and can make the code a bit
complex, so left those for now.

Author: Justin Pryzby, with few changes by Amit Kapila
Reviewed-by: Alvaro Herrera, Amit Kapila, Andres Freund, Michael Paquier
and Sawada Masahiko
Discussion: https://www.postgresql.org/message-id/20191120210600.GC30362@telsasoft.com
2020-03-30 07:33:38 +05:30
Peter Eisentraut b79911dc8c Update SQL features
Change F181 to supported.  It requires that an embedded C program can
be split across multiple files, which ECPG easily supports.
2020-03-29 08:56:41 +02:00
Peter Geoghegan a7b9d24e4e Make deduplication use number of key attributes.
Use IndexRelationGetNumberOfKeyAttributes() rather than
IndexRelationGetNumberOfAttributes() when determining whether or not two
index tuples are suitable for merging together into a single posting
list tuple.  This is a little bit tidier.  It brings affected code in
nbtdedup.c a little closer to similar, related code in nbtsplitloc.c.
2020-03-28 20:25:03 -07:00
Andres Freund 42750b08d9 Ensure snapshot is registered within ScanPgRelation().
In 9.4 I added support to use a historical snapshot in
ScanPgRelation(), while adding logical decoding. Unfortunately a
conflict with the concurrent removal of SnapshotNow was incorrectly
resolved, leading to an unregistered snapshot being used.

It is not correct to use an unregistered (or non-active) snapshot for
anything non-trivial, because catalog invalidations can cause the
snapshot to be invalidated.

Luckily it seems unlikely to actively cause problems in practice, as
ScanPgRelation() requires that we already have a lock on the relation,
we only look for a single row, and we don't appear to rely on the
result's tid to be correct. It however is clearly wrong and potential
negative consequences would likely be hard to find. So it seems worth
backpatching the fix, even without a concrete hazard.

Discussion: https://postgr.es/m/20200229052459.wzhqnbhrriezg4v2@alap3.anarazel.de
Backpatch: 9.5-
2020-03-28 12:26:46 -07:00
Jeff Davis 7351bfeda3 Fix costing for disk-based hash aggregation.
Report and suggestions from Richard Guo and Tomas Vondra.

Discussion: https://postgr.es/m/CAMbWs4_W8fYbAn8KxgidAaZHON_Oo08OYn9ze=7remJymLqo5g@mail.gmail.com
2020-03-28 12:07:49 -07:00
Dean Rasheed 4083f445c0 Improve the performance and accuracy of numeric sqrt() and ln().
Instead of using Newton's method to compute numeric square roots, use
the Karatsuba square root algorithm, which performs better for numbers
of all sizes. In practice, this is 3-5 times faster for inputs with
just a few digits and up to around 10 times faster for larger inputs.

Also, the new algorithm guarantees that the final digit of the result
is correctly rounded, since it computes an integer square root with
truncation, containing at least 1 extra decimal digit before rounding.
The former algorithm would occasionally round the wrong way because
it rounded both the intermediate and final results.

In addition, arrange for sqrt_var() to explicitly support negative
rscale values (rounding before the decimal point). This allows the
argument reduction phase of ln_var() to be optimised for large inputs,
since it only needs to compute square roots with a few more digits
than the final ln() result, rather than computing all the digits
before the decimal point. For very large inputs, this can be many
thousands of times faster.

In passing, optimise div_var_fast() in a couple of places where it was
doing unnecessary work.

Patch be me, reviewed by Tom Lane and Tels.

Discussion: https://postgr.es/m/CAEZATCV1A7+jD3P30Zu31KjaxeSEyOn3v9d6tYegpxcq3cQu-g@mail.gmail.com
2020-03-28 14:37:53 +00:00
Dean Rasheed 87779aa474 Prevent functional dependency estimates from exceeding column estimates.
Formerly we applied a functional dependency "a => b with dependency
degree f" using the formula

  P(a,b) = P(a) * [f + (1-f)*P(b)]

This leads to the possibility that the combined selectivity P(a,b)
could exceed P(b), which is not ideal. The addition of support for IN
and OR clauses (commits 8f321bd16c and ccaa3569f5) would seem to make
this more likely, since the user-supplied values in such clauses are
not necessarily compatible with the functional dependency.

Mitigate this by using the formula

  P(a,b) = f * Min(P(a), P(b)) + (1-f) * P(a) * P(b)

instead, which guarantees that the combined selectivity is less than
each column's individual selectivity. Logically, this is modifies the
part of the formula that accounts for dependent rows to handle cases
where P(a) > P(b), whilst not changing the second term which accounts
for independent rows.

Additionally, this refactors the way that functional dependencies are
applied, so now dependencies_clauselist_selectivity() estimates both
the implying clauses and the implied clauses for each functional
dependency (formerly only the implied clauses were estimated), and now
all clauses for each attribute are taken into account (formerly only
one clause for each implied attribute was estimated). This removes the
previously built-in assumption that only equality clauses will be
seen, which is no longer true, and opens up the possibility of
applying functional dependencies to more general clauses.

Patch by me, reviewed by Tomas Vondra.

Discussion: https://postgr.es/m/CAEZATCXaNFZyOhR4XXAfkvj1tibRBEjje6ZbXwqWUB_tqbH%3Drw%40mail.gmail.com
Discussion: https://postgr.es/m/20200318002946.6dvblukm3cfmgir2%40development
2020-03-28 12:48:34 +00:00
Peter Eisentraut 145cb16d3b Cleanup in SQL features files
Feature C011 was still listed in sql_feature_packages.txt but had been
removed from sql_features.txt, so also remove from the former.
2020-03-28 08:46:18 +01:00
David Rowley b07642dbcd Trigger autovacuum based on number of INSERTs
Traditionally autovacuum has only ever invoked a worker based on the
estimated number of dead tuples in a table and for anti-wraparound
purposes. For the latter, with certain classes of tables such as
insert-only tables, anti-wraparound vacuums could be the first vacuum that
the table ever receives. This could often lead to autovacuum workers being
busy for extended periods of time due to having to potentially freeze
every page in the table. This could be particularly bad for very large
tables. New clusters, or recently pg_restored clusters could suffer even
more as many large tables may have the same relfrozenxid, which could
result in large numbers of tables requiring an anti-wraparound vacuum all
at once.

Here we aim to reduce the work required by anti-wraparound and aggressive
vacuums in general, by triggering autovacuum when the table has received
enough INSERTs. This is controlled by adding two new GUCs and reloptions;
autovacuum_vacuum_insert_threshold and
autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
existing scale factor and threshold controls, only base themselves off the
number of inserts since the last vacuum, rather than the number of dead
tuples. New controls were added rather than reusing the existing
controls, to allow these new vacuums to be tuned independently and perhaps
even completely disabled altogether, which can be done by setting
autovacuum_vacuum_insert_threshold to -1.

We make no attempt to skip index cleanup operations on these vacuums as
they may trigger for an insert-mostly table which continually doesn't have
enough dead tuples to trigger an autovacuum for the purpose of removing
those dead tuples. If we were to skip cleaning the indexes in this case,
then it is possible for the index(es) to become bloated over time.

There are additional benefits to triggering autovacuums based on inserts,
as tables which never contain enough dead tuples to trigger an autovacuum
are now more likely to receive a vacuum, which can mark more of the table
as "allvisible" and encourage the query planner to make use of Index Only
Scans.

Currently, we still obey vacuum_freeze_min_age when triggering these new
autovacuums based on INSERTs. For large insert-only tables, it may be
beneficial to lower the table's autovacuum_freeze_min_age so that tuples
are eligible to be frozen sooner. Here we've opted not to zero that for
these types of vacuums, since the table may just be insert-mostly and we
may otherwise freeze tuples that are still destined to be updated or
removed in the near future.

There was some debate to what exactly the new scale factor and threshold
should default to. For now, these are set to 0.2 and 1000, respectively.
There may be some motivation to adjust these before the release.

Author: Laurenz Albe, Darafei Praliaskouski
Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
2020-03-28 19:20:12 +13:00
Peter Geoghegan 9945ad6e90 Justify nbtree page split locking in code comment.
Delaying unlocking the right child page until after the point that the
left child's parent page has been refound is no longer truly necessary.
Commit 40dae7ec made nbtree tolerant of interrupted page splits.  VACUUM
was taught to avoid deleting a page that happens to be the right half of
an incomplete split.  As long as page splits don't unlock the left child
page until the end of the second/final phase, it should be safe to
unlock the right child page earlier (at the end of the first phase).

It probably isn't actually useful to release the right child's lock
earlier like this (it probably won't improve performance).  Even still,
pointing out that it ought to be safe to do so should make it easier to
understand the overall design.
2020-03-27 16:44:52 -07:00
Alvaro Herrera 1e6148032e
Allow walreceiver configuration to change on reload
The parameters primary_conninfo, primary_slot_name and
wal_receiver_create_temp_slot can now be changed with a simple "reload"
signal, no longer requiring a server restart.  This is achieved by
signalling the walreceiver process to terminate and having it start
again with the new values.

Thanks to Andres Freund, Kyotaro Horiguchi, Fujii Masao for discussion.

Author: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/19513901543181143@sas1-19a94364928d.qloud-c.yandex.net
2020-03-27 19:51:37 -03:00
Alvaro Herrera 092c6936de
Set wal_receiver_create_temp_slot PGC_POSTMASTER
Commit 3297308278 gave walreceiver the ability to create and use a
temporary replication slot, and made it controllable by a GUC (enabled
by default) that can be changed with SIGHUP.  That's useful but has two
problems: one, it's possible to cause the origin server to fill its disk
if the slot doesn't advance in time; and also there's a disconnect
between state passed down via the startup process and GUCs that
walreceiver reads directly.

We handle the first problem by setting the option to disabled by
default.  If the user enables it, its on their head to make sure that
disk doesn't fill up.

We handle the second problem by passing the flag via startup rather than
having walreceiver acquire it directly, and making it PGC_POSTMASTER
(which ensures a walreceiver always has the fresh value).  A future
commit can relax this (to PGC_SIGHUP again) by having the startup
process signal walreceiver to shutdown whenever the value changes.

Author: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200122055510.GH174860@paquier.xyz
2020-03-27 16:20:33 -03:00
Tom Lane fbc7a71608 Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time.  We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill.  (We can assume that the plan is valid, it's just
possibly a bit stale.  Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.)  Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.

Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.

Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 14:47:34 -04:00
Peter Eisentraut 8d1b9648c5 Update SQL features
Change F311 to supported.  This was already accomplished when
subfeature F311-04 (WITH CHECK OPTION) was added, but the top-level
feature wasn't updated at the time.
2020-03-27 08:36:08 +01:00
Tom Lane 8f59f6b9c0 Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.

First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).

The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session.  This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.

Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation.  The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.

Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.

Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here.  Also thanks to Andres Freund for review.

Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 18:58:57 -04:00
Magnus Hagander eff5b245df Document that pg_checksums exists in checksums README
Author: Daniel Gustafsson <daniel@yesql.se>
2020-03-26 15:05:54 +01:00
Peter Eisentraut 49bf81536e Drop slot's LWLock before returning from SaveSlotToPath()
When SaveSlotToPath() is called with elevel=LOG, the early exits didn't
release the slot's io_in_progress_lock.

This could result in a walsender being stuck on the lock forever.  A
possible way to get into this situation is if the offending code paths
are triggered in a low disk space situation.

Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com>
Reported-by: Craig Ringer <craig@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com
2020-03-26 13:29:20 +01:00
Andrew Dunstan 896fcdb230 Provide a TLS init hook
The default hook function sets the default password callback function.
In order to allow preloaded libraries to have an opportunity to override
the default, TLS initialization if now delayed slightly until after
shared preloaded libraries have been loaded.

A test module is provided which contains a trivial example that decodes
an obfuscated password for an SSL certificate.

Author: Andrew Dunstan
Reviewed By: Andreas Karlsson, Asaba Takanori
Discussion: https://postgr.es/m/04116472-818b-5859-1d74-3d995aab2252@2ndQuadrant.com
2020-03-25 17:13:17 -04:00
Tom Lane bda6dedbea Go back to returning int from ereport auxiliary functions.
This reverts the parts of commit 17a28b0364
that changed ereport's auxiliary functions from returning dummy integer
values to returning void.  It turns out that a minority of compilers
complain (not entirely unreasonably) about constructs such as

	(condition) ? errdetail(...) : 0

if errdetail() returns void rather than int.  We could update those
call sites to say "(void) 0" perhaps, but the expectation for this
patch set was that ereport callers would not have to change anything.
And this aspect of the patch set was already the most invasive and
least compelling part of it, so let's just drop it.

Per buildfarm.

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-25 11:57:36 -04:00
Peter Eisentraut e8b1774fc2 Update SQL features
The name of E182 was changed in SQL:2011.

Also, we can change it to supported because all it requires is one
embedded language to be supported, which we do.
2020-03-25 08:46:41 +01:00
Thomas Munro 352f6f2df6 Add collation versions for Windows.
On Vista and later, use GetNLSVersionEx() to request collation version
information.

Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
2020-03-25 16:04:32 +13:00
Thomas Munro 382a821907 Allow NULL version for individual collations.
Remove the documented restriction that collation providers must either
return NULL for all collations or non-NULL for all collations.

Use NULL for glibc collations like "C.UTF-8", which might otherwise lead
future proposed commits to force unnecessary index rebuilds.

Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
2020-03-25 15:53:24 +13:00
Jeff Davis dd8e19132a Consider disk-based hash aggregation to implement DISTINCT.
Correct oversight in 1f39bce0. If enable_hashagg_disk=true, we should
consider hash aggregation for DISTINCT when applicable.
2020-03-24 18:30:04 -07:00
Jeff Davis 3649133b14 Avoid allocating unnecessary zero-sized array.
If there are no aggregates, there is no need to allocate an array of
zero AggStatePerGroupData elements.
2020-03-24 18:30:04 -07:00
Peter Geoghegan b150a76793 Fix nbtree deduplication README commentary.
Descriptions of some aspects of how deduplication works were unclear in
a couple of places.
2020-03-24 14:58:27 -07:00
Andres Freund 112b006fe7 logical decoding: Remove TODO about unnecessary optimization.
Measurements show, and intuition agrees, that there's currently no
known cases where adding a fastpath to avoid allocating / ordering a
heap for a single transaction is worthwhile.

Author: Dilip Kumar
Discussion: https://postgr.es/m/CAFiTN-sp701wvzvnLQJGk7JDqrFM8f--97-ihbwkU8qvn=p8nw@mail.gmail.com
2020-03-24 12:15:03 -07:00
Peter Eisentraut f15ace7935 Fix compiler warning on Cygwin
bf68b79e50 introduced an unused variable
compiler warning on Cygwin.
2020-03-24 19:31:02 +01:00
Tom Lane 17a28b0364 Improve the internal implementation of ereport().
Change all the auxiliary error-reporting routines to return void,
now that we no longer need to pretend they are passing something
useful to errfinish().  While this probably doesn't save anything
significant at the machine-code level, it allows detection of some
additional types of mistakes.

Pass the error location details (__FILE__, __LINE__, PG_FUNCNAME_MACRO)
to errfinish not errstart.  This shaves a few cycles off the case where
errstart decides we're not going to emit anything.

Re-implement elog() as a trivial wrapper around ereport(), removing
the separate support infrastructure it used to have.  Aside from
getting rid of some now-surplus code, this means that elog() now
really does have exactly the same semantics as ereport(), in particular
that it can skip evaluation work if the message is not to be emitted.

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-24 12:08:48 -04:00
Tom Lane e3a87b4991 Re-implement the ereport() macro using __VA_ARGS__.
Now that we require C99, we can depend on __VA_ARGS__ to work, and
revising ereport() to use it has several significant benefits:

* The extra parentheses around the auxiliary function calls are now
optional.  Aside from being a bit less ugly, this removes a common
gotcha for new contributors, because in some cases the compiler errors
you got from forgetting them were unintelligible.

* The auxiliary function calls are now evaluated as a comma expression
list rather than as extra arguments to errfinish().  This means that
compilers can be expected to warn about no-op expressions in the list,
allowing detection of several other common mistakes such as forgetting
to add errmsg(...) when converting an elog() call to ereport().

* Unlike the situation with extra function arguments, comma expressions
are guaranteed to be evaluated left-to-right, so this removes platform
dependency in the order of the auxiliary function calls.  While that
dependency hasn't caused us big problems in the past, this change does
allow dropping some rather shaky assumptions around errcontext() domain
handling.

There's no intention to make wholesale changes of existing ereport
calls, but as proof-of-concept this patch removes the extra parens
from a couple of calls in postgres.c.

While new code can be written either way, code intended to be
back-patched will need to use extra parens for awhile yet.  It seems
worth back-patching this change into v12, so as to reduce the window
where we have to be careful about that by one year.  Hence, this patch
is careful to preserve ABI compatibility; a followup HEAD-only patch
will make some additional simplifications.

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-24 11:49:00 -04:00
Peter Eisentraut cef27ae01a Fix compiler warning
A variable was unused in non-assert builds.  Simplify the code to
avoid the issue.

Reported-by: Erik Rijkers <er@xs4all.nl>
2020-03-24 16:02:01 +01:00
Peter Eisentraut 97ee604d9b Some refactoring of logical/worker.c
This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding
apply_handle_{insert|update|delete}_internal functions.  This allows
performing those operations on relations that are not directly the
targets of replication, which is something a later patch will use for
targeting partitioned tables.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-24 15:00:54 +01:00
Andres Freund cedffbdb8b Report wait event for cost-based vacuum delay.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20200321040750.GD13662@telsasoft.com
2020-03-23 22:53:22 -07:00
Fujii Masao 496ee647ec Prefer standby promotion over recovery pause.
Previously if a promotion was triggered while recovery was paused,
the paused state continued. Also recovery could be paused by executing
pg_wal_replay_pause() even while a promotion was ongoing. That is,
recovery pause had higher priority over a standby promotion.
But this behavior was not desirable because most users basically wanted
the recovery to complete as soon as possible and the server to become
the master when they requested a promotion.

This commit changes recovery so that it prefers a promotion over
recovery pause. That is, if a promotion is triggered while recovery
is paused, the paused state ends and a promotion continues. Also
this commit makes recovery pause functions like pg_wal_replay_pause()
throw an error if they are executed while a promotion is ongoing.

Internally, this commit adds new internal function PromoteIsTriggered()
that returns true if a promotion is triggered. Since the name of
this function and the existing function IsPromoteTriggered() are
confusingly similar, the commit changes the name of IsPromoteTriggered()
to IsPromoteSignaled, as more appropriate name.

Author: Fujii Masao
Reviewed-by: Atsushi Torikoshi, Sergei Kornilov
Discussion: https://postgr.es/m/00c194b2-dbbb-2e8a-5b39-13f14048ef0a@oss.nttdata.com
2020-03-24 12:46:48 +09:00
Michael Paquier e09ad07b21 Move routine building restore_command to src/common/
restore_command has only been used until now by the backend, but there
is a pending patch for pg_rewind to make use of that in the frontend.

Author: Alexey Kondratov
Reviewed-by: Andrey Borodin, Andres Freund, Alvaro Herrera, Alexander
Korotkov, Michael Paquier
Discussion: https://postgr.es/m/a3acff50-5a0d-9a2c-b3b2-ee36168955c1@postgrespro.ru
2020-03-24 12:13:36 +09:00
Fujii Masao b8e20d6dab Add wait events for WAL archive and recovery pause.
This commit introduces new wait events BackupWaitWalArchive and
RecoveryPause. The former is reported while waiting for the WAL files
required for the backup to be successfully archived. The latter is
reported while waiting for recovery in pause state to be resumed.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Robert Haas
Discussion: https://postgr.es/m/f0651f8c-9c96-9f29-0ff9-80414a15308a@oss.nttdata.com
2020-03-24 11:12:21 +09:00
Fujii Masao 67e0adfb3f Report NULL as total backup size if it's not estimated.
Previously 0 was reported in pg_stat_progress_basebackup.total_backup
if the total backup size was not estimated. Per discussion, our consensus
is that NULL is better choise as the value in total_backup in that case.
So this commit makes pg_stat_progress_basebackup view report NULL
in total_backup column if the estimation is disabled.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Amit Langote, Magnus Hagander, Alvaro Herrera
Discussion: https://postgr.es/m/CABUevExnhOD89zBDuPvfAAh243RzNpwCPEWNLtMYpKHMB8gbAQ@mail.gmail.com
2020-03-24 10:43:41 +09:00
Jeff Davis 64fe602279 Fixes for Disk-based Hash Aggregation.
Justin Pryzby raised a couple issues with commit 1f39bce0. Fixed.

Also, tweak the way the size of a hash entry is estimated and the
number of buckets is estimated when calling BuildTupleHashTableExt().

Discussion: https://www.postgresql.org/message-id/20200319064222.GR26184@telsasoft.com
2020-03-23 15:43:07 -07:00
Amit Kapila 33753ac9d7 Add object names to partition integrity violations.
All errors of SQLSTATE class 23 should include the name of an object
associated with the error in separate fields of the error report message.
We do this so that applications need not try to extract them from the
possibly-localized human-readable text of the message.

Reported-by: Chris Bandy
Author: Chris Bandy
Reviewed-by: Amit Kapila and Amit Langote
Discussion: https://postgr.es/m/0aa113a3-3c7f-db48-bcd8-f9290b2269ae@gmail.com
2020-03-23 08:09:15 +05:30
Michael Paquier 79dfa8afb2 Add bound checks for ssl_min_protocol_version and ssl_max_protocol_version
Mixing incorrect bounds in the SSL context leads to confusing error
messages generated by OpenSSL which are hard to act on.  New range
checks are added when both min/max parameters are loaded in the context
of a SSL reload to improve the error reporting.  Note that this does not
make use of the GUC hook machinery contrary to 41aadee, as there is no
way to ensure a consistent range check (except if there is a way one day
to define range types for GUC parameters?).  Hence, this patch applies
only to OpenSSL, and uses a logic similar to other parameters to trigger
an error when reloading the SSL context in a session.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200114035420.GE1515@paquier.xyz
2020-03-23 11:01:41 +09:00
Noah Misch de9396326e Revert "Skip WAL for new relfilenodes, under wal_level=minimal."
This reverts commit cb2fd7eac2.  Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64.  Back-patch to 9.5 (all supported versions).

Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
2020-03-22 09:24:09 -07:00
Noah Misch cb2fd7eac2 Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this.  If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY.  See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules.  Maintainers of table access
methods should examine that section.

To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL.  A new GUC,
wal_skip_threshold, guides that choice.  If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold.  Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.

Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode.  Introduce rd_firstRelfilenodeSubid.  Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node.  Make relcache.c retain entries for certain
dropped relations until end of transaction.

Back-patch to 9.5 (all supported versions).  This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC.  As
always, update standby systems before master systems.  This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions.  (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)

Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas.  Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem.  Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs.  Reported by Martijn van Oosterhout.

Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-03-21 09:38:26 -07:00
Noah Misch d3e572855b In log_newpage_range(), heed forkNum and page_std arguments.
The function assumed forkNum=MAIN_FORKNUM and page_std=true, ignoring
the actual arguments.  Existing callers passed exactly those values, so
there's no live bug.  Back-patch to v12, where the function first
appeared, because another fix needs this.

Discussion: https://postgr.es/m/20191118045434.GA1173436@rfd.leadboat.com
2020-03-21 09:38:26 -07:00
Noah Misch e629a01f69 During heap rebuild, lock any TOAST index until end of transaction.
swap_relation_files() calls toast_get_valid_index() to find and lock
this index, just before swapping with the rebuilt TOAST index.  The
latter function releases the lock before returning.  Potential for
mischief is low; a concurrent session can issue ALTER INDEX ... SET
(fillfactor = ...), which is not alarming.  Nonetheless, changing
pg_class.relfilenode without a lock is unconventional.  Back-patch to
9.5 (all supported versions), because another fix needs this.

Discussion: https://postgr.es/m/20191226001521.GA1772687@rfd.leadboat.com
2020-03-21 09:38:26 -07:00
Noah Misch d60ef94d76 Fix cosmetic blemishes involving rd_createSubid.
Remove an obsolete comment from AtEOXact_cleanup().  Restore formatting
of a comment in struct RelationData, mangled by the pgindent run in
commit 9af4159fce.  Back-patch to 9.5 (all
supported versions), because another fix stacks on this.
2020-03-21 09:38:26 -07:00
Amit Kapila 3ba59ccc89 Allow page lock to conflict among parallel group members.
This is required as it is no safer for two related processes to perform
clean up in gin indexes at a time than for unrelated processes to do the
same.  After acquiring page locks, we can acquire relation extension lock
but reverse never happens which means these will also not participate in
deadlock.  So, avoid checking wait edges from this lock.

Currently, the parallel mode is strictly read-only, but after this patch
we have the infrastructure to allow parallel inserts and parallel copy.

Author: Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-21 08:48:06 +05:30
Amit Kapila 85f6b49c2c Allow relation extension lock to conflict among parallel group members.
This is required as it is no safer for two related processes to extend the
same relation at a time than for unrelated processes to do the same.  We
don't acquire a heavyweight lock on any other object after relation
extension lock which means such a lock can never participate in the
deadlock cycle.  So, avoid checking wait edges from this lock.

This provides an infrastructure to allow parallel operations like insert,
copy, etc. which were earlier not possible as parallel group members won't
conflict for relation extension lock.

Author: Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-20 08:20:56 +05:30
Peter Geoghegan b27e1b3418 nbtree: Remove obsolete _bt_pgaddtup() comments.
Remove comments that are a throw back to a time when nbtree cared about
write-ordering dependencies.  The comments are similar to those removed
by commit 9ee7414e, among others.
2020-03-19 14:56:56 -07:00
Jeff Davis 2fd6a44ad5 Revert "Specialize MemoryContextMemAllocated()."
This reverts commit e00912e11a.
2020-03-19 12:21:50 -07:00
Tom Lane 24e2885ee3 Introduce "anycompatible" family of polymorphic types.
This patch adds the pseudo-types anycompatible, anycompatiblearray,
anycompatiblenonarray, and anycompatiblerange.  They work much like
anyelement, anyarray, anynonarray, and anyrange respectively, except
that the actual input values need not match precisely in type.
Instead, if we can find a common supertype (using the same rules
as for UNION/CASE type resolution), then the parser automatically
promotes the input values to that type.  For example,
"myfunc(anycompatible, anycompatible)" can match a call with one
integer and one bigint argument, with the integer automatically
promoted to bigint.  With anyelement in the definition, the user
would have had to cast the integer explicitly.

The new types also provide a second, independent set of type variables
for function matching; thus with "myfunc(anyelement, anyelement,
anycompatible) returns anycompatible" the first two arguments are
constrained to be the same type, but the third can be some other
type, and the result has the type of the third argument.  The need
for more than one set of type variables was foreseen back when we
first invented the polymorphic types, but we never did anything
about it.

Pavel Stehule, revised a bit by me

Discussion: https://postgr.es/m/CAFj8pRDna7VqNi8gR+Tt2Ktmz0cq5G93guc3Sbn_NVPLdXAkqA@mail.gmail.com
2020-03-19 11:43:11 -04:00
Peter Eisentraut c314c147c0 Prepare to support non-tables in publications
This by itself doesn't change any functionality but prepares the way
for having relations other than base tables in publications.

Make arrangements for COPY handling the initial table sync.  For
non-tables we have to use COPY (SELECT ...) instead of directly
copying from the table, but then we have to take care to omit
generated columns from the column list.

Also, remove a hardcoded reference to relkind = 'r' and rely on the
publisher to send only what it can actually publish, which will be
correct even in future cross-version scenarios.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-19 08:25:07 +01:00
Fujii Masao 1d253bae57 Rename the recovery-related wait events.
This commit renames RecoveryWalAll and RecoveryWalStream wait events to
RecoveryWalStream and RecoveryRetrieveRetryInterval, respectively,
in order to make the names and what they are more consistent. For example,
previously RecoveryWalAll was reported as a wait event while the recovery
was waiting for WAL from a stream, and which was confusing because the name
was very different from the situation where the wait actually could happen.

The names of macro variables for those wait events also are renamed
accordingly.

This commit also changes the category of RecoveryRetrieveRetryInterval to
Timeout from Activity because the wait event is reported while waiting based
on wal_retrieve_retry_interval.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi
Discussion: https://postgr.es/m/124997ee-096a-5d09-d8da-2c7a57d0816e@oss.nttdata.com
2020-03-19 15:32:55 +09:00
Amit Kapila 72e78d831a Add assert to ensure that page locks don't participate in deadlock cycle.
Assert that we don't acquire any other heavyweight lock while holding the
page lock except for relation extension.  However, these locks are never
taken in reverse order which implies that page locks will never
participate in the deadlock cycle.

Similar to relation extension, page locks are also held for a short
duration, so imposing such a restriction won't hurt.

Author: Dilip Kumar, with few changes by Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-19 08:11:45 +05:30
Peter Geoghegan 6312c08a29 nbtree: Use raw PageAddItem() for retail inserts.
Only internal page splits need to call _bt_pgaddtup() instead of
PageAddItem(), and only for data items, one of which will end up at the
first offset (or first offset after the high key offset) on the new
right page.  This data item alone will need to be truncated in
_bt_pgaddtup().

Since there is no reason why retail inserts ever need to truncate the
incoming item, use a raw PageAddItem() call there instead.  Even
_bt_split() uses raw PageAddItem() calls for left page and right page
high keys.  Clearly the _bt_pgaddtup() shim function wasn't really
encapsulating anything.  _bt_pgaddtup() should now be thought of as a
_bt_split() helper function.

Note that the assertions from commit d1e241c2 verify that retail inserts
never insert an item at an internal page's negative infinity offset.
This invariant could only ever be violated as a result of a basic logic
error in nbtinsert.c.
2020-03-18 18:17:37 -07:00
Michael Paquier d41202f36e Fix comment related to concurrent index swapping in index.c
A comment about switching indisvalid of the new and old indexes swapped
in REINDEX CONCURRENTLY got this backwards.

Issue introduced by 5dc92b8, the original commit of REINDEX
CONCURRENTLY.

Author: Julien Rouhaud
Discussion: https://postgr.es/m/20200318143340.GA46897@nol
Backpatch-through: 12
2020-03-19 09:51:33 +09:00
Jeff Davis 1f39bce021 Disk-based Hash Aggregation.
While performing hash aggregation, track memory usage when adding new
groups to a hash table. If the memory usage exceeds work_mem, enter
"spill mode".

In spill mode, new groups are not created in the hash table(s), but
existing groups continue to be advanced if input tuples match. Tuples
that would cause a new group to be created are instead spilled to a
logical tape to be processed later.

The tuples are spilled in a partitioned fashion. When all tuples from
the outer plan are processed (either by advancing the group or
spilling the tuple), finalize and emit the groups from the hash
table. Then, create new batches of work from the spilled partitions,
and select one of the saved batches and process it (possibly spilling
recursively).

Author: Jeff Davis
Reviewed-by: Tomas Vondra, Adam Lee, Justin Pryzby, Taylor Vesely, Melanie Plageman
Discussion: https://postgr.es/m/507ac540ec7c20136364b5272acbcd4574aa76ef.camel@j-davis.com
2020-03-18 15:42:02 -07:00
Jeff Davis e00912e11a Specialize MemoryContextMemAllocated().
An AllocSet doubles the size of allocated blocks (up to maxBlockSize),
which means that the current block can represent half of the total
allocated space for the memory context. But the free space in the
current block may never have been touched, so don't count the
untouched memory as allocated for the purposes of
MemoryContextMemAllocated().

Discussion: https://postgr.es/m/ec63d70b668818255486a83ffadc3aec492c1f57.camel@j-davis.com
2020-03-18 15:39:14 -07:00
Alvaro Herrera 487e9861d0
Enable BEFORE row-level triggers for partitioned tables
... with the limitation that the tuple must remain in the same
partition.

Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/20200227165158.GA2071@alvherre.pgsql
2020-03-18 18:58:05 -03:00
Peter Geoghegan b029395f5e Refactor nbtree fastpath optimization.
Commit 2b272734, which added the fastpath rightmost leaf page cache
insert optimization, added code to _bt_doinsert() to handle using and
invalidating the backend local block cache.  It doesn't seem like a good
place to handle these low level details, though.  _bt_doinsert() is
supposed to be a high level function -- it is the main entry point to
nbtinsert.c.

Restructure the code by placing handling of the rightmost block cache at
the start of a new _bt_search() shim function, _bt_search_insert().  The
new function is called from _bt_doinsert(), which uses it as a
_bt_search() variant that conveniently accepts its BTInsertState state
as an argument.  _bt_doinsert() no longer needs to directly consider the
fastpath optimization.

Discussion: https://postgr.es/m/CAH2-Wzk59cxKJRd=rfbyub6-V4yWRjsOYRkUNHBLT1P1GdtCQQ@mail.gmail.com
2020-03-18 14:42:49 -07:00
Peter Eisentraut a2b1faa0f2 Implement type regcollation
This will be helpful for a following commit and it's also just
generally useful, like the other reg* types.

Author: Julien Rouhaud
Reviewed-by: Thomas Munro and Michael Paquier
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-03-18 21:21:00 +01:00
Tomas Vondra ccaa3569f5 Recognize some OR clauses as compatible with functional dependencies
Since commit 8f321bd16c functional dependencies can handle IN clauses,
which however introduced a possible (and surprising) inconsistency,
because IN clauses may be expressed as an OR clause, which are still
considered incompatible. For example

  a IN (1, 2, 3)

may be rewritten as

  (a = 1 OR a = 2 OR a = 3)

The IN clause will work fine with functional dependencies, but the OR
clause will force the estimation to fall back to plain per-column
estimates, possibly introducing significant estimation errors.

This commit recognizes OR clauses equivalent to an IN clause (when all
arugments are compatible and reference the same attribute) as a special
case, compatible with functional dependencies. This allows applying
functional dependencies, just like for IN clauses.

This does not eliminate the difference in estimating the clause itself,
i.e. IN clause and OR clause still use different formulas. It would be
possible to change that (for these special OR clauses), but that's not
really about extended statistics - it was always like this. Moreover the
errors are usually much smaller compared to ignoring dependencies.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-18 16:41:49 +01:00
Tomas Vondra 6f72dbc48b Fix wording of several extended stats comments
Reported-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development
2020-03-18 13:40:13 +01:00
Amit Kapila b4f140869f Add missing errcode() in a few ereport calls.
This will allow to specifying SQLSTATE error code for the errors in the
missing places.

Reported-by: Sawada Masahiko
Author: Sawada Masahiko
Backpatch-through: 9.5
Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-18 09:27:14 +05:30
Michael Paquier fdeeb524b4 Fix typo in indexcmds.c
Introduced by 61d7c7b.

Backpatch-through: 12
2020-03-18 11:13:12 +09:00
Amit Kapila 15ef6ff4b9 Assert that we don't acquire a heavyweight lock on another object after
relation extension lock.

The only exception to the rule is that we can try to acquire the same
relation extension lock more than once.  This is allowed as we are not
creating any new lock for this case.  This restriction implies that the
relation extension lock won't ever participate in the deadlock cycle
because we can never wait for any other heavyweight lock after acquiring
this lock.

Such a restriction is okay for relation extension locks as unlike other
heavyweight locks these are not held till the transaction end.  These are
taken for a short duration to extend a particular relation and then
released.

Author: Dilip Kumar, with few changes by Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-18 07:20:17 +05:30
Peter Geoghegan b897b3aae6 nbtree: Remove useless local variables.
Copying block and offset numbers to local variables in _bt_insertonpg()
made the code less readable.  Remove the variables.  There is already
code that conditionally calls BufferGetBlockNumber() in the same block,
so consistently do it that way instead.

Calling BufferGetBlockNumber() is very cheap, but we might as well avoid
it when it isn't truly necessary.  It isn't truly necessary for
_bt_insertonpg() to call BufferGetBlockNumber() in almost all cases.

Spotted while working on a patch that refactors the fastpath rightmost
leaf page cache optimization, which was added by commit 2b272734.
2020-03-17 18:39:26 -07:00
Thomas Munro 9b8aa09293 Don't use EV_CLEAR for kqueue events.
For the semantics to match the epoll implementation, we need a socket to
continue to appear readable/writable if you wait multiple times without
doing I/O in between (in Linux terminology: level-triggered rather than
edge-triggered).  This distinction will be important for later commits.
Similar to commit 3b790d256f for Windows.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
2020-03-18 13:05:09 +13:00
Thomas Munro 7bc84a1f30 Fix kqueue support under debugger on macOS.
While running under a debugger, macOS's getppid() can return the
debugger's PID.  That could cause a backend to exit because it falsely
believed that the postmaster had died, since commit 815c2f09.

Continue to use getppid() as a fast postmaster check after adding the
postmaster's PID to a kqueue, to close a PID-reuse race, but double
check that it actually exited by trying to read the pipe.  The new check
isn't reached in the common case.

Reported-by: Alexander Korotkov <a.korotkov@postgrespro.ru>
Discussion: https://postgr.es/m/CA%2BhUKGKhAxJ8V8RVwCo6zJaeVrdOG1kFBHGZOOjf6DzW_omeMA%40mail.gmail.com
2020-03-18 13:05:07 +13:00
Tom Lane e6c178b5b7 Refactor our checks for valid function and aggregate signatures.
pg_proc.c and pg_aggregate.c had near-duplicate copies of the logic
to decide whether a function or aggregate's signature is legal.
This seems like a bad thing even without the problem that the
upcoming "anycompatible" patch would roughly double the complexity
of that logic.  Hence, refactor so that the rules are localized
in new subroutines supplied by parse_coerce.c.  (One could quibble
about just where to add that code, but putting it beside
enforce_generic_type_consistency seems not totally unreasonable.)

The fact that the rules are about to change would mandate some
changes in the wording of the associated error messages in any case.
I ended up spelling things out in a fairly literal fashion in the
errdetail messages, eg "A result of type anyelement requires at
least one input of type anyelement, anyarray, anynonarray, anyenum,
or anyrange."  Perhaps this is overkill, but once there's more than
one subgroup of polymorphic types, people might get confused by
more-abstract messages.

Discussion: https://postgr.es/m/24137.1584139352@sss.pgh.pa.us
2020-03-17 19:36:41 -04:00
Tom Lane 77ec5affbc Adjust handling of an ANYARRAY actual input for an ANYARRAY argument.
Ordinarily it's impossible for an actual input of a function to have
declared type ANYARRAY, since we'd resolve that to a concrete array
type before doing argument type resolution for the function.  But an
exception arises for functions applied to certain columns of pg_statistic
or pg_stats, since we abuse the "anyarray" pseudotype by using it to
declare those columns.  So parse_coerce.c has to deal with the case.

Previously we allowed an ANYARRAY actual input to match an ANYARRAY
polymorphic argument, but only if no other argument or result was
declared ANYELEMENT.  When that logic was written, those were the only
two polymorphic types, and I fear nobody thought carefully about how it
ought to extend to the other ones that came along later.  But actually
it was wrong even then, because if a function has two ANYARRAY
arguments, it should be able to expect that they have identical element
types, and we'd not be able to ensure that.

The correct generalization is that we can match an ANYARRAY actual input
to an ANYARRAY polymorphic argument only if no other argument or result
is of any polymorphic type, so that no promises are being made about
element type compatibility.  check_generic_type_consistency can't test
that condition, but it seems better anyway to accept such matches there
and then throw an error if needed in enforce_generic_type_consistency.
That way we can produce a specific error message rather than an
unintuitive "function does not exist" complaint.  (There's some risk
perhaps of getting new ambiguous-function complaints, but I think that
any set of functions for which that could happen would be ambiguous
against ordinary array columns as well.)  While we're at it, we can
improve the message that's produced in cases that the code did already
object to, as shown in the regression test changes.

Also, remove a similar test that got cargo-culted in for ANYRANGE;
there are no catalog columns of type ANYRANGE, and I hope we never
create any, so that's not needed.  (It was incomplete anyway.)

While here, update some comments and rearrange the code a bit in
preparation for upcoming additions of more polymorphic types.

In practical situations I believe this is just exchanging one error
message for another, hopefully better, one.  So it doesn't seem
needful to back-patch, even though the mistake is ancient.

Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
2020-03-17 18:29:14 -04:00
Alvaro Herrera 5d0c2d5eba
Remove logical_read_local_xlog_page
It devolved into a content-less wrapper over read_local_xlog_page, with
nothing to add, plus it's easily confused with walsender's
logical_read_xlog_page.  There doesn't seem to be any reason for it to
stay.

src/include/replication/logicalfuncs.h becomes empty, so remove it too.
The prototypes it initially had were absorbed by generated fmgrprotos.h.

Discussion: https://postgr.es/m/20191115214102.GA15616@alvherre.pgsql
2020-03-17 18:18:01 -03:00
Alvaro Herrera bcd1c36300
Fix consistency issues with replication slot copy
Commit 9f06d79ef831's replication slot copying failed to
properly reserve the WAL that the slot is expecting to see
during DecodingContextFindStartpoint (to set the confirmed_flush
LSN), so concurrent activity could remove that WAL and cause the
copy process to error out.  But it doesn't actually *need* that
WAL anyway: instead of running decode to find confirmed_flush, it
can be copied from the source slot. Fix this by rearranging things
to avoid DecodingContextFindStartpoint() (leaving the target slot's
confirmed_flush_lsn to invalid), and set that up afterwards by copying
from the target slot's value.

Also ensure the source slot's confirmed_flush_lsn is valid.

Reported-by: Arseny Sher
Author: Masahiko Sawada, Arseny Sher
Discussion: https://postgr.es/m/871rr3ohbo.fsf@ars-thinkpad
2020-03-17 16:13:18 -03:00
Tom Lane 9d9784c840 Remove bogus assertion about polymorphic SQL function result.
It is possible to reach check_sql_fn_retval() with an unresolved
polymorphic rettype, resulting in an assertion failure as demonstrated
by one of the added test cases.  However, the code following that
throws what seems an acceptable error message, so just remove the
Assert and adjust commentary.

While here, I thought it'd be a good idea to provide some parallel
tests of SQL-function and PL/pgSQL-function polymorphism behavior.
Some of these cases are perhaps duplicative of tests elsewhere,
but we hadn't any organized coverage of the topic AFAICS.

Although that assertion's been wrong all along, it won't have any
effect in production builds, so I'm not bothering to back-patch.

Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
2020-03-17 14:54:46 -04:00
Fujii Masao 1429d3f767 Fix comment in xlog.c.
This commit fixes the comment about SharedHotStandbyActive variable.
The comment was apparently copy-and-pasted.

Author: Atsushi Torikoshi
Discussion: https://postgr.es/m/CACZ0uYEjpqZB9wN2Rwc_RMvDybyYqdbkPuDr1NyxJg4f9yGfMw@mail.gmail.com
2020-03-17 12:06:22 +09:00
Tom Lane 41b45576d5 Remove useless pfree()s at the ends of various ValuePerCall SRFs.
We don't need to manually clean up allocations in a SRF's
multi_call_memory_ctx, because the SRF_RETURN_DONE infrastructure
takes care of that (and also ensures that it will happen even if the
function never gets a final call, which simple manual cleanup cannot
do).

Hence, the code removed by this patch is a waste of code and cycles.
Worse, it gives the impression that cleaning up manually is a thing,
which can lead to more serious errors such as those fixed in
commits 085b6b667 and b4570d33a.  So we should get rid of it.

These are not quite actual bugs though, so I couldn't muster the
enthusiasm to back-patch.  Fix in HEAD only.

Justin Pryzby

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-16 21:36:53 -04:00
Tom Lane b4570d33aa Avoid holding a directory FD open across assorted SRF calls.
This extends the fixes made in commit 085b6b667 to other SRFs with the
same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(),
pg_ls_dir(), and pg_tablespace_databases().

Also adjust various comments and documentation to warn against
expecting to clean up resources during a ValuePerCall SRF's final
call.

Back-patch to all supported branches, since these functions were
all born broken.

Justin Pryzby, with cosmetic tweaks by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-16 21:05:52 -04:00
Peter Geoghegan 113758155c nbtree: Fix obsolete _bt_search() comment.
Oversight in commit d2086b08b0.
2020-03-16 15:51:06 -07:00
Peter Geoghegan 013c1f6af6 nbtree: Pass down MAXALIGN()'d itemsz for new item.
Refactor nbtinsert.c so that the final itemsz of each new non-pivot
tuple (the MAXALIGN()'d size) is determined once.  Most of the functions
used by leaf page inserts used the insertstate.itemsz value already.
This commit makes everything use insertstate.itemsz as standard
practice.  The goal is to decouple tuple size from "effective" tuple
size.  Making this distinction isn't truly necessary right now, but that
might change in the future.

Also explain why we consistently apply MAXALIGN() to get an effective
index tuple size.  This was rather unclear, in part because it isn't
actually strictly necessary right now.
2020-03-16 12:00:10 -07:00
Thomas Munro fc34b0d9de Introduce a maintenance_io_concurrency setting.
Introduce a GUC and a tablespace option to control I/O prefetching, much
like effective_io_concurrency, but for work that is done on behalf of
many client sessions.

Use the new setting in heapam.c instead of the hard-coded formula
effective_io_concurrency + 10 introduced by commit 558a9165e0.  Go with
a default value of 10 for now, because it's a round number pretty close
to the value used for that existing case.

Discussion: https://postgr.es/m/CA%2BhUKGJUw08dPs_3EUcdO6M90GnjofPYrWp4YSLaBkgYwS-AqA%40mail.gmail.com
2020-03-16 17:14:26 +13:00
Thomas Munro b09ff53667 Simplify the effective_io_concurrency setting.
The effective_io_concurrency GUC and equivalent tablespace option were
previously passed through a formula based on a theory about RAID
spindles and probabilities, to arrive at the number of pages to prefetch
in bitmap heap scans.  Tomas Vondra, Andres Freund and others argued
that it was anachronistic and hard to justify, and commit 558a9165e0
already started down the path of bypassing it in new code.  We agreed to
drop that logic and use the value directly.

For the default setting of 1, there is no change in effect.  Higher
settings can be converted from the old meaning to the new with:

  select round(sum(OLD / n::float)) from generate_series(1, OLD) s(n);

We might want to consider renaming the GUC before the next release given
the change in meaning, but it's not clear that many users had set it
very carefully anyway.  That decision is deferred for now.

Discussion: https://postgr.es/m/CA%2BhUKGJUw08dPs_3EUcdO6M90GnjofPYrWp4YSLaBkgYwS-AqA%40mail.gmail.com
2020-03-16 17:14:26 +13:00
Peter Geoghegan f207bb0b8f nbtree: Reorder nbtinsert.c prototypes.
Relocate _bt_newroot() prototype, so that the order that prototypes
appear in matches the order that the functions are defined in.
2020-03-15 20:53:12 -07:00
Peter Eisentraut 70a7b4776b Add backend type to csvlog and optionally log_line_prefix
The backend type, which corresponds to what
pg_stat_activity.backend_type shows, is added as a column to the
csvlog and can optionally be added to log_line_prefix using the new %b
placeholder.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-15 11:20:21 +01:00
Tom Lane 87c9c2571c Rearrange pseudotypes.c to get rid of duplicative code.
Commit a5954de10 replaced a lot of manually-coded stub I/O routines
with code generated by macros.  That was a good idea but it didn't
go far enough, because there were still manually-coded stub input
routines for types that had live output routines.  Refactor the
macro so that we can generate just a stub input routine at need.

Also create similar macros to generate stub binary I/O routines,
since we have some of those now.  The only stub functions that remain
hand-coded are shell_in() and shell_out(), which need to be separate
because they use different error messages.

While here, rearrange the commentary to discuss each type not each
function.  This provides a better way to explain the *why* of which
types need which support, rather than just duplicatively annotating
the functions.

Discussion: https://postgr.es/m/24137.1584139352@sss.pgh.pa.us
2020-03-14 15:31:44 -04:00
Tom Lane 4dbcb3f844 Restructure polymorphic-type resolution in funcapi.c.
resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes() failed to
cover the case of having to resolve anyarray given only an anyrange input.
The bug was masked if anyelement was also used (as either input or
output), which probably helps account for our not having noticed.

While looking at this I noticed that resolve_generic_type() would produce
the wrong answer if asked to make that same resolution.  ISTM that
resolve_generic_type() is confusingly defined and overly complex, so
rather than fix it, let's just make funcapi.c do the specific lookups
it requires for itself.

With this change, resolve_generic_type() is not used anywhere, so remove
it in HEAD.  In the back branches, leave it alone (complete with bug)
just in case any external code is using it.

While we're here, make some other refactoring adjustments in funcapi.c
with an eye to upcoming future expansion of the set of polymorphic types:

* Simplify quick-exit tests by adding an overall have_polymorphic_result
flag.  This is about a wash now but will be a win when there are more
flags.

* Reduce duplication of code between resolve_polymorphic_tupdesc() and
resolve_polymorphic_argtypes().

* Don't bother to validate correct matching of anynonarray or anyenum;
the parser should have done that, and even if it didn't, just doing
"return false" here would lead to a very confusing, off-point error
message.  (Really, "return false" in these two functions should only
occur if the call_expr isn't supplied or we can't obtain data type
info from it.)

* For the same reason, throw an elog rather than "return false" if
we fail to resolve a polymorphic type.

The bug's been there since we added anyrange, so back-patch to
all supported branches.

Discussion: https://postgr.es/m/6093.1584202130@sss.pgh.pa.us
2020-03-14 14:42:22 -04:00
Tomas Vondra e83daa7e33 Use multi-variate MCV lists to estimate ScalarArrayOpExpr
Commit 8f321bd16c added support for estimating ScalarArrayOpExpr clauses
(IN/ANY) clauses using functional dependencies. There's no good reason
not to support estimation of these clauses using multi-variate MCV lists
too, so this commits implements that. That makes the behavior consistent
and MCV lists can estimate all variants (ANY/ALL, inequalities, ...).

Author: Tomas Vondra
Review: Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-14 16:13:00 +01:00
Tomas Vondra 8f321bd16c Use functional dependencies to estimate ScalarArrayOpExpr
Until now functional dependencies supported only simple equality clauses
and clauses that can be trivially translated to equalities. This commit
allows estimation of some ScalarArrayOpExpr (IN/ANY) clauses.

For IN clauses we can do this thanks to using operator with equality
semantics, which means an IN clause

    WHERE c IN (1, 2, ..., N)

can be translated to

    WHERE (c = 1 OR c = 2 OR ... OR c = N)

IN clauses are now considered compatible with functional dependencies,
and rely on the same assumption of consistency of queries with data
(which is an assumption we already used for simple equality clauses).
This applies also to ALL clauses with an equality operator, which can be
considered equivalent to IN clause.

ALL clauses are still considered incompatible, although there's some
discussion about maybe relaxing this in the future.

Author: Pierre Ducroquet
Reviewed-by: Tomas Vondra, Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-14 16:12:41 +01:00
Peter Eisentraut d90bd24391 Remove am_syslogger global variable
Use the new MyBackendType instead.  More similar changes for other "am
something" variables are possible.  This one was just particularly
simple.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-13 14:01:15 +01:00
Peter Eisentraut 8e8a0becb3 Unify several ways to tracking backend type
Add a new global variable MyBackendType that uses the same BackendType
enum that was previously only used by the stats collector.  That way
several duplicate ways of checking what type a particular process is
can be simplified.  Since it's no longer just for stats, move to
miscinit.c and rename existing functions to match the expanded
purpose.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-13 14:01:10 +01:00
Peter Eisentraut 1cc9c2412c Preserve replica identity index across ALTER TABLE rewrite
If an index was explicitly set as replica identity index, this setting
was lost when a table was rewritten by ALTER TABLE.  Because this
setting is part of pg_index but actually controlled by ALTER
TABLE (not part of CREATE INDEX, say), we have to do some extra work
to restore it.

Based-on-patch-by: Quan Zongliang <quanzongliang@gmail.com>
Reviewed-by: Euler Taveira <euler.taveira@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c70fcab2-4866-0d9f-1d01-e75e189db342@gmail.com
2020-03-13 11:57:06 +01:00
Tom Lane 085b6b6679 Avoid holding a directory FD open across pg_ls_dir_files() calls.
This coding technique is undesirable because (a) it leaks the FD for
the rest of the transaction if the SRF is not run to completion, and
(b) allocated FDs are a scarce resource, but multiple interleaved
uses of the relevant functions could eat many such FDs.

In v11 and later, a query such as "SELECT pg_ls_waldir() LIMIT 1"
yields a warning about the leaked FD, and the only reason there's
no warning in earlier branches is that fd.c didn't whine about such
leaks before commit 9cb7db3f0.  Even disregarding the warning, it
wouldn't be too hard to run a backend out of FDs with careless use
of these SQL functions.

Hence, rewrite the function so that it reads the directory within
a single call, returning the results as a tuplestore rather than
via value-per-call mode.

There are half a dozen other built-in SRFs with similar problems,
but let's fix this one to start with, just to see if the buildfarm
finds anything wrong with the code.

In passing, fix bogus error report for stat() failure: it was
whining about the directory when it should be fingering the
individual file.  Doubtless a copy-and-paste error.

Back-patch to v10 where this function was added.

Justin Pryzby, with cosmetic tweaks and test cases by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-11 15:27:59 -04:00
Peter Eisentraut bf68b79e50 Refactor ps_status.c API
The init_ps_display() arguments were mostly lies by now, so to match
typical usage, just use one argument and let the caller assemble it
from multiple sources if necessary.  The only user of the additional
arguments is BackendInitialize(), which was already doing string
assembly on the caller side anyway.

Remove the second argument of set_ps_display() ("force") and just
handle that in init_ps_display() internally.

BackendInitialize() also used to set the initial status as
"authentication", but that was very far from where authentication
actually happened.  So now it's set to "initializing" and then
"authentication" just before the actual call to
ClientAuthentication().

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-11 16:38:31 +01:00
Alvaro Herrera 899a04f5ed
Avoid duplicates in ALTER ... DEPENDS ON EXTENSION
If the command is attempted for an extension that the object already
depends on, silently do nothing.

In particular, this means that if a database containing multiple such
entries is dumped, the restore will silently do the right thing and
record just the first one.  (At least, in a world where pg_dump does
dump such entries -- which it doesn't currently, but it will.)

Backpatch to 9.6, where this kind of dependency was introduced.

Reviewed-by: Ibrar Ahmed, Tom Lane (offlist)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
2020-03-11 11:04:59 -03:00
Peter Eisentraut 1c91838181 Clean up order in miscinit.c a bit
The code around InitPostmasterChild() from commit 31c453165b somehow
ended up in the middle of a block of code related to "User ID state".
Move it into its own block instead.
2020-03-11 13:51:55 +01:00
Peter Eisentraut aaa3aeddee Remove HAVE_WORKING_LINK
Previously, hard links were not used on Windows and Cygwin, but they
support them just fine in currently supported OS versions, so we can
use them there as well.

Since all supported platforms now support hard links, we can remove
the alternative code paths.

Rename durable_link_or_rename() to durable_rename_excl() to make the
purpose more clear without referencing the implementation details.

Discussion: https://www.postgresql.org/message-id/flat/72fff73f-dc9c-4ef4-83e8-d2e60c98df48%402ndquadrant.com
2020-03-11 11:23:04 +01:00
Peter Geoghegan 39eabec904 nbtree: Move fastpath NULL descent stack assertion.
Commit 074251db added an assertion that verified the fastpath/rightmost
page insert optimization's assumption about free space: There should
always be enough free space on the page to insert the new item without
splitting the page.  Otherwise, we end up using the "concurrent root
page split" phony/fake stack path in _bt_insert_parent().  This does not
lead to incorrect behavior, but it is likely to be far slower than
simply using the regular _bt_search() path.  The assertion catches
serious performance bugs that would probably take a long time to detect
any other way.

It seems much more natural to make this assertion just before the point
that we generate a fake/phony descent stack.  Move the assert there.
This also makes _bt_insertonpg() a bit more readable.
2020-03-10 17:25:47 -07:00
Tom Lane c8e8b2f9df Marginal comments and docs cleanup.
Fix up some imprecise comments and poor markup from ba79cb5dc.  Also try
to convert the documentation of log_min_duration_sample and friends into
passable English.
2020-03-10 17:34:09 -04:00
Peter Geoghegan d1e241c226 nbtree: Demote minus infinity "can't happen" error.
Only a very basic logic bug in a _bt_insertonpg() caller could lead to a
violation of this invariant.  Besides, any newitemoff used for an
internal page is sanitized using other "can't happen" errors in
_bt_getstackbuf() or its callers, before _bt_insertonpg() even gets
called.

Also, move the error/assertion from the insert-without-split path of
_bt_insertonpg() to the top of the same function.  There is no reason
why this invariant only applies to insertions that happen to not result
in a page split; cover every insertion.  The assertion naturally belongs
next to the existing generic assertions that document relatively
high-level invariants for the item being inserted.
2020-03-10 14:15:41 -07:00
Tom Lane cacef17223 Ensure that CREATE TABLE LIKE copies any NO INHERIT constraint property.
Since the documentation about LIKE doesn't say that a copied constraint
has properties different from the original, it seems that ignoring
a NO INHERIT property doesn't meet the principle of least surprise.
So make it copy that.

(Note, however, that we still don't copy a NOT VALID property;
CREATE TABLE offers no way to do that, plus it seems pointless.)

Arguably this is a bug fix; but no back-patch, as it seems barely
possible somebody is depending on the current behavior.

Ildar Musin and Chris Travers; reviewed by Amit Langote and myself

Discussion: https://postgr.es/m/CAONYFtMC6C+3AWCVp7Yd8H87Zn0GxG1_iQG6_bQKbaqYZY0=-g@mail.gmail.com
2020-03-10 14:54:00 -04:00
Tom Lane d01f03a495 Preserve integer and float values accurately in (de)serialize_deflist.
Previously, this code just smashed all types of DefElem values to
strings, cavalierly reasoning that nobody would care.  But in point of
fact, most of the defGetFoo functions do distinguish among different
input syntaxes; for instance defGetBoolean will accept 1 as an integer
but not "1" as a string.  This led to CREATE/ALTER TEXT SEARCH
DICTIONARY accepting 0 and 1 as values for boolean dictionary
properties, only to have the dictionary fail at runtime.

We can upgrade this behavior by teaching serialize_deflist that it
does not need to quote T_Integer or T_Float nodes' values on output,
and then teaching deserialize_deflist to restore unquoted integer or
float values as the appropriate node type.  This should not break
anything using pg_ts_dict.dictinitoption, since that field is just
defined as being something valid to include in CREATE TEXT SEARCH
DICTIONARY.

deserialize_deflist is also used to parse the options arguments
for the ts_headline family of functions, but so far as I can see
this won't cause any problems there either: the only consumer of
that output is prsd_headline which always uses defGetString.
(Really that's a bad idea, but I won't risk changing it here.)

This is surely a bug fix, but given the lack of field complaints
I don't think it's necessary to back-patch.

Discussion: https://postgr.es/m/CAMkU=1xRcs_BUPzR0+V3WndaCAv0E_m3h6aUEJ8NF-sY1nnHsw@mail.gmail.com
2020-03-10 12:30:02 -04:00
Alvaro Herrera 40b3e2c201
Split out CreateCast into src/backend/catalog/pg_cast.c
This catalog-handling code was previously together with the rest of
CastCreate() in src/backend/commands/functioncmds.c.  A future patch
will need a way to add casts internally, so this will be useful to have
separate.

Also, move the nearby get_cast_oid() function from functioncmds.c to
lsyscache.c, which seems a more natural place for it.

Author: Paul Jungwirth, minor edits by Álvaro
Discussion: https://postgr.es/m/20200309210003.GA19992@alvherre.pgsql
2020-03-10 11:28:23 -03:00
Peter Eisentraut 3c173a53a8 Remove utils/acl.h from catalog/objectaddress.h
The need for this was removed by
8b9e9644dc.

A number of files now need to include utils/acl.h or
parser/parse_node.h explicitly where they previously got it indirectly
somehow.

Since parser/parse_node.h already includes nodes/parsenodes.h, the
latter is then removed where the former was added.  Also, remove
nodes/pg_list.h from objectaddress.h, since that's included via
nodes/parsenodes.h.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/7601e258-26b2-8481-36d0-dc9dca6f28f1%402ndquadrant.com
2020-03-10 10:27:00 +01:00
Peter Eisentraut 17b9e7f9fe Support adding partitioned tables to publication
When a partitioned table is added to a publication, changes of all of
its partitions (current or future) are published via that publication.

This change only affects which tables a publication considers as its
members.  The receiving side still sees the data coming from the
individual leaf partitions.  So existing restrictions that partition
hierarchies can only be replicated one-to-one are not changed by this.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-10 09:09:32 +01:00
Michael Paquier 61d7c7bce3 Prevent reindex of invalid indexes on TOAST tables
Such indexes can only be duplicated leftovers of a previously failed
REINDEX CONCURRENTLY command, and a valid equivalent is guaranteed to
exist.  As toast indexes can only be dropped if invalid, reindexing
these would lead to useless duplicated indexes that can't be dropped
anymore, except if the parent relation is dropped.

Thanks to Justin Pryzby for reminding that this problem was reported
long ago during the review of the original patch of REINDEX
CONCURRENTLY, but the issue was never addressed.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
2020-03-10 15:38:17 +09:00
Fujii Masao 71e0d0a737 Tidy up XLogSource code in xlog.c.
This commit replaces 0 used as an initial value of XLogSource variable,
with XLOG_FROM_ANY. Also this commit changes those variable so that
XLogSource instead of int is used as the type for them. These changes
are for code readability and debugger-friendliness.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
2020-03-10 09:41:44 +09:00
Jeff Davis 24d85952a5 Introduce LogicalTapeSetExtend().
Increases the number of tapes in a logical tape set. This will be
important for disk-based hash aggregation, because the maximum number
of tapes is not known ahead of time.

While discussing this change, it was observed to regress the
performance of Sort for at least one test case. The performance
regression was because some versions of GCC switch to an inlined
version of memcpy() in LogicalTapeWrite() after this change. No
performance regression for clang was observed.

Because the regression is due to an arbitrary decision by the
compiler, I decided it shouldn't hold up this change. If it needs to
be fixed, we can find a workaround.

Author: Adam Lee, Jeff Davis
Discussion: https://postgr.es/m/e54bfec11c59689890f277722aaaabd05f78e22c.camel%40j-davis.com
2020-03-09 10:40:02 -07:00
Fujii Masao 17d3fcdc39 Fix bug that causes to report waiting in PS display twice, in hot standby.
Previously "waiting" could appear twice via PS in case of lock conflict
in hot standby mode. Specifically this issue happend when the delay
in WAL application determined by max_standby_archive_delay and
max_standby_streaming_delay had passed but it took more than 500 msec
to cancel all the conflicting transactions. Especially we can observe this
easily by setting those delay parameters to -1.

The cause of this issue was that WaitOnLock() and
ResolveRecoveryConflictWithVirtualXIDs() added "waiting" to
the process title in that case. This commit prevents
ResolveRecoveryConflictWithVirtualXIDs() from reporting waiting
in case of lock conflict, to fix the bug.

Back-patch to all back branches.

Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
2020-03-10 00:14:43 +09:00
Peter Eisentraut 71d60e2aa0 Add tg_updatedcols to TriggerData
This allows a trigger function to determine for an UPDATE trigger
which columns were actually updated.  This allows some optimizations
in generic trigger functions such as lo_manage and
tsvector_update_trigger.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
2020-03-09 09:34:55 +01:00
Peter Eisentraut 8f152b6c50 Code simplification
Initialize TriggerData to 0 for the whole struct together, instead of
each field separately.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
2020-03-09 09:34:55 +01:00
Fujii Masao ef34ab42a8 Avoid assertion failure with targeted recovery in standby mode.
At the end of recovery, standby mode is turned off to re-fetch the last
valid record from archive or pg_wal. Previously, if recovery target was
reached and standby mode was turned off while the current WAL source
was stream, recovery could try to retrieve WAL file containing the last
valid record unexpectedly from stream even though not in standby mode.
This caused an assertion failure. That is, the assertion test confirms that
WAL file should not be retrieved from stream if standby mode is not true.

This commit moves back the current WAL source to archive if it's stream
even though not in standby mode, to avoid that assertion failure.

This issue doesn't cause the server to crash when built with assertion
disabled. In this case, the attempt to retrieve WAL file from stream not
in standby mode just fails. And then recovery tries to retrieve WAL file
from archive or pg_wal.

Back-patch to all supported branches.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
2020-03-09 15:32:32 +09:00
Fujii Masao d9249441ef Mark ssl_passphrase_command as GUC_SUPERUSER_ONLY.
This commit changes the GUC ssl_passphrase_command so that
it's examinable by only superuser and a member of pg_read_all_settings.
Per discussion, we determined to do this because the parameter may
contain a sensitive informtaion like a passphrase itself.

Author: Insung Moon
Reviewed-by: Keisuke Kuroda
Discussion: https://postgr.es/m/CAEMmqBuHVGayc+QkYKgx3gWSdqwTAQGw+0DYn3WhcX-eNa2ntA@mail.gmail.com
2020-03-09 11:41:31 +09:00
Tom Lane ea7dace2aa Simplify/speed up assertion cross-check in ginCompressPostingList().
I noticed while testing some other stuff that the CHECK_ENCODING_ROUNDTRIP
logic in ginCompressPostingList could account for over 50% of the runtime
of an INSERT with a GIN index.  While that's not relevant to production
performance, it's still kind of annoying in a debug build.  Replacing
the loop around short memcmp's with one long memcmp works just as well
and is significantly faster, at least on my machine.
2020-03-07 13:31:17 -05:00
Peter Eisentraut 8d7def5c27 Fix typo 2020-03-07 08:51:59 +01:00
Tom Lane a6525588b7 Allow Unicode escapes in any server encoding, not only UTF-8.
SQL includes provisions for numeric Unicode escapes in string
literals and identifiers.  Previously we only accepted those
if they represented ASCII characters or the server encoding
was UTF-8, making the conversion to internal form trivial.
This patch adjusts things so that we'll call the appropriate
encoding conversion function in less-trivial cases, allowing
the escape sequence to be accepted so long as it corresponds
to some character available in the server encoding.

This also applies to processing of Unicode escapes in JSONB.
However, the old restriction still applies to client-side
JSON processing, since that hasn't got access to the server's
encoding conversion infrastructure.

This patch includes some lexer infrastructure that simplifies
throwing errors with error cursors pointing into the middle of
a string (or other complex token).  For the moment I only used
it for errors relating to Unicode escapes, but we might later
expand the usage to some other cases.

Patch by me, reviewed by John Naylor.

Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
2020-03-06 14:17:43 -05:00
Tom Lane fe30e7ebfa Allow ALTER TYPE to change some properties of a base type.
Specifically, this patch allows ALTER TYPE to:
* Change the default TOAST strategy for a toastable base type;
* Promote a non-toastable type to toastable;
* Add/remove binary I/O functions for a type;
* Add/remove typmod I/O functions for a type;
* Add/remove a custom ANALYZE statistics functions for a type.

The first of these can be done by the type's owner; all the others
require superuser privilege since misuse could cause problems.

The main motivation for this patch is to allow extensions to
upgrade the feature sets of their data types, so the set of
alterable properties is biased towards that use-case.  However
it's also true that changing some other properties would be
a lot harder, as they get baked into physical storage and/or
stored expressions that depend on the type.

Along the way, refactor GenerateTypeDependencies() to make it easier
to call, refactor DefineType's volatility checks so they can be shared
by AlterType, and teach typcache.c that it might have to reload data
from the type's pg_type row, a scenario it never handled before.
Also rearrange alter_type.sgml a bit for clarity (put the
composite-type operations together).

Tomas Vondra and Tom Lane

Discussion: https://postgr.es/m/20200228004440.b23ein4qvmxnlpht@development
2020-03-06 12:19:29 -05:00
Tom Lane bb03010b9f Remove the "opaque" pseudo-type and associated compatibility hacks.
A long time ago, it was necessary to declare datatype I/O functions,
triggers, and language handler support functions in a very type-unsafe
way involving a single pseudo-type "opaque".  We got rid of those
conventions in 7.3, but there was still support in various places to
automatically convert such functions to the modern declaration style,
to be able to transparently re-load dumps from pre-7.3 servers.
It seems unnecessary to continue to support that anymore, so take out
the hacks; whereupon the "opaque" pseudo-type itself is no longer
needed and can be dropped.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
2020-03-05 15:48:56 -05:00
Tom Lane 84eca14bc4 Remove ancient hacks to ignore certain opclass names in CREATE INDEX.
Twenty years ago, we removed certain operator classes in favor of
letting indexes over their data types be built with some other
binary-compatible, more standard opclass.  As a hack to allow existing
index definitions to be dumped and reloaded, we made CREATE INDEX ignore
the removed opclass names, so that such indexes would fall back to the
new default opclass for their data types.  This was never intended to
be a long-lived thing; it carries the obvious risk of breaking some
future developer's attempt to re-use those old opclass names.  Since
all of the cases in question are for opclasses that were removed
before PG 8.0, it seems okay to get rid of these hacks now.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/3685.1583422389@sss.pgh.pa.us
2020-03-05 15:36:06 -05:00
Tom Lane e58a599752 Remove ancient support for upgrading pre-7.3 foreign key constraints.
Before 7.3, foreign key constraints had no explicit catalog
representation, so that what pg_dump produced for them was (usually)
a set of three CREATE CONSTRAINT TRIGGER commands.  Commit a2899ebdc
and some follow-on fixes added an ugly hack in CreateTrigger() to
recognize that pattern and reconstruct the foreign key definition.
However, we've never had any test coverage for that code, so that it's
legitimate to wonder if it still works; and having to maintain it in
the face of upcoming trigger-related patches seems rather pointless.
Let's decree that its time has passed, and drop it.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Daniel Gustafsson

Discussion: https://postgr.es/m/805874E2-999C-4CDA-856F-1AFBD9DFE933@yesql.se
2020-03-05 15:25:45 -05:00
Alvaro Herrera a77315fdf2
Remove RangeIOData->typiofunc
We used to carry the I/O function OID in RangeIOData, but it's not used
for anything.  Since the struct is not exposed to the world anyway, we
can simplify it a bit.  Also, rename the FmgrInfo member to match
the accompanying 'typioparam' and put them in a more sensible order.

Reviewed by Tom Lane and Paul Jungwirth.

Discussion: https://postgr.es/m/20200304215711.GA8732@alvherre.pgsql
2020-03-05 11:35:02 -03:00
Michael Paquier fbcf087112 Fix more issues with dependency handling at swap phase of REINDEX CONCURRENTLY
When canceling a REINDEX CONCURRENTLY operation after swapping is done,
a drop of the parent table would leave behind old indexes.  This is a
consequence of 68ac9cf, which fixed the case of pg_depend bloat when
repeating REINDEX CONCURRENTLY on the same relation.

In order to take care of the problem without breaking the previous fix,
this uses a different strategy, possible even with the exiting set of
routines to handle dependency changes.  The dependencies of/on the
new index are additionally switched to the old one, allowing an old
invalid index remaining around because of a cancellation or a failure to
use the dependency links of the concurrently-created index.  This
ensures that dropping any objects the old invalid index depends on also
drops the old index automatically.

Reported-by: Julien Rouhaud
Author: Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/20200227080735.l32fqcauy73lon7o@nol
Backpatch-through: 12
2020-03-05 12:50:15 +09:00
Jeff Davis c954d49046 Extend ExecBuildAggTrans() to support a NULL pointer check.
Optionally push a step to check for a NULL pointer to the pergroup
state.

This will be important for disk-based hash aggregation in combination
with grouping sets. When memory limits are reached, a given tuple may
find its per-group state for some grouping sets but not others. For
the former, it advances the per-group state as normal; for the latter,
it skips evaluation and the calling code will have to spill the tuple
and reprocess it in a later batch.

Add the NULL check as a separate expression step because in some
common cases it's not needed.

Discussion: https://postgr.es/m/20200221202212.ssb2qpmdgrnx52sj%40alap3.anarazel.de
2020-03-04 17:29:18 -08:00
Tom Lane 3ed2005ff5 Introduce macros for typalign and typstorage constants.
Our usual practice for "poor man's enum" catalog columns is to define
macros for the possible values and use those, not literal constants,
in C code.  But for some reason lost in the mists of time, this was
never done for typalign/attalign or typstorage/attstorage.  It's never
too late to make it better though, so let's do that.

The reason I got interested in this right now is the need to duplicate
some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch.
But in general, this sort of change aids greppability and readability,
so it's a good idea even without any specific motivation.

I may have missed a few places that could be converted, and it's even
more likely that pending patches will re-introduce some hard-coded
references.  But that's not fatal --- there's no expectation that
we'd actually change any of these values.  We can clean up stragglers
over time.

Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
2020-03-04 10:34:25 -05:00
Tom Lane d677550493 Allow to_date/to_timestamp to recognize non-English month/day names.
to_char() has long allowed the TM (translation mode) prefix to
specify output of translated month or day names; but that prefix
had no effect in input format strings.  Now it does.  to_date()
and to_timestamp() will now recognize the same month or day names
that to_char() would output for the same format code.  Matching is
case-insensitive (per the active collation's notion of what that
means), just as it has always been for English month/day names
without the TM prefix.

(As per the discussion thread, there are lots of cases that this
feature will not handle, such as alternate day names.  But being
able to accept what to_char() will output seems useful enough.)

In passing, fix some shaky English and violations of message
style guidelines in jsonpath errors for the .datetime() method,
which depends on this code.

Juan José Santamaría Flecha, reviewed and modified by me,
with other commentary from Alvaro Herrera, Tomas Vondra,
Arthur Zakirov, Peter Eisentraut, Mark Dilger.

Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
2020-03-03 11:06:47 -05:00
Peter Geoghegan 1e07f5e0a1 Remove overzealous _bt_split() assertions.
_bt_split() is passed NULL as its insertion scankey for internal page
splits.  Two recently added Assert() statements failed to consider this,
leading to a crash with pg_upgrade'd BREE_VERSION < 4 indexes.  Remove
the assertions.

The assertions in question were added by commit 0d861bbb, which added
nbtree deduplication.  It would be possible to fix the assertions
directly instead, but they weren't adding much anyway.
2020-03-02 21:40:11 -08:00
Michael Paquier 0b48f1335d Fix assertion failure with ALTER TABLE ATTACH PARTITION and indexes
Using ALTER TABLE ATTACH PARTITION causes an assertion failure when
attempting to work on a partitioned index, because partitioned indexes
cannot have partition bounds.

The grammar of ALTER TABLE ATTACH PARTITION requires partition bounds,
but not ALTER INDEX, so mixing ALTER TABLE with partitioned indexes is
confusing.  Hence, on HEAD, prevent ALTER TABLE to attach a partition if
the relation involved is a partitioned index.  On back-branches, as
applications may rely on the existing behavior, just remove the
culprit assertion.

Reported-by: Alexander Lakhin
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/16276-5cd1dcc8fb8be7b5@postgresql.org
Backpatch-through: 11
2020-03-03 13:55:41 +09:00
Fujii Masao e65497df8f Report progress of streaming base backup.
This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
2020-03-03 12:03:43 +09:00
Michael Paquier d79fb88ac7 Preserve pg_index.indisclustered across REINDEX CONCURRENTLY
If the flag value is lost, a CLUSTER query following REINDEX
CONCURRENTLY could fail.  Non-concurrent REINDEX is already handling
this case consistently.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200229024202.GH29456@telsasoft.com
Backpatch-through: 12
2020-03-03 10:12:28 +09:00
Alvaro Herrera 2f9661311b
Represent command completion tags as structs
The backend was using strings to represent command tags and doing string
comparisons in multiple places, but that's slow and unhelpful.  Create a
new command list with a supporting structure to use instead; this is
stored in a tag-list-file that can be tailored to specific purposes with
a caller-definable C macro, similar to what we do for WAL resource
managers.  The first first such uses are a new CommandTag enum and a
CommandTagBehavior struct.

Replace numerous occurrences of char *completionTag with a
QueryCompletion struct so that the code no longer stores information
about completed queries in a cstring.  Only at the last moment, in
EndCommand(), does this get converted to a string.

EventTriggerCacheItem no longer holds an array of palloc’d tag strings
in sorted order, but rather just a Bitmapset over the CommandTags.

Author: Mark Dilger, with unsolicited help from Álvaro Herrera
Reviewed-by: John Naylor, Tom Lane
Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
2020-03-02 18:19:51 -03:00
Peter Geoghegan 77b88bd5dc Add assertions to _bt_update_posting().
Copy some assertions from _bt_form_posting() to its sibling function,
_bt_update_posting().

Discussion: https://postgr.es/m/CAH2-WzkPR8KMwkL0ap976kmXwBCeukTeHz6fB-U__wvuP1S9Zg@mail.gmail.com
2020-03-02 08:07:16 -08:00
Michael Paquier 12c5cad76f Handle logical decoding in multi-insert for catalog tuples
The code path for multi-insert decoding is not stressed yet for
catalogs (a future patch may introduce this capability), so no
back-patch is needed.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/9690D72F-5C4F-4016-9572-6D16684E1D87@yesql.se
2020-03-02 10:00:37 +09:00
Peter Geoghegan 84ec9b231a Remove dead code from _bt_update_posting().
Discussion: https://postgr.es/m/CAH2-WzmAufHiOku6AGiFD=81VQs5nYJ1L2YkhW1t+BH4CMsgRw@mail.gmail.com
2020-03-01 12:11:26 -08:00
Dean Rasheed 43a899f41f Fix corner-case loss of precision in numeric ln().
When deciding on the local rscale to use for the Taylor series
expansion, ln_var() neglected to account for the fact that the result
is subsequently multiplied by a factor of 2^(nsqrt+1), where nsqrt is
the number of square root operations performed in the range reduction
step, which can be as high as 22 for very large inputs. This could
result in a loss of precision, particularly when combined with large
rscale values, for which a large number of Taylor series terms is
required (up to around 400).

Fix by computing a few extra digits in the Taylor series, based on the
weight of the multiplicative factor log10(2^(nsqrt+1)). It remains to
be proven whether or not the other 8 extra digits used for the Taylor
series is appropriate, but this at least deals with the obvious
oversight of failing to account for the effects of the final
multiplication.

Per report from Justin AnyhowStep. Reviewed by Tom Lane.

Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
2020-03-01 14:49:25 +00:00
Tom Lane 58c47ccfff Correctly re-use hash tables in buildSubPlanHash().
Commit 356687bd8 omitted to remove leftover code for destroying
a hashed subplan's hash tables, with the result that the tables
were always rebuilt not reused; this leads to severe memory
leakage if a hashed subplan is re-executed enough times.
Moreover, the code for reusing the hashnulls table had a typo
that would have made it do the wrong thing if it were reached.

Looking at the code coverage report shows severe under-coverage
of the potential callers of ResetTupleHashTable, so add some test
cases that exercise them.

Andreas Karlsson and Tom Lane, per reports from Ranier Vilela
and Justin Pryzby.

Backpatch to v11, as the faulty commit was.

Discussion: https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se
Discussion: https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
2020-02-29 13:48:09 -05:00
Tom Lane 6afc8aefd3 Remove obsolete comment.
Noted while studying subplan hash issue.
2020-02-29 13:23:12 -05:00
Tom Lane 80d76be51c Avoid failure if autovacuum tries to access a just-dropped temp namespace.
Such an access became possible when commit 246a6c8f7 added more
aggressive cleanup of orphaned temp relations by autovacuum.
Since autovacuum's snapshot might be slightly stale, it could
attempt to access an already-dropped temp namespace, resulting in
an assertion failure or null-pointer dereference.  (In practice,
since we don't drop temp namespaces automatically but merely
recycle them, this situation could only arise if a superuser does
a manual drop of a temp namespace.  Still, that should be allowed.)

The core of the bug, IMO, is that isTempNamespaceInUse and its callers
failed to think hard about whether to treat "temp namespace isn't there"
differently from "temp namespace isn't in use".  In hopes of forestalling
future mistakes of the same ilk, replace that function with a new one
checkTempNamespaceStatus, which makes the same tests but returns a
three-way enum rather than just a bool.  isTempNamespaceInUse is gone
entirely in HEAD; but just in case some external code is relying on it,
keep it in the back branches, as a bug-compatible wrapper around the
new function.

Per report originally from Prabhat Kumar Sahu, investigated by Mahendra
Singh and Michael Paquier; the final form of the patch is my fault.
This replaces the failed fix attempt in a052f6cbb.

Backpatch as far as v11, as 246a6c8f7 was.

Discussion: https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com
2020-02-28 20:28:34 -05:00
Jeff Davis 32bb4535a0 Fix commit c11cb17d.
I neglected to update copyfuncs/outfuncs/readfuncs.

Discussion: https://postgr.es/m/12491.1582833409%40sss.pgh.pa.us
2020-02-28 09:35:11 -08:00
Alvaro Herrera db989184cd
Add comments on avoid reuse of parse-time snapshot
Apparently, reusing the parse-time query snapshot for later steps
(execution) is a frequently considered optimization ... but it doesn't
work, for reasons discovered in thread [1].  Adding some comments about
why it doesn't really work can relieve some future hackers from wasting
time reimplementing it again.

[1] https://postgr.es/m/flat/5075D8DF.6050500@fuzzy.cz

Author: Michail Nikolaev
Discussion: https://postgr.es/m/CANtu0ogp6cTvMJObXP8n=k+JtqxY1iT9UV5MbGCpjjPa5crCiw@mail.gmail.com
2020-02-28 13:25:26 -03:00
Peter Eisentraut 1933ae629e Add PostgreSQL home page to --help output
Per emerging standard in GNU programs and elsewhere.  Autoconf already
has support for specifying a home page, so we can just that.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Peter Eisentraut 864934131e Refer to bug report address by symbol rather than hardcoding
Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Jeff Davis c11cb17dc5 Save calculated transitionSpace in Agg node.
This will be useful in the upcoming Hash Aggregation work to improve
estimates for hash table sizing.

Discussion: https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
2020-02-27 11:20:56 -08:00
Alvaro Herrera b9b408c487
Record parents of triggers
This let us get rid of a recently introduced ugly hack (commit
1fa846f1c9).

Author: Álvaro Herrera
Reviewed-by: Amit Langote, Tom Lane
Discussion: https://postgr.es/m/20200217215641.GA29784@alvherre.pgsql
2020-02-27 13:23:33 -03:00
Robert Haas 05d8449e73 Move src/backend/utils/hash/hashfn.c to src/common
This also involves renaming src/include/utils/hashutils.h, which
becomes src/include/common/hashfn.h. Perhaps an argument can be
made for keeping the hashutils.h name, but it seemed more
consistent to make it match the name of the file, and also more
descriptive of what is actually going on here.

Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list
advice on how not to break the Windows build from Davinder Singh
and Amit Kapila.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-27 09:25:41 +05:30
Peter Geoghegan 2c0797da2c Silence another compiler warning in nbtinsert.c.
Per complaint from Álvaro Herrera.
2020-02-26 15:15:45 -08:00
Tom Lane a477bfc1df Suppress unnecessary RelabelType nodes in more cases.
eval_const_expressions sometimes produced RelabelType nodes that
were useless because they just relabeled an expression to the same
exposed type it already had.  This is worth avoiding because it can
cause two equivalent expressions to not be equal(), preventing
recognition of useful optimizations.  In the test case added here,
an unpatched planner fails to notice that the "sqli = constant" clause
renders a sort step unnecessary, because one code path produces an
extra RelabelType and another doesn't.

Fix by ensuring that eval_const_expressions_mutator's T_RelabelType
case will not add in an unnecessary RelabelType.  Also save some
code by sharing a subroutine with the effectively-equivalent cases
for CollateExpr and CoerceToDomain.  (CollateExpr had no bug, and
I think that the case couldn't arise with CoerceToDomain, but
it seems prudent to do the same check for all three cases.)

Back-patch to v12.  In principle this has been wrong all along,
but I haven't seen a case where it causes visible misbehavior
before v12, so refrain from changing stable branches unnecessarily.

Per investigation of a report from Eric Gillum.

Discussion: https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
2020-02-26 18:14:12 -05:00
Peter Geoghegan 2d8a6fad18 Silence compiler warning in nbtinsert.c.
Per buildfarm member longfin.
2020-02-26 13:17:36 -08:00
Peter Geoghegan 0d861bbb70 Add deduplication to nbtree.
Deduplication reduces the storage overhead of duplicates in indexes that
use the standard nbtree index access method.  The deduplication process
is applied lazily, after the point where opportunistic deletion of
LP_DEAD-marked index tuples occurs.  Deduplication is only applied at
the point where a leaf page split would otherwise be required.  New
posting list tuples are formed by merging together existing duplicate
tuples.  The physical representation of the items on an nbtree leaf page
is made more space efficient by deduplication, but the logical contents
of the page are not changed.  Even unique indexes make use of
deduplication as a way of controlling bloat from duplicates whose TIDs
point to different versions of the same logical table row.

The lazy approach taken by nbtree has significant advantages over a GIN
style eager approach.  Most individual inserts of index tuples have
exactly the same overhead as before.  The extra overhead of
deduplication is amortized across insertions, just like the overhead of
page splits.  The key space of indexes works in the same way as it has
since commit dd299df8 (the commit that made heap TID a tiebreaker
column).

Testing has shown that nbtree deduplication can generally make indexes
with about 10 or 15 tuples for each distinct key value about 2.5X - 4X
smaller, even with single column integer indexes (e.g., an index on a
referencing column that accompanies a foreign key).  The final size of
single column nbtree indexes comes close to the final size of a similar
contrib/btree_gin index, at least in cases where GIN's posting list
compression isn't very effective.  This can significantly improve
transaction throughput, and significantly reduce the cost of vacuuming
indexes.

A new index storage parameter (deduplicate_items) controls the use of
deduplication.  The default setting is 'on', so all new B-Tree indexes
automatically use deduplication where possible.  This decision will be
reviewed at the end of the Postgres 13 beta period.

There is a regression of approximately 2% of transaction throughput with
synthetic workloads that consist of append-only inserts into a table
with several non-unique indexes, where all indexes have few or no
repeated values.  The underlying issue is that cycles are wasted on
unsuccessful attempts at deduplicating items in non-unique indexes.
There doesn't seem to be a way around it short of disabling
deduplication entirely.  Note that deduplication of items in unique
indexes is fairly well targeted in general, which avoids the problem
there (we can use a special heuristic to trigger deduplication passes in
unique indexes, since we're specifically targeting "version bloat").

Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed.

No bump in BTREE_VERSION, since the representation of posting list
tuples works in a way that's backwards compatible with version 4 indexes
(i.e. indexes built on PostgreSQL 12).  However, users must still
REINDEX a pg_upgrade'd index to use deduplication, regardless of the
Postgres version they've upgraded from.  This is the only way to set the
new nbtree metapage flag indicating that deduplication is generally
safe.

Author: Anastasia Lubennikova, Peter Geoghegan
Reviewed-By: Peter Geoghegan, Heikki Linnakangas
Discussion:
    https://postgr.es/m/55E4051B.7020209@postgrespro.ru
    https://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
2020-02-26 13:05:30 -08:00
Peter Geoghegan 612a1ab767 Add equalimage B-Tree support functions.
Invent the concept of a B-Tree equalimage ("equality implies image
equality") support function, registered as support function 4.  This
indicates whether it is safe (or not safe) to apply optimizations that
assume that any two datums considered equal by an operator class's order
method must be interchangeable without any loss of semantic information.
This is static information about an operator class and a collation.

Register an equalimage routine for almost all of the existing B-Tree
opclasses.  We only need two trivial routines for all of the opclasses
that are included with the core distribution.  There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which
returns 'true' when the collation is a deterministic collation).

This patch is infrastructure for an upcoming patch that adds B-Tree
deduplication.

Author: Peter Geoghegan, Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2020-02-26 11:28:25 -08:00
Andres Freund 2742c45080 expression eval: Reduce number of steps for agg transition invocations.
Do so by combining the various steps that are part of aggregate
transition function invocation into one larger step. As some of the
current steps are only necessary for some aggregates, have one variant
of the aggregate transition step for each possible combination.

To avoid further manual copies of code in the different transition
step implementations, move most of the code into helper functions
marked as "always inline".

The benefit of this change is an increase in performance when
aggregating lots of rows. This comes in part due to the reduced number
of indirect jumps due to the reduced number of steps, and in part by
reducing redundant setup code across steps. This mainly benefits
interpreted execution, but the code generated by JIT is also improved
a bit.

As a nice side-effect it also ends up making the code a bit simpler.

A small additional optimization is removing the need to set
aggstate->curaggcontext before calling ExecAggInitGroup, choosing to
instead passign curaggcontext as an argument. It was, in contrast to
other aggregate related functions, only needed to fetch a memory
context to copy the transition value into.

Author: Andres Freund
Discussion:
   https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
   https://postgr.es/m/5c371df7cee903e8cd4c685f90c6c72086d3a2dc.camel@j-davis.com
2020-02-24 15:09:09 -08:00
Michael Paquier 7d672b76bf Issue properly WAL record for CID of first catalog tuple in multi-insert
Multi-insert for heap is not yet used actively for catalogs, but the
code to support this case is in place for logical decoding.  The
existing code forgot to issue a XLOG_HEAP2_NEW_CID record for the first
tuple inserted, leading to failures when attempting to use multiple
inserts for catalogs at decoding time.  This commit fixes the problem by
WAL-logging the needed CID.

This is not an active bug, so no back-patch is done.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/E0D4CC67-A1CF-4DF4-991D-B3AC2EB5FAE9@yesql.se
2020-02-25 07:55:22 +09:00
Tom Lane 3d475515a1 Account explicitly for long-lived FDs that are allocated outside fd.c.
The comments in fd.c have long claimed that all file allocations should
go through that module, but in reality that's not always practical.
fd.c doesn't supply APIs for invoking some FD-producing syscalls like
pipe() or epoll_create(); and the APIs it does supply for non-virtual
FDs are mostly insistent on releasing those FDs at transaction end;
and in some cases the actual open() call is in code that can't be made
to use fd.c, such as libpq.

This has led to a situation where, in a modern server, there are likely
to be seven or so long-lived FDs per backend process that are not known
to fd.c.  Since NUM_RESERVED_FDS is only 10, that meant we had *very*
few spare FDs if max_files_per_process is >= the system ulimit and
fd.c had opened all the files it thought it safely could.  The
contrib/postgres_fdw regression test, in particular, could easily be
made to fall over by running it under a restrictive ulimit.

To improve matters, invent functions Acquire/Reserve/ReleaseExternalFD
that allow outside callers to tell fd.c that they have or want to allocate
a FD that's not directly managed by fd.c.  Add calls to track all the
fixed FDs in a standard backend session, so that we are honestly
guaranteeing that NUM_RESERVED_FDS FDs remain unused below the EMFILE
limit in a backend's idle state.  The coding rules for these functions say
that there's no need to call them in code that just allocates one FD over
a fairly short interval; we can dip into NUM_RESERVED_FDS for such cases.
That means that there aren't all that many places where we need to worry.
But postgres_fdw and dblink must use this facility to account for
long-lived FDs consumed by libpq connections.  There may be other places
where it's worth doing such accounting, too, but this seems like enough
to solve the immediate problem.

Internally to fd.c, "external" FDs are limited to max_safe_fds/3 FDs.
(Callers can choose to ignore this limit, but of course it's unwise
to do so except for fixed file allocations.)  I also reduced the limit
on "allocated" files to max_safe_fds/3 FDs (it had been max_safe_fds/2).
Conceivably a smarter rule could be used here --- but in practice,
on reasonable systems, max_safe_fds should be large enough that this
isn't much of an issue, so KISS for now.  To avoid possible regression
in the number of external or allocated files that can be opened,
increase FD_MINFREE and the lower limit on max_files_per_process a
little bit; we now insist that the effective "ulimit -n" be at least 64.

This seems like pretty clearly a bug fix, but in view of the lack of
field complaints, I'll refrain from risking a back-patch.

Discussion: https://postgr.es/m/E1izCmM-0005pV-Co@gemulon.postgresql.org
2020-02-24 17:28:33 -05:00
Robert Haas a91e2fa941 Adapt hashfn.c and hashutils.h for frontend use.
hash_any() and its various variants are defined to return Datum,
which is a backend-only concept, but the underlying functions
actually want to return uint32 and uint64, and only return Datum
because it's convenient for callers who are using them to
implement a hash function for some SQL datatype.

However, changing these functions to return uint32 and uint64
seems like it might lead to programming errors or back-patching
difficulties, both because they are widely used and because
failure to use UInt{32,64}GetDatum() might not provoke a
compilation error. Instead, rename the existing functions as
well as changing the return type, and add static inline wrappers
for those callers that need the previous behavior.

Although this commit adapts hashutils.h and hashfn.c so that they
can be compiled as frontend code, it does not actually do
anything that would cause them to be so compiled. That is left
for another commit.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:27:15 +05:30
Robert Haas 9341c783cc Put all the prototypes for hashfn.c into the same header file.
Previously, some of the prototypes for functions in hashfn.c were
in utils/hashutils.h and others were in utils/hsearch.h, but that
is confusing and has no particular benefit.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:22:45 +05:30
Robert Haas 07b95c3d83 Move bitmap_hash and bitmap_match to bitmapset.c.
The closely-related function bms_hash_value is already defined in that
file, and this change means that hashfn.c no longer needs to depend on
nodes/bitmapset.h. That gets us closer to allowing use of the hash
functions in hashfn.c in frontend code.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:17:43 +05:30
Michael Paquier bf883b211e Add prefix checks in exclude lists for pg_rewind, pg_checksums and base backups
An instance of PostgreSQL crashing with a bad timing could leave behind
temporary pg_internal.init files, potentially causing failures when
verifying checksums.  As the same exclusion lists are used between
pg_rewind, pg_checksums and basebackup.c, all those tools are extended
with prefix checks to keep everything in sync, with dedicated checks
added for pg_internal.init.

Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced.

Reported-by: Michael Banck
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
2020-02-24 18:13:25 +09:00
Peter Eisentraut 79c2385915 Factor out InitControlFile() from BootStrapXLOG()
Right now this only makes BootStrapXLOG() a bit more manageable, but
in the future there may be external callers.

Discussion: https://www.postgresql.org/message-id/e8f86ba5-48f1-a80a-7f1d-b76bcb9c5c47@2ndquadrant.com
2020-02-22 12:09:27 +01:00
Peter Eisentraut 9745f93afc Reformat code comment
Discussion: https://www.postgresql.org/message-id/e8f86ba5-48f1-a80a-7f1d-b76bcb9c5c47@2ndquadrant.com
2020-02-22 12:09:27 +01:00
Tom Lane 97cf1fa4ed Assume that we have <wchar.h>.
Windows has this, and so do all other live platforms according to the
buildfarm; it's been required by POSIX since SUSv2.  So remove the
configure probe and tests of HAVE_WCHAR_H.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Tom Lane 481c8e9232 Assume that we have utime() and <utime.h>.
These are required by POSIX since SUSv2, and no live platforms fail
to provide them.  On Windows, utime() exists and we bring our own
<utime.h>, so we're good there too.  So remove the configure probes
and ad-hoc substitute code.  We don't need to check for utimes()
anymore either, since that was only used as a substitute.

In passing, make the Windows build include <sys/utime.h> only where
we need it, not everywhere.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Tom Lane abe41f453a Assume that we have cbrt().
Windows has this, and so do all other live platforms according to the
buildfarm, so remove the configure probe and float.c's substitute code.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Jeff Davis b7fabe80df Fixup for nodeAgg.c refactor.
Commit 5b618e1f made an unintended behavior change.
2020-02-21 09:34:20 -08:00
Etsuro Fujita 032f9ae012 Avoid redundant checks in partition_bounds_copy().
Previously, partition_bounds_copy() checked whether the strategy for the
given partition bounds was hash or not, and then determined the number of
elements in the datums in the datums array for the partition bounds, on
each iteration of the loop for copying the datums array, but there is no
need to do that.  Perform the checks only once before the loop iteration.

Author: Etsuro Fujita
Reported-by: Amit Langote and Julien Rouhaud
Discussion: https://postgr.es/m/CAPmGK14Rvxrm8DHWvCjdoks6nwZuHBPvMnWZ6rkEx2KhFeEoPQ@mail.gmail.com
2020-02-21 20:00:45 +09:00
Etsuro Fujita 53b01acd46 Remove extra word from comment. 2020-02-20 19:15:00 +09:00
Tom Lane 70a7732007 Remove support for upgrading extensions from "unpackaged" state.
Andres Freund pointed out that allowing non-superusers to run
"CREATE EXTENSION ... FROM unpackaged" has security risks, since
the unpackaged-to-1.0 scripts don't try to verify that the existing
objects they're modifying are what they expect.  Just attaching such
objects to an extension doesn't seem too dangerous, but some of them
do more than that.

We could have resolved this, perhaps, by still requiring superuser
privilege to use the FROM option.  However, it's fair to ask just what
we're accomplishing by continuing to lug the unpackaged-to-1.0 scripts
forward.  None of them have received any real testing since 9.1 days,
so they may not even work anymore (even assuming that one could still
load the previous "loose" object definitions into a v13 database).
And an installation that's trying to go from pre-9.1 to v13 or later
in one jump is going to have worse compatibility problems than whether
there's a trivial way to convert their contrib modules into extension
style.

Hence, let's just drop both those scripts and the core-code support
for "CREATE EXTENSION ... FROM".

Discussion: https://postgr.es/m/20200213233015.r6rnubcvl4egdh5r@alap3.anarazel.de
2020-02-19 16:59:14 -05:00
Jeff Davis 5b618e1f48 Minor refactor of nodeAgg.c.
* Separate calculation of hash value from the lookup.
  * Split build_hash_table() into two functions.
  * Change lookup_hash_entry() to return AggStatePerGroup. That's all
    the caller needed, anyway.

These changes are to support the upcoming Disk-based Hash Aggregation
work.

Discussion: https://postgr.es/m/31f5ab871a3ad5a1a91a7a797651f20e77ac7ce3.camel%40j-davis.com
2020-02-19 10:39:11 -08:00