Commit Graph

20553 Commits

Author SHA1 Message Date
Tom Lane fbc7a71608 Rearrange validity checks for plpgsql "simple" expressions.
Buildfarm experience shows what probably should've occurred to me before:
if a cache flush occurs partway through building a generic plan, then
the plansource may have is_valid = false even though the plan is valid.
We need to accept this case, use the generated plan, and then try to
replan the next time.  We can't try to replan immediately, because that
would produce an infinite loop in CLOBBER_CACHE_ALWAYS builds; moreover
it's really overkill.  (We can assume that the plan is valid, it's just
possibly a bit stale.  Note that the pre-existing code behaved this way,
and the non-simple-expression code paths do too.)  Conversely, not using
the generated plan would drop us into the not-a-simple-expression code
path, which is bad for performance and would also cause regression-test
failures due to visibly different error-reporting behavior.

Hence, refactor the validity-check functions so that the initial check
and recheck cases can react differently to plansource->is_valid.
This makes their usage a bit simpler, too.

Discussion: https://postgr.es/m/7072.1585332104@sss.pgh.pa.us
2020-03-27 14:47:34 -04:00
Peter Eisentraut 8d1b9648c5 Update SQL features
Change F311 to supported.  This was already accomplished when
subfeature F311-04 (WITH CHECK OPTION) was added, but the top-level
feature wasn't updated at the time.
2020-03-27 08:36:08 +01:00
Tom Lane 8f59f6b9c0 Improve performance of "simple expressions" in PL/pgSQL.
For relatively simple expressions (say, "x + 1" or "x > 0"), plpgsql's
management overhead exceeds the cost of evaluating the expression.
This patch substantially improves that situation, providing roughly
2X speedup for such trivial expressions.

First, add infrastructure in the plancache to allow fast re-validation
of cached plans that contain no table access, and hence need no locks.
Teach plpgsql to use this infrastructure for expressions that it's
already deemed "simple" (which in particular will never contain table
references).

The fast path still requires checking that search_path hasn't changed,
so provide a fast path for OverrideSearchPathMatchesCurrent by
counting changes that have occurred to the active search path in the
current session.  This is simplistic but seems enough for now, seeing
that PushOverrideSearchPath is not used in any performance-critical
cases.

Second, manage the refcounts on simple expressions' cached plans using
a transaction-lifespan resource owner, so that we only need to take
and release an expression's refcount once per transaction not once per
expression evaluation.  The management of this resource owner exactly
parallels the existing management of plpgsql's simple-expression EState.

Add some regression tests covering this area, in particular verifying
that expression caching doesn't break semantics for search_path changes.

Patch by me, but it owes something to previous work by Amit Langote,
who recognized that getting rid of plancache-related overhead would
be a useful thing to do here.  Also thanks to Andres Freund for review.

Discussion: https://postgr.es/m/CAFj8pRDRVfLdAxsWeVLzCAbkLFZhW549K+67tpOc-faC8uH8zw@mail.gmail.com
2020-03-26 18:58:57 -04:00
Magnus Hagander eff5b245df Document that pg_checksums exists in checksums README
Author: Daniel Gustafsson <daniel@yesql.se>
2020-03-26 15:05:54 +01:00
Peter Eisentraut 49bf81536e Drop slot's LWLock before returning from SaveSlotToPath()
When SaveSlotToPath() is called with elevel=LOG, the early exits didn't
release the slot's io_in_progress_lock.

This could result in a walsender being stuck on the lock forever.  A
possible way to get into this situation is if the offending code paths
are triggered in a low disk space situation.

Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com>
Reported-by: Craig Ringer <craig@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com
2020-03-26 13:29:20 +01:00
Andrew Dunstan 896fcdb230 Provide a TLS init hook
The default hook function sets the default password callback function.
In order to allow preloaded libraries to have an opportunity to override
the default, TLS initialization if now delayed slightly until after
shared preloaded libraries have been loaded.

A test module is provided which contains a trivial example that decodes
an obfuscated password for an SSL certificate.

Author: Andrew Dunstan
Reviewed By: Andreas Karlsson, Asaba Takanori
Discussion: https://postgr.es/m/04116472-818b-5859-1d74-3d995aab2252@2ndQuadrant.com
2020-03-25 17:13:17 -04:00
Tom Lane bda6dedbea Go back to returning int from ereport auxiliary functions.
This reverts the parts of commit 17a28b0364
that changed ereport's auxiliary functions from returning dummy integer
values to returning void.  It turns out that a minority of compilers
complain (not entirely unreasonably) about constructs such as

	(condition) ? errdetail(...) : 0

if errdetail() returns void rather than int.  We could update those
call sites to say "(void) 0" perhaps, but the expectation for this
patch set was that ereport callers would not have to change anything.
And this aspect of the patch set was already the most invasive and
least compelling part of it, so let's just drop it.

Per buildfarm.

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-25 11:57:36 -04:00
Peter Eisentraut e8b1774fc2 Update SQL features
The name of E182 was changed in SQL:2011.

Also, we can change it to supported because all it requires is one
embedded language to be supported, which we do.
2020-03-25 08:46:41 +01:00
Thomas Munro 352f6f2df6 Add collation versions for Windows.
On Vista and later, use GetNLSVersionEx() to request collation version
information.

Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
2020-03-25 16:04:32 +13:00
Thomas Munro 382a821907 Allow NULL version for individual collations.
Remove the documented restriction that collation providers must either
return NULL for all collations or non-NULL for all collations.

Use NULL for glibc collations like "C.UTF-8", which might otherwise lead
future proposed commits to force unnecessary index rebuilds.

Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://postgr.es/m/CA%2BhUKGJvqup3s%2BJowVTcacZADO6dOhfdBmvOPHLS3KXUJu41Jw%40mail.gmail.com
2020-03-25 15:53:24 +13:00
Jeff Davis dd8e19132a Consider disk-based hash aggregation to implement DISTINCT.
Correct oversight in 1f39bce0. If enable_hashagg_disk=true, we should
consider hash aggregation for DISTINCT when applicable.
2020-03-24 18:30:04 -07:00
Jeff Davis 3649133b14 Avoid allocating unnecessary zero-sized array.
If there are no aggregates, there is no need to allocate an array of
zero AggStatePerGroupData elements.
2020-03-24 18:30:04 -07:00
Peter Geoghegan b150a76793 Fix nbtree deduplication README commentary.
Descriptions of some aspects of how deduplication works were unclear in
a couple of places.
2020-03-24 14:58:27 -07:00
Andres Freund 112b006fe7 logical decoding: Remove TODO about unnecessary optimization.
Measurements show, and intuition agrees, that there's currently no
known cases where adding a fastpath to avoid allocating / ordering a
heap for a single transaction is worthwhile.

Author: Dilip Kumar
Discussion: https://postgr.es/m/CAFiTN-sp701wvzvnLQJGk7JDqrFM8f--97-ihbwkU8qvn=p8nw@mail.gmail.com
2020-03-24 12:15:03 -07:00
Peter Eisentraut f15ace7935 Fix compiler warning on Cygwin
bf68b79e50 introduced an unused variable
compiler warning on Cygwin.
2020-03-24 19:31:02 +01:00
Tom Lane 17a28b0364 Improve the internal implementation of ereport().
Change all the auxiliary error-reporting routines to return void,
now that we no longer need to pretend they are passing something
useful to errfinish().  While this probably doesn't save anything
significant at the machine-code level, it allows detection of some
additional types of mistakes.

Pass the error location details (__FILE__, __LINE__, PG_FUNCNAME_MACRO)
to errfinish not errstart.  This shaves a few cycles off the case where
errstart decides we're not going to emit anything.

Re-implement elog() as a trivial wrapper around ereport(), removing
the separate support infrastructure it used to have.  Aside from
getting rid of some now-surplus code, this means that elog() now
really does have exactly the same semantics as ereport(), in particular
that it can skip evaluation work if the message is not to be emitted.

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-24 12:08:48 -04:00
Tom Lane e3a87b4991 Re-implement the ereport() macro using __VA_ARGS__.
Now that we require C99, we can depend on __VA_ARGS__ to work, and
revising ereport() to use it has several significant benefits:

* The extra parentheses around the auxiliary function calls are now
optional.  Aside from being a bit less ugly, this removes a common
gotcha for new contributors, because in some cases the compiler errors
you got from forgetting them were unintelligible.

* The auxiliary function calls are now evaluated as a comma expression
list rather than as extra arguments to errfinish().  This means that
compilers can be expected to warn about no-op expressions in the list,
allowing detection of several other common mistakes such as forgetting
to add errmsg(...) when converting an elog() call to ereport().

* Unlike the situation with extra function arguments, comma expressions
are guaranteed to be evaluated left-to-right, so this removes platform
dependency in the order of the auxiliary function calls.  While that
dependency hasn't caused us big problems in the past, this change does
allow dropping some rather shaky assumptions around errcontext() domain
handling.

There's no intention to make wholesale changes of existing ereport
calls, but as proof-of-concept this patch removes the extra parens
from a couple of calls in postgres.c.

While new code can be written either way, code intended to be
back-patched will need to use extra parens for awhile yet.  It seems
worth back-patching this change into v12, so as to reduce the window
where we have to be careful about that by one year.  Hence, this patch
is careful to preserve ABI compatibility; a followup HEAD-only patch
will make some additional simplifications.

Andres Freund and Tom Lane

Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-24 11:49:00 -04:00
Peter Eisentraut cef27ae01a Fix compiler warning
A variable was unused in non-assert builds.  Simplify the code to
avoid the issue.

Reported-by: Erik Rijkers <er@xs4all.nl>
2020-03-24 16:02:01 +01:00
Peter Eisentraut 97ee604d9b Some refactoring of logical/worker.c
This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding
apply_handle_{insert|update|delete}_internal functions.  This allows
performing those operations on relations that are not directly the
targets of replication, which is something a later patch will use for
targeting partitioned tables.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-24 15:00:54 +01:00
Andres Freund cedffbdb8b Report wait event for cost-based vacuum delay.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20200321040750.GD13662@telsasoft.com
2020-03-23 22:53:22 -07:00
Fujii Masao 496ee647ec Prefer standby promotion over recovery pause.
Previously if a promotion was triggered while recovery was paused,
the paused state continued. Also recovery could be paused by executing
pg_wal_replay_pause() even while a promotion was ongoing. That is,
recovery pause had higher priority over a standby promotion.
But this behavior was not desirable because most users basically wanted
the recovery to complete as soon as possible and the server to become
the master when they requested a promotion.

This commit changes recovery so that it prefers a promotion over
recovery pause. That is, if a promotion is triggered while recovery
is paused, the paused state ends and a promotion continues. Also
this commit makes recovery pause functions like pg_wal_replay_pause()
throw an error if they are executed while a promotion is ongoing.

Internally, this commit adds new internal function PromoteIsTriggered()
that returns true if a promotion is triggered. Since the name of
this function and the existing function IsPromoteTriggered() are
confusingly similar, the commit changes the name of IsPromoteTriggered()
to IsPromoteSignaled, as more appropriate name.

Author: Fujii Masao
Reviewed-by: Atsushi Torikoshi, Sergei Kornilov
Discussion: https://postgr.es/m/00c194b2-dbbb-2e8a-5b39-13f14048ef0a@oss.nttdata.com
2020-03-24 12:46:48 +09:00
Michael Paquier e09ad07b21 Move routine building restore_command to src/common/
restore_command has only been used until now by the backend, but there
is a pending patch for pg_rewind to make use of that in the frontend.

Author: Alexey Kondratov
Reviewed-by: Andrey Borodin, Andres Freund, Alvaro Herrera, Alexander
Korotkov, Michael Paquier
Discussion: https://postgr.es/m/a3acff50-5a0d-9a2c-b3b2-ee36168955c1@postgrespro.ru
2020-03-24 12:13:36 +09:00
Fujii Masao b8e20d6dab Add wait events for WAL archive and recovery pause.
This commit introduces new wait events BackupWaitWalArchive and
RecoveryPause. The former is reported while waiting for the WAL files
required for the backup to be successfully archived. The latter is
reported while waiting for recovery in pause state to be resumed.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Robert Haas
Discussion: https://postgr.es/m/f0651f8c-9c96-9f29-0ff9-80414a15308a@oss.nttdata.com
2020-03-24 11:12:21 +09:00
Fujii Masao 67e0adfb3f Report NULL as total backup size if it's not estimated.
Previously 0 was reported in pg_stat_progress_basebackup.total_backup
if the total backup size was not estimated. Per discussion, our consensus
is that NULL is better choise as the value in total_backup in that case.
So this commit makes pg_stat_progress_basebackup view report NULL
in total_backup column if the estimation is disabled.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Amit Langote, Magnus Hagander, Alvaro Herrera
Discussion: https://postgr.es/m/CABUevExnhOD89zBDuPvfAAh243RzNpwCPEWNLtMYpKHMB8gbAQ@mail.gmail.com
2020-03-24 10:43:41 +09:00
Jeff Davis 64fe602279 Fixes for Disk-based Hash Aggregation.
Justin Pryzby raised a couple issues with commit 1f39bce0. Fixed.

Also, tweak the way the size of a hash entry is estimated and the
number of buckets is estimated when calling BuildTupleHashTableExt().

Discussion: https://www.postgresql.org/message-id/20200319064222.GR26184@telsasoft.com
2020-03-23 15:43:07 -07:00
Amit Kapila 33753ac9d7 Add object names to partition integrity violations.
All errors of SQLSTATE class 23 should include the name of an object
associated with the error in separate fields of the error report message.
We do this so that applications need not try to extract them from the
possibly-localized human-readable text of the message.

Reported-by: Chris Bandy
Author: Chris Bandy
Reviewed-by: Amit Kapila and Amit Langote
Discussion: https://postgr.es/m/0aa113a3-3c7f-db48-bcd8-f9290b2269ae@gmail.com
2020-03-23 08:09:15 +05:30
Michael Paquier 79dfa8afb2 Add bound checks for ssl_min_protocol_version and ssl_max_protocol_version
Mixing incorrect bounds in the SSL context leads to confusing error
messages generated by OpenSSL which are hard to act on.  New range
checks are added when both min/max parameters are loaded in the context
of a SSL reload to improve the error reporting.  Note that this does not
make use of the GUC hook machinery contrary to 41aadee, as there is no
way to ensure a consistent range check (except if there is a way one day
to define range types for GUC parameters?).  Hence, this patch applies
only to OpenSSL, and uses a logic similar to other parameters to trigger
an error when reloading the SSL context in a session.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200114035420.GE1515@paquier.xyz
2020-03-23 11:01:41 +09:00
Noah Misch de9396326e Revert "Skip WAL for new relfilenodes, under wal_level=minimal."
This reverts commit cb2fd7eac2.  Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64.  Back-patch to 9.5 (all supported versions).

Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
2020-03-22 09:24:09 -07:00
Noah Misch cb2fd7eac2 Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this.  If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY.  See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules.  Maintainers of table access
methods should examine that section.

To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL.  A new GUC,
wal_skip_threshold, guides that choice.  If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold.  Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.

Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode.  Introduce rd_firstRelfilenodeSubid.  Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node.  Make relcache.c retain entries for certain
dropped relations until end of transaction.

Back-patch to 9.5 (all supported versions).  This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC.  As
always, update standby systems before master systems.  This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions.  (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)

Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas.  Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem.  Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs.  Reported by Martijn van Oosterhout.

Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-03-21 09:38:26 -07:00
Noah Misch d3e572855b In log_newpage_range(), heed forkNum and page_std arguments.
The function assumed forkNum=MAIN_FORKNUM and page_std=true, ignoring
the actual arguments.  Existing callers passed exactly those values, so
there's no live bug.  Back-patch to v12, where the function first
appeared, because another fix needs this.

Discussion: https://postgr.es/m/20191118045434.GA1173436@rfd.leadboat.com
2020-03-21 09:38:26 -07:00
Noah Misch e629a01f69 During heap rebuild, lock any TOAST index until end of transaction.
swap_relation_files() calls toast_get_valid_index() to find and lock
this index, just before swapping with the rebuilt TOAST index.  The
latter function releases the lock before returning.  Potential for
mischief is low; a concurrent session can issue ALTER INDEX ... SET
(fillfactor = ...), which is not alarming.  Nonetheless, changing
pg_class.relfilenode without a lock is unconventional.  Back-patch to
9.5 (all supported versions), because another fix needs this.

Discussion: https://postgr.es/m/20191226001521.GA1772687@rfd.leadboat.com
2020-03-21 09:38:26 -07:00
Noah Misch d60ef94d76 Fix cosmetic blemishes involving rd_createSubid.
Remove an obsolete comment from AtEOXact_cleanup().  Restore formatting
of a comment in struct RelationData, mangled by the pgindent run in
commit 9af4159fce.  Back-patch to 9.5 (all
supported versions), because another fix stacks on this.
2020-03-21 09:38:26 -07:00
Amit Kapila 3ba59ccc89 Allow page lock to conflict among parallel group members.
This is required as it is no safer for two related processes to perform
clean up in gin indexes at a time than for unrelated processes to do the
same.  After acquiring page locks, we can acquire relation extension lock
but reverse never happens which means these will also not participate in
deadlock.  So, avoid checking wait edges from this lock.

Currently, the parallel mode is strictly read-only, but after this patch
we have the infrastructure to allow parallel inserts and parallel copy.

Author: Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-21 08:48:06 +05:30
Amit Kapila 85f6b49c2c Allow relation extension lock to conflict among parallel group members.
This is required as it is no safer for two related processes to extend the
same relation at a time than for unrelated processes to do the same.  We
don't acquire a heavyweight lock on any other object after relation
extension lock which means such a lock can never participate in the
deadlock cycle.  So, avoid checking wait edges from this lock.

This provides an infrastructure to allow parallel operations like insert,
copy, etc. which were earlier not possible as parallel group members won't
conflict for relation extension lock.

Author: Dilip Kumar, Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-20 08:20:56 +05:30
Peter Geoghegan b27e1b3418 nbtree: Remove obsolete _bt_pgaddtup() comments.
Remove comments that are a throw back to a time when nbtree cared about
write-ordering dependencies.  The comments are similar to those removed
by commit 9ee7414e, among others.
2020-03-19 14:56:56 -07:00
Jeff Davis 2fd6a44ad5 Revert "Specialize MemoryContextMemAllocated()."
This reverts commit e00912e11a.
2020-03-19 12:21:50 -07:00
Tom Lane 24e2885ee3 Introduce "anycompatible" family of polymorphic types.
This patch adds the pseudo-types anycompatible, anycompatiblearray,
anycompatiblenonarray, and anycompatiblerange.  They work much like
anyelement, anyarray, anynonarray, and anyrange respectively, except
that the actual input values need not match precisely in type.
Instead, if we can find a common supertype (using the same rules
as for UNION/CASE type resolution), then the parser automatically
promotes the input values to that type.  For example,
"myfunc(anycompatible, anycompatible)" can match a call with one
integer and one bigint argument, with the integer automatically
promoted to bigint.  With anyelement in the definition, the user
would have had to cast the integer explicitly.

The new types also provide a second, independent set of type variables
for function matching; thus with "myfunc(anyelement, anyelement,
anycompatible) returns anycompatible" the first two arguments are
constrained to be the same type, but the third can be some other
type, and the result has the type of the third argument.  The need
for more than one set of type variables was foreseen back when we
first invented the polymorphic types, but we never did anything
about it.

Pavel Stehule, revised a bit by me

Discussion: https://postgr.es/m/CAFj8pRDna7VqNi8gR+Tt2Ktmz0cq5G93guc3Sbn_NVPLdXAkqA@mail.gmail.com
2020-03-19 11:43:11 -04:00
Peter Eisentraut c314c147c0 Prepare to support non-tables in publications
This by itself doesn't change any functionality but prepares the way
for having relations other than base tables in publications.

Make arrangements for COPY handling the initial table sync.  For
non-tables we have to use COPY (SELECT ...) instead of directly
copying from the table, but then we have to take care to omit
generated columns from the column list.

Also, remove a hardcoded reference to relkind = 'r' and rely on the
publisher to send only what it can actually publish, which will be
correct even in future cross-version scenarios.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-19 08:25:07 +01:00
Fujii Masao 1d253bae57 Rename the recovery-related wait events.
This commit renames RecoveryWalAll and RecoveryWalStream wait events to
RecoveryWalStream and RecoveryRetrieveRetryInterval, respectively,
in order to make the names and what they are more consistent. For example,
previously RecoveryWalAll was reported as a wait event while the recovery
was waiting for WAL from a stream, and which was confusing because the name
was very different from the situation where the wait actually could happen.

The names of macro variables for those wait events also are renamed
accordingly.

This commit also changes the category of RecoveryRetrieveRetryInterval to
Timeout from Activity because the wait event is reported while waiting based
on wal_retrieve_retry_interval.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi
Discussion: https://postgr.es/m/124997ee-096a-5d09-d8da-2c7a57d0816e@oss.nttdata.com
2020-03-19 15:32:55 +09:00
Amit Kapila 72e78d831a Add assert to ensure that page locks don't participate in deadlock cycle.
Assert that we don't acquire any other heavyweight lock while holding the
page lock except for relation extension.  However, these locks are never
taken in reverse order which implies that page locks will never
participate in the deadlock cycle.

Similar to relation extension, page locks are also held for a short
duration, so imposing such a restriction won't hurt.

Author: Dilip Kumar, with few changes by Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-19 08:11:45 +05:30
Peter Geoghegan 6312c08a29 nbtree: Use raw PageAddItem() for retail inserts.
Only internal page splits need to call _bt_pgaddtup() instead of
PageAddItem(), and only for data items, one of which will end up at the
first offset (or first offset after the high key offset) on the new
right page.  This data item alone will need to be truncated in
_bt_pgaddtup().

Since there is no reason why retail inserts ever need to truncate the
incoming item, use a raw PageAddItem() call there instead.  Even
_bt_split() uses raw PageAddItem() calls for left page and right page
high keys.  Clearly the _bt_pgaddtup() shim function wasn't really
encapsulating anything.  _bt_pgaddtup() should now be thought of as a
_bt_split() helper function.

Note that the assertions from commit d1e241c2 verify that retail inserts
never insert an item at an internal page's negative infinity offset.
This invariant could only ever be violated as a result of a basic logic
error in nbtinsert.c.
2020-03-18 18:17:37 -07:00
Michael Paquier d41202f36e Fix comment related to concurrent index swapping in index.c
A comment about switching indisvalid of the new and old indexes swapped
in REINDEX CONCURRENTLY got this backwards.

Issue introduced by 5dc92b8, the original commit of REINDEX
CONCURRENTLY.

Author: Julien Rouhaud
Discussion: https://postgr.es/m/20200318143340.GA46897@nol
Backpatch-through: 12
2020-03-19 09:51:33 +09:00
Jeff Davis 1f39bce021 Disk-based Hash Aggregation.
While performing hash aggregation, track memory usage when adding new
groups to a hash table. If the memory usage exceeds work_mem, enter
"spill mode".

In spill mode, new groups are not created in the hash table(s), but
existing groups continue to be advanced if input tuples match. Tuples
that would cause a new group to be created are instead spilled to a
logical tape to be processed later.

The tuples are spilled in a partitioned fashion. When all tuples from
the outer plan are processed (either by advancing the group or
spilling the tuple), finalize and emit the groups from the hash
table. Then, create new batches of work from the spilled partitions,
and select one of the saved batches and process it (possibly spilling
recursively).

Author: Jeff Davis
Reviewed-by: Tomas Vondra, Adam Lee, Justin Pryzby, Taylor Vesely, Melanie Plageman
Discussion: https://postgr.es/m/507ac540ec7c20136364b5272acbcd4574aa76ef.camel@j-davis.com
2020-03-18 15:42:02 -07:00
Jeff Davis e00912e11a Specialize MemoryContextMemAllocated().
An AllocSet doubles the size of allocated blocks (up to maxBlockSize),
which means that the current block can represent half of the total
allocated space for the memory context. But the free space in the
current block may never have been touched, so don't count the
untouched memory as allocated for the purposes of
MemoryContextMemAllocated().

Discussion: https://postgr.es/m/ec63d70b668818255486a83ffadc3aec492c1f57.camel@j-davis.com
2020-03-18 15:39:14 -07:00
Alvaro Herrera 487e9861d0
Enable BEFORE row-level triggers for partitioned tables
... with the limitation that the tuple must remain in the same
partition.

Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/20200227165158.GA2071@alvherre.pgsql
2020-03-18 18:58:05 -03:00
Peter Geoghegan b029395f5e Refactor nbtree fastpath optimization.
Commit 2b272734, which added the fastpath rightmost leaf page cache
insert optimization, added code to _bt_doinsert() to handle using and
invalidating the backend local block cache.  It doesn't seem like a good
place to handle these low level details, though.  _bt_doinsert() is
supposed to be a high level function -- it is the main entry point to
nbtinsert.c.

Restructure the code by placing handling of the rightmost block cache at
the start of a new _bt_search() shim function, _bt_search_insert().  The
new function is called from _bt_doinsert(), which uses it as a
_bt_search() variant that conveniently accepts its BTInsertState state
as an argument.  _bt_doinsert() no longer needs to directly consider the
fastpath optimization.

Discussion: https://postgr.es/m/CAH2-Wzk59cxKJRd=rfbyub6-V4yWRjsOYRkUNHBLT1P1GdtCQQ@mail.gmail.com
2020-03-18 14:42:49 -07:00
Peter Eisentraut a2b1faa0f2 Implement type regcollation
This will be helpful for a following commit and it's also just
generally useful, like the other reg* types.

Author: Julien Rouhaud
Reviewed-by: Thomas Munro and Michael Paquier
Discussion: https://postgr.es/m/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
2020-03-18 21:21:00 +01:00
Tomas Vondra ccaa3569f5 Recognize some OR clauses as compatible with functional dependencies
Since commit 8f321bd16c functional dependencies can handle IN clauses,
which however introduced a possible (and surprising) inconsistency,
because IN clauses may be expressed as an OR clause, which are still
considered incompatible. For example

  a IN (1, 2, 3)

may be rewritten as

  (a = 1 OR a = 2 OR a = 3)

The IN clause will work fine with functional dependencies, but the OR
clause will force the estimation to fall back to plain per-column
estimates, possibly introducing significant estimation errors.

This commit recognizes OR clauses equivalent to an IN clause (when all
arugments are compatible and reference the same attribute) as a special
case, compatible with functional dependencies. This allows applying
functional dependencies, just like for IN clauses.

This does not eliminate the difference in estimating the clause itself,
i.e. IN clause and OR clause still use different formulas. It would be
possible to change that (for these special OR clauses), but that's not
really about extended statistics - it was always like this. Moreover the
errors are usually much smaller compared to ignoring dependencies.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-18 16:41:49 +01:00
Tomas Vondra 6f72dbc48b Fix wording of several extended stats comments
Reported-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/flat/20200113230008.g67iyk4cs3xbnjju@development
2020-03-18 13:40:13 +01:00
Amit Kapila b4f140869f Add missing errcode() in a few ereport calls.
This will allow to specifying SQLSTATE error code for the errors in the
missing places.

Reported-by: Sawada Masahiko
Author: Sawada Masahiko
Backpatch-through: 9.5
Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-18 09:27:14 +05:30
Michael Paquier fdeeb524b4 Fix typo in indexcmds.c
Introduced by 61d7c7b.

Backpatch-through: 12
2020-03-18 11:13:12 +09:00
Amit Kapila 15ef6ff4b9 Assert that we don't acquire a heavyweight lock on another object after
relation extension lock.

The only exception to the rule is that we can try to acquire the same
relation extension lock more than once.  This is allowed as we are not
creating any new lock for this case.  This restriction implies that the
relation extension lock won't ever participate in the deadlock cycle
because we can never wait for any other heavyweight lock after acquiring
this lock.

Such a restriction is okay for relation extension locks as unlike other
heavyweight locks these are not held till the transaction end.  These are
taken for a short duration to extend a particular relation and then
released.

Author: Dilip Kumar, with few changes by Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com
2020-03-18 07:20:17 +05:30
Peter Geoghegan b897b3aae6 nbtree: Remove useless local variables.
Copying block and offset numbers to local variables in _bt_insertonpg()
made the code less readable.  Remove the variables.  There is already
code that conditionally calls BufferGetBlockNumber() in the same block,
so consistently do it that way instead.

Calling BufferGetBlockNumber() is very cheap, but we might as well avoid
it when it isn't truly necessary.  It isn't truly necessary for
_bt_insertonpg() to call BufferGetBlockNumber() in almost all cases.

Spotted while working on a patch that refactors the fastpath rightmost
leaf page cache optimization, which was added by commit 2b272734.
2020-03-17 18:39:26 -07:00
Thomas Munro 9b8aa09293 Don't use EV_CLEAR for kqueue events.
For the semantics to match the epoll implementation, we need a socket to
continue to appear readable/writable if you wait multiple times without
doing I/O in between (in Linux terminology: level-triggered rather than
edge-triggered).  This distinction will be important for later commits.
Similar to commit 3b790d256f for Windows.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com
2020-03-18 13:05:09 +13:00
Thomas Munro 7bc84a1f30 Fix kqueue support under debugger on macOS.
While running under a debugger, macOS's getppid() can return the
debugger's PID.  That could cause a backend to exit because it falsely
believed that the postmaster had died, since commit 815c2f09.

Continue to use getppid() as a fast postmaster check after adding the
postmaster's PID to a kqueue, to close a PID-reuse race, but double
check that it actually exited by trying to read the pipe.  The new check
isn't reached in the common case.

Reported-by: Alexander Korotkov <a.korotkov@postgrespro.ru>
Discussion: https://postgr.es/m/CA%2BhUKGKhAxJ8V8RVwCo6zJaeVrdOG1kFBHGZOOjf6DzW_omeMA%40mail.gmail.com
2020-03-18 13:05:07 +13:00
Tom Lane e6c178b5b7 Refactor our checks for valid function and aggregate signatures.
pg_proc.c and pg_aggregate.c had near-duplicate copies of the logic
to decide whether a function or aggregate's signature is legal.
This seems like a bad thing even without the problem that the
upcoming "anycompatible" patch would roughly double the complexity
of that logic.  Hence, refactor so that the rules are localized
in new subroutines supplied by parse_coerce.c.  (One could quibble
about just where to add that code, but putting it beside
enforce_generic_type_consistency seems not totally unreasonable.)

The fact that the rules are about to change would mandate some
changes in the wording of the associated error messages in any case.
I ended up spelling things out in a fairly literal fashion in the
errdetail messages, eg "A result of type anyelement requires at
least one input of type anyelement, anyarray, anynonarray, anyenum,
or anyrange."  Perhaps this is overkill, but once there's more than
one subgroup of polymorphic types, people might get confused by
more-abstract messages.

Discussion: https://postgr.es/m/24137.1584139352@sss.pgh.pa.us
2020-03-17 19:36:41 -04:00
Tom Lane 77ec5affbc Adjust handling of an ANYARRAY actual input for an ANYARRAY argument.
Ordinarily it's impossible for an actual input of a function to have
declared type ANYARRAY, since we'd resolve that to a concrete array
type before doing argument type resolution for the function.  But an
exception arises for functions applied to certain columns of pg_statistic
or pg_stats, since we abuse the "anyarray" pseudotype by using it to
declare those columns.  So parse_coerce.c has to deal with the case.

Previously we allowed an ANYARRAY actual input to match an ANYARRAY
polymorphic argument, but only if no other argument or result was
declared ANYELEMENT.  When that logic was written, those were the only
two polymorphic types, and I fear nobody thought carefully about how it
ought to extend to the other ones that came along later.  But actually
it was wrong even then, because if a function has two ANYARRAY
arguments, it should be able to expect that they have identical element
types, and we'd not be able to ensure that.

The correct generalization is that we can match an ANYARRAY actual input
to an ANYARRAY polymorphic argument only if no other argument or result
is of any polymorphic type, so that no promises are being made about
element type compatibility.  check_generic_type_consistency can't test
that condition, but it seems better anyway to accept such matches there
and then throw an error if needed in enforce_generic_type_consistency.
That way we can produce a specific error message rather than an
unintuitive "function does not exist" complaint.  (There's some risk
perhaps of getting new ambiguous-function complaints, but I think that
any set of functions for which that could happen would be ambiguous
against ordinary array columns as well.)  While we're at it, we can
improve the message that's produced in cases that the code did already
object to, as shown in the regression test changes.

Also, remove a similar test that got cargo-culted in for ANYRANGE;
there are no catalog columns of type ANYRANGE, and I hope we never
create any, so that's not needed.  (It was incomplete anyway.)

While here, update some comments and rearrange the code a bit in
preparation for upcoming additions of more polymorphic types.

In practical situations I believe this is just exchanging one error
message for another, hopefully better, one.  So it doesn't seem
needful to back-patch, even though the mistake is ancient.

Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
2020-03-17 18:29:14 -04:00
Alvaro Herrera 5d0c2d5eba
Remove logical_read_local_xlog_page
It devolved into a content-less wrapper over read_local_xlog_page, with
nothing to add, plus it's easily confused with walsender's
logical_read_xlog_page.  There doesn't seem to be any reason for it to
stay.

src/include/replication/logicalfuncs.h becomes empty, so remove it too.
The prototypes it initially had were absorbed by generated fmgrprotos.h.

Discussion: https://postgr.es/m/20191115214102.GA15616@alvherre.pgsql
2020-03-17 18:18:01 -03:00
Alvaro Herrera bcd1c36300
Fix consistency issues with replication slot copy
Commit 9f06d79ef831's replication slot copying failed to
properly reserve the WAL that the slot is expecting to see
during DecodingContextFindStartpoint (to set the confirmed_flush
LSN), so concurrent activity could remove that WAL and cause the
copy process to error out.  But it doesn't actually *need* that
WAL anyway: instead of running decode to find confirmed_flush, it
can be copied from the source slot. Fix this by rearranging things
to avoid DecodingContextFindStartpoint() (leaving the target slot's
confirmed_flush_lsn to invalid), and set that up afterwards by copying
from the target slot's value.

Also ensure the source slot's confirmed_flush_lsn is valid.

Reported-by: Arseny Sher
Author: Masahiko Sawada, Arseny Sher
Discussion: https://postgr.es/m/871rr3ohbo.fsf@ars-thinkpad
2020-03-17 16:13:18 -03:00
Tom Lane 9d9784c840 Remove bogus assertion about polymorphic SQL function result.
It is possible to reach check_sql_fn_retval() with an unresolved
polymorphic rettype, resulting in an assertion failure as demonstrated
by one of the added test cases.  However, the code following that
throws what seems an acceptable error message, so just remove the
Assert and adjust commentary.

While here, I thought it'd be a good idea to provide some parallel
tests of SQL-function and PL/pgSQL-function polymorphism behavior.
Some of these cases are perhaps duplicative of tests elsewhere,
but we hadn't any organized coverage of the topic AFAICS.

Although that assertion's been wrong all along, it won't have any
effect in production builds, so I'm not bothering to back-patch.

Discussion: https://postgr.es/m/21569.1584314271@sss.pgh.pa.us
2020-03-17 14:54:46 -04:00
Fujii Masao 1429d3f767 Fix comment in xlog.c.
This commit fixes the comment about SharedHotStandbyActive variable.
The comment was apparently copy-and-pasted.

Author: Atsushi Torikoshi
Discussion: https://postgr.es/m/CACZ0uYEjpqZB9wN2Rwc_RMvDybyYqdbkPuDr1NyxJg4f9yGfMw@mail.gmail.com
2020-03-17 12:06:22 +09:00
Tom Lane 41b45576d5 Remove useless pfree()s at the ends of various ValuePerCall SRFs.
We don't need to manually clean up allocations in a SRF's
multi_call_memory_ctx, because the SRF_RETURN_DONE infrastructure
takes care of that (and also ensures that it will happen even if the
function never gets a final call, which simple manual cleanup cannot
do).

Hence, the code removed by this patch is a waste of code and cycles.
Worse, it gives the impression that cleaning up manually is a thing,
which can lead to more serious errors such as those fixed in
commits 085b6b667 and b4570d33a.  So we should get rid of it.

These are not quite actual bugs though, so I couldn't muster the
enthusiasm to back-patch.  Fix in HEAD only.

Justin Pryzby

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-16 21:36:53 -04:00
Tom Lane b4570d33aa Avoid holding a directory FD open across assorted SRF calls.
This extends the fixes made in commit 085b6b667 to other SRFs with the
same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(),
pg_ls_dir(), and pg_tablespace_databases().

Also adjust various comments and documentation to warn against
expecting to clean up resources during a ValuePerCall SRF's final
call.

Back-patch to all supported branches, since these functions were
all born broken.

Justin Pryzby, with cosmetic tweaks by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-16 21:05:52 -04:00
Peter Geoghegan 113758155c nbtree: Fix obsolete _bt_search() comment.
Oversight in commit d2086b08b0.
2020-03-16 15:51:06 -07:00
Peter Geoghegan 013c1f6af6 nbtree: Pass down MAXALIGN()'d itemsz for new item.
Refactor nbtinsert.c so that the final itemsz of each new non-pivot
tuple (the MAXALIGN()'d size) is determined once.  Most of the functions
used by leaf page inserts used the insertstate.itemsz value already.
This commit makes everything use insertstate.itemsz as standard
practice.  The goal is to decouple tuple size from "effective" tuple
size.  Making this distinction isn't truly necessary right now, but that
might change in the future.

Also explain why we consistently apply MAXALIGN() to get an effective
index tuple size.  This was rather unclear, in part because it isn't
actually strictly necessary right now.
2020-03-16 12:00:10 -07:00
Thomas Munro fc34b0d9de Introduce a maintenance_io_concurrency setting.
Introduce a GUC and a tablespace option to control I/O prefetching, much
like effective_io_concurrency, but for work that is done on behalf of
many client sessions.

Use the new setting in heapam.c instead of the hard-coded formula
effective_io_concurrency + 10 introduced by commit 558a9165e0.  Go with
a default value of 10 for now, because it's a round number pretty close
to the value used for that existing case.

Discussion: https://postgr.es/m/CA%2BhUKGJUw08dPs_3EUcdO6M90GnjofPYrWp4YSLaBkgYwS-AqA%40mail.gmail.com
2020-03-16 17:14:26 +13:00
Thomas Munro b09ff53667 Simplify the effective_io_concurrency setting.
The effective_io_concurrency GUC and equivalent tablespace option were
previously passed through a formula based on a theory about RAID
spindles and probabilities, to arrive at the number of pages to prefetch
in bitmap heap scans.  Tomas Vondra, Andres Freund and others argued
that it was anachronistic and hard to justify, and commit 558a9165e0
already started down the path of bypassing it in new code.  We agreed to
drop that logic and use the value directly.

For the default setting of 1, there is no change in effect.  Higher
settings can be converted from the old meaning to the new with:

  select round(sum(OLD / n::float)) from generate_series(1, OLD) s(n);

We might want to consider renaming the GUC before the next release given
the change in meaning, but it's not clear that many users had set it
very carefully anyway.  That decision is deferred for now.

Discussion: https://postgr.es/m/CA%2BhUKGJUw08dPs_3EUcdO6M90GnjofPYrWp4YSLaBkgYwS-AqA%40mail.gmail.com
2020-03-16 17:14:26 +13:00
Peter Geoghegan f207bb0b8f nbtree: Reorder nbtinsert.c prototypes.
Relocate _bt_newroot() prototype, so that the order that prototypes
appear in matches the order that the functions are defined in.
2020-03-15 20:53:12 -07:00
Peter Eisentraut 70a7b4776b Add backend type to csvlog and optionally log_line_prefix
The backend type, which corresponds to what
pg_stat_activity.backend_type shows, is added as a column to the
csvlog and can optionally be added to log_line_prefix using the new %b
placeholder.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-15 11:20:21 +01:00
Tom Lane 87c9c2571c Rearrange pseudotypes.c to get rid of duplicative code.
Commit a5954de10 replaced a lot of manually-coded stub I/O routines
with code generated by macros.  That was a good idea but it didn't
go far enough, because there were still manually-coded stub input
routines for types that had live output routines.  Refactor the
macro so that we can generate just a stub input routine at need.

Also create similar macros to generate stub binary I/O routines,
since we have some of those now.  The only stub functions that remain
hand-coded are shell_in() and shell_out(), which need to be separate
because they use different error messages.

While here, rearrange the commentary to discuss each type not each
function.  This provides a better way to explain the *why* of which
types need which support, rather than just duplicatively annotating
the functions.

Discussion: https://postgr.es/m/24137.1584139352@sss.pgh.pa.us
2020-03-14 15:31:44 -04:00
Tom Lane 4dbcb3f844 Restructure polymorphic-type resolution in funcapi.c.
resolve_polymorphic_tupdesc() and resolve_polymorphic_argtypes() failed to
cover the case of having to resolve anyarray given only an anyrange input.
The bug was masked if anyelement was also used (as either input or
output), which probably helps account for our not having noticed.

While looking at this I noticed that resolve_generic_type() would produce
the wrong answer if asked to make that same resolution.  ISTM that
resolve_generic_type() is confusingly defined and overly complex, so
rather than fix it, let's just make funcapi.c do the specific lookups
it requires for itself.

With this change, resolve_generic_type() is not used anywhere, so remove
it in HEAD.  In the back branches, leave it alone (complete with bug)
just in case any external code is using it.

While we're here, make some other refactoring adjustments in funcapi.c
with an eye to upcoming future expansion of the set of polymorphic types:

* Simplify quick-exit tests by adding an overall have_polymorphic_result
flag.  This is about a wash now but will be a win when there are more
flags.

* Reduce duplication of code between resolve_polymorphic_tupdesc() and
resolve_polymorphic_argtypes().

* Don't bother to validate correct matching of anynonarray or anyenum;
the parser should have done that, and even if it didn't, just doing
"return false" here would lead to a very confusing, off-point error
message.  (Really, "return false" in these two functions should only
occur if the call_expr isn't supplied or we can't obtain data type
info from it.)

* For the same reason, throw an elog rather than "return false" if
we fail to resolve a polymorphic type.

The bug's been there since we added anyrange, so back-patch to
all supported branches.

Discussion: https://postgr.es/m/6093.1584202130@sss.pgh.pa.us
2020-03-14 14:42:22 -04:00
Tomas Vondra e83daa7e33 Use multi-variate MCV lists to estimate ScalarArrayOpExpr
Commit 8f321bd16c added support for estimating ScalarArrayOpExpr clauses
(IN/ANY) clauses using functional dependencies. There's no good reason
not to support estimation of these clauses using multi-variate MCV lists
too, so this commits implements that. That makes the behavior consistent
and MCV lists can estimate all variants (ANY/ALL, inequalities, ...).

Author: Tomas Vondra
Review: Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-14 16:13:00 +01:00
Tomas Vondra 8f321bd16c Use functional dependencies to estimate ScalarArrayOpExpr
Until now functional dependencies supported only simple equality clauses
and clauses that can be trivially translated to equalities. This commit
allows estimation of some ScalarArrayOpExpr (IN/ANY) clauses.

For IN clauses we can do this thanks to using operator with equality
semantics, which means an IN clause

    WHERE c IN (1, 2, ..., N)

can be translated to

    WHERE (c = 1 OR c = 2 OR ... OR c = N)

IN clauses are now considered compatible with functional dependencies,
and rely on the same assumption of consistency of queries with data
(which is an assumption we already used for simple equality clauses).
This applies also to ALL clauses with an equality operator, which can be
considered equivalent to IN clause.

ALL clauses are still considered incompatible, although there's some
discussion about maybe relaxing this in the future.

Author: Pierre Ducroquet
Reviewed-by: Tomas Vondra, Dean Rasheed
Discussion: https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy%40pierred-pdoc
2020-03-14 16:12:41 +01:00
Peter Eisentraut d90bd24391 Remove am_syslogger global variable
Use the new MyBackendType instead.  More similar changes for other "am
something" variables are possible.  This one was just particularly
simple.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-13 14:01:15 +01:00
Peter Eisentraut 8e8a0becb3 Unify several ways to tracking backend type
Add a new global variable MyBackendType that uses the same BackendType
enum that was previously only used by the stats collector.  That way
several duplicate ways of checking what type a particular process is
can be simplified.  Since it's no longer just for stats, move to
miscinit.c and rename existing functions to match the expanded
purpose.

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-13 14:01:10 +01:00
Peter Eisentraut 1cc9c2412c Preserve replica identity index across ALTER TABLE rewrite
If an index was explicitly set as replica identity index, this setting
was lost when a table was rewritten by ALTER TABLE.  Because this
setting is part of pg_index but actually controlled by ALTER
TABLE (not part of CREATE INDEX, say), we have to do some extra work
to restore it.

Based-on-patch-by: Quan Zongliang <quanzongliang@gmail.com>
Reviewed-by: Euler Taveira <euler.taveira@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c70fcab2-4866-0d9f-1d01-e75e189db342@gmail.com
2020-03-13 11:57:06 +01:00
Tom Lane 085b6b6679 Avoid holding a directory FD open across pg_ls_dir_files() calls.
This coding technique is undesirable because (a) it leaks the FD for
the rest of the transaction if the SRF is not run to completion, and
(b) allocated FDs are a scarce resource, but multiple interleaved
uses of the relevant functions could eat many such FDs.

In v11 and later, a query such as "SELECT pg_ls_waldir() LIMIT 1"
yields a warning about the leaked FD, and the only reason there's
no warning in earlier branches is that fd.c didn't whine about such
leaks before commit 9cb7db3f0.  Even disregarding the warning, it
wouldn't be too hard to run a backend out of FDs with careless use
of these SQL functions.

Hence, rewrite the function so that it reads the directory within
a single call, returning the results as a tuplestore rather than
via value-per-call mode.

There are half a dozen other built-in SRFs with similar problems,
but let's fix this one to start with, just to see if the buildfarm
finds anything wrong with the code.

In passing, fix bogus error report for stat() failure: it was
whining about the directory when it should be fingering the
individual file.  Doubtless a copy-and-paste error.

Back-patch to v10 where this function was added.

Justin Pryzby, with cosmetic tweaks and test cases by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-11 15:27:59 -04:00
Peter Eisentraut bf68b79e50 Refactor ps_status.c API
The init_ps_display() arguments were mostly lies by now, so to match
typical usage, just use one argument and let the caller assemble it
from multiple sources if necessary.  The only user of the additional
arguments is BackendInitialize(), which was already doing string
assembly on the caller side anyway.

Remove the second argument of set_ps_display() ("force") and just
handle that in init_ps_display() internally.

BackendInitialize() also used to set the initial status as
"authentication", but that was very far from where authentication
actually happened.  So now it's set to "initializing" and then
"authentication" just before the actual call to
ClientAuthentication().

Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com
2020-03-11 16:38:31 +01:00
Alvaro Herrera 899a04f5ed
Avoid duplicates in ALTER ... DEPENDS ON EXTENSION
If the command is attempted for an extension that the object already
depends on, silently do nothing.

In particular, this means that if a database containing multiple such
entries is dumped, the restore will silently do the right thing and
record just the first one.  (At least, in a world where pg_dump does
dump such entries -- which it doesn't currently, but it will.)

Backpatch to 9.6, where this kind of dependency was introduced.

Reviewed-by: Ibrar Ahmed, Tom Lane (offlist)
Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql
2020-03-11 11:04:59 -03:00
Peter Eisentraut 1c91838181 Clean up order in miscinit.c a bit
The code around InitPostmasterChild() from commit 31c453165b somehow
ended up in the middle of a block of code related to "User ID state".
Move it into its own block instead.
2020-03-11 13:51:55 +01:00
Peter Eisentraut aaa3aeddee Remove HAVE_WORKING_LINK
Previously, hard links were not used on Windows and Cygwin, but they
support them just fine in currently supported OS versions, so we can
use them there as well.

Since all supported platforms now support hard links, we can remove
the alternative code paths.

Rename durable_link_or_rename() to durable_rename_excl() to make the
purpose more clear without referencing the implementation details.

Discussion: https://www.postgresql.org/message-id/flat/72fff73f-dc9c-4ef4-83e8-d2e60c98df48%402ndquadrant.com
2020-03-11 11:23:04 +01:00
Peter Geoghegan 39eabec904 nbtree: Move fastpath NULL descent stack assertion.
Commit 074251db added an assertion that verified the fastpath/rightmost
page insert optimization's assumption about free space: There should
always be enough free space on the page to insert the new item without
splitting the page.  Otherwise, we end up using the "concurrent root
page split" phony/fake stack path in _bt_insert_parent().  This does not
lead to incorrect behavior, but it is likely to be far slower than
simply using the regular _bt_search() path.  The assertion catches
serious performance bugs that would probably take a long time to detect
any other way.

It seems much more natural to make this assertion just before the point
that we generate a fake/phony descent stack.  Move the assert there.
This also makes _bt_insertonpg() a bit more readable.
2020-03-10 17:25:47 -07:00
Tom Lane c8e8b2f9df Marginal comments and docs cleanup.
Fix up some imprecise comments and poor markup from ba79cb5dc.  Also try
to convert the documentation of log_min_duration_sample and friends into
passable English.
2020-03-10 17:34:09 -04:00
Peter Geoghegan d1e241c226 nbtree: Demote minus infinity "can't happen" error.
Only a very basic logic bug in a _bt_insertonpg() caller could lead to a
violation of this invariant.  Besides, any newitemoff used for an
internal page is sanitized using other "can't happen" errors in
_bt_getstackbuf() or its callers, before _bt_insertonpg() even gets
called.

Also, move the error/assertion from the insert-without-split path of
_bt_insertonpg() to the top of the same function.  There is no reason
why this invariant only applies to insertions that happen to not result
in a page split; cover every insertion.  The assertion naturally belongs
next to the existing generic assertions that document relatively
high-level invariants for the item being inserted.
2020-03-10 14:15:41 -07:00
Tom Lane cacef17223 Ensure that CREATE TABLE LIKE copies any NO INHERIT constraint property.
Since the documentation about LIKE doesn't say that a copied constraint
has properties different from the original, it seems that ignoring
a NO INHERIT property doesn't meet the principle of least surprise.
So make it copy that.

(Note, however, that we still don't copy a NOT VALID property;
CREATE TABLE offers no way to do that, plus it seems pointless.)

Arguably this is a bug fix; but no back-patch, as it seems barely
possible somebody is depending on the current behavior.

Ildar Musin and Chris Travers; reviewed by Amit Langote and myself

Discussion: https://postgr.es/m/CAONYFtMC6C+3AWCVp7Yd8H87Zn0GxG1_iQG6_bQKbaqYZY0=-g@mail.gmail.com
2020-03-10 14:54:00 -04:00
Tom Lane d01f03a495 Preserve integer and float values accurately in (de)serialize_deflist.
Previously, this code just smashed all types of DefElem values to
strings, cavalierly reasoning that nobody would care.  But in point of
fact, most of the defGetFoo functions do distinguish among different
input syntaxes; for instance defGetBoolean will accept 1 as an integer
but not "1" as a string.  This led to CREATE/ALTER TEXT SEARCH
DICTIONARY accepting 0 and 1 as values for boolean dictionary
properties, only to have the dictionary fail at runtime.

We can upgrade this behavior by teaching serialize_deflist that it
does not need to quote T_Integer or T_Float nodes' values on output,
and then teaching deserialize_deflist to restore unquoted integer or
float values as the appropriate node type.  This should not break
anything using pg_ts_dict.dictinitoption, since that field is just
defined as being something valid to include in CREATE TEXT SEARCH
DICTIONARY.

deserialize_deflist is also used to parse the options arguments
for the ts_headline family of functions, but so far as I can see
this won't cause any problems there either: the only consumer of
that output is prsd_headline which always uses defGetString.
(Really that's a bad idea, but I won't risk changing it here.)

This is surely a bug fix, but given the lack of field complaints
I don't think it's necessary to back-patch.

Discussion: https://postgr.es/m/CAMkU=1xRcs_BUPzR0+V3WndaCAv0E_m3h6aUEJ8NF-sY1nnHsw@mail.gmail.com
2020-03-10 12:30:02 -04:00
Alvaro Herrera 40b3e2c201
Split out CreateCast into src/backend/catalog/pg_cast.c
This catalog-handling code was previously together with the rest of
CastCreate() in src/backend/commands/functioncmds.c.  A future patch
will need a way to add casts internally, so this will be useful to have
separate.

Also, move the nearby get_cast_oid() function from functioncmds.c to
lsyscache.c, which seems a more natural place for it.

Author: Paul Jungwirth, minor edits by Álvaro
Discussion: https://postgr.es/m/20200309210003.GA19992@alvherre.pgsql
2020-03-10 11:28:23 -03:00
Peter Eisentraut 3c173a53a8 Remove utils/acl.h from catalog/objectaddress.h
The need for this was removed by
8b9e9644dc.

A number of files now need to include utils/acl.h or
parser/parse_node.h explicitly where they previously got it indirectly
somehow.

Since parser/parse_node.h already includes nodes/parsenodes.h, the
latter is then removed where the former was added.  Also, remove
nodes/pg_list.h from objectaddress.h, since that's included via
nodes/parsenodes.h.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/7601e258-26b2-8481-36d0-dc9dca6f28f1%402ndquadrant.com
2020-03-10 10:27:00 +01:00
Peter Eisentraut 17b9e7f9fe Support adding partitioned tables to publication
When a partitioned table is added to a publication, changes of all of
its partitions (current or future) are published via that publication.

This change only affects which tables a publication considers as its
members.  The receiving side still sees the data coming from the
individual leaf partitions.  So existing restrictions that partition
hierarchies can only be replicated one-to-one are not changed by this.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-10 09:09:32 +01:00
Michael Paquier 61d7c7bce3 Prevent reindex of invalid indexes on TOAST tables
Such indexes can only be duplicated leftovers of a previously failed
REINDEX CONCURRENTLY command, and a valid equivalent is guaranteed to
exist.  As toast indexes can only be dropped if invalid, reindexing
these would lead to useless duplicated indexes that can't be dropped
anymore, except if the parent relation is dropped.

Thanks to Justin Pryzby for reminding that this problem was reported
long ago during the review of the original patch of REINDEX
CONCURRENTLY, but the issue was never addressed.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
2020-03-10 15:38:17 +09:00
Fujii Masao 71e0d0a737 Tidy up XLogSource code in xlog.c.
This commit replaces 0 used as an initial value of XLogSource variable,
with XLOG_FROM_ANY. Also this commit changes those variable so that
XLogSource instead of int is used as the type for them. These changes
are for code readability and debugger-friendliness.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
2020-03-10 09:41:44 +09:00
Jeff Davis 24d85952a5 Introduce LogicalTapeSetExtend().
Increases the number of tapes in a logical tape set. This will be
important for disk-based hash aggregation, because the maximum number
of tapes is not known ahead of time.

While discussing this change, it was observed to regress the
performance of Sort for at least one test case. The performance
regression was because some versions of GCC switch to an inlined
version of memcpy() in LogicalTapeWrite() after this change. No
performance regression for clang was observed.

Because the regression is due to an arbitrary decision by the
compiler, I decided it shouldn't hold up this change. If it needs to
be fixed, we can find a workaround.

Author: Adam Lee, Jeff Davis
Discussion: https://postgr.es/m/e54bfec11c59689890f277722aaaabd05f78e22c.camel%40j-davis.com
2020-03-09 10:40:02 -07:00
Fujii Masao 17d3fcdc39 Fix bug that causes to report waiting in PS display twice, in hot standby.
Previously "waiting" could appear twice via PS in case of lock conflict
in hot standby mode. Specifically this issue happend when the delay
in WAL application determined by max_standby_archive_delay and
max_standby_streaming_delay had passed but it took more than 500 msec
to cancel all the conflicting transactions. Especially we can observe this
easily by setting those delay parameters to -1.

The cause of this issue was that WaitOnLock() and
ResolveRecoveryConflictWithVirtualXIDs() added "waiting" to
the process title in that case. This commit prevents
ResolveRecoveryConflictWithVirtualXIDs() from reporting waiting
in case of lock conflict, to fix the bug.

Back-patch to all back branches.

Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
2020-03-10 00:14:43 +09:00
Peter Eisentraut 71d60e2aa0 Add tg_updatedcols to TriggerData
This allows a trigger function to determine for an UPDATE trigger
which columns were actually updated.  This allows some optimizations
in generic trigger functions such as lo_manage and
tsvector_update_trigger.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
2020-03-09 09:34:55 +01:00
Peter Eisentraut 8f152b6c50 Code simplification
Initialize TriggerData to 0 for the whole struct together, instead of
each field separately.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
2020-03-09 09:34:55 +01:00
Fujii Masao ef34ab42a8 Avoid assertion failure with targeted recovery in standby mode.
At the end of recovery, standby mode is turned off to re-fetch the last
valid record from archive or pg_wal. Previously, if recovery target was
reached and standby mode was turned off while the current WAL source
was stream, recovery could try to retrieve WAL file containing the last
valid record unexpectedly from stream even though not in standby mode.
This caused an assertion failure. That is, the assertion test confirms that
WAL file should not be retrieved from stream if standby mode is not true.

This commit moves back the current WAL source to archive if it's stream
even though not in standby mode, to avoid that assertion failure.

This issue doesn't cause the server to crash when built with assertion
disabled. In this case, the attempt to retrieve WAL file from stream not
in standby mode just fails. And then recovery tries to retrieve WAL file
from archive or pg_wal.

Back-patch to all supported branches.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
2020-03-09 15:32:32 +09:00
Fujii Masao d9249441ef Mark ssl_passphrase_command as GUC_SUPERUSER_ONLY.
This commit changes the GUC ssl_passphrase_command so that
it's examinable by only superuser and a member of pg_read_all_settings.
Per discussion, we determined to do this because the parameter may
contain a sensitive informtaion like a passphrase itself.

Author: Insung Moon
Reviewed-by: Keisuke Kuroda
Discussion: https://postgr.es/m/CAEMmqBuHVGayc+QkYKgx3gWSdqwTAQGw+0DYn3WhcX-eNa2ntA@mail.gmail.com
2020-03-09 11:41:31 +09:00
Tom Lane ea7dace2aa Simplify/speed up assertion cross-check in ginCompressPostingList().
I noticed while testing some other stuff that the CHECK_ENCODING_ROUNDTRIP
logic in ginCompressPostingList could account for over 50% of the runtime
of an INSERT with a GIN index.  While that's not relevant to production
performance, it's still kind of annoying in a debug build.  Replacing
the loop around short memcmp's with one long memcmp works just as well
and is significantly faster, at least on my machine.
2020-03-07 13:31:17 -05:00
Peter Eisentraut 8d7def5c27 Fix typo 2020-03-07 08:51:59 +01:00
Tom Lane a6525588b7 Allow Unicode escapes in any server encoding, not only UTF-8.
SQL includes provisions for numeric Unicode escapes in string
literals and identifiers.  Previously we only accepted those
if they represented ASCII characters or the server encoding
was UTF-8, making the conversion to internal form trivial.
This patch adjusts things so that we'll call the appropriate
encoding conversion function in less-trivial cases, allowing
the escape sequence to be accepted so long as it corresponds
to some character available in the server encoding.

This also applies to processing of Unicode escapes in JSONB.
However, the old restriction still applies to client-side
JSON processing, since that hasn't got access to the server's
encoding conversion infrastructure.

This patch includes some lexer infrastructure that simplifies
throwing errors with error cursors pointing into the middle of
a string (or other complex token).  For the moment I only used
it for errors relating to Unicode escapes, but we might later
expand the usage to some other cases.

Patch by me, reviewed by John Naylor.

Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
2020-03-06 14:17:43 -05:00
Tom Lane fe30e7ebfa Allow ALTER TYPE to change some properties of a base type.
Specifically, this patch allows ALTER TYPE to:
* Change the default TOAST strategy for a toastable base type;
* Promote a non-toastable type to toastable;
* Add/remove binary I/O functions for a type;
* Add/remove typmod I/O functions for a type;
* Add/remove a custom ANALYZE statistics functions for a type.

The first of these can be done by the type's owner; all the others
require superuser privilege since misuse could cause problems.

The main motivation for this patch is to allow extensions to
upgrade the feature sets of their data types, so the set of
alterable properties is biased towards that use-case.  However
it's also true that changing some other properties would be
a lot harder, as they get baked into physical storage and/or
stored expressions that depend on the type.

Along the way, refactor GenerateTypeDependencies() to make it easier
to call, refactor DefineType's volatility checks so they can be shared
by AlterType, and teach typcache.c that it might have to reload data
from the type's pg_type row, a scenario it never handled before.
Also rearrange alter_type.sgml a bit for clarity (put the
composite-type operations together).

Tomas Vondra and Tom Lane

Discussion: https://postgr.es/m/20200228004440.b23ein4qvmxnlpht@development
2020-03-06 12:19:29 -05:00
Tom Lane bb03010b9f Remove the "opaque" pseudo-type and associated compatibility hacks.
A long time ago, it was necessary to declare datatype I/O functions,
triggers, and language handler support functions in a very type-unsafe
way involving a single pseudo-type "opaque".  We got rid of those
conventions in 7.3, but there was still support in various places to
automatically convert such functions to the modern declaration style,
to be able to transparently re-load dumps from pre-7.3 servers.
It seems unnecessary to continue to support that anymore, so take out
the hacks; whereupon the "opaque" pseudo-type itself is no longer
needed and can be dropped.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
2020-03-05 15:48:56 -05:00
Tom Lane 84eca14bc4 Remove ancient hacks to ignore certain opclass names in CREATE INDEX.
Twenty years ago, we removed certain operator classes in favor of
letting indexes over their data types be built with some other
binary-compatible, more standard opclass.  As a hack to allow existing
index definitions to be dumped and reloaded, we made CREATE INDEX ignore
the removed opclass names, so that such indexes would fall back to the
new default opclass for their data types.  This was never intended to
be a long-lived thing; it carries the obvious risk of breaking some
future developer's attempt to re-use those old opclass names.  Since
all of the cases in question are for opclasses that were removed
before PG 8.0, it seems okay to get rid of these hacks now.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/3685.1583422389@sss.pgh.pa.us
2020-03-05 15:36:06 -05:00
Tom Lane e58a599752 Remove ancient support for upgrading pre-7.3 foreign key constraints.
Before 7.3, foreign key constraints had no explicit catalog
representation, so that what pg_dump produced for them was (usually)
a set of three CREATE CONSTRAINT TRIGGER commands.  Commit a2899ebdc
and some follow-on fixes added an ugly hack in CreateTrigger() to
recognize that pattern and reconstruct the foreign key definition.
However, we've never had any test coverage for that code, so that it's
legitimate to wonder if it still works; and having to maintain it in
the face of upcoming trigger-related patches seems rather pointless.
Let's decree that its time has passed, and drop it.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Daniel Gustafsson

Discussion: https://postgr.es/m/805874E2-999C-4CDA-856F-1AFBD9DFE933@yesql.se
2020-03-05 15:25:45 -05:00
Alvaro Herrera a77315fdf2
Remove RangeIOData->typiofunc
We used to carry the I/O function OID in RangeIOData, but it's not used
for anything.  Since the struct is not exposed to the world anyway, we
can simplify it a bit.  Also, rename the FmgrInfo member to match
the accompanying 'typioparam' and put them in a more sensible order.

Reviewed by Tom Lane and Paul Jungwirth.

Discussion: https://postgr.es/m/20200304215711.GA8732@alvherre.pgsql
2020-03-05 11:35:02 -03:00
Michael Paquier fbcf087112 Fix more issues with dependency handling at swap phase of REINDEX CONCURRENTLY
When canceling a REINDEX CONCURRENTLY operation after swapping is done,
a drop of the parent table would leave behind old indexes.  This is a
consequence of 68ac9cf, which fixed the case of pg_depend bloat when
repeating REINDEX CONCURRENTLY on the same relation.

In order to take care of the problem without breaking the previous fix,
this uses a different strategy, possible even with the exiting set of
routines to handle dependency changes.  The dependencies of/on the
new index are additionally switched to the old one, allowing an old
invalid index remaining around because of a cancellation or a failure to
use the dependency links of the concurrently-created index.  This
ensures that dropping any objects the old invalid index depends on also
drops the old index automatically.

Reported-by: Julien Rouhaud
Author: Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/20200227080735.l32fqcauy73lon7o@nol
Backpatch-through: 12
2020-03-05 12:50:15 +09:00
Jeff Davis c954d49046 Extend ExecBuildAggTrans() to support a NULL pointer check.
Optionally push a step to check for a NULL pointer to the pergroup
state.

This will be important for disk-based hash aggregation in combination
with grouping sets. When memory limits are reached, a given tuple may
find its per-group state for some grouping sets but not others. For
the former, it advances the per-group state as normal; for the latter,
it skips evaluation and the calling code will have to spill the tuple
and reprocess it in a later batch.

Add the NULL check as a separate expression step because in some
common cases it's not needed.

Discussion: https://postgr.es/m/20200221202212.ssb2qpmdgrnx52sj%40alap3.anarazel.de
2020-03-04 17:29:18 -08:00
Tom Lane 3ed2005ff5 Introduce macros for typalign and typstorage constants.
Our usual practice for "poor man's enum" catalog columns is to define
macros for the possible values and use those, not literal constants,
in C code.  But for some reason lost in the mists of time, this was
never done for typalign/attalign or typstorage/attstorage.  It's never
too late to make it better though, so let's do that.

The reason I got interested in this right now is the need to duplicate
some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch.
But in general, this sort of change aids greppability and readability,
so it's a good idea even without any specific motivation.

I may have missed a few places that could be converted, and it's even
more likely that pending patches will re-introduce some hard-coded
references.  But that's not fatal --- there's no expectation that
we'd actually change any of these values.  We can clean up stragglers
over time.

Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
2020-03-04 10:34:25 -05:00
Tom Lane d677550493 Allow to_date/to_timestamp to recognize non-English month/day names.
to_char() has long allowed the TM (translation mode) prefix to
specify output of translated month or day names; but that prefix
had no effect in input format strings.  Now it does.  to_date()
and to_timestamp() will now recognize the same month or day names
that to_char() would output for the same format code.  Matching is
case-insensitive (per the active collation's notion of what that
means), just as it has always been for English month/day names
without the TM prefix.

(As per the discussion thread, there are lots of cases that this
feature will not handle, such as alternate day names.  But being
able to accept what to_char() will output seems useful enough.)

In passing, fix some shaky English and violations of message
style guidelines in jsonpath errors for the .datetime() method,
which depends on this code.

Juan José Santamaría Flecha, reviewed and modified by me,
with other commentary from Alvaro Herrera, Tomas Vondra,
Arthur Zakirov, Peter Eisentraut, Mark Dilger.

Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
2020-03-03 11:06:47 -05:00
Peter Geoghegan 1e07f5e0a1 Remove overzealous _bt_split() assertions.
_bt_split() is passed NULL as its insertion scankey for internal page
splits.  Two recently added Assert() statements failed to consider this,
leading to a crash with pg_upgrade'd BREE_VERSION < 4 indexes.  Remove
the assertions.

The assertions in question were added by commit 0d861bbb, which added
nbtree deduplication.  It would be possible to fix the assertions
directly instead, but they weren't adding much anyway.
2020-03-02 21:40:11 -08:00
Michael Paquier 0b48f1335d Fix assertion failure with ALTER TABLE ATTACH PARTITION and indexes
Using ALTER TABLE ATTACH PARTITION causes an assertion failure when
attempting to work on a partitioned index, because partitioned indexes
cannot have partition bounds.

The grammar of ALTER TABLE ATTACH PARTITION requires partition bounds,
but not ALTER INDEX, so mixing ALTER TABLE with partitioned indexes is
confusing.  Hence, on HEAD, prevent ALTER TABLE to attach a partition if
the relation involved is a partitioned index.  On back-branches, as
applications may rely on the existing behavior, just remove the
culprit assertion.

Reported-by: Alexander Lakhin
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/16276-5cd1dcc8fb8be7b5@postgresql.org
Backpatch-through: 11
2020-03-03 13:55:41 +09:00
Fujii Masao e65497df8f Report progress of streaming base backup.
This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
2020-03-03 12:03:43 +09:00
Michael Paquier d79fb88ac7 Preserve pg_index.indisclustered across REINDEX CONCURRENTLY
If the flag value is lost, a CLUSTER query following REINDEX
CONCURRENTLY could fail.  Non-concurrent REINDEX is already handling
this case consistently.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200229024202.GH29456@telsasoft.com
Backpatch-through: 12
2020-03-03 10:12:28 +09:00
Alvaro Herrera 2f9661311b
Represent command completion tags as structs
The backend was using strings to represent command tags and doing string
comparisons in multiple places, but that's slow and unhelpful.  Create a
new command list with a supporting structure to use instead; this is
stored in a tag-list-file that can be tailored to specific purposes with
a caller-definable C macro, similar to what we do for WAL resource
managers.  The first first such uses are a new CommandTag enum and a
CommandTagBehavior struct.

Replace numerous occurrences of char *completionTag with a
QueryCompletion struct so that the code no longer stores information
about completed queries in a cstring.  Only at the last moment, in
EndCommand(), does this get converted to a string.

EventTriggerCacheItem no longer holds an array of palloc’d tag strings
in sorted order, but rather just a Bitmapset over the CommandTags.

Author: Mark Dilger, with unsolicited help from Álvaro Herrera
Reviewed-by: John Naylor, Tom Lane
Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
2020-03-02 18:19:51 -03:00
Peter Geoghegan 77b88bd5dc Add assertions to _bt_update_posting().
Copy some assertions from _bt_form_posting() to its sibling function,
_bt_update_posting().

Discussion: https://postgr.es/m/CAH2-WzkPR8KMwkL0ap976kmXwBCeukTeHz6fB-U__wvuP1S9Zg@mail.gmail.com
2020-03-02 08:07:16 -08:00
Michael Paquier 12c5cad76f Handle logical decoding in multi-insert for catalog tuples
The code path for multi-insert decoding is not stressed yet for
catalogs (a future patch may introduce this capability), so no
back-patch is needed.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/9690D72F-5C4F-4016-9572-6D16684E1D87@yesql.se
2020-03-02 10:00:37 +09:00
Peter Geoghegan 84ec9b231a Remove dead code from _bt_update_posting().
Discussion: https://postgr.es/m/CAH2-WzmAufHiOku6AGiFD=81VQs5nYJ1L2YkhW1t+BH4CMsgRw@mail.gmail.com
2020-03-01 12:11:26 -08:00
Dean Rasheed 43a899f41f Fix corner-case loss of precision in numeric ln().
When deciding on the local rscale to use for the Taylor series
expansion, ln_var() neglected to account for the fact that the result
is subsequently multiplied by a factor of 2^(nsqrt+1), where nsqrt is
the number of square root operations performed in the range reduction
step, which can be as high as 22 for very large inputs. This could
result in a loss of precision, particularly when combined with large
rscale values, for which a large number of Taylor series terms is
required (up to around 400).

Fix by computing a few extra digits in the Taylor series, based on the
weight of the multiplicative factor log10(2^(nsqrt+1)). It remains to
be proven whether or not the other 8 extra digits used for the Taylor
series is appropriate, but this at least deals with the obvious
oversight of failing to account for the effects of the final
multiplication.

Per report from Justin AnyhowStep. Reviewed by Tom Lane.

Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
2020-03-01 14:49:25 +00:00
Tom Lane 58c47ccfff Correctly re-use hash tables in buildSubPlanHash().
Commit 356687bd8 omitted to remove leftover code for destroying
a hashed subplan's hash tables, with the result that the tables
were always rebuilt not reused; this leads to severe memory
leakage if a hashed subplan is re-executed enough times.
Moreover, the code for reusing the hashnulls table had a typo
that would have made it do the wrong thing if it were reached.

Looking at the code coverage report shows severe under-coverage
of the potential callers of ResetTupleHashTable, so add some test
cases that exercise them.

Andreas Karlsson and Tom Lane, per reports from Ranier Vilela
and Justin Pryzby.

Backpatch to v11, as the faulty commit was.

Discussion: https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se
Discussion: https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
2020-02-29 13:48:09 -05:00
Tom Lane 6afc8aefd3 Remove obsolete comment.
Noted while studying subplan hash issue.
2020-02-29 13:23:12 -05:00
Tom Lane 80d76be51c Avoid failure if autovacuum tries to access a just-dropped temp namespace.
Such an access became possible when commit 246a6c8f7 added more
aggressive cleanup of orphaned temp relations by autovacuum.
Since autovacuum's snapshot might be slightly stale, it could
attempt to access an already-dropped temp namespace, resulting in
an assertion failure or null-pointer dereference.  (In practice,
since we don't drop temp namespaces automatically but merely
recycle them, this situation could only arise if a superuser does
a manual drop of a temp namespace.  Still, that should be allowed.)

The core of the bug, IMO, is that isTempNamespaceInUse and its callers
failed to think hard about whether to treat "temp namespace isn't there"
differently from "temp namespace isn't in use".  In hopes of forestalling
future mistakes of the same ilk, replace that function with a new one
checkTempNamespaceStatus, which makes the same tests but returns a
three-way enum rather than just a bool.  isTempNamespaceInUse is gone
entirely in HEAD; but just in case some external code is relying on it,
keep it in the back branches, as a bug-compatible wrapper around the
new function.

Per report originally from Prabhat Kumar Sahu, investigated by Mahendra
Singh and Michael Paquier; the final form of the patch is my fault.
This replaces the failed fix attempt in a052f6cbb.

Backpatch as far as v11, as 246a6c8f7 was.

Discussion: https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com
2020-02-28 20:28:34 -05:00
Jeff Davis 32bb4535a0 Fix commit c11cb17d.
I neglected to update copyfuncs/outfuncs/readfuncs.

Discussion: https://postgr.es/m/12491.1582833409%40sss.pgh.pa.us
2020-02-28 09:35:11 -08:00
Alvaro Herrera db989184cd
Add comments on avoid reuse of parse-time snapshot
Apparently, reusing the parse-time query snapshot for later steps
(execution) is a frequently considered optimization ... but it doesn't
work, for reasons discovered in thread [1].  Adding some comments about
why it doesn't really work can relieve some future hackers from wasting
time reimplementing it again.

[1] https://postgr.es/m/flat/5075D8DF.6050500@fuzzy.cz

Author: Michail Nikolaev
Discussion: https://postgr.es/m/CANtu0ogp6cTvMJObXP8n=k+JtqxY1iT9UV5MbGCpjjPa5crCiw@mail.gmail.com
2020-02-28 13:25:26 -03:00
Peter Eisentraut 1933ae629e Add PostgreSQL home page to --help output
Per emerging standard in GNU programs and elsewhere.  Autoconf already
has support for specifying a home page, so we can just that.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Peter Eisentraut 864934131e Refer to bug report address by symbol rather than hardcoding
Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Jeff Davis c11cb17dc5 Save calculated transitionSpace in Agg node.
This will be useful in the upcoming Hash Aggregation work to improve
estimates for hash table sizing.

Discussion: https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
2020-02-27 11:20:56 -08:00
Alvaro Herrera b9b408c487
Record parents of triggers
This let us get rid of a recently introduced ugly hack (commit
1fa846f1c9).

Author: Álvaro Herrera
Reviewed-by: Amit Langote, Tom Lane
Discussion: https://postgr.es/m/20200217215641.GA29784@alvherre.pgsql
2020-02-27 13:23:33 -03:00
Robert Haas 05d8449e73 Move src/backend/utils/hash/hashfn.c to src/common
This also involves renaming src/include/utils/hashutils.h, which
becomes src/include/common/hashfn.h. Perhaps an argument can be
made for keeping the hashutils.h name, but it seemed more
consistent to make it match the name of the file, and also more
descriptive of what is actually going on here.

Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list
advice on how not to break the Windows build from Davinder Singh
and Amit Kapila.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-27 09:25:41 +05:30
Peter Geoghegan 2c0797da2c Silence another compiler warning in nbtinsert.c.
Per complaint from Álvaro Herrera.
2020-02-26 15:15:45 -08:00
Tom Lane a477bfc1df Suppress unnecessary RelabelType nodes in more cases.
eval_const_expressions sometimes produced RelabelType nodes that
were useless because they just relabeled an expression to the same
exposed type it already had.  This is worth avoiding because it can
cause two equivalent expressions to not be equal(), preventing
recognition of useful optimizations.  In the test case added here,
an unpatched planner fails to notice that the "sqli = constant" clause
renders a sort step unnecessary, because one code path produces an
extra RelabelType and another doesn't.

Fix by ensuring that eval_const_expressions_mutator's T_RelabelType
case will not add in an unnecessary RelabelType.  Also save some
code by sharing a subroutine with the effectively-equivalent cases
for CollateExpr and CoerceToDomain.  (CollateExpr had no bug, and
I think that the case couldn't arise with CoerceToDomain, but
it seems prudent to do the same check for all three cases.)

Back-patch to v12.  In principle this has been wrong all along,
but I haven't seen a case where it causes visible misbehavior
before v12, so refrain from changing stable branches unnecessarily.

Per investigation of a report from Eric Gillum.

Discussion: https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
2020-02-26 18:14:12 -05:00
Peter Geoghegan 2d8a6fad18 Silence compiler warning in nbtinsert.c.
Per buildfarm member longfin.
2020-02-26 13:17:36 -08:00
Peter Geoghegan 0d861bbb70 Add deduplication to nbtree.
Deduplication reduces the storage overhead of duplicates in indexes that
use the standard nbtree index access method.  The deduplication process
is applied lazily, after the point where opportunistic deletion of
LP_DEAD-marked index tuples occurs.  Deduplication is only applied at
the point where a leaf page split would otherwise be required.  New
posting list tuples are formed by merging together existing duplicate
tuples.  The physical representation of the items on an nbtree leaf page
is made more space efficient by deduplication, but the logical contents
of the page are not changed.  Even unique indexes make use of
deduplication as a way of controlling bloat from duplicates whose TIDs
point to different versions of the same logical table row.

The lazy approach taken by nbtree has significant advantages over a GIN
style eager approach.  Most individual inserts of index tuples have
exactly the same overhead as before.  The extra overhead of
deduplication is amortized across insertions, just like the overhead of
page splits.  The key space of indexes works in the same way as it has
since commit dd299df8 (the commit that made heap TID a tiebreaker
column).

Testing has shown that nbtree deduplication can generally make indexes
with about 10 or 15 tuples for each distinct key value about 2.5X - 4X
smaller, even with single column integer indexes (e.g., an index on a
referencing column that accompanies a foreign key).  The final size of
single column nbtree indexes comes close to the final size of a similar
contrib/btree_gin index, at least in cases where GIN's posting list
compression isn't very effective.  This can significantly improve
transaction throughput, and significantly reduce the cost of vacuuming
indexes.

A new index storage parameter (deduplicate_items) controls the use of
deduplication.  The default setting is 'on', so all new B-Tree indexes
automatically use deduplication where possible.  This decision will be
reviewed at the end of the Postgres 13 beta period.

There is a regression of approximately 2% of transaction throughput with
synthetic workloads that consist of append-only inserts into a table
with several non-unique indexes, where all indexes have few or no
repeated values.  The underlying issue is that cycles are wasted on
unsuccessful attempts at deduplicating items in non-unique indexes.
There doesn't seem to be a way around it short of disabling
deduplication entirely.  Note that deduplication of items in unique
indexes is fairly well targeted in general, which avoids the problem
there (we can use a special heuristic to trigger deduplication passes in
unique indexes, since we're specifically targeting "version bloat").

Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed.

No bump in BTREE_VERSION, since the representation of posting list
tuples works in a way that's backwards compatible with version 4 indexes
(i.e. indexes built on PostgreSQL 12).  However, users must still
REINDEX a pg_upgrade'd index to use deduplication, regardless of the
Postgres version they've upgraded from.  This is the only way to set the
new nbtree metapage flag indicating that deduplication is generally
safe.

Author: Anastasia Lubennikova, Peter Geoghegan
Reviewed-By: Peter Geoghegan, Heikki Linnakangas
Discussion:
    https://postgr.es/m/55E4051B.7020209@postgrespro.ru
    https://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
2020-02-26 13:05:30 -08:00
Peter Geoghegan 612a1ab767 Add equalimage B-Tree support functions.
Invent the concept of a B-Tree equalimage ("equality implies image
equality") support function, registered as support function 4.  This
indicates whether it is safe (or not safe) to apply optimizations that
assume that any two datums considered equal by an operator class's order
method must be interchangeable without any loss of semantic information.
This is static information about an operator class and a collation.

Register an equalimage routine for almost all of the existing B-Tree
opclasses.  We only need two trivial routines for all of the opclasses
that are included with the core distribution.  There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which
returns 'true' when the collation is a deterministic collation).

This patch is infrastructure for an upcoming patch that adds B-Tree
deduplication.

Author: Peter Geoghegan, Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2020-02-26 11:28:25 -08:00
Andres Freund 2742c45080 expression eval: Reduce number of steps for agg transition invocations.
Do so by combining the various steps that are part of aggregate
transition function invocation into one larger step. As some of the
current steps are only necessary for some aggregates, have one variant
of the aggregate transition step for each possible combination.

To avoid further manual copies of code in the different transition
step implementations, move most of the code into helper functions
marked as "always inline".

The benefit of this change is an increase in performance when
aggregating lots of rows. This comes in part due to the reduced number
of indirect jumps due to the reduced number of steps, and in part by
reducing redundant setup code across steps. This mainly benefits
interpreted execution, but the code generated by JIT is also improved
a bit.

As a nice side-effect it also ends up making the code a bit simpler.

A small additional optimization is removing the need to set
aggstate->curaggcontext before calling ExecAggInitGroup, choosing to
instead passign curaggcontext as an argument. It was, in contrast to
other aggregate related functions, only needed to fetch a memory
context to copy the transition value into.

Author: Andres Freund
Discussion:
   https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
   https://postgr.es/m/5c371df7cee903e8cd4c685f90c6c72086d3a2dc.camel@j-davis.com
2020-02-24 15:09:09 -08:00
Michael Paquier 7d672b76bf Issue properly WAL record for CID of first catalog tuple in multi-insert
Multi-insert for heap is not yet used actively for catalogs, but the
code to support this case is in place for logical decoding.  The
existing code forgot to issue a XLOG_HEAP2_NEW_CID record for the first
tuple inserted, leading to failures when attempting to use multiple
inserts for catalogs at decoding time.  This commit fixes the problem by
WAL-logging the needed CID.

This is not an active bug, so no back-patch is done.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/E0D4CC67-A1CF-4DF4-991D-B3AC2EB5FAE9@yesql.se
2020-02-25 07:55:22 +09:00
Tom Lane 3d475515a1 Account explicitly for long-lived FDs that are allocated outside fd.c.
The comments in fd.c have long claimed that all file allocations should
go through that module, but in reality that's not always practical.
fd.c doesn't supply APIs for invoking some FD-producing syscalls like
pipe() or epoll_create(); and the APIs it does supply for non-virtual
FDs are mostly insistent on releasing those FDs at transaction end;
and in some cases the actual open() call is in code that can't be made
to use fd.c, such as libpq.

This has led to a situation where, in a modern server, there are likely
to be seven or so long-lived FDs per backend process that are not known
to fd.c.  Since NUM_RESERVED_FDS is only 10, that meant we had *very*
few spare FDs if max_files_per_process is >= the system ulimit and
fd.c had opened all the files it thought it safely could.  The
contrib/postgres_fdw regression test, in particular, could easily be
made to fall over by running it under a restrictive ulimit.

To improve matters, invent functions Acquire/Reserve/ReleaseExternalFD
that allow outside callers to tell fd.c that they have or want to allocate
a FD that's not directly managed by fd.c.  Add calls to track all the
fixed FDs in a standard backend session, so that we are honestly
guaranteeing that NUM_RESERVED_FDS FDs remain unused below the EMFILE
limit in a backend's idle state.  The coding rules for these functions say
that there's no need to call them in code that just allocates one FD over
a fairly short interval; we can dip into NUM_RESERVED_FDS for such cases.
That means that there aren't all that many places where we need to worry.
But postgres_fdw and dblink must use this facility to account for
long-lived FDs consumed by libpq connections.  There may be other places
where it's worth doing such accounting, too, but this seems like enough
to solve the immediate problem.

Internally to fd.c, "external" FDs are limited to max_safe_fds/3 FDs.
(Callers can choose to ignore this limit, but of course it's unwise
to do so except for fixed file allocations.)  I also reduced the limit
on "allocated" files to max_safe_fds/3 FDs (it had been max_safe_fds/2).
Conceivably a smarter rule could be used here --- but in practice,
on reasonable systems, max_safe_fds should be large enough that this
isn't much of an issue, so KISS for now.  To avoid possible regression
in the number of external or allocated files that can be opened,
increase FD_MINFREE and the lower limit on max_files_per_process a
little bit; we now insist that the effective "ulimit -n" be at least 64.

This seems like pretty clearly a bug fix, but in view of the lack of
field complaints, I'll refrain from risking a back-patch.

Discussion: https://postgr.es/m/E1izCmM-0005pV-Co@gemulon.postgresql.org
2020-02-24 17:28:33 -05:00
Robert Haas a91e2fa941 Adapt hashfn.c and hashutils.h for frontend use.
hash_any() and its various variants are defined to return Datum,
which is a backend-only concept, but the underlying functions
actually want to return uint32 and uint64, and only return Datum
because it's convenient for callers who are using them to
implement a hash function for some SQL datatype.

However, changing these functions to return uint32 and uint64
seems like it might lead to programming errors or back-patching
difficulties, both because they are widely used and because
failure to use UInt{32,64}GetDatum() might not provoke a
compilation error. Instead, rename the existing functions as
well as changing the return type, and add static inline wrappers
for those callers that need the previous behavior.

Although this commit adapts hashutils.h and hashfn.c so that they
can be compiled as frontend code, it does not actually do
anything that would cause them to be so compiled. That is left
for another commit.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:27:15 +05:30
Robert Haas 9341c783cc Put all the prototypes for hashfn.c into the same header file.
Previously, some of the prototypes for functions in hashfn.c were
in utils/hashutils.h and others were in utils/hsearch.h, but that
is confusing and has no particular benefit.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:22:45 +05:30
Robert Haas 07b95c3d83 Move bitmap_hash and bitmap_match to bitmapset.c.
The closely-related function bms_hash_value is already defined in that
file, and this change means that hashfn.c no longer needs to depend on
nodes/bitmapset.h. That gets us closer to allowing use of the hash
functions in hashfn.c in frontend code.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:17:43 +05:30
Michael Paquier bf883b211e Add prefix checks in exclude lists for pg_rewind, pg_checksums and base backups
An instance of PostgreSQL crashing with a bad timing could leave behind
temporary pg_internal.init files, potentially causing failures when
verifying checksums.  As the same exclusion lists are used between
pg_rewind, pg_checksums and basebackup.c, all those tools are extended
with prefix checks to keep everything in sync, with dedicated checks
added for pg_internal.init.

Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced.

Reported-by: Michael Banck
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
2020-02-24 18:13:25 +09:00
Peter Eisentraut 79c2385915 Factor out InitControlFile() from BootStrapXLOG()
Right now this only makes BootStrapXLOG() a bit more manageable, but
in the future there may be external callers.

Discussion: https://www.postgresql.org/message-id/e8f86ba5-48f1-a80a-7f1d-b76bcb9c5c47@2ndquadrant.com
2020-02-22 12:09:27 +01:00
Peter Eisentraut 9745f93afc Reformat code comment
Discussion: https://www.postgresql.org/message-id/e8f86ba5-48f1-a80a-7f1d-b76bcb9c5c47@2ndquadrant.com
2020-02-22 12:09:27 +01:00
Tom Lane 97cf1fa4ed Assume that we have <wchar.h>.
Windows has this, and so do all other live platforms according to the
buildfarm; it's been required by POSIX since SUSv2.  So remove the
configure probe and tests of HAVE_WCHAR_H.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Tom Lane 481c8e9232 Assume that we have utime() and <utime.h>.
These are required by POSIX since SUSv2, and no live platforms fail
to provide them.  On Windows, utime() exists and we bring our own
<utime.h>, so we're good there too.  So remove the configure probes
and ad-hoc substitute code.  We don't need to check for utimes()
anymore either, since that was only used as a substitute.

In passing, make the Windows build include <sys/utime.h> only where
we need it, not everywhere.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Tom Lane abe41f453a Assume that we have cbrt().
Windows has this, and so do all other live platforms according to the
buildfarm, so remove the configure probe and float.c's substitute code.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:47 -05:00
Jeff Davis b7fabe80df Fixup for nodeAgg.c refactor.
Commit 5b618e1f made an unintended behavior change.
2020-02-21 09:34:20 -08:00
Etsuro Fujita 032f9ae012 Avoid redundant checks in partition_bounds_copy().
Previously, partition_bounds_copy() checked whether the strategy for the
given partition bounds was hash or not, and then determined the number of
elements in the datums in the datums array for the partition bounds, on
each iteration of the loop for copying the datums array, but there is no
need to do that.  Perform the checks only once before the loop iteration.

Author: Etsuro Fujita
Reported-by: Amit Langote and Julien Rouhaud
Discussion: https://postgr.es/m/CAPmGK14Rvxrm8DHWvCjdoks6nwZuHBPvMnWZ6rkEx2KhFeEoPQ@mail.gmail.com
2020-02-21 20:00:45 +09:00
Etsuro Fujita 53b01acd46 Remove extra word from comment. 2020-02-20 19:15:00 +09:00
Tom Lane 70a7732007 Remove support for upgrading extensions from "unpackaged" state.
Andres Freund pointed out that allowing non-superusers to run
"CREATE EXTENSION ... FROM unpackaged" has security risks, since
the unpackaged-to-1.0 scripts don't try to verify that the existing
objects they're modifying are what they expect.  Just attaching such
objects to an extension doesn't seem too dangerous, but some of them
do more than that.

We could have resolved this, perhaps, by still requiring superuser
privilege to use the FROM option.  However, it's fair to ask just what
we're accomplishing by continuing to lug the unpackaged-to-1.0 scripts
forward.  None of them have received any real testing since 9.1 days,
so they may not even work anymore (even assuming that one could still
load the previous "loose" object definitions into a v13 database).
And an installation that's trying to go from pre-9.1 to v13 or later
in one jump is going to have worse compatibility problems than whether
there's a trivial way to convert their contrib modules into extension
style.

Hence, let's just drop both those scripts and the core-code support
for "CREATE EXTENSION ... FROM".

Discussion: https://postgr.es/m/20200213233015.r6rnubcvl4egdh5r@alap3.anarazel.de
2020-02-19 16:59:14 -05:00
Jeff Davis 5b618e1f48 Minor refactor of nodeAgg.c.
* Separate calculation of hash value from the lookup.
  * Split build_hash_table() into two functions.
  * Change lookup_hash_entry() to return AggStatePerGroup. That's all
    the caller needed, anyway.

These changes are to support the upcoming Disk-based Hash Aggregation
work.

Discussion: https://postgr.es/m/31f5ab871a3ad5a1a91a7a797651f20e77ac7ce3.camel%40j-davis.com
2020-02-19 10:39:11 -08:00
Jeff Davis 8021985d79 logtape.c: allocate read buffer even for an empty tape.
Prior to this commit, the read buffer was allocated at the time the tape
was rewound; but as an optimization, would not be allocated at all if
the tape was empty.

That optimization meant that it was valid to have a rewound tape with
the buffer set to NULL, but only if a number of conditions were met
and only if the API was used properly. After 7fdd919a refactored the
code to support lazily-allocating the buffer, Coverity started
complaining.

The optimization for empty tapes doesn't seem important, so just
allocate the buffer whether the tape has any data or not.

Discussion: https://postgr.es/m/20351.1581868306%40sss.pgh.pa.us
2020-02-19 10:04:17 -08:00
Fujii Masao 0074919794 Fix mesurement of elapsed time during truncating heap in VACUUM.
VACUUM may truncate heap in several batches. The activity report
is logged for each batch, and contains the number of pages in the table
before and after the truncation, and also the elapsed time during
the truncation. Previously the elapsed time reported in each batch was
the total elapsed time since starting the truncation until finishing
each batch. For example, if the truncation was processed dividing into
three batches, the second batch reported the accumulated time elapsed
during both first and second batches. This is strange and confusing
because the number of pages in the table reported together is not
total. Instead, each batch should report the time elapsed during
only that batch.

The cause of this issue was that the resource usage snapshot was
initialized only at the beginning of the truncation and was never
reset later. This commit fixes the issue by changing VACUUM so that
the resource usage snapshot is reset at each batch.

Back-patch to all supported branches.

Reported-by: Tatsuhito Kasahara
Author: Tatsuhito Kasahara
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/CAP0=ZVJsf=NvQuy+QXQZ7B=ZVLoDV_JzsVC1FRsF1G18i3zMGg@mail.gmail.com
2020-02-19 20:37:26 +09:00
Michael Paquier e2e02191e2 Clean up some code, comments and docs referring to Windows 2000 and older
This fixes and updates a couple of comments related to outdated Windows
versions.  Particularly, src/common/exec.c had a fallback implementation
to read a file's line from a pipe because stdin/stdout/stderr does not
exist in Windows 2000 that is removed to simplify src/common/ as there
are unlikely versions of Postgres running on such platforms.

Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, Juan José Santamaría Flecha
Discussion: https://postgr.es/m/20191219021526.GC4202@paquier.xyz
2020-02-19 13:20:33 +09:00
Amit Kapila e3ff789acf Stop demanding that top xact must be seen before subxact in decoding.
Manifested as

ERROR:  subtransaction logged without previous top-level txn record

this check forbids legit behaviours like
 - First xl_xact_assignment record is beyond reading, i.e. earlier
   restart_lsn.
 - After restart_lsn there is some change of a subxact.
 - After that, there is second xl_xact_assignment (for another subxact)
   revealing the relationship between top and first subxact.

Such a transaction won't be streamed anyway because we hadn't seen it in
full.  Saying for sure whether xact of some record encountered after
the snapshot was deserialized can be streamed or not requires to know
whether it wrote something before deserialization point --if yes, it
hasn't been seen in full and can't be decoded. Snapshot doesn't have such
info, so there is no easy way to relax the check.

Reported-by: Hsu, John
Diagnosed-by: Arseny Sher
Author: Arseny Sher, Amit Kapila
Reviewed-by: Amit Kapila, Dilip Kumar
Backpatch-through: 9.5
Discussion: https://postgr.es/m/AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
2020-02-19 08:15:49 +05:30
Peter Geoghegan fe9b92854e Remove obsolete _bt_compare() comment.
btbuild() has nothing to say about how NULL values compare in nbtree.
Besides, there are _bt_compare() header comments that describe how NULL
values are handled.
2020-02-18 16:07:16 -08:00
Fujii Masao b7e51b350c Make inherited LOCK TABLE perform access permission checks on parent table only.
Previously, LOCK TABLE command through a parent table checked
the permissions on not only the parent table but also the children
tables inherited from it. This was a bug and inherited queries should
perform access permission checks on the parent table only. This
commit fixes LOCK TABLE so that it does not check the permission
on children tables.

This patch is applied only in the master branch. We decided not to
back-patch because it's not hard to imagine that there are some
applications expecting the old behavior and the change breaks their
security.

Author: Amit Langote
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CAHGQGwE+GauyG7POtRfRwwthAGwTjPQYdFHR97+LzA4RHGnJxA@mail.gmail.com
2020-02-18 13:13:15 +09:00
Michael Paquier 958f9fb98d Remove duplicated words in comments
Author: Daniel Gustafsson
Reviewed-by: Vik Fearing
Discussion: https://postgr.es/m/EBC3BFEB-664C-4063-81ED-29F1227DB012@yesql.se
2020-02-18 12:20:55 +09:00
Peter Eisentraut c6679e4fca Optimize update of tables with generated columns
When updating a table row with generated columns, only recompute those
generated columns whose base columns have changed in this update and
keep the rest unchanged.  This can result in a significant performance
benefit.  The required information was already kept in
RangeTblEntry.extraUpdatedCols; we just have to make use of it.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b05e781a-fa16-6b52-6738-761181204567@2ndquadrant.com
2020-02-17 15:20:58 +01:00
Peter Eisentraut ad3ae64770 Fill in extraUpdatedCols in logical replication
The extraUpdatedCols field of the target RTE records which generated
columns are affected by an update.  This is used in a variety of
places, including per-column triggers and foreign data wrappers.  When
an update was initiated by a logical replication subscription, this
field was not filled in, so such an update would not affect generated
columns in a way that is consistent with normal updates.  To fix,
factor out some code from analyze.c to fill in extraUpdatedCols in the
logical replication worker as well.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b05e781a-fa16-6b52-6738-761181204567@2ndquadrant.com
2020-02-17 15:20:57 +01:00
Fujii Masao f4ae722141 Add description about GSSOpenServer wait event into document.
This commit also updates wait event enum into alphabetical order.
Previously the enum entry for GSSOpenServer was added out-of-order.

Back-patch to v12 where commit b0b39f72b9 introduced
GSSOpenServer wait event. In v12, the commit doesn't include
the update of wait event enum, not to break ABI.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/949931aa-4ed4-d867-a7b5-de9c02b2292b@oss.nttdata.com
2020-02-17 16:16:08 +09:00
Tom Lane faade5d4c6 Update obsolete comment.
Noted by Justin Pryzby, though I chose to just rip out the stale text,
as it's in no way relevant to this particular function.

Discussion: https://postgr.es/m/20200212182337.GZ1412@telsasoft.com
2020-02-15 15:22:40 -05:00
Tom Lane 9d1ec5a8e1 Clarify coding in Catalog::AddDefaultValues.
Make it a bit shorter and better-commented; no functional change.

Alvaro Herrera and Tom Lane

Discussion: https://postgr.es/m/20200212182337.GZ1412@telsasoft.com
2020-02-15 15:13:44 -05:00
Tom Lane 86ff085e83 Don't require pg_class.dat to contain correct relnatts values.
Practically everybody who's ever added a column to one of the bootstrap
catalogs has been burnt by the need to update the relnatts field in the
initial pg_class data to match.  Now that we use Perl scripts to
generate postgres.bki, we can have the machines take care of that,
by filling the field during genbki.pl.

While at it, use the BKI_DEFAULTS mechanism to eliminate repetitive
specifications of other column values in pg_class.dat, too.  They
weren't particularly a maintenance problem, but this way is prettier
(certainly the spotty previous usage of BKI_DEFAULTS wasn't pretty).

No catversion bump needed, since this doesn't actually change the
contents of postgres.bki.

Per gripe from Justin Pryzby, though this is quite different from
his originally proposed solution.

Amit Langote, John Naylor, Tom Lane

Discussion: https://postgr.es/m/20200212182337.GZ1412@telsasoft.com
2020-02-15 14:57:27 -05:00
Jeff Davis 7fdd919ae7 Logical Tape Set: lazily allocate read buffer.
The write buffer was already lazily-allocated, so this is more
symmetric. It also means that a freshly-rewound tape (whether for
reading or writing) is not consuming memory for the buffer.

Discussion: https://postgr.es/m/97c46a59c27f3c38e486ca170fcbc618d97ab049.camel%40j-davis.com
2020-02-13 10:44:25 -08:00
Tom Lane 607f8ce74d Avoid a performance regression in float overflow/underflow detection.
Commit 6bf0bc842 replaced float.c's CHECKFLOATVAL() macro with static
inline subroutines, but that wasn't too well thought out.  In the original
coding, the unlikely condition (isinf(result) or result == 0) was checked
first, and the inf_is_valid or zero_is_valid condition only afterwards.
The inline-subroutine coding caused that to be swapped around, which is
pretty horrid for performance because (a) in common cases the is_valid
condition is twice as expensive to evaluate (e.g., requiring two isinf()
calls not one) and (b) in common cases the is_valid condition is false,
requiring us to perform the unlikely-condition check anyway.  Net result
is that one isinf() call becomes two or three, resulting in visible
performance loss as reported by Keisuke Kuroda.

The original fix proposal was to revert the replacement of the macro,
but on second thought, that macro was just a bad idea from the beginning:
if anything it's a net negative for readability of the code.  So instead,
let's just open-code all the overflow/underflow tests, being careful to
test the unlikely condition first (and mark it unlikely() to help the
compiler get the point).

Also, rather than having N copies of the actual ereport() calls, collapse
those into out-of-line error subroutines to save some code space.  This
does mean that the error file/line numbers won't be very helpful for
figuring out where the issue really is --- but we'd already burned that
bridge by putting the ereports into static inlines.

In HEAD, check_float[48]_val() are gone altogether.  In v12, leave them
present in float.h but unused in the core code, just in case some
extension is depending on them.

Emre Hasegeli, with some kibitzing from me and Andres Freund

Discussion: https://postgr.es/m/CANDwggLe1Gc1OrRqvPfGE=kM9K0FSfia0hbeFCEmwabhLz95AA@mail.gmail.com
2020-02-13 13:37:43 -05:00
Tom Lane 0973f5602c Remove long-dead comments.
These should've been dropped by a8bb8eb58, but evidently were
missed.  Not important enough to back-patch.
2020-02-12 14:33:49 -05:00
Thomas Munro 701a51fd4e Use pg_pwrite() in more places.
This removes some lseek() system calls.

Author: Thomas Munro
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGJ%2BoHhnvqjn3%3DHro7xu-YDR8FPr0FL6LF35kHRX%3D_bUzg%40mail.gmail.com
2020-02-11 17:50:22 +13:00
Jeff Davis 11de6c903d Change signature of TupleHashTableHash().
Commit 4eaea3db introduced TupleHashTableHash(), but the signature
didn't match the other exposed functions. Separate it into internal
and external versions. The external version hides the details behind
an API more consistent with the other external functions, and the
internal version is still suitable for simplehash.
2020-02-10 10:20:10 -08:00
Alvaro Herrera b048f558dd Fix priv checks for ALTER <object> DEPENDS ON EXTENSION
Marking an object as dependant on an extension did not have any
privilege check whatsoever; this allowed any user to mark objects as
droppable by anyone able to DROP EXTENSION, which could be used to cause
system-wide havoc.  Disallow by checking that the calling user owns the
mentioned object.

(No constraints are placed on the extension.)

Security: CVE-2020-1720
Reported-by: Tom Lane
Discussion: 31605.1566429043@sss.pgh.pa.us
2020-02-10 11:47:09 -03:00
Amit Kapila 3dfba9fdf5 Fix typos.
Reported-by: Justin Pryzby
Author: Justin Pryzby
Discussion: https://postgr.es/m/20200206021432.GA24549@telsasoft.com
2020-02-10 09:31:18 +05:30
Tom Lane 4093ff5737 Store the deletion horizon XID for a deleted GIN page on the right page.
Commit b10714080 moved the GinPageSetDeleteXid() call to a spot where
the "page" variable was pointing to the wrong page, causing the XID
to be inserted on a page that's not being deleted, thus allowing later
GinPageIsRecyclable tests to recycle the deleted page too soon.

It might be a good idea to stop using the single "page" variable for
multiple purposes in this function.  But for the moment I just moved
the GinPageSetDeleteXid() call down beside the GinPageSetDeleted()
call, which seems like a more logical place for it anyway.

Back-patch to v11, as the faulty patch was.  (Fortunately, the bug
hasn't made it into any release yet.)

Discussion: https://postgr.es/m/21620.1581098806@sss.pgh.pa.us
2020-02-09 12:02:57 -05:00
Alvaro Herrera 55173d2e66 Fix failure to create FKs correctly in partitions
On a multi-level partioned table, when adding a partition not directly
connected to the root table, foreign key constraints referencing the
root were not cloned to the new partition, leading to the FK being
possibly inadvertently violated later on.

This was caused by fuzzy thinking in CloneFkReferenced (commit
f56f8f8da6): it was skipping constraints marked as having parents on
the theory that cloning those would create duplicates; but that's only
correct for the top level of the partitioning hierarchy.  For levels
below that one, such constraints must still be considered and only
skipped if later on we see that we'd create duplicates.  Apparently, I
(Álvaro) wrote the comments right but the code implemented something
slightly different.

Author: Jehan-Guillaume de Rorthais
Discussion: https://postgr.es/m/20200206004948.238352db@firost
2020-02-07 18:27:18 -03:00
Alvaro Herrera 9710d3d4a8 Fix TRUNCATE .. CASCADE on partitions
When running TRUNCATE CASCADE on a child of a partitioned table
referenced by another partitioned table, the truncate was not applied to
partitions of the referencing table; this could leave rows violating the
constraint in the referencing partitioned table.  Repair by walking the
pg_constraint chain all the way up to the topmost referencing table.

Note: any partitioned tables containing FKs that reference other
partitioned tables should be checked for possible violating rows, if
TRUNCATE has occurred in partitions of the referenced table.

Reported-by: Christophe Courtois
Author: Jehan-Guillaume de Rorthais
Discussion: https://postgr.es/m/20200204183906.115f693e@firost
2020-02-07 17:09:36 -03:00
Fujii Masao cb5b28613d Fix bug in Tid scan.
Commit 147e3722f7 changed Tid scan so that it calls table_beginscan()
and uses the scan option for seq scan. This change caused two issues.

(1) The change caused Tid scan to take a predicate lock on the entire
       relation in serializable transaction even when relation-level
       lock is not necessary. This could lead to an unexpected
       serialization error.

(2) The change caused Tid scan to increment the number of seq_scan
       in pg_stat_*_tables views even though it's not seq scan. This
       could confuse the users.

This commit adds the scan option for Tid scan and makes Tid scan
use it, to avoid those issues.

Back-patch to v12, where the bug was introduced.

Author: Tatsuhito Kasahara
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/CAP0=ZVKy+gTbFmB6X_UW0pP3WaeJ-fkUWHoD-pExS=at3CY76g@mail.gmail.com
2020-02-07 22:06:31 +09:00
Andres Freund b059d2f456 jit: Reference expression step functions via llvmjit_types.
The main benefit of doing so is that this allows llvm to ensure that
types match - previously that'd only be detected by a crash within the
called function. There were a number of cases where we passed a
superfluous parameter...

To avoid needing to add all the functions to llvmjit.{c,h}, instead
get them from the llvm module for llvmjit_types.c. Also use that for
the functions from llvmjit_types already in llvmjit.h.

Author: Soumyadeep Chakraborty and Andres Freund
Discussion: https://postgr.es/m/CADwEdooww3wZv-sXSfatzFRwMuwa186LyTwkBfwEW6NjtooBPA@mail.gmail.com
2020-02-06 22:29:14 -08:00
Jeff Davis 4eaea3db15 Introduce TupleHashTableHash() and LookupTupleHashEntryHash().
Expose two new entry points: one for only calculating the hash value
of a tuple, and another for looking up a hash entry when the hash
value is already known. This will be useful for disk-based Hash
Aggregation to avoid recomputing the hash value for the same tuple
after saving and restoring it from disk.

Discussion: https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
2020-02-06 20:34:01 -08:00
Andres Freund e6f86f8dd9 jit: Remove redundancies in expression evaluation code generation.
This merges the code emission for a number of opcodes by handling the
behavioural difference more locally. This reduces code, and also
improves the generated code a bit in some cases, by removing redundant
constants.

Author: Andres Freund
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
2020-02-06 20:01:23 -08:00
Andres Freund 8c2769405f jit: Reference functions by name in IOCOERCE steps.
Previously we used constant function pointer addresses, which prevents
inlining and other related optimizations.

Author: Andres Freund
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
2020-02-06 19:54:43 -08:00
Andres Freund 1fdb7f9789 expression eval: Don't redundantly keep track of AggState.
It's already tracked via ExprState->parent, so we don't need to also
include it in ExprEvalStep. When that code originally was written
ExprState->parent didn't exist, but it since has been introduced in
6719b238e8.

Author: Andres Freund
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
2020-02-06 19:54:43 -08:00
Andres Freund 1ec7679f1b expression eval, jit: Minor code cleanups.
This mostly consists of using C99 style for loops, moving variables
into narrower scopes, and a smattering of other minor improvements.
Done separately to make it easier to review patches with actual
functional changes.

Author: Andres Freund
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
2020-02-06 19:54:15 -08:00
Michael Paquier 5ac4e9a12c Fix typo in proc.c
Author: Julien Rouhaud
Discussion: https://postgr.es/m/20200206082333.GA95343@nol
2020-02-07 12:41:10 +09:00
Michael Paquier 414c2fd1e1 Revert "Add GUC checks for ssl_min_protocol_version and ssl_max_protocol_version"
This reverts commit 41aadee, as the GUC checks could run on older values
with the new values used, and result in incorrect errors if both
parameters are changed at the same time.

Per complaint from Tom Lane.

Discussion: https://postgr.es/m/27574.1581015893@sss.pgh.pa.us
Backpatch-through: 12
2020-02-07 08:10:40 +09:00
Peter Eisentraut fc7a5e9eaa Ensure relcache consistency around generated columns
In certain transient states, it's possible that a table has attributes
with attgenerated set but no default expressions in pg_attrdef yet.
In that case, the old code path would not set
relation->rd_att->constr->has_generated_stored, unless
relation->rd_att->constr was also populated for some other reason.
There was probably no practical impact, but it's better to keep this
consistent.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/20200115181105.ad6ab6dlgyww3lb6%40alap3.anarazel.de
2020-02-06 21:25:01 +01:00
Jeff Davis 7d4395d0a1 Refactor hash_agg_entry_size().
Consolidate the calculations for hash table size estimation. This will
help with upcoming Hash Aggregation work that will add additional call
sites.
2020-02-06 11:49:56 -08:00
Jeff Davis c02fdc9223 Logical Tape Set: use min heap for freelist.
Previously, the freelist of blocks was tracked as an
occasionally-sorted array. A min heap is more resilient to larger
freelists or more frequent changes between reading and writing.

Discussion: https://postgr.es/m/97c46a59c27f3c38e486ca170fcbc618d97ab049.camel%40j-davis.com
2020-02-06 10:09:45 -08:00
Amit Kapila cac8ce4a73 Fix typo.
Reported-by: Amit Langote
Author: Amit Langote
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CA+HiwqFNADeukaaGRmTqANbed9Fd81gLi08AWe_F86_942Gspw@mail.gmail.com
2020-02-06 15:57:02 +05:30
Fujii Masao 3ccc66dac6 Fix bug in LWLock statistics mechanism.
Previously PostgreSQL built with -DLWLOCK_STATS could report
more than one LWLock statistics entries for the same backend
process and the same LWLock. This is strange and only one
statistics should be output in that case, instead.

The cause of this issue is that the key variable used for
LWLock stats hash table was not fully initialized. The key
consists of two fields and they were initialized. But
the following 4 bytes allocated in the key variable for
the alignment was not initialized. So even if the same key
was specified, hash_search(HASH_ENTER) could not find
the existing entry for that key and created new one.

This commit fixes this issue by initializing the key
variable with zero. As the side effect of this commit,
the volume of LWLock statistics output would be reduced
very much.

Back-patch to v10, where commit 3761fe3c20 introduced the issue.

Author: Fujii Masao
Reviewed-by: Julien Rouhaud, Kyotaro Horiguchi
Discussion: https://postgr.es/m/26359edb-798a-568f-d93a-6aafac49752d@oss.nttdata.com
2020-02-06 14:43:21 +09:00
Michael Paquier b025f32e0b Add leader_pid to pg_stat_activity
This new field tracks the PID of the group leader used with parallel
query.  For parallel workers and the leader, the value is set to the
PID of the group leader.  So, for the group leader, the value is the
same as its own PID.  Note that this reflects what PGPROC stores in
shared memory, so as leader_pid is NULL if a backend has never been
involved in parallel query.  If the backend is using parallel query or
has used it at least once, the value is set until the backend exits.

Author: Julien Rouhaud
Reviewed-by: Sergei Kornilov, Guillaume Lelarge, Michael Paquier, Tomas
Vondra
Discussion: https://postgr.es/m/CAOBaU_Yy5bt0vTPZ2_LUM6cUcGeqmYNoJ8-Rgto+c2+w3defYA@mail.gmail.com
2020-02-06 09:18:06 +09:00
Andrew Gierth bf6cc19e34 Force tuple conversion when the source has missing attributes.
Tuple conversion incorrectly concluded that no conversion was needed
as long as all the attributes lined up. But if the source tuple has a
missing attribute (from addition of a column with default), then the
destination tupdesc might not reflect the same default. The typical
symptom was that the affected columns would be unexpectedly NULL.

Repair by always forcing conversion if the source has missing
attributes, which will be filled in by the deform operation. (In
theory we could optimize for when the destination has the same
default, but that seemed overkill.)

Backpatch to 11 where missing attributes were added.

Per bug #16242.

Vik Fearing (discovery, code, testing) and me (analysis, testcase).

Discussion: https://postgr.es/m/16242-d1c9fca28445966b@postgresql.org
2020-02-05 20:21:20 +00:00
Alvaro Herrera 15d13e8291 Make vacuum buffer counters 64 bits wide
Using 32 bit counters means they can now realistically wrap around when
vacuuming extremely large tables.  Because they're signed integers,
stats printed by vacuum look very odd when they do.

We'd love to backpatch this, but refrain because the variables are
exported and could cause third-party code to break.

Reviewed-by: Julien Rouhaud, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/20200131205926.GA16367@alvherre.pgsql
2020-02-05 16:59:29 -03:00
Thomas Munro 815c2f0972 Add kqueue(2) support to the WaitEventSet API.
Use kevent(2) to wait for events on the BSD family of operating
systems and macOS.  This is similar to the epoll(2) support added
for Linux by commit 98a64d0bd.

Author: Thomas Munro
Reviewed-by: Andres Freund, Marko Tiikkaja, Tom Lane
Tested-by: Mateusz Guzik, Matteo Beccati, Keith Fiske, Heikki Linnakangas, Michael Paquier, Peter Eisentraut, Rui DeSousa, Tom Lane, Mark Wong
Discussion: https://postgr.es/m/CAEepm%3D37oF84-iXDTQ9MrGjENwVGds%2B5zTr38ca73kWR7ez_tA%40mail.gmail.com
2020-02-05 17:35:57 +13:00
Thomas Munro d9fe702a2c Handle lack of DSM slots in parallel btree build, take 2.
Commit 74618e77 added a new check intended to fix a bug, but put
it in the wrong place so that parallel btree build was always
disabled.  Do the check after we've actually tried to create
a DSM segment.  Back-patch to 11, like the earlier commit.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-WzmDABkJzrNnvf%2BOULK-_A_j9gkYg_Dz-H62jzNv4eKQTw%40mail.gmail.com
2020-02-05 12:27:00 +13:00
Tom Lane 7d91b604d9 Fix handling of "Subplans Removed" field in EXPLAIN output.
Commit 499be013d added this field in a rather poorly-thought-through
manner, with the result being that rather than being a field of the
Append or MergeAppend plan node as intended (and as it seems to be,
in text format), it was actually an element of the "Plans" subgroup.
At least in JSON format, that's flat out invalid syntax, because
"Plans" is an array not an object.

While it's not hard to move the generation of the field so that it
appears where it's supposed to, this does result in a visible change
in field order in text format, in cases where a Append or MergeAppend
plan node has any InitPlans attached.  That's slightly annoying to
do in stable branches; but the alternative of continuing to emit
broken non-text formats seems worse.

Also, since the set of fields emitted is not supposed to be
data-dependent in non-text formats, make sure that "Subplans Removed"
appears in Append and MergeAppend nodes even when it's zero, in those
formats.  (The previous coding made it look like it could appear in
some other node types such as BitmapAnd, but we don't actually support
runtime pruning there, so don't emit it in those cases.)

Per bug #16171 from Mahadevan Ramachandran.  Fix by Daniel Gustafsson
and Tom Lane, reviewed by Hamid Akhtar.  Back-patch to v11 where this
code came in.

Discussion: https://postgr.es/m/16171-b72259ab75505fa2@postgresql.org
2020-02-04 13:07:13 -05:00
Alvaro Herrera 1c7a0b387d Add missing break out seqscan loop in logical replication
When replica identity is FULL (an admittedly unusual case), the loop
that searches for tuples in execReplication.c didn't stop scanning the
table when once a matching tuple was found.  Add the missing 'break'.

Note slight behavior change: we now return the first matching tuple
rather than the last one.  They are supposed to be indistinguishable
anyway, so this shouldn't matter.

Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/379743f6-ae91-b866-f7a2-5624e6d2b0a4@postgrespro.ru
2020-02-03 18:59:12 -03:00
Michael Paquier f1f10a1ba9 Add declaration-level assertions for compile-time checks
Those new assertions can be used at file scope, outside of any function
for compilation checks.  This commit provides implementations for C and
C++, and fallback implementations.

Author: Peter Smith
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Dagfinn Ilmari Mannsåker,
Michael Paquier
Discussion: https://postgr.es/m/201DD0641B056142AC8C6645EC1B5F62014B8E8030@SYD1217
2020-02-03 14:48:42 +09:00
Andrew Gierth 1fd687a035 Optimizations for integer to decimal output.
Using a lookup table of digit pairs reduces the number of divisions
needed, and calculating the length upfront saves some work; these
ideas are taken from the code previously committed for floats.

David Fetter, reviewed by Kyotaro Horiguchi, Tels, and me.

Discussion: https://postgr.es/m/20190924052620.GP31596%40fetter.org
2020-02-01 21:57:14 +00:00
Thomas Munro 93745f1e01 Fix memory leak on DSM slot exhaustion.
If we attempt to create a DSM segment when no slots are available,
we should return the memory to the operating system.  Previously
we did that if the DSM_CREATE_NULL_IF_MAXSEGMENTS flag was
passed in, but we didn't do it if an error was raised.  Repair.

Back-patch to 9.4, where DSM segments arrived.

Author: Thomas Munro
Reviewed-by: Robert Haas
Reported-by: Julian Backes
Discussion: https://postgr.es/m/CA%2BhUKGKAAoEw-R4om0d2YM4eqT1eGEi6%3DQot-3ceDR-SLiWVDw%40mail.gmail.com
2020-02-01 14:29:13 +13:00
Tom Lane 870ad6a59b Fix not-quite-right string comparison in parse_jsonb_index_flags().
This code would accept "strinX", where X is any 1-byte character,
as meaning "string".  Clearly it wasn't meant to do that.

No back-patch, since this doesn't affect correct queries and
there's some tiny chance we'd break somebody's incorrect query
in a minor release.

Report and patch by Dominik Czarnota.

Discussion: https://postgr.es/m/CABEVAa1dU0mDCAfaT8WF2adVXTDsLVJy_izotg6ze_hh-cn8qQ@mail.gmail.com
2020-01-31 17:26:40 -05:00
Tom Lane 74b35eb468 Fix CheckAttributeType's handling of collations for ranges.
Commit fc7695891 changed CheckAttributeType to recurse into ranges,
but made it pass down the wrong collation (always InvalidOid, since
ranges as such have no collation).  This would result in guaranteed
failure when considering a range type whose subtype is collatable.

Embarrassingly, we lack any regression tests that would expose such
a problem (but fortunately, somebody noticed before we shipped this
bug in any release).

Fix it to pass down the range's subtype collation property instead,
and add some regression test cases to exercise collatable-subtype
ranges a bit more.  Back-patch to all supported branches, as the
previous patch was.

Report and patch by Julien Rouhaud, test cases tweaked by me

Discussion: https://postgr.es/m/CAOBaU_aBWqNweiGUFX0guzBKkcfJ8mnnyyGC_KBQmO12Mj5f_A@mail.gmail.com
2020-01-31 17:03:55 -05:00
Peter Eisentraut 7c23bfd25c Sprinkle some const decorations
This might help clarify the API a bit.
2020-01-31 12:52:08 +01:00
Thomas Munro ef02fb15a3 Report time spent in posix_fallocate() as a wait event.
When allocating DSM segments with posix_fallocate() on Linux (see commit
899bd785), report this activity as a wait event exactly as we would if
we were using file-backed DSM rather than shm_open()-backed DSM.

Author: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKGKCSh4GARZrJrQZwqs5SYp0xDMRr9Bvb%2BHQzJKvRgL6ZA%40mail.gmail.com
2020-01-31 17:29:41 +13:00
Thomas Munro d061ea21fc Adjust DSM and DSA slot usage constants.
When running a lot of large parallel queries concurrently, or a plan with
a lot of separate Gather nodes, it is possible to run out of DSM slots.
There are better solutions to these problems requiring architectural
redesign work, but for now, let's adjust the constants so that it's more
difficult to hit the limit.

1.  Previously, a DSA area would create up to four segments at each size
before doubling the size.  After this commit, it will create only two at
each size, so it ramps up faster and therefore needs fewer slots.

2.  Previously, the total limit on DSM slots allowed for 2 per connection.
Switch to 5 per connection.

Also remove an obsolete nearby comment.

Author: Thomas Munro
Reviewed-by: Robert Haas, Andres Freund
Discussion: https://postre.es/m/CA%2BhUKGL6H2BpGbiF7Lj6QiTjTGyTLW_vLR%3DSn2tEBeTcYXiMKw%40mail.gmail.com
2020-01-31 17:29:38 +13:00
Thomas Munro 74618e77b4 Handle lack of DSM slots in parallel btree build.
If no DSM slots are available, a ParallelContext can still be
created, but its seg pointer is NULL.  Teach parallel btree build
to cope with that by falling back to a regular non-parallel build,
to avoid crashing with a segmentation fault.

Back-patch to 11, where parallel CREATE INDEX landed.

Reported-by: Nicola Contu
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CA%2BhUKGJgJEBnkuODBVomyK3MWFvDBbMVj%3Dgdt6DnRPU-5sQ6UQ%40mail.gmail.com
2020-01-31 10:25:34 +13:00
Alvaro Herrera c9d2977519 Clean up newlines following left parentheses
We used to strategically place newlines after some function call left
parentheses to make pgindent move the argument list a few chars to the
left, so that the whole line would fit under 80 chars.  However,
pgindent no longer does that, so the newlines just made the code
vertically longer for no reason.  Remove those newlines, and reflow some
of those lines for some extra naturality.

Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
2020-01-30 13:42:14 -03:00
Alvaro Herrera 4e89c79a52 Remove excess parens in ereport() calls
Cosmetic cleanup, not worth backpatching.

Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
Reviewed-by: Tom Lane, Michael Paquier
2020-01-30 13:32:04 -03:00
Fujii Masao e6f1e560e4 Make inherited TRUNCATE perform access permission checks on parent table only.
Previously, TRUNCATE command through a parent table checked the
permissions on not only the parent table but also the children tables
inherited from it. This was a bug and inherited queries should perform
access permission checks on the parent table only. This commit fixes
that bug.

Back-patch to all supported branches.

Author: Amit Langote
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CAHGQGwFHdSvifhJE+-GSNqUHSfbiKxaeQQ7HGcYz6SC2n_oDcg@mail.gmail.com
2020-01-31 00:42:06 +09:00
Michael Paquier b0afdcad21 Fix slot data persistency when advancing physical replication slots
Advancing a physical replication slot with pg_replication_slot_advance()
did not mark the slot as dirty if any advancing was done, preventing the
follow-up checkpoint to flush the slot data to disk.  This caused the
advancing to be lost even on clean restarts.  This does not happen for
logical slots as any advancing marked the slot as dirty.  Per
discussion, the original feature has been implemented so as in the event
of a crash the slot may move backwards to a past LSN.  This property is
kept and more documentation is added about that.

This commit adds some new TAP tests to check the persistency of physical
and logical slots after advancing across clean restarts.

Author: Alexey Kondratov, Michael Paquier
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Craig Ringer
Discussion: https://postgr.es/m/059cc53a-8b14-653a-a24d-5f867503b0ee@postgrespro.ru
Backpatch-through: 11
2020-01-30 11:14:02 +09:00
Tom Lane 50fc694e43 Invent "trusted" extensions, and remove the pg_pltemplate catalog.
This patch creates a new extension property, "trusted".  An extension
that's marked that way in its control file can be installed by a
non-superuser who has the CREATE privilege on the current database,
even if the extension contains objects that normally would have to be
created by a superuser.  The objects within the extension will (by
default) be owned by the bootstrap superuser, but the extension itself
will be owned by the calling user.  This allows replicating the old
behavior around trusted procedural languages, without all the
special-case logic in CREATE LANGUAGE.  We have, however, chosen to
loosen the rules slightly: formerly, only a database owner could take
advantage of the special case that allowed installation of a trusted
language, but now anyone who has CREATE privilege can do so.

Having done that, we can delete the pg_pltemplate catalog, moving the
knowledge it contained into the extension script files for the various
PLs.  This ends up being no change at all for the in-core PLs, but it is
a large step forward for external PLs: they can now have the same ease
of installation as core PLs do.  The old "trusted PL" behavior was only
available to PLs that had entries in pg_pltemplate, but now any
extension can be marked trusted if appropriate.

This also removes one of the stumbling blocks for our Python 2 -> 3
migration, since the association of "plpythonu" with Python 2 is no
longer hard-wired into pg_pltemplate's initial contents.  Exactly where
we go from here on that front remains to be settled, but one problem
is fixed.

Patch by me, reviewed by Peter Eisentraut, Stephen Frost, and others.

Discussion: https://postgr.es/m/5889.1566415762@sss.pgh.pa.us
2020-01-29 18:42:43 -05:00
Robert Haas beb4699091 Move jsonapi.c and jsonapi.h to src/common.
To make this work, (1) makeJsonLexContextCstringLen now takes the
encoding to be used as an argument; (2) check_stack_depth() is made to
do nothing in frontend code, and (3) elog(ERROR, ...) is changed to
pg_log_fatal + exit in frontend code.

Mark Dilger, reviewed and slightly revised by me.

Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com
2020-01-29 10:22:51 -05:00
Peter Eisentraut dc788668bb Fail if recovery target is not reached
Before, if a recovery target is configured, but the archive ended
before the target was reached, recovery would end and the server would
promote without further notice.  That was deemed to be pretty wrong.
With this change, if the recovery target is not reached, it is a fatal
error.

Based-on-patch-by: Leif Gunnar Erlandsen <leif@lako.no>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/993736dd3f1713ec1f63fc3b653839f5@lako.no
2020-01-29 15:58:14 +01:00
Tom Lane 01d9676a53 Fix dangling pointer in EvalPlanQual machinery.
EvalPlanQualStart() supposed that it could re-use the relsubs_rowmark
and relsubs_done arrays from a prior instantiation.  But since they are
allocated in the es_query_cxt of the recheckestate, that's just wrong;
EvalPlanQualEnd() will blow away that storage.  Therefore we were using
storage that could have been reallocated to something else, causing all
sorts of havoc.

I think this was modeled on the old code's handling of es_epqTupleSlot,
but since the code was anyway clearing the arrays at re-use, there's
clearly no expectation of importing any outside state.  So it's just
a dubious savings of a couple of pallocs, which is negligible compared
to setting up a new planstate tree.  Therefore, just allocate the
arrays always.  (I moved the allocations slightly for readability.)

In principle this bug could cause a problem whenever EPQ rechecks are
needed in more than one target table of a ModifyTable plan node.
In practice it seems not quite so easy to trigger as that; I couldn't
readily duplicate a crash with a partitioned target table, for instance.
That's probably down to incidental choices about when to free or
reallocate stuff.  The added isolation test case does seem to reliably
show an assertion failure, though.

Per report from Oleksii Kliukin.  Back-patch to v12 where the bug was
introduced (evidently by commit 3fb307bc4).

Discussion: https://postgr.es/m/EEF05F66-2871-4786-992B-5F45C92FEE2E@hintbits.com
2020-01-28 17:26:37 -05:00
Heikki Linnakangas 30012a04a6 Fix randAccess setting in ReadRecord()
Commit 38a957316d got this backwards.

Author: Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/20200128.194408.2260703306774646445.horikyota.ntt@gmail.com
2020-01-28 12:55:30 +02:00
Thomas Munro 11da6bccd1 Fix compile error on HP C.
Per build farm animal anole, after commit 6f38d4dac3.
2020-01-28 20:30:40 +13:00
Thomas Munro 78aaa0e823 Don't reset latch in ConditionVariablePrepareToSleep().
It's not OK to do that without calling CHECK_FOR_INTERRUPTS().
Let the next wait loop deal with it, following the usual pattern.

One consequence of this bug was that a SIGTERM delivered in a very
narrow timing window could leave a parallel worker process waiting
forever for a condition variable that will never be signaled, after
an error was raised in other process.

The code is a bit different in the stable branches due to commit
1321509f, making problems less likely there.  No back-patch for now,
but we may finish up deciding to make a similar change after more
discussion.

Author: Thomas Munro
Reviewed-by: Shawn Debnath
Reported-by: Tomas Vondra
Discussion: https://postgr.es/m/CA%2BhUKGJOm8zZHjVA8svoNT3tHY0XdqmaC_kHitmgXDQM49m1dA%40mail.gmail.com
2020-01-28 15:28:36 +13:00
Amit Kapila 05f18c6b6b Added relation name in error messages for constraint checks.
This gives more information to the user about the error and it makes such
messages consistent with the other similar messages in the code.

Reported-by: Simon Riggs
Author: Mahendra Singh and Simon Riggs
Reviewed-by: Beena Emerson and Amit Kapila
Discussion: https://postgr.es/m/CANP8+j+7YUvQvGxTrCiw77R23enMJ7DFmyA3buR+fa2pKs4XhA@mail.gmail.com
2020-01-28 07:48:10 +05:30
Michael Paquier ff8ca5fadd Add connection parameters to control SSL protocol min/max in libpq
These two new parameters, named sslminprotocolversion and
sslmaxprotocolversion, allow to respectively control the minimum and the
maximum version of the SSL protocol used for the SSL connection attempt.
The default setting is to allow any version for both the minimum and the
maximum bounds, causing libpq to rely on the bounds set by the backend
when negotiating the protocol to use for an SSL connection.  The bounds
are checked when the values are set at the earliest stage possible as
this makes the checks independent of any SSL implementation.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier, Cary Huang
Discussion: https://postgr.es/m/4F246AE3-A7AE-471E-BD3D-C799D3748E03@yesql.se
2020-01-28 10:40:48 +09:00
Thomas Munro 6f38d4dac3 Remove dependency on HeapTuple from predicate locking functions.
The following changes make the predicate locking functions more
generic and suitable for use by future access methods:

- PredicateLockTuple() is renamed to PredicateLockTID().  It takes
  ItemPointer and inserting transaction ID instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer.

- CheckForSerializableConflictOut() no longer takes HeapTuple or buffer.

Author: Ashwin Agrawal
Reviewed-by: Andres Freund, Kuntal Ghosh, Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
2020-01-28 13:13:04 +13:00
Tom Lane 4589c6a2a3 Apply project best practices to switches over enum values.
In the wake of 1f3a02173, assorted buildfarm members were warning about
"control reaches end of non-void function" or the like.  Do what we've
done elsewhere: in place of a "default" switch case that will prevent
the compiler from warning about unhandled enum values, put a catchall
elog() after the switch.  And return a dummy value to satisfy compilers
that don't know elog() doesn't return.
2020-01-27 18:46:30 -05:00
Robert Haas 73ce2a03f3 Move some code from jsonapi.c to jsonfuncs.c.
Specifically, move those functions that depend on ereport()
from jsonapi.c to jsonfuncs.c, in preparation for allowing
jsonapi.c to be used from frontend code.

A few cases where elog(ERROR, ...) is used for can't-happen
conditions are left alone; we can handle those in some other
way in frontend code.

Reviewed by Mark Dilger and Andrew Dunstan.

Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com
2020-01-27 11:22:13 -05:00
Robert Haas 1f3a021730 Adjust pg_parse_json() so that it does not directly ereport().
Instead, it now returns a value indicating either success or the
type of error which occurred. The old behavior is still available
by calling pg_parse_json_or_ereport(). If the new interface is
used, an error can be thrown by passing the return value of
pg_parse_json() to json_ereport_error().

pg_parse_json() can still elog() in can't-happen cases, but it
seems like that issue is best handled separately.

Adjust json_lex() and json_count_array_elements() to return an
error code, too.

This is all in preparation for making the backend's json parser
available to frontend code.

Reviewed and/or tested by Mark Dilger and Andrew Dunstan.

Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com
2020-01-27 11:04:51 -05:00
Thomas Munro 3e4818e9dd Avoid unnecessary shm writes in Parallel Hash Join.
Currently, Parallel Hash Join cannot be used for full/right joins,
so there is no point in setting the match flag.  It turns out that
the cache coherence traffic generated by those writes slows down
large systems running many-core joins, so let's stop doing that.
In future, if we need to use match bits in parallel joins, we might
want to consider setting them only if not already set.

Back-patch to 11, where Parallel Hash Join arrived.

Reported-by: Deng, Gang
Discussion: https://postgr.es/m/0F44E799048C4849BAE4B91012DB910462E9897A%40SHSMSX103.ccr.corp.intel.com
2020-01-27 15:07:03 +13:00
Michael Paquier 10a525230f Fix some memory leaks and improve restricted token handling on Windows
The leaks have been detected by a Coverity run on Windows.  No backpatch
is done as the leaks are minor.

While on it, make restricted token creation more consistent in its error
handling by logging an error instead of a warning if missing
advapi32.dll, which was missing in the NT4 days.  Any modern platform
should have this DLL around.  Now, if the library is not there, an error
is still reported back to the caller, and nothing is done do there is no
behavior change done in this commit.

Author: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQApa9MG0foPkgPX87fipk=vhnF2Xfg+CfUyR08h4R7Mywg@mail.gmail.com
2020-01-27 11:02:05 +09:00
Tom Lane 3ec20c7091 Fix EXPLAIN (SETTINGS) to follow policy about when to print empty fields.
In non-TEXT output formats, the "Settings" field should appear when
requested, even if it would be empty.

Also, get rid of the premature optimization of counting all the
GUC_EXPLAIN variables at startup.  Since there was no provision for
adjusting that count later, all it'd take would be some extension marking
a parameter as GUC_EXPLAIN to risk an assertion failure or memory stomp.
We could make get_explain_guc_options() count those variables on-the-fly,
or dynamically resize its array ... but TBH I do not think that making a
transient array of pointers a bit smaller is worth any extra complication,
especially when you consider all the other transient space EXPLAIN eats.
So just allocate that array at the max possible size.

In HEAD, also add some regression test coverage for this feature.

Because of the memory-stomp hazard, back-patch to v12 where this
feature was added.

Discussion: https://postgr.es/m/19416.1580069629@sss.pgh.pa.us
2020-01-26 16:32:19 -05:00
Thomas Munro f37ff03478 Refactor confusing code in _mdfd_openseg().
As reported independently by a couple of people, _mdfd_openseg() is coded in a
way that seems to imply that the segments could be opened in an order that
isn't strictly sequential.  Even if that were true, it's also using the wrong
comparison.  It's not an active bug, since the condition is always true anyway,
but it's confusing, so replace it with an assertion.

Author: Thomas Munro
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Noah Misch
Discussion: https://postgr.es/m/CA%2BhUKG%2BNBw%2BuSzxF1os-SO6gUuw%3DcqO5DAybk6KnHKzgGvxhxA%40mail.gmail.com
Discussion: https://postgr.es/m/20191222091930.GA1280238%40rfd.leadboat.com
2020-01-27 09:12:56 +13:00
Heikki Linnakangas 38a957316d Refactor XLogReadRecord(), adding XLogBeginRead() function.
The signature of XLogReadRecord() required the caller to pass the starting
WAL position as argument, or InvalidXLogRecPtr to continue reading at the
end of previous record. That's slightly awkward to the callers, as most
of them don't want to randomly jump around in the WAL stream, but start
reading at one position and then read everything from that point onwards.
Remove the 'RecPtr' argument and add a new function XLogBeginRead() to
specify the starting position instead. That's more convenient for the
callers. Also, xlogreader holds state that is reset when you change the
starting position, so having a separate function for doing that feels like
a more natural fit.

This changes XLogFindNextRecord() function so that it doesn't reset the
xlogreader's state to what it was before the call anymore. Instead, it
positions the xlogreader to the found record, like XLogBeginRead().

Reviewed-by: Kyotaro Horiguchi, Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/5382a7a3-debe-be31-c860-cb810c08f366%40iki.fi
2020-01-26 11:39:00 +02:00
Tom Lane 1001368497 Clean up EXPLAIN's handling of per-worker details.
Previously, it was possible for EXPLAIN ANALYZE of a parallel query
to produce several different "Workers" fields for a single plan node,
because different portions of explain.c independently generated
per-worker data and wrapped that output in separate fields.  This
is pretty bogus, especially for the structured output formats: even
if it's not technically illegal, most programs would have a hard time
dealing with such data.

To improve matters, add infrastructure that allows redirecting
per-worker values into a side data structure, and then collect that
data into a single "Workers" field after we've finished running all
the relevant code for a given plan node.

There are a few visible side-effects:

* In text format, instead of something like

  Sort Method: external merge  Disk: 4920kB
  Worker 0:  Sort Method: external merge  Disk: 5880kB
  Worker 1:  Sort Method: external merge  Disk: 5920kB
  Buffers: shared hit=682 read=10188, temp read=1415 written=2101
  Worker 0:  actual time=130.058..130.324 rows=1324 loops=1
    Buffers: shared hit=337 read=3489, temp read=505 written=739
  Worker 1:  actual time=130.273..130.512 rows=1297 loops=1
    Buffers: shared hit=345 read=3507, temp read=505 written=744

you get

  Sort Method: external merge  Disk: 4920kB
  Buffers: shared hit=682 read=10188, temp read=1415 written=2101
  Worker 0:  actual time=130.058..130.324 rows=1324 loops=1
    Sort Method: external merge  Disk: 5880kB
    Buffers: shared hit=337 read=3489, temp read=505 written=739
  Worker 1:  actual time=130.273..130.512 rows=1297 loops=1
    Sort Method: external merge  Disk: 5920kB
    Buffers: shared hit=345 read=3507, temp read=505 written=744

* When JIT is enabled, any relevant per-worker JIT stats are attached
to the child node of the Gather or Gather Merge node, which is where
the other per-worker output has always been.  Previously, that info
was attached directly to a Gather node, or missed entirely for Gather
Merge.

* A query's summary JIT data no longer includes a bogus
"Worker Number: -1" field.

A notable code-level change is that indenting for lines of text-format
output should now be handled by calling "ExplainIndentText(es)",
instead of hard-wiring how much space to emit.  This seems a good deal
cleaner anyway.

This patch also adds a new "explain.sql" regression test script that's
dedicated to testing EXPLAIN.  There is more that can be done in that
line, certainly, but for now it just adds some coverage of the XML and
YAML output formats, which had been completely untested.

Although this is surely a bug fix, it's not clear that people would
be happy with rearranging EXPLAIN output in a minor release, so apply
to HEAD only.

Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's;
reviewed by Georgios Kokolatos

Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-25 18:16:42 -05:00
Dean Rasheed 13661ddd7e Add functions gcd() and lcm() for integer and numeric types.
These compute the greatest common divisor and least common multiple of
a pair of numbers using the Euclidean algorithm.

Vik Fearing, reviewed by Fabien Coelho.

Discussion: https://postgr.es/m/adbd3e0b-e3f1-5bbc-21db-03caf1cef0f7@2ndquadrant.com
2020-01-25 14:00:59 +00:00
Robert Haas 530609aa42 Remove jsonapi.c's lex_accept().
At first glance, this function seems useful, but it actually increases
the amount of code required rather than decreasing it. Inline the
logic into the callers instead; most callers don't use the 'lexeme'
argument for anything and as a result considerable simplification is
possible.

Along the way, fix the header comment for the nearby function
lex_expect(), which mislabeled it as lex_accept().

Patch by me, reviewed by David Steele, Mark Dilger, and Andrew
Dunstan.

Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com
2020-01-24 10:29:52 -08:00
Robert Haas 11b5e3e35d Split JSON lexer/parser from 'json' data type support.
Keep the code that pertains to the 'json' data type in json.c, but
move the lexing and parsing code to a new file jsonapi.c, a name
I chose because the corresponding prototypes are in jsonapi.h.

This seems like a logical division, because the JSON lexer and parser
are also used by the 'jsonb' data type, but the SQL-callable functions
in json.c are a separate thing. Also, the new jsonapi.c file needs to
include far fewer header files than json.c, which seems like a good
sign that this is an appropriate place to insert an abstraction
boundary. I took the opportunity to remove a few apparently-unneeded
includes from json.c at the same time.

Patch by me, reviewed by David Steele, Mark Dilger, and Andrew
Dunstan. The previous commit was, too, but I forgot to note it
in the commit message.

Discussion: http://postgr.es/m/CA+TgmoYfOXhd27MUDGioVh6QtpD0C1K-f6ObSA10AWiHBAL5bA@mail.gmail.com
2020-01-24 10:17:43 -08:00
Robert Haas ce0425b162 Adjust src/include/utils/jsonapi.h so it's not backend-only.
The major change here is that we no longer include jsonb.h into
jsonapi.h. The reason that was necessary is that jsonapi.h included
several prototypes functions in jsonfuncs.c that depend on the Jsonb
type. Move those prototypes to a new header, jsonfuncs.h, and include
it where needed.

The other change is that JsonEncodeDateTime is now declared in
json.h rather than jsonapi.h.

Taken together, these steps eliminate all dependencies of jsonapi.h
on backend-only data types and header files, so that it can
potentially be included in frontend code.
2020-01-24 09:58:37 -08:00
Fujii Masao d694e0bb79 Add pg_file_sync() to adminpack extension.
This function allows us to fsync the specified file or directory.
It's useful, for example, when we want to sync the file that
pg_file_write() writes out or that COPY TO exports the data into,
for durability.

Author: Fujii Masao
Reviewed-By: Julien Rouhaud, Arthur Zakirov, Michael Paquier, Atsushi Torikoshi
Discussion: https://www.postgresql.org/message-id/CAHGQGwGY8uzZ_k8dHRoW1zDcy1Z7=5GQ+So4ZkVy2u=nLsk=hA@mail.gmail.com
2020-01-24 20:42:52 +09:00
Tom Lane 9a3a75cb81 Fix an oversight in commit 4c70098ff.
I had supposed that the from_char_seq_search() call sites were
all passing the constant arrays you'd expect them to pass ...
but on looking closer, the one for DY format was passing the
days[] array not days_short[].  This accidentally worked because
the day abbreviations in English are all the same as the first
three letters of the full day names.  However, once we took out
the "maximum comparison length" logic, it stopped working.

As penance for that oversight, add regression test cases covering
this, as well as every other switch case in DCH_from_char() that
was not reached according to the code coverage report.

Also, fold the DCH_RM and DCH_rm cases into one --- now that
seq_search is case independent, there's no need to pass different
comparison arrays for those cases.

Back-patch, as the previous commit was.
2020-01-23 16:15:32 -05:00
Tom Lane 4c70098ffa Clean up formatting.c's logic for matching constant strings.
seq_search(), which is used to match input substrings to constants
such as month and day names, had a lot of bizarre and unnecessary
behaviors.  It was mostly possible to avert our eyes from that before,
but we don't want to duplicate those behaviors in the upcoming patch
to allow recognition of non-English month and day names.  So it's time
to clean this up.  In particular:

* seq_search scribbled on the input string, which is a pretty dangerous
thing to do, especially in the badly underdocumented way it was done here.
Fortunately the input string is a temporary copy, but that was being made
three subroutine levels away, making it something easy to break
accidentally.  The behavior is externally visible nonetheless, in the form
of odd case-folding in error reports about unrecognized month/day names.
The scribbling is evidently being done to save a few calls to pg_tolower,
but that's such a cheap function (at least for ASCII data) that it's
pretty pointless to worry about.  In HEAD I switched it to be
pg_ascii_tolower to ensure it is cheap in all cases; but there are corner
cases in Turkish where this'd change behavior, so leave it as pg_tolower
in the back branches.

* seq_search insisted on knowing the case form (all-upper, all-lower,
or initcap) of the constant strings, so that it didn't have to case-fold
them to perform case-insensitive comparisons.  This likewise seems like
excessive micro-optimization, given that pg_tolower is certainly very
cheap for ASCII data.  It seems unsafe to assume that we know the case
form that will come out of pg_locale.c for localized month/day names, so
it's better just to define the comparison rule as "downcase all strings
before comparing".  (The choice between downcasing and upcasing is
arbitrary so far as English is concerned, but it might not be in other
locales, so follow citext's lead here.)

* seq_search also had a parameter that'd cause it to report a match
after a maximum number of characters, even if the constant string were
longer than that.  This was not actually used because no caller passed
a value small enough to cut off a comparison.  Replicating that behavior
for localized month/day names seems expensive as well as useless, so
let's get rid of that too.

* from_char_seq_search used the maximum-length parameter to truncate
the input string in error reports about not finding a matching name.
This leads to rather confusing reports in many cases.  Worse, it is
outright dangerous if the input string isn't all-ASCII, because we
risk truncating the string in the middle of a multibyte character.
That'd lead either to delivering an illegible error message to the
client, or to encoding-conversion failures that obscure the actual
data problem.  Get rid of that in favor of truncating at whitespace
if any (a suggestion due to Alvaro Herrera).

In addition to fixing these things, I const-ified the input string
pointers of DCH_from_char and its subroutines, to make sure there
aren't any other scribbling-on-input problems.

The risk of generating a badly-encoded error message seems like
enough of a bug to justify back-patching, so patch all supported
branches.

Discussion: https://postgr.es/m/29432.1579731087@sss.pgh.pa.us
2020-01-23 13:42:09 -05:00
Michael Paquier f942dfb952 Clarify some comments in vacuumlazy.c
Author: Justin Pryzby
Discussion: https://postgr.es/m/20200113004542.GA26045@telsasoft.com
2020-01-23 15:56:56 +09:00
Fujii Masao 41c184bc64 Add GUC ignore_invalid_pages.
Detection of WAL records having references to invalid pages
during recovery causes PostgreSQL to raise a PANIC-level error,
aborting the recovery. Setting ignore_invalid_pages to on causes
the system to ignore those WAL records (but still report a warning),
and continue recovery. This behavior may cause crashes, data loss,
propagate or hide corruption, or other serious problems.
However, it may allow you to get past the PANIC-level error,
to finish the recovery, and to cause the server to start up.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/CAHGQGwHCK6f77yeZD4MHOnN+PaTf6XiJfEB+Ce7SksSHjeAWtg@mail.gmail.com
2020-01-22 11:56:34 +09:00
Amit Kapila 79a3efb84d Fix the computation of max dead tuples during the vacuum.
In commit 40d964ec99, we changed the way memory is allocated for dead
tuples but forgot to update the place where we compute the maximum
number of dead tuples.  This could lead to invalid memory requests.

Reported-by: Andres Freund
Diagnosed-by: Andres Freund
Author: Masahiko Sawada
Reviewed-by: Amit Kapila and Dilip Kumar
Discussion: https://postgr.es/m/20200121060020.e3cr7s7fj5rw4lok@alap3.anarazel.de
2020-01-22 07:43:51 +05:30
Michael Paquier a904abe2e2 Fix concurrent indexing operations with temporary tables
Attempting to use CREATE INDEX, DROP INDEX or REINDEX with CONCURRENTLY
on a temporary relation with ON COMMIT actions triggered unexpected
errors because those operations use multiple transactions internally to
complete their work.  Here is for example one confusing error when using
ON COMMIT DELETE ROWS:
ERROR:  index "foo" already contains data

Issues related to temporary relations and concurrent indexing are fixed
in this commit by enforcing the non-concurrent path to be taken for
temporary relations even if using CONCURRENTLY, transparently to the
user.  Using a non-concurrent path does not matter in practice as locks
cannot be taken on a temporary relation by a session different than the
one owning the relation, and the non-concurrent operation is more
effective.

The problem exists with REINDEX since v12 with the introduction of
CONCURRENTLY, and with CREATE/DROP INDEX since CONCURRENTLY exists for
those commands.  In all supported versions, this caused only confusing
error messages to be generated.  Note that with REINDEX, it was also
possible to issue a REINDEX CONCURRENTLY for a temporary relation owned
by a different session, leading to a server crash.

The idea to enforce transparently the non-concurrent code path for
temporary relations comes originally from Andres Freund.

Reported-by: Manuel Rigger
Author: Michael Paquier, Heikki Linnakangas
Reviewed-by: Andres Freund, Álvaro Herrera, Heikki Linnakangas
Discussion: https://postgr.es/m/CA+u7OA6gP7YAeCguyseusYcc=uR8+ypjCcgDDCTzjQ+k6S9ksQ@mail.gmail.com
Backpatch-through: 9.4
2020-01-22 09:49:18 +09:00
Tom Lane 9b9c5f279e Clarify behavior of adding and altering a column in same ALTER command.
The behavior of something like

ALTER TABLE transactions
  ADD COLUMN status varchar(30) DEFAULT 'old',
  ALTER COLUMN status SET default 'current';

is to fill existing table rows with 'old', not 'current'.  That's
intentional and desirable for a couple of reasons:

* It makes the behavior the same whether you merge the sub-commands
into one ALTER command or give them separately;

* If we applied the new default while filling the table, there would
be no way to get the existing behavior in one SQL command.

The same reasoning applies in cases that add a column and then
manipulate its GENERATED/IDENTITY status in a second sub-command,
since the generation expression is really just a kind of default.
However, that wasn't very obvious (at least not to me; earlier in
the referenced discussion thread I'd thought it was a bug to be
fixed).  And it certainly wasn't documented.

Hence, add documentation, code comments, and a test case to clarify
that this behavior is all intentional.

In passing, adjust ATExecAddColumn's defaults-related relkind check
so that it matches up exactly with ATRewriteTables, instead of being
effectively (though not literally) the negated inverse condition.
The reasoning can be explained a lot more concisely that way, too
(not to mention that the comment now matches the code, which it
did not before).

Discussion: https://postgr.es/m/10365.1558909428@sss.pgh.pa.us
2020-01-21 16:17:21 -05:00
Andres Freund affdde2e15 Fix edge case leading to agg transitions skipping ExecAggTransReparent() calls.
The code checking whether an aggregate transition value needs to be
reparented into the current context has always only compared the
transition return value with the previous transition value by datum,
i.e. without regard for NULLness.  This normally works, because when
the transition function returns NULL (via fcinfo->isnull), it'll
return a value that won't be the same as its input value.

But there's no hard requirement that that's the case. And it turns
out, it's possible to hit this case (see discussion or reproducers),
leading to a non-null transition value not being reparented, followed
by a crash caused by that.

Instead of adding another comparison of NULLness, instead have
ExecAggTransReparent() ensure that pergroup->transValue ends up as 0
when the new transition value is NULL. That avoids having to add an
additional branch to the much more common cases of the transition
function returning the old transition value (which is a pointer in
this case), and when the new value is different, but not NULL.

In branches since 69c3936a14, also deduplicate the reparenting code
between the expression evaluation based transitions, and the path for
ordered aggregates.

Reported-By: Teodor Sigaev, Nikita Glukhov
Author: Andres Freund
Discussion: https://postgr.es/m/bd34e930-cfec-ea9b-3827-a8bc50891393@sigaev.ru
Backpatch: 9.4-, this issue has existed since at least 7.4
2020-01-20 23:26:51 -08:00
Tom Lane 31f403e95f Further tweaking of jsonb_set_lax().
Some buildfarm members were still warning about this, because in
9c679a08f I'd missed decorating one of the ereport() code paths
with a dummy return.

Also, adjust the error messages to be more in line with project
style guide.
2020-01-20 14:26:56 -05:00
Heikki Linnakangas 4c87010981 Fix crash in BRIN inclusion op functions, due to missing datum copy.
The BRIN add_value() and union() functions need to make a longer-lived
copy of the argument, if they want to store it in the BrinValues struct
also passed as argument. The functions for the "inclusion operator
classes" used with box, range and inet types didn't take into account
that the union helper function might return its argument as is, without
making a copy. Check for that case, and make a copy if necessary. That
case arises at least with the range_union() function, when one of the
arguments is an 'empty' range:

CREATE TABLE brintest (n numrange);
CREATE INDEX brinidx ON brintest USING brin (n);
INSERT INTO brintest VALUES ('empty');
INSERT INTO brintest VALUES (numrange(0, 2^1000::numeric));
INSERT INTO brintest VALUES ('(-1, 0)');

SELECT brin_desummarize_range('brinidx', 0);
SELECT brin_summarize_range('brinidx', 0);

Backpatch down to 9.5, where BRIN was introduced.

Discussion: https://www.postgresql.org/message-id/e6e1d6eb-0a67-36aa-e779-bcca59167c14%40iki.fi
Reviewed-by: Emre Hasegeli, Tom Lane, Alvaro Herrera
2020-01-20 10:36:35 +02:00
Amit Kapila 40d964ec99 Allow vacuum command to process indexes in parallel.
This feature allows the vacuum to leverage multiple CPUs in order to
process indexes.  This enables us to perform index vacuuming and index
cleanup with background workers.  This adds a PARALLEL option to VACUUM
command where the user can specify the number of workers that can be used
to perform the command which is limited by the number of indexes on a
table.  Specifying zero as a number of workers will disable parallelism.
This option can't be used with the FULL option.

Each index is processed by at most one vacuum process.  Therefore parallel
vacuum can be used when the table has at least two indexes.

The parallel degree is either specified by the user or determined based on
the number of indexes that the table has, and further limited by
max_parallel_maintenance_workers.  The index can participate in parallel
vacuum iff it's size is greater than min_parallel_index_scan_size.

Author: Masahiko Sawada and Amit Kapila
Reviewed-by: Dilip Kumar, Amit Kapila, Robert Haas, Tomas Vondra,
Mahendra Singh and Sergei Kornilov
Tested-by: Mahendra Singh and Prabhat Sahu
Discussion:
https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com
https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0+w@mail.gmail.com
2020-01-20 07:57:49 +05:30
Tom Lane 9c679a08f0 Silence minor compiler warnings.
Ensure that ClassifyUtilityCommandAsReadOnly() has defined behavior
even if TransactionStmt.kind has a value that's not one of the
declared values for its enum.

Suppress warnings from compilers that don't know that elog(ERROR)
doesn't return, in ClassifyUtilityCommandAsReadOnly() and
jsonb_set_lax().

Per Coverity and buildfarm.
2020-01-19 16:04:36 -05:00
Heikki Linnakangas 7aaefadaac Remove separate files for the initial contents of pg_(sh)description
This data was only in separate files because it was the most convenient
way to handle it with a shell script. Now that we use a general-purpose
programming language, it's easy to assemble the data into the same format
as the rest of the catalogs and output it into postgres.bki. This allows
removal of some special-purpose code from initdb.c.

Discussion: https://www.postgresql.org/message-id/CACPNZCtVFtjHre6hg9dput0qRPp39pzuyA2A6BT8wdgrRy%2BQdA%40mail.gmail.com
Author: John Naylor
2020-01-19 13:54:58 +02:00
Michael Paquier 41aadeeb12 Add GUC checks for ssl_min_protocol_version and ssl_max_protocol_version
Mixing incorrect bounds set in the SSL context leads to confusing error
messages generated by OpenSSL which are hard to act on.  New checks are
added within the GUC machinery to improve the user experience as they
apply to any SSL implementation, not only OpenSSL, and doing the checks
beforehand avoids the creation of a SSL during a reload (or startup)
which we know will never be used anyway.

Backpatch down to 12, as those parameters have been introduced by
e73e67c.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/20200114035420.GE1515@paquier.xyz
Backpatch-through: 12
2020-01-18 12:32:43 +09:00
Alexander Korotkov 4b754d6c16 Avoid full scan of GIN indexes when possible
The strategy of GIN index scan is driven by opclass-specific extract_query
method.  This method that needed search mode is GIN_SEARCH_MODE_ALL.  This
mode means that matching tuple may contain none of extracted entries.  Simple
example is '!term' tsquery, which doesn't need any term to exist in matching
tsvector.

In order to handle such scan key GIN calculates virtual entry, which contains
all TIDs of all entries of attribute.  In fact this is full scan of index
attribute.  And typically this is very slow, but allows to handle some queries
correctly in GIN.  However, current algorithm calculate such virtual entry for
each GIN_SEARCH_MODE_ALL scan key even if they are multiple for the same
attribute.  This is clearly not optimal.

This commit improves the situation by introduction of "exclude only" scan keys.
Such scan keys are not capable to return set of matching TIDs.  Instead, they
are capable only to filter TIDs produced by normal scan keys.  Therefore,
each attribute should contain at least one normal scan key, while rest of them
may be "exclude only" if search mode is GIN_SEARCH_MODE_ALL.

The same optimization might be applied to the whole scan, not per-attribute.
But that leads to NULL values elimination problem.  There is trade-off between
multiple possible ways to do this.  We probably want to do this later using
some cost-based decision algorithm.

Discussion: https://postgr.es/m/CAOBaU_YGP5-BEt5Cc0%3DzMve92vocPzD%2BXiZgiZs1kjY0cj%3DXBg%40mail.gmail.com
Author: Nikita Glukhov, Alexander Korotkov, Tom Lane, Julien Rouhaud
Reviewed-by: Julien Rouhaud, Tomas Vondra, Tom Lane
2020-01-18 01:11:39 +03:00
Tom Lane 41c6f9db25 Repair more failures with SubPlans in multi-row VALUES lists.
Commit 9b63c13f0 turns out to have been fundamentally misguided:
the parent node's subPlan list is by no means the only way in which
a child SubPlan node can be hooked into the outer execution state.
As shown in bug #16213 from Matt Jibson, we can also get short-lived
tuple table slots added to the outer es_tupleTable list.  At this point
I have little faith that there aren't other possible connections as
well; the long time it took to notice this problem shows that this
isn't a heavily-exercised situation.

Therefore, revert that fix, returning to the coding that passed a
NULL parent plan pointer down to the transiently-built subexpressions.
That gives us a pretty good guarantee that they won't hook into the
outer executor state in any way.  But then we need some other solution
to make SubPlans work.  Adopt the solution speculated about in the
previous commit's log message: do expression initialization at plan
startup for just those VALUES rows containing SubPlans, abandoning the
goal of reclaiming memory intra-query for those rows.  In practice it
seems unlikely that queries containing a vast number of VALUES rows
would be using SubPlans in them, so this should not give up much.

(BTW, this test case also refutes my claim in connection with the prior
commit that the issue only arises with use of LATERAL.  That was just
wrong: some variants of SubLink always produce SubPlans.)

As with previous patch, back-patch to all supported branches.

Discussion: https://postgr.es/m/16213-871ac3bc208ecf23@postgresql.org
2020-01-17 16:17:31 -05:00
Alvaro Herrera 15cac3a523 Set ReorderBufferTXN->final_lsn more eagerly
... specifically, set it incrementally as each individual change is
spilled down to disk.  This way, it is set correctly when the
transaction disappears without trace, ie. without leaving an XACT_ABORT
wal record.  (This happens when the server crashes midway through a
transaction.)

Failing to have final_lsn prevents ReorderBufferRestoreCleanup() from
working, since it needs the final_lsn in order to know the endpoint of
its iteration through spilled files.

Commit df9f682c7b already tried to fix the problem, but it didn't set
the final_lsn in all cases.  Revert that, since it's no longer needed.

Author: Vignesh C
Reviewed-by: Amit Kapila, Dilip Kumar
Discussion: https://postgr.es/m/CALDaNm2CLk+K9JDwjYST0sPbGg5AQdvhUt0jbKyX_HdAE0jk3A@mail.gmail.com
2020-01-17 18:00:39 -03:00
Tomas Vondra 543852fd8b Allocate freechunks bitmap as part of SlabContext
The bitmap used by SlabCheck to cross-check free chunks in a block used
to be allocated for each SlabCheck call, and was never freed. The memory
leak could be fixed by simply adding a pfree call, but it's actually a
bad idea to do any allocations in SlabCheck at all as it assumes the
state of the memory management as a whole is sane.

So instead we allocate the bitmap as part of SlabContext, which means
we don't need to do any allocations in SlabCheck and the bitmap goes
away together with the SlabContext.

Backpatch to 10, where the Slab context was introduced.

Author: Tomas Vondra
Reported-by: Andres Freund
Reviewed-by: Tom Lane
Backpatch-through: 10
Discussion: https://www.postgresql.org/message-id/20200116044119.g45f7pmgz4jmodxj%40alap3.anarazel.de
2020-01-17 15:29:11 +01:00
Andrew Dunstan a83586b554 Add a non-strict version of jsonb_set
jsonb_set_lax() is the same as jsonb_set, except that it takes and extra
argument that specifies what to do if the value argument is NULL. The
default is 'use_json_null'. Other possibilities are 'raise_exception',
'return_target' and 'delete_key', all these behaviours having been
suggested as reasonable by various users.

Discussion: https://postgr.es/m/375873e2-c957-3a8d-64f9-26c43c2b16e7@2ndQuadrant.com

Reviewed by: Pavel Stehule
2020-01-17 11:52:39 +10:30