Commit Graph

18654 Commits

Author SHA1 Message Date
Peter Eisentraut 1e5e4efd02 Error position support for partition specifications
Add support for error position reporting for partition specifications.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2018-08-30 08:20:23 +02:00
Peter Eisentraut a4a232b1e7 Error position support for defaults and check constraints
Add support for error position reporting for the expressions contained
in defaults and check constraint definitions.  This currently works only
for CREATE TABLE, not ALTER TABLE, because the latter is not set up to
pass around the original query string.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2018-08-30 08:20:23 +02:00
Michael Paquier 55875b6d2a Stop bgworkers during fast shutdown with postmaster in startup phase
When a postmaster gets into its phase PM_STARTUP, it would start
background workers using BgWorkerStart_PostmasterStart mode immediately,
which would cause problems for a fast shutdown as the postmaster forgets
to send SIGTERM to already-started background workers.  With smart and
immediate shutdowns, this correctly happened, and fast shutdown is the
only mode missing the shot.

Author: Alexander Kukushkin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAFh8B=mvnD8+DZUfzpi50DoaDfZRDfd7S=gwj5vU9GYn8UvHkA@mail.gmail.com
Backpatch-through: 9.5
2018-08-29 17:10:02 -07:00
Andrew Gierth c8ea87e4bd Avoid quadratic slowdown in regexp match/split functions.
regexp_matches, regexp_split_to_table and regexp_split_to_array all
work by compiling a list of match positions as character offsets (NOT
byte positions) in the source string.

Formerly, they then used text_substr to extract the matched text; but
in a multi-byte encoding, that counts the characters in the string,
and the characters needed to reach the starting byte position, on
every call. Accordingly, the performance degraded as the product of
the input string length and the number of match positions, such that
splitting a string of a few hundred kbytes could take many minutes.

Repair by keeping the wide-character copy of the input string
available (only in the case where encoding_max_length is not 1) after
performing the match operation, and extracting substrings from that
instead. This reduces the complexity to being linear in the number of
result bytes, discounting the actual regexp match itself (which is not
affected by this patch).

In passing, remove cleanup using retail pfree() which was obsoleted by
commit ff428cded (Feb 2008) which made cleanup of SRF multi-call
contexts automatic. Also increase (to ~134 million) the maximum number
of matches and provide an error message when it is reached.

Backpatch all the way because this has been wrong forever.

Analysis and patch by me; review by Kaiting Chen.

Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk

see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk
2018-08-28 12:17:33 +01:00
Peter Eisentraut 7a3b7bbfde Fix snapshot leak warning for some procedures
The problem arises with the combination of CALL with output parameters
and doing a COMMIT inside the procedure.  When a CALL has output
parameters, the portal uses the strategy PORTAL_UTIL_SELECT instead of
PORTAL_MULTI_QUERY.  Using PORTAL_UTIL_SELECT causes the portal's
snapshot to be registered with the current resource
owner (portal->holdSnapshot); see
9ee1cf04ab for the reason.

Normally, PortalDrop() unregisters the snapshot.  If not, then
ResourceOwnerRelease() will print a warning about a snapshot leak on
transaction commit.  A transaction commit normally drops all
portals (PreCommit_Portals()), except the active portal.  So in case of
the active portal, we need to manually release the snapshot to avoid the
warning.

Reported-by: Prabhat Sahu <prabhat.sahu@enterprisedb.com>
Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
2018-08-27 22:16:15 +02:00
Michael Paquier a556549d7e Improve VACUUM and ANALYZE by avoiding early lock queue
A caller of VACUUM can perform early lookup obtention which can cause
other sessions to block on the request done, causing potentially DOS
attacks as even a non-privileged user can attempt a vacuum fill of a
critical catalog table to block even all incoming connection attempts.

Contrary to TRUNCATE, a client could attempt a system-wide VACUUM after
building the list of relations to VACUUM, which can cause vacuum_rel()
or analyze_rel() to try to lock the relation but the operation would
just block.  When the client specifies a list of relations and the
relation needs to be skipped, ownership checks are done when building
the list of relations to work on, preventing a later lock attempt.

vacuum_rel() already had the sanity checks needed, except that those
were applied too late.  This commit refactors the code so as relation
skips are checked beforehand, making it safer to avoid too early locks,
for both manual VACUUM with and without a list of relations specified.

An isolation test is added emulating the fact that early locks do not
happen anymore, issuing a WARNING message earlier if the user calling
VACUUM is not a relation owner.

When a partitioned table is listed in a manual VACUUM or ANALYZE
command, its full list of partitions is fetched, all partitions get
added to the list to work on, and then each one of them is processed one
by one, with ownership checks happening at the later phase of
vacuum_rel() or analyze_rel().  Trying to do early ownership checks for
each partition is proving to be tedious as this would result in deadlock
risks with lock upgrades, and skipping all partitions if the listed
partitioned table is not owned would result in a behavior change
compared to how Postgres 10 has implemented vacuum for partitioned
tables.  The original problem reported related to early lock queue for
critical relations is fixed anyway, so priority is given to avoiding a
backward-incompatible behavior.

Reported-by: Lloyd Albin, Jeremy Schneider
Author: Michael Paquier
Reviewed by: Nathan Bossart, Kyotaro Horiguchi
Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org
Discussion: https://postgr.es/m/20180812222142.GA6097@paquier.xyz
2018-08-27 09:11:12 +09:00
Thomas Munro 18e586741b Fix typos.
Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f8du35u5DprpykWvgNEScxapbWYJdHq%2Bz06Wj3Y2KFPbw%40mail.gmail.com
2018-08-27 09:32:59 +12:00
Tom Lane bff84a547d Make syslogger more robust against failures in opening CSV log files.
The previous coding figured it'd be good enough to postpone opening
the first CSV log file until we got a message we needed to write there.
This is unsafe, though, because if the open fails we end up in infinite
recursion trying to report the failure.  Instead make the CSV log file
management code look as nearly as possible like the longstanding logic
for the stderr log file.  In particular, open it immediately at postmaster
startup (if enabled), or when we get a SIGHUP in which we find that
log_destination has been changed to enable CSV logging.

It seems OK to fail if a postmaster-start-time open attempt fails, as
we've long done for the stderr log file.  But we can't die if we fail
to open a CSV log file during SIGHUP, so we're still left with a problem.
In that case, write any output meant for the CSV log file to the stderr
log file.  (This will also cover race-condition cases in which backends
send CSV log data before or after we have the CSV log file open.)

This patch also fixes an ancient oversight that, if CSV logging was
turned off during a SIGHUP, we never actually closed the last CSV
log file.

In passing, remember to reset whereToSendOutput = DestNone during syslogger
start, since (unlike all other postmaster children) it's forked before the
postmaster has done that.  This made for a platform-dependent difference
in error reporting behavior between the syslogger and other children:
except on Windows, it'd report problems to the original postmaster stderr
as well as the normal error log file(s).  It's barely possible that that
was intentional at some point; but it doesn't seem likely to be desirable
in production, and the platform dependency definitely isn't desirable.

Per report from Alexander Kukushkin.  It's been like this for a long time,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/CAFh8B==iLUD_gqC-dAENS0V+kVrCeGiKujtKqSQ7++S-caaChw@mail.gmail.com
2018-08-26 14:21:55 -04:00
Jeff Davis ba9d35b8eb Reconsider new file extension in commit 91f26d5f.
Andres and Tom objected to the choice of the ".tmp"
extension. Changing to Andres's suggestion of ".spill".

Discussion: https://postgr.es/m/88092095-3348-49D8-8746-EB574B1D30EA%40anarazel.de
2018-08-25 22:52:46 -07:00
Jeff Davis 91f26d5fe4 Change extension of spilled ReorderBufferChange data to ".tmp".
The previous extension, ".snap", was chosen for historical reasons and
became confusing.

Discussion: https://postgr.es/m/CAMp0ubd_P8vBGx8=MfDXQJZxHA5D_Zarw5cCkDxJ_63+pWRJ9w@mail.gmail.com
2018-08-25 09:36:09 -07:00
Andres Freund cb92520563 LLVMJIT: LLVMGetHostCPUFeatures now is upstream, use LLMV version if available.
Noticed thanks to buildfarm animal seawasp.

Author: Andres Freund
Backpatch: v11-, where LLVM based JIT compliation was introduced.
2018-08-24 10:21:38 -07:00
Tom Lane b0c5da615e Suppress uninitialized-variable warning in new SCRAM code.
While we generally don't sweat too much about "may be used uninitialized"
warnings from older compilers, I noticed that there's a fair number of
buildfarm animals that are producing such a warning *only* for this
variable.  So it seems worth silencing.
2018-08-24 10:51:10 -04:00
Andres Freund 143290efd0 Introduce minimal C99 usage to verify compiler support.
This just converts a few for loops in postgres.c to declare variables
in the loop initializer, and uses designated initializers in smgr.c's
definition of smgr callbacks.

Author: Andres Freund
Discussion: https://postgr.es/m/97d4b165-192d-3605-749c-f614a0c4e783@2ndquadrant.com
2018-08-23 18:36:07 -07:00
Andres Freund 88ebd62fcc Deduplicate code between slot_getallattrs() and slot_getsomeattrs().
Code in slot_getallattrs() is the same as if slot_getsomeattrs() is
called with number of attributes specified in the tuple
descriptor. Implement it that way instead of duplicating the code
between those two functions.

This is part of a patchseries abstracting TupleTableSlots so they can
store arbitrary forms of tuples, but is a nice enough cleanup on its
own.

Author: Ashutosh Bapat
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de
2018-08-23 16:58:53 -07:00
Andrew Gierth a40631a920 Fix lexing of standard multi-character operators in edge cases.
Commits c6b3c939b (which fixed the precedence of >=, <=, <> operators)
and 865f14a2d (which added support for the standard => notation for
named arguments) created a class of lexer tokens which look like
multi-character operators but which have their own token IDs distinct
from Op. However, longest-match rules meant that following any of
these tokens with another operator character, as in (1<>-1), would
cause them to be incorrectly returned as Op.

The error here isn't immediately obvious, because the parser would
usually still find the correct operator via the Op token, but there
were more subtle problems:

1. If immediately followed by a comment or +-, >= <= <> would be given
   the old precedence of Op rather than the correct new precedence;

2. If followed by a comment, != would be returned as Op rather than as
   NOT_EQUAL, causing it not to be found at all;

3. If followed by a comment or +-, the => token for named arguments
   would be lexed as Op, causing the argument to be mis-parsed as a
   simple expression, usually causing an error.

Fix by explicitly checking for the operators in the {operator} code
block in addition to all the existing special cases there.

Backpatch to 9.5 where the problem was introduced.

Analysis and patch by me; review by Tom Lane.
Discussion: https://postgr.es/m/87va851ppl.fsf@news-spur.riddles.org.uk
2018-08-23 21:42:40 +01:00
Andrew Gierth d4a63f8297 Reduce an unnecessary O(N^3) loop in lexer.
The lexer's handling of operators contained an O(N^3) hazard when
dealing with long strings of + or - characters; it seems hard to
prevent this case from being O(N^2), but the additional N multiplier
was not needed.

Backpatch all the way since this has been there since 7.x, and it
presents at least a mild hazard in that trying to do Bind, PREPARE or
EXPLAIN on a hostile query could take excessive time (without
honouring cancels or timeouts) even if the query was never executed.
2018-08-23 21:42:40 +01:00
Peter Eisentraut 0a63f996e0 Change PROCEDURE to FUNCTION in CREATE TRIGGER syntax
Since procedures are now a different thing from functions, change the
CREATE TRIGGER and CREATE EVENT TRIGGER syntax to use FUNCTION in the
clause that specifies the function.  PROCEDURE is still accepted for
compatibility.

pg_dump and ruleutils.c output is not changed yet, because that would
require a change in information_schema.sql and thus a catversion change.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
2018-08-22 14:44:49 +02:00
Peter Eisentraut d12782898e Change PROCEDURE to FUNCTION in CREATE OPERATOR syntax
Since procedures are now a different thing from functions, change the
CREATE OPERATOR syntax to use FUNCTION in the clause that specifies the
function.  PROCEDURE is still accepted for compatibility.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
2018-08-22 14:44:49 +02:00
Peter Eisentraut b19495772e doc: Update uses of the word "procedure"
Historically, the term procedure was used as a synonym for function in
Postgres/PostgreSQL.  Now we have procedures as separate objects from
functions, so we need to clean up the documentation to not mix those
terms.

In particular, mentions of "trigger procedures" are changed to "trigger
functions", and access method "support procedures" are changed to
"support functions".  (The latter already used FUNCTION in the SQL
syntax anyway.)  Also, the terminology in the SPI chapter has been
cleaned up.

A few tests, examples, and code comments are also adjusted to be
consistent with documentation changes, but not everything.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
2018-08-22 14:44:49 +02:00
Thomas Munro af63926cf5 Wrap long line in postgresql.conf.sample.
Per complaint from Michael Paquier.
2018-08-22 21:34:50 +12:00
Thomas Munro f9fe269ca2 Provide plan_cache_mode options in postgresql.conf.sample.
Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f8YkwojSTSg8YjNYCLCXzx0fR7wBR3Gf%2BrA9_52eoPZKg%40mail.gmail.com
2018-08-22 18:19:39 +12:00
Alvaro Herrera 55a0154efd Fix typo 2018-08-21 17:16:10 -03:00
Alvaro Herrera 083651da17 fix typo 2018-08-21 17:03:35 -03:00
Michael Paquier 72be8c29a1 Fix set of NLS translation issues
While monitoring the code, a couple of issues related to string
translation has showed up:
- Some routines for auto-updatable views return an error string, which
sometimes missed the shot.  A comment regarding string translation is
added for each routine to help with future features.
- GSSAPI authentication missed two translations.
- vacuumdb handles non-translated strings.
- GetConfigOptionByNum should translate strings.  This part is not
back-patched as after a minor upgrade this could be surprising for
users.

Reported-by: Kyotaro Horiguchi
Author: Kyotaro Horiguchi
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20180810.152131.31921918.horiguchi.kyotaro@lab.ntt.co.jp
Backpatch-through: 9.3
2018-08-21 15:17:13 +09:00
Michael Paquier d8c83800c3 Fix typo in description of enable_parallel_hash
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20180821.115841.93250330.horiguchi.kyotaro@lab.ntt.co.jp
2018-08-21 12:13:16 +09:00
Michael Paquier 1339fcc896 Clarify comment about assignment and reset of temp namespace ID in MyProc
The new wording comes from Álvaro, which I modified a bit.

Reported-by: Andres Freund, Álvaro Herrera
Author: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/20180809165047.GK13638@paquier.xyz
Backpatch-through: 11
2018-08-21 08:32:18 +09:00
Peter Eisentraut d83423db86 Improve error messages for CREATE OR REPLACE PROCEDURE
Change the hint to recommend DROP PROCEDURE instead of FUNCTION.  Also
make the error message when changing the return type more specific to
the case of procedures.

Reported-by: Jeremy Evans <code@jeremyevans.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2018-08-18 22:05:43 +02:00
Peter Eisentraut e4597ee65d InsertPgAttributeTuple() to set attcacheoff
InsertPgAttributeTuple() is the interface between in-memory tuple
descriptors and on-disk pg_attribute, so it makes sense to give it the
job of resetting attcacheoff.  This avoids having all the callers having
to do so.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2018-08-17 22:08:21 +02:00
Andrew Gierth 520acab171 Set scan direction appropriately for SubPlans (bug #15336)
When executing a SubPlan in an expression, the EState's direction
field was left alone, resulting in an attempt to execute the subplan
backwards if it was encountered during a backwards scan of a cursor.
Also, though much less likely, it was possible to reach the execution
of an InitPlan while in backwards-scan state.

Repair by saving/restoring estate->es_direction and forcing forward
scan mode in the relevant places.

Backpatch all the way, since this has been broken since 8.3 (prior to
commit c7ff7663e, SubPlans had their own EStates rather than sharing
the parent plan's, so there was no confusion over scan direction).

Per bug #15336 reported by Vladimir Baranoff; analysis and patch by
me, review by Tom Lane.

Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org
2018-08-17 15:44:13 +01:00
Tomas Vondra c4c3400885 Use the built-in float datatypes to implement geometric types
This patch makes the geometric operators and functions use the exported
function of the float4/float8 datatypes.  The main reason of doing so is
to check for underflow and overflow, and to handle NaNs consciously.

The float datatypes consider NaNs values to be equal and greater than
all non-NaN values.  This change considers NaNs equal only for equality
operators.  The placement operators, contains, overlaps, left/right of
etc. continue to return false when NaNs are involved.  We don't need
to worry about them being considered greater than any-NaN because there
aren't any basic comparison operators like less/greater than for the
geometric datatypes.

The changes may be summarised as:

* Check for underflow, overflow and division by zero
* Consider NaN values to be equal
* Return NULL when the distance is NaN for all closest point operators
* Favour not-NaN over NaN where it makes sense

The patch also replaces all occurrences of "double" as "float8".  They
are the same, but were used inconsistently in the same file.

Author: Emre Hasegeli
Reviewed-by: Kyotaro Horiguchi, Tomas Vondra

Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
2018-08-16 19:56:11 +02:00
Tomas Vondra a082aed072 Remove remaining GEODEBUG references from geo_ops.c
Commit a7dc63d904 removed most of the
GEODEBUG occurrences, but there were a couple remaining. So remove
them too, to get rid of the macro entirely.

Author: Emre Hasegeli

Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
2018-08-16 19:55:43 +02:00
Tom Lane e1d19c902e Require a C99-compliant snprintf(), and remove related workarounds.
Since our substitute snprintf now returns a C99-compliant result,
there's no need anymore to have complicated code to cope with pre-C99
behavior.  We can just make configure substitute snprintf.c if it finds
that the system snprintf() is pre-C99.  (Note: I do not believe that
there are any platforms where this test will trigger that weren't
already being rejected due to our other C99-ish feature requirements for
snprintf.  But let's add the check for paranoia's sake.)  Then, simplify
the call sites that had logic to cope with the pre-C99 definition.

I also dropped some stuff that was being paranoid about the possibility
of snprintf overrunning the given buffer.  The only reports we've ever
heard of that being a problem were for Solaris 7, which is long dead,
and we've sure not heard any reports of these assertions triggering in
a long time.  So let's drop that complexity too.

Likewise, drop some code that wasn't trusting snprintf to set errno
when it returns -1.  That would be not-per-spec, and again there's
no real reason to believe it is a live issue, especially not for
snprintfs that pass all of configure's feature checks.

Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
2018-08-16 13:01:09 -04:00
Alvaro Herrera 1eb9221585 Fix executor prune failure when plan already pruned
In a multi-layer partitioning setup, if at plan time all the
sub-partitions are pruned but the intermediate one remains, the executor
later throws a spurious error that there's nothing to prune.  That is
correct, but there's no reason to throw an error.  Therefore, don't.

Reported-by: Andreas Seltenreich <seltenreich@gmx.de>
Author: David Rowley <david.rowley@2ndquadrant.com>
Discussion: https://postgr.es/m/87in4h98i0.fsf@ansel.ydns.eu
2018-08-16 12:53:43 -03:00
Tomas Vondra fa73b377ee Close the file descriptor in ApplyLogicalMappingFile
The function was forgetting to close the file descriptor, resulting
in failures like this:

  ERROR:  53000: exceeded maxAllocatedDescs (492) while trying to open
  file "pg_logical/mappings/map-4000-4eb-1_60DE1E08-5376b5-537c6b"
  LOCATION:  OpenTransientFile, fd.c:2161

Simply close the file at the end, and backpatch to 9.4 (where logical
decoding was introduced). While at it, fix a nearby typo.

Discussion: https://www.postgresql.org/message-id/flat/738a590a-2ce5-9394-2bef-7b1caad89b37%402ndquadrant.com
2018-08-16 16:49:57 +02:00
Michael Paquier 3593579bcd Update comment in header of errcodes.txt
This file mentions all the files generated from it, but missed that
errcodes-list.sgml is no more, while errcodes-table.sgml is.

Author: Noriyoshi Shinoda
Discussion: https://postgr.es/m/TU4PR8401MB0430855D6B971E49EB55F328EE3E0@TU4PR8401MB0430.NAMPRD84.PROD.OUTLOOK.COM
2018-08-16 09:47:59 +02:00
Thomas Munro ca1e64feba Improve comment in GetNewObjectId().
The previous comment gave the impression that skipping OIDs before
FirstNormalObjectId was merely an optimization to avoid likely collisions.
In fact other parts of the system have been relying on this threshold to
detect system-created objects since commit 8e18d04d4d, so adjust the
wording.

Author: Thomas Munro
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAEepm%3D33JASACeOayr_W3%3DCSjy2jiPxM-k89axu0akFbHdjnjA%40mail.gmail.com
2018-08-16 17:17:30 +12:00
Alvaro Herrera ab7dbd681c Update FSM on WAL replay of page all-visible/frozen
We aren't very strict about keeping FSM up to date on WAL replay,
because per-page freespace values aren't critical in replicas (can't
write to heap in a replica; and if the replica is promoted, the values
would be updated by VACUUM anyway).  However, VACUUM since 9.6 can skip
processing pages marked all-visible or all-frozen, and if such pages are
recorded in FSM with wrong values, those values are blindly propagated
to FSM's upper layers by VACUUM's FreeSpaceMapVacuum.  (This rationale
assumes that crashes are not very frequent, because those would cause
outdated FSM to occur in the primary.)

Even when the FSM is outdated in standby, things are not too bad
normally, because, most per-page FSM values will be zero (other than
those propagated with the base-backup that created the standby); only
once the remaining free space is less than 0.2*BLCKSZ the per-page value
is maintained by WAL replay of heap ins/upd/del.  However, if
wal_log_hints=on causes complete FSM pages to be propagated to a standby
via full-page images, many too-optimistic per-page values can end up
being registered in the standby.

Incorrect per-page values aren't critical in most cases, since an
inserter that is given a page that doesn't actually contain the claimed
free space will update FSM with the correct value, and retry until it
finds a usable page.  However, if there are many such updates to do, an
inserter can spend a long time doing them before a usable page is found;
in a heavily trafficked insert-only table with many concurrent inserters
this has been observed to cause several second stalls, causing visible
application malfunction.

To fix this problem, it seems sufficient to have heap_xlog_visible
(replay of setting all-visible and all-frozen VM bits for a heap page)
update the FSM value for the page being processed.  This fixes the
per-page counters together with making the page skippable to vacuum, so
when vacuum does FreeSpaceMapVacuum, the values propagated to FSM upper
layers are the correct ones, avoiding the problem.

While at it, apply the same fix to heap_xlog_clean (replay of tuple
removal by HOT pruning and vacuum).  This makes any space freed by the
cleaning available earlier than the next vacuum in the promoted replica.

Backpatch to 9.6, where this problem was diagnosed on an insert-only
table with all-frozen pages, which were introduced as a concept in that
release.  Theoretically it could apply with all-visible pages to older
branches, but there's been no report of that and it doesn't backpatch
cleanly anyway.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20180802172857.5skoexsilnjvgruk@alvherre.pgsql
2018-08-15 18:09:29 -03:00
Tom Lane cc4f6b7786 Clean up assorted misuses of snprintf()'s result value.
Fix a small number of places that were testing the result of snprintf()
but doing so incorrectly.  The right test for buffer overrun, per C99,
is "result >= bufsize" not "result > bufsize".  Some places were also
checking for failure with "result == -1", but the standard only says
that a negative value is delivered on failure.

(Note that this only makes these places correct if snprintf() delivers
C99-compliant results.  But at least now these places are consistent
with all the other places where we assume that.)

Also, make psql_start_test() and isolation_start_test() check for
buffer overrun while constructing their shell commands.  There seems
like a higher risk of overrun, with more severe consequences, here
than there is for the individual file paths that are made elsewhere
in the same functions, so this seemed like a worthwhile change.

Also fix guc.c's do_serialize() to initialize errno = 0 before
calling vsnprintf.  In principle, this should be unnecessary because
vsnprintf should have set errno if it returns a failure indication ...
but the other two places this coding pattern is cribbed from don't
assume that, so let's be consistent.

These errors are all very old, so back-patch as appropriate.  I think
that only the shell command overrun cases are even theoretically
reachable in practice, but there's not much point in erroneous error
checks.

Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
2018-08-15 16:29:31 -04:00
Peter Eisentraut b68ff3ea67 Remove obsolete netbsd dynloader code
dlopen() has been documented since NetBSD 1.1 (1995).
2018-08-13 23:21:01 +02:00
Peter Eisentraut 29351a06af Remove obsolete openbsd dynloader code
dlopen() has been documented since OpenBSD 2.0 (1996).
2018-08-13 23:21:01 +02:00
Peter Eisentraut 2db1905fdd Remove obsolete freebsd dynloader code
dlopen() has been documented since FreeBSD 3.0 (1989).
2018-08-13 23:21:01 +02:00
Peter Eisentraut 351855fc4e Remove obsolete linux dynloader code
This has been obsolete probably since the late 1990s.
2018-08-13 23:21:01 +02:00
Peter Eisentraut b5d29299cc Remove obsolete darwin dynloader code
not needed since macOS 10.3 (2003)
2018-08-13 23:21:01 +02:00
Peter Eisentraut 3ebdd21b79 Remove obsolete comment
The sequence name is no longer stored in the sequence relation, since
1753b1b027.
2018-08-13 21:07:31 +02:00
Michael Paquier 246a6c8f7b Make autovacuum more aggressive to remove orphaned temp tables
Commit dafa084, added in 10, made the removal of temporary orphaned
tables more aggressive.  This commit makes an extra step into the
aggressiveness by adding a flag in each backend's MyProc which tracks
down any temporary namespace currently in use.  The flag is set when the
namespace gets created and can be reset if the temporary namespace has
been created in a transaction or sub-transaction which is aborted.  The
flag value assignment is assumed to be atomic, so this can be done in a
lock-less fashion like other flags already present in PGPROC like
databaseId or backendId, still the fact that the temporary namespace and
table created are still locked until the transaction creating those
commits acts as a barrier for other backends.

This new flag gets used by autovacuum to discard more aggressively
orphaned tables by additionally checking for the database a backend is
connected to as well as its temporary namespace in-use, removing
orphaned temporary relations even if a backend reuses the same slot as
one which created temporary relations in a past session.

The base idea of this patch comes from Robert Haas, has been written in
its first version by Tsunakawa Takayuki, then heavily reviewed by me.

Author: Tsunakawa Takayuki
Reviewed-by: Michael Paquier, Kyotaro Horiguchi, Andres Freund
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05
Backpatch: 11-, as PGPROC gains a new flag and we don't want silent ABI
breakages on already released versions.
2018-08-13 11:49:04 +02:00
Amit Kapila 4f9a97e417 Adjust comment atop ExecShutdownNode.
After commits a315b967cc and b805b63ac2, part of the comment atop
ExecShutdownNode is redundant.  Adjust it.

Author: Amit Kapila
Backpatch-through: 10 where both the mentioned commits are present.
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-13 10:04:39 +05:30
Amit Kapila 2cd0acfdad Prohibit shutting down resources if there is a possibility of back up.
Currently, we release the asynchronous resources as soon as it is evident
that no more rows will be needed e.g. when a Limit is filled.  This can be
problematic especially for custom and foreign scans where we can scan
backward.  Fix that by disallowing the shutting down of resources in such
cases.

Reported-by: Robert Haas
Analysed-by: Robert Haas and Amit Kapila
Author: Amit Kapila
Reviewed-by: Robert Haas
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-13 08:22:18 +05:30
Andrew Gierth 07172d5aff Avoid query-lifetime memory leaks in XMLTABLE (bug #15321)
Multiple calls to XMLTABLE in a query (e.g. laterally applying it to a
table with an xml column, an important use-case) were leaking large
amounts of memory into the per-query context, blowing up memory usage.

Repair by reorganizing memory context usage in nodeTableFuncscan; use
the usual per-tuple context for row-by-row evaluations instead of
perValueCxt, and use the explicitly created context -- renamed from
perValueCxt to perTableCxt -- for arguments and state for each
individual table-generation operation.

Backpatch to PG10 where this code was introduced.

Original report by IRC user begriffs; analysis and patch by me.
Reviewed by Tom Lane and Pavel Stehule.

Discussion: https://postgr.es/m/153394403528.10284.7530399040974170549@wrigleys.postgresql.org
2018-08-13 01:59:45 +01:00
Tom Lane 4a2994f055 Fix wrong order of operations in inheritance_planner.
When considering a partitioning parent rel, we should stop processing that
subroot as soon as we've done adjust_appendrel_attrs and any securityQuals
updates.  The rest of this is unnecessary, and indeed adding duplicate
subquery RTEs to the subroot is *wrong*.  As the code stood, the children
of that partition ended up with two sets of copied subquery RTEs, confusing
matters greatly.  Even more hilarity ensued if all of the children got
excluded by constraint exclusion, so that the extra RTEs didn't make it
back into the parent rtable.

Per fuzz testing by Andreas Seltenreich.  Back-patch to v11 where this
got broken (by commit 0a480502b, it looks like).

Discussion: https://postgr.es/m/87va8g7vq0.fsf@ansel.ydns.eu
2018-08-11 15:53:20 -04:00
Andrew Dunstan 5c047fd709 Revert changes in execMain.c from commit 16828d5c02
These changes were put in at some stage of the development process, but
are unnecessary and should not have made it into the final patch. Mea
culpa.

Per gripe from Andreas Freund

Backpatch to REL_11_STABLE
2018-08-10 16:09:25 -04:00
Peter Geoghegan 4974d7f87e Handle parallel index builds on mapped relations.
Commit 9da0cc3528, which introduced parallel CREATE INDEX, failed to
propagate relmapper.c backend local cache state to parallel worker
processes.  This could result in parallel index builds against mapped
catalog relations where the leader process (participating as a worker)
scans the new, pristine relfilenode, while worker processes scan the
obsolescent relfilenode.  When this happened, the final index structure
was typically not consistent with the owning table's structure.  The
final index structure could contain entries formed from both heap
relfilenodes.  Only rebuilds on mapped catalog relations that occur as
part of a VACUUM FULL or CLUSTER could become corrupt in practice, since
their mapped relation relfilenode swap is what allows the inconsistency
to arise.

On master, fix the problem by propagating the required relmapper.c
backend state as part of standard parallel initialization (Cf. commit
29d58fd3).  On v11, simply disallow builds against mapped catalog
relations by deeming them parallel unsafe.

Author: Peter Geoghegan
Reported-By: "death lock"
Reviewed-By: Tom Lane, Amit Kapila
Bug: #15309
Discussion: https://postgr.es/m/153329671686.1405.18298309097348420351@wrigleys.postgresql.org
Backpatch: 11-, where parallel CREATE INDEX was introduced.
2018-08-10 13:01:34 -07:00
Michael Paquier f841ceb26d Improve TRUNCATE by avoiding early lock queue
A caller of TRUNCATE could previously queue for an access exclusive lock
on a relation it may not have permission to truncate, potentially
interfering with users authorized to work on it.  This can be very
intrusive depending on the lock attempted to be taken.  For example,
pg_authid could be blocked, preventing any authentication attempt to
happen on a PostgreSQL instance.

This commit fixes the case of TRUNCATE so as RangeVarGetRelidExtended is
used with a callback doing the necessary ACL checks at an earlier stage,
avoiding lock queuing issues, so as an immediate failure happens for
unprivileged users instead of waiting on a lock that would not be
taken.

This is rather similar to the type of work done in cbe24a6 for CLUSTER,
and the code of TRUNCATE is this time refactored so as there is no
user-facing changes.  As the commit for CLUSTER, no back-patch is done.

Reported-by: Lloyd Albin, Jeremy Schneider
Author: Michael Paquier
Reviewed by: Nathan Bossart, Kyotaro Horiguchi
Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org
Discussion: https://postgr.es/m/20180806165816.GA19883@paquier.xyz
2018-08-10 18:26:59 +02:00
Alexander Korotkov 2b13702d5c Fix typo in SP-GiST error message
Error message didn't match the actual check.  Fix that.  Compression of leaf
SP-GiST values was introduced in 11.  So, backpatch.

Discussion: https://postgr.es/m/20180810.100742.15469435.horiguchi.kyotaro%40lab.ntt.co.jp
Author: Kyotaro Horiguchi
Backpatch-through: 11
2018-08-10 17:28:48 +03:00
Heikki Linnakangas 31380bc7c2 Spell "partitionwise" consistently.
I'm not sure which spelling is better, "partitionwise" or "partition-wise",
but everywhere else we spell it "partitionwise", so be consistent.

Tatsuro Yamada reported the one in README, I found the other one with grep.

Discussion: https://www.postgresql.org/message-id/d25ebf36-5a6d-8b2c-1ff3-d6f022a56000@lab.ntt.co.jp
2018-08-09 10:43:18 +03:00
Michael Paquier 661dd23950 Restrict access to reindex of shared catalogs for non-privileged users
A database owner running a database-level REINDEX has the possibility to
also do the operation on shared system catalogs without being an owner
of them, which allows him to block resources it should not have access
to.  The same goes for a schema owner.  For example, PostgreSQL would go
unresponsive and even block authentication if a lock is waited for
pg_authid.  This commit makes sure that a user running a REINDEX SYSTEM,
DATABASE or SCHEMA only works on the following relations:
- The user is a superuser
- The user is the table owner
- The user is the database/schema owner, only if the relation worked on
is not shared.

Robert has worded most the documentation changes, and I have coded the
core part.

Reported-by: Lloyd Albin, Jeremy Schneider
Author: Michael Paquier, Robert Haas
Reviewed by: Nathan Bossart, Kyotaro Horiguchi
Discussion: https://postgr.es/m/152512087100.19803.12733865831237526317@wrigleys.postgresql.org
Discussion: https://postgr.es/m/20180805211059.GA2185@paquier.xyz
Backpatch-through: 11- as the current behavior has been around for a
very long time and could be disruptive for already released branches.
2018-08-09 09:40:15 +02:00
Tom Lane 59ef49d26d Remove bogus Assert in make_partitionedrel_pruneinfo().
This Assert thought that a given rel couldn't be both leaf and
non-leaf, but it turns out that in some unusual plan trees
that's wrong, so remove it.

The lack of testing for cases like that is quite concerning ---
there is little reason for confidence that there aren't other
bugs in the area.  But developing a stable test case seems
rather difficult, and in any case we don't need this Assert.

David Rowley

Discussion: https://postgr.es/m/CAJGNTeOkdk=UVuMugmKL7M=owgt4nNr1wjxMg1F+mHsXyLCzFA@mail.gmail.com
2018-08-08 20:02:32 -04:00
Heikki Linnakangas 8e19a82640 Don't run atexit callbacks in quickdie signal handlers.
exit() is not async-signal safe. Even if the libc implementation is, 3rd
party libraries might have installed unsafe atexit() callbacks. After
receiving SIGQUIT, we really just want to exit as quickly as possible, so
we don't really want to run the atexit() callbacks anyway.

The original report by Jimmy Yih was a self-deadlock in startup_die().
However, this patch doesn't address that scenario; the signal handling
while waiting for the startup packet is more complicated. But at least this
alleviates similar problems in the SIGQUIT handlers, like that reported
by Asim R P later in the same thread.

Backpatch to 9.3 (all supported versions).

Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
2018-08-08 19:10:32 +03:00
Tom Lane 11e22e486d Match RelOptInfos by relids not pointer equality.
Commit 1c2cb2744 added some code that tried to detect whether two
RelOptInfos were the "same" rel by pointer comparison; but it turns
out that inheritance_planner breaks that, through its shenanigans
with copying some relations forward into new subproblems.  Compare
relid sets instead.  Add a regression test case to exercise this
area.

Problem reported by Rushabh Lathia; diagnosis and fix by Amit Langote,
modified a bit by me.

Discussion: https://postgr.es/m/CAGPqQf3anJGj65bqAQ9edDr8gF7qig6_avRgwMT9MsZ19COUPw@mail.gmail.com
2018-08-08 11:44:50 -04:00
Tom Lane 9b7c56d6cb Don't record FDW user mappings as members of extensions.
CreateUserMapping has a recordDependencyOnCurrentExtension call that's
been there since extensions were introduced (very possibly my fault).
However, there's no support anywhere else for user mappings as members
of extensions, nor are they listed as a possible member object type in
the documentation.  Nor does it really seem like a good idea for user
mappings to belong to extensions when roles don't.  Hence, remove the
bogus call.

(As we saw in bug #15310, the lack of any pg_dump support for this case
ensures that any such membership record would silently disappear during
pg_upgrade.  So there's probably no need for us to do anything else
about cleaning up after this mistake.)

Discussion: https://postgr.es/m/27952.1533667213@sss.pgh.pa.us
2018-08-07 16:32:50 -04:00
Tom Lane 41db97399d Fix incorrect initialization of BackendActivityBuffer.
Since commit c8e8b5a6e, this has been zeroed out using the wrong length.
In practice the length would always be too small, leading to not zeroing
the whole buffer rather than clobbering additional memory; and that's
pretty harmless, both because shmem would likely start out as zeroes
and because we'd reinitialize any given entry before use.  Still,
it's bogus, so fix it.

Reported by Petru-Florin Mihancea (bug #15312)

Discussion: https://postgr.es/m/153363913073.1303.6518849192351268091@wrigleys.postgresql.org
2018-08-07 16:00:44 -04:00
Heikki Linnakangas 77291139c7 Remove support for tls-unique channel binding.
There are some problems with the tls-unique channel binding type. It's not
supported by all SSL libraries, and strictly speaking it's not defined for
TLS 1.3 at all, even though at least in OpenSSL, the functions used for it
still seem to work with TLS 1.3 connections. And since we had no
mechanism to negotiate what channel binding type to use, there would be
awkward interoperability issues if a server only supported some channel
binding types. tls-server-end-point seems feasible to support with any SSL
library, so let's just stick to that.

This removes the scram_channel_binding libpq option altogether, since there
is now only one supported channel binding type.

This also removes all the channel binding tests from the SSL test suite.
They were really just testing the scram_channel_binding option, which
is now gone. Channel binding is used if both client and server support it,
so it is used in the existing tests. It would be good to have some tests
specifically for channel binding, to make sure it really is used, and the
different combinations of a client and a server that support or doesn't
support it. The current set of settings we have make it hard to write such
tests, but I did test those things manually, by disabling
HAVE_BE_TLS_GET_CERTIFICATE_HASH and/or
HAVE_PGTLS_GET_PEER_CERTIFICATE_HASH.

I also removed the SCRAM_CHANNEL_BINDING_TLS_END_POINT constant. This is a
matter of taste, but IMO it's more readable to just use the
"tls-server-end-point" string.

Refactor the checks on whether the SSL library supports the functions
needed for tls-server-end-point channel binding. Now the server won't
advertise, and the client won't choose, the SCRAM-SHA-256-PLUS variant, if
compiled with an OpenSSL version too old to support it.

In the passing, add some sanity checks to check that the chosen SASL
mechanism, SCRAM-SHA-256 or SCRAM-SHA-256-PLUS, matches whether the SCRAM
exchange used channel binding or not. For example, if the client selects
the non-channel-binding variant SCRAM-SHA-256, but in the SCRAM message
uses channel binding anyway. It's harmless from a security point of view,
I believe, and I'm not sure if there are some other conditions that would
cause the connection to fail, but it seems better to be strict about these
things and check explicitly.

Discussion: https://www.postgresql.org/message-id/ec787074-2305-c6f4-86aa-6902f98485a4%40iki.fi
2018-08-05 13:44:21 +03:00
Tom Lane b8a1247a34 Fix INSERT ON CONFLICT UPDATE through a view that isn't just SELECT *.
When expanding an updatable view that is an INSERT's target, the rewriter
failed to rewrite Vars in the ON CONFLICT UPDATE clause.  This accidentally
worked if the view was just "SELECT * FROM ...", as the transformation
would be a no-op in that case.  With more complicated view targetlists,
this omission would often lead to "attribute ... has the wrong type" errors
or even crashes, as reported by Mario De Frutos Dieguez.

Fix by adding code to rewriteTargetView to fix up the data structure
correctly.  The easiest way to update the exclRelTlist list is to rebuild
it from scratch looking at the new target relation, so factor the code
for that out of transformOnConflictClause to make it sharable.

In passing, avoid duplicate permissions checks against the EXCLUDED
pseudo-relation, and prevent useless view expansion of that relation's
dummy RTE.  The latter is only known to happen (after this patch) in cases
where the query would fail later due to not having any INSTEAD OF triggers
for the view.  But by exactly that token, it would create an unintended
and very poorly tested state of the query data structure, so it seems like
a good idea to prevent it from happening at all.

This has been broken since ON CONFLICT was introduced, so back-patch
to 9.5.

Dean Rasheed, based on an earlier patch by Amit Langote;
comment-kibitzing and back-patching by me

Discussion: https://postgr.es/m/CAFYwGJ0xfzy8jaK80hVN2eUWr6huce0RU8AgU04MGD00igqkTg@mail.gmail.com
2018-08-04 19:38:58 -04:00
Michael Paquier 5a23c74b63 Reset properly errno before calling write()
6cb3372 enforces errno to ENOSPC when less bytes than what is expected
have been written when it is unset, though it forgot to properly reset
errno before doing a system call to write(), causing errno to
potentially come from a previous system call.

Reported-by: Tom Lane
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/31797.1533326676@sss.pgh.pa.us
2018-08-05 05:31:18 +09:00
Peter Geoghegan b3f919da07 Add table relcache invalidation to index builds.
It's necessary to make sure that owning tables have a relcache
invalidation prior to advancing the command counter to make
newly-entered catalog tuples for the index visible.  inval.c must be
able to maintain the consistency of the local caches in the event of
transaction abort.  There is usually only a problem when CREATE INDEX
transactions abort, since there is a generic invalidation once we reach
index_update_stats().

This bug is of long standing.  Problems were made much more likely by
the addition of parallel CREATE INDEX (commit 9da0cc3528), but it is
strongly suspected that similar problems can be triggered without
involving plan_create_index_workers().  (plan_create_index_workers()
triggers a relcache build or rebuild, which previously only happened in
rare edge cases.)

Author: Peter Geoghegan
Reported-By: Luca Ferrari
Diagnosed-By: Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAKoxK+5fVodiCtMsXKV_1YAKXbzwSfp7DgDqUmcUAzeAhf=HEQ@mail.gmail.com
Backpatch: 9.3-
2018-08-03 15:11:31 -07:00
Amit Kapila 85c9d3475e Fix buffer usage stats for parallel nodes.
The buffer usage stats is accounted only for the execution phase of the
node.  For Gather and Gather Merge nodes, such stats are accumulated at
the time of shutdown of workers which is done after execution of node due
to which we missed to account them for such nodes.  Fix it by treating
nodes as running while we shut down them.

We can also miss accounting for a Limit node when Gather or Gather Merge
is beneath it, because it can finish the execution before shutting down
such nodes.  So we allow a Limit node to shut down the resources before it
completes the execution.

In the passing fix the gather node code to allow workers to shut down as
soon as we find that all the tuples from the workers have been retrieved.
The original code use to do that, but is accidently removed by commit
01edb5c7fc.

Reported-by: Adrien Nayrat
Author: Amit Kapila and Robert Haas
Reviewed-by: Robert Haas and Andres Freund
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-03 11:02:02 +05:30
Amit Kapila ccc84a956b Match the buffer usage tracking for leader and worker backends.
In the leader backend, we don't track the buffer usage for ExecutorStart
phase whereas in worker backend we track it for ExecutorStart phase as
well.  This leads to different value for buffer usage stats for the
parallel and non-parallel query.  Change the code so that worker backend
also starts tracking buffer usage after ExecutorStart.

Author: Amit Kapila and Robert Haas
Reviewed-by: Robert Haas and Andres Freund
Backpatch-through: 9.6 where this code was introduced
Discussion: https://postgr.es/m/86137f17-1dfb-42f9-7421-82fd786b04a1@anayrat.info
2018-08-03 09:11:37 +05:30
Tom Lane 1c2cb2744b Fix run-time partition pruning for appends with multiple source rels.
The previous coding here supposed that if run-time partitioning applied to
a particular Append/MergeAppend plan, then all child plans of that node
must be members of a single partitioning hierarchy.  This is totally wrong,
since an Append could be formed from a UNION ALL: we could have multiple
hierarchies sharing the same Append, or child plans that aren't part of any
hierarchy.

To fix, restructure the related plan-time and execution-time data
structures so that we can have a separate list or array for each
partitioning hierarchy.  Also track subplans that are not part of any
hierarchy, and make sure they don't get pruned.

Per reports from Phil Florent and others.  Back-patch to v11, since
the bug originated there.

David Rowley, with a lot of cosmetic adjustments by me; thanks also
to Amit Langote for review.

Discussion: https://postgr.es/m/HE1PR03MB17068BB27404C90B5B788BCABA7B0@HE1PR03MB1706.eurprd03.prod.outlook.com
2018-08-01 19:42:52 -04:00
Alvaro Herrera c40489e449 Fix logical replication slot initialization
This was broken in commit 9c7d06d606, which inadvertently gave the
wrong value to fast_forward in one StartupDecodingContext call.  Fix by
flipping the value.  Add a test for the obvious error, namely trying to
initialize a replication slot with an nonexistent output plugin.

While at it, move the CreateDecodingContext call earlier, so that any
errors are reported before sending the CopyBoth message.

Author: Dave Cramer <davecramer@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CADK3HHLVkeRe1v4P02-5hj55H3_yJg3AEtpXyEY5T3wuzO2jSg@mail.gmail.com
2018-08-01 17:47:15 -04:00
Alvaro Herrera 91bc213d90 Fix unnoticed variable shadowing in previous commit
Per buildfarm.
2018-08-01 17:04:57 -04:00
Alvaro Herrera 1c9bb02d8e Fix per-tuple memory leak in partition tuple routing
Some operations were being done in a longer-lived memory context,
causing intra-query leaks.  It's not noticeable unless you're doing a
large COPY, but if you are, it eats enough memory to cause a problem.

Co-authored-by: Kohei KaiGai <kaigai@heterodb.com>
Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAOP8fzYtVFWZADq4c=KoTAqgDrHWfng+AnEPEZccyxqxPVbbWQ@mail.gmail.com
2018-08-01 16:29:15 -04:00
Peter Eisentraut 0d5f05cde0 Allow multi-inserts during COPY into a partitioned table
CopyFrom allows multi-inserts to be used for non-partitioned tables, but
this was disabled for partitioned tables.  The reason for this appeared
to be that the tuple may not belong to the same partition as the
previous tuple did.  Not allowing multi-inserts here greatly slowed down
imports into partitioned tables.  These could take twice as long as a
copy to an equivalent non-partitioned table.  It seems wise to do
something about this, so this change allows the multi-inserts by
flushing the so-far inserted tuples to the partition when the next tuple
does not belong to the same partition, or when the buffer fills.  This
improves performance when the next tuple in the stream commonly belongs
to the same partition as the previous tuple.

In cases where the target partition changes on every tuple, using
multi-inserts slightly slows the performance.  To get around this we
track the average size of the batches that have been inserted and
adaptively enable or disable multi-inserts based on the size of the
batch.  Some testing was done and the regression only seems to exist
when the average size of the insert batch is close to 1, so let's just
enable multi-inserts when the average size is at least 1.3.  More
performance testing might reveal a better number for, this, but since
the slowdown was only 1-2% it does not seem critical enough to spend too
much time calculating it.  In any case it may depend on other factors
rather than just the size of the batch.

Allowing multi-inserts for partitions required a bit of work around the
per-tuple memory contexts as we must flush the tuples when the next
tuple does not belong the same partition.  In which case there is no
good time to reset the per-tuple context, as we've already built the new
tuple by this time.  In order to work around this we maintain two
per-tuple contexts and just switch between them every time the partition
changes and reset the old one.  This does mean that the first of each
batch of tuples is not allocated in the same memory context as the
others, but that does not matter since we only reset the context once
the previous batch has been inserted.

Author: David Rowley <david.rowley@2ndquadrant.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
2018-08-01 10:23:09 +02:00
Tom Lane f3eb76b399 Further fixes for quoted-list GUC values in pg_dump and ruleutils.c.
Commits 742869946 et al turn out to be a couple bricks shy of a load.
We were dumping the stored values of GUC_LIST_QUOTE variables as they
appear in proconfig or setconfig catalog columns.  However, although that
quoting rule looks a lot like SQL-identifier double quotes, there are two
critical differences: empty strings ("") are legal, and depending on which
variable you're considering, values longer than NAMEDATALEN might be valid
too.  So the current technique fails altogether on empty-string list
entries (as reported by Steven Winfield in bug #15248) and it also risks
truncating file pathnames during dump/reload of GUC values that are lists
of pathnames.

To fix, split the stored value without any downcasing or truncation,
and then emit each element as a SQL string literal.

This is a tad annoying, because we now have three copies of the
comma-separated-string splitting logic in varlena.c as well as a fourth
one in dumputils.c.  (Not to mention the randomly-different-from-those
splitting logic in libpq...)  I looked at unifying these, but it would
be rather a mess unless we're willing to tweak the API definitions of
SplitIdentifierString, SplitDirectoriesString, or both.  That might be
worth doing in future; but it seems pretty unsafe for a back-patched
bug fix, so for now accept the duplication.

Back-patch to all supported branches, as the previous fix was.

Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
2018-07-31 13:00:14 -04:00
Tom Lane 6574f19127 Remove dead code left behind by 1b6801051. 2018-07-30 19:11:02 -04:00
Alvaro Herrera d25d45e4d9 Verify range bounds to bms_add_range when necessary
Now that the bms_add_range boundary protections are gone, some
alternative ones are needed in a few places.

Author: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Discussion: https://postgr.es/m/3437ccf8-a144-55ff-1e2f-fc16b437823b@lab.ntt.co.jp
2018-07-30 18:45:39 -04:00
Alvaro Herrera 1b68010518 Change bms_add_range to be a no-op for empty ranges
In commit 84940644de, bms_add_range was added with an API to fail with
an error if an empty range was specified.  This seems arbitrary and
unhelpful, so turn that case into a no-op instead.  Callers that require
further verification on the arguments or result can apply them by
themselves.

This fixes the bug that partition pruning throws an API error for a case
involving the default partition of a default partition, as in the
included test case.

Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/16590.1532622503@sss.pgh.pa.us
2018-07-30 18:44:33 -04:00
Alvaro Herrera 4f10e7ea7b Set ActiveSnapshot when logically replaying inserts
Input functions for the inserted tuples may require a snapshot, when
they are replayed by native logical replication.  An example is a domain
with a constraint using a SQL-language function, which prior to this
commit failed to apply on the subscriber side.

Reported-by: Mai Peng <maily.peng@webedia-group.com>
Co-authored-by: Minh-Quan TRAN <qtran@itscaro.me>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/4EB4BD78-BFC3-4D04-B8DA-D53DF7160354@webedia-group.com
Discussion: https://postgr.es/m/153211336163.1404.11721804383024050689@wrigleys.postgresql.org
2018-07-30 16:30:07 -04:00
Peter Eisentraut 98efa76fe3 Add ssl_library preset parameter
This allows querying the SSL implementation used on the server side.
It's analogous to using PQsslAttribute(conn, "library") in libpq.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
2018-07-30 13:46:27 +02:00
Tomas Vondra ab87b8fedc Mark variable used only in assertion with PG_USED_FOR_ASSERTS_ONLY
Perpendicular lines always intersect, so the line_interpt_line() return
value in line_closept_point() was used only in an assertion, triggering
compiler warnings in non-assert builds.
2018-07-29 23:08:00 +02:00
Tomas Vondra 74294c7301 Restore handling of -0 in the C field of lines in line_construct().
Commit a7dc63d904 inadvertedly removed this bit originally introduced
by 43fe90f66a, causing regression test failures on some platforms,
due to producing {1,-1,-0} instead of {1,-1,0}.
2018-07-29 21:11:05 +02:00
Michael Paquier 9f7ba88aa4 Fix two oversights from 9ebe0572 which refactored cluster_rel
The recheck option became a no-op as ClusterOption failed to set proper
values for each element.  There was a second code path where local
options got overwritten.

Both issues have been spotted by Coverity.
2018-07-29 22:00:42 +09:00
Noah Misch e09144e6ce Document security implications of qualified names.
Commit 5770172cb0 documented secure schema
usage, and that advice suffices for using unqualified names securely.
Document, in typeconv-func primarily, the additional issues that arise
with qualified names.  Back-patch to 9.3 (all supported versions).

Reviewed by Jonathan S. Katz.

Discussion: https://postgr.es/m/20180721012446.GA1840594@rfd.leadboat.com
2018-07-28 20:08:01 -07:00
Tomas Vondra 6bf0bc842b Provide separate header file for built-in float types
Some data types under adt/ have separate header files, but most simple
ones do not, and their public functions are defined in builtins.h.  As
the patches improving geometric types will require making additional
functions public, this seems like a good opportunity to create a header
for floats types.

Commit 1acf757255 made _cmp functions public to solve NaN issues locally
for GiST indexes.  This patch reworks it in favour of a more widely
applicable API.  The API uses inline functions, as they are easier to
use compared to macros, and avoid double-evaluation hazards.

Author: Emre Hasegeli
Reviewed-by: Kyotaro Horiguchi

Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
2018-07-29 03:30:48 +02:00
Tomas Vondra a7dc63d904 Refactor geometric functions and operators
The primary goal of this patch is to eliminate duplicate code and share
code between different geometric data types more often, to prepare the
ground for additional patches.  Until now the code reuse was limited,
probably because the simpler types (line and point) were implemented
after the more complex ones.

The changes are quite extensive and can be summarised as:

* Eliminate SQL-level function calls.
* Re-use more functions to implement others.
* Unify internal function names and signatures.
* Remove private functions from geo_decls.h.
* Replace should-not-happen checks with assertions.
* Add comments describe for various functions.
* Remove some unreachable code.
* Define delimiter symbols of line datatype like the other ones.
* Remove the GEODEBUG macro and printf() calls.
* Unify code style of a few oddly formatted lines.

While the goal was to cause minimal user-visible changes, it was not
possible to keep the original behavior in all cases - for example when
handling NaN values, or when reusing code makes the functions return
consistent results.

Author: Emre Hasegeli
Reviewed-by: Kyotaro Horiguchi, me

Discussion: https://www.postgresql.org/message-id/CAE2gYzxF7-5djV6-cEvqQu-fNsnt%3DEqbOURx7ZDg%2BVv6ZMTWbg%40mail.gmail.com
2018-07-29 02:36:29 +02:00
Alexander Korotkov d2086b08b0 Reduce path length for locking leaf B-tree pages during insertion
In our B-tree implementation appropriate leaf page for new tuple
insertion is acquired using _bt_search() function.  This function always
returns leaf page locked in shared mode.  In order to obtain exclusive
lock, caller have to relock the page.

This commit makes _bt_search() function lock leaf page immediately in
exclusive mode when needed.  That removes unnecessary relock and, in
turn reduces lock contention for B-tree leaf pages.  Our experiments
on multi-core systems showed acceleration up to 4.5 times in corner
case.

Discussion: https://postgr.es/m/CAPpHfduAMDFMNYTCN7VMBsFg_hsf0GqiqXnt%2BbSeaJworwFoig%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Yoshikazu Imai, Simon Riggs, Peter Geoghegan
2018-07-28 00:31:40 +03:00
Alvaro Herrera 8a9b72c3ea Fix grammar in README.tuplock
Author: Brad DeJong
Discussion: https://postgr.es/m/CAJnrtnxrA4FqZi0Z6kGPQKMiZkWv2xxgSDQ+hv1jDrf8WCKjjw@mail.gmail.com
2018-07-27 10:56:30 -04:00
Robert Haas 3e32109049 Use key and partdesc from PartitionDispatch where possible.
Instead of repeatedly fishing the data out of the relcache entry,
let's use the version that we cached in the PartitionDispatch.  We
could alternatively rip out the PartitionDispatch fields altogether,
but it doesn't make much sense to have them and not use them; before
this patch, partdesc was set but altogether unused.  Amit Langote and
I both thought using them was a litle better than removing them, so
this patch takes that approach.

Discussion: http://postgr.es/m/CA+TgmobFnxcaW-Co-XO8=yhJ5pJXoNkCj6Z7jm9Mwj9FGv-D7w@mail.gmail.com
2018-07-27 09:40:52 -04:00
Amit Kapila 8ce29bb4f0 Fix the buffer release order for parallel index scans.
During parallel index scans, if the current page to be read is deleted, we
skip it and try to get the next page for a scan without releasing the buffer
lock on the current page.  To get the next page, sometimes it needs to wait
for another process to complete its scan and advance it to the next page.
Now, it is quite possible that the master backend has errored out before
advancing the scan and issued a termination signal for all workers.  The
workers failed to notice the termination request during wait because the
interrupts are held due to buffer lock on the previous page.  This lead to
all workers being stuck.

The fix is to release the buffer lock on current page before trying to get
the next page.  We are already doing same in backward scans, but missed
it for forward scans.

Reported-by: Victor Yegorov
Bug: 15290
Diagnosed-by: Thomas Munro and Amit Kapila
Author: Amit Kapila
Reviewed-by: Thomas Munro
Tested-By: Thomas Munro and Victor Yegorov
Backpatch-through: 10 where parallel index scans were introduced
Discussion: https://postgr.es/m/153228422922.1395.1746424054206154747@wrigleys.postgresql.org
2018-07-27 10:53:00 +05:30
Tom Lane 662d12aea1 Avoid crash in eval_const_expressions if a Param's type changes.
Since commit 6719b238e it's been possible for the values of plpgsql
record field variables to be exposed to the planner as Params.
(Before that, plpgsql never supplied values for such variables during
planning, so that the problematic code wasn't reached.)  Other places
that touch potentially-type-mutable Params either cope gracefully or
do runtime-test-and-ereport checks that the type is what they expect.
But eval_const_expressions() just had an Assert, meaning that it either
failed the assertion or risked crashes due to using an incompatible
value.

In this case, rather than throwing an ereport immediately, we can just
not perform a const-substitution in case of a mismatch.  This seems
important for the same reason that the Param fetch was speculative:
we might not actually reach this part of the expression at runtime.

Test case will follow in a separate commit.

Patch by me, pursuant to bug report from Andrew Gierth.
Back-patch to v11 where the previous commit appeared.

Discussion: https://postgr.es/m/87wotkfju1.fsf@news-spur.riddles.org.uk
2018-07-26 16:08:45 -04:00
Andres Freund 3acc4acd9b LLVMJIT: Release JIT context after running ExprContext shutdown callbacks.
Due to inlining it previously was possible that an ExprContext's
shutdown callback pointed to a JITed function. As the JIT context
previously was shut down before the shutdown callbacks were called,
that could lead to segfaults.  Fix the ordering.

Reported-By: Dmitry Dolgov
Author: Andres Freund
Discussion: https://postgr.es/m/CA+q6zcWO7CeAJtHBxgcHn_hj+PenM=tvG0RJ93X1uEJ86+76Ug@mail.gmail.com
Backpatch: 11-, where JIT compilation was added
2018-07-25 16:31:49 -07:00
Andres Freund bcafa263ec LLVMJIT: Check for 'noinline' attribute in recursively inlined functions.
Previously the attribute was only checked for external functions
inlined, not "static" functions that had to be inlined as
dependencies.

This isn't really a bug, but makes debugging a bit harder. The new
behaviour also makes more sense. Therefore backpatch.

Author: Andres Freund
Backpatch: 11-, where JIT compilation was added
2018-07-25 16:23:59 -07:00
Thomas Munro 2d30675952 Pad semaphores to avoid false sharing.
In a USE_UNNAMED_SEMAPHORES build, the default on Linux and FreeBSD
since commit ecb0d20a, we have an array of sem_t objects.  This
turned out to reduce performance compared to the previous default
USE_SYSV_SEMAPHORES on an 8 socket system.  Testing showed that the
lost performance could be regained by padding the array elements so
that they have their own cache lines.  This matches what we do for
similar hot arrays (see LWLockPadded, WALInsertLockPadded).

Back-patch to 10, where unnamed semaphores were adopted as the default
semaphore interface on those operating systems.

Author: Thomas Munro
Reviewed-by: Andres Freund
Reported-by: Mithun Cy
Tested-by: Mithun Cy, Tom Lane, Thomas Munro
Discussion: https://postgr.es/m/CAD__OugYDM3O%2BdyZnnZSbJprSfsGFJcQ1R%3De59T3hcLmDug4_w%40mail.gmail.com
2018-07-25 11:00:29 +12:00
Andres Freund b2bb3dc0e0 Defend against some potential spurious compiler warnings in 86eaf208e.
Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f-AbCFeFU92GZZYqNOVRnPtUwczSYmR2NHCyf9uHUnNiw@mail.gmail.com
2018-07-24 10:10:22 -07:00
Michael Paquier 9ebe0572ce Refactor cluster_rel() to handle more options
This extends cluster_rel() in such a way that more options can be added
in the future, which will reduce the amount of chunk code for an
upcoming SKIP_LOCKED aimed for VACUUM.  As VACUUM FULL is a different
flavor of CLUSTER, we want to make that extensible to ease integration.

This only reworks the API and its callers, without providing anything
user-facing.  Two options are present now: verbose mode and relation
recheck when doing the cluster command work across multiple
transactions.  This could be used as well as a base to extend the
grammar of CLUSTER later on.

Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/20180723031058.GE2854@paquier.xyz
2018-07-24 11:37:32 +09:00
Michael Paquier d9fadbf131 Fix calculation for WAL segment recycling and removal
Commit 4b0d28de06 has removed the prior checkpoint and related
facilities but has left WAL recycling based on the LSN of the prior
checkpoint, which causes incorrect calculations for WAL removal and
recycling for max_wal_size and min_wal_size.  This commit changes things
so as the base calculation point is the last checkpoint generated.

Reported-by: Kyotaro Horiguchi
Author: Kyotaro Horiguchi
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20180723.135748.42558387.horiguchi.kyotaro@lab.ntt.co.jp
Backpatch: 11-, where the prior checkpoint has been removed.
2018-07-24 10:32:56 +09:00
Thomas Munro 1bc180cd2a Use setproctitle_fast() to update the ps status, if available.
FreeBSD has introduced a faster variant of setproctitle().  Use it,
where available.

Author: Thomas Munro
Discussion: https://postgr.es/m/CAEepm=1wKMTi81uodJ=1KbJAz5WedOg=cr8ewEXrUFeaxWEgww@mail.gmail.com
2018-07-24 13:09:22 +12:00
Andres Freund e9a9843e13 LLVMJIT: Adapt to API changes in gdb and perf support.
During the work of upstreaming my previous patches for gdb and perf
support the API changed. Adapt.  Normally this wouldn't necessarily be
something to backpatch, but the previous API wasn't upstream, and at
least the gdb support is quite useful for debugging.

Author: Andres Freund
Backpatch: 11, where LLVM based JIT support was added.
2018-07-22 21:13:34 -07:00
Andres Freund a38b833a7c LLVMJIT: Fix LLVM build for LLVM > 7.
The location of LLVMAddPromoteMemoryToRegisterPass moved.

Author: Andres Freund
Backpatch: 11, where LLVM based JIT support was added.
2018-07-22 21:13:02 -07:00
Andres Freund 1307bc3d45 Reset context at the tail end of JITed EEOP_AGG_PLAIN_TRANS.
While no negative consequences are currently known, it's clearly wrong
to not reset the context in one of the branches.

Reported-By: Dmitry Dolgov
Author: Dmitry Dolgov
Discussion: https://postgr.es/m/CAGPqQf165-=+Drw3Voim7M5EjHT1zwPF9BQRjLFQzCzYnNZEiQ@mail.gmail.com
Backpatch: 11-, where JIT compilation support was added
2018-07-22 20:31:22 -07:00
Michael Paquier e41d0a1090 Add proper errcodes to new error messages for read() failures
Those would use the default ERRCODE_INTERNAL_ERROR, but for foreseeable
failures an errcode ought to be set, ERRCODE_DATA_CORRUPTED making the
most sense here.

While on the way, fix one errcode_for_file_access missing in origin.c
since the code has been created, and remove one assignment of errno to 0
before calling read(), as this was around to fit with what was present
before 811b6e36 where errno would not be set when not enough bytes are
read.  I have noticed the first one, and Tom has pinged me about the
second one.

Author: Michael Paquier
Reported-by: Tom Lane
Discussion: https://postgr.es/m/27265.1531925836@sss.pgh.pa.us
2018-07-23 09:37:36 +09:00
Michael Paquier 56df07bb9e Make more consistent some error messages for file-related operations
Some error messages which report something about a file operation use
as well context which is already provided within the path being worked
on, making things rather duplicated.  This creates more work for
translators, and does not actually bring clarity.

More could be done, however in a lot of cases the context used is
actually useful, still that patch gets down things with a good cut.

Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, Tom Lane
Discussion: https://postgr.es/m/20180718044711.GA8565@paquier.xyz
2018-07-23 09:19:12 +09:00