Commit Graph

30810 Commits

Author SHA1 Message Date
Tom Lane 737639017c Fix bogus size calculation in strlist_to_textarray().
It's making an array of Datum, not an array of text *.  The mistake
is harmless since those are currently the same size, but it's still
wrong.
2017-09-23 15:01:59 -04:00
Tom Lane 335f3d04e4 Improve memory management in autovacuum.c.
Invoke vacuum(), as well as "work item" processing, in the PortalContext
that do_autovacuum() has manufactured, which will be reset before each
such invocation.  This ensures cleanup of any memory leaked by these
operations.  It also avoids the rather dangerous practice of calling
vacuum() in a context that vacuum() itself will destroy while it runs.
There's no known live bug there, but it's not hard to imagine introducing
one if we leave it like this.

Tom Lane, reviewed by Michael Paquier and Alvaro Herrera

Discussion: https://postgr.es/m/13849.1506114543@sss.pgh.pa.us
2017-09-23 13:28:16 -04:00
Tom Lane ad51c6fb57 Remove pgbench "progress" test pending solution of its timing issues.
Buildfarm member skink shows that this is even more flaky than
I thought.  There are probably some actual pgbench bugs here
as well as a timing dependency.  But we can't have stuff this
unstable in the buildfarm, it obscures other issues.
2017-09-23 13:02:30 -04:00
Tom Lane 01c7d3ef85 Ten-second timeout in 013_crash_restart.pl is not enough, let's try 60.
Per buildfarm member topminnow.
2017-09-23 12:56:31 -04:00
Peter Eisentraut 0c5803b450 Refactor new file permission handling
The file handling functions from fd.c were called with a diverse mix of
notations for the file permissions when they were opening new files.
Almost all files created by the server should have the same permissions
set.  So change the API so that e.g. OpenTransientFile() automatically
uses the standard permissions set, and OpenTransientFilePerm() is a new
function that takes an explicit permissions set for the few cases where
it is needed.  This also saves an unnecessary argument for call sites
that are just opening an existing file.

While we're reviewing these APIs, get rid of the FileName typedef and
use the standard const char * for the file name and mode_t for the file
mode.  This makes these functions match other file handling functions
and removes an unnecessary layer of mysteriousness.  We can also get rid
of a few casts that way.

Author: David Steele <david@pgmasters.net>
2017-09-23 10:16:18 -04:00
Alvaro Herrera 404ba54e8f Test BRIN autosummarization
There was no coverage for this code.

Reported-by: Nikolay Shaplov, Tom Lane
Discussion: https://postgr.es/m/2700647.XEouBYNZic@x200m
	https://postgr.es/m/13849.1506114543@sss.pgh.pa.us
2017-09-23 14:15:06 +02:00
Peter Eisentraut aa6b7b72d9 Fix saving and restoring umask
In two cases, we set a different umask for some piece of code and
restore it afterwards.  But if the contained code errors out, the umask
is not restored.  So add TRY/CATCH blocks to fix that.
2017-09-22 17:10:36 -04:00
Peter Eisentraut 58ffe141eb Revert "Add basic TAP test setup for pg_upgrade"
This reverts commit f41e56c76e.

The build farm client would run the pg_upgrade tests twice, once as part
of the existing pg_upgrade check run and once as part of picking up all
TAP tests by looking for "t" directories.  Since the pg_upgrade tests
are pretty slow, we will need a better solution or possibly a build farm
client change before we can proceed with this.
2017-09-22 16:46:56 -04:00
Andres Freund 791961f59b Add inline murmurhash32(uint32) function.
The function already existed in tidbitmap.c but more users requiring
fast hashing of 32bit ints are coming up.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-09-22 13:38:42 -07:00
Andres Freund 8d926029e8 Expand expected output for recovery test even further.
I'd assumed that the backend being killed should be able to get out an
error message - but it turns out it's not guaranteed that it's not
still sending a ready-for-query.  Really need to do something about
getting these error message to the client.

Reported-By: Thomas Munro, Tom Lane
Discussion:
	https://postgr.es/m/CAEepm=0TE90nded+bNthP45_PEvGAAr=3gxhHJObL4xmOLtX0w@mail.gmail.com
	https://postgr.es/m/14968.1506101414@sss.pgh.pa.us
2017-09-22 11:35:26 -07:00
Andres Freund f9583e86b4 Fix s/intidb/initdb/ typo.
Reported-By: Michael Paquier
Discussion: https://postgr.es/m/CAB7nPqTfaKAYZ4wuUM-W8kc4VnXrxX1=5-a9i==VoUPTMFpsgg@mail.gmail.com
2017-09-22 11:35:26 -07:00
Robert Haas 6a2fa09c0c For wal_consistency_checking, mask page checksum as well as page LSN.
If the LSN is different, the checksum will be different, too.

Ashwin Agrawal, reviewed by Michael Paquier and Kuntal Ghosh

Discussion: http://postgr.es/m/CALfoeis5iqrAU-+JAN+ZzXkpPr7+-0OAGv7QUHwFn=-wDy4o4Q@mail.gmail.com
2017-09-22 14:28:22 -04:00
Robert Haas 7c75ef5715 hash: Implement page-at-a-time scan.
Commit 09cb5c0e7d added a similar
optimization to btree back in 2006, but nobody bothered to implement
the same thing for hash indexes, probably because they weren't
WAL-logged and had lots of other performance problems as well.  As
with the corresponding btree case, this eliminates the problem of
potentially needing to refind our position within the page, and cuts
down on pin/unpin traffic as well.

Ashutosh Sharma, reviewed by Alexander Korotkov, Jesper Pedersen,
Amit Kapila, and me.  Some final edits to comments and README by
me.

Discussion: http://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F+SyknxUwXrN-zqSZ9X8ZS3w@mail.gmail.com
2017-09-22 13:56:27 -04:00
Tom Lane 0f574a7afb Allow up to 3 "-P 1" reports per thread in pgbench run of 2 seconds.
There seems to be some considerable imprecision in the timing of -P
progress reports.  Nominally each thread ought to produce 2 reports
in this test, but about 10% of the time we only get one, and 1% of
the time we get three, as per buildfarm results so far.  Pending
further investigation, treat the last case as a "pass".  (I, tgl,
am suspicious that this still might not be lax enough, now that it's
obvious that the behavior is load-dependent; but there's not yet
buildfarm evidence to confirm that suspicion.)

Fabien Coelho

Discussion: https://postgr.es/m/26654.1505232433@sss.pgh.pa.us
2017-09-22 12:59:44 -04:00
Tom Lane ed87e19807 Mop-up for commit 85feb77aa0.
Adjust commentary in regc_pg_locale.c to remove mention of the possibility
of not having <wctype.h> functions, since we no longer consider that.

Eliminate duplicate code in wparser_def.c by generalizing the p_iswhat
macro to take a parameter saying what to return for non-ASCII chars
in C locale.  (That's not really a consequence of the
USE_WIDE_UPPER_LOWER-ectomy, but I noticed it while doing that.)
2017-09-22 11:35:12 -04:00
Tom Lane 85feb77aa0 Assume wcstombs(), towlower(), and sibling functions are always present.
These functions are required by SUS v2, which is our minimum baseline
for Unix platforms, and are present on all interesting Windows versions
as well.  Even our oldest buildfarm members have them.  Thus, we were not
testing the "!USE_WIDE_UPPER_LOWER" code paths, which explains why the bug
fixed in commit e6023ee7f escaped detection.  Per discussion, there seems
to be no more real-world value in maintaining this option.  Hence, remove
the configure-time tests for wcstombs() and towlower(), remove the
USE_WIDE_UPPER_LOWER symbol, and remove all the !USE_WIDE_UPPER_LOWER code.
There's not actually all that much of the latter, but simplifying the #if
nests is a win in itself.

Discussion: https://postgr.es/m/20170921052928.GA188913@rfd.leadboat.com
2017-09-22 11:00:58 -04:00
Peter Eisentraut e6023ee7fa Fix build with !USE_WIDE_UPPER_LOWER
The placement of the ifdef blocks in formatting.c was pretty bogus, so
the code failed to compile if USE_WIDE_UPPER_LOWER was not defined.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reported-by: Noah Misch <noah@leadboat.com>
2017-09-22 09:26:38 -04:00
Tom Lane 47f849a3c9 Sync our copy of the timezone library with IANA tzcode master.
This patch absorbs a few unreleased fixes in the IANA code.
It corresponds to commit 2d8b944c1cec0808ac4f7a9ee1a463c28f9cd00a
in https://github.com/eggert/tz.  Non-cosmetic changes include:

TZDEFRULESTRING is updated to match current US DST practice,
rather than what it was over ten years ago.  This only matters
for interpretation of POSIX-style zone names (e.g., "EST5EDT"),
and only if the timezone database doesn't include either an exact
match for the zone name or a "posixrules" entry.  The latter
should not be true in any current Postgres installation, but
this could possibly matter when using --with-system-tzdata.

Get rid of a nonportable use of "++var" on a bool var.
This is part of a larger fix that eliminates some vestigial
support for consecutive leap seconds, and adds checks to
the "zic" compiler that the data files do not specify that.

Remove a couple of ancient compatibility hacks.  The IANA
crew think these are obsolete, and I tend to agree.  But
perhaps our buildfarm will think different.

Back-patch to all supported branches, in line with our policy
that all branches should be using current IANA code.  Before v10,
this includes application of current pgindent rules, to avoid
whitespace problems in future back-patches.

Discussion: https://postgr.es/m/E1dsWhf-0000pT-F9@gemulon.postgresql.org
2017-09-22 00:04:29 -04:00
Tom Lane a890432a87 Revert "Fix bool/int type confusion"
This reverts commit 0ec2e908ba.
We'll use the upstream (IANA) fix instead.
2017-09-22 00:04:29 -04:00
Andrew Dunstan d57c7a7c50 Provide a test for variable existence in psql
"\if :{?variable_name}" will be translated to "\if TRUE" if the variable
exists and "\if FALSE" otherwise. Thus it will be possible to execute code
conditionally on the existence of the variable, regardless of its value.

Fabien Coelho, with some review by Robins Tharakan and some light text
editing by me.

Discussion: https://postgr.es/m/alpine.DEB.2.20.1708260835520.3627@lancre
2017-09-21 19:02:23 -04:00
Tom Lane 7148050105 Give a better error for duplicate entries in VACUUM/ANALYZE column list.
Previously, the code didn't think about this case and would just try to
analyze such a column twice.  That would fail at the point of inserting
the second version of the pg_statistic row, with obscure error messsages
like "duplicate key value violates unique constraint" or "tuple already
updated by self", depending on context and PG version.  We could allow
the case by ignoring duplicate column specifications, but it seems better
to reject it explicitly.

The bogus error messages seem like arguably a bug, so back-patch to
all supported versions.

Nathan Bossart, per a report from Michael Paquier, and whacked
around a bit by me.

Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com
2017-09-21 18:13:11 -04:00
Andrew Dunstan 28ae524bbf Quieten warnings about unused variables
These variables are only ever written to in assertion-enabled builds,
and the latest Microsoft compilers complain about such variables in
non-assertion-enabled builds.

Apparently they don't worry so much about variables that are written to
but not read from, so most of our PG_USED_FOR_ASSERTS_ONLY variables
don't cause the problem.

Discussion: https://postgr.es/m/7800.1505950322@sss.pgh.pa.us
2017-09-21 08:41:14 -04:00
Robert Haas 9140cf8269 Associate partitioning information with each RelOptInfo.
This is not used for anything yet, but it is necessary infrastructure
for partition-wise join and for partition pruning without constraint
exclusion.

Ashutosh Bapat, reviewed by Amit Langote and with quite a few changes,
mostly cosmetic, by me.  Additional review and testing of this patch
series by Antonin Houska, Amit Khandekar, Rafia Sabih, Rajkumar
Raghuwanshi, Thomas Munro, and Dilip Kumar.

Discussion: http://postgr.es/m/CAFjFpRfneFG3H+F6BaiXemMrKF+FY-POpx3Ocy+RiH3yBmXSNw@mail.gmail.com
2017-09-20 23:39:13 -04:00
Tom Lane 7b86c2ac95 Improve dubious memory management in pg_newlocale_from_collation().
pg_newlocale_from_collation() used malloc() and strdup() directly,
which is generally not per backend coding style, and it didn't bother
to check for failure results, but would just SIGSEGV instead.  Also,
if one of the numerous error checks in the middle of the function
failed, the already-allocated memory would be leaked permanently.
Admittedly, it's not a lot of memory, but it could build up if this
function were called repeatedly for a bad collation.

The first two problems are easily cured by palloc'ing in TopMemoryContext
instead of calling libc directly.  We can fairly easily dodge the leakage
problem for the struct pg_locale_struct by filling in a temporary variable
and allocating permanent storage only once we reach the bottom of the
function.  It's harder to get rid of the potential leakage for ICU's copy
of the collcollate string, but at least that's only allocated after most
of the error checks; so live with that aspect.

Back-patch to v10 where this code came in, with one or another of the
ICU patches.
2017-09-20 13:52:36 -04:00
Tom Lane 4939488af9 Fix instability in subscription regression test.
005_encoding.pl neglected to wait for the subscriber's initial
synchronization to happen.  While we have not seen this fail in
the buildfarm, it's pretty easy to demonstrate there's an issue
by hacking logicalrep_worker_launch() to fail most of the time.

Michael Paquier

Discussion: https://postgr.es/m/27032.1505749806@sss.pgh.pa.us
2017-09-20 11:28:34 -04:00
Robert Haas 57eebca03a Fix create_lateral_join_info to handle dead relations properly.
Commit 0a480502b0 broke it.

Report by Andreas Seltenreich.  Fix by Ashutosh Bapat.

Discussion: http://postgr.es/m/874ls2vrnx.fsf@ansel.ydns.eu
2017-09-20 10:20:10 -04:00
Robert Haas 7f3a3312ab Fix typo.
Thomas Munro

Discussion: http://postgr.es/m/CAEepm=2j-HAgnBUrAazwS0ry7Z_ihk+d7g+Ye3u99+6WbiGt_Q@mail.gmail.com
2017-09-20 10:07:53 -04:00
Peter Eisentraut d42294fc00 Fix compiler warning
from gcc-7 -Wformat-truncation (via -Wall)
2017-09-20 09:03:04 -04:00
Peter Eisentraut be87b70b61 Sync process names between ps and pg_stat_activity
Remove gratuitous differences in the process names shown in
pg_stat_activity.backend_type and the ps output.

Reviewed-by: Takayuki Tsunakawa <tsunakawa.takay@jp.fujitsu.com>
2017-09-20 08:59:03 -04:00
Andres Freund fc49e24fa6 Make WAL segment size configurable at initdb time.
For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.

But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.

This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured.  For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.

Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
    Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com
2017-09-19 22:03:48 -07:00
Andres Freund 5ada1fcd0c Accept that server might not be able to send error in crash recovery test.
As it turns out we can't rely that the script's monitoring session is
terminated with a proper error by the server, because the session
might be terminated while already trying to send data.

Also improve robustness and error reporting facilities of the test,
developed while debugging this issue.

Discussion: https://postgr.es/m/20170920020038.kllxgilo7xzwmtto@alap3.anarazel.de
2017-09-19 21:37:24 -07:00
Tom Lane 2d484f9b05 Remove no-op GiST support functions in the core GiST opclasses.
The preceding patch allowed us to remove useless GiST support functions.
This patch actually does that for all the no-op cases in the core GiST
code.  This buys us whatever performance gain is to be had, and more
importantly exercises the preceding patch.

There remain no-op functions in the contrib GiST opclasses, but those
will take more work to remove.

Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
2017-09-19 23:32:59 -04:00
Tom Lane d3a4f89d8a Allow no-op GiST support functions to be omitted.
There are common use-cases in which the compress and/or decompress
functions can be omitted, with the result being that we make no
data transformation when storing or retrieving index values.
Previously, you had to provide a no-op function anyway, but this
patch allows such opclass support functions to be omitted.

Furthermore, if the compress function is omitted, then the core code
knows that the stored representation is the same as the original data.
This means we can allow index-only scans without requiring a fetch
function to be provided either.  Previously you had to provide a
no-op fetch function if you wanted IOS to work.

This reportedly provides a small performance benefit in such cases,
but IMO the real reason for doing it is just to reduce the amount of
useless boilerplate code that has to be written for GiST opclasses.

Andrey Borodin, reviewed by Dmitriy Sarafannikov

Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
2017-09-19 23:32:59 -04:00
Andres Freund 896537f078 s/NULL byte/NUL byte/ in comment refering to C string terminator.
Reported-By: Robert Haas
Discussion: https://postgr.es/m/CA+Tgmoa+YBvWgFST2NVoeXjVSohEpK=vqnVCsoCkhTVVxfLcVQ@mail.gmail.com
2017-09-19 16:41:07 -07:00
Peter Eisentraut f41e56c76e Add basic TAP test setup for pg_upgrade
The plan is to convert the current pg_upgrade test to the TAP
framework.  This commit just puts a basic TAP test in place so that we
can see how the build farm behaves, since the build farm client has some
special knowledge of the pg_upgrade tests.

Author: Michael Paquier <michael.paquier@gmail.com>
2017-09-19 18:51:25 -04:00
Andres Freund 71edbb6f66 Avoid use of non-portable strnlen() in pgstat_clip_activity().
The use of strnlen rather than strlen was just paranoia. Instead of
giving up on the paranoia, just implement the safeguard
differently. And add a comment explaining why we're careful.

Author: Andres Freund
Discussion: https://postgr.es/m/E1duOkJ-0001Mc-U5@gemulon.postgresql.org
2017-09-19 14:25:47 -07:00
Andres Freund 54b6cd589a Speedup pgstat_report_activity by moving mb-aware truncation to read side.
Previously multi-byte aware truncation was done on every
pgstat_report_activity() call - proving to be a bottleneck for
workloads with long query strings that execute quickly.

Instead move the truncation to the read side, which commonly is
executed far less frequently. That's possible because all server
encodings allow to determine the length of a multi-byte string from
the first byte.

Rename PgBackendStatus.st_activity to st_activity_raw so existing
extension users of the field break - their code has to be adjusted to
use pgstat_clip_activity().

Author: Andres Freund
Tested-By: Khuntal Ghosh
Reviewed-By: Robert Haas, Tom Lane
Discussion: https://postgr.es/m/20170912071948.pa7igbpkkkviecpz@alap3.anarazel.de
2017-09-19 12:51:14 -07:00
Tom Lane ed22fb8b00 Cache datatype-output-function lookup info across calls of concat().
Testing indicates this can save a third to a half of the runtime
of the function.

Pavel Stehule, reviewed by Alexander Kuzmenkov

Discussion: https://postgr.es/m/CAFj8pRAT62pRgjoHbgTfJUc2uLmeQ4saUj+yVJAEZUiMwNCmdg@mail.gmail.com
2017-09-19 15:09:38 -04:00
Andres Freund 1910353675 Make new crash restart test a bit more robust.
Add timeouts in case psql doesn't deliver the expected output, and try
to cause the monitoring psql to be fully connected to a backend.  This
isn't necessarily everything needed, but at least the timeouts should
reduce the pain for buildfarm owners.

Author: Andres Freund
Reported-By: Tom Lane, BF animals prairiedog and calliphoridae
Discussion: https://postgr.es/m/E1du6ZT-00043I-91@gemulon.postgresql.org
2017-09-19 10:39:52 -07:00
Andres Freund f8e5f156b3 Rearm statement_timeout after each executed query.
Previously statement_timeout, in the extended protocol, affected all
messages till a Sync message.  For clients that pipeline/batch query
execution that's problematic.

Instead disable timeout after each Execute message, and enable, if
necessary, the timer in start_xact_command(). As that's done only for
Execute and not Parse / Bind, pipelining the latter two could still
cause undesirable timeouts. But a survey of protocol implementations
shows that all drivers issue Sync messages when preparing, and adding
timeout rearming to both is fairly expensive for the common parse /
bind / execute sequence.

Author: Tatsuo Ishii, editorialized by Andres Freund
Reviewed-By: Takayuki Tsunakawa, Andres Freund
Discussion: https://postgr.es/m/20170222.115044.1665674502985097185.t-ishii@sraoss.co.jp
2017-09-18 19:36:44 -07:00
Andres Freund 0fb9e4ace5 Fix uninitialized variable in dshash.c.
A bugfix for commit 8c0d7bafad.  The code
would have crashed if hashtable->size_log2 ever had the same value as
hashtable->control->size_log2 by coincidence.

Per Valgrind.

Author: Thomas Munro
Reported-By: Tomas Vondra
Discussion: https://postgr.es/m/e72fb33c-4f31-f276-e972-263d9b59554d%402ndquadrant.com
2017-09-18 17:43:37 -07:00
Andres Freund a1924a4ea2 Add test for postmaster crash restarts.
Given that I managed to break this...  We probably should extend the
tests to also cover other sub-processes dying, but that's something
for later.

Author: Andres Freund
Discussion: https://postgr.es/m/20170917080752.rcmihzfmgbeuqjk2@alap3.anarazel.de
2017-09-18 17:25:53 -07:00
Andres Freund ec9e05b3c3 Fix crash restart bug introduced in 8356753c21.
The bug was caused by not re-reading the control file during crash
recovery restarts, which lead to an attempt to pfree() shared memory
contents. The fix is to re-read the control file, which seems good
anyway.

It's unclear as of this moment, whether we want to keep the
refactoring introduced in the commit referenced above, or come up with
an alternative approach. But fixing the bug in the mean time seems
like a good idea regardless.

A followup commit will introduce regression test coverage for crash
restarts.

Reported-By: Tom Lane
Discussion: https://postgr.es/m/14134.1505572349@sss.pgh.pa.us
2017-09-18 17:25:49 -07:00
Tom Lane eb5c404b17 Minor code-cleanliness improvements for btree.
Make the btree page-flags test macros (P_ISLEAF and friends) return clean
boolean values, rather than values that might not fit in a bool.  Use them
in a few places that were randomly referencing the flag bits directly.

In passing, change access/nbtree/'s only direct use of BUFFER_LOCK_SHARE to
BT_READ.  (Some think we should go the other way, but as long as we have
BT_READ/BT_WRITE, let's use them consistently.)

Masahiko Sawada, reviewed by Doug Doole

Discussion: https://postgr.es/m/CAD21AoBmWPeN=WBB5Jvyz_Nt3rmW1ebUyAnk3ZbJP3RMXALJog@mail.gmail.com
2017-09-18 16:36:28 -04:00
Tom Lane 66917bfaa7 Make ExplainOpenGroup and ExplainCloseGroup public.
Extensions with custom plan nodes might like to use these in their
EXPLAIN output.

Hadi Moshayedi

Discussion: https://postgr.es/m/CA+_kT_dU-rHCN0u6pjA6bN5CZniMfD=-wVqPY4QLrKUY_uJq5w@mail.gmail.com
2017-09-18 16:01:16 -04:00
Tom Lane 4bd1994650 Make DatumGetFoo/PG_GETARG_FOO/PG_RETURN_FOO macro names more consistent.
By project convention, these names should include "P" when dealing with a
pointer type; that is, if the result of a GETARG macro is of type FOO *,
it should be called PG_GETARG_FOO_P not just PG_GETARG_FOO.  Some newer
types such as JSONB and ranges had not followed the convention, and a
number of contrib modules hadn't gotten that memo either.  Rename the
offending macros to improve consistency.

In passing, fix a few places that thought PG_DETOAST_DATUM() returns
a Datum; it does not, it returns "struct varlena *".  Applying
DatumGetPointer to that happens not to cause any bad effects today,
but it's formally wrong.  Also, adjust an ltree macro that was designed
without any thought for what pgindent would do with it.

This is all cosmetic and shouldn't have any impact on generated code.

Mark Dilger, some further tweaks by me

Discussion: https://postgr.es/m/EA5676F4-766F-4F38-8348-ECC7DB427C6A@gmail.com
2017-09-18 15:21:23 -04:00
Tom Lane 3e1683d37e Fix, or at least ameliorate, bugs in logicalrep_worker_launch().
If we failed to get a background worker slot, the code just walked
away from the logicalrep-worker slot it already had, leaving that
looking like the worker is still starting up.  This led to an indefinite
hang in subscription startup, as reported by Thomas Munro.  We must
release the slot on failure.

Also fix a thinko: we must capture the worker slot's generation before
releasing LogicalRepWorkerLock the first time, else testing to see if
it's changed is pretty meaningless.

BTW, the CHECK_FOR_INTERRUPTS() in WaitForReplicationWorkerAttach is a
ticking time bomb, even without considering the possibility of elog(ERROR)
in one of the other functions it calls.  Really, this entire business needs
a redesign with some actual thought about error recovery.  But for now
I'm just band-aiding the case observed in testing.

Back-patch to v10 where this code was added.

Discussion: https://postgr.es/m/CAEepm=2bP3TBMFBArP6o20AZaRduWjMnjCjt22hSdnA-EvrtCw@mail.gmail.com
2017-09-18 11:39:55 -04:00
Peter Eisentraut 8edacab209 Fix DROP SUBSCRIPTION hang
When ALTER SUBSCRIPTION DISABLE is run in the same transaction before
DROP SUBSCRIPTION, the latter will hang because workers will still be
running, not having seen the DISABLE committed, and DROP SUBSCRIPTION
will wait until the workers have vacated the replication origin slots.

Previously, DROP SUBSCRIPTION killed the logical replication workers
immediately only if it was going to drop the replication slot, otherwise
it scheduled the worker killing for the end of the transaction, as a
result of 7e174fa793.  This, however,
causes the present problem.  To fix, kill the workers immediately in all
cases.  This covers all cases: A subscription that doesn't have a
replication slot must be disabled.  It was either disabled in the same
transaction, or it was already disabled before the current transaction,
but then there shouldn't be any workers left and this won't make a
difference.

Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/87mv6av84w.fsf%40ars-thinkpad
2017-09-17 22:00:23 -04:00
Tom Lane 6f44fe7f12 Allow rel_is_distinct_for() to look through RelabelType below OpExpr.
This lets it do the right thing for, eg, varchar columns.
Back-patch to 9.5 where this logic appeared.

David Rowley, per report from Kim Rose Carlsen

Discussion: https://postgr.es/m/VI1PR05MB17091F9A9876528055D6A827C76D0@VI1PR05MB1709.eurprd05.prod.outlook.com
2017-09-17 15:28:51 -04:00
Tom Lane 27c6619e9c Fix possible dangling pointer dereference in trigger.c.
AfterTriggerEndQuery correctly notes that the query_stack could get
repalloc'd during a trigger firing, but it nonetheless passes the address
of a query_stack entry to afterTriggerInvokeEvents, so that if such a
repalloc occurs, afterTriggerInvokeEvents is already working with an
obsolete dangling pointer while it scans the rest of the events.  Oops.
The only code at risk is its "delete_ok" cleanup code, so we can
prevent unsafe behavior by passing delete_ok = false instead of true.

However, that could have a significant performance penalty, because the
point of passing delete_ok = true is to not have to re-scan possibly
a large number of dead trigger events on the next time through the loop.
There's more than one way to skin that cat, though.  What we can do is
delete all the "chunks" in the event list except the last one, since
we know all events in them must be dead.  Deleting the chunks is work
we'd have had to do later in AfterTriggerEndQuery anyway, and it ends
up saving rescanning of just about the same events we'd have gotten
rid of with delete_ok = true.

In v10 and HEAD, we also have to be careful to mop up any per-table
after_trig_events pointers that would become dangling.  This is slightly
annoying, but I don't think that normal use-cases will traverse this code
path often enough for it to be a performance problem.

It's pretty hard to hit this in practice because of the unlikelihood
of the query_stack getting resized at just the wrong time.  Nonetheless,
it's definitely a live bug of ancient standing, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/2891.1505419542@sss.pgh.pa.us
2017-09-17 14:50:01 -04:00