Commit Graph

3716 Commits

Author SHA1 Message Date
Heikki Linnakangas
f50e888990 Fix permission checks on constraint violation errors on partitions.
If a cross-partition UPDATE violates a constraint on the target partition,
and the columns in the new partition are in different physical order than
in the parent, the error message can reveal columns that the user does not
have SELECT permission on. A similar bug was fixed earlier in commit
804b6b6db4.

The cause of the bug is that the callers of the
ExecBuildSlotValueDescription() function got confused when constructing
the list of modified columns. If the tuple was routed from a parent, we
converted the tuple to the parent's format, but the list of modified
columns was grabbed directly from the child's RTE entry.

ExecUpdateLockMode() had a similar issue. That lead to confusion on which
columns are key columns, leading to wrong tuple lock being taken on tables
referenced by foreign keys, when a row is updated with INSERT ON CONFLICT
UPDATE. A new isolation test is added for that corner case.

With this patch, the ri_RangeTableIndex field is no longer set for
partitions that don't have an entry in the range table. Previously, it was
set to the RTE entry of the parent relation, but that was confusing.

NOTE: This modifies the ResultRelInfo struct, replacing the
ri_PartitionRoot field with ri_RootResultRelInfo. That's a bit risky to
backpatch, because it breaks any extensions accessing the field. The
change that ri_RangeTableIndex is not set for partitions could potentially
break extensions, too. The ResultRelInfos are visible to FDWs at least,
and this patch required small changes to postgres_fdw. Nevertheless, this
seem like the least bad option. I don't think these fields widely used in
extensions; I don't think there are FDWs out there that uses the FDW
"direct update" API, other than postgres_fdw. If there is, you will get a
compilation error, so hopefully it is caught quickly.

Backpatch to 11, where support for both cross-partition UPDATEs, and unique
indexes on partitioned tables, were added.

Reviewed-by: Amit Langote
Security: CVE-2021-3393
2021-02-08 11:01:55 +02:00
Tom Lane
7428469586 Fix ancient memory leak in contrib/auto_explain.
The ExecutorEnd hook is invoked in a context that could be quite
long-lived, not the executor's own per-query context as I think
we were sort of assuming.  Thus, any cruft generated while producing
the EXPLAIN output could accumulate over multiple queries.  This can
result in spectacular leakage if log_nested_statements is on, and
even without that I'm surprised nobody complained before.

To fix, just switch into the executor's context so that anything we
allocate will be released when standard_ExecutorEnd frees the executor
state.  We might as well nuke the code's retail pfree of the explain
output string, too; that's laughably inadequate to the need.

Japin Li, per report from Jeff Janes.  This bug is old, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/CAMkU=1wCVtbeRn0s9gt12KwQ7PLXovbpM8eg25SYocKW3BT4hg@mail.gmail.com
2021-02-02 13:49:08 -05:00
Alvaro Herrera
fdf9d00540
Report the true database name on connection errors
When reporting connection errors, we might show a database name in the
message that's not the one we actually tried to connect to, if the
database was taken from libpq defaults instead of from user parameters.
Fix such error messages to use PQdb(), which reports the correct name.

(But, per commit 2930c05634, make sure not to try to print NULL.)

Apply to branches 9.5 through 13.  Branch master has already been
changed differently by commit 58cd8dca3d.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+TgmobssJ6rS22dspWnu-oDxXevGmhMD8VcRBjmj-b9UDqRjw@mail.gmail.com
2021-01-26 16:42:13 -03:00
Tom Lane
4641b2a30f Fix broken ruleutils support for function TRANSFORM clauses.
I chanced to notice that this dumped core due to a faulty Assert.
To add insult to injury, the output has been misformatted since v11.
Obviously we need some regression testing here.

Discussion: https://postgr.es/m/d1cc628c-3953-4209-957b-29427acc38c8@www.fastmail.com
2021-01-25 13:03:11 -05:00
Tom Lane
1cce024fd2 Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.
Previously, pull_varnos() took the relids of a PlaceHolderVar as being
equal to the relids in its contents, but that fails to account for the
possibility that we have to postpone evaluation of the PHV due to outer
joins.  This could result in a malformed plan.  The known cases end up
triggering the "failed to assign all NestLoopParams to plan nodes"
sanity check in createplan.c, but other symptoms may be possible.

The right value to use is the join level we actually intend to evaluate
the PHV at.  We can get that from the ph_eval_at field of the associated
PlaceHolderInfo.  However, there are some places that call pull_varnos()
before the PlaceHolderInfos have been created; in that case, fall back
to the conservative assumption that the PHV will be evaluated at its
syntactic level.  (In principle this might result in missing some legal
optimization, but I'm not aware of any cases where it's an issue in
practice.)  Things are also a bit ticklish for calls occurring during
deconstruct_jointree(), but AFAICS the ph_eval_at fields should have
reached their final values by the time we need them.

The main problem in making this work is that pull_varnos() has no
way to get at the PlaceHolderInfos.  We can fix that easily, if a
bit tediously, in HEAD by passing it the planner "root" pointer.
In the back branches that'd cause an unacceptable API/ABI break for
extensions, so leave the existing entry points alone and add new ones
with the additional parameter.  (If an old entry point is called and
encounters a PHV, it'll fall back to using the syntactic level,
again possibly missing some valid optimization.)

Back-patch to v12.  The computation is surely also wrong before that,
but it appears that we cannot reach a bad plan thanks to join order
restrictions imposed on the subquery that the PlaceHolderVar came from.
The error only became reachable when commit 4be058fe9 allowed trivial
subqueries to be collapsed out completely, eliminating their join order
restrictions.

Per report from Stephan Springl.

Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us
2021-01-21 15:37:23 -05:00
Tom Lane
53a24faaa4 Disable vacuum page skipping in selected test cases.
By default VACUUM will skip pages that it can't immediately get
exclusive access to, which means that even activities as harmless
and unpredictable as checkpoint buffer writes might prevent a page
from being processed.  Ordinarily this is no big deal, but we have
a small number of test cases that examine the results of VACUUM's
processing and therefore will fail if the page of interest is skipped.
This seems to be the explanation for some rare buildfarm failures.
To fix, add the DISABLE_PAGE_SKIPPING option to the VACUUM commands
in tests where this could be an issue.

In passing, remove a duplicated query in pageinspect/sql/page.sql.

Back-patch as necessary (some of these cases are as old as v10).

Discussion: https://postgr.es/m/413923.1611006484@sss.pgh.pa.us
2021-01-20 11:49:29 -05:00
Fujii Masao
e792ca4aca postgres_fdw: Fix connection leak.
In postgres_fdw, the cached connections to foreign servers will not be
closed until the local session exits if the user mappings or foreign servers
that those connections depend on are dropped. Those connections can be
leaked.

To fix that connection leak issue, after a change to a pg_foreign_server
or pg_user_mapping catalog entry, this commit makes postgres_fdw close
the connections depending on that entry immediately if current
transaction has not used those connections yet. Otherwise, mark those
connections as invalid and then close them at the end of current transaction,
since they cannot be closed in the midst of the transaction using them.
Closed connections will be remade at the next opportunity if necessary.

Back-patch to all supported branches.

Author: Bharath Rupireddy
Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVNcGH_6qLY-4_tXz8JLvA+4yeBThRfxMz7Oxbk1aHcpQ@mail.gmail.com
2020-12-28 19:59:00 +09:00
Tom Lane
3d8068edce Fix race condition between shutdown and unstarted background workers.
If a database shutdown (smart or fast) is commanded between the time
some process decides to request a new background worker and the time
that the postmaster can launch that worker, then nothing happens
because the postmaster won't launch any bgworkers once it's exited
PM_RUN state.  This is fine ... unless the requesting process is
waiting for that worker to finish (or even for it to start); in that
case the requestor is stuck, and only manual intervention will get us
to the point of being able to shut down.

To fix, cancel pending requests for workers when the postmaster sends
shutdown (SIGTERM) signals, and similarly cancel any new requests that
arrive after that point.  (We can optimize things slightly by only
doing the cancellation for workers that have waiters.)  To fit within
the existing bgworker APIs, the "cancel" is made to look like the
worker was started and immediately stopped, causing deregistration of
the bgworker entry.  Waiting processes would have to deal with
premature worker exit anyway, so this should introduce no bugs that
weren't there before.  We do have a side effect that registration
records for restartable bgworkers might disappear when theoretically
they should have remained in place; but since we're shutting down,
that shouldn't matter.

Back-patch to v10.  There might be value in putting this into 9.6
as well, but the management of bgworkers is a bit different there
(notably see 8ff518699) and I'm not convinced it's worth the effort
to validate the patch for that branch.

Discussion: https://postgr.es/m/661570.1608673226@sss.pgh.pa.us
2020-12-24 17:00:43 -05:00
Tom Lane
f581e53836 Improve autoprewarm's handling of early-shutdown scenarios.
Bad things happen if the DBA issues "pg_ctl stop -m fast" before
autoprewarm finishes loading its list of blocks to prewarm.
The current worker process successfully terminates early, but
(if this wasn't the last database with blocks to prewarm) the
leader process will just try to launch another worker for the
next database.  Since the postmaster is now in PM_WAIT_BACKENDS
state, it ignores the launch request, and the leader just sits
until it's killed manually.

This is mostly the fault of our half-baked design for launching
background workers, but a proper fix for that is likely to be
too invasive to be back-patchable.  To ameliorate the situation,
fix apw_load_buffers() to check whether SIGTERM has arrived
just before trying to launch another worker.  That leaves us with
only a very narrow window in each worker launch where SIGTERM
could occur between the launch request and successful worker start.

Another issue is that if the leader process does manage to exit,
it unconditionally rewrites autoprewarm.blocks with only the
blocks currently in shared buffers, thus forgetting any blocks
that we hadn't reached yet while prewarming.  This seems quite
unhelpful, since the next database start will then not have the
expected prewarming benefit.  Fix it to not modify the file if
we shut down before the initial load attempt is complete.

Per bug #16785 from John Thompson.  Back-patch to v11 where
the autoprewarm code was introduced.

Discussion: https://postgr.es/m/16785-c0207d8c67fb5f25@postgresql.org
2020-12-22 13:23:49 -05:00
Michael Paquier
d7ecba937b pgcrypto: Detect errors with EVP calls from OpenSSL
The following routines are called within pgcrypto when handling digests
but there were no checks for failures:
- EVP_MD_CTX_size (can fail with -1 as of 3.0.0)
- EVP_MD_CTX_block_size (can fail with -1 as of 3.0.0)
- EVP_DigestInit_ex
- EVP_DigestUpdate
- EVP_DigestFinal_ex

A set of elog(ERROR) is added by this commit to detect such failures,
that should never happen except in the event of a processing failure
internal to OpenSSL.

Note that it would be possible to use ERR_reason_error_string() to get
more context about such errors, but these refer mainly to the internals
of OpenSSL, so it is not really obvious how useful that would be.  This
is left out for simplicity.

Per report from Coverity.  Thanks to Tom Lane for the discussion.

Backpatch-through: 9.5
2020-12-08 15:22:43 +09:00
Heikki Linnakangas
ee42f78910 Remove leftover comments, left behind by removal of WITH OIDS.
Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGaRoF3XrhPW-Y7P%2BG7bKo84Z_h%3DkQHvMh-80%3Dav3wmOw%40mail.gmail.com
2020-11-30 10:29:26 +02:00
Andrew Gierth
7f69ed4aeb pg_trgm: fix crash in 2-item picksplit
Whether from size overflow in gistSplit or from secondary splits,
picksplit is (rarely) called with exactly two items to split.

Formerly, due to special-case handling of the last item, this would
lead to access to an uninitialized cache entry; prior to PG 13 this
might have been harmless or at worst led to an incorrect union datum,
but in 13 onwards it can cause a backend crash from using an
uninitialized pointer.

Repair by removing the special case, which was deemed not to have been
appropriate anyway. Backpatch all the way, because this bug has
existed since pg_trgm was added.

Per report on IRC from user "ftzdomino". Analysis and testing by me,
patch from Alexander Korotkov.

Discussion: https://postgr.es/m/87k0usfdxg.fsf@news-spur.riddles.org.uk
2020-11-12 14:56:58 +00:00
Tom Lane
171c457cd0 Fix and simplify some usages of TimestampDifference().
Introduce TimestampDifferenceMilliseconds() to simplify callers
that would rather have the difference in milliseconds, instead of
the select()-oriented seconds-and-microseconds format.  This gets
rid of at least one integer division per call, and it eliminates
some apparently-easy-to-mess-up arithmetic.

Two of these call sites were in fact wrong:

* pg_prewarm's autoprewarm_main() forgot to multiply the seconds
by 1000, thus ending up with a delay 1000X shorter than intended.
That doesn't quite make it a busy-wait, but close.

* postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute
microseconds not milliseconds, thus ending up with a delay 1000X longer
than intended.  Somebody along the way had noticed this problem but
misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather
than fixing the units.  This was relatively harmless in context, because
we don't care that much about exactly how long this delay is; still,
it's wrong.

There are a few more callers of TimestampDifference() that don't
have a direct need for seconds-and-microseconds, but can't use
TimestampDifferenceMilliseconds() either because they do need
microsecond precision or because they might possibly deal with
intervals long enough to overflow 32-bit milliseconds.  It might be
worth inventing another API to improve that, but that seems outside
the scope of this patch; so those callers are untouched here.

Given the fact that we are fixing some bugs, and the likelihood
that future patches might want to back-patch code that uses this
new API, back-patch to all supported branches.

Alexey Kondratov and Tom Lane

Discussion: https://postgr.es/m/3b1c053a21c07c1ed5e00be3b2b855ef@postgrespro.ru
2020-11-10 22:51:55 -05:00
Noah Misch
ac8f6243cb In security-restricted operations, block enqueue of at-commit user code.
Specifically, this blocks DECLARE ... WITH HOLD and firing of deferred
triggers within index expressions and materialized view queries.  An
attacker having permission to create non-temp objects in at least one
schema could execute arbitrary SQL functions under the identity of the
bootstrap superuser.  One can work around the vulnerability by disabling
autovacuum and not manually running ANALYZE, CLUSTER, REINDEX, CREATE
INDEX, VACUUM FULL, or REFRESH MATERIALIZED VIEW.  (Don't restore from
pg_dump, since it runs some of those commands.)  Plain VACUUM (without
FULL) is safe, and all commands are fine when a trusted user owns the
target object.  Performance may degrade quickly under this workaround,
however.  Back-patch to 9.5 (all supported versions).

Reviewed by Robert Haas.  Reported by Etienne Stalmans.

Security: CVE-2020-25695
2020-11-09 07:32:12 -08:00
Michael Paquier
57bdf29dd5 Fix potential memory leak in pgcrypto
When allocating a EVP context, it would have been possible to leak some
memory allocated directly by OpenSSL, that PostgreSQL lost track of if
the initialization of the context allocated failed.  The cleanup can be
done with EVP_MD_CTX_destroy().

Note that EVP APIs exist since OpenSSL 0.9.7 and we have in the tree
equivalent implementations for older versions since ce9b75d (code
removed with 9b7cd59a as of 10~).  However, in 9.5 and 9.6, the existing
code makes use of EVP_MD_CTX_destroy() and EVP_MD_CTX_create() without
an equivalent implementation when building the tree with OpenSSL 0.9.6
or older, meaning that this code is in reality broken with such versions
since it got introduced in e2838c5.  As we have heard no complains about
that, it does not seem worth bothering with in 9.5 and 9.6, so I have
left that out for simplicity.

Author: Michael Paquier
Discussion: https://postgr.es/m/20201015072212.GC2305@paquier.xyz
Backpatch-through: 9.5
2020-10-19 09:37:55 +09:00
Tom Lane
7004ce7589 Add missing error check in pgcrypto/crypt-md5.c.
In theory, the second px_find_digest call in px_crypt_md5 could fail
even though the first one succeeded, since resource allocation is
required.  Don't skip testing for a failure.  (If one did happen,
the likely result would be a crash rather than clean recovery from
an OOM failure.)

The code's been like this all along, so back-patch to all supported
branches.

Daniel Gustafsson

Discussion: https://postgr.es/m/AA8D6FE9-4AB2-41B4-98CB-AE64BA668C03@yesql.se
2020-10-16 11:59:31 -04:00
Tom Lane
3ba9670847 Make contrib modules' installation scripts more secure.
Hostile objects located within the installation-time search_path could
capture references in an extension's installation or upgrade script.
If the extension is being installed with superuser privileges, this
opens the door to privilege escalation.  While such hazards have existed
all along, their urgency increases with the v13 "trusted extensions"
feature, because that lets a non-superuser control the installation path
for a superuser-privileged script.  Therefore, make a number of changes
to make such situations more secure:

* Tweak the construction of the installation-time search_path to ensure
that references to objects in pg_catalog can't be subverted; and
explicitly add pg_temp to the end of the path to prevent attacks using
temporary objects.

* Disable check_function_bodies within installation/upgrade scripts,
so that any security gaps in SQL-language or PL-language function bodies
cannot create a risk of unwanted installation-time code execution.

* Adjust lookup of type input/receive functions and join estimator
functions to complain if there are multiple candidate functions.  This
prevents capture of references to functions whose signature is not the
first one checked; and it's arguably more user-friendly anyway.

* Modify various contrib upgrade scripts to ensure that catalog
modification queries are executed with secure search paths.  (These
are in-place modifications with no extension version changes, since
it is the update process itself that is at issue, not the end result.)

Extensions that depend on other extensions cannot be made fully secure
by these methods alone; therefore, revert the "trusted" marking that
commit eb67623c9 applied to earthdistance and hstore_plperl, pending
some better solution to that set of issues.

Also add documentation around these issues, to help extension authors
write secure installation scripts.

Patch by me, following an observation by Andres Freund; thanks
to Noah Misch for review.

Security: CVE-2020-14350
2020-08-10 10:44:42 -04:00
Peter Geoghegan
16c977906e Restore lost amcheck TOAST test coverage.
Commit eba77534 fixed an amcheck false positive bug involving
inconsistencies in TOAST input state between table and index.  A test
case was added that verified that such an inconsistency didn't result in
a spurious corruption related error.

Test coverage from the test was accidentally lost by commit 501e41dd,
which propagated ALTER TABLE ...  SET STORAGE attstorage state to
indexes.  This broke the test because the test specifically relied on
attstorage not being propagated.  This artificially forced there to be
index tuples whose datums were equivalent to the datums in the heap
without the datums actually being bitwise equal.

Fix this by updating pg_attribute directly instead.  Commit 501e41dd
made similar changes to a test_decoding TOAST-related test case which
made the same assumption, but overlooked the amcheck test case.

Backpatch: 11-, just like commit eba77534 (and commit 501e41dd).
2020-07-31 15:34:25 -07:00
Michael Paquier
5bd087eb5d Fix corner case with 16kB-long decompression in pgcrypto, take 2
A compressed stream may end with an empty packet.  In this case
decompression finishes before reading the empty packet and the
remaining stream packet causes a failure in reading the following
data.  This commit makes sure to consume such extra data, avoiding a
failure when decompression the data.  This corner case was reproducible
easily with a data length of 16kB, and existed since e94dd6a.  A cheap
regression test is added to cover this case based on a random,
incompressible string.

The first attempt of this patch has allowed to find an older failure
within the compression logic of pgcrypto, fixed by b9b6105.  This
involved SLES 15 with z390 where a custom flavor of libz gets used.
Bonus thanks to Mark Wong for providing access to the specific
environment.

Reported-by: Frank Gagnepain
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
Backpatch-through: 9.5
2020-07-27 15:58:59 +09:00
Tom Lane
3d4a778152 Fix ancient violation of zlib's API spec.
contrib/pgcrypto mishandled the case where deflate() does not consume
all of the offered input on the first try.  It reset the next_in pointer
to the start of the input instead of leaving it alone, causing the wrong
data to be fed to the next deflate() call.

This has been broken since pgcrypto was committed.  The reason for the
lack of complaints seems to be that it's fairly hard to get stock zlib
to not consume all the input, so long as the output buffer is big enough
(which it normally would be in pgcrypto's usage; AFAICT the input is
always going to be packetized into packets no larger than ZIP_OUT_BUF).
However, IBM's zlibNX implementation for AIX evidently will do it
in some cases.

I did not add a test case for this, because I couldn't find one that
would fail with stock zlib.  When we put back the test case for
bug #16476, that will cover the zlibNX situation well enough.

While here, write deflate()'s second argument as Z_NO_FLUSH per its
API spec, instead of hard-wiring the value zero.

Per buildfarm results and subsequent investigation.

Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
2020-07-23 17:20:02 -04:00
Michael Paquier
e30a63f258 Revert "Fix corner case with PGP decompression in pgcrypto"
This reverts commit 9e10898, after finding out that buildfarm members
running SLES 15 on z390 complain on the compression and decompression
logic of the new test: pipistrelles, barbthroat and steamerduck.

Those hosts are visibly using hardware-specific changes to improve zlib
performance, requiring more investigation.

Thanks to Tom Lane for the discussion.

Discussion: https://postgr.es/m/20200722093749.GA2564@paquier.xyz
Backpatch-through: 9.5
2020-07-23 08:29:18 +09:00
Michael Paquier
bba2e66aec Fix corner case with PGP decompression in pgcrypto
A compressed stream may end with an empty packet, and PGP decompression
finished before reading this empty packet in the remaining stream.  This
caused a failure in pgcrypto, handling this case as corrupted data.
This commit makes sure to consume such extra data, avoiding a failure
when decompression the entire stream.  This corner case was reproducible
with a data length of 16kB, and existed since its introduction in
e94dd6a.  A cheap regression test is added to cover this case.

Thanks to Jeff Janes for the extra investigation.

Reported-by: Frank Gagnepain
Author: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
Backpatch-through: 9.5
2020-07-22 14:52:46 +09:00
Joe Conway
015e899a7a Read until EOF vice stat-reported size in read_binary_file
read_binary_file(), used by SQL functions pg_read_file() and friends,
uses stat to determine file length to read, when not passed an explicit
length as an argument. This is problematic, for example, if the file
being read is a virtual file with a stat-reported length of zero.
Arrange to read until EOF, or StringInfo data string lenth limit, is
reached instead.

Original complaint and patch by me, with significant review, corrections,
advice, and code optimizations by Tom Lane. Backpatched to v11. Prior to
that only paths relative to the data and log dirs were allowed for files,
so no "zero length" files were reachable anyway.

Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/flat/969b8d82-5bb2-5fa8-4eb1-f0e685c5d736%40joeconway.com
Backpatch-through: 11
2020-07-04 06:28:44 -04:00
Joe Conway
e8eb485954 Initialize dblink remoteConn struct in all cases
Two of the members of rconn were left uninitialized. When
dblink_open() is called without an outer transaction it
handles the initialization for us, but with an outer
transaction it does not. Arrange for initialization
in all cases. Backpatch to all supported versions.

Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/flat/9bd0744f-5f04-c778-c5b3-809efe9c30c7%40joeconway.com#c545909a41664991aca60c4d70a10ce7
2020-05-28 13:44:59 -04:00
Alexander Korotkov
ae1f9b0a9b Fix amcheck for page checks concurrent to replay of btree page deletion
amcheck expects at least hikey to always exist on leaf page even if it is
deleted page.  But replica reinitializes page during replay of page deletion,
causing deleted page to have no items.  Thus, replay of page deletion can
cause an error in concurrent amcheck run.

This commit relaxes amcheck expectation making it tolerate deleted page with
no items.

Reported-by: Konstantin Knizhnik
Discussion: https://postgr.es/m/CAPpHfdt_OTyQpXaPJcWzV2N-LNeNJseNB-K_A66qG%3DL518VTFw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 11
2020-05-14 12:46:08 +03:00
Peter Eisentraut
bf7233ee4a Propagate ALTER TABLE ... SET STORAGE to indexes
When creating a new index, the attstorage setting of the table column
is copied to regular (non-expression) index columns.  But a later
ALTER TABLE ... SET STORAGE is not propagated to indexes, thus
creating an inconsistent and undumpable state.

Discussion: https://www.postgresql.org/message-id/flat/9765d72b-37c0-06f5-e349-2a580aafd989%402ndquadrant.com
2020-05-08 09:18:15 +02:00
Tom Lane
c08da32f13 Get rid of trailing semicolons in C macro definitions.
Writing a trailing semicolon in a macro is almost never the right thing,
because you almost always want to write a semicolon after each macro
call instead.  (Even if there was some reason to prefer not to, pgindent
would probably make a hash of code formatted that way; so within PG the
rule should basically be "don't do it".)  Thus, if we have a semi inside
the macro, the compiler sees "something;;".  Much of the time the extra
empty statement is harmless, but it could lead to mysterious syntax
errors at call sites.  In perhaps an overabundance of neatnik-ism, let's
run around and get rid of the excess semicolons whereever possible.

The only thing worse than a mysterious syntax error is a mysterious
syntax error that only happens in the back branches; therefore,
backpatch these changes where relevant, which is most of them because
most of these mistakes are old.  (The lack of reported problems shows
that this is largely a hypothetical issue, but still, it could bite
us in some future patch.)

John Naylor and Tom Lane

Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com
2020-05-01 17:28:00 -04:00
Tom Lane
687e566b90 Fix cache reference leak in contrib/sepgsql.
fixup_whole_row_references() did the wrong thing with a dropped column,
resulting in a commit-time warning about a cache reference leak.

I (tgl) added a test case exercising this, but back-patched the test
only as far as v10; the patch didn't apply cleanly to 9.6 and it
didn't seem worth the trouble to adapt it.  The bug is pretty old
though, so apply the code change all the way back.

Michael Luo, with cosmetic improvements by me

Discussion: https://postgr.es/m/BYAPR08MB5606D1453D7F50E2AF4D2FD29AD80@BYAPR08MB5606.namprd08.prod.outlook.com
2020-04-16 14:45:54 -04:00
Tom Lane
d56657c35d Fix bogus CALLED_AS_TRIGGER() defenses.
contrib/lo's lo_manage() thought it could use
trigdata->tg_trigger->tgname in its error message about
not being called as a trigger.  That naturally led to a core dump.

unique_key_recheck() figured it could Assert that fcinfo->context
is a TriggerData node in advance of having checked that it's
being called as a trigger.  That's harmless in production builds,
and perhaps not that easy to reach in any case, but it's logically
wrong.

The first of these per bug #16340 from William Crowell;
the second from manual inspection of other CALLED_AS_TRIGGER
call sites.

Back-patch the lo.c change to all supported branches, the
other to v10 where the thinko crept in.

Discussion: https://postgr.es/m/16340-591c7449dc7c8c47@postgresql.org
2020-04-03 11:24:56 -04:00
Tom Lane
94c9152dc8 Back-patch addition of stack overflow and interrupt checks for lquery.
Experimentation shows that it's not hard at all to drive the
old implementation of "ltree ~ lquery" match to stack overflow,
so throw in a check_stack_depth() call, as I just did in HEAD.

I wasn't able to make it take a long time, because all the
pathological cases I tried hit stack overflow first; but
I bet there are some others that do take a long time, so add
CHECK_FOR_INTERRUPTS() too.

Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
2020-03-31 11:37:44 -04:00
Tom Lane
2bb6bdbe5d Protect against overflow of ltree.numlevel and lquery.numlevel.
These uint16 fields could be overflowed by excessively long input,
producing strange results.  Complain for invalid input.

Likewise check for out-of-range values of the repeat counts in lquery.
(We don't try too hard on that one, notably not bothering to detect
if atoi's result has overflowed.)

Also detect length overflow in ltree_concat.

In passing, be more consistent about whether "syntax error" messages
include the type name.  Also, clarify the documentation about what
the size limit is.

This has been broken for a long time, so back-patch to all supported
branches.

Nikita Glukhov, reviewed by Benjie Gillam and Tomas Vondra

Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
2020-03-28 17:09:51 -04:00
Noah Misch
63631ee64f Revert "Skip WAL for new relfilenodes, under wal_level=minimal."
This reverts commit cb2fd7eac2.  Per
numerous buildfarm members, it was incompatible with parallel query, and
a test case assumed LP64.  Back-patch to 9.5 (all supported versions).

Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com
2020-03-22 09:24:13 -07:00
Noah Misch
e4b0a02ef8 Skip WAL for new relfilenodes, under wal_level=minimal.
Until now, only selected bulk operations (e.g. COPY) did this.  If a
given relfilenode received both a WAL-skipping COPY and a WAL-logged
operation (e.g. INSERT), recovery could lose tuples from the COPY.  See
src/backend/access/transam/README section "Skipping WAL for New
RelFileNode" for the new coding rules.  Maintainers of table access
methods should examine that section.

To maintain data durability, just before commit, we choose between an
fsync of the relfilenode and copying its contents to WAL.  A new GUC,
wal_skip_threshold, guides that choice.  If this change slows a workload
that creates small, permanent relfilenodes under wal_level=minimal, try
adjusting wal_skip_threshold.  Users setting a timeout on COMMIT may
need to adjust that timeout, and log_min_duration_statement analysis
will reflect time consumption moving to COMMIT from commands like COPY.

Internally, this requires a reliable determination of whether
RollbackAndReleaseCurrentSubTransaction() would unlink a relation's
current relfilenode.  Introduce rd_firstRelfilenodeSubid.  Amend the
specification of rd_createSubid such that the field is zero when a new
rel has an old rd_node.  Make relcache.c retain entries for certain
dropped relations until end of transaction.

Back-patch to 9.5 (all supported versions).  This introduces a new WAL
record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC.  As
always, update standby systems before master systems.  This changes
sizeof(RelationData) and sizeof(IndexStmt), breaking binary
compatibility for affected extensions.  (The most recent commit to
affect the same class of extensions was
089e4d405d0f3b94c74a2c6a54357a84a681754b.)

Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert
Haas.  Heikki Linnakangas and Michael Paquier implemented earlier
designs that materially clarified the problem.  Reviewed, in earlier
designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane,
Fujii Masao, and Simon Riggs.  Reported by Martijn van Oosterhout.

Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org
2020-03-21 09:38:30 -07:00
Amit Kapila
e37824136f Add missing errcode() in a few ereport calls.
This will allow to specifying SQLSTATE error code for the errors in the
missing places.

Reported-by: Sawada Masahiko
Author: Sawada Masahiko
Backpatch-through: 9.5
Discussion: https://postgr.es/m/CA+fd4k6N8EjNvZpM8nme+y+05mz-SM8Z_BgkixzkA34R+ej0Kw@mail.gmail.com
2020-03-18 09:33:01 +05:30
Tom Lane
2a89455aad Avoid holding a directory FD open across assorted SRF calls.
This extends the fixes made in commit 085b6b667 to other SRFs with the
same bug, namely pg_logdir_ls(), pgrowlocks(), pg_timezone_names(),
pg_ls_dir(), and pg_tablespace_databases().

Also adjust various comments and documentation to warn against
expecting to clean up resources during a ValuePerCall SRF's final
call.

Back-patch to all supported branches, since these functions were
all born broken.

Justin Pryzby, with cosmetic tweaks by me

Discussion: https://postgr.es/m/20200308173103.GC1357@telsasoft.com
2020-03-16 21:05:53 -04:00
Peter Geoghegan
393b449f1a Paper over bt_metap() oldest_xact bug in backbranches.
The data types that contrib/pageinspect's bt_metap() function were
declared to return as OUT arguments were wrong in some cases.  In
particular, the oldest_xact column (a TransactionId/xid field) was
declared integer/int4 within the pageinspect extension's sql file.  This
led to errors when an oldest_xact value that exceeded 2^31-1 was
encountered.

We cannot fix the declaration on Postgres 11 or 12.  All we can do is
ameliorate the problem.  Use "%d" instead of "%u" to format the output
of the oldest_xact value.  This makes the C code match the declaration,
suppressing unhelpful error messages that might otherwise make
bt_metap() totally unusable.  A bogus negative oldest_xact value will be
displayed instead of raising an error.

This commit addresses the same issue as master branch commit 691e8b2e18,
which actually fixed the problem.  Backpatch to the 11 and 12 branches
only, since they are the only branches (other than master) that have
oldest_xact.  All of the other problematic columns already display bogus
output for out of range values.

Reported-By: Victor Yegorov
Bug: #16285
Discussion: https://postgr.es/m/20200309223557.aip5n6ewln4ixbbi@alap3.anarazel.de
Backpatch: 11 and 12 only
2020-03-11 14:15:02 -07:00
Amit Kapila
59112f2355 Stop demanding that top xact must be seen before subxact in decoding.
Manifested as

ERROR:  subtransaction logged without previous top-level txn record

this check forbids legit behaviours like
 - First xl_xact_assignment record is beyond reading, i.e. earlier
   restart_lsn.
 - After restart_lsn there is some change of a subxact.
 - After that, there is second xl_xact_assignment (for another subxact)
   revealing the relationship between top and first subxact.

Such a transaction won't be streamed anyway because we hadn't seen it in
full.  Saying for sure whether xact of some record encountered after
the snapshot was deserialized can be streamed or not requires to know
whether it wrote something before deserialization point --if yes, it
hasn't been seen in full and can't be decoded. Snapshot doesn't have such
info, so there is no easy way to relax the check.

Reported-by: Hsu, John
Diagnosed-by: Arseny Sher
Author: Arseny Sher, Amit Kapila
Reviewed-by: Amit Kapila, Dilip Kumar
Backpatch-through: 9.5
Discussion: https://postgr.es/m/AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
2020-02-19 08:27:15 +05:30
Tom Lane
a4484a6489 In jsonb_plpython.c, suppress warning message from gcc 10.
Very recent gcc complains that PLyObject_ToJsonbValue could return
a pointer to a local variable.  I think it's wrong; but the coding
is fragile enough, and the savings of one palloc() minimal enough,
that it seems better to just do a palloc() all the time.  (My other
idea of tweaking the if-condition doesn't suppress the warning.)

Back-patch to v11 where this code was introduced.

Discussion: https://postgr.es/m/21547.1580170366@sss.pgh.pa.us
2020-01-30 18:26:13 -05:00
Tom Lane
7294f99a0b In postgres_fdw, don't try to ship MULTIEXPR updates to remote server.
In a statement like "UPDATE remote_tab SET (x,y) = (SELECT ...)",
we'd conclude that the statement could be directly executed remotely,
because the sub-SELECT is in a resjunk tlist item that's not examined
for shippability.  Currently that ends up crashing if the sub-SELECT
contains any remote Vars.  Prevent the crash by deeming MULTIEXEC
Params to be unshippable.

This is a bit of a brute-force solution, since if the sub-SELECT
*doesn't* contain any remote Vars, the current execution technology
would work; but that's not a terribly common use-case for this syntax,
I think.  In any case, we generally don't try to ship sub-SELECTs, so
it won't surprise anybody that this doesn't end up as a remote direct
update.  I'd be inclined to see if that general limitation can be fixed
before worrying about this case further.

Per report from Lukáš Sobotka.

Back-patch to 9.6.  9.5 had MULTIEXPR, but we didn't try to perform
remote direct updates then, so the case didn't arise anyway.

Discussion: https://postgr.es/m/CAJif3k+iA_ekBB5Zw2hDBaE1wtiQa4LH4_JUXrrMGwTrH0J01Q@mail.gmail.com
2020-01-26 14:31:08 -05:00
Joe Conway
b5e7569ddd Disallow null category in crosstab_hash
While building a hash map of categories in load_categories_hash,
resulting category names have not thus far been checked to ensure
they are not null. Prior to pg12 null category names worked to the
extent that they did not crash on some platforms. This is because
those system libraries have an snprintf which can deal with being
passed a null pointer argument for a string. But even in those cases
null categories did nothing useful. And on some platforms it crashed.
As of pg12, our own version of snprintf gets called, and it does
not deal with null pointer arguments at all, and crashes consistently.

Fix that by disallowing null categories. They never worked usefully,
and no one has ever asked for them to work previously. Back-patch to
all supported branches.

Reported-By: Ireneusz Pluta
Discussion: https://postgr.es/m/16176-7489719b05e4303c@postgresql.org
2019-12-23 13:33:34 -05:00
Tom Lane
e8f60e6fe2 libpq should expose GSS-related parameters even when not implemented.
We realized years ago that it's better for libpq to accept all
connection parameters syntactically, even if some are ignored or
restricted due to lack of the feature in a particular build.
However, that lesson from the SSL support was for some reason never
applied to the GSSAPI support.  This is causing various buildfarm
members to have problems with a test case added by commit 6136e94dc,
and it's just a bad idea from a user-experience standpoint anyway,
so fix it.

While at it, fix some places where parameter-related infrastructure
was added with the aid of a dartboard, or perhaps with the aid of
the anti-pattern "add new stuff at the end".  It should be safe
to rearrange the contents of struct pg_conn even in released
branches, since that's private to libpq (and we'd have to move
some fields in some builds to fix this, anyway).

Back-patch to all supported branches.

Discussion: https://postgr.es/m/11297.1576868677@sss.pgh.pa.us
2019-12-20 15:34:07 -05:00
Etsuro Fujita
547e454cbc Fix handling of multiple AFTER ROW triggers on a foreign table.
AfterTriggerExecute() retrieves a fresh tuple or pair of tuples from a
tuplestore and then stores the tuple(s) in the passed-in slot(s) if
AFTER_TRIGGER_FDW_FETCH, while it uses the most-recently-retrieved
tuple(s) stored in the slot(s) if AFTER_TRIGGER_FDW_REUSE.  This was
done correctly before 12, but commit ff11e7f4b broke it by mistakenly
clearing the tuple(s) stored in the slot(s) in that function, leading to
an assertion failure as reported in bug #16139 from Alexander Lakhin.

Also, fix some other issues with the aforementioned commit in passing:

* For tg_newslot, which is a slot added to the TriggerData struct by the
  commit to store new updated tuples, it didn't ensure the slot was NULL
  if there was no such tuple.
* The commit failed to update the documentation about the trigger
  interface.

Author: Etsuro Fujita
Backpatch-through: 12
Discussion: https://postgr.es/m/16139-94f9ccf0db6119ec%40postgresql.org
2019-12-10 18:00:31 +09:00
Tomas Vondra
a8a8c6b296 Ensure maxlen is at leat 1 in dict_int
The dict_int text search dictionary template accepts maxlen parameter,
which is then used to cap the length of input strings. The value was
not properly checked, and the code simply does

    txt[d->maxlen] = '\0';

to insert a terminator, leading to segfaults with negative values.

This commit simply rejects values less than 1. The issue was there since
dct_int was introduced in 9.3, so backpatch all the way back to 9.4
which is the oldest supported version.

Reported-by: cili
Discussion: https://postgr.es/m/16144-a36a5bef7657047d@postgresql.org
Backpatch-through: 9.4
2019-12-03 18:40:48 +01:00
Etsuro Fujita
0cc31cc32d postgres_fdw: Fix error message for PREPARE TRANSACTION.
Currently, postgres_fdw does not support preparing a remote transaction
for two-phase commit even in the case where the remote transaction is
read-only, but the old error message appeared to imply that that was not
supported only if the remote transaction modified remote tables.  Change
the message so as to include the case where the remote transaction is
read-only.

Also fix a comment above the message.

Also add a note about the lack of supporting PREPARE TRANSACTION to the
postgres_fdw documentation.

Reported-by: Gilles Darold
Author: Gilles Darold and Etsuro Fujita
Reviewed-by: Michael Paquier and Kyotaro Horiguchi
Backpatch-through: 9.4
Discussion: https://postgr.es/m/08600ed3-3084-be70-65ba-279ab19618a5%40darold.net
2019-11-08 17:00:31 +09:00
Michael Paquier
9fd9af97fb Update test output of sepgsql for ALTER TABLE COLUMN DROP
1df5875 has changed the way dependencies are dropped for this command
with inheritance trees, which impacts sepgsql.  This just updates the
regression test output to take care of the failures and adapt to the new
code.

Reported by buildfarm member rhinoceros.

Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20191013101331.GC1434@paquier.xyz
Backpatch-through: 12
2019-10-14 08:58:47 +09:00
Michael Paquier
707f38e38e Fix failure with lock mode used for custom relation options
In-core relation options can use a custom lock mode since 47167b7, that
has lowered the lock available for some autovacuum parameters.  However
it forgot to consider custom relation options.  This causes failures
with ALTER TABLE SET when changing a custom relation option, as its lock
is not defined.  The existing APIs to define a custom reloption does not
allow to define a custom lock mode, so enforce its initialization to
AccessExclusiveMode which should be safe enough in all cases.  An
upcoming patch will extend the existing APIs to allow a custom lock mode
to be defined.

The problem can be reproduced with bloom indexes, so add a test there.

Reported-by: Nikolay Sharplov
Analyzed-by: Thomas Munro, Michael Paquier
Author: Michael Paquier
Reviewed-by: Kuntal Ghosh
Discussion: https://postgr.es/m/20190920013831.GD1844@paquier.xyz
Backpatch-through: 9.6
2019-09-25 10:08:26 +09:00
Peter Eisentraut
516a4c116c Message style fixes 2019-09-23 13:37:33 +02:00
Peter Geoghegan
a05fa2c0e7 amcheck: Skip unlogged relations during recovery.
contrib/amcheck failed to consider the possibility that unlogged
relations will not have any main relation fork files when running in hot
standby mode.  This led to low-level "can't happen" errors that complain
about the absence of a relfilenode file.

To fix, simply skip verification of unlogged index relations during
recovery.  In passing, add a direct check for the presence of a main
fork just before verification proper begins, so that we cleanly verify
the presence of the main relation fork file.

Author: Andrey Borodin, Peter Geoghegan
Reported-By: Andrey Borodin
Diagnosed-By: Andrey Borodin
Discussion: https://postgr.es/m/DA9B33AC-53CB-4643-96D4-7A0BBC037FA1@yandex-team.ru
Backpatch: 10-, where amcheck was introduced.
2019-08-12 15:21:30 -07:00
Tom Lane
2f76f41829 Fix intarray's GiST opclasses to not fail for empty arrays with <@.
contrib/intarray considers "arraycol <@ constant-array" to be indexable,
but its GiST opclass code fails to reliably find index entries for empty
array values (which of course should trivially match such queries).
This is because the test condition to see whether we should descend
through a non-leaf node is wrong.

Unfortunately, empty array entries could be anywhere in the index,
as these index opclasses are currently designed.  So there's no way
to fix this except by lobotomizing <@ indexscans to scan the whole
index ... which is what this patch does.  That's pretty unfortunate:
the performance is now actually worse than a seqscan, in most cases.
We'd be better off to remove <@ from the GiST opclasses entirely,
and perhaps a future non-back-patchable patch will do so.

In the meantime, applications whose performance is adversely impacted
have a couple of options.  They could switch to a GIN index, which
doesn't have this bug, or they could replace "arraycol <@ constant-array"
with "arraycol <@ constant-array AND arraycol && constant-array".
That will provide about the same performance as before, and it will find
all non-empty subsets of the given constant-array, which is all that
could reliably be expected of the query before.

While at it, add some more regression test cases to improve code
coverage of contrib/intarray.

In passing, adjust resize_intArrayType so that when it's returning an
empty array, it uses construct_empty_array for that rather than
cowboy hacking on the input array.  While the hack produces an array
that looks valid for most purposes, it isn't bitwise equal to empty
arrays produced by other code paths, which could have subtle odd
effects.  I don't think this code path is performance-critical
enough to justify such shortcuts.  (Back-patch this part only as far
as v11; before commit 01783ac36 we were not careful about this in
other intarray code paths either.)

Back-patch the <@ fixes to all supported versions, since this was
broken from day one.

Patch by me; thanks to Alexander Korotkov for review.

Discussion: https://postgr.es/m/458.1565114141@sss.pgh.pa.us
2019-08-06 18:04:51 -04:00
Tom Lane
df521ab795 Fix handling of "undef" in contrib/jsonb_plperl.
Perl has multiple internal representations of "undef", and just
testing for SvTYPE(x) == SVt_NULL doesn't recognize all of them,
leading to "cannot transform this Perl type to jsonb" errors.
Use the approved test SvOK() instead.

Report and patch by Ivan Panchenko.  Back-patch to v11 where
this module was added.

Discussion: https://postgr.es/m/1564783533.324795401@f193.i.mail.ru
2019-08-04 14:05:35 -04:00