Commit Graph

42000 Commits

Author SHA1 Message Date
Noah Misch
5eecc1cafd Document security implications of qualified names.
Commit 5770172cb0 documented secure schema
usage, and that advice suffices for using unqualified names securely.
Document, in typeconv-func primarily, the additional issues that arise
with qualified names.  Back-patch to 9.3 (all supported versions).

Reviewed by Jonathan S. Katz.

Discussion: https://postgr.es/m/20180721012446.GA1840594@rfd.leadboat.com
2018-07-28 20:08:29 -07:00
Bruce Momjian
03131abe9b pgtest: run clean, build, and check stages separately
This allows for cleaner error reporting.

Backpatch-through: 9.5
2018-07-28 15:34:06 -04:00
Bruce Momjian
a326ca75b3 pg_upgrade: check for clean server shutdowns
Previously pg_upgrade checked for the pid file and started/stopped the
server to force a clean shutdown.  However, "pg_ctl -m immediate"
removes the pid file but doesn't do a clean shutdown, so check
pg_controldata for a clean shutdown too.

Diagnosed-by: Vimalraj A

Discussion: https://postgr.es/m/CAFKBAK5e4Q-oTUuPPJ56EU_d2Rzodq6GWKS3ncAk3xo7hAsOZg@mail.gmail.com

Backpatch-through: 9.3
2018-07-28 15:01:55 -04:00
Bruce Momjian
5f2c5890e9 pgtest: grab possible warnings from install.log
Since PG 9.5, 'make check' records the build output in install.log, so
look in there for warnings too.

Backpatch-through: 9.5
2018-07-28 11:35:52 -04:00
Andres Freund
2217fd9e97 doc: Fix reference to "decoder" to instead be the correct "output plugin".
Author: Jonathan Katz
Discussion: https://postgr.es/m/DD02DD86-5989-4BFD-8712-468541F68383@postgresql.org
Backpatch: 9.4-, where logical decoding was added
2018-07-24 10:51:17 -07:00
Tom Lane
2cc81f95e6 Further portability hacking in pg_upgrade's test script.
I blew the dust off a Bourne shell (file date 1996, yea verily) and
tried to run test.sh with it.  It mostly worked, but I found that the
temp-directory creation code introduced by commit be76a6d39 was not
compatible, for a couple of reasons: this shell thinks "set -e" should
force an exit if a command within backticks fails, and it also thinks code
within braces should be executed by a sub-shell, meaning that variable
settings don't propagate back up to the parent shell.  In view of Victor
Wagner's report that Solaris is still using pre-POSIX shells, seems like
we oughta make this case work.  It's not like the code is any less
idiomatic this way; the prior coding technique appeared nowhere else.

(There is a remaining bash-ism here, which is that $RANDOM doesn't do
what the code hopes in non-bash shells.  But the use of $$ elsewhere in
that path should be enough to ensure uniqueness and some amount of
randomness, so I think it's okay as-is.)

Back-patch to all supported branches, as the previous commit was.

Discussion: https://postgr.es/m/20180720153820.69e9ae6c@fafnir.local.vm
2018-07-21 15:40:51 -04:00
Alexander Korotkov
44b550e0d9 Fix handling of empty uncompressed posting list pages in GIN
PostgreSQL 9.4 introduces posting list compression in GIN.  This feature
supports online upgrade, so that after pg_upgrade uncompressed posting
lists are compressed on-the-fly.  Underlying code appears to always
expect at least one item on uncompressed posting list page.  But there
could be completely empty pages, because VACUUM never deletes leftmost
and rightmost pages from posting trees.  This commit fixes that.

Reported-by: Sivasubramanian Ramasubramanian
Discussion: https://postgr.es/m/1531867212836.63354%40amazon.com
Author: Sivasubramanian Ramasubramanian, Alexander Korotkov
Backpatch-through: 9.4
2018-07-19 21:19:19 +03:00
Heikki Linnakangas
8b1d2685b8 Rephrase a few comments for clarity.
I was confused by what "intended to be parallel serially" meant, until
Robert Haas and David G. Johnston explained it. Rephrase the comment to
make it more clear, using David's suggested wording.

Discussion: https://www.postgresql.org/message-id/1fec9022-41e8-e484-70ce-2179b08c2092%40iki.fi
2018-07-19 16:09:41 +03:00
Michael Paquier
164ed1532f Fix print of Path nodes when using OPTIMIZER_DEBUG
GatherMergePath (introduced in 10) and CustomPath (introduced in 9.5)
have gone missing.  The order of the Path nodes was inconsistent with
what is listed in nodes.h, so make the order consistent at the same time
to ease future checks and additions.

Author: Sawada Masahiko
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAD21AoBQMLoc=ohH-oocuAPsELrmk8_EsRJjOyR8FQLZkbE0wA@mail.gmail.com
2018-07-19 09:55:25 +09:00
Heikki Linnakangas
75459bc427 Fix misc typos, mostly in comments.
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.

Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
2018-07-18 16:20:18 +03:00
Tom Lane
330cad2c4b Fix crash in contrib/ltree's lca() function for empty input array.
lca_inner() wasn't prepared for the possibility of getting no inputs.
Fix that, and make some cosmetic improvements to the code while at it.

Also, I thought the documentation of this function as returning the
"longest common prefix" of the paths was entirely misleading; it really
returns a path one shorter than the longest common prefix, for the typical
definition of "prefix".  Don't use that term in the docs, and adjust the
examples to clarify what really happens.

This has been broken since its beginning, so back-patch to all supported
branches.

Per report from Hailong Li.  Thanks to Pierre Ducroquet for diagnosing
and for the initial patch, though I whacked it around some and added
test cases.

Discussion: https://postgr.es/m/5b0d8e4f-f2a3-1305-d612-e00e35a7be66@qunar.com
2018-07-13 18:45:30 -04:00
Tom Lane
073ffefd80 Fix inadequate buffer locking in FSM and VM page re-initialization.
When reading an existing FSM or VM page that was found to be corrupt by the
buffer manager, the code applied PageInit() to reinitialize the page, but
did so without any locking.  There is thus a hazard that two backends might
concurrently do PageInit, which in itself would still be OK, but the slower
one might then zero over subsequent data changes applied by the faster one.
Even that is unlikely to be fatal; but it's not desirable, so add locking
to prevent it.

This does not add any locking overhead in the normal code path where the
page is OK.  It's not immediately obvious that that's safe, but I believe
it is, for reasons explained in the added comments.

Problem noted by R P Asim.  It's been like this for a long time, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/CANXE4Te4G0TGq6cr0-TvwP0H4BNiK_-hB5gHe8mF+nz0mcYfMQ@mail.gmail.com
2018-07-13 11:53:14 -04:00
Bruce Momjian
9348020cc0 docs: Remove "New" description of the libpqxx interface
Backpatch-through: 9.3
2018-07-13 11:16:55 -04:00
Tom Lane
5593280e20 Doc: minor improvement in pl/pgsql FETCH/MOVE documentation.
Explain that you can use any integer expression for the "count" in
pl/pgsql's versions of FETCH/MOVE, unlike the SQL versions which only
allow a constant.

Remove the duplicate version of this para under MOVE.  I don't see
a good reason to maintain two identical paras when we just said that
MOVE works exactly like FETCH.

Per Pavel Stehule, though I didn't use his text.

Discussion: https://postgr.es/m/CAFj8pRAcvSXcNdUGx43bOK1e3NNPbQny7neoTLN42af+8MYWEA@mail.gmail.com
2018-07-12 12:28:43 -04:00
Michael Paquier
d5eb1fe0dc Make logical WAL sender report streaming state appropriately
WAL senders sending logically-decoded data fail to properly report in
"streaming" state when starting up, hence as long as one extra record is
not replayed, such WAL senders would remain in a "catchup" state, which
is inconsistent with the physical cousin.

This can be easily reproduced by for example using pg_recvlogical and
restarting the upstream server.  The TAP tests have been slightly
modified to detect the failure and strengthened so as future tests also
make sure that a node is in streaming state when waiting for its
catchup.

Backpatch down to 9.4 where this code has been introduced.

Reported-by: Sawada Masahiko
Author: Simon Riggs, Sawada Masahiko
Reviewed-by: Petr Jelinek, Michael Paquier, Vaishnavi Prabakaran
Discussion: https://postgr.es/m/CAD21AoB2ZbCCqOx=bgKMcLrAvs1V0ZMqzs7wBTuDySezTGtMZA@mail.gmail.com
2018-07-12 10:20:14 +09:00
Tom Lane
4b8860e2dd Fix create_scan_plan's handling of sortgrouprefs for physical tlists.
We should only run apply_pathtarget_labeling_to_tlist if CP_LABEL_TLIST
was specified, because only in that case has use_physical_tlist checked
that the labeling will succeed; otherwise we may get an "ORDER/GROUP BY
expression not found in targetlist" error.  (This subsumes the previous
test about gating_clauses, because we reset "flags" to zero earlier
if there are gating clauses to apply.)

The only known case in which a failure can occur is with a ProjectSet
path directly atop a table scan path, although it seems likely that there
are other cases or will be such in future.  This means that the failure
is currently only visible in the v10 branch: 9.6 didn't have ProjectSet,
while in v11 and HEAD, apply_scanjoin_target_to_paths for some weird
reason is using create_projection_path not apply_projection_to_path,
masking the problem because there's a ProjectionPath in between.

Nonetheless this code is clearly wrong on its own terms, so back-patch
to 9.6 where this logic was introduced.

Per report from Regina Obe.

Discussion: https://postgr.es/m/001501d40f88$75186950$5f493bf0$@pcorp.us
2018-07-11 15:25:29 -04:00
Peter Eisentraut
d93c05635f Fix typos 2018-07-10 11:16:04 +02:00
Peter Eisentraut
8caf262097 Fix typo 2018-07-10 11:09:53 +02:00
Tom Lane
568995be67 Avoid emitting a bogus WAL record when recycling an all-zero btree page.
Commit fafa374f2 caused _bt_getbuf() to possibly emit a WAL record for
a page that it was about to recycle.  However, it failed to distinguish
all-zero pages from dead pages, which is important because only the
latter have valid btpo.xact values, or indeed any special space at all.
Recycling an all-zero page with XLogStandbyInfoActive() enabled therefore
led to an Assert failure, or to emission of a WAL record containing a
bogus cutoff XID, which might lead to unnecessary query cancellations
on hot standby servers.

Per reports from Antonin Houska and 自己.  Amit Kapila was first to
propose this fix, and Robert Haas, myself, and Kyotaro Horiguchi
reviewed it at various times.

This is an old bug, so back-patch to all supported branches.

Discussion: https://postgr.es/m/2628.1474272158@localhost
Discussion: https://postgr.es/m/48875502.f4a0.1635f0c27b0.Coremail.zoulx1982@163.com
2018-07-09 19:26:19 -04:00
Tom Lane
ccc286da1d Prevent accidental linking of system-supplied copies of libpq.so etc.
Back-patch commit dddfc4cb2, which broke LDFLAGS and related Makefile
variables into two parts, one for within-build-tree library references and
one for external libraries, to ensure that the order of -L flags has all
of the former before all of the latter.  This turns out to fix a problem
recently noted on buildfarm member peripatus, that we attempted to
incorporate code from libpgport.a into a shared library.  That will fail on
platforms that are sticky about putting non-PIC code into shared libraries.
(It's quite surprising we hadn't seen such failures before, since the code
in question has been like that for a long time.)

I think that peripatus' problem could have been fixed with just a subset
of this patch; but since the previous issue of accidentally linking to the
wrong copy of a Postgres shlib seems likely to bite people in the field,
let's just back-patch the whole change.  Now that commit dddfc4cb2 has
survived some beta testing, I'm less afraid to back-patch it than I was
at the time.

This also fixes undesired inclusion of "-DFRONTEND" in pg_config's CPPFLAGS
output (in 9.6 and up) and undesired inclusion of "-L../../src/common" in
its LDFLAGS output (in all supported branches).

Back-patch to v10 and older branches; this is already in v11.

Discussion: https://postgr.es/m/20180704234304.bq2dxispefl65odz@ler-imac.local
2018-07-09 17:23:31 -04:00
Michael Paquier
619dea4678 Rework order of end-of-recovery actions to delay timeline history write
A critical failure in some of the end-of-recovery actions before the
end-of-recovery record is written can cause PostgreSQL to react
inconsistently with the rest of the cluster in the event of a crash
before the final record is written.  Two such failures are for example
an error while processing a two-phase state files or when operating on
recovery.conf.  With this commit, the failures are still considered
FATAL, but the write of the timeline history file is delayed as much as
possible so as the window between the moment the file is written and the
end-of-recovery record is generated gets minimized. This way, in the
event of a crash or a failure, the new timeline decided at promotion
will not seem taken by other nodes in the cluster.  It is not really
possible to reduce to zero this window, hence one could still see
failures if a crash happens between the history file write and the
end-of-recovery record, so any future code should be careful when
adding new end-of-recovery actions.  The original report from Magnus
Hagander mentioned a renamed recovery.conf as original end-of-recovery
failure which caused a timeline to be seen as taken but the subsequent
processing on the now-missing recovery.conf cause the startup process to
issue stop on FATAL, which at follow-up startup made the system
inconsistent because of on-disk changes which already happened.

Processing of two-phase state files still needs some work as corrupted
entries are simply ignored now.  This is left as a future item and this
commit fixes the original complain.

Reported-by: Magnus Hagander
Author: Heikki Linnakangas
Reviewed-by: Alexander Korotkov, Michael Paquier, David Steele
Discussion: https://postgr.es/m/CABUevEz09XY2EevA2dLjPCY-C5UO4Hq=XxmXLmF6ipNFecbShQ@mail.gmail.com
2018-07-09 10:26:41 +09:00
Michael Paquier
741ad15f3b Add note in pg_rewind documentation about read-only files
When performing pg_rewind, the presence of a read-only file which is not
accessible for writes will cause a failure while processing.  This can
cause the control file of the target data folder to be truncated,
causing it to not be reusable with a successive run.

Also, when pg_rewind fails mid-flight, there is likely no way to be able
to recover the target data folder anyway, in which case a new base
backup is the best option.  A note is added in the documentation as
well about.

Reported-by: Christian H.
Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20180104200633.17004.16377%40wrigleys.postgresql.org
2018-07-07 08:10:47 +09:00
Alvaro Herrera
95b6a40915 logical decoding: beware of an unset specinsert change
Coverity complains that there is no protection in the code (at least in
non-assertion-enabled builds) against speculative insertion failing to
follow the expected protocol.  Add an elog(ERROR) for the case.
2018-07-05 17:42:37 -04:00
Alvaro Herrera
301b2a1aad Reduce cost of test_decoding's new oldest_xmin test
Change a whole-database VACUUM into doing just pg_attribute, which is
the portion that verifies what we want it to do.  The original
formulation wastes a lot of CPU time, which leads the test to fail when
runtime exceeds isolationtester timeout when it's super-slow, such as
under CLOBBER_CACHE_ALWAYS.  Per buildfarm member friarbird.

It turns out that the previous shape of the test doesn't always detect
the condition it is supposed to detect (on unpatched reorderbuffer
code): the reason is that there is a good chance of encountering a
xl_running_xacts record (logged every 15 seconds) before the checkpoint
-- and because we advance the xmin when we receive that WAL record, and
we *don't* advance the xmin twice consecutively without receiving a
client message in between, that means the xmin is not advanced enough
for the tuple to be pruned from pg_attribute by VACUUM.  So the test
would spuriously pass.

The reason this test deficiency wasn't detected earlier is that HOT
pruning removes the tuple anyway, even if vacuum leaves it in place, so
the test correctly fails (detecting the coding mistake), but for the
wrong reason.

To fix this mess, run the s0_get_changes step twice before vacuum
instead of once: this seems to cause the xmin to be advanced reliably,
wreaking havoc with more certainty.

Author: Arseny Sher
Discussion: https://postgr.es/m/87h8lkuxoa.fsf@ars-thinkpad
2018-07-05 16:54:52 -04:00
Michael Paquier
8d68ee6f31 Prevent references to invalid relation pages after fresh promotion
If a standby crashes after promotion before having completed its first
post-recovery checkpoint, then the minimal recovery point which marks
the LSN position where the cluster is able to reach consistency may be
set to a position older than the first end-of-recovery checkpoint while
all the WAL available should be replayed.  This leads to the instance
thinking that it contains inconsistent pages, causing a PANIC and a hard
instance crash even if all the WAL available has not been replayed for
certain sets of records replayed.  When in crash recovery,
minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
which forces the recovery to replay all the WAL available, so this
commit makes sure that the local copy of minRecoveryPoint from the
control file is initialized properly and stays as it is while crash
recovery is performed.  Once switching to archive recovery or if crash
recovery finishes, then the local copy minRecoveryPoint can be safely
updated.

Pavan Deolasee has reported and diagnosed the failure in the first
place, and the base fix idea to rely on the local copy of
minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
into a full-fledged patch by me.  The test included in this commit has
been written by Álvaro Herrera and Pavan Deolasee, which I have modified
to make it faster and more reliable with sleep phases.

Backpatch down to all supported versions where the bug appears, aka 9.3
which is where the end-of-recovery checkpoint is not run by the startup
process anymore.  The test gets easily supported down to 10, still it
has been tested on all branches.

Reported-by: Pavan Deolasee
Diagnosed-by: Pavan Deolasee
Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
Herrera
Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
2018-07-05 10:47:19 +09:00
Andres Freund
2adadf018b Check for interrupts inside the nbtree page deletion code.
When deleting pages the nbtree code has to walk through siblings of a
tree node. When those sibling links are corrupted that can lead to
endless loops - which are currently not interruptible.  This is
especially problematic if autovacuum is repeatedly blocked on such
indexes, as it can be hard to get out of that situation without
resorting to single user mode.

Thus add interrupt checks to appropriate places in such
loops. Unfortunately in one of the cases it's it's not easy to do so.

Between 9.3 and 9.4 the page deletion (and page split) code changed
significantly. Before it was significantly less robust against
interruptions. Therefore don't backpatch to 9.3.

Author: Andres Freund
Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de
Backpatch: 9.4-
2018-07-04 14:58:26 -07:00
Fujii Masao
7da22d866d Improve the performance of relation deletes during recovery.
When multiple relations are deleted at the same transaction,
the files of those relations are deleted by one call to smgrdounlinkall(),
which leads to scan whole shared_buffers only one time. OTOH,
previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was
called for each file to delete, which led to scan shared_buffers
multiple times. Obviously this could cause to increase the WAL replay
time very much especially when shared_buffers was huge.

To alleviate this situation, this commit changes the recovery so that
it also calls smgrdounlinkall() only one time to delete multiple
relation files.

This is just fix for oversight of commit 279628a0a7, not new feature.
So, per discussion on pgsql-hackers, we concluded to backpatch this
to all supported versions.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa
Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
2018-07-05 02:27:05 +09:00
Peter Eisentraut
d7e5805399 Fix libpq example programs
When these programs call pg_catalog.set_config, they need to check for
PGRES_TUPLES_OK instead of PGRES_COMMAND_OK.  Fix for
5770172cb0.

Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>
2018-07-01 14:09:11 +02:00
Michael Paquier
7e0786e4d9 Replace search.cpan.org with metacpan.org
search.cpan.org has been EOL'd, with metacpan.org being the official
replacement to which URLs now redirect.  Update links to match the new
URL. Also update links to CPAN to use https as it will redirect from
http.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3@yesql.se
2018-06-29 22:17:54 +09:00
Alvaro Herrera
da10d6a8a9 Fix "base" snapshot handling in logical decoding
Two closely related bugs are fixed.  First, xmin of logical slots was
advanced too early.  During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them.  The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.

To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN.  This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment).  Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.

The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise.  To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.

Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.

Liberally sprinkle code comments, and rewrite a few existing ones.  This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.

Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
2018-06-26 16:38:34 -04:00
Fujii Masao
d06d92cd2d Fix documentation bug related to backup history file.
The backup history file has been no longer necessary for recovery
since the version 9.0. It's now basically just for informational purpose.
But previously the documentations still described that a recovery
requests the backup history file to proceed. The commit fixes this
documentation bug.

Back-patch to all supported versions.

Author: Yugo Nagata
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20180626174752.0ce505e3.nagata@sraoss.co.jp
2018-06-27 00:51:07 +09:00
Thomas Munro
b5b973ef0b Add PGTYPESchar_free() to avoid cross-module problems on Windows.
On Windows, it is sometimes important for corresponding malloc() and
free() calls to be made from the same DLL, since some build options can
result in multiple allocators being active at the same time.  For that
reason we already provided PQfreemem().  This commit adds a similar
function for freeing string results allocated by the pgtypes library.

Author: Takayuki Tsunakawa
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8AD5D6%40G01JPEXMBYT05
2018-06-26 20:54:09 +12:00
Thomas Munro
35750a38b7 Move RecoveryLockList into a hash table.
Standbys frequently need to release all locks held by a given xid.
Instead of searching one big list linearly, let's create one list
per xid and put them in a hash table, so we can find what we need
in O(1) time.

Earlier analysis and a prototype were done by David Rowley, though
this isn't his patch.

Back-patch all the way.

Author: Thomas Munro
Diagnosed-by: David Rowley, Andres Freund
Reviewed-by: Andres Freund, Tom Lane, Robert Haas
Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com
Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com
2018-06-26 18:44:31 +12:00
Michael Paquier
7fdf56b0a7 Address set of issues with errno handling
System calls mixed up in error code paths are causing two issues which
several code paths have not correctly handled:
1) For write() calls, sometimes the system may return less bytes than
what has been written without errno being set.  Some paths were careful
enough to consider that case, and assumed that errno should be set to
ENOSPC, other calls missed that.
2) errno generated by a system call is overwritten by other system calls
which may succeed once an error code path is taken, causing what is
reported to the user to be incorrect.

This patch uses the brute-force approach of correcting all those code
paths.  Some refactoring could happen in the future, but this is let as
future work, which is not targeted for back-branches anyway.

Author: Michael Paquier
Reviewed-by: Ashutosh Sharma
Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz
2018-06-25 11:20:50 +09:00
Bruce Momjian
037768cf75 doc: adjust order of NUMERIC arguments to match syntax
Specifically, mention precision before scale

Reported-by: claytonjsalem@gmail.com

Discussion: https://postgr.es/m/152967566691.1268.1062965601465200209@wrigleys.postgresql.org

Backpatch-through: 9.3
2018-06-24 18:07:00 -04:00
Bruce Momjian
ae36661661 doc: show how interval's 3 unit buckets behave using EXTRACT()
This clarifies when justify_days() and justify_hours() are useful.
Paragraph moved too.

Reported-by: vodevsh@gmail.com

Discussion: https://postgr.es/m/152698651482.26744.15456677499485530703@wrigleys.postgresql.org

Backpatch-through: 9.3
2018-06-23 23:32:41 -04:00
Magnus Hagander
39686cd7f0 Fix typo
Reported using the website comment form
2018-06-20 16:07:30 +02:00
Bruce Momjian
4832c4c191 doc: explain use of json_populate_record{set}()
The set-returning nature of these functions make their use unclear. The
modified paragraph was added in PG 9.4.

Reported-by: yshaladi@denodo.com

Discussion:  https://postgr.es/m/152571684246.9460.18059951267371255159@wrigleys.postgresql.org

Backpatch-through: 9.4
2018-06-19 13:43:40 -04:00
Tom Lane
1bebfb9b63 Fix contrib/hstore_plperl to look through scalar refs.
Bring this transform function into sync with the policy established
by commit 3a382983d.

Also, fix it to make sure that what it drills down to is indeed a
hash, and not some other kind of Perl SV.  Previously, the test
cases added here provoked crashes.

Because of the crash hazard, back-patch to 9.5 where this module
was introduced.

Discussion: https://postgr.es/m/28336.1528393969@sss.pgh.pa.us
2018-06-18 15:55:06 -04:00
Michael Paquier
5860b22c45 Prevent hard failures of standbys caused by recycled WAL segments
When a standby's WAL receiver stops reading WAL from a WAL stream, it
writes data to the current WAL segment without having priorily zero'ed
the page currently written to, which can cause the WAL reader to read
junk data from a past recycled segment and then it would try to get a
record from it.  While sanity checks in place provide most of the
protection needed, in some rare circumstances, with chances increasing
when a record header crosses a page boundary, then the startup process
could fail violently on an allocation failure, as follows:
FATAL:  invalid memory alloc request size XXX

This is confusing for the user and also unhelpful as this requires in
the worst case a manual restart of the instance, impacting potentially
the availability of the cluster, and this also makes WAL data look like
it is in a corrupted state.

The chances of seeing failures are higher if the connection between the
standby and its root node is unstable, causing WAL pages to be written
in the middle.  A couple of approaches have been discussed, like
zero-ing  new WAL pages within the WAL receiver itself but this has the
disadvantage of impacting performance of any existing instances as this
breaks the sequential writes done by the WAL receiver.  This commit
deals with the problem with a more simple approach, which has no
performance impact without reducing the detection of the problem: if a
record is found with a length higher than 1GB for backends, then do not
try any allocation and report a soft failure which will force the
standby to retry reading WAL.  It could be possible that the allocation
call passes and that an unnecessary amount of memory is allocated,
however follow-up checks on records would just fail, making this
allocation short-lived anyway.

This patch owes a great deal to Tsunakawa Takayuki for reporting the
failure first, and then discussing a couple of potential approaches to
the problem.

Backpatch down to 9.5, which is where palloc_extended has been
introduced.

Reported-by: Tsunakawa Takayuki
Reviewed-by: Tsunakawa Takayuki
Author: Michael Paquier
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8B57AD@G01JPEXMBYT05
2018-06-18 10:43:59 +09:00
Tom Lane
119290be6f Use -Wno-format-truncation and -Wno-stringop-truncation, if available.
gcc 8 has started emitting some warnings that are largely useless for
our purposes, particularly since they complain about code following
the project-standard coding convention that path names are assumed
to be shorter than MAXPGPATH.  Even if we make the effort to remove
that assumption in some future release, the changes wouldn't get
back-patched.  Hence, just suppress these warnings, on compilers that
have these switches.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/1524563856.26306.9.camel@gunduz.org
2018-06-16 15:34:07 -04:00
Tom Lane
58065f9eed Avoid unnecessary use of strncpy in a couple of places in ecpg.
Use of strncpy with a length limit based on the source, rather than
the destination, is non-idiomatic and draws warnings from gcc 8.
Replace with memcpy, which does exactly the same thing in these cases,
but with less chance for confusion.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16 14:58:31 -04:00
Tom Lane
8870e2978f Use snprintf not sprintf in pg_waldump's timestamptz_to_str.
This could only cause an issue if strftime returned a ridiculously
long timezone name, which seems unlikely; and it wouldn't qualify
as a security problem even then, since pg_waldump (nee pg_xlogdump)
is a debug tool not part of the server.  But gcc 8 has started issuing
warnings about it, so let's use snprintf and be safe.

Backpatch to 9.3 where this code was added.

Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16 14:45:47 -04:00
Alvaro Herrera
e951f66dea Fail BRIN control functions during recovery explicitly
They already fail anyway, but prior to this patch they raise an ugly
error message about a lock that cannot be acquired.  This just improves
the message.

Author: Masahiko Sawada
Reported-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoBZau4g4_NUf3BKNd=CdYK+xaPdtJCzvOC1TxGdTiJx_Q@mail.gmail.com
Reviewed-by: Kuntal Ghosh, Alexander Korotkov, Simon Riggs, Michaël Paquier, Álvaro Herrera
2018-06-14 12:51:32 -04:00
Alexander Korotkov
c7699d9acf Documentation improvement for pg_trgm
Documentation of word_similarity() and strict_word_similarity() functions
contains some vague wordings which could confuse users.  This patch makes
those wordings more clear.  word_similarity() was introduced in PostgreSQL 9.6,
and corresponding part of documentation needs to be backpatched.

Author: Bruce Momjian, Alexander Korotkov
Discussion: https://postgr.es/m/20180526165648.GB12510%40momjian.us
Backpatch: 9.6, where word_similarity() was introduced
2018-06-13 18:31:18 +03:00
Andres Freund
6a46aba1cd Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
When vacuum processes a relation it uses the corresponding relcache
entry's relfrozenxid / relminmxid as a cutoff for when to remove
tuples etc. Unfortunately for nailed relations (i.e. critical system
catalogs) bugs could frequently lead to the corresponding relcache
entry being stale.

This set of bugs could cause actual data corruption as vacuum would
potentially not remove the correct row versions, potentially reviving
them at a later point.  After 699bf7d05c some corruptions in this vein
were prevented, but the additional error checks could also trigger
spuriously. Examples of such errors are:
  ERROR: found xmin ... from before relfrozenxid ...
and
  ERROR: found multixact ... from before relminmxid ...
To be caused by this bug the errors have to occur on system catalog
tables.

The two bugs are:

1) Invalidations for nailed relations were ignored, based on the
   theory that the relcache entry for such tables doesn't
   change. Which is largely true, except for fields like relfrozenxid
   etc.  This means that changes to relations vacuumed in other
   sessions weren't picked up by already existing sessions.  Luckily
   autovacuum doesn't have particularly longrunning sessions.

2) For shared *and* nailed relations, the shared relcache init file
   was never invalidated while running.  That means that for such
   tables (e.g. pg_authid, pg_database) it's not just already existing
   sessions that are affected, but even new connections are as well.
   That explains why the reports usually were about pg_authid et. al.

To fix 1), revalidate the rd_rel portion of a relcache entry when
invalid. This implies a bit of extra complexity to deal with
bootstrapping, but it's not too bad.  The fix for 2) is simpler,
simply always remove both the shared and local init files.

Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion:
    https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
    https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
    https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
    https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
Backpatch: 9.3-
2018-06-12 11:13:21 -07:00
Michael Paquier
256d43128d Fix grammar in documentation related to checkpoint_flush_after
Reported-by: Christopher Jones
2018-06-11 09:58:02 +09:00
Michael Paquier
5bb08c7000 Fix grammar in REVOKE documentation
Reported-by: Erwin Brandstetter
2018-06-10 22:49:07 +09:00
Alvaro Herrera
1a552256f6 Fix function code in error report
This bug causes a lseek() failure to be reported as a "could not open"
failure in the error message, muddling bug reports.  I introduced this
copy-and-pasteo in commit 78e1220104.

Noticed while reviewing code for bug report #15221, from lily liang.  In
version 10 the affected function is only used by multixact.c and
commit_ts, and only in corner-case circumstances, neither of which are
involved in the reported bug (a pg_subtrans failure.)

Author: Álvaro Herrera
2018-06-06 14:47:57 -04:00
Bruce Momjian
be565fa89d doc: mark 'replaceable' parameter for backup program listing
Reported-by: Liudmila Mantrova

Discussion: https://postgr.es/m/f3e2c0f5-5266-d626-58d7-b77e1b29d870@postgrespro.ru

Author: Liudmila Mantrova

Backpatch-through: 9.3
2018-05-28 14:19:45 -04:00