Commit Graph

47254 Commits

Author SHA1 Message Date
Thomas Munro 098344be66 Fix copy-and-paste mistakes in documentation.
Reported-by: Vik Fearing
2019-05-08 21:16:36 +12:00
Etsuro Fujita b7434dc007 Add missing periods to comments. 2019-05-08 16:49:09 +09:00
Heikki Linnakangas 256be1050c Remove leftover reference to old "flat file" mechanism in a comment.
The flat file mechanism was removed in PostgreSQL 9.0.
2019-05-08 09:32:34 +03:00
Peter Geoghegan d65b5ccad6 Correct obsolete nbtsort.c minimum key comment.
It is no longer possible under any circumstances for nbtree code to
reconstruct a strict lower bound key (parent page's pivot tuple key) for
a right sibling page by retrieving the first item in the right sibling
page.
2019-05-07 21:42:12 -07:00
Alexander Korotkov e5f9786317 Add jsonpath_encoding_1.out changes missed in 29ceacc3f9
Reported-by: Tom Lane
Discussion: https://postgr.es/m/14305.1557268259%40sss.pgh.pa.us
2019-05-08 01:55:31 +03:00
Alexander Korotkov 53ae0b16d6 Remove word "singleton" out of jsonpath docs
Word "singleton" is hard for user understanding, especially taking into account
there is only one place it's used in the docs and there is even no definition.
Use more evident wording instead.

Discussion: https://postgr.es/m/23737.1556550645%40sss.pgh.pa.us
2019-05-08 01:02:59 +03:00
Alexander Korotkov 29ceacc3f9 Improve error reporting in jsonpath
This commit contains multiple improvements to error reporting in jsonpath
including but not limited to getting rid of following things:

 * definition of error messages in macros,
 * errdetail() when valueable information could fit to errmsg(),
 * word "singleton" which is not properly explained anywhere,
 * line breaks in error messages.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/14890.1555523005%40sss.pgh.pa.us
Author: Alexander Korotkov
Reviewed-by: Tom Lane
2019-05-08 01:02:59 +03:00
Fujii Masao b84dbc8eb8 Add TRUNCATE parameter to VACUUM.
This commit adds new parameter to VACUUM command, TRUNCATE,
which specifies that VACUUM should attempt to truncate off
any empty pages at the end of the table and allow the disk space
for the truncated pages to be returned to the operating system.

This parameter, if specified, overrides the vacuum_truncate
reloption. If neither the reloption nor the VACUUM option is
used, the default is true, as before.

Author: Fujii Masao
Reviewed-by: Julien Rouhaud, Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoD+qtrSDL=GSma4Wd3kLYLeRC0hPna-YAdkDeV4z156vg@mail.gmail.com
2019-05-08 02:10:33 +09:00
Magnus Hagander 98719af6c2 Fix typos and clarify a comment
Author: Daniel Gustafsson <daniel@yesql.se>
2019-05-07 18:26:09 +02:00
Tom Lane 8d0ddccec6 Avoid "invalid memory alloc request size" while reading pg_stat_activity.
On a 64-bit machine, if you set track_activity_query_size and
max_connections such that their product exceeds 1GB, shared memory
setup will still succeed (given enough RAM), but attempts to read
pg_stat_activity fail with "invalid memory alloc request size".
Work around that by using MemoryContextAllocHuge to allocate the
local copy of the activity strings.  Using the "huge" API costs us
nothing extra in normal cases, and it seems better than throwing
an error and/or explaining to people why they can't do this.

This situation seems insanely profligate today, but who knows what
people will consider normal in ten or twenty years?  So let's fix it
in HEAD but not worry about a back-patch.

Per report from James Tomson.

Discussion: https://postgr.es/m/1CFDCCD6-B268-48D8-85C8-400D2790B2C3@pushd.com
2019-05-07 11:41:37 -04:00
Peter Eisentraut b753bc0c84 doc: Generate keywords table automatically
The SQL keywords table in the documentation had until now been
generated by some ad hoc scripting outside the source tree once for
each major release.  This changes it to an automated process.

We have the PostgreSQL keywords available in a parseable format in
parser/kwlist.h.  For the relevant SQL standard versions, keep the
keyword lists in new text files.  A new script
generate-keywords-table.pl pulls it all together and produces a
DocBook table.

The final output in the documentation should be identical after this
change.

Discussion: https://www.postgresql.org/message-id/flat/07daeadd-8c82-0d95-5e19-e350502cb749%402ndquadrant.com
2019-05-07 15:29:39 +02:00
Amit Kapila 7db0cde6b5 Revert "Avoid the creation of the free space map for small heap relations".
This feature was using a process local map to track the first few blocks
in the relation.  The map was reset each time we get the block with enough
freespace.  It was discussed that it would be better to track this map on
a per-relation basis in relcache and then invalidate the same whenever
vacuum frees up some space in the page or when FSM is created.  The new
design would be better both in terms of API design and performance.

List of commits reverted, in reverse chronological order:

06c8a5090e  Improve code comments in b0eaa4c51b.
13e8643bfc  During pg_upgrade, conditionally skip transfer of FSMs.
6f918159a9  Add more tests for FSM.
9c32e4c350  Clear the local map when not used.
29d108cdec  Update the documentation for FSM behavior..
08ecdfe7e5  Make FSM test portable.
b0eaa4c51b  Avoid creation of the free space map for small heap relations.

Discussion: https://postgr.es/m/20190416180452.3pm6uegx54iitbt5@alap3.anarazel.de
2019-05-07 09:30:24 +05:30
Michael Paquier af82f95abb Remove some code related to 7.3 and older servers from tools of src/bin/
This code was broken as of 582edc3, and is most likely not used anymore.
Note that pg_dump supports servers down to 8.0, and psql has code to
support servers down to 7.4.

Author: Julien Rouhaud
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAOBaU_Y5y=zo3+2gf+2NJC1pvMYPcbRXoQaPXx=U7+C8Qh4CzQ@mail.gmail.com
2019-05-07 09:39:39 +09:00
Bruce Momjian bdf595adbc docs: fist draft version of the PG 12 release notes
Still needs text markup, links, word wrap, and indenting.
2019-05-06 19:02:34 -04:00
Alvaro Herrera a1ec7402e9 Revert "Make pg_dump emit ATTACH PARTITION instead of PARTITION OF"
... and fallout (from branches 10, 11 and master).  The change was
ill-considered, and it broke a few normal use cases; since we don't have
time to fix it, we'll try again after this week's minor releases.

Reported-by: Rushabh Lathia
Discussion: https://postgr.es/m/CAGPqQf0iQV=PPOv2Btog9J9AwOQp6HmuVd6SbGTR_v3Zp2XT1w@mail.gmail.com
2019-05-06 12:23:49 -04:00
Michael Paquier 91248608a6 Add tests for error message generation in partition tuple routing
This adds extra tests for the error message generated for partition
tuple routing in the executor, using more than three levels of
partitioning including partitioned tables with no partitions.  These
tests have been added to fix CVE-2019-10129 on REL_11_STABLE.  HEAD has
no active bugs in this area, but it lacked coverage.

Author: Michael Paquier
Reviewed-by: Noah Misch
Security: CVE-2019-10129
2019-05-06 21:44:24 +09:00
Dean Rasheed a0905056fd Use checkAsUser for selectivity estimator checks, if it's set.
In examine_variable() and examine_simple_variable(), when checking the
user's table and column privileges to determine whether to grant
access to the pg_statistic data, use checkAsUser for the privilege
checks, if it's set. This will be the case if we're accessing the
table via a view, to indicate that we should perform privilege checks
as the view owner rather than the current user.

This change makes this planner check consistent with the check in the
executor, so the planner will be able to make use of statistics if the
table is accessible via the view. This fixes a performance regression
introduced by commit e2d4ef8de8, which affects queries against
non-security barrier views in the case where the user doesn't have
privileges on the underlying table, but the view owner does.

Note that it continues to provide the same safeguards controlling
access to pg_statistic for direct table access (in which case
checkAsUser won't be set) and for security barrier views, because of
the nearby checks on rte->security_barrier and rte->securityQuals.

Back-patch to all supported branches because e2d4ef8de8 was.

Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.
2019-05-06 11:54:32 +01:00
Dean Rasheed 1aebfbea83 Fix security checks for selectivity estimation functions with RLS.
In commit e2d4ef8de8, security checks were added to prevent
user-supplied operators from running over data from pg_statistic
unless the user has table or column privileges on the table, or the
operator is leakproof. For a table with RLS, however, checking for
table or column privileges is insufficient, since that does not
guarantee that the user has permission to view all of the column's
data.

Fix this by also checking for securityQuals on the RTE, and insisting
that the operator be leakproof if there are any. Thus the
leakproofness check will only be skipped if there are no securityQuals
and the user has table or column privileges on the table -- i.e., only
if we know that the user has access to all the data in the column.

Back-patch to 9.5 where RLS was added.

Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.

Security: CVE-2019-10130
2019-05-06 11:38:43 +01:00
Tom Lane bd5e8b627b Bring pg_nextoid()'s error messages into line with message style guide.
Noticed while reviewing nearby code.  Given all the disclaimers about
this not being meant as user-facing code, I wonder whether we should
make these non-translatable?  But in any case there's little excuse
for them not to be good English.
2019-05-05 17:06:53 -04:00
Tom Lane 9691aa72e2 Fix style violations in syscache lookups.
Project style is to check the success of SearchSysCacheN and friends
by applying HeapTupleIsValid to the result.  A tiny minority of calls
creatively did it differently.  Bring them into line with the rest.

This is just cosmetic, since HeapTupleIsValid is indeed just a null
check at the moment ... but that may not be true forever, and in any
case it puts a mental burden on readers who may wonder why these
call sites are not like the rest.

Back-patch to v11 just to keep the branches in sync.  (The bulk of these
errors seem to have originated in v11 or v12, though a few are old.)

Per searching to see if anyplace else had made the same error
repaired in 62148c352.
2019-05-05 13:10:07 -04:00
Tom Lane 62148c3520 Add check for syscache lookup failure in update_relispartition().
Omitted in commit 05b38c7e6 (though it looks like the original blame
belongs to 9e9befac4).  A failure is admittedly unlikely, but if it
did happen, SIGSEGV is not the approved method of reporting it.

Per Coverity.  Back-patch to v11 where the broken code originated.
2019-05-05 12:44:32 -04:00
Michael Paquier 84e4570da9 Fix set of issues with memory-allocation system calls in frontend code
Like the backend, the frontend has wrappers on top of malloc() and such
whose use is recommended.  Particularly, it is possible to do memory
allocation without issuing an error.  Some binaries missed the use of
those wrappers, so let's fix the gap for consistency.

This also fixes two latent bugs:
- In pg_dump/pg_dumpall when parsing an ACL item, on an out-of-memory
error for strdup(), the code considered the failure as a ACL parsing
problem instead of an actual OOM.
- In pg_waldump, an OOM when building the target directory string would
cause a crash.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/gY0y9xenfoBPc-Tufsr2Zg-MmkrJslm0Tw_CMg4p_j58-k_PXNC0klMdkKQkg61BkXC9_uWo-DcUzfxnHqpkpoR5jjVZrPHqKYikcHIiONhg=@yesql.se
2019-05-04 16:32:19 +09:00
Noah Misch 34ff542a71 MSVC: Build ~35% faster by calling dumpbin just once per directory.
Peifeng Qiu

Discussion: https://postgr.es/m/CABmtVJiKXQjast0dQD-8KAtfm8XmyYxo-4Dc7+M+fBr8JRTqkw@mail.gmail.com
2019-05-03 21:56:47 -07:00
Noah Misch 726cc4242a Suppress compiler warning in non-SSL, non-assert builds.
Jeff Janes, reviewed by Michael Paquier.

Discussion: https://postgr.es/m/CAMkU=1x8taZfsbPkv_MsWbTtzibW_yQHXoMhF_DTtm=z2hVHDg@mail.gmail.com
2019-05-03 21:56:46 -07:00
Peter Geoghegan 7b37f4b02e Correct more obsolete nbtree page split comments.
Commit 3f342839 corrected obsolete comments about buffer locks at the
main _bt_insert_parent() call site, but missed similar obsolete comments
above _bt_insert_parent() itself.  Both sets of comments were rendered
obsolete by commit 40dae7ec53, which made the nbtree page split
algorithm more robust.  Fix the comments that were missed the first time
around now.

In passing, refine a related _bt_insert_parent() comment about
re-finding the parent page to insert new downlink.
2019-05-03 13:34:45 -07:00
Tom Lane 8309eae499 Doc: remove obsolete comment about per-branch documentation.
I should have removed this in a0b762626, but forgot.
2019-05-03 12:32:06 -04:00
Tom Lane f884dca495 Remove RelationSetIndexList().
In the wake of commit f912d7dec, RelationSetIndexList isn't used any
more.  It was always a horrid wart, so getting rid of it is very nice.
We can also convert rd_indexvalid back to a plain boolean.

Discussion: https://postgr.es/m/28926.1556664156@sss.pgh.pa.us
2019-05-03 10:26:14 -04:00
Tom Lane f912d7dec2 Fix reindexing of pg_class indexes some more.
Commits 3dbb317d3 et al failed under CLOBBER_CACHE_ALWAYS testing.
Investigation showed that to reindex pg_class_oid_index, we must
suppress accesses to the index (via SetReindexProcessing) before we call
RelationSetNewRelfilenode, or at least before we do CommandCounterIncrement
therein; otherwise, relcache reloads happening within the CCI may try to
fetch pg_class rows using the index's new relfilenode value, which is as
yet an empty file.

Of course, the point of 3dbb317d3 was that that ordering didn't work
either, because then RelationSetNewRelfilenode's own update of the index's
pg_class row cannot access the index, should it need to.

There are various ways we might have got around that, but Andres Freund
came up with a brilliant solution: for a mapped index, we can really just
skip the pg_class update altogether.  The only fields it was actually
changing were relpages etc, but it was just setting them to zeroes which
is useless make-work.  (Correct new values will be installed at the end
of index build.)  All pg_class indexes are mapped and probably always will
be, so this eliminates the problem by removing work rather than adding it,
always a pleasant outcome.  Having taught RelationSetNewRelfilenode to do
it that way, we can revert the code reordering in reindex_index.  (But
I left the moved setup code where it was; there seems no reason why it
has to run without use of the old index.  If you're trying to fix a
busted pg_class index, you'll have had to disable system index use
altogether to get this far.)

Moreover, this means we don't need RelationSetIndexList at all, because
reindex_relation's hacking to make "REINDEX TABLE pg_class" work is
likewise now unnecessary.  We'll leave that code in place in the back
branches, but a follow-on patch will remove it in HEAD.

In passing, do some minor cleanup for commit 5c1560606 (in HEAD only),
notably removing a duplicate newrnode assignment.

Patch by me, using a core idea due to Andres Freund.  Back-patch to all
supported branches, as 3dbb317d3 was.

Discussion: https://postgr.es/m/28926.1556664156@sss.pgh.pa.us
2019-05-02 19:11:28 -04:00
Alvaro Herrera 2bf372a4ae heap_prepare_freeze_tuple: Simplify coding
Commit d2599ecfcc introduced some contorted, confused code around:
readers would think that it's possible for HeapTupleHeaderGetXmin return
a non-frozen value for some frozen tuples, which would be disastrous.
There's no actual bug, but it seems better to make it clearer.

Per gripe from Tom Lane and Andres Freund.
Discussion: https://postgr.es/m/30116.1555430496@sss.pgh.pa.us
2019-05-02 16:14:08 -04:00
Peter Geoghegan 6dd86c269d Fix nbtsort.c's page space accounting.
Commit dd299df818, which made heap TID a tiebreaker nbtree index
column, introduced new rules on page space management to make suffix
truncation safe.  In general, suffix truncation needs to have a small
amount of extra space available on the new left page when splitting a
leaf page.  This is needed in case it turns out that truncation cannot
even "truncate away the heap TID column", resulting in a
larger-than-firstright leaf high key with an explicit heap TID
representation.

Despite all this, CREATE INDEX/nbtsort.c did not account for the
possible need for extra heap TID space on leaf pages when deciding
whether or not a new item could fit on current page.  This could lead to
"failed to add item to the index page" errors when CREATE
INDEX/nbtsort.c tried to finish off a leaf page that lacked space for a
larger-than-firstright leaf high key (it only had space for firstright
tuple, which was just short of what was needed following "truncation").

Several conditions needed to be met all at once for CREATE INDEX to
fail.  The problem was in the hard limit on what will fit on a page,
which tends to be masked by the soft fillfactor-wise limit.  The easiest
way to recreate the problem seems to be a CREATE INDEX on a low
cardinality text column, with tuples that are of non-uniform width,
using a fillfactor of 100.

To fix, bring nbtsort.c in line with nbtsplitloc.c, which already
pessimistically assumes that all leaf page splits will have high keys
that have a heap TID appended.

Reported-By: Andreas Joseph Krogh
Discussion: https://postgr.es/m/VisenaEmail.c5.3ee7fe277d514162.16a6d785bea@tc7-visena
2019-05-02 12:33:35 -07:00
Robert Haas dd69597988 Fix some problems with VACUUM (INDEX_CLEANUP FALSE).
The new nleft_dead_tuples and nleft_dead_itemids fields are confusing
and do not seem like the correct way forward.  One of them is tested
via an assertion that can fail, as it has already done on buildfarm
member topminnow.  Remove the assertion and the fields.

Change the logic for the case where a tuple is not initially pruned
by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by
HeapTupleSatisfiesVacuum.  Previously, tupgone = true was set in
that case, which leads to treating the tuple as one that will be
removed.  In a normal vacuum, that's OK, because we'll remove
index entries for it and then the second heap pass will remove the
tuple itself, but when index cleanup is disabled, those things
don't happen, so we must instead treat it as a recently-dead
tuple that we have voluntarily chosen to keep.

Report and analysis by Tom Lane.  This patch loosely based on one
from Masahiko Sawada, but I changed most of it.
2019-05-02 10:07:13 -04:00
Bruce Momjian 26950273dc doc: clarify behavior of pg_upgrade's clone mode
Be more precise about the benefits of using clone mode.
2019-05-01 09:09:28 -04:00
Magnus Hagander 659e53498c Fix union for pgstat message types
The message type for temp files and for checksum failures were missing
from the union. Due to the coding style used there was no compiler error
when this happend. So change the code to actively use the union thereby
producing a compiler error if the same mistake happens again, suggested
by Tom Lane.

Author: Julien Rouhaud
Reported-By: Tomas Vondra
Discussion: https://postgr.es/m/20190430163328.zd4rrlnbvgaqlcdz@development
2019-05-01 12:30:44 +02:00
Andres Freund 809c9b48f4 Run catalog reindexing test from 3dbb317d32 serially, to avoid deadlocks.
The tests turn out to cause deadlocks in some circumstances. Fairly
reproducibly so with -DRELCACHE_FORCE_RELEASE
-DCATCACHE_FORCE_RELEASE.  Some of the deadlocks may be hard to fix
without disproportionate measures, but others probably should be fixed
- but not in 12.

We discussed removing the new tests until we can fix the issues
underlying the deadlocks, but results from buildfarm animal
markhor (which runs with CLOBBER_CACHE_ALWAYS) indicates that there
might be a more severe, as of yet undiagnosed, issue (including on
stable branches) with reindexing catalogs. The failure is:
ERROR: could not read block 0 in file "base/16384/28025": read only 0 of 8192 bytes
Therefore it seems advisable to keep the tests.

It's not certain that running the tests in isolation removes the risk
of deadlocks. It's possible that additional locks are needed to
protect against a concurrent auto-analyze or such.

Per discussion with Tom Lane.

Discussion: https://postgr.es/m/28926.1556664156@sss.pgh.pa.us
Backpatch: 9.4-, like 3dbb317d3
2019-04-30 17:45:32 -07:00
Andres Freund 4b40d40b30 Fix unused variable compiler warning in !debug builds.
Introduced in 3dbb317d3.  Fix by using the new local variable in more
places.

Reported-By: Bruce Momjian (off-list)
Backpatch: 9.4-, like 3dbb317d3
2019-04-30 17:45:32 -07:00
Andres Freund b06a354e38 docs: Fix small copy & paste mistake.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20190418005115.r4mat75wvlski3ij@alap3.anarazel.de
2019-04-30 16:20:07 -07:00
Andres Freund a0b5bb6e02 Improve comment spelling and style in llvmjit_deform.c.
Author: Justin Pryzby
Discussion:
    https://postgr.es/m/20190408141828.GE10080@telsasoft.com
    https://postgr.es/m/20181127184133.GM10913@telsasoft.com
2019-04-30 16:20:07 -07:00
Andres Freund 3a48005b00 Improve code inferring length of bitmap for JITed tuple deforming.
While discussing comment improvements (see next commit) by Justin
Pryzby, Tom complained about a few details of the logic to infer the
length of the NULL bitmap when building the JITed tuple deforming
function. That bitmap allows to avoid checking the tuple header's
natts, a check which often causes a pipeline stall

Improvements:
a) As long as missing columns aren't taken into account, we can
   continue to infer the length of the NULL bitmap from NOT NULL
   columns following it. Previously we stopped at the first missing
   column.  It's unlikely to matter much in practice, but the
   alternative would have been to document why we stop.
b) For robustness reasons it seems better to also check against
   attisdropped - RemoveAttributeById() sets attnotnull to false, but
   an additional check is trivial.
c) Improve related comments

Discussion: https://postgr.es/m/20637.1555957068@sss.pgh.pa.us
Backpatch: -
2019-04-30 16:20:07 -07:00
Tom Lane e03ff73969 Clean up handling of constraint_exclusion and enable_partition_pruning.
The interaction of these parameters was a bit confused/confusing,
and in fact v11 entirely misses the opportunity to apply partition
constraints when a partition is accessed directly (rather than
indirectly from its parent).

In HEAD, establish the principle that enable_partition_pruning controls
partition pruning and nothing else.  When accessing a partition via its
parent, we do partition pruning (if enabled by enable_partition_pruning)
and then there is no need to consider partition constraints in the
constraint_exclusion logic.  When accessing a partition directly, its
partition constraints are applied by the constraint_exclusion logic,
only if constraint_exclusion = on.

In v11, we can't have such a clean division of these GUCs' effects,
partly because we don't want to break compatibility too much in a
released branch, and partly because the clean coding requires
inheritance_planner to have applied partition pruning to a partitioned
target table, which it doesn't in v11.  However, we can tweak things
enough to cover the missed case, which seems like a good idea since
it's potentially a performance regression from v10.  This patch keeps
v11's previous behavior in which enable_partition_pruning overrides
constraint_exclusion for an inherited target table, though.

In HEAD, also teach relation_excluded_by_constraints that it's okay to use
inheritable constraints when trying to prune a traditional inheritance
tree.  This might not be thought worthy of effort given that that feature
is semi-deprecated now, but we have enough infrastructure that it only
takes a couple more lines of code to do it correctly.

Amit Langote and Tom Lane

Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp
Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us
2019-04-30 15:03:50 -04:00
Bruce Momjian ad23adc5a1 doc: improve PG 12 to_timestamp()/to_date() wording 2019-04-30 14:06:57 -04:00
Bruce Momjian 3454738625 doc: move "only" to a more appropriate place in the sentence 2019-04-30 13:44:31 -04:00
Alvaro Herrera 9f8b717a80 Message style fixes 2019-04-30 10:33:37 -04:00
Alvaro Herrera 9a83afecb7 Widen tuple counter variables from long to int64
Mistake in ab0dfc961b6a; progress reporting would have wrapped around
for indexes created with more than 2^31 tuples.

Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wz=WbNxc5ob5NJ9yqo2RMJ0q4HXDS30GVCobeCvC9A1L9A@mail.gmail.com
2019-04-30 10:27:38 -04:00
Andres Freund 3dbb317d32 Fix potential assertion failure when reindexing a pg_class index.
When reindexing individual indexes on pg_class it was possible to
either trigger an assertion failure:
TRAP: FailedAssertion("!(!ReindexIsProcessingIndex(((index)->rd_id)))

That's because reindex_index() called SetReindexProcessing() - which
enables an asserts ensuring no index insertions happen into the index
- before calling RelationSetNewRelfilenode(). That not correct for
indexes on pg_class, because RelationSetNewRelfilenode() updates the
relevant pg_class row, which needs to update the indexes.

The are two reasons this wasn't noticed earlier. Firstly the bug
doesn't trigger when reindexing all of pg_class, as reindex_relation
has code "hiding" all yet-to-be-reindexed indexes. Secondly, the bug
only triggers when the the update to pg_class doesn't turn out to be a
HOT update - otherwise there's no index insertion to trigger the
bug. Most of the time there's enough space, making this bug hard to
trigger.

To fix, move RelationSetNewRelfilenode() to before the
SetReindexProcessing() (and, together with some other code, to outside
of the PG_TRY()).

To make sure the error checking intended by SetReindexProcessing() is
more robust, modify CatalogIndexInsert() to check
ReindexIsProcessingIndex() even when the update is a HOT update.

Also add a few regression tests for REINDEXing of system catalogs.

The last two improvements would have prevented some of the issues
fixed in 5c1560606d from being introduced in the first place.

Reported-By: Michael Paquier
Diagnosed-By: Tom Lane and Andres Freund
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz
Backpatch: 9.4-, the bug is present in all branches
2019-04-29 19:42:08 -07:00
Andres Freund 5c1560606d Fix several recently introduced issues around handling new relation forks.
Most of these stem from d25f519107 "tableam: relation creation, VACUUM
FULL/CLUSTER, SET TABLESPACE.".

1) To pass data to the relation_set_new_filenode()
   RelationSetNewRelfilenode() was made to update RelationData.rd_rel
   directly. That's not OK however, as it makes the relcache entries
   temporarily inconsistent. Which among other scenarios is a problem
   if a REINDEX targets an index on pg_class - the
   CatalogTupleUpdate() in RelationSetNewRelfilenode().  Presumably
   that was introduced because other places in the code do so - while
   those aren't "good practice" they don't appear to be actively
   buggy (e.g. because system tables may not be targeted).

   I (Andres) should have caught this while reviewing and signficantly
   evolving the code in that commit, mea culpa.

   Fix that by instead passing in the new RelFileNode as separate
   argument to relation_set_new_filenode() and rely on the relcache to
   update the catalog entry. Also revert that the
   RelationMapUpdateMap() call was changed to immediate, and undo some
   other more unnecessary changes.

2) Document that the relation_set_new_filenode cannot rely on the
   whole relcache entry to be valid. It might be worthwhile to
   refactor the code to never have to rely on that, but given the way
   heap_create() is currently coded, that'd be a large change.

3) ATExecSetTableSpace() shouldn't do FlushRelationBuffers() itself. A
   table AM might not use shared buffers at all. Move to
   index_copy_data() and heapam_relation_copy_data().

4) heapam_relation_set_new_filenode() previously sometimes accessed
   rel->rd_rel->relpersistence rather than the `persistence`
   argument. Code movement mistake.

5) Previously heapam_relation_set_new_filenode() re-opened the smgr
   relation to create the init for, if necesary. Instead have
   RelationCreateStorage() return the SMgrRelation and use it to
   create the init fork.

6) Add a note about the danger of modifying the relcache directly to
   ATExecSetTableSpace() - it's currently not a bug because there's a
   check ERRORing for catalog tables.

Regression tests and assertion improvements that together trigger the
bug described in 1) will be added in a later commit, as there is a
related bug on all branches.

Reported-By: Michael Paquier
Diagnosed-By: Tom Lane and Andres Freund
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz
2019-04-29 19:28:05 -07:00
Peter Geoghegan 9ee7414ed0 Remove obsolete _bt_insert_parent() comment.
Remove a comment that refers to a coding practice that was fully removed
by commit a8b8f4db, which introduced MarkBufferDirty().  It looks like
the comment was even obsolete before then, since it concerns
write-ordering dependencies with synchronous buffer writes.
2019-04-29 14:14:38 -07:00
Tom Lane a1a789eb5a In walreceiver, don't try to do ereport() in a signal handler.
This is quite unsafe, even for the case of ereport(FATAL) where we won't
return control to the interrupted code, and despite this code's use of
a flag to restrict the areas where we'd try to do it.  It's possible
for example that we interrupt malloc or free while that's holding a lock
that's meant to protect against cross-thread interference.  Then, any
attempt to do malloc or free within ereport() will result in a deadlock,
preventing the walreceiver process from exiting in response to SIGTERM.
We hypothesize that this explains some hard-to-reproduce failures seen
in the buildfarm.

Hence, get rid of the immediate-exit code in WalRcvShutdownHandler,
as well as the logic associated with WalRcvImmediateInterruptOK.
Instead, we need to take care that potentially-blocking operations
in the walreceiver's data transmission logic (libpqwalreceiver.c)
will respond reasonably promptly to the process's latch becoming
set and then call ProcessWalRcvInterrupts.  Much of the needed code
for that was already present in libpqwalreceiver.c.  I refactored
things a bit so that all the uses of PQgetResult use latch-aware
waiting, but didn't need to do much more.

These changes should be enough to ensure that libpqwalreceiver.c
will respond promptly to SIGTERM whenever it's waiting to receive
data.  In principle, it could block for a long time while waiting
to send data too, and this patch does nothing to guard against that.
I think that that hazard is mostly theoretical though: such blocking
should occur only if we fill the kernel's data transmission buffers,
and we don't generally send enough data to make that happen without
waiting for input.  If we find out that the hazard isn't just
theoretical, we could fix it by using PQsetnonblocking, but that
would require more ticklish changes than I care to make now.

This is a bug fix, but it seems like too big a change to push into
the back branches without much more testing than there's time for
right now.  Perhaps we'll back-patch once we have more confidence
in the change.

Patch by me; thanks to Thomas Munro for review.

Discussion: https://postgr.es/m/20190416070119.GK2673@paquier.xyz
2019-04-29 12:26:07 -04:00
Michael Paquier 9c592896d9 Fix some typos
Author: Daniel Gustafsson
Discussion: https://postgr.es/m/42kEeWei6VxLGh12QbR08hiI5Pm-c3XgbK7qj393PSttEhVbnnQoFXHKzXjPRZLUpndWAfHIuZuUqGZBzyXadmEUCSqm9xphWur_I8vESMA=@yesql.se
2019-04-29 23:52:42 +09:00
Alvaro Herrera ffbce803e6 Message fixes 2019-04-29 10:05:45 -04:00
Peter Eisentraut cd3e27464c Fix potential catalog corruption with temporary identity columns
If a temporary table with an identity column and ON COMMIT DROP is
created in a single-statement transaction (not useful, but allowed),
it would leave the catalog corrupted.  We need to add a
CommandCounterIncrement() so that PreCommit_on_commit_actions() sees
the created dependency between table and sequence and can clean it
up.

The analogous and more useful case of doing this in a transaction
block already runs some CommandCounterIncrement() before it gets to
the on-commit cleanup, so it wasn't a problem in practical use.

Several locations for placing the new CommandCounterIncrement() call
were discussed.  This patch places it at the end of
standard_ProcessUtility().  That would also help if other commands
were to create catalog entries that some on-commit action would like
to see.

Bug: #15631
Reported-by: Serge Latyntsev <dnsl48@gmail.com>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2019-04-29 08:49:03 +02:00