Commit Graph

21008 Commits

Author SHA1 Message Date
Peter Eisentraut c005eb00e7 Standardize the printf format for st_size
Existing code used various inconsistent ways to printf struct stat's
st_size member.  The type of that is off_t, which is in most cases a
signed 64-bit integer, so use the long long int format for it.
2020-09-24 21:04:21 +02:00
Robert Haas f5ea92e8d6 Expose oldSnapshotControl definition via new header.
This makes it possible for code outside snapmgr.c to examine the
contents of this data structure. This commit does not add any code
which actually does so; a subsequent commit will make that change.

Patch by me, reviewed by Thomas Munro, Dilip Kumar, Hamid Akhtar.

Discussion: http://postgr.es/m/CA+TgmoY=aqf0zjTD+3dUWYkgMiNDegDLFjo+6ze=Wtpik+3XqA@mail.gmail.com
2020-09-24 13:33:09 -04:00
Tom Lane 83b61319a1 Improve behavior of tsearch_readline(), and remove t_readline().
Commit fbeb9da22, which added the tsearch_readline APIs, left
t_readline() in place as a compatibility measure.  But that function
has been unused and deprecated for twelve years now, so that seems
like enough time to remove it.  Doing so, and merging t_readline's
code into tsearch_readline, aids in making several useful
improvements:

* The hard-wired 4K limit on line length in tsearch data files is
removed, by using a StringInfo buffer instead of a fixed-size buffer.

* We can buy back the per-line palloc/pfree added by 3ea7e9550
in the common case where encoding conversion is not required.

* We no longer need a separate pg_verify_mbstr call, as that
functionality was folded into encoding conversion some time ago.

(We could have done some of this stuff while keeping t_readline as a
separate API, but there seems little point, since there's no reason
for anyone to still be using t_readline directly.)

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-23 20:26:58 -04:00
Thomas Munro aca74843e4 Fix missing fsync of SLRU directories.
Harmonize behavior by moving reponsibility for fsyncing directories down
into slru.c.  In 10 and later, only the multixact directories were
missed (see commit 1b02be21), and in older branches all SLRUs were
missed.

Back-patch to all supported releases.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGLtsTUOScnNoSMZ-2ZLv%2BwGh01J6kAo_DM8mTRq1sKdSQ%40mail.gmail.com
2020-09-24 10:39:52 +12:00
Tom Lane 6b2c4e59d0 Improve error cursor positions for problems with partition bounds.
We failed to pass down the query string to check_new_partition_bound,
so that its attempts to provide error cursor positions were for naught;
one must have the query string to get parser_errposition to do anything.
Adjust its API to require a ParseState to be passed down.

Also, improve the logic inside check_new_partition_bound so that the
cursor points at the partition bound for the specific column causing
the issue, when one can be identified.

That part is also for naught if we can't determine the query position of
the column with the problem.  Improve transformPartitionBoundValue so
that it makes sure that const-simplified partition expressions will be
properly labeled with positions.  In passing, skip calling evaluate_expr
if the value is already a Const, which is surely the most common case.

Alexandra Wang, Ashwin Agrawal, Amit Langote; reviewed by Ashutosh Bapat

Discussion: https://postgr.es/m/CACiyaSopZoqssfMzgHk6fAkp01cL6vnqBdmTw2C5_KJaFR_aMg@mail.gmail.com
Discussion: https://postgr.es/m/CAJV4CdrZ5mKuaEsRSbLf2URQ3h6iMtKD=hik8MaF5WwdmC9uZw@mail.gmail.com
2020-09-23 18:04:53 -04:00
Tom Lane 3ea7e9550e Avoid possible dangling-pointer access in tsearch_readline_callback.
tsearch_readline() saves the string pointer it returns to the caller
for possible use in the associated error context callback.  However,
the caller will usually pfree that string sometime before it next
calls tsearch_readline(), so that there is a window where an ereport
will try to print an already-freed string.

The built-in users of tsearch_readline() happen to all do that pfree
at the bottoms of their loops, so that the window is effectively
empty for them.  However, this is not documented as a requirement,
and contrib/dict_xsyn doesn't do it like that, so it seems likely
that third-party dictionaries might have live bugs here.

The practical consequences of this seem pretty limited in any case,
since production builds wouldn't clobber the freed string immediately,
besides which you'd not expect syntax errors in dictionary files
being used in production.  Still, it's clearly a bug waiting to bite
somebody.

Fix by pstrdup'ing the string to be saved for the error callback,
and then pfree'ing it next time through.  It's been like this for
a long time, so back-patch to all supported branches.

Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se
2020-09-23 11:36:13 -04:00
Thomas Munro 733fa9aa51 Allow WaitLatch() to be used without a latch.
Due to flaws in commit 3347c982ba, using WaitLatch() without
WL_LATCH_SET could cause an assertion failure or crash.  Repair.

While here, also add a check that the latch we're switching to belongs
to this backend, when changing from one latch to another.

Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
2020-09-23 15:17:30 +12:00
Tom Lane ce90f075f0 Improve the error message for an inappropriate column definition list.
The existing message about "a column definition list is only allowed for
functions returning "record"" could be given in some cases where it was
fairly confusing; in particular, a function with multiple OUT parameters
*does* return record according to pg_proc.  Break it down into a couple
more cases to deliver a more on-point complaint.  Per complaint from
Bruce Momjian.

Discussion: https://postgr.es/m/798909.1600562993@sss.pgh.pa.us
2020-09-22 10:49:11 -04:00
Tom Lane f859c2ffa0 Fix a few more generator scripts to produce pgindent-clean output.
This completes the project of making all our derived files be
pgindent-clean (or else explicitly excluded from indentation),
so that no surprises result when running pgindent in a built-out
development tree.

Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com
2020-09-21 13:58:26 -04:00
Tom Lane 9436041ed8 Copy editing: fix a bunch of misspellings and poor wording.
99% of this is docs, but also a couple of comments.  No code changes.

Justin Pryzby

Discussion: https://postgr.es/m/20200919175804.GE30557@telsasoft.com
2020-09-21 12:43:42 -04:00
Peter Eisentraut 80fc96eceb Standardize order of use strict and use warnings in Perl code
The standard order in PostgreSQL and other code is use strict first,
but some code was uselessly inconsistent about this.
2020-09-21 17:04:36 +02:00
Heikki Linnakangas c47a240fe6 Fix checksum calculation in the new sorting GiST build.
Since we're bypassing the buffer manager, we need to call
PageSetChecksumInplace() directly. As reported by Justin Pryzby.

In the passing, add RelationOpenSmgr() calls before all smgrwrite() and
smgrextend() calls. Tom added one before the first smgrextend() call in
commit c2bb287025, which seems to be enough, but let's play it safe and
do it before each one. That's how it's done in the similar code in
nbtsort.c, too.

Discussion: https://www.postgresql.org/message-id/20200920224446.GF30557@telsasoft.com
2020-09-21 14:50:07 +03:00
Tom Lane c2bb287025 Fix new GIST build code to work under CLOBBER_CACHE_ALWAYS.
Can't say if this fixes *all* cases, but at least we get through
the "point" regression test now, which hyrax's last run did not.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-09-19%2021%3A27%3A23
2020-09-20 17:08:49 -04:00
Peter Eisentraut 3d13867a2c Fix whitespace 2020-09-20 14:42:54 +02:00
Tom Lane 28a61fc6c5 Remove precedence hacks no longer needed without postfix operators.
It's no longer necessary to assign explicit precedences to GENERATED,
NULL_P, PRESERVE, or STRIP_P.

Actually, we don't need to assign precedence to IDENT either; that was
really just there to govern the behavior of target_el's "a_expr IDENT"
production, which no longer ends with that terminal.  However, it seems
like a good idea to continue to do so, because it provides a reference
point for a precedence level that we can assign to other unreserved
keywords that lack a natural precedence level.

Research by Peter Eisentraut and John Naylor; comment rewrite by me.

Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-19 15:11:26 -04:00
Thomas Munro ff28809feb Code review for dynahash change.
Commit be0a6666 left behind a comment about the order of some tests that
didn't make sense without the expensive division, and in fact we might
as well change the order to one that fails more cheaply most of the time
as a micro-optimization.  Also, remove the "+ 1" applied to max_bucket,
to drop an instruction and match the original behavior.  Per review
from Tom Lane.

Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-09-19 15:53:06 +12:00
Thomas Munro be0a666665 Remove large fill factor support from dynahash.c.
Since ancient times we have had support for a fill factor (maximum load
factor) to be set for a dynahash hash table, but:

1.  It was an integer, whereas for in-memory hash tables interesting
load factor targets are probably somewhere near the 0.75-1.0 range.

2.  It was implemented in a way that performed an expensive division
operation that regularly showed up in profiles.

3.  We are not aware of anyone ever having used a non-default value.

Therefore, remove support, effectively fixing it at 1.

Author: Jakub Wartak <Jakub.Wartak@tomtom.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
2020-09-19 11:40:39 +12:00
Tom Lane 06a7c3154f Allow most keywords to be used as column labels without requiring AS.
Up to now, if you tried to omit "AS" before a column label in a SELECT
list, it would only work if the column label was an IDENT, that is not
any known keyword.  This is rather unfriendly considering that we have
so many keywords and are constantly growing more.  In the wake of commit
1ed6b8956 it's possible to improve matters quite a bit.

We'd originally tried to make this work by having some of the existing
keyword categories be allowed without AS, but that didn't work too well,
because each category contains a few special cases that don't work
without AS.  Instead, invent an entirely orthogonal keyword property
"can be bare column label", and mark all keywords that way for which
we don't get shift/reduce errors by doing so.

It turns out that of our 450 current keywords, all but 39 can be made
bare column labels, improving the situation by over 90%.  This number
might move around a little depending on future grammar work, but it's
a pretty nice improvement.

Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor

Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-18 16:46:36 -04:00
Amit Kapila 24fb35e111 Update file header comments for logical/relation.c.
Author: Amit Langote
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CA+HiwqE20oZoix13JyCeALpTf_SmjarZWtBFe5sND6zz+iupAw@mail.gmail.com
2020-09-18 10:14:30 +05:30
Amit Kapila 0d32511eca Fix comments in heapam.c.
After commits 85f6b49c2c and 3ba59ccc89, we can allow parallel inserts
which was earlier not possible as parallel group members won't conflict
for relation extension and page lock.  In those commits, we forgot to
update comments at few places.

Author: Amit Kapila
Reviewed-by: Robert Haas and Dilip Kumar
Backpatch-through: 13
Discussion: https://postgr.es/m/CAFiTN-tMrQh5FFMPx5aWJ+1gi1H6JxktEhq5mDwCHgnEO5oBkA@mail.gmail.com
2020-09-18 09:50:44 +05:30
Tom Lane 1ed6b89563 Remove support for postfix (right-unary) operators.
This feature has been a thorn in our sides for a long time, causing
many grammatical ambiguity problems.  It doesn't seem worth the
pain to continue to support it, so remove it.

There are some follow-on improvements we can make in the grammar,
but this commit only removes the bare minimum number of productions,
plus assorted backend support code.

Note that pg_dump and psql continue to have full support, since
they may be used against older servers.  However, pg_dump warns
about postfix operators.  There is also a check in pg_upgrade.

Documentation-wise, I (tgl) largely removed the "left unary"
terminology in favor of saying "prefix operator", which is
a more standard and IMO less confusing term.

I included a catversion bump, although no initial catalog data
changes here, to mark the boundary at which oprkind = 'r'
stopped being valid in pg_operator.

Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor

Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-17 19:38:05 -04:00
Amit Kapila b7f2dd959a Update parallel BTree scan state when the scan keys can't be satisfied.
For parallel btree scan to work for array of scan keys, it should reach
BTPARALLEL_DONE state once for every distinct combination of array keys.
This is required to ensure that the parallel workers don't try to seize
blocks at the same time for different scan keys. We missed to update this
state when we discovered that the scan keys can't be satisfied.

Author: James Hunter
Reviewed-by: Amit Kapila
Tested-by: Justin Pryzby
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/4248CABC-25E3-4809-B4D0-128E1BAABC3C@amazon.com
2020-09-17 16:11:48 +05:30
Peter Eisentraut 45b9805706 Allow CURRENT_ROLE where CURRENT_USER is accepted
In the particular case of GRANTED BY, this is specified in the SQL
standard.  Since in PostgreSQL, CURRENT_ROLE is equivalent to
CURRENT_USER, and CURRENT_USER is already supported here, adding
CURRENT_ROLE is trivial.  The other cases are PostgreSQL extensions,
but for the same reason it also makes sense there.

Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Asif Rehman <asifr.rehman@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/f2feac44-b4c5-f38f-3699-2851d6a76dc9%402ndquadrant.com
2020-09-17 11:40:08 +02:00
Heikki Linnakangas 16fa9b2b30 Add support for building GiST index by sorting.
This adds a new optional support function to the GiST access method:
sortsupport. If it is defined, the GiST index is built by sorting all data
to the order defined by the sortsupport's comparator function, and packing
the tuples in that order to GiST pages. This is similar to how B-tree
index build works, and is much faster than inserting the tuples one by
one. The resulting index is smaller too, because the pages are packed more
tightly, upto 'fillfactor'. The normal build method works by splitting
pages, which tends to lead to more wasted space.

The quality of the resulting index depends on how good the opclass-defined
sort order is. A good order preserves locality of the input data.

As the first user of this facility, add 'sortsupport' function to the
point_ops opclass. It sorts the points in Z-order (aka Morton Code), by
interleaving the bits of the X and Y coordinates.

Author: Andrey Borodin
Reviewed-by: Pavel Borisov, Thomas Munro
Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru
2020-09-17 11:33:40 +03:00
Tom Lane babef40c9a Teach walsender to update its process title for replication commands.
Because the code path taken for SQL commands executed in a walsender
will update the process title, we pretty much have to update the
title for replication commands as well.  Otherwise, the title shows
"idle" for the rest of a logical walsender's lifetime once it's
executed any SQL command.

Playing with this, I confirm that a walsender now typically spends
most of its life reporting
	walsender postgres [local] START_REPLICATION
Considering this in isolation, it might be better to have it say
	walsender postgres [local] sending replication data
However, consistency with the other cases seems to be a stronger
argument.

In passing, remove duplicative pgstat_report_activity call.

Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us
2020-09-16 21:06:50 -04:00
Alvaro Herrera 07082b08cc
Fix bogus completion tag usage in walsender
Since commit fd5942c18f (2012, 9.3-era), walsender has been sending
completion tags for certain replication commands twice -- and they're
not even consistent.  Apparently neither libpq nor JDBC have a problem
with it, but it's not kosher.  Fix by remove the EndCommand() call in
the common code path for them all, and inserting specific calls to
EndReplicationCommand() specifically in those places where it's needed.

EndReplicationCommand() is a new simple function to send the completion
tag for replication commands.  Do this instead of sending a generic
SELECT completion tag for them all, which was also pretty bogus (if
innocuous).  While at it, change StartReplication() to use
EndReplicationCommand() instead of pg_puttextmessage().

In commit 2f9661311b, I failed to realize that replication commands
are not close-enough kin of regular SQL commands, so the
DROP_REPLICATION_SLOT tag I added is undeserved and a type pun.  Take it
out.

Backpatch to 13, where the latter commit appeared.  The duplicate tag
has been sent since 9.3, but since nothing is broken, it doesn't seem
worth fixing.

Per complaints from Tom Lane.

Discussion: https://postgr.es/m/1347966.1600195735@sss.pgh.pa.us
2020-09-16 21:16:25 -03:00
Tom Lane 44fc6e259b Centralize setup of SIGQUIT handling for postmaster child processes.
We decided that the policy established in commit 7634bd4f6 for
the bgwriter, checkpointer, walwriter, and walreceiver processes,
namely that they should accept SIGQUIT at all times, really ought
to apply uniformly to all postmaster children.  Therefore, get
rid of the duplicative and inconsistent per-process code for
establishing that signal handler and removing SIGQUIT from BlockSig.
Instead, make InitPostmasterChild do it.

The handler set up by InitPostmasterChild is SignalHandlerForCrashExit,
which just summarily does _exit(2).  In interactive backends, we
almost immediately replace that with quickdie, since we would prefer
to try to tell the client that we're dying.  However, this patch is
changing the behavior of autovacuum (both launcher and workers), as
well as walsenders.  Those processes formerly also used quickdie,
but AFAICS that was just mindless copy-and-paste: they don't have
any interactive client that's likely to benefit from being told this.

The stats collector continues to be an outlier, in that it thinks
SIGQUIT means normal exit.  That should probably be changed for
consistency, but there's another patch set where that's being
dealt with, so I didn't do so here.

Discussion: https://postgr.es/m/644875.1599933441@sss.pgh.pa.us
2020-09-16 16:04:36 -04:00
Tom Lane 2000b6c10a Don't fetch partition check expression during InitResultRelInfo.
Since there is only one place that actually needs the partition check
expression, namely ExecPartitionCheck, it's better to fetch it from
the relcache there.  In this way we will never fetch it at all if
the query never has use for it, and we still fetch it just once when
we do need it.

The reason for taking an interest in this is that if the relcache
doesn't already have the check expression cached, fetching it
requires obtaining AccessShareLock on the partition root.  That
means that operations that look like they should only touch the
partition itself will also take a lock on the root.  In particular
we observed that TRUNCATE on a partition may take a lock on the
partition's root, contributing to a deadlock situation in parallel
pg_restore.

As written, this patch does have a small cost, which is that we
are microscopically reducing efficiency for the case where a partition
has an empty check expression.  ExecPartitionCheck will be called,
and will go through the motions of setting up and checking an empty
qual, where before it would not have been called at all.  We could
avoid that by adding a separate boolean flag to track whether there
is a partition expression to test.  However, this case only arises
for a default partition with no siblings, which surely is not an
interesting case in practice.  Hence adding complexity for it
does not seem like a good trade-off.

Amit Langote, per a suggestion by me

Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16 14:28:18 -04:00
Tom Lane e5fac1cb19 Avoid unnecessary recursion to child tables in ALTER TABLE SET NOT NULL.
If a partitioned table's column is already marked NOT NULL, there is
no need to examine its partitions, because we can rely on previous
DDL to have enforced that the child columns are NOT NULL as well.
(Unfortunately, the same cannot be said for traditional inheritance,
so for now we have to restrict the optimization to partitioned tables.)
Hence, we may skip recursing to child tables in this situation.

The reason this case is worth worrying about is that when pg_dump dumps
a partitioned table having a primary key, it will include the requisite
NOT NULL markings in the CREATE TABLE commands, and then add the
primary key as a separate step.  The primary key addition generates a
SET NOT NULL as a subcommand, just to be sure.  So the situation where
a SET NOT NULL is redundant does arise in the real world.

Skipping the recursion does more than just save a few cycles: it means
that a command such as "ALTER TABLE ONLY partition_parent ADD PRIMARY
KEY" will take locks only on the partition parent table, not on the
partitions.  It turns out that parallel pg_restore is effectively
assuming that that's true, and has little choice but to do so because
the dependencies listed for such a TOC entry don't include the
partitions.  pg_restore could thus issue this ALTER while data restores
on the partitions are still in progress.  Taking unnecessary locks on
the partitions not only hurts concurrency, but can lead to actual
deadlock failures, as reported by Domagoj Smoljanovic.

(A contributing factor in the deadlock is that TRUNCATE on a child
partition wants a non-exclusive lock on the parent.  This seems
likewise unnecessary, but the fix for it is more invasive so we
won't consider back-patching it.  Fortunately, getting rid of one
of these two poor behaviors is enough to remove the deadlock.)

Although support for partitioned primary keys came in with v11,
this patch is dependent on the SET NOT NULL refactoring done by
commit f4a3fdfbd, so we can only patch back to v12.

Patch by me; thanks to Alvaro Herrera and Amit Langote for review.

Discussion: https://postgr.es/m/VI1PR03MB31670CA1BD9625C3A8C5DD05EB230@VI1PR03MB3167.eurprd03.prod.outlook.com
2020-09-16 13:38:26 -04:00
Tom Lane 3d65b0593c Fix bogus cache-invalidation logic in logical replication worker.
The code recorded cache invalidation events by zeroing the "localreloid"
field of affected cache entries.  However, it's possible for an inval
event to occur even while we have the entry open and locked.  So an
ill-timed inval could result in "cache lookup failed for relation 0"
errors, if the worker's code tried to use the cleared field.  We can
fix that by creating a separate bool field to record whether the entry
needs to be revalidated.  (In the back branches, cram the bool into
what had been padding space, to avoid an ABI break in the somewhat
unlikely event that any extension is looking at this struct.)

Also, rearrange the logic in logicalrep_rel_open so that it
does the right thing in cases where table_open would fail.
We should retry the lookup by name in that case, but we didn't.

The real-world impact of this is probably small.  In the first place,
the error conditions are very low probability, and in the second place,
the worker would just exit and get restarted.  We only noticed because
in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly,
preventing the worker from making progress.  Nonetheless, it's clearly
a bug, and it impedes a useful type of testing; so back-patch to v10
where this code was introduced.

Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us
2020-09-16 12:07:31 -04:00
Jeff Davis c8aeaf3ab3 Change LogicalTapeSetBlocks() to use nBlocksWritten.
Previously, it was based on nBlocksAllocated to account for tapes with
open write buffers that may not have made it to the BufFile yet.

That was unnecessary, because callers do not need to get the number of
blocks while a tape has an open write buffer; and it also conflicted
with the preallocation logic added for HashAgg.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/ce5af05900fdbd0e9185747825a7423c48501964.camel@j-davis.com
Backpatch-through: 13
2020-09-15 21:42:25 -07:00
Jeff Davis 3bd35d4f51 HashAgg: release write buffers sooner by rewinding tape.
This was an oversight. The purpose of 7fdd919ae7 was to avoid keeping
tape buffers around unnecessisarily, but HashAgg didn't rewind early
enough.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/1fb1151c2cddf8747d14e0532da283c3f97e2685.camel@j-davis.com
Backpatch-through: 13
2020-09-15 21:18:22 -07:00
Amit Kapila 69bd60672a Fix initialization of RelationSyncEntry for streaming transactions.
In commit 464824323e, for each RelationSyncEntry we maintained the list
of xids (streamed_txns) for which we have already sent the schema. This
helps us to track when to send the schema to the downstream node for
replication of streaming transactions. Before this list got initialized,
we were processing invalidation messages which access this list and led
to an assertion failure.

In passing, clean up the nearby code:

* Initialize the list of xids with NIL instead of NULL which is our usual
coding practice.
* Remove the MemoryContext switch for creating a RelationSyncEntry in dynahash.

Diagnosed-by: Amit Kapila and Tom Lane
Author: Amit Kapila
Reviewed-by: Tom Lane and Dilip Kumar
Discussion: https://postgr.es/m/904373.1600033123@sss.pgh.pa.us
2020-09-16 07:45:44 +05:30
David Rowley 19c60ad69a Optimize compactify_tuples function
This function could often be seen in profiles of vacuum and could often
be a significant bottleneck during recovery. The problem was that a qsort
was performed in order to sort an array of item pointers in reverse offset
order so that we could use that to safely move tuples up to the end of the
page without overwriting the memory of yet-to-be-moved tuples. i.e. we
used to compact the page starting at the back of the page and move towards
the front. The qsort that this required could be expensive for pages with
a large number of tuples.

In this commit, we take another approach to tuple compactification.

Now, instead of sorting the remaining item pointers array we first check
if the array is presorted and only memmove() the tuples that need to be
moved. This presorted check can be done very cheaply in the calling
functions when the array is being populated. This presorted case is very
fast.

When the item pointer array is not presorted we must copy tuples that need
to be moved into a temp buffer before copying them back into the page
again. This differs from what we used to do here as we're now copying the
tuples back into the page in reverse line pointer order. Previously we
left the existing order alone.  Reordering the tuples results in an
increased likelihood of hitting the pre-sorted case the next time around.
Any newly added tuple which consumes a new line pointer will also maintain
the correct sort order of tuples in the page which will also result in the
presorted case being hit the next time.  Only consuming an unused line
pointer can cause the order of tuples to go out again, but that will be
corrected next time the function is called for the page.

Benchmarks have shown that the non-presorted case is at least equally as
fast as the original qsort method even when the page just has a few
tuples. As the number of tuples becomes larger the new method maintains
its performance whereas the original qsort method became much slower when
the number of tuples on the page became large.

Author: David Rowley
Reviewed-by: Thomas Munro
Tested-by: Jakub Wartak
Discussion: https://postgr.es/m/CA+hUKGKMQFVpjr106gRhwk6R-nXv0qOcTreZuQzxgpHESAL6dw@mail.gmail.com
2020-09-16 13:22:20 +12:00
Alvaro Herrera ced138e8cb
Fix use-after-free bug with event triggers in an extension script
ALTER TABLE commands in an extension script are added to an event
trigger command list; but starting with commit b5810de3f4 they do so in
a memory context that's too short-lived, so when execution ends and time
comes to use the entries, they've already been freed.

(This would also be a problem with ALTER TABLE commands in a
multi-command query string, but these serendipitously end in
PortalContext -- which probably explains why it took so long for this to
be reported.)

Fix by using the memory context specifically set for that, instead.

Backpatch to 13, where the aforementioned commit appeared.

Reported-by: Philippe Beaudoin
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20200902193715.6e0269d4@firost
2020-09-15 21:03:14 -03:00
David Rowley 10a5b35a00 Report resource usage at the end of recovery
Reporting this has been rather useful in some recent recovery speedup
work.  It also seems like something that will be useful to the average DBA
too.

Author: David Rowley
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CAApHDvqYVORiZxq2xPvP6_ndmmsTkvr6jSYv4UTNaFa5i1kd%3DQ%40mail.gmail.com
2020-09-16 11:25:46 +12:00
David Rowley 62e221e1c0 Allow incremental sorts for windowing functions
This expands on the work done in d2d8a229b and allows incremental sort
to be considered during create_window_paths().

Author: David Rowley
Reviewed-by: Daniel Gustafsson, Tomas Vondra
Discussion: https://postgr.es/m/CAApHDvoOHobiA2x13NtWnWLcTXYj9ddpCkv9PnAJQBMegYf_xw%40mail.gmail.com
2020-09-15 23:44:45 +12:00
David Rowley fe4f36bcde Fix compiler warning
Introduced in 0aa8f7640.

MSVC warned about performing 32-bit bit shifting when it appeared like we
might like a 64-bit result.  We did, but it just so happened that none of
the calls to this function could have caused the 32-bit shift to overflow.

Here we just cast the constant to int64 to make the compiler happy.

Discussion: https://postgr.es/m/CAApHDvofA_vsrpC13mq_hZyuye5B-ssKEaer04OouXYCO5-uXQ@mail.gmail.com
2020-09-15 15:07:57 +12:00
Tom Lane f560209c6e Make walsenders show their replication commands in pg_stat_activity.
A walsender process that has executed a SQL command left the text of
that command in pg_stat_activity.query indefinitely, which is quite
confusing if it's in RUNNING state but not doing that query.  An easy
and useful fix is to treat replication commands as if they were SQL
queries, and show them in pg_stat_activity according to the same rules
as for regular queries.  While we're at it, it seems also sensible to
set debug_query_string, allowing error logging and debugging to see
the replication command.

While here, clean up assorted silliness in exec_replication_command:

* The SQLCmd path failed to restore CurrentMemoryContext to the caller's
value, and failed to delete the temp context created in this routine.
It's only through great good fortune that these oversights did not
result in long-term memory leaks or other problems.  It seems cleaner
to code SQLCmd as a separate early-exit path, so do it like that.

* Remove useless duplicate call of SnapBuildClearExportedSnapshot().

* replication_scanner_finish() was never called.

None of those things are significant enough to merit a backpatch,
so this is for HEAD only.

Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us
2020-09-14 12:35:00 -04:00
Fujii Masao 95233011a0 Fix typos.
Author: Naoki Nakamichi
Discussion: https://postgr.es/m/b6919d145af00295a8e86ce4d034b7cd@oss.nttdata.com
2020-09-14 14:16:07 +09:00
Michael Paquier 83158f74d3 Make index_set_state_flags() transactional
3c84046 is the original commit that introduced index_set_state_flags(),
where the presence of SnapshotNow made necessary the use of an in-place
update.  SnapshotNow has been removed in 813fb03, so there is no actual
reasons to not make this operation transactional.

Note that while making the operation more robust, using a transactional
operation in this routine was not strictly necessary as there was no use
case for it yet.  However, some future features are going to need a
transactional behavior, like support for CREATE/DROP INDEX CONCURRENTLY
with partitioned tables, where indexes in a partition tree need to have
all their pg_index.indis* flags updated in the same transaction to make
the operation stable to the end-user by keeping partition trees
consistent, even with a failure mid-flight.

REINDEX CONCURRENTLY uses already transactional updates when swapping
the old and new indexes, making this change more consistent with the
index-swapping logic.

Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200903080440.GA8559@paquier.xyz
2020-09-14 13:56:41 +09:00
Peter Eisentraut 3e0242b24c Message fixes and style improvements 2020-09-14 06:42:30 +02:00
Tom Lane 19f5a37b9f Use the properly transformed RangeVar for expandTableLikeClause().
transformCreateStmt() adjusts the transformed statement's RangeVar
to specify the target schema explicitly, for the express reason
of making sure that auxiliary statements derived by parse
transformation operate on the right table.  But the refactoring
I did in commit 502898192 got this wrong and passed the untransformed
RangeVar to expandTableLikeClause().  This could lead to assertion
failures or weird misbehavior if the wrong table was accessed.

Per report from Alexander Lakhin.  Like the previous patch, back-patch
to all supported branches.

Discussion: https://postgr.es/m/05051f9d-b32b-cb35-6735-0e9f2ab86b5f@gmail.com
2020-09-13 12:51:21 -04:00
Amit Kapila 03c7f1f37a Fix inconsistency in determining the timestamp of the db statfile.
We use the timestamp of the global statfile if we are not able to
determine it for a particular database in case the entry for that database
doesn't exist. However, we were using it even when the statfile is
corrupt.

As there is no user reported issue and it is not clear if there is any
impact of this on actual application so decided not to backpatch.

Reported-by: Amit Kapila
Author: Amit Kapila
Reviewed-by: Sawada Masahiko, Magnus Hagander and Alvaro Herrera
Discussion: https://postgr.es/m/CAA4eK1J3oTJKyVq6v7K4d3jD+vtnruG9fHRib6UuWWsrwAR6Aw@mail.gmail.com
2020-09-12 08:02:54 +05:30
Amit Kapila ddd5f6d260 Remove unused function declaration in logicalproto.h.
In the passing, fix a typo in pgoutput.c.

Reported-by: Tomas Vondra
Author: Tomas Vondra
Reviewed-by: Dilip Kumar
Discussion: https://postgr.es/m/20200909084353.pncuclpbwlr7vylh@development
2020-09-12 07:47:53 +05:30
Jeff Davis 0758964963 logtape.c: do not preallocate for tapes when sorting
The preallocation logic is only useful for HashAgg, so disable it when
sorting.

Also, adjust an out-of-date comment.

Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wzn_o7tE2+hRVvwSFghRb75AJ5g-nqGzDUqLYMexjOAe=g@mail.gmail.com
Backpatch-through: 13
2020-09-11 17:10:02 -07:00
Tom Lane 7634bd4f6d Accept SIGQUIT during error recovery in auxiliary processes.
The bgwriter, checkpointer, walwriter, and walreceiver processes
claimed to allow SIGQUIT "at all times".  In reality SIGQUIT
would get re-blocked during error recovery, because we didn't
update the actual signal mask immediately, so sigsetjmp() would
save and reinstate a mask that includes SIGQUIT.

This appears to be simply a coding oversight.  There's never a
good reason to hold off SIGQUIT in these processes, because it's
going to just call _exit(2) which should be safe enough, especially
since the postmaster is going to tear down shared memory afterwards.
Hence, stick in PG_SETMASK() calls to install the modified BlockSig
mask immediately.

Also try to improve the comments around sigsetjmp blocks.  Most of
them were just referencing postgres.c, which is misleading because
actually postgres.c manages the signals differently.

No back-patch, since there's no evidence that this is causing any
problems in the field.

Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
2020-09-11 16:01:36 -04:00
Tom Lane 10095ca634 Log a message when resorting to SIGKILL during shutdown/crash recovery.
Currently, no useful trace is left in the logs when the postmaster
is forced to use SIGKILL to shut down children that failed to respond
to SIGQUIT.  Some questions were raised about how often that scenario
happens in the buildfarm, so let's add a LOG-level message showing
that it happened.

Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
2020-09-11 12:24:46 -04:00
Tom Lane 6693a96b32 Don't run atexit callbacks during signal exits from ProcessStartupPacket.
Although 58c6feccf fixed the case for SIGQUIT, we were still calling
proc_exit() from signal handlers for SIGTERM and timeout failures in
ProcessStartupPacket.  Fortunately, at the point where that code runs,
we haven't yet connected to shared memory in any meaningful way, so
there is nothing we need to undo in shared memory.  This means it
should be safe to use _exit(1) here, ie, not run any atexit handlers
but also inform the postmaster that it's not a crash exit.

To make sure nobody breaks the "nothing to undo" expectation, add
a cross-check that no on-shmem-exit or before-shmem-exit handlers
have been registered yet when we finish using these signal handlers.

This change is simple enough that maybe it could be back-patched,
but I won't risk that right now.

Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
2020-09-11 12:20:16 -04:00
Alvaro Herrera 6a68a233ce
Update copyright year
Thinko in 40b3e2c201.

Reported-by: "Wang, Shenhao" <wangsh.fnst@cn.fujitsu.com>
Discussion: https://postgr.es/m/ed98706b82694b57a8c0d339a10732aa@G08CNEXMBPEKD06.g08.fujitsu.local
2020-09-11 12:55:13 -03:00
Alvaro Herrera 9f1cf97bb5
Print WAL logical message contents in pg_waldump
This helps debuggability when looking at WAL streams containing logical
messages.

Author: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAExHW5sWx49rKmXbg5H1Xc1t+nRv9PaYKQmgw82HPt6vWDVmDg@mail.gmail.com
2020-09-10 19:37:02 -03:00
Tom Lane 58c6feccfa Use _exit(2) for SIGQUIT during ProcessStartupPacket, too.
Bring the signal handling for startup-packet collection into line
with the policy established in commits bedadc732 and 8e19a8264,
namely don't risk running atexit callbacks when handling SIGQUIT.

Ideally, we'd not do so for SIGTERM or timeout interrupts either,
but that change seems a bit too risky for the back branches.
For now, just improve the comments in this area to describe the risk.

Also relocate where BackendInitialize re-disables these interrupts,
to minimize the code span where they're active.  This doesn't buy
a whole lot of safety, but it can't hurt.

In passing, rename startup_die() to remove confusion about whether
it is for the startup process.

Like the previous commits, back-patch to all supported branches.

Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
2020-09-10 12:06:37 -04:00
Etsuro Fujita 3857f98f14 Clean up some code and comments in partbounds.c.
Do some minor cleanup for commit c8434d64c: 1) remove a useless
assignment (in normal builds) and 2) improve comments a little.

Back-patch to v13 where the aforementioned commit went in.

Author: Etsuro Fujita
Reviewed-by: Alvaro Herrera
Discussion: https://postgr.es/m/CAPmGK16yCd2R4=bQ4g8N2dT9TtA5ZU+qNmJ3LPc_nypbNy4_2A@mail.gmail.com
2020-09-10 18:00:00 +09:00
Michael Paquier aad546bd0a doc: Fix some grammar and inconsistencies
Some comments are fixed while on it.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200818171702.GK17022@telsasoft.com
Backpatch-through: 9.6
2020-09-10 15:50:19 +09:00
Noah Misch fe4d022c8e Fix rd_firstRelfilenodeSubid for nailed relations, in parallel workers.
Move applicable code out of RelationBuildDesc(), which nailed relations
bypass.  Non-assert builds experienced no known problems.  Back-patch to
v13, where commit c6b92041d3 introduced
rd_firstRelfilenodeSubid.

Kyotaro Horiguchi.  Reported by Justin Pryzby.

Discussion: https://postgr.es/m/20200907023737.GA7158@telsasoft.com
2020-09-09 18:50:24 -07:00
Tom Lane bedadc7322 Make archiver's SIGQUIT handler exit via _exit().
Commit 8e19a8264 changed the SIGQUIT handlers of almost all server
processes not to run atexit callbacks.  The archiver process was
skipped, perhaps because it's not connected to shared memory; but
it's just as true here that running atexit callbacks in a signal
handler is unsafe.  So let's make it work like the rest.

In HEAD and v13, we can use the common SignalHandlerForCrashExit
handler.  Before that, just tweak pgarch_exit to use _exit(2)
explicitly.

Like the previous commit, back-patch to all supported branches.

Kyotaro Horiguchi, back-patching by me

Discussion: https://postgr.es/m/1850884.1599601164@sss.pgh.pa.us
2020-09-09 15:32:45 -04:00
Peter Eisentraut 0aa8f76408 Expose internal function for converting int64 to numeric
Existing callers had to take complicated detours via
DirectFunctionCall1().  This simplifies a lot of code.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu
2020-09-09 20:16:28 +02:00
Tom Lane f3e1e66196 Minor fixes in docs and error messages.
Alexander Lakhin

Discussion: https://postgr.es/m/ce7debdd-c943-d7a7-9b41-687107b27831@gmail.com
2020-09-09 11:53:39 -04:00
Alvaro Herrera f481d28232
Check default partitions constraints while descending
Partitioning tuple route code assumes that the partition chosen while
descending the partition hierarchy is always the correct one.  This is
true except when the partition is the default partition and another
partition has been added concurrently: the partition constraint changes
and we don't recheck it.  This can lead to tuples mistakenly being added
to the default partition that should have been rejected.

Fix by rechecking the default partition constraint while descending the
hierarchy.

An isolation test based on the reproduction steps described by Hao Wu
(with tweaks for extra coverage) is included.

Backpatch to 12, where this bug came in with 898e5e3290.

Reported by: Hao Wu <hawu@vmware.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CA+HiwqFqBmcSSap4sFnCBUEL_VfOMmEKaQ3gwUhyfa4c7J_-nA@mail.gmail.com
Discussion: https://postgr.es/m/DM5PR0501MB3910E97A9EDFB4C775CF3D75A42F0@DM5PR0501MB3910.namprd05.prod.outlook.com
2020-09-08 19:35:15 -03:00
Tom Lane c9ae5cbb88 Install an error check into cancel_before_shmem_exit().
Historically, cancel_before_shmem_exit() just silently did nothing
if the specified callback wasn't the top-of-stack.  The folly of
ignoring this case was exposed by the bugs fixed in 303640199 and
bab150045, so let's make it throw elog(ERROR) instead.

There is a decent argument to be made that PG_ENSURE_ERROR_CLEANUP
should use some separate infrastructure, so it wouldn't break if
something inside the guarded code decides to register a new
before_shmem_exit callback.  However, a survey of the surviving
uses of before_shmem_exit() and PG_ENSURE_ERROR_CLEANUP doesn't
show any plausible conflicts of that sort today, so for now we'll
forgo the extra complexity.  (It will almost certainly become
necessary if anyone ever wants to wrap PG_ENSURE_ERROR_CLEANUP
around arbitrary user-defined actions, though.)

No backpatch, since this is developer support not a production issue.

Bharath Rupireddy, per advice from Andres Freund, Robert Haas, and myself

Discussion: https://postgr.es/m/CALj2ACWk7j4F2v2fxxYfrroOF=AdFNPr1WsV+AGtHAFQOqm_pw@mail.gmail.com
2020-09-08 15:54:25 -04:00
Andres Freund 5871f09c98 Fix autovacuum cancellation.
The problem is caused by me (Andres) having ProcSleep() look at the
wrong PGPROC entry in 5788e258bb.

Unfortunately it seems hard to write a reliable test for autovacuum
cancellations. Perhaps somebody will come up with a good approach, but
it seems worth fixing the issue even without a test.

Reported-By: Jeff Janes <jeff.janes@gmail.com>
Author: Jeff Janes <jeff.janes@gmail.com>
Discussion: https://postgr.es/m/CAMkU=1wH2aUy+wDRDz+5RZALdcUnEofV1t9PzXS_gBJO9vZZ0Q@mail.gmail.com
2020-09-08 11:25:34 -07:00
Tom Lane 3438c988fd Use plain memset() in numeric.c, not MemSet and friends.
This essentially reverts a micro-optimization I made years ago,
as part of the much larger commit d72f6c750.  It's doubtful
that there was any hard evidence for it being helpful even then,
and the case is even more dubious now that modern compilers
are so much smarter about inlining memset().

The proximate reason for undoing it is to get rid of the type punning
inherent in MemSet, for fear that that may cause problems now that
we're applying additional optimization switches to numeric.c.
At the very least this'll silence some warnings from a few old
buildfarm animals.

(It's probably past time for another look at whether MemSet is still
worth anything at all, but I do not propose to tackle that question
right now.)

Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com
2020-09-08 11:47:37 -04:00
Peter Eisentraut 728d4bc16b Use <unnamed> for name of unnamed portal's memory context
Otherwise just printing an empty string makes the memory context debug
output slightly confusing.

Discussion: https://www.postgresql.org/message-id/flat/ccb353ef-89ff-09b3-8046-1d2514624b9c%402ndquadrant.com
2020-09-08 17:15:00 +02:00
Michael Paquier a6642b3ae0 Add support for partitioned tables and indexes in REINDEX
Until now, REINDEX was not able to work with partitioned tables and
indexes, forcing users to reindex partitions one by one.  This extends
REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned
index and table in input, respectively, to reindex all the partitions
assigned to them with physical storage (foreign tables, partitioned
tables and indexes are then discarded).

This shares some logic with schema and database REINDEX as each
partition gets processed in its own transaction after building a list of
relations to work on.  This choice has the advantage to minimize the
number of invalid indexes to one partition with REINDEX CONCURRENTLY in
the event a cancellation or failure in-flight, as the only indexes
handled at once in a single REINDEX CONCURRENTLY loop are the ones from
the partition being working on.

Isolation tests are added to emulate some cases I bumped into while
developing this feature, particularly with the concurrent drop of a
leaf partition reindexed.  However, this is rather limited as LOCK would
cause REINDEX to block in the first transaction building the list of
partitions.

Per its multi-transaction nature, this new flavor cannot run in a
transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE.

Author: Justin Pryzby, Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com
2020-09-08 10:09:22 +09:00
Jeff Davis a547e68675 Adjust cost model for HashAgg that spills to disk.
Tomas Vondra observed that the IO behavior for HashAgg tends to be
worse than for Sort. Penalize HashAgg IO costs accordingly.

Also, account for the CPU effort of spilling the tuples and reading
them back.

Discussion: https://postgr.es/m/20200906212112.nzoy5ytrzjjodpfh@development
Reviewed-by: Tomas Vondra
Backpatch-through: 13
2020-09-07 13:31:59 -07:00
Tom Lane 53367e6c62 Clarify comments in enforce_generic_type_consistency().
Some of the pre-existing comments were vague about whether they
referred to all polymorphic types or only the old-style ones.

Also be more consistent about using the "family 1" vs "family 2"
terminology.

Himanshu Upadhyaya and Tom Lane

Discussion: https://postgr.es/m/CAPF61jBUg9XoMPNuLpoZ+h6UZ2VxKdNt3rQL1xw1GOBwjWzAXQ@mail.gmail.com
2020-09-07 14:52:33 -04:00
Tom Lane 9c79e646c6 Frob numeric.c loop so that clang will auto-vectorize it too.
Experimentation shows that clang will auto-vectorize the critical
multiplication loop if the termination condition is written "i2 < limit"
rather than "i2 <= limit".  This seems unbelievably stupid, but I've
reproduced it on both clang 9.0.1 (RHEL8) and 11.0.3 (macOS Catalina).
gcc doesn't care, so tweak the code to do it that way.

Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com
2020-09-07 12:03:04 -04:00
Thomas Munro 861c6e7c8e Skip unnecessary stat() calls in walkdir().
Some kernels can tell us the type of a "dirent", so we can avoid a call
to stat() or lstat() in many cases.  Define a new function
get_dirent_type() to contain that logic, for use by the backend and
frontend versions of walkdir(), and perhaps other callers in future.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2BFzxupGGN4GpUdbzZN%2Btn6FQPHo8w0Q%2BAPH5Wz8RG%2Bww%40mail.gmail.com
2020-09-07 18:28:06 +12:00
Tom Lane 8870917623 Apply auto-vectorization to the inner loop of numeric multiplication.
Compile numeric.c with -ftree-vectorize where available, and adjust
the innermost loop of mul_var() so that it is amenable to being
auto-vectorized.  (Mainly, that involves making it process the arrays
left-to-right not right-to-left.)

Applying -ftree-vectorize actually makes numeric.o smaller, at least
with my compiler (gcc 8.3.1 on x86_64), and it's a little faster too.
Independently of that, fixing the inner loop to be vectorizable also
makes things a bit faster.  But doing both is a huge win for
multiplications with lots of digits.  For me, the numeric regression
test is the same speed to within measurement noise, but numeric_big
is a full 45% faster.

We also looked into applying -funroll-loops, but that makes numeric.o
bloat quite a bit, and the additional speed improvement is very
marginal.

Amit Khandekar, reviewed and edited a little by me

Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com
2020-09-06 21:40:39 -04:00
Tom Lane 695de5d1ed Split Makefile symbol CFLAGS_VECTOR into two symbols.
Replace CFLAGS_VECTOR with CFLAGS_UNROLL_LOOPS and CFLAGS_VECTORIZE,
allowing us to distinguish whether we want to apply -funroll-loops,
-ftree-vectorize, or both to a particular source file.  Up to now
the only consumer of the symbol has been checksum.c which wants
both, so that there was no need to distinguish; but that's about
to change.

Amit Khandekar, reviewed and edited a little by me

Discussion: https://postgr.es/m/CAJ3gD9evtA_vBo+WMYMyT-u=keHX7-r8p2w7OSRfXf42LTwCZQ@mail.gmail.com
2020-09-06 21:28:16 -04:00
Tom Lane 8e3c58e6e4 Refactor pg_get_line() to expose an alternative StringInfo-based API.
Letting the caller provide a StringInfo to read into is helpful when
the caller needs to merge lines or otherwise modify the data after
it's been read.  Notably, now the code added by commit 8f8154a50
can use pg_get_line_append() instead of having its own copy of that
logic.  A follow-on commit will also make use of this.

Also, since StringInfo buffers are a minimum of 1KB long, blindly
using pg_get_line() in a loop can eat a lot more memory than one would
expect.  I discovered for instance that commit e0f05cd5b caused initdb
to consume circa 10MB to read postgres.bki, even though that's under
1MB worth of data.  A less memory-hungry alternative is to re-use the
same StringInfo for all lines and pg_strdup the results.

Discussion: https://postgr.es/m/1315832.1599345736@sss.pgh.pa.us
2020-09-06 14:13:19 -04:00
Magnus Hagander 2a093355aa Fix typo in comment
Author: Hou, Zhijie
2020-09-06 19:26:55 +02:00
Tom Lane 19ad7e1d7b Fix misleading error message about inconsistent moving-aggregate types.
We reported the wrong types when complaining that an aggregate's
moving-aggregate implementation is inconsistent with its regular
implementation.

This was wrong since the feature was introduced, so back-patch
to all supported branches.

Jeff Janes

Discussion: https://postgr.es/m/CAMkU=1x808LH=LPhZp9mNSP0Xd1xDqEd+XeGcvEe48dfE6xV=A@mail.gmail.com
2020-09-06 12:55:13 -04:00
Tom Lane e0f05cd5ba Improve some ancient, crufty code in bootstrap + initdb.
At some point back in the last century, somebody felt that reading
all of pg_type twice was cheaper, or at least easier, than using
repalloc() to resize the Typ[] array dynamically.  That seems like an
entirely wacko proposition, so rewrite the code to do it the other
way.  (To add insult to injury, there were two not-quite-identical
copies of said code.)

initdb.c's readfile() function had the same disease of preferring
to do double the I/O to avoid resizing its output array.  Here,
we can make things easier by using the just-invented pg_get_line()
function to handle reading individual lines without a predetermined
notion of how long they are.

On my machine, it's difficult to detect any net change in the
overall runtime of initdb from these changes; but they should
help on slower buildfarm machines (especially since a buildfarm
cycle involves a lot of initdb's these days).

My attention was drawn to these places by scan-build complaints,
but on inspection they needed a lot more work than just suppressing
dead stores :-(
2020-09-05 16:20:04 -04:00
Tom Lane a5cc4dab6d Yet more elimination of dead stores and useless initializations.
I'm not sure what tool Ranier was using, but the ones I contributed
were found by using a newer version of scan-build than I tried before.

Ranier Vilela and Tom Lane

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-05 13:17:32 -04:00
Michael Paquier 8febfd1855 Switch to multi-inserts when registering dependencies for many code paths
This commit improves the dependency registrations by taking advantage of
the preliminary work done in 63110c62, to group together the insertion
of dependencies of the same type to pg_depend.  With the current layer
of routines available, and as only dependencies of the same type can be
grouped, there are code paths still doing more than one multi-insert
when it is necessary to register dependencies of multiple types
(constraint and index creation are two cases doing that).

While on it, this refactors some of the code to use ObjectAddressSet()
when manipulating object addresses.

Author: Daniel Gustafsson, Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera
Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz
2020-09-05 21:33:53 +09:00
Peter Eisentraut 556cbdfce4 Fix typo in comment 2020-09-05 11:32:20 +02:00
Michael Paquier 63110c6264 Use multi-inserts for pg_depend
This is a follow-up of the work done in e3931d01.  This case is a bit
different than pg_attribute and pg_shdepend: the maximum number of items
to insert is known in advance, but there is no need to handle pinned
dependencies.  Hence, the base allocation for slots is done based on the
number of items and the maximum allowed with a cap at 64kB.  Slots are
initialized once used to minimize the overhead of the operation.

The insertions can be done for dependencies of the same type.  More
could be done by grouping the insertion of multiple dependency types in
a single batch.  This is left as future work.

Some of the multi-insert logic is also simplified for pg_shdepend, as
per the feedback discussed for this specific patch.  This also moves to
indexing.h the variable capping the maximum amount of data that can be
used at once for a multi-insert, instead of having separate definitions
for pg_attribute, pg_depend and pg_shdepend.

Author: Daniel Gustafsson, Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera
Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz
2020-09-05 13:52:47 +09:00
Tom Lane c8746f999e Fix over-eager ping'ing in logical replication receiver.
Commit 3f60f690f only partially fixed the broken-status-tracking
issue in LogicalRepApplyLoop: we need ping_sent to have the same
lifetime as last_recv_timestamp.  The effects are much less serious
than what that commit fixed, though.  AFAICS this would just lead to
extra ping requests being sent, once per second until the sender
responds.  Still, it's a bug, so backpatch to v10 as before.

Discussion: https://postgr.es/m/959627.1599248476@sss.pgh.pa.us
2020-09-04 20:33:36 -04:00
Tom Lane 9a851039aa Remove still more useless assignments.
Fix some more things scan-build pointed to as dead stores.  In some of
these cases, rearranging the code a little leads to more readable
code IMO.  It's all cosmetic, though.

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-04 20:33:36 -04:00
Jeff Davis 0852006a94 Fix bogus MaxAllocSize check in logtape.c.
Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-Wz=NZPZc3-fkdmvu=w2itx0PiB-G6QpxHXZOjuvFAzPdZw@mail.gmail.com
Backpatch-through: 13
2020-09-04 12:09:52 -07:00
Alvaro Herrera f43e295f68
Report expected contrecord length on mismatch
When reading a WAL record fails to find continuation record(s) of the
proper length, report what it expects, for clarity.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20200903212152.GA15319@alvherre.pgsql
2020-09-04 14:58:32 -04:00
Tom Lane 38a2d70329 Remove some more useless assignments.
Found with clang's scan-build tool.  It also whines about a lot of
other dead stores that we should *not* change IMO, either as a matter
of style or future-proofing.  But these places seem like clear
oversights.

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-04 14:32:19 -04:00
Amit Kapila ac15b499f7 Fix inline marking introduced in commit 464824323e.
Forgot to add inline marking in changes_filename() declaration. In the passing, add
inline marking for a similar function subxact_filename().

Reported-By: Nathan Bossart
Discussion: https://postgr.es/m/E98FBE8F-B878-480D-A728-A60C6EED3047@amazon.com
2020-09-04 11:25:16 +05:30
Bruce Momjian e36e936e0e remove redundant initializations
Reported-by: Ranier Vilela

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com

Author: Ranier Vilela

Backpatch-through: master
2020-09-03 22:57:35 -04:00
Michael Paquier 844c05abc3 Remove variable "concurrent" from ReindexStmt
This node already handles multiple options using a bitmask, so having a
separate boolean flag is not necessary.  This simplifies the code a bit
with less arguments to give to the reindex routines, by replacing the
boolean with an equivalent bitmask value.

Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/20200902110326.GA14963@paquier.xyz
2020-09-04 10:43:32 +09:00
Tom Lane 67a472d71c Remove arbitrary restrictions on password length.
This patch started out with the goal of harmonizing various arbitrary
limits on password length, but after awhile a better idea emerged:
let's just get rid of those fixed limits.

recv_password_packet() has an arbitrary limit on the packet size,
which we don't really need, so just drop it.  (Note that this doesn't
really affect anything for MD5 or SCRAM password verification, since
those will hash the user's password to something shorter anyway.
It does matter for auth methods that require a cleartext password.)

Likewise remove the arbitrary error condition in pg_saslprep().

The remaining limits are mostly in client-side code that prompts
for passwords.  To improve those, refactor simple_prompt() so that
it allocates its own result buffer that can be made as big as
necessary.  Actually, it proves best to make a separate routine
pg_get_line() that has essentially the semantics of fgets(), except
that it allocates a suitable result buffer and hence will never
return a truncated line.  (pg_get_line has a lot of potential
applications to replace randomly-sized fgets buffers elsewhere,
but I'll leave that for another patch.)

I built pg_get_line() atop stringinfo.c, which requires moving
that code to src/common/; but that seems fine since it was a poor
fit for src/port/ anyway.

This patch is mostly mine, but it owes a good deal to Nathan Bossart
who pressed for a solution to the password length problem and
created a predecessor patch.  Also thanks to Peter Eisentraut and
Stephen Frost for ideas and discussion.

Discussion: https://postgr.es/m/09512C4F-8CB9-4021-B455-EF4C4F0D55A0@amazon.com
2020-09-03 20:09:18 -04:00
Tom Lane be4b0c0077 Avoid lockup of a parallel worker when reporting a long error message.
Because sigsetjmp() will restore the initial state with signals blocked,
the code path in bgworker.c for reporting an error and exiting would
execute that way.  Usually this is fairly harmless; but if a parallel
worker had an error message exceeding the shared-memory communication
buffer size (16K) it would lock up, because it would wait for a
resume-sending signal from its parallel leader which it would never
detect.

To fix, just unblock signals at the appropriate point.

This can be shown to fail back to 9.6.  The lack of parallel query
infrastructure makes it difficult to provide a simple test case for
9.5; but I'm pretty sure the issue exists in some form there as well,
so apply the code change there too.

Vignesh C, reviewed by Bharath Rupireddy, Robert Haas, and myself

Discussion: https://postgr.es/m/CALDaNm1d1hHPZUg3xU4XjtWBOLCrA+-2cJcLpw-cePZ=GgDVfA@mail.gmail.com
2020-09-03 16:52:09 -04:00
Tom Lane 8f8154a503 Allow records to span multiple lines in pg_hba.conf and pg_ident.conf.
A backslash at the end of a line now causes the next line to be appended
to the current one (effectively, the backslash and newline are discarded).
This allows long HBA entries to be created without legibility problems.

While we're here, get rid of the former hard-wired length limit on
pg_hba.conf lines, by using an expansible StringInfo buffer instead
of a fixed-size local variable.

Since the same code is used to read the ident map file, these changes
apply there as well.

Fabien Coelho, reviewed by Justin Pryzby and David Zhang

Discussion: https://postgr.es/m/alpine.DEB.2.21.2003251906140.15243@pseudo
2020-09-03 12:16:48 -04:00
Amit Kapila 464824323e Add support for streaming to built-in logical replication.
To add support for streaming of in-progress transactions into the
built-in logical replication, we need to do three things:

* Extend the logical replication protocol, so identify in-progress
transactions, and allow adding additional bits of information (e.g.
XID of subtransactions).

* Modify the output plugin (pgoutput) to implement the new stream
API callbacks, by leveraging the extended replication protocol.

* Modify the replication apply worker, to properly handle streamed
in-progress transaction by spilling the data to disk and then
replaying them on commit.

We however must explicitly disable streaming replication during
replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover we don't have a replication connection open so we
don't have where to send the data anyway.

Author: Tomas Vondra, Dilip Kumar and Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian
Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-09-03 07:54:07 +05:30
Tom Lane 66f1630680 Add string_to_table() function.
This splits a string at occurrences of a delimiter.  It is exactly like
string_to_array() except for producing a set of values instead of an
array of values.  Thus, the relationship of these two functions is
the same as between regexp_split_to_table() and regexp_split_to_array().

Although the same results could be had from unnest(string_to_array()),
this is somewhat faster than that, and anyway it seems reasonable to
have it for symmetry with the regexp functions.

Pavel Stehule, reviewed by Peter Smith

Discussion: https://postgr.es/m/CAFj8pRD8HOpjq2TqeTBhSo_QkzjLOhXzGCpKJ4nCs7Y9SQkuPw@mail.gmail.com
2020-09-02 18:23:56 -04:00
Fujii Masao be9788e998 Avoid unnecessary acquisition of SyncRepLock in transaction commit time.
In SyncRepWaitForLSN() routine called in transaction commit time,
SyncRepLock is necessary to atomically both check the shared
sync_standbys_defined flag and operate the sync replication wait-queue.
On the other hand, when the flag is false, the lock is not necessary
because the wait-queue is not touched. But due to the changes by
commit 48c9f49265, previously the lock was taken whatever the flag was.
This could cause unnecessary performance overhead in every transaction
commit time. Therefore this commit avoids that unnecessary aquisition
of SyncRepLock.

Author: Fujii Masao
Reviewed-by: Asim Praveen, Masahiko Sawada,
Discussion: https://postgr.es/m/20200406050332.nsscfqjzk2d57zyx@alap3.anarazel.de
2020-09-02 10:55:55 +09:00
Michael Paquier 1d65416661 Improve handling of dropped relations for REINDEX DATABASE/SCHEMA/SYSTEM
When multiple relations are reindexed, a scan of pg_class is done first
to build the list of relations to work on.  However the REINDEX logic
has never checked if a relation listed still exists when beginning the
work on it, causing for example sudden cache lookup failures.

This commit adds safeguards against dropped relations for REINDEX,
similarly to VACUUM or CLUSTER where we try to open the relation,
ignoring it if it is missing.  A new option is added to the REINDEX
routines to control if a missed relation is OK to ignore or not.

An isolation test, based on REINDEX SCHEMA, is added for the concurrent
and non-concurrent cases.

Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200813043805.GE11663@paquier.xyz
2020-09-02 09:08:12 +09:00
Tom Lane a7212be8b9 Set cutoff xmin more aggressively when vacuuming a temporary table.
Since other sessions aren't allowed to look into a temporary table
of our own session, we do not need to worry about the global xmin
horizon when setting the vacuum XID cutoff.  Indeed, if we're not
inside a transaction block, we may set oldestXmin to be the next
XID, because there cannot be any in-doubt tuples in a temp table,
nor any tuples that are dead but still visible to some snapshot of
our transaction.  (VACUUM, of course, is never inside a transaction
block; but we need to test that because CLUSTER shares the same code.)

This approach allows us to always clean out a temp table completely
during VACUUM, independently of concurrent activity.  Aside from
being useful in its own right, that simplifies building reproducible
test cases.

Discussion: https://postgr.es/m/3490536.1598629609@sss.pgh.pa.us
2020-09-01 18:40:43 -04:00
Alvaro Herrera afc7e0ad55
Raise error on concurrent drop of partitioned index
We were already raising an error for DROP INDEX CONCURRENTLY on a
partitioned table, albeit a different and confusing one:
  ERROR:  DROP INDEX CONCURRENTLY must be first action in transaction

Change that to throw a more comprehensible error:
  ERROR:  cannot drop partitioned index \"%s\" concurrently

Michael Paquier authored the test case for indexes on temporary
partitioned tables.

Backpatch to 11, where indexes on partitioned tables were added.

Reported-by: Jan Mussler <jan.mussler@zalando.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
2020-09-01 13:40:43 -04:00
Amit Kapila 4ab77697f6 Fix the SharedFileSetUnregister API.
Commit 808e13b282 introduced a few APIs to extend the existing Buffile
interface. In SharedFileSetDeleteOnProcExit, it tries to delete the list
element while traversing the list with 'foreach' construct which makes the
behavior of list traversal unpredictable.

Author: Amit Kapila
Reviewed-by: Dilip Kumar
Tested-by: Dilip Kumar and Neha Sharma
Discussion: https://postgr.es/m/CAA4eK1JhLatVcQ2OvwA_3s0ih6Hx9+kZbq107cXVsSWWukH7vA@mail.gmail.com
2020-09-01 08:11:39 +05:30
Tom Lane 3d351d916b Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
Historically, we've considered the state with relpages and reltuples
both zero as indicating that we do not know the table's tuple density.
This is problematic because it's impossible to distinguish "never yet
vacuumed" from "vacuumed and seen to be empty".  In particular, a user
cannot use VACUUM or ANALYZE to override the planner's normal heuristic
that an empty table should not be believed to be empty because it is
probably about to get populated.  That heuristic is a good safety
measure, so I don't care to abandon it, but there should be a way to
override it if the table is indeed intended to stay empty.

Hence, represent the initial state of ignorance by setting reltuples
to -1 (relpages is still set to zero), and apply the minimum-ten-pages
heuristic only when reltuples is still -1.  If the table is empty,
VACUUM or ANALYZE (but not CREATE INDEX) will override that to
reltuples = relpages = 0, and then we'll plan on that basis.

This requires a bunch of fiddly little changes, but we can get rid of
some ugly kluges that were formerly needed to maintain the old definition.

One notable point is that FDWs' GetForeignRelSize methods will see
baserel->tuples = -1 when no ANALYZE has been done on the foreign table.
That seems like a net improvement, since those methods were formerly
also in the dark about what baserel->tuples = 0 really meant.  Still,
it is an API change.

I bumped catversion because code predating this change would get confused
by seeing reltuples = -1.

Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
2020-08-30 12:21:51 -04:00
Michael Paquier 9511fb37ac Reset indisreplident for an invalid index in DROP INDEX CONCURRENTLY
A failure when dropping concurrently an index used in a replica identity
could leave in pg_index an index marked as !indisvalid and
indisreplident.  Reindexing this index would switch back indisvalid to
true, and if the replica identity of the parent relation was switched to
use a different index, it would be possible to finish with more than one
index marked as indisreplident.  If that were to happen, this could mess
up with the relation cache as an incorrect index could be used for the
replica identity.

Indexes marked as invalid are discarded as candidates for the replica
identity, as of RelationGetIndexList(), so similarly to what is done
with indisclustered, resetting indisreplident when the index is marked
as invalid keeps things consistent.  REINDEX CONCURRENTLY's swapping
already resets the flag for the old index, while the new index inherits
the value of the old index to-be-dropped, so only DROP INDEX was an
issue.

Even if this is a bug, the sequence able to reproduce a problem requires
a failure while running DROP INDEX CONCURRENTLY, something unlikely
going to happen in the field, so no backpatch is done.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov
Discussion: https://postgr.es/m/20200827025721.GN2017@paquier.xyz
2020-08-30 14:14:34 +09:00
Tom Lane 10564ee02c Fix code for re-finding scan position in a multicolumn GIN index.
collectMatchBitmap() needs to re-find the index tuple it was previously
looking at, after transiently dropping lock on the index page it's on.
The tuple should still exist and be at its prior position or somewhere
to the right of that, since ginvacuum never removes tuples but
concurrent insertions could add one.  However, there was a thinko in
that logic, to the effect of expecting any inserted tuples to have the
same index "attnum" as what we'd been scanning.  Since there's no
physical separation of tuples with different attnums, it's not terribly
hard to devise scenarios where this fails, leading to transient "lost
saved point in index" errors.  (While I've duplicated this with manual
testing, it seems impossible to make a reproducible test case with our
available testing technology.)

Fix by just continuing the scan when the attnum doesn't match.

While here, improve the error message used if we do fail, so that it
matches the wording used in btree for a similar case.

collectMatchBitmap()'s posting-tree code path was previously not
exercised at all by our regression tests.  While I can't make
a regression test that exhibits the bug, I can at least improve
the code coverage here, so do that.  The test case I made for this
is an extension of one added by 4b754d6c1, so it only works in
HEAD and v13; didn't seem worth trying hard to back-patch it.

Per bug #16595 from Jesse Kinkead.  This has been broken since
multicolumn capability was added to GIN (commit 27cb66fdf),
so back-patch to all supported branches.

Discussion: https://postgr.es/m/16595-633118be8eef9ce2@postgresql.org
2020-08-27 17:36:13 -04:00
Michael Paquier 77c7267c37 Fix comment in procarray.c
The description of GlobalVisDataRels was missing, GlobalVisCatalogRels
being mentioned instead.

Author: Jim Nasby
Discussion: https://postgr.es/m/8e06c883-2858-1fd4-07c5-560c28b08dcd@amazon.com
2020-08-27 16:40:34 +09:00