Commit Graph

867 Commits

Author SHA1 Message Date
Peter Eisentraut 3c173a53a8 Remove utils/acl.h from catalog/objectaddress.h
The need for this was removed by
8b9e9644dc.

A number of files now need to include utils/acl.h or
parser/parse_node.h explicitly where they previously got it indirectly
somehow.

Since parser/parse_node.h already includes nodes/parsenodes.h, the
latter is then removed where the former was added.  Also, remove
nodes/pg_list.h from objectaddress.h, since that's included via
nodes/parsenodes.h.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/7601e258-26b2-8481-36d0-dc9dca6f28f1%402ndquadrant.com
2020-03-10 10:27:00 +01:00
Peter Eisentraut 17b9e7f9fe Support adding partitioned tables to publication
When a partitioned table is added to a publication, changes of all of
its partitions (current or future) are published via that publication.

This change only affects which tables a publication considers as its
members.  The receiving side still sees the data coming from the
individual leaf partitions.  So existing restrictions that partition
hierarchies can only be replicated one-to-one are not changed by this.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-10 09:09:32 +01:00
Tom Lane 3ed2005ff5 Introduce macros for typalign and typstorage constants.
Our usual practice for "poor man's enum" catalog columns is to define
macros for the possible values and use those, not literal constants,
in C code.  But for some reason lost in the mists of time, this was
never done for typalign/attalign or typstorage/attstorage.  It's never
too late to make it better though, so let's do that.

The reason I got interested in this right now is the need to duplicate
some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch.
But in general, this sort of change aids greppability and readability,
so it's a good idea even without any specific motivation.

I may have missed a few places that could be converted, and it's even
more likely that pending patches will re-introduce some hard-coded
references.  But that's not fatal --- there's no expectation that
we'd actually change any of these values.  We can clean up stragglers
over time.

Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
2020-03-04 10:34:25 -05:00
Fujii Masao e65497df8f Report progress of streaming base backup.
This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
2020-03-03 12:03:43 +09:00
Alvaro Herrera 2f9661311b
Represent command completion tags as structs
The backend was using strings to represent command tags and doing string
comparisons in multiple places, but that's slow and unhelpful.  Create a
new command list with a supporting structure to use instead; this is
stored in a tag-list-file that can be tailored to specific purposes with
a caller-definable C macro, similar to what we do for WAL resource
managers.  The first first such uses are a new CommandTag enum and a
CommandTagBehavior struct.

Replace numerous occurrences of char *completionTag with a
QueryCompletion struct so that the code no longer stores information
about completed queries in a cstring.  Only at the last moment, in
EndCommand(), does this get converted to a string.

EventTriggerCacheItem no longer holds an array of palloc’d tag strings
in sorted order, but rather just a Bitmapset over the CommandTags.

Author: Mark Dilger, with unsolicited help from Álvaro Herrera
Reviewed-by: John Naylor, Tom Lane
Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
2020-03-02 18:19:51 -03:00
Michael Paquier 12c5cad76f Handle logical decoding in multi-insert for catalog tuples
The code path for multi-insert decoding is not stressed yet for
catalogs (a future patch may introduce this capability), so no
back-patch is needed.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/9690D72F-5C4F-4016-9572-6D16684E1D87@yesql.se
2020-03-02 10:00:37 +09:00
Michael Paquier bf883b211e Add prefix checks in exclude lists for pg_rewind, pg_checksums and base backups
An instance of PostgreSQL crashing with a bad timing could leave behind
temporary pg_internal.init files, potentially causing failures when
verifying checksums.  As the same exclusion lists are used between
pg_rewind, pg_checksums and basebackup.c, all those tools are extended
with prefix checks to keep everything in sync, with dedicated checks
added for pg_internal.init.

Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced.

Reported-by: Michael Banck
Author: Michael Paquier
Reviewed-by: Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
2020-02-24 18:13:25 +09:00
Amit Kapila e3ff789acf Stop demanding that top xact must be seen before subxact in decoding.
Manifested as

ERROR:  subtransaction logged without previous top-level txn record

this check forbids legit behaviours like
 - First xl_xact_assignment record is beyond reading, i.e. earlier
   restart_lsn.
 - After restart_lsn there is some change of a subxact.
 - After that, there is second xl_xact_assignment (for another subxact)
   revealing the relationship between top and first subxact.

Such a transaction won't be streamed anyway because we hadn't seen it in
full.  Saying for sure whether xact of some record encountered after
the snapshot was deserialized can be streamed or not requires to know
whether it wrote something before deserialization point --if yes, it
hasn't been seen in full and can't be decoded. Snapshot doesn't have such
info, so there is no easy way to relax the check.

Reported-by: Hsu, John
Diagnosed-by: Arseny Sher
Author: Arseny Sher, Amit Kapila
Reviewed-by: Amit Kapila, Dilip Kumar
Backpatch-through: 9.5
Discussion: https://postgr.es/m/AB5978B2-1772-4FEE-A245-74C91704ECB0@amazon.com
2020-02-19 08:15:49 +05:30
Peter Eisentraut ad3ae64770 Fill in extraUpdatedCols in logical replication
The extraUpdatedCols field of the target RTE records which generated
columns are affected by an update.  This is used in a variety of
places, including per-column triggers and foreign data wrappers.  When
an update was initiated by a logical replication subscription, this
field was not filled in, so such an update would not affect generated
columns in a way that is consistent with normal updates.  To fix,
factor out some code from analyze.c to fill in extraUpdatedCols in the
logical replication worker as well.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/b05e781a-fa16-6b52-6738-761181204567@2ndquadrant.com
2020-02-17 15:20:57 +01:00
Thomas Munro 701a51fd4e Use pg_pwrite() in more places.
This removes some lseek() system calls.

Author: Thomas Munro
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGJ%2BoHhnvqjn3%3DHro7xu-YDR8FPr0FL6LF35kHRX%3D_bUzg%40mail.gmail.com
2020-02-11 17:50:22 +13:00
Alvaro Herrera c9d2977519 Clean up newlines following left parentheses
We used to strategically place newlines after some function call left
parentheses to make pgindent move the argument list a few chars to the
left, so that the whole line would fit under 80 chars.  However,
pgindent no longer does that, so the newlines just made the code
vertically longer for no reason.  Remove those newlines, and reflow some
of those lines for some extra naturality.

Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
2020-01-30 13:42:14 -03:00
Alvaro Herrera 4e89c79a52 Remove excess parens in ereport() calls
Cosmetic cleanup, not worth backpatching.

Discussion: https://postgr.es/m/20200129200401.GA6303@alvherre.pgsql
Reviewed-by: Tom Lane, Michael Paquier
2020-01-30 13:32:04 -03:00
Michael Paquier b0afdcad21 Fix slot data persistency when advancing physical replication slots
Advancing a physical replication slot with pg_replication_slot_advance()
did not mark the slot as dirty if any advancing was done, preventing the
follow-up checkpoint to flush the slot data to disk.  This caused the
advancing to be lost even on clean restarts.  This does not happen for
logical slots as any advancing marked the slot as dirty.  Per
discussion, the original feature has been implemented so as in the event
of a crash the slot may move backwards to a past LSN.  This property is
kept and more documentation is added about that.

This commit adds some new TAP tests to check the persistency of physical
and logical slots after advancing across clean restarts.

Author: Alexey Kondratov, Michael Paquier
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Craig Ringer
Discussion: https://postgr.es/m/059cc53a-8b14-653a-a24d-5f867503b0ee@postgrespro.ru
Backpatch-through: 11
2020-01-30 11:14:02 +09:00
Heikki Linnakangas 38a957316d Refactor XLogReadRecord(), adding XLogBeginRead() function.
The signature of XLogReadRecord() required the caller to pass the starting
WAL position as argument, or InvalidXLogRecPtr to continue reading at the
end of previous record. That's slightly awkward to the callers, as most
of them don't want to randomly jump around in the WAL stream, but start
reading at one position and then read everything from that point onwards.
Remove the 'RecPtr' argument and add a new function XLogBeginRead() to
specify the starting position instead. That's more convenient for the
callers. Also, xlogreader holds state that is reset when you change the
starting position, so having a separate function for doing that feels like
a more natural fit.

This changes XLogFindNextRecord() function so that it doesn't reset the
xlogreader's state to what it was before the call anymore. Instead, it
positions the xlogreader to the found record, like XLogBeginRead().

Reviewed-by: Kyotaro Horiguchi, Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/5382a7a3-debe-be31-c860-cb810c08f366%40iki.fi
2020-01-26 11:39:00 +02:00
Alvaro Herrera 15cac3a523 Set ReorderBufferTXN->final_lsn more eagerly
... specifically, set it incrementally as each individual change is
spilled down to disk.  This way, it is set correctly when the
transaction disappears without trace, ie. without leaving an XACT_ABORT
wal record.  (This happens when the server crashes midway through a
transaction.)

Failing to have final_lsn prevents ReorderBufferRestoreCleanup() from
working, since it needs the final_lsn in order to know the endpoint of
its iteration through spilled files.

Commit df9f682c7b already tried to fix the problem, but it didn't set
the final_lsn in all cases.  Revert that, since it's no longer needed.

Author: Vignesh C
Reviewed-by: Amit Kapila, Dilip Kumar
Discussion: https://postgr.es/m/CALDaNm2CLk+K9JDwjYST0sPbGg5AQdvhUt0jbKyX_HdAE0jk3A@mail.gmail.com
2020-01-17 18:00:39 -03:00
Peter Eisentraut fe233366f2 Fix compiler warning about format on Windows
On 64-bit Windows, pid_t is long long int, so a %d format isn't
enough.
2020-01-14 23:59:18 +01:00
Peter Eisentraut 3297308278 walreceiver uses a temporary replication slot by default
If no permanent replication slot is configured using
primary_slot_name, the walreceiver now creates and uses a temporary
replication slot.  A new setting wal_receiver_create_temp_slot can be
used to disable this behavior, for example, if the remote instance is
out of replication slots.

Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com
2020-01-14 14:40:41 +01:00
Peter Eisentraut ee4ac46c8e Expose PQbackendPID() through walreceiver API
This will be used by a subsequent patch.

Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com
2020-01-14 14:40:41 +01:00
Peter Eisentraut 259bbe1778 Fix base backup with database OIDs larger than INT32_MAX
The use of pg_atoi() for parsing a string into an Oid fails for values
larger than INT32_MAX, since OIDs are unsigned.  Instead, use
atooid().  While this has less error checking, the contents of the
data directory are expected to be trustworthy, so we don't need to go
out of our way to do full error checking.

Discussion: https://www.postgresql.org/message-id/flat/dea47fc8-6c89-a2b1-07e3-754ff1ab094b%402ndquadrant.com
2020-01-13 13:41:12 +01:00
Michael Paquier 1088729e84 Remove incorrect assertion for INSERT in logical replication's publisher
On the publisher, it was assumed that an INSERT change cannot happen for
a relation with no replica identity.  However this is true only for a
change that needs references to old rows, aka UPDATE or DELETE, so
trying to use logical replication with a relation that has no replica
identity led to an assertion failure in the publisher when issuing an
INSERT.  This commit removes the incorrect assertion, and adds more
regression tests to provide coverage for relations without replica
identity.

Reported-by: Neha Sharma
Author: Dilip Kumar, Michael Paquier
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CANiYTQsL1Hb8_Km08qd32svrqNumXLJeoGo014O7VZymgOhZEA@mail.gmail.com
Backpatch-through: 10
2020-01-12 22:43:45 +09:00
Peter Eisentraut c67a55da4e Make lsn argument of walrcv_create_slot() optional
Some callers are not using it, so it's wasteful to have to specify it.

Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/CA+fd4k4BcYrYucNfTnK-CQX3+jsG+PRPEhHAUSo-W4P0Lec57A@mail.gmail.com
2020-01-11 09:07:14 +01:00
Alvaro Herrera a7b6ab5db1 Clean up representation of flags in struct ReorderBufferTXN
This simplifies addition of further flags.

Author: Nikhil Sontakke
Discussion: https://postgr.es/m/CAMGcDxeViP+R-OL7QhzUV9eKCVjURobuY1Zijik4Ay_Ddwo4Cg@mail.gmail.com
2020-01-10 17:46:57 -03:00
Tom Lane e369f37086 Reduce the number of GetFlushRecPtr() calls done by walsenders.
Since the WAL flush position only moves forward, it's safe to cache
its previous value within each walsender process, and update from
shared memory only once we've caught up to the previously-seen value.
When there are many active walsenders, this makes for a very significant
reduction in the amount of contention on the XLogCtl->info_lck spinlock.

This patch also adjusts the logic so that we update our idea of the
flush position after processing a WAL record, rather than beforehand.
This may cause us to realize we're not caught up when the preceding
coding would've thought that we were, but that seems all to the good;
it may avoid a useless sleep-and-wakeup cycle.

Back-patch to v12.  The contention problem exists in prior branches,
but it's much less severe (due to inefficiencies elsewhere) so there
seems no need to take any risk of back-patching further.

Pierre Ducroquet, reviewed by Julien Rouhaud

Discussion: https://postgr.es/m/2931018.Vxl9zapr77@pierred-pdoc
2020-01-06 16:42:20 -05:00
Peter Eisentraut b9c130a1fd Have logical replication subscriber fire column triggers
The logical replication apply worker did not fire per-column update
triggers because the updatedCols bitmap in the RTE was not populated.
This fixes that.

Reviewed-by: Euler Taveira <euler@timbira.com.br>
Discussion: https://www.postgresql.org/message-id/flat/21673e2d-597c-6afe-637e-e8b10425b240%402ndquadrant.com
2020-01-06 08:40:00 +01:00
Tom Lane 5815696bc6 Make parser rely more heavily on the ParseNamespaceItem data structure.
When I added the ParseNamespaceItem data structure (in commit 5ebaaa494),
it wasn't very tightly integrated into the parser's APIs.  In the wake of
adding p_rtindex to that struct (commit b541e9acc), there is a good reason
to make more use of it: by passing around ParseNamespaceItem pointers
instead of bare RTE pointers, we can get rid of various messy methods for
passing back or deducing the rangetable index of an RTE during parsing.
Hence, refactor the addRangeTableEntryXXX functions to build and return
a ParseNamespaceItem struct, not just the RTE proper; and replace
addRTEtoQuery with addNSItemToQuery, which is passed a ParseNamespaceItem
rather than building one internally.

Also, add per-column data (a ParseNamespaceColumn array) to each
ParseNamespaceItem.  These arrays are built during addRangeTableEntryXXX,
where we have column type data at hand so that it's nearly free to fill
the data structure.  Later, when we need to build Vars referencing RTEs,
we can use the ParseNamespaceColumn info to avoid the rather expensive
operations done in get_rte_attribute_type() or expandRTE().
get_rte_attribute_type() is indeed dead code now, so I've removed it.
This makes for a useful improvement in parse analysis speed, around 20%
in one moderately-complex test query.

The ParseNamespaceColumn structs also include Var identity information
(varno/varattno).  That info isn't actually being used in this patch,
except that p_varno == 0 is a handy test for a dropped column.
A follow-on patch will make more use of it.

Discussion: https://postgr.es/m/2461.1577764221@sss.pgh.pa.us
2020-01-02 11:29:01 -05:00
Amit Kapila d207038053 Fix running out of file descriptors for spill files.
Currently while decoding changes, if the number of changes exceeds a
certain threshold, we spill those to disk.  And this happens for each
(sub)transaction.  Now, while reading all these files, we don't close them
until we read all the files.  While reading these files, if the number of
such files exceeds the maximum number of file descriptors, the operation
errors out.

Use PathNameOpenFile interface to open these files as that internally has
the mechanism to release kernel FDs as needed to get us under the
max_safe_fds limit.

Reported-by: Amit Khandekar
Author: Amit Khandekar
Reviewed-by: Amit Kapila
Backpatch-through: 9.4
Discussion: https://postgr.es/m/CAJ3gD9c-sECEn79zXw4yBnBdOttacoE-6gAyP0oy60nfs_sabQ@mail.gmail.com
2020-01-02 11:41:04 +05:30
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Michael Paquier 7854e07f25 Revert "Rename files and headers related to index AM"
This follows multiple complains from Peter Geoghegan, Andres Freund and
Alvaro Herrera that this issue ought to be dug more before actually
happening, if it happens.

Discussion: https://postgr.es/m/20191226144606.GA5659@alvherre.pgsql
2019-12-27 08:09:00 +09:00
Michael Paquier 044b319cd7 Fix some comments related to logical repslot advancing
confirmed_flush is part of a replication slot's information, but not
confirmed_lsn.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20191226.175919.17237335658671970.horikyota.ntt@gmail.com
Backpatch-through: 11
2019-12-26 22:26:09 +09:00
Michael Paquier 8ce3aa9b59 Rename files and headers related to index AM
The following renaming is done so as source files related to index
access methods are more consistent with table access methods (the
original names used for index AMs ware too generic, and could be
confused as including features related to table AMs):
- amapi.h -> indexam.h.
- amapi.c -> indexamapi.c.  Here we have an equivalent with
backend/access/table/tableamapi.c.
- amvalidate.c -> indexamvalidate.c.
- amvalidate.h -> indexamvalidate.h.
- genam.c -> indexgenam.c.
- genam.h -> indexgenam.h.

This has been discussed during the development of v12 when table AM was
worked on, but the renaming never happened.

Author: Michael Paquier
Reviewed-by: Fabien Coelho, Julien Rouhaud
Discussion: https://postgr.es/m/20191223053434.GF34339@paquier.xyz
2019-12-25 10:23:39 +09:00
Alvaro Herrera c4dcd9144b Avoid splitting C string literals with \-newline
Using \ is unnecessary and ugly, so remove that.  While at it, stitch
the literals back into a single line: we've long discouraged splitting
error message literals even when they go past the 80 chars line limit,
to improve greppability.

Leave contrib/tablefunc alone.

Discussion: https://postgr.es/m/20191223195156.GA12271@alvherre.pgsql
2019-12-24 12:44:12 -03:00
Robert Haas 16a4e4aecd Extend the ProcSignal mechanism to support barriers.
A new function EmitProcSignalBarrier() can be used to emit a global
barrier which all backends that participate in the ProcSignal
mechanism must absorb, and a new function WaitForProcSignalBarrier()
can be used to wait until all relevant backends have in fact
absorbed the barrier.

This can be used to coordinate global state changes, such as turning
checksums on while the system is running.

There's no real client of this mechanism yet, although two are
proposed, but an enum has to have at least one element, so this
includes a placeholder type (PROCSIGNAL_BARRIER_PLACEHOLDER) which
should be replaced by the first real client of this mechanism to
get committed.

Andres Freund and Robert Haas, reviewed by Daniel Gustafsson and,
in earlier versions, by Magnus Hagander.

Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com
2019-12-19 14:56:20 -05:00
Robert Haas 303640199d Fix minor problems with non-exclusive backup cleanup.
The previous coding imagined that it could call before_shmem_exit()
when a non-exclusive backup began and then remove the previously-added
handler by calling cancel_before_shmem_exit() when that backup
ended. However, this only works provided that nothing else in the
system has registered a before_shmem_exit() hook in the interim,
because cancel_before_shmem_exit() is documented to remove a callback
only if it is the latest callback registered. It also only works
if nothing can ERROR out between the time that sessionBackupState
is reset and the time that cancel_before_shmem_exit(), which doesn't
seem to be strictly true.

To fix, leave the handler installed for the lifetime of the session,
arrange to install it just once, and teach it to quietly do nothing if
there isn't a non-exclusive backup in process.

This is a bug, but for now I'm not going to back-patch, because the
consequences are minor. It's possible to cause a spurious warning
to be generated, but that doesn't really matter. It's also possible
to trigger an assertion failure, but production builds shouldn't
have assertions enabled.

Patch by me, reviewed by Kyotaro Horiguchi, Michael Paquier (who
preferred a different approach, but got outvoted), Fujii Masao,
and Tom Lane, and with comments by various others.

Discussion: http://postgr.es/m/CA+TgmobMjnyBfNhGTKQEDbqXYE3_rXWpc4CM63fhyerNCes3mA@mail.gmail.com
2019-12-19 09:06:54 -05:00
Michael Paquier e1551f96e6 Refactor attribute mappings used in logical tuple conversion
Tuple conversion support in tupconvert.c is able to convert rowtypes
between two relations, inner and outer, which are logically equivalent
but have a different ordering or even dropped columns (used mainly for
inheritance tree and partitions).  This makes use of attribute mappings,
which are simple arrays made of AttrNumber elements with a length
matching the number of attributes of the outer relation.  The length of
the attribute mapping has been treated as completely independent of the
mapping itself until now, making it easy to pass down an incorrect
mapping length.

This commit refactors the code related to attribute mappings and moves
it into an independent facility called attmap.c, extracted from
tupconvert.c.  This merges the attribute mapping with its length,
avoiding to try to guess what is the length of a mapping to use as this
is computed once, when the map is built.

This will avoid mistakes like what has been fixed in dc816e58, which has
used an incorrect mapping length by matching it with the number of
attributes of an inner relation (a child partition) instead of an outer
relation (a partitioned table).

Author: Michael Paquier
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/20191121042556.GD153437@paquier.xyz
2019-12-18 16:23:02 +09:00
Amit Kapila 04c8a69c0c Fix subscriber invalid memory access on DDL.
This patch allows building the local relmap cache for a subscribed
relation after processing pending invalidation messages and potential
relcache updates.  Without this, the attributes in the local cache don't
tally with the updated relcache entry leading to invalid memory access.

Reported-by Jehan-Guillaume de Rorthais
Author: Jehan-Guillaume de Rorthais and Vignesh C
Reviewed-by: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/20191025175929.7e90dbf5@firost
2019-12-18 07:49:18 +05:30
Robert Haas 7dbfea3c45 Partially deduplicate interrupt handling for background processes.
Where possible, share signal handler code and main loop interrupt
checking. This saves quite a bit of code and should simplify
maintenance, too.

This commit intends not to change the way anything works, even
though that might allow more code to be unified. It does unify
a bunch of individual variables into a ShutdownRequestPending
flag that has is now used by a bunch of different process types,
though.

Patch by me, reviewed by Andres Freund and Daniel Gustafsson.

Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com
2019-12-17 13:14:28 -05:00
Robert Haas 1e53fe0e70 Use PostgresSigHupHandler in more places.
There seems to be no reason for every background process to have
its own flag indicating that a config-file reload is needed.
Instead, let's just use ConfigFilePending for that purpose
everywhere.

Patch by me, reviewed by Andres Freund and Daniel Gustafsson.

Discussion: http://postgr.es/m/CA+TgmoZwDk=BguVDVa+qdA6SBKef=PKbaKDQALTC_9qoz1mJqg@mail.gmail.com
2019-12-17 13:03:57 -05:00
Michael Paquier 9989d37d1c Remove XLogFileNameP() from the tree
XLogFileNameP() is a wrapper routine able to build a palloc'd string for
a WAL segment name, which is used for error string generation.  There
were several code paths where it gets called in a critical section,
where memory allocation is not allowed.  This results in triggering
an assertion failure instead of generating the wanted error message.

Another, more annoying, problem is that if the allocation to generate
the WAL segment name fails on OOM, then the failure would be escalated
to a PANIC.

This removes the routine and all its callers are replaced with a logic
using a fixed-size buffer.  This way, all the existing mistakes are
fixed and future ones are prevented.

Author: Masahiko Sawada
Reviewed-by: Michael Paquier, Álvaro Herrera
Discussion: https://postgr.es/m/CA+fd4k5gC9H4uoWMLg9K_QfNrnkkdEw+-AFveob9YX7z8JnKTA@mail.gmail.com
2019-12-03 15:06:04 +09:00
Alvaro Herrera 3974c4a724 Remove useless "return;" lines
Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql
2019-11-28 16:48:37 -03:00
Robert Haas 0d3c3aae33 Use procsignal_sigusr1_handler for auxiliary processes.
AuxiliaryProcessMain does ProcSignalInit, so one might expect that
auxiliary processes would need to respond to SendProcSignal, but none
of the auxiliary processes do that. Change them to use
procsignal_sigusr1_handler instead of their own private handlers so
that they do. Besides seeming more correct, this is also less code. It
shouldn't make any functional difference right now because, as far as
we know, there are no current cases where SendProcSignal targets an
auxiliary process, but there are plans to change that in the future.

Andres Freund

Discussion: http://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
2019-11-25 16:16:27 -05:00
Alvaro Herrera 0dc8ead463 Refactor WAL file-reading code into WALRead()
XLogReader, walsender and pg_waldump all had their own routines to read
data from WAL files to memory, with slightly different approaches
according to the particular conditions of each environment.  There's a
lot of commonality, so we can refactor that into a single routine
WALRead in XLogReader, and move the differences to a separate (simpler)
callback that just opens the next WAL-segment.  This results in a
clearer (ahem) code flow.

The error reporting needs are covered by filling in a new error-info
struct, WALReadError, and it's the caller's responsibility to act on it.
The backend has WALReadRaiseError() to do so.

We no longer ever need to seek in this interface; switch to using
pg_pread().

Author: Antonin Houska, with contributions from Álvaro Herrera
Reviewed-by: Michaël Paquier, Kyotaro Horiguchi
Discussion: https://postgr.es/m/14984.1554998742@spoje.net
2019-11-25 15:04:54 -03:00
Tom Lane 5883f5fe27 Fix unportable printf format introduced in commit 9290ad198.
"%ld" is not an acceptable format spec for int64 variables, though
it accidentally works on most non-Windows 64-bit platforms.  Follow
the lead of commit 6a1cd8b92, and use "%lld" with an explicit cast
to long long.  Per buildfarm.
2019-11-25 10:48:36 -05:00
Tom Lane 4d9ceb0018 Fix bogus tuple-slot management in logical replication UPDATE handling.
slot_modify_cstrings seriously abused the TupleTableSlot API by relying
on a slot's underlying data to stay valid across ExecClearTuple.  Since
this abuse was also quite undocumented, it's little surprise that the
case got broken during the v12 slot rewrites.  As reported in bug #16129
from Ondřej Jirman, this could lead to crashes or data corruption when
a logical replication subscriber processes a row update.  Problems would
only arise if the subscriber's table contained columns of pass-by-ref
types that were not being copied from the publisher.

Fix by explicitly copying the datum/isnull arrays from the source slot
that the old row was in already.  This ends up being about the same
thing that happened pre-v12, but hopefully in a less opaque and
fragile way.

We might've caught the problem sooner if there were any test cases
dealing with updates involving non-replicated or dropped columns.
Now there are.

Back-patch to v10 where this code came in.  Even though the failure
does not manifest before v12, IMO this code is too fragile to leave
as-is.  In any case we certainly want the additional test coverage.

Patch by me; thanks to Tomas Vondra for initial investigation.

Discussion: https://postgr.es/m/16129-a0c0f48e71741e5f@postgresql.org
2019-11-22 11:31:19 -05:00
Amit Kapila 9290ad198b Track statistics for spilling of changes from ReorderBuffer.
This adds the statistics about transactions spilled to disk from
ReorderBuffer.  Users can query the pg_stat_replication view to check
these stats.

Author: Tomas Vondra, with bug-fixes and minor changes by Dilip Kumar
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-21 08:06:51 +05:30
Amit Kapila cec2edfa78 Add logical_decoding_work_mem to limit ReorderBuffer memory usage.
Instead of deciding to serialize a transaction merely based on the
number of changes in that xact (toplevel or subxact), this makes
the decisions based on amount of memory consumed by the changes.

The memory limit is defined by a new logical_decoding_work_mem GUC,
so for example we can do this

    SET logical_decoding_work_mem = '128kB'

to reduce the memory usage of walsenders or set the higher value to
reduce disk writes. The minimum value is 64kB.

When adding a change to a transaction, we account for the size in
two places. Firstly, in the ReorderBuffer, which is then used to
decide if we reached the total memory limit. And secondly in the
transaction the change belongs to, so that we can pick the largest
transaction to evict (and serialize to disk).

We still use max_changes_in_memory when loading changes serialized
to disk. The trouble is we can't use the memory limit directly as
there might be multiple subxact serialized, we need to read all of
them but we don't know how many are there (and which subxact to
read first).

We do not serialize the ReorderBufferTXN entries, so if there is a
transaction with many subxacts, most memory may be in this type of
objects. Those records are not included in the memory accounting.

We also do not account for INTERNAL_TUPLECID changes, which are
kept in a separate list and not evicted from memory. Transactions
with many CTID changes may consume significant amounts of memory,
but we can't really do much about that.

The current eviction algorithm is very simple - the transaction is
picked merely by size, while it might be useful to also consider age
(LSN) of the changes for example. With the new Generational memory
allocator, evicting the oldest changes would make it more likely
the memory gets actually pfreed.

The logical_decoding_work_mem can be set in postgresql.conf, in which
case it serves as the default for all publishers on that instance.

Author: Tomas Vondra, with changes by Dilip Kumar and Amit Kapila
Reviewed-by: Dilip Kumar and Amit Kapila
Tested-By: Vignesh C
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-19 07:32:36 +05:30
Amit Kapila 14aec03502 Make the order of the header file includes consistent in backend modules.
Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order
of header file inclusion consistent for backend modules.

In the passing, removed a couple of duplicate inclusions.

Author: Vignesh C
Reviewed-by: Kuntal Ghosh and Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
2019-11-12 08:30:16 +05:30
Peter Eisentraut 1c60e40ad5 Fix negative bitmapset member not allowed error in logical replication
This happens when we add a replica identity column on a subscriber
that does not yet exist on the publisher, according to the mapping
maintained by the subscriber.  Code that checks whether the target
relation on the subscriber is updatable would check the replica
identity attribute bitmap with a column number -1, which would result
in an error.  To fix, skip such columns in the bitmap lookup and
consider the relation not updatable.  The result is consistent with
the rule that the replica identity columns on the subscriber must be a
subset of those on the publisher, since if the column doesn't exist on
the publisher, the column set on the subscriber can't be a subset.

Reported-by: Tim Clarke <tim.clarke@minerva.info>
Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info
2019-11-09 08:35:44 +01:00
Peter Eisentraut 3dcffb381c Fix gratuitous error message variation 2019-11-08 18:37:17 +01:00
Peter Eisentraut d40abd5fcf Fix memory allocation mistake
The previous code was allocating more memory than necessary because
the formula used the wrong data type.

Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/20191105172918.3e32a446@firost
2019-11-06 14:20:29 +01:00
Michael Paquier 5f6b1eb0cf Fix timestamp of sent message for write context in logical decoding
When sending data for logical decoding using the streaming replication
protocol via a WAL sender, the timestamp of the sent write message is
allocated at the beginning of the message when preparing for the write,
and actually computed when the write message is ready to be sent.

The timestamp was getting computed after sending the message.  This
impacts anything using logical decoding, causing for example logical
replication to report mostly NULL for last_msg_send_time in
pg_stat_subscription.

This commit makes sure that the timestamp is computed before sending the
message.  This is wrong since 5a991ef, so backpatch down to 9.4.

Author: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com
Backpatch-through: 9.4
2019-11-06 16:12:21 +09:00