postgresql/src/backend/commands
Tom Lane 118e99c3d7 Fix low-probability loss of NOTIFY messages due to XID wraparound.
Up to now async.c has used TransactionIdIsInProgress() to detect whether
a notify message's source transaction is still running.  However, that
function has a quick-exit path that reports that XIDs before RecentXmin
are no longer running.  If a listening backend is doing nothing but
listening, and not running any queries, there is nothing that will advance
its value of RecentXmin.  Once 2 billion transactions elapse, the
RecentXmin check causes active transactions to be reported as not running.
If they aren't committed yet according to CLOG, async.c decides they
aborted and discards their messages.  The timing for that is a bit tight
but it can happen when multiple backends are sending notifies concurrently.
The net symptom therefore is that a sufficiently-long-surviving
listen-only backend starts to miss some fraction of NOTIFY traffic,
but only under heavy load.

The only function that updates RecentXmin is GetSnapshotData().
A brute-force fix would therefore be to take a snapshot before
processing incoming notify messages.  But that would add cycles,
as well as contention for the ProcArrayLock.  We can be smarter:
having taken the snapshot, let's use that to check for running
XIDs, and not call TransactionIdIsInProgress() at all.  In this
way we reduce the number of ProcArrayLock acquisitions from one
per message to one per notify interrupt; that's the same under
light load but should be a benefit under heavy load.  Light testing
says that this change is a wash performance-wise for normal loads.

I looked around for other callers of TransactionIdIsInProgress()
that might be at similar risk, and didn't find any; all of them
are inside transactions that presumably have already taken a
snapshot.

Problem report and diagnosis by Marko Tiikkaja, patch by me.
Back-patch to all supported branches, since it's been like this
since 9.0.

Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org
2017-10-11 14:28:33 -04:00
..
aggregatecmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
alter.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
amcmds.c Fix typos in comments. 2017-02-06 11:33:58 +02:00
analyze.c Improve comments in vacuum_rel() and analyze_rel(). 2017-10-05 10:47:47 -04:00
async.c Fix low-probability loss of NOTIFY messages due to XID wraparound. 2017-10-11 14:28:33 -04:00
cluster.c Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n). 2017-08-20 11:19:07 -07:00
collationcmds.c Don't install ICU collation keyword variants 2017-08-21 19:21:07 -04:00
comment.c Allow COMMENT ON COLUMN with partitioned tables 2017-04-18 10:42:10 +01:00
constraint.c Allow index AMs to cache data across aminsert calls within a SQL command. 2017-02-09 11:52:12 -05:00
conversioncmds.c Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
copy.c Replace most usages of ntoh[ls] and hton[sl] with pg_bswap.h. 2017-10-01 15:36:14 -07:00
createas.c Allow DML commands that create tables to use parallel query. 2017-10-05 11:40:48 -04:00
dbcommands.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
define.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
discard.c Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
dropcmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
event_trigger.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
explain.c Allow DML commands that create tables to use parallel query. 2017-10-05 11:40:48 -04:00
extension.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
foreigncmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
functioncmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
indexcmds.c Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n). 2017-08-20 11:19:07 -07:00
lockcmds.c Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Makefile Implement multivariate n-distinct coefficients 2017-03-24 14:06:10 -03:00
matview.c Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n). 2017-08-20 11:19:07 -07:00
opclasscmds.c Introduce 64-bit hash functions with a 64-bit seed. 2017-08-31 22:21:21 -04:00
operatorcmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
policy.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
portalcmds.c Reduce excessive dereferencing of function pointers 2017-09-07 13:56:09 -04:00
prepare.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
proclang.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
publicationcmds.c Message style fixes 2017-09-11 11:21:27 -04:00
schemacmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
seclabel.c Reduce excessive dereferencing of function pointers 2017-09-07 13:56:09 -04:00
sequence.c For wal_consistency_checking, mask page checksum as well as page LSN. 2017-09-22 14:28:22 -04:00
statscmds.c Message style fixes 2017-09-11 11:21:27 -04:00
subscriptioncmds.c ... and the very same bug in publicationListToArray(). 2017-09-23 15:16:48 -04:00
tablecmds.c On attach, consider skipping validation of subpartitions individually. 2017-10-05 13:06:46 -04:00
tablespace.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
trigger.c Fix possible dangling pointer dereference in trigger.c. 2017-09-17 14:50:01 -04:00
tsearchcmds.c Phase 3 of pgindent updates. 2017-06-21 15:35:54 -04:00
typecmds.c Support arrays over domains. 2017-09-30 13:40:56 -04:00
user.c Don't allow logging in with empty password. 2017-08-07 17:03:42 +03:00
vacuum.c Improve comments in vacuum_rel() and analyze_rel(). 2017-10-05 10:47:47 -04:00
vacuumlazy.c Fix freezing of a dead HOT-updated tuple 2017-09-28 16:44:01 +02:00
variable.c Remove uses of "slave" in replication contexts 2017-08-10 22:55:41 -04:00
view.c Change tupledesc->attrs[n] to TupleDescAttr(tupledesc, n). 2017-08-20 11:19:07 -07:00