From c6a389792e08980e395055ac219303cf4357ba33 Mon Sep 17 00:00:00 2001 From: Robert Haas Date: Fri, 20 Jan 2017 15:55:45 -0500 Subject: [PATCH] Avoid useless respawining the autovacuum launcher at high speed. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When (1) autovacuum = off and (2) there's at least one database with an XID age greater than autovacuum_freeze_max_age and (3) all tables in that database that need vacuuming are already being processed by a worker and (4) the autovacuum launcher is started, a kind of infinite loop occurs. The launcher starts a worker and immediately exits. The worker, finding no worker to do, immediately starts the launcher, supposedly so that the next database can be processed. But because datfrozenxid for that database hasn't been advanced yet, the new worker gets put right back into the same database as the old one, where it once again starts the launcher and exits. High-speed ping pong ensues. There are several possible ways to break the cycle; this seems like the safest one. Amit Khandekar (code) and Robert Haas (comments), reviewed by Álvaro Herrera. Discussion: http://postgr.es/m/CAJ3gD9eWejf72HKquKSzax0r+epS=nAbQKNnykkMA0E8c+rMDg@mail.gmail.com --- src/backend/postmaster/autovacuum.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 251b9fe372..51821d3f55 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -1898,6 +1898,8 @@ do_autovacuum(void) ScanKeyData key; TupleDesc pg_class_desc; int effective_multixact_freeze_max_age; + bool did_vacuum = false; + bool found_concurrent_worker = false; /* * StartTransactionCommand and CommitTransactionCommand will automatically @@ -2307,6 +2309,7 @@ do_autovacuum(void) if (worker->wi_tableoid == relid) { skipit = true; + found_concurrent_worker = true; break; } } @@ -2433,6 +2436,8 @@ do_autovacuum(void) } PG_END_TRY(); + did_vacuum = true; + /* the PGXACT flags are reset at the next end of transaction */ /* be tidy */ @@ -2470,8 +2475,25 @@ deleted: /* * Update pg_database.datfrozenxid, and truncate pg_clog if possible. We * only need to do this once, not after each table. + * + * Even if we didn't vacuum anything, it may still be important to do + * this, because one indirect effect of vac_update_datfrozenxid() is to + * update ShmemVariableCache->xidVacLimit. That might need to be done + * even if we haven't vacuumed anything, because relations with older + * relfrozenxid values or other databases with older datfrozenxid values + * might have been dropped, allowing xidVacLimit to advance. + * + * However, it's also important not to do this blindly in all cases, + * because when autovacuum=off this will restart the autovacuum launcher. + * If we're not careful, an infinite loop can result, where workers find + * no work to do and restart the launcher, which starts another worker in + * the same database that finds no work to do. To prevent that, we skip + * this if (1) we found no work to do and (2) we skipped at least one + * table due to concurrent autovacuum activity. In that case, the other + * worker has already done it, or will do so when it finishes. */ - vac_update_datfrozenxid(); + if (did_vacuum || !found_concurrent_worker) + vac_update_datfrozenxid(); /* Finally close out the last transaction. */ CommitTransactionCommand();