Windows), arrange for each postmaster child process to be its own process
group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
process group not only the direct child process. This provides saner behavior
for archive and recovery scripts; in particular, it's possible to shut down a
warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
of SIGQUIT to the startup subprocess will result in killing the waiting
recovery_command. Also, this makes Query Cancel and statement_timeout apply
to scripts being run from backends via system(). (There is no support in the
core backend for that, but it's widely done using untrusted PLs.) Per gripe
from Stephen Harris and subsequent discussion.
any no-longer-needed segments; just truncate them to zero bytes and leave
the files in place for possible future re-use. This avoids problems when
the segments are re-used due to relation growth shortly after truncation.
Before, the bgwriter, and possibly other backends, could still be holding
open file references to the old segment files, and would write dirty blocks
into those files where they'd disappear from the view of other processes.
Back-patch as far as 8.0. I believe the 7.x branches are not vulnerable,
because they had no bgwriter, and "blind" writes by other backends would
always be done via freshly-opened file references.
in PITR scenarios. We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases. Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId. Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done. Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database. initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs. Heikki Linnakangas, Simon Riggs, and Tom Lane.
modules; the first try was not usable in EXEC_BACKEND builds (e.g.,
Windows). Instead, just provide some entry points to increase the
allocation requests during postmaster start, and provide a dedicated
LWLock that can be used to synchronize allocation operations performed
by backends. Per discussion with Marc Munro.
to performance. (A wholesale effort to get rid of strncpy should be
undertaken sometime, but not during beta.) This commit also fixes dynahash.c
to correctly truncate overlength string keys for hashtables, so that its
callers don't have to anymore.
wrong answer, as has been seen to occur with a buggy Linux kernel. Not
really our bug, but it's a simple test in a seldom-used control path,
so might as well have a defense.
even when a single relation requires more than max_fsm_pages pages. Also,
make VACUUM emit a warning in this case, since it likely means that VACUUM
FULL or other drastic corrective measure is needed. Per reports from Jeff
Frost and others of unexpected changes in the claimed max_fsm_pages need.
contrib functionality. Along the way, remove the USER_LOCKS configuration
symbol, since it no longer makes any sense to try to compile that out.
No user documentation yet ... mmoncure has promised to write some.
Thanks to Abhijit Menon-Sen for creating a first draft to work from.
after an error during VACUUM. We have a PG_TRY block anyway around the only
call sites, so just reset it in the CATCH clause instead of having
AtEOXact_Buffers blindly do it during xact end. I think the old code was
actively wrong for the case of a failure during ANALYZE inside a
subtransaction --- the flag wouldn't get cleared until main transaction end.
Probably not worth back-patching though.
PGPROC array into snapshots, and use this information to avoid visits
to pg_subtrans in HeapTupleSatisfiesSnapshot. This appears to solve
the pg_subtrans-related context swap storm problem that's been reported
by several people for 8.1. While at it, modify GetSnapshotData to not
take an exclusive lock on ProcArrayLock, as closer analysis shows that
shared lock is always sufficient.
Itagaki Takahiro and Tom Lane
locks that would conflict with a specified lock request, without
actually trying to get that lock. Use this instead of the former ad hoc
method of doing the first wait step in CREATE INDEX CONCURRENTLY.
Fixes problem with undetected deadlock and in many cases will allow the
index creation to proceed sooner than it otherwise could've. Per
discussion with Greg Stark.
that ps_status provides by appending 'waiting' to the PS display. This
completes the project of making it feasible to turn off process title
updates and instead rely on pg_stat_activity. Per my suggestion a few
weeks ago.
the rel, it's easy to get rid of the narrow race-condition window that
used to exist in VACUUM and CLUSTER. Did some minor code-beautification
work in the same area, too.
(table or index) before trying to open its relcache entry. This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load. Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.
Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
vacuums. This allows a OLTP-like system with big tables to continue
regular vacuuming on small-but-frequently-updated tables while the
big tables are being vacuumed.
Original patch from Hannu Krossing, rewritten by Tom Lane and updated
by me.
BufferAlloc tries to insert a new mapping entry before deleting the old one
for a buffer, we have a transient need for more than NBuffers entries ---
one more in 8.1, and as many as NUM_BUFFER_PARTITIONS more in CVS HEAD.
In theory this could lead to an "out of shared memory" failure if shmem
had already been completely claimed by the time the extra entries were
needed.
to the low-order bits of the entry hash value. Also make some incidental
cleanups in the dynahash API, such as not exporting the hash header
structs to the world.
changing semantics too much. statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead. I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>. There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
current commands; instead, store current-status information in shared
memory. This substantially reduces the overhead of stats_command_string
and also ensures that pg_stat_activity is fully up to date at all times.
Per my recent proposal.
always has been, because it's not got any .globl declaration! We've
been relying on the solaris_sparc.s code instead. Rip it out.
(Not back-patched, since this is just cosmetic cleanup.)
into a single mostly-physical-order scan of the index. This requires some
ticklish interlocking considerations, but should create no material
performance impact on normal index operations (at least given the
already-committed changes to make scans work a page at a time). VACUUM
itself should get significantly faster in any index that's degenerated to a
very nonlinear page order. Also, we save one pass over the index entirely,
except in the case where there were no deletions to do and so only one pass
happened anyway.
Original patch by Heikki Linnakangas, rework by Tom Lane.
The former approach used ExclusiveLock on pg_database, which being a
cluster-wide lock meant only one of these operations could proceed at
a time; worse, it also blocked all incoming connections in ReverifyMyDatabase.
Now that we have LockSharedObject(), we can use locks of different types
applied to databases considered as objects. This allows much more
flexible management of the interlocking: two CREATE DATABASEs need not
block each other, and need not block connections except to the template
database being used. Similarly DROP DATABASE doesn't block unrelated
operations. The locking used in flatfiles.c is also much narrower in
scope than before. Per recent proposal.
set to the large object context ("fscxt"), as this is inevitably a source of
transaction-duration memory leaks. Not sure why we'd not noticed it before;
maybe people weren't touching a whole lot of LOs in the same transaction
before the 8.1 pg_dump changes. Per report from Wayne Conrad.
Backpatched as far as 8.1, but the problem doubtless goes all the way back.
I'm disinclined to spend the time to try to verify that the older branches
would still work if patched, seeing that this code was significantly modified
for 8.0 and again for 8.1, and that we don't have any trouble reports before
8.1. (Maybe the leaks were smaller before?)