Commit Graph

184 Commits

Author SHA1 Message Date
Tom Lane c9d8edc906 Repair bufmgr deadlock problem reported by Michael Wildpaner. Must take
share lock on a buffer being written out before releasing BufMgrLock in
the BufferAlloc code path; if we do it later we might block on someone
who's re-pinned the buffer.  I believe this is only an issue for BufferAlloc
and not the other places that call FlushBuffer.  BufferSync must continue
to do it the old way since it may well be trying to write buffers that
other backends have pinned; but it should not be holding any conflicting
locks.  FlushRelationBuffers is okay since it's got exclusive lock at the
relation level.
2005-01-03 18:49:41 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Neil Conway 4acc97d7e4 Assert that BufferIsPinned() in IncrBufferRefCount(), rather than using
a home-brewed combination of assertions that boiled down to the same
thing.
2004-11-24 02:56:17 +00:00
Tom Lane 4347cc2392 Allow background writing to be shut down by setting limit values to zero.
This does not disable the bgwriter process: it still has to wake up often
enough to collect fsync requests from backends in a timely fashion.  But
it responds to the recent gripe about not being able to prevent the disk
from being spun up constantly.
2004-10-17 22:01:51 +00:00
Tom Lane fdd13f1568 Give the ResourceOwner mechanism full responsibility for releasing buffer
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly.  When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op.  This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
2004-10-16 18:57:26 +00:00
Tom Lane 1c2de47746 Remove BufferLocks[] array in favor of a single pointer to the buffer
(if any) currently waited for by LockBufferForCleanup(), which is all
that we were using it for anymore.  Saves some space and eliminates
proportional-to-NBuffers slowdown in UnlockBuffers().
2004-10-16 18:05:07 +00:00
Tom Lane 9ffc8ed58b Repair possible failure to update hint bits back to disk, per
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php.
This fix is intended to be permanent: it moves the responsibility for
calling SetBufferCommitInfoNeedsSave() into the tqual.c routines,
eliminating the requirement for callers to test whether t_infomask changed.
Also, tighten validity checking on buffer IDs in bufmgr.c --- several
routines were paranoid about out-of-range shared buffer numbers but not
about out-of-range local ones, which seems a tad pointless.
2004-10-15 22:40:29 +00:00
Tom Lane eb917c1a21 I can't see any good reason for DropRelFileNodeBuffers to be issuing
FATAL when it detects a nonzero reference count.  Reduce to ERROR.
2004-09-06 17:31:32 +00:00
Tom Lane a421b4e850 FlushRelationBuffers was also being a bit cavalier about whether the
relation is already opened by smgr.
2004-08-31 16:13:06 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane fe548629c5 Invent ResourceOwner mechanism as per my recent proposal, and use it to
keep track of portal-related resources separately from transaction-related
resources.  This allows cursors to work in a somewhat sane fashion with
nested transactions.  For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor.  We might want to change that later.
2004-07-17 03:32:14 +00:00
Tom Lane 573a71a5da Nested transactions. There is still much left to do, especially on the
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.

Alvaro Herrera, with some help from Tom Lane.
2004-07-01 00:52:04 +00:00
Tom Lane 2467394ee1 Tablespaces. Alternate database locations are dead, long live tablespaces.
There are various things left to do: contrib dbsize and oid2name modules
need work, and so does the documentation.  Also someone should think about
COMMENT ON TABLESPACE and maybe RENAME TABLESPACE.  Also initlocation is
dead, it just doesn't know it yet.

Gavin Sherry and Tom Lane.
2004-06-18 06:14:31 +00:00
Tom Lane e6cba71503 Add some code to Assert that when we release pin on a buffer, we are
not holding the buffer's cntx_lock or io_in_progress_lock.  A recent
report from Litao Wu makes me wonder whether it is ever possible for
us to drop a buffer and forget to release its cntx_lock.  The Assert
does not fire in the regression tests, but that proves little ...
2004-06-11 16:43:24 +00:00
Tom Lane 91d20ff7aa Additional mop-up for sync-to-fsync changes: avoid issuing fsyncs for
temp tables, and avoid WAL-logging truncations of temp tables.  Do issue
fsync on truncated files (not sure this is necessary but it seems like
a good idea).
2004-05-31 20:31:33 +00:00
Tom Lane e674707968 Minor code rationalization: FlushRelationBuffers just returns void,
rather than an error code, and does elog(ERROR) not elog(WARNING)
when it detects a problem.  All callers were simply elog(ERROR)'ing on
failure return anyway, and I find it hard to envision a caller that would
not, so we may as well simplify the callers and produce the more useful
error message directly.
2004-05-31 19:24:05 +00:00
Tom Lane 9b178555fc Per previous discussions, get rid of use of sync(2) in favor of
explicitly fsync'ing every (non-temp) file we have written since the
last checkpoint.  In the vast majority of cases, the burden of the
fsyncs should fall on the bgwriter process not on backends.  (To this
end, we assume that an fsync issued by the bgwriter will force out
blocks written to the same file by other processes using other file
descriptors.  Anyone have a problem with that?)  This makes the world
safe for WIN32, which ain't even got sync(2), and really makes the world
safe for Unixen as well, because sync(2) never had the semantics we need:
it offers no way to wait for the requested I/O to finish.

Along the way, fix a bug I recently introduced in xlog recovery:
file truncation replay failed to clear bufmgr buffers for the dropped
blocks, which could result in 'PANIC:  heap_delete_redo: no block'
later on in xlog replay.
2004-05-31 03:48:10 +00:00
Tom Lane 076a055acf Separate out bgwriter code into a logically separate module, rather
than being random pieces of other files.  Give bgwriter responsibility
for all checkpoint activity (other than a post-recovery checkpoint);
so this child process absorbs the functionality of the former transient
checkpoint and shutdown subprocesses.  While at it, create an actual
include file for postmaster.c, which for some reason never had its own
file before.
2004-05-29 22:48:23 +00:00
Tom Lane 4af3421161 Get rid of rd_nblocks field in relcache entries. Turns out this was
costing us lots more to maintain than it was worth.  On shared tables
it was of exactly zero benefit because we couldn't trust it to be
up to date.  On temp tables it sometimes saved an lseek, but not often
enough to be worth getting excited about.  And the real problem was that
we forced an lseek on every relcache flush in order to update the field.
So all in all it seems best to lose the complexity.
2004-05-08 19:09:25 +00:00
Neil Conway 0370951347 Tiny assorted fixes: correct a typo in a comment in vacuumlazy.c, remove
some unused #include directives from bufmgr.c, and clarify comments in
bufmgr.h and buf.h
2004-04-25 23:50:58 +00:00
Neil Conway 139abc2896 Make LocalRefCount and PrivateRefCount arrays of int32, rather than long.
This saves a small amount of per-backend memory for LP64 machines.
2004-04-22 07:21:55 +00:00
Tom Lane 95a03e9cdf Another round of code cleanup on bufmgr. Use BM_VALID flag to keep track
of whether we have successfully read data into a buffer; this makes the
error behavior a bit more transparent (IMHO anyway), and also makes it
work correctly for local buffers which don't use Start/TerminateBufferIO.
Collapse three separate functions for writing a shared buffer into one.
This overlaps a bit with cleanups that Neil proposed awhile back, but
seems not to have committed yet.
2004-04-21 18:06:30 +00:00
Tom Lane 011c3e62e7 Code review for ARC patch. Eliminate static variables, improve handling
of VACUUM cases so that VACUUM requests don't affect the ARC state at all,
avoid corner case where BufferSync would uselessly rewrite a buffer that
no longer contains the page that was to be flushed.  Make some minor
other cleanups in and around the bufmgr as well, such as moving PinBuffer
and UnpinBuffer into bufmgr.c where they really belong.
2004-04-19 23:27:17 +00:00
Tom Lane da99cce7cd Avoid delaying postmaster shutdown by up to 10 seconds on platforms
where signals do not terminate sleep() delays.
2004-02-12 20:07:26 +00:00
Jan Wieck fc65a3e1fd Fixed bug where FlushRelationBuffers() did call StrategyInvalidateBuffer()
for already empty buffers because their buffer tag was not cleard out
when the buffers have been invalidated before.

Also removed the misnamed BM_FREE bufhdr flag and replaced the checks,
which effectively ask if the buffer is unpinned, with checks against the
refcount field.

Jan
2004-02-12 15:06:56 +00:00
Tom Lane 58f337a343 Centralize implementation of delay code by creating a pg_usleep()
subroutine in src/port/pgsleep.c.  Remove platform dependencies from
miscadmin.h and put them in port.h where they belong.  Extend recent
vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and
non-btree index vacuuming.

By the way, where is the documentation for the cost-based-delay patch?
2004-02-10 03:42:45 +00:00
Tom Lane 87bd956385 Restructure smgr API as per recent proposal. smgr no longer depends on
the relcache, and so the notion of 'blind write' is gone.  This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
2004-02-10 01:55:27 +00:00
Jan Wieck f425b605f4 Cost based vacuum delay feature.
Jan
2004-02-06 19:36:18 +00:00
Jan Wieck 8d09e25693 Backing out the background writer sync() option.
Jan
2004-02-04 01:24:53 +00:00
Bruce Momjian 5ee2ae2049 Remove sleep() and use single PG_SLEEP call for Win32 signal handling
and consistency.

Change PG_USLEEP to use SleepEx() for signal interuptability.
2004-01-30 15:57:04 +00:00
Jan Wieck d77b63b17c Added GUC variable bgwriter_flush_method controlling the action
done by the background writer between writing dirty blocks and
napping.

    none (default)   no action
	sync             bgwriter calls smgrsync() causing a sync(2)

A global sync() is only good on dedicated database servers, so
more flush methods should be added in the future.

Jan
2004-01-24 20:00:46 +00:00
Bruce Momjian 38081fd000 Change PG_DELAY from msec to usec and use it consistenly rather than
select().   Add Win32 Sleep() for delay.
2004-01-09 21:08:50 +00:00
Neil Conway 192ad63bd7 More janitorial work: remove the explicit casting of NULL literals to a
pointer type when it is not necessary to do so.

For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
2004-01-07 18:56:30 +00:00
Tom Lane 16cc9dff4f bufmgr.c failed to compile on Darwin, because it didn't include
<sys/time.h> where struct timeval is defined.
2003-12-20 22:18:02 +00:00
Neil Conway fef0c8345a I posted some bufmgr cleanup a few weeks ago, but it conflicted with
some concurrent changes Jan was making to the bufmgr. Here's an
updated version of the patch -- it should apply cleanly to CVS
HEAD and passes the regression tests.

This patch makes the following changes:

     - remove the UnlockAndReleaseBuffer() and UnlockAndWriteBuffer()
       macros, and replace uses of them with calls to the appropriate
       functions.

     - remove a bunch of #ifdef BMTRACE code: it is ugly & broken
       (i.e. it doesn't compile)

     - make BufferReplace() return a bool, not an int

     - cleanup some logic in bufmgr.c; should be functionality
       equivalent to the previous code, just cleaner now

     - remove the BM_PRIVATE flag as it is unused

     - improve a few comments, etc.
2003-12-14 00:34:47 +00:00
Tom Lane 0902ece5b9 Force zero_damaged_pages to be effectively ON during recovery from WAL,
since there is no need to worry about damaged pages when we are going to
overwrite them anyway from the WAL.  Per recent discussion.
2003-12-01 16:53:19 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Tom Lane 0a97cb37fc Remove unused variable. 2003-11-21 17:41:31 +00:00
Jan Wieck cfeca62148 Background writer process
This first part of the background writer does no syncing at all.
It's only purpose is to keep the LRU heads clean so that regular
backends seldom to never have to call write().

Jan
2003-11-19 15:55:08 +00:00
Jan Wieck 6b86d62b00 2nd try for the ARC strategy.
I added a couple more Assertions while tracking down the exact
cause of the former bug.

All 93 regression tests pass now.

Jan
2003-11-13 14:57:15 +00:00
Jan Wieck 923e994d79 ARC strategy backed out ... sorry
Jan
2003-11-13 05:34:58 +00:00
Jan Wieck 48adc0b34b Replacement of the buffer replacement strategy with an ARC
algorithm adopted for PostgreSQL.

Jan
2003-11-13 00:40:02 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane ffafacc1f6 Repair potential deadlock created by recent changes to recycle btree
index pages: when _bt_getbuf asks the FSM for a free index page, it is
possible (and, in some cases, even moderately likely) that the answer
will be the same page that _bt_split is trying to split.  _bt_getbuf
already knew that the returned page might not be free, but it wasn't
prepared for the possibility that even trying to lock the page could
be problematic.  Fix by doing a conditional rather than unconditional
grab of the page lock.
2003-08-10 19:48:08 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane cfa191f3b8 Error message editing in backend/storage. 2003-07-24 22:04:15 +00:00
Tom Lane a4e775a263 Make use of new error context stack mechanism to allow random errors
detected during buffer dump to be labeled with the buffer location.
For example, if a page LSN is clobbered, we now produce something like
ERROR:  XLogFlush: request 2C000000/8468EC8 is not satisfied --- flushed only
to 0/8468EF0
CONTEXT:  writing block 0 of relation 428946/566240
whereas before there was no convenient way to find out which page had
been trashed.
2003-05-10 19:04:30 +00:00
Tom Lane fd42262836 Add code to apply some simple sanity checks to the header fields of a
page when it's read in, per pghackers discussion around 17-Feb.  Add a
GUC variable zero_damaged_pages that causes the response to be a WARNING
followed by zeroing the page, rather than the normal ERROR; this is per
Hiroshi's suggestion that there needs to be a way to get at the data
in the rest of the table.
2003-03-28 20:17:13 +00:00