Commit Graph

534 Commits

Author SHA1 Message Date
Tom Lane a8d1075f27 Add a time-of-preparation column to the pg_prepared_xacts view, per an
old suggestion by Oliver Jowett.  Also, add a transaction column to the
pg_locks view to show the xid of each transaction holding or awaiting
locks; this allows prepared transactions to be properly associated with
the locks they own.  There was already a column named 'transaction',
and I chose to rename it to 'transactionid' --- since this column is
new in the current devel cycle there should be no backwards compatibility
issue to worry about.
2005-06-18 19:33:42 +00:00
Tom Lane 66b098492e Dept. of second thoughts: regular COMMIT deletes deletable files before
releasing locks, so COMMIT PREPARED should too.
2005-06-18 05:21:09 +00:00
Tom Lane d0a89683a3 Two-phase commit. Original patch by Heikki Linnakangas, with additional
hacking by Alvaro Herrera and Tom Lane.
2005-06-17 22:32:51 +00:00
Bruce Momjian f4d907ca85 Remove old *.backup files when we do pg_stop_backup(). This
prevents a large number of *.backup files from existing in pg_xlog/
2005-06-15 01:36:08 +00:00
Teodor Sigaev 37c839365c WAL for GiST. It work for online backup and so on, but on
recovery after crash (power loss etc) it may say that it can't restore
index and index should be reindexed.

Some refactoring code.
2005-06-14 11:45:14 +00:00
Bruce Momjian 51746c4549 Free buffer allocated via malloc (process is short-lived, but fix it anyway). 2005-06-09 22:36:27 +00:00
Tom Lane f5b2f60bd1 Change WAL-logging scheme for multixacts to be more like regular
transaction IDs, rather than like subtrans; in particular, the information
now survives a database restart.  Per previous discussion, this is
essential for PITR log shipping and for 2PC.
2005-06-08 15:50:28 +00:00
Tom Lane ee7ac7b11e Modify XLogInsert API to make callers specify whether pages to be backed
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero.  That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space.  Per suggestion by Heikki Linnakangas.
2005-06-06 20:22:58 +00:00
Tom Lane 4c8495a1f2 Remove the mostly-stubbed-out-anyway support routines for WAL UNDO.
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
2005-06-06 17:01:25 +00:00
Tom Lane 21fda22ec4 Change CRCs in WAL records from 64bit to 32bit for performance reasons.
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block.  Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages.  Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective.  All per recent discussions.
2005-06-02 05:55:29 +00:00
Tom Lane a91fa39028 Add test to WAL replay to verify that xl_prev points back to the previous
WAL record; this is necessary to be sure we recognize stale WAL records
when a WAL page was only partially written during a system crash.
2005-05-31 19:10:28 +00:00
Tom Lane e92a88272e Modify hash_search() API to prevent future occurrences of the error
spotted by Qingqing Zhou.  The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places.  If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL.  But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
2005-05-29 04:23:07 +00:00
Bruce Momjian 6dc7760ac3 Add support for wal_fsync_writethrough for Darwin, and restructure the
code to better handle writethrough.

Chris Campbell
2005-05-20 14:53:26 +00:00
Tom Lane ff0c143a3b Make a comment pgindent-proof, per suggestion from Alvaro. 2005-05-19 23:58:51 +00:00
Tom Lane ee3b71f6bc Split the shared-memory array of PGPROC pointers out of the sinval
communication structure, and make it its own module with its own lock.
This should reduce contention at least a little, and it definitely makes
the code seem cleaner.  Per my recent proposal.
2005-05-19 21:35:48 +00:00
Neil Conway c891e05f26 Cleanup GiST header files. Since GiST extensions are often written as
external projects, we should be careful about what parts of the GiST
API are considered implementation details, and which are part of the
public API. Therefore, I've moved internal-only declarations into
gist_private.h -- future backward-incompatible changes to gist.h should
be made with care, to avoid needlessly breaking external GiST extensions.

Also did some related header cleanup: remove some unnecessary #includes
from gist.h, and remove some unused definitions: isAttByVal(), _gistdump(),
and GISTNStrategies.
2005-05-17 03:34:18 +00:00
Bruce Momjian 35e1651508 Back out check for unreferenced files.
Heikki Linnakangas
2005-05-10 22:27:30 +00:00
Tom Lane 4cd4ed0cc2 Fix case in which a debug printout would print already-pfreed data. 2005-05-07 18:14:25 +00:00
Tom Lane 126eaef651 Clean up MultiXactIdExpand's API by separating out the case where we
are creating a new MultiXactId from two regular XIDs.  The original
coding was unnecessarily complicated and didn't save any code anyway.
2005-05-03 19:42:41 +00:00
Bruce Momjian 76668e6eb4 Check the file system on postmaster startup and report any unreferenced
files in the server log.

Heikki Linnakangas
2005-05-02 18:26:54 +00:00
Tom Lane bedb78d386 Implement sharable row-level locks, and use them for foreign key references
to eliminate unnecessary deadlocks.  This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE.  The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets.  When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX.  This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared.   Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.

Alvaro Herrera and Tom Lane.
2005-04-28 21:47:18 +00:00
Tom Lane 19d127548c Add comment about checkpoint panic behavior during shutdown, per
suggestion from Qingqing Zhou.
2005-04-23 18:49:54 +00:00
Bruce Momjian 1a6ad669fb Fix comment typo. 2005-04-17 03:04:29 +00:00
Tom Lane 5f0a974ea9 Reduce PANIC to ERROR in several xlog routines that are used in both
critical and noncritical contexts (an example of noncritical being
post-checkpoint removal of dead xlog segments).  In the critical cases
the CRIT_SECTION mechanism will cause ERROR to be promoted to PANIC
anyway, and in the noncritical cases we shouldn't let an error take
down the entire database.  Arguably there should be *no* explicit
PANIC errors in this module, only more START/END_CRIT_SECTION calls,
but I didn't go that far.  (Yet.)
2005-04-15 22:19:48 +00:00
Tom Lane 61b861421b Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior
when recycling a large number of xlog segments during checkpoint.
The former behavior searched from the same start point each time,
requiring O(checkpoint_segments^2) stat() calls to relocate all the
segments.  Instead keep track of where we stopped last time through.
2005-04-15 18:48:10 +00:00
Tom Lane 2193a856a2 Simplify initdb-time assignment of OIDs as I proposed yesterday, and
avoid encroaching on the 'user' range of OIDs by allowing automatic
OID assignment to use values below 16k until we reach normal operation.

initdb not forced since this doesn't make any incompatible change;
however a lot of stuff will have different OIDs after your next initdb.
2005-04-13 18:54:57 +00:00
Tom Lane c3294f1cbf Fix interaction between materializing holdable cursors and firing
deferred triggers: either one can create more work for the other,
so we have to loop till it's all gone.  Per example from andrew@supernews.
Add a regression test to help spot trouble in this area in future.
2005-04-11 19:51:16 +00:00
Tom Lane 8c85a34a3b Officially decouple FUNC_MAX_ARGS from INDEX_MAX_KEYS, and set the
former to 100 by default.  Clean up some of the less necessary
dependencies on FUNC_MAX_ARGS; however, the biggie (FunctionCallInfoData)
remains.
2005-03-29 03:01:32 +00:00
Tom Lane 119191609c Remove dead push/pop rollback code. Vadim once planned to implement
transaction rollback via UNDO but I think that's highly unlikely to
happen, so we may as well remove the stubs.  (Someday we ought to
rip out the stub xxx_undo routines, too.)  Per Alvaro.
2005-03-28 01:50:34 +00:00
Bruce Momjian b1f57d88f5 Change Win32 O_SYNC method to O_DSYNC because that is what the method
currently does.  This is now the default Win32 wal sync method because
we perfer o_datasync to fsync.

Also, change Win32 fsync to a new wal sync method called
fsync_writethrough because that is the behavior of _commit, which is
what is used for fsync on Win32.

Backpatch to 8.0.X.
2005-03-24 04:36:20 +00:00
Tom Lane 4aefe75553 Remove some no-longer-needed kluges for bootstrapping, in particular
the AMI_OVERRIDE flag.  The fact that TransactionLogFetch treats
BootstrapTransactionId as always committed is sufficient to make
bootstrap work, and getting rid of extra tests in heavily used code
paths seems like a win.  The files produced by initdb are demonstrably
the same after this change.
2005-02-20 21:46:50 +00:00
Tom Lane 60b2444cc3 Add code to prevent transaction ID wraparound by enforcing a safe limit
in GetNewTransactionId().  Since the limit value has to be computed
before we run any real transactions, this requires adding code to database
startup to scan pg_database and determine the oldest datfrozenxid.
This can conveniently be combined with the first stage of an attack on
the problem that the 'flat file' copies of pg_shadow and pg_group are
not properly updated during WAL recovery.  The code I've added to
startup resides in a new file src/backend/utils/init/flatfiles.c, and
it is responsible for rewriting the flat files as well as initializing
the XID wraparound limit value.  This will eventually allow us to get
rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add
a trigger to pg_database.
2005-02-20 02:22:07 +00:00
Bruce Momjian 7c44e57331 Move plpgsql DEBUG from DEBUG2 to DEBUG1 because it is a user-requested
DEBUG.

Fix a few places where DEBUG1 crept in that should have been DEBUG2.
2005-02-12 23:53:42 +00:00
Tom Lane 0ce4d56924 Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This
is the minimum required fix.  I want to look next at taking advantage of
it by simplifying the message semantics in the shared inval message queue,
but that part can be held over for 8.1 if it turns out too ugly.
2005-01-10 20:02:24 +00:00
Bruce Momjian 2daed8c5b3 Update copyrights that were missed. 2005-01-01 05:43:09 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane bfa5f30481 Awhile back I added some code to StartupCLOG() to forcibly zero out
the remainder of the current clog page during system startup.  While
this was a good idea, it turns out the code fails if nextXid is
exactly at a page boundary, because we won't have created the "current"
clog page yet in that case.  Since the page will be correctly zeroed
when we execute the first transaction on it, the solution is just to
do nothing when exactly at a page boundary.  Per trouble report from
Dave Hartwig.
2004-12-22 18:45:49 +00:00
Tom Lane ff5a354ece Fix is-it-time-for-a-checkpoint logic so that checkpoint_segments can
usefully be larger than 255.  Per gripe from Simon Riggs.
2004-12-17 00:10:36 +00:00
Tom Lane 37d693033d Minor adjustment of message style. 2004-11-17 16:26:59 +00:00
Neil Conway b25d23e1e6 Don't allow pg_start_backup() to be invoked if archive_command has not
been defined. Patch from Gavin Sherry, editorializing by Neil Conway.
2004-11-17 02:22:54 +00:00
Peter Eisentraut 0ed3c7665e Small message clarifications 2004-11-05 17:11:34 +00:00
Tom Lane 88868d4fbc Change COMMIT back to the old behavior of emitting command tag COMMIT,
not ROLLBACK, for the case of COMMIT outside a transaction block.
Alvaro Herrera
2004-10-30 20:44:43 +00:00
Tom Lane 23f264d125 Rearrange order of pre-commit operations: must close cursors before doing
ON COMMIT actions.  Per bug report from Michael Guerin.
2004-10-29 22:19:53 +00:00
Tom Lane ee69be44d5 Add DEBUG1-level logging of checkpoint start and end. Also, reduce the
'recycled log files' and 'removed log files' messages from DEBUG1 to
DEBUG2, replacing them with a count of files added/removed/recycled in
the checkpoint end message, as per suggestion from Simon Riggs.
2004-10-29 00:16:08 +00:00
Tom Lane fdd13f1568 Give the ResourceOwner mechanism full responsibility for releasing buffer
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly.  When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op.  This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
2004-10-16 18:57:26 +00:00
Bruce Momjian 5c267325ec Add 'int' cast for getpid() because some Solaris releases return long
for getpid().
2004-10-14 20:23:46 +00:00
Peter Eisentraut 0fd37839d9 Message style revisions 2004-10-12 21:54:45 +00:00
Bruce Momjian 67608a393b Make getpid() use %d consistently for printing. 2004-10-09 02:46:42 +00:00
Bruce Momjian a5d7ba773d Adjust comments previously moved to column 1 by pgident. 2004-10-07 15:21:58 +00:00
Tom Lane 4c77cbb272 PortalRun must guard against the possibility that the portal it's
running contains VACUUM or a similar command that will internally start
and commit transactions.  In such a case, the original caller values of
CurrentMemoryContext and CurrentResourceOwner will point to objects that
will be destroyed by the internal commit.  We must restore these pointers
to point to the newly-manufactured transaction context and resource owner,
rather than possibly pointing to deleted memory.
Also tweak xact.c so that AbortTransaction and AbortSubTransaction
forcibly restore a sane value for CurrentResourceOwner, much as they
have always done for CurrentMemoryContext.  I'm not certain this is
necessary but I'm feeling paranoid today.
Responds to Sean Chittenden's bug report of 4-Oct.
2004-10-04 21:52:15 +00:00
Tom Lane 257cccbe5e Add some marginal tweaks to eliminate memory leakages associated with
subtransactions.  Trivial subxacts (such as a plpgsql exception block
containing no database access) now demonstrably leak zero bytes.
2004-09-16 20:17:49 +00:00
Tom Lane 86fff990b2 RecentXmin is too recent to use as the cutoff point for accessing
pg_subtrans --- what we need is the oldest xmin of any snapshot in use
in the current top transaction.  Introduce a new variable TransactionXmin
to play this role.  Fixes intermittent regression failure reported by
Neil Conway.
2004-09-16 18:35:23 +00:00
Tom Lane 8f9f198603 Restructure subtransaction handling to reduce resource consumption,
as per recent discussions.  Invent SubTransactionIds that are managed like
CommandIds (ie, counter is reset at start of each top transaction), and
use these instead of TransactionIds to keep track of subtransaction status
in those modules that need it.  This means that a subtransaction does not
need an XID unless it actually inserts/modifies rows in the database.
Accordingly, don't assign it an XID nor take a lock on the XID until it
tries to do that.  This saves a lot of overhead for subtransactions that
are only used for error recovery (eg plpgsql exceptions).  Also, arrange
to release a subtransaction's XID lock as soon as the subtransaction
exits, in both the commit and abort cases.  This avoids holding many
unique locks after a long series of subtransactions.  The price is some
additional overhead in XactLockTableWait, but that seems acceptable.
Finally, restructure the state machine in xact.c to have a more orthogonal
set of states for subtransactions.
2004-09-16 16:58:44 +00:00
Tom Lane b2c4071299 Redesign query-snapshot timing so that volatile functions in READ COMMITTED
mode see a fresh snapshot for each command in the function, rather than
using the latest interactive command's snapshot.  Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and IMMUTABLE
functions, instead using the snapshot taken for the most closely nested
regular query.  (This behavior is only sane for read-only functions, so
the patch also enforces that such functions contain only SELECT commands.)
As per my proposal of 6-Sep-2004; I note that I floated essentially the
same proposal on 19-Jun-2002, but that discussion tailed off without any
action.  Since 8.0 seems like the right place to be taking possibly
nontrivial backwards compatibility hits, let's get it done now.
2004-09-13 20:10:13 +00:00
Tom Lane b339d1fff6 Fire non-deferred AFTER triggers immediately upon query completion,
rather than when returning to the idle loop.  This makes no particular
difference for interactively-issued queries, but it makes a big difference
for queries issued within functions: trigger execution now occurs before
the calling function is allowed to proceed.  This responds to numerous
complaints about nonintuitive behavior of foreign key checking, such as
http://archives.postgresql.org/pgsql-bugs/2004-09/msg00020.php, and
appears to be required by the SQL99 spec.
Also take the opportunity to simplify the data structures used for the
pending-trigger list, rename them for more clarity, and squeeze out a
bit of space.
2004-09-10 18:40:09 +00:00
Tom Lane 23645f0582 Fix incorrect ordering of smgr cleanup relative to buffer pin cleanup
during transaction abort.  Add a regression test case to catch related
mistakes in future.  Alvaro Herrera and Tom Lane.
2004-09-06 17:56:33 +00:00
Tom Lane e32bba202d Downgrade LOG messages to DEBUG1 for normal recycling of xlog, clog,
subtrans segments.  Per Greg Mullane and Chris K-L.
2004-09-06 03:04:27 +00:00
Tom Lane 64cb889106 Ensure that the remainder of the current pg_clog page is zeroed during
startup, just to be sure that there's no leftover junk there.
2004-08-30 19:00:42 +00:00
Tom Lane 9cf4eaa00b Fix failure to advance nextXID beyond subtransactions whose XIDs appear
only within COMMIT or ABORT records.
2004-08-30 19:00:03 +00:00
Bruce Momjian 15d3f9f6b7 Another pgindent run with lib typedefs added. 2004-08-30 02:54:42 +00:00
Tom Lane 50742aed68 Add WAL logging for CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.
Fix TablespaceCreateDbspace() to be able to create a dummy directory
in place of a dropped tablespace's symlink.  This eliminates the open
problem of a PANIC during WAL replay when a replayed action attempts
to touch a file in a since-deleted tablespace.  It also makes for a
significant improvement in the usability of PITR replay.
2004-08-29 21:08:48 +00:00
Tom Lane 0ffe11abd3 Widen xl_len field of XLogRecord header to 32 bits, so that we'll have
a more tolerable limit on the number of subtransactions or deleted files
in COMMIT and ABORT records.  Buy back the extra space by eliminating the
xl_xact_prev field, which isn't being used for anything and is rather
unlikely ever to be used for anything.
This does not force initdb, but you do need to do pg_resetxlog if you
want to upgrade an existing 8.0 installation without initdb.
2004-08-29 16:34:48 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane f78ecbf20e Now that TransactionIdDidAbort doesn't think it should try to modify
pg_clog, there's no reason to do abort marking of subtransactions in a
nonintuitive order.
2004-08-28 22:04:12 +00:00
Tom Lane 7531d2fd85 Add missing Assert to make TransactionIdDidAbort more consistent with
TransactionIdDidCommit.
2004-08-28 21:58:59 +00:00
Tom Lane 1c72d0dec1 Fix relcache to account properly for subtransaction status of 'new'
relcache entries.  Also, change TransactionIdIsCurrentTransactionId()
so that if consulted during transaction abort, it will not say that
the aborted xact is still current.  (It would be better to ensure that
it's never called at all during abort, but I'm not sure we can easily
guarantee that.)  In combination, these fix a crash we have seen
occasionally during parallel regression tests of 8.0.
2004-08-28 20:31:44 +00:00
Tom Lane f444dafab0 Can't truncate pg_subtrans during a recovery checkpoint --- subtrans
module isn't fully initialized yet.
2004-08-28 18:18:03 +00:00
Tom Lane fe455ee1d4 Revise ResourceOwner code to avoid accumulating ResourceOwner objects
for every command executed within a transaction.  For long transactions
this was a significant memory leak.  Instead, we can delete a portal's
or subtransaction's ResourceOwner immediately, if we physically transfer
the information about its locks up to the parent owner.  This does not
fully solve the leak problem; we need to do something about counting
multiple acquisitions of the same lock in order to fix it.  But it's a
necessary step along the way.
2004-08-25 18:43:43 +00:00
Tom Lane 4dbb880d3c Rearrange pg_subtrans handling as per recent discussion. pg_subtrans
updates are no longer WAL-logged nor even fsync'd; we do not need to,
since after a crash no old pg_subtrans data is needed again.  We truncate
pg_subtrans to RecentGlobalXmin at each checkpoint.  slru.c's API is
refactored a little bit to separate out the necessary decisions.
2004-08-23 23:22:45 +00:00
Tom Lane f009c316ba Tweak code so that pg_subtrans is never consulted for XIDs older than
RecentXmin (== MyProc->xmin).  This ensures that it will be safe to
truncate pg_subtrans at RecentGlobalXmin, which should largely eliminate
any fear of bloat.  Along the way, eliminate SubTransXidsHaveCommonAncestor,
which isn't really needed and could not give a trustworthy result anyway
under the lookback restriction.
In an unrelated but nearby change, #ifdef out GetUndoRecPtr, which has
been dead code since 2001 and seems unlikely to ever be resurrected.
2004-08-22 02:41:58 +00:00
Bruce Momjian 10249abfa1 Cleanup Win32 COPY handling, and move archive examples to SGML. 2004-08-12 19:03:44 +00:00
Bruce Momjian 43ea65a0dc Add mention of "WIN32" COPY. 2004-08-12 18:34:45 +00:00
Bruce Momjian 6525b42b10 Add make_native_path() because Win32 COPY is an internal CMD.EXE command
and doesn't process forward slashes in the same way as external
commands.  Quoting the first argument to COPY does not convert forward
to backward slashes, but COPY does properly process quoted forward
slashes in the second argument.

Win32 COPY works with quoted forward slashes in the first argument only if the
current directory is the same as the directory of the first argument.
2004-08-12 18:32:52 +00:00
Tom Lane 3fdf649f4f Fix failure to guarantee that a checkpoint will write out pg_clog updates
for transaction commits that occurred just before the checkpoint.  This is
an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a
reproducible test case to prove its existence.
2004-08-11 04:07:16 +00:00
Tom Lane 35f539b481 When expanding %p in archive_command or restore_command, translate
slashes to backslashes #ifdef WIN32.  This is to cope with the fact
that Windows seems exceedingly unfriendly to slashes in shell commands,
as per recent discussion.
2004-08-09 16:26:06 +00:00
Tom Lane 7dca975c5d Add a comment about why we always replay backup blocks from WAL. 2004-08-08 03:22:08 +00:00
Tom Lane fcbc438727 Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments
and documentation to reference 8.0 instead of 7.5.
2004-08-04 21:34:35 +00:00
Tom Lane b387d16f96 Make use of backup label/history files to control recovery properly. 2004-08-04 16:25:02 +00:00
Tom Lane 58c41712d5 Add functions pg_start_backup, pg_stop_backup to create backup label
and history files as per recent discussion.  While at it, remove
pg_terminate_backend, since we have decided we do not have time during
this release cycle to address the reliability concerns it creates.
Split the 'Miscellaneous Functions' documentation section into
'System Information Functions' and 'System Administration Functions',
which hopefully will draw the eyes of those looking for such things.
2004-08-03 20:32:36 +00:00
Tom Lane a83c45c4c6 Fix misplacement of savepointLevel test, per report from Chris K-L. 2004-08-03 15:57:26 +00:00
Tom Lane 410b1dfb88 Update the in-code documentation about the transaction system. Move it
into a README file instead of being in xact.c's header comment.
Alvaro Herrera.
2004-08-01 20:57:59 +00:00
Tom Lane 5cc380f9a3 Error message style adjustments, per Alvaro Herrera. 2004-08-01 17:45:43 +00:00
Tom Lane efcaf1e868 Some mop-up work for savepoints (nested transactions). Store a small
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress.  Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end.  (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.)  Add basic reference documentation for
SAVEPOINT and related commands.  Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
2004-08-01 17:32:22 +00:00
Tom Lane beda4814c1 plpgsql does exceptions.
There are still some things that need refinement; in particular I fear
that the recognized set of error condition names probably has little in
common with what Oracle recognizes.  But it's a start.
2004-07-31 07:39:21 +00:00
Tom Lane 1bf3d61504 Fix subtransaction behavior for large objects, temp namespace, files,
password/group files.  Also allow read-only subtransactions of a read-write
parent, but not vice versa.  These are the reasonably noncontroversial
parts of Alvaro's recent mop-up patch, plus further work on large objects
to minimize use of the TopTransactionResourceOwner.
2004-07-28 14:23:31 +00:00
Tom Lane cc813fc2b8 Replace nested-BEGIN syntax for subtransactions with spec-compliant
SAVEPOINT/RELEASE/ROLLBACK-TO syntax.  (Alvaro)
Cause COMMIT of a failed transaction to report ROLLBACK instead of
COMMIT in its command tag.  (Tom)
Fix a few loose ends in the nested-transactions stuff.
2004-07-27 05:11:48 +00:00
Tom Lane acd907bfcc Add cross-check that current timeline of pg_control is an ancestor of
recovery_target_timeline --- otherwise there is no path from the backup
to the requested timeline.  This check was foreseen in the original
discussion but I forgot to implement it.
2004-07-22 21:09:37 +00:00
Tom Lane 3dba9cb694 Add a check on file size as an additional safety check that a WAL file
recovered from archive is not corrupt.  It's not much but it will catch
one common problem, viz out-of-disk-space.
Also, force a WAL recovery scan when recovery.conf is present, even if
pg_control shows a clean shutdown.  This allows recovery with a tar backup
that was taken with the postmaster shut down, as per complaint from
Mark Kirkwood.
2004-07-22 20:18:40 +00:00
Tom Lane 2042b3428d Invent WAL timelines, as per recent discussion, to make point-in-time
recovery more manageable.  Also, undo recent change to add FILE_HEADER
and WASTED_SPACE records to XLOG; instead make the XLOG page header
variable-size with extra fields in the first page of an XLOG file.
This should fix the boundary-case bugs observed by Mark Kirkwood.
initdb forced due to change of XLOG representation.
2004-07-21 22:31:26 +00:00
Tom Lane 9c7a765f02 Remove unportable use of strptime() to parse recovery target time spec.
Instead use our own abstimein code, which is more flexible anyway.
2004-07-19 14:34:39 +00:00
Tom Lane 66ec2db728 XLOG file archiving and point-in-time recovery. There are still some
loose ends and a glaring lack of documentation, but it basically works.

Simon Riggs with some editorialization by Tom Lane.
2004-07-19 02:47:16 +00:00
Tom Lane fe548629c5 Invent ResourceOwner mechanism as per my recent proposal, and use it to
keep track of portal-related resources separately from transaction-related
resources.  This allows cursors to work in a somewhat sane fashion with
nested transactions.  For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor.  We might want to change that later.
2004-07-17 03:32:14 +00:00
Tom Lane f5c798ee82 Fix no-longer-correct bit-pushing in TransactionIdSetStatus, per Alvaro. 2004-07-03 02:55:56 +00:00
Tom Lane b6197fe069 Further review of xact.c state machine for nested transactions. Fix
problems with starting subtransactions inside already-failed transactions.
Clean up some comments.
2004-07-01 20:11:03 +00:00
Tom Lane 573a71a5da Nested transactions. There is still much left to do, especially on the
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.

Alvaro Herrera, with some help from Tom Lane.
2004-07-01 00:52:04 +00:00
Tom Lane 2467394ee1 Tablespaces. Alternate database locations are dead, long live tablespaces.
There are various things left to do: contrib dbsize and oid2name modules
need work, and so does the documentation.  Also someone should think about
COMMENT ON TABLESPACE and maybe RENAME TABLESPACE.  Also initlocation is
dead, it just doesn't know it yet.

Gavin Sherry and Tom Lane.
2004-06-18 06:14:31 +00:00
Tom Lane 921d749bd4 Adjust our timezone library to use pg_time_t (typedef'd as int64) in
place of time_t, as per prior discussion.  The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038).  The system will now treat
times over the full supported timestamp range as being in your local
time zone.  It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.

I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038.  Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported.  We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
2004-06-03 02:08:07 +00:00
Tom Lane 9b178555fc Per previous discussions, get rid of use of sync(2) in favor of
explicitly fsync'ing every (non-temp) file we have written since the
last checkpoint.  In the vast majority of cases, the burden of the
fsyncs should fall on the bgwriter process not on backends.  (To this
end, we assume that an fsync issued by the bgwriter will force out
blocks written to the same file by other processes using other file
descriptors.  Anyone have a problem with that?)  This makes the world
safe for WIN32, which ain't even got sync(2), and really makes the world
safe for Unixen as well, because sync(2) never had the semantics we need:
it offers no way to wait for the requested I/O to finish.

Along the way, fix a bug I recently introduced in xlog recovery:
file truncation replay failed to clear bufmgr buffers for the dropped
blocks, which could result in 'PANIC:  heap_delete_redo: no block'
later on in xlog replay.
2004-05-31 03:48:10 +00:00
Tom Lane 076a055acf Separate out bgwriter code into a logically separate module, rather
than being random pieces of other files.  Give bgwriter responsibility
for all checkpoint activity (other than a post-recovery checkpoint);
so this child process absorbs the functionality of the former transient
checkpoint and shutdown subprocesses.  While at it, create an actual
include file for postmaster.c, which for some reason never had its own
file before.
2004-05-29 22:48:23 +00:00
Tom Lane 1a321f26d8 Code review for EXEC_BACKEND changes. Reduce the number of #ifdefs by
about a third, make it work on non-Windows platforms again.  (But perhaps
I broke the WIN32 code, since I have no way to test that.)  Fold all the
paths that fork postmaster child processes to go through the single
routine SubPostmasterMain, which takes care of resurrecting the state that
would normally be inherited from the postmaster (including GUC variables).
Clean up some places where there's no particularly good reason for the
EXEC and non-EXEC cases to work differently.  Take care of one or two
FIXMEs that remained in the code.
2004-05-28 05:13:32 +00:00
Tom Lane 16974ee910 Get rid of the former rather baroque mechanism for propagating the values
of ThisStartUpID and RedoRecPtr into new backends.  It's a lot easier just
to make them all grab the values out of shared memory during startup.
This helps to decouple the postmaster from checkpoint execution, which I
need since I'm intending to let the bgwriter do it instead, and it also
fixes a bug in the Win32 port: ThisStartUpID wasn't getting propagated at
all AFAICS.  (Doesn't give me a lot of faith in the amount of testing that
port has gotten.)
2004-05-27 17:12:57 +00:00
Tom Lane 4d86ae4260 For multi-table ANALYZE, use per-table transactions when possible
(ie, when not inside a transaction block), so that we can avoid holding
locks longer than necessary.  Per trouble report from Philip Warner.
2004-05-22 23:14:38 +00:00
Tom Lane e6319d1d28 Put back #include <sys/time.h> in files that seem to need it on Linux. 2004-05-21 16:08:47 +00:00
Tom Lane 63bd0db121 Integrate src/timezone library for all platforms. There is more we can
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
2004-05-21 05:08:06 +00:00
Tom Lane 0bd61548ab Solve the 'Turkish problem' with undesirable locale behavior for case
conversion of basic ASCII letters.  Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower.  These functions use the same notions of
case folding already developed for identifier case conversion.  I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent.  Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
2004-05-07 00:24:59 +00:00
Bruce Momjian 31338352bd * Most changes are to fix warnings issued when compiling win32
* removed a few redundant defines
* get_user_name safe under win32
* rationalized pipe read EOF for win32 (UPDATED PATCH USED)
* changed all backend instances of sleep() to pg_usleep

    - except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a
32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is
acceptable, please replace with pg_usleep(2000000000L)]

I added a comment to that part of the code:

    /*
     *  It would be nice to use pg_usleep() here, but only does 2000 sec
     *  or 33 minutes, which seems too short.
     */
    sleep(1000000);

Claudio Natoli
2004-04-19 17:42:59 +00:00
Bruce Momjian 823ac7c2b4 This is a cleanup patch for access/transam/xact.c. It only removes some
#ifdef NOT_USED code, and adds a new TBLOCK state which signals the fact
that StartTransaction() has been executed.

Alvaro Herrera
2004-04-05 03:11:39 +00:00
Bruce Momjian 6367ed4382 Increase xlog str_time() static string variable, per Korean User's Group. 2004-03-22 04:16:57 +00:00
Tom Lane 7a57a67278 Replace opendir/closedir calls throughout the backend with AllocateDir
and FreeDir routines modeled on the existing AllocateFile/FreeFile.
Like the latter, these routines will avoid failing on EMFILE/ENFILE
conditions whenever possible, and will prevent leakage of directory
descriptors if an elog() occurs while one is open.
Also, reduce PANIC to ERROR in MoveOfflineLogs() --- this is not
critical code and there is no reason to force a DB restart on failure.
All per recent trouble report from Olivier Hubaut.
2004-02-23 23:03:10 +00:00
Bruce Momjian 1f17316a3d Here is an updated version of the win32 readdir patch.
1) Now puts in exactly the same change as the current-cvs mingw code
does. (see
http://cvs.sourceforge.net/viewcvs.py/mingw/runtime/mingwex/dirent.c?r1=
1.3&r2=1.4, second part of the patch).

2) Updates both xlog.c and slru.c in backend/access/transam/

3) Also updates pg_resetxlog, which also uses readdir() and checks the
errno value after the loop.

Magnus Hagander
2004-02-17 03:45:17 +00:00
Tom Lane c3c09be34b Commit the reasonably uncontroversial parts of J.R. Nield's PITR patch, to
wit: Add a header record to each WAL segment file so that it can be reliably
identified.  Avoid splitting WAL records across segment files (this is not
strictly necessary, but makes it simpler to incorporate the header records).
Make WAL entries for file creation, deletion, and truncation (as foreseen but
never implemented by Vadim).  Also, add support for making XLOG_SEG_SIZE
configurable at compile time, similarly to BLCKSZ.  Fix a couple bugs I
introduced in WAL replay during recent smgr API changes.  initdb is forced
due to changes in pg_control contents.
2004-02-11 22:55:26 +00:00
Tom Lane 58f337a343 Centralize implementation of delay code by creating a pg_usleep()
subroutine in src/port/pgsleep.c.  Remove platform dependencies from
miscadmin.h and put them in port.h where they belong.  Extend recent
vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and
non-btree index vacuuming.

By the way, where is the documentation for the cost-based-delay patch?
2004-02-10 03:42:45 +00:00
Tom Lane 87bd956385 Restructure smgr API as per recent proposal. smgr no longer depends on
the relcache, and so the notion of 'blind write' is gone.  This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
2004-02-10 01:55:27 +00:00
Tom Lane 2f0d43b251 Review uses of IsUnderPostmaster, change some tests to look at
whereToSendOutput instead because they are really inquiring about
the correct client communication protocol.  Update some comments.
This is pointing towards supporting regular FE/BE client protocol
in a standalone backend, per discussion a month or so back.
2004-01-28 21:02:40 +00:00
Bruce Momjian f4921e5ca3 Attached is a patch that fixes some trivial typos and alignment. Please
apply.

Alvaro Herrera
2004-01-26 22:51:56 +00:00
Tom Lane c77f363384 Ensure that close() and fclose() are checked for errors, at least in
cases involving writes.  Per recent discussion about the possibility
of close-time failures on some filesystems.  There is a TODO item for
this, too.
2004-01-26 22:35:32 +00:00
Tom Lane be11fa26e3 Repair incorrect order of operations in GetNewTransactionId(). We must
complete ExtendCLOG() before advancing nextXid, so that if that routine
fails, the next incoming transaction will try it again.  Per trouble
report from Christopher Kings-Lynne.
2004-01-26 19:15:59 +00:00
Tom Lane 9bd681a522 Repair problem identified by Olivier Prenant: ALTER DATABASE SET search_path
should not be too eager to reject paths involving unknown schemas, since
it can't really tell whether the schemas exist in the target database.
(Also, when reading pg_dumpall output, it could be that the schemas
don't exist yet, but eventually will.)  ALTER USER SET has a similar issue.
So, reduce the normal ERROR to a NOTICE when checking search_path values
for these commands.  Supporting this requires changing the API for GUC
assign_hook functions, which causes the patch to touch a lot of places,
but the changes are conceptually trivial.
2004-01-19 19:04:40 +00:00
Bruce Momjian 38081fd000 Change PG_DELAY from msec to usec and use it consistenly rather than
select().   Add Win32 Sleep() for delay.
2004-01-09 21:08:50 +00:00
Neil Conway 192ad63bd7 More janitorial work: remove the explicit casting of NULL literals to a
pointer type when it is not necessary to do so.

For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
2004-01-07 18:56:30 +00:00
Tom Lane 06288d4e22 Suppress compiler warning (xlog_outrec is unused if not WAL_DEBUG). 2004-01-06 22:22:37 +00:00
Neil Conway bc028beb16 Make the 'wal_debug' GUC variable a boolean (rather than an integer), and
hide it behind #ifdef WAL_DEBUG blocks.
2004-01-06 17:26:23 +00:00
Bruce Momjian d75b2ec4eb This patch is the next step towards (re)allowing fork/exec.
Claudio Natoli
2003-12-20 17:31:21 +00:00
Neil Conway fef0c8345a I posted some bufmgr cleanup a few weeks ago, but it conflicted with
some concurrent changes Jan was making to the bufmgr. Here's an
updated version of the patch -- it should apply cleanly to CVS
HEAD and passes the regression tests.

This patch makes the following changes:

     - remove the UnlockAndReleaseBuffer() and UnlockAndWriteBuffer()
       macros, and replace uses of them with calls to the appropriate
       functions.

     - remove a bunch of #ifdef BMTRACE code: it is ugly & broken
       (i.e. it doesn't compile)

     - make BufferReplace() return a bool, not an int

     - cleanup some logic in bufmgr.c; should be functionality
       equivalent to the previous code, just cleaner now

     - remove the BM_PRIVATE flag as it is unused

     - improve a few comments, etc.
2003-12-14 00:34:47 +00:00
Peter Eisentraut 2afacfc403 This patch properly sets the prototype for the on_shmem_exit and
on_proc_exit functions, and adjust all other related code to use
the proper types too.

by Kurt Roeckx
2003-12-12 18:45:10 +00:00
Joe Conway e2605c8311 Add a warning to AtEOXact_SPI() to catch cases where the current
transaction has been committed without SPI_finish() being called
first. Per recent discussion here:
http://archives.postgresql.org/pgsql-patches/2003-11/msg00286.php
2003-12-02 19:26:47 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Tom Lane 90b2202975 Fix bad interaction between NOTIFY processing and V3 extended query
protocol, per report from Igor Shevchenko.  NOTIFY thought it could
do its thing if transaction blockState is TBLOCK_DEFAULT, but in
reality it had better check the low-level transaction state is
TRANS_DEFAULT as well.  Formerly it was not possible to wait for the
client in a state where the first is true and the second is not ...
but now we can have such a state.  Minor cleanup in StartTransaction()
as well.
2003-10-16 16:50:41 +00:00
Tom Lane 8934790052 Add a mechanism to let dynamically loaded modules register post-commit/
post-abort cleanup hooks.  I'm surprised that we have not needed this
already, but I need it now to fix a plpgsql problem, and the usefulness
for other dynamically loaded modules seems obvious.
2003-09-28 23:26:20 +00:00
Tom Lane 4f7a2fa0c3 Fix typo in message. 2003-09-27 18:16:35 +00:00
Peter Eisentraut d84b6ef56b Various message fixes, among those fixes for the previous round of fixes 2003-09-26 15:27:37 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane a56a016ceb Repair some REINDEX problems per recent discussions. The relcache is
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method.  This
leaves only shared system indexes as special cases.  Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case.  Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX.  -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS.  Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
2003-09-24 18:54:02 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Tom Lane 870886affe Suppress unused-variable warnings when building without Asserts. 2003-08-08 14:39:45 +00:00
Tom Lane 2f9c859ea1 Fix some copyright notices that weren't updated. Improve copyright tool
so it won't miss 'em again.
2003-08-04 23:59:41 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 81b5c8a136 A visit from the message-style police ... 2003-07-28 00:09:16 +00:00
Tom Lane ec7aa4b515 Error message editing in backend/access. 2003-07-21 20:29:40 +00:00
Tom Lane fa3bd4dbd0 Error message editing: finish up undone task of reporting the problem
xid when we fail to access pg_clog.
2003-07-19 21:37:37 +00:00
Tom Lane 8cf63ba920 Repair boundary-case bug introduced by patch of two months ago that
fixed incorrect initial setting of StartUpID.  The logic in XLogWrite()
expects that Write->curridx is advanced to the next page as soon as
LogwrtResult points to the end of the current page, but StartupXLOG()
failed to make that happen when the old WAL ended exactly on a page
boundary.  Per trouble report from Hannu Krosing.
2003-07-17 16:45:04 +00:00
Tom Lane 0c985ab5a8 Add comment pointing out that XLByteToPrevSeg macro is not broken. 2003-06-26 18:23:07 +00:00
Bruce Momjian 0abe7431c6 This patch extracts page buffer pooling and the simple
least-recently-used strategy from clog.c into slru.c.  It doesn't
change any visible behaviour and passes all regression tests plus a
TruncateCLOG test done manually.

Apart from refactoring I made a little change to SlruRecentlyUsed,
formerly ClogRecentlyUsed:  It now skips incrementing lru_counts, if
slotno is already the LRU slot, thus saving a few CPU cycles.  To make
this work, lru_counts are initialised to 1 in SimpleLruInit.

SimpleLru will be used by pg_subtrans (part of the nested transactions
project), so the main purpose of this patch is to avoid future code
duplication.

Manfred Koizar
2003-06-11 22:37:46 +00:00
Tom Lane 39e98d9563 Repair sometimes-incorrect computation of StartUpID after a crash, per
example from Rao Kumar.  This is a very corner corner-case, requiring
a minimum of three closely-spaced database crashes and an unlucky
positioning of the second recovery's checkpoint record before you'd notice
any problem.  But the consequences are dire enough that it's a must-fix.
2003-05-22 14:39:28 +00:00
Tom Lane f85f43dfb5 Backend support for autocommit removed, per recent discussions. The
only remnant of this failed experiment is that the server will take
SET AUTOCOMMIT TO ON.  Still TODO: provide some client-side autocommit
logic in libpq.
2003-05-14 03:26:03 +00:00
Tom Lane 30f609484d Add binary I/O routines for a bunch more datatypes. Still a few to go,
but that was enough tedium for one day.  Along the way, move the few
support routines for types xid and cid into a more logical place.
2003-05-12 23:08:52 +00:00
Tom Lane 8d86a96068 Adjust CreateCheckpoint so that buffer dumping activities and cleanup of
dead xlog segments are not considered part of a critical section.  It is
not necessary to force a database-wide panic if we get a failure in these
operations.  Per recent trouble reports.
2003-05-10 18:01:31 +00:00
Bruce Momjian a7fd03e1de Handle clog structure in shared memory in exec() case, for Win32. 2003-05-03 03:52:07 +00:00