Commit Graph

870 Commits

Author SHA1 Message Date
Tom Lane
a52e4408b9 Add a debug logging message when a resource manager rejects an attempted
restart point.  Per suggestion from Simon Riggs.
2007-08-28 23:17:47 +00:00
Tom Lane
647fd9a108 Fix two bugs induced in VACUUM FULL by async-commit patch.
First, we cannot assume that XLogAsyncCommitFlush guarantees hint bits will be
settable, because clog.c's inexact LSN bookkeeping results in windows where a
previously flushed transaction is considered unhintable because it shares an
LSN slot with a later unflushed transaction.  But repair_frag requires
XMIN_COMMITTED to be correct so that it can distinguish tuples moved by the
current vacuum.  Since not being able to set the bit is an uncommon corner
case, the most practical way of dealing with it seems to be to abandon
shrinking (ie, don't invoke repair_frag) when we find a non-dead tuple whose
XMIN_COMMITTED bit couldn't be set.

Second, it is possible for the same reason that a RECENTLY_DEAD tuple does not
get its XMAX_COMMITTED bit set during scan_heap.  But by the time repair_frag
examines the tuple it might be possible to set the bit.  We therefore must
take buffer content lock when calling HeapTupleSatisfiesVacuum a second time,
else we can get an Assert failure in SetBufferCommitInfoNeedsSave.  This
latter bug is latent in existing releases, but I think it cannot actually
occur without async commit, since the first HeapTupleSatisfiesVacuum call
should always have set the bit.  So I'm not going to back-patch it.

In passing, reduce the existing "cannot shrink relation" messages from NOTICE
to LOG level.  The new message must be no higher than LOG if we don't want
unpredictable regression test failures, and consistency seems like a good
idea.  Also arrange that only one such message is reported per VACUUM FULL;
in typical scenarios you could get spammed with many such messages, which
seems a bit useless.
2007-08-13 19:08:26 +00:00
Tom Lane
bdd6b62245 Switch over to using the src/timezone functions for formatting timestamps
displayed in the postmaster log.  This avoids Windows-specific problems with
localized time zone names that are in the wrong encoding, and generally seems
like a good idea to forestall other potential platform-dependent issues.
To preserve the existing behavior that all backends will log in the same time
zone, create a new GUC variable log_timezone that can only be changed on a
system-wide basis, and reference log-related calculations to that zone instead
of the TimeZone variable.

This fixes the issue reported by Hiroshi Saito that timestamps printed by
xlog.c startup could be improperly localized on Windows.  We still need a
simpler patch for that problem in the back branches, however.
2007-08-04 01:26:54 +00:00
Tom Lane
4a78cdeb6b Support an optional asynchronous commit mode, in which we don't flush WAL
before reporting a transaction committed.  Data consistency is still
guaranteed (unlike setting fsync = off), but a crash may lose the effects
of the last few transactions.  Patch by Simon, some editorialization by Tom.
2007-08-01 22:45:09 +00:00
Tom Lane
ad4295728e Create a new dedicated Postgres process, "wal writer", which exists to write
and fsync WAL at convenient intervals.  For the moment it just tries to
offload this work from backends, but soon it will be responsible for
guaranteeing a maximum delay before asynchronously-committed transactions
will be flushed to disk.

This is a portion of Simon Riggs' async-commit patch, committed to CVS
separately because a background WAL writer seems like it might be a good idea
independently of the async-commit feature.  I rebased walwriter.c on
bgwriter.c because it seemed like a more appropriate way of handling signals;
while the startup/shutdown logic in postmaster.c is more like autovac because
we want walwriter to quit before we start the shutdown checkpoint.
2007-07-24 04:54:09 +00:00
Tom Lane
9fc25c0511 Improve logging of checkpoints. Patch by Greg Smith, worked over
by Heikki and a little bit by me.
2007-06-30 19:12:02 +00:00
Tom Lane
867e2c91a0 Implement "distributed" checkpoints in which the checkpoint I/O is spread
over a fairly long period of time, rather than being spat out in a burst.
This happens only for background checkpoints carried out by the bgwriter;
other cases, such as a shutdown checkpoint, are still done at full speed.

Remove the "all buffers" scan in the bgwriter, and associated stats
infrastructure, since this seems no longer very useful when the checkpoint
itself is properly throttled.

Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas,
and some minor API editorialization by me.
2007-06-28 00:02:40 +00:00
Tom Lane
6d6d14b6d5 Redefine IsTransactionState() to only return true for TRANS_INPROGRESS state,
which is the only state in which it's safe to initiate database queries.
It turns out that all but two of the callers thought that's what it meant;
and the other two were using it as a proxy for "will GetTopTransactionId()
return a nonzero XID"?  Since it was in fact an unreliable guide to that,
make those two just invoke GetTopTransactionId() always, then deal with a
zero result if they get one.
2007-06-07 21:45:59 +00:00
Peter Eisentraut
7ce9b3683e Make some messages more consistent 2007-05-31 15:13:06 +00:00
Peter Eisentraut
71fb7b9014 Downgrade some low-level startup messages to DEBUG1. 2007-05-31 07:36:12 +00:00
Tom Lane
fa0e318f94 Fix overly-strict sanity check in BeginInternalSubTransaction that made it
fail when used in a deferred trigger.  Bug goes back to 8.0; no doubt the
reason it hadn't been noticed is that we've been discouraging use of
user-defined constraint triggers.  Per report from Frank van Vugt.
2007-05-30 21:01:39 +00:00
Tom Lane
d526575f89 Make large sequential scans and VACUUMs work in a limited-size "ring" of
buffers, rather than blowing out the whole shared-buffer arena.  Aside from
avoiding cache spoliation, this fixes the problem that VACUUM formerly tended
to cause a WAL flush for every page it modified, because we had it hacked to
use only a single buffer.  Those flushes will now occur only once per
ring-ful.  The exact ring size, and the threshold for seqscans to switch into
the ring usage pattern, remain under debate; but the infrastructure seems
done.  The key bit of infrastructure is a new optional BufferAccessStrategy
object that can be passed to ReadBuffer operations; this replaces the former
StrategyHintVacuum API.

This patch also changes the buffer usage-count methodology a bit: we now
advance usage_count when first pinning a buffer, rather than when last
unpinning it.  To preserve the behavior that a buffer's lifetime starts to
decrease when it's released, the clock sweep code is modified to not decrement
usage_count of pinned buffers.

Work not done in this commit: teach GiST and GIN indexes to use the vacuum
BufferAccessStrategy for vacuum-driven fetches.

Original patch by Simon, reworked by Heikki and again by Tom.
2007-05-30 20:12:03 +00:00
Tom Lane
77947c51c0 Fix up pgstats counting of live and dead tuples to recognize that committed
and aborted transactions have different effects; also teach it not to assume
that prepared transactions are always committed.

Along the way, simplify the pgstats API by tying counting directly to
Relations; I cannot detect any redeeming social value in having stats
pointers in HeapScanDesc and IndexScanDesc structures.  And fix a few
corner cases in which counts might be missed because the relation's
pgstat_info pointer hadn't been set.
2007-05-27 03:50:39 +00:00
Tom Lane
a8d539f124 To support external compression of archived WAL data, add a flag bit to
WAL records that shows whether it is safe to remove full-page images
(ie, whether or not an on-line backup was in progress when the WAL entry
was made).  Also make provision for an XLOG_NOOP record type that can be
used to fill in the extra space when decompressing the data for restore.

This is the portion of Koichi Suzuki's "full page writes" patch that
has to go into the core database.  The remainder of that work is two
external compression and decompression programs, which for the time being
will undergo separate development on pgfoundry.  Per discussion.

Also, twiddle the handling of BTREE_SPLIT records to ensure it'll be
possible to compress them (the previous coding caused essential info
to be omitted).  The other commonly-used record types seem OK already,
with the possible exception of GIN and GIST WAL records, which I don't
understand well enough to opine on.
2007-05-20 21:08:19 +00:00
Tom Lane
8c3cc86e7b During WAL recovery, when reading a page that we intend to overwrite completely
from the WAL data, don't bother to physically read it; just have bufmgr.c
return a zeroed-out buffer instead.  This speeds recovery significantly,
and also avoids unnecessary failures when a page-to-be-overwritten has corrupt
page headers on disk.  This replaces a former kluge that accomplished the
latter by pretending zero_damaged_pages was always ON during WAL recovery;
which was OK when the kluge was put in, but is unsafe when restoring a WAL
log that was written with full_page_writes off.

Heikki Linnakangas
2007-05-02 23:18:03 +00:00
Tom Lane
c432061963 Change the timestamps recorded in transaction commit/abort xlog records
from time_t to TimestampTz representation.  This provides full gettimeofday()
resolution of the timestamps, which might be useful when attempting to
do point-in-time recovery --- previously it was not possible to specify
the stop point with sub-second resolution.  But mostly this is to get
rid of TimestampTz-to-time_t conversion overhead during commit.  Per my
proposal of a day or two back.
2007-04-30 21:01:53 +00:00
Tom Lane
957d08c81f Implement rate-limiting logic on how often backends will attempt to send
messages to the stats collector.  This avoids the problem that enabling
stats_row_level for autovacuum has a significant overhead for short
read-only transactions, as noted by Arjen van der Meijden.  We can avoid
an extra gettimeofday call by piggybacking on the one done for WAL-logging
xact commit or abort (although that doesn't help read-only transactions,
since they don't WAL-log anything).

In my proposal for this, I noted that we could change the WAL log entries
for commit/abort to record full TimestampTz precision, instead of only
time_t as at present.  That's not done in this patch, but will be committed
separately.
2007-04-30 03:23:49 +00:00
Tom Lane
a2e923a652 Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable.  This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries.  However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well.  Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.

Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one.  There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption.  This affects 8.2 and HEAD
only, since those functions weren't there earlier.
2007-04-26 23:24:46 +00:00
Tom Lane
9c9b619473 Remove the CheckpointStartLock in favor of having backends show whether they
are in their commit critical sections via flags in the ProcArray.  Checkpoint
can watch the ProcArray to determine when it's safe to proceed.  This is
a considerably better solution to the original problem of race conditions
between checkpoint and transaction commit: it speeds up commit, since there's
one less lock to fool with, and it prevents the problem of checkpoint being
delayed indefinitely when there's a constant flow of commits.  Heikki, with
some kibitzing from Tom.
2007-04-03 16:34:36 +00:00
Tom Lane
b3005276eb Decouple the values of TOAST_TUPLE_THRESHOLD and TOAST_MAX_CHUNK_SIZE.
Add the latter to the values checked in pg_control, since it can't be changed
without invalidating toast table content.  This commit in itself shouldn't
change any behavior, but it lays some necessary groundwork for experimentation
with these toast-control numbers.

Note: while TOAST_TUPLE_THRESHOLD can now be changed without initdb, some
thought still needs to be given to needs_toast_table() in toasting.c before
unleashing random changes.
2007-04-03 04:14:26 +00:00
Tom Lane
4f896dac17 Arrange for PreventTransactionChain to reject commands submitted as part
of a multi-statement simple-Query message.  This bug goes all the way
back, but unfortunately is not nearly so easy to fix in existing releases;
it is only the recent ProcessUtility API change that makes it fixable in
HEAD.  Per report from William Garrison.
2007-03-22 19:55:04 +00:00
Peter Eisentraut
f4ee82e3d3 Reverted waiting for further fixes:
Make configuration parameters fall back to their default values when they
are removed from the configuration file.

Joachim Wieland
2007-03-13 14:32:25 +00:00
Tom Lane
b9527e9840 First phase of plan-invalidation project: create a plan cache management
module and teach PREPARE and protocol-level prepared statements to use it.
In service of this, rearrange utility-statement processing so that parse
analysis does not assume table schemas can't change before execution for
utility statements (necessary because we don't attempt to re-acquire locks
for utility statements when reusing a stored plan).  This requires some
refactoring of the ProcessUtility API, but it ends up cleaner anyway,
for instance we can get rid of the QueryContext global.

Still to do: fix up SPI and related code to use the plan cache; I'm tempted to
try to make SQL functions use it too.  Also, there are at least some aspects
of system state that we want to ensure remain the same during a replan as in
the original processing; search_path certainly ought to behave that way for
instance, and perhaps there are others.
2007-03-13 00:33:44 +00:00
Peter Eisentraut
f84308f195 Make configuration parameters fall back to their default values when they
are removed from the configuration file.

Joachim Wieland
2007-03-12 22:09:28 +00:00
Bruce Momjian
ae35867a39 Remove undo information from pg_controldata --- never used.
Florian G. Pflug
2007-03-03 20:02:27 +00:00
Alvaro Herrera
1820650934 Restructure autovacuum in two processes: a dummy process, which runs
continuously, and requests vacuum runs of "autovacuum workers" to postmaster.
The workers do the actual vacuum work.  This allows for future improvements,
like allowing multiple autovacuum jobs running in parallel.

For now, the code keeps the original behavior of having a single autovac
process at any time by sleeping until the previous worker has finished.
2007-02-15 23:23:23 +00:00
Bruce Momjian
a9eb53969a Move fsync method macro defines into /include/access/xlogdefs.h so they
can be used by src/tools/fsync/test_fsync.c.
2007-02-14 05:00:40 +00:00
Tom Lane
caf2b64a75 Disallow committing a prepared transaction unless we are in the same database
it was executed in.  Someday it might be nice to allow cross-DB commits, but
work would be needed in NOTIFY and perhaps other places.  Per Heikki.
2007-02-13 19:39:42 +00:00
Tom Lane
c398300330 Combine cmin and cmax fields of HeapTupleHeaders into a single field, by
keeping private state in each backend that has inserted and deleted the same
tuple during its current top-level transaction.  This is sufficient since
there is no need to be able to determine the cmin/cmax from any other
transaction.  This gets us back down to 23-byte headers, removing a penalty
paid in 8.0 to support subtransactions.  Patch by Heikki Linnakangas, with
minor revisions by moi, following a design hashed out awhile back on the
pghackers list.
2007-02-09 03:35:35 +00:00
Peter Eisentraut
086c189456 Normalize fgets() calls to use sizeof() for calculating the buffer size
where possible, and fix some sites that apparently thought that fgets()
will overwrite the buffer by one byte.

Also add some strlcpy() to eliminate some weird memory handling.
2007-02-08 11:10:27 +00:00
Tom Lane
aec4cf1c8c Add a function pg_stat_clear_snapshot() that discards any statistics snapshot
already collected in the current transaction; this allows plpgsql functions to
watch for stats updates even though they are confined to a single transaction.
Use this instead of the previous kluge involving pg_stat_file() to wait for
the stats collector to update in the stats regression test.  Internally,
decouple storage of stats snapshots from transaction boundaries; they'll
now stick around until someone calls pgstat_clear_snapshot --- which xact.c
still does at transaction end, to maintain the previous behavior.  This makes
the logic a lot cleaner, at the price of a couple dozen cycles per transaction
exit.
2007-02-07 23:11:30 +00:00
Tom Lane
78d1216160 Remove the xlog-centric "database system is ready" message and replace it with
"database system is ready to accept connections", which is issued by the
postmaster when it really is ready to accept connections.  Per proposal from
Markus Schiltknecht and subsequent discussion.
2007-02-07 16:44:48 +00:00
Bruce Momjian
8b4ff8b6a1 Wording cleanup for error messages. Also change can't -> cannot.
Standard English uses "may", "can", and "might" in different ways:

        may - permission, "You may borrow my rake."

        can - ability, "I can lift that log."

        might - possibility, "It might rain today."

Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice.  Similarly, "It may crash" is better stated, "It might crash".
2007-02-01 19:10:30 +00:00
Alvaro Herrera
eb63cc3da8 Arrange for autovacuum to be killed when another operation wants to be alone
accessing it, like DROP DATABASE.  This allows the regression tests to pass
with autovacuum enabled, which open the gates for finally enabling autovacuum
by default.
2007-01-16 13:28:57 +00:00
Bruce Momjian
29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane
0cb91ccba9 Remove the logId/logSeg fields from pg_control, because they are not needed
in normal operation, and we can avoid rewriting pg_control at every log
segment switch if we don't insist that these values be valid.  Reducing
the number of pg_control updates is a good idea for both performance and
reliability.  It does make pg_resetxlog's life a bit harder, but that seems
a good tradeoff; and anyway the change to pg_resetxlog amounts to automating
something people formerly needed to do by hand, namely look at the existing
pg_xlog files to make sure the new WAL start point was past them.

In passing, change the wording of xlog.c's "database system was interrupted"
messages: describe the pg_control timestamp as "last known up at" rather than
implying it is the exact time of service interruption.  With this change the
timestamp will generally be the time of the last checkpoint, which could be
many minutes before the failure; and we've already seen indications that
people tend to misinterpret the old wording.

initdb forced due to change in pg_control layout.  Simon Riggs and Tom Lane
2006-12-08 19:50:53 +00:00
Neil Conway
886a02d1cb Add a txn_start column to pg_stat_activity. This makes it easier to
identify long-running transactions. Since we already need to record
the transaction-start time (e.g. for now()), we don't need any
additional system calls to report this information.

Catversion bumped, initdb required.
2006-12-06 18:06:48 +00:00
Tom Lane
5f60086e10 Minor adjustments to make failures in startup/shutdown behave more cleanly.
StartupXLOG and ShutdownXLOG no longer need to be critical sections, because
in all contexts where they are invoked, elog(ERROR) would be translated to
elog(FATAL) anyway.  (One change in bgwriter.c is needed to make this true:
set ExitOnAnyError before trying to exit.  This is a good fix anyway since
the existing code would have gone into an infinite loop on elog(ERROR) during
shutdown.)  That avoids a misleading report of PANIC during semi-orderly
failures.  Modify the postmaster to include the startup process in the set of
processes that get SIGTERM when a fast shutdown is requested, and also fix it
to not try to restart the bgwriter if the bgwriter fails while trying to write
the shutdown checkpoint.  Net result is that "pg_ctl stop -m fast" does
something reasonable for a system in warm standby mode, and so should Unix
system shutdown (ie, universal SIGTERM).  Per gripe from Stephen Harris and
some corner-case testing of my own.
2006-11-30 18:29:12 +00:00
Tom Lane
395249ecbe Several changes to reduce the probability of running out of memory during
AbortTransaction, which would lead to recursion and eventual PANIC exit
as illustrated in recent report from Jeff Davis.  First, in xact.c create
a special dedicated memory context for AbortTransaction to run in.  This
solves the problem as long as AbortTransaction doesn't need more than 32K
(or whatever other size we create the context with).  But in corner cases
it might.  Second, in trigger.c arrange to keep pending after-trigger event
records in separate contexts that can be freed near the beginning of
AbortTransaction, rather than having them persist until CleanupTransaction
as before.  Third, in portalmem.c arrange to free executor state data
earlier as well.  These two changes should result in backing off the
out-of-memory condition before AbortTransaction needs any significant
amount of memory, at least in typical cases such as memory overrun due
to too many trigger events or too big an executor hash table.  And all
the same for subtransaction abort too, of course.
2006-11-23 01:14:59 +00:00
Tom Lane
3ad0728c81 On systems that have setsid(2) (which should be just about everything except
Windows), arrange for each postmaster child process to be its own process
group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
process group not only the direct child process.  This provides saner behavior
for archive and recovery scripts; in particular, it's possible to shut down a
warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
of SIGQUIT to the startup subprocess will result in killing the waiting
recovery_command.  Also, this makes Query Cancel and statement_timeout apply
to scripts being run from backends via system().  (There is no support in the
core backend for that, but it's widely done using untrusted PLs.)  Per gripe
from Stephen Harris and subsequent discussion.
2006-11-21 20:59:53 +00:00
Tom Lane
4f335a3d7f Repair two related errors in heap_lock_tuple: it was failing to recognize
cases where we already hold the desired lock "indirectly", either via
membership in a MultiXact or because the lock was originally taken by a
different subtransaction of the current transaction.  These cases must be
accounted for to avoid needless deadlocks and/or inappropriate replacement of
an exclusive lock with a shared lock.  Per report from Clarence Gardner and
subsequent investigation.
2006-11-17 18:00:15 +00:00
Peter Eisentraut
e138b80996 String fix 2006-11-16 14:28:41 +00:00
Tom Lane
792d6edd5b Clean up some misleading references to %p being a full path, per Simon. 2006-11-10 22:32:20 +00:00
Tom Lane
dcbdf9b1d4 Change Windows rename and unlink substitutes so that they time out after
30 seconds instead of retrying forever.  Also modify xlog.c so that if
it fails to rename an old xlog segment up to a future slot, it will
unlink the segment instead.  Per discussion of bug #2712, in which it
became apparent that Windows can handle unlinking a file that's being
held open, but not renaming it.
2006-11-08 20:12:05 +00:00
Tom Lane
48188e1621 Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios.  We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases.  Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId.  Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done.  Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database.  initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs.  Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 22:42:10 +00:00
Tom Lane
1e758d5263 Add some code to CREATE DATABASE to check for pre-existing subdirectories
that conflict with the OID that we want to use for the new database.
This avoids the risk of trying to remove files that maybe we shouldn't
remove.  Per gripe from Jon Lapham and subsequent discussion of 27-Sep.
2006-10-18 22:44:12 +00:00
Peter Eisentraut
b9b4f10b5b Message style improvements 2006-10-06 17:14:01 +00:00
Bruce Momjian
f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Bruce Momjian
45c8ed96b9 Make some sentences consistent with similar ones.
Euler Taveira de Oliveira
2006-10-03 21:21:36 +00:00
Alvaro Herrera
4650c4fdb9 Degrade the transaction-id wraparound point message from LOG to DEBUG1, per
discussion.

Patch from Simon Riggs.
2006-09-26 17:21:39 +00:00
Tom Lane
8fad2e3ff4 Arrange for GetSnapshotData to copy live-subtransaction XIDs from the
PGPROC array into snapshots, and use this information to avoid visits
to pg_subtrans in HeapTupleSatisfiesSnapshot.  This appears to solve
the pg_subtrans-related context swap storm problem that's been reported
by several people for 8.1.  While at it, modify GetSnapshotData to not
take an exclusive lock on ProcArrayLock, as closer analysis shows that
shared lock is always sufficient.
Itagaki Takahiro and Tom Lane
2006-09-03 15:59:39 +00:00
Tom Lane
ca1fd0ea5b Move xact.c's partial support for Lists of TransactionIds into pg_list.h.
Needed because lock.c is now going to use the same type of list.
2006-08-27 19:11:46 +00:00
Tom Lane
35af5422f6 Make the server track an 'XID epoch', that is, maintain higher-order bits
of the transaction ID counter.  Nothing is done with the epoch except to
store it in checkpoint records, but this provides a foundation with which
add-on code can pretend that XIDs never wrap around.  This is a severely
trimmed and rewritten version of the xxid patch submitted by Marko Kreen.
Per discussion, the epoch counter seems the only part of xxid that really
needs to be in the core server.
2006-08-21 16:16:31 +00:00
Tom Lane
e8ea9e9587 Implement archive_timeout feature to force xlog file switches to occur no more
than N seconds apart.  This allows a simple, if not very high performance,
means of guaranteeing that a PITR archive is no more than N seconds behind
real time.  Also make pg_current_xlog_location return the WAL Write pointer,
add pg_current_xlog_insert_location to return the Insert pointer, and fix
pg_xlogfile_name_offset to return its results as a two-element record instead
of a smashed-together string, as per recent discussion.

Simon Riggs
2006-08-17 23:04:10 +00:00
Tom Lane
e002836913 Make recovery from WAL be restartable, by executing a checkpoint-like
operation every so often.  This improves the usefulness of PITR log
shipping for hot standby: formerly, if the standby server crashed, it
was necessary to restart it from the last base backup and replay all
the WAL since then.  Now it will only need to reread about the same
amount of WAL as the master server would.  The behavior might also
come in handy during a long PITR replay sequence.  Simon Riggs,
with some editorialization by Tom Lane.
2006-08-07 16:57:57 +00:00
Tom Lane
704ddaaa09 Add support for forcing a switch to a new xlog file; cause such a switch
to happen automatically during pg_stop_backup().  Add some functions for
interrogating the current xlog insertion point and for easily extracting
WAL filenames from the hex WAL locations displayed by pg_stop_backup
and friends.  Simon Riggs with some editorialization by Tom Lane.
2006-08-06 03:53:44 +00:00
Alvaro Herrera
92c2ecc130 Modify snapshot definition so that lazy vacuums are ignored by other
vacuums.  This allows a OLTP-like system with big tables to continue
regular vacuuming on small-but-frequently-updated tables while the
big tables are being vacuumed.

Original patch from Hannu Krossing, rewritten by Tom Lane and updated
by me.
2006-07-30 02:07:18 +00:00
Peter Eisentraut
e9b4969062 DTrace support, with a small initial set of probes
by Robert Lor
2006-07-24 16:32:45 +00:00
Tom Lane
9dc842f083 Don't try to truncate multixact SLRU files in checkpoints done during xlog
recovery.  In the first place, it doesn't work because slru's
latest_page_number isn't set up yet (this is why we've been hearing reports
of strange "apparent wraparound" log messages during crash recovery, but
only from people who'd managed to advance their next-mxact counters some
considerable distance from 0).  In the second place, it seems a bit unwise
to be throwing away data during crash recovery anwyway.  This latter
consideration convinces me to just disable truncation during recovery,
rather than computing latest_page_number and pushing ahead.
2006-07-20 00:46:42 +00:00
Bruce Momjian
e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian
a22d76d96a Allow include files to compile own their own.
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
2006-07-13 16:49:20 +00:00
Bruce Momjian
0ff3461bcc Alphabetically order reference to include files, "N" - "S". 2006-07-11 17:26:59 +00:00
Bruce Momjian
3a534ade39 Alphabetically order reference to include files, "G" - "M". 2006-07-11 17:04:13 +00:00
Alvaro Herrera
d4cef0aa2a Improve vacuum code to track minimum Xids per table instead of per database.
To this end, add a couple of columns to pg_class, relminxid and relvacuumxid,
based on which we calculate the pg_database columns after each vacuum.

We now force all databases to be vacuumed, even template ones.  A backend
noticing too old a database (meaning pg_database.datminxid is in danger of
falling behind Xid wraparound) will signal the postmaster, which in turn will
start an autovacuum iteration to process the offending database.  In principle
this is only there to cope with frozen (non-connectable) databases without
forcing users to set them to connectable, but it could force regular user
database to go through a database-wide vacuum at any time.  Maybe we should
warn users about this somehow.  Of course the real solution will be to use
autovacuum all the time ;-)

There are some additional improvements we could have in this area: for example
the vacuum code could be smarter about not updating pg_database for each table
when called by autovacuum, and do it only once the whole autovacuum iteration
is done.

I updated the system catalogs documentation, but I didn't modify the
maintenance section.  Also having some regression tests for this would be nice
but it's not really a very straightforward thing to do.

Catalog version bumped due to system catalog changes.
2006-07-10 16:20:52 +00:00
Tom Lane
b7b78d24f7 Code review for FILLFACTOR patch. Change WITH grammar as per earlier
discussion (including making def_arg allow reserved words), add missed
opt_definition for UNIQUE case.  Put the reloptions support code in a less
random place (I chose to make a new file access/common/reloptions.c).
Eliminate header inclusion creep.  Make the index options functions safely
user-callable (seems like client apps might like to be able to test validity
of options before trying to make an index).  Reduce overhead for normal case
with no options by allowing rd_options to be NULL.  Fix some unmaintainably
klugy code, including getting rid of Natts_pg_class_fixed at long last.
Some stylistic cleanup too, and pay attention to keeping comments in sync
with code.

Documentation still needs work, though I did fix the omissions in
catalogs.sgml and indexam.sgml.
2006-07-03 22:45:41 +00:00
Bruce Momjian
277807bd9e Add FILLFACTOR to CREATE INDEX.
ITAGAKI Takahiro
2006-07-02 02:23:23 +00:00
Tom Lane
3c71244b74 Put #ifdef NOT_USED around posix_fadvise call. We may want to resurrect
this someday, but right now it seems that posix_fadvise is immature to
the point of being broken on many platforms ... and we don't have any
benchmark evidence proving it's worth spending time on.
2006-06-27 18:59:17 +00:00
Tom Lane
3a04f53e7f pg_stop_backup was calling XLogArchiveNotify() twice for the newly created
backup history file.  Bug introduced by the 8.1 change to make pg_stop_backup
delete older history files.  Per report from Masao Fujii.
2006-06-22 20:42:57 +00:00
Tom Lane
27c3e3de09 Remove redundant gettimeofday() calls to the extent practical without
changing semantics too much.  statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead.  I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>.  There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
2006-06-20 22:52:00 +00:00
Tom Lane
1e8ae13640 Don't try to call posix_fadvise() unless <fcntl.h> supplies a declaration
for it.  Hopefully will fix core dump evidenced by some buildfarm members
since fadvise patch went in.  The actual definition of the function is not
ABI-compatible with compiler's default assumption in the absence of any
declaration, so it's clearly unsafe to try to call it without seeing a
declaration.
2006-06-18 18:30:21 +00:00
Bruce Momjian
40bc06fa16 Test for POSIX_FADV_DONTNEED to use posix_fadvise(). 2006-06-16 04:11:48 +00:00
Bruce Momjian
94a5c4a01b Use posix_fadvise() to avoid kernel caching of WAL contents on WAL file
close.

ITAGAKI Takahiro
2006-06-15 19:15:00 +00:00
Teodor Sigaev
8a3631f8d8 GIN: Generalized Inverted iNdex.
text[], int4[], Tsearch2 support for GIN.
2006-05-02 11:28:56 +00:00
Bruce Momjian
e6004f0151 Add statement_timestamp(), clock_timestamp(), and
transaction_timestamp() (just like now()).

Also update statement_timeout() to mention it is statement arrival time
that is measured.

Catalog version updated.
2006-04-25 00:25:22 +00:00
Tom Lane
eac825aa68 Ensure that we validate the page header of the first page of a WAL file
whenever we start to read within that file.  The first page carries
extra identification information that really ought to be checked, but
as the code stood, this was only checked when we switched sequentially
into a new WAL file, or if by chance the starting checkpoint record was
within the first page.  This patch ensures that we will detect bogus
'long header' information before we start replaying the WAL sequence.
2006-04-20 04:07:38 +00:00
Tom Lane
0a87394956 Fix the torn-page hazard for PITR base backups by forcing full page writes
to occur between pg_start_backup() and pg_stop_backup(), even if the GUC
setting full_page_writes is OFF.  Per discussion, doing this in combination
with the already-existing checkpoint during pg_start_backup() should ensure
safety against partial page updates being included in the backup.  We do
not have to force full page writes to occur during normal PITR operation,
as I had first feared.
2006-04-17 18:55:05 +00:00
Tom Lane
defe93463c Make the world safe for full_page_writes. Allow XLOG records that try to
update no-longer-existing pages to fall through as no-ops, but make a note
of each page number referenced by such records.  If we don't see a later
XLOG entry dropping the table or truncating away the page, complain at
the end of XLOG replay.  Since this fixes the known failure mode for
full_page_writes = off, revert my previous band-aid patch that disabled
that GUC variable.
2006-04-14 20:27:24 +00:00
Tom Lane
09b5271ebd Add a field to the first page of each WAL file to indicate the
XLOG_BLCKSZ.  This ought to help in preventing configuration mismatch
problems if anyone tries to ship PITR files between servers compiled
with different XLOG_BLCKSZ settings.  Simon Riggs
2006-04-05 03:34:05 +00:00
Tom Lane
e6140d9052 Don't use BLCKSZ for the physical length of the pg_control file, but
instead a dedicated symbol.  This probably makes no functional difference
for likely values of BLCKSZ, but it makes the intent clearer.
Simon Riggs, minor editorialization by Tom Lane.
2006-04-04 22:39:59 +00:00
Tom Lane
eaef111396 Define a separately configurable XLOG_BLCKSZ symbol for the page size
used within WAL files.  Historically this was the same as the data file
BLCKSZ, but there's no necessary connection, and it's possible that
performance gains might ensue from reducing XLOG_BLCKSZ.  In any case
distinguishing two symbols should improve code clarity.  This commit
does not actually change the page size, only provide the infrastructure
to make it possible to do so.  initdb forced because of addition of a
field to pg_control.
Mark Wong, with some help from Simon Riggs and Tom Lane.
2006-04-03 23:35:05 +00:00
Tom Lane
a8b8f4db23 Clean up WAL/buffer interactions as per my recent proposal. Get rid of the
misleadingly-named WriteBuffer routine, and instead require routines that
change buffer pages to call MarkBufferDirty (which does exactly what it says).
We also require that they do so before calling XLogInsert; this takes care of
the synchronization requirement documented in SyncOneBuffer.  Note that
because bufmgr takes the buffer content lock (in shared mode) while writing
out any buffer, it doesn't matter whether MarkBufferDirty is executed before
the buffer content change is complete, so long as the content change is
completed before releasing exclusive lock on the buffer.  So it's OK to set
the dirtybit before we fill in the LSN.
This eliminates the former kluge of needing to set the dirtybit in LockBuffer.
Aside from making the code more transparent, we can also add some new
debugging assertions, in particular that the caller of MarkBufferDirty must
hold the buffer content lock, not merely a pin.
2006-03-31 23:32:07 +00:00
Tom Lane
6d61cdec07 Clean up and document the API for XLogOpenRelation and XLogReadBuffer.
This commit doesn't make much functional change, but it does eliminate some
duplicated code --- for instance, PageIsNew tests are now done inside
XLogReadBuffer rather than by each caller.
The GIST xlog code still needs a lot of love, but I'll worry about that
separately.
2006-03-29 21:17:39 +00:00
Tom Lane
0a971e2f20 Disable full_page_writes, because turning it off risks causing crash-recovery
failures even when the hardware and OS did nothing wrong.  Per recent analysis
of a problem report from Alex Bahdushka.

For the moment I've just diked out the test of the parameter, rather than
removing the GUC infrastructure and documentation, in case we conclude that
there's something salvageable there.  There seems no chance of it being
resurrected in the 8.1 branch though.
2006-03-28 22:01:16 +00:00
Tom Lane
0a20207060 Arrange to emit a description of the current XLOG record as error context
when an error occurs during xlog replay.  Also, replace the former risky
'write into a fixed-size buffer with no overflow detection' API for XLOG
record description routines; use an expansible StringInfo instead.  (The
latter accounts for most of the patch bulk.)

Qingqing Zhou
2006-03-24 04:32:13 +00:00
Bruce Momjian
f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Neil Conway
8e5a10d46c This patch makes the error message strings throughout the backend
more compliant with the error message style guide. In particular,
errdetail should begin with a capital letter and end with a period,
whereas errmsg should not. I also fixed a few related issues in
passing, such as fixing the repeated misspelling of "lexeme" in
contrib/tsearch2 (per Tom's suggestion).
2006-03-01 06:30:32 +00:00
Tom Lane
c89a0dd3bb Repair longstanding bug in slru/clog logic: it is possible for two backends
to try to create a log segment file concurrently, but the code erroneously
specified O_EXCL to open(), resulting in a needless failure.  Before 7.4,
it was even a PANIC condition :-(.  Correct code is actually simpler than
what we had, because we can just say O_CREAT to start with and not need a
second open() call.  I believe this accounts for several recent reports of
hard-to-reproduce "could not create file ...: File exists" errors in both
pg_clog and pg_subtrans.
2006-01-21 04:38:21 +00:00
Neil Conway
fb627b76cc Cosmetic code cleanup: fix a bunch of places that used "return (expr);"
rather than "return expr;" -- the latter style is used in most of the
tree. I kept the parentheses when they were necessary or useful because
the return expression was complex.
2006-01-11 08:43:13 +00:00
Tom Lane
195f164228 Get rid of the SpinLockAcquire/SpinLockAcquire_NoHoldoff distinction
in favor of having just one set of macros that don't do HOLD/RESUME_INTERRUPTS
(hence, these correspond to the old SpinLockAcquire_NoHoldoff case).
Given our coding rules for spinlock use, there is no reason to allow
CHECK_FOR_INTERRUPTS to be done while holding a spinlock, and also there
is no situation where ImmediateInterruptOK will be true while holding a
spinlock.  Therefore doing HOLD/RESUME_INTERRUPTS while taking/releasing a
spinlock is just a waste of cycles.  Qingqing Zhou and Tom Lane.
2005-12-29 18:08:05 +00:00
Tom Lane
ab51bbaa06 Arrange to set the LC_XXX environment variables to match our locale
setup.  This protects against undesired changes in locale behavior
if someone carelessly does setlocale(LC_ALL, "") (and we know who
you are, perl guys).
2005-12-28 23:22:51 +00:00
Tom Lane
ec0baf949e Divide the lock manager's shared state into 'partitions', so as to
reduce contention for the former single LockMgrLock.  Per my recent
proposal.  I set it up for 16 partitions, but on a pgbench test this
gives only a marginal further improvement over 4 partitions --- we need
to test more scenarios to choose the number of partitions.
2005-12-11 21:02:18 +00:00
Tom Lane
887a7c61f6 Get rid of slru.c's hardwired insistence on a fixed number of slots per
SLRU area.  The number of slots is still a compile-time constant (someday
we might want to change that), but at least it's a different constant for
each SLRU area.  Increase number of subtrans buffers to 32 based on
experimentation with a heavily subtrans-bashing test case, and increase
number of multixact member buffers to 16, since it's obviously silly for
it not to be at least twice the number of multixact offset buffers.
2005-12-06 23:08:34 +00:00
Tom Lane
a615acf555 Arrange for read-only accesses to SLRU page buffers to take only a shared
lock, not exclusive, if the desired page is already in memory.  This can
be demonstrated to be a significant win on the pg_subtrans cache when there
is a large window of open transactions.  It should be useful for pg_clog
as well.  I didn't try to make GetMultiXactIdMembers() use the code, as
that would have taken some restructuring, and what with the local cache
for multixact contents it probably wouldn't really make a difference.
Per my recent proposal.
2005-12-06 18:10:06 +00:00
Bruce Momjian
436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Tom Lane
2a8d3d83ef R-tree is dead ... long live GiST. 2005-11-07 17:36:47 +00:00
Tom Lane
18691d8ee3 Clean up representation of SLRU page state. This is the cleaner fix
for the SLRU race condition that I posted a few days ago, but we decided
not to use in 8.1 and older branches.
2005-11-05 21:19:47 +00:00
Tom Lane
99d48695d4 Fix longstanding race condition in transaction log management: there was a
very narrow window in which SimpleLruReadPage or SimpleLruWritePage could
think that I/O was needed when it wasn't (and indeed the buffer had already
been assigned to another page).  This would result in an Assert failure if
Asserts were enabled, and probably in silent data corruption if not.
Reported independently by Jim Nasby and Robert Creager.

I intend a more extensive fix when 8.2 development starts, but this is a
reasonably low-impact patch for the existing branches.
2005-11-03 00:23:36 +00:00
Peter Eisentraut
07bb9f086b Message corrections 2005-10-29 00:31:52 +00:00
Tom Lane
a037926295 Reorder code so that we don't have to hold a critical section while
reserving SLRU space for a new MultiXact.  The original coding would have
treated out-of-disk-space as a PANIC condition, which is unnecessary.
2005-10-28 19:00:19 +00:00
Tom Lane
1986ca5ce5 Fix race condition in multixact code: it's possible to try to read a
multixact's starting offset before the offset has been stored into the
SLRU file.  A simple fix would be to hold the MultiXactGenLock until the
offset has been stored, but that looks like a big concurrency hit.  Instead
rely on knowledge that unset offsets will be zero, and loop when we see
a zero.  This requires a little extra hacking to ensure that zero is never
a valid value for the offset.  Problem reported by Matteo Beccati, fix
ideas from Martijn van Oosterhout, Alvaro Herrera, and Tom Lane.
2005-10-28 17:27:29 +00:00
Tom Lane
6d6c3722fb Make code for selecting default WAL sync method less confusing. 2005-10-22 20:27:17 +00:00
Bruce Momjian
1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Bruce Momjian
1d028537a2 This makes the error messages for PREPARE TRANSACTION, COMMIT PREPARED
etc. match the docs, which talk about "transaction identifier" not
"gid" or "global transaction identifier".

Steve Woodcock
2005-10-13 22:55:55 +00:00
Bruce Momjian
84cc9a4bb3 Back out this because of fear of changing error strings:
This makes the error messages for PREPARE TRANSACTION, COMMIT PREPARED
etc. match the docs, which talk about "transaction identifier" not
"gid" or "global transaction identifier".

Steve Woodcock
2005-10-13 17:57:57 +00:00
Bruce Momjian
90c22c9206 This makes the error messages for PREPARE TRANSACTION, COMMIT PREPARED
etc. match the docs, which talk about "transaction identifier" not
"gid" or "global transaction identifier".

Steve Woodcock
2005-10-13 17:57:17 +00:00
Tom Lane
64eea6c21d Expand pg_control information so that we can verify that the database
was created on a machine with alignment rules and floating-point format
similar to the current machine.  Per recent discussion, this seems like
a good idea with the increasing prevalence of 32/64 bit environments.
2005-10-03 00:28:43 +00:00
Tom Lane
037709e0b3 Reduce default value of max_prepared_transactions from 50 to 5. This
saves nearly 700kB in the default shared memory segment size, which seems
worthwhile, and it is a feature that many users won't use anyway.  Per
Heikki's argument, there is no point in a compromise value --- those who
are using 2PC at all will probably want it at least equal to max_connections.
But we can't set it to zero by default without breaking the prepared_xacts
regression test.
2005-08-29 21:38:18 +00:00
Tom Lane
9052537325 Rewrite gather-write patch into something less obviously bolted on
after the fact.  Fix bug with incorrect test for whether we are at end
of logfile segment.  Arrange for writes triggered by XLogInsert's
is-cache-more-than-half-full test to synchronize with the cache boundaries,
so that in long transactions we tend to write alternating halves of the
cache rather than randomly chosen portions of it; this saves one more
write syscall per cache load.
2005-08-22 23:59:04 +00:00
Bruce Momjian
8ad3965a11 Improve xid wraparound message (the server isn't really shut down, just
not accepting queries).

         errmsg("database is not accepting queries to avoid
	 wraparound data loss in database \"%s\"",
         errhint("Stop the postmaster and use a standalone
	 backend to VACUUM database \"%s\".",
2005-08-22 16:59:47 +00:00
Tom Lane
d0096a41fa Fix some inconsistent choices of datatypes in xlog.c. Make buffer
indexes all be int, rather than variously int, uint16 and uint32;
add some casts where necessary to support large buffer arrays.
2005-08-22 00:41:28 +00:00
Tom Lane
f39f6b500f Seems that the childXids list would be better based on Oid lists than
integer lists.
2005-08-20 23:45:08 +00:00
Tom Lane
0007490e09 Convert the arithmetic for shared memory size calculation from 'int'
to 'Size' (that is, size_t), and install overflow detection checks in it.
This allows us to remove the former arbitrary restrictions on NBuffers
etc.  It won't make any difference in a 32-bit machine, but in a 64-bit
machine you could theoretically have terabytes of shared buffers.
(How efficiently we could manage 'em remains to be seen.)  Similarly,
num_temp_buffers, work_mem, and maintenance_work_mem can be set above
2Gb on a 64-bit machine.  Original patch from Koichi Suzuki, additional
work by moi.
2005-08-20 23:26:37 +00:00
Tatsuo Ishii
ba2fc7eb4b Make GetMultiXactIdMembers() a public function. 2005-08-20 01:29:27 +00:00
Tom Lane
f8d0a82bf9 Avoid an Assert failure if OuterUserId hasn't been set yet during
AbortTransaction.  This can happen if a backend's InitPostgres transaction
fails (eg, because the given username is invalid).  Per Alvaro.
2005-08-17 22:14:34 +00:00
Tom Lane
721e53785d Solve the problem of OID collisions by probing for duplicate OIDs
whenever we generate a new OID.  This prevents occasional duplicate-OID
errors that can otherwise occur once the OID counter has wrapped around.
Duplicate relfilenode values are also checked for when creating new
physical files.  Per my recent proposal.
2005-08-12 01:36:05 +00:00
Tom Lane
d90c531188 Autovacuum loose end mop-up. Provide autovacuum-specific vacuum cost
delay and limit, both as global GUCs and as table-specific entries in
pg_autovacuum.  stats_reset_on_server_start is now OFF by default,
but a reset is forced if we did WAL replay.  XID-wrap vacuums do not
ANALYZE, but do FREEZE if it's a template database.  Alvaro Herrera
2005-08-11 21:11:50 +00:00
Tom Lane
4568e0f791 Modify AtEOXact_CatCache and AtEOXact_RelationCache to assume that the
ResourceOwner mechanism already released all reference counts for the
cache entries; therefore, we do not need to scan the catcache or relcache
at transaction end, unless we want to do it as a debugging crosscheck.
Do the crosscheck only in Assert mode.  This is the same logic we had
previously installed in AtEOXact_Buffers to avoid overhead with large
numbers of shared buffers.  I thought it'd be a good idea to do it here
too, in view of Kari Lavikka's recent report showing a real-world case
where AtEOXact_CatCache is taking a significant fraction of runtime.
2005-08-08 19:17:23 +00:00
Tom Lane
2a4fad1a0e Add NOWAIT option to SELECT FOR UPDATE/SHARE.
Original patch by Hans-Juergen Schoenig, revisions by Karel Zak
and Tom Lane.
2005-08-01 20:31:16 +00:00
Tom Lane
d42cf5a42a Add per-user and per-database connection limit options.
This patch also includes preliminary update of pg_dumpall for roles.
Petr Jelinek, with review by Bruce Momjian and Tom Lane.
2005-07-31 17:19:22 +00:00
Bruce Momjian
5b0bfec414 Fix compile for no O_SYNC, but introduced with O_DIRECT. 2005-07-30 14:15:44 +00:00
Tom Lane
5d5f1a79e6 Clean up a number of autovacuum loose ends. Make the stats collector
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly.  Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries.  Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
2005-07-29 19:30:09 +00:00
Bruce Momjian
c6b1724c67 Update O_DIRECT comment. 2005-07-29 03:25:53 +00:00
Bruce Momjian
c34bb00581 Use O_DIRECT if available when using O_SYNC for wal_sync_method.
Also, write multiple WAL buffers out in one write() operation.

ITAGAKI Takahiro

---------------------------------------------------------------------------

> If we disable writeback-cache and use open_sync, the per-page writing
> behavior in WAL module will show up as bad result. O_DIRECT is similar
> to O_DSYNC (at least on linux), so that the benefit of it will disappear
> behind the slow disk revolution.
>
> In the current source, WAL is written as:
>     for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
> Is this intentional? Can we rewrite it as follows?
>    write(&buffers[0], N * BLCKSZ);
>
> In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
> Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
> These two patches are independent, so they can be applied either or both.
>
>
> I tested them on my machine and the results as follows. It shows that
> direct-io and gather-write is the best choice when writeback-cache is off.
> Are these two patches worth trying if they are used together?
>
>
>             | writeback | fsync= | fdata | open_ | fsync_ | open_
> patch       | cache     |  false |  sync |  sync | direct | direct
> ------------+-----------+--------+-------+-------+--------+---------
> direct io   | off       |  124.2 | 105.7 |  48.3 |   48.3 |  48.2
> direct io   | on        |  129.1 | 112.3 | 114.1 |  142.9 | 144.5
> gather-write| off       |  124.3 | 108.7 | 105.4 |  (N/A) | (N/A)
> both        | off       |  131.5 | 115.5 | 114.4 |  145.4 | 145.2
>
> - 20runs * pgbench -s 100 -c 50 -t 200
>    - with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
> - using 2 ATA disks:
>    - hda(reiserfs) includes system and wal.
>    - hdc(jfs) includes database files. writeback-cache is always on.
>
> ---
> ITAGAKI Takahiro
2005-07-29 03:22:33 +00:00
Tom Lane
e5d6b91220 Add SET ROLE. This is a partial commit of Stephen Frost's recent patch;
I'm still working on the has_role function and information_schema changes.
2005-07-25 22:12:34 +00:00
Bruce Momjian
9af9d674c6 Remove unintended code addition. 2005-07-23 15:31:16 +00:00
Bruce Momjian
4098c8867d Macro alignment cleanup. 2005-07-23 15:29:47 +00:00
Tom Lane
f2bf2d2dc5 Fix a couple of bogus comments, per Alvaro. 2005-07-13 22:46:09 +00:00
Tom Lane
d7207cfc6b Even though I'd like to see full_page_writes go away before 8.1,
a minimum requirement is that it not completely break the system
meanwhile.  Put the test in the right place.
2005-07-08 04:07:26 +00:00
Bruce Momjian
326a7a0788 Add GUC full_page_writes to control writing full pages to WAL. 2005-07-05 23:18:10 +00:00
Tom Lane
eb5949d190 Arrange for the postmaster (and standalone backends, initdb, etc) to
chdir into PGDATA and subsequently use relative paths instead of absolute
paths to access all files under PGDATA.  This seems to give a small
performance improvement, and it should make the system more robust
against naive DBAs doing things like moving a database directory that
has a live postmaster in it.  Per recent discussion.
2005-07-04 04:51:52 +00:00
Tom Lane
401de9c8be Improve the checkpoint signaling mechanism so that the bgwriter can tell
the difference between checkpoints forced due to WAL segment consumption
and checkpoints forced for other reasons (such as CREATE DATABASE).  Avoid
generating 'checkpoints are occurring too frequently' messages when the
checkpoint wasn't caused by WAL segment consumption.  Per gripe from
Chris K-L.
2005-06-30 00:00:52 +00:00
Tom Lane
b5f7cff84f Clean up the rather historically encumbered interface to now() and
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields.  This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
2005-06-29 22:51:57 +00:00
Tom Lane
7762619e95 Replace pg_shadow and pg_group by new role-capable catalogs pg_authid
and pg_auth_members.  There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance).  But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies.  The catalog changes should
be pretty much done.
2005-06-28 05:09:14 +00:00
Tom Lane
fc654583ab Need #include <time.h> on some platforms. 2005-06-19 22:34:56 +00:00
Tom Lane
3f749924f8 Simplify uses of readdir() by creating a function ReadDir() that
includes error checking and an appropriate ereport(ERROR) message.
This gets rid of rather tedious and error-prone manipulation of errno,
as well as a Windows-specific bug workaround, at more than a dozen
call sites.  After an idea in a recent patch by Heikki Linnakangas.
2005-06-19 21:34:03 +00:00
Tom Lane
e26b0abda3 Arrange to fsync two-phase-commit state files only during checkpoints;
given reasonably short lifespans for prepared transactions, this should
mean that only a small minority of state files ever need to be fsynced
at all.  Per discussion with Heikki Linnakangas.
2005-06-19 20:00:39 +00:00
Tom Lane
a8d1075f27 Add a time-of-preparation column to the pg_prepared_xacts view, per an
old suggestion by Oliver Jowett.  Also, add a transaction column to the
pg_locks view to show the xid of each transaction holding or awaiting
locks; this allows prepared transactions to be properly associated with
the locks they own.  There was already a column named 'transaction',
and I chose to rename it to 'transactionid' --- since this column is
new in the current devel cycle there should be no backwards compatibility
issue to worry about.
2005-06-18 19:33:42 +00:00
Tom Lane
66b098492e Dept. of second thoughts: regular COMMIT deletes deletable files before
releasing locks, so COMMIT PREPARED should too.
2005-06-18 05:21:09 +00:00
Tom Lane
d0a89683a3 Two-phase commit. Original patch by Heikki Linnakangas, with additional
hacking by Alvaro Herrera and Tom Lane.
2005-06-17 22:32:51 +00:00
Bruce Momjian
f4d907ca85 Remove old *.backup files when we do pg_stop_backup(). This
prevents a large number of *.backup files from existing in pg_xlog/
2005-06-15 01:36:08 +00:00
Teodor Sigaev
37c839365c WAL for GiST. It work for online backup and so on, but on
recovery after crash (power loss etc) it may say that it can't restore
index and index should be reindexed.

Some refactoring code.
2005-06-14 11:45:14 +00:00
Bruce Momjian
51746c4549 Free buffer allocated via malloc (process is short-lived, but fix it anyway). 2005-06-09 22:36:27 +00:00
Tom Lane
f5b2f60bd1 Change WAL-logging scheme for multixacts to be more like regular
transaction IDs, rather than like subtrans; in particular, the information
now survives a database restart.  Per previous discussion, this is
essential for PITR log shipping and for 2PC.
2005-06-08 15:50:28 +00:00
Tom Lane
ee7ac7b11e Modify XLogInsert API to make callers specify whether pages to be backed
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero.  That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space.  Per suggestion by Heikki Linnakangas.
2005-06-06 20:22:58 +00:00
Tom Lane
4c8495a1f2 Remove the mostly-stubbed-out-anyway support routines for WAL UNDO.
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
2005-06-06 17:01:25 +00:00
Tom Lane
21fda22ec4 Change CRCs in WAL records from 64bit to 32bit for performance reasons.
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block.  Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages.  Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective.  All per recent discussions.
2005-06-02 05:55:29 +00:00
Tom Lane
a91fa39028 Add test to WAL replay to verify that xl_prev points back to the previous
WAL record; this is necessary to be sure we recognize stale WAL records
when a WAL page was only partially written during a system crash.
2005-05-31 19:10:28 +00:00
Tom Lane
e92a88272e Modify hash_search() API to prevent future occurrences of the error
spotted by Qingqing Zhou.  The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places.  If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL.  But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
2005-05-29 04:23:07 +00:00
Bruce Momjian
6dc7760ac3 Add support for wal_fsync_writethrough for Darwin, and restructure the
code to better handle writethrough.

Chris Campbell
2005-05-20 14:53:26 +00:00
Tom Lane
ff0c143a3b Make a comment pgindent-proof, per suggestion from Alvaro. 2005-05-19 23:58:51 +00:00
Tom Lane
ee3b71f6bc Split the shared-memory array of PGPROC pointers out of the sinval
communication structure, and make it its own module with its own lock.
This should reduce contention at least a little, and it definitely makes
the code seem cleaner.  Per my recent proposal.
2005-05-19 21:35:48 +00:00
Neil Conway
c891e05f26 Cleanup GiST header files. Since GiST extensions are often written as
external projects, we should be careful about what parts of the GiST
API are considered implementation details, and which are part of the
public API. Therefore, I've moved internal-only declarations into
gist_private.h -- future backward-incompatible changes to gist.h should
be made with care, to avoid needlessly breaking external GiST extensions.

Also did some related header cleanup: remove some unnecessary #includes
from gist.h, and remove some unused definitions: isAttByVal(), _gistdump(),
and GISTNStrategies.
2005-05-17 03:34:18 +00:00
Bruce Momjian
35e1651508 Back out check for unreferenced files.
Heikki Linnakangas
2005-05-10 22:27:30 +00:00
Tom Lane
4cd4ed0cc2 Fix case in which a debug printout would print already-pfreed data. 2005-05-07 18:14:25 +00:00
Tom Lane
126eaef651 Clean up MultiXactIdExpand's API by separating out the case where we
are creating a new MultiXactId from two regular XIDs.  The original
coding was unnecessarily complicated and didn't save any code anyway.
2005-05-03 19:42:41 +00:00
Bruce Momjian
76668e6eb4 Check the file system on postmaster startup and report any unreferenced
files in the server log.

Heikki Linnakangas
2005-05-02 18:26:54 +00:00
Tom Lane
bedb78d386 Implement sharable row-level locks, and use them for foreign key references
to eliminate unnecessary deadlocks.  This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE.  The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets.  When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX.  This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared.   Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.

Alvaro Herrera and Tom Lane.
2005-04-28 21:47:18 +00:00
Tom Lane
19d127548c Add comment about checkpoint panic behavior during shutdown, per
suggestion from Qingqing Zhou.
2005-04-23 18:49:54 +00:00
Bruce Momjian
1a6ad669fb Fix comment typo. 2005-04-17 03:04:29 +00:00
Tom Lane
5f0a974ea9 Reduce PANIC to ERROR in several xlog routines that are used in both
critical and noncritical contexts (an example of noncritical being
post-checkpoint removal of dead xlog segments).  In the critical cases
the CRIT_SECTION mechanism will cause ERROR to be promoted to PANIC
anyway, and in the noncritical cases we shouldn't let an error take
down the entire database.  Arguably there should be *no* explicit
PANIC errors in this module, only more START/END_CRIT_SECTION calls,
but I didn't go that far.  (Yet.)
2005-04-15 22:19:48 +00:00
Tom Lane
61b861421b Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior
when recycling a large number of xlog segments during checkpoint.
The former behavior searched from the same start point each time,
requiring O(checkpoint_segments^2) stat() calls to relocate all the
segments.  Instead keep track of where we stopped last time through.
2005-04-15 18:48:10 +00:00
Tom Lane
2193a856a2 Simplify initdb-time assignment of OIDs as I proposed yesterday, and
avoid encroaching on the 'user' range of OIDs by allowing automatic
OID assignment to use values below 16k until we reach normal operation.

initdb not forced since this doesn't make any incompatible change;
however a lot of stuff will have different OIDs after your next initdb.
2005-04-13 18:54:57 +00:00
Tom Lane
c3294f1cbf Fix interaction between materializing holdable cursors and firing
deferred triggers: either one can create more work for the other,
so we have to loop till it's all gone.  Per example from andrew@supernews.
Add a regression test to help spot trouble in this area in future.
2005-04-11 19:51:16 +00:00
Tom Lane
8c85a34a3b Officially decouple FUNC_MAX_ARGS from INDEX_MAX_KEYS, and set the
former to 100 by default.  Clean up some of the less necessary
dependencies on FUNC_MAX_ARGS; however, the biggie (FunctionCallInfoData)
remains.
2005-03-29 03:01:32 +00:00
Tom Lane
119191609c Remove dead push/pop rollback code. Vadim once planned to implement
transaction rollback via UNDO but I think that's highly unlikely to
happen, so we may as well remove the stubs.  (Someday we ought to
rip out the stub xxx_undo routines, too.)  Per Alvaro.
2005-03-28 01:50:34 +00:00
Bruce Momjian
b1f57d88f5 Change Win32 O_SYNC method to O_DSYNC because that is what the method
currently does.  This is now the default Win32 wal sync method because
we perfer o_datasync to fsync.

Also, change Win32 fsync to a new wal sync method called
fsync_writethrough because that is the behavior of _commit, which is
what is used for fsync on Win32.

Backpatch to 8.0.X.
2005-03-24 04:36:20 +00:00
Tom Lane
4aefe75553 Remove some no-longer-needed kluges for bootstrapping, in particular
the AMI_OVERRIDE flag.  The fact that TransactionLogFetch treats
BootstrapTransactionId as always committed is sufficient to make
bootstrap work, and getting rid of extra tests in heavily used code
paths seems like a win.  The files produced by initdb are demonstrably
the same after this change.
2005-02-20 21:46:50 +00:00
Tom Lane
60b2444cc3 Add code to prevent transaction ID wraparound by enforcing a safe limit
in GetNewTransactionId().  Since the limit value has to be computed
before we run any real transactions, this requires adding code to database
startup to scan pg_database and determine the oldest datfrozenxid.
This can conveniently be combined with the first stage of an attack on
the problem that the 'flat file' copies of pg_shadow and pg_group are
not properly updated during WAL recovery.  The code I've added to
startup resides in a new file src/backend/utils/init/flatfiles.c, and
it is responsible for rewriting the flat files as well as initializing
the XID wraparound limit value.  This will eventually allow us to get
rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add
a trigger to pg_database.
2005-02-20 02:22:07 +00:00
Bruce Momjian
7c44e57331 Move plpgsql DEBUG from DEBUG2 to DEBUG1 because it is a user-requested
DEBUG.

Fix a few places where DEBUG1 crept in that should have been DEBUG2.
2005-02-12 23:53:42 +00:00
Tom Lane
0ce4d56924 Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This
is the minimum required fix.  I want to look next at taking advantage of
it by simplifying the message semantics in the shared inval message queue,
but that part can be held over for 8.1 if it turns out too ugly.
2005-01-10 20:02:24 +00:00
Bruce Momjian
2daed8c5b3 Update copyrights that were missed. 2005-01-01 05:43:09 +00:00
PostgreSQL Daemon
2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane
bfa5f30481 Awhile back I added some code to StartupCLOG() to forcibly zero out
the remainder of the current clog page during system startup.  While
this was a good idea, it turns out the code fails if nextXid is
exactly at a page boundary, because we won't have created the "current"
clog page yet in that case.  Since the page will be correctly zeroed
when we execute the first transaction on it, the solution is just to
do nothing when exactly at a page boundary.  Per trouble report from
Dave Hartwig.
2004-12-22 18:45:49 +00:00
Tom Lane
ff5a354ece Fix is-it-time-for-a-checkpoint logic so that checkpoint_segments can
usefully be larger than 255.  Per gripe from Simon Riggs.
2004-12-17 00:10:36 +00:00
Tom Lane
37d693033d Minor adjustment of message style. 2004-11-17 16:26:59 +00:00
Neil Conway
b25d23e1e6 Don't allow pg_start_backup() to be invoked if archive_command has not
been defined. Patch from Gavin Sherry, editorializing by Neil Conway.
2004-11-17 02:22:54 +00:00
Peter Eisentraut
0ed3c7665e Small message clarifications 2004-11-05 17:11:34 +00:00
Tom Lane
88868d4fbc Change COMMIT back to the old behavior of emitting command tag COMMIT,
not ROLLBACK, for the case of COMMIT outside a transaction block.
Alvaro Herrera
2004-10-30 20:44:43 +00:00
Tom Lane
23f264d125 Rearrange order of pre-commit operations: must close cursors before doing
ON COMMIT actions.  Per bug report from Michael Guerin.
2004-10-29 22:19:53 +00:00
Tom Lane
ee69be44d5 Add DEBUG1-level logging of checkpoint start and end. Also, reduce the
'recycled log files' and 'removed log files' messages from DEBUG1 to
DEBUG2, replacing them with a count of files added/removed/recycled in
the checkpoint end message, as per suggestion from Simon Riggs.
2004-10-29 00:16:08 +00:00
Tom Lane
fdd13f1568 Give the ResourceOwner mechanism full responsibility for releasing buffer
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly.  When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op.  This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
2004-10-16 18:57:26 +00:00
Bruce Momjian
5c267325ec Add 'int' cast for getpid() because some Solaris releases return long
for getpid().
2004-10-14 20:23:46 +00:00
Peter Eisentraut
0fd37839d9 Message style revisions 2004-10-12 21:54:45 +00:00
Bruce Momjian
67608a393b Make getpid() use %d consistently for printing. 2004-10-09 02:46:42 +00:00
Bruce Momjian
a5d7ba773d Adjust comments previously moved to column 1 by pgident. 2004-10-07 15:21:58 +00:00
Tom Lane
4c77cbb272 PortalRun must guard against the possibility that the portal it's
running contains VACUUM or a similar command that will internally start
and commit transactions.  In such a case, the original caller values of
CurrentMemoryContext and CurrentResourceOwner will point to objects that
will be destroyed by the internal commit.  We must restore these pointers
to point to the newly-manufactured transaction context and resource owner,
rather than possibly pointing to deleted memory.
Also tweak xact.c so that AbortTransaction and AbortSubTransaction
forcibly restore a sane value for CurrentResourceOwner, much as they
have always done for CurrentMemoryContext.  I'm not certain this is
necessary but I'm feeling paranoid today.
Responds to Sean Chittenden's bug report of 4-Oct.
2004-10-04 21:52:15 +00:00
Tom Lane
257cccbe5e Add some marginal tweaks to eliminate memory leakages associated with
subtransactions.  Trivial subxacts (such as a plpgsql exception block
containing no database access) now demonstrably leak zero bytes.
2004-09-16 20:17:49 +00:00
Tom Lane
86fff990b2 RecentXmin is too recent to use as the cutoff point for accessing
pg_subtrans --- what we need is the oldest xmin of any snapshot in use
in the current top transaction.  Introduce a new variable TransactionXmin
to play this role.  Fixes intermittent regression failure reported by
Neil Conway.
2004-09-16 18:35:23 +00:00
Tom Lane
8f9f198603 Restructure subtransaction handling to reduce resource consumption,
as per recent discussions.  Invent SubTransactionIds that are managed like
CommandIds (ie, counter is reset at start of each top transaction), and
use these instead of TransactionIds to keep track of subtransaction status
in those modules that need it.  This means that a subtransaction does not
need an XID unless it actually inserts/modifies rows in the database.
Accordingly, don't assign it an XID nor take a lock on the XID until it
tries to do that.  This saves a lot of overhead for subtransactions that
are only used for error recovery (eg plpgsql exceptions).  Also, arrange
to release a subtransaction's XID lock as soon as the subtransaction
exits, in both the commit and abort cases.  This avoids holding many
unique locks after a long series of subtransactions.  The price is some
additional overhead in XactLockTableWait, but that seems acceptable.
Finally, restructure the state machine in xact.c to have a more orthogonal
set of states for subtransactions.
2004-09-16 16:58:44 +00:00
Tom Lane
b2c4071299 Redesign query-snapshot timing so that volatile functions in READ COMMITTED
mode see a fresh snapshot for each command in the function, rather than
using the latest interactive command's snapshot.  Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and IMMUTABLE
functions, instead using the snapshot taken for the most closely nested
regular query.  (This behavior is only sane for read-only functions, so
the patch also enforces that such functions contain only SELECT commands.)
As per my proposal of 6-Sep-2004; I note that I floated essentially the
same proposal on 19-Jun-2002, but that discussion tailed off without any
action.  Since 8.0 seems like the right place to be taking possibly
nontrivial backwards compatibility hits, let's get it done now.
2004-09-13 20:10:13 +00:00
Tom Lane
b339d1fff6 Fire non-deferred AFTER triggers immediately upon query completion,
rather than when returning to the idle loop.  This makes no particular
difference for interactively-issued queries, but it makes a big difference
for queries issued within functions: trigger execution now occurs before
the calling function is allowed to proceed.  This responds to numerous
complaints about nonintuitive behavior of foreign key checking, such as
http://archives.postgresql.org/pgsql-bugs/2004-09/msg00020.php, and
appears to be required by the SQL99 spec.
Also take the opportunity to simplify the data structures used for the
pending-trigger list, rename them for more clarity, and squeeze out a
bit of space.
2004-09-10 18:40:09 +00:00
Tom Lane
23645f0582 Fix incorrect ordering of smgr cleanup relative to buffer pin cleanup
during transaction abort.  Add a regression test case to catch related
mistakes in future.  Alvaro Herrera and Tom Lane.
2004-09-06 17:56:33 +00:00
Tom Lane
e32bba202d Downgrade LOG messages to DEBUG1 for normal recycling of xlog, clog,
subtrans segments.  Per Greg Mullane and Chris K-L.
2004-09-06 03:04:27 +00:00
Tom Lane
64cb889106 Ensure that the remainder of the current pg_clog page is zeroed during
startup, just to be sure that there's no leftover junk there.
2004-08-30 19:00:42 +00:00
Tom Lane
9cf4eaa00b Fix failure to advance nextXID beyond subtransactions whose XIDs appear
only within COMMIT or ABORT records.
2004-08-30 19:00:03 +00:00
Bruce Momjian
15d3f9f6b7 Another pgindent run with lib typedefs added. 2004-08-30 02:54:42 +00:00
Tom Lane
50742aed68 Add WAL logging for CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.
Fix TablespaceCreateDbspace() to be able to create a dummy directory
in place of a dropped tablespace's symlink.  This eliminates the open
problem of a PANIC during WAL replay when a replayed action attempts
to touch a file in a since-deleted tablespace.  It also makes for a
significant improvement in the usability of PITR replay.
2004-08-29 21:08:48 +00:00
Tom Lane
0ffe11abd3 Widen xl_len field of XLogRecord header to 32 bits, so that we'll have
a more tolerable limit on the number of subtransactions or deleted files
in COMMIT and ABORT records.  Buy back the extra space by eliminating the
xl_xact_prev field, which isn't being used for anything and is rather
unlikely ever to be used for anything.
This does not force initdb, but you do need to do pg_resetxlog if you
want to upgrade an existing 8.0 installation without initdb.
2004-08-29 16:34:48 +00:00
Bruce Momjian
b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian
da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane
f78ecbf20e Now that TransactionIdDidAbort doesn't think it should try to modify
pg_clog, there's no reason to do abort marking of subtransactions in a
nonintuitive order.
2004-08-28 22:04:12 +00:00
Tom Lane
7531d2fd85 Add missing Assert to make TransactionIdDidAbort more consistent with
TransactionIdDidCommit.
2004-08-28 21:58:59 +00:00
Tom Lane
1c72d0dec1 Fix relcache to account properly for subtransaction status of 'new'
relcache entries.  Also, change TransactionIdIsCurrentTransactionId()
so that if consulted during transaction abort, it will not say that
the aborted xact is still current.  (It would be better to ensure that
it's never called at all during abort, but I'm not sure we can easily
guarantee that.)  In combination, these fix a crash we have seen
occasionally during parallel regression tests of 8.0.
2004-08-28 20:31:44 +00:00
Tom Lane
f444dafab0 Can't truncate pg_subtrans during a recovery checkpoint --- subtrans
module isn't fully initialized yet.
2004-08-28 18:18:03 +00:00
Tom Lane
fe455ee1d4 Revise ResourceOwner code to avoid accumulating ResourceOwner objects
for every command executed within a transaction.  For long transactions
this was a significant memory leak.  Instead, we can delete a portal's
or subtransaction's ResourceOwner immediately, if we physically transfer
the information about its locks up to the parent owner.  This does not
fully solve the leak problem; we need to do something about counting
multiple acquisitions of the same lock in order to fix it.  But it's a
necessary step along the way.
2004-08-25 18:43:43 +00:00
Tom Lane
4dbb880d3c Rearrange pg_subtrans handling as per recent discussion. pg_subtrans
updates are no longer WAL-logged nor even fsync'd; we do not need to,
since after a crash no old pg_subtrans data is needed again.  We truncate
pg_subtrans to RecentGlobalXmin at each checkpoint.  slru.c's API is
refactored a little bit to separate out the necessary decisions.
2004-08-23 23:22:45 +00:00
Tom Lane
f009c316ba Tweak code so that pg_subtrans is never consulted for XIDs older than
RecentXmin (== MyProc->xmin).  This ensures that it will be safe to
truncate pg_subtrans at RecentGlobalXmin, which should largely eliminate
any fear of bloat.  Along the way, eliminate SubTransXidsHaveCommonAncestor,
which isn't really needed and could not give a trustworthy result anyway
under the lookback restriction.
In an unrelated but nearby change, #ifdef out GetUndoRecPtr, which has
been dead code since 2001 and seems unlikely to ever be resurrected.
2004-08-22 02:41:58 +00:00
Bruce Momjian
10249abfa1 Cleanup Win32 COPY handling, and move archive examples to SGML. 2004-08-12 19:03:44 +00:00
Bruce Momjian
43ea65a0dc Add mention of "WIN32" COPY. 2004-08-12 18:34:45 +00:00
Bruce Momjian
6525b42b10 Add make_native_path() because Win32 COPY is an internal CMD.EXE command
and doesn't process forward slashes in the same way as external
commands.  Quoting the first argument to COPY does not convert forward
to backward slashes, but COPY does properly process quoted forward
slashes in the second argument.

Win32 COPY works with quoted forward slashes in the first argument only if the
current directory is the same as the directory of the first argument.
2004-08-12 18:32:52 +00:00
Tom Lane
3fdf649f4f Fix failure to guarantee that a checkpoint will write out pg_clog updates
for transaction commits that occurred just before the checkpoint.  This is
an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a
reproducible test case to prove its existence.
2004-08-11 04:07:16 +00:00
Tom Lane
35f539b481 When expanding %p in archive_command or restore_command, translate
slashes to backslashes #ifdef WIN32.  This is to cope with the fact
that Windows seems exceedingly unfriendly to slashes in shell commands,
as per recent discussion.
2004-08-09 16:26:06 +00:00
Tom Lane
7dca975c5d Add a comment about why we always replay backup blocks from WAL. 2004-08-08 03:22:08 +00:00
Tom Lane
fcbc438727 Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments
and documentation to reference 8.0 instead of 7.5.
2004-08-04 21:34:35 +00:00
Tom Lane
b387d16f96 Make use of backup label/history files to control recovery properly. 2004-08-04 16:25:02 +00:00
Tom Lane
58c41712d5 Add functions pg_start_backup, pg_stop_backup to create backup label
and history files as per recent discussion.  While at it, remove
pg_terminate_backend, since we have decided we do not have time during
this release cycle to address the reliability concerns it creates.
Split the 'Miscellaneous Functions' documentation section into
'System Information Functions' and 'System Administration Functions',
which hopefully will draw the eyes of those looking for such things.
2004-08-03 20:32:36 +00:00
Tom Lane
a83c45c4c6 Fix misplacement of savepointLevel test, per report from Chris K-L. 2004-08-03 15:57:26 +00:00
Tom Lane
410b1dfb88 Update the in-code documentation about the transaction system. Move it
into a README file instead of being in xact.c's header comment.
Alvaro Herrera.
2004-08-01 20:57:59 +00:00
Tom Lane
5cc380f9a3 Error message style adjustments, per Alvaro Herrera. 2004-08-01 17:45:43 +00:00
Tom Lane
efcaf1e868 Some mop-up work for savepoints (nested transactions). Store a small
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress.  Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end.  (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.)  Add basic reference documentation for
SAVEPOINT and related commands.  Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
2004-08-01 17:32:22 +00:00
Tom Lane
beda4814c1 plpgsql does exceptions.
There are still some things that need refinement; in particular I fear
that the recognized set of error condition names probably has little in
common with what Oracle recognizes.  But it's a start.
2004-07-31 07:39:21 +00:00
Tom Lane
1bf3d61504 Fix subtransaction behavior for large objects, temp namespace, files,
password/group files.  Also allow read-only subtransactions of a read-write
parent, but not vice versa.  These are the reasonably noncontroversial
parts of Alvaro's recent mop-up patch, plus further work on large objects
to minimize use of the TopTransactionResourceOwner.
2004-07-28 14:23:31 +00:00
Tom Lane
cc813fc2b8 Replace nested-BEGIN syntax for subtransactions with spec-compliant
SAVEPOINT/RELEASE/ROLLBACK-TO syntax.  (Alvaro)
Cause COMMIT of a failed transaction to report ROLLBACK instead of
COMMIT in its command tag.  (Tom)
Fix a few loose ends in the nested-transactions stuff.
2004-07-27 05:11:48 +00:00
Tom Lane
acd907bfcc Add cross-check that current timeline of pg_control is an ancestor of
recovery_target_timeline --- otherwise there is no path from the backup
to the requested timeline.  This check was foreseen in the original
discussion but I forgot to implement it.
2004-07-22 21:09:37 +00:00
Tom Lane
3dba9cb694 Add a check on file size as an additional safety check that a WAL file
recovered from archive is not corrupt.  It's not much but it will catch
one common problem, viz out-of-disk-space.
Also, force a WAL recovery scan when recovery.conf is present, even if
pg_control shows a clean shutdown.  This allows recovery with a tar backup
that was taken with the postmaster shut down, as per complaint from
Mark Kirkwood.
2004-07-22 20:18:40 +00:00
Tom Lane
2042b3428d Invent WAL timelines, as per recent discussion, to make point-in-time
recovery more manageable.  Also, undo recent change to add FILE_HEADER
and WASTED_SPACE records to XLOG; instead make the XLOG page header
variable-size with extra fields in the first page of an XLOG file.
This should fix the boundary-case bugs observed by Mark Kirkwood.
initdb forced due to change of XLOG representation.
2004-07-21 22:31:26 +00:00
Tom Lane
9c7a765f02 Remove unportable use of strptime() to parse recovery target time spec.
Instead use our own abstimein code, which is more flexible anyway.
2004-07-19 14:34:39 +00:00
Tom Lane
66ec2db728 XLOG file archiving and point-in-time recovery. There are still some
loose ends and a glaring lack of documentation, but it basically works.

Simon Riggs with some editorialization by Tom Lane.
2004-07-19 02:47:16 +00:00
Tom Lane
fe548629c5 Invent ResourceOwner mechanism as per my recent proposal, and use it to
keep track of portal-related resources separately from transaction-related
resources.  This allows cursors to work in a somewhat sane fashion with
nested transactions.  For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor.  We might want to change that later.
2004-07-17 03:32:14 +00:00
Tom Lane
f5c798ee82 Fix no-longer-correct bit-pushing in TransactionIdSetStatus, per Alvaro. 2004-07-03 02:55:56 +00:00
Tom Lane
b6197fe069 Further review of xact.c state machine for nested transactions. Fix
problems with starting subtransactions inside already-failed transactions.
Clean up some comments.
2004-07-01 20:11:03 +00:00
Tom Lane
573a71a5da Nested transactions. There is still much left to do, especially on the
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.

Alvaro Herrera, with some help from Tom Lane.
2004-07-01 00:52:04 +00:00
Tom Lane
2467394ee1 Tablespaces. Alternate database locations are dead, long live tablespaces.
There are various things left to do: contrib dbsize and oid2name modules
need work, and so does the documentation.  Also someone should think about
COMMENT ON TABLESPACE and maybe RENAME TABLESPACE.  Also initlocation is
dead, it just doesn't know it yet.

Gavin Sherry and Tom Lane.
2004-06-18 06:14:31 +00:00
Tom Lane
921d749bd4 Adjust our timezone library to use pg_time_t (typedef'd as int64) in
place of time_t, as per prior discussion.  The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038).  The system will now treat
times over the full supported timestamp range as being in your local
time zone.  It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.

I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038.  Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported.  We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
2004-06-03 02:08:07 +00:00
Tom Lane
9b178555fc Per previous discussions, get rid of use of sync(2) in favor of
explicitly fsync'ing every (non-temp) file we have written since the
last checkpoint.  In the vast majority of cases, the burden of the
fsyncs should fall on the bgwriter process not on backends.  (To this
end, we assume that an fsync issued by the bgwriter will force out
blocks written to the same file by other processes using other file
descriptors.  Anyone have a problem with that?)  This makes the world
safe for WIN32, which ain't even got sync(2), and really makes the world
safe for Unixen as well, because sync(2) never had the semantics we need:
it offers no way to wait for the requested I/O to finish.

Along the way, fix a bug I recently introduced in xlog recovery:
file truncation replay failed to clear bufmgr buffers for the dropped
blocks, which could result in 'PANIC:  heap_delete_redo: no block'
later on in xlog replay.
2004-05-31 03:48:10 +00:00
Tom Lane
076a055acf Separate out bgwriter code into a logically separate module, rather
than being random pieces of other files.  Give bgwriter responsibility
for all checkpoint activity (other than a post-recovery checkpoint);
so this child process absorbs the functionality of the former transient
checkpoint and shutdown subprocesses.  While at it, create an actual
include file for postmaster.c, which for some reason never had its own
file before.
2004-05-29 22:48:23 +00:00
Tom Lane
1a321f26d8 Code review for EXEC_BACKEND changes. Reduce the number of #ifdefs by
about a third, make it work on non-Windows platforms again.  (But perhaps
I broke the WIN32 code, since I have no way to test that.)  Fold all the
paths that fork postmaster child processes to go through the single
routine SubPostmasterMain, which takes care of resurrecting the state that
would normally be inherited from the postmaster (including GUC variables).
Clean up some places where there's no particularly good reason for the
EXEC and non-EXEC cases to work differently.  Take care of one or two
FIXMEs that remained in the code.
2004-05-28 05:13:32 +00:00
Tom Lane
16974ee910 Get rid of the former rather baroque mechanism for propagating the values
of ThisStartUpID and RedoRecPtr into new backends.  It's a lot easier just
to make them all grab the values out of shared memory during startup.
This helps to decouple the postmaster from checkpoint execution, which I
need since I'm intending to let the bgwriter do it instead, and it also
fixes a bug in the Win32 port: ThisStartUpID wasn't getting propagated at
all AFAICS.  (Doesn't give me a lot of faith in the amount of testing that
port has gotten.)
2004-05-27 17:12:57 +00:00
Tom Lane
4d86ae4260 For multi-table ANALYZE, use per-table transactions when possible
(ie, when not inside a transaction block), so that we can avoid holding
locks longer than necessary.  Per trouble report from Philip Warner.
2004-05-22 23:14:38 +00:00
Tom Lane
e6319d1d28 Put back #include <sys/time.h> in files that seem to need it on Linux. 2004-05-21 16:08:47 +00:00
Tom Lane
63bd0db121 Integrate src/timezone library for all platforms. There is more we can
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
2004-05-21 05:08:06 +00:00
Tom Lane
0bd61548ab Solve the 'Turkish problem' with undesirable locale behavior for case
conversion of basic ASCII letters.  Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower.  These functions use the same notions of
case folding already developed for identifier case conversion.  I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent.  Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
2004-05-07 00:24:59 +00:00
Bruce Momjian
31338352bd * Most changes are to fix warnings issued when compiling win32
* removed a few redundant defines
* get_user_name safe under win32
* rationalized pipe read EOF for win32 (UPDATED PATCH USED)
* changed all backend instances of sleep() to pg_usleep

    - except for the SLEEP_ON_ASSERT in assert.c, as it would exceed a
32-bit long [Note to patcher: If a SLEEP_ON_ASSERT of 2000 seconds is
acceptable, please replace with pg_usleep(2000000000L)]

I added a comment to that part of the code:

    /*
     *  It would be nice to use pg_usleep() here, but only does 2000 sec
     *  or 33 minutes, which seems too short.
     */
    sleep(1000000);

Claudio Natoli
2004-04-19 17:42:59 +00:00
Bruce Momjian
823ac7c2b4 This is a cleanup patch for access/transam/xact.c. It only removes some
#ifdef NOT_USED code, and adds a new TBLOCK state which signals the fact
that StartTransaction() has been executed.

Alvaro Herrera
2004-04-05 03:11:39 +00:00
Bruce Momjian
6367ed4382 Increase xlog str_time() static string variable, per Korean User's Group. 2004-03-22 04:16:57 +00:00
Tom Lane
7a57a67278 Replace opendir/closedir calls throughout the backend with AllocateDir
and FreeDir routines modeled on the existing AllocateFile/FreeFile.
Like the latter, these routines will avoid failing on EMFILE/ENFILE
conditions whenever possible, and will prevent leakage of directory
descriptors if an elog() occurs while one is open.
Also, reduce PANIC to ERROR in MoveOfflineLogs() --- this is not
critical code and there is no reason to force a DB restart on failure.
All per recent trouble report from Olivier Hubaut.
2004-02-23 23:03:10 +00:00
Bruce Momjian
1f17316a3d Here is an updated version of the win32 readdir patch.
1) Now puts in exactly the same change as the current-cvs mingw code
does. (see
http://cvs.sourceforge.net/viewcvs.py/mingw/runtime/mingwex/dirent.c?r1=
1.3&r2=1.4, second part of the patch).

2) Updates both xlog.c and slru.c in backend/access/transam/

3) Also updates pg_resetxlog, which also uses readdir() and checks the
errno value after the loop.

Magnus Hagander
2004-02-17 03:45:17 +00:00
Tom Lane
c3c09be34b Commit the reasonably uncontroversial parts of J.R. Nield's PITR patch, to
wit: Add a header record to each WAL segment file so that it can be reliably
identified.  Avoid splitting WAL records across segment files (this is not
strictly necessary, but makes it simpler to incorporate the header records).
Make WAL entries for file creation, deletion, and truncation (as foreseen but
never implemented by Vadim).  Also, add support for making XLOG_SEG_SIZE
configurable at compile time, similarly to BLCKSZ.  Fix a couple bugs I
introduced in WAL replay during recent smgr API changes.  initdb is forced
due to changes in pg_control contents.
2004-02-11 22:55:26 +00:00
Tom Lane
58f337a343 Centralize implementation of delay code by creating a pg_usleep()
subroutine in src/port/pgsleep.c.  Remove platform dependencies from
miscadmin.h and put them in port.h where they belong.  Extend recent
vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and
non-btree index vacuuming.

By the way, where is the documentation for the cost-based-delay patch?
2004-02-10 03:42:45 +00:00
Tom Lane
87bd956385 Restructure smgr API as per recent proposal. smgr no longer depends on
the relcache, and so the notion of 'blind write' is gone.  This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
2004-02-10 01:55:27 +00:00
Tom Lane
2f0d43b251 Review uses of IsUnderPostmaster, change some tests to look at
whereToSendOutput instead because they are really inquiring about
the correct client communication protocol.  Update some comments.
This is pointing towards supporting regular FE/BE client protocol
in a standalone backend, per discussion a month or so back.
2004-01-28 21:02:40 +00:00
Bruce Momjian
f4921e5ca3 Attached is a patch that fixes some trivial typos and alignment. Please
apply.

Alvaro Herrera
2004-01-26 22:51:56 +00:00
Tom Lane
c77f363384 Ensure that close() and fclose() are checked for errors, at least in
cases involving writes.  Per recent discussion about the possibility
of close-time failures on some filesystems.  There is a TODO item for
this, too.
2004-01-26 22:35:32 +00:00
Tom Lane
be11fa26e3 Repair incorrect order of operations in GetNewTransactionId(). We must
complete ExtendCLOG() before advancing nextXid, so that if that routine
fails, the next incoming transaction will try it again.  Per trouble
report from Christopher Kings-Lynne.
2004-01-26 19:15:59 +00:00
Tom Lane
9bd681a522 Repair problem identified by Olivier Prenant: ALTER DATABASE SET search_path
should not be too eager to reject paths involving unknown schemas, since
it can't really tell whether the schemas exist in the target database.
(Also, when reading pg_dumpall output, it could be that the schemas
don't exist yet, but eventually will.)  ALTER USER SET has a similar issue.
So, reduce the normal ERROR to a NOTICE when checking search_path values
for these commands.  Supporting this requires changing the API for GUC
assign_hook functions, which causes the patch to touch a lot of places,
but the changes are conceptually trivial.
2004-01-19 19:04:40 +00:00
Bruce Momjian
38081fd000 Change PG_DELAY from msec to usec and use it consistenly rather than
select().   Add Win32 Sleep() for delay.
2004-01-09 21:08:50 +00:00
Neil Conway
192ad63bd7 More janitorial work: remove the explicit casting of NULL literals to a
pointer type when it is not necessary to do so.

For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
2004-01-07 18:56:30 +00:00
Tom Lane
06288d4e22 Suppress compiler warning (xlog_outrec is unused if not WAL_DEBUG). 2004-01-06 22:22:37 +00:00
Neil Conway
bc028beb16 Make the 'wal_debug' GUC variable a boolean (rather than an integer), and
hide it behind #ifdef WAL_DEBUG blocks.
2004-01-06 17:26:23 +00:00
Bruce Momjian
d75b2ec4eb This patch is the next step towards (re)allowing fork/exec.
Claudio Natoli
2003-12-20 17:31:21 +00:00
Neil Conway
fef0c8345a I posted some bufmgr cleanup a few weeks ago, but it conflicted with
some concurrent changes Jan was making to the bufmgr. Here's an
updated version of the patch -- it should apply cleanly to CVS
HEAD and passes the regression tests.

This patch makes the following changes:

     - remove the UnlockAndReleaseBuffer() and UnlockAndWriteBuffer()
       macros, and replace uses of them with calls to the appropriate
       functions.

     - remove a bunch of #ifdef BMTRACE code: it is ugly & broken
       (i.e. it doesn't compile)

     - make BufferReplace() return a bool, not an int

     - cleanup some logic in bufmgr.c; should be functionality
       equivalent to the previous code, just cleaner now

     - remove the BM_PRIVATE flag as it is unused

     - improve a few comments, etc.
2003-12-14 00:34:47 +00:00
Peter Eisentraut
2afacfc403 This patch properly sets the prototype for the on_shmem_exit and
on_proc_exit functions, and adjust all other related code to use
the proper types too.

by Kurt Roeckx
2003-12-12 18:45:10 +00:00
Joe Conway
e2605c8311 Add a warning to AtEOXact_SPI() to catch cases where the current
transaction has been committed without SPI_finish() being called
first. Per recent discussion here:
http://archives.postgresql.org/pgsql-patches/2003-11/msg00286.php
2003-12-02 19:26:47 +00:00
PostgreSQL Daemon
969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Tom Lane
90b2202975 Fix bad interaction between NOTIFY processing and V3 extended query
protocol, per report from Igor Shevchenko.  NOTIFY thought it could
do its thing if transaction blockState is TBLOCK_DEFAULT, but in
reality it had better check the low-level transaction state is
TRANS_DEFAULT as well.  Formerly it was not possible to wait for the
client in a state where the first is true and the second is not ...
but now we can have such a state.  Minor cleanup in StartTransaction()
as well.
2003-10-16 16:50:41 +00:00
Tom Lane
8934790052 Add a mechanism to let dynamically loaded modules register post-commit/
post-abort cleanup hooks.  I'm surprised that we have not needed this
already, but I need it now to fix a plpgsql problem, and the usefulness
for other dynamically loaded modules seems obvious.
2003-09-28 23:26:20 +00:00
Tom Lane
4f7a2fa0c3 Fix typo in message. 2003-09-27 18:16:35 +00:00
Peter Eisentraut
d84b6ef56b Various message fixes, among those fixes for the previous round of fixes 2003-09-26 15:27:37 +00:00
Peter Eisentraut
feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane
a56a016ceb Repair some REINDEX problems per recent discussions. The relcache is
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method.  This
leaves only shared system indexes as special cases.  Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case.  Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX.  -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS.  Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
2003-09-24 18:54:02 +00:00
Bruce Momjian
46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Tom Lane
870886affe Suppress unused-variable warnings when building without Asserts. 2003-08-08 14:39:45 +00:00
Tom Lane
2f9c859ea1 Fix some copyright notices that weren't updated. Improve copyright tool
so it won't miss 'em again.
2003-08-04 23:59:41 +00:00
Bruce Momjian
f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian
089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane
81b5c8a136 A visit from the message-style police ... 2003-07-28 00:09:16 +00:00
Tom Lane
ec7aa4b515 Error message editing in backend/access. 2003-07-21 20:29:40 +00:00
Tom Lane
fa3bd4dbd0 Error message editing: finish up undone task of reporting the problem
xid when we fail to access pg_clog.
2003-07-19 21:37:37 +00:00
Tom Lane
8cf63ba920 Repair boundary-case bug introduced by patch of two months ago that
fixed incorrect initial setting of StartUpID.  The logic in XLogWrite()
expects that Write->curridx is advanced to the next page as soon as
LogwrtResult points to the end of the current page, but StartupXLOG()
failed to make that happen when the old WAL ended exactly on a page
boundary.  Per trouble report from Hannu Krosing.
2003-07-17 16:45:04 +00:00
Tom Lane
0c985ab5a8 Add comment pointing out that XLByteToPrevSeg macro is not broken. 2003-06-26 18:23:07 +00:00
Bruce Momjian
0abe7431c6 This patch extracts page buffer pooling and the simple
least-recently-used strategy from clog.c into slru.c.  It doesn't
change any visible behaviour and passes all regression tests plus a
TruncateCLOG test done manually.

Apart from refactoring I made a little change to SlruRecentlyUsed,
formerly ClogRecentlyUsed:  It now skips incrementing lru_counts, if
slotno is already the LRU slot, thus saving a few CPU cycles.  To make
this work, lru_counts are initialised to 1 in SimpleLruInit.

SimpleLru will be used by pg_subtrans (part of the nested transactions
project), so the main purpose of this patch is to avoid future code
duplication.

Manfred Koizar
2003-06-11 22:37:46 +00:00
Tom Lane
39e98d9563 Repair sometimes-incorrect computation of StartUpID after a crash, per
example from Rao Kumar.  This is a very corner corner-case, requiring
a minimum of three closely-spaced database crashes and an unlucky
positioning of the second recovery's checkpoint record before you'd notice
any problem.  But the consequences are dire enough that it's a must-fix.
2003-05-22 14:39:28 +00:00
Tom Lane
f85f43dfb5 Backend support for autocommit removed, per recent discussions. The
only remnant of this failed experiment is that the server will take
SET AUTOCOMMIT TO ON.  Still TODO: provide some client-side autocommit
logic in libpq.
2003-05-14 03:26:03 +00:00
Tom Lane
30f609484d Add binary I/O routines for a bunch more datatypes. Still a few to go,
but that was enough tedium for one day.  Along the way, move the few
support routines for types xid and cid into a more logical place.
2003-05-12 23:08:52 +00:00
Tom Lane
8d86a96068 Adjust CreateCheckpoint so that buffer dumping activities and cleanup of
dead xlog segments are not considered part of a critical section.  It is
not necessary to force a database-wide panic if we get a failure in these
operations.  Per recent trouble reports.
2003-05-10 18:01:31 +00:00
Bruce Momjian
a7fd03e1de Handle clog structure in shared memory in exec() case, for Win32. 2003-05-03 03:52:07 +00:00
Bruce Momjian
a2e038fbee Back out last commit --- wrong patch. 2003-05-02 21:59:31 +00:00
Bruce Momjian
fb1f7ccec5 Dump/read non-default GUC values for use by exec'ed backends, for Win32. 2003-05-02 21:52:42 +00:00
Tom Lane
de28dc9a04 Portal and memory management infrastructure for extended query protocol.
Both plannable queries and utility commands are now always executed
within Portals, which have been revamped so that they can handle the
load (they used to be good only for single SELECT queries).  Restructure
code to push command-completion-tag selection logic out of postgres.c,
so that it won't have to be duplicated between simple and extended queries.
initdb forced due to addition of a field to Query nodes.
2003-05-02 20:54:36 +00:00
Tom Lane
4db9689d1a Add transaction status field to ReadyForQuery messages, and make room
for tableID/columnID in RowDescription.  (The latter isn't really
implemented yet though --- the backend always sends zeroes, and libpq
just throws away the data.)
2003-04-26 20:23:00 +00:00
Tom Lane
9cbaf72177 In the continuing saga of FE/BE protocol revisions, add reporting of
initial values and runtime changes in selected parameters.  This gets
rid of the need for an initial 'select pg_client_encoding()' query in
libpq, bringing us back to one message transmitted in each direction
for a standard connection startup.  To allow server version to be sent
using the same GUC mechanism that handles other parameters, invent the
concept of a never-settable GUC parameter: you can 'show server_version'
but it's not settable by any GUC input source.  Create 'lc_collate' and
'lc_ctype' never-settable parameters so that people can find out these
settings without need for pg_controldata.  (These side ideas were all
discussed some time ago in pgsql-hackers, but not yet implemented.)
2003-04-25 19:45:10 +00:00
Bruce Momjian
4d4953fc41 Make Win32 tests to match existing Cygwin tests, where appropriate. 2003-04-18 01:03:42 +00:00
Tom Lane
0851e12244 Reorganize clog's error reporting so that PANIC on clog I/O error can
be reduced to a plain ERROR.  Should make it at least a little less
painful to deal with data-corruption problems.
2003-04-14 17:31:33 +00:00
Bruce Momjian
54f7338fa1 This patch implements holdable cursors, following the proposal
(materialization into a tuple store) discussed on pgsql-hackers earlier.
I've updated the documentation and the regression tests.

Notes on the implementation:

- I needed to change the tuple store API slightly -- it assumes that it
won't be used to hold data across transaction boundaries, so the temp
files that it uses for on-disk storage are automatically reclaimed at
end-of-transaction. I added a flag to tuplestore_begin_heap() to control
this behavior. Is changing the tuple store API in this fashion OK?

- in order to store executor results in a tuple store, I added a new
CommandDest. This works well for the most part, with one exception: the
current DestFunction API doesn't provide enough information to allow the
Executor to store results into an arbitrary tuple store (where the
particular tuple store to use is chosen by the call site of
ExecutorRun). To workaround this, I've temporarily hacked up a solution
that works, but is not ideal: since the receiveTuple DestFunction is
passed the portal name, we can use that to lookup the Portal data
structure for the cursor and then use that to get at the tuple store the
Portal is using. This unnecessarily ties the Portal code with the
tupleReceiver code, but it works...

The proper fix for this is probably to change the DestFunction API --
Tom suggested passing the full QueryDesc to the receiveTuple function.
In that case, callers of ExecutorRun could "subclass" QueryDesc to add
any additional fields that their particular CommandDest needed to get
access to. This approach would work, but I'd like to think about it for
a little bit longer before deciding which route to go. In the mean time,
the code works fine, so I don't think a fix is urgent.

- (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and
adjusted the behavior of SCROLL in accordance with the discussion on
-hackers.

- (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml

Neil Conway
2003-03-27 16:51:29 +00:00
Bruce Momjian
9a9719e482 Allow error query to start transaction in autocommit off mode. 2003-03-21 04:33:15 +00:00
Bruce Momjian
c90354bad0 Remove unneeded dash blocks around function start comments. 2003-03-14 22:40:31 +00:00
Tom Lane
e4704001ea This patch fixes a bunch of spelling mistakes in comments throughout the
PostgreSQL source code.

Neil Conway
2003-03-10 22:28:22 +00:00
Tom Lane
70508ba7ae Make btree index structure adjustments and WAL logging changes needed to
support btree compaction, as per proposal of a few days ago.  btree index
pages no longer store parent links, instead they have a level indicator
(counting up from zero for leaf pages).  The FixBTree recovery logic is
removed, and replaced by code that detects missing parent-level insertions
during WAL replay.  Also, generate appropriate WAL entries when updating
btree metapage and when building a btree index from scratch.  I believe
btree indexes are now completely WAL-legal for the first time.
initdb forced due to index and WAL changes.
2003-02-21 00:06:22 +00:00
Tom Lane
80727ce14f Use stat(2) to probe for existing xlog segments in InstallXLogFileSegment,
rather than actually opening the files.  This eliminates some corner cases
where the file indeed exists but open() fails for another reason, such
as being out of file descriptors.  The net reliability gain is probably
tiny, since xlog.c is full of other file open calls that will elog(PANIC)
if they fail for any reason; but this specific failure mode has been
observed in the field, so we may as well fix it.
2003-01-25 03:06:04 +00:00
Peter Eisentraut
b65cd56240 Read-only transactions, as defined in SQL. 2003-01-10 22:03:30 +00:00
Bruce Momjian
1b7f3cc02d This patch implements FOR EACH STATEMENT triggers, per my email to
-hackers a couple days ago.

Notes/caveats:

        - added regression tests for the new functionality, all
          regression tests pass on my machine

        - added pg_dump support

        - updated PL/PgSQL to support per-statement triggers; didn't
          look at the other procedural languages.

        - there's (even) more code duplication in trigger.c than there
          was previously. Any suggestions on how to refactor the
          ExecXXXTriggers() functions to reuse more code would be
          welcome -- I took a brief look at it, but couldn't see an
          easy way to do it (there are several subtly-different
          versions of the code in question)

        - updated the documentation. I also took the liberty of
          removing a big chunk of duplicated syntax documentation in
          the Programmer's Guide on triggers, and moving that
          information to the CREATE TRIGGER reference page.

        - I also included some spelling fixes and similar small
          cleanups I noticed while making the changes. If you'd like
          me to split those into a separate patch, let me know.

Neil Conway
2002-11-23 03:59:09 +00:00
Tom Lane
17ac74797a Put back error test for DECLARE CURSOR outside a transaction block ...
but do it correctly now.
2002-11-18 01:17:39 +00:00
Bruce Momjian
63e9734542 Update xact.c comments for clarity. 2002-11-13 03:12:05 +00:00
Tom Lane
f9b5b41ef9 Code review for ON COMMIT patch. Make the actual on-commit action happen
before commit, not after :-( --- the original coding is not only unsafe
if an error occurs while it's processing, but it generates an invalid
sequence of WAL entries.  Resurrect 7.2 logic for deleting items when
no longer needed.  Use an enum instead of random macros.  Editorialize
on names used for routines and constants.  Teach backend/nodes routines
about new field in CreateTable struct.  Add a regression test.
2002-11-11 22:19:25 +00:00
Bruce Momjian
ebb531836a Add code to handle [ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP }]
for temp tables.

Gavin Sherry
2002-11-09 23:56:39 +00:00
Bruce Momjian
bea4792125 This patch removes a bunch of superfluous #include directives: if
postgres.h or c.h includes a system header (such as stdio.h or
stdlib.h), there's no need to specifically include it in any of the .c
files in the backend.

Neil Conway
2002-11-08 20:23:57 +00:00
Tom Lane
f6e0130b5b Clean up a few fprintf(stderr)'s that should be elog's. 2002-11-02 15:54:13 +00:00
Tom Lane
30963fc200 Perform transaction cleanup operations in a less ad-hoc, more
principled order; in particular ensure that all shared resources
are released before we release transaction locks.  The code used
to release locks before buffer pins, which might explain an ancient
note I have about a bufmgr assertion failure I'd seen once several
years ago, and been unable to reproduce since.  (Theory: someone
trying to drop a relation might be able to reach FlushRelationBuffers
before the last user of the relation had gotten around to dropping
his buffer pins.)
2002-10-22 22:44:36 +00:00
Tom Lane
200b151615 Fix places that were using IsTransactionBlock() as an (inadequate) check
that they'd get to commit immediately on finishing.  There's now a
centralized routine PreventTransactionChain() that implements the
necessary tests.
2002-10-21 22:06:20 +00:00
Tom Lane
e16f04cf72 Make CREATE/ALTER/DROP USER/GROUP transaction-safe, or at least pretty
nearly so, by postponing write of flat password file until transaction
commit.
2002-10-21 19:46:45 +00:00
Tom Lane
4e9b159484 Change order of operations during XLogFlush so that we try to include
in our write/flush operation any WAL entries that got queued while we
were waiting to get the WALWriteLock.  This improves throughput when
transactions are small enough that several can be committed per WAL
write (ie, per disk revolution).
2002-10-07 17:04:30 +00:00
Tom Lane
b2ab1e6bc9 Ensure that before truncating CLOG, we force a checkpoint even if no
recent WAL activity has occurred.  Without this, it's possible that a
later crash might leave tuples on disk with un-updated commit status
bits.
2002-09-26 22:58:34 +00:00
Bruce Momjian
e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Bruce Momjian
97ac103289 Remove sys/types.h in files that include postgres.h, and hence c.h,
because c.h has sys/types.h.
2002-09-02 02:47:07 +00:00
Bruce Momjian
d64e6392fb Remove code that suggested increasing wal_files. 2002-09-01 01:58:42 +00:00
Tom Lane
26993b2918 AUTOCOMMIT mode is now an available backend GUC variable; setting it
to false provides more SQL-spec-compliant behavior than we had before.
I am not sure that setting it false is actually a good idea yet; there
is a lot of client-side code that will probably be broken by turning
autocommit off.  But it's a start.

Loosely based on a patch by David Van Wie.
2002-08-30 22:18:07 +00:00
Bruce Momjian
63653f7ffa Complete TODO item:
* Remove wal_files postgresql.conf option because WAL files are
	  now recycled
2002-08-30 16:50:50 +00:00
Bruce Momjian
d04e9137c9 Reverse out XLogDir/-X write-ahead log handling, per discussion.
Original patch from Thomas.
2002-08-17 15:12:07 +00:00
Tom Lane
5df307c778 Restructure local-buffer handling per recent pghackers discussion.
The local buffer manager is no longer used for newly-created relations
(unless they are TEMP); a new non-TEMP relation goes through the shared
bufmgr and thus will participate normally in checkpoints.  But TEMP relations
use the local buffer manager throughout their lifespan.  Also, operations
in TEMP relations are not logged in WAL, thus improving performance.
Since it's no longer necessary to fsync relations as they move out of the
local buffers into shared buffers, quite a lot of smgr.c/md.c/fd.c code
is no longer needed and has been removed: there's no concept of a dirty
relation anymore in md.c/fd.c, and we never fsync anything but WAL.
Still TODO: improve local buffer management algorithms so that it would
be reasonable to increase NLocBuffer.
2002-08-06 02:36:35 +00:00
Thomas G. Lockhart
ac1a3dcf24 Fix compilation problem with assert checking enabled for recent xlog
location feature.
2002-08-05 01:24:16 +00:00
Thomas G. Lockhart
c755f6027f Implement WAL log location control using "-X" or PGXLOG. 2002-08-04 06:53:10 +00:00
Tom Lane
22c64f1834 When compiling with --enable-cassert, check for reference count leaks
in the relcache.  It's rather silly that we have reference count leak
checks in bufmgr and in catcache, but not in relcache which will normally
have many fewer entries.  Chris K-L would have caught at least one bug
in his recent DROP patch if he'd had this.
2002-08-02 22:36:05 +00:00
Bruce Momjian
d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Bruce Momjian
918e864f14 Remove some pre-WAL relics:
SharedBufferChanged
  BufferRelidLastDirtied
  BufferTagLastDirtied
  BufferDirtiedByMe

Manfred Koizar
2002-06-15 19:55:38 +00:00
Bruce Momjian
3c35face41 This patch wraps all accesses to t_xmin, t_cmin, t_xmax, and t_cmax in
HeapTupleHeaderData in setter and getter macros called
HeapTupleHeaderGetXmin, HeapTupleHeaderSetXmin etc.

It also introduces a "virtual" field xvac by defining
HeapTupleHeaderGetXvac and HeapTupleHeaderSetXvac.  Xvac is used by
VACUUM, in fact it is stored in t_cmin.

Manfred Koizar
2002-06-15 19:54:24 +00:00
Jan Wieck
469cb65aca Katherine Ward wrote:
> Changes to avoid collisions with WIN32 & MFC names...
> 1.  Renamed:
>       a.  PROC => PGPROC
>       b.  GetUserName() => GetUserNameFromId()
>       c.  GetCurrentTime() => GetCurrentDateTime()
>       d.  IGNORE => IGNORE_DTF in include/utils/datetime.h & utils/adt/datetim
>
> 2.  Added _P to some lex/yacc tokens:
>       CONST, CHAR, DELETE, FLOAT, GROUP, IN, OUT

Jan
2002-06-11 13:40:53 +00:00
Bruce Momjian
2f297a2fcf The attached patch fixes a problem with InstallXLogFileSegment()'s use
of link() under Cygwin:

    http://archives.postgresql.org/pgsql-cygwin/2002-04/msg00072.php

Note that it appears that BeOS and Netware also have the above or
similar problem.

I have only verified that PostgreSQL builds under Cygwin with this
patch.
Since I cannot reproduce the problem, I cannot verify that the proposed
patch solves it.  Nevertheless, both Barry Pederson and David P.
Caldwell
attest that this patch solves the problem.  See the following for
details:

    http://archives.postgresql.org/pgsql-cygwin/2002-05/msg00043.php
    http://archives.postgresql.org/pgsql-cygwin/2002-05/msg00040.php

Jason Tishler
2002-06-07 21:47:45 +00:00
Tom Lane
4d567013cf Remove AMI_OVERRIDE tests from tqual.c routines; they aren't necessary
and just slow down normal operations (only fractionally, but a cycle saved
is a cycle earned).  Improve documentation of AMI_OVERRIDE behavior.
2002-05-25 20:00:12 +00:00
Tom Lane
a2597ef179 Modify sequence state storage to eliminate dangling-pointer problem
exemplified by bug #671.  Moving the storage to relcache turned out to
be a bad idea because relcache might decide to discard the info.  Instead,
open and close the relcache entry on each sequence operation, and use
a record of the current XID to discover whether we already hold
AccessShareLock on the sequence.
2002-05-22 21:40:55 +00:00
Tom Lane
959e61e917 Remove global variable scanCommandId in favor of storing a command ID
in snapshots, per my proposal of a few days ago.  Also, tweak heapam.c
routines (heap_insert, heap_update, heap_delete, heap_mark4update) to
be passed the command ID to use, instead of doing GetCurrentCommandID.
For catalog updates they'll still get passed current command ID, but
for updates generated from the main executor they'll get passed the
command ID saved in the snapshot the query is using.  This should fix
some corner cases associated with functions and triggers that advance
current command ID while an outer query is still in progress.
2002-05-21 22:05:55 +00:00
Tom Lane
940f772a29 Support temporary setting of search path during CREATE SCHEMA; this
allows the example in the CREATE SCHEMA ref page to actually work now.
Also, clean up when the transaction that initially creates a temp-table
namespace is later aborted.  Simplify internal representation of search
path by folding special cases into the main list.
2002-05-17 20:53:33 +00:00
Tom Lane
f0811a74b3 Merge the last few variable.c configuration variables into the generic
GUC support.  It's now possible to set datestyle, timezone, and
client_encoding from postgresql.conf and per-database or per-user
settings.  Also, implement rollback of SET commands that occur in a
transaction that later fails.  Create a SET LOCAL var = value syntax
that sets the variable only for the duration of the current transaction.
All per previous discussions in pghackers.
2002-05-17 01:19:19 +00:00
Peter Eisentraut
1944bff1d6 Make initdb print a message about which locale it is about to use.
Re-add warning if the locale prevents LIKE-optimization.  Done within
initdb now.
2002-05-09 13:30:24 +00:00
Bruce Momjian
d37134085b xlog.c: If possible please add the following patch to better support NetWare.
Ulrich Neumann
2002-04-24 01:54:43 +00:00
Thomas G. Lockhart
f56e8fec31 Add fields in the control file to check for whether the backend was
compiled for integer date/time storage and to check the length of
 storage for the locale fields in the same data structure.
Slightly reword some of the error messages to be more accurate on
 possible recovery options (e.g. recompile *or* re-initdb).
Bump version number on this file.
2002-04-21 19:08:02 +00:00
Peter Eisentraut
867901db9e Locale support is on by default. The choice of locale is done in initdb
and/or with GUC variables.
2002-04-03 05:39:33 +00:00
Tom Lane
838fe25a95 Create a new GUC variable search_path to control the namespace search
path.  The default behavior if no per-user schemas are created is that
all users share a 'public' namespace, thus providing behavior backwards
compatible with 7.2 and earlier releases.  Probably the semantics and
default setting will need to be fine-tuned, but this is a start.
2002-04-01 03:34:27 +00:00
Tom Lane
3114102521 Reimplement temp tables using schemas. The temp table map is history;
temp table entries in pg_class have the names the user would expect.
2002-03-31 06:26:32 +00:00
Tom Lane
01747692fe Repair two problems with WAL logging of sequence nextvalI() ops, as
per recent pghackers discussion: force a new WAL record at first nextval
after a checkpoint, and ensure that xlog is flushed to disk if a nextval
record is the only thing emitted by a transaction.
2002-03-15 19:20:36 +00:00
Bruce Momjian
92288a1cf9 Change made to elog:
o  Change all current CVS messages of NOTICE to WARNING.  We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.

o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.

o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.

o Remove INFO from the client_min_messages options and add NOTICE.

Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.

Regression passed.
2002-03-06 06:10:59 +00:00
Bruce Momjian
a033daf566 Commit to match discussed elog() changes. Only update is that LOG is
now just below FATAL in server_min_messages.  Added more text to
highlight ordering difference between it and client_min_messages.

---------------------------------------------------------------------------

REALLYFATAL => PANIC
STOP => PANIC
New INFO level the prints to client by default
New LOG level the prints to server log by default
Cause VACUUM information to print only to the client
NOTICE => INFO where purely information messages are sent
DEBUG => LOG for purely server status messages
DEBUG removed, kept as backward compatible
DEBUG5, DEBUG4, DEBUG3, DEBUG2, DEBUG1 added
DebugLvl removed in favor of new DEBUG[1-5] symbols
New server_min_messages GUC parameter with values:
        DEBUG[5-1], INFO, NOTICE, ERROR, LOG, FATAL, PANIC
New client_min_messages GUC parameter with values:
        DEBUG[5-1], LOG, INFO, NOTICE, ERROR, FATAL, PANIC
Server startup now logged with LOG instead of DEBUG
Remove debug_level GUC parameter
elog() numbers now start at 10
Add test to print error message if older elog() values are passed to elog()
Bootstrap mode now has a -d that requires an argument, like postmaster
2002-03-02 21:39:36 +00:00
Bruce Momjian
c448847378 Add better error text:
elog(LOG, "XLogWrite: new log file created - "
                     "consider increasing 'wal_files' in postgresql.conf.");
2002-02-18 05:44:45 +00:00
Tom Lane
2004337785 Reduce severity of 'XLogFlush: request is not satisfied' error condition,
per my proposal of a couple days ago.  This will eliminate the unable-
to-restart-database class of problem that we have seen reported half a
dozen times with 7.1.*.
2002-01-14 17:55:57 +00:00
Tom Lane
d3fc362ec2 Ensure that all direct uses of spinlock-protected data structures use
'volatile' pointers to access those structures, so that optimizing
compilers will not decide to move the structure accesses outside of the
spinlock-acquire-to-spinlock-release sequence.  There are no known bugs
in these uses at present, but based on bad experience with lwlock.c,
it seems prudent to ensure that we protect these other uses too.
Per pghackers discussion around 12-Dec.  (Note: it should not be
necessary to worry about structures protected by LWLocks, since the
LWLock acquire and release operations are not inline macros.)
2001-12-28 18:16:43 +00:00
Tom Lane
aed0c29f7e Fix mispeling ... 2001-12-23 07:25:39 +00:00
Tom Lane
9aa2e7da51 Temporarily dike out GetUndoRecPtr() in checkpoint generation, since we
do not use the undo pointer anyway.  This is a quick-hack solution for
the three-way deadlock condition discussed in pghackers 17-Dec-01.
Need to find a better way of doing it.
2001-12-19 19:42:51 +00:00
Bruce Momjian
ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Tom Lane
fb5f1b2c13 Merge three existing ways of signaling postmaster from child processes,
so that only one signal number is used not three.  Flags in shared
memory tell the reason(s) for the current signal.  This method is
extensible to handle more signal reasons without chewing up even more
signal numbers, but the immediate reason is to keep pg_pwd reloads
separate from SIGHUP processing in the postmaster.
Also clean up some problems in the postmaster with delayed response to
checkpoint status changes --- basically, it wouldn't schedule a checkpoint
if it wasn't getting connection requests on a regular basis.
2001-11-04 19:55:31 +00:00
Tom Lane
bdea97ea95 Add missing #include. 2001-11-01 06:17:01 +00:00
Bruce Momjian
6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00