Commit Graph

386 Commits

Author SHA1 Message Date
Tom Lane d1e027221d Replace the pg_listener-based LISTEN/NOTIFY mechanism with an in-memory queue.
In addition, add support for a "payload" string to be passed along with
each notify event.

This implementation should be significantly more efficient than the old one,
and is also more compatible with Hot Standby usage.  There is not yet any
facility for HS slaves to receive notifications generated on the master,
although such a thing is possible in future.

Joachim Wieland, reviewed by Jeff Davis; also hacked on by me.
2010-02-16 22:34:57 +00:00
Simon Riggs dd428c79a4 Fix relcache init file invalidation during Hot Standby for the case
where a database has a non-default tablespaceid. Pass thru MyDatabaseId
and MyDatabaseTableSpace to allow file path to be re-created in
standby and correct invalidation to take place in all cases.
Update and rework xact_commit_desc() debug messages.
Bug report from Tom by code inspection. Fix by me.
2010-02-13 16:15:48 +00:00
Tom Lane 0a469c8769 Remove old-style VACUUM FULL (which was known for a little while as
VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
Per discussion, the use case for this method of vacuuming is no longer large
enough to justify maintaining it; not to mention that we don't wish to invest
the work that would be needed to make it play nicely with Hot Standby.

Aside from the code directly related to old-style VACUUM FULL, this commit
removes support for certain WAL record types that could only be generated
within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
nontransactional generation of cache invalidation sinval messages (the last
being the sticking point for Hot Standby).

We still have to retain all code that copes with finding HEAP_MOVED_OFF and
HEAP_MOVED_IN flag bits on existing tuples.  This can't be removed as long
as we want to support in-place update from pre-9.0 databases.
2010-02-08 04:33:55 +00:00
Tom Lane b9b8831ad6 Create a "relation mapping" infrastructure to support changing the relfilenodes
of shared or nailed system catalogs.  This has two key benefits:

* The new CLUSTER-based VACUUM FULL can be applied safely to all catalogs.

* We no longer have to use an unsafe reindex-in-place approach for reindexing
  shared catalogs.

CLUSTER on nailed catalogs now works too, although I left it disabled on
shared catalogs because the resulting pg_index.indisclustered update would
only be visible in one database.

Since reindexing shared system catalogs is now fully transactional and
crash-safe, the former special cases in REINDEX behavior have been removed;
shared catalogs are treated the same as non-shared.

This commit does not do anything about the recently-discussed problem of
deadlocks between VACUUM FULL/CLUSTER on a system catalog and other
concurrent queries; will address that in a separate patch.  As a stopgap,
parallel_schedule has been tweaked to run vacuum.sql by itself, to avoid
such failures during the regression tests.
2010-02-07 20:48:13 +00:00
Tom Lane 875353b99f Fix assorted core dumps and Assert failures that could occur during
AbortTransaction or AbortSubTransaction, when trying to clean up after an
error that prevented (sub)transaction start from completing:
* access to TopTransactionResourceOwner that might not exist
* assert failure in AtEOXact_GUC, if AtStart_GUC not called yet
* assert failure or core dump in AfterTriggerEndSubXact, if
  AfterTriggerBeginSubXact not called yet

Per testing by injecting elog(ERROR) at successive steps in StartTransaction
and StartSubTransaction.  It's not clear whether all of these cases could
really occur in the field, but at least one of them is easily exposed by
simple stress testing, as per my accidental discovery yesterday.
2010-01-24 21:49:17 +00:00
Simon Riggs a8ce974cdd Teach standby conflict resolution to use SIGUSR1
Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
2010-01-16 10:05:59 +00:00
Simon Riggs 42edbd16fb During Hot Standby, set DatabasePath correctly during relcache init file
deletion, so that we attempt to unlink the correct filepath. unlink()
errors are ignorable there, so lack of a DatabasePath initialization step
did not cause visible problems until a related bug showed up on Solaris.

Code refactored from xact_redo_commit() to
ProcessCommittedInvalidationMessages() in inval.c. Recovery may replay
shared invalidation messages for many databases, so we cannot
SetDatabasePath() once as we do in normal backends. Read the databaseid
from the shared invalidation messages, then set DatabasePath
temporarily before calling RelationCacheInitFileInvalidate().

Problem report by Robert Treat, analysis and fix by me.
2010-01-09 16:49:27 +00:00
Bruce Momjian 0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Simon Riggs efc16ea520 Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 01:32:45 +00:00
Tom Lane 62aba76568 Prevent indirect security attacks via changing session-local state within
an allegedly immutable index function.  It was previously recognized that
we had to prevent such a function from executing SET/RESET ROLE/SESSION
AUTHORIZATION, or it could trivially obtain the privileges of the session
user.  However, since there is in general no privilege checking for changes
of session-local state, it is also possible for such a function to change
settings in a way that might subvert later operations in the same session.
Examples include changing search_path to cause an unexpected function to
be called, or replacing an existing prepared statement with another one
that will execute a function of the attacker's choosing.

The present patch secures VACUUM, ANALYZE, and CREATE INDEX/REINDEX against
these threats, which are the same places previously deemed to need protection
against the SET ROLE issue.  GUC changes are still allowed, since there are
many useful cases for that, but we prevent security problems by forcing a
rollback of any GUC change after completing the operation.  Other cases are
handled by throwing an error if any change is attempted; these include temp
table creation, closing a cursor, and creating or deleting a prepared
statement.  (In 7.4, the infrastructure to roll back GUC changes doesn't
exist, so we settle for rejecting changes of "search_path" in these contexts.)

Original report and patch by Gurjeet Singh, additional analysis by
Tom Lane.

Security: CVE-2009-4136
2009-12-09 21:57:51 +00:00
Heikki Linnakangas cd87b6f8a5 Fix an old bug in multixact and two-phase commit. Prepared transactions can
be part of multixacts, so allocate a slot for each prepared transaction in
the "oldest member" array in multixact.c. On PREPARE TRANSACTION, transfer
the oldest member value from the current backends slot to the prepared xact
slot. Also save and recover the value from the 2pc state file.

The symptom of the bug was that after a transaction prepared, a shared lock
still held by the prepared transaction was sometimes ignored by other
transactions.

Fix back to 8.1, where both 2PC and multixact were introduced.
2009-11-23 09:58:36 +00:00
Alvaro Herrera a8bb8eb583 Remove flatfiles.c, which is now obsolete.
Recent commits have removed the various uses it was supporting.  It was a
performance bottleneck, according to bug report #4919 by Lauris Ulmanis; seems
it slowed down user creation after a billion users.
2009-09-01 02:54:52 +00:00
Bruce Momjian d747140279 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list
provided by Andrew.
2009-06-11 14:49:15 +00:00
Tom Lane 23543c732b Rewrite xml.c's memory management (yet again). Give up on the idea of
redirecting libxml's allocations into a Postgres context.  Instead, just let
it use malloc directly, and add PG_TRY blocks as needed to be sure we release
libxml data structures in error recovery code paths.  This is ugly but seems
much more likely to play nicely with third-party uses of libxml, as seen in
recent trouble reports about using Perl XML facilities in pl/perl and bug
#4774 about contrib/xml2.

I left the code for allocation redirection in place, but it's only
built/used if you #define USE_LIBXMLCONTEXT.  This is because I found it
useful to corral libxml's allocations in a palloc context when hunting
for libxml memory leaks, and we're surely going to have more of those
in the future with this type of approach.  But we don't want it turned on
in a normal build because it breaks exactly what we need to fix.

I have not re-indented most of the code sections that are now wrapped
by PG_TRY(); that's for ease of review.  pg_indent will fix it.

This is a pre-existing bug in 8.3, but I don't dare back-patch this change
until it's gotten a reasonable amount of field testing.
2009-05-13 20:27:17 +00:00
Heikki Linnakangas b2a667b9ee Add a new option to RestoreBkpBlocks() to indicate if a cleanup lock should
be used instead of the normal exclusive lock, and make WAL redo functions
responsible for calling RestoreBkpBlocks(). They know better what kind of a
lock they need.

At the moment, this just moves things around with no functional change, but
makes the hot standby patch that's under review cleaner.
2009-01-20 18:59:37 +00:00
Bruce Momjian 511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
Alvaro Herrera 7b640b0345 Fix a couple of snapshot management bugs in the new ResourceOwner world:
non-writable large objects need to have their snapshots registered on the
transaction resowner, not the current portal's, because it must persist until
the large object is closed (which the portal does not).  Also, ensure that the
serializable snapshot is recorded by the transaction resource owner too, even
when a subtransaction has changed the current resource owner before
serializable is taken.

Per bug reports from Pavan Deolasee.
2008-12-04 14:51:02 +00:00
Heikki Linnakangas 3396000684 Rethink the way FSM truncation works. Instead of WAL-logging FSM
truncations in FSM code, call FreeSpaceMapTruncateRel from smgr_redo. To
make that cleaner from modularity point of view, move the WAL-logging one
level up to RelationTruncate, and move RelationTruncate and all the
related WAL-logging to new src/backend/catalog/storage.c file. Introduce
new RelationCreateStorage and RelationDropStorage functions that are used
instead of calling smgrcreate/smgrscheduleunlink directly. Move the
pending rel deletion stuff from smgrcreate/smgrscheduleunlink to the new
functions. This leaves smgr.c as a thin wrapper around md.c; all the
transactional stuff is now in storage.c.

This will make it easier to add new forks with similar truncation logic,
like the visibility map.
2008-11-19 10:34:52 +00:00
Tom Lane cad3a26a95 Fix sloppy omission of now-required #include's. 2008-11-11 14:17:02 +00:00
Heikki Linnakangas 7e8b0b9ab1 Change error messages to print the physical path, like
"base/11517/3767_fsm", instead of symbolic names like "1663/11517/3767/1",
per Alvaro's suggestion. I didn't change the messages in the higher-level
index, heap and FSM routines, though, where the fork is implicit.
2008-11-11 13:19:16 +00:00
Alvaro Herrera 06da3c570f Rework subtransaction commit protocol for hot standby.
This patch eliminates the marking of subtransactions as SUBCOMMITTED in pg_clog
during their commit; instead they remain in-progress until main transaction
commit.  At main transaction commit, the commit protocol is atomic-by-page
instead of one transaction at a time.  To avoid a race condition with some
subtransactions appearing committed before others in the case where they span
more than one pg_clog page, we conserve the logic that marks them subcommitted
before marking the parent committed.

Simon Riggs with minor help from me
2008-10-20 19:18:18 +00:00
Heikki Linnakangas 3f0e808c4a Introduce the concept of relation forks. An smgr relation can now consist
of multiple forks, and each fork can be created and grown separately.

The bulk of this patch is about changing the smgr API to include an extra
ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
smgrdounlink no longer implicitly call smgrclose, because other forks might
still exist after unlinking one. The callers of those functions have been
modified to call smgrclose instead.

This patch in itself doesn't have any user-visible effect, but provides the
infrastructure needed for upcoming patches. The additional forks envisioned
are a rewritten FSM implementation that doesn't rely on a fixed-size shared
memory block, and a visibility map to allow skipping portions of a table in
VACUUM that have no dead tuples.
2008-08-11 11:05:11 +00:00
Alvaro Herrera 5da9da71c4 Improve snapshot manager by keeping explicit track of snapshots.
There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment.  This is all done automatically now.

As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.
2008-05-12 20:02:02 +00:00
Alvaro Herrera f8c4d7db60 Restructure some header files a bit, in particular heapam.h, by removing some
unnecessary #include lines in it.  Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.

For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.

While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
2008-05-12 00:00:54 +00:00
Alvaro Herrera 78f02ca1f5 Rename snapmgmt.c/h to snapmgr.c/h, for consistency with other files.
Per complaint from Tom Lane.
2008-03-26 18:48:59 +00:00
Alvaro Herrera d43b085d57 Separate snapshot management code from tuple visibility code, create a
snapmgmt.c file for the former.  The header files have also been reorganized
in three parts: the most basic snapshot definitions are now in a new file
snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c.
tqual.h has been reduced to the bare minimum.

This patch is just a first step towards managing live snapshots within a
transaction; there is no functionality change.

Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and
subsequent discussion.
2008-03-26 16:20:48 +00:00
Peter Eisentraut a7b7b07af3 Enable probes to work with Mac OS X Leopard and other OSes that will
support DTrace in the future.

Switch from using DTRACE_PROBEn macros to the dynamically generated macros.
Use "dtrace -h" to create a header file that contains the dynamically
generated macros to be used in the source code instead of the DTRACE_PROBEn
macros.  A dummy header file is generated for builds without DTrace support.

Author: Robert Lor <Robert.Lor@sun.com>
2008-03-17 19:44:41 +00:00
Tom Lane 32846f8152 Fix TransactionIdIsCurrentTransactionId() to use binary search instead of
linear search when checking child-transaction XIDs.  This makes for an
important speedup in transactions that have large numbers of children,
as in a recent example from Craig Ringer.  We can also get rid of an
ugly kluge that represented lists of TransactionIds as lists of OIDs.

Heikki Linnakangas
2008-03-17 02:18:55 +00:00
Tom Lane 7d6e6e2e97 Fix PREPARE TRANSACTION to reject the case where the transaction has dropped a
temporary table; we can't support that because there's no way to clean up the
source backend's internal state if the eventual COMMIT PREPARED is done by
another backend.  This was checked correctly in 8.1 but I broke it in 8.2 :-(.
Patch by Heikki Linnakangas, original trouble report by John Smith.
2008-03-04 19:54:06 +00:00
Tom Lane ac12412ede Revise memory management for libxml calls. Instead of keeping libxml's data
in whichever context happens to be current during a call of an xml.c function,
use a dedicated context that will not go away until we explicitly delete it
(which we do at transaction end or subtransaction abort).  This makes recovery
after an error much simpler --- we don't have to individually delete the data
structures created by libxml.  Also, we need to initialize and cleanup libxml
only once per transaction (if there's no error) instead of once per function
call, so it should be a bit faster.  We'll need to keep an eye out for
intra-transaction memory leaks, though.  Alvaro and Tom.
2008-01-15 18:57:00 +00:00
Tom Lane eedb068c0a Make standard maintenance operations (including VACUUM, ANALYZE, REINDEX,
and CLUSTER) execute as the table owner rather than the calling user, using
the same privilege-switching mechanism already used for SECURITY DEFINER
functions.  The purpose of this change is to ensure that user-defined
functions used in index definitions cannot acquire the privileges of a
superuser account that is performing routine maintenance.  While a function
used in an index is supposed to be IMMUTABLE and thus not able to do anything
very interesting, there are several easy ways around that restriction; and
even if we could plug them all, there would remain a risk of reading sensitive
information and broadcasting it through a covert channel such as CPU usage.

To prevent bypassing this security measure, execution of SET SESSION
AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context.

Thanks to Itagaki Takahiro for reporting this vulnerability.

Security: CVE-2007-6600
2008-01-03 21:23:15 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Tom Lane 895a94de6d Avoid incrementing the CommandCounter when CommandCounterIncrement is called
but no database changes have been made since the last CommandCounterIncrement.
This should result in a significant improvement in the number of "commands"
that can typically be performed within a transaction before hitting the 2^32
CommandId size limit.  In particular this buys back (and more) the possible
adverse consequences of my previous patch to fix plan caching behavior.

The implementation requires tracking whether the current CommandCounter
value has been "used" to mark any tuples.  CommandCounter values stored into
snapshots are presumed not to be used for this purpose.  This requires some
small executor changes, since the executor used to conflate the curcid of
the snapshot it was using with the command ID to mark output tuples with.
Separating these concepts allows some small simplifications in executor APIs.

Something for the TODO list: look into having CommandCounterIncrement not do
AcceptInvalidationMessages.  It seems fairly bogus to be doing it there,
but exactly where to do it instead isn't clear, and I'm disinclined to mess
with asynchronous behavior during late beta.
2007-11-30 21:22:54 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Bruce Momjian 82748bc253 Reduce error level of ROLLBACK outside a transaction from WARNING to
NOTICE.
2007-11-10 14:36:44 +00:00
Tom Lane ef4d38c86c Rename recently-added pg_stat_activity column from txn_start to xact_start,
for consistency with other column names such as in pg_stat_database.
2007-09-11 03:28:05 +00:00
Tom Lane 6bd4f401b0 Replace the former method of determining snapshot xmax --- to wit, calling
ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid"
variable that is updated during transaction commit or abort.  Since
latestCompletedXid is written only in places that had to lock ProcArrayLock
exclusively anyway, and is read only in places that had to lock ProcArrayLock
shared anyway, it adds no new locking requirements to the system despite being
cluster-wide.  Moreover, removing ReadNewTransactionId from snapshot
acquisition eliminates the need to take both XidGenLock and ProcArrayLock at
the same time.  Since XidGenLock is sometimes held across I/O this can be a
significant win.  Some preliminary benchmarking suggested that this patch has
no effect on average throughput but can significantly improve the worst-case
transaction times seen in pgbench.  Concept by Florian Pflug, implementation
by Tom Lane.
2007-09-08 20:31:15 +00:00
Tom Lane 0a51e7073c Don't take ProcArrayLock while exiting a transaction that has no XID; there is
no need for serialization against snapshot-taking because the xact doesn't
affect anyone else's snapshot anyway.  Per discussion.  Also, move various
info about the interlocking of transactions and snapshots out of code comments
and into a hopefully-more-cohesive discussion in access/transam/README.

Also, remove a couple of now-obsolete comments about having to force some WAL
to be written to persuade RecordTransactionCommit to do its thing.
2007-09-07 20:59:26 +00:00
Tom Lane 295e63983d Implement lazy XID allocation: transactions that do not modify any database
rows will normally never obtain an XID at all.  We already did things this way
for subtransactions, but this patch extends the concept to top-level
transactions.  In applications where there are lots of short read-only
transactions, this should improve performance noticeably; not so much from
removal of the actual XID-assignments, as from reduction of overhead that's
driven by the rate of XID consumption.  We add a concept of a "virtual
transaction ID" so that active transactions can be uniquely identified even
if they don't have a regular XID.  This is a much lighter-weight concept:
uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
record is made about them.

Florian Pflug, with some editorialization by Tom.
2007-09-05 18:10:48 +00:00
Tom Lane 2abae34a2e Implement function-local GUC parameter settings, as per recent discussion.
There are still some loose ends: I didn't do anything about the SET FROM
CURRENT idea yet, and it's not real clear whether we are happy with the
interaction of SET LOCAL with function-local settings.  The documentation
is a bit spartan, too.
2007-09-03 00:39:26 +00:00
Tom Lane 4a78cdeb6b Support an optional asynchronous commit mode, in which we don't flush WAL
before reporting a transaction committed.  Data consistency is still
guaranteed (unlike setting fsync = off), but a crash may lose the effects
of the last few transactions.  Patch by Simon, some editorialization by Tom.
2007-08-01 22:45:09 +00:00
Tom Lane 6d6d14b6d5 Redefine IsTransactionState() to only return true for TRANS_INPROGRESS state,
which is the only state in which it's safe to initiate database queries.
It turns out that all but two of the callers thought that's what it meant;
and the other two were using it as a proxy for "will GetTopTransactionId()
return a nonzero XID"?  Since it was in fact an unreliable guide to that,
make those two just invoke GetTopTransactionId() always, then deal with a
zero result if they get one.
2007-06-07 21:45:59 +00:00
Tom Lane fa0e318f94 Fix overly-strict sanity check in BeginInternalSubTransaction that made it
fail when used in a deferred trigger.  Bug goes back to 8.0; no doubt the
reason it hadn't been noticed is that we've been discouraging use of
user-defined constraint triggers.  Per report from Frank van Vugt.
2007-05-30 21:01:39 +00:00
Tom Lane 77947c51c0 Fix up pgstats counting of live and dead tuples to recognize that committed
and aborted transactions have different effects; also teach it not to assume
that prepared transactions are always committed.

Along the way, simplify the pgstats API by tying counting directly to
Relations; I cannot detect any redeeming social value in having stats
pointers in HeapScanDesc and IndexScanDesc structures.  And fix a few
corner cases in which counts might be missed because the relation's
pgstat_info pointer hadn't been set.
2007-05-27 03:50:39 +00:00
Tom Lane c432061963 Change the timestamps recorded in transaction commit/abort xlog records
from time_t to TimestampTz representation.  This provides full gettimeofday()
resolution of the timestamps, which might be useful when attempting to
do point-in-time recovery --- previously it was not possible to specify
the stop point with sub-second resolution.  But mostly this is to get
rid of TimestampTz-to-time_t conversion overhead during commit.  Per my
proposal of a day or two back.
2007-04-30 21:01:53 +00:00
Tom Lane 957d08c81f Implement rate-limiting logic on how often backends will attempt to send
messages to the stats collector.  This avoids the problem that enabling
stats_row_level for autovacuum has a significant overhead for short
read-only transactions, as noted by Arjen van der Meijden.  We can avoid
an extra gettimeofday call by piggybacking on the one done for WAL-logging
xact commit or abort (although that doesn't help read-only transactions,
since they don't WAL-log anything).

In my proposal for this, I noted that we could change the WAL log entries
for commit/abort to record full TimestampTz precision, instead of only
time_t as at present.  That's not done in this patch, but will be committed
separately.
2007-04-30 03:23:49 +00:00
Tom Lane a2e923a652 Fix dynahash.c to suppress hash bucket splits while a hash_seq_search() scan
is in progress on the same hashtable.  This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries.  However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well.  Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.

Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one.  There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption.  This affects 8.2 and HEAD
only, since those functions weren't there earlier.
2007-04-26 23:24:46 +00:00
Tom Lane 9c9b619473 Remove the CheckpointStartLock in favor of having backends show whether they
are in their commit critical sections via flags in the ProcArray.  Checkpoint
can watch the ProcArray to determine when it's safe to proceed.  This is
a considerably better solution to the original problem of race conditions
between checkpoint and transaction commit: it speeds up commit, since there's
one less lock to fool with, and it prevents the problem of checkpoint being
delayed indefinitely when there's a constant flow of commits.  Heikki, with
some kibitzing from Tom.
2007-04-03 16:34:36 +00:00
Tom Lane 4f896dac17 Arrange for PreventTransactionChain to reject commands submitted as part
of a multi-statement simple-Query message.  This bug goes all the way
back, but unfortunately is not nearly so easy to fix in existing releases;
it is only the recent ProcessUtility API change that makes it fixable in
HEAD.  Per report from William Garrison.
2007-03-22 19:55:04 +00:00
Peter Eisentraut f4ee82e3d3 Reverted waiting for further fixes:
Make configuration parameters fall back to their default values when they
are removed from the configuration file.

Joachim Wieland
2007-03-13 14:32:25 +00:00
Tom Lane b9527e9840 First phase of plan-invalidation project: create a plan cache management
module and teach PREPARE and protocol-level prepared statements to use it.
In service of this, rearrange utility-statement processing so that parse
analysis does not assume table schemas can't change before execution for
utility statements (necessary because we don't attempt to re-acquire locks
for utility statements when reusing a stored plan).  This requires some
refactoring of the ProcessUtility API, but it ends up cleaner anyway,
for instance we can get rid of the QueryContext global.

Still to do: fix up SPI and related code to use the plan cache; I'm tempted to
try to make SQL functions use it too.  Also, there are at least some aspects
of system state that we want to ensure remain the same during a replan as in
the original processing; search_path certainly ought to behave that way for
instance, and perhaps there are others.
2007-03-13 00:33:44 +00:00
Peter Eisentraut f84308f195 Make configuration parameters fall back to their default values when they
are removed from the configuration file.

Joachim Wieland
2007-03-12 22:09:28 +00:00
Tom Lane c398300330 Combine cmin and cmax fields of HeapTupleHeaders into a single field, by
keeping private state in each backend that has inserted and deleted the same
tuple during its current top-level transaction.  This is sufficient since
there is no need to be able to determine the cmin/cmax from any other
transaction.  This gets us back down to 23-byte headers, removing a penalty
paid in 8.0 to support subtransactions.  Patch by Heikki Linnakangas, with
minor revisions by moi, following a design hashed out awhile back on the
pghackers list.
2007-02-09 03:35:35 +00:00
Tom Lane aec4cf1c8c Add a function pg_stat_clear_snapshot() that discards any statistics snapshot
already collected in the current transaction; this allows plpgsql functions to
watch for stats updates even though they are confined to a single transaction.
Use this instead of the previous kluge involving pg_stat_file() to wait for
the stats collector to update in the stats regression test.  Internally,
decouple storage of stats snapshots from transaction boundaries; they'll
now stick around until someone calls pgstat_clear_snapshot --- which xact.c
still does at transaction end, to maintain the previous behavior.  This makes
the logic a lot cleaner, at the price of a couple dozen cycles per transaction
exit.
2007-02-07 23:11:30 +00:00
Bruce Momjian 8b4ff8b6a1 Wording cleanup for error messages. Also change can't -> cannot.
Standard English uses "may", "can", and "might" in different ways:

        may - permission, "You may borrow my rake."

        can - ability, "I can lift that log."

        might - possibility, "It might rain today."

Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice.  Similarly, "It may crash" is better stated, "It might crash".
2007-02-01 19:10:30 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Neil Conway 886a02d1cb Add a txn_start column to pg_stat_activity. This makes it easier to
identify long-running transactions. Since we already need to record
the transaction-start time (e.g. for now()), we don't need any
additional system calls to report this information.

Catversion bumped, initdb required.
2006-12-06 18:06:48 +00:00
Tom Lane 395249ecbe Several changes to reduce the probability of running out of memory during
AbortTransaction, which would lead to recursion and eventual PANIC exit
as illustrated in recent report from Jeff Davis.  First, in xact.c create
a special dedicated memory context for AbortTransaction to run in.  This
solves the problem as long as AbortTransaction doesn't need more than 32K
(or whatever other size we create the context with).  But in corner cases
it might.  Second, in trigger.c arrange to keep pending after-trigger event
records in separate contexts that can be freed near the beginning of
AbortTransaction, rather than having them persist until CleanupTransaction
as before.  Third, in portalmem.c arrange to free executor state data
earlier as well.  These two changes should result in backing off the
out-of-memory condition before AbortTransaction needs any significant
amount of memory, at least in typical cases such as memory overrun due
to too many trigger events or too big an executor hash table.  And all
the same for subtransaction abort too, of course.
2006-11-23 01:14:59 +00:00
Tom Lane 48188e1621 Fix recently-understood problems with handling of XID freezing, particularly
in PITR scenarios.  We now WAL-log the replacement of old XIDs with
FrozenTransactionId, so that such replacement is guaranteed to propagate to
PITR slave databases.  Also, rather than relying on hint-bit updates to be
preserved, pg_clog is not truncated until all instances of an XID are known to
have been replaced by FrozenTransactionId.  Add new GUC variables and
pg_autovacuum columns to allow management of the freezing policy, so that
users can trade off the size of pg_clog against the amount of freezing work
done.  Revise the already-existing code that forces autovacuum of tables
approaching the wraparound point to make it more bulletproof; also, revise the
autovacuum logic so that anti-wraparound vacuuming is done per-table rather
than per-database.  initdb forced because of changes in pg_class, pg_database,
and pg_autovacuum catalogs.  Heikki Linnakangas, Simon Riggs, and Tom Lane.
2006-11-05 22:42:10 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane ca1fd0ea5b Move xact.c's partial support for Lists of TransactionIds into pg_list.h.
Needed because lock.c is now going to use the same type of list.
2006-08-27 19:11:46 +00:00
Alvaro Herrera 92c2ecc130 Modify snapshot definition so that lazy vacuums are ignored by other
vacuums.  This allows a OLTP-like system with big tables to continue
regular vacuuming on small-but-frequently-updated tables while the
big tables are being vacuumed.

Original patch from Hannu Krossing, rewritten by Tom Lane and updated
by me.
2006-07-30 02:07:18 +00:00
Peter Eisentraut e9b4969062 DTrace support, with a small initial set of probes
by Robert Lor
2006-07-24 16:32:45 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian a22d76d96a Allow include files to compile own their own.
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
2006-07-13 16:49:20 +00:00
Tom Lane 27c3e3de09 Remove redundant gettimeofday() calls to the extent practical without
changing semantics too much.  statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead.  I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>.  There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
2006-06-20 22:52:00 +00:00
Bruce Momjian e6004f0151 Add statement_timestamp(), clock_timestamp(), and
transaction_timestamp() (just like now()).

Also update statement_timeout() to mention it is statement arrival time
that is measured.

Catalog version updated.
2006-04-25 00:25:22 +00:00
Tom Lane 6d61cdec07 Clean up and document the API for XLogOpenRelation and XLogReadBuffer.
This commit doesn't make much functional change, but it does eliminate some
duplicated code --- for instance, PageIsNew tests are now done inside
XLogReadBuffer rather than by each caller.
The GIST xlog code still needs a lot of love, but I'll worry about that
separately.
2006-03-29 21:17:39 +00:00
Tom Lane 0a20207060 Arrange to emit a description of the current XLOG record as error context
when an error occurs during xlog replay.  Also, replace the former risky
'write into a fixed-size buffer with no overflow detection' API for XLOG
record description routines; use an expansible StringInfo instead.  (The
latter accounts for most of the patch bulk.)

Qingqing Zhou
2006-03-24 04:32:13 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane f39f6b500f Seems that the childXids list would be better based on Oid lists than
integer lists.
2005-08-20 23:45:08 +00:00
Tom Lane f8d0a82bf9 Avoid an Assert failure if OuterUserId hasn't been set yet during
AbortTransaction.  This can happen if a backend's InitPostgres transaction
fails (eg, because the given username is invalid).  Per Alvaro.
2005-08-17 22:14:34 +00:00
Tom Lane 4568e0f791 Modify AtEOXact_CatCache and AtEOXact_RelationCache to assume that the
ResourceOwner mechanism already released all reference counts for the
cache entries; therefore, we do not need to scan the catcache or relcache
at transaction end, unless we want to do it as a debugging crosscheck.
Do the crosscheck only in Assert mode.  This is the same logic we had
previously installed in AtEOXact_Buffers to avoid overhead with large
numbers of shared buffers.  I thought it'd be a good idea to do it here
too, in view of Kari Lavikka's recent report showing a real-world case
where AtEOXact_CatCache is taking a significant fraction of runtime.
2005-08-08 19:17:23 +00:00
Tom Lane e5d6b91220 Add SET ROLE. This is a partial commit of Stephen Frost's recent patch;
I'm still working on the has_role function and information_schema changes.
2005-07-25 22:12:34 +00:00
Tom Lane f2bf2d2dc5 Fix a couple of bogus comments, per Alvaro. 2005-07-13 22:46:09 +00:00
Tom Lane b5f7cff84f Clean up the rather historically encumbered interface to now() and
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields.  This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
2005-06-29 22:51:57 +00:00
Tom Lane 7762619e95 Replace pg_shadow and pg_group by new role-capable catalogs pg_authid
and pg_auth_members.  There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance).  But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies.  The catalog changes should
be pretty much done.
2005-06-28 05:09:14 +00:00
Tom Lane e26b0abda3 Arrange to fsync two-phase-commit state files only during checkpoints;
given reasonably short lifespans for prepared transactions, this should
mean that only a small minority of state files ever need to be fsynced
at all.  Per discussion with Heikki Linnakangas.
2005-06-19 20:00:39 +00:00
Tom Lane a8d1075f27 Add a time-of-preparation column to the pg_prepared_xacts view, per an
old suggestion by Oliver Jowett.  Also, add a transaction column to the
pg_locks view to show the xid of each transaction holding or awaiting
locks; this allows prepared transactions to be properly associated with
the locks they own.  There was already a column named 'transaction',
and I chose to rename it to 'transactionid' --- since this column is
new in the current devel cycle there should be no backwards compatibility
issue to worry about.
2005-06-18 19:33:42 +00:00
Tom Lane d0a89683a3 Two-phase commit. Original patch by Heikki Linnakangas, with additional
hacking by Alvaro Herrera and Tom Lane.
2005-06-17 22:32:51 +00:00
Tom Lane ee7ac7b11e Modify XLogInsert API to make callers specify whether pages to be backed
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero.  That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space.  Per suggestion by Heikki Linnakangas.
2005-06-06 20:22:58 +00:00
Tom Lane 4c8495a1f2 Remove the mostly-stubbed-out-anyway support routines for WAL UNDO.
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
2005-06-06 17:01:25 +00:00
Tom Lane ff0c143a3b Make a comment pgindent-proof, per suggestion from Alvaro. 2005-05-19 23:58:51 +00:00
Tom Lane ee3b71f6bc Split the shared-memory array of PGPROC pointers out of the sinval
communication structure, and make it its own module with its own lock.
This should reduce contention at least a little, and it definitely makes
the code seem cleaner.  Per my recent proposal.
2005-05-19 21:35:48 +00:00
Tom Lane bedb78d386 Implement sharable row-level locks, and use them for foreign key references
to eliminate unnecessary deadlocks.  This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE.  The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets.  When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX.  This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared.   Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.

Alvaro Herrera and Tom Lane.
2005-04-28 21:47:18 +00:00
Tom Lane c3294f1cbf Fix interaction between materializing holdable cursors and firing
deferred triggers: either one can create more work for the other,
so we have to loop till it's all gone.  Per example from andrew@supernews.
Add a regression test to help spot trouble in this area in future.
2005-04-11 19:51:16 +00:00
Tom Lane 119191609c Remove dead push/pop rollback code. Vadim once planned to implement
transaction rollback via UNDO but I think that's highly unlikely to
happen, so we may as well remove the stubs.  (Someday we ought to
rip out the stub xxx_undo routines, too.)  Per Alvaro.
2005-03-28 01:50:34 +00:00
Tom Lane 4aefe75553 Remove some no-longer-needed kluges for bootstrapping, in particular
the AMI_OVERRIDE flag.  The fact that TransactionLogFetch treats
BootstrapTransactionId as always committed is sufficient to make
bootstrap work, and getting rid of extra tests in heavily used code
paths seems like a win.  The files produced by initdb are demonstrably
the same after this change.
2005-02-20 21:46:50 +00:00
Tom Lane 60b2444cc3 Add code to prevent transaction ID wraparound by enforcing a safe limit
in GetNewTransactionId().  Since the limit value has to be computed
before we run any real transactions, this requires adding code to database
startup to scan pg_database and determine the oldest datfrozenxid.
This can conveniently be combined with the first stage of an attack on
the problem that the 'flat file' copies of pg_shadow and pg_group are
not properly updated during WAL recovery.  The code I've added to
startup resides in a new file src/backend/utils/init/flatfiles.c, and
it is responsible for rewriting the flat files as well as initializing
the XID wraparound limit value.  This will eventually allow us to get
rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add
a trigger to pg_database.
2005-02-20 02:22:07 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane 88868d4fbc Change COMMIT back to the old behavior of emitting command tag COMMIT,
not ROLLBACK, for the case of COMMIT outside a transaction block.
Alvaro Herrera
2004-10-30 20:44:43 +00:00
Tom Lane 23f264d125 Rearrange order of pre-commit operations: must close cursors before doing
ON COMMIT actions.  Per bug report from Michael Guerin.
2004-10-29 22:19:53 +00:00
Tom Lane fdd13f1568 Give the ResourceOwner mechanism full responsibility for releasing buffer
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly.  When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op.  This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
2004-10-16 18:57:26 +00:00
Tom Lane 4c77cbb272 PortalRun must guard against the possibility that the portal it's
running contains VACUUM or a similar command that will internally start
and commit transactions.  In such a case, the original caller values of
CurrentMemoryContext and CurrentResourceOwner will point to objects that
will be destroyed by the internal commit.  We must restore these pointers
to point to the newly-manufactured transaction context and resource owner,
rather than possibly pointing to deleted memory.
Also tweak xact.c so that AbortTransaction and AbortSubTransaction
forcibly restore a sane value for CurrentResourceOwner, much as they
have always done for CurrentMemoryContext.  I'm not certain this is
necessary but I'm feeling paranoid today.
Responds to Sean Chittenden's bug report of 4-Oct.
2004-10-04 21:52:15 +00:00
Tom Lane 257cccbe5e Add some marginal tweaks to eliminate memory leakages associated with
subtransactions.  Trivial subxacts (such as a plpgsql exception block
containing no database access) now demonstrably leak zero bytes.
2004-09-16 20:17:49 +00:00
Tom Lane 8f9f198603 Restructure subtransaction handling to reduce resource consumption,
as per recent discussions.  Invent SubTransactionIds that are managed like
CommandIds (ie, counter is reset at start of each top transaction), and
use these instead of TransactionIds to keep track of subtransaction status
in those modules that need it.  This means that a subtransaction does not
need an XID unless it actually inserts/modifies rows in the database.
Accordingly, don't assign it an XID nor take a lock on the XID until it
tries to do that.  This saves a lot of overhead for subtransactions that
are only used for error recovery (eg plpgsql exceptions).  Also, arrange
to release a subtransaction's XID lock as soon as the subtransaction
exits, in both the commit and abort cases.  This avoids holding many
unique locks after a long series of subtransactions.  The price is some
additional overhead in XactLockTableWait, but that seems acceptable.
Finally, restructure the state machine in xact.c to have a more orthogonal
set of states for subtransactions.
2004-09-16 16:58:44 +00:00
Tom Lane b2c4071299 Redesign query-snapshot timing so that volatile functions in READ COMMITTED
mode see a fresh snapshot for each command in the function, rather than
using the latest interactive command's snapshot.  Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and IMMUTABLE
functions, instead using the snapshot taken for the most closely nested
regular query.  (This behavior is only sane for read-only functions, so
the patch also enforces that such functions contain only SELECT commands.)
As per my proposal of 6-Sep-2004; I note that I floated essentially the
same proposal on 19-Jun-2002, but that discussion tailed off without any
action.  Since 8.0 seems like the right place to be taking possibly
nontrivial backwards compatibility hits, let's get it done now.
2004-09-13 20:10:13 +00:00
Tom Lane b339d1fff6 Fire non-deferred AFTER triggers immediately upon query completion,
rather than when returning to the idle loop.  This makes no particular
difference for interactively-issued queries, but it makes a big difference
for queries issued within functions: trigger execution now occurs before
the calling function is allowed to proceed.  This responds to numerous
complaints about nonintuitive behavior of foreign key checking, such as
http://archives.postgresql.org/pgsql-bugs/2004-09/msg00020.php, and
appears to be required by the SQL99 spec.
Also take the opportunity to simplify the data structures used for the
pending-trigger list, rename them for more clarity, and squeeze out a
bit of space.
2004-09-10 18:40:09 +00:00
Tom Lane 23645f0582 Fix incorrect ordering of smgr cleanup relative to buffer pin cleanup
during transaction abort.  Add a regression test case to catch related
mistakes in future.  Alvaro Herrera and Tom Lane.
2004-09-06 17:56:33 +00:00
Tom Lane 9cf4eaa00b Fix failure to advance nextXID beyond subtransactions whose XIDs appear
only within COMMIT or ABORT records.
2004-08-30 19:00:03 +00:00
Bruce Momjian 15d3f9f6b7 Another pgindent run with lib typedefs added. 2004-08-30 02:54:42 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane f78ecbf20e Now that TransactionIdDidAbort doesn't think it should try to modify
pg_clog, there's no reason to do abort marking of subtransactions in a
nonintuitive order.
2004-08-28 22:04:12 +00:00
Tom Lane 1c72d0dec1 Fix relcache to account properly for subtransaction status of 'new'
relcache entries.  Also, change TransactionIdIsCurrentTransactionId()
so that if consulted during transaction abort, it will not say that
the aborted xact is still current.  (It would be better to ensure that
it's never called at all during abort, but I'm not sure we can easily
guarantee that.)  In combination, these fix a crash we have seen
occasionally during parallel regression tests of 8.0.
2004-08-28 20:31:44 +00:00
Tom Lane fe455ee1d4 Revise ResourceOwner code to avoid accumulating ResourceOwner objects
for every command executed within a transaction.  For long transactions
this was a significant memory leak.  Instead, we can delete a portal's
or subtransaction's ResourceOwner immediately, if we physically transfer
the information about its locks up to the parent owner.  This does not
fully solve the leak problem; we need to do something about counting
multiple acquisitions of the same lock in order to fix it.  But it's a
necessary step along the way.
2004-08-25 18:43:43 +00:00
Tom Lane 3fdf649f4f Fix failure to guarantee that a checkpoint will write out pg_clog updates
for transaction commits that occurred just before the checkpoint.  This is
an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a
reproducible test case to prove its existence.
2004-08-11 04:07:16 +00:00
Tom Lane a83c45c4c6 Fix misplacement of savepointLevel test, per report from Chris K-L. 2004-08-03 15:57:26 +00:00
Tom Lane 410b1dfb88 Update the in-code documentation about the transaction system. Move it
into a README file instead of being in xact.c's header comment.
Alvaro Herrera.
2004-08-01 20:57:59 +00:00
Tom Lane efcaf1e868 Some mop-up work for savepoints (nested transactions). Store a small
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress.  Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end.  (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.)  Add basic reference documentation for
SAVEPOINT and related commands.  Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
2004-08-01 17:32:22 +00:00
Tom Lane beda4814c1 plpgsql does exceptions.
There are still some things that need refinement; in particular I fear
that the recognized set of error condition names probably has little in
common with what Oracle recognizes.  But it's a start.
2004-07-31 07:39:21 +00:00
Tom Lane 1bf3d61504 Fix subtransaction behavior for large objects, temp namespace, files,
password/group files.  Also allow read-only subtransactions of a read-write
parent, but not vice versa.  These are the reasonably noncontroversial
parts of Alvaro's recent mop-up patch, plus further work on large objects
to minimize use of the TopTransactionResourceOwner.
2004-07-28 14:23:31 +00:00
Tom Lane cc813fc2b8 Replace nested-BEGIN syntax for subtransactions with spec-compliant
SAVEPOINT/RELEASE/ROLLBACK-TO syntax.  (Alvaro)
Cause COMMIT of a failed transaction to report ROLLBACK instead of
COMMIT in its command tag.  (Tom)
Fix a few loose ends in the nested-transactions stuff.
2004-07-27 05:11:48 +00:00
Tom Lane fe548629c5 Invent ResourceOwner mechanism as per my recent proposal, and use it to
keep track of portal-related resources separately from transaction-related
resources.  This allows cursors to work in a somewhat sane fashion with
nested transactions.  For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor.  We might want to change that later.
2004-07-17 03:32:14 +00:00
Tom Lane b6197fe069 Further review of xact.c state machine for nested transactions. Fix
problems with starting subtransactions inside already-failed transactions.
Clean up some comments.
2004-07-01 20:11:03 +00:00
Tom Lane 573a71a5da Nested transactions. There is still much left to do, especially on the
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.

Alvaro Herrera, with some help from Tom Lane.
2004-07-01 00:52:04 +00:00
Tom Lane 921d749bd4 Adjust our timezone library to use pg_time_t (typedef'd as int64) in
place of time_t, as per prior discussion.  The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038).  The system will now treat
times over the full supported timestamp range as being in your local
time zone.  It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.

I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038.  Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported.  We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
2004-06-03 02:08:07 +00:00
Tom Lane 4d86ae4260 For multi-table ANALYZE, use per-table transactions when possible
(ie, when not inside a transaction block), so that we can avoid holding
locks longer than necessary.  Per trouble report from Philip Warner.
2004-05-22 23:14:38 +00:00
Tom Lane 63bd0db121 Integrate src/timezone library for all platforms. There is more we can
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
2004-05-21 05:08:06 +00:00
Bruce Momjian 823ac7c2b4 This is a cleanup patch for access/transam/xact.c. It only removes some
#ifdef NOT_USED code, and adds a new TBLOCK state which signals the fact
that StartTransaction() has been executed.

Alvaro Herrera
2004-04-05 03:11:39 +00:00
Tom Lane c3c09be34b Commit the reasonably uncontroversial parts of J.R. Nield's PITR patch, to
wit: Add a header record to each WAL segment file so that it can be reliably
identified.  Avoid splitting WAL records across segment files (this is not
strictly necessary, but makes it simpler to incorporate the header records).
Make WAL entries for file creation, deletion, and truncation (as foreseen but
never implemented by Vadim).  Also, add support for making XLOG_SEG_SIZE
configurable at compile time, similarly to BLCKSZ.  Fix a couple bugs I
introduced in WAL replay during recent smgr API changes.  initdb is forced
due to changes in pg_control contents.
2004-02-11 22:55:26 +00:00
Tom Lane 58f337a343 Centralize implementation of delay code by creating a pg_usleep()
subroutine in src/port/pgsleep.c.  Remove platform dependencies from
miscadmin.h and put them in port.h where they belong.  Extend recent
vacuum cost-based-delay patch to apply to VACUUM FULL, ANALYZE, and
non-btree index vacuuming.

By the way, where is the documentation for the cost-based-delay patch?
2004-02-10 03:42:45 +00:00
Tom Lane 87bd956385 Restructure smgr API as per recent proposal. smgr no longer depends on
the relcache, and so the notion of 'blind write' is gone.  This should
improve efficiency in bgwriter and background checkpoint processes.
Internal restructuring in md.c to remove the not-very-useful array of
MdfdVec objects --- might as well just use pointers.
Also remove the long-dead 'persistent main memory' storage manager (mm.c),
since it seems quite unlikely to ever get resurrected.
2004-02-10 01:55:27 +00:00
Bruce Momjian f4921e5ca3 Attached is a patch that fixes some trivial typos and alignment. Please
apply.

Alvaro Herrera
2004-01-26 22:51:56 +00:00
Bruce Momjian 38081fd000 Change PG_DELAY from msec to usec and use it consistenly rather than
select().   Add Win32 Sleep() for delay.
2004-01-09 21:08:50 +00:00
Neil Conway 192ad63bd7 More janitorial work: remove the explicit casting of NULL literals to a
pointer type when it is not necessary to do so.

For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
2004-01-07 18:56:30 +00:00
Joe Conway e2605c8311 Add a warning to AtEOXact_SPI() to catch cases where the current
transaction has been committed without SPI_finish() being called
first. Per recent discussion here:
http://archives.postgresql.org/pgsql-patches/2003-11/msg00286.php
2003-12-02 19:26:47 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Tom Lane 90b2202975 Fix bad interaction between NOTIFY processing and V3 extended query
protocol, per report from Igor Shevchenko.  NOTIFY thought it could
do its thing if transaction blockState is TBLOCK_DEFAULT, but in
reality it had better check the low-level transaction state is
TRANS_DEFAULT as well.  Formerly it was not possible to wait for the
client in a state where the first is true and the second is not ...
but now we can have such a state.  Minor cleanup in StartTransaction()
as well.
2003-10-16 16:50:41 +00:00
Tom Lane 8934790052 Add a mechanism to let dynamically loaded modules register post-commit/
post-abort cleanup hooks.  I'm surprised that we have not needed this
already, but I need it now to fix a plpgsql problem, and the usefulness
for other dynamically loaded modules seems obvious.
2003-09-28 23:26:20 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane a56a016ceb Repair some REINDEX problems per recent discussions. The relcache is
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method.  This
leaves only shared system indexes as special cases.  Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case.  Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX.  -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS.  Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
2003-09-24 18:54:02 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane ec7aa4b515 Error message editing in backend/access. 2003-07-21 20:29:40 +00:00
Tom Lane f85f43dfb5 Backend support for autocommit removed, per recent discussions. The
only remnant of this failed experiment is that the server will take
SET AUTOCOMMIT TO ON.  Still TODO: provide some client-side autocommit
logic in libpq.
2003-05-14 03:26:03 +00:00
Tom Lane de28dc9a04 Portal and memory management infrastructure for extended query protocol.
Both plannable queries and utility commands are now always executed
within Portals, which have been revamped so that they can handle the
load (they used to be good only for single SELECT queries).  Restructure
code to push command-completion-tag selection logic out of postgres.c,
so that it won't have to be duplicated between simple and extended queries.
initdb forced due to addition of a field to Query nodes.
2003-05-02 20:54:36 +00:00
Tom Lane 4db9689d1a Add transaction status field to ReadyForQuery messages, and make room
for tableID/columnID in RowDescription.  (The latter isn't really
implemented yet though --- the backend always sends zeroes, and libpq
just throws away the data.)
2003-04-26 20:23:00 +00:00
Bruce Momjian 54f7338fa1 This patch implements holdable cursors, following the proposal
(materialization into a tuple store) discussed on pgsql-hackers earlier.
I've updated the documentation and the regression tests.

Notes on the implementation:

- I needed to change the tuple store API slightly -- it assumes that it
won't be used to hold data across transaction boundaries, so the temp
files that it uses for on-disk storage are automatically reclaimed at
end-of-transaction. I added a flag to tuplestore_begin_heap() to control
this behavior. Is changing the tuple store API in this fashion OK?

- in order to store executor results in a tuple store, I added a new
CommandDest. This works well for the most part, with one exception: the
current DestFunction API doesn't provide enough information to allow the
Executor to store results into an arbitrary tuple store (where the
particular tuple store to use is chosen by the call site of
ExecutorRun). To workaround this, I've temporarily hacked up a solution
that works, but is not ideal: since the receiveTuple DestFunction is
passed the portal name, we can use that to lookup the Portal data
structure for the cursor and then use that to get at the tuple store the
Portal is using. This unnecessarily ties the Portal code with the
tupleReceiver code, but it works...

The proper fix for this is probably to change the DestFunction API --
Tom suggested passing the full QueryDesc to the receiveTuple function.
In that case, callers of ExecutorRun could "subclass" QueryDesc to add
any additional fields that their particular CommandDest needed to get
access to. This approach would work, but I'd like to think about it for
a little bit longer before deciding which route to go. In the mean time,
the code works fine, so I don't think a fix is urgent.

- (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and
adjusted the behavior of SCROLL in accordance with the discussion on
-hackers.

- (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml

Neil Conway
2003-03-27 16:51:29 +00:00
Bruce Momjian 9a9719e482 Allow error query to start transaction in autocommit off mode. 2003-03-21 04:33:15 +00:00
Bruce Momjian c90354bad0 Remove unneeded dash blocks around function start comments. 2003-03-14 22:40:31 +00:00
Tom Lane e4704001ea This patch fixes a bunch of spelling mistakes in comments throughout the
PostgreSQL source code.

Neil Conway
2003-03-10 22:28:22 +00:00
Peter Eisentraut b65cd56240 Read-only transactions, as defined in SQL. 2003-01-10 22:03:30 +00:00
Bruce Momjian 1b7f3cc02d This patch implements FOR EACH STATEMENT triggers, per my email to
-hackers a couple days ago.

Notes/caveats:

        - added regression tests for the new functionality, all
          regression tests pass on my machine

        - added pg_dump support

        - updated PL/PgSQL to support per-statement triggers; didn't
          look at the other procedural languages.

        - there's (even) more code duplication in trigger.c than there
          was previously. Any suggestions on how to refactor the
          ExecXXXTriggers() functions to reuse more code would be
          welcome -- I took a brief look at it, but couldn't see an
          easy way to do it (there are several subtly-different
          versions of the code in question)

        - updated the documentation. I also took the liberty of
          removing a big chunk of duplicated syntax documentation in
          the Programmer's Guide on triggers, and moving that
          information to the CREATE TRIGGER reference page.

        - I also included some spelling fixes and similar small
          cleanups I noticed while making the changes. If you'd like
          me to split those into a separate patch, let me know.

Neil Conway
2002-11-23 03:59:09 +00:00
Tom Lane 17ac74797a Put back error test for DECLARE CURSOR outside a transaction block ...
but do it correctly now.
2002-11-18 01:17:39 +00:00
Bruce Momjian 63e9734542 Update xact.c comments for clarity. 2002-11-13 03:12:05 +00:00
Tom Lane f9b5b41ef9 Code review for ON COMMIT patch. Make the actual on-commit action happen
before commit, not after :-( --- the original coding is not only unsafe
if an error occurs while it's processing, but it generates an invalid
sequence of WAL entries.  Resurrect 7.2 logic for deleting items when
no longer needed.  Use an enum instead of random macros.  Editorialize
on names used for routines and constants.  Teach backend/nodes routines
about new field in CreateTable struct.  Add a regression test.
2002-11-11 22:19:25 +00:00