Commit Graph

954 Commits

Author SHA1 Message Date
Tom Lane 19d127548c Add comment about checkpoint panic behavior during shutdown, per
suggestion from Qingqing Zhou.
2005-04-23 18:49:54 +00:00
Tom Lane 3842892492 Recent changes got the sense of the notnull bit backwards in the 2.0
protocol output routines.  Mea culpa :-(.  Per report from Kris Jurka.
2005-04-23 17:45:35 +00:00
Bruce Momjian 1a6ad669fb Fix comment typo. 2005-04-17 03:04:29 +00:00
Tom Lane 5f0a974ea9 Reduce PANIC to ERROR in several xlog routines that are used in both
critical and noncritical contexts (an example of noncritical being
post-checkpoint removal of dead xlog segments).  In the critical cases
the CRIT_SECTION mechanism will cause ERROR to be promoted to PANIC
anyway, and in the noncritical cases we shouldn't let an error take
down the entire database.  Arguably there should be *no* explicit
PANIC errors in this module, only more START/END_CRIT_SECTION calls,
but I didn't go that far.  (Yet.)
2005-04-15 22:19:48 +00:00
Tom Lane 61b861421b Modify MoveOfflineLogs/InstallXLogFileSegment to avoid O(N^2) behavior
when recycling a large number of xlog segments during checkpoint.
The former behavior searched from the same start point each time,
requiring O(checkpoint_segments^2) stat() calls to relocate all the
segments.  Instead keep track of where we stopped last time through.
2005-04-15 18:48:10 +00:00
Tom Lane 8e14408028 Make equalTupleDescs() compare attlen/attbyval/attalign rather than
assuming comparison of atttypid is sufficient.  In a dropped column
atttypid will be 0, and we'd better check the physical-storage data
to make sure the tupdescs are physically compatible.
I do not believe there is a real risk before 8.0, since before that
we only used this routine to compare successive states of the tupdesc
for a particular relation.  But 8.0's typcache.c might be comparing
arbitrary tupdescs so we'd better play it safer.
2005-04-14 22:34:48 +00:00
Tom Lane 162bd08b3f Completion of project to use fixed OIDs for all system catalogs and
indexes.  Replace all heap_openr and index_openr calls by heap_open
and index_open.  Remove runtime lookups of catalog OID numbers in
various places.  Remove relcache's support for looking up system
catalogs by name.  Bulky but mostly very boring patch ...
2005-04-14 20:03:27 +00:00
Tom Lane 2193a856a2 Simplify initdb-time assignment of OIDs as I proposed yesterday, and
avoid encroaching on the 'user' range of OIDs by allowing automatic
OID assignment to use values below 16k until we reach normal operation.

initdb not forced since this doesn't make any incompatible change;
however a lot of stuff will have different OIDs after your next initdb.
2005-04-13 18:54:57 +00:00
Tom Lane c3294f1cbf Fix interaction between materializing holdable cursors and firing
deferred triggers: either one can create more work for the other,
so we have to loop till it's all gone.  Per example from andrew@supernews.
Add a regression test to help spot trouble in this area in future.
2005-04-11 19:51:16 +00:00
Tom Lane ad161bcc8a Merge Resdom nodes into TargetEntry nodes to simplify code and save a
few palloc's.  I also chose to eliminate the restype and restypmod fields
entirely, since they are redundant with information stored in the node's
contained expression; re-examining the expression at need seems simpler
and more reliable than trying to keep restype/restypmod up to date.

initdb forced due to change in contents of stored rules.
2005-04-06 16:34:07 +00:00
Tom Lane 47888fe842 First phase of OUT-parameters project. We can now define and use SQL
functions with OUT parameters.  The various PLs still need work, as does
pg_dump.  Rudimentary docs and regression tests included.
2005-03-31 22:46:33 +00:00
Tom Lane 8c85a34a3b Officially decouple FUNC_MAX_ARGS from INDEX_MAX_KEYS, and set the
former to 100 by default.  Clean up some of the less necessary
dependencies on FUNC_MAX_ARGS; however, the biggie (FunctionCallInfoData)
remains.
2005-03-29 03:01:32 +00:00
Tom Lane 70c9763d48 Convert oidvector and int2vector into variable-length arrays. This
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value.  INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not.  I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass.  However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.
2005-03-29 00:17:27 +00:00
Tom Lane 119191609c Remove dead push/pop rollback code. Vadim once planned to implement
transaction rollback via UNDO but I think that's highly unlikely to
happen, so we may as well remove the stubs.  (Someday we ought to
rip out the stub xxx_undo routines, too.)  Per Alvaro.
2005-03-28 01:50:34 +00:00
Tom Lane bf3dbb5881 First steps towards index scans with heap access decoupled from index
access: define new index access method functions 'amgetmulti' that can
fetch multiple TIDs per call.  (The functions exist but are totally
untested as yet.)  Since I was modifying pg_am anyway, remove the
no-longer-needed 'rel' parameter from amcostestimate functions, and
also remove the vestigial amowner column that was creating useless
work for Alvaro's shared-object-dependencies project.
Initdb forced due to changes in pg_am.
2005-03-27 23:53:05 +00:00
Tom Lane 617dd33b6e Eliminate duplicate hasnulls bit testing in index tuple access, and
clean up itup.h a little bit.
2005-03-27 18:38:27 +00:00
Bruce Momjian b1f57d88f5 Change Win32 O_SYNC method to O_DSYNC because that is what the method
currently does.  This is now the default Win32 wal sync method because
we perfer o_datasync to fsync.

Also, change Win32 fsync to a new wal sync method called
fsync_writethrough because that is the behavior of _commit, which is
what is used for fsync on Win32.

Backpatch to 8.0.X.
2005-03-24 04:36:20 +00:00
Tom Lane 94e03330cb Create a routine PageIndexMultiDelete() that replaces a loop around
PageIndexTupleDelete() with a single pass of compactification ---
logic mostly lifted from PageRepairFragmentation.  I noticed while
profiling that a VACUUM that's cleaning up a whole lot of deleted
tuples would spend as much as a third of its CPU time in
PageIndexTupleDelete; not too surprising considering the loop method
was roughly O(N^2) in the number of tuples involved.
2005-03-22 06:17:03 +00:00
Tom Lane ee4ddac137 Convert index-related tuple handling routines from char 'n'/' ' to bool
convention for isnull flags.  Also, remove the useless InsertIndexResult
return struct from index AM aminsert calls --- there is no reason for
the caller to know where in the index the tuple was inserted, and we
were wasting a palloc cycle per insert to deliver this uninteresting
value (plus nontrivial complexity in some AMs).
I forced initdb because of the change in the signature of the aminsert
routines, even though nothing really looks at those pg_proc entries...
2005-03-21 01:24:04 +00:00
Neil Conway fe7015f5e8 Change the return value of HeapTupleSatisfiesUpdate() to be an enum,
rather than an integer, and fix the associated fallout. From Alvaro
Herrera.
2005-03-20 23:40:34 +00:00
Tom Lane 354049c709 Remove unnecessary calls of FlushRelationBuffers: there is no need
to write out data that we are about to tell the filesystem to drop.
smgr_internal_unlink already had a DropRelFileNodeBuffers call to
get rid of dead buffers without a write after it's no longer possible
to roll back the deleting transaction.  Adding a similar call in
smgrtruncate simplifies callers and makes the overall division of
labor clearer.  This patch removes the former behavior that VACUUM
would write all dirty buffers of a relation unconditionally.
2005-03-20 22:00:54 +00:00
Tom Lane f97aebd162 Revise TupleTableSlot code to avoid unnecessary construction and disassembly
of tuples when passing data up through multiple plan nodes.  A slot can now
hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting
of Datum/isnull arrays.  Upper plan levels can usually just copy the Datum
arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr()
calls to extract the data again.  This work extends Atsushi Ogawa's earlier
patch, which provided the key idea of adding Datum arrays to TupleTableSlots.
(I believe however that something like this was foreseen way back in Berkeley
days --- see the old comment on ExecProject.)  A test case involving many
levels of join of fairly wide tables (about 80 columns altogether) showed
about 3x overall speedup, though simple queries will probably not be
helped very much.

I have also duplicated some code in heaptuple.c in order to provide versions
of heap_formtuple and friends that use "bool" arrays to indicate null
attributes, instead of the old convention of "char" arrays containing either
'n' or ' '.  This provides a better match to the convention used by
ExecEvalExpr.  While I have not made a concerted effort to get rid of uses
of the old routines, I think they should be deprecated and eventually removed.
2005-03-16 21:38:10 +00:00
Tom Lane a9b05bdc83 Avoid O(N^2) overhead in repeated nocachegetattr calls when columns of
a tuple are being accessed via ExecEvalVar and the attcacheoff shortcut
isn't usable (due to nulls and/or varlena columns).  To do this, cache
Datums extracted from a tuple in the associated TupleTableSlot.
Also some code cleanup in and around the TupleTable handling.
Atsushi Ogawa with some kibitzing by Tom Lane.
2005-03-14 04:41:13 +00:00
Tom Lane a52b4fb131 Adjust creation/destruction of TupleDesc data structure to reduce the
number of palloc calls.  This has a salutory impact on plpgsql operations
with record variables (which create and destroy tupdescs constantly)
and probably helps a bit in some other cases too.
2005-03-07 04:42:17 +00:00
Tom Lane 4aefe75553 Remove some no-longer-needed kluges for bootstrapping, in particular
the AMI_OVERRIDE flag.  The fact that TransactionLogFetch treats
BootstrapTransactionId as always committed is sufficient to make
bootstrap work, and getting rid of extra tests in heavily used code
paths seems like a win.  The files produced by initdb are demonstrably
the same after this change.
2005-02-20 21:46:50 +00:00
Tom Lane 60b2444cc3 Add code to prevent transaction ID wraparound by enforcing a safe limit
in GetNewTransactionId().  Since the limit value has to be computed
before we run any real transactions, this requires adding code to database
startup to scan pg_database and determine the oldest datfrozenxid.
This can conveniently be combined with the first stage of an attack on
the problem that the 'flat file' copies of pg_shadow and pg_group are
not properly updated during WAL recovery.  The code I've added to
startup resides in a new file src/backend/utils/init/flatfiles.c, and
it is responsible for rewriting the flat files as well as initializing
the XID wraparound limit value.  This will eventually allow us to get
rid of GetRawDatabaseInfo too, but we'll need an initdb so we can add
a trigger to pg_database.
2005-02-20 02:22:07 +00:00
Bruce Momjian 7c44e57331 Move plpgsql DEBUG from DEBUG2 to DEBUG1 because it is a user-requested
DEBUG.

Fix a few places where DEBUG1 crept in that should have been DEBUG2.
2005-02-12 23:53:42 +00:00
Tom Lane 12179c99b1 Marginal hack to merge adjacent ReleaseBuffer/ReadBuffer calls into
ReleaseAndReadBuffer during GIST index searches.  We already did this
in btree and rtree, might as well do it here too.
2005-02-05 19:38:58 +00:00
Neil Conway a885ecd6ef Change heap_modifytuple() to require a TupleDesc rather than a
Relation. Patch from Alvaro Herrera, minor editorializing by
Neil Conway.
2005-01-27 23:24:11 +00:00
Tom Lane 0ffe9f7946 Fix memory leak in rtdosplit, per report from Clive Page. 2005-01-24 02:47:26 +00:00
Neil Conway b4297c177c This patch makes some improvements to the rtree index implementation:
(1) Keep a pin on the scan's current buffer and mark buffer. This
avoids the need to do a ReadBuffer() for each tuple produced by the
scan. Since ReadBuffer() is expensive, this is a significant win.

(2) Convert a ReleaseBuffer(); ReadBuffer() pair into
ReleaseAndReadBuffer(). Surely not a huge win, but it saves a lock
acquire/release...

(3) Remove a bunch of duplicated code in rtget.c; make rtnext() handle
both the "initial result" and "subsequent result" cases.

(4) Add support for index tuple killing

(5) Remove rtscancache(): it is dead code, for the same reason that
gistscancache() is dead code (an index scan ought not be invoked with
NoMovementScanDirection).

The end result is about a 10% improvement in rtree index scan perf,
according to contrib/rtree_gist/bench.
2005-01-18 23:25:55 +00:00
Tom Lane 0ce4d56924 Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This
is the minimum required fix.  I want to look next at taking advantage of
it by simplifying the message semantics in the shared inval message queue,
but that part can be held over for 8.1 if it turns out too ugly.
2005-01-10 20:02:24 +00:00
Bruce Momjian 2daed8c5b3 Update copyrights that were missed. 2005-01-01 05:43:09 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane bfa5f30481 Awhile back I added some code to StartupCLOG() to forcibly zero out
the remainder of the current clog page during system startup.  While
this was a good idea, it turns out the code fails if nextXid is
exactly at a page boundary, because we won't have created the "current"
clog page yet in that case.  Since the page will be correctly zeroed
when we execute the first transaction on it, the solution is just to
do nothing when exactly at a page boundary.  Per trouble report from
Dave Hartwig.
2004-12-22 18:45:49 +00:00
Tom Lane ff5a354ece Fix is-it-time-for-a-checkpoint logic so that checkpoint_segments can
usefully be larger than 255.  Per gripe from Simon Riggs.
2004-12-17 00:10:36 +00:00
Tom Lane c3d6c7d8f9 Calculation of keys_are_unique flag was wrong for cases involving
redundant cross-datatype comparisons.  Per example from Merlin Moncure.
2004-12-15 19:16:39 +00:00
Tom Lane 5374d097de Change planner to use the current true disk file size as its estimate of
a relation's number of blocks, rather than the possibly-obsolete value
in pg_class.relpages.  Scale the value in pg_class.reltuples correspondingly
to arrive at a hopefully more accurate number of rows.  When pg_class
contains 0/0, estimate a tuple width from the column datatypes and divide
that into current file size to estimate number of rows.  This improved
methodology allows us to jettison the ancient hacks that put bogus default
values into pg_class when a table is first created.  Also, per a suggestion
from Simon, make VACUUM (but not VACUUM FULL or ANALYZE) adjust the value
it puts into pg_class.reltuples to try to represent the mean tuple density
instead of the minimal density that actually prevails just after VACUUM.
These changes alter the plans selected for certain regression tests, so
update the expected files accordingly.  (I removed join_1.out because
it's not clear if it still applies; we can add back any variant versions
as they are shown to be needed.)
2004-12-01 19:00:56 +00:00
Tom Lane 37d693033d Minor adjustment of message style. 2004-11-17 16:26:59 +00:00
Neil Conway 5d1dd2bc55 Micro-optimization of markpos() and restrpos() in btree and hash indexes.
Rather than using ReadBuffer() to increment the reference count on an
already-pinned buffer, we should use IncrBufferRefCount() as it is
faster and does not require acquiring the BufMgrLock.
2004-11-17 03:13:38 +00:00
Neil Conway b25d23e1e6 Don't allow pg_start_backup() to be invoked if archive_command has not
been defined. Patch from Gavin Sherry, editorializing by Neil Conway.
2004-11-17 02:22:54 +00:00
Neil Conway a236dd9536 There is no need for ReadBuffer() call sites to check that the returned
buffer is valid, as ReadBuffer() will elog on error. Most of the call
sites of ReadBuffer() got this right, but this patch fixes those call
sites that did not.
2004-11-14 02:04:14 +00:00
Neil Conway 4d0f669f3c Remove obsolete comment from btbuild() and hashbuild(): we no longer use
a global variable to control building indexes.
2004-11-11 00:32:50 +00:00
Peter Eisentraut 0ed3c7665e Small message clarifications 2004-11-05 17:11:34 +00:00
Tom Lane 88868d4fbc Change COMMIT back to the old behavior of emitting command tag COMMIT,
not ROLLBACK, for the case of COMMIT outside a transaction block.
Alvaro Herrera
2004-10-30 20:44:43 +00:00
Tom Lane 23f264d125 Rearrange order of pre-commit operations: must close cursors before doing
ON COMMIT actions.  Per bug report from Michael Guerin.
2004-10-29 22:19:53 +00:00
Tom Lane ee69be44d5 Add DEBUG1-level logging of checkpoint start and end. Also, reduce the
'recycled log files' and 'removed log files' messages from DEBUG1 to
DEBUG2, replacing them with a count of files added/removed/recycled in
the checkpoint end message, as per suggestion from Simon Riggs.
2004-10-29 00:16:08 +00:00
Tom Lane 83cd2d8b0f Make heap_fetch API more consistent by having the buffer remain pinned
in all cases when keep_buf = true.  This allows ANALYZE's inner loop to
use heap_release_fetch, which saves multiple buffer lookups for the same
page and avoids overestimation of cost by the vacuum cost mechanism.
2004-10-26 16:05:03 +00:00
Tom Lane fb22b32095 Allow functions returning void or cstring to appear in FROM clause,
to make life cushy for the JDBC driver.  Centralize the decision-making
that affects this by inventing a get_type_func_class() function, rather
than adding special cases in half a dozen places.
2004-10-20 16:04:50 +00:00
Tom Lane fdd13f1568 Give the ResourceOwner mechanism full responsibility for releasing buffer
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly.  When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op.  This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
2004-10-16 18:57:26 +00:00