Commit Graph

226 Commits

Author SHA1 Message Date
Tom Lane
f905d65ee3 Rewrite of planner statistics-gathering code. ANALYZE is now available as
a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored.  ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values).  Random
sampling is used to make the process reasonably fast even on very large
tables.  The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.

There is more still to do; the new stats are not being used everywhere
they could be in the planner.  But the remaining changes for this project
should be localized, and the behavior is already better than before.

A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
2001-05-07 00:43:27 +00:00
Tom Lane
571dbe4606 Improve comments for xlog item size #defines. 2001-03-25 22:40:58 +00:00
Bruce Momjian
0686d49da0 Remove dashes in comments that don't need them, rewrap with pgindent. 2001-03-22 06:16:21 +00:00
Bruce Momjian
9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Tom Lane
af6e88a9cf Remove NEXTXID xlog record type to avoid three-way deadlock risk.
NEXTXID isn't really necessary, per previous discussion in pghackers,
but I mulishy insisted we should put it in anyway.  Mea culpa.
2001-03-18 20:18:59 +00:00
Tom Lane
9d645fd84c Support syncing WAL log to disk using either fsync(), fdatasync(),
O_SYNC, or O_DSYNC (as available on a given platform).  Add GUC parameter
to control sync method.
Also, add defense to XLogWrite to prevent it from going nuts if passed
a target write position that's past the end of the buffers so far filled
by XLogInsert.
2001-03-16 05:44:33 +00:00
Tom Lane
1b87e24c4a Change xlog page-header format to include StartUpID. Use the SUI to
detect case that next page in log came from an older run than the prior
page.  This avoids the necessity to re-zero the log after recovery from
a crash, which is good because we need not risk destroying valuable log
information.
This forces another initdb since yesterday :-(.  Need to get that log
reset utility done...
2001-03-13 20:32:37 +00:00
Tom Lane
4d14fe0048 XLOG (and related) changes:
* Store two past checkpoint locations, not just one, in pg_control.
  On startup, we fall back to the older checkpoint if the newer one
  is unreadable.  Also, a physical copy of the newest checkpoint record
  is kept in pg_control for possible use in disaster recovery (ie,
  complete loss of pg_xlog).  Also add a version number for pg_control
  itself.  Remove archdir from pg_control; it ought to be a GUC
  parameter, not a special case (not that it's implemented yet anyway).

* Suppress successive checkpoint records when nothing has been entered
  in the WAL log since the last one.  This is not so much to avoid I/O
  as to make it actually useful to keep track of the last two
  checkpoints.  If the things are right next to each other then there's
  not a lot of redundancy gained...

* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
  on alternate bytes.  Polynomial borrowed from ECMA DLT1 standard.

* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.

* Change XID allocation to work more like OID allocation.  (This is of
  dubious necessity, but I think it's a good idea anyway.)

* Fix a number of minor bugs, such as off-by-one logic for XLOG file
  wraparound at the 4 gig mark.

* Add documentation and clean up some coding infelicities; move file
  format declarations out to include files where planned contrib
  utilities can get at them.

* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
  every CHECKPOINT_TIMEOUT seconds, whichever comes first.  It is also
  possible to force a checkpoint by sending SIGUSR1 to the postmaster
  (undocumented feature...)

* Defend against kill -9 postmaster by storing shmem block's key and ID
  in postmaster.pid lockfile, and checking at startup to ensure that no
  processes are still connected to old shmem block (if it still exists).

* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
  stop, for symmetry with postmaster and xlog utilities.  Clean up signal
  handling in bootstrap.c so that xlog utilities launched by postmaster
  will react to signals better.

* Standalone bootstrap now grabs lockfile in target directory, as added
  insurance against running it in parallel with live postmaster.
2001-03-13 01:17:06 +00:00
Tom Lane
9c9936587c Implement COMMIT_SIBLINGS parameter to allow pre-commit delay to occur
only if at least N other backends currently have open transactions.  This
is not a great deal of intelligence about whether a delay might be
profitable ... but it beats no intelligence at all.  Note that the default
COMMIT_DELAY is still zero --- this new code does nothing unless that
setting is changed.
Also, mark ENABLEFSYNC as a system-wide setting.  It's no longer safe to
allow that to be set per-backend, since we may be relying on some other
backend's fsync to have synced the WAL log.
2001-02-26 00:50:08 +00:00
Bruce Momjian
82fc51e0b3 More comment improvements. 2001-02-22 23:02:33 +00:00
Bruce Momjian
4f6c49fef0 Clean up index/btree comments/macros, as approved. 2001-02-22 21:48:49 +00:00
Bruce Momjian
15903a1ed4 Comment improvements. 2001-02-21 19:07:04 +00:00
Tom Lane
059e361481 Although we can't support out-of-line TOAST storage in indexes (yet),
compressed storage works perfectly well.  Might as well have a coherent
strategy for applying it, rather than the haphazard store-what-you-get
approach that was in the code before.  The strategy I've set up here is
to attempt compression of any compressible index value exceeding
BLCKSZ/16, or about 500 bytes by default.
2001-02-15 20:57:01 +00:00
Vadim B. Mikheev
66decbfb08 Macro for btree runtime fix. 2001-02-07 23:34:18 +00:00
Bruce Momjian
623bf843d2 Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group. 2001-01-24 19:43:33 +00:00
Tom Lane
786f1a59cd Fix all the places that called heap_update() and heap_delete() without
bothering to check the return value --- which meant that in case the
update or delete failed because of a concurrent update, you'd not find
out about it, except by observing later that the transaction produced
the wrong outcome.  There are now subroutines simple_heap_update and
simple_heap_delete that should be used anyplace that you're not prepared
to do the full nine yards of coping with concurrent updates.  In
practice, that seems to mean absolutely everywhere but the executor,
because *noplace* else was checking.
2001-01-23 04:32:23 +00:00
Tom Lane
36839c1927 Restructure backend SIGINT/SIGTERM handling so that 'die' interrupts
are treated more like 'cancel' interrupts: the signal handler sets a
flag that is examined at well-defined spots, rather than trying to cope
with an interrupt that might happen anywhere.  See pghackers discussion
of 1/12/01.
2001-01-14 05:08:17 +00:00
Tom Lane
6162432de9 Add more critical-section calls: all code sections that hold spinlocks
are now critical sections, so as to ensure die() won't interrupt us while
we are munging shared-memory data structures.  Avoid insecure intermediate
states in some code that proc_exit will call, like palloc/pfree.  Rename
START/END_CRIT_CODE to START/END_CRIT_SECTION, since that seems to be
what people tend to call them anyway, and make them be called with () like
a function call, in hopes of not confusing pg_indent.
I doubt that this is sufficient to make SIGTERM safe anywhere; there's
just too much code that could get invoked during proc_exit().
2001-01-12 21:54:01 +00:00
Vadim B. Mikheev
3e059b3802 1. WAL needs in zero-ed content of newly initialized page.
2. Log record for PageRepaireFragmentation now keeps array
   of !LP_USED offnums to redo cleanup properly.
2000-12-30 15:19:57 +00:00
Vadim B. Mikheev
7ceeeb662f New WAL version - CRC and data blocks backup. 2000-12-28 13:00:29 +00:00
Tom Lane
8609d4abf2 Fix portability problems recently exposed by regression tests on Alphas.
1. Distinguish cases where a Datum representing a tuple datatype is an OID
from cases where it is a pointer to TupleTableSlot, and make sure we use
the right typlen in each case.
2. Make fetchatt() and related code support 8-byte by-value datatypes on
machines where Datum is 8 bytes.  Centralize knowledge of the available
by-value datatype sizes in two macros in tupmacs.h, so that this will be
easier if we ever have to do it again.
2000-12-27 23:59:14 +00:00
Tom Lane
a626b78c89 Clean up backend-exit-time cleanup behavior. Use on_shmem_exit callbacks
to ensure that we have released buffer refcounts and so forth, rather than
putting ad-hoc operations before (some of the calls to) proc_exit.  Add
commentary to discourage future hackers from repeating that mistake.
2000-12-18 00:44:50 +00:00
Vadim B. Mikheev
65b362fae1 Disable elog(ERROR|FATAL) in signal handlers in
critical sections of code.
2000-12-03 10:27:29 +00:00
Tom Lane
217d1566bf Make tuple receive/print routines TOAST-aware. Formerly, printtup would
leak memory when printing a toasted attribute, and printtup_internal
didn't work at all...
2000-12-01 22:10:31 +00:00
Tom Lane
1f5cc8c78a Remove VARLENA_FIXED_SIZE hack, which is irreversibly broken now that
both MULTIBYTE and TOAST prevent char(n) from being truly fixed-size.
Simplify and speed up fastgetattr() and index_getattr() macros by
eliminating special cases for attnum=1.  It's just as fast to handle
the first attribute by presetting its attcacheoff to zero; so do that
instead when loading the tupledesc in relcache.c.
2000-11-30 18:38:47 +00:00
Vadim B. Mikheev
81c8c244b2 No more #ifdef XLOG. 2000-11-30 08:46:26 +00:00
Vadim B. Mikheev
741510521c XLOG stuff for sequences.
CommitDelay in guc.c
2000-11-30 01:47:33 +00:00
Tom Lane
bbea3643a3 Store current LC_COLLATE and LC_CTYPE settings in pg_control during initdb;
re-adopt these settings at every postmaster or standalone-backend startup.
This should fix problems with indexes becoming corrupt due to failure to
provide consistent locale environment for postmaster at all times.  Also,
refuse to start up a non-locale-enabled compilation in a database originally
initdb'd with a non-C locale.  Suppress LIKE index optimization if locale
is not "C" or "POSIX" (are there any other locales where it's safe?).
Issue NOTICE during initdb if selected locale disables LIKE optimization.
2000-11-25 20:33:54 +00:00
Peter Eisentraut
a70e74b060 Put external declarations into header files. 2000-11-21 21:16:06 +00:00
Tom Lane
21e1e6643c Minor cleanup of tableOid-related coding. 2000-11-14 21:04:32 +00:00
Tom Lane
b0d243e420 Clean up comments. 2000-11-14 20:47:34 +00:00
Vadim B. Mikheev
f0e37a8531 New CHECKPOINT command.
Auto removing of offline log files and creating new file
at checkpoint time.
2000-11-05 22:50:21 +00:00
Vadim B. Mikheev
b98ba2a04c pg_variable is not used in WAL version now. 2000-11-03 11:39:36 +00:00
Tom Lane
7a64100164 Fix insufficiently-parenthesized macro definitions.
No known bug here, but...
2000-11-02 23:11:03 +00:00
Vadim B. Mikheev
5b0740d3fc WAL 2000-10-28 16:21:00 +00:00
Vadim B. Mikheev
157ff4e108 WAL utils defs 2000-10-25 00:49:14 +00:00
Tom Lane
7ab8384543 Create empty file so that CVS sources compile (Vadim can fill in real
definition later...)
2000-10-24 18:05:14 +00:00
Vadim B. Mikheev
db2faa943a WAL misc 2000-10-24 09:56:23 +00:00
Vadim B. Mikheev
a7fcadd10a WAL 2000-10-21 15:43:36 +00:00
Vadim B. Mikheev
b58c0411ba redo/undo support functions and cleanups. 2000-10-20 11:01:21 +00:00
Vadim B. Mikheev
deee783052 WAL 2000-10-13 12:05:22 +00:00
Vadim B. Mikheev
25a26a7ab8 WAL 2000-10-13 02:03:02 +00:00
Vadim B. Mikheev
5800c6b9aa Btree WAL logging. 2000-10-04 00:04:43 +00:00
Vadim B. Mikheev
8e95321476 Btree WAL records. 2000-09-12 06:07:52 +00:00
Vadim B. Mikheev
f2bfe8a24c Heap redo/undo (except for tuple moving used by vacuum). 2000-09-07 09:58:38 +00:00
Hiroshi Inoue
b0d5036c7c CREATE btree INDEX takes dead tuples into account when old transactions
are running.
2000-08-10 02:33:20 +00:00
Tom Lane
0224177400 TOAST mop-up work: update comments for tuple-size-related symbols such
as MaxHeapAttributeNumber.  Increase MaxAttrSize to something more
reasonable (given what it's used for, namely checking char(n) declarations,
I didn't make it the full 1G that it could theoretically be --- 10Mb
seemed a more reasonable number).  Improve calculation of MaxTupleSize.
2000-08-07 20:16:13 +00:00
Tom Lane
dd8ad64118 Fix tuptoaster bugs induced by making bytea toastable. Durn thing was
trying to toast tuples inserted into toast tables!  Fix is two-pronged:
first, ensure all columns of a toast table are marked attstorage='p',
and second, alter the target chunk size so that it's less than the
threshold for trying to toast a tuple.  (Code tried to do that but the
expression was wrong.)  A few cosmetic cleanups in tuptoaster too.
NOTE: initdb forced due to change in toaster chunk-size.
2000-08-04 04:16:17 +00:00
Tom Lane
61aca818c4 Modify heap_open()/heap_openr() API per pghackers discussion of 11 July.
These two routines will now ALWAYS elog() on failure, whether you ask for
a lock or not.  If you really want to get a NULL return on failure, call
the new routines heap_open_nofail()/heap_openr_nofail().  By my count there
are only about three places that actually want that behavior.  There were
rather more than three places that were missing the check they needed to
make under the old convention :-(.
2000-08-03 19:19:38 +00:00
Tom Lane
ad7b47aa02 Fix sloppy macro coding (not enough parentheses). 2000-07-28 01:04:40 +00:00