was created on a machine with alignment rules and floating-point format
similar to the current machine. Per recent discussion, this seems like
a good idea with the increasing prevalence of 32/64 bit environments.
transaction IDs, rather than like subtrans; in particular, the information
now survives a database restart. Per previous discussion, this is
essential for PITR log shipping and for 2PC.
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block. Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages. Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective. All per recent discussions.
to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets. When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX. This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared. Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.
Alvaro Herrera and Tom Lane.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
recovery more manageable. Also, undo recent change to add FILE_HEADER
and WASTED_SPACE records to XLOG; instead make the XLOG page header
variable-size with extra fields in the first page of an XLOG file.
This should fix the boundary-case bugs observed by Mark Kirkwood.
initdb forced due to change of XLOG representation.
place of time_t, as per prior discussion. The behavior does not change
on machines without a 64-bit-int type, but on machines with one, which
is most, we are rid of the bizarre boundary behavior at the edges of
the 32-bit-time_t range (1901 and 2038). The system will now treat
times over the full supported timestamp range as being in your local
time zone. It may seem a little bizarre to consider that times in
4000 BC are PST or EST, but this is surely at least as reasonable as
propagating Gregorian calendar rules back that far.
I did not modify the format of the zic timezone database files, which
means that for the moment the system will not know about daylight-savings
periods outside the range 1901-2038. Given the way the files are set up,
it's not a simple decision like 'widen to 64 bits'; we have to actually
think about the range of years that need to be supported. We should
probably inquire what the plans of the upstream zic people are before
making any decisions of our own.
and should do now that we control our own destiny for timezone handling,
but this commit gets the bulk of the picayune diffs in place.
Magnus Hagander and Tom Lane.
wit: Add a header record to each WAL segment file so that it can be reliably
identified. Avoid splitting WAL records across segment files (this is not
strictly necessary, but makes it simpler to incorporate the header records).
Make WAL entries for file creation, deletion, and truncation (as foreseen but
never implemented by Vadim). Also, add support for making XLOG_SEG_SIZE
configurable at compile time, similarly to BLCKSZ. Fix a couple bugs I
introduced in WAL replay during recent smgr API changes. initdb is forced
due to changes in pg_control contents.
Use "--enable-integer-datetimes" in configuration to use this rather
than the original float8 storage. I would recommend the integer-based
storage for any platform on which it is available. We perhaps should
make this the default for the production release.
Change timezone(timestamptz) results to return timestamp rather than
a character string. Formerly, we didn't have a way to represent
timestamps with an explicit time zone other than freezing the info into
a string. Now, we can reasonably omit the explicit time zone from the
result and return a timestamp with values appropriate for the specified
time zone. Much cleaner, and if you need the time zone in the result
you can put it into a character string pretty easily anyway.
Allow fractional seconds in date/time types even for dates prior to 1BC.
Limit timestamp data types to 6 decimal places of precision. Just right
for a micro-second storage of int8 date/time types, and reduces the
number of places ad-hoc rounding was occuring for the float8-based types.
Use lookup tables for precision/rounding calculations for timestamp and
interval types. Formerly used pow() to calculate the desired value but
with a more limited range there is no reason to not type in a lookup
table. Should be *much* better performance, though formerly there were
some optimizations to help minimize the number of times pow() was called.
Define a HAVE_INT64_TIMESTAMP variable. Based on the configure option
"--enable-integer-datetimes" and the existing internal INT64_IS_BUSTED.
Add explicit date/interval operators and functions for addition and
subtraction. Formerly relied on implicit type promotion from date to
timestamp with time zone.
Change timezone conversion functions for the timetz type from "timetz()"
to "timezone()". This is consistant with other time zone coersion
functions for other types.
Bump the catalog version to 200204201.
Fix up regression tests to reflect changes in fractional seconds
representation for date/times in BC eras.
All regression tests pass on my Linux box.
* Store two past checkpoint locations, not just one, in pg_control.
On startup, we fall back to the older checkpoint if the newer one
is unreadable. Also, a physical copy of the newest checkpoint record
is kept in pg_control for possible use in disaster recovery (ie,
complete loss of pg_xlog). Also add a version number for pg_control
itself. Remove archdir from pg_control; it ought to be a GUC
parameter, not a special case (not that it's implemented yet anyway).
* Suppress successive checkpoint records when nothing has been entered
in the WAL log since the last one. This is not so much to avoid I/O
as to make it actually useful to keep track of the last two
checkpoints. If the things are right next to each other then there's
not a lot of redundancy gained...
* Change CRC scheme to a true 64-bit CRC, not a pair of 32-bit CRCs
on alternate bytes. Polynomial borrowed from ECMA DLT1 standard.
* Fix XLOG record length handling so that it will work at BLCKSZ = 32k.
* Change XID allocation to work more like OID allocation. (This is of
dubious necessity, but I think it's a good idea anyway.)
* Fix a number of minor bugs, such as off-by-one logic for XLOG file
wraparound at the 4 gig mark.
* Add documentation and clean up some coding infelicities; move file
format declarations out to include files where planned contrib
utilities can get at them.
* Checkpoint will now occur every CHECKPOINT_SEGMENTS log segments or
every CHECKPOINT_TIMEOUT seconds, whichever comes first. It is also
possible to force a checkpoint by sending SIGUSR1 to the postmaster
(undocumented feature...)
* Defend against kill -9 postmaster by storing shmem block's key and ID
in postmaster.pid lockfile, and checking at startup to ensure that no
processes are still connected to old shmem block (if it still exists).
* Switch backends to accept SIGQUIT rather than SIGUSR1 for emergency
stop, for symmetry with postmaster and xlog utilities. Clean up signal
handling in bootstrap.c so that xlog utilities launched by postmaster
will react to signals better.
* Standalone bootstrap now grabs lockfile in target directory, as added
insurance against running it in parallel with live postmaster.