macros patch :-(. Results from both baiji and mastodon imply that MSVC
fails to perceive offsetof(PageHeaderData, pd_linp[0]) as a constant
expression in some contexts where offsetof(PageHeaderData, pd_linp) works
fine. Sloth, thy name is Micro.
thereby forestalling any problems with alignment of the data structure placed
there. Since SizeOfPageHeaderData is maxalign'd anyway in 8.3 and HEAD, this
does not actually change anything right now, but it is foreseeable that the
header size will change again someday. I had to fix a couple of places that
were assuming that the content offset is just SizeOfPageHeaderData rather than
MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's page-macros patch.
more logical that way, and also it reduces the amount of unnecessary includes
in bufpage.h, which is widely used.
Zdenek Kotala.
My previous patch to bufpage.h should also have credited him as author, but I
forgot (sorry about that).
unnecessary #include lines in it. Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.
For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.
While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
unpruned XMAX in its header. At the cost of 4 bytes per page, this keeps us
from performing heap_page_prune when there's no chance of pruning anything.
Seems to be necessary per Heikki's preliminary performance testing.
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version. Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.
In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.
Pavan Deolasee, with help from a bunch of other people.
than two independent bits (one of which was never used in heap pages anyway,
or at least hadn't been in a very long time). This gives us flexibility to
add the HOT notions of redirected and dead item pointers without requiring
anything so klugy as magic values of lp_off and lp_len. The state values
are chosen so that for the states currently in use (pre-HOT) there is no
change in the physical representation.
this, add a 16-bit "flags" field to page headers by stealing some bits from
pd_tli. We use one flag bit as a hint to indicate whether there are any
unused line pointers; the remaining 15 are available for future use.
This is a cut-down form of an idea proposed by Hiroki Kataoka in July 2005.
At the time it was rejected because the original patch increased the size of
page headers and it wasn't clear that the benefit outweighed the distributed
cost. The flag-bit approach gets most of the benefit without requiring an
increase in the page header size.
Heikki Linnakangas and Tom Lane
I refactored findsplitloc and checksplitloc so that the division of
labor is more clear IMO. I pushed all the space calculation inside the
loop to checksplitloc.
I also fixed the off by 4 in free space calculation caused by
PageGetFreeSpace subtracting sizeof(ItemIdData), even though it was
harmless, because it was distracting and I felt it might come back to
bite us in the future if we change the page layout or alignments.
There's now a new function PageGetExactFreeSpace that doesn't do the
subtraction.
findsplitloc now tries the "just the new item to right page" split as
well. If people don't like the refactoring, I can write a patch to just
add that.
Heikki Linnakangas
keeping private state in each backend that has inserted and deleted the same
tuple during its current top-level transaction. This is sufficient since
there is no need to be able to determine the cmin/cmax from any other
transaction. This gets us back down to 23-byte headers, removing a penalty
paid in 8.0 to support subtransactions. Patch by Heikki Linnakangas, with
minor revisions by moi, following a design hashed out awhile back on the
pghackers list.
to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets. When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX. This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared. Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.
Alvaro Herrera and Tom Lane.
PageIndexTupleDelete() with a single pass of compactification ---
logic mostly lifted from PageRepairFragmentation. I noticed while
profiling that a VACUUM that's cleaning up a whole lot of deleted
tuples would spend as much as a third of its CPU time in
PageIndexTupleDelete; not too surprising considering the loop method
was roughly O(N^2) in the number of tuples involved.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
recovery more manageable. Also, undo recent change to add FILE_HEADER
and WASTED_SPACE records to XLOG; instead make the XLOG page header
variable-size with extra fields in the first page of an XLOG file.
This should fix the boundary-case bugs observed by Mark Kirkwood.
initdb forced due to change of XLOG representation.
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.
Alvaro Herrera, with some help from Tom Lane.
page when it's read in, per pghackers discussion around 17-Feb. Add a
GUC variable zero_damaged_pages that causes the response to be a WARNING
followed by zeroing the page, rather than the normal ERROR; this is per
Hiroshi's suggestion that there needs to be a way to get at the data
in the rest of the table.
(overlaying low byte of page size) and add HEAP_HASOID bit to t_infomask,
per earlier discussion. Simplify scheme for overlaying fields in tuple
header (no need for cmax to live in more than one place). Don't try to
clear infomask status bits in tqual.c --- not safe to do it there. Don't
try to force output table of a SELECT INTO to have OIDs, either. Get rid
of unnecessarily complex three-state scheme for TupleDesc.tdhasoids, which
has already caused one recent failure. Improve documentation.
>usefulness.
>
>> [...] should I post a patch that puts pagesize directly into
>> PageHeaderData?
>
>If you're so inclined. Given that pd_opaque is hidden in those macros,
>there wouldn't be much of any gain in readability either, so I haven't
>worried about changing the declaration.
Thanks for the clarification. Here is the patch. Not much gain, but at
least it saves the next junior hacker from scratching his head ...
Manfred Koizar
all places, where pd_linp is accessed. Also introduce new macros
SizeOfPageHeaderData and BTMaxItemSize. This is just source code
cosmetic, no behaviour changed.
Manfred Koizar
to prevent spreading of corruption when page header pointers are bad.
Merge PageZero into PageInit, since it was never used separately, and
remove separate memset calls used at most other PageInit call points.
Remove IndexPageCleanup, which wasn't used at all.
buffer manager with 'pg_clog', a specialized access method modeled
on pg_xlog. This simplifies startup (don't need to play games to
open pg_log; among other things, OverrideTransactionSystem goes away),
should improve performance a little, and opens the door to recycling
commit log space by removing no-longer-needed segments of the commit
log. Actual recycling is not there yet, but I felt I should commit
this part separately since it'd still be useful if we chose not to
do transaction ID wraparound.