Commit Graph

235 Commits

Author SHA1 Message Date
Tom Lane bf3dbb5881 First steps towards index scans with heap access decoupled from index
access: define new index access method functions 'amgetmulti' that can
fetch multiple TIDs per call.  (The functions exist but are totally
untested as yet.)  Since I was modifying pg_am anyway, remove the
no-longer-needed 'rel' parameter from amcostestimate functions, and
also remove the vestigial amowner column that was creating useless
work for Alvaro's shared-object-dependencies project.
Initdb forced due to changes in pg_am.
2005-03-27 23:53:05 +00:00
Tom Lane ee4ddac137 Convert index-related tuple handling routines from char 'n'/' ' to bool
convention for isnull flags.  Also, remove the useless InsertIndexResult
return struct from index AM aminsert calls --- there is no reason for
the caller to know where in the index the tuple was inserted, and we
were wasting a palloc cycle per insert to deliver this uninteresting
value (plus nontrivial complexity in some AMs).
I forced initdb because of the change in the signature of the aminsert
routines, even though nothing really looks at those pg_proc entries...
2005-03-21 01:24:04 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane fe548629c5 Invent ResourceOwner mechanism as per my recent proposal, and use it to
keep track of portal-related resources separately from transaction-related
resources.  This allows cursors to work in a somewhat sane fashion with
nested transactions.  For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor.  We might want to change that later.
2004-07-17 03:32:14 +00:00
Tom Lane 94d4d240bb Rename XLOG_BTREE_NEWPAGE xlog record type into XLOG_HEAP_NEWPAGE, and
shift support code into heapam.c accordingly.  This is in service of
soon-to-be-committed ALTER TABLE SET TABLESPACE code that will want to
use this same record type for both heaps and indexes.

Theoretically I should have forced initdb for this, but in practice there
is no change in xlog contents because CVS tip will never really emit this
record type anyhow...
2004-07-11 18:01:45 +00:00
Tom Lane 2095206de1 Adjust btree index build to not use shared buffers, thereby avoiding the
locking conflict against concurrent CHECKPOINT that was discussed a few
weeks ago.  Also, if not using WAL archiving (which is always true ATM
but won't be if PITR makes it into this release), there's no need to
WAL-log the index build process; it's sufficient to force-fsync the
completed index before commit.  This seems to gain about a factor of 2
in my tests, which is consistent with writing half as much data.  I did
not try it with WAL on a separate drive though --- probably the gain would
be a lot less in that scenario.
2004-06-02 17:28:18 +00:00
Tom Lane 37fa3b6c89 Tweak indexscan and seqscan code to arrange that steps from one page to
the next are handled by ReleaseAndReadBuffer rather than separate
ReleaseBuffer and ReadBuffer calls.  This cuts the number of acquisitions
of the BufMgrLock by a factor of 2 (possibly more, if an indexscan happens
to pull successive rows from the same heap page).  Unfortunately this
doesn't seem enough to get us out of the recently discussed context-switch
storm problem, but it's surely worth doing anyway.
2004-04-21 18:24:26 +00:00
Tom Lane 391c3811a2 Rename SortMem and VacuumMem to work_mem and maintenance_work_mem.
Make btree index creation and initial validation of foreign-key constraints
use maintenance_work_mem rather than work_mem as their memory limit.
Add some code to guc.c to allow these variables to be referenced by their
old names in SHOW and SET commands, for backwards compatibility.
2004-02-03 17:34:04 +00:00
Tom Lane 569659ae16 Improve btree's initial-positioning-strategy code so that we never need
to step more than one entry after descending the search tree to arrive at
the correct place to start the scan.  This can improve the behavior
substantially when there are many entries equal to the chosen boundary
value.  Per suggestion from Dmitry Tkach, 14-Jul-03.
2003-12-21 01:23:06 +00:00
PostgreSQL Daemon 55b113257c make sure the $Id tags are converted to $PostgreSQL as well ... 2003-11-29 22:41:33 +00:00
Tom Lane fa5c8a055a Cross-data-type comparisons are now indexable by btrees, pursuant to my
pghackers proposal of 8-Nov.  All the existing cross-type comparison
operators (int2/int4/int8 and float4/float8) have appropriate support.
The original proposal of storing the right-hand-side datatype as part of
the primary key for pg_amop and pg_amproc got modified a bit in the event;
it is easier to store zero as the 'default' case and only store a nonzero
when the operator is actually cross-type.  Along the way, remove the
long-since-defunct bigbox_ops operator class.
2003-11-12 21:15:59 +00:00
Tom Lane c1d62bfd00 Add operator strategy and comparison-value datatype fields to ScanKey.
Remove the 'strategy map' code, which was a large amount of mechanism
that no longer had any use except reverse-mapping from procedure OID to
strategy number.  Passing the strategy number to the index AM in the
first place is simpler and faster.
This is a preliminary step in planned support for cross-datatype index
operations.  I'm committing it now since the ScanKeyEntryInitialize()
API change touches quite a lot of files, and I want to commit those
changes before the tree drifts under me.
2003-11-09 21:30:38 +00:00
Tom Lane e33f205a94 Adjust btree index build procedure so that the btree metapage looks
invalid (has the wrong magic number) until the build is entirely
complete.  This turns out to cost no additional writes in the normal
case, since we were rewriting the metapage at the end of the process
anyway.  In normal scenarios there's no real gain in security, because
a failed index build would roll back the transaction leaving an unused
index file, but for rebuilding shared system indexes this seems to add
some useful protection.
2003-09-29 23:40:26 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 3bbd6af37c Adjust btbulkdelete logic so that only one WAL record is issued while
deleting multiple index entries on a single index page.  This makes for
a very substantial reduction in the amount of WAL traffic during a
large delete operation.
2003-02-23 22:43:09 +00:00
Tom Lane 88dc31e3f2 First cut at recycling space in btree indexes. Still some rough edges
to fix, but it seems to basically work...
2003-02-23 06:17:13 +00:00
Tom Lane 799bc58dc7 More infrastructure for btree compaction project. Tree-traversal code
now knows what to do upon hitting a dead page (in theory anyway, it's
untested...).  Add a post-VACUUM-cleanup entry point for index AMs, to
provide a place for dead-page scavenging to happen.
Also, fix oversight that broke btpo_prev links in temporary indexes.
initdb forced due to additions in pg_am.
2003-02-22 00:45:05 +00:00
Tom Lane 70508ba7ae Make btree index structure adjustments and WAL logging changes needed to
support btree compaction, as per proposal of a few days ago.  btree index
pages no longer store parent links, instead they have a level indicator
(counting up from zero for leaf pages).  The FixBTree recovery logic is
removed, and replaced by code that detects missing parent-level insertions
during WAL replay.  Also, generate appropriate WAL entries when updating
btree metapage and when building a btree index from scratch.  I believe
btree indexes are now completely WAL-legal for the first time.
initdb forced due to index and WAL changes.
2003-02-21 00:06:22 +00:00
Bruce Momjian 33f1687879 There already was a macro PageGetItemId; this is now used in (almost)
all places, where pd_linp is accessed.  Also introduce new macros
SizeOfPageHeaderData and BTMaxItemSize. This is just source code
cosmetic, no behaviour changed.

Manfred Koizar
2002-07-02 05:48:44 +00:00
Bruce Momjian d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Tom Lane 3f4d488022 Mark index entries "killed" when they are no longer visible to any
transaction, so as to avoid returning them out of the index AM.  Saves
repeated heap_fetch operations on frequently-updated rows.  Also detect
queries on unique keys (equality to all columns of a unique index), and
don't bother continuing scan once we have found first match.

Killing is implemented in the btree and hash AMs, but not yet in rtree
or gist, because there isn't an equally convenient place to do it in
those AMs (the outer amgetnext routine can't do it without re-pinning
the index page).

Did some small cleanup on APIs of HeapTupleSatisfies, heap_fetch, and
index_insert to make this a little easier.
2002-05-24 18:57:57 +00:00
Tom Lane 44fbe20d62 Restructure indexscan API (index_beginscan, index_getnext) per
yesterday's proposal to pghackers.  Also remove unnecessary parameters
to heap_beginscan, heap_rescan.  I modified pg_proc.h to reflect the
new numbers of parameters for the AM interface routines, but did not
force an initdb because nothing actually looks at those fields.
2002-05-20 23:51:44 +00:00
Bruce Momjian ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Bruce Momjian 6783b2372e Another pgindent run. Fixes enum indenting, and improves #endif
spacing.  Also adds space for one-line comments.
2001-10-28 06:26:15 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Tom Lane c8076f09d2 Restructure index AM interface for index building and index tuple deletion,
per previous discussion on pghackers.  Most of the duplicate code in
different AMs' ambuild routines has been moved out to a common routine
in index.c; this means that all index types now do the right things about
inserting recently-dead tuples, etc.  (I also removed support for EXTEND
INDEX in the ambuild routines, since that's about to go away anyway, and
it cluttered the code a lot.)  The retail indextuple deletion routines have
been replaced by a "bulk delete" routine in which the indexscan is inside
the access method.  I haven't pushed this change as far as it should go yet,
but it should allow considerable simplification of the internal bookkeeping
for deletions.  Also, add flag columns to pg_am to eliminate various
hardcoded tests on AM OIDs, and remove unused pg_am columns.

Fix rtree and gist index types to not attempt to store NULLs; before this,
gist usually crashed, while rtree managed not to crash but computed wacko
bounding boxes for NULL entries (which might have had something to do with
the performance problems we've heard about occasionally).

Add AtEOXact routines to hash, rtree, and gist, all of which have static
state that needs to be reset after an error.  We discovered this need long
ago for btree, but missed the other guys.

Oh, one more thing: concurrent VACUUM is now the default.
2001-07-15 22:48:19 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Bruce Momjian 82fc51e0b3 More comment improvements. 2001-02-22 23:02:33 +00:00
Bruce Momjian 4f6c49fef0 Clean up index/btree comments/macros, as approved. 2001-02-22 21:48:49 +00:00
Bruce Momjian 15903a1ed4 Comment improvements. 2001-02-21 19:07:04 +00:00
Vadim B. Mikheev 66decbfb08 Macro for btree runtime fix. 2001-02-07 23:34:18 +00:00
Bruce Momjian 623bf843d2 Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group. 2001-01-24 19:43:33 +00:00
Vadim B. Mikheev 7ceeeb662f New WAL version - CRC and data blocks backup. 2000-12-28 13:00:29 +00:00
Vadim B. Mikheev 81c8c244b2 No more #ifdef XLOG. 2000-11-30 08:46:26 +00:00
Peter Eisentraut a70e74b060 Put external declarations into header files. 2000-11-21 21:16:06 +00:00
Vadim B. Mikheev a7fcadd10a WAL 2000-10-21 15:43:36 +00:00
Vadim B. Mikheev deee783052 WAL 2000-10-13 12:05:22 +00:00
Vadim B. Mikheev 25a26a7ab8 WAL 2000-10-13 02:03:02 +00:00
Vadim B. Mikheev 5800c6b9aa Btree WAL logging. 2000-10-04 00:04:43 +00:00
Vadim B. Mikheev 8e95321476 Btree WAL records. 2000-09-12 06:07:52 +00:00
Hiroshi Inoue b0d5036c7c CREATE btree INDEX takes dead tuples into account when old transactions
are running.
2000-08-10 02:33:20 +00:00
Tom Lane 916b2321ad Clean up and document btree code for ordering keys. Neat stuff,
actually, but who could understand it with no comments?  Fix bug
while at it: _bt_orderkeys would try to invoke comparisons on
NULL inputs, given the right sort of redundant quals.
2000-07-25 04:47:59 +00:00
Tom Lane 9e85183bfc Major overhaul of btree index code. Eliminate special BTP_CHAIN logic for
duplicate keys by letting search go to the left rather than right when an
equal key is seen at an upper tree level.  Fix poor choice of page split
point (leading to insertion failures) that was forced by chaining logic.
Don't store leftmost key in non-leaf pages, since it's not necessary.
Don't create root page until something is first stored in the index, so an
unused index is now 8K not 16K.  (Doesn't seem to be as easy to get rid of
the metadata page, unfortunately.)  Massive cleanup of unreadable code,
fix poor, obsolete, and just plain wrong documentation and comments.
See src/backend/access/nbtree/README for the gory details.
2000-07-21 06:42:39 +00:00
Bruce Momjian df43800fc8 Clean up #include's. 2000-06-15 03:33:12 +00:00
Tom Lane f2d1205322 Another batch of fmgr updates. I think I have gotten all old-style
functions that take pass-by-value datatypes.  Should be ready for
port testing ...
2000-06-13 07:35:40 +00:00
Bruce Momjian 20ad43b576 Mark functions as static and ifdef NOT_USED as appropriate. 2000-06-08 22:38:00 +00:00
Bruce Momjian 52f77df613 Ye-old pgindent run. Same 4-space tabs. 2000-04-12 17:17:23 +00:00
Tom Lane 8cb624262a Replace inefficient _bt_invokestrat calls with direct calls to the
appropriate btree three-way comparison routine.  Not clear why the
three-way comparison routines were being used in some paths and not
others in btree --- incomplete changes by someone long ago, maybe?
Anyway, this makes for a nice speedup in CREATE INDEX.
2000-02-18 06:32:39 +00:00
Bruce Momjian 5c25d60244 Add:
* Portions Copyright (c) 1996-2000, PostgreSQL, Inc

to all files copyright Regents of Berkeley.  Man, that's a lot of files.
2000-01-26 05:58:53 +00:00
Tom Lane 26c48b5e8c Final stage of psort reconstruction work: replace psort.c with
a generalized module 'tuplesort.c' that can sort either HeapTuples or
IndexTuples, and is not tied to execution of a Sort node.  Clean up
memory leakages in sorting, and replace nbtsort.c's private implementation
of mergesorting with calls to tuplesort.c.
1999-10-17 22:15:09 +00:00
Tom Lane 4488b69b4c Fix nbtree's failure to clear BTScans list during xact abort.
Also, move responsibility for calling vc_abort into main xact.c list of
things-to-call-at-abort.  What in the world was it doing down inside of
TransactionIdAbort()?
1999-08-08 20:12:52 +00:00
Bruce Momjian 773088809d More cleanup 1999-07-16 17:07:40 +00:00
Bruce Momjian a9591ce66a Change #include's to use <> and "" as appropriate. 1999-07-15 23:04:24 +00:00
Bruce Momjian 4b2c2850bf Clean up #include in /include directory. Add scripts for checking includes. 1999-07-15 15:21:54 +00:00
Bruce Momjian 4eadfe8754 Make 0x007f -> (unsigned)0x7f to make pgindent happy. 1999-05-25 22:04:56 +00:00
Vadim B. Mikheev 519cd36d06 Get rid of page-level locking in btree-s.
BT_READ/BT_WRITE are BUFFER_LOCK_SHARE/BUFFER_LOCK_EXCLUSIVE now.
Also get rid of #define BT_VERSION_1 - we use version 1 as default
for near two years now.
1999-05-25 18:31:28 +00:00
Bruce Momjian 07842084fe pgindent run over code. 1999-05-25 16:15:34 +00:00
Vadim B. Mikheev fdf6be80f9 1. Vacuum is updated for MVCC.
2. Much faster btree tuples deletion in the case when first on page
   index tuple is deleted (no movement to the left page(s)).
3. Remember blkno of new root page in BTPageOpaque of
   left/right siblings when root page is splitted.
1999-03-28 20:32:42 +00:00
Bruce Momjian 6724a50787 Change my-function-name-- to my_function_name, and optimizer renames. 1999-02-13 23:22:53 +00:00
Bruce Momjian fa1a8d6a97 OK, folks, here is the pgindent output. 1998-09-01 04:40:42 +00:00
Vadim B. Mikheev f73fc6eb29 Fix scan adjustment. 1998-07-30 05:05:05 +00:00
Bruce Momjian a32450a585 pgindent run before 6.3 release, with Thomas' requested changes. 1998-02-26 04:46:47 +00:00
Bruce Momjian 7229513943 Fix prototypes so they don't look like function definitions. 1998-01-24 22:50:57 +00:00
Bruce Momjian 59f6a57e59 Used modified version of indent that understands over 100 typedefs. 1997-09-08 21:56:23 +00:00
Bruce Momjian 075cede748 Add typdefs to pgindent run. 1997-09-08 20:59:27 +00:00
Bruce Momjian 319dbfa736 Another PGINDENT run that changes variable indenting and case label indenting. Also static variable indenting. 1997-09-08 02:41:22 +00:00
Bruce Momjian 1ccd423235 Massive commit to run PGINDENT on all *.c and *.h files. 1997-09-07 05:04:48 +00:00
Bruce Momjian 1d8bbfd2e7 Make functions static where possible, enclose unused functions in #ifdef NOT_USED. 1997-08-19 21:40:56 +00:00
Vadim B. Mikheev 5cf55737a4 Added: new BTP_CHAIN flag (if hikey == firstkey then it's not
last page in chain of duplicates).
1997-05-30 18:40:02 +00:00
Vadim B. Mikheev afd9295786 BTREE_VERSION_1: using bti_itup->t_tid as unique identifier for a given
index tuple (logical position within A LEVEL). bti_oid & bti_dummy
taken off from BTItemData.
1997-04-16 01:21:59 +00:00
Vadim B. Mikheev 427a87911d New func _bt_checkkeys() added to let caller know number of keys
for which checking was TRUE.
1997-03-24 08:04:51 +00:00
Marc G. Fournier d146305065 Patches for Vadim's multikey indexing... 1997-03-18 18:41:37 +00:00
Vadim B. Mikheev 36058981a4 Added: UNIQUE feature to bulkload code. 1997-02-22 10:08:27 +00:00
Bruce Momjian a17b01f320 Update btree patches that were missed. 1997-02-18 17:14:25 +00:00
Bruce Momjian d38767fcb5 Add prototypes and remove unused variables from btree Fastbuild patch. 1997-02-14 22:47:36 +00:00
Marc G. Fournier 5d9f146c64 What looks like some *major* improvements to btree indexing...
Patches from: aoki@CS.Berkeley.EDU (Paul M. Aoki)

i gave jolly my btree bulkload code a long, long time ago but never
gave him a bunch of my bugfixes.  here's a diff against the 6.0
baseline.

for some reason, this code has slowed down somewhat relative to the
insertion-build code on very small tables.  don't know why -- it used
to be within about 10%.  anyway, here are some (highly unscientific!)
timings on a dec 3000/300 for synthetic tables with 10k, 100k and
1000k tuples (basically, 1mb, 10mb and 100mb heaps).  'c' means
clustered (pre-sorted) inputs and 'u' means unclustered (randomly
ordered) inputs.  the 10k table basically fits in the buffer pool, but
the 100k and 1000k tables don't.  as you can see, insertion build is
fine if you've sorted your heaps on your index key or if your heap
fits in core, but is absolutely horrible on unordered data (yes,
that's 7.5 hours to index 100mb of data...) because of the zillions of
random i/os.

if it doesn't work for you for whatever reason, you can always turn it
back off by flipping the FastBuild flag in nbtree.c.  i don't have
time to maintain it.

good luck!

baseline code:

time psql -c 'create index c10 on k10 using btree (c int4_ops)' bttest
real   8.6
time psql -c 'create index u10 on k10 using btree (b int4_ops)' bttest
real   9.1
time psql -c 'create index c100 on k100 using btree (c int4_ops)' bttest
real   59.2
time psql -c 'create index u100 on k100 using btree (b int4_ops)' bttest
real   652.4
time psql -c 'create index c1000 on k1000 using btree (c int4_ops)' bttest
real   636.1
time psql -c 'create index u1000 on k1000 using btree (b int4_ops)' bttest
real   26772.9

bulkloading code:

time psql -c 'create index c10 on k10 using btree (c int4_ops)' bttest
real   11.3
time psql -c 'create index u10 on k10 using btree (b int4_ops)' bttest
real   10.4
time psql -c 'create index c100 on k100 using btree (c int4_ops)' bttest
real   59.5
time psql -c 'create index u100 on k100 using btree (b int4_ops)' bttest
real   63.5
time psql -c 'create index c1000 on k1000 using btree (c int4_ops)' bttest
real   636.9
time psql -c 'create index u1000 on k1000 using btree (b int4_ops)' bttest
real   701.0
1997-02-12 05:04:52 +00:00
Vadim B. Mikheev c7990b35f7 index_insert has now HeapRelation as last param (for
unique index implementation).
1997-01-10 09:36:34 +00:00
Marc G. Fournier 07a65b2255 Commit of a *MAJOR* patch from Dan McGuirk <djm@indirect.com>
Changes:

        * Unique index capability works using the syntax 'create unique
          index'.

        * Duplicate OID's in the system tables are removed.  I put
          little scripts called 'duplicate_oids' and 'find_oid' in
          include/catalog that help to find and remove duplicate OID's.
          I also moved 'unused_oids' from backend/catalog to
          include/catalog, since it has to be in the same directory
          as the include files in order to work.

        * The backend tries converting the name of a function or aggregate
          to all lowercase if the original name given doesn't work (mostly
          for compatibility with ODBC).

        * You can 'SELECT NULL' to your heart's content.

        * I put my _bt_updateitem fix in instead, which uses
          _bt_insertonpg so that even if the new key is so big that
          the page has to be split, everything still works.

        * All literal references to system catalog OID's have been
          replaced with references to define'd constants from the catalog
          header files.

        * I added a couple of node copy functions.  I think this was a
          preliminary attempt to get rules to work.
1996-11-13 20:56:15 +00:00
Marc G. Fournier 6608278ea4 these ones have their dependencies cleaned up 1996-11-05 10:37:16 +00:00
Marc G. Fournier f36b2560a4 Major code cleanups from D'arcy (-Wall -Werror) 1996-10-23 07:42:13 +00:00
Marc G. Fournier 5a8820efcd Moved from backend/access to include/access 1996-08-27 21:50:29 +00:00