Commit Graph

3614 Commits

Author SHA1 Message Date
Bruce Momjian
1f54d43075 Add GUC variables to control keep-alive times for idle, interval, and
count.

Oliver Jowett
2005-07-30 15:17:26 +00:00
Tom Lane
5d5f1a79e6 Clean up a number of autovacuum loose ends. Make the stats collector
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly.  Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries.  Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
2005-07-29 19:30:09 +00:00
Bruce Momjian
2ea44d1ada Update catversion for dbsize changes. 2005-07-29 15:04:22 +00:00
Bruce Momjian
358a897fa1 Move dbsize functions into the backend. New functions:
pg_tablespace_size
	pg_database_size
	pg_relation_size
	pg_complete_relation_size
	pg_size_pretty

Remove /contrib/dbsize.

Dave Page
2005-07-29 14:47:04 +00:00
Tom Lane
a4ca842319 Fix a bunch of bad interactions between partial indexes and the new
planning logic for bitmap indexscans.  Partial indexes create corner
cases in which a scan might be done with no explicit index qual conditions,
and the code wasn't handling those cases nicely.  Also be a little
tenser about eliminating redundant clauses in the generated plan.
Per report from Dmitry Karasik.
2005-07-28 20:26:22 +00:00
Neil Conway
a4c75ece82 Fix a few macro definitions to ensure that unary minus is enclosed in
parentheses. This avoids possible operator precedence problems, and
is consistent with most of the macro definitions in the tree.
2005-07-27 12:44:10 +00:00
Neil Conway
b98b75eb3b Remove MMCacheLock -- it is no longer used. Per ITAGAKI Takahiro. 2005-07-27 08:05:36 +00:00
Tom Lane
af019fb9ae Add a role property 'rolinherit' which, when false, denotes that the role
doesn't automatically inherit the privileges of roles it is a member of;
for such a role, membership in another role can be exploited only by doing
explicit SET ROLE.  The default inherit setting is TRUE, so by default
the behavior doesn't change, but creating a user with NOINHERIT gives closer
adherence to our current reading of SQL99.  Documentation still lacking,
and I think the information schema needs another look.
2005-07-26 16:38:29 +00:00
Tom Lane
f9fd176461 Add pg_has_role() family of privilege inquiry functions modeled after the
existing ones for object privileges.  Update the information_schema for
roles --- pg_has_role() makes this a whole lot easier, removing the need
for most of the explicit joins with pg_user.  The views should be a tad
faster now, too.  Stephen Frost and Tom Lane.
2005-07-26 00:04:19 +00:00
Tom Lane
e5d6b91220 Add SET ROLE. This is a partial commit of Stephen Frost's recent patch;
I'm still working on the has_role function and information_schema changes.
2005-07-25 22:12:34 +00:00
Tom Lane
d007a95055 Simple constraint exclusion. For now, only child tables of inheritance
scans are candidates for exclusion; this should be fixed eventually.
Simon Riggs, with some help from Tom Lane.
2005-07-23 21:05:48 +00:00
Bruce Momjian
3dbbbbf8e9 Andrew pointed out that the current fix didn't handle dates that were
near daylight savings time boudaries.  This handles it properly, e.g.

        test=> select '2005-04-03 04:00:00'::timestamp at time zone
        'America/Los_Angeles';
                timezone
        ------------------------
         2005-04-03 07:00:00-04
        (1 row)
2005-07-23 14:25:34 +00:00
Bruce Momjian
75e5aba7fe Update date/time comments. 2005-07-22 05:08:26 +00:00
Bruce Momjian
d5f1e08c0c Code spacing improvement, particularly *tm spacing. 2005-07-22 03:46:34 +00:00
Bruce Momjian
e9c44bd382 More comment update of time macros. 2005-07-21 20:37:21 +00:00
Bruce Momjian
e6b72d6af6 Update DAYS_PER_MONTH comment.
Add SECS_PER_YEAR and MINS_PER_HOUR macros.
2005-07-21 18:06:13 +00:00
Bruce Momjian
a0407f508a Add comment about void* use in MemSet. 2005-07-21 15:16:30 +00:00
Bruce Momjian
aa0f6e8d06 Add comment marking non-exact time conversion macros. 2005-07-21 04:48:42 +00:00
Bruce Momjian
a536b2dd80 Add time/date macros for code clarity:
#define DAYS_PER_YEAR   365.25
	#define MONTHS_PER_YEAR 12
	#define DAYS_PER_MONTH  30
	#define HOURS_PER_DAY   24
2005-07-21 03:56:25 +00:00
Bruce Momjian
ddc038cad2 Update catalog version for INTERVAL day addition. 2005-07-20 17:24:39 +00:00
Bruce Momjian
db05f4a7eb Add 'day' field to INTERVAL so 1 day interval can be distinguished from
24 hours. This is very helpful for daylight savings time:

	select '2005-05-03 00:00:00 EST'::timestamp with time zone + '24 hours';
	      ?column?
	----------------------
	2005-05-04 01:00:00-04

	select '2005-05-03 00:00:00 EST'::timestamp with time zone + '1 day';
	      ?column?
	----------------------
	2005-05-04 01:00:00-04

Michael Glaesemann
2005-07-20 16:42:32 +00:00
Tom Lane
ac43da8466 MemSet() must not cast its pointer argument to int32* until after it has
checked that the pointer is actually word-aligned.  Casting a non-aligned
pointer to int32* is technically illegal per the C spec, and some recent
versions of gcc actually generate bad code for the memset() when given
such a pointer.  Per report from Andrew Morrow.
2005-07-18 15:53:28 +00:00
Tom Lane
aa1110624c Adjust permissions checking for ALTER OWNER commands: instead of
requiring superuserness always, allow an owner to reassign ownership
to any role he is a member of, if that role would have the right to
create a similar object.  These three requirements essentially state
that the would-be alterer has enough privilege to DROP the existing
object and then re-CREATE it as the new role; so we might as well
let him do it in one step.  The ALTER TABLESPACE case is a bit
squirrely, but the whole concept of non-superuser tablespace owners
is pretty dubious anyway.  Stephen Frost, code review by Tom Lane.
2005-07-14 21:46:30 +00:00
Tom Lane
29094193f5 Integrate autovacuum functionality into the backend. There's still a
few loose ends to be dealt with, but it seems to work.  Alvaro Herrera,
based on the contrib code by Matthew O'Connor.
2005-07-14 05:13:45 +00:00
Tom Lane
d78397d301 Change typreceive function API so that receive functions get the same
optional arguments as text input functions, ie, typioparam OID and
atttypmod.  Make all the datatypes that use typmod enforce it the same
way in typreceive as they do in typinput.  This fixes a problem with
failure to enforce length restrictions during COPY FROM BINARY.
2005-07-10 21:14:00 +00:00
Bruce Momjian
75a64eeb4b I made the patch that implements regexp_replace again.
The specification of this function is as follows.

regexp_replace(source text, pattern text, replacement text, [flags
text])
returns text

Replace string that matches to regular expression in source text to
replacement text.

 - pattern is regular expression pattern.
 - replacement is replace string that can use '\1'-'\9', and '\&'.
    '\1'-'\9': back reference to the n'th subexpression.
    '\&'     : entire matched string.
 - flags can use the following values:
    g: global (replace all)
    i: ignore case
    When the flags is not specified, case sensitive, replace the first
    instance only.

Atsushi Ogawa
2005-07-10 04:54:33 +00:00
Neil Conway
40ffa1a14c Remove some dead code for handling XLOG_DBASE_CREATE_OLD and
XLOG_DBASE_DROP_OLD WAL records -- these records are no longer created in
current sources. Adjust numbering of XLOG_DBASE_CREATE and XLOG_DBASE_DROP
and bump the catversion. Patch from Gavin Sherry, adjusted by Neil Conway.
2005-07-08 04:12:27 +00:00
Tom Lane
59d1b3d99e Track dependencies on shared objects (which is to say, roles; we already
have adequate mechanisms for tracking the contents of databases and
tablespaces).  This solves the longstanding problem that you can drop a
user who still owns objects and/or has access permissions.
Alvaro Herrera, with some kibitzing from Tom Lane.
2005-07-07 20:40:02 +00:00
Bruce Momjian
970bb03c3c Complete zic patch backout by removing NO_PGPORT workaround. 2005-07-06 21:40:09 +00:00
Bruce Momjian
a923602855 Add pg_column_size() to return storage size of a column, including
possible compression.

Mark Kirkwood
2005-07-06 19:02:54 +00:00
Bruce Momjian
7e33fae3c1 Add NO_PGPORT defines to fix win32/cygwin builds for new target platform
build of zic.
2005-07-05 17:24:30 +00:00
Bruce Momjian
4f979e8bac Restructure zic #define fprintf checks to use a NO_PGPORT macro instead. 2005-07-04 19:54:51 +00:00
Tom Lane
eb5949d190 Arrange for the postmaster (and standalone backends, initdb, etc) to
chdir into PGDATA and subsequently use relative paths instead of absolute
paths to access all files under PGDATA.  This seems to give a small
performance improvement, and it should make the system more robust
against naive DBAs doing things like moving a database directory that
has a live postmaster in it.  Per recent discussion.
2005-07-04 04:51:52 +00:00
Tom Lane
cc5e80b8d1 Teach planner about some cases where a restriction clause can be
propagated inside an outer join.  In particular, given
LEFT JOIN ON (A = B) WHERE A = constant, we cannot conclude that
B = constant at the top level (B might be null instead), but we
can nonetheless put a restriction B = constant into the quals for
B's relation, since no inner-side rows not meeting that condition
can contribute to the final result.  Similarly, given
FULL JOIN USING (J) WHERE J = constant, we can't directly conclude
that either input J variable = constant, but it's OK to push such
quals into each input rel.  Per recent gripe from Kim Bisgaard.
Along the way, remove 'valid_everywhere' flag from RestrictInfo,
as on closer analysis it was not being used for anything, and was
defined backwards anyway.
2005-07-02 23:00:42 +00:00
Bruce Momjian
74b49a8129 Add E'' to internally created SQL strings that contain backslashes.
Improve code clarity by using macros for E'' processing.
2005-07-02 17:01:59 +00:00
Tom Lane
e7e1694295 Migrate rtree_gist functionality into the core system, and add some
basic regression tests for GiST to the standard regression tests.
I took the opportunity to add an rtree-equivalent gist opclass for
circles; the contrib version only covered boxes and polygons, but
indexing circles is very handy for distance searches.
2005-07-01 19:19:05 +00:00
Peter Eisentraut
875efad481 Update to autoconf 2.59 as well as updates of related scripts 2005-07-01 18:17:31 +00:00
Teodor Sigaev
898a7bd13b Bug fixes for GiST crash recovery.
- add forgotten check of lsn for insert completion
- remove level of pages: hard to check in recovery
- some cleanups
2005-06-30 17:52:14 +00:00
Tom Lane
401de9c8be Improve the checkpoint signaling mechanism so that the bgwriter can tell
the difference between checkpoints forced due to WAL segment consumption
and checkpoints forced for other reasons (such as CREATE DATABASE).  Avoid
generating 'checkpoints are occurring too frequently' messages when the
checkpoint wasn't caused by WAL segment consumption.  Per gripe from
Chris K-L.
2005-06-30 00:00:52 +00:00
Tom Lane
b5f7cff84f Clean up the rather historically encumbered interface to now() and
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields.  This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
2005-06-29 22:51:57 +00:00
Tom Lane
c33d575899 More cleanup on roles patch. Allow admin option to be inherited through
role memberships; make superuser/createrole distinction do something
useful; fix some locking and CommandCounterIncrement issues; prevent
creation of loops in the membership graph.
2005-06-29 20:34:15 +00:00
Tom Lane
0eaa36a16a Bring syntax of role-related commands into SQL compliance. To avoid
syntactic conflicts, both privilege and role GRANT/REVOKE commands have
to use the same production for scanning the list of tokens that might
eventually turn out to be privileges or role names.  So, change the
existing GRANT/REVOKE code to expect a list of strings not pre-reduced
AclMode values.  Fix a couple other minor issues while at it, such as
InitializeAcl function name conflicting with a Windows system function.
2005-06-28 19:51:26 +00:00
Tom Lane
7762619e95 Replace pg_shadow and pg_group by new role-capable catalogs pg_authid
and pg_auth_members.  There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance).  But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies.  The catalog changes should
be pretty much done.
2005-06-28 05:09:14 +00:00
Teodor Sigaev
e8cab5fe49 Concurrency for GiST
- full concurrency for insert/update/select/vacuum:
        - select and vacuum never locks more than one page simultaneously
        - select (gettuple) hasn't any lock across it's calls
        - insert never locks more than two page simultaneously:
                - during search of leaf to insert it locks only one page
                  simultaneously
                - while walk upward to the root it locked only parent (may be
                  non-direct parent) and child. One of them X-lock, another may
                  be S- or X-lock
- 'vacuum full' locks index
- improve gistgetmulti
- simplify XLOG records

Fix bug in index_beginscan_internal: LockRelation may clean
  rd_aminfo structure, so move GET_REL_PROCEDURE after LockRelation
2005-06-27 12:45:23 +00:00
Neil Conway
a159ad3048 Remove support for Kerberos V4. It seems no one is using this, it has
some security issues, and upstream has declared it "dead". Patch from
Magnus Hagander, minor editorialization from Neil Conway.
2005-06-27 02:04:26 +00:00
Tom Lane
06ae88a82e Tweak dynahash.c to not allocate so many entries at once when dealing
with a table that has a small predicted size.  Avoids wasting several
hundred K on the timezone hash table, which is likely to have only one
or a few entries, but the entries use up 10Kb apiece ...
2005-06-26 23:32:34 +00:00
Tom Lane
943b396245 Add Oracle-compatible GREATEST and LEAST functions. Pavel Stehule 2005-06-26 22:05:42 +00:00
Tom Lane
d395aecffa Code review for escape-strings patch. Sync psql and plpgsql lexers
with main, avoid using a SQL-defined SQLSTATE for what is most definitely
not a SQL-compatible error condition, fix documentation omissions,
adhere to message style guidelines, don't use two GUC_REPORT variables
when one is sufficient.  Nothing done about pg_dump issues.
2005-06-26 19:16:07 +00:00
Bruce Momjian
bb3cce4ec9 Add E'' syntax so eventually normal strings can treat backslashes
literally.

Add GUC variables:

        "escape_string_warning" - warn about backslashes in non-E strings
        "escape_string_syntax" - supports E'' syntax?
        "standard_compliant_strings" - treats backslashes literally in ''

Update code to use E'' when escapes are used.
2005-06-26 03:04:37 +00:00
Tom Lane
c96375a39b Fix a couple of items that should be declared Oid not int. Purely
cosmetic at the moment, but someday Oid might be 64 bits ...
2005-06-25 23:58:58 +00:00
Tom Lane
b90f8f20f0 Extend r-tree operator classes to handle Y-direction tests equivalent
to the existing X-direction tests.  An rtree class now includes 4 actual
2-D tests, 4 1-D X-direction tests, and 4 1-D Y-direction tests.
This involved adding four new Y-direction test operators for each of
box and polygon; I followed the PostGIS project's lead as to the names
of these operators.
NON BACKWARDS COMPATIBLE CHANGE: the poly_overleft (&<) and poly_overright
(&>) operators now have semantics comparable to box_overleft and box_overright.
This is necessary to make r-tree indexes work correctly on polygons.
Also, I changed circle_left and circle_right to agree with box_left and
box_right --- formerly they allowed the boundaries to touch.  This isn't
actually essential given the lack of any r-tree opclass for circles, but
it seems best to sync all the definitions while we are at it.
2005-06-24 20:53:34 +00:00
Tom Lane
9a09248edd Fix rtree and contrib/rtree_gist search behavior for the 1-D box and
polygon operators (<<, &<, >>, &>).  Per ideas originally put forward
by andrew@supernews and later rediscovered by moi.  This patch just
fixes the existing opclasses, and does not add any new behavior as I
proposed earlier; that can be sorted out later.  In principle this
could be back-patched, since it changes only search behavior and not
system catalog entries nor rtree index contents.  I'm not currently
planning to do that, though, since I think it could use more testing.
2005-06-24 00:18:52 +00:00
Tom Lane
4cc7a93d22 Make REINDEX DATABASE do what one would expect, namely reindex all indexes
in the database.  The old behavior (reindex system catalogs only) is now
available as REINDEX SYSTEM.  I did not add the complementary REINDEX USER
case since there did not seem to be consensus for this, but it would be
trivial to add later.  Per recent discussions.
2005-06-22 21:14:31 +00:00
Tom Lane
e98edb5555 Fix the mechanism for reporting the original table OID and column number
of columns of a query result so that it can "see through" cursors and
prepared statements.  Per gripe a couple months back from John DeSoi.
2005-06-22 17:45:46 +00:00
Tom Lane
6f7fc0bade Cause initdb to create a third standard database "postgres", which
unlike template0 and template1 does not have any special status in
terms of backend functionality.  However, all external utilities such
as createuser and createdb now connect to "postgres" instead of
template1, and the documentation is changed to encourage people to use
"postgres" instead of template1 as a play area.  This should fix some
longstanding gotchas involving unexpected propagation of database
objects by createdb (when you used template1 without understanding
the implications), as well as ameliorating the problem that CREATE
DATABASE is unhappy if anyone else is connected to template1.
Patch by Dave Page, minor editing by Tom Lane.  All per recent
pghackers discussions.
2005-06-21 04:02:34 +00:00
Tom Lane
b95ae32b41 Avoid WAL-logging individual tuple insertions during CREATE TABLE AS
(a/k/a SELECT INTO).  Instead, flush and fsync the whole relation before
committing.  We do still need the WAL log when PITR is active, however.
Simon Riggs and Tom Lane.
2005-06-20 18:37:02 +00:00
Teodor Sigaev
1bfdd1a893 fix founded hole in recovery after crash, add vacuum_delay_point() 2005-06-20 15:22:38 +00:00
Teodor Sigaev
d544ec8bbd 1. full functional WAL for GiST
2. improve vacuum for gist
   - use FSM
   - full vacuum:
      - reforms parent tuple if it's needed
        ( tuples was deleted on child page or parent tuple remains invalid
          after crash recovery )
      - truncate index file if possible
3. fixes bugs and mistakes
2005-06-20 10:29:37 +00:00
Tom Lane
3f749924f8 Simplify uses of readdir() by creating a function ReadDir() that
includes error checking and an appropriate ereport(ERROR) message.
This gets rid of rather tedious and error-prone manipulation of errno,
as well as a Windows-specific bug workaround, at more than a dozen
call sites.  After an idea in a recent patch by Heikki Linnakangas.
2005-06-19 21:34:03 +00:00
Tom Lane
e26b0abda3 Arrange to fsync two-phase-commit state files only during checkpoints;
given reasonably short lifespans for prepared transactions, this should
mean that only a small minority of state files ever need to be fsynced
at all.  Per discussion with Heikki Linnakangas.
2005-06-19 20:00:39 +00:00
Tom Lane
6a6f2d91d4 When using C-string lookup keys in a dynahash.c hash table, use strncpy()
not memcpy() to copy the offered key into the hash table during HASH_ENTER.
This avoids possible core dump if the passed key is located very near the
end of memory.  Per report from Stefan Kaltenbrunner.
2005-06-18 20:51:30 +00:00
Tom Lane
a8d1075f27 Add a time-of-preparation column to the pg_prepared_xacts view, per an
old suggestion by Oliver Jowett.  Also, add a transaction column to the
pg_locks view to show the xid of each transaction holding or awaiting
locks; this allows prepared transactions to be properly associated with
the locks they own.  There was already a column named 'transaction',
and I chose to rename it to 'transactionid' --- since this column is
new in the current devel cycle there should be no backwards compatibility
issue to worry about.
2005-06-18 19:33:42 +00:00
Tom Lane
d0a89683a3 Two-phase commit. Original patch by Heikki Linnakangas, with additional
hacking by Alvaro Herrera and Tom Lane.
2005-06-17 22:32:51 +00:00
Bruce Momjian
26cbccd52c Add fsync() define for Win32 to cover cases other than wal_sync_method
where we need fsync().
2005-06-16 17:53:54 +00:00
Bruce Momjian
2becf48483 Update catalog version for recent function additions. 2005-06-15 12:56:35 +00:00
Neil Conway
c119c5bd49 Change the implementation of hash join to attempt to avoid unnecessary
work if either of the join relations are empty. The logic is:

(1) if the inner relation's startup cost is less than the outer
    relation's startup cost and this is not an outer join, read
    a single tuple from the inner relation via ExecHash()
      - if NULL, we're done

(2) read a single tuple from the outer relation
      - if NULL, we're done

(3) build the hash table on the inner relation
      - if hash table is empty and this is not an outer join,
        we're done

(4) otherwise, do hash join as usual

The implementation uses the new MultiExecProcNode API, per a
suggestion from Tom: invoking ExecHash() now produces the first
tuple from the Hash node's child node, whereas MultiExecHash()
builds the hash table.

I had to put in a bit of a kludge to get the row count returned
for EXPLAIN ANALYZE to be correct: since ExecHash() is invoked to
return a tuple, and then MultiExecHash() is invoked, we would
return one too many tuples to EXPLAIN ANALYZE. I hacked around
this by just manually detecting this situation and subtracting 1
from the EXPLAIN ANALYZE row count.
2005-06-15 07:27:44 +00:00
Bruce Momjian
0851a6fbc7 This patch makes it possible to use the full set of timezones when doing
"AT TIME ZONE", and not just the shorlist previously available. For
example:

SELECT CURRENT_TIMESTAMP AT TIME ZONE 'Europe/London';

works fine now. It will also obey whatever DST rules were in effect at
just that date, which the previous implementation did not.

It also supports the AT TIME ZONE on the timetz datatype. The whole
handling of DST is a bit bogus there, so I chose to make it use whatever
DST rules are in effect at the time of executig the query. not sure if
anybody is actuallyi *using* timetz though, it seems pretty
unpredictable just because of this...

Magnus Hagander
2005-06-15 00:34:11 +00:00
Bruce Momjian
5955945828 Support 3 and 4-byte unicode characters.
John Hansen
2005-06-15 00:15:08 +00:00
Tom Lane
8563ccae2c Simplify shared-memory lock data structures as per recent discussion:
it is sufficient to track whether a backend holds a lock or not, and
store information about transaction vs. session locks only in the
inside-the-backend LocalLockTable.  Since there can now be but one
PROCLOCK per lock per backend, LockCountMyLocks() is no longer needed,
thus eliminating some O(N^2) behavior when a backend holds many locks.
Also simplify the LockAcquire/LockRelease API by passing just a
'sessionLock' boolean instead of a transaction ID.  The previous API
was designed with the idea that per-transaction lock holding would be
important for subtransactions, but now that we have subtransactions we
know that this is unwanted.  While at it, add an 'isTempObject' parameter
to LockAcquire to indicate whether the lock is being taken on a temp
table.  This is not used just yet, but will be needed shortly for
two-phase commit.
2005-06-14 22:15:33 +00:00
Bruce Momjian
f5835b4b8d Add pg_postmaster_start_time() function.
Euler Taveira de Oliveira
Matthias Schmidt
2005-06-14 21:04:42 +00:00
Bruce Momjian
954f6bcffe Add GUC krb_server_hostname so the server hostname can be specified as
part of service principal.  If not set, any service principal matching
an entry in the keytab can be used.

NEW KERBEROS MATCHING BEHAVIOR FOR 8.1.

Todd Kover
2005-06-14 17:43:14 +00:00
Teodor Sigaev
37c839365c WAL for GiST. It work for online backup and so on, but on
recovery after crash (power loss etc) it may say that it can't restore
index and index should be reindexed.

Some refactoring code.
2005-06-14 11:45:14 +00:00
Tom Lane
c186c93148 Change the planner to allow indexscan qualification clauses to use
nonconsecutive columns of a multicolumn index, as per discussion around
mid-May (pghackers thread "Best way to scan on-disk bitmaps").  This
turns out to require only minimal changes in btree, and so far as I can
see none at all in GiST.  btcostestimate did need some work, but its
original assumption that index selectivity == heap selectivity was
quite bogus even before this.
2005-06-13 23:14:49 +00:00
Tom Lane
a2fb7b8a1f Adjust lo_open() so that specifying INV_READ without INV_WRITE creates
a descriptor that uses the current transaction snapshot, rather than
SnapshotNow as it did before (and still does if INV_WRITE is set).
This means pg_dump will now dump a consistent snapshot of large object
contents, as it never could do before.  Also, add a lo_create() function
that is similar to lo_creat() but allows the desired OID of the large
object to be specified.  This will simplify pg_restore considerably
(but I'll fix that in a separate commit).
2005-06-13 02:26:53 +00:00
Tom Lane
2f1210629c Separate predicate-testing code out of indxpath.c, making it a module
in its own right.  As proposed by Simon Riggs, but with some editorializing
of my own.
2005-06-10 22:25:37 +00:00
Neil Conway
d46bc444ac Implement two new special variables in PL/PgSQL: SQLSTATE and SQLERRM.
These contain the SQLSTATE and error message of the current exception,
respectively. They are scope-local variables that are only defined
in exception handlers (so attempting to reference them outside an
exception handler is an error). Update the regression tests and the
documentation.

Also, do some minor related cleanup: export an unpack_sql_state()
function from the backend and use it to unpack a SQLSTATE into a
string, and add a free_var() function to pl_exec.c

Original patch from Pavel Stehule, review by Neil Conway.
2005-06-10 16:23:11 +00:00
Tom Lane
a87ee007ed Quick hack to allow the outer query's tuple_fraction to be passed down
to a subquery if the outer query is simple enough that the LIMIT can
be reflected directly to the subquery.  This didn't use to be very
interesting, because a subquery that couldn't have been flattened into
the upper query was usually not going to be very responsive to
tuple_fraction anyway.  But with new code that allows UNION ALL subqueries
to pay attention to tuple_fraction, this is useful to do.  In particular
this lets the optimization occur when the UNION ALL is directly inside
a view.
2005-06-10 03:32:25 +00:00
Tom Lane
3b167a4099 If a LIMIT is applied to a UNION ALL query, plan each UNION arm as
if the limit were directly applied to it.  This does not actually
add a LIMIT plan node to the generated subqueries --- that would be
useless overhead --- but it does cause the planner to prefer fast-
start plans when the limit is small.  After an idea from Phil Endecott.
2005-06-10 02:21:05 +00:00
Tom Lane
532ca3083d Avoid bare 'struct Node;' declaration --- provokes annoying warnings
on some compilers.
2005-06-09 18:44:05 +00:00
Bruce Momjian
4d0e7b4aac Please find attached a patch (diff -c against cvs HEAD) to add a
function that accepts a double precision argument assumed to be a Unix
epoch timestamp and returns timestamp with time zone, and accompanying
documentation.

Usage:

test=# select to_timestamp(200120400);
       to_timestamp
------------------------
  1976-05-05 14:00:00+09
(1 row)

Michael Glaesemann
2005-06-09 16:35:09 +00:00
Tom Lane
a31ad27fc5 Simplify the planner's join clause management by storing join clauses
of a relation in a flat 'joininfo' list.  The former arrangement grouped
the join clauses according to the set of unjoined relids used in each;
however, profiling on test cases involving lots of joins proves that
that data structure is a net loss.  It takes more time to group the
join clauses together than is saved by avoiding duplicate tests later.
It doesn't help any that there are usually not more than one or two
clauses per group ...
2005-06-09 04:19:00 +00:00
Tom Lane
e3a33a9a9f Marginal hack to avoid spending a lot of time in find_join_rel during
large planning problems: when the list of join rels gets too long, make
an auxiliary hash table that hashes on the identifying Bitmapset.
2005-06-08 23:02:05 +00:00
Tom Lane
77c168a836 Remove grammar productions for prefix and postfix % and ^ operators,
as well as the existing pg_catalog entries for prefix and postfix %.
These have never been documented, though they did appear in one old
regression test.  This avoids surprising behavior in cases like
"SELECT -25 % -10".  Per recent discussion.
Note: although there is a catalog change here, I did not force initdb
since there's no harm in leaving the inaccessible entries in one's
copy of pg_operator.
2005-06-08 21:15:29 +00:00
Tom Lane
f5b2f60bd1 Change WAL-logging scheme for multixacts to be more like regular
transaction IDs, rather than like subtrans; in particular, the information
now survives a database restart.  Per previous discussion, this is
essential for PITR log shipping and for 2PC.
2005-06-08 15:50:28 +00:00
Neil Conway
657c098e41 Add a function lastval(), which returns the value returned by the
last nextval() or setval() performed by the current session. Update the
docs, add regression tests, and bump the catalog version. Patch from
Dennis Björklund, various improvements by Neil Conway.
2005-06-07 07:08:35 +00:00
Tom Lane
ee7ac7b11e Modify XLogInsert API to make callers specify whether pages to be backed
up have the standard layout with unused space between pd_lower and pd_upper.
When this is set, XLogInsert will omit the unused space without bothering
to scan it to see if it's zero.  That saves time in XLogInsert, and also
allows reversion of my earlier patch to make PageRepairFragmentation et al
explicitly re-zero freed space.  Per suggestion by Heikki Linnakangas.
2005-06-06 20:22:58 +00:00
Tom Lane
4c8495a1f2 Remove the mostly-stubbed-out-anyway support routines for WAL UNDO.
That code is never going to be used in the foreseeable future, and
where it's more than a stub it's making the redo routines harder to
read.
2005-06-06 17:01:25 +00:00
Tom Lane
9a586fe0c5 Nab some low-hanging fruit: replace the planner's base_rel_list and
other_rel_list with a single array indexed by rangetable index.
This reduces find_base_rel from O(N) to O(1) without any real penalty.
While find_base_rel isn't one of the major bottlenecks in any profile
I've seen so far, it was starting to creep up on the radar screen
for complex queries --- so might as well fix it.
2005-06-06 04:13:36 +00:00
Tom Lane
9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
Tom Lane
a4996a8953 Replace the parser's namespace tree (which formerly had the same
representation as the jointree) with two lists of RTEs, one showing
the RTEs accessible by qualified names, and the other showing the RTEs
accessible by unqualified names.  I think this is conceptually simpler
than what we did before, and it's sure a whole lot easier to search.
This seems to eliminate the parse-time bottleneck for deeply nested
JOIN structures that was exhibited by phil@vodafone.
2005-06-05 00:38:11 +00:00
Bruce Momjian
72c53ac3a7 Allow kerberos name and username case sensitivity to be specified from
postgresql.conf.

---------------------------------------------------------------------------


Here's an updated version of the patch, with the following changes:

1) No longer uses "service name" as "application version". It's instead
hardcoded as "postgres". It could be argued that this part should be
backpatched to 8.0, but it doesn't make a big difference until you can
start changing it with GUC / connection parameters. This change only
affects kerberos 5, not 4.

2) Now downcases kerberos usernames when the client is running on win32.

3) Adds guc option for "krb_caseins_users" to make the server ignore
case mismatch which is required by some KDCs such as Active Directory.
Off by default, per discussion with Tom. This change only affects
kerberos 5, not 4.

4) Updated so it doesn't conflict with the rendevouz/bonjour patch
already in ;-)

Magnus Hagander
2005-06-04 20:42:43 +00:00
Tom Lane
e18e8f8735 Change expandRTE() and ResolveNew() back to taking just the single
RTE of interest, rather than the whole rangetable list.  This makes
the API more understandable and avoids duplicate RTE lookups.  This
patch reverts no-longer-needed portions of my patch of 2004-08-19.
2005-06-04 19:19:42 +00:00
Tom Lane
ba42002461 Revise handling of dropped columns in JOIN alias lists to avoid a
performance problem pointed out by phil@vodafone: to wit, we were
spending O(N^2) time to check dropped-ness in an N-deep join tree,
even in the case where the tree was freshly constructed and couldn't
possibly mention any dropped columns.  Instead of recursing in
get_rte_attribute_is_dropped(), change the data structure definition:
the joinaliasvars list of a JOIN RTE must have a NULL Const instead
of a Var at any position that references a now-dropped column.  This
costs nothing during normal parse-rewrite-plan path, and instead we
have a linear-time update to make when loading a stored rule that
might contain now-dropped columns.  While at it, move the responsibility
for acquring locks on relations referenced by rules into this separate
function (which I therefore chose to call AcquireRewriteLocks).
This saves effort --- namely, duplicated lock grabs in parser and rewriter
--- in the normal path at a cost of one extra non-locked heap_open()
in the stored-rule path; seems a good tradeoff.  A fringe benefit is
that it is now *much* clearer that we acquire lock on relations referenced
in rules before we make any rewriter decisions based on their properties.
(I don't know of any bug of that ilk, but it wasn't exactly clear before.)
2005-06-03 23:05:30 +00:00
Tom Lane
b5ebef7c41 Push enable/disable of notify and catchup interrupts all the way down
to just around the bare recv() call that gets a command from the client.
The former placement in PostgresMain was unsafe because the intermediate
processing layers (especially SSL) use facilities such as malloc that are
not necessarily re-entrant.  Per report from counterstorm.com.
2005-06-02 21:03:25 +00:00
Tom Lane
21fda22ec4 Change CRCs in WAL records from 64bit to 32bit for performance reasons.
Instead of a separate CRC on each backup block, include backup blocks
in their parent WAL record's CRC; this is important to ensure that the
backup block really goes with the WAL record, ie there was not a page
tear right at the start of the backup block.  Implement a simple form
of compression of backup blocks: drop any run of zeroes starting at
pd_lower, so as not to store the unused 'hole' that commonly exists in
PG heap and index pages.  Tweak PageRepairFragmentation and related
routines to ensure they keep the unused space zeroed, so that the above
compression method remains effective.  All per recent discussions.
2005-06-02 05:55:29 +00:00
Tom Lane
83b72ee286 ParseComplexProjection should make use of expandRecordVariable so that
it can handle cases like (foo.x).y where foo is a subquery and x is
a function-returning-RECORD RTE in that subquery.
2005-05-31 01:03:23 +00:00
Tom Lane
978129f28e Document get_call_result_type() and friends; mark TypeGetTupleDesc()
and RelationNameGetTupleDesc() as deprecated; remove uses of the
latter in the contrib library.  Along the way, clean up crosstab()
code and documentation a little.
2005-05-30 23:09:07 +00:00
Bruce Momjian
25146d3c29 Add support for NUMERIC ^ NUMERIC based on power(numeric, numeric). 2005-05-30 20:59:17 +00:00
Neil Conway
adfeef55cb When enqueueing after-row triggers for updates of a table with a foreign
key, compare the new and old row versions. If the foreign key column has
not changed, we needn't enqueue the trigger, since the update cannot
violate the foreign key. This optimization was previously applied in the
RI trigger function, but it is more efficient to avoid firing the trigger
altogether. Per recent discussion on pgsql-hackers.

Also add a regression test for some unintuitive foreign key behavior, and
refactor some code that deals with the OIDs of the various RI trigger
functions.
2005-05-30 07:20:59 +00:00
Neil Conway
f99b75b0a0 Create separate ON INSERT and ON UPDATE triggers on tables with foreign
keys, rather than a single trigger for both events. This should not change
functionality, but it is more consistent: previously, there were trigger
functions for both "check_insert" and "check_update", but the former was
used for both events.

Bump catalog version number (not strictly necessary, but best to be
cautious).
2005-05-30 06:52:38 +00:00
Tom Lane
cfd9be939e Change the UNKNOWN type to have an internal representation matching
cstring, rather than text, so as to eliminate useless conversions
inside the parser.  Per recent discussion.
2005-05-30 01:20:50 +00:00
Tom Lane
140b078d2a Improve LockAcquire API per my recent proposal. All error conditions
are now reported via elog, eliminating the need to test the result code
at most call sites.  Make it possible for the caller to distinguish a
freshly acquired lock from one already held in the current transaction.
Use that capability to avoid redundant AcceptInvalidationMessages() calls
in LockRelation().
2005-05-29 22:45:02 +00:00
Tom Lane
d66daabec9 Remove typeidIsValid() checks in can_coerce_type(). These checks
were pretty expensive and I believe the case they were put in to
defend against can no longer arise, now that we have dependency checks
to prevent deletion of a type entry that is still referenced.  Certainly
the example given in the CVS log entry can't happen anymore.
Since this was the only use of typeidIsValid(), remove the routine too.
2005-05-29 18:24:14 +00:00
Tom Lane
e92a88272e Modify hash_search() API to prevent future occurrences of the error
spotted by Qingqing Zhou.  The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places.  If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL.  But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
2005-05-29 04:23:07 +00:00
Tom Lane
32e8fc4a28 Arrange to cache fmgr lookup information for an index's access method
routines in the index's relcache entry, instead of doing a fresh fmgr_info
on every index access.  We were already doing this for the index's opclass
support functions; not sure why we didn't think to do it for the AM
functions too.  This supersedes the former method of caching (only)
amgettuple in indexscan scan descriptors; it's an improvement because the
function lookup can be amortized across multiple statements instead of
being repeated for each statement.  Even though lookup for builtin
functions is pretty cheap, this seems to drop a percent or two off some
simple benchmarks.
2005-05-27 23:31:21 +00:00
Neil Conway
a4374f9070 Remove second argument from textToQualifiedNameList(), as it is no longer
used. From Jaime Casanova.
2005-05-27 00:57:49 +00:00
Neil Conway
63e0d612f5 Adjust datetime parsing to be more robust. We now pass the length of the
working buffer into ParseDateTime() and reject too-long input there,
rather than checking the length of the input string before calling
ParseDateTime(). The old method was bogus because ParseDateTime() can use
a variable amount of working space, depending on the content of the
input string (e.g. how many fields need to be NUL terminated). This fixes
a minor stack overrun -- I don't _think_ it's exploitable, although I
won't claim to be an expert.

Along the way, fix a bug reported by Mark Dilger: the working buffer
allocated by interval_in() was too short, which resulted in rejecting
some perfectly valid interval input values. I added a regression test for
this fix.
2005-05-26 02:04:14 +00:00
Bruce Momjian
b492c3accc Add parentheses to macros when args are used in computations. Without
them, the executation behavior could be unexpected.
2005-05-25 21:40:43 +00:00
Bruce Momjian
f534820d4d Put parentheses around use of macro arguments in FMODULO and TMODULO. 2005-05-24 04:03:01 +00:00
Bruce Momjian
4550c1e519 More macro cleanups for date/time. 2005-05-23 21:54:02 +00:00
Bruce Momjian
5ebaae801c Add datetime macros for constants, for clarity:
#define SECS_PER_DAY  86400
#define USECS_PER_DAY INT64CONST(86400000000)
#define USECS_PER_HOUR    INT64CONST(3600000000)
#define USECS_PER_MINUTE INT64CONST(60000000)
#define USECS_PER_SEC INT64CONST(1000000)
2005-05-23 18:56:55 +00:00
Tom Lane
e2159f3842 Teach the planner to remove SubqueryScan nodes from the plan if they
aren't doing anything useful (ie, neither selection nor projection).
Also, extend to SubqueryScan the hacks already in place to avoid
unnecessary ExecProject calls when the result would just be the same
tuple the subquery already delivered.  This saves some overhead in
UNION and other set operations, as well as avoiding overhead for
unflatten-able subqueries.  Per example from Sokolov Yura.
2005-05-22 22:30:20 +00:00
Bruce Momjian
6dc7760ac3 Add support for wal_fsync_writethrough for Darwin, and restructure the
code to better handle writethrough.

Chris Campbell
2005-05-20 14:53:26 +00:00
Neil Conway
f3567eeaf2 Implement md5(bytea), update regression tests and documentation. Patch
from Abhijit Menon-Sen, minor editorialization from Neil Conway. Also,
improve md5(text) to allocate a constant-sized buffer on the stack
rather than via palloc.

Catalog version bumped.
2005-05-20 01:29:56 +00:00
Tom Lane
191b13aaca Factor out lock cleanup code that is needed in several places in lock.c.
Also, remove the rather useless return value of LockReleaseAll.  Change
response to detection of corruption in the shared lock tables to PANIC,
since that is the only way of cleaning up fully.
Originally an idea of Heikki Linnakangas, variously hacked on by
Alvaro Herrera and Tom Lane.
2005-05-19 23:30:18 +00:00
Tom Lane
ee3b71f6bc Split the shared-memory array of PGPROC pointers out of the sinval
communication structure, and make it its own module with its own lock.
This should reduce contention at least a little, and it definitely makes
the code seem cleaner.  Per my recent proposal.
2005-05-19 21:35:48 +00:00
Tom Lane
a9c4c9cd52 Extend the pg_locks system view so that it can fully display all lock
types, as per recent discussion.
2005-05-17 21:46:11 +00:00
Neil Conway
c891e05f26 Cleanup GiST header files. Since GiST extensions are often written as
external projects, we should be careful about what parts of the GiST
API are considered implementation details, and which are part of the
public API. Therefore, I've moved internal-only declarations into
gist_private.h -- future backward-incompatible changes to gist.h should
be made with care, to avoid needlessly breaking external GiST extensions.

Also did some related header cleanup: remove some unnecessary #includes
from gist.h, and remove some unused definitions: isAttByVal(), _gistdump(),
and GISTNStrategies.
2005-05-17 03:34:18 +00:00
Neil Conway
eda6dd32d1 GiST improvements:
- make sure we always invoke user-supplied GiST methods in a short-lived
  memory context. This means the backend isn't exposed to any memory leaks
  that be in those methods (in fact, it is probably a net loss for most
  GiST methods to bother manually freeing memory now). This also means
  we can do away with a lot of ugly manual memory management in the
  GiST code itself.

- keep the current page of a GiST index scan pinned, rather than doing a
  ReadBuffer() for each tuple produced by the scan. Since ReadBuffer() is
  expensive, this is a perf. win

- implement dead tuple killing for GiST indexes (which is easy to do, now
  that we keep a pin on the current scan page). Now all the builtin indexes
  implement dead tuple killing.

- cleanup a lot of ugly code in GiST
2005-05-17 00:59:30 +00:00
Bruce Momjian
c9a382b2ed Rename Rendezvous to Bonjour to match OS/X renaming. 2005-05-15 00:26:19 +00:00
Tom Lane
fabef3044a Minor refactoring to eliminate duplicate code and make startup a
tad faster.
2005-05-14 21:29:23 +00:00
Tom Lane
184e7a73a5 Revise nodeMergejoin in light of example provided by Guillaume Smet.
When one side of the join has a NULL, we don't want to uselessly try
to match it against every remaining tuple of the other side.  While
at it, rewrite the comparison machinery to avoid multiple evaluations
of the left and right input expressions and to use a btree comparator
where available, instead of double operator calls.  Also revise the
state machine to eliminate redundant comparisons and hopefully make it
more readable too.
2005-05-13 21:20:16 +00:00
Neil Conway
3140437495 This patch refactors away some duplicated code in the index AM build
methods: they all invoke UpdateStats() since they have computed the
number of heap tuples, so I created a function in catalog/index.c that
each AM now calls.
2005-05-11 06:24:55 +00:00
Neil Conway
48f8eadffb This patch reduces the size of the message header used by statistics
collector messages, per recent discussion on pgsql-patches. This
actually required quite a few changes -- for example,
"databaseid != InvalidOid" was used to check whether a slot in the
backend entry table was initialized, but that no longer works since
the slot might be initialized prior to receiving the BESTART message
which contains the database id. We now use procpid > 0 to indicate
that a slot is non-empty.

Other changes:

- various comment improvements and cleanups
- there's no need to zero-out the entire activity buffer in
  pgstat_add_backend(), we can just set activity[0] to '\0'.
- remove the counting of the # of connections to a database; this
  was not used anywhere

One change in behavior I wasn't sure about: previously, the code
would create a hash table entry for a database as soon as any message
was received whose header referenced that database. Now, we only
create hash table entries as needed (so for example BESTART won't
create a database hash table entry, since it doesn't need to
access anything in the per-db hash table). It would be easy enough
to retain the old behavior, but AFAICS it is not required.
2005-05-11 01:41:41 +00:00
Neil Conway
f38e413b20 Code cleanup: in C89, there is no point casting the first argument to
memset() or MemSet() to a char *. For one, memset()'s first argument is
a void *, and further void * can be implicitly coerced to/from any other
pointer type.
2005-05-11 01:26:02 +00:00
Bruce Momjian
35e1651508 Back out check for unreferenced files.
Heikki Linnakangas
2005-05-10 22:27:30 +00:00
Bruce Momjian
a4dde3bff3 Report index name on CLUSTER failure. Also, suggest ALTER TABLE
WITHOUT CLUSTER for cluster failure of a single table in a full db
cluster.
2005-05-10 13:16:26 +00:00
Neil Conway
4744c1a0a1 Complete the following TODO items:
* Add session start time to pg_stat_activity
* Add the client IP address and port to pg_stat_activity

Original patch from Magnus Hagander, code review by Neil Conway. Catalog
version bumped. This patch sends the client IP address and port number in
every statistics message; that's not ideal, but will be fixed up shortly.
2005-05-09 11:31:34 +00:00
Tom Lane
278bd0cc22 For some reason access/tupmacs.h has been #including utils/memutils.h,
which is neither needed by nor related to that header.  Remove the bogus
inclusion and instead include the header in those C files that actually
need it.  Also fix unnecessary inclusions and bad inclusion order in
tsearch2 files.
2005-05-06 17:24:55 +00:00
Tom Lane
db70a31294 Adjust nodeBitmapIndexscan to keep the target index opened from plan
startup to end, rather than re-opening it in each MultiExecBitmapIndexScan
call.  I had foolishly thought that opening/closing wouldn't be much
more expensive than a rescan call, but that was sheer brain fade.

This seems to fix about half of the performance lossage reported by
Sergey Koposov.  I'm still not sure where the other half went.
2005-05-05 03:37:23 +00:00
Tom Lane
126eaef651 Clean up MultiXactIdExpand's API by separating out the case where we
are creating a new MultiXactId from two regular XIDs.  The original
coding was unnecessarily complicated and didn't save any code anyway.
2005-05-03 19:42:41 +00:00
Bruce Momjian
76668e6eb4 Check the file system on postmaster startup and report any unreferenced
files in the server log.

Heikki Linnakangas
2005-05-02 18:26:54 +00:00
Neil Conway
f478856c7f Change SPI functions to use a `long' when specifying the number of tuples
to produce when running the executor. This is consistent with the internal
executor APIs (such as ExecutorRun), which also use a long for this purpose.
It also allows FETCH_ALL to be passed -- since FETCH_ALL is defined as
LONG_MAX, this wouldn't have worked on platforms where int and long are of
different sizes. Per report from Tzahi Fadida.
2005-05-02 00:37:07 +00:00
Tom Lane
6c412f0605 Change CREATE TYPE to require datatype output and send functions to have
only one argument.  (Per recent discussion, the option to accept multiple
arguments is pretty useless for user-defined types, and would be a likely
source of security holes if it was used.)  Simplify call sites of
output/send functions to not bother passing more than one argument.
2005-05-01 18:56:19 +00:00
Tom Lane
7f8d2fe31c Change catalog entries for record_out and record_send to show only one
argument, since that's all they are using now.  Adjust type_sanity
regression test so that it will complain if anyone tries to define
multiple-argument output functions in future.
2005-04-30 20:31:39 +00:00
Tom Lane
93b2477278 Use the standard lock manager to establish priority order when there
is contention for a tuple-level lock.  This solves the problem of a
would-be exclusive locker being starved out by an indefinite succession
of share-lockers.  Per recent discussion with Alvaro.
2005-04-30 19:03:33 +00:00
Tom Lane
3a694bb0a1 Restructure LOCKTAG as per discussions of a couple months ago.
Essentially, we shoehorn in a lockable-object-type field by taking
a byte away from the lockmethodid, which can surely fit in one byte
instead of two.  This allows less artificial definitions of all the
other fields of LOCKTAG; we can get rid of the special pg_xactlock
pseudo-relation, and also support locks on individual tuples and
general database objects (including shared objects).  None of those
possibilities are actually exploited just yet, however.

I removed pg_xactlock from pg_class, but did not force initdb for
that change.  At this point, relkind 's' (SPECIAL) is unused and
could be removed entirely.
2005-04-29 22:28:24 +00:00
Tom Lane
bedb78d386 Implement sharable row-level locks, and use them for foreign key references
to eliminate unnecessary deadlocks.  This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE.  The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets.  When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX.  This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared.   Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.

Alvaro Herrera and Tom Lane.
2005-04-28 21:47:18 +00:00
Tom Lane
79a1b00226 Replace slightly klugy create_bitmap_restriction() function with a
more efficient routine in restrictinfo.c (which can make use of
make_restrictinfo_internal).
2005-04-25 02:14:48 +00:00
Tom Lane
5b05185262 Remove support for OR'd indexscans internal to a single IndexScan plan
node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
2005-04-25 01:30:14 +00:00
Tom Lane
186655e9a5 Adjust nodeBitmapIndexscan.c to not keep the index open across calls,
but just to open and close it during MultiExecBitmapIndexScan.  This
avoids acquiring duplicate resources (eg, multiple locks on the same
relation) in a tree with many bitmap scans.  Also, don't bother to
lock the parent heap at all here, since we must be underneath a
BitmapHeapScan node that will be holding a suitable lock.
2005-04-24 18:16:38 +00:00
Tom Lane
bc843d3960 First cut at planner support for bitmap index scans. Lots to do yet,
but the code is basically working.  Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever.  orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
2005-04-22 21:58:32 +00:00
Tom Lane
14c7fba3f7 Rethink original decision to use AND/OR Expr nodes to represent bitmap
logic operations during planning.  Seems cleaner to create two new Path
node types, instead --- this avoids duplication of cost-estimation code.
Also, create an enable_bitmapscan GUC parameter to control use of bitmap
plans.
2005-04-21 19:18:13 +00:00
Tom Lane
e6f7edb9d5 Install some slightly realistic cost estimation for bitmap index scans. 2005-04-21 02:28:02 +00:00
Tom Lane
9d64632144 Minor performance improvement: avoid unnecessary creation/unioning of
bitmaps for multiple indexscans.  Instead just let each indexscan add
TIDs directly into the BitmapOr node's result bitmap.
2005-04-20 15:48:36 +00:00
Tom Lane
4a8c5d0375 Create executor and planner-backend support for decoupled heap and index
scans, using in-memory tuple ID bitmaps as the intermediary.  The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed.  I have tested it using some hacked planner
code that is far too ugly to see the light of day, however.  Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
2005-04-19 22:35:18 +00:00
Bruce Momjian
aa8bdab272 Attached patch gets rid of the global timezone in the following steps:
* Changes the APIs to the timezone functions to take a pg_tz pointer as
an argument, representing the timezone to use for the selected
operation.

* Adds a global_timezone variable that represents the current timezone
in the backend as set by SET TIMEZONE (or guc, or env, etc).

* Implements a hash-table cache of loaded tables, so we don't have to
read and parse the TZ file everytime we change a timezone. While not
necesasry now (we don't change timezones very often), I beleive this
will be necessary (or at least good) when "multiple timezones in the
same query" is eventually implemented. And code-wise, this was the time
to do it.


There are no user-visible changes at this time. Implementing the
"multiple zones in one query" is a later step...

This also gets rid of some of the cruft needed to "back out a timezone
change", since we previously couldn't check a timezone unless it was
activated first.

Passes regression tests on win32, linux (slackware 10) and solaris x86.

Magnus Hagander
2005-04-19 03:13:59 +00:00
Tom Lane
db30652135 Initial implementation of lossy-tuple-bitmap data structures.
Not connected to anything useful yet ...
2005-04-17 22:24:02 +00:00
Tom Lane
d8b1bf4791 Create a new 'MultiExecProcNode' call API for plan nodes that don't
return just a single tuple at a time.  Currently the only such node
type is Hash, but I expect we will soon have indexscans that can return
tuple bitmaps.  A side benefit is that EXPLAIN ANALYZE now shows the
correct tuple count for a Hash node.
2005-04-16 20:07:35 +00:00
Tom Lane
055467d504 Marginal hack to use a specialized hash function for dynahash hashtables
whose keys are OIDs.  The only one that looks particularly performance
critical is the relcache hashtable, but as long as we've got the function
we may as well use it wherever it's applicable.
2005-04-14 20:32:43 +00:00