be supported for all datatypes. Add CREATE AGGREGATE and pg_dump support
too. Add specialized min/max aggregates for bpchar, instead of depending
on text's min/max, because otherwise the possible use of bpchar indexes
cannot be recognized.
initdb forced because of catalog changes.
into indexscans on matching indexes. For the moment, it only handles
int4 and text datatypes; next step is to add a column to pg_aggregate
so that all MIN/MAX aggregates can be handled. Per my recent proposal.
deferred triggers: either one can create more work for the other,
so we have to loop till it's all gone. Per example from andrew@supernews.
Add a regression test to help spot trouble in this area in future.
while completing execution of the cursor's query. Otherwise we get wrong
answers or even crashes from non-volatile functions called by the query.
Per report from andrew@supernews.
decides whether to use hashed grouping instead of sort-plus-uniq
grouping. The function needs an annoyingly large number of parameters,
but this still seems like a win for legibility, since it removes over
a hundred lines from grouping_planner (which is still too big :-().
the long-term plan for this behavior for quite some time, but it is only
possible now that DELETE has a USING clause so that the user can join
other tables in a DELETE statement without relying on this behavior.
in UPDATE. We also now issue a NOTICE if a query has _any_ implicit
range table entries -- in the past, we would only warn about implicit
RTEs in SELECTs with at least one explicit RTE.
As a result of the warning change, 25 of the regression tests had to
be updated. I also took the opportunity to remove some bogus whitespace
differences between some of the float4 and float8 variants. I believe
I have correctly updated all the platform-specific variants, but let
me know if that's not the case.
Original patch for DELETE ... USING from Euler Taveira de Oliveira,
reworked by Neil Conway.
functions. This patch optimizes int2_sum(), int4_sum(), float4_accum()
and float8_accum() to avoid needing to copy the transition function's
state for each input tuple of the aggregate. In an extreme case
(e.g. SELECT sum(int2_col) FROM table where table has a single column),
it improves performance by about 20%. For more complex queries or tables
with wider rows, the relative performance improvement will not be as
significant.
ExecProcNode() with a NULL value, so the test couldn't do anything
for us except maybe mask bugs. Removing it probably doesn't save
anything much either, but then again this is a hot-spot routine.
few palloc's. I also chose to eliminate the restype and restypmod fields
entirely, since they are redundant with information stored in the node's
contained expression; re-examining the expression at need seems simpler
and more reliable than trying to keep restype/restypmod up to date.
initdb forced due to change in contents of stored rules.
performance hack Tom introduced recently. This means we can avoid
copying the transition array for each input tuple if these functions
are invoked as aggregate transition functions.
To test the performance improvement, I created a 1 million row table
with a single int4 column. Without the patch, SELECT avg(col) FROM
table took about 4.2 seconds (after the data was cached); with the
patch, it took about 3.2 seconds. Naturally, the performance
improvement for a less trivial query (or a table with wider rows)
would be relatively smaller.
cases with binary-compatible relabeling. My first try was implicitly
assuming that all operators scalarineqsel is used for have binary-
compatible datatypes on both sides ... which is very wrong of course.
Per report from Michael Fuhr.
exit. Without this, operations triggered during backend exit (such as
temp table deletions) won't be counted ... which given heavy usage of
temp tables can lead to pg_autovacuum falling way behind on the need
to vacuum pg_class and pg_attribute. Per reports from Steve Crawford
and others.
old comment in the code claimed that this was necessary. Since it is not
actually necessary any more, it is clearer to remove the comment and
just return NULL instead -- the return value of ExecHash() is not used.
proposal for OUT parameter support. The columns don't actually *do*
anything yet, they are just left NULLs. But I thought I'd commit this
part separately as a fairly pure example of the tasks needed when adding
a column to pg_proc or one of the other core system tables.
implement any new feature, it just pushes the 'not implemented' error
message deeper into the backend. I also tweaked the grammar to accept
Oracle-ish parameter syntax (parameter name first), as well as the
SQL99 standard syntax (parameter mode first), since it was easy and
people will doubtless try to use both anyway.
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not. I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass. However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.
transaction rollback via UNDO but I think that's highly unlikely to
happen, so we may as well remove the stubs. (Someday we ought to
rip out the stub xxx_undo routines, too.) Per Alvaro.
really ought to run before canonicalize_qual, because it can now produce
forms that canonicalize_qual knows how to improve (eg, NOT clauses).
Also, because eval_const_expressions already knows about flattening
nested ANDs and ORs into N-argument form, the initial flatten_andors
pass in canonicalize_qual is now completely redundant and can be
removed. This doesn't save a whole lot of code, but the time and
palloc traffic eliminated is a useful gain on large expression trees.
access: define new index access method functions 'amgetmulti' that can
fetch multiple TIDs per call. (The functions exist but are totally
untested as yet.) Since I was modifying pg_am anyway, remove the
no-longer-needed 'rel' parameter from amcostestimate functions, and
also remove the vestigial amowner column that was creating useless
work for Alvaro's shared-object-dependencies project.
Initdb forced due to changes in pg_am.
that is 'x = true' becomes 'x' and 'x = false' becomes 'NOT x'. This isn't
all that amazingly useful in itself, but it ensures that we will recognize
the different forms as being logically equivalent when checking partial
index predicates. Per example from Patrick Clery.
structs. There are many places in the planner where we were passing
both a rel and an index to subroutines, and now need only pass the
index struct. Notationally simpler, and perhaps a tad faster.
for boolean indexes. Previously we would only use such an index with
WHERE clauses like 'indexkey = true' or 'indexkey = false'. The new
code transforms the cases 'indexkey', 'NOT indexkey', 'indexkey IS TRUE',
and 'indexkey IS FALSE' into one of these. While this is only marginally
useful in itself, I intend soon to change constant-expression simplification
so that 'foo = true' and 'foo = false' are reduced to just 'foo' and
'NOT foo' ... which would lose the ability to use boolean indexes for
such queries at all, if the indexscan machinery couldn't make the
reverse transformation.
binary-compatible relabeling of one or both operands. examine_variable
should avoid stripping RelabelType from non-variable expressions, so that
they will continue to have the correct type; and convert_to_scalar should
just use that type and ignore the other input type. This isn't perfect
but it beats failing entirely. Per example from Michael Fuhr.
when a zero-month interval is given. Per discussion with Karel.
Also, some desultory const-labeling of constant tables. More could be
done along that line.
actual number of unremoved tuples as pg_class.reltuples. The idea of
trying to estimate a steady state condition still seems attractive, but
this particular implementation crashed and burned ...
executing a statement that fires triggers. Formerly this time was
included in "Total runtime" but not otherwise accounted for.
As a side benefit, we avoid re-opening relations when firing non-deferred
AFTER triggers, because the trigger code can re-use the main executor's
ResultRelInfo data structure.
when open references remain during normal cleanup of a resource owner.
This restores the system's ability to warn about leaks to what it was
before 8.0. Not really a user-level bug, but helpful for development.
overly strong lock on pg_depend, and it wasn't closing the rel when done.
The latter bug was masked by the ResourceOwner code, which is something
that should be changed.
never-yet-vacuumed relation. This restores the pre-8.0 behavior of
avoiding seqscans during initial data loading, while still allowing
reasonable optimization after a table has been vacuumed. Several
regression test cases revert to 7.4-like behavior, which is probably
a good sign. Per gripes from Keith Browne and others.
* Touch the socket and lock file at least every hour, to
* ensure that they are not removed by overzealous /tmp-cleaning
* tasks. Set to 58 minutes so a cleaner never sees the
* file as an hour old.
currently does. This is now the default Win32 wal sync method because
we perfer o_datasync to fsync.
Also, change Win32 fsync to a new wal sync method called
fsync_writethrough because that is the behavior of _commit, which is
what is used for fsync on Win32.
Backpatch to 8.0.X.
ExclusiveLock rather than AccessExclusiveLock. This will allow concurrent
SELECT queries to proceed on the table. Per discussion with Andrew at
SuperNews.
explicit paths, so that the log can be replayed in a data directory
with a different absolute path than the original had. To avoid forcing
initdb in the 8.0 branch, continue to accept the old WAL log record
types; they will never again be generated however, and the code can be
dropped after the next forced initdb. Per report from Oleg Bartunov.
We still need to think about what it really means to WAL-log CREATE
TABLESPACE commands: we more or less have to put the absolute path
into those, but how to replay in a different context??
PageIndexTupleDelete() with a single pass of compactification ---
logic mostly lifted from PageRepairFragmentation. I noticed while
profiling that a VACUUM that's cleaning up a whole lot of deleted
tuples would spend as much as a third of its CPU time in
PageIndexTupleDelete; not too surprising considering the loop method
was roughly O(N^2) in the number of tuples involved.
up-to-speed logic; in particular this will cause it to quote names that
match keywords. Remove unnecessary multibyte cruft from quote_literal
(all backend-internal encodings are 8-bit-safe).
convention for isnull flags. Also, remove the useless InsertIndexResult
return struct from index AM aminsert calls --- there is no reason for
the caller to know where in the index the tuple was inserted, and we
were wasting a palloc cycle per insert to deliver this uninteresting
value (plus nontrivial complexity in some AMs).
I forced initdb because of the change in the signature of the aminsert
routines, even though nothing really looks at those pg_proc entries...
to write out data that we are about to tell the filesystem to drop.
smgr_internal_unlink already had a DropRelFileNodeBuffers call to
get rid of dead buffers without a write after it's no longer possible
to roll back the deleting transaction. Adding a similar call in
smgrtruncate simplifies callers and makes the overall division of
labor clearer. This patch removes the former behavior that VACUUM
would write all dirty buffers of a relation unconditionally.
is still alive. This improves our odds of not getting fooled by an
unrelated process when checking a stale lock file. Other checks already
in place, plus one newly added in checkDataDir(), ensure that we cannot
attempt to usurp the place of a postmaster belonging to a different userid,
so there is no need to error out. Add comments indicating the importance
of these other checks.
grouping_planner() to preprocess_targetlist(), according to a comment
in grouping_planner(). I think the refactoring makes sense, and moves
some extraneous details out of grouping_planner().
of tuples when passing data up through multiple plan nodes. A slot can now
hold either a normal "physical" HeapTuple, or a "virtual" tuple consisting
of Datum/isnull arrays. Upper plan levels can usually just copy the Datum
arrays, avoiding heap_formtuple() and possible subsequent nocachegetattr()
calls to extract the data again. This work extends Atsushi Ogawa's earlier
patch, which provided the key idea of adding Datum arrays to TupleTableSlots.
(I believe however that something like this was foreseen way back in Berkeley
days --- see the old comment on ExecProject.) A test case involving many
levels of join of fairly wide tables (about 80 columns altogether) showed
about 3x overall speedup, though simple queries will probably not be
helped very much.
I have also duplicated some code in heaptuple.c in order to provide versions
of heap_formtuple and friends that use "bool" arrays to indicate null
attributes, instead of the old convention of "char" arrays containing either
'n' or ' '. This provides a better match to the convention used by
ExecEvalExpr. While I have not made a concerted effort to get rid of uses
of the old routines, I think they should be deprecated and eventually removed.
locale is C.
Backpatch to 8.0.X because some operating systems were throwing errors
for such operations, rather than ignoring the locale when it was C.
a tuple are being accessed via ExecEvalVar and the attcacheoff shortcut
isn't usable (due to nulls and/or varlena columns). To do this, cache
Datums extracted from a tuple in the associated TupleTableSlot.
Also some code cleanup in and around the TupleTable handling.
Atsushi Ogawa with some kibitzing by Tom Lane.
whether or not it is a security definer. Changing a function's strictness
is required by SQL2003, and the other capabilities make sense. Also, allow
an optional RESTRICT noise word to be specified, for SQL conformance.
Some trivial regression tests added and the documentation has been
updated.
database's datallowconn and datfrozenxid to the current transaction ID
instead of copying the source database's values. This is OK because we
assume the source DB contains no normal transaction IDs whatsoever.
This keeps VACUUM from immediately starting to complain about unvacuumed
databases in the situation where we are more than 2 billion transactions
out from the XID stamp of template0. Per discussion with Milen Radev
(although his complaint turned out to be due to something else, but the
problem is real anyway).
can tell whether it is being used as an aggregate or not. This allows
such a function to avoid re-pallocing a pass-by-reference transition
value; normally it would be unsafe for a function to scribble on an input,
but in the aggregate case it's safe to reuse the old transition value.
Make int8inc() do this. This gets a useful improvement in the speed of
COUNT(*), at least on narrow tables (it seems to be swamped by I/O when
the table rows are wide). Per a discussion in early December with
Neil Conway. I also fixed int_aggregate.c to check this, thereby
turning it into something approaching a supportable technique instead
of being a crude hack.
elog if the former has trouble writing its file. Code review for
Magnus' patch to redirect stderr to syslog on Windows (Bruce's version
seems right, but did some minor prettification).
Backpatch both changes to 8.0 branch.
Document use of macros for pg_printf functions.
Bump major versions of all interfaces to handle movement of get_progname
from libpq to libpgport in 8.0, and probably other libpgport changes in 8.1.
Formerly, if such a clause contained no aggregate functions we mistakenly
treated it as equivalent to WHERE. Per spec it must cause the query to
be treated as a grouped query of a single group, the same as appearance
of aggregate functions would do. Also, the HAVING filter must execute
after aggregate function computation even if it itself contains no
aggregate functions.
before we can invoke fork() -- flush stdio buffers, save and restore the
profiling timer on Linux with LINUX_PROFILE, and handle BeOS stuff. This
patch moves that code into a single function, fork_process(), instead of
duplicating it at the various callsites of fork().
This patch doesn't address the EXEC_BACKEND case; there is room for
further cleanup there.
number of palloc calls. This has a salutory impact on plpgsql operations
with record variables (which create and destroy tupdescs constantly)
and probably helps a bit in some other cases too.
Too much space is allocated for tablespace file path, I guess the
directory name used to be "pg_tablespaces" instead of "pg_tblspc" at
some point.
Heikki Linnakangas