the timestamp types. Turns out this doesn't even reduce the available
range of dates, since the restriction to dates that work for Julian-date
arithmetic is much tighter than the int32 range anyway. Per a longstanding
TODO item.
depth-first search order. Upon close reading of SQL:2008, it seems that the
spec's SEARCH DEPTH FIRST and SEARCH BREADTH FIRST options do not actually
guarantee any particular result order: what they do is provide a constructed
column that the user can then sort on in the outer query. So this is actually
just as much functionality ...
pseudo-type record[] to represent arrays of possibly-anonymous composite
types. Since composite datums carry their own type identification, no
extra knowledge is needed at the array level.
The main reason for doing this right now is that it is necessary to support
the general case of detection of cycles in recursive queries: if you need to
compare more than one column to detect a cycle, you need to compare a ROW()
to an array built from ROW()s, at least if you want to do it as the spec
suggests. Add some documentation and regression tests concerning the cycle
detection issue.
RecursiveUnion to which it refers. It turns out that we can just postpone the
relevant initialization steps until the first exec call for the node, by which
time the ancestor node must surely be initialized. Per report from Greg Stark.
the isTrigger state explicitly, not rely on nonzero-ness of trigrelOid
to indicate trigger-hood, because trigrelOid will be left zero when compiling
for validation. The (useless) function hash entry built by the validator
was able to match an ordinary non-trigger call later in the same session,
thereby bypassing the check that is supposed to prevent such a call.
Per report from Alvaro.
It might be worth suppressing the useless hash entry altogether, but
that's a bigger change than I want to consider back-patching.
Back-patch to 8.0. 7.4 doesn't have the problem because it doesn't
have validation mode.
to process any pending unlinks for the source database.
Before, if you dropped a relation in the template database just before
CREATE DATABASE, and a checkpoint happened during copydir(), the checkpoint
might delete a file that we're just about to copy, causing lstat() in
copydir() to fail with ENOENT.
Backpatch to 8.3, where the pending unlinks were introduced.
Per report by Matthew Wakeling and analysis by Tom Lane.
of referencing a WITH item that's not yet in scope according to the SQL
spec's semantics. This seems to be an easy error to make, and the bare
"relation doesn't exist" message doesn't lead one's mind in the correct
direction to fix it.
implementation uses an in-memory hash table, so it will poop out for very
large recursive results ... but the performance characteristics of a
sort-based implementation would be pretty unpleasant too.
use the old relfilenode in the new tablespace. There might be another relation
in the new tablespace with the same relfilenode, so we must generate a fresh
relfilenode in the new tablespace.
The 8.3 patch to let deleted relation files linger as zero-length files until
the next checkpoint made this more obvious: moving a relation from one table
space another, and then back again, caused a collision with the lingering
file.
Back-patch to 8.1. The issue is present in 8.0 as well, but it doesn't seem
worth fixing there, because we didn't have protection from OID collisions
after OID wraparound before 8.1.
Report by Guillaume Lelarge.
supplies an expression that can't be coerced to the target column type.
The code previously attempted to point at the target column name, which
doesn't work at all in an INSERT with omitted column name list, and is
also not remarkably helpful when the problem is buried somewhere in a
long INSERT-multi-VALUES command. Make it point at the failed expression
instead.
get_name_for_var_field didn't have enough context to interpret a reference to
a CTE query's output. Fixing this requires separate hacks for the regular
deparse case (pg_get_ruledef) and for the EXPLAIN case, since the available
context information is quite different. It's pretty nearly parallel to the
existing code for SUBQUERY RTEs, though. Also, add code to make sure we
qualify a relation name that matches a CTE name; else the CTE will mistakenly
capture the reference when reloading the rule.
In passing, fix a pre-existing problem with get_name_for_var_field not working
on variables in targetlists of SubqueryScan plan nodes. Although latent all
along, this wasn't a problem until we made EXPLAIN VERBOSE try to print
targetlists. To do this, refactor the deparse_context_for_plan API so that
the special case for SubqueryScan is all on ruleutils.c's side.
the column alias names of the RTE referenced by the Var to the RowExpr.
This is needed to allow ruleutils.c to correctly deparse FieldSelect nodes
referencing such a construct. Per my recent bug report.
Adding a field to RowExpr forces initdb (because of stored rules changes)
so this solution is not back-patchable; which is unfortunate because 8.2
and 8.3 have this issue. But it only affects EXPLAIN for some pretty odd
corner cases, so we can probably live without a solution for the back
branches.
relation forks. While the file names are not visible to users, for those
that do peek into the data directory, it's nice to have more descriptive
names. Per Greg Stark's suggestion.
well as regular tables. Per discussion, this seems necessary to meet the
principle of least astonishment.
In passing, simplify the error messages in warnAutoRange(). Now that we
have parser error position info for these errors, it doesn't seem very
useful to word the error message differently depending on whether we are
inside a sub-select or not.
machine produces zero (rather than the more usual minimum-possible-integer)
for the only possible overflow case. This has been seen to occur for at least
some word widths on some hardware, and it's cheap enough to check for
everywhere. Per Peter's analysis of buildfarm reports.
This could be back-patched, but in the absence of any gripes from the field
I doubt it's worth the trouble.
recursive CTE that we're still in progress of analyzing. Add a similar guard
to the similar code in expandRecordVariable(), and tweak regression tests to
cover this case. Per report from Dickson S. Guedes.
There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.
There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly. But let's land
the patch now so we can get on with other development.
Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
name of a fork ('main' or 'fsm', at the moment) to pg_relation_size() to
get the size of a specific fork. Defaults to 'main', if none given.
While we're at it, modify pg_relation_size to take a regclass as argument,
instead of separate variants taking oid and name. This change is
transparent to typical use where the table name is passed as a string
literal, like pg_relation_size('table'), but will break queries like
pg_relation_size(namecol), where namecol is of type name. text-type input
still works, and using a non-schema-qualified table name is not very
reliable anyway, so this is unlikely to break anyone's queries in practice.
This facility replaces the former mark/restore support but is otherwise
upward-compatible with previous uses. It's expected to be needed for
single evaluation of CTEs and also for window functions, so I'm committing
it separately instead of waiting for either one of those patches to be
finished. Per discussion with Greg Stark and Hitoshi Harada.
Note: I removed nodeFunctionscan's mark/restore support, instead of bothering
to update it for this change, because it was dead code anyway.
free space information is stored in a dedicated FSM relation fork, with each
relation (except for hash indexes; they don't use FSM).
This eliminates the max_fsm_relations and max_fsm_pages GUC options; remove any
trace of them from the backend, initdb, and documentation.
Rewrite contrib/pg_freespacemap to match the new FSM implementation. Also
introduce a new variant of the get_raw_page(regclass, int4, int4) function in
contrib/pageinspect that let's you to return pages from any relation fork, and
a new fsm_page_contents() function to inspect the new FSM pages.
applied to expression indexes, not to plain relations. The original coding
in btcostestimate conflated the two cases, but it's not hard to use
get_relation_stats_hook instead when we're looking to the underlying relation.
(ie, has nothing to quote), rather than silently ignoring the character as has
been our historical behavior. This is required by SQL spec and should help
reduce the sort of user confusion seen in bug #4436. Per discussion.
This is not so much a bug fix as a definitional change, and it could break
existing applications; so not back-patched. It might deserve being mentioned
as an incompatibility in the 8.4 release notes.
element types. Since the backend doesn't actually pay attention to the array
type's delimiter, this has no functional effect, but it seems better for the
catalog entries to be consistent. Per gripe from Greg Mullane and subsequent
discussion.
a SubLink expression into a rule query. We missed cases where the original
query contained a sub-SELECT in a function in FROM, a multi-row VALUES list,
or a RETURNING list. Per bug #4434 from Dean Rasheed and subsequent
investigation.
Back-patch to 8.1; older releases don't have the issue because they didn't
try to be smart about setting hasSubLinks only when needed.
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.
This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
that presence of the password in the conninfo string must be checked *before*
risking a connection attempt, there is no point in checking it afterwards.
This makes the specification of PQconnectionUsedPassword() a bit simpler
and perhaps more generally useful, too.
conninfo string *before* trying to connect to the remote server, not after.
As pointed out by Marko Kreen, in certain not-very-plausible situations
this could result in sending a password from the postgres user's .pgpass file,
or other places that non-superusers shouldn't have access to, to an
untrustworthy remote server. The cleanest fix seems to be to expose libpq's
conninfo-string-parsing code so that dblink can check for a password option
without duplicating the parsing logic.
Joe Conway, with a little cleanup by Tom Lane
instead of listing all the columns returned by the underlying function.
initdb not forced since this patch doesn't actually change anything about
the stored form of the views. It just means there's one less place to change
if someone wants to add columns to them.
sequence of operations that libpq goes through while creating a PGresult.
Also, remove ill-considered "const" decoration on parameters passed to
event procedures.
guarantees about whether event procedures will receive DESTROY events.
They no longer need to defend themselves against getting a DESTROY
without a successful prior CREATE.
Andrew Chernow
bison output. Without these, make can sometimes be tempted to invoke its
built-in rules using lex and yacc, which can fail if those commands are not
available.
This was a main cause for the NLS web site breakage.
there are FD_XACT_TEMPORARY files to clean up at transaction end.
Per performance profiling results on AWeber's huge systems.
Patch by me after an idea suggested by Simon Riggs.
interpreted as expected (the sign should affect months too), and get rid of
hard-wired assumption that unmarked signed values must be hours (if integers)
or seconds (if floats). The former was just a bug in my previous patch,
while the latter may have made sense at one time but seems illogical now
that we support determination of the units from typmod information.
Ron Mayer and myself.
forestalls potential overflow when the same table (or other object, but
usually tables) is accessed by very many successive queries within a single
transaction. Per report from Michael Milligan.
Back-patch to 8.0, which is as far back as the patch conveniently applies.
There have been no reports of overflow in pre-8.3 releases, but clearly the
risk existed all along. (Michael's report suggests that 8.3 may consume lock
counts faster than prior releases, but with no test case to look at it's hard
to be sure about that. Widening the counts seems a good future-proofing
measure in any event.)
we regenerate the SQL query text not merely the plan derived from it. This
is needed to handle contingencies such as renaming of a table or column
used in an FK. Pre-8.3, such cases worked despite the lack of replanning
(because the cached plan needn't actually change), so this is a regression.
Per bug #4417 from Benjamin Bihler.
value. This means that hash index lookups are always lossy and have to be
rechecked when the heap is visited; however, the gain in index compactness
outweighs this when the indexed values are wide. Also, we only need to
perform datatype comparisons when the hash codes match exactly, rather than
for every entry in the hash bucket; so it could also win for datatypes that
have expensive comparison functions. A small additional win is gained by
keeping hash index pages sorted by hash code and using binary search to reduce
the number of index tuples we have to look at.
Xiao Meng
This commit also incorporates Zdenek Kotala's patch to isolate hash metapages
and hash bitmaps a bit better from the page header datastructures.
each connection. This makes it possible to catch errors in the pg_hba
file when it's being reloaded, instead of silently reloading a broken
file and failing only when a user tries to connect.
This patch also makes the "sameuser" argument to ident authentication
optional.
btree. We can't easily tell whether clauses generated from the equivalence
class could be used with such an index, so just assume that they might be.
This bit of over-optimization prevented use of non-btree indexes for nestloop
inner indexscans, in any case where the join uses an equality operator that
is also a btree operator --- which in particular is typically true for hash
indexes. Noted while trying to test the current hash index patch.
and the literal syntax INTERVAL 'string' ... SECOND(n), as required by the
SQL standard. Our old syntax put (n) directly after INTERVAL, which was
a mistake, but will still be accepted for backward compatibility as well
as symmetry with the TIMESTAMP cases.
Change intervaltypmodout to show it in the spec's way, too. (This could
potentially affect clients, if there are any that analyze the typmod of an
INTERVAL in any detail.)
Also fix interval input to handle 'min:sec.frac' properly; I had overlooked
this case in my previous patch.
Document the use of the interval fields qualifier, which up to now we had
never mentioned in the docs. (I think the omission was intentional because
it didn't work per spec; but it does now, or at least close enough to be
credible.)
GetOldestXmin() instead of RecentGlobalXmin; this is safer because we do not
depend on the latter being correctly set elsewhere, and while it is more
expensive, this code path is not performance-critical. This is a real
risk for autovacuum, because it can execute whole cycles without doing
a single vacuum, which would mean that RecentGlobalXmin would stay at its
initialization value, FirstNormalTransactionId, causing a bogus value to be
inserted in pg_database. This bug could explain some recent reports of
failure to truncate pg_clog.
At the same time, change the initialization of RecentGlobalXmin to
InvalidTransactionId, and ensure that it's set to something else whenever
it's going to be used. Using it as FirstNormalTransactionId in HOT page
pruning could incur in data loss. InitPostgres takes care of setting it
to a valid value, but the extra checks are there to prevent "special"
backends from behaving in unusual ways.
Per Tom Lane's detailed problem dissection in 29544.1221061979@sss.pgh.pa.us
a lot closer than it was before). To do this, tweak coerce_type() to pass
through the typmod information when invoking interval_in() on an UNKNOWN
constant; then fix DecodeInterval to pay attention to the typmod when deciding
how to interpret a units-less integer value. I changed one or two other
details as well. I believe the code now reacts as expected by spec for all
the literal syntaxes that are specifically enumerated in the spec. There
are corner cases involving strings that don't exactly match the set of fields
called out by the typmod, for which we might want to tweak the behavior some
more; but I think this is an area of user friendliness rather than spec
compliance. There remain some non-compliant details about the SQL syntax
(as opposed to what's inside the literal string); but at least we'll throw
error rather than silently doing the wrong thing in those cases.
when user-defined functions used in a plan are modified. Also invalidate
plans when schemas, operators, or operator classes are modified; but for these
cases we just invalidate everything rather than tracking exact dependencies,
since these types of objects seldom change in a production database.
Tom Lane; loosely based on a patch by Martin Pihlak.
for pg_stop_backup. First, it is possible that the history file name is not
alphabetically later than the last WAL file name, so we should explicitly
check that both have been archived. Second, the previous coding would wait
forever if a checkpoint had managed to remove the WAL file before we look for
it.
Simon Riggs, plus some code cleanup by me.
referenced tables are dumped before the referencing tables. This avoids
failures when the data is loaded with the FK constraints already active.
If no such ordering is possible because of circular or self-referential
constraints, print a NOTICE to warn the user about it.
searching instead of naive matching. In the worst case this has the same
O(M*N) complexity as the naive method, but the worst case is hard to hit,
and the average case is very fast, especially with longer patterns.
David Rowley
for editing if no function name is specified. This seems a much cleaner way
to offer that functionality than the original patch had. In passing,
de-clutter the error displays that are given for a bogus function-name
argument, and standardize on "$function$" as the default delimiter for the
function body. (The original coding would use the shortest possible
dollar-quote delimiter, which seems to create unnecessarily high risk of
later conflicts with the user-modified function body.)
In support of that, create a backend function pg_get_functiondef().
The psql command is functional but maybe a bit rough around the edges...
Abhijit Menon-Sen
inserting a materialize node above an inner-side sort node, when the sort is
expected to spill to disk. (The materialize protects the sort from having
to support mark/restore, allowing it to do its final merge pass on-the-fly.)
We neglected to teach cost_mergejoin about that hack, so it was failing to
include the materialize's costs in the estimated cost of the mergejoin.
The materialize's costs are generally going to be pretty negligible in
comparison to the sort's, so this is only a small error and probably not
worth back-patching; but it's still wrong.
In the similar case where a materialize is inserted to protect an inner-side
node that can't do mark/restore at all, it's still true that the materialize
should not spill to disk, and so we should cost it cheaply rather than
expensively.
Noted while thinking about a question from Tom Raney.
during parsing. Formerly the parser's stack was allocated with malloc
and so wouldn't be reclaimed; this patch makes it use palloc instead,
so that flushing the current context will reclaim the memory. Per
Marko Kreen.
whenever possible, as per bug report from Oleg Serov. While at it, reorder
the operations in the RECORD case to avoid possible palloc failure while the
variable update is only partly complete.
Back-patch as far as 8.1. Although the code of the particular function is
similar in 8.0, 8.0's support for composite fields in rows is sufficiently
broken elsewhere that it doesn't seem worth fixing this.
output that is seen when a checkpoint occurs at just the right time during
the test. Per my report of 2008-08-31.
This could be back-patched but I'm not sure it's worth the trouble.
command id is the cmin, when it can in fact be a combo cid. That made rows
incorrectly invisible to a transaction where a tuple was deleted by multiple
aborted subtransactions.
Report and patch Karl Schnaitter. Back-patch to 8.3, where combo cids was
introduced.
SELECT foo.*) so that it cannot be confused with a quoted identifier "*".
Instead create a separate node type A_Star to represent this notation.
Per pgsql-hackers discussion of 2007-Sep-27.
syntax to avoid a useless store into a global variable. Per experimentation,
this works better than my original thought of trying to push the code into
an out-of-line subroutine.
clear to me why I'd not seen this message before --- on F-9 it seems to
only happen if Asserts are disabled, which ought to be irrelevant.
Maybe that affects a decision whether to inline get_ten(), which would
be needed to expose the warning condition to the compiler? Anyway,
the fix is clear.
most node types used in expression trees (both before and after parse
analysis). This allows us to place an error cursor in many situations
where we formerly could not, because the information wasn't available
beyond the very first level of parse analysis. There's a fair amount
of work still to be done to persuade individual ereport() calls to actually
include an error location, but this gets the initdb-forcing part of the
work out of the way; and the situation is already markedly better than
before for complaints about unimplementable implicit casts, such as
CASE and UNION constructs with incompatible alternative data types.
Per my proposal of a few days ago.
when its input is constant and the element coercion function is immutable
(or nonexistent, ie, binary-coercible case). This is an oversight in the
8.3 implementation of ArrayCoerceExpr, and its result is that certain cases
involving IN or NOT IN with constants don't get optimized as they should be.
Per experimentation with an example from Ow Mun Heng.
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend. There's probably more that should be done along this line,
but this is a start anyway.
checks in ExecIndexBuildScanKeys() that were inadequate anyway: it's better
to verify the correct varno on an expected index key, not just reject OUTER
and INNER.
This makes the entire current contents of nodeFuncs.c dead code. I'll be
replacing it with some other stuff later, as per recent proposal.
at once and ItemPointers are collected in memory.
Remove tuple's killing by killtuple() if tuple was moved to another
page - it could produce unaceptable overhead.
Backpatch up to 8.1 because the bug was introduced by GiST's concurrency support.
1. -i option should run vacuum analyze only on pgbench tables, not *all*
tables in database.
2. pre-run cleanup step was DELETE FROM HISTORY then VACUUM HISTORY.
This is just a slow version of TRUNCATE HISTORY.
Simon Riggs
subqueries into the same thing you'd have gotten from IN (except always with
unknownEqFalse = true, so as to get the proper semantics for an EXISTS).
I believe this fixes the last case within CVS HEAD in which an EXISTS could
give worse performance than an equivalent IN subquery.
The tricky part of this is that if the upper query probes the EXISTS for only
a few rows, the hashing implementation can actually be worse than the default,
and therefore we need to make a cost-based decision about which way to use.
But at the time when the planner generates plans for subqueries, it doesn't
really know how many times the subquery will be executed. The least invasive
solution seems to be to generate both plans and postpone the choice until
execution. Therefore, in a query that has been optimized this way, EXPLAIN
will show two subplans for the EXISTS, of which only one will actually get
executed.
There is a lot more that could be done based on this infrastructure: in
particular it's interesting to consider switching to the hash plan if we start
out using the non-hashed plan but find a lot more upper rows going by than we
expected. I have therefore left some minor inefficiencies in place, such as
initializing both subplans even though we will currently only use one.
to be used for SubLinks that are underneath a top-level OR clause. Just as at
the very top level of WHERE, it's not necessary to be accurate about whether
the sublink returns FALSE or NULL, because either result has the same impact
on whether the WHERE will succeed.
Per Microsoft knowledge base article Q201213, early versions of
Windows fail when we do this. Later versions of Windows appear
to have a higher limit than 64Kb, but do still fail on large
sends, so we unconditionally limit it for all versions.
Patch from Tom Lane.
debug_print_plan to appear at LOG message level, not DEBUG1 as historically.
Make debug_pretty_print default to on. Also, cause plans generated via
EXPLAIN to be subject to debug_print_plan. This is all to make
debug_print_plan a reasonably comfortable substitute for the former behavior
of EXPLAIN VERBOSE.
eval_const_expressions will generally throw away anything that's ANDed with
constant FALSE, what we're left with given an example like
select * from tenk1 a where (unique1,0) in (select unique2,1 from tenk1 b);
is a cartesian product computation, which is really not acceptable.
This is a regression in CVS HEAD compared to previous releases, which were
able to notice the impossible join condition in this case --- though not in
some related cases that are also improved by this patch, such as
select * from tenk1 a left join tenk1 b on (a.unique1=b.unique2 and 0=1);
Fix by skipping evaluation of the appropriate side of the outer join in
cases where it's demonstrably unnecessary.
that we're considering pulling up. I hadn't wanted to think through whether
that could work during the first pass at this stuff. However, on closer
inspection it seems to be safe enough.
level of a JOIN/ON clause, not only at top level of WHERE. (However, we
can't do this in an outer join's ON clause, unless the ANY/EXISTS refers
only to the nullable side of the outer join, so that it can effectively
be pushed down into the nullable side.) Per request from Kevin Grittner.
In passing, fix a bug in the initial implementation of EXISTS pullup:
it would Assert if the EXIST's WHERE clause used a join alias variable.
Since we haven't yet flattened join aliases when this transformation
happens, it's necessary to include join relids in the computed set of
RHS relids.
returns NULL instead of a PGresult. The former coding would fail, which
is OK, but it neglected to give you the PQerrorMessage that might tell
you why. In the oldest branches, there was another problem: it'd sometimes
report PQerrorMessage from the wrong connection.
and anti joins. To do this, pass the SpecialJoinInfo struct for the current
join as an additional optional argument to operator join selectivity
estimation functions. This allows the estimator to tell not only what kind
of join is being formed, but which variable is on which side of the join;
a requirement long recognized but not dealt with till now. This also leaves
the door open for future improvements in the estimators, such as accounting
for the null-insertion effects of lower outer joins. I didn't do anything
about that in the current patch but the information is in principle deducible
from what's passed.
The patch also clarifies the definition of join selectivity for semi/anti
joins: it's the fraction of the left input that has (at least one) match
in the right input. This allows getting rid of some very fuzzy thinking
that I had committed in the original 7.4-era IN-optimization patch.
There's probably room to estimate this better than the present patch does,
but at least we know what to estimate.
Since I had to touch CREATE OPERATOR anyway to allow a variant signature
for join estimator functions, I took the opportunity to add a couple of
additional checks that were missing, per my recent message to -hackers:
* Check that estimator functions return float8;
* Require execute permission at the time of CREATE OPERATOR on the
operator's function as well as the estimator functions;
* Require ownership of any pre-existing operator that's modified by
the command.
I also moved the lookup of the functions out of OperatorCreate() and
into operatorcmds.c, since that seemed more consistent with most of
the other catalog object creation processes, eg CREATE TYPE.
match in antijoin mode, we should advance to next outer tuple not next inner.
We know we don't want to return this outer tuple, and there is no point in
advancing over matching inner tuples now, because we'd just have to do it
again if the next outer tuple has the same merge key. This makes a noticeable
difference if there are lots of duplicate keys in both inputs.
Similarly, after finding a match in semijoin mode, arrange to advance to
the next outer tuple after returning the current match; or immediately,
if it fails the extra quals. The rationale is the same. (This is a
performance bug in existing releases; perhaps worth back-patching? The
planner tries to avoid using mergejoin with lots of duplicates, so it may
not be a big issue in practice.)
Nestloop and hash got this right to start with, but I made some cosmetic
adjustments there to make the corresponding bits of logic look more similar.
variable stats_temp_directory, instead of requiring the admin to
mount/symlink the pg_stat_tmp directory manually.
For now the config variable is PGC_POSTMASTER. Room for further improvment
that would allow it to be changed on-the-fly.
parent, not only those with RangeTblRefs. We need them in ExecCheckRTPerms.
Report by Brendan O'Shea. Back-patch to 8.2, where pull_up_simple_union_all
was introduced.
the old JOIN_IN code, but antijoins are new functionality.) Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins. Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation. That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch. So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
main tables.
This requires vacuum() to accept processing a toast table standalone, so
there's a user-visible change in that it's now possible (for a superuser) to
execute "VACUUM pg_toast.pg_toast_XXX".
of multiple forks, and each fork can be created and grown separately.
The bulk of this patch is about changing the smgr API to include an extra
ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
smgrdounlink no longer implicitly call smgrclose, because other forks might
still exist after unlinking one. The callers of those functions have been
modified to call smgrclose instead.
This patch in itself doesn't have any user-visible effect, but provides the
infrastructure needed for upcoming patches. The additional forks envisioned
are a rewritten FSM implementation that doesn't rely on a fixed-size shared
memory block, and a visibility map to allow skipping portions of a table in
VACUUM that have no dead tuples.
REINDEX DATABASE including same) is done before a session has done any other
update on pg_class, the pg_class relcache entry was left with an incorrect
setting of rd_indexattr, because the indexed-attributes set would be first
demanded at a time when we'd forced a partial list of indexes into the
pg_class entry, and it would remain cached after that. This could result
in incorrect decisions about HOT-update safety later in the same session.
In practice, since only pg_class_relname_nsp_index would be missed out,
only ALTER TABLE RENAME and ALTER TABLE SET SCHEMA could trigger a problem.
Per report and test case from Ondrej Jirman.
INSERT or UPDATE will match the target table's current rowtype. In pre-8.3
releases inconsistency can arise with stale cached plans, as reported by
Merlin Moncure. (We patched the equivalent hazard on the SELECT side in Feb
2007; I'm not sure why we thought there was no risk on the insertion side.)
In 8.3 and HEAD this problem should be impossible due to plan cache
invalidation management, but it seems prudent to make the check anyway.
Back-patch as far as 8.0. 7.x versions lack ALTER COLUMN TYPE, so there
seems no way to abuse a stale plan comparably.
hashtable entries for tuples that are found only in the second input: they
can never contribute to the output. Furthermore, this implies that the
planner should endeavor to put first the smaller (in number of groups) input
relation for an INTERSECT. Implement that, and upgrade prepunion's estimation
of the number of rows returned by setops so that there's some amount of sanity
in the estimate of which one is smaller.
This completes my project of improving usage of hashing for duplicate
elimination (aggregate functions with DISTINCT remain undone, but that's
for some other day).
As with the previous patches, this means we can INTERSECT/EXCEPT on datatypes
that can hash but not sort, and it means that INTERSECT/EXCEPT without ORDER
BY are no longer certain to produce sorted output.
but seem like a separate patch since most of the remaining work is on the
executor side.) I took the opportunity to push selection of the grouping
operators for set operations into the parser where it belongs. Otherwise this
is just a small exercise in making prepunion.c consider both alternatives.
As with the recent DISTINCT patch, this means we can UNION on datatypes that
can hash but not sort, and it means that UNION without ORDER BY is no longer
certain to produce sorted output.
would work, but in fact it didn't return the same rows when moving backwards
as when moving forwards. This would have no visible effect in a DISTINCT
query (at least assuming the column datatypes use a strong definition of
equality), but it gave entirely wrong answers for DISTINCT ON queries.
sure that DISTINCT ON does what it's supposed to, ie, sort by the full ORDER
BY list before unique-ifying. The error seems masked in simple cases by the
fact that query_planner won't return query pathkeys that only partially match
the requested sort order, but I wouldn't want to bet that it couldn't be
exposed in some way or other.
PageHeaderIsValid when we zero the buffer instead of reading the page in.
The actual performance improvement is probably marginal since this function
isn't very heavily used, but a cycle saved is a cycle earned.
Zdenek Kotala
This allows the use of a ramdrive (either through mount or symlink) for
the temporary file that's written every half second, which should
reduce I/O.
On server shutdown/startup, the file is written to the old location in
the global directory, to preserve data across restarts.
Bump catversion since the $PGDATA directory layout changed.
as methods for implementing the DISTINCT step. This eliminates the former
performance gap between DISTINCT and GROUP BY, and also makes it possible
to do SELECT DISTINCT on datatypes that only support hashing not sorting.
SELECT DISTINCT ON is still always implemented by sorting; it would take
executor changes to support hashing that, and it's not clear it's worth
the trouble.
This is a release-note-worthy incompatibility from previous PG versions,
since SELECT DISTINCT can no longer be counted on to deliver sorted output
without explicitly saying ORDER BY. (Anyone who can't cope with that
can consider turning off enable_hashagg.)
Several regression test queries needed to have ORDER BY added to preserve
stable output order. I fixed the ones that manifested here, but there
might be some other cases that show up on other platforms.
or target database is being accessed by other users, it tells you whether
the "other users" are live sessions or uncommitted prepared transactions.
(Indeed, it tells you exactly how many of each, but that's mostly just
because it was easy to do so.) This should help forestall the gotcha of
not realizing that a prepared transaction is what's blocking the command.
Per discussion.
sorting. The infrastructure for this was all in place already; it's only
necessary to fix the planner to not assume that sorting is always an available
option.
read when the --temp-config argument is bad. Noted while wondering why
buildfarm member dungbeetle is failing ... this isn't why, but it is why
the error report isn't very helpful ...
as per my recent proposal:
1. Fold SortClause and GroupClause into a single node type SortGroupClause.
We were already relying on them to be struct-equivalent, so using two node
tags wasn't accomplishing much except to get in the way of comparing items
with equal().
2. Add an "eqop" field to SortGroupClause to carry the associated equality
operator. This is cheap for the parser to get at the same time it's looking
up the sort operator, and storing it eliminates the need for repeated
not-so-cheap lookups during planning. In future this will also let us
represent GROUP/DISTINCT operations on datatypes that have hash opclasses
but no btree opclasses (ie, they have equality but no natural sort order).
The previous representation simply didn't work for that, since its only
indicator of comparison semantics was a sort operator.
3. Add a hasDistinctOn boolean to struct Query to explicitly record whether
the distinctClause came from DISTINCT or DISTINCT ON. This allows removing
some complicated and not 100% bulletproof code that attempted to figure
that out from the distinctClause alone.
This patch doesn't in itself create any new capability, but it's necessary
infrastructure for future attempts to use hash-based grouping for DISTINCT
and UNION/INTERSECT/EXCEPT.
method is grouped together in a reasonably similar way, keeping the "global
shared functions" together in their own section as well. Makes it a lot easier
to find your way around the code.
to represent DISTINCT or DISTINCT ON. This gets rid of a longstanding
annoyance that a view or rule using SELECT DISTINCT will be dumped out
with an overspecified ORDER BY list, and is one small step along the way
to decoupling DISTINCT and ORDER BY enough so that hash-based implementation
of DISTINCT will be possible. In passing, improve transformDistinctClause
so that it doesn't reject duplicate DISTINCT ON items, as was reported by
Steve Midgley a couple weeks ago.
or domains). This was already effectively required because you had to own
the I/O functions, and the I/O functions pretty much have to be written in
C since we don't let PL functions take or return cstring. But given the
possible security consequences of a malicious type definition, it seems
prudent to enforce superuser requirement directly. Per recent discussion.
of the STRING type category, thereby opening up the mechanism for user-defined
types. This is mainly for the benefit of citext, though; there aren't likely
to be a lot of types that are all general-purpose character strings.
Per discussion with David Wheeler.
only type categories in which the previous coding made *every* type
preferred; so there is no change in effective behavior, because the function
resolution rules only do something different when faced with a choice
between preferred and non-preferred types in the same category. It just
seems safer and less surprising to have CREATE TYPE default to non-preferred
status ...
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
filter to be used when INSERT or SELECT INTO has a plan that returns raw
disk tuples. The virtual-tuple-slot optimizations that were put in place
awhile ago mean that ExecInsert has to do ExecMaterializeSlot, and that
already copies the tuple if it's raw (and does so more efficiently than
a junk filter, too). So get rid of that logic. This in turn means that
we can throw away ExecMayReturnRawTuples, which wasn't used for any other
purpose, and was always a kluge anyway.
In passing, move a couple of SELECT-INTO-specific fields out of EState
and into the private state of the SELECT INTO DestReceiver, as was foreseen
in an old comment there. Also make intorel_receive use ExecMaterializeSlot
not ExecCopySlotTuple, for consistency with ExecInsert and to possibly save
a tuple copy step in some cases.
default_reloptions(). The previous coding was really a bug because pg_atoi()
will always throw elog on bad input data, whereas default_reloptions is not
supposed to complain about bad input unless its validate parameter is true.
Right now you could only expose the problem by hand-modifying
pg_class.reloptions into an invalid state, so it doesn't seem worth
back-patching; but we should get it right in HEAD because there might be other
situations in future. Noted while studying GIN fast-update patch.
and bogus documentation (dimension arrays are int[] not anyarray). Also the
errhint() messages seem to be really errdetail(), since there is nothing
heuristic about them. Some other trivial cosmetic improvements.
the postgres.bki file during build, because we want that file to be entirely
platform- and configuration-independent; else it can't safely be put into
/usr/share on multiarch machines. We can do the substitution during initdb,
instead. FLOAT4PASSBYVAL and FLOAT8PASSBYVAL are new breakage as of 8.4,
while the NAMEDATALEN hazard has been there all along but I guess no one
tripped over it. Noticed while trying to build "universal" OS X binaries.
a portal are never NULL, but reliably provide the source text of the query.
It turns out that there was only one place that was really taking a short-cut,
which was the 'EXECUTE' utility statement. That doesn't seem like a
sufficiently critical performance hotspot to justify not offering a guarantee
of validity of the portal source text. Fix it to copy the source text over
from the cached plan. Add Asserts in the places that set up cached plans and
portals to reject null source strings, and simplify a bunch of places that
formerly needed to guard against nulls.
There may be a few places that cons up statements for execution without
having any source text at all; I found one such in ConvertTriggerToFK().
It seems sufficient to inject a phony source string in such a case,
for instance
ProcessUtility((Node *) atstmt,
"(generated ALTER TABLE ADD FOREIGN KEY command)",
NULL, false, None_Receiver, NULL);
We should take a second look at the usage of debug_query_string,
particularly the recently added current_query() SQL function.
ITAGAKI Takahiro and Tom Lane
rewrite. When called from SIInsertDataEntries, SICleanupQueue releases
the write lock if it has to issue a kill() to signal some laggard backend.
That still seems like a good idea --- but it's possible that by the time
we get the lock back, there are no longer enough free message slots to
satisfy SIInsertDataEntries' requirement. Must recheck, and repeat the
whole SICleanupQueue process if not. Noted while reading code.
need to deconstruct proargmodes for each pg_proc entry inspected by
FuncnameGetCandidates(). Fixes function lookup performance regression
caused by yesterday's variadic-functions patch.
In passing, make pg_proc.probin be NULL, rather than a dummy value '-',
in cases where it is not actually used for the particular type of function.
This should buy back some of the space cost of the extra column.
so long as all the trailing arguments are of the same (non-array) type.
The function receives them as a single array argument (which is why they
have to all be the same type).
It might be useful to extend this facility to aggregates, but this patch
doesn't do that.
This patch imposes a noticeable slowdown on function lookup --- a follow-on
patch will fix that by adding a redundant column to pg_proc.
Pavel Stehule
macros patch :-(. Results from both baiji and mastodon imply that MSVC
fails to perceive offsetof(PageHeaderData, pd_linp[0]) as a constant
expression in some contexts where offsetof(PageHeaderData, pd_linp) works
fine. Sloth, thy name is Micro.
on the most common individual lexemes in place of the mostly-useless default
behavior of counting duplicate tsvectors. Future work: create selectivity
estimation functions that actually do something with these stats.
(Some other things we ought to look at doing: using the Lossy Counting
algorithm in compute_minimal_stats, and using the element-counting idea for
stats on regular arrays.)
Jan Urbanski
thereby forestalling any problems with alignment of the data structure placed
there. Since SizeOfPageHeaderData is maxalign'd anyway in 8.3 and HEAD, this
does not actually change anything right now, but it is foreseeable that the
header size will change again someday. I had to fix a couple of places that
were assuming that the content offset is just SizeOfPageHeaderData rather than
MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's page-macros patch.
SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that
makes the code clearer, and avoid casting between Page and PageHeader where
possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas.
I did not apply the parts of the proposed patch that would have resulted in
slightly changing the on-disk format of hash indexes; it seems to me that's
not a win as long as there's any chance of having in-place upgrade for 8.4.
warnings. Clean up various unneeded cruft that was left behind after
creating those routines. Introduce some convenience functions str_tolower_z
etc to eliminate tedious and error-prone double arguments in formatting.c.
(Currently there seems no need to export the latter, but maybe reconsider
this later.)
Document return type of cast functions.
Also change documentation to prefer the term "binary coercible" in its
present sense instead of the previous term "binary compatible".
CopySnapshot, per Neil Conway. Also add a comment about the assumption in
GetSnapshotData that the argument is statically allocated.
Also, fix some more typos in comments in snapmgr.c.
wal_segment_size to make those configuration parameters available to clients,
in the same way that block_size was previously exposed. Bernd Helmle, with
comments from Abhijit Menon-Sen and some further tweaking by me.
the current query level that aren't in fact output parameters of the current
initPlans. (This means, for example, output parameters of regular subplans.)
To make this work correctly for output parameters coming from sibling
initplans requires rejiggering the API of SS_finalize_plan just a bit:
we need the siblings to be visible to it, rather than hidden as
SS_make_initplan_from_plan had been doing. This is really part of my response
to bug #4290, but I concluded this part probably shouldn't be back-patched,
since all that it's doing is to make a debugging cross-check tighter.
bug #4290. The fundamental bug is that masking extParam by outer_params,
as finalize_plan had been doing, caused us to lose the information that
an initPlan depended on the output of a sibling initPlan. On reflection
the best thing to do seemed to be not to try to adjust outer_params for
this case but get rid of it entirely. The only thing it was really doing
for us was to filter out param IDs associated with SubPlan nodes, and that
can be done (with greater accuracy) while processing individual SubPlan
nodes in finalize_primnode. This approach was vindicated by the discovery
that the masking method was hiding a second bug: SS_finalize_plan failed to
remove extParam bits for initPlan output params that were referenced in the
main plan tree (it only got rid of those referenced by other initPlans).
It's not clear that this caused any real problems, given the limited use
of extParam by the executor, but it's certainly not what was intended.
I originally thought that there was also a problem with needing to include
indirect dependencies on external params in initPlans' param sets, but it
turns out that the executor handles this correctly so long as the depended-on
initPlan is earlier in the initPlans list than the one using its output.
That seems a bit of a fragile assumption, but it is true at the moment,
so I just documented it in some code comments rather than making what would
be rather invasive changes to remove the assumption.
Back-patch to 8.1. Previous versions don't have the case of initPlans
referring to other initPlans' outputs, so while the existing logic is still
questionable for them, there are not any known bugs to be fixed. So I'll
refrain from changing them for now.
1024 to improve performance when sending large elog messages. Also add a
comment about why we use that number.
Since this represents an externally visible behavior change, and might
possibly result in portability issues, it seems best not to back-patch it.
log message at newlines cost O(N^2) for very long messages with few or no
newlines. For messages in the megabyte range this became the dominant cost.
Per gripe from Achilleas Mantzios.
Patch all the way back, since this is a safe change with no portability
risks. I am also thinking of increasing PG_SYSLOG_LIMIT, but that should
be done separately.
results always contribute two groups, regardless of the expression contents.
This is very substantially more accurate than the regular heuristic for
certain boolean tests like "col IS NULL". Per gripe from Sam Mason.
Back-patch to all supported releases, since the behavior of
estimate_num_groups() hasn't changed all that much since 7.4.
the timezone argument as a timezone abbreviation, and only try it as a full
timezone name if that fails. The zic database has four zones (CET, EET, MET,
WET) that are full daylight-savings zones and yet have names that are the
same as their abbreviations for standard time, resulting in ambiguity.
In the timestamp input functions we resolve the ambiguity by preferring the
abbreviation, and AT TIME ZONE should work the same way. (No functionality
is lost because the zic database also has other names for these zones, eg
Europe/Zurich.) Per gripe from Jaromir Talir.
Backpatch to 8.1. Older releases did not have the issue because AT TIME ZONE
only accepted abbreviations not zone names. (Thus, this patch also arguably
fixes a compatibility botch introduced at 8.1: in ambiguous cases we now
behave the same as 8.0 did.)
variable that has units. Per report from Stefan Kaltenbrunner.
Backport to 8.2. I also backported my patch of 2007-06-21 that prevented
comparable overflows on the input side, since that now seems to have enough
field track record to be back-patched safely. That patch included addition
of hints listing the available unit names, which I did not bother to strip
out of it --- this will make a little more work for the translators, but
they can copy the translation from 8.3, and anyway an untranslated hint
is better than no hint.
output for CREATE FUNCTION. This makes it easier to read especially if the
function body is long.
Original idea and patch by Greg Sabino Mullane, though this is a stripped
down version of that.
of different types than the underlying column. The capability isn't yet
used for anything, but will be required by upcoming patch to analyze
tsvector columns.
Jan Urbanski
one for client-side, restoring the previous behaviour with different
sort order for the 'log' level. Also, remove redundant list of available
options, since the enum code will output it automatically.
timezone setting in the current year and for 100 years back, rather than
always examining years 1904-2004. The original coding would have problems
distinguishing zones whose behavior diverged only after 2004; which is a
situation we will surely face sometime, if it's not out there already.
In passing, also prevent selection of the dummy "Factory" timezone, even
if that's exactly what the system is using. Reporting time as GMT seems
better than that.
backend. If so, send a LOG message to the postmaster log, and if the table
is beyond the vacuum-for-wraparound horizon, forcibly drop it. Per recent
discussions. Perhaps we ought to back-patch this, but it probably needs
to age a bit in HEAD first.
As the buffer could now be a lot larger than before, and copying it could
thus be a lot more expensive than before, use strcpy instead of memcpy to
copy the query string, as was already suggested in comments. Also, only copy
the PgBackendStatus struct and string if the slot is in use.
Patch by Thomas Lee, with some changes by me.
space is tracked via GetMemoryChunkSpace, there's really no advantage
to duplicating datumCopy's innards here. This is one bit of my toast
indirection patch that should go in anyway.
of any lower outer join, even if it also references the non-nullable side and
so could not get pushed below the outer join anyway. We need this in case
the clause is an OR clause: if it doesn't get marked outerjoin_delayed,
create_or_index_quals() could pull an indexable restriction for the nullable
side out of it, leading to wrong results as demonstrated by today's bug
report from toruvinn. (See added regression test case for an example.)
In principle this has been wrong for quite a while. In practice I don't
think any branch before 8.3 can really show the failure, because
create_or_index_quals() will only pull out indexable conditions, and before
8.3 those were always strict. So though we might have improperly generated
null-extended rows in the outer join, they'd get discarded from the result
anyway. The gating factor that makes the failure visible is that 8.3
considers "col IS NULL" to be indexable. Hence I'm not going to risk
back-patching further than 8.3.
taking the maximum of any child rel's width, we should weight the widths
proportionally to the number of rows expected from each child. In hindsight
this is obviously correct because row width is really a proxy for the total
physical size of the relation. Per discussion with Scott Carey (bug #4264).
to suppress zero-padding of "name" entries in indexes.
The alignment change is unlikely to save any space, but it is really needed
anyway to make the world safe for our widespread practice of passing plain
old C strings to functions that are declared as taking Name. In the previous
coding, the C compiler was entitled to assume that a Name pointer was
word-aligned; but we were failing to guarantee that. I think the reason
we'd not seen failures is that usually the only thing that gets done with
such a pointer is strcmp(), which is hard to optimize in a way that exploits
word-alignment. Still, some enterprising compiler guy will probably think
of a way eventually, or we might change our code in a way that exposes
more-obvious optimization opportunities.
The padding change is accomplished in one-liner fashion by declaring the
"name" index opclasses to use storage type "cstring" in pg_opclass.h.
Normally btree and hash don't allow a nondefault storage type, because they
don't have any provisions for converting the input datum to another type.
However, because name and cstring are effectively the same thing except for
padding, no conversion is needed --- we only need index_form_tuple() to treat
the datum as being cstring not name, and this is sufficient. This seems to
make for about a one-third reduction in the typical sizes of system catalog
indexes that involve "name" columns, of which we have many.
These two changes are only weakly related, but the alignment change makes
me feel safer that the padding change won't introduce problems, so I'm
committing them together.
in pg_proc. Also make it not emit duplicate extern declarations, and make it
a bit more bulletproof in some other small ways. Likewise fix the equally
hard-wired, and utterly undocumented, knowledge in the MSVC build scripts.
For testing purposes and perhaps other uses in future, pull out that portion
of the MSVC scripts into a standalone perl script equivalent to
Gen_fmgrtab.sh, and make it generate actually identical output, rather than
just more-or-less-the-same output.
Motivated by looking at Pavel's variadic function patch. Whether or not
that gets accepted, we can be sure that pg_proc's column set will change
again in the future; it's time to not have to deal with this gotcha.
read and written without a lock. The value itself is atomic, sure, but on
processors with weak memory ordering it's possible for a reader to see the
value change before it sees the associated message written into the buffer
array. Fix by introducing a spinlock that's used just to read and write
maxMsgNum. (We could do this with less overhead if we recognized a concept
of "memory access barrier"; is it worth introducing such a thing? At the
moment probably not --- I can't measure any clear slowdown from adding the
spinlock, so this solution is probably fine.) Per buildfarm results.
unnecessary cache resets. The major changes are:
* When the queue overflows, we only issue a cache reset to the specific
backend or backends that still haven't read the oldest message, rather
than resetting everyone as in the original coding.
* When we observe backend(s) falling well behind, we signal SIGUSR1
to only one backend, the one that is furthest behind and doesn't already
have a signal outstanding for it. When it finishes catching up, it will
in turn signal SIGUSR1 to the next-furthest-back guy, if there is one that
is far enough behind to justify a signal. The PMSIGNAL_WAKEN_CHILDREN
mechanism is removed.
* We don't attempt to clean out dead messages after every message-receipt
operation; rather, we do it on the insertion side, and only when the queue
fullness passes certain thresholds.
* Split SInvalLock into SInvalReadLock and SInvalWriteLock so that readers
don't block writers nor vice versa (except during the infrequent queue
cleanout operations).
* Transfer multiple sinval messages for each acquisition of a read or
write lock.
corresponding struct definitions. This allows other headers to avoid including
certain highly-loaded headers such as rel.h and relscan.h, instead using just
relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less
unnecessary dependencies.
by installing an error context subroutine that will provide the file name
and line number for all errors detected while reading a config file.
Some of the reader routines were already doing that in an ad-hoc way for
errors detected directly in the reader, but it didn't help for problems
detected in subroutines, such as encoding violations.
Back-patch to 8.3 because 8.3 is where people will be trying to debug
configuration files.
int2-and-int8 implementations of the basic arithmetic operators +, -, *, /.
This doesn't really add any new functionality, but it avoids "operator is not
unique" failures that formerly occurred in these cases because the parser
couldn't decide whether to promote the int2 to int4 or int8. We could
alternatively have removed the existing cross-type operators, but
experimentation shows that the cost of an additional type coercion expression
node is noticeable compared to such cheap operators; so let's not give up any
performance here. On the other hand, I removed the int2-and-int4 modulo (%)
operators since they didn't seem as important from a performance standpoint.
Per a complaint last January from ykhuang.
that it depends on for replan-forcing purposes. We need to consider plain OID
constants too, because eval_const_expressions folds a RelabelType atop a Const
to just a Const. This change could result in OID values that aren't really
for tables getting added to the dependency list, but the worst-case
consequence would be occasional useless replans. Per report from Gabriele
Messineo.
1. Directly reading interp->result is deprecated in Tcl 8.0 and later;
you're supposed to use Tcl_GetStringResult. This code finally broke with
Tcl 8.5, because Tcl_GetVar can now have side-effects on interp->result even
though it preserves the logical state of the result. (There's arguably a
Tcl issue here, because Tcl_GetVar could invalidate the pointer result of a
just-preceding Tcl_GetStringResult, but I doubt the Tcl guys will see it as
a bug.)
2. We were being sloppy about the encoding of the result: some places would
push database-encoding data into the Tcl result, which should not happen,
and we were assuming that any error result coming back from Tcl was in the
database encoding, which is not a good assumption.
3. There were a lot of calls of Tcl_SetResult that uselessly specified
TCL_VOLATILE for constant strings. This is only a minor performance issue,
but I fixed it in passing since I had to look at all the calls anyway.
#2 is a live bug regardless of which Tcl version you are interested in,
so back-patch even to branches that are unlikely to be used with Tcl 8.5.
I went back as far as 8.0, which is as far as the patch applied easily;
7.4 was using a different error processing scheme that has got its own
problems :-(
is necessary to avoid deadlock against ordinary queries, but we'd broken it
with recent changes that made the DROP machinery lock the index before
arriving at index_drop. Per intermittent buildfarm failures.
grammar allows ALTER TABLE/INDEX/SEQUENCE/VIEW interchangeably for all
subforms of those commands, and then we sort out what's really legal
at execution time. This allows the ALTER SEQUENCE/VIEW reference pages
to fully document all the ALTER forms available for sequences and views
respectively, and eliminates a longstanding cause of confusion for users.
The net effect is that the following forms are allowed that weren't before:
ALTER SEQUENCE OWNER TO
ALTER VIEW ALTER COLUMN SET/DROP DEFAULT
ALTER VIEW OWNER TO
ALTER VIEW SET SCHEMA
(There's no actual functionality gain here, but formerly you had to say
ALTER TABLE instead.)
Interestingly, the grammar tables actually get smaller, probably because
there are fewer special cases to keep track of.
I did not disallow using ALTER TABLE for these operations. Perhaps we
should, but there's a backwards-compatibility issue if we do; in fact
it would break existing pg_dump scripts. I did however tighten up
ALTER SEQUENCE and ALTER VIEW to reject non-sequences and non-views
in the new cases as well as a couple of cases where they didn't before.
The patch doesn't change pg_dump to use the new syntaxes, either.
objects are specified, we drop them all in a single performMultipleDeletions
call. This makes the RESTRICT/CASCADE checks more relaxed: it's not counted
as a cascade if one of the later objects has a dependency on an earlier one.
NOTICE messages about such cases go away, too.
In passing, fix the permissions check for DROP CONVERSION, which for some
reason was never made role-aware, and omitted the namespace-owner exemption
too.
Alex Hunsaker, with further fiddling by me.
the problem happened in. These are all supposedly can't-happen cases, but
when they do happen it's useful to know where.
Back-patch to 8.3, but not further because the patch doesn't apply cleanly
further back. Given the lack of response to my proposal of this, there
doesn't seem to be enough interest to justify much back-porting effort.
forks. XLogOpenRelation() and the associated light-weight relation cache in
xlogutils.c is gone, and XLogReadBuffer() now takes a RelFileNode as argument,
instead of Relation.
For functions that still need a Relation struct during WAL replay, there's a
new function called CreateFakeRelcacheEntry() that returns a fake entry like
XLogOpenRelation() used to.
devised for pg_shdepend, namely the individual dependencies are reported as
DETAIL lines rather than coming out as separate NOTICEs. The client-side
report is capped at 100 lines, but the server log always gets a full report.
CacheInvalidateRelcache() crashes if called in WAL recovery, because the
invalidation infrastructure hasn't been initialized yet.
Back-patch to 8.2, where the bug was introduced.
Basically just reuse the same text that psql emitted as part of
its startup banner in prior versions, and make some whitespace
more consistent with the conventions in other psql command output.
algorithm, replacing the original intention of a one-pass search, which
had been hacked up over time to be partially two-pass in hopes of handling
various corner cases better. It still wasn't quite there, especially as
regards emitting unwanted NOTICE messages. More importantly, this approach
lets us fix a number of open bugs concerning concurrent DROP scenarios,
because we can take locks during the first pass and avoid traversing to
dependent objects that were just deleted by someone else.
There is more that can be done here, but I'll go ahead and commit the
base patch before working on the options.
more logical that way, and also it reduces the amount of unnecessary includes
in bufpage.h, which is widely used.
Zdenek Kotala.
My previous patch to bufpage.h should also have credited him as author, but I
forgot (sorry about that).
patches that dealt with object ownership. It wasn't updating pg_shdepend
nor adjusting the aggregate's ACL. In 8.2 and up, fix this permanently
by making it use AlterFunctionOwner_oid. In 8.1, the function code wasn't
factored that way, so just copy and paste.
This is needed because :: casting binds more tightly than minus, so for
example -1::integer is not the same as (-1)::integer, and there are cases
where the difference is important. In particular this caused a failure
in SELECT DISTINCT ... ORDER BY ... where expressions that should have
matched were seen as different by the parser; but I suspect that there
could be other cases where failure to parenthesize leads to subtler
semantic differences in reloaded rules. Per report from Alexandr Popov.
doesn't work, and the real reason why not is it's unclear where the path
is relative to (initdb's CWD, or the data directory?). We could make an
arbitrary decision, but it seems best to make the user be unambiguous.
Per gripe from Devrim.
the PARAM_FLAG_CONST flag on the parameters that are passed into the portal,
while the former's behavior is unchanged. This should only affect the case
where the portal is executing an EXPLAIN; it will cause the generated plan to
look more like what would be generated if the portal were actually executing
the command being explained. Per gripe from Pavel.
"make all", and then reference them there during the actual tests. This
makes the handling of these files more parallel to that of regress.so,
and in particular simplifies use of the regression tests outside the
original build tree. The PGDG and Red Hat RPMs have been doing this via
patches for a very long time. Inclusion of the change in core was requested
by Jørgen Austvik of Sun, and I can't see any reason not to.
I attempted to fix the MSVC scripts for this too, but they may need
further tweaking ...
we are on a 64-bit machine (ie, size_t is wider than int) and someone passes
in a query string that approaches or exceeds INT_MAX bytes. Also, just for
paranoia's sake, guard against similar overflows in sizing the input buffer.
The backend will not in the foreseeable future be prepared to send or receive
strings exceeding 1GB, so I didn't take the more invasive step of switching
all the buffer index variables from int to size_t; though someday we might
want to do that.
I have a suspicion that this is not the only such bug in libpq, but this
fix is enough to take care of the crash reported by Francisco Reyes.
This is required on Windows due to the special locale
handling for UTF8 that doesn't change the full environment.
Fixes crash with translated error messages per bugs 4180
and 4196.
Tom Lane
the associated datatype as their equality member. This means that these
opclasses can now support plain equality comparisons along with LIKE tests,
thus avoiding the need for an extra index in some applications. This
optimization was not possible when the pattern opclasses were first introduced,
because we didn't insist that text equality meant bitwise equality; but we
do now, so there is no semantic difference between regular and pattern
equality operators.
I removed the name_pattern_ops opclass altogether, since it's really useless:
name's regular comparisons are just strcmp() and are unlikely to become
something different. Instead teach indxpath.c that btree name_ops can be
used for LIKE whether or not the locale is C. This might lead to a useful
speedup in LIKE queries on the system catalogs in non-C locales.
The ~=~ and ~<>~ operators are gone altogether. (It would have been nice to
keep them for backward compatibility's sake, but since the pg_amop structure
doesn't allow multiple equality operators per opclass, there's no way.)
A not-immediately-obvious incompatibility is that the sort order within
bpchar_pattern_ops indexes changes --- it had been identical to plain
strcmp, but is now trailing-blank-insensitive. This will impact
in-place upgrades, if those ever happen.
Per discussions a couple months ago.
called before, not after, calling the assign_hook if any. This is because
push_old_value might fail (due to palloc out-of-memory), and in that case
there would be no stack entry to tell transaction abort to undo the GUC
assignment. Of course the actual assignment to the GUC variable hasn't
happened yet --- but the assign_hook might have altered subsidiary state.
Without a stack entry we won't call it again to make it undo such actions.
So this is necessary to make the world safe for assign_hooks with side
effects. Per a discussion a couple weeks ago with Magnus.
Back-patch to 8.0. 7.x did not have the problem because it did not have
allocatable stacks of GUC values.
cases. Recent buildfarm experience shows that it is sometimes possible
to execute several SQL commands in less time than the granularity of
Windows' not-very-high-resolution gettimeofday(), leading to a failure
because the tests expect the value of now() to change and it doesn't.
Also, it was recognized some time ago that the same area of the tests
could fail if local midnight passes between the insertion and the checking
of the values for 'yesterday', 'tomorrow', etc. Clean all this up per
ideas from myself and Greg Stark.
There remains a window for failure if the transaction block is entered
exactly at local midnight (so that 'now' and 'today' have the same value),
but that seems low-probability enough to live with.
Since the point of this change is mostly to eliminate buildfarm noise,
back-patch to all versions we are still actively testing.
Windows, for better performance.
Per suggestion from Andrew Chernow, but not his patch since the underlying
code was changed to deal with return values.