errors when tables are concurrently dropped. To do this we must take lock
on each relation before we check its privileges. The old code was trying
to do that the other way around, which is a bit pointless when there are lots
of other commands that lock relations before checking privileges. I did keep
it checking each relation's privilege before locking the next relation, which
is a detail that ALTER TABLE isn't too picky about.
ability to lock relations as they scan pg_inherits, and to ignore any
relations that have disappeared by the time we get lock on them. This
makes uses of these functions safe against concurrent DROP operations
on child tables: we will effectively ignore any just-dropped child,
rather than possibly throwing an error as in recent bug report from
Thomas Johansson (and similar past complaints). The behavior should
not change otherwise, since the code was acquiring those same locks
anyway, just a little bit later.
An exception is LockTableCommand(), which is still behaving unsafely;
but that seems to require some more discussion before we change it.
find_inheritance_children() and find_all_inheritors(). I got annoyed that
these are buried inside the planner but mostly used elsewhere. So, create
a new file catalog/pg_inherits.c and put them there, along with a couple
of other functions that search pg_inherits.
The code that modifies pg_inherits is (still) in tablecmds.c --- it's
kind of entangled with unrelated code that modifies pg_depend and other
stuff, so pulling it out seemed like a bigger change than I wanted to make
right now. But this file provides a natural home for it if anyone ever
gets around to that.
This commit just moves code around; it doesn't change anything, except
I succumbed to the temptation to make a couple of trivial optimizations
in typeInheritsFrom().
of AND/OR clause branches that predtest.c would attempt to deal with. As
noted in bug #4721, that change disabled proof attempts for sizes of problems
that people are actually expecting it to work for. The original complaint
it was trying to solve was O(N^2) behavior for long IN-lists, so let's try
applying the limit to just ScalarArrayOpExprs rather than everything.
Another case of "foolish consistency" I fear.
Back-patch to 8.2, same as the previous patch was.
predicate_refuted_by: if either top-level input is a single-element list,
reduce it to its lone member before proceeding. This avoids
a useless level of AND-recursion within the recursive proof routines.
It's worth doing because, for example, if the clause is a 100-element
list and the predicate is a 1-element list then we'd otherwise strip
the predicate's list structure 100 times as we iterate through the clause.
It's only needed at top level because there won't be any trivial ANDs below
that --- this situation is an artifact of the decision to represent even
single-item conditions as Lists in the "implicit AND" format, and that format
is only used at the top level of any predicate or restriction condition.
in its CREATE DATABASE commands only for databases that have settings
different from the installation defaults. This is a low-tech method of
avoiding unnecessary platform dependencies in dump files. Eventually we ought
to have a platform-independent way of specifying LC_COLLATE and LC_CTYPE, but
that's not going to happen for 8.4, and this patch at least avoids the issue
for people who aren't setting up per-database locales. ENCODING doesn't have
the platform dependency problem, but it seems consistent to make it act the
same as the locale settings.
joins a bit better, ie, understand the differing cost functions for matched
and unmatched outer tuples. There is more that could be done in cost_hashjoin
but this already helps a great deal. Per discussions with Robert Haas.
a toast table to be built, even if the sum-of-column-widths calculation
indicates one isn't needed. This is needed by pg_migrator because if the
old table has a toast table, we have to migrate over the toast table since
it might contain some live data, even though subsequent column drops could
mean that no recently-added rows could require toasting.
restrictions specified for semijoins in optimizer/README, to wit that
you can't reassociate outer joins into or out of the RHS of a semijoin.
Per report from Heikki.
you can end up with an unrecoverable backup if you start a new base backup
right after finishing archive recovery. In that scenario, the redo pointer of
the checkpoint that pg_start_backup() writes points to the XLOG segment where
the timeline-changing end-of-archive-recovery checkpoint is. The beginning
of that segment contains pages with the old timeline ID, and we don't accept
that in recovery unless we find a history file covering the old timeline ID.
If you omit pg_xlog from the base backup and clear the archive directory
before starting the backup, there will be no such history file available.
The bug is present in all versions since PITR was introduced in 8.0, but I'm
back-patching only back to 8.2. Earlier versions didn't have XLOG switch
records, making this fix unfeasible. Given the lack of reports until now,
it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1.
Per report and suggestion by Mikael Krantz
can be pushed to the top of the join tree, we update both the relids and
qualscope variables to keep them in sync. This prevents a possible later
failure of an Assert clause, and affects nothing else since qualscope isn't
used later except for that Assert. At the moment the Assert shouldn't be
reachable when we've pushed the qual up; but this is cheap insurance, and
it's more sensible anyway in terms of the overall logic of the routine.
Per analysis of a bug report from Stefan Huehner.
I'm not back-patching this since it's just future-proofing; but if anyone
gets tempted to change check_outerjoin_delay again in the back branches,
this might be needed.
must be used for the new database, except when copying from template0.
This is the same rule that we now enforce for locale settings, and it has
the same motivation: databases other than template0 might contain data that
would be invalid according to a different setting. This represents another
step in a continuing process of locking down ways in which encoding violations
could occur inside the backend. Per discussion of a few days ago.
In passing, fix pre-existing breakage of mbregress.sh, and fix up a couple
of ereport() calls in dbcommands.c that failed to specify sqlstate codes.
will still be performed if something in a backend process calls exit()
directly, instead of going through proc_exit() as we prefer. This is a second
response to the issue that we might load third-party code that doesn't know it
should not call exit(). Such a call will now cause a reasonably graceful
backend shutdown, if possible. (Of course, if the reason for the exit() call
is out-of-memory or some such, we might not be able to recover, but at least
we will try.)
a backend has done exit(0) or exit(1) without having disengaged itself
from shared memory. We are at risk for this whenever third-party code is
loaded into a backend, since such code might not know it's supposed to go
through proc_exit() instead. Also, it is reported that under Windows
there are ways to externally kill a process that cause the status code
returned to the postmaster to be indistinguishable from a voluntary exit
(thank you, Microsoft). If this does happen then the system is probably
hosed --- for instance, the dead session might still be holding locks.
So the best recovery method is to treat this like a backend crash.
The dead man switch is armed for a particular child process when it
acquires a regular PGPROC, and disarmed when the PGPROC is released;
these should be the first and last touches of shared memory resources
in a backend, or close enough anyway. This choice means there is no
coverage for auxiliary processes, but I doubt we need that, since they
shouldn't be executing any user-provided code anyway.
This patch also improves the management of the EXEC_BACKEND
ShmemBackendArray array a bit, by reducing search costs.
Although this problem is of long standing, the lack of field complaints
seems to mean it's not critical enough to risk back-patching; at least
not till we get some more testing of this mechanism.
points where we step right or left to the next page. This should ensure
reasonable response time to a query cancel request during an unsuccessful
index scan, as seen in recent gripe from Marc Cousin. It's a bit trickier
than it might seem at first glance, because CHECK_FOR_INTERRUPTS() is a no-op
if executed while holding a buffer lock. So we have to do it just at the
point where we've dropped one page lock and not yet acquired the next.
Remove CHECK_FOR_INTERRUPTS calls at the top level of btgetbitmap and
hashgetbitmap, since they're pointless given the added checks.
I think that GIST is okay already --- at least, there's a CHECK_FOR_INTERRUPTS
at a plausible-looking place in gistnext(). I don't claim to know GIN well
enough to try to poke it for this, if indeed it has a problem at all.
This is a pre-existing issue, but in view of the lack of prior complaints
I'm not going to risk back-patching.
ANALYZE's total sample. The original coding is at risk of overflow for
statistics targets exceeding about 2675; this was not a problem before
8.4 but it is now. Per bug #4793 from Dennis Noordsij.
it fails because the shared memory segment already exists. This
means it can take up to 10 seconds before it reports the error
if it *does* exist, but hopefully it will make the system capable
of restarting even when the server is under high load.
to make sure that the error code is reset, as a precaution in
case the API doesn't properly reset it on success. This could
be necessary, since we check the error value even if the function
doesn't fail for specific success cases.
error message if the installation directory layout is messed up (or at least,
something more useful than the behavior exhibited in bug #4787). During
postmaster startup, check that get_pkglib_path resolves as a readable
directory; and if ParseTzFile() fails to open the expected timezone
abbreviation file, check the possibility that the directory is missing rather
than just the specified file. In case of either failure, issue a hint
suggesting that the installation is broken. These two checks cover the lib/
and share/ trees of a full installation, which should take care of most
scenarios where a sysadmin decides to get cute.
never a BEGIN block. This is required for Oracle compatibility and is
also plainly stated to be the behavior by our original documentation
(up until 8.1, in which the docs were adjusted to match the code's behavior;
but actually the old docs said the correct thing and the code was wrong).
Not back-patched because this introduces an incompatibility that could
break working applications. Requires release note.
failed to consider the possibility that it would get T_SCALAR, T_RECORD,
or T_ROW instead because the word happens to match a plpgsql variable name.
In particular, give "duplicate declaration" rather than generic "syntax error"
if the same identifier is declared twice in the same block, as per my recent
complaint. Also behave more sanely when decl_aliasitem or proc_condition or
opt_lblname is coincidentally not T_WORD. Refactor the related productions a
bit to reduce duplication.
This is a longstanding bug, but it doesn't seem critical enough to
back-patch.
part that rounds up to exactly 1.0 second. The previous coding rejected input
like "00:12:57.9999999999999999999999999999", with the exact number of nines
needed to cause failure varying depending on float-timestamp option and
possibly on platform. Obviously this should round up to the next integral
second, if we don't have enough precision to distinguish the value from that.
Per bug #4789 from Robert Kruus.
In passing, fix a missed check for fractional seconds in one copy of the
"is it greater than 24:00:00" code.
Broken all the way back, so patch all the way back.
keyword lists in gram.y and kwlist.h. It checks that all lists are in
alphabetical order, and that all keywords present in gram.y are listed
in kwlist.h in the right category, and that all keywords in kwlist.h are
also in gram.y. What's still missing is to check that all keywords
defined with "%token <keyword>" in gram.y are present in one of the
keyword lists in gram.y.
PlaceHolderVar nodes in join quals appearing in or below the lowest
outer join that could null the subquery being pulled up. This improves
the planner's ability to recognize constant join quals, and probably
helps with detection of common sort keys (equivalence classes) as well.
aggregate function. By definition, such a sub-SELECT cannot reference any
variables of query levels between itself and the aggregate's semantic level
(else the aggregate would've been assigned to that lower level instead).
So the correct, most efficient implementation is to treat the sub-SELECT as
being a sub-select of that outer query level, not the level the aggregate
syntactically appears in. Not doing so also confuses the heck out of our
parameter-passing logic, as illustrated in bug report from Daniel Grace.
Fortunately, we were already copying the whole Aggref expression up to the
outer query level, so all that's needed is to delay SS_process_sublinks
processing of the sub-SELECT until control returns to the outer level.
This has been broken since we introduced spec-compliant treatment of
outer aggregates in 7.4; so patch all the way back.
"verify-ca" and "verify-full".
Since "prefer" remains the default, this will make certificate validation
off by default, which should lead to less upgrade issues.
any negative or positive number, not just -1 or 1. Fix comment on
varstr_cmp and citext test case accordingly.
As pointed out by Zdenek Kotala, and buildfarm member gothic moth.
documentation warnings against setting it nonzero unless active use of
prepared transactions is intended and a suitable transaction manager has been
installed. This should help to prevent the type of scenario we've seen
several times now where a prepared transaction is forgotten and eventually
causes severe maintenance problems (or even anti-wraparound shutdown).
The only real reason we had the default be nonzero in the first place was to
support regression testing of the feature. To still be able to do that,
tweak pg_regress to force a nonzero value during "make check". Since we
cannot force a nonzero value in "make installcheck", add a variant regression
test "expected" file that shows the results that will be obtained when
max_prepared_transactions is zero.
Also, extend the HINT messages for transaction wraparound warnings to mention
the possibility that old prepared transactions are causing the problem.
All per today's discussion.
using the system functions all the time. (These files are now just copies
of the osf.* files.) The homebrew functions were not getting used anyway
on AIX versions that have dlopen(), that is 4.3 and up, so they are not
needed on any AIX that is even remotely supported by the vendor anymore.
We'd have probably left them here anyway, except some questions were
raised about the copyright.
fact that this is breaking the MSVC build, it's probably not really a good
idea to expand the dependencies of gram.h any further than the core parser;
for instance the value of SCONST might depend on which bison version you'd
built with. Better to expose an additional call point in parser.c, so
move what I had put into pl_funcs.c into parser.c. Also PGDLLIMPORT'ify
the reference to standard_conforming_strings, per buildfarm results.
Stefan Kaltenbrunner. The most reasonable behavior (at least for the near
term) seems to be to ignore the PlaceHolderVar and examine its argument
instead. In support of this, change the API of pull_var_clause() to allow
callers to request recursion into PlaceHolderVars. Currently
estimate_num_groups() is the only customer for that behavior, but where
there's one there may be others.
more nearly matching the core SQL scanner. The user-visible effects are:
* Block comments (slash-star comments) now nest, as per SQL spec.
* In standard_conforming_strings mode, backslash as the last character of a
non-E string literal is now correctly taken as an ordinary character;
formerly it was misinterpreted as escaping the ending quote. (Since the
string also had to pass through the core scanner, this invariably led
to syntax errors.)
* Formerly, backslashes in the format string of RAISE were always treated as
quoting the next character, regardless of mode. Now, they are ordinary
characters with standard_conforming_strings on, while with it off, they
introduce the same set of escapes as in the core SQL scanner. Also,
escape_string_warning is now effective for RAISE format strings. These
changes make RAISE format strings work just like any other string literal.
This is implemented by copying and pasting a lot of logic from the core
scanner. It would be a good idea to look into getting rid of plpgsql's
scanner entirely in favor of using the core scanner. However, that involves
more change than I can justify making during beta --- in particular, the core
scanner would have to become re-entrant.
In passing, remove the kluge that made the plpgsql scanner emit T_FUNCTION or
T_TRIGGER as a made-up first token. That presumably had some value once upon
a time, but now it's just useless complication for both the scanner and the
grammar.
constants through full joins, as in
select * from tenk1 a full join tenk1 b using (unique1)
where unique1 = 42;
which should generate a fairly cheap plan where we apply the constraint
unique1 = 42 in each relation scan. This had been broken by my patch of
2008-06-27, which is now reverted in favor of a more invasive but hopefully
less incorrect approach. That patch was meant to prevent incorrect extraction
of OR'd indexclauses from OR conditions above an outer join. To do that
correctly we need more information than the outerjoin_delay flag can provide,
so add a nullable_relids field to RestrictInfo that records exactly which
relations are nulled by outer joins that are underneath a particular qual
clause. A side benefit is that we can make the test in create_or_index_quals
more specific: it is now smart enough to extract an OR'd indexclause into the
outer side of an outer join, even though it must not do so in the inner side.
The old coding couldn't distinguish these cases so it could not do either.
select u&42 from table-with-a-u-column;
Also fix missing SET_YYLLOC() in the {dolqfailed} production that I suppose
this was based on. The latter is a pre-existing bug, but the only effect
is to misplace the error cursor by one token, so probably not worth
backpatching.
If a currently running item needs an exclusive lock on any item that the candidate items needs
any sort of lock on, or vice versa, then the candidate item is not allowed to run now, and
must wait till later.
tablespaces in an order that has some chance of working.
Per a complaint from Kevin Bailey.
This is a pre-existing bug, but given the lack of prior complaints I'm
not sure it's worth back-patching. In most cases failure of the DROP
commands wouldn't be that important anyway.
In passing, fix syntax errors in dumpCreateDB()'s queries for old servers;
these were apparently introduced in recent binary_upgrade patch.
how this ought to behave for multi-dimensional arrays. Per discussion,
not having it at all seems better than having it with what might prove
to be the wrong behavior. We can always add it later when we have consensus
on the correct behavior.
by my patch of 2007-01-28 to use per-subtransaction ExprContexts/EStates:
since we re-prepared any expression tree when the current subtransaction ID
changed, we'd accumulate more and more leaked expression state trees in the
outermost subtransaction if the same function was executed at multiple levels
of subtransaction nesting. To fix, go back to the previous scheme where
there was only one EState per transaction for simple plpgsql expressions.
We really only need an ExprContext per subtransaction, not a whole EState,
so it's possible to keep prepared expression state trees in the one EState
throughout the transaction. This should be more efficient as well as not
leaking memory for cases involving lots of subtransactions.
The added regression test is the case that inspired the 2007-01-28 patch in
the first place, just to make sure we didn't go backwards. The current
memory leak complaint is unfortunately hard to test for in the regression
test framework, though manual testing shows it's fixed.
Although this is a pre-existing bug, I'm not back-patching because I'd like to
see this method get some field testing first. Consider back-patching if it
gets through 8.4beta unscathed.
cstring from the output of \df. Now that the default behavior is to
exclude all system functions, the de-cluttering rationale for this behavior
seems pretty weak; and it was always quite confusing/unhelpful if you were
actually looking for I/O functions. (Not to mention if you were looking
for encoding converters or other cases that might take or return cstring.)
already did that on Windows, but it's needed on other platforms too when
LC_CTYPE=C. With other locales, we enforce (or trust) that the codeset of
the locale matches the server encoding so we don't need to bind it
explicitly. It should do no harm in that case either, but I don't have
full faith in the PG encoding -> OS codeset mapping table yet. Per recent
discussion on pgsql-hackers.
the checkpoint in immediate or lazy mode. This is to address complaints
that pg_start_backup() takes a long time even when there's no need to minimize
its I/O consumption.
alias for array_length(v,1). The efficiency gain here is doubtless
negligible --- what I'm interested in is making sure that if we have
second thoughts about the definition, we will not have to force a
post-beta initdb to change the implementation.
of discovery, rather than reverse order. This doesn't matter functionally
(I suppose the previous coding dates from the time when lcons was markedly
cheaper than lappend). However now that EXPLAIN is labeling subplans with
IDs that are based on order of creation, this may help produce a slightly
less surprising printout.
are individually labeled, rather than just grouped under an "InitPlan"
or "SubPlan" heading. This in turn makes it possible for decompilation of
a subplan reference to usefully identify which subplan it's referencing.
I also made InitPlans identify which parameter symbol(s) they compute,
so that references to those parameters elsewhere in the plan tree can
be connected to the initplan that will be executed. Per a gripe from
Robert Haas about EXPLAIN output of a WITH query being inadequate,
plus some longstanding pet peeves of my own.
are using our own ports of getopt or getopt_long, those will define
the variable for themselves; and if not, we don't need these, because
we never touch the variable anyway.
provides optreset. Current mastodon results prove that in fact it
does not; it was only because getopt.c defined the variable anyway
that things failed to fall over.
of adding optional namespace and action fields to DefElem. Having three
node types that do essentially the same thing bloats the code and leads
to errors of confusion, such as in yesterday's bug report from Khee Chin.
when we are waiting for old snapshots to go away during a concurrent index
build. In particular, this rule lets us avoid waiting for
idle-in-transaction sessions.
This logic could be improved further if we had some way to wake up when
the session we are currently waiting for goes idle-in-transaction. However
that would be a significantly more complex/invasive patch, so it'll have to
wait for some other day.
Simon Riggs, with some improvements by Tom.
interval_eq() considers equal. I'm not sure how that fundamental requirement
escaped us through multiple revisions of this hash function, but there it is;
it's been wrong since interval_hash was first written for PG 7.1.
Per bug #4748 from Roman Kononov.
Backpatch to all supported releases.
This patch changes the contents of hash indexes for interval columns. That's
no particular problem for PG 8.4, since we've broken on-disk compatibility
of hash indexes already; but it will require a migration warning note in
the next minor releases of all existing branches: "if you have any hash
indexes on columns of type interval, REINDEX them after updating".
To implement this without almost duplicating the reloption table, treat
relopt_kind as a bitmask instead of an integer value. This decreases the
range of allowed values, but it's not clear that there's need for that much
values anyway.
This patch also makes heap_reloptions explicitly a no-op for relation kinds
other than heap and TOAST tables.
Patch by ITAGAKI Takahiro with minor edits from me. (In particular I removed
the bit about adding relation kind to an error message, which I intend to
commit separately.)
Windows without that, but we shouldn't put bad examples where people might
copy them. Also, reformat slightly to improve the odds that pgindent
won't go nuts on this.
try to protect an already-existing buffer from being evicted. This was
left as an open issue when the posix_fadvise patch was committed. I'm
not sure there's any evidence to justify more work in this area, but we
should have some record about it in the source code.
for its arguments. Also add a regression test, since someone apparently
changed every single plpython test case to use only named parameters; else
we'd have noticed this sooner.
Euler Taveira de Oliveira, per a report from Alvaro
This method will not catch all different ways since the locale
handling in NTFS doesn't provide an easy way to do that, but it
will hopefully solve the most common cases causing startup
problems when the backend is found in the system PATH.
Attempts to fix bug #4694.
for simple Var targetlist entries all the time, even when there are other
entries that are not simple Vars. Also, ensure that we prefetch attributes
(with slot_getsomeattrs) for all Vars in the targetlist, even those buried
within expressions. In combination these changes seem to significantly
reduce the runtime for cases where tlists are mostly but not exclusively
Vars. Per my proposal of yesterday.
conversion functions. This allows transaction rollback to revert to a
previous client_encoding setting without doing fresh catalog lookups.
I believe that this explains and fixes the recent report of "failed to commit
client_encoding" failures.
This bug is present in 8.3.x, but it doesn't seem prudent to back-patch
the fix, at least not till it's had some time for field testing in HEAD.
In passing, remove SetDefaultClientEncoding(), which was used nowhere.
we failed to assign, even in "can't happen" cases. Motivated by wondering
what's going on in a recent trouble report where "failed to commit" did
happen.
casting effort whenever the input value was NULL. However this prevents
application of not-null domain constraints in the cases that use this
function, as illustrated in bug #4741. Since this function isn't meant
for use in performance-critical paths anyway, this certainly seems like
another case of "premature optimization is the root of all evil".
Back-patch as far as 8.2; older versions made no effort to enforce
domain constraints here anyway.
temp relations; this is no more expensive than before, now that we have
pg_class.relistemp. Insert tests into bufmgr.c to prevent attempting
to fetch pages from nonlocal temp relations. This provides a low-level
defense against bugs-of-omission allowing temp pages to be loaded into shared
buffers, as in the contrib/pgstattuple problem reported by Stuart Bishop.
While at it, tweak a bunch of places to use new relcache tests (instead of
expensive probes into pg_namespace) to detect local or nonlocal temp tables.
relations (including a temp table's indexes and toast table/index), and
false for normal relations. For ease of checking, this commit just adds
the column and fills it correctly --- revising the relation access machinery
to use it will come separately.
at the same instant as a new backend is spawned. Since CountActiveBackends()
doesn't hold ProcArrayLock, it needs to be prepared for the case that a
pointer at the end of the proc array is still NULL even though numProcs says
it should be valid, since it doesn't hold ProcArrayLock. Backpatch to 8.1.
8.0 and earlier had this right, but it was broken in the split of PGPROC and
sinval shared memory arrays.
Per report and proposal by Marko Kreen.
TupleTableSlots. We have functions for retrieving a minimal tuple from a slot
after storing a regular tuple in it, or vice versa; but these were implemented
by converting the internal storage from one format to the other. The problem
with that is it invalidates any pass-by-reference Datums that were already
fetched from the slot, since they'll be pointing into the just-freed version
of the tuple. The known problem cases involve fetching both a whole-row
variable and a pass-by-reference value from a slot that is fed from a
tuplestore or tuplesort object. The added regression tests illustrate some
simple cases, but there may be other failure scenarios traceable to the same
bug. Note that the added tests probably only fail on unpatched code if it's
built with --enable-cassert; otherwise the bug leads to fetching from freed
memory, which will not have been overwritten without additional conditions.
Fix by allowing a slot to contain both formats simultaneously; which turns out
not to complicate the logic much at all, if anything it seems less contorted
than before.
Back-patch to 8.2, where minimal tuples were introduced.
mode while callers hold pointers to in-memory tuples. I reported this for
the case of nodeWindowAgg's primary scan tuple, but inspection of the code
shows that all of the calls in nodeWindowAgg and nodeCtescan are at risk.
For the moment, fix it with a rather brute-force approach of copying
whenever one of the at-risk callers requests a tuple. Later we might
think of some sort of reference-count approach to reduce tuple copying.
In the backend, I changed only a handful of exemplary or important-looking
instances to make use of the plural support; there is probably more work
there. For the rest of the source, this should cover all relevant cases.
"physical tlist" optimization on the outer relation (ie, force a projection
step to occur in its scan). This avoids storing useless column values when
the outer relation's tuples are written to temporary batch files.
Modified version of a patch by Michael Henderson and Ramon Lawrence.
method to pass extra data to the consistent() and comparePartial() methods.
This is the core infrastructure needed to support the soon-to-appear
contrib/btree_gin module. The APIs are still upward compatible with the
definitions used in 8.3 and before, although *not* with the previous 8.4devel
function definitions.
catversion bump for changes in pg_proc entries (although these are just
cosmetic, since GIN doesn't actually look at the function signature before
calling it...)
Teodor Sigaev and Oleg Bartunov
them from degrading badly when the input is sorted or nearly so. In this
scenario the tree is unbalanced to the point of becoming a mere linked list,
so insertions become O(N^2). The easiest and most safely back-patchable
solution is to stop growing the tree sooner, ie limit the growth of N. We
might later consider a rebalancing tree algorithm, but it's not clear that
the benefit would be worth the cost and complexity. Per report from Sergey
Burladyan and an earlier complaint from Heikki.
Back-patch to 8.2; older versions didn't have GIN indexes.
multiple index entries in a holding area before adding them to the main index
structure. This helps because bulk insert is (usually) significantly faster
than retail insert for GIN.
This patch also removes GIN support for amgettuple-style index scans. The
API defined for amgettuple is difficult to support with fastupdate, and
the previously committed partial-match feature didn't really work with
it either. We might eventually figure a way to put back amgettuple
support, but it won't happen for 8.4.
catversion bumped because of change in GIN's pg_am entry, and because
the format of GIN indexes changed on-disk (there's a metapage now,
and possibly a pending list).
Teodor Sigaev
probes --- the BUFFER_READ_DONE probe provides the same information and more
besides. Expand the LOCK_WAIT_START/DONE probe arguments so that there's
actually some chance of telling what is being waited for. Update and
clean up the documentation.
DTrace probes, so that ordinary reads can be distinguished from relation
extension operations. Move buffer_read_start probe to before the
smgrnblocks() call that's needed in the isExtend case, since really that step
should be charged as part of the time needed for the extension operation.
(This makes it slightly harder to match the read_start with the associated
read_done, since now you can't match them on blockNumber, but it should still
be possible since isExtend operations on the same relation can never be
interleaved.) Per recent discussion.
In passing, add the page identity (forkNum/blockNum) to the parameters of the
buffer_flush_start/buffer_flush_done probes, which were unaccountably lacking
the info.
is still available, but you must now write the long equivalent --inserts
or --column-inserts. This change is made to eliminate confusion with the
use of -d to specify a database name in most other Postgres client programs.
Original patch by Greg Mullane, modified per subsequent discussion.
noise words for the last twelve years, for compatibility with Berkeley-era
output formatting of the special INVALID values for those datatypes.
Considering that the datatypes themselves have been deprecated for awhile,
this is taking backwards compatibility a little far. Per gripe from Josh
Berkus.
distribution, by creating a special fast path for the (first few) most common
values of the outer relation. Tuples having hashvalues matching the MCVs
are effectively forced to be in the first batch, so that we never write
them out to the batch temp files.
Bryce Cutt and Ramon Lawrence, with some editorialization by me.
the cause of the "could not write to log file: Bad file descriptor"
errors reported at
http://archives.postgresql.org//pgsql-general/2008-06/msg00193.php
Backpatch to 8.3, the race condition was introduced by the CSV logging
patch.
Analysis and patch by Gurjeet Singh.
to date, as per bug #4702 and subsequent discussion. In particular, make it
work for years specified using AD/BC or CC fields, and fix the test for "no
year specified" so that it doesn't trigger inappropriately for 1 BC (which it
was doing even in code paths that had nothing to do with to_timestamp). I
also did some minor code beautification in the non-ISO-day-number code path.
This area has been busted all along, but because the code has been rewritten
repeatedly, it would be considerable trouble to back-patch. It's such a
corner case that it doesn't seem worth the effort.
make it include the time for the possible smgropen() call, but that
results in a null pointer dereference :-(.
An alternative solution would be to fetch the buffer tag instead of
looking at *reln, but I'll just put it back as it was for the moment.
BTW, this indicates that DTrace probes evaluate their arguments even
when nominally inactive. What was that about "zero cost", again?
format codes are misapplied to a numeric argument. (The code still produces
a pretty bogus error message in such cases, but I'll settle for stopping the
crash for now.) Per bug #4700 from Sergey Burladyan.
Problem exists in all supported branches, so patch all the way back.
In HEAD, also clean up some ugly coding in the nearby cache management
code.
some bufmgr probes, take out redundant and memory-leak-inducing path arguments
to smgr__md__read__done and smgr__md__write__done, fix bogus attempt to
recalculate space used in sort__done, clean up formatting in places where
I'm not sure pgindent will do a nice job by itself.
are not an alphabetic character although they are not word-breakers too.
So, treat them as part of word.
Per off-list discussion with Dibyendra Hyoju <dibyendra@gmail.com> and
and Bal Krishna Bal <balkrishna7bal@gmail.com> about Nepali language and
Devanagari alphabet.
running pg_restore, which might run in parallel).
Only reopen archive file when we really need to read from it, in parallel code. Otherwise,
close it immediately in a worker, if possible.
exact-match pattern (no wildcard) can be index-optimized in some cases where a
prefix-match pattern cannot; specifically, since the required index clause is
simple equality, it works for regular text/varchar indexes even when the
locale is not C. I'm not sure how often this case really comes up, but since
it requires hardly any additional work to handle it, we might as well get it
right. Motivated by a discussion on the JDBC list.
for consistency with the (relatively) recent addition of typmod to SubLink.
An example of why it's a good idea is to be seen in the recent "failed to
locate grouping columns" bug, which wouldn't have happened if a SubPlan
exposed the same typmod info as the SubLink it was derived from.
This could be back-patched, since it doesn't affect any on-disk data format,
but for the moment it doesn't seem necessary to do so.
by the planning process. This prevents the "failed to locate grouping columns"
error recently reported by Dickson Guedes. That happens because planning
replaces SubLinks by SubPlans in the subquery's targetlist, and exprTypmod()
is smarter about the former than the latter, causing the apparent type of
the subquery's output columns to change. This seems to be a deficiency we
should fix in exprTypmod(), but that will be a much more invasive patch
with possible side-effects elsewhere, so I'll do that only in HEAD.
Back-patch to 8.3. Arguably the lack of a copying step is broken/dangerous
all the way back, but in the absence of known problems I'll refrain from
making the older branches pay the extra cost. (The reason this particular
symptom didn't appear before is that exprTypmod() wasn't smart about SubLinks
either, until 8.3.)