sets the items, and serializes the value back (rather than adding an
arbitrary number of XML preambles as before).
The libxml memory management via palloc had to be disabled because it
crashes when libxml tries to access memory that was helpfully freed
earlier by PostgreSQL. This needs further thought.
database privileges from a pre-8.2 server. This ensures that the reloaded
database will maintain the same behavior it had in the previous installation,
ie, everybody has connect privilege. Per gripe from L Bayuk.
an optarg). Add some comments noting that code in three different files has
to be kept in sync. Fix erroneous description of -S switch (it sets work_mem
not silent_mode), and do some light copy-editing elsewhere in postgres-ref.
form '^(foo)$'. Before, these could never be optimized into indexscans.
The recent changes to make psql and pg_dump generate such patterns (for \d
commands and -t and related switches, respectively) therefore represented
a big performance hit for people with large pg_class catalogs, as seen in
recent gripe from Erik Jones. While at it, be more paranoid about
case-sensitivity checking in multibyte encodings, and fix some other
corner cases in which a regex might be interpreted too liberally.
having md.c return a success/failure boolean to smgr.c, which was just going
to elog anyway, let md.c issue the elog messages itself. This allows better
error reporting, particularly in cases such as "short read" or "short write"
which Peter was complaining of. Also, remove the kluge of allowing mdread()
to return zeroes from a read-beyond-EOF: this is now an error condition
except when InRecovery or zero_damaged_pages = true. (Hash indexes used to
require that behavior, but no more.) Also, enforce that mdwrite() is to be
used for rewriting existing blocks while mdextend() is to be used for
extending the relation EOF. This restriction lets us get rid of the old
ad-hoc defense against creating huge files by an accidental reference to
a bogus block number: we'll only create new segments in mdextend() not
mdwrite() or mdread(). (Again, when InRecovery we allow it anyway, since
we need to allow updates of blocks that were later truncated away.)
Also, clean up the original makeshift patch for bug #2737: move the
responsibility for padding relation segments to full length into md.c.
The purpose is to allow autovacuum-esq conditional vacuuming and
clustering using SQL to discover the required stats.
No documentation updates required. Catalog version updated.
Glen Parker
valid result from a computation if one of the input values was infinity.
The previous code assumed an operation that returned infinity was an
overflow.
Handle underflow/overflow consistently, and add checks for aggregate
overflow.
Consistently prevent Inf/Nan from being cast to integer data types.
Fix INT_MIN % -1 to prevent overflow.
Update regression results for new error text.
Per report from Roman Kononov.
pg_opclass during LookupOpclassInfo(), I'd turned pg_opclass_oid_index
into a critical system index. However the problem could only manifest
during a backend's first attempt to load opclass data, and then only
if it had successfully loaded pg_internal.init and subsequently received
a relcache flush; which made it impossible to reproduce in sequential
tests and darn hard even in parallel tests. Memo to self: when
exercising cache flush scenarios, must disable LookupOpclassInfo's
internal cache too.
or contradictory keys even in cross-data-type scenarios. This is another
benefit of the opfamily rewrite: we can find the needed comparison
operators now.
the 8.1 SQL function definition for it. Per report from Rajesh Kumar Mallah,
such a DBA error doesn't seem at all improbable, and the cost of checking for
it is not very high compared to the cost of running this function. (It would
have been better to change the C name of the function so it wouldn't be called
by the old SQL definition, but it's too late for that now in the 8.2 branch.)
of increasing size, instead of one at a time. This reduces the memory
management overhead when num_temp_buffers is large: in the previous coding
we would actually waste 50% of the space used for temp buffers, because aset.c
would round the individual requests up to 16K. Problem noted while studying
a performance issue reported by Steven Flatt.
Back-patch as far as 8.1 --- older versions used few enough local buffers
that the issue isn't significant for them.
has a small maxBlockSize: the maximum request size that we will treat as a
"chunk" needs to be limited to fit in maxBlockSize. Otherwise we will round
up the request size to the next power of 2, wasting space, which is a bit
pointless if we aren't going to make the blocks big enough to fit additional
stuff in them. The example motivating this is local buffer management, which
makes repeated allocations of 8K (one BLCKSZ buffer) in TopMemoryContext,
which has maxBlockSize = 8K because for the most part allocations there are
small. This leads to each local buffer actually eating 16K of space, which
adds up when there are thousands of them. I intend to change localbuf.c to
aggregate its requests, which will prevent this particular misbehavior, but
it seems likely that similar scenarios could arise elsewhere, so fixing the
core problem seems wise as well.
PQdsplen()) normally, instead of replacing them by \uXXXX sequences.
Assume that they in fact occupy zero screen space for formatting purposes.
Per gripe from Michael Fuhr and ensuing discussion.
involving HashAggregate over SubqueryScan (this is the known case, there
may well be more). The bug is only latent in releases before 8.2 since they
didn't try to access tupletable slots' descriptors during ExecDropTupleTable.
The least bogus fix seems to be to make subqueries share the parent query's
memory context, so that tupdescs they create will have the same lifespan as
those of the parent query. There are comments in the code envisioning going
even further by not having a separate child EState at all, but that will
require rethinking executor access to range tables, which I don't want to
tackle right now. Per bug report from Jean-Pierre Pelletier.
ps_TupFromTlist in plan nodes that make use of it. This was being done
correctly in join nodes and Result nodes but not in any relation-scan nodes.
Bug would lead to bogus results if a set-returning function appeared in the
targetlist of a subquery that could be rescanned after partial execution,
for example a subquery within EXISTS(). Bug has been around forever :-(
... surprising it wasn't reported before.
were marked canSetTag. While it's certainly correct to return the result
of the last one that is marked canSetTag, it's less clear what to do when
none of them are. Since plpgsql will complain if zero is returned, the
8.2.0 behavior isn't good. I've fixed it to restore the prior behavior of
returning the physically last query's result code when there are no
canSetTag queries.
Use a TRY block instead of (inadequate) ad-hoc coding to ensure that
libxml is cleaned up after a failure. Report the intended SQLCODE
instead of defaulting to XX000. Avoid risking use of a dangling
pointer by keeping the persistent error buffer in TopMemoryContext.
Be less trusting that error messages don't contain %.
This patch doesn't do anything about changing the way the messages
are put together --- this is just about mechanism.
bletcherous and unsafe manipulation of global encoding setting.
Clean up libxml reporting mechanism a bit (it still looks like a
dangling-pointer crash waiting to happen, though, not to mention
being far less than sane from a localization standpoint).
the XmlExpr code in various lists, use a representation that has some hope
of reverse-listing correctly (though it's still a de-escaping function
shy of correctness), generally try to make it look more like Postgres
coding conventions.
cases. Operator classes now exist within "operator families". While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.
This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later. Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default. I owe some more documentation work, too. But that can all
be done in smaller pieces once this infrastructure is in place.
operator strategy numbers, ie, GiST and GIN. This is almost cosmetic
enough to not need a catversion bump, but since the opr_sanity regression
test has to change in sync with the catalog entry, I figured I'd better
do one.
are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr
(IN-clauses) or amortization of indexscan costs across repeated indexscans
on the inside of a nestloop. In particular:
Fix some logic errors in the estimation for multiple scans induced by a
ScalarArrayOpExpr indexqual.
Include a small cost component in bitmap index scans to reflect the costs of
manipulating the bitmap itself; this is mainly to prevent a bitmap scan from
appearing to have the same cost as a plain indexscan for fetching a single
tuple.
Also add a per-index-scan-startup CPU cost component; while prior releases
were clearly too pessimistic about the cost of repeated indexscans, the
original 8.2 coding allowed the cost of an indexscan to effectively go to zero
if repeated often enough, which is overly optimistic.
Pay some attention to index correlation when estimating costs for a nestloop
inner indexscan: this is significant when the plan fetches multiple heap
tuples per iteration, since high correlation means those tuples are probably
on the same or adjacent heap pages.
joinclause doesn't use any outer-side vars) requires a "bushy" plan to be
created. The normal heuristic to avoid joins with no joinclause has to be
overridden in that case. Problem is new in 8.2; before that we forced the
outer join order anyway. Per example from Teodor.
representing externally-supplied values, since the APIs that carry such
values only specify type not typmod. However, for PARAM_SUBLINK Params
it is handy to carry the typmod of the sublink's output column. This
is a much cleaner solution for the recently reported 'could not find
pathkey item to sort' and 'failed to find unique expression in subplan
tlist' bugs than my original 8.2-compatible patch. Besides, someday we
might want to support typmods for external parameters ...
in normal operation, and we can avoid rewriting pg_control at every log
segment switch if we don't insist that these values be valid. Reducing
the number of pg_control updates is a good idea for both performance and
reliability. It does make pg_resetxlog's life a bit harder, but that seems
a good tradeoff; and anyway the change to pg_resetxlog amounts to automating
something people formerly needed to do by hand, namely look at the existing
pg_xlog files to make sure the new WAL start point was past them.
In passing, change the wording of xlog.c's "database system was interrupted"
messages: describe the pg_control timestamp as "last known up at" rather than
implying it is the exact time of service interruption. With this change the
timestamp will generally be the time of the last checkpoint, which could be
many minutes before the failure; and we've already seen indications that
people tend to misinterpret the old wording.
initdb forced due to change in pg_control layout. Simon Riggs and Tom Lane
release it in a subtransaction abort, but this neglects possibility that
someone outside SPI already did. Fix is for spi.c to forget about a tuptable
as soon as it's handed it back to the caller.
Per bug #2817 from Michael Andreen.
rearrangeable outer joins and the WHERE clause is non-strict and mentions
only nullable-side relations. New bug in 8.2, caused by new logic to allow
rearranging outer joins. Per bug #2807 from Ross Cohen; thanks to Jeff
Davis for producing a usable test case.
a sublink's test expression have the correct vartypmod, rather than defaulting
to -1. There's at least one place where this is important because we're
expecting these Vars to be exactly equal() to those appearing in the subplan
itself. This is a pretty klugy solution --- it would likely be cleaner to
change Param nodes to include a typmod field --- but we can't do that in the
already-released 8.2 branch.
Per bug report from Hubert Fongarnand.
identify long-running transactions. Since we already need to record
the transaction-start time (e.g. for now()), we don't need any
additional system calls to report this information.
Catversion bumped, initdb required.
capitalize the strings like sentences. Remove unnecessarily
specific descriptions of the units used by GUC variables, since
we now allow any reasonable unit to be specified.
FormatMessage() (This should have been in 8.2.0, patched to 8.2.X and
HEAD):
I think this problem to be complex....
http://archives.postgresql.org/pgsql-hackers/2006-11/msg00042.php
FormatMessage of windows cannot consider the encoding of the database.
However, I should try the solution now. It is necessary to clear the
problem.
Multi character-code exists together in message and log. It doesn't
consider
the data base encoding that the user intended....
The user in multi-byte country can try this.
http://inet.winpg.jp/~saito/pg_bug/MessageCheck.c
That is, it is likely to become it in this manner.(Japanese)
http://inet.winpg.jp/~saito/pg_bug/FormatMessage998.png
Hiroshi Saito
by name on each and every row processed. Profiling suggests this may
buy a percent or two for simple UPDATE scenarios, which isn't huge,
but when it's so easy to get ...
by the change to make limit values int8 instead of int4. (Specifically, you
can do DatumGetInt32 safely on a null value, but not DatumGetInt64.) Per
bug #2803 from Greg Johnson.
should allow delete-pending files to actually go away, and thereby work
around the various complaints we've seen about 'permission denied'
errors in such cases. Should be reasonably harmless in any case...
StartupXLOG and ShutdownXLOG no longer need to be critical sections, because
in all contexts where they are invoked, elog(ERROR) would be translated to
elog(FATAL) anyway. (One change in bgwriter.c is needed to make this true:
set ExitOnAnyError before trying to exit. This is a good fix anyway since
the existing code would have gone into an infinite loop on elog(ERROR) during
shutdown.) That avoids a misleading report of PANIC during semi-orderly
failures. Modify the postmaster to include the startup process in the set of
processes that get SIGTERM when a fast shutdown is requested, and also fix it
to not try to restart the bgwriter if the bgwriter fails while trying to write
the shutdown checkpoint. Net result is that "pg_ctl stop -m fast" does
something reasonable for a system in warm standby mode, and so should Unix
system shutdown (ie, universal SIGTERM). Per gripe from Stephen Harris and
some corner-case testing of my own.
remove page on next level linked from next inner page, ginScanToDelete()
wrongly sets parent page. Bug reveals when many item pointers from index
was deleted ( several hundred thousands).
Bug is discovered by hubert depesz lubaczewski <depesz@gmail.com>
Suppose, we need rc2 before release...
result now depends on the lc_messages setting, as noted by Bruce.
Also, mark to_number() and the numeric-type variants of to_char() as stable,
because their results depend on lc_numeric; this is a longstanding oversight.
Also, mark to_date() and to_char(interval) as stable; although these appear
not to depend on any GUC variables as of CVS HEAD, that seems a property
unlikely to survive future improvements. It seems best to mark all the
formatting functions stable and be done with it.
catversion not bumped, because this does not seem critical enough to force
a post-RC1 initdb, and anyway we cannot do so in the back branches.
(in particular, causing the ReadyForQuery message to be eaten) before
returning from do_copy. The only known consequence of failing to do so is
that get_prompt might show a wrong result for the %x transaction status
escape, as reported by Bernd Helmle; but it's possible there are other issues.
Back-patch as far as 7.4, the oldest version supporting %x.
vacuum/analyze timestamp columns at the end, rather than at a random
spot in the middle as in the original patch. This was deemed more usable
as well as less likely to break existing application code. initdb forced
accordingly. In passing, remove former kluge for initializing
pg_stat_file()'s pg_proc entry --- bootstrap mode was fixed recently
so that this can be done without any hacks, but I overlooked this usage.
HeapTuple that is no longer allocated as a single palloc() block; if
used carelessly, this might result in a subsequent memory leak after
heap_freetuple().
AbortTransaction, which would lead to recursion and eventual PANIC exit
as illustrated in recent report from Jeff Davis. First, in xact.c create
a special dedicated memory context for AbortTransaction to run in. This
solves the problem as long as AbortTransaction doesn't need more than 32K
(or whatever other size we create the context with). But in corner cases
it might. Second, in trigger.c arrange to keep pending after-trigger event
records in separate contexts that can be freed near the beginning of
AbortTransaction, rather than having them persist until CleanupTransaction
as before. Third, in portalmem.c arrange to free executor state data
earlier as well. These two changes should result in backing off the
out-of-memory condition before AbortTransaction needs any significant
amount of memory, at least in typical cases such as memory overrun due
to too many trigger events or too big an executor hash table. And all
the same for subtransaction abort too, of course.
zic's Europe/London, rather than Europe/Dublin as before. This seems
a less surprising choice, particularly with respect to dates before
1948. Original suggestion was to translate to straight GMT, but this
seems wrong given that these zones *are* DST-aware. Per offlist
discussion with Magnus.
in the middle of executing a SPI query. This doesn't entirely fix the
problem of memory leakage in plpgsql exception handling, but it should
get rid of the lion's share of leakage.
because on that platform strftime produces localized zone names in varying
encodings. Even though it's only in a comment, this can cause encoding
errors when reloading the dump script. Per suggestion from Andreas
Seltenreich. Also, suppress %Z on Windows in the %s escape of
log_line_prefix ... not sure why this one is different from the other two,
but it shouldn't be.
python 2.5. This involves fixing several violations of the published
spec for creating PyTypeObjects, and adding another regression test
expected output for yet another variation of error message spelling.
Windows), arrange for each postmaster child process to be its own process
group leader, and deliver signals SIGINT, SIGTERM, SIGQUIT to the whole
process group not only the direct child process. This provides saner behavior
for archive and recovery scripts; in particular, it's possible to shut down a
warm-standby recovery server using "pg_ctl stop -m immediate", since delivery
of SIGQUIT to the startup subprocess will result in killing the waiting
recovery_command. Also, this makes Query Cancel and statement_timeout apply
to scripts being run from backends via system(). (There is no support in the
core backend for that, but it's widely done using untrusted PLs.) Per gripe
from Stephen Harris and subsequent discussion.
Typo in the changes to plperl - uses wrong dir, and had a missing slash.
Also fixes error checking for xsubpp - it was broken in a way that hid
the problem above when run more than once (which is the normal case when
developing).
Negotiation failure is only likely to happen if one side or the other is
misconfigured, eg. bad client certificate. I'm not 100% convinced that
a retry is really the best thing, hence not back-patching this fix for now.
Per gripe from Sergio Cinos.
promoted to FATAL) end in exit(1) not exit(0). Then change the postmaster to
allow exit(1) without a system-wide panic, but not for the startup subprocess
or the bgwriter. There were a couple of places that were using exit(1) to
deliberately force a system-wide panic; adjust these to be exit(2) instead.
This fixes the problem noted back in July that if the startup process exits
with elog(ERROR), the postmaster would think everything is hunky-dory and
proceed to start up. Alternative solutions such as trying to run the entire
startup process as a critical section seem less clean, primarily because of
the fact that a fair amount of startup code is shared by all postmaster
children in the EXEC_BACKEND case. We'd need an ugly special case somewhere
near the head of main.c to make it work if it's the child process's
responsibility to determine what happens; and what's the point when the
postmaster already treats different children differently?
* New versions of OpenSSL come with proper debug versions, and use
suffixed names on the LIBs for that. Adapts library handling to deal
with that.
* Fixes error where it incorrectly enabled Kerberos based on NLS
configuration instead of Kerberos configuration
* Specifies path of perl in config, instead of using current one.
Required when using a 64-bit perl normally, but want to build pl/perl
against 32-bit one (required)
* Fix so pgevent generates win32ver.rc automatically
Magnus Hagander
any no-longer-needed segments; just truncate them to zero bytes and leave
the files in place for possible future re-use. This avoids problems when
the segments are re-used due to relation growth shortly after truncation.
Before, the bgwriter, and possibly other backends, could still be holding
open file references to the old segment files, and would write dirty blocks
into those files where they'd disappear from the view of other processes.
Back-patch as far as 8.0. I believe the 7.x branches are not vulnerable,
because they had no bgwriter, and "blind" writes by other backends would
always be done via freshly-opened file references.
preference for filling pages out-of-order tends to confuse the sanity checks
in md.c, as per report from Balazs Nagy in bug #2737. The fix is to ensure
that the smgr-level code always has the same idea of the logical EOF as the
hash index code does, by using ReadBuffer(P_NEW) where we are adding a single
page to the end of the index, and using smgrextend() to reserve a large batch
of pages when creating a new splitpoint. The patch is a bit ugly because it
avoids making any changes in md.c, which seems the most prudent approach for a
backpatchable beta-period fix. After 8.3 development opens, I'll take a look
at a cleaner but more invasive patch, in particular getting rid of the now
unnecessary hack to allow reading beyond EOF in mdread().
Backpatch as far as 7.4. The bug likely exists in 7.3 as well, but because
of the magnitude of the 7.3-to-7.4 changes in hash, the later-version patch
doesn't even begin to apply. Given the other known bugs in the 7.3-era hash
code, it does not seem worth trying to develop a separate patch for 7.3.
cases where we already hold the desired lock "indirectly", either via
membership in a MultiXact or because the lock was originally taken by a
different subtransaction of the current transaction. These cases must be
accounted for to avoid needless deadlocks and/or inappropriate replacement of
an exclusive lock with a shared lock. Per report from Clarence Gardner and
subsequent investigation.
of an index on a serial column, rather than the name of the associated
sequence. Fallout from recent changes in dependency setup for serials.
Per bug #2732 from Basil Evseenko.
added to information_schema (per a SQL2003 addition). The original coding
failed if a referenced column participated in more than one pg_constraint
entry. Also, it did not work if an FK relied directly on a unique index
without any constraint syntactic sugar --- this case is outside the SQL spec,
but PG has always supported it, so it's reasonable for our information_schema
to handle it too. Per bug#2750 from Stephen Haberman.
Although this patch changes the initial catalog contents, I didn't force
initdb. Any beta3 testers who need the fix can install it via CREATE OR
REPLACE VIEW, so forcing them to initdb seems an unnecessary imposition.
accurately: we have to distinguish the effects of the join's own ON
clauses from the effects of pushed-down clauses. Failing to do so
was a quick hack long ago, but it's time to be smarter. Per example
from Thomas H.
30 seconds instead of retrying forever. Also modify xlog.c so that if
it fails to rename an old xlog segment up to a future slot, it will
unlink the segment instead. Per discussion of bug #2712, in which it
became apparent that Windows can handle unlinking a file that's being
held open, but not renaming it.
The former coding relied on the actual allocated size of the last block,
which made it behave strangely if the first allocation in a context was
larger than ALLOC_CHUNK_LIMIT: subsequent allocations would be referenced
to that and not to the intended series of block sizes. Noted while
studying a memory wastage gripe from Tatsuo.
more space is needed, instead of incrementing by a fixed amount; the old
method wastes lots of space and time when the ultimate size is large.
Per gripe from Tatsuo.