condition: when there are multiple possible index paths involving
ScalarArrayOpExprs, they are logically to be ANDed together not ORed.
This thinko was a direct consequence of trying to put the processing
inside generate_bitmap_or_paths(), which I now see was a bit too cute.
So pull it out and make the callers do it separately (there are only two
that need it anyway). Partially responds to bug #2441 from Arjen van der Meijden.
There are some additional infelicities exposed by his example, but they
are also in 8.1.x, while this mistake is not.
sections now isn't nested. All user-defined functions now is
called outside critsections. Small improvements in WAL
protocol.
TODO: improve XLOG replay
always has been, because it's not got any .globl declaration! We've
been relying on the solaris_sparc.s code instead. Rip it out.
(Not back-patched, since this is just cosmetic cleanup.)
throw warnings for 100%-SQL-standard constructs, clean up some minor
infelicities, try to un-break ecpg to the best of my ability. (It's not clear
how ecpg is going to find out the setting of standard_conforming_strings,
though.) I think pg_dump still needs work, too.
(relpages/reltuples). To do this, create formal support in heapam.c for
"overwrite" tuple updates (including xlog replay capability) and use that
instead of the ad-hoc overwrites we'd been using in VACUUM and CREATE INDEX.
Take the responsibility for updating stats during CREATE INDEX out of the
individual index AMs, and do it where it belongs, in catalog/index.c. Aside
from being more modular, this avoids having to update the same tuple twice in
some paths through CREATE INDEX. It's probably not measurably faster, but
for sure it's a lot cleaner than before.
into a single mostly-physical-order scan of the index. This requires some
ticklish interlocking considerations, but should create no material
performance impact on normal index operations (at least given the
already-committed changes to make scans work a page at a time). VACUUM
itself should get significantly faster in any index that's degenerated to a
very nonlinear page order. Also, we save one pass over the index entirely,
except in the case where there were no deletions to do and so only one pass
happened anyway.
Original patch by Heikki Linnakangas, rework by Tom Lane.
btgettuple and btgetmulti). This eliminates the problem of "re-finding" the
exact stopping point, since the stopping point is effectively always a page
boundary, and index items are never moved across pre-existing page boundaries.
A small penalty is that the keys_are_unique optimization is effectively
disabled (and, therefore, is removed in this patch), causing us to apply
_bt_checkkeys() to at least one more tuple than necessary when looking up a
unique key. However, the advantages for non-unique cases seem great enough to
accept this tradeoff. Aside from simplifying and (sometimes) speeding up the
indexscan code, this will allow us to reimplement btbulkdelete as a largely
sequential scan instead of index-order traversal, thereby significantly
reducing the cost of VACUUM. Those changes will come in a separate patch.
Original patch by Heikki Linnakangas, rework by Tom Lane.
it's not necessary to have three separate calls anymore. This patch also
fixes things so we don't try to read pg_internal.init until after we've
obtained lock on the target database; which was fairly harmless, but it's
certainly cleaner this way.
The former approach used ExclusiveLock on pg_database, which being a
cluster-wide lock meant only one of these operations could proceed at
a time; worse, it also blocked all incoming connections in ReverifyMyDatabase.
Now that we have LockSharedObject(), we can use locks of different types
applied to databases considered as objects. This allows much more
flexible management of the interlocking: two CREATE DATABASEs need not
block each other, and need not block connections except to the template
database being used. Similarly DROP DATABASE doesn't block unrelated
operations. The locking used in flatfiles.c is also much narrower in
scope than before. Per recent proposal.
in various places that were previously doing ad hoc pg_database searches.
This may speed up database-related privilege checks a little bit, but
the main motivation is to eliminate the performance reason for having
ReverifyMyDatabase do such a lot of stuff (viz, avoiding repeat scans
of pg_database during backend startup). The locking reason for having
that routine is about to go away, and it'd be good to have the option
to break it up.
initPlan sets a parameter for another. This could not (I think) happen before
8.1, but it's possible now because the initPlans generated by MIN/MAX
optimization might themselves use initPlans. We attach those initPlans as
siblings of the MIN/MAX ones, not children, to avoid duplicate computation
when multiple MIN/MAX aggregates are present; so this leads to the case of an
initPlan needing the result of a sibling initPlan, which is not possible with
ordinary query nesting. Hadn't been noticed because in most contexts having
too much stuff listed in extParam is fairly harmless. Fixes "plan should not
reference subplan's variable" bug reported by Catalin Pitis.
This formulation requires every AM to provide amvacuumcleanup, unlike before,
but it's surely a whole lot cleaner. Also, add an 'amstorage' column to
pg_am so that we can get rid of hardwired knowledge in DefineOpClass().
the union of its child relations as well. This might have been a good idea
when it was originally coded, but it's a fatally bad idea when inheritance is
being used for partitioning. It's better to have no stats at all than
completely misleading stats. Per report from Mark Liberman.
The bug arguably exists all the way back, but I've only patched HEAD and 8.1
because we weren't particularly trying to support partitioning before 8.1.
Eventually we ought to look at deriving union statistics instead of just
punting, but for now the drop kick looks good.
input datatypes given, and use this before trying OpernameGetCandidates.
This is faster than the old method when there's an exact match, and it
does not seem materially slower when there's not. And it definitely
makes some of the callers cleaner, because they didn't really want to
know about a list of candidates anyway. Per discussion with Atsushi Ogawa.
support both FOR UPDATE and FOR SHARE in one command, as well as both
NOWAIT and normal WAIT behavior. The more general code is actually
simpler and cleaner.
MIN/MAX not be converted to use an index if the query WHERE clause contains
any volatile functions or subplans.
I had originally feared that the conversion might alter the behavior of such a
query with respect to a volatile function. Well, so it might, but only in the
sense that the function would get evaluated at a subset of the table rows
rather than all of them --- and we have never made any such guarantee anyway.
(For instance, we don't refuse to use an index for an ordinary non-aggregate
query when one of the non-indexable filter conditions contains a volatile
function.)
The prohibition against subplans was because of worry that that case wasn't
adequately tested, which it wasn't, but it turns out to be possible to make
8.1 fail anyway:
regression=# select o.ten, (select max(unique2) from tenk1 i where ten = o.ten
or ten = (select f1 from int4_tbl limit 1)) from tenk1 o;
ERROR: direct correlated subquery unsupported as initplan
This is due to bogus code in SS_make_initplan_from_plan (it's an initplan,
ergo it can't have any parParams). Having fixed that, we might as well allow
subplans as well as initplans.
be exported on Linux and Darwin. We already did this on Windows but
that's not enough, as evidenced by the fact that libecpg had an unexpected
dependency on one such symbol. We should try to do it on more platforms.
Fix ecpg's oversight, and bump libpq's major .so version number to reflect
the unwanted but nonetheless real ABI break.
cases. This was not needed in the existing uses within selfuncs.c, but if
we're gonna export it for general use, the extra generality seems helpful.
Motivated by looking at ltree example.
> >> >> > 1) named parameters additionally to args[]
> >> >> > 2) return composite-types from plpython as dictionary
> >> >> > 3) return result-set from plpython as list, iterator or generator
1) named parameters additionally to args[]
2) return composite-types from plpython as dictionary
3) return result-set from plpython as list, iterator or generator
Hannu Krosing
Sven Suursoho
In the SSL code in libpq it does some processing with DH parameters:
SSL_CTX_set_tmp_dh_callback()
This function is marked as server use only[1], the client always uses
the DH parameters in the server, so all the code in the client dealing
with the DH parameters is useless. This patch removes it.
It's not clear why the code was added in the first place, it's been
there almost since the beginning[2]. At the time there was a suggestion
of merging the front-end and backend SSL code, but looking at the
changes since, that seems unlikely.
As a further example, the s_server program allows you to specify DH
params, but s_client doesn't. In the GnuTLS documentation under
gnutls_dh_params_generate2() it says[3]:
Also note that the DH parameters are only useful to servers. Since
clients use the parameters sent by the server, it's of no use to call
this in client side.
shutdown, or when requested by a backend:
It changes so the file is only written once every 5 minutes (changeable
of course, I just picked something) instead of once every half second.
It's still written when the stats collector shuts down, just as before.
And it is now also written on backend request. A backend requests a
rewrite by simply sending a special stats message. It operates on the
assumption that the backends aren't actually going to read the
statistics file very often, compared to how frequent it's written today.
Magnus Hagander
set to the large object context ("fscxt"), as this is inevitably a source of
transaction-duration memory leaks. Not sure why we'd not noticed it before;
maybe people weren't touching a whole lot of LOs in the same transaction
before the 8.1 pg_dump changes. Per report from Wayne Conrad.
Backpatched as far as 8.1, but the problem doubtless goes all the way back.
I'm disinclined to spend the time to try to verify that the older branches
would still work if patched, seeing that this code was significantly modified
for 8.0 and again for 8.1, and that we don't have any trouble reports before
8.1. (Maybe the leaks were smaller before?)
thereby saving a visit to the metapage in most index searches/updates.
This wouldn't actually save any I/O (since in the old regime the metapage
generally stayed in cache anyway), but it does provide a useful decrease
in bufmgr traffic in high-contention scenarios. Per my recent proposal.
implied by the predicate of a partial index being used to scan a table.
However, this optimization is unsafe in an UPDATE, DELETE, or SELECT FOR
UPDATE query, because the quals need to be rechecked by EvalPlanQual if
there's an update conflict. Per example from Jean-Samuel Reynaud.
transaction_timestamp() (just like now()).
Also update statement_timeout() to mention it is statement arrival time
that is measured.
Catalog version updated.
not named ones, and replace linear searches of the list with array indexing.
The named-parameter support has been dead code for many years anyway,
and recent profiling suggests that the searching was costing a noticeable
amount of performance for complex queries.
to track the number of LWLock acquisitions and the number of times we
block waiting for an LWLock, on a per-process basis. After having needed
this twice in the past few months, seems like it should go into CVS.
of rejecting palloc(0). Also, tweak like_selectivity() to avoid assuming
the presented pattern is nonempty; although that assumption is valid,
it doesn't really help much, and the new coding is more correct anyway
since it properly handles redundant wildcards. In combination these
changes should eliminate a Coverity warning noted by Martijn.
whenever we start to read within that file. The first page carries
extra identification information that really ought to be checked, but
as the code stood, this was only checked when we switched sequentially
into a new WAL file, or if by chance the starting checkpoint record was
within the first page. This patch ensures that we will detect bogus
'long header' information before we start replaying the WAL sequence.
symbol for PPC64 hardware. I hadn't known that Apple supported PPC64 at
all, but darn if there aren't 64-bit variant libraries in OS X as well
as support in their gcc.
no evidence that any currently-supported platform needs this, and good
reason to think that any platform that did need it couldn't use the static
libraries anyway --- libpq, at least, has circular references. Removing
the code shuts up tsort warnings about the circular references on some
platforms.
8.0), and add as suggestion to use log_min_error_statement for this
purpose. I also fixed the code so the first EXECUTE has it's prepare,
rather than the last which is what was in the current code. Also remove
"protocol" prefix for SQL EXECUTE output because it is not accurate.
Backpatch to 8.1.X.
to occur between pg_start_backup() and pg_stop_backup(), even if the GUC
setting full_page_writes is OFF. Per discussion, doing this in combination
with the already-existing checkpoint during pg_start_backup() should ensure
safety against partial page updates being included in the backup. We do
not have to force full page writes to occur during normal PITR operation,
as I had first feared.
CREATE AGGREGATE aggname (input_type) (parameter_list)
along with the old syntax where the input type was named in the parameter
list. This fits more naturally with the way that the aggregate is identified
in DROP AGGREGATE and other utility commands; furthermore it has a natural
extension to handle multiple-input aggregates, where the basetype-parameter
method would get ugly. In fact, this commit fixes the grammar and all the
utility commands to support multiple-input aggregates; but DefineAggregate
rejects it because the executor isn't fixed yet.
I didn't do anything about treating agg(*) as a zero-input aggregate instead
of artificially making it a one-input aggregate, but that should be considered
in combination with supporting multi-input aggregates.
update no-longer-existing pages to fall through as no-ops, but make a note
of each page number referenced by such records. If we don't see a later
XLOG entry dropping the table or truncating away the page, complain at
the end of XLOG replay. Since this fixes the known failure mode for
full_page_writes = off, revert my previous band-aid patch that disabled
that GUC variable.
If a process abandons a wait in LockBufferForCleanup (in practice,
only happens if someone cancels a VACUUM) just before someone else
sends it a signal indicating the buffer is available, it was possible
for the wakeup to remain in the process' semaphore, causing misbehavior
next time the process waited for an lmgr lock. Rather than try to
prevent the race condition directly, it seems best to make the lock
manager robust against leftover wakeups, by having it repeat waiting
on the semaphore if the lock has not actually been granted or denied
yet.
alternatives ("|" symbol). The original coding allowed the added ^ and $
constraints to be absorbed into the first and last alternatives, producing
a pattern that would match more than it should. Per report from Eric Noriega.
I also changed the pattern to add an ARE director ("***:"), ensuring that
SIMILAR TO patterns do not change behavior if regex_flavor is changed. This
is necessary to make the non-capturing parentheses work, and seems like a
good idea on general principles.
Back-patched as far as 7.4. 7.3 also has the bug, but a fix seems impractical
because that version's regex engine doesn't have non-capturing parens.
upper-level insertion completes a previously-seen split, we cannot simply grab
the downlink block number out of the buffer, because the buffer could contain
a later state of the page --- or perhaps the page doesn't even exist at all
any more, due to relation truncation. These possibilities have been masked up
to now because the use of full_page_writes effectively ensured that no xlog
replay routine ever actually saw a page state newer than its own change.
Since we're deprecating full_page_writes in 8.1.*, there's no need to fix this
in existing release branches, but we need a fix in HEAD if we want to have any
hope of re-allowing full_page_writes. Accordingly, adjust the contents of
btree WAL records so that we can always get the downlink block number from the
WAL record rather than having to depend on buffer contents. Per report from
Kevin Grittner and Peter Brant.
Improve a few comments in related code while at it.
# Need to recomple any libpgport object files because we need these
# object files to use the same compile flags as libpq. If we used
# the object files from libpgport, this would not be true on all
# platforms.
had a bad side-effect: it stopped finding plans that involved BitmapAnd
combinations of indexscans using both join and non-join conditions. Instead,
make choose_bitmap_and more aggressive about detecting redundancies between
BitmapOr subplans.
at least one join condition as an indexqual. Before bitmap indexscans, this
oversight didn't really cost much except for redundantly considering the
same join paths twice; but as of 8.1 it could result in silly bitmap scans
that would do the same BitmapOr twice and then BitmapAnd these together :-(
when trying to locate the referent of a RECORD variable. This fixes the
'record type has not been registered' failure reported by Stefan
Kaltenbrunner about a month ago. A side effect of the way I chose to
fix it is that most variable references in join conditions will now be
properly labeled with the variable's source table name, instead of the
not-too-helpful 'outer' or 'inner' we used to use.
identically named user and group: we merge these into a single entity
with LOGIN permission. Also, add ORDER BY commands to ensure consistent
dump ordering, for ease of comparing outputs from different installations.
output, ie, no OR immediately below an OR. Otherwise we get Asserts or
wrong answers for cases such as
select * from tenk1 a, tenk1 b
where (a.ten = b.ten and (a.unique1 = 100 or a.unique1 = 101))
or (a.hundred = b.hundred and a.unique1 = 42);
Per report from Rafael Martinez Guerrero.
Per recent discussion, this seems to be making the stats less accurate
rather than more so, particularly on Windows where PID values may be
reused very quickly. Patch by Peter Brant.
generated text files. Fix build of that file, too.
Put the text files in the right place during make dist, so there are no
extra manual steps required anymore.
that apply the necessary domain constraint checks immediately. This fixes
cases where domain constraints went unchecked for statement parameters,
PL function local variables and results, etc. We can also eliminate existing
special cases for domains in places that had gotten it right, eg COPY.
Also, allow domains over domains (base of a domain is another domain type).
This almost worked before, but was disallowed because the original patch
hadn't gotten it quite right.
files of the same languages. That way, similar or equal translations in
different programs are automatically propagated and the life of translators
becomes a little bit easier.
XLOG_BLCKSZ. This ought to help in preventing configuration mismatch
problems if anyone tries to ship PITR files between servers compiled
with different XLOG_BLCKSZ settings. Simon Riggs
instead a dedicated symbol. This probably makes no functional difference
for likely values of BLCKSZ, but it makes the intent clearer.
Simon Riggs, minor editorialization by Tom Lane.
functions are not strict, they will be called (passing a NULL first parameter)
during any attempt to input a NULL value of their datatype. Currently, all
our input functions are strict and so this commit does not change any
behavior. However, this will make it possible to build domain input functions
that centralize checking of domain constraints, thereby closing numerous holes
in our domain support, as per previous discussion.
While at it, I took the opportunity to introduce convenience functions
InputFunctionCall, OutputFunctionCall, etc to use in code that calls I/O
functions. This eliminates a lot of grotty-looking casts, but the main
motivation is to make it easier to grep for these places if we ever need
to touch them again.
used within WAL files. Historically this was the same as the data file
BLCKSZ, but there's no necessary connection, and it's possible that
performance gains might ensue from reducing XLOG_BLCKSZ. In any case
distinguishing two symbols should improve code clarity. This commit
does not actually change the page size, only provide the infrastructure
to make it possible to do so. initdb forced because of addition of a
field to pg_control.
Mark Wong, with some help from Simon Riggs and Tom Lane.
to fix regressions introduced in the recent patch adding additional
\connect options. This is based on work by Volkan YAZICI, although
this version of the patch doesn't bear much resemblance to Volkan's
version.
\connect takes 4 optional arguments: database name, user name, host
name, and port number. If any of those parameters are omitted or
specified as "-", the value of that parameter from the previous
connection is used instead; if there is no previous connection,
the libpq default is used. Note that this behavior makes it
impossible to reuse the libpq defaults without quitting psql and
restarting it; I don't really see the use case for needing to do
that.
whitespace issues nearby.
DROP OWNED BY is actually a bit kludgy, but it seems better to do it this way
rather than duplicating the words_after_create list just to add a single
element.
incrementally by successive inserts rather than by sorting the data.
We were only using the slow path during bootstrap, apparently because
when first written it failed during bootstrap --- but it works fine now
AFAICT. Removing it saves a hundred or so lines of code and produces
noticeably (~10%) smaller initial states of the system catalog indexes.
While that won't make much difference for heavily-modified catalogs,
for the more static ones there may be a useful long-term performance
improvement.
misleadingly-named WriteBuffer routine, and instead require routines that
change buffer pages to call MarkBufferDirty (which does exactly what it says).
We also require that they do so before calling XLogInsert; this takes care of
the synchronization requirement documented in SyncOneBuffer. Note that
because bufmgr takes the buffer content lock (in shared mode) while writing
out any buffer, it doesn't matter whether MarkBufferDirty is executed before
the buffer content change is complete, so long as the content change is
completed before releasing exclusive lock on the buffer. So it's OK to set
the dirtybit before we fill in the LSN.
This eliminates the former kluge of needing to set the dirtybit in LockBuffer.
Aside from making the code more transparent, we can also add some new
debugging assertions, in particular that the caller of MarkBufferDirty must
hold the buffer content lock, not merely a pin.
torn-page problems. This introduces some issues of its own, mainly
that there are now some critical sections of unreasonably broad scope,
but it's a step forward anyway. Further cleanup will require some
code refactoring that I'd prefer to get Oleg and Teodor involved in.
startup or recovery process. Since such a process isn't a real backend,
pgstat.c gets confused. This accounts for recent reports of strange
"invalid server process ID -1" log messages during crash recovery.
There isn't any point in attempting to make the report, since we'll discard
stats in such scenarios anyhow.
This commit doesn't make much functional change, but it does eliminate some
duplicated code --- for instance, PageIsNew tests are now done inside
XLogReadBuffer rather than by each caller.
The GIST xlog code still needs a lot of love, but I'll worry about that
separately.
have symlinks (ie, Windows). Although it'll never be called on to do anything
useful during normal operation on such a platform, it's still needed to
re-create dropped directories during WAL replay.
failures even when the hardware and OS did nothing wrong. Per recent analysis
of a problem report from Alex Bahdushka.
For the moment I've just diked out the test of the parameter, rather than
removing the GUC infrastructure and documentation, in case we conclude that
there's something salvageable there. There seems no chance of it being
resurrected in the 8.1 branch though.
passed extend = true whenever we are reading a page we intend to reinitialize
completely, even if we think the page "should exist". This is because it
might indeed not exist, if the relation got truncated sometime after the
current xlog record was made and before the crash we're trying to recover
from. These two thinkos appear to explain both of the old bug reports
discussed here:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php
tuples as needed "to keep VACUUM from complaining", but actually there is
a more compelling reason to do it: failure to do so violates MVCC semantics.
This is because a pre-existing serializable transaction might try to use
the index after we finish (re)building it, and it might fail to find tuples
it should be able to see. We got this mostly right, but not in the case
of partial indexes: the code mistakenly discarded recently-dead tuples for
partial indexes. Fix that, and adjust the comments.
when an error occurs during xlog replay. Also, replace the former risky
'write into a fixed-size buffer with no overflow detection' API for XLOG
record description routines; use an expansible StringInfo instead. (The
latter accounts for most of the patch bulk.)
Qingqing Zhou
command or expression, rather than one copy for each textual occurrence as
it did before. This might result in some small performance improvement,
but the compelling reason to do it is that not doing so can result in
unexpected grouping failures because the main SQL parser won't see different
parameter numbers as equivalent. Add a regression test for the failure case.
Per report from Robert Davidson.
the logic it contained to switch to insertion sort for near-sorted input was
in fact a big loss, because it could fairly easily be fooled into applying
insertion sort to large subfiles that weren't all that well ordered. Remove
that, and instead add a simple check for already-perfectly-sorted input, as
per suggestion from Dann Corbit. This adds at worst O(N*lgN) overhead, and
usually far less, while sometimes allowing a subfile sort to finish in O(N)
time. Preliminary testing says this is an improvement over the basic
Bentley & McIlroy code for many nonrandom inputs, and it costs almost
nothing when the input is random.
> 1) Fix the problems with the \s command.
> When the saveHistory is executed by the \s command we must not do the
> conversion \n -> \x01 (per
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg00317.php )
>
> 2) Fix the handling of Ctrl+C
>
> Now when you do
> wsdb=# select 'your long query here '
> wsdb-#
> and press afterwards the CtrlC the line "select 'your long query here
'"
> will be in the history
>
> (partly per
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg00297.php )
>
> 3) Fix the handling of commands with not closed brackets, quotes,
double
> quotes. (now those commands are not splitted in parts...)
>
> 4) Fix the behaviour when SINGLELINE mode is used. (before it was
almost
> broken ;(
Sergey E. Koposov
byte-swapping on the port number which causes the call to fail on Intel
Macs.
This patch uses htons() instead of htonl() and fixes this bug.
Ashley Clark
2005-05-13. When we find that a new inner tuple can't possibly match any
outer tuple (because it contains a NULL), we can't immediately skip the
tuple when we are in NEXTINNER state. Doing so can lead to emitting
multiple copies of the tuple in FillInner mode, because we may rescan the
tuple after returning to a previous marked tuple. Instead, proceed to
NEXTOUTER state the same as we used to do. After we've found that there's
no need to return to the marked position, we can go to SKIPINNER_ADVANCE
state instead of SKIP_TEST when the inner tuple is unmatchable; this
preserves the performance improvement. Per bug report from Bruce.
I also made a couple of cosmetic code rearrangements and added a regression
test for the problem.
The original coding stored the raw parser output (ColumnDef and TypeName
nodes) which was ugly, bulky, and wrong because it failed to create any
dependency on the referenced datatype --- and in fact would not track type
renamings and suchlike. Instead store a list of column type OIDs in the
RTE.
Also fix up general failure of recordDependencyOnExpr to do anything sane
about recording dependencies on datatypes. While there are many cases where
there will be an indirect dependency (eg if an operator returns a datatype,
the dependency on the operator is enough), we do have to record the datatype
as a separate dependency in examples like CoerceToDomain.
initdb forced because of change of stored rules.
during parse analysis, not only errors detected in the flex/bison stages.
This is per my earlier proposal. This commit includes all the basic
infrastructure, but locations are only tracked and reported for errors
involving column references, function calls, and operators. More could
be done later but this seems like a good set to start with. I've also
moved the ReportSyntaxErrorPosition logic out of psql and into libpq,
which should make it available to more people --- even within psql this
is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
similar constants if they were not previously defined. All these
constants must be defined by limits.h according to C89, so we can
safely assume they are present.
case where we run low on array slots before we run low on memory is much
more probable than I had thought, and so it's important to treat each
tape fairly in that case. To fix this, track per-tape slot allocations
just like we track per-tape space allocation. Also, in the FINALMERGE
code path avoid scanning all the input tapes when we really only need to
read from one. This should fix poor behavior with very large work_mem
as exhibited by Stefan Kaltenbrunner.
I didn't do anything about putting an upper bound on the number of tapes,
but maybe we should still consider that.
var_samp(), stddev_pop(), and stddev_samp(). var_samp() and stddev_samp()
are just renamings of the historical Postgres aggregates variance() and
stddev() -- the latter names have been kept for backward compatibility.
This patch includes updates for the documentation and regression tests.
The catversion has been bumped.
NB: SQL2003 requires that DISTINCT not be specified for any of these
aggregates. Per discussion on -patches, I have NOT implemented this
restriction: if the user asks for stddev(DISTINCT x), presumably they
know what they are doing.
performance issue during regular merge passes not only the 'final merge'
case. The original design contemplated that there would never be more
than about one free block per 'tape', hence no need for an efficient
method of keeping the free blocks sorted. But given the later addition
of merge preread behavior in tuplesort.c, there is likely to be about
work_mem worth of free blocks, which is not so small ... and for that
matter the number of tapes isn't necessarily small anymore either. So
we'd better get rid of the assumption entirely. Instead, I'm assuming
that the usage pattern will involve alternation between merge preread
and writing of a new run. This makes it reasonable to just add blocks
to the list without sorting during successive ltsReleaseBlock calls,
and then do a qsort() when we start getting ltsGetFreeBlock() calls.
Experimentation seems to confirm that there aren't many qsort calls
relative to the number of ltsReleaseBlock/ltsGetFreeBlock calls.
we are doing the final merge pass on-the-fly, and not writing the data
back onto a 'tape', the number of free blocks in the tape set will become
large, leading to a lot of time wasted in ltsReleaseBlock(). There is
really no need to track the free blocks anymore in this state, so add a
simple shutoff switch. Per report from Stefan Kaltenbrunner.
(respectively) to rename yylex and related symbols. Some were doing
it this way already, while others used not-too-reliable sed hacks in
the Makefiles. It's all nice and consistent now.
not likely ever to be implemented seeing it's been removed from SQL2003.
This allows getting rid of the 'filter' version of yylex() that we had in
parser.c, which should save at least a few microseconds in parsing.
- new function justify_interval(interval)
- modified function justify_hours(interval)
- modified function justify_days(interval)
These functions are defined to meet the requirements as discussed in
this thread. Specifically:
- justify_hours makes certain the sign bit on the hours
matches the sign bit on the days. It only checks the
sign bit on the days, and not the months, when
determining if the hours should be positive or negative.
After the call, -24 < hours < 24.
- justify_days makes certain the sign bit on the days
matches the sign bit on the months. It's behavior does
not depend on the hours, nor does it modify the hours.
After the call, -30 < days < 30.
- justify_interval makes sure the sign bits on all three
fields months, days, and hours are all the same. After
the call, -24 < hours < 24 AND -30 < days < 30.
Mark Dilger
> I've now tested this patch at home w/ 8.2HEAD and it seems to fix the
> bug. I plan on testing it under 8.1.2 at work tommorow with
> mod_auth_krb5, etc, and expect it'll work there. Assuming all goes
> well and unless someone objects I'll forward the patch to -patches.
> It'd be great to have this fixed as it'll allow us to use Kerberos to
> authenticate to phppgadmin and other web-based tools which use
> Postgres.
While playing with this patch under 8.1.2 at home I discovered a
mistake in how I manually applied one of the hunks to fe-auth.c.
Basically, the base code had changed and so the patch needed to be
modified slightly. This is because the code no longer either has a
freeable pointer under 'name' or has 'name' as NULL.
The attached patch correctly frees the string from pg_krb5_authname
(where it had been strdup'd) if and only if pg_krb5_authname returned
a string (as opposed to falling through and having name be set using
name = pw->name;). Also added a comment to this effect.
Backpatch to 8.1.X.
Stephen Frost
columns of the grouping clause to avoid redundant sorts. The optimizer
is not currently capable of doing this, so this patch implements a
simple hack in the analysis phase (transformGroupClause): if any
subset of the GROUP BY clause matches a prefix of the ORDER BY list,
that prefix is moved to the front of the GROUP BY clause. This
shouldn't change the semantics of the query, and allows a redundant
sort to be avoided for queries like "GROUP BY a, b ORDER BY b".
In particular, ensure that enlargement of the memtuples[] array doesn't
fall foul of MaxAllocSize when work_mem is very large, and don't bother
enlarging it if that would force an immediate switch into 'tape' mode anyway.
we'll go over to disk-based sort if we reach that limit.
This fixes Stefan Kaltenbrunner's observation that sorting can suffer an
'invalid memory alloc request size' failure when sort_mem is set large
enough. It's unfortunately not so easy to fix in 8.1 ...
> type
Wouldn't it be better to use the UINT64CONST macro? I realize this
file is Windows-only, but we do worry about more than one compiler
on that platform.
Kris Jurka
/dev/tty, but it isn't a device file and doesn't work as expected.
This fixes a known bug where psql does not prompt for a password on some
Win32 systems.
Backpatch to 8.1.X.
Robert Kinberg
relations are still checked for permissions etc as soon as they are
opened. The original form of the patch could hold exclusive lock for a
long time on relations that the user doesn't even have permissions to
access, let alone truncate.
checkpoint in the bgwriter. This forestalls overflow of the fsync request
queue, which is not fatal but causes considerable performance degradation
when it occurs (because backends then have to do their own fsyncs). Per
patch from Itagaki Takahiro, modified a little bit by me.
a need for it back in the neolithic era, but it's certainly dead code in
any PG release we would recognize as such. Since it forces an additional
network round trip to the backend, getting rid of it should provide some
small performance improvement for large-object-using clients.
then modified within the same transaction. The code was using a linked list
of active PLpgSQL_expr structs, which was OK when it was written because
plpgsql never released any parse data structures for the life of the backend.
But since Neil fixed plpgsql's memory management, elements of the linked list
could be freed, leading to crash when the list is chased. Per report and test
case from Kris Jurka.
make use of the recently added ability to create a shell type explicitly.
I also put in place some infrastructure to allow dump/no dump decisions
to be made separately for each database object, rather than the former
hardwired 'dump if in a dumpable schema' policy. This was needed anyway
for shell types so now seemed a convenient time to do it. The flexibility
isn't exposed to the user yet, but is ready for future extensions.
are unnecessarily allocated on the heap rather than the stack. If the
StringInfo doesn't outlive the stack frame in which it is created,
there is no need to allocate it on the heap via makeStringInfo() --
stack allocation is faster. While it's not a big deal unless the
code is in a critical path, I don't see a reason not to save a few
cycles -- using stack allocation is not less readable.
I also cleaned up a bit of code along the way: moved variable
declarations into a more tightly-enclosing scope where possible,
fixed some pointless copying of strings in dblink, etc.
more compliant with the error message style guide. In particular,
errdetail should begin with a capital letter and end with a period,
whereas errmsg should not. I also fixed a few related issues in
passing, such as fixing the repeated misspelling of "lexeme" in
contrib/tsearch2 (per Tom's suggestion).
creation of a shell type. This allows a less hacky way of dealing with
the mutual dependency between a datatype and its I/O functions: make a
shell type, then make the functions, then define the datatype fully.
We should fix pg_dump to handle things this way, but this commit just deals
with the backend.
Martijn van Oosterhout, with some corrections by Tom Lane.
(I didn't use his patch, however). A void-returning PL/Python function
must return None (from Python), which is translated into a void datum
(and *not* NULL) for Postgres. I also added some regression tests for
this functionality.