Commit Graph

12171 Commits

Author SHA1 Message Date
Tom Lane 269c5dd2f4 Fix window functions that sort by expressions involving aggregates.
In commit c1d9579dd8, I changed things so
that the output of the Agg node that feeds the window functions would not
list any ungrouped Vars directly.  Formerly, for example, the Agg tlist
might have included both "x" and "sum(x)", which is not really valid if
"x" isn't a grouping column.  If we then had a window function ordering on
something like "sum(x) + 1", prepare_sort_from_pathkeys would find no exact
match for this in the Agg tlist, and would conclude that it must recompute
the expression.  But it would break the expression down to just the Var
"x", which it would find in the tlist, and then rebuild the ORDER BY
expression using a reference to the subplan's "x" output.  Now, after the
above-referenced changes, "x" isn't in the Agg tlist if it's not a grouping
column, so that prepare_sort_from_pathkeys fails with "could not find
pathkey item to sort", as reported by Bricklen Anderson.

The fix is to not break down Aggrefs into their component parts, but just
treat them as irreducible expressions to be sought in the subplan tlist.
This is definitely OK for the use with respect to window functions in
grouping_planner, since it just built the tlist being used on the same
basis.  AFAICT it is safe for other uses too; most of the other call sites
couldn't encounter Aggrefs anyway.
2011-09-26 23:48:39 -04:00
Tom Lane 57eb009092 Allow snapshot references to still work during transaction abort.
In REPEATABLE READ (nee SERIALIZABLE) mode, an attempt to do
GetTransactionSnapshot() between AbortTransaction and CleanupTransaction
failed, because GetTransactionSnapshot would recompute the transaction
snapshot (which is already wrong, given the isolation mode) and then
re-register it in the TopTransactionResourceOwner, leading to an Assert
because the TopTransactionResourceOwner should be empty of resources after
AbortTransaction.  This is the root cause of bug #6218 from Yamamoto
Takashi.  While changing plancache.c to avoid requesting a snapshot when
handling a ROLLBACK masks the problem, I think this is really a snapmgr.c
bug: it's lower-level than the resource manager mechanism and should not be
shutting itself down before we unwind resource manager resources.  However,
just postponing the release of the transaction snapshot until cleanup time
didn't work because of the circular dependency with
TopTransactionResourceOwner.  Fix by managing the internal reference to
that snapshot manually instead of depending on TopTransactionResourceOwner.
This saves a few cycles as well as making the module layering more
straightforward.  predicate.c's dependencies on TopTransactionResourceOwner
go away too.

I think this is a longstanding bug, but there's no evidence that it's more
than a latent bug, so it doesn't seem worth any risk of back-patching.
2011-09-26 22:25:28 -04:00
Robert Haas 821fd903f9 Update obsolete comments.
This was partially fixed by 57fdb2b0d8,
back in 2005, but it missed a couple of spots.

YAMAMOTO Takashi
2011-09-26 13:12:22 -04:00
Tom Lane 21fb95da46 Use a fresh copy of query_list when making a second plan in GetCachedPlan.
The code path that tried a generic plan, didn't like it, and then made a
custom plan was mistakenly passing the same copy of the query_list to the
planner both times.  This doesn't work too well for nontrivial queries,
since the planner tends to scribble on its input.  Diagnosis and fix by
Yamamoto Takashi.
2011-09-26 12:44:17 -04:00
Tom Lane d5aa7a9fe6 Avoid unnecessary snapshot-acquisitions in BuildCachedPlan.
I had copied-and-pasted a claim that we couldn't reach this point when
dealing with utility statements, but that was a leftover from when the
caller was required to supply a plan to start with.  We now will go
through here at least once when handling a utility statement, so it
seems worth a check to see whether a snapshot is actually needed.
(Note that analyze_requires_snapshot is quite a cheap test.)

Per suggestion from Yamamoto Takashi.  I don't think I believe that this
resolves his reported assertion failure; but it's worth changing anyway,
just to save a cycle or two.
2011-09-25 17:34:20 -04:00
Tom Lane 7741dd6590 Recognize self-contradictory restriction clauses for non-table relations.
The constraint exclusion feature checks for contradictions among scan
restriction clauses, as well as contradictions between those clauses and a
table's CHECK constraints.  The first aspect of this testing can be useful
for non-table relations (such as subqueries or functions-in-FROM), but the
feature was coded with only the CHECK case in mind so we were applying it
only to plain-table RTEs.  Move the relation_excluded_by_constraints call
so that it is applied to all RTEs not just plain tables.  With the default
setting of constraint_exclusion this results in no extra work, but with
constraint_exclusion = ON we will detect optimizations that we missed
before (at the cost of more planner cycles than we expended before).

Per a gripe from Gunnlaugur Þór Briem.  Experimentation with
his example also showed we were not being very bright about the case where
constraint exclusion is proven within a subquery within UNION ALL, so tweak
the code to allow set_append_rel_pathlist to recognize such cases.
2011-09-24 19:33:16 -04:00
Robert Haas 0c8eda6258 Memory barrier support for PostgreSQL.
This is not actually used anywhere yet, but it gets the basic
infrastructure in place.  It is fairly likely that there are bugs, and
support for some important platforms may be missing, so we'll need to
refine this as we go along.
2011-09-23 17:52:43 -04:00
Tom Lane f197272365 Make EXPLAIN ANALYZE report the numbers of rows rejected by filter steps.
This provides information about the numbers of tuples that were visited
but not returned by table scans, as well as the numbers of join tuples
that were considered and discarded within a join plan node.

There is still some discussion going on about the best way to report counts
for outer-join situations, but I think most of what's in the patch would
not change if we revise that, so I'm going to go ahead and commit it as-is.

Documentation changes to follow (they weren't in the submitted patch
either).

Marko Tiikkaja, reviewed by Marc Cousin, somewhat revised by Tom
2011-09-22 11:30:11 -04:00
Robert Haas 4893552e21 Fix another bit of unlogged-table-induced breakage.
Per bug #6205, reported by Abel Abraham Camarillo Ojeda.  This isn't a
particularly elegant fix, but I'm trying to minimize the chances of
causing yet another round of breakage.

Adjust regression tests to exercise this case.
2011-09-21 10:48:31 -04:00
Tom Lane 2562dcea81 Suppress "unused function" warning when not HAVE_LOCALE_T.
Forgot to consider this case ...
2011-09-20 17:47:21 -04:00
Tom Lane 37d4fd2b9d Improve reporting of newlocale() failures in CREATE COLLATION.
The standardized errno code for "no such locale" failures is ENOENT, which
we were just reporting at face value, viz "No such file or directory".
Per gripe from Thom Brown, this might confuse users, so add an errdetail
message to clarify what it means.  Also, report newlocale() failures as
ERRCODE_INVALID_PARAMETER_VALUE rather than using
errcode_for_file_access(), since newlocale()'s errno values aren't
necessarily tied directly to file access failures.
2011-09-20 13:23:40 -04:00
Tom Lane c4ae968633 Fix Assert failure in new plancache code.
The regression tests were failing with CLOBBER_CACHE_ALWAYS enabled,
as reported by buildfarm member jaguar.  There was an Assert in
BuildCachedPlan that asserted that the CachedPlanSource hadn't been
invalidated since we called RevalidateCachedQuery, which in theory can't
happen because we are holding locks on all the relevant database objects.
However, CLOBBER_CACHE_ALWAYS generates a false positive by making an
invalidation happen anyway; and on reflection, that could also occur as a
result of a badly-timed sinval reset due to queue overflow.  We could just
remove the Assert and forge ahead with the not-really-stale querytree, but
it seems safer to do another RevalidateCachedQuery call just to make real
sure everything's OK.
2011-09-17 01:47:33 -04:00
Tom Lane 99b5454167 Remove debug logging for pgstat wait timeout.
This reverts commit 79b2ee20c8, which proved
to not be very informative; it looks like the "pgstat wait timeout"
warnings in the buildfarm are just a symptom of running on heavily loaded
machines, and there isn't any weird mechanism causing them to appear.

To try to reduce the frequency of buildfarm failures from this effect,
increase PGSTAT_MAX_WAIT_TIME from 5 seconds to 10.

Also, arrange to not send a fresh inquiry message every single time through
the loop, as that seems more likely to cause problems (by swamping the
collector) than fix them.  We'll now send an inquiry the first time through
the delay loop, and every 640 msec thereafter.
2011-09-16 18:25:27 -04:00
Tom Lane 9d306c66e6 Avoid unnecessary page-level SSI lock check in heap_insert().
As observed by Heikki, we need not conflict on heap page locks during an
insert; heap page locks are only aggregated tuple locks, they don't imply
locking "gaps" as index page locks do.  So we can avoid some unnecessary
conflicts, and also do the SSI check while not holding exclusive lock on
the target buffer.

Kevin Grittner, reviewed by Jeff Davis.  Back-patch to 9.1.
2011-09-16 14:47:20 -04:00
Tom Lane 0a6cc28500 gistendscan() forgot to free so->giststate.
This oversight led to a massive memory leak --- upwards of 10KB per tuple
--- during creation-time verification of an exclusion constraint based on a
GIST index.  In most other scenarios it'd just be a leak of 10KB that would
be recovered at end of query, so not too significant; though perhaps the
leak would be noticeable in a situation where a GIST index was being used
in a nestloop inner indexscan.  In any case, it's a real leak of long
standing, so patch all supported branches.  Per report from Harald Fuchs.
2011-09-16 04:27:49 -04:00
Tom Lane e6faf910d7 Redesign the plancache mechanism for more flexibility and efficiency.
Rewrite plancache.c so that a "cached plan" (which is rather a misnomer
at this point) can support generation of custom, parameter-value-dependent
plans, and can make an intelligent choice between using custom plans and
the traditional generic-plan approach.  The specific choice algorithm
implemented here can probably be improved in future, but this commit is
all about getting the mechanism in place, not the policy.

In addition, restructure the API to greatly reduce the amount of extraneous
data copying needed.  The main compromise needed to make that possible was
to split the initial creation of a CachedPlanSource into two steps.  It's
worth noting in particular that SPI_saveplan is now deprecated in favor of
SPI_keepplan, which accomplishes the same end result with zero data
copying, and no need to then spend even more cycles throwing away the
original SPIPlan.  The risk of long-term memory leaks while manipulating
SPIPlans has also been greatly reduced.  Most of this improvement is based
on use of the recently-added MemoryContextSetParent primitive.
2011-09-16 00:43:52 -04:00
Alvaro Herrera 86822df9b5 Split walsender.h in public/private headers
This dramatically cuts short the number of headers the public one brings
into whatever includes it.
2011-09-13 21:42:49 -03:00
Tom Lane 6693c9a5ed deflist_to_tuplestore dumped core on an option with no value.
Make it return NULL for the option_value, instead.

Per report from Frank van Vugt.  Back-patch to 8.4 where this code was
added.
2011-09-13 11:36:49 -04:00
Heikki Linnakangas 8caf6132c7 In the final emptying phase of the new GiST buffering build, set the
queuedForEmptying flag correctly on buffer when adding it to the queue.
Also, don't add buffer to the queue if it's there already. These were
harmless oversights; failing to set the flag just means that a buffer might
get added to the queue twice if more tuples are added to it (although that
can't actually happen at this point because all the upper buffers have
already been emptied), and having the same buffer twice in the emptying
queue is harmless. But better be tidy.
2011-09-12 13:06:06 +03:00
Tom Lane b0025bd957 Invent a new memory context primitive, MemoryContextSetParent.
This function will be useful for altering the lifespan of a context after
creation (for example, by creating it under a transient context and later
reparenting it to belong to a long-lived context).  It costs almost no new
code, since we can refactor what was there.  Per my proposal of yesterday.
2011-09-11 16:29:42 -04:00
Peter Eisentraut 1b81c2fe6e Remove many -Wcast-qual warnings
This addresses only those cases that are easy to fix by adding or
moving a const qualifier or removing an unnecessary cast.  There are
many more complicated cases remaining.
2011-09-11 21:54:32 +03:00
Tom Lane ca4af308c3 Simplify handling of the timezone GUC by making initdb choose the default.
We were doing some amazingly complicated things in order to avoid running
the very expensive identify_system_timezone() procedure during GUC
initialization.  But there is an obvious fix for that, which is to do it
once during initdb and have initdb install the system-specific default into
postgresql.conf, as it already does for most other GUC variables that need
system-environment-dependent defaults.  This means that the timezone (and
log_timezone) settings no longer have any magic behavior in the server.
Per discussion.
2011-09-09 17:59:11 -04:00
Tom Lane a7801b62f2 Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h.
As per my recent proposal, this refactors things so that these typedefs and
macros are available in a header that can be included in frontend-ish code.
I also changed various headers that were undesirably including
utils/timestamp.h to include datatype/timestamp.h instead.  Unsurprisingly,
this showed that half the system was getting utils/timestamp.h by way of
xlog.h.

No actual code changes here, just header refactoring.
2011-09-09 13:23:41 -04:00
Tom Lane d63de337f3 round() is not portable. Use rint(). 2011-09-08 16:38:24 -04:00
Alvaro Herrera 295e7dc929 Tweak string for uniformity 2011-09-08 16:39:58 -03:00
Heikki Linnakangas 5edb24a898 Buffering GiST index build algorithm.
When building a GiST index that doesn't fit in cache, buffers are attached
to some internal nodes in the index. This speeds up the build by avoiding
random I/O that would otherwise be needed to traverse all the way down the
tree to the find right leaf page for tuple.

Alexander Korotkov
2011-09-08 17:51:23 +03:00
Tom Lane f0bedf3e45 Fix corner case bug in numeric to_char().
Trailing-zero stripping applied by the FM specifier could strip zeroes
to the left of the decimal point, for a format with no digit positions
after the decimal point (such as "FM999.").

Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
2011-09-07 17:07:20 -04:00
Tom Lane 99155aaa33 Fix typo in error message.
Per Euler Taveira de Oliveira.
2011-09-07 13:29:26 -04:00
Tom Lane a7d9203cc4 Fix get_name_for_var_field() to deal with RECORD Params.
With 9.1's use of Params to pass down values from NestLoop join nodes
to their inner plans, it is possible for a Param to have type RECORD, in
which case the set of fields comprising the value isn't determinable by
inspection of the Param alone.  However, just as with a Var of type RECORD,
we can find out what we need to know if we can locate the expression that
the Param represents.  We already knew how to do this in get_parameter(),
but I'd overlooked the need to be able to cope in get_name_for_var_field(),
which led to EXPLAIN failing with "record type has not been registered".

To fix, refactor the search code in get_parameter() so it can be used by
both functions.

Per report from Marti Raudsepp.
2011-09-07 13:01:36 -04:00
Bruce Momjian 029dfdf115 Fix to_date() and to_timestamp() to handle year masks of length < 4 so
they wrap toward year 2020, rather than the inconsistent behavior we had
before.
2011-09-07 09:47:51 -04:00
Simon Riggs df383b03e6 Partially revoke attempt to improve performance with many savepoints.
Maintain difference between subtransaction release and commit introduced
by earlier patch.
2011-09-07 12:11:26 +01:00
Simon Riggs dde70cc313 Emit cascaded standby message on shutdown only when appropriate.
Adds additional test for active walsenders and closes a race
condition for when we failover when a new walsender was connecting.

Reported and fixed bu Fujii Masao. Review by Heikki Linnakangas
2011-09-07 09:09:47 +01:00
Tom Lane db10f01baa Improve comment about handling of temp tables in shared-inval code. 2011-09-06 17:06:54 -04:00
Peter Eisentraut e6d800981e Correct ancient logic mistake in assertion
Found by gcc -Wlogical-op
2011-09-06 23:05:02 +03:00
Tom Lane 623f77e9d1 Avoid possibly accessing off the end of memory in SJIS2004 conversion.
The code in shift_jis_20042euc_jis_2004() would fetch two bytes even when
only one remained in the string.  Since conversion functions aren't
supposed to assume null-terminated input, this poses a small risk of
fetching past the end of memory and incurring SIGSEGV.  No such crash has
been identified in the field, but we've certainly seen the equivalent
happen in other code paths, so patch this one all the way back.

Report and patch by Noah Misch.
2011-09-06 14:50:28 -04:00
Tom Lane 780a342c90 Avoid possibly accessing off the end of memory in examine_attribute().
Since the last couple of columns of pg_type are often NULL,
sizeof(FormData_pg_type) can be an overestimate of the actual size of the
tuple data part.  Therefore memcpy'ing that much out of the catalog cache,
as analyze.c was doing, poses a small risk of copying past the end of
memory and incurring SIGSEGV.  No such crash has been identified in the
field, but we've certainly seen the equivalent happen in other code paths,
so patch this one all the way back.

Per valgrind testing by Noah Misch, though this is not his proposed patch.
I chose to use SearchSysCacheCopy1 rather than inventing special-purpose
infrastructure for copying only the minimal part of a pg_type tuple.
2011-09-06 14:37:22 -04:00
Bruce Momjian f458c90bff Add C comment about why we send cache invalidation messages for
session-local objects.
2011-09-05 22:09:02 -04:00
Alvaro Herrera 56a9ed92b6 Adjust translator comment format to xgettext expectations 2011-09-05 19:04:30 -03:00
Alvaro Herrera b64f18c583 Mark some untranslatable messages with errmsg_internal 2011-09-05 17:48:07 -03:00
Peter Eisentraut a2a5ce6826 Improve "invalid byte sequence for encoding" message
It used to say

ERROR:  invalid byte sequence for encoding "UTF8": 0xdb24

Change this to

ERROR:  invalid byte sequence for encoding "UTF8": 0xdb 0x24

to make it clear that this is a byte sequence and not a code point.

Also fix the adjacent "character has no equivalent" message that has
the same issue.
2011-09-05 23:38:27 +03:00
Tom Lane 4c2777d0b7 Change get_variable_numdistinct's API to flag default estimates explicitly.
Formerly, callers tested for DEFAULT_NUM_DISTINCT, which had the problem
that a perfectly solid estimate might be mistaken for a content-free
default.
2011-09-04 15:41:49 -04:00
Tom Lane 1cb108efb0 Dig down into sub-selects to look for column statistics.
If a sub-select's output column is a simple Var, recursively look for
statistics applying to that Var, and use them if available.  The need for
this was foreseen ages ago, but we didn't have enough infrastructure to do
it with reasonable speed until just now.

We punt and stick with default estimates if the subquery uses set
operations, GROUP BY, or DISTINCT, since those operations would change the
underlying column statistics (particularly, the relative frequencies of
different values) beyond recognition.  This means that the types of
sub-selects for which this improvement applies are fairly limited, since
most subqueries satisfying those restrictions would have gotten flattened
into the parent query anyway.  But it does help for some cases, such as
subqueries with ORDER BY or LIMIT.
2011-09-04 15:13:46 -04:00
Tom Lane 698df3350d Can't print PlannerGlobal's subroots list in outfuncs.
Since the subroots will surely link back to the same glob struct, this
necessarily leads to infinite recursion.  Doh.  Found while trying to
debug some other code.
2011-09-04 14:43:52 -04:00
Tom Lane 1609797c25 Clean up the #include mess a little.
walsender.h should depend on xlog.h, not vice versa.  (Actually, the
inclusion was circular until a couple hours ago, which was even sillier;
but Bruce broke it in the expedient rather than logically correct
direction.)  Because of that poor decision, plus blind application of
pgrminclude, we had a situation where half the system was depending on
xlog.h to include such unrelated stuff as array.h and guc.h.  Clean up
the header inclusion, and manually revert a lot of what pgrminclude had
done so things build again.

This episode reinforces my feeling that pgrminclude should not be run
without adult supervision.  Inclusion changes in header files in particular
need to be reviewed with great care.  More generally, it'd be good if we
had a clearer notion of module layering to dictate which headers can sanely
include which others ... but that's a big task for another day.
2011-09-04 01:13:16 -04:00
Tom Lane b3aaf9081a Rearrange planner to save the whole PlannerInfo (subroot) for a subquery.
Formerly, set_subquery_pathlist and other creators of plans for subqueries
saved only the rangetable and rowMarks lists from the lower-level
PlannerInfo.  But there's no reason not to remember the whole PlannerInfo,
and indeed this turns out to simplify matters in a number of places.

The immediate reason for doing this was so that the subroot will still be
accessible when we're trying to extract column statistics out of an
already-planned subquery.  But now that I've done it, it seems like a good
code-beautification effort in its own right.

I also chose to get rid of the transient subrtable and subrowmark fields in
SubqueryScan nodes, in favor of having setrefs.c look up the subquery's
RelOptInfo.  That required changing all the APIs in setrefs.c to pass
PlannerInfo not PlannerGlobal, which was a large but quite mechanical
transformation.

One side-effect not foreseen at the beginning is that this finally broke
inheritance_planner's assumption that replanning the same subquery RTE N
times would necessarily give interchangeable results each time.  That
assumption was always pretty risky, but now we really have to make a
separate RTE for each instance so that there's a place to carry the
separate subroots.
2011-09-03 15:36:24 -04:00
Peter Eisentraut 42ad992fdc Add archive_command example 2011-09-03 01:29:09 +03:00
Peter Eisentraut f1e4f3d44f Whitespace adjustment for consistency in the file 2011-09-03 01:28:05 +03:00
Tom Lane 5b562644fe Teach ANALYZE to clear pg_class.relhassubclass when appropriate.
In the past, relhassubclass always remained true if a relation had ever had
child relations, even if the last subclass was long gone.  While this had
only marginal performance implications in most cases, it was annoying, and
I'm now considering some planner changes that would raise the cost of a
false positive.  It was previously impractical to fix this because of race
condition concerns.  However, given the recent change that made tablecmds.c
take ShareExclusiveLock on relations that are gaining a child (commit
fbcf4b92aa), we can now allow ANALYZE to
clear the flag when it's no longer relevant.  There is no additional
locking cost to do so, since ANALYZE takes ShareExclusiveLock anyway.
2011-09-02 14:29:31 -04:00
Bruce Momjian 10af3ab2b2 Add C comment about needed include. 2011-09-01 12:53:45 -04:00
Tom Lane e5b012b788 Put back improperly removed #include. 2011-09-01 11:57:46 -04:00