Commit Graph

297 Commits

Author SHA1 Message Date
Bruce Momjian 0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
Bruce Momjian 7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Tom Lane f7fbf4b0be Remove dead code now that orindxpath.c is history.
We don't need make_restrictinfo_from_bitmapqual() anymore at all.
generate_bitmap_or_paths() doesn't need to be exported, and we can
drop its rather klugy restriction_only flag.
2013-12-30 12:50:31 -05:00
Tom Lane f343a880d5 Extract restriction OR clauses whether or not they are indexable.
It's possible to extract a restriction OR clause from a join clause that
has the form of an OR-of-ANDs, if each sub-AND includes a clause that
mentions only one specific relation.  While PG has been aware of that idea
for many years, the code previously only did it if it could extract an
indexable OR clause.  On reflection, though, that seems a silly limitation:
adding a restriction clause can be a win by reducing the number of rows
that have to be filtered at the join step, even if we have to test the
clause as a plain filter clause during the scan.  This should be especially
useful for foreign tables, where the change can cut the number of rows that
have to be retrieved from the foreign server; but testing shows it can win
even on local tables.  Per a suggestion from Robert Haas.

As a heuristic, I made the code accept an extracted restriction clause
if its estimated selectivity is less than 0.9, which will probably result
in accepting extracted clauses just about always.  We might need to tweak
that later based on experience.

Since the code no longer has even a weak connection to Path creation,
remove orindxpath.c and create a new file optimizer/util/orclauses.c.

There's some additional janitorial cleanup of now-dead code that needs
to happen, but it seems like that's a fit subject for a separate commit.
2013-12-30 12:24:37 -05:00
Tom Lane 9e7e29c75a Fix planner problems with LATERAL references in PlaceHolderVars.
The planner largely failed to consider the possibility that a
PlaceHolderVar's expression might contain a lateral reference to a Var
coming from somewhere outside the PHV's syntactic scope.  We had a previous
report of a problem in this area, which I tried to fix in a quick-hack way
in commit 4da6439bd8, but Antonin Houska
pointed out that there were still some problems, and investigation turned
up other issues.  This patch largely reverts that commit in favor of a more
thoroughly thought-through solution.  The new theory is that a PHV's
ph_eval_at level cannot be higher than its original syntactic level.  If it
contains lateral references, those don't change the ph_eval_at level, but
rather they create a lateral-reference requirement for the ph_eval_at join
relation.  The code in joinpath.c needs to handle that.

Another issue is that createplan.c wasn't handling nested PlaceHolderVars
properly.

In passing, push knowledge of lateral-reference checks for join clauses
into join_clause_is_movable_to.  This is mainly so that FDWs don't need
to deal with it.

This patch doesn't fix the original join-qual-placement problem reported by
Jeremy Evans (and indeed, one of the new regression test cases shows the
wrong answer because of that).  But the PlaceHolderVar problems need to be
fixed before that issue can be addressed, so committing this separately
seems reasonable.
2013-08-17 20:22:37 -04:00
Bruce Momjian 9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Tom Lane 9cbc4b80dd Redo postgres_fdw's planner code so it can handle parameterized paths.
I wasn't going to ship this without having at least some example of how
to do that.  This version isn't terribly bright; in particular it won't
consider any combinations of multiple join clauses.  Given the cost of
executing a remote EXPLAIN, I'm not sure we want to be very aggressive
about doing that, anyway.

In support of this, refactor generate_implied_equalities_for_indexcol
so that it can be used to extract equivalence clauses that aren't
necessarily tied to an index.
2013-03-21 19:44:32 -04:00
Bruce Momjian bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Tom Lane 1746ba9256 Improve check_partial_indexes() to consider join clauses in proof attempts.
Traditionally check_partial_indexes() has only looked at restriction
clauses while trying to prove partial indexes usable in queries.  However,
join clauses can also be used in some cases; mainly, that a strict operator
on "x" proves an "x IS NOT NULL" index predicate, even if the operator is
in a join clause rather than a restriction clause.  Adding this code fixes
a regression in 9.2, because previously we would take join clauses into
account when considering whether a partial index could be used in a
nestloop inner indexscan path.  9.2 doesn't handle nestloop inner
indexscans in the same way, and this consideration was overlooked in the
rewrite.  Moving the work to check_partial_indexes() is a better solution
anyway, since the proof applies whether or not we actually use the index
in that particular way, and we don't have to do it over again for each
possible outer relation.  Per report from Dave Cramer.
2012-11-15 19:29:05 -05:00
Tom Lane 19e36477b3 Limit the number of rel sets considered in consider_index_join_outer_rels.
In bug #7626, Brian Dunavant exposes a performance problem created by
commit 3b8968f25232ad09001bf35ab4cc59f5a501193e: that commit attempted to
consider *all* possible combinations of indexable join clauses, but if said
clauses join to enough different relations, there's an exponential increase
in the number of outer-relation sets considered.

In Brian's example, all the clauses come from the same equivalence class,
which means it's redundant to use more than one of them in an indexscan
anyway.  So we can prevent the problem in this class of cases (which is
probably the majority of real examples) by rejecting combinations that
would only serve to add a known-redundant clause.

But that still leaves us exposed to exponential growth of planning time
when the query has a lot of non-equivalence join clauses that are usable
with the same index.  I chose to prevent such cases by setting an upper
limit on the number of relation sets considered, equal to ten times the
number of index clauses considered so far.  (This sliding limit still
allows new relsets to be added on as we move to additional index columns,
which is probably more important than considering even more combinations of
clauses for the previous column.)  This should keep the amount of work done
roughly linear rather than exponential in the apparent query complexity.
This part of the fix is pretty ad-hoc; but without a clearer idea of
real-world cases for which this would result in markedly inferior plans,
it's hard to see how to do better.
2012-11-01 14:08:42 -04:00
Tom Lane 807a40c551 Fix planning of btree index scans using ScalarArrayOpExpr quals.
In commit 9e8da0f757, I improved btree
to handle ScalarArrayOpExpr quals natively, so that constructs like
"indexedcol IN (list)" could be supported by index-only scans.  Using
such a qual results in multiple scans of the index, under-the-hood.
I went to some lengths to ensure that this still produces rows in index
order ... but I failed to recognize that if a higher-order index column
is lacking an equality constraint, rescans can produce out-of-order
data from that column.  Tweak the planner to not expect sorted output
in that case.  Per trouble report from Robert McGehee.
2012-09-18 12:20:34 -04:00
Tom Lane 3b8968f252 Rethink heuristics for choosing index quals for parameterized paths.
Some experimentation with examples similar to bug #7539 has convinced me
that indxpath.c's original implementation of parameterized-path generation
was several bricks shy of a load.  In general, if we are relying on a
particular outer rel or set of outer rels for a parameterized path, the
path should use every indexable join clause that's available from that rel
or rels.  Any join clauses that get left out of the indexqual will end up
getting applied as plain filter quals (qpquals), and that's generally a
significant loser compared to having the index AM enforce them.  (This is
particularly true with btree, which can skip the index scan entirely if
it can see that the given indexquals are mutually contradictory.)  The
original heuristics failed to ensure this, though, and were overly
complicated anyway.  Rewrite to make the code explicitly identify each
useful set of outer rels and then select all applicable join clauses for
each one.  The one plan that changes in the regression tests is in fact
for the better according to the planner's cost estimates.

(Note: this is not a correctness issue but just a matter of plan quality.
I don't yet know what is going on in bug #7539, but I don't expect this
change to fix that.)
2012-09-16 17:58:09 -04:00
Tom Lane 77387f0ac8 Suppress creation of backwardly-indexed paths for LATERAL join clauses.
Given a query such as

SELECT * FROM foo JOIN LATERAL (SELECT foo.var1) ss(x) ON ss.x = foo.var2

the existence of the join clause "ss.x = foo.var2" encourages indxpath.c to
build a parameterized path for foo using any index available for foo.var2.
This is completely useless activity, though, since foo has got to be on the
outside not the inside of any nestloop join with ss.  It's reasonably
inexpensive to add tests that prevent creation of such paths, so let's do
that.
2012-08-30 14:33:00 -04:00
Tom Lane 9ff79b9d4e Fix up planner infrastructure to support LATERAL properly.
This patch takes care of a number of problems having to do with failure
to choose valid join orders and incorrect handling of lateral references
pulled up from subqueries.  Notable changes:

* Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to
represent join ordering constraints created by lateral references.
(I first considered extending the SpecialJoinInfo structure, but the
semantics are different enough that a separate data structure seems
better.)  Extend join_is_legal() and related functions to prevent trying
to form unworkable joins, and to ensure that we will consider joins that
satisfy lateral references even if the joins would be clauseless.

* Fill in the infrastructure needed for the last few types of relation scan
paths to support parameterization.  We'd have wanted this eventually
anyway, but it is necessary now because a relation that gets pulled up out
of a UNION ALL subquery may acquire a reltargetlist containing lateral
references, meaning that its paths *have* to be parameterized whether or
not we have any code that can push join quals down into the scan.

* Compute data about lateral references early in query_planner(), and save
in RelOptInfo nodes, to avoid repetitive calculations later.

* Assorted corner-case bug fixes.

There's probably still some bugs left, but this is a lot closer to being
real than it was before.
2012-08-26 22:50:23 -04:00
Tom Lane f5983923d8 Allow create_index_paths() to consider multiple join bitmapscan paths.
In the initial cut at the "parameterized paths" feature, I'd simplified
create_index_paths() to the point where it would only generate a single
parameterized bitmap path per relation.  Experimentation with an example
supplied by Josh Berkus convinces me that that's not good enough: we really
need to consider a bitmap path for each possible outer relation.  Otherwise
we have regressions relative to pre-9.2 versions, in which the planner
picks a plain indexscan where it should have used a bitmap scan in queries
involving three or more tables.  Indeed, after fixing this, several queries
in the regression tests show improved plans as a result of using bitmap not
plain indexscans.
2012-08-16 13:03:54 -04:00
Tom Lane 00dac6000d Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.
Previously, pattern_fixed_prefix() was defined to return whatever fixed
prefix it could extract from the pattern, plus the "rest" of the pattern.
That definition was sensible for LIKE patterns, but not so much for
regexes, where reconstituting a valid pattern minus the prefix could be
quite tricky (certainly the existing code wasn't doing that correctly).
Since the only thing that callers ever did with the "rest" of the pattern
was to pass it to like_selectivity() or regex_selectivity(), let's cut out
the middle-man and just have pattern_fixed_prefix's subroutines do this
directly.  Then pattern_fixed_prefix can return a simple selectivity
number, and the question of how to cope with partial patterns is removed
from its API specification.

While at it, adjust the API spec so that callers who don't actually care
about the pattern's selectivity (which is a lot of them) can pass NULL for
the selectivity pointer to skip doing the work of computing a selectivity
estimate.

This patch is only an API refactoring that doesn't actually change any
processing, other than allowing a little bit of useless work to be skipped.
However, it's necessary infrastructure for my upcoming fix to regex prefix
extraction, because after that change there won't be any simple way to
identify the "rest" of the regex, not even to the low level of fidelity
needed by regex_selectivity.  We can cope with that if regex_fixed_prefix
and regex_selectivity communicate directly, but not if we have to work
within the old API.  Hence, back-patch to all active branches.
2012-07-09 23:22:55 -04:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Tom Lane 7c85aa39fc Fix oversight in recent parameterized-path patch.
bitmap_scan_cost_est() has to be able to cope with a BitmapOrPath, but
I'd taken a shortcut that didn't work for that case.  Noted by Heikki.
Add some regression tests since this area is evidently under-covered.
2012-04-26 14:17:44 -04:00
Tom Lane 5b7b5518d0 Revise parameterized-path mechanism to fix assorted issues.
This patch adjusts the treatment of parameterized paths so that all paths
with the same parameterization (same set of required outer rels) for the
same relation will have the same rowcount estimate.  We cache the rowcount
estimates to ensure that property, and hopefully save a few cycles too.
Doing this makes it practical for add_path_precheck to operate without
a rowcount estimate: it need only assume that paths with different
parameterizations never dominate each other, which is close enough to
true anyway for coarse filtering, because normally a more-parameterized
path should yield fewer rows thanks to having more join clauses to apply.

In add_path, we do the full nine yards of comparing rowcount estimates
along with everything else, so that we can discard parameterized paths that
don't actually have an advantage.  This fixes some issues I'd found with
add_path rejecting parameterized paths on the grounds that they were more
expensive than not-parameterized ones, even though they yielded many fewer
rows and hence would be cheaper once subsequent joining was considered.

To make the same-rowcounts assumption valid, we have to require that any
parameterized path enforce *all* join clauses that could be obtained from
the particular set of outer rels, even if not all of them are useful for
indexing.  This is required at both base scans and joins.  It's a good
thing anyway since the net impact is that join quals are checked at the
lowest practical level in the join tree.  Hence, discard the original
rather ad-hoc mechanism for choosing parameterization joinquals, and build
a better one that has a more principled rule for when clauses can be moved.
The original rule was actually buggy anyway for lack of knowledge about
which relations are part of an outer join's outer side; getting this right
requires adding an outer_relids field to RestrictInfo.
2012-04-19 15:53:47 -04:00
Tom Lane b67ad046e6 Improve commentary in match_pathkeys_to_index().
For a little while there I thought match_pathkeys_to_index() was broken
because it wasn't trying to match index columns to pathkeys in order.
Actually that's correct, because GiST can support ordering operators
on any random collection of index columns, but it sure needs a comment.
2012-03-16 14:07:21 -04:00
Tom Lane dd4134ea56 Revisit handling of UNION ALL subqueries with non-Var output columns.
In commit 57664ed25e I tried to fix a bug
reported by Teodor Sigaev by making non-simple-Var output columns distinct
(by wrapping their expressions with dummy PlaceHolderVar nodes).  This did
not work too well.  Commit b28ffd0fcc fixed
some ensuing problems with matching to child indexes, but per a recent
report from Claus Stadler, constraint exclusion of UNION ALL subqueries was
still broken, because constant-simplification didn't handle the injected
PlaceHolderVars well either.  On reflection, the original patch was quite
misguided: there is no reason to expect that EquivalenceClass child members
will be distinct.  So instead of trying to make them so, we should ensure
that we can cope with the situation when they're not.

Accordingly, this patch reverts the code changes in the above-mentioned
commits (though the regression test cases they added stay).  Instead, I've
added assorted defenses to make sure that duplicate EC child members don't
cause any problems.  Teodor's original problem ("MergeAppend child's
targetlist doesn't match MergeAppend") is addressed more directly by
revising prepare_sort_from_pathkeys to let the parent MergeAppend's sort
list guide creation of each child's sort list.

In passing, get rid of add_sort_column; as far as I can tell, testing for
duplicate sort keys at this stage is dead code.  Certainly it doesn't
trigger often enough to be worth expending cycles on in ordinary queries.
And keeping the test would've greatly complicated the new logic in
prepare_sort_from_pathkeys, because comparing pathkey list entries against
a previous output array requires that we not skip any entries in the list.

Back-patch to 9.1, like the previous patches.  The only known issue in
this area that wasn't caused by the ill-advised previous patches was the
MergeAppend planning failure, which of course is not relevant before 9.1.
It's possible that we need some of the new defenses against duplicate child
EC entries in older branches, but until there's some clear evidence of that
I'm going to refrain from back-patching further.
2012-03-16 13:11:55 -04:00
Tom Lane 0140a11b9b Fix thinko in new match_join_clauses_to_index() logic.
We don't need to constrain the other side of an indexable join clause to
not be below an outer join; an example here is

SELECT FROM t1 LEFT JOIN t2 ON t1.a = t2.b LEFT JOIN t3 ON t2.c = t3.d;

We can consider an inner indexscan on t3.d using c = d as indexqual, even
though t2.c is potentially nulled by a previous outer join.  The comparable
logic in orindxpath.c has always worked that way, but I was being overly
cautious here.
2012-02-28 18:10:40 -05:00
Robert Haas 0ed7445d73 Allow spgist's text_ops to handle pattern-matching operators.
This was presumably intended to work this way all along, but a few key
bits of indxpath.c didn't get the memo.

Robert Haas and Tom Lane
2012-02-02 13:10:56 -05:00
Tom Lane b28ffd0fcc Fix pushing of index-expression qualifications through UNION ALL.
In commit 57664ed25e, I made the planner
wrap non-simple-variable outputs of appendrel children (IOW, child SELECTs
of UNION ALL subqueries) inside PlaceHolderVars, in order to solve some
issues with EquivalenceClass processing.  However, this means that any
upper-level WHERE clauses mentioning such outputs will now contain
PlaceHolderVars after they're pushed down into the appendrel child,
and that prevents indxpath.c from recognizing that they could be matched
to index expressions.  To fix, add explicit stripping of PlaceHolderVars
from index operands, same as we have long done for RelabelType nodes.
Add a regression test covering both this and the plain-UNION case (which
is a totally different code path, but should also be able to do it).

Per bug #6416 from Matteo Beccati.  Back-patch to 9.1, same as the
previous change.
2012-01-29 16:31:23 -05:00
Tom Lane e2fa76d80b Use parameterized paths to generate inner indexscans more flexibly.
This patch fixes the planner so that it can generate nestloop-with-
inner-indexscan plans even with one or more levels of joining between
the indexscan and the nestloop join that is supplying the parameter.
The executor was fixed to handle such cases some time ago, but the
planner was not ready.  This should improve our plans in many situations
where join ordering restrictions formerly forced complete table scans.

There is probably a fair amount of tuning work yet to be done, because
of various heuristics that have been added to limit the number of
parameterized paths considered.  However, we are not going to find out
what needs to be adjusted until the code gets some real-world use, so
it's time to get it in there where it can be tested easily.

Note API change for index AM amcostestimate functions.  I'm not aware of
any non-core index AMs, but if there are any, they will need minor
adjustments.
2012-01-27 19:26:38 -05:00
Bruce Momjian e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
Tom Lane 472d3935a2 Rethink representation of index clauses' mapping to index columns.
In commit e2c2c2e8b1 I made use of nested
list structures to show which clauses went with which index columns, but
on reflection that's a data structure that only an old-line Lisp hacker
could love.  Worse, it adds unnecessary complication to the many places
that don't much care which clauses go with which index columns.  Revert
to the previous arrangement of flat lists of clauses, and instead add a
parallel integer list of column numbers.  The places that care about the
pairing can chase both lists with forboth(), while the places that don't
care just examine one list the same as before.

The only real downside to this is that there are now two more lists that
need to be passed to amcostestimate functions in case they care about
column matching (which btcostestimate does, so not passing the info is not
an option).  Rather than deal with 11-argument amcostestimate functions,
pass just the IndexPath and expect the functions to extract fields from it.
That gets us down to 7 arguments which is better than 11, and it seems
more future-proof against likely additions to the information we keep
about an index path.
2011-12-24 19:03:21 -05:00
Tom Lane e2c2c2e8b1 Improve planner's handling of duplicated index column expressions.
It's potentially useful for an index to repeat the same indexable column
or expression in multiple index columns, if the columns have different
opclasses.  (If they share opclasses too, the duplicate column is pretty
useless, but nonetheless we've allowed such cases since 9.0.)  However,
the planner failed to cope with this, because createplan.c was relying on
simple equal() matching to figure out which index column each index qual
is intended for.  We do have that information available upstream in
indxpath.c, though, so the fix is to not flatten the multi-level indexquals
list when putting it into an IndexPath.  Then we can rely on the sublist
structure to identify target index columns in createplan.c.  There's a
similar issue for index ORDER BYs (the KNNGIST feature), so introduce a
multi-level-list representation for that too.  This adds a bit more
representational overhead, but we might more or less buy that back by not
having to search for matching index columns anymore in createplan.c;
likewise btcostestimate saves some cycles.

Per bug #6351 from Christian Rudolph.  Likely symptoms include the "btree
index keys must be ordered by attribute" failure shown there, as well as
"operator MMMM is not a member of opfamily NNNN".

Although this is a pre-existing problem that can be demonstrated in 9.0 and
9.1, I'm not going to back-patch it, because the API changes in the planner
seem likely to break things such as index plugins.  The corner cases where
this matters seem too narrow to justify possibly breaking things in a minor
release.
2011-12-23 18:45:14 -05:00
Tom Lane 3695a55513 Replace simple constant pg_am.amcanreturn with an AM support function.
The need for this was debated when we put in the index-only-scan feature,
but at the time we had no near-term expectation of having AMs that could
support such scans for only some indexes; so we kept it simple.  However,
the SP-GiST AM forces the issue, so let's fix it.

This patch only installs the new API; no behavior actually changes.
2011-12-18 15:50:37 -05:00
Tom Lane 3e4b3465b6 Improve planner's ability to recognize cases where an IN's RHS is unique.
If the right-hand side of a semijoin is unique, then we can treat it like a
normal join (or another way to say that is: we don't need to explicitly
unique-ify the data before doing it as a normal join).  We were recognizing
such cases when the RHS was a sub-query with appropriate DISTINCT or GROUP
BY decoration, but there's another way: if the RHS is a plain relation with
unique indexes, we can check if any of the indexes prove the output is
unique.  Most of the infrastructure for that was there already in the join
removal code, though I had to rearrange it a bit.  Per reflection about a
recent example in pgsql-performance.
2011-10-26 17:52:29 -04:00
Tom Lane 0f39d5050d Don't trust deferred-unique indexes for join removal.
The uniqueness condition might fail to hold intra-transaction, and assuming
it does can give incorrect query results.  Per report from Marti Raudsepp,
though this is not his proposed patch.

Back-patch to 9.0, where both these features were introduced.  In the
released branches, add the new IndexOptInfo field to the end of the struct,
to try to minimize ABI breakage for third-party code that may be examining
that struct.
2011-10-23 00:43:39 -04:00
Tom Lane 9e8da0f757 Teach btree to handle ScalarArrayOpExpr quals natively.
This allows "indexedcol op ANY(ARRAY[...])" conditions to be used in plain
indexscans, and particularly in index-only scans.
2011-10-16 15:39:24 -04:00
Tom Lane 600d3206d1 Consider index-only scans even when there is no matching qual or ORDER BY.
By popular demand.
2011-10-11 15:00:30 -04:00
Tom Lane a0185461dd Rearrange the implementation of index-only scans.
This commit changes index-only scans so that data is read directly from the
index tuple without first generating a faux heap tuple.  The only immediate
benefit is that indexes on system columns (such as OID) can be used in
index-only scans, but this is necessary infrastructure if we are ever to
support index-only scans on expression indexes.  The executor is now ready
for that, though the planner still needs substantial work to recognize
the possibility.

To do this, Vars in index-only plan nodes have to refer to index columns
not heap columns.  I introduced a new special varno, INDEX_VAR, to mark
such Vars to avoid confusion.  (In passing, this commit renames the two
existing special varnos to OUTER_VAR and INNER_VAR.)  This allows
ruleutils.c to handle them with logic similar to what we use for subplan
reference Vars.

Since index-only scans are now fundamentally different from regular
indexscans so far as their expression subtrees are concerned, I also chose
to change them to have their own plan node type (and hence, their own
executor source file).
2011-10-11 14:21:30 -04:00
Tom Lane a2822fb933 Support index-only scans using the visibility map to avoid heap fetches.
When a btree index contains all columns required by the query, and the
visibility map shows that all tuples on a target heap page are
visible-to-all, we don't need to fetch that heap page.  This patch depends
on the previous patches that made the visibility map reliable.

There's a fair amount left to do here, notably trying to figure out a less
chintzy way of estimating the cost of an index-only scan, but the core
functionality seems ready to commit.

Robert Haas and Ibrar Ahmed, with some previous work by Heikki Linnakangas.
2011-10-07 20:14:13 -04:00
Tom Lane cb37c29106 Fix index matching for operators with mixed collatable/noncollatable inputs.
If an indexable operator for a non-collatable indexed datatype has a
collatable right-hand input type, any OpExpr for it will be marked with a
nonzero inputcollid (since having one collatable input is sufficient to
make that happen).  However, an index on a non-collatable column certainly
doesn't have any collation.  This caused us to fail to match such operators
to their indexes, because indxpath.c required an exact match of index
collation and clause collation.  It seems correct to allow a match when the
index is collation-less regardless of the clause's inputcollid: an operator
with both noncollatable and collatable inputs could perhaps depend on the
collation of the collatable input, but it could hardly expect the index for
the noncollatable input to have that same collation.

Per bug #6232 from Pierre Ducroquet.  His example is specifically about
"hstore ? text" but the problem seems quite generic.
2011-09-29 00:43:42 -04:00
Tom Lane d64713df7e Pass collations to functions in FunctionCallInfoData, not FmgrInfo.
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings.  Fix by passing it in FunctionCallInfoData
instead.  In particular this allows a clean fix for bug #5970 (record_cmp
not working).  This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
2011-04-12 19:19:24 -04:00
Tom Lane 3c381a55b0 Teach pattern_fixed_prefix() about collations.
This is necessary, not optional, now that ILIKE and regexes are collation
aware --- else we might derive a wrong comparison constant for index
optimized pattern matches.
2011-04-11 12:28:28 -04:00
Bruce Momjian bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Tom Lane 69f1d5fe14 Clean up minor collation issues in indxpath.c.
Get rid of bogus collation test in match_special_index_operator (even for
ILIKE, the pattern match operator's collation doesn't matter here, and even
if it did the test was testing the wrong thing).
Fix broken looping logic in expand_indexqual_rowcompare.
Add collation check in match_clause_to_ordering_op.
Make naming and argument ordering more consistent; improve comments.
2011-04-08 19:19:17 -04:00
Tom Lane bfa4440ca5 Pass collation to makeConst() instead of looking it up internally.
In nearly all cases, the caller already knows the correct collation, and
in a number of places, the value the caller has handy is more correct than
the default for the type would be.  (In particular, this patch makes it
significantly less likely that eval_const_expressions will result in
changing the exposed collation of an expression.)  So an internal lookup
is both expensive and wrong.
2011-03-25 20:10:42 -04:00
Tom Lane 8df08c8489 Reimplement planner's handling of MIN/MAX aggregate optimization (again).
Instead of playing cute games with pathkeys, just build a direct
representation of the intended sub-select, and feed it through
query_planner to get a Path for the index access.  This is a bit slower
than 9.1's previous method, since we'll duplicate most of the overhead of
query_planner; but since the whole optimization only applies to rather
simple single-table queries, that probably won't be much of a problem in
practice.  The advantage is that we get to do the right thing when there's
a partial index that needs the implicit IS NOT NULL clause to be usable.
Also, although this makes planagg.c be a bit more closely tied to the
ordering of operations in grouping_planner, we can get rid of some coupling
to lower-level parts of the planner.  Per complaint from Marti Raudsepp.
2011-03-22 00:34:31 -04:00
Tom Lane b310b6e31c Revise collation derivation method and expression-tree representation.
All expression nodes now have an explicit output-collation field, unless
they are known to only return a noncollatable data type (such as boolean
or record).  Also, nodes that can invoke collation-aware functions store
a separate field that is the collation value to pass to the function.
This avoids confusion that arises when a function has collatable inputs
and noncollatable output type, or vice versa.

Also, replace the parser's on-the-fly collation assignment method with
a post-pass over the completed expression tree.  This allows us to use
a more complex (and hopefully more nearly spec-compliant) assignment
rule without paying for it in extra storage in every expression node.

Fix assorted bugs in the planner's handling of collations by making
collation one of the defining properties of an EquivalenceClass and
by converting CollateExprs into discardable RelabelType nodes during
expression preprocessing.
2011-03-19 20:30:08 -04:00
Peter Eisentraut 414c5a2ea6 Per-column collation support
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.

Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
2011-02-08 23:04:18 +02:00
Tom Lane 9d1ac2f5fa Tweak create_index_paths()'s test for whether to consider a bitmap scan.
Per my note of a couple days ago, create_index_paths would refuse to
consider any path at all for GIN indexes if the selectivity estimate came
out as 1.0; not even if you tried to force it with enable_seqscan.  While
this isn't really a bad outcome in practice, it could be annoying for
testing purposes.  Adjust the test for "is this path only useful for
sorting" so that it doesn't fire on paths with nil pathkeys, which will
include all GIN paths.
2011-01-11 12:13:02 -05:00
Bruce Momjian 5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Tom Lane d583f10b7e Create core infrastructure for KNNGIST.
This is a heavily revised version of builtin_knngist_core-0.9.  The
ordering operators are no longer mixed in with actual quals, which would
have confused not only humans but significant parts of the planner.
Instead, ordering operators are carried separately throughout planning and
execution.

Since the API for ambeginscan and amrescan functions had to be changed
anyway, this commit takes the opportunity to rationalize that a bit.
RelationGetIndexScan no longer forces a premature index_rescan call;
instead, callers of index_beginscan must call index_rescan too.  Aside from
making the AM-side initialization logic a bit less peculiar, this has the
advantage that we do not make a useless extra am_rescan call when there are
runtime key values.  AMs formerly could not assume that the key values
passed to amrescan were actually valid; now they can.

Teodor Sigaev and Tom Lane
2010-12-02 20:51:37 -05:00
Tom Lane c0b5fac701 Simplify and speed up mapping of index opfamilies to pathkeys.
Formerly we looked up the operators associated with each index (caching
them in relcache) and then the planner looked up the btree opfamily
containing such operators in order to build the btree-centric pathkey
representation that describes the index's sort order.  This is quite
pointless for btree indexes: we might as well just use the index's opfamily
information directly.  That saves syscache lookup cycles during planning,
and furthermore allows us to eliminate the relcache's caching of operators
altogether, which may help in reducing backend startup time.

I added code to plancat.c to perform the same type of double lookup
on-the-fly if it's ever faced with a non-btree amcanorder index AM.
If such a thing actually becomes interesting for production, we should
replace that logic with some more-direct method for identifying the
corresponding btree opfamily; but it's not worth spending effort on now.

There is considerably more to do pursuant to my recent proposal to get rid
of sort-operator-based representations of sort orderings, but this patch
grabs some of the low-hanging fruit.  I'll look at the remainder of that
work after the current commitfest.
2010-11-29 12:30:43 -05:00
Tom Lane 89a368418c Further cleanup of indxpath logic related to IndexOptInfo.opfamily array.
We no longer need the terminating zero entry in opfamily[], so get rid of
it.  Also replace assorted ad-hoc looping logic with simple for and foreach
constructs.  This code is now noticeably more readable than it was an hour
ago; credit to Robert for seeing that it could be simplified.
2010-11-20 15:07:16 -05:00
Robert Haas 99bc012d51 Minor cleanup of indxpath.c.
Eliminate some superfluous notational complexity around
match_clause_to_indexcol(), and rip out the DoneMatchingIndexKeys
crock.
2010-11-20 13:57:27 -05:00