Up to now, EXPLAIN has contented itself with printing the sort expressions
in a Sort or Merge Append plan node. This patch improves that by
annotating the sort keys with COLLATE, DESC, USING, and/or NULLS FIRST/LAST
whenever nondefault sort ordering options are used. The output is now a
reasonably close approximation of an ORDER BY clause equivalent to the
plan's ordering.
Marius Timmer, Lukas Kreft, and Arne Scheffer; reviewed by Mike Blackwell.
Some additional hacking by me.
The collate.linux.utf8 test covers some of the same territory, but
isn't portable and so probably does not get run often, or on
non-Linux platforms. If this approach turns out to be sufficiently
portable, we may want to look at trimming the redundant tests out
of that file to avoid duplication.
Robins Tharakan, reviewed by Michael Paquier and Fabien Coelho,
with further changes and cleanup by me.
ORDER BY expressions were being treated the same as regular aggregate
arguments for purposes of collation determination, but really they should
not affect the aggregate's collation at all; only collations of the
aggregate's regular arguments should affect it.
In many cases this mistake would lead to incorrectly throwing a "collation
conflict" error; but in some cases the corrected code will silently assign
a different collation to the aggregate than before, for example
agg(foo ORDER BY bar COLLATE "x")
which will now use foo's collation rather than "x" for the aggregate.
Given this risk and the lack of field complaints about the issue, it
doesn't seem prudent to back-patch.
In passing, rearrange code in assign_collations_walker so that we don't
need multiple copies of the standard logic for computing collation of a
node with children. (Previously, CaseExpr duplicated the standard logic,
and we would have needed a third copy for Aggref without this change.)
Andrew Gierth and David Fetter
Because coerce_type recurses into the argument of a CollateExpr,
coerce_to_target_type's longstanding code for detecting whether coerce_type
had actually done anything (to wit, returned a different node than it
passed in) was broken in 9.1. This resulted in unexpected failures in
hide_coercion_node; which was not the latter's fault, since it's critical
that we never call it on anything that wasn't inserted by coerce_type.
(Else we might decide to "hide" a user-written function call.)
Fix by removing and replacing the CollateExpr in coerce_to_target_type
itself. This is all pretty ugly but I don't immediately see a way to make
it nicer.
Per report from Jean-Yves F. Barbier.
Per spec we ought to apply select_common_collation() across the expressions
in each column of the VALUES table. The original coding was just taking
the first row and assuming it was representative.
This patch adds a field to struct RangeTblEntry to carry the resolved
collations, so initdb is forced for changes in stored rule representation.
If the referencing and referenced columns have different collations,
the parser will be unable to resolve which collation to use unless it's
helped out in this way. The effects are sometimes masked, if we end up
using a non-collation-sensitive plan; but if we do use a mergejoin
we'll see a failure, as recently noted by Robert Haas.
The SQL spec states that the referenced column's collation should be used
to resolve RI checks, so that's what we do. Note however that we currently
don't append a COLLATE clause when writing a query that examines only the
referencing column. If we ever support collations that have varying
notions of equality, that will have to be changed. For the moment, though,
it's preferable to leave it off so that we can use a normal index on the
referencing column.
Remove crude hack that tried to propagate collation through a
function-returning-record, ie, from the function's arguments to individual
fields selected from its result record. That is just plain inconsistent,
because the function result is composite and cannot have a collation;
and there's no hope of making this kind of action-at-a-distance work
consistently. Adjust regression test cases that expected this to happen.
Meanwhile, the behavior of casting to a domain with a declared collation
stays the same as it was, since that seemed to be the consensus.
Ensure that COLLATE at the top level of an index expression is treated the
same as a grammatically separate COLLATE. Fix bogus reverse-parsing logic
in pg_get_indexdef.
This restores a parse error that was thrown (though only in the ORDER BY
case) by the original collation patch. I had removed it in my recent
revisions because it was thrown at a place where collations now haven't
been computed yet; but I thought of another way to handle it.
Throwing the error at parse time, rather than leaving it to be done at
runtime, is good because a syntax error pointer is helpful for localizing
the problem. We can reasonably assume that the comparison function for a
collatable datatype will complain if it doesn't have a collation to use.
Now the planner might choose to implement GROUP or DISTINCT via hashing,
in which case no runtime error would actually occur, but it seems better
to throw error consistently rather than let the error depend on what the
planner chooses to do. Another possible objection is that the user might
specify a nondefault sort operator that doesn't care about collation
... but that's surely an uncommon usage, and it wouldn't hurt him to throw
in a COLLATE clause anyway. This change also makes the ORDER BY/GROUP
BY/DISTINCT case more consistent with the UNION/INTERSECT/EXCEPT case,
which was already coded to throw this error even though the same objections
could be raised there.