SELECT DISTINCT ON (expr [, expr ...]) targetlist ...
and there is a check to make sure that the user didn't specify an ORDER BY
that's incompatible with the DISTINCT operation.
Reimplement nodeUnique and nodeGroup to use the proper datatype-specific
equality function for each column being compared --- they used to do
bitwise comparisons or convert the data to text strings and strcmp().
(To add insult to injury, they'd look up the conversion functions once
for each tuple...) Parse/plan representation of DISTINCT is now a list
of SortClause nodes.
initdb forced by querytree change...
pghackers discussion of 5-Jan-2000. The amopselect and amopnpages
estimators are gone, and in their place is a per-AM amcostestimate
procedure (linked to from pg_am, not pg_amop).
subselects can only appear on the righthand side of a binary operator.
That's still true for quantified predicates like x = ANY (SELECT ...),
but a subselect that delivers a single result can now appear anywhere
in an expression. This is implemented by changing EXPR_SUBLINK sublinks
to represent just the (SELECT ...) expression, without any 'left hand
side' or combining operator --- so they're now more like EXISTS_SUBLINK.
To handle the case of '(x, y, z) = (SELECT ...)', I added a new sublink
type MULTIEXPR_SUBLINK, which acts just like EXPR_SUBLINK used to.
But the grammar will only generate one for a multiple-left-hand-side
row expression.
mentioned in FROM but not elsewhere in the query: such tables should be
joined over anyway. Aside from being more standards-compliant, this allows
removal of some very ugly hacks for COUNT(*) processing. Also, allow
HAVING clause without aggregate functions, since SQL does. Clean up
CREATE RULE statement-list syntax the same way Bruce just fixed the
main stmtmulti production.
CAUTION: addition of a field to RangeTblEntry nodes breaks stored rules;
you will have to initdb if you have any rules.
Frankpitt, plus some improvements from yours truly. The simplifier depends
on the proiscachable field of pg_proc to tell it whether a function is
safe to pre-evaluate --- things like nextval() are not, for example.
Update pg_proc.h to contain reasonable cacheability information; as of
6.5.* hardly any functions were marked cacheable. I may have erred too
far in the other direction; see recent mail to pghackers for more info.
This update does not force an initdb, exactly, but you won't see much
benefit from the simplifier until you do one.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
conditions. There are some pretty bogus heuristics in prepqual.c that
try to decide whether to output CNF or DNF format; they need to be replaced,
likely. Right now the code is probably too willing to choose DNF form,
which might hurt performance in some cases that used to work OK.
But at least we have a foundation to build on.
Most parts of the planner should ignore, or indeed never even see, uplevel
Vars because they will be or have been replaced by Params. There were a
couple of places that got it wrong though, probably my fault from recent
changes...
documented intepretation of the lefthand and oper fields. Fix a number of
obscure problems while at it --- for example, the old code failed if the parser
decided to insert a type-coercion function just below the operator of a
SubLink.
CAUTION: this will break stored rules that contain subplans. You may
need to initdb.
and fix_opids processing to a single recursive pass over the plan tree
executed at the very tail end of planning, rather than haphazardly here
and there at different places. Now that tlist Vars do not get modified
until the very end, it's possible to get rid of the klugy var_equal and
match_varid partial-matching routines, and just use plain equal()
throughout the optimizer. This is a step towards allowing merge and
hash joins to be done on expressions instead of only Vars ...
sort order down into planner, instead of handling it only at the very top
level of the planner. This fixes many things. An explicit sort is now
avoided if there is a cheaper alternative (typically an indexscan) not
only for ORDER BY, but also for the internal sort of GROUP BY. It works
even when there is no other reason (such as a WHERE condition) to consider
the indexscan. It works for indexes on functions. It works for indexes
on functions, backwards. It's just so cool...
CAUTION: I have changed the representation of SortClause nodes, therefore
THIS UPDATE BREAKS STORED RULES. You will need to initdb.
above a Sort or Materialize node. As far as I can tell, the only place
that actually needed that was set_tlist_references, which was being lazy
about checking to see if it had a noname node to fix or not...
store all ordering information in pathkeys lists (which are now lists of
lists of PathKeyItem nodes, not just lists of lists of vars). This was
a big win --- the code is smaller and IMHO more understandable than it
was, even though it handles more cases. I believe the node changes will
not force an initdb for anyone; planner nodes don't show up in stored
rules.
commuted (ie, the index var appears on the right). These are now handled
the same way as merge and hash join quals that need to be commuted: the
actual reversing of the clause only happens if we actually choose the path
and generate a plan from it. Furthermore, the clause is only reversed in
the 'indexqual' field of the plan, not in the 'indxqualorig' field. This
allows the clause to still be recognized and removed from qpquals of upper
level join plans. Also, simplify and generalize match_clause_to_indexkey;
now it recognizes binary-compatible indexes for join as well as restriction
clauses.
work under a wider range of scenarios than it did --- it formerly did not
handle a multi-pass inner scan, nor cases in which the inner scan's
indxqualorig or non-index qual contained outer var references. I am not
sure that these limitations could be hit in the existing optimizer, but
they need to be fixed for future expansion.
> >
> > was implemented by Jan Wieck.
> > His work is for ascending order cases.
> >
> > Here is a patch to prevent sorting also in descending
> > order cases.
> > Because I had already changed _bt_first() to position
> > backward correctly before v6.5,this patch would work.
> >
Hiroshi Inoue
Inoue@tpf.co.jp
multi-scan indexscan plans; it tried to use the same table-to-index
attribute mapping for all the scans, even if they used different indexes.
It would klugily work as long as OR indexquals never used multikey indexes,
but that's not likely to hold up much longer...
so that Case works in WHERE join clauses. Temporary patch --- this routine
is one of many that ought to be changed to use centralized expression-tree-
walking logic.
optimizer rather than parser. This has many advantages, such as not
getting fooled by chance uses of operator names ~ and ~~ (the operators
are identified by OID now), and not creating useless comparison operations
in contexts where the comparisons will not actually be used as indexquals.
The new code also recognizes exact-match LIKE and regex patterns, and
produces an = indexqual instead of >= and <=.
This change does NOT fix the problem with non-ASCII locales: the code
still doesn't know how to generate an upper bound indexqual for non-ASCII
collation order. But it's no worse than before, just the same deficiency
in a different place...
Also, dike out loc_restrictinfo fields in Plan nodes. These were doing
nothing useful in the absence of 'expensive functions' optimization,
and they took a considerable amount of processing to fill in.
identified by Hiroshi (incorrect cost attributed to OR clauses
after multiple passes through set_rest_selec()). I think the code
was trying to allow selectivities of OR subclauses to be passed in
from outside, but noplace was actually passing any useful data, and
set_rest_selec() was passing wrong data.
Restructure representation of "indexqual" in IndexPath nodes so that
it is the same as for indxqual in completed IndexScan nodes: namely,
a toplevel list with an entry for each pass of the index scan, having
sublists that are implicitly-ANDed index qual conditions for that pass.
You don't want to know what the old representation was :-(
Improve documentation of OR-clause indexscan functions.
Remove useless 'notclause' field from RestrictInfo nodes. (This might
force an initdb for anyone who has stored rules containing RestrictInfos,
but I do not think that RestrictInfo ever appears in completed plans.)
tlist and qual are NULL. It ought to handle these the same as the cases
where tlist contains only constant expressions, ie, be willing to generate
a Result-node plan. This did not use to matter, but it does now because
union_planner will flatten the tlist when aggregates are present. Thus,
'select count(1) from table' now causes query_planner to be given a null
tlist, and to duplicate 6.4's behavior we need it to give back a Result
plan rather than refusing the query. 6.4 was arguably doing the Wrong
Thing for this query, but I'm not going to open a semantics issue right
before 6.5 release ... can revisit that problem later.
returned NULL, which it will do in some cases where an elog(ERROR) would
probably be more appropriate. For the moment, generate a not-very-
informative error message rather than proceeding to certain coredump.
Probably ought to think about making query_planner elog instead of
returning NULL, but this is at least a safe change for now.
lists are now plain old garden-variety Lists, allocated with palloc,
rather than specialized expansible-array data allocated with malloc.
This substantially simplifies their handling and eliminates several
sources of memory leakage.
Several basic types of erroneous queries (syntax error, attempt to
insert a duplicate key into a unique index) now demonstrably leak
zero bytes per query.
change functionality, but makes the code more ANSI C'ish.
My AIX xlc compiler barfs on all of these. Can someone please
review and apply to current.
<<port.patch>>
Thanks
Andreas
_copyResult didn't copy subPlan structure completely. _copyAgg is still
busted, apparently because of changes from EXCEPT/INTERSECT patch
(get_agg_tlist_references is no longer sufficient to find all aggregates).
No time to look at that tonight, however.
so remove them from MergeJoin node. Hack together a partial
solution for commuted mergejoin operators --- yesterday
a mergejoin int4 = int8 would crash if the planner decided to
commute it, today it works. The planner's representation of
mergejoins really needs a rewrite though.
Also, further testing of mergejoin ops in opr_sanity regress test.
Ok. I made patches replacing all of "#if FALSE" or "#if 0" to "#ifdef
NOT_USED" for current. I have tested these patches in that the
postgres binaries are identical.
INTERSECT and EXCEPT is available for postgresql-v6.4!
The patch against v6.4 is included at the end of the current text
(in uuencoded form!)
I also included the text of my Master's Thesis. (a postscript
version). I hope that you find something of it useful and would be
happy if parts of it find their way into the PostgreSQL documentation
project (If so, tell me, then I send the sources of the document!)
The contents of the document are:
-) The first chapter might be of less interest as it gives only an
overview on SQL.
-) The second chapter gives a description on much of PostgreSQL's
features (like user defined types etc. and how to use these features)
-) The third chapter starts with an overview of PostgreSQL's internal
structure with focus on the stages a query has to pass (i.e. parser,
planner/optimizer, executor). Then a detailed description of the
implementation of the Having clause and the Intersect/Except logic is
given.
Originally I worked on v6.3.2 but never found time enough to prepare
and post a patch. Now I applied the changes to v6.4 to get Intersect
and Except working with the new version. Chapter 3 of my documentation
deals with the changes against v6.3.2, so keep that in mind when
comparing the parts of the code printed there with the patched sources
of v6.4.
Here are some remarks on the patch. There are some things that have
still to be done but at the moment I don't have time to do them
myself. (I'm doing my military service at the moment) Sorry for that
:-(
-) I used a rewrite technique for the implementation of the Except/Intersect
logic which rewrites the query to a semantically equivalent query before
it is handed to the rewrite system (for views, rules etc.), planner,
executor etc.
-) In v6.3.2 the types of the attributes of two select statements
connected by the UNION keyword had to match 100%. In v6.4 the types
only need to be familiar (i.e. int and float can be mixed). Since this
feature did not exist when I worked on Intersect/Except it
does not work correctly for Except/Intersect queries WHEN USED IN
COMBINATION WITH UNIONS! (i.e. sometimes the wrong type is used for the
resulting table. This is because until now the types of the attributes of
the first select statement have been used for the resulting table.
When Intersects and/or Excepts are used in combination with Unions it
might happen, that the first select statement of the original query
appears at another position in the query which will be executed. The reason
for this is the technique used for the implementation of
Except/Intersect which does a query rewrite!)
NOTE: It is NOT broken for pure UNION queries and pure INTERSECT/EXCEPT
queries!!!
-) I had to add the field intersect_clause to some data structures
but did not find time to implement printfuncs for the new field.
This does NOT break the debug modes but when an Except/Intersect
is used the query debug output will be the already rewritten query.
-) Massive changes to the grammar rules for SELECT and INSERT statements
have been necessary (see comments in gram.y and documentation for
deatails) in order to be able to use mixed queries like
(SELECT ... UNION (SELECT ... EXCEPT SELECT)) INTERSECT SELECT...;
-) When using UNION/EXCEPT/INTERSECT you will get:
NOTICE: equal: "Don't know if nodes of type xxx are equal".
I did not have time to add comparsion support for all the needed nodes,
but the default behaviour of the function equal met my requirements.
I did not dare to supress this message!
That's the reason why the regression test for union will fail: These
messages are also included in the union.out file!
-) Somebody of you changed the union_planner() function for v6.4
(I copied the targetlist to new_tlist and that was removed and
replaced by a cleanup of the original targetlist). These chnages
violated some having queries executed against views so I changed
it back again. I did not have time to examine the differences between the
two versions but now it works :-)
If you want to find out, try the file queries/view_having.sql on
both versions and compare the results . Two queries won't produce a
correct result with your version.
regards
Stefan
[AC_MSG_RESULT(yes) AC_DEFINE(HAVE_LONG_INT_64)],
this line produces something like:
echo "$ac_t""yes" 1>&6 cat >> confdefs.h <<\EOF
and would append garbage "yes cat" to confdefs.h. Of course the
result confdefs.h is not syntactically correct therefore following
tests using confdefs.h would all fail. To avoid the problem, we
could switch the order of AC_MSG_RESULT and AC_DEFINE (see attached
patch). This happend on my LinuxPPC box.
Tatsuo Ishii t-ishii@sra.co.jp
Everyone using an [NOT] EXISTS subquery will have noticed that
already.
The bug is in "subselect.c" in the function "SS_process_sublinks()".
Here the whole function as it *SHOULD BE*:
Stephan
now. Here some tested features, (examples included in the patch):
1.1) Subselects in the having clause 1.2) Double nested subselects
1.3) Subselects used in the where clause and in the having clause
simultaneously 1.4) Union Selects using having 1.5) Indexes
on the base relations are used correctly 1.6) Unallowed Queries
are prevented (e.g. qualifications in the
having clause that belong to the where clause) 1.7) Insert
into as select
2) Queries using the having clause on view relations also work
but there are some restrictions:
2.1) Create View as Select ... Having ...; using base tables in
the select 2.1.1) The Query rewrite system:
2.1.2) Why are only simple queries allowed against a view from 2.1)
? 2.2) Select ... from testview1, testview2, ... having...; 3) Bug
in ExecMergeJoin ??
Regards Stefan
Attached you'll find a (big) patch that fixes make dep and make
depend in all Makefiles where I found it to be appropriate.
It also removes the dependency in Makefile.global for NAMEDATALEN
and OIDNAMELEN by making backend/catalog/genbki.sh and bin/initdb/initdb.sh
a little smarter.
This no longer requires initdb.sh that is turned into initdb with
a sed script when installing Postgres, hence initdb.sh should be
renamed to initdb (after the patch has been applied :-) )
This patch is against the 6.3 sources, as it took a while to
complete.
Please review and apply,
Cheers,
Jeroen van Vianen
sequential scans! (I think it will also work with hash, index, etc
but I did not check it out! I made some High level changes which
should work for all access methods, but maybe I'm wrong. Please
let me know.)
Now it is possible to make queries like:
select s.sname, max(p.pid), min(p.pid) from part p, supplier s
where s.sid=p.sid group by s.sname having max(pid)=6 and min(pid)=1
or avg(pid)=4;
Having does not work yet for queries that contain a subselect
statement in the Having clause, I'll try to fix this in the next
days.
If there are some bugs, please let me know, I'll start to read the
mailinglists now!
Now here is the patch against the original 6.3 version (no snapshot!!):
Stefan
of some global variables to support subselects and calls union_planner().
Calls to SS_replace_correlation_vars() and SS_process_sublinks() in
query_planner() before planning.
Get rid of #ifdef INDEXSCAN_PATCH in createplan.c.
Pass List* of Aggregs into executor, and create needed array there.
No longer need to double-processs Aggregs with second copy in Query.
Fix crash when doing:
select sum(x+1) from test where 1 > 0;
Makefile.global.
End result, if all goes well, should allow for much easier porting, since
there will no longer be a concept of a "port". Most, if not everything,
*should* be determined by configure, or by the compiler itself. Still
work to be done though :)
nestloop's join clauses doesn't work in some cases:
* 1. fix_indxqual_references may change varattno-s in
* inner_indxqual;
* 2. clauses may be commuted
Subject: [HACKERS] linux/alpha patches
These patches lay the groundwork for a Linux/Alpha port. The port doesn't
actually work unless you tweak the linker to put all the pointers in the
first 32 bits of the address space, but it's at least a start. It
implements the test-and-set instruction in Alpha assembly, and also fixes
a lot of pointer-to-integer conversions, which is probably good anyway.
The problem is that the function arguments are not considered as possible key
candidates for index scan and so only a sequential scan is possible inside
the body of a function. I have therefore made some patches to the optimizer
so that indices are now used also by functions. I have also moved the plan
debug message from pg_eval to pg_plan so that it is printed also for plans
genereated for function execution. I had also to add an index rescan to the
executor because it ignored the parameters set in the execution state, they
were flagged as runtime variables in ExecInitIndexScan but then never used
by the executor so that the scan were always done with any key=1. Very odd.
This means that an index rescan is now done twice for each function execution
which uses an index, the first time when the index scan is initialized and
the second when the actual function arguments are finally available for the
execution. I don't know what is the cost of an double index scan but I
suppose it is anyway less than the cost of a full sequential scan, at leat
for large tables. This is my patch, you must also add -DINDEXSCAN_PATCH in
Makefile.global to enable the changes.
Submitted by: Massimo Dal Zotto <dz@cs.unitn.it>