Update some obsolete info about GEQO.

This commit is contained in:
Tom Lane 2000-12-16 22:44:47 +00:00
parent a238cb5a8a
commit 0c2629efaa

View File

@ -1,5 +1,5 @@
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/geqo.sgml,v 1.13 2000/09/29 20:21:33 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/geqo.sgml,v 1.14 2000/12/16 22:44:47 tgl Exp $
Genetic Optimizer Genetic Optimizer
--> -->
@ -49,7 +49,7 @@ Genetic Optimizer
grows exponentially with the number of <command>join</command>s included in it. Further grows exponentially with the number of <command>join</command>s included in it. Further
optimization effort is caused by the support of a variety of optimization effort is caused by the support of a variety of
<firstterm>join methods</firstterm> <firstterm>join methods</firstterm>
(e.g., nested loop, index scan, merge join in <productname>Postgres</productname>) to (e.g., nested loop, hash join, merge join in <productname>Postgres</productname>) to
process individual <command>join</command>s and a diversity of process individual <command>join</command>s and a diversity of
<firstterm>indices</firstterm> (e.g., r-tree, <firstterm>indices</firstterm> (e.g., r-tree,
b-tree, hash in <productname>Postgres</productname>) as access paths for relations. b-tree, hash in <productname>Postgres</productname>) as access paths for relations.
@ -57,8 +57,8 @@ Genetic Optimizer
<para> <para>
The current <productname>Postgres</productname> optimizer The current <productname>Postgres</productname> optimizer
implementation performs a <firstterm>near- implementation performs a <firstterm>near-exhaustive search</firstterm>
exhaustive search</firstterm> over the space of alternative strategies. This query over the space of alternative strategies. This query
optimization technique is inadequate to support database application optimization technique is inadequate to support database application
domains that involve the need for extensive queries, such as artificial domains that involve the need for extensive queries, such as artificial
intelligence. intelligence.
@ -74,8 +74,8 @@ Genetic Optimizer
</para> </para>
<para> <para>
Performance difficulties within exploring the space of possible query Performance difficulties in exploring the space of possible query
plans arose the demand for a new optimization technique being developed. plans created the demand for a new optimization technique being developed.
</para> </para>
<para> <para>
@ -88,10 +88,11 @@ Genetic Optimizer
<title>Genetic Algorithms (<acronym>GA</acronym>)</title> <title>Genetic Algorithms (<acronym>GA</acronym>)</title>
<para> <para>
The <acronym>GA</acronym> is a heuristic optimization method which operates through The <acronym>GA</acronym> is a heuristic optimization method which
operates through
determined, randomized search. The set of possible solutions for the determined, randomized search. The set of possible solutions for the
optimization problem is considered as a optimization problem is considered as a
<firstterm>erm>popula</firstterm>erm> of <firstterm>individuals</firstterm>. <firstterm>population</firstterm> of <firstterm>individuals</firstterm>.
The degree of adaption of an individual to its environment is specified The degree of adaption of an individual to its environment is specified
by its <firstterm>fitness</firstterm>. by its <firstterm>fitness</firstterm>.
</para> </para>
@ -167,7 +168,8 @@ P''(t) generation of descendants at a time t
</programlisting> </programlisting>
is encoded by the integer string '4-1-3-2', is encoded by the integer string '4-1-3-2',
which means, first join relation '4' and '1', then '3', and which means, first join relation '4' and '1', then '3', and
then '2', where 1, 2, 3, 4 are relids in <productname>Postgres</productname>. then '2', where 1, 2, 3, 4 are relids within the
<productname>Postgres</productname> optimizer.
</para> </para>
<para> <para>
@ -192,9 +194,10 @@ P''(t) generation of descendants at a time t
<listitem> <listitem>
<para> <para>
Usage of <firstterm>edge recombination crossover</firstterm> which is especially suited Usage of <firstterm>edge recombination crossover</firstterm> which is
especially suited
to keep edge losses low for the solution of the to keep edge losses low for the solution of the
<acronym>cro</acronym>cronym> by means of a <acronym>GA</acronym>; <acronym>TSP</acronym> by means of a <acronym>GA</acronym>;
</para> </para>
</listitem> </listitem>
@ -208,40 +211,19 @@ P''(t) generation of descendants at a time t
</para> </para>
<para> <para>
The <acronym>GEQO</acronym> module gives the following benefits to The <acronym>GEQO</acronym> module allows
the <productname>Postgres</productname> DBMS the <productname>Postgres</productname> query optimizer to
compared to the <productname>Postgres</productname> query optimizer implementation: support large <command>join</command> queries effectively through
non-exhaustive search.
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
Handling of large <command>join</command> queries through non-exhaustive search;
</para>
</listitem>
<listitem>
<para>
Improved cost size approximation of query plans since no longer
plan merging is needed (the <acronym>GEQO</acronym> module evaluates the cost for a
query plan as an individual).
</para>
</listitem>
</itemizedlist>
</para> </para>
</sect1> <sect2 id="geqo-future">
<sect1 id="geqo-future">
<title>Future Implementation Tasks for <title>Future Implementation Tasks for
<productname>PostgreSQL</> <acronym>GEQO</acronym></title> <productname>PostgreSQL</> <acronym>GEQO</acronym></title>
<sect2>
<title>Basic Improvements</title>
<sect3>
<title>Improve genetic algorithm parameter settings</title>
<para> <para>
Work is still needed to improve the genetic algorithm parameter
settings.
In file <filename>backend/optimizer/geqo/geqo_params.c</filename>, routines In file <filename>backend/optimizer/geqo/geqo_params.c</filename>, routines
<function>gimme_pool_size</function> and <function>gimme_number_generations</function>, <function>gimme_pool_size</function> and <function>gimme_number_generations</function>,
we have to find a compromise for the parameter settings we have to find a compromise for the parameter settings
@ -259,38 +241,9 @@ P''(t) generation of descendants at a time t
</listitem> </listitem>
</itemizedlist> </itemizedlist>
</para> </para>
</sect3>
<sect3>
<title>Find better solution for integer overflow</title>
<para>
In file <filename>backend/optimizer/geqo/geqo_eval.c</filename>, routine
<function>geqo_joinrel_size</function>,
the present hack for MAXINT overflow is to set the <productname>Postgres</productname> integer
value of <structfield>rel->size</structfield> to its logarithm.
Modifications of <structname>Rel</structname> in <filename>backend/nodes/relation.h</filename> will
surely have severe impacts on the whole <productname>Postgres</productname> implementation.
</para>
</sect3>
<sect3>
<title>Find solution for exhausted memory</title>
<para>
Memory exhaustion may occur with more than 10 relations involved in a query.
In file <filename>backend/optimizer/geqo/geqo_eval.c</filename>, routine
<function>gimme_tree</function> is recursively called.
Maybe I forgot something to be freed correctly, but I dunno what.
Of course the <structname>rel</structname> data structure of the
<command>join</command> keeps growing and
growing the more relations are packed into it.
Suggestions are welcome :-(
</para>
</sect3>
</sect2> </sect2>
<bibliography id="geqo-biblio"> <bibliography id="geqo-biblio">
<title> <title>
References References