This patch makes a few incremental improvements to geqo.sgml and
arch-dev.sgml Neil Conway
This commit is contained in:
parent
04e401f97f
commit
a17b53753e
|
@ -1,5 +1,5 @@
|
||||||
<!--
|
<!--
|
||||||
$Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tgl Exp $
|
$Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.22 2003/09/29 18:18:35 momjian Exp $
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<chapter id="overview">
|
<chapter id="overview">
|
||||||
|
@ -25,7 +25,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
very extensive. Rather, this chapter is intended to help the reader
|
very extensive. Rather, this chapter is intended to help the reader
|
||||||
understand the general sequence of operations that occur within the
|
understand the general sequence of operations that occur within the
|
||||||
backend from the point at which a query is received, to the point
|
backend from the point at which a query is received, to the point
|
||||||
when the results are returned to the client.
|
at which the results are returned to the client.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<sect1 id="query-path">
|
<sect1 id="query-path">
|
||||||
|
@ -79,7 +79,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
<step>
|
<step>
|
||||||
<para>
|
<para>
|
||||||
The <firstterm>planner/optimizer</firstterm> takes
|
The <firstterm>planner/optimizer</firstterm> takes
|
||||||
the (rewritten) querytree and creates a
|
the (rewritten) query tree and creates a
|
||||||
<firstterm>query plan</firstterm> that will be the input to the
|
<firstterm>query plan</firstterm> that will be the input to the
|
||||||
<firstterm>executor</firstterm>.
|
<firstterm>executor</firstterm>.
|
||||||
</para>
|
</para>
|
||||||
|
@ -183,12 +183,12 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
<title>Parser</title>
|
<title>Parser</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The parser has to check the query string (which arrives as
|
The parser has to check the query string (which arrives as plain
|
||||||
plain ASCII text) for valid syntax. If the syntax is correct a
|
ASCII text) for valid syntax. If the syntax is correct a
|
||||||
<firstterm>parse tree</firstterm> is built up and handed back otherwise an error is
|
<firstterm>parse tree</firstterm> is built up and handed back;
|
||||||
returned. For the implementation the well known Unix
|
otherwise an error is returned. The parser and lexer are
|
||||||
tools <application>lex</application> and <application>yacc</application>
|
implemented using the well-known Unix tools <application>yacc</>
|
||||||
are used.
|
and <application>lex</>.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
@ -201,23 +201,22 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The parser is defined in the file <filename>gram.y</filename> and consists of a
|
The parser is defined in the file <filename>gram.y</filename> and
|
||||||
set of <firstterm>grammar rules</firstterm> and <firstterm>actions</firstterm>
|
consists of a set of <firstterm>grammar rules</firstterm> and
|
||||||
that are executed
|
<firstterm>actions</firstterm> that are executed whenever a rule
|
||||||
whenever a rule is fired. The code of the actions (which
|
is fired. The code of the actions (which is actually C code) is
|
||||||
is actually C-code) is used to build up the parse tree.
|
used to build up the parse tree.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The file <filename>scan.l</filename> is transformed to
|
The file <filename>scan.l</filename> is transformed to the C
|
||||||
the C-source file <filename>scan.c</filename>
|
source file <filename>scan.c</filename> using the program
|
||||||
using the program <application>lex</application>
|
<application>lex</application> and <filename>gram.y</filename> is
|
||||||
and <filename>gram.y</filename> is transformed to
|
transformed to <filename>gram.c</filename> using
|
||||||
<filename>gram.c</filename> using <application>yacc</application>.
|
<application>yacc</application>. After these transformations
|
||||||
After these transformations have taken
|
have taken place a normal C compiler can be used to create the
|
||||||
place a normal C-compiler can be used to create the
|
parser. Never make any changes to the generated C files as they
|
||||||
parser. Never make any changes to the generated C-files as they will
|
will be overwritten the next time <application>lex</application>
|
||||||
be overwritten the next time <application>lex</application>
|
|
||||||
or <application>yacc</application> is called.
|
or <application>yacc</application> is called.
|
||||||
|
|
||||||
<note>
|
<note>
|
||||||
|
@ -334,15 +333,27 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
<title>Planner/Optimizer</title>
|
<title>Planner/Optimizer</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The task of the <firstterm>planner/optimizer</firstterm> is to create an optimal
|
The task of the <firstterm>planner/optimizer</firstterm> is to
|
||||||
execution plan. It first considers all possible ways of
|
create an optimal execution plan. A given SQL query (and hence, a
|
||||||
<firstterm>scanning</firstterm> and <firstterm>joining</firstterm>
|
query tree) can be actually executed in a wide variety of
|
||||||
the relations that appear in a
|
different ways, each of which will produce the same set of
|
||||||
query. All the created paths lead to the same result and it's the
|
results. If it is computationally feasible, the query optimizer
|
||||||
task of the optimizer to estimate the cost of executing each path and
|
will examine each of these possible execution plans, ultimately
|
||||||
find out which one is the cheapest.
|
selecting the execution plan that will run the fastest.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
<note>
|
||||||
|
<para>
|
||||||
|
In some situations, examining each possible way in which a query
|
||||||
|
may be executed would take an excessive amount of time and memory
|
||||||
|
space. In particular, this occurs when executing queries
|
||||||
|
involving large numbers of join operations. In order to determine
|
||||||
|
a reasonable (not optimal) query plan in a reasonable amount of
|
||||||
|
time, <productname>PostgreSQL</productname> uses a <xref
|
||||||
|
linkend="geqo" endterm="geqo-title">.
|
||||||
|
</para>
|
||||||
|
</note>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
After the cheapest path is determined, a <firstterm>plan tree</>
|
After the cheapest path is determined, a <firstterm>plan tree</>
|
||||||
is built to pass to the executor. This represents the desired
|
is built to pass to the executor. This represents the desired
|
||||||
|
@ -373,7 +384,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
After all feasible plans have been found for scanning single relations,
|
After all feasible plans have been found for scanning single relations,
|
||||||
plans for joining relations are created. The planner/optimizer
|
plans for joining relations are created. The planner/optimizer
|
||||||
preferentially considers joins between any two relations for which there
|
preferentially considers joins between any two relations for which there
|
||||||
exist a corresponding join clause in the WHERE qualification (i.e. for
|
exist a corresponding join clause in the <literal>WHERE</literal> qualification (i.e. for
|
||||||
which a restriction like <literal>where rel1.attr1=rel2.attr2</literal>
|
which a restriction like <literal>where rel1.attr1=rel2.attr2</literal>
|
||||||
exists). Join pairs with no join clause are considered only when there
|
exists). Join pairs with no join clause are considered only when there
|
||||||
is no other choice, that is, a particular relation has no available
|
is no other choice, that is, a particular relation has no available
|
||||||
|
@ -416,17 +427,19 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The finished plan tree consists of sequential or index scans of the
|
The finished plan tree consists of sequential or index scans of
|
||||||
base relations, plus nestloop, merge, or hash join nodes as needed,
|
the base relations, plus nestloop, merge, or hash join nodes as
|
||||||
plus any auxiliary steps needed, such as sort nodes or aggregate-function
|
needed, plus any auxiliary steps needed, such as sort nodes or
|
||||||
calculation nodes. Most of these plan node types have the additional
|
aggregate-function calculation nodes. Most of these plan node
|
||||||
ability to do <firstterm>selection</> (discarding rows that do
|
types have the additional ability to do <firstterm>selection</>
|
||||||
not meet a specified boolean condition) and <firstterm>projection</>
|
(discarding rows that do not meet a specified boolean condition)
|
||||||
(computation of a derived column set based on given column values,
|
and <firstterm>projection</> (computation of a derived column set
|
||||||
that is, evaluation of scalar expressions where needed). One of
|
based on given column values, that is, evaluation of scalar
|
||||||
the responsibilities of the planner is to attach selection conditions
|
expressions where needed). One of the responsibilities of the
|
||||||
from the WHERE clause and computation of required output expressions
|
planner is to attach selection conditions from the
|
||||||
to the most appropriate nodes of the plan tree.
|
<literal>WHERE</literal> clause and computation of required
|
||||||
|
output expressions to the most appropriate nodes of the plan
|
||||||
|
tree.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
|
@ -1,5 +1,5 @@
|
||||||
<!--
|
<!--
|
||||||
$Header: /cvsroot/pgsql/doc/src/sgml/geqo.sgml,v 1.23 2002/01/20 22:19:56 petere Exp $
|
$Header: /cvsroot/pgsql/doc/src/sgml/geqo.sgml,v 1.24 2003/09/29 18:18:35 momjian Exp $
|
||||||
Genetic Optimizer
|
Genetic Optimizer
|
||||||
-->
|
-->
|
||||||
|
|
||||||
|
@ -28,7 +28,7 @@ Genetic Optimizer
|
||||||
<date>1997-10-02</date>
|
<date>1997-10-02</date>
|
||||||
</docinfo>
|
</docinfo>
|
||||||
|
|
||||||
<title>Genetic Query Optimization</title>
|
<title id="geqo-title">Genetic Query Optimizer</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
<note>
|
<note>
|
||||||
|
@ -44,24 +44,29 @@ Genetic Optimizer
|
||||||
<title>Query Handling as a Complex Optimization Problem</title>
|
<title>Query Handling as a Complex Optimization Problem</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Among all relational operators the most difficult one to process and
|
Among all relational operators the most difficult one to process
|
||||||
optimize is the <firstterm>join</firstterm>. The number of alternative plans to answer a query
|
and optimize is the <firstterm>join</firstterm>. The number of
|
||||||
grows exponentially with the number of joins included in it. Further
|
alternative plans to answer a query grows exponentially with the
|
||||||
optimization effort is caused by the support of a variety of
|
number of joins included in it. Further optimization effort is
|
||||||
<firstterm>join methods</firstterm>
|
caused by the support of a variety of <firstterm>join
|
||||||
(e.g., nested loop, hash join, merge join in <productname>PostgreSQL</productname>) to
|
methods</firstterm> (e.g., nested loop, hash join, merge join in
|
||||||
process individual joins and a diversity of
|
<productname>PostgreSQL</productname>) to process individual joins
|
||||||
<firstterm>indexes</firstterm> (e.g., R-tree,
|
and a diversity of <firstterm>indexes</firstterm> (e.g., R-tree,
|
||||||
B-tree, hash in <productname>PostgreSQL</productname>) as access paths for relations.
|
B-tree, hash in <productname>PostgreSQL</productname>) as access
|
||||||
|
paths for relations.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The current <productname>PostgreSQL</productname> optimizer
|
The current <productname>PostgreSQL</productname> optimizer
|
||||||
implementation performs a <firstterm>near-exhaustive search</firstterm>
|
implementation performs a <firstterm>near-exhaustive
|
||||||
over the space of alternative strategies. This query
|
search</firstterm> over the space of alternative strategies. This
|
||||||
optimization technique is inadequate to support database application
|
algorithm, first introduced in the <quote>System R</quote>
|
||||||
domains that involve the need for extensive queries, such as artificial
|
database, produces a near-optimal join order, but can take an
|
||||||
intelligence.
|
enormous amount of time and memory space when the number of joins
|
||||||
|
in the query grows large. This makes the ordinary
|
||||||
|
<productname>PostgreSQL</productname> query optimizer
|
||||||
|
inappropriate for database application domains that involve the
|
||||||
|
need for extensive queries, such as artificial intelligence.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
@ -75,12 +80,14 @@ Genetic Optimizer
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Performance difficulties in exploring the space of possible query
|
Performance difficulties in exploring the space of possible query
|
||||||
plans created the demand for a new optimization technique being developed.
|
plans created the demand for a new optimization technique to be developed.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
In the following we propose the implementation of a <firstterm>Genetic Algorithm</firstterm>
|
In the following we describe the implementation of a
|
||||||
as an option for the database query optimization problem.
|
<firstterm>Genetic Algorithm</firstterm> to solve the join
|
||||||
|
ordering problem in a manner that is efficient for queries
|
||||||
|
involving large numbers of joins.
|
||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
|
@ -208,10 +215,10 @@ Genetic Optimizer
|
||||||
|
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
Usage of <firstterm>edge recombination crossover</firstterm> which is
|
Usage of <firstterm>edge recombination crossover</firstterm>
|
||||||
especially suited
|
which is especially suited to keep edge losses low for the
|
||||||
to keep edge losses low for the solution of the
|
solution of the <acronym>TSP</acronym> by means of a
|
||||||
<acronym>TSP</acronym> by means of a <acronym>GA</acronym>;
|
<acronym>GA</acronym>;
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
|
|
|
@ -1,3 +1,7 @@
|
||||||
|
<!--
|
||||||
|
$Header: /cvsroot/pgsql/doc/src/sgml/gist.sgml,v 1.12 2003/09/29 18:18:35 momjian Exp $
|
||||||
|
-->
|
||||||
|
|
||||||
<Chapter Id="gist">
|
<Chapter Id="gist">
|
||||||
<DocInfo>
|
<DocInfo>
|
||||||
<AuthorGroup>
|
<AuthorGroup>
|
||||||
|
|
|
@ -1,3 +1,7 @@
|
||||||
|
<!--
|
||||||
|
$Header: /cvsroot/pgsql/doc/src/sgml/install-win32.sgml,v 1.12 2003/09/29 18:18:35 momjian Exp $
|
||||||
|
-->
|
||||||
|
|
||||||
<chapter id="install-win32">
|
<chapter id="install-win32">
|
||||||
<title>Installation on <productname>Windows</productname></title>
|
<title>Installation on <productname>Windows</productname></title>
|
||||||
|
|
||||||
|
|
|
@ -1,3 +1,7 @@
|
||||||
|
<!--
|
||||||
|
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/libpgtcl.sgml,v 1.38 2003/09/29 18:18:35 momjian Exp $
|
||||||
|
-->
|
||||||
|
|
||||||
<chapter id="pgtcl">
|
<chapter id="pgtcl">
|
||||||
<title><application>pgtcl</application> - Tcl Binding Library</title>
|
<title><application>pgtcl</application> - Tcl Binding Library</title>
|
||||||
|
|
||||||
|
|
|
@ -1,3 +1,7 @@
|
||||||
|
<!--
|
||||||
|
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/page.sgml,v 1.14 2003/09/29 18:18:35 momjian Exp $
|
||||||
|
-->
|
||||||
|
|
||||||
<chapter id="page">
|
<chapter id="page">
|
||||||
|
|
||||||
<title>Page Files</title>
|
<title>Page Files</title>
|
||||||
|
|
Loading…
Reference in New Issue