This patch makes a few incremental improvements to geqo.sgml and

arch-dev.sgml Neil Conway
2003-09-29 18:18:35 +00:00 · 2003-09-29 18:18:35 +00:00 · a17b53753e
parent 04e401f97f
commit a17b53753e
6 changed files with 101 additions and 65 deletions
--- a/doc/src/sgml/arch-dev.sgml
+++ b/doc/src/sgml/arch-dev.sgml
@ -1,5 +1,5 @@
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tgl Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.22 2003/09/29 18:18:35 momjian Exp $
 -->
 <chapter id="overview">
@ -25,7 +25,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
   very extensive. Rather, this chapter is intended to help the reader
   understand the general sequence of operations that occur within the
   backend from the point at which a query is received, to the point
-   when the results are returned to the client.
+   at which the results are returned to the client.
  </para>
  <sect1 id="query-path">
@ -79,7 +79,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
    <step>
     <para>
      The <firstterm>planner/optimizer</firstterm> takes
-      the (rewritten) querytree and creates a 
+      the (rewritten) query tree and creates a 
      <firstterm>query plan</firstterm> that will be the input to the
      <firstterm>executor</firstterm>.
     </para>
@ -183,12 +183,12 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
    <title>Parser</title>
    <para>
-     The parser has to check the query string (which arrives as
+     The parser has to check the query string (which arrives as plain
-     plain ASCII text) for valid syntax. If the syntax is correct a
+     ASCII text) for valid syntax. If the syntax is correct a
-     <firstterm>parse tree</firstterm> is built up and handed back otherwise an error is
+     <firstterm>parse tree</firstterm> is built up and handed back;
-     returned. For the implementation the well known Unix
+     otherwise an error is returned. The parser and lexer are
-     tools <application>lex</application> and <application>yacc</application>
+     implemented using the well-known Unix tools <application>yacc</>
-     are used.
+     and <application>lex</>.
    </para>
    <para>
@ -201,23 +201,22 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
    </para>
    <para>
-     The parser is defined in the file <filename>gram.y</filename> and consists of a
+     The parser is defined in the file <filename>gram.y</filename> and
-     set of <firstterm>grammar rules</firstterm> and <firstterm>actions</firstterm>
+     consists of a set of <firstterm>grammar rules</firstterm> and
-     that are executed
+     <firstterm>actions</firstterm> that are executed whenever a rule
-     whenever a rule is fired. The code of the actions (which
+     is fired. The code of the actions (which is actually C code) is
-     is actually C-code) is used to build up the parse tree.
+     used to build up the parse tree.
    </para>
    <para>
-     The file <filename>scan.l</filename> is transformed to
+     The file <filename>scan.l</filename> is transformed to the C
-     the C-source file <filename>scan.c</filename>
+     source file <filename>scan.c</filename> using the program
-     using the program <application>lex</application>
+     <application>lex</application> and <filename>gram.y</filename> is
-     and <filename>gram.y</filename> is transformed to
+     transformed to <filename>gram.c</filename> using
-     <filename>gram.c</filename> using <application>yacc</application>.
+     <application>yacc</application>.  After these transformations
-     After these transformations have taken
+     have taken place a normal C compiler can be used to create the
-     place a normal C-compiler can be used to create the
+     parser. Never make any changes to the generated C files as they
-     parser. Never make any changes to the generated C-files as they will
+     will be overwritten the next time <application>lex</application>
     be overwritten the next time <application>lex</application>
     or <application>yacc</application> is called.
     <note>
@ -334,15 +333,27 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
   <title>Planner/Optimizer</title>
   <para>
-    The task of the <firstterm>planner/optimizer</firstterm> is to create an optimal
+    The task of the <firstterm>planner/optimizer</firstterm> is to
-    execution plan. It first considers all possible ways of
+    create an optimal execution plan. A given SQL query (and hence, a
-    <firstterm>scanning</firstterm> and <firstterm>joining</firstterm>
+    query tree) can be actually executed in a wide variety of
-    the relations that appear in a
+    different ways, each of which will produce the same set of
-    query. All the created paths lead to the same result and it's the
+    results.  If it is computationally feasible, the query optimizer
-    task of the optimizer to estimate the cost of executing each path and
+    will examine each of these possible execution plans, ultimately
-    find out which one is the cheapest.
+    selecting the execution plan that will run the fastest.
   </para>
   <note>
    <para>
     In some situations, examining each possible way in which a query
     may be executed would take an excessive amount of time and memory
     space. In particular, this occurs when executing queries
     involving large numbers of join operations. In order to determine
     a reasonable (not optimal) query plan in a reasonable amount of
     time, <productname>PostgreSQL</productname> uses a <xref
     linkend="geqo" endterm="geqo-title">.
    </para>
   </note>
   <para>
    After the cheapest path is determined, a <firstterm>plan tree</>
    is built to pass to the executor.  This represents the desired
@ -373,7 +384,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
     After all feasible plans have been found for scanning single relations,
     plans for joining relations are created. The planner/optimizer
     preferentially considers joins between any two relations for which there
-     exist a corresponding join clause in the WHERE qualification (i.e. for
+     exist a corresponding join clause in the <literal>WHERE</literal> qualification (i.e. for
     which a restriction like <literal>where rel1.attr1=rel2.attr2</literal>
     exists). Join pairs with no join clause are considered only when there
     is no other choice, that is, a particular relation has no available
@ -416,17 +427,19 @@ $Header: /cvsroot/pgsql/doc/src/sgml/arch-dev.sgml,v 2.21 2003/06/22 16:16:44 tg
    </para>
    <para>
-     The finished plan tree consists of sequential or index scans of the
+     The finished plan tree consists of sequential or index scans of
-     base relations, plus nestloop, merge, or hash join nodes as needed,
+     the base relations, plus nestloop, merge, or hash join nodes as
-     plus any auxiliary steps needed, such as sort nodes or aggregate-function
+     needed, plus any auxiliary steps needed, such as sort nodes or
-     calculation nodes.  Most of these plan node types have the additional
+     aggregate-function calculation nodes.  Most of these plan node
-     ability to do <firstterm>selection</> (discarding rows that do
+     types have the additional ability to do <firstterm>selection</>
-     not meet a specified boolean condition) and <firstterm>projection</>
+     (discarding rows that do not meet a specified boolean condition)
-     (computation of a derived column set based on given column values,
+     and <firstterm>projection</> (computation of a derived column set
-     that is, evaluation of scalar expressions where needed).  One of
+     based on given column values, that is, evaluation of scalar
-     the responsibilities of the planner is to attach selection conditions
+     expressions where needed).  One of the responsibilities of the
-     from the WHERE clause and computation of required output expressions
+     planner is to attach selection conditions from the
-     to the most appropriate nodes of the plan tree.
+     <literal>WHERE</literal> clause and computation of required
     output expressions to the most appropriate nodes of the plan
     tree.
    </para>
   </sect2>
  </sect1>
--- a/doc/src/sgml/geqo.sgml
+++ b/doc/src/sgml/geqo.sgml
@ -1,5 +1,5 @@
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/geqo.sgml,v 1.23 2002/01/20 22:19:56 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/geqo.sgml,v 1.24 2003/09/29 18:18:35 momjian Exp $
 Genetic Optimizer
 -->
@ -28,7 +28,7 @@ Genetic Optimizer
   <date>1997-10-02</date>
  </docinfo>
-  <title>Genetic Query Optimization</title>
+  <title id="geqo-title">Genetic Query Optimizer</title>
  <para>
   <note>
@ -44,24 +44,29 @@ Genetic Optimizer
   <title>Query Handling as a Complex Optimization Problem</title>
   <para>
-    Among all relational operators the most difficult one to process and
+    Among all relational operators the most difficult one to process
-    optimize is the <firstterm>join</firstterm>. The number of alternative plans to answer a query
+    and optimize is the <firstterm>join</firstterm>. The number of
-    grows exponentially with the number of joins included in it. Further
+    alternative plans to answer a query grows exponentially with the
-    optimization effort is caused by the support of a variety of
+    number of joins included in it. Further optimization effort is
-    <firstterm>join methods</firstterm>
+    caused by the support of a variety of <firstterm>join
-    (e.g., nested loop, hash join, merge join in <productname>PostgreSQL</productname>) to
+    methods</firstterm> (e.g., nested loop, hash join, merge join in
-    process individual joins and a diversity of
+    <productname>PostgreSQL</productname>) to process individual joins
-    <firstterm>indexes</firstterm> (e.g., R-tree,
+    and a diversity of <firstterm>indexes</firstterm> (e.g., R-tree,
-    B-tree, hash in <productname>PostgreSQL</productname>) as access paths for relations.
+    B-tree, hash in <productname>PostgreSQL</productname>) as access
    paths for relations.
   </para>
   <para>
    The current <productname>PostgreSQL</productname> optimizer
-    implementation performs a <firstterm>near-exhaustive search</firstterm>
+    implementation performs a <firstterm>near-exhaustive
-    over the space of alternative strategies. This query 
+    search</firstterm> over the space of alternative strategies. This
-    optimization technique is inadequate to support database application
+    algorithm, first introduced in the <quote>System R</quote>
-    domains that involve the need for extensive queries, such as artificial
+    database, produces a near-optimal join order, but can take an
-    intelligence.
+    enormous amount of time and memory space when the number of joins
    in the query grows large. This makes the ordinary
    <productname>PostgreSQL</productname> query optimizer
    inappropriate for database application domains that involve the
    need for extensive queries, such as artificial intelligence.
   </para>
   <para>
@ -75,12 +80,14 @@ Genetic Optimizer
   <para>
    Performance difficulties in exploring the space of possible query
-    plans created the demand for a new optimization technique being developed.
+    plans created the demand for a new optimization technique to be developed.
   </para>
   <para>
-    In the following we propose the implementation of a <firstterm>Genetic Algorithm</firstterm>
+    In the following we describe the implementation of a
-    as an option for the database query optimization problem.
+    <firstterm>Genetic Algorithm</firstterm> to solve the join
    ordering problem in a manner that is efficient for queries
    involving large numbers of joins.
   </para>
  </sect1>
@ -208,10 +215,10 @@ Genetic Optimizer
     <listitem>
      <para>
-       Usage of <firstterm>edge recombination crossover</firstterm> which is
+       Usage of <firstterm>edge recombination crossover</firstterm>
-       especially suited
+       which is especially suited to keep edge losses low for the
-       to keep edge losses low for the solution of the
+       solution of the <acronym>TSP</acronym> by means of a
-       <acronym>TSP</acronym> by means of a <acronym>GA</acronym>;
+       <acronym>GA</acronym>;
      </para>
     </listitem>
--- a/doc/src/sgml/gist.sgml
+++ b/doc/src/sgml/gist.sgml
@ -1,3 +1,7 @@
 <!--
 $Header: /cvsroot/pgsql/doc/src/sgml/gist.sgml,v 1.12 2003/09/29 18:18:35 momjian Exp $
 -->
 <Chapter Id="gist">
 <DocInfo>
 <AuthorGroup>
--- a/doc/src/sgml/install-win32.sgml
+++ b/doc/src/sgml/install-win32.sgml
@ -1,3 +1,7 @@
 <!--
 $Header: /cvsroot/pgsql/doc/src/sgml/install-win32.sgml,v 1.12 2003/09/29 18:18:35 momjian Exp $
 -->
 <chapter id="install-win32">
 <title>Installation on <productname>Windows</productname></title>
--- a/doc/src/sgml/libpgtcl.sgml
+++ b/doc/src/sgml/libpgtcl.sgml
@ -1,3 +1,7 @@
 <!--
 $Header: /cvsroot/pgsql/doc/src/sgml/Attic/libpgtcl.sgml,v 1.38 2003/09/29 18:18:35 momjian Exp $
 -->
 <chapter id="pgtcl">
 <title><application>pgtcl</application> - Tcl Binding Library</title>
--- a/doc/src/sgml/page.sgml
+++ b/doc/src/sgml/page.sgml
@ -1,3 +1,7 @@
 <!--
 $Header: /cvsroot/pgsql/doc/src/sgml/Attic/page.sgml,v 1.14 2003/09/29 18:18:35 momjian Exp $
 -->
 <chapter id="page">
 <title>Page Files</title>