2001-05-07 02:43:27 +02:00
|
|
|
<!--
|
2001-11-18 23:17:30 +01:00
|
|
|
$Header: /cvsroot/pgsql/doc/src/sgml/ref/analyze.sgml,v 1.4 2001/11/18 22:17:30 tgl Exp $
|
2001-05-07 02:43:27 +02:00
|
|
|
Postgres documentation
|
|
|
|
-->
|
|
|
|
|
|
|
|
<refentry id="SQL-ANALYZE">
|
|
|
|
<refmeta>
|
|
|
|
<refentrytitle id="sql-analyze-title">
|
|
|
|
ANALYZE
|
|
|
|
</refentrytitle>
|
|
|
|
<refmiscinfo>SQL - Language Statements</refmiscinfo>
|
|
|
|
</refmeta>
|
|
|
|
<refnamediv>
|
|
|
|
<refname>
|
|
|
|
ANALYZE
|
|
|
|
</refname>
|
|
|
|
<refpurpose>
|
2001-09-03 14:57:50 +02:00
|
|
|
collect statistics about a database
|
2001-05-07 02:43:27 +02:00
|
|
|
</refpurpose>
|
|
|
|
</refnamediv>
|
|
|
|
<refsynopsisdiv>
|
|
|
|
<refsynopsisdivinfo>
|
|
|
|
<date>2001-05-04</date>
|
|
|
|
</refsynopsisdivinfo>
|
|
|
|
<synopsis>
|
|
|
|
ANALYZE [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> [ (<replaceable class="PARAMETER">column</replaceable> [, ...] ) ] ]
|
|
|
|
</synopsis>
|
|
|
|
|
|
|
|
<refsect2 id="R2-SQL-ANALYZE-1">
|
|
|
|
<refsect2info>
|
|
|
|
<date>2001-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>
|
|
|
|
Inputs
|
|
|
|
</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
|
|
<term>VERBOSE</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Enables display of progress messages.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
|
|
<term><replaceable class="PARAMETER">table</replaceable></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The name of a specific table to analyze. Defaults to all tables.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
|
|
<term><replaceable class="PARAMETER">column</replaceable></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The name of a specific column to analyze. Defaults to all columns.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</para>
|
|
|
|
</refsect2>
|
|
|
|
|
|
|
|
<refsect2 id="R2-SQL-ANALYZE-2">
|
|
|
|
<refsect2info>
|
|
|
|
<date>2001-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>
|
|
|
|
Outputs
|
|
|
|
</title>
|
|
|
|
<para>
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
|
|
<term><computeroutput>
|
|
|
|
<returnvalue>ANALYZE</returnvalue>
|
|
|
|
</computeroutput></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
The command is complete.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
</variablelist>
|
|
|
|
</para>
|
|
|
|
</refsect2>
|
|
|
|
</refsynopsisdiv>
|
|
|
|
|
|
|
|
<refsect1 id="R1-SQL-ANALYZE-1">
|
|
|
|
<refsect1info>
|
|
|
|
<date>2001-05-04</date>
|
|
|
|
</refsect1info>
|
|
|
|
<title>
|
|
|
|
Description
|
|
|
|
</title>
|
|
|
|
<para>
|
|
|
|
<command>ANALYZE</command> collects statistics about the contents of
|
|
|
|
<productname>Postgres</productname> tables, and stores the results in
|
|
|
|
the system table <literal>pg_statistic</literal>. Subsequently,
|
|
|
|
the query planner uses the statistics to help determine the most efficient
|
|
|
|
execution plans for queries.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
With no parameter, <command>ANALYZE</command> examines every table in the
|
|
|
|
current database. With a parameter, <command>ANALYZE</command> examines
|
|
|
|
only that table. It is further possible to give a list of column names,
|
|
|
|
in which case only the statistics for those columns are updated.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<refsect2 id="R2-SQL-ANALYZE-3">
|
|
|
|
<refsect2info>
|
|
|
|
<date>2001-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>
|
|
|
|
Notes
|
|
|
|
</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
It is a good idea to run <command>ANALYZE</command> periodically, or
|
|
|
|
just after making major changes in the contents of a table. Accurate
|
|
|
|
statistics will help the planner to choose the most appropriate query
|
|
|
|
plan, and thereby improve the speed of query processing. A common
|
2001-11-18 23:17:30 +01:00
|
|
|
strategy is to run <xref linkend="sql-vacuum" endterm="sql-vacuum-title">
|
|
|
|
and <command>ANALYZE</command> once a day during a low-usage time of day.
|
2001-05-07 02:43:27 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2001-11-18 23:17:30 +01:00
|
|
|
Unlike <command>VACUUM FULL</command>,
|
2001-05-07 02:43:27 +02:00
|
|
|
<command>ANALYZE</command> requires
|
|
|
|
only a read lock on the target table, so it can run in parallel with
|
|
|
|
other activity on the table.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
For large tables, <command>ANALYZE</command> takes a random sample of the
|
|
|
|
table contents, rather than examining every row. This allows even very
|
|
|
|
large tables to be analyzed in a small amount of time. Note however
|
|
|
|
that the statistics are only approximate, and will change slightly each
|
|
|
|
time <command>ANALYZE</command> is run, even if the actual table contents
|
|
|
|
did not change. This may result in small changes in the planner's
|
|
|
|
estimated costs shown by <command>EXPLAIN</command>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The collected statistics usually include a list of some of the most common
|
|
|
|
values in each column and a histogram showing the approximate data
|
|
|
|
distribution in each column. One or both of these may be omitted if
|
|
|
|
<command>ANALYZE</command> deems them uninteresting (for example, in
|
|
|
|
a unique-key column, there are no common values) or if the column
|
2001-10-16 03:13:44 +02:00
|
|
|
datatype does not support the appropriate operators. There is more
|
|
|
|
information about the statistics in the <citetitle>User's
|
|
|
|
Guide</citetitle>.
|
2001-05-07 02:43:27 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The extent of analysis can be controlled by adjusting the per-column
|
|
|
|
statistics target with <command>ALTER TABLE ALTER COLUMN SET
|
|
|
|
STATISTICS</command> (see
|
|
|
|
<xref linkend="sql-altertable" endterm="sql-altertable-title">). The
|
|
|
|
target value sets the maximum number of entries in the most-common-value
|
|
|
|
list and the maximum number of bins in the histogram. The default
|
|
|
|
target value is 10, but this can be adjusted up or down to trade off
|
|
|
|
accuracy of planner estimates against the time taken for
|
|
|
|
<command>ANALYZE</command> and the
|
|
|
|
amount of space occupied in <literal>pg_statistic</literal>.
|
|
|
|
In particular, setting the statistics target to zero disables collection of
|
|
|
|
statistics for that column. It may be useful to do that for columns that
|
|
|
|
are never used as part of the WHERE, GROUP BY, or ORDER BY clauses of
|
|
|
|
queries, since the planner will have no use for statistics on such columns.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The largest statistics target among the columns being analyzed determines
|
|
|
|
the number of table rows sampled to prepare the statistics. Increasing
|
|
|
|
the target causes a proportional increase in the time and space needed
|
|
|
|
to do <command>ANALYZE</command>.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
</refsect2>
|
|
|
|
</refsect1>
|
|
|
|
|
|
|
|
<refsect1 id="R1-SQL-ANALYZE-3">
|
|
|
|
<title>
|
|
|
|
Compatibility
|
|
|
|
</title>
|
|
|
|
|
|
|
|
<refsect2 id="R2-SQL-ANALYZE-4">
|
|
|
|
<refsect2info>
|
|
|
|
<date>2001-05-04</date>
|
|
|
|
</refsect2info>
|
|
|
|
<title>
|
|
|
|
SQL92
|
|
|
|
</title>
|
|
|
|
<para>
|
|
|
|
There is no <command>ANALYZE</command> statement in <acronym>SQL92</acronym>.
|
|
|
|
</para>
|
|
|
|
</refsect2>
|
|
|
|
</refsect1>
|
|
|
|
</refentry>
|
|
|
|
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
|
|
Local variables:
|
|
|
|
mode: sgml
|
|
|
|
sgml-omittag:nil
|
|
|
|
sgml-shorttag:t
|
|
|
|
sgml-minimize-attributes:nil
|
|
|
|
sgml-always-quote-attributes:t
|
|
|
|
sgml-indent-step:1
|
|
|
|
sgml-indent-data:t
|
|
|
|
sgml-parent-document:nil
|
|
|
|
sgml-default-dtd-file:"../reference.ced"
|
|
|
|
sgml-exposed-tags:nil
|
|
|
|
sgml-local-catalogs:"/usr/lib/sgml/catalog"
|
|
|
|
sgml-local-ecat-files:nil
|
|
|
|
End:
|
|
|
|
-->
|