diff --git a/doc/src/sgml/parallel.sgml b/doc/src/sgml/parallel.sgml index 479e24a1dc..13479d7e5e 100644 --- a/doc/src/sgml/parallel.sgml +++ b/doc/src/sgml/parallel.sgml @@ -8,11 +8,11 @@ - PostgreSQL can devise query plans which can leverage + PostgreSQL can devise query plans that can leverage multiple CPUs in order to answer queries faster. This feature is known as parallel query. Many queries cannot benefit from parallel query, either due to limitations of the current implementation or because there is no - imaginable query plan which is any faster than the serial query plan. + imaginable query plan that is any faster than the serial query plan. However, for queries that can benefit, the speedup from parallel query is often very significant. Many queries can run more than twice as fast when using parallel query, and some queries can run four times faster or @@ -27,7 +27,7 @@ When the optimizer determines that parallel query is the fastest execution - strategy for a particular query, it will create a query plan which includes + strategy for a particular query, it will create a query plan that includes a Gather or Gather Merge node. Here is a simple example: @@ -59,7 +59,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; Using EXPLAIN, you can see the number of workers chosen by the planner. When the Gather node is reached - during query execution, the process which is implementing the user's + during query execution, the process that is implementing the user's session will request a number of background worker processes equal to the number of workers chosen by the planner. The number of background workers that @@ -79,7 +79,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; - Every background worker process which is successfully started for a given + Every background worker process that is successfully started for a given parallel query will execute the parallel portion of the plan. The leader will also execute that portion of the plan, but it has an additional responsibility: it must also read all of the tuples generated by the @@ -88,7 +88,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; worker, speeding up query execution. Conversely, when the parallel portion of the plan generates a large number of tuples, the leader may be almost entirely occupied with reading the tuples generated by the workers and - performing any further processing steps which are required by plan nodes + performing any further processing steps that are required by plan nodes above the level of the Gather node or Gather Merge node. In such cases, the leader will do very little of the work of executing the parallel portion of the plan. @@ -109,7 +109,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; When Can Parallel Query Be Used? - There are several settings which can cause the query planner not to + There are several settings that can cause the query planner not to generate a parallel query plan under any circumstances. In order for any parallel query plans whatsoever to be generated, the following settings must be configured as indicated. @@ -119,7 +119,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; must be set to a - value which is greater than zero. This is a special case of the more + value that is greater than zero. This is a special case of the more general principle that no more workers should be used than the number configured via max_parallel_workers_per_gather. @@ -144,8 +144,8 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; The query writes any data or locks any database rows. If a query contains a data-modifying operation either at the top level or within a CTE, no parallel plans for that query will be generated. As an - exception, the following commands which create a new table and populate - it can use a parallel plan for the underlying SELECT + exception, the following commands, which create a new table and populate + it, can use a parallel plan for the underlying SELECT part of the query: @@ -255,7 +255,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; than normal but would produce incorrect results. Instead, the parallel portion of the plan must be what is known internally to the query optimizer as a partial plan; that is, it must be constructed - so that each process which executes the plan will generate only a + so that each process that executes the plan will generate only a subset of the output rows in such a way that each required output row is guaranteed to be generated by exactly one of the cooperating processes. Generally, this means that the scan on the driving table of the query @@ -365,11 +365,11 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; Because the Finalize Aggregate node runs on the leader - process, queries which produce a relatively large number of groups in + process, queries that produce a relatively large number of groups in comparison to the number of input rows will appear less favorable to the query planner. For example, in the worst-case scenario the number of groups seen by the Finalize Aggregate node could be as many as - the number of input rows which were seen by all worker processes in the + the number of input rows that were seen by all worker processes in the Partial Aggregate stage. For such cases, there is clearly going to be no performance benefit to using parallel aggregation. The query planner takes this into account during the planning process and is @@ -425,7 +425,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; involve appending multiple results sets can therefore achieve coarse-grained parallelism even when efficient partial plans are not available. For example, consider a query against a partitioned table - which can only be implemented efficiently by using an index that does + that can only be implemented efficiently by using an index that does not support parallel scans. The planner might choose a Parallel Append of regular Index Scan plans; each individual index scan would have to be executed to completion by a single @@ -446,7 +446,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; If a query that is expected to do so does not produce a parallel plan, you can try reducing or . Of course, this plan may turn - out to be slower than the serial plan which the planner preferred, but + out to be slower than the serial plan that the planner preferred, but this will not always be the case. If you don't get a parallel plan even with very small values of these settings (e.g., after setting them both to zero), there may be some reason why the query planner is @@ -473,15 +473,15 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; The planner classifies operations involved in a query as either parallel safe, parallel restricted, - or parallel unsafe. A parallel safe operation is one which + or parallel unsafe. A parallel safe operation is one that does not conflict with the use of parallel query. A parallel restricted - operation is one which cannot be performed in a parallel worker, but which + operation is one that cannot be performed in a parallel worker, but that can be performed in the leader while parallel query is in use. Therefore, parallel restricted operations can never occur below a Gather - or Gather Merge node, but can occur elsewhere in a plan which - contains such a node. A parallel unsafe operation is one which cannot + or Gather Merge node, but can occur elsewhere in a plan that + contains such a node. A parallel unsafe operation is one that cannot be performed while parallel query is in use, not even in the leader. - When a query contains anything which is parallel unsafe, parallel query + When a query contains anything that is parallel unsafe, parallel query is completely disabled for that query. @@ -505,7 +505,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; Scans of foreign tables, unless the foreign data wrapper has - an IsForeignScanParallelSafe API which indicates otherwise. + an IsForeignScanParallelSafe API that indicates otherwise. @@ -517,7 +517,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; - Plan nodes which reference a correlated SubPlan. + Plan nodes that reference a correlated SubPlan. @@ -528,7 +528,7 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; The planner cannot automatically determine whether a user-defined function or aggregate is parallel safe, parallel restricted, or parallel - unsafe, because this would require predicting every operation which the + unsafe, because this would require predicting every operation that the function could possibly perform. In general, this is equivalent to the Halting Problem and therefore impossible. Even for simple functions where it could conceivably be done, we do not try, since this would be expensive @@ -546,11 +546,11 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; Functions and aggregates must be marked PARALLEL UNSAFE if they write to the database, access sequences, change the transaction state - even temporarily (e.g., a PL/pgSQL function which establishes an + even temporarily (e.g., a PL/pgSQL function that establishes an EXCEPTION block to catch errors), or make persistent changes to settings. Similarly, functions must be marked PARALLEL RESTRICTED if they access temporary tables, client connection state, - cursors, prepared statements, or miscellaneous backend-local state which + cursors, prepared statements, or miscellaneous backend-local state that the system cannot synchronize across workers. For example, setseed and random are parallel restricted for this last reason. @@ -568,10 +568,10 @@ EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; - If a function executed within a parallel worker acquires locks which are + If a function executed within a parallel worker acquires locks that are not held by the leader, for example by querying a table not referenced in the query, those locks will be released at worker exit, not end of - transaction. If you write a function which does this, and this behavior + transaction. If you write a function that does this, and this behavior difference is important to you, mark such functions as PARALLEL RESTRICTED to ensure that they execute only in the leader.