postgresql/src/test/regress/expected/explain.out

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

642 lines
28 KiB
Plaintext
Raw Normal View History

Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
--
-- EXPLAIN
--
-- There are many test cases elsewhere that use EXPLAIN as a vehicle for
-- checking something else (usually planner behavior). This file is
-- concerned with testing EXPLAIN in its own right.
--
-- To produce stable regression test output, it's usually necessary to
-- ignore details such as exact costs or row counts. These filter
-- functions replace changeable output details with fixed strings.
create function explain_filter(text) returns setof text
language plpgsql as
$$
declare
ln text;
begin
for ln in execute $1
loop
-- Replace any numeric word with just 'N'
ln := regexp_replace(ln, '-?\m\d+\M', 'N', 'g');
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
-- In sort output, the above won't match units-suffixed numbers
ln := regexp_replace(ln, '\m\d+kB', 'NkB', 'g');
-- Ignore text-mode buffers output because it varies depending
-- on the system state
CONTINUE WHEN (ln ~ ' +Buffers: .*');
-- Ignore text-mode "Planning:" line because whether it's output
-- varies depending on the system state
CONTINUE WHEN (ln = 'Planning:');
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
return next ln;
end loop;
end;
$$;
-- To produce valid JSON output, replace numbers with "0" or "0.0" not "N"
create function explain_filter_to_json(text) returns jsonb
language plpgsql as
$$
declare
data text := '';
ln text;
begin
for ln in execute $1
loop
-- Replace any numeric word with just '0'
ln := regexp_replace(ln, '\m\d+\M', '0', 'g');
data := data || ln;
end loop;
return data::jsonb;
end;
$$;
-- Disable JIT, or we'll get different output on machines where that's been
-- forced on
set jit = off;
-- Similarly, disable track_io_timing, to avoid output differences when
-- enabled.
set track_io_timing = off;
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
-- Simple cases
select explain_filter('explain select * from int8_tbl i8');
explain_filter
---------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N)
(1 row)
select explain_filter('explain (analyze) select * from int8_tbl i8');
explain_filter
-----------------------------------------------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
Planning Time: N.N ms
Execution Time: N.N ms
(3 rows)
select explain_filter('explain (analyze, verbose) select * from int8_tbl i8');
explain_filter
------------------------------------------------------------------------------------------------------
Seq Scan on public.int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
Output: q1, q2
Planning Time: N.N ms
Execution Time: N.N ms
(4 rows)
select explain_filter('explain (analyze, buffers, format text) select * from int8_tbl i8');
explain_filter
-----------------------------------------------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
Planning Time: N.N ms
Execution Time: N.N ms
(3 rows)
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
select explain_filter('explain (analyze, buffers, format xml) select * from int8_tbl i8');
explain_filter
--------------------------------------------------------
<explain xmlns="http://www.postgresql.org/N/explain"> +
<Query> +
<Plan> +
<Node-Type>Seq Scan</Node-Type> +
<Parallel-Aware>false</Parallel-Aware> +
Add support for asynchronous execution. This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
<Async-Capable>false</Async-Capable> +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
<Relation-Name>int8_tbl</Relation-Name> +
<Alias>i8</Alias> +
<Startup-Cost>N.N</Startup-Cost> +
<Total-Cost>N.N</Total-Cost> +
<Plan-Rows>N</Plan-Rows> +
<Plan-Width>N</Plan-Width> +
<Actual-Startup-Time>N.N</Actual-Startup-Time> +
<Actual-Total-Time>N.N</Actual-Total-Time> +
<Actual-Rows>N</Actual-Rows> +
<Actual-Loops>N</Actual-Loops> +
<Shared-Hit-Blocks>N</Shared-Hit-Blocks> +
<Shared-Read-Blocks>N</Shared-Read-Blocks> +
<Shared-Dirtied-Blocks>N</Shared-Dirtied-Blocks>+
<Shared-Written-Blocks>N</Shared-Written-Blocks>+
<Local-Hit-Blocks>N</Local-Hit-Blocks> +
<Local-Read-Blocks>N</Local-Read-Blocks> +
<Local-Dirtied-Blocks>N</Local-Dirtied-Blocks> +
<Local-Written-Blocks>N</Local-Written-Blocks> +
<Temp-Read-Blocks>N</Temp-Read-Blocks> +
<Temp-Written-Blocks>N</Temp-Written-Blocks> +
</Plan> +
<Planning> +
<Shared-Hit-Blocks>N</Shared-Hit-Blocks> +
<Shared-Read-Blocks>N</Shared-Read-Blocks> +
<Shared-Dirtied-Blocks>N</Shared-Dirtied-Blocks>+
<Shared-Written-Blocks>N</Shared-Written-Blocks>+
<Local-Hit-Blocks>N</Local-Hit-Blocks> +
<Local-Read-Blocks>N</Local-Read-Blocks> +
<Local-Dirtied-Blocks>N</Local-Dirtied-Blocks> +
<Local-Written-Blocks>N</Local-Written-Blocks> +
<Temp-Read-Blocks>N</Temp-Read-Blocks> +
<Temp-Written-Blocks>N</Temp-Written-Blocks> +
</Planning> +
<Planning-Time>N.N</Planning-Time> +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
<Triggers> +
</Triggers> +
<Execution-Time>N.N</Execution-Time> +
</Query> +
</explain>
(1 row)
select explain_filter('explain (analyze, buffers, format yaml) select * from int8_tbl i8');
explain_filter
-------------------------------
- Plan: +
Node Type: "Seq Scan" +
Parallel Aware: false +
Add support for asynchronous execution. This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
Async Capable: false +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
Relation Name: "int8_tbl"+
Alias: "i8" +
Startup Cost: N.N +
Total Cost: N.N +
Plan Rows: N +
Plan Width: N +
Actual Startup Time: N.N +
Actual Total Time: N.N +
Actual Rows: N +
Actual Loops: N +
Shared Hit Blocks: N +
Shared Read Blocks: N +
Shared Dirtied Blocks: N +
Shared Written Blocks: N +
Local Hit Blocks: N +
Local Read Blocks: N +
Local Dirtied Blocks: N +
Local Written Blocks: N +
Temp Read Blocks: N +
Temp Written Blocks: N +
Planning: +
Shared Hit Blocks: N +
Shared Read Blocks: N +
Shared Dirtied Blocks: N +
Shared Written Blocks: N +
Local Hit Blocks: N +
Local Read Blocks: N +
Local Dirtied Blocks: N +
Local Written Blocks: N +
Temp Read Blocks: N +
Temp Written Blocks: N +
Planning Time: N.N +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
Triggers: +
Execution Time: N.N
(1 row)
select explain_filter('explain (buffers, format text) select * from int8_tbl i8');
explain_filter
---------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N)
(1 row)
select explain_filter('explain (buffers, format json) select * from int8_tbl i8');
explain_filter
------------------------------------
[ +
{ +
"Plan": { +
"Node Type": "Seq Scan", +
"Parallel Aware": false, +
Add support for asynchronous execution. This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
"Async Capable": false, +
"Relation Name": "int8_tbl",+
"Alias": "i8", +
"Startup Cost": N.N, +
"Total Cost": N.N, +
"Plan Rows": N, +
"Plan Width": N, +
"Shared Hit Blocks": N, +
"Shared Read Blocks": N, +
"Shared Dirtied Blocks": N, +
"Shared Written Blocks": N, +
"Local Hit Blocks": N, +
"Local Read Blocks": N, +
"Local Dirtied Blocks": N, +
"Local Written Blocks": N, +
"Temp Read Blocks": N, +
"Temp Written Blocks": N +
}, +
"Planning": { +
"Shared Hit Blocks": N, +
"Shared Read Blocks": N, +
"Shared Dirtied Blocks": N, +
"Shared Written Blocks": N, +
"Local Hit Blocks": N, +
"Local Read Blocks": N, +
"Local Dirtied Blocks": N, +
"Local Written Blocks": N, +
"Temp Read Blocks": N, +
"Temp Written Blocks": N +
} +
} +
]
(1 row)
-- Check output including I/O timings. These fields are conditional
-- but always set in JSON format, so check them only in this case.
set track_io_timing = on;
select explain_filter('explain (analyze, buffers, format json) select * from int8_tbl i8');
explain_filter
-------------------------------------
[ +
{ +
"Plan": { +
"Node Type": "Seq Scan", +
"Parallel Aware": false, +
"Async Capable": false, +
"Relation Name": "int8_tbl", +
"Alias": "i8", +
"Startup Cost": N.N, +
"Total Cost": N.N, +
"Plan Rows": N, +
"Plan Width": N, +
"Actual Startup Time": N.N, +
"Actual Total Time": N.N, +
"Actual Rows": N, +
"Actual Loops": N, +
"Shared Hit Blocks": N, +
"Shared Read Blocks": N, +
"Shared Dirtied Blocks": N, +
"Shared Written Blocks": N, +
"Local Hit Blocks": N, +
"Local Read Blocks": N, +
"Local Dirtied Blocks": N, +
"Local Written Blocks": N, +
"Temp Read Blocks": N, +
"Temp Written Blocks": N, +
"Shared I/O Read Time": N.N, +
"Shared I/O Write Time": N.N,+
"Local I/O Read Time": N.N, +
"Local I/O Write Time": N.N, +
"Temp I/O Read Time": N.N, +
"Temp I/O Write Time": N.N +
}, +
"Planning": { +
"Shared Hit Blocks": N, +
"Shared Read Blocks": N, +
"Shared Dirtied Blocks": N, +
"Shared Written Blocks": N, +
"Local Hit Blocks": N, +
"Local Read Blocks": N, +
"Local Dirtied Blocks": N, +
"Local Written Blocks": N, +
"Temp Read Blocks": N, +
"Temp Written Blocks": N, +
"Shared I/O Read Time": N.N, +
"Shared I/O Write Time": N.N,+
"Local I/O Read Time": N.N, +
"Local I/O Write Time": N.N, +
"Temp I/O Read Time": N.N, +
"Temp I/O Write Time": N.N +
}, +
"Planning Time": N.N, +
"Triggers": [ +
], +
"Execution Time": N.N +
} +
]
(1 row)
set track_io_timing = off;
-- SETTINGS option
-- We have to ignore other settings that might be imposed by the environment,
-- so printing the whole Settings field unfortunately won't do.
begin;
set local plan_cache_mode = force_generic_plan;
select true as "OK"
from explain_filter('explain (settings) select * from int8_tbl i8') ln
where ln ~ '^ *Settings: .*plan_cache_mode = ''force_generic_plan''';
OK
----
t
(1 row)
select explain_filter_to_json('explain (settings, format json) select * from int8_tbl i8') #> '{0,Settings,plan_cache_mode}';
?column?
----------------------
"force_generic_plan"
(1 row)
rollback;
-- GENERIC_PLAN option
select explain_filter('explain (generic_plan) select unique1 from tenk1 where thousand = $1');
explain_filter
---------------------------------------------------------------------------------
Bitmap Heap Scan on tenk1 (cost=N.N..N.N rows=N width=N)
Recheck Cond: (thousand = $N)
-> Bitmap Index Scan on tenk1_thous_tenthous (cost=N.N..N.N rows=N width=N)
Index Cond: (thousand = $N)
(4 rows)
-- should fail
select explain_filter('explain (analyze, generic_plan) select unique1 from tenk1 where thousand = $1');
ERROR: EXPLAIN options ANALYZE and GENERIC_PLAN cannot be used together
CONTEXT: PL/pgSQL function explain_filter(text) line 5 at FOR over EXECUTE statement
-- MEMORY option
select explain_filter('explain (memory) select * from int8_tbl i8');
explain_filter
---------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N)
Memory: used=N bytes allocated=N bytes
(2 rows)
select explain_filter('explain (memory, analyze) select * from int8_tbl i8');
explain_filter
-----------------------------------------------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N) (actual time=N.N..N.N rows=N loops=N)
Memory: used=N bytes allocated=N bytes
Planning Time: N.N ms
Execution Time: N.N ms
(4 rows)
select explain_filter('explain (memory, summary, format yaml) select * from int8_tbl i8');
explain_filter
-------------------------------
- Plan: +
Node Type: "Seq Scan" +
Parallel Aware: false +
Async Capable: false +
Relation Name: "int8_tbl"+
Alias: "i8" +
Startup Cost: N.N +
Total Cost: N.N +
Plan Rows: N +
Plan Width: N +
Planning: +
Memory Used: N +
Memory Allocated: N +
Planning Time: N.N
(1 row)
select explain_filter('explain (memory, analyze, format json) select * from int8_tbl i8');
explain_filter
------------------------------------
[ +
{ +
"Plan": { +
"Node Type": "Seq Scan", +
"Parallel Aware": false, +
"Async Capable": false, +
"Relation Name": "int8_tbl",+
"Alias": "i8", +
"Startup Cost": N.N, +
"Total Cost": N.N, +
"Plan Rows": N, +
"Plan Width": N, +
"Actual Startup Time": N.N, +
"Actual Total Time": N.N, +
"Actual Rows": N, +
"Actual Loops": N +
}, +
"Planning": { +
"Memory Used": N, +
"Memory Allocated": N +
}, +
"Planning Time": N.N, +
"Triggers": [ +
], +
"Execution Time": N.N +
} +
]
(1 row)
prepare int8_query as select * from int8_tbl i8;
select explain_filter('explain (memory) execute int8_query');
explain_filter
---------------------------------------------------------
Seq Scan on int8_tbl i8 (cost=N.N..N.N rows=N width=N)
Memory: used=N bytes allocated=N bytes
(2 rows)
-- Test EXPLAIN (GENERIC_PLAN) with partition pruning
-- partitions should be pruned at plan time, based on constants,
-- but there should be no pruning based on parameter placeholders
create table gen_part (
key1 integer not null,
key2 integer not null
) partition by list (key1);
create table gen_part_1
partition of gen_part for values in (1)
partition by range (key2);
create table gen_part_1_1
partition of gen_part_1 for values from (1) to (2);
create table gen_part_1_2
partition of gen_part_1 for values from (2) to (3);
create table gen_part_2
partition of gen_part for values in (2);
-- should scan gen_part_1_1 and gen_part_1_2, but not gen_part_2
select explain_filter('explain (generic_plan) select key1, key2 from gen_part where key1 = 1 and key2 = $1');
explain_filter
---------------------------------------------------------------------------
Append (cost=N.N..N.N rows=N width=N)
-> Seq Scan on gen_part_1_1 gen_part_1 (cost=N.N..N.N rows=N width=N)
Filter: ((key1 = N) AND (key2 = $N))
-> Seq Scan on gen_part_1_2 gen_part_2 (cost=N.N..N.N rows=N width=N)
Filter: ((key1 = N) AND (key2 = $N))
(5 rows)
drop table gen_part;
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
--
-- Test production of per-worker data
--
-- Unfortunately, because we don't know how many worker processes we'll
-- actually get (maybe none at all), we can't examine the "Workers" output
-- in any detail. We can check that it parses correctly as JSON, and then
-- remove it from the displayed results.
begin;
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
-- encourage use of parallel plans
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather=4;
select jsonb_pretty(
explain_filter_to_json('explain (analyze, verbose, buffers, format json)
select * from tenk1 order by tenthous')
-- remove "Workers" node of the Seq Scan plan node
#- '{0,Plan,Plans,0,Plans,0,Workers}'
-- remove "Workers" node of the Sort plan node
#- '{0,Plan,Plans,0,Workers}'
-- Also remove its sort-type fields, as those aren't 100% stable
#- '{0,Plan,Plans,0,Sort Method}'
#- '{0,Plan,Plans,0,Sort Space Type}'
);
jsonb_pretty
-------------------------------------------------------------
[ +
{ +
"Plan": { +
"Plans": [ +
{ +
"Plans": [ +
{ +
"Alias": "tenk1", +
"Output": [ +
"unique1", +
"unique2", +
"two", +
"four", +
"ten", +
"twenty", +
"hundred", +
"thousand", +
"twothousand", +
"fivethous", +
"tenthous", +
"odd", +
"even", +
"stringu1", +
"stringu2", +
"string4" +
], +
"Schema": "public", +
"Node Type": "Seq Scan", +
"Plan Rows": 0, +
"Plan Width": 0, +
"Total Cost": 0.0, +
"Actual Rows": 0, +
"Actual Loops": 0, +
"Startup Cost": 0.0, +
Add support for asynchronous execution. This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
"Async Capable": false, +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
"Relation Name": "tenk1", +
"Parallel Aware": true, +
"Local Hit Blocks": 0, +
"Temp Read Blocks": 0, +
"Actual Total Time": 0.0, +
"Local Read Blocks": 0, +
"Shared Hit Blocks": 0, +
"Shared Read Blocks": 0, +
"Actual Startup Time": 0.0, +
"Parent Relationship": "Outer",+
"Temp Written Blocks": 0, +
"Local Dirtied Blocks": 0, +
"Local Written Blocks": 0, +
"Shared Dirtied Blocks": 0, +
"Shared Written Blocks": 0 +
} +
], +
"Output": [ +
"unique1", +
"unique2", +
"two", +
"four", +
"ten", +
"twenty", +
"hundred", +
"thousand", +
"twothousand", +
"fivethous", +
"tenthous", +
"odd", +
"even", +
"stringu1", +
"stringu2", +
"string4" +
], +
"Sort Key": [ +
"tenk1.tenthous" +
], +
"Node Type": "Sort", +
"Plan Rows": 0, +
"Plan Width": 0, +
"Total Cost": 0.0, +
"Actual Rows": 0, +
"Actual Loops": 0, +
"Startup Cost": 0.0, +
Add support for asynchronous execution. This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
"Async Capable": false, +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
"Parallel Aware": false, +
"Sort Space Used": 0, +
"Local Hit Blocks": 0, +
"Temp Read Blocks": 0, +
"Actual Total Time": 0.0, +
"Local Read Blocks": 0, +
"Shared Hit Blocks": 0, +
"Shared Read Blocks": 0, +
"Actual Startup Time": 0.0, +
"Parent Relationship": "Outer", +
"Temp Written Blocks": 0, +
"Local Dirtied Blocks": 0, +
"Local Written Blocks": 0, +
"Shared Dirtied Blocks": 0, +
"Shared Written Blocks": 0 +
} +
], +
"Output": [ +
"unique1", +
"unique2", +
"two", +
"four", +
"ten", +
"twenty", +
"hundred", +
"thousand", +
"twothousand", +
"fivethous", +
"tenthous", +
"odd", +
"even", +
"stringu1", +
"stringu2", +
"string4" +
], +
"Node Type": "Gather Merge", +
"Plan Rows": 0, +
"Plan Width": 0, +
"Total Cost": 0.0, +
"Actual Rows": 0, +
"Actual Loops": 0, +
"Startup Cost": 0.0, +
Add support for asynchronous execution. This implements asynchronous execution, which runs multiple parts of a non-parallel-aware Append concurrently rather than serially to improve performance when possible. Currently, the only node type that can be run concurrently is a ForeignScan that is an immediate child of such an Append. In the case where such ForeignScans access data on different remote servers, this would run those ForeignScans concurrently, and overlap the remote operations to be performed simultaneously, so it'll improve the performance especially when the operations involve time-consuming ones such as remote join and remote aggregation. We may extend this to other node types such as joins or aggregates over ForeignScans in the future. This also adds the support for postgres_fdw, which is enabled by the table-level/server-level option "async_capable". The default is false. Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit is mostly based on the patch proposed by Robert Haas, but also uses stuff from the patch proposed by Kyotaro Horiguchi and from the patch proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and others. Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
2021-03-31 11:45:00 +02:00
"Async Capable": false, +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
"Parallel Aware": false, +
"Workers Planned": 0, +
"Local Hit Blocks": 0, +
"Temp Read Blocks": 0, +
"Workers Launched": 0, +
"Actual Total Time": 0.0, +
"Local Read Blocks": 0, +
"Shared Hit Blocks": 0, +
"Shared Read Blocks": 0, +
"Actual Startup Time": 0.0, +
"Temp Written Blocks": 0, +
"Local Dirtied Blocks": 0, +
"Local Written Blocks": 0, +
"Shared Dirtied Blocks": 0, +
"Shared Written Blocks": 0 +
}, +
"Planning": { +
"Local Hit Blocks": 0, +
"Temp Read Blocks": 0, +
"Local Read Blocks": 0, +
"Shared Hit Blocks": 0, +
"Shared Read Blocks": 0, +
"Temp Written Blocks": 0, +
"Local Dirtied Blocks": 0, +
"Local Written Blocks": 0, +
"Shared Dirtied Blocks": 0, +
"Shared Written Blocks": 0 +
}, +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
"Triggers": [ +
], +
"Planning Time": 0.0, +
Clean up EXPLAIN's handling of per-worker details. Previously, it was possible for EXPLAIN ANALYZE of a parallel query to produce several different "Workers" fields for a single plan node, because different portions of explain.c independently generated per-worker data and wrapped that output in separate fields. This is pretty bogus, especially for the structured output formats: even if it's not technically illegal, most programs would have a hard time dealing with such data. To improve matters, add infrastructure that allows redirecting per-worker values into a side data structure, and then collect that data into a single "Workers" field after we've finished running all the relevant code for a given plan node. There are a few visible side-effects: * In text format, instead of something like Sort Method: external merge Disk: 4920kB Worker 0: Sort Method: external merge Disk: 5880kB Worker 1: Sort Method: external merge Disk: 5920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Buffers: shared hit=345 read=3507, temp read=505 written=744 you get Sort Method: external merge Disk: 4920kB Buffers: shared hit=682 read=10188, temp read=1415 written=2101 Worker 0: actual time=130.058..130.324 rows=1324 loops=1 Sort Method: external merge Disk: 5880kB Buffers: shared hit=337 read=3489, temp read=505 written=739 Worker 1: actual time=130.273..130.512 rows=1297 loops=1 Sort Method: external merge Disk: 5920kB Buffers: shared hit=345 read=3507, temp read=505 written=744 * When JIT is enabled, any relevant per-worker JIT stats are attached to the child node of the Gather or Gather Merge node, which is where the other per-worker output has always been. Previously, that info was attached directly to a Gather node, or missed entirely for Gather Merge. * A query's summary JIT data no longer includes a bogus "Worker Number: -1" field. A notable code-level change is that indenting for lines of text-format output should now be handled by calling "ExplainIndentText(es)", instead of hard-wiring how much space to emit. This seems a good deal cleaner anyway. This patch also adds a new "explain.sql" regression test script that's dedicated to testing EXPLAIN. There is more that can be done in that line, certainly, but for now it just adds some coverage of the XML and YAML output formats, which had been completely untested. Although this is surely a bug fix, it's not clear that people would be happy with rearranging EXPLAIN output in a minor release, so apply to HEAD only. Maciek Sakrejda and Tom Lane, based on an idea of Andres Freund's; reviewed by Georgios Kokolatos Discussion: https://postgr.es/m/CAOtHd0AvAA8CLB9Xz0wnxu1U=zJCKrr1r4QwwXi_kcQsHDVU=Q@mail.gmail.com
2020-01-26 00:16:42 +01:00
"Execution Time": 0.0 +
} +
]
(1 row)
rollback;
-- Test display of temporary objects
create temp table t1(f1 float8);
create function pg_temp.mysin(float8) returns float8 language plpgsql
as 'begin return sin($1); end';
select explain_filter('explain (verbose) select * from t1 where pg_temp.mysin(f1) < 0.5');
explain_filter
------------------------------------------------------------
Seq Scan on pg_temp.t1 (cost=N.N..N.N rows=N width=N)
Output: f1
Filter: (pg_temp.mysin(t1.f1) < 'N.N'::double precision)
(3 rows)
-- Test compute_query_id
set compute_query_id = on;
select explain_filter('explain (verbose) select * from int8_tbl i8');
explain_filter
----------------------------------------------------------------
Seq Scan on public.int8_tbl i8 (cost=N.N..N.N rows=N width=N)
Output: q1, q2
Query Identifier: N
(3 rows)