Allow pgbench to exit immediately when any client is aborted.

Previously when client was aborted due to some error during
benchmarking, other clients continued their run until certain number
of transactions specified -t was reached or the time specified by -T
was expired. At the end, the results are printed with caution: "Run
was aborted; the above results are incomplete" shows.

New option "--exit-on-abort" allows pgbench to exit immediately in
this case so that users could quickly fix the cause of the failure and
try again another round of benchmarking.

Author: Yugo Nagata
Reviewed-by: Fabien COELHO, Tatsuo Ishii
Discussion: https://postgr.es/m/flat/20230804130325.df32e60879c38c92bca64207%40sraoss.co.jp
This commit is contained in:
Tatsuo Ishii 2023-08-30 10:03:31 +09:00
parent 8bf7db0285
commit 3c662643c4
3 changed files with 83 additions and 4 deletions

View File

@ -768,6 +768,24 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
</listitem>
</varlistentry>
<varlistentry id="pgbench-option-exit-on-abort">
<term><option>--exit-on-abort</option></term>
<listitem>
<para>
Exit immediately when any client is aborted due to some error. Without
this option, even when a client is aborted, other clients could
continue their run as specified by <option>-t</option>
or <option>-T</option> option, and <application>pgbench</application>
will print an incomplete results in this case.
</para>
<para>
Note that serialization failures or deadlock failures do not abort the
client, so they are not affected by this option.
See <xref linkend="failures-and-retries"/> for more information.
</para>
</listitem>
</varlistentry>
<varlistentry id="pgbench-option-failures-detailed">
<term><option>--failures-detailed</option></term>
<listitem>
@ -985,7 +1003,8 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
benchmark such as initial connection failures also exit with status 1.
Errors during the run such as database errors or problems in the script
will result in exit status 2. In the latter case,
<application>pgbench</application> will print partial results.
<application>pgbench</application> will print partial results if
<option>--exit-on-abort</option> option is not specified.
</para>
</refsect1>
@ -2801,14 +2820,17 @@ statement latencies in milliseconds, failures and retries:
start a connection to the database server / the socket for connecting
the client to the database server has become invalid). In such cases
all clients of this thread stop while other threads continue to work.
However, <option>--exit-on-abort</option> is specified, all of the
threads stop immediately in this case.
</para>
</listitem>
<listitem>
<para>
Direct client errors. They lead to immediate exit from
<application>pgbench</application> with the corresponding error message
only in the case of an internal <application>pgbench</application>
error (which are supposed to never occur...). Otherwise in the worst
in the case of an internal <application>pgbench</application>
error (which are supposed to never occur...) or when
<option>--exit-on-abort</option> is specified. Otherwise in the worst
case they only lead to the abortion of the failed client while other
clients continue their run (but some client errors are handled without
an abortion of the client and reported separately, see below). Later in

View File

@ -765,6 +765,8 @@ static int64 total_weight = 0;
static bool verbose_errors = false; /* print verbose messages of all errors */
static bool exit_on_abort = false; /* exit when any client is aborted */
/* Builtin test scripts */
typedef struct BuiltinScript
{
@ -911,6 +913,7 @@ usage(void)
" -T, --time=NUM duration of benchmark test in seconds\n"
" -v, --vacuum-all vacuum all four standard tables before tests\n"
" --aggregate-interval=NUM aggregate data over NUM seconds\n"
" --exit-on-abort exit when any client is aborted\n"
" --failures-detailed report the failures grouped by basic types\n"
" --log-prefix=PREFIX prefix for transaction time log file\n"
" (default: \"pgbench_log\")\n"
@ -6617,6 +6620,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
{"exit-on-abort", no_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@ -6950,6 +6954,10 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
case 16: /* exit-on-abort */
benchmarking_option_set = true;
exit_on_abort = true;
break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@ -7558,11 +7566,18 @@ threadRun(void *arg)
advanceConnectionState(thread, st, &aggs);
/*
* If --exit-on-abort is used, the program is going to exit
* when any client is aborted.
*/
if (exit_on_abort && st->state == CSTATE_ABORTED)
goto done;
/*
* If advanceConnectionState changed client to finished state,
* that's one fewer client that remains.
*/
if (st->state == CSTATE_FINISHED || st->state == CSTATE_ABORTED)
else if (st->state == CSTATE_FINISHED ||
st->state == CSTATE_ABORTED)
remains--;
}
@ -7595,6 +7610,22 @@ threadRun(void *arg)
}
done:
if (exit_on_abort)
{
/*
* Abort if any client is not finished, meaning some error occurred.
*/
for (int i = 0; i < nstate; i++)
{
if (state[i].state != CSTATE_FINISHED)
{
pg_log_error("Run was aborted due to an error in thread %d",
thread->tid);
exit(2);
}
}
}
disconnect_all(state, nstate);
if (thread->logfile)

View File

@ -1475,6 +1475,32 @@ SELECT pg_advisory_unlock_all();
# Clean up
$node->safe_psql('postgres', 'DROP TABLE first_client_table, xy;');
# Test --exit-on-abort
$node->safe_psql('postgres',
'CREATE TABLE counter(i int); '.
'INSERT INTO counter VALUES (0);'
);
$node->pgbench(
'-t 10 -c 2 -j 2 --exit-on-abort',
2,
[],
[
qr{division by zero},
qr{Run was aborted due to an error in thread}
],
'test --exit-on-abort',
{
'001_exit_on_abort' => q{
update counter set i = i+1 returning i \gset
\if :i = 5
\set y 1/0
\endif
}
});
# Clean up
$node->safe_psql('postgres', 'DROP TABLE counter;');
# done
$node->safe_psql('postgres', 'DROP TABLESPACE regress_pgbench_tap_1_ts');