Change aggregated log format of pgbench.

Commit 4a39f87acd changed the aggregated log format. Problem is, now
the explanatory paragraph for the log line in the document is too
long. Also the log format included more optional columns, and it's
harder to parse the log lines.  This commit tries to solve the
problems.

- There's no optional log columns anymore. If a column is not
  meaningful with provided pgbench option, it will be presented as 0.

- Reorder the log columns so that it's easier to parse them.

- Adjust explanatory paragraph for the log line in the doc.

Discussion: https://postgr.es/m/flat/202203280757.3tu4ovs3petm%40alvherre.pgsql
This commit is contained in:
Tatsuo Ishii 2022-04-06 09:55:58 +09:00
parent e37ad5fa4d
commit 17a856d08b
2 changed files with 82 additions and 41 deletions

View File

@ -2401,7 +2401,9 @@ END;
format is used for the log files:
<synopsis>
<replaceable>interval_start</replaceable> <replaceable>num_transactions</replaceable> <replaceable>sum_latency</replaceable> <replaceable>sum_latency_2</replaceable> <replaceable>min_latency</replaceable> <replaceable>max_latency</replaceable> { <replaceable>failures</replaceable> | <replaceable>serialization_failures</replaceable> <replaceable>deadlock_failures</replaceable> } <optional> <replaceable>sum_lag</replaceable> <replaceable>sum_lag_2</replaceable> <replaceable>min_lag</replaceable> <replaceable>max_lag</replaceable> <optional> <replaceable>skipped</replaceable> </optional> </optional> <optional> <replaceable>retried</replaceable> <replaceable>retries</replaceable> </optional>
<replaceable>interval_start</replaceable> <replaceable>num_transactions</replaceable> <replaceable>sum_latency</replaceable> <replaceable>sum_latency_2</replaceable> <replaceable>min_latency</replaceable> <replaceable>max_latency</replaceable>
<replaceable>sum_lag</replaceable> <replaceable>sum_lag_2</replaceable> <replaceable>min_lag</replaceable> <replaceable>max_lag</replaceable> <replaceable>skipped</replaceable>
<replaceable>retried</replaceable> <replaceable>retries</replaceable> <replaceable>failures</replaceable> <replaceable>serialization_failures</replaceable> <replaceable>deadlock_failures</replaceable>
</synopsis>
where
@ -2417,41 +2419,55 @@ END;
and
<replaceable>max_latency</replaceable> is the maximum latency within the interval,
<replaceable>failures</replaceable> is the number of transactions that ended
with a failed SQL command within the interval. If you use option
<option>--failures-detailed</option>, instead of the sum of all failed
transactions you will get more detailed statistics for the failed
transactions grouped by the following types:
<replaceable>serialization_failures</replaceable> is the number of
transactions that got a serialization error and were not retried after this,
<replaceable>deadlock_failures</replaceable> is the number of transactions
that got a deadlock error and were not retried after this.
with a failed SQL command within the interval.
</para>
<para>
The next fields,
<replaceable>sum_lag</replaceable>, <replaceable>sum_lag_2</replaceable>, <replaceable>min_lag</replaceable>,
and <replaceable>max_lag</replaceable>, are only present if the <option>--rate</option>
option is used.
and <replaceable>max_lag</replaceable>, only meaningful if the <option>--rate</option>
option is used. Otherwise, they are all 0.0.
They provide statistics about the time each transaction had to wait for the
previous one to finish, i.e., the difference between each transaction's
scheduled start time and the time it actually started.
The next field, <replaceable>skipped</replaceable>,
is only present if the <option>--latency-limit</option> option is used, too.
is only meaningful if the <option>--latency-limit</option> option is used, too. Otherwise it is 0.
It counts the number of transactions skipped because they would have
started too late.
The <replaceable>retried</replaceable> and <replaceable>retries</replaceable>
fields are present only if the <option>--max-tries</option> option is not
equal to 1. They report the number of retried transactions and the sum of all
retries after serialization or deadlock errors within the interval.
Each transaction is counted in the interval when it was committed.
</para>
<para>
The <replaceable>retried</replaceable>
and <replaceable>retries</replaceable> fields are only meaningful if
the <option>--max-tries</option> option is not equal to 1. Otherwise they
are 0. They report the number of retried transactions and the sum of all
retries after serialization or deadlock errors within the interval. Each
transaction is counted in the interval when it was committed.
</para>
<para>
<replaceable>failures</replaceable> is the sum of all failed transactions.
If <option>--failures-detailed</option> is specified, instead of the sum of
all failed transactions you will get more detailed statistics for the
failed transactions grouped by the following types:
<replaceable>serialization_failures</replaceable> is the number of
transactions that got a serialization error and were not retried after this,
<replaceable>deadlock_failures</replaceable> is the number of transactions
that got a deadlock error and were not retried after this.
If <option>--failures-detailed</option> is not
specified, <replaceable>serialization_failures</replaceable>
and <replaceable>deadlock_failures</replaceable> are always 0.
</para>
<para>
Here is some example output:
Here is some example output with following options:
<screen>
1345828501 5601 1542744 483552416 61 2573 0
1345828503 7884 1979812 565806736 60 1479 0
1345828505 7208 1979422 567277552 59 1391 0
1345828507 7685 1980268 569784714 60 1398 0
1345828509 7073 1979779 573489941 236 1411 0
</screen></para>
pgbench --aggregate-interval=10 --time=20 --client=10 --log --rate=1000
--latency-limit=10 --failures-detailed --max-tries=10 test
</screen>
<screen>
1649114136 5815 27552565 177846919143 1078 21716 2756787 7264696105 0 9661 0 7854 31472 4022 4022 0
1649114146 5958 28460110 182785513108 1083 20391 2539395 6411761497 0 7268 0 8127 32595 4101 4101 0
</screen>
</para>
<para>
Notice that while the plain (unaggregated) log file shows which script

View File

@ -4494,6 +4494,17 @@ doLog(TState *thread, CState *st,
while ((next = agg->start_time + agg_interval * INT64CONST(1000000)) <= now)
{
double lag_sum = 0.0;
double lag_sum2 = 0.0;
double lag_min = 0.0;
double lag_max = 0.0;
int64 skipped = 0;
int64 serialization_failures = 0;
int64 deadlock_failures = 0;
int64 serialization_or_deadlock_failures = 0;
int64 retried = 0;
int64 retries = 0;
/* print aggregated report to logfile */
fprintf(logfile, INT64_FORMAT " " INT64_FORMAT " %.0f %.0f %.0f %.0f",
agg->start_time / 1000000, /* seconds since Unix epoch */
@ -4503,27 +4514,41 @@ doLog(TState *thread, CState *st,
agg->latency.min,
agg->latency.max);
if (failures_detailed)
fprintf(logfile, " " INT64_FORMAT " " INT64_FORMAT,
agg->serialization_failures,
agg->deadlock_failures);
else
fprintf(logfile, " " INT64_FORMAT, getFailures(agg));
if (throttle_delay)
{
fprintf(logfile, " %.0f %.0f %.0f %.0f",
agg->lag.sum,
agg->lag.sum2,
agg->lag.min,
agg->lag.max);
if (latency_limit)
fprintf(logfile, " " INT64_FORMAT, agg->skipped);
lag_sum = agg->lag.sum;
lag_sum2 = agg->lag.sum2;
lag_min = agg->lag.min;
lag_max = agg->lag.max;
}
fprintf(logfile, " %.0f %.0f %.0f %.0f",
lag_sum,
lag_sum2,
lag_min,
lag_max);
if (latency_limit)
skipped = agg->skipped;
fprintf(logfile, " " INT64_FORMAT, skipped);
if (max_tries != 1)
fprintf(logfile, " " INT64_FORMAT " " INT64_FORMAT,
agg->retried,
agg->retries);
{
retried = agg->retried;
retries = agg->retries;
}
fprintf(logfile, " " INT64_FORMAT " " INT64_FORMAT, retried, retries);
if (failures_detailed)
{
serialization_failures = agg->serialization_failures;
deadlock_failures = agg->deadlock_failures;
}
serialization_or_deadlock_failures = serialization_failures + deadlock_failures;
fprintf(logfile, " " INT64_FORMAT " " INT64_FORMAT " " INT64_FORMAT,
serialization_or_deadlock_failures,
serialization_failures,
deadlock_failures);
fputc('\n', logfile);
/* reset data and move to next interval */