2012-03-27 22:14:00 +02:00
|
|
|
<!-- doc/src/sgml/pgtesttiming.sgml -->
|
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<refentry id="pgtesttiming">
|
|
|
|
<refmeta>
|
|
|
|
<refentrytitle><application>pg_test_timing</application></refentrytitle>
|
|
|
|
<manvolnum>1</manvolnum>
|
|
|
|
<refmiscinfo>Application</refmiscinfo>
|
|
|
|
</refmeta>
|
|
|
|
|
|
|
|
<refnamediv>
|
|
|
|
<refname>pg_test_timing</refname>
|
|
|
|
<refpurpose>measure timing overhead</refpurpose>
|
|
|
|
</refnamediv>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<indexterm zone="pgtesttiming">
|
|
|
|
<primary>pg_test_timing</primary>
|
|
|
|
</indexterm>
|
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<refsynopsisdiv>
|
|
|
|
<cmdsynopsis>
|
|
|
|
<command>pg_test_timing</command>
|
|
|
|
<arg rep="repeat"><replaceable>option</replaceable></arg>
|
|
|
|
</cmdsynopsis>
|
|
|
|
</refsynopsisdiv>
|
|
|
|
|
|
|
|
<refsect1>
|
|
|
|
<title>Description</title>
|
|
|
|
|
2012-03-27 22:14:00 +02:00
|
|
|
<para>
|
|
|
|
<application>pg_test_timing</> is a tool to measure the timing overhead
|
|
|
|
on your system and confirm that the system time never moves backwards.
|
2012-05-10 18:53:41 +02:00
|
|
|
Systems that are slow to collect timing data can give less accurate
|
2012-03-27 22:14:00 +02:00
|
|
|
<command>EXPLAIN ANALYZE</command> results.
|
|
|
|
</para>
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect1>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<refsect1>
|
|
|
|
<title>Options</title>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
<application>pg_test_timing</application> accepts the following
|
|
|
|
command-line options:
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
|
|
|
|
<varlistentry>
|
2012-03-28 14:16:19 +02:00
|
|
|
<term><option>-d <replaceable class="parameter">duration</replaceable></option></term>
|
|
|
|
<term><option>--duration=<replaceable class="parameter">duration</replaceable></option></term>
|
2012-03-27 22:14:00 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Specifies the test duration, in seconds. Longer durations
|
|
|
|
give slightly better accuracy, and are more likely to discover
|
|
|
|
problems with the system clock moving backwards. The default
|
|
|
|
test duration is 3 seconds.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-06-18 01:44:00 +02:00
|
|
|
<varlistentry>
|
|
|
|
<term><option>-V</></term>
|
|
|
|
<term><option>--version</></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Print the <application>pg_test_timing</application> version and exit.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
|
|
|
<varlistentry>
|
|
|
|
<term><option>-?</></term>
|
|
|
|
<term><option>--help</></term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
|
|
|
Show help about <application>pg_test_timing</application> command line
|
|
|
|
arguments, and exit.
|
|
|
|
</para>
|
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
|
2012-03-27 22:14:00 +02:00
|
|
|
</variablelist>
|
|
|
|
</para>
|
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect1>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<refsect1>
|
|
|
|
<title>Usage</title>
|
|
|
|
|
|
|
|
<refsect2>
|
2012-03-27 22:14:00 +02:00
|
|
|
<title>Interpreting results</title>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Good results will show most (>90%) individual timing calls take less than
|
|
|
|
one microsecond. Average per loop overhead will be even lower, below 100
|
|
|
|
nanoseconds. This example from an Intel i7-860 system using a TSC clock
|
|
|
|
source shows excellent performance:
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<screen>
|
|
|
|
Testing timing overhead for 3 seconds.
|
|
|
|
Per loop time including overhead: 35.96 nsec
|
|
|
|
Histogram of timing durations:
|
|
|
|
< usec: count percent
|
|
|
|
16: 2 0.00000%
|
|
|
|
8: 13 0.00002%
|
|
|
|
4: 126 0.00015%
|
|
|
|
2: 2999652 3.59518%
|
|
|
|
1: 80435604 96.40465%
|
|
|
|
</screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
</para>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Note that different units are used for the per loop time than the
|
|
|
|
histogram. The loop can have resolution within a few nanoseconds (nsec),
|
|
|
|
while the individual timing calls can only resolve down to one microsecond
|
|
|
|
(usec).
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect2>
|
|
|
|
<refsect2>
|
2012-03-27 22:14:00 +02:00
|
|
|
<title>Measuring executor timing overhead</title>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
When the query executor is running a statement using
|
|
|
|
<command>EXPLAIN ANALYZE</command>, individual operations are timed as well
|
|
|
|
as showing a summary. The overhead of your system can be checked by
|
2012-06-07 23:06:20 +02:00
|
|
|
counting rows with the <application>psql</application> program:
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<screen>
|
|
|
|
CREATE TABLE t AS SELECT * FROM generate_series(1,100000);
|
|
|
|
\timing
|
|
|
|
SELECT COUNT(*) FROM t;
|
|
|
|
EXPLAIN ANALYZE SELECT COUNT(*) FROM t;
|
|
|
|
</screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
</para>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
The i7-860 system measured runs the count query in 9.8 ms while
|
|
|
|
the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms, each
|
|
|
|
processing just over 100,000 rows. That 6.8 ms difference means the timing
|
|
|
|
overhead per row is 68 ns, about twice what pg_test_timing estimated it
|
|
|
|
would be. Even that relatively small amount of overhead is making the fully
|
|
|
|
timed count statement take almost 70% longer. On more substantial queries,
|
|
|
|
the timing overhead would be less problematic.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect2>
|
|
|
|
|
|
|
|
<refsect2>
|
2012-03-27 22:14:00 +02:00
|
|
|
<title>Changing time sources</title>
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
On some newer Linux systems, it's possible to change the clock source used
|
|
|
|
to collect timing data at any time. A second example shows the slowdown
|
|
|
|
possible from switching to the slower acpi_pm time source, on the same
|
|
|
|
system used for the fast results above:
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
|
2012-03-27 22:14:00 +02:00
|
|
|
tsc hpet acpi_pm
|
|
|
|
# echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource
|
|
|
|
# pg_test_timing
|
|
|
|
Per loop time including overhead: 722.92 nsec
|
|
|
|
Histogram of timing durations:
|
|
|
|
< usec: count percent
|
|
|
|
16: 3 0.00007%
|
|
|
|
8: 563 0.01357%
|
|
|
|
4: 3241 0.07810%
|
|
|
|
2: 2990371 72.05956%
|
|
|
|
1: 1155682 27.84870%
|
|
|
|
</screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
</para>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
In this configuration, the sample <command>EXPLAIN ANALYZE</command> above
|
|
|
|
takes 115.9 ms. That's 1061 nsec of timing overhead, again a small multiple
|
|
|
|
of what's measured directly by this utility. That much timing overhead
|
|
|
|
means the actual query itself is only taking a tiny fraction of the
|
|
|
|
accounted for time, most of it is being consumed in overhead instead. In
|
|
|
|
this configuration, any <command>EXPLAIN ANALYZE</command> totals involving
|
|
|
|
many timed operations would be inflated significantly by timing overhead.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
FreeBSD also allows changing the time source on the fly, and it logs
|
|
|
|
information about the timer selected during boot:
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
dmesg | grep "Timecounter"
|
2012-03-27 22:14:00 +02:00
|
|
|
sysctl kern.timecounter.hardware=TSC
|
|
|
|
</screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
</para>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Other systems may only allow setting the time source on boot. On older
|
|
|
|
Linux systems the "clock" kernel setting is the only way to make this sort
|
|
|
|
of change. And even on some more recent ones, the only option you'll see
|
|
|
|
for a clock source is "jiffies". Jiffies are the older Linux software clock
|
|
|
|
implementation, which can have good resolution when it's backed by fast
|
|
|
|
enough timing hardware, as in this example:
|
2012-03-27 22:14:00 +02:00
|
|
|
|
|
|
|
<screen>
|
2012-05-10 18:53:41 +02:00
|
|
|
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
|
|
|
|
jiffies
|
2012-03-27 22:14:00 +02:00
|
|
|
$ dmesg | grep time.c
|
|
|
|
time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
|
|
|
|
time.c: Detected 2400.153 MHz processor.
|
2012-05-10 18:53:41 +02:00
|
|
|
$ pg_test_timing
|
2012-03-27 22:14:00 +02:00
|
|
|
Testing timing overhead for 3 seconds.
|
|
|
|
Per timing duration including loop overhead: 97.75 ns
|
|
|
|
Histogram of timing durations:
|
|
|
|
< usec: count percent
|
|
|
|
32: 1 0.00000%
|
|
|
|
16: 1 0.00000%
|
|
|
|
8: 22 0.00007%
|
|
|
|
4: 3010 0.00981%
|
|
|
|
2: 2993204 9.75277%
|
|
|
|
1: 27694571 90.23734%
|
2012-05-15 21:55:13 +02:00
|
|
|
</screen></para>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect2>
|
2012-05-10 18:53:41 +02:00
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<refsect2>
|
2012-03-27 22:14:00 +02:00
|
|
|
<title>Clock hardware and timing accuracy</title>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Collecting accurate timing information is normally done on computers using
|
|
|
|
hardware clocks with various levels of accuracy. With some hardware the
|
|
|
|
operating systems can pass the system clock time almost directly to
|
|
|
|
programs. A system clock can also be derived from a chip that simply
|
|
|
|
provides timing interrupts, periodic ticks at some known time interval. In
|
|
|
|
either case, operating system kernels provide a clock source that hides
|
|
|
|
these details. But the accuracy of that clock source and how quickly it can
|
|
|
|
return results varies based on the underlying hardware.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Inaccurate time keeping can result in system instability. Test any change
|
|
|
|
to the clock source very carefully. Operating system defaults are sometimes
|
|
|
|
made to favor reliability over best accuracy. And if you are using a virtual
|
|
|
|
machine, look into the recommended time sources compatible with it. Virtual
|
|
|
|
hardware faces additional difficulties when emulating timers, and there are
|
|
|
|
often per operating system settings suggested by vendors.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
The Time Stamp Counter (TSC) clock source is the most accurate one available
|
|
|
|
on current generation CPUs. It's the preferred way to track the system time
|
|
|
|
when it's supported by the operating system and the TSC clock is
|
|
|
|
reliable. There are several ways that TSC can fail to provide an accurate
|
|
|
|
timing source, making it unreliable. Older systems can have a TSC clock that
|
|
|
|
varies based on the CPU temperature, making it unusable for timing. Trying
|
2012-06-07 23:06:20 +02:00
|
|
|
to use TSC on some older multicore CPUs can give a reported time that's
|
2012-05-10 18:53:41 +02:00
|
|
|
inconsistent among multiple cores. This can result in the time going
|
|
|
|
backwards, a problem this program checks for. And even the newest systems
|
|
|
|
can fail to provide accurate TSC timing with very aggressive power saving
|
|
|
|
configurations.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Newer operating systems may check for the known TSC problems and switch to a
|
|
|
|
slower, more stable clock source when they are seen. If your system
|
|
|
|
supports TSC time but doesn't default to that, it may be disabled for a good
|
|
|
|
reason. And some operating systems may not detect all the possible problems
|
|
|
|
correctly, or will allow using TSC even in situations where it's known to be
|
|
|
|
inaccurate.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
The High Precision Event Timer (HPET) is the preferred timer on systems
|
|
|
|
where it's available and TSC is not accurate. The timer chip itself is
|
|
|
|
programmable to allow up to 100 nanosecond resolution, but you may not see
|
|
|
|
that much accuracy in your system clock.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Advanced Configuration and Power Interface (ACPI) provides a Power
|
|
|
|
Management (PM) Timer, which Linux refers to as the acpi_pm. The clock
|
|
|
|
derived from acpi_pm will at best provide 300 nanosecond resolution.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2012-05-10 18:53:41 +02:00
|
|
|
Timers used on older PC hardware including the 8254 Programmable Interval
|
|
|
|
Timer (PIT), the real-time clock (RTC), the Advanced Programmable Interrupt
|
|
|
|
Controller (APIC) timer, and the Cyclone timer. These timers aim for
|
|
|
|
millisecond resolution.
|
2012-03-27 22:14:00 +02:00
|
|
|
</para>
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect2>
|
|
|
|
</refsect1>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<refsect1>
|
2012-03-27 22:14:00 +02:00
|
|
|
<title>Author</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Ants Aasma <email>ants.aasma@eesti.ee</email>
|
|
|
|
</para>
|
2012-05-09 19:39:53 +02:00
|
|
|
</refsect1>
|
|
|
|
|
|
|
|
<refsect1>
|
|
|
|
<title>See Also</title>
|
2012-03-27 22:14:00 +02:00
|
|
|
|
2012-05-09 19:39:53 +02:00
|
|
|
<simplelist type="inline">
|
|
|
|
<member><xref linkend="sql-explain"></member>
|
|
|
|
</simplelist>
|
|
|
|
</refsect1>
|
|
|
|
</refentry>
|