postgresql/src/tools/pgindent
Andres Freund 141fd1b66c Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
   entries - instead store them as a Datum array. This also allows to
   get rid of having to build dummy tuples for negative & list
   entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
   likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
   greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
   allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
   piece.

This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.

I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles. Unfortunately it's not easy to do so safely as
  an entry's memory location can change at various times, which
  doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
  but the win isn't big and the code for it is ugly, because the
  tuples have to be freed as well.
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in
  profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 14:22:41 -07:00
..
README Remove entab and associated detritus. 2017-06-21 15:46:39 -04:00
exclude_file_patterns Teach pgindent to skip files generated by bison or flex automatically. 2017-06-16 23:14:40 -04:00
perltidyrc Remove whitespace from end of lines 2013-05-30 21:05:07 -04:00
pgindent Final pgindent + perltidy run for v10. 2017-08-14 17:29:33 -04:00
pgindent.man Remove entab and associated detritus. 2017-06-21 15:46:39 -04:00
pgperltidy Simplify the process of perltidy'ing our Perl files. 2016-08-15 11:32:09 -04:00
typedefs.list Improve sys/catcache performance. 2017-10-13 14:22:41 -07:00

README

pgindent'ing the PostgreSQL source tree
=======================================

We run this process at least once in each development cycle,
to maintain uniform layout style in our C and Perl code.

You might find this blog post interesting:
http://adpgtech.blogspot.com/2015/05/running-pgindent-on-non-core-code-or.html


PREREQUISITES:

1) Install pg_bsd_indent in your PATH.  Fetch its source code with
   git clone https://git.postgresql.org/git/pg_bsd_indent.git
   then follow the directions in README.pg_bsd_indent therein.

2) Install perltidy.  Please be sure it is v20090616 (older and newer
   versions make different formatting choices, and we want consistency).  See
   https://sourceforge.net/projects/perltidy/files/perltidy/perltidy-20090616/

DOING THE INDENT RUN:

1) Change directory to the top of the source tree.

2) Download the latest typedef file from the buildfarm:

	wget -O src/tools/pgindent/typedefs.list https://buildfarm.postgresql.org/cgi-bin/typedefs.pl

   (See https://www.pgbuildfarm.org/cgi-bin/typedefs.pl?show_list for a full
   list of typedef files, if you want to indent some back branch.)

3) Run pgindent on the C files:

	src/tools/pgindent/pgindent

   If any files generate errors, restore their original versions with
   "git checkout", and see below for cleanup ideas.

4) Indent the Perl code using perltidy:

	src/tools/pgindent/pgperltidy

   If you want to use some perltidy version that's not in your PATH,
   first set the PERLTIDY environment variable to point to it.

VALIDATION:

1) Check for any newly-created files using "git status"; there shouldn't
   be any.  (pgindent leaves *.BAK files behind if it has trouble, while
   perltidy leaves *.LOG files behind.)

2) Do a full test build:

	make -s clean
	make -s all	# look for unexpected warnings, and errors of course
	make check-world

   Your configure switches should include at least --enable-tap-tests
   or else much of the Perl code won't get exercised.
   The ecpg regression tests may well fail due to pgindent's updates of
   header files that get copied into ecpg output; if so, adjust the
   expected-files to match.

3) If you have the patience, it's worth eyeballing the "git diff" output
   for any egregiously ugly changes.  See below for cleanup ideas.


When you're done, "git commit" everything including the typedefs.list file
you used.


---------------------------------------------------------------------------

Cleaning up in case of failure or ugly output
---------------------------------------------

If you don't like the results for any particular file, "git checkout"
that file to undo the changes, patch the file as needed, then repeat
the indent process.

pgindent will reflow any comment block that's not at the left margin.
If this messes up manual formatting that ought to be preserved, protect
the comment block with some dashes:

	/*----------
	 * Text here will not be touched by pgindent.
	 *----------
	 */

Odd spacing around typedef names might indicate an incomplete typedefs list.

pgindent can get confused by #if sequences that look correct to the compiler
but have mismatched braces/parentheses when considered as a whole.  Usually
that looks pretty unreadable to humans too, so best practice is to rearrange
the #if tests to avoid it.

Sometimes, if pgindent or perltidy produces odd-looking output, it's because
of minor bugs like extra commas.  Don't hesitate to clean that up while
you're at it.

---------------------------------------------------------------------------

BSD indent
----------

We have standardized on FreeBSD's indent, and renamed it pg_bsd_indent.
pg_bsd_indent does differ slightly from FreeBSD's version, mostly in
being more easily portable to non-BSD platforms.  You can obtain it from
https://git.postgresql.org/git/pg_bsd_indent.git

GNU indent, version 2.2.6, has several problems, and is not recommended.
These bugs become pretty major when you are doing >500k lines of code.
If you don't believe me, take a directory and make a copy.  Run pgindent
on the copy using GNU indent, and do a diff -r. You will see what I
mean. GNU indent does some things better, but mangles too.  For details,
see:

	http://archives.postgresql.org/pgsql-hackers/2003-10/msg00374.php
	http://archives.postgresql.org/pgsql-hackers/2011-04/msg01436.php

---------------------------------------------------------------------------

Which files are processed
-------------------------

The pgindent run processes (nearly) all PostgreSQL *.c and *.h files,
but we currently exclude *.y and *.l files, as well as *.c and *.h files
derived from *.y and *.l files.  Additional exceptions are listed
in exclude_file_patterns:

src/include/storage/s_lock.h and src/include/port/atomics/ are excluded
because they contain assembly code that pgindent tends to mess up.

src/backend/utils/fmgrtab.c is excluded because it confuses pgindent
and it's a derived file anyway.

src/interfaces/ecpg/test/expected/ is excluded to avoid breaking the ecpg
regression tests, since what ecpg generates is not necessarily formatted
as pgindent would do it.  (Note that we do not exclude ecpg's header files
from the run; some of them get copied verbatim into ecpg's output, meaning
that the expected files may need to be updated to match.)

src/include/snowball/libstemmer/ and src/backend/snowball/libstemmer/
are excluded because those files are imported from an external project,
not maintained locally, and are machine-generated anyway.  Likewise for
plperl/ppport.h.


The perltidy run processes all *.pl and *.pm files, plus a few
executable Perl scripts that are not named that way.  See the "find"
rules in pgperltidy for details.