Commit Graph

1575 Commits

Author SHA1 Message Date
Andrew Dunstan 99d5a3ffb9 Fix use of config-specific libraries for Windows OpenSSL
Commit 614350a3 allowed for an different builds of OpenSSL libraries on
Windows, but ignored the fact that the alternative builds don't have
config-specific libraries. This patch fixes the Solution file to ask for
the correct libraries.

per offline discussions with Leonardo Cecchi and Marco Nenciarini,

Backpatch to all live branches.
2018-01-03 15:36:54 -05:00
Bruce Momjian 9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Andres Freund 1804284042 Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash.  While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.

After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory.  If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.

The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:

 * avoids wasting memory on duplicated hash tables
 * avoids wasting disk space on duplicated batch files
 * divides the work of building the hash table over the CPUs

One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables.  This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes.  Another is that
outer batch 0 must be written to disk if multiple batches are required.

A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.

A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.

Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
    https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
    https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 00:43:41 -08:00
Andres Freund ab9e0e718a Add shared tuplestores.
SharedTuplestore allows multiple participants to write into it and
then read the tuples back from it in parallel.  Each reader receives
partial results.

For now it always uses disk files, but other buffering policies and
other kinds of scans (ie each reader receives complete results) may be
useful in future.

The upcoming parallel hash join feature will use this facility.

Author: Thomas Munro
Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas
Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
2017-12-18 14:23:19 -08:00
Peter Eisentraut 632b03da31 Start a separate test suite for plpgsql
The plpgsql.sql test file in the main regression tests is now by far the
largest after numeric_big, making editing and managing the test cases
very cumbersome.  The other PLs have their own test suites split up into
smaller files by topic.  It would be nice to have that for plpgsql as
well.  So, to get that started, set up test infrastructure in
src/pl/plpgsql/src/ and split out the recently added procedure test
cases into a new file there.  That file now mirrors the test cases added
to the other PLs, making managing those matching tests a bit easier too.

msvc build system changes with help from Michael Paquier
2017-12-13 11:02:29 -05:00
Noah Misch 7e0c574ee2 MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries.
Notably, this permits linking to the 32-bit Perl binaries advertised on
perl.org, namely Strawberry Perl and ActivePerl.  This has a side effect
of permitting linking to binaries built with obsolete MSVC versions.

By default, MSVC 2012 and later require a "safe exception handler table"
in each binary.  MinGW-built, 32-bit DLLs lack the relevant exception
handler metadata, so linking to them failed with error LNK2026.  Restore
the semantics of MSVC 2010, which omits the table from a given binary if
some linker input lacks metadata.  This has no effect on 64-bit builds
or on MSVC 2010 and earlier.  Back-patch to 9.3 (all supported
versions).

Reported by Victor Wagner.

Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
2017-12-09 00:58:55 -08:00
Noah Misch 65a00f3035 MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T.
Commits 5a5c2feca3 and
b5178c5d08 introduced support for modern
MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl
distributions like Strawberry Perl and modern ActivePerl.  Perl has no
robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so
test this.  Back-patch to 9.3 (all supported versions).

The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when
$Config{gccversion} is nonempty.  That banks on every gcc-built Perl
using the same ABI.  gcc could change its default ABI the way MSVC once
did, and one could build Perl with gcc and the non-default ABI.

The GNU make build system could benefit from a similar test, without
which it does not support MSVC-built Perl.  For now, just add a comment.
Most users taking the special step of building Perl with MSVC probably
build PostgreSQL with MSVC.

Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com
2017-12-08 18:06:05 -08:00
Andres Freund dc6c4c9dc2 Add infrastructure for sharing temporary files between backends.
SharedFileSet allows temporary files to be created by one backend and
then exported for read-only access by other backends, with clean-up
managed by reference counting associated with a DSM segment.  This
includes changes to fd.c and buffile.c to support the new kind of
temporary file.

This will be used by an upcoming patch adding support for parallel
hash joins.

Author: Thomas Munro
Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas, Rushabh Lathia
Discussion:
    https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
    https://postgr.es/m/CAH2-WznJ_UgLux=_jTgCQ4yFz0iBntudsNKa1we3kN1BAG=88w@mail.gmail.com
2017-12-01 16:30:56 -08:00
Robert Haas eaedf0df71 Update typedefs.list and re-run pgindent
Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com
2017-11-29 09:24:24 -05:00
Magnus Hagander d5f965c257 Fix typo in comment
Andreas Karlsson
2017-11-27 09:24:14 +01:00
Tom Lane 6d4ae6a8e7 Update MSVC build process for new timezone data.
Missed this dependency in commits 7cce222c9 et al.
2017-11-25 18:15:22 -05:00
Noah Misch 84c4313c6f Support linking with MinGW-built Perl.
This is necessary for ActivePerl 5.18 onwards and for Strawberry Perl.
It is not sufficient for 32-bit builds with newer Visual Studio; these
fail with error LINK2026.  Back-patch to 9.3 (all supported versions).

Reported by Victor Wagner.

Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
2017-11-23 20:22:04 -08:00
Andres Freund 7082e614c0 Provide DSM segment to ExecXXXInitializeWorker functions.
Previously, executor nodes running in parallel worker processes didn't
have access to the dsm_segment object used for parallel execution.  In
order to support resource management based on DSM segment lifetime,
they need that.  So create a ParallelWorkerContext object to hold it
and pass it to all InitializeWorker functions.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
2017-11-16 17:39:18 -08:00
Andrew Dunstan 98d54bb779 Back out the session_start and session_end hooks feature.
It's become apparent during testing that there are problems with at
least the testing regime. I don't think we should have it without a
working test regime, and the difficulties might indicate implementation
problems anyway, so I'm backing out the whole thing until that's sorted
out.

This reverts commits 7459484 9989f92 cd8ce3a
2017-11-16 11:35:02 -05:00
Andrew Dunstan 745948422c Disable installcheck tests for test_session_hooks
The module requires a preloaded library and the defect can't be cured by
a LOAD instruction in the test script. To achieve this we override the
installcheck target in the module's Makefile, and exclude ithe module in
vcregress.pl.

Along the way, revert commit 9989f92aab.
2017-11-15 17:49:04 -05:00
Noah Misch 9363b8b79b MSVC: Rebuild spiexceptions.h when out of date.
Also, add a warning to catch future instances of naming a nonexistent
file as a prerequisite.  Back-patch to 9.3 (all supported versions).
2017-11-12 18:43:32 -08:00
Robert Haas 1aba8e651a Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly.  This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.

At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work.  Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.

Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me.  A few
final tweaks also by me.

Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-09 18:07:44 -05:00
Andrew Dunstan 74d2c0dbfd Improve gendef.pl diagnostic on failure to open sym file
There have been numerous buildfarm failures but the diagnostic is
currently silent about the reason for failure to open the file. Let's
see if we can get to the bottom of it.

Backpatch to all live branches.
2017-10-26 10:01:02 -04:00
Andres Freund 141fd1b66c Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
   entries - instead store them as a Datum array. This also allows to
   get rid of having to build dummy tuples for negative & list
   entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
   likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
   greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
   allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
   piece.

This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.

I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles. Unfortunately it's not easy to do so safely as
  an entry's memory location can change at various times, which
  doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
  but the win isn't big and the code for it is ugly, because the
  tuples have to be freed as well.
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in
  profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 14:22:41 -07:00
Tom Lane ef73a8162a Enforce our convention about max number of parallel regression tests.
We have a very old rule that parallel_schedule should have no more
than twenty tests in any one parallel group, so as to provide a
bound on the number of concurrently running processes needed to
pass the tests.  But people keep forgetting the rule, so let's add
a few lines of code to check it.

Discussion: https://postgr.es/m/a37e9c57-22d4-1b82-1270-4501cd2e984e@2ndquadrant.com
2017-10-07 17:20:09 -04:00
Andres Freund 15334ad19a Attempt to adapt windows build for 212e6f34d5.
Per buildfarm animal baiji.
2017-10-04 09:32:02 -07:00
Tom Lane 4736d74479 Adjust git_changelog for new-style release tags.
It wasn't on board with REL_n_n format.
2017-10-04 00:45:15 -04:00
Andrew Dunstan f2ab3898f3 Support building with Visual Studio 2017
Haribabu Kommi, reviewed by Takeshi Ideriha and Christian Ullrich

Backpatch to 9.6
2017-09-25 08:03:05 -04:00
Andres Freund fc49e24fa6 Make WAL segment size configurable at initdb time.
For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.

But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.

This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured.  For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.

Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
    Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com
2017-09-19 22:03:48 -07:00
Andres Freund cc5f81366c Add support for coordinating record typmods among parallel workers.
Tuples can have type RECORDOID and a typmod number that identifies a blessed
TupleDesc in a backend-private cache.  To support the sharing of such tuples
through shared memory and temporary files, provide a typmod registry in
shared memory.

To achieve that, introduce per-session DSM segments, created on demand when a
backend first runs a parallel query.  The per-session DSM segment has a
table-of-contents just like the per-query DSM segment, and initially the
contents are a shared record typmod registry and a DSA area to provide the
space it needs to grow.

State relating to the current session is accessed via a Session object
reached through global variable CurrentSession that may require significant
redesign further down the road as we figure out what else needs to be shared
or remodelled.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-09-14 19:59:21 -07:00
Tom Lane e451901804 Fix macro-redefinition warning on MSVC.
In commit 9d6b160d7, I tweaked pg_config.h.win32 to use
"#define HAVE_LONG_LONG_INT_64 1" rather than defining it as empty,
for consistency with what happens in an autoconf'd build.
But Solution.pm injects another definition of that macro into
ecpg_config.h, leading to justifiable (though harmless) compiler whining.
Make that one consistent too.  Back-patch, like the previous patch.

Discussion: https://postgr.es/m/CAEepm=1dWsXROuSbRg8PbKLh0S=8Ou-V8sr05DxmJOF5chBxqQ@mail.gmail.com
2017-09-03 11:01:08 -04:00
Andres Freund 8c0d7bafad Hash tables backed by DSA shared memory.
Add general purpose chaining hash tables for DSA memory.  Unlike
DynaHash in shared memory mode, these hash tables can grow as
required, and cope with being mapped into different addresses in
different backends.

There is a wide range of potential users for such a hash table, though
it's very likely the interface will need to evolve as we come to
understand the needs of different kinds of users.  E.g support for
iterators and incremental resizing is planned for later commits and
the details of the callback signatures are likely to change.

Author: Thomas Munro
Reviewed-By: John Gorman, Andres Freund, Dilip Kumar, Robert Haas
Discussion:
	https://postgr.es/m/CAEepm=3d8o8XdVwYT6O=bHKsKAM2pu2D6sV1S_=4d+jStVCE7w@mail.gmail.com
	https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-08-22 22:43:07 -07:00
Robert Haas 79ccd7cbd5 pg_prewarm: Add automatic prewarm feature.
Periodically while the server is running, and at shutdown, write out a
list of blocks in shared buffers.  When the server reaches consistency
-- unfortunatey, we can't do it before that point without breaking
things -- reload those blocks into any still-unused shared buffers.

Mithun Cy and Robert Haas, reviewed and tested by Beena Emerson,
Amit Kapila, Jim Nasby, and Rafia Sabih.

Discussion: http://postgr.es/m/CAD__OugubOs1Vy7kgF6xTjmEqTR4CrGAv8w+ZbaY_+MZeitukw@mail.gmail.com
2017-08-21 14:17:39 -04:00
Tom Lane b5178c5d08 Further tweaks to compiler flags for PL/Perl on Windows.
It now emerges that we can only rely on Perl to tell us we must use
-D_USE_32BIT_TIME_T if it's Perl 5.13.4 or later.  For older versions,
revert to our previous practice of assuming we need that symbol in
all 32-bit Windows builds.  This is not ideal, but inquiring into
which compiler version Perl was built with seems far too fragile.
In any case, we had not previously had complaints about these old
Perl versions, so let's assume this is Good Enough.  (It's still
better than the situation ante commit 5a5c2feca, in that at least
the effects are confined to PL/Perl rather than the whole PG build.)

Back-patch to all supported versions, like 5a5c2feca and predecessors.

Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
2017-08-17 13:13:47 -04:00
Tom Lane 9f14dc393b Stamp HEAD as 11devel.
Note that we no longer require any manual adjustments to shared-library
minor version numbers, cf commit a3bce17ef.  So this should be everything.
2017-08-14 18:08:30 -04:00
Tom Lane 21d304dfed Final pgindent + perltidy run for v10. 2017-08-14 17:29:33 -04:00
Tom Lane 5a5c2feca3 Absorb -D_USE_32BIT_TIME_T switch from Perl, if relevant.
Commit 3c163a7fc's original choice to ignore all #define symbols whose
names begin with underscore turns out to be too simplistic.  On Windows,
some Perl installations are built with -D_USE_32BIT_TIME_T, and we must
absorb that or we get the wrong result for sizeof(PerlInterpreter).

This effectively re-reverts commit ef58b87df, which injected that symbol
in a hacky way, making it apply to all of Postgres not just PL/Perl.
More significantly, it did so on *all* 32-bit Windows builds, even when
the Perl build to be used did not select this option; so that it fails
to work properly with some newer Perl builds.

By making this change, we would be introducing an ABI break in 32-bit
Windows builds; but fortunately we have not used type time_t in any
exported Postgres APIs in a long time.  So it should be OK, both for
PL/Perl itself and for third-party extensions, if an extension library
is built with a different _USE_32BIT_TIME_T setting than the core code.

Patch by me, based on research by Ashutosh Sharma and Robert Haas.
Back-patch to all supported branches, as commit 3c163a7fc was.

Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
2017-08-14 11:48:59 -04:00
Tom Lane 655727d93b Update RELEASE_CHANGES' example of branch name format.
We're planning to put an underscore before the major version number in
branch names for v10 and later.  Make sure the recipe in RELEASE_CHANGES
reflects that.

In passing, add a reminder to consider doing pgindent right before
the branch.

Discussion: https://postgr.es/m/E1dAkjZ-0003MG-0U@gemulon.postgresql.org
2017-08-06 23:26:09 -04:00
Tom Lane 3c163a7fc7 PL/Perl portability fix: absorb relevant -D switches from Perl.
The Perl documentation is very clear that stuff calling libperl should
be built with the compiler switches shown by Perl's $Config{ccflags}.
We'd been ignoring that up to now, and mostly getting away with it,
but recent Perl versions contain ABI compatibility cross-checks that
fail on some builds because of this omission.  In particular the
sizeof(PerlInterpreter) can come out different due to some fields being
added or removed; which means we have a live ABI hazard that we'd better
fix rather than continuing to sweep it under the rug.

However, it still seems like a bad idea to just absorb $Config{ccflags}
verbatim.  In some environments Perl was built with a different compiler
that doesn't even use the same switch syntax.  -D switch syntax is pretty
universal though, and absorbing Perl's -D switches really ought to be
enough to fix the problem.

Furthermore, Perl likes to inject stuff like -D_LARGEFILE_SOURCE and
-D_FILE_OFFSET_BITS=64 into $Config{ccflags}, which affect libc ABIs on
platforms where they're relevant.  Adopting those seems dangerous too.
It's unclear whether a build wherein Perl and Postgres have different ideas
of sizeof(off_t) etc would work, or whether anyone would care about making
it work.  But it's dead certain that having different stdio ABIs in
core Postgres and PL/Perl will not work; we've seen that movie before.
Therefore, let's also ignore -D switches for symbols beginning with
underscore.  The symbols that we actually need to import should be the ones
mentioned in perl.h's PL_bincompat_options stanza, and none of those start
with underscore, so this seems likely to work.  (If it turns out not to
work everywhere, we could consider intersecting the symbols mentioned in
PL_bincompat_options with the -D switches.  But that will be much more
complicated, so let's try this way first.)

This will need to be back-patched, but first let's see what the
buildfarm makes of it.

Ashutosh Sharma, some adjustments by me

Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
2017-07-28 14:25:28 -04:00
Noah Misch bbbd9121e6 MSVC: Finish clean.bat build artifact coverage.
With this, "git clean -dnx" is clear after a "clean dist" following a
build.  Preserve sql_help.h in non-dist cleans, like the Makefile does.
2017-07-24 00:13:23 -07:00
Noah Misch 71ad8000da MSVC: Accept tcl86.lib in addition to tcl86t.lib.
ActiveTcl8.6.4.1.299124-win32-x86_64-threaded.exe ships just tcl86.lib.
Back-patch to 9.2, like the commit recognizing tcl86t.lib.
2017-07-23 23:53:27 -07:00
Dean Rasheed d363d42bb9 Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition bounds.
Previously, UNBOUNDED meant no lower bound when used in the FROM list,
and no upper bound when used in the TO list, which was OK for
single-column range partitioning, but problematic with multiple
columns. For example, an upper bound of (10.0, UNBOUNDED) would not be
collocated with a lower bound of (10.0, UNBOUNDED), thus making it
difficult or impossible to define contiguous multi-column range
partitions in some cases.

Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to
represent a partition column that is unbounded below or above
respectively. This syntax removes any ambiguity, and ensures that if
one partition's lower bound equals another partition's upper bound,
then the partitions are contiguous.

Also drop the constraint prohibiting finite values after an unbounded
column, and just document the fact that any values after MINVALUE or
MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED
multiple times, which was needlessly verbose.

Note: Forces a post-PG 10 beta2 initdb.

Report by Amul Sul, original patch by Amit Langote with some
additional hacking by me.

Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com
2017-07-21 09:20:47 +01:00
Noah Misch 2f7f45a64b MSVC: Don't link libpgcommon into pgcrypto.
Doing so was useful in 273c458a2b but
became obsolete when 818fd4a67d caused
postgres.exe to provide the relevant symbols.  No other loadable module
links to libpgcommon directly.
2017-07-16 23:13:58 -07:00
Andrew Dunstan deb0129a22 fix typo 2017-07-16 12:01:13 -04:00
Andrew Dunstan fd2487e49f Fix vcregress.pl PROVE_FLAGS bug in commit 93b7d9731f
This change didn't adjust the publicly visible taptest function, causing
buildfarm failures on bowerbird.

Backpatch to 9.4 like previous change.
2017-07-16 11:24:29 -04:00
Tom Lane c95275fc20 Fix broken link-command-line ordering for libpgfeutils.
In the frontend Makefiles that pull in libpgfeutils, we'd generally
done it like this:

LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)

That method is badly broken, as seen in bug #14742 from Chris Ruprecht.
The -L flag for src/fe_utils ends up being placed after whatever random
-L flags are in LDFLAGS already.  That puts us at risk of pulling in
libpgfeutils.a from some previous installation rather than the freshly
built one in src/fe_utils.  Also, the lack of an "override" is hazardous
if someone tries to specify some LDFLAGS on the make command line.

The correct way to do it is like this:

override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(LDFLAGS)

so that libpgfeutils, along with libpq, libpgport, and libpgcommon, are
guaranteed to be pulled in from the build tree and not from any referenced
system directory, because their -L flags will appear first.

In some places we'd been even lazier and done it like this:

LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils -lpq

which is subtly wrong in an additional way: on platforms where we can't
restrict the symbols exported by libpq.so, it allows libpgfeutils to
latch onto libpgport and libpgcommon symbols from libpq.so, rather than
directly from those static libraries as intended.  This carries hazards
like those explained in the comments for the libpq_pgport macro.

In addition to fixing the broken libpgfeutils usages, I tried to
standardize on using $(libpq_pgport) like so:

override LDFLAGS := $(libpq_pgport) $(LDFLAGS)

even where libpgfeutils is not in the picture.  This makes no difference
right now but will hopefully discourage future mistakes of the same ilk.
And it's more like the way we handle CPPFLAGS in libpq-using Makefiles.

In passing, just for consistency, make pgbench include PTHREAD_LIBS the
same way everyplace else does, ie just after LIBS rather than in some
random place in the command line.  This might have practical effect if
there are -L switches in that macro on some platform.

It looks to me like the MSVC build scripts are not affected by this
error, but someone more familiar with them than I might want to double
check.

Back-patch to 9.6 where libpgfeutils was introduced.  In 9.6, the hazard
this error creates is that a reinstallation might link to the prior
installation's copy of libpgfeutils.a and thereby fail to absorb a
minor-version bug fix.

Discussion: https://postgr.es/m/20170714125106.9231.13772@wrigleys.postgresql.org
2017-07-14 12:26:53 -04:00
Noah Misch 3381898f98 MSVC: Repair libpq.rc generator.
It generates an empty file, so libpq.dll advertises no version
information.  Commit facde2a98f
mistranslated "print O;" in this one place.
2017-07-09 00:43:17 -07:00
Tom Lane 1ae8536545 Ooops, WIN32 code in pg_ctl.c still needs PQExpBuffer.
Per buildfarm.
2017-06-28 18:00:16 -04:00
Tom Lane f13ea95f9e Change pg_ctl to detect server-ready by watching status in postmaster.pid.
Traditionally, "pg_ctl start -w" has waited for the server to become
ready to accept connections by attempting a connection once per second.
That has the major problem that connection issues (for instance, a
kernel packet filter blocking traffic) can't be reliably told apart
from server startup issues, and the minor problem that if server startup
isn't quick, we accumulate "the database system is starting up" spam
in the server log.  We've hacked around many of the possible connection
issues, but it resulted in ugly and complicated code in pg_ctl.c.

In commit c61559ec3, I changed the probe rate to every tenth of a second.
That prompted Jeff Janes to complain that the log-spam problem had become
much worse.  In the ensuing discussion, Andres Freund pointed out that
we could dispense with connection attempts altogether if the postmaster
were changed to report its status in postmaster.pid, which "pg_ctl start"
already relies on being able to read.  This patch implements that, teaching
postmaster.c to report a status string into the pidfile at the same
state-change points already identified as being of interest for systemd
status reporting (cf commit 7d17e683f).  pg_ctl no longer needs to link
with libpq at all; all its functions now depend on reading server files.

In support of this, teach AddToDataDirLockFile() to allow addition of
postmaster.pid lines in not-necessarily-sequential order.  This is needed
on Windows where the SHMEM_KEY line will never be written at all.  We still
have the restriction that we don't want to truncate the pidfile; document
the reasons for that a bit better.

Also, fix the pg_ctl TAP tests so they'll notice if "start -w" mode
is broken --- before, they'd just wait out the sixty seconds until
the loop gives up, and then report success anyway.  (Yes, I found that
out the hard way.)

While at it, arrange for pg_ctl to not need to #include miscadmin.h;
as a rather low-level backend header, requiring that to be compilable
client-side is pretty dubious.  This requires moving the #define's
associated with the pidfile into a new header file, and moving
PG_BACKEND_VERSIONSTR someplace else.  For lack of a clearly better
"someplace else", I put it into port.h, beside the declaration of
find_other_exec(), since most users of that macro are passing the value to
find_other_exec().  (initdb still depends on miscadmin.h, but at least
pg_ctl and pg_upgrade no longer do.)

In passing, fix main.c so that PG_BACKEND_VERSIONSTR actually defines the
output of "postgres -V", which remarkably it had never done before.

Discussion: https://postgr.es/m/CAMkU=1xJW8e+CTotojOMBd-yzUvD0e_JZu2xHo=MnuZ4__m7Pg@mail.gmail.com
2017-06-28 17:31:32 -04:00
Tom Lane 81f056c725 Remove entab and associated detritus.
We don't need this anymore, because pg_bsd_indent has been taught to
follow the same tab-vs-space rules that entab used to enforce.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:46:39 -04:00
Tom Lane 382ceffdf7 Phase 3 of pgindent updates.
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.

By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis.  However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent.  That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.

This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:35:54 -04:00
Tom Lane c7b8998ebb Phase 2 of pgindent updates.
Change pg_bsd_indent to follow upstream rules for placement of comments
to the right of code, and remove pgindent hack that caused comments
following #endif to not obey the general rule.

Commit e3860ffa4d wasn't actually using
the published version of pg_bsd_indent, but a hacked-up version that
tried to minimize the amount of movement of comments to the right of
code.  The situation of interest is where such a comment has to be
moved to the right of its default placement at column 33 because there's
code there.  BSD indent has always moved right in units of tab stops
in such cases --- but in the previous incarnation, indent was working
in 8-space tab stops, while now it knows we use 4-space tabs.  So the
net result is that in about half the cases, such comments are placed
one tab stop left of before.  This is better all around: it leaves
more room on the line for comment text, and it means that in such
cases the comment uniformly starts at the next 4-space tab stop after
the code, rather than sometimes one and sometimes two tabs after.

Also, ensure that comments following #endif are indented the same
as comments following other preprocessor commands such as #else.
That inconsistency turns out to have been self-inflicted damage
from a poorly-thought-through post-indent "fixup" in pgindent.

This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 15:19:25 -04:00
Tom Lane e3860ffa4d Initial pgindent run with pg_bsd_indent version 2.0.
The new indent version includes numerous fixes thanks to Piotr Stefaniak.
The main changes visible in this commit are:

* Nicer formatting of function-pointer declarations.
* No longer unexpectedly removes spaces in expressions using casts,
  sizeof, or offsetof.
* No longer wants to add a space in "struct structname *varname", as
  well as some similar cases for const- or volatile-qualified pointers.
* Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely.
* Fixes bug where comments following declarations were sometimes placed
  with no space separating them from the code.
* Fixes some odd decisions for comments following case labels.
* Fixes some cases where comments following code were indented to less
  than the expected column 33.

On the less good side, it now tends to put more whitespace around typedef
names that are not listed in typedefs.list.  This might encourage us to
put more effort into typedef name collection; it's not really a bug in
indent itself.

There are more changes coming after this round, having to do with comment
indentation and alignment of lines appearing within parentheses.  I wanted
to limit the size of the diffs to something that could be reviewed without
one's eyes completely glazing over, so it seemed better to split up the
changes as much as practical.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:39:04 -04:00
Tom Lane 8ff6d4ec78 Adjust pgindent script to use pg_bsd_indent 2.0.
Update version-checking code and list of switches.  Delete obsolete
quasi-support for using GNU indent.  Remove a lot of no-longer-needed
workarounds for bugs of the old version, and improve comments for
the hacks that remain.  Update run_build() subroutine to fetch the
pg_bsd_indent code from the newly established git repo for it.

In passing, fix pgindent to not overwrite files that require no changes;
this makes it a bit more friendly to run on a built tree.

Adjust relevant documentation.

Remove indent.bsd.patch; it's not relevant anymore (and was obsolete
long ago anyway).  Likewise remove pgcppindent, since we're no longer
in the business of shipping C++ code.

Piotr Stefaniak is responsible for most of the algorithmic changes
to the pgindent script; I did the rest.

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:26:21 -04:00
Tom Lane 9ef2dbefc7 Final pgindent run with old pg_bsd_indent (version 1.3).
This is just to have a clean basis for comparison with the results of
the new version (which will indeed end up reverting some of these
changes...)

Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
2017-06-21 14:09:24 -04:00
Tom Lane cea258b63d Teach pgindent to skip files generated by bison or flex automatically.
If a .c or .h file corresponds to a .y or .l file, skip indenting it.
There's no point in reindenting derived files, and these files tend to
confuse pgindent.  (Which probably indicates a bug in BSD indent, but
I can't get excited about trying to fix it.)

For the same reasons, add src/backend/utils/fmgrtab.c to the set of
files excluded by src/tools/pgindent/exclude_file_patterns.

The point of doing this is that it makes it safe to run pgindent over
the tree without doing "make maintainer-clean" first.  While these are
not the only derived .c/.h files in the tree, they are the only ones
pgindent fails on.  Removing that prerequisite step results in one less
way to mess up a pgindent run, and it's necessary if we ever hope to get
to the ease of running pgindent via "make indent".
2017-06-16 23:14:40 -04:00
Peter Eisentraut ae1aa28eb6 Use correct ICU path for Windows 32 vs. 64 bit
Author: Ashutosh Sharma <ashu.coek88@gmail.com>
2017-06-13 09:13:32 -04:00
Peter Eisentraut 03c396080d Add MSVC build system support for ICU
Author: Ashutosh Sharma <ashu.coek88@gmail.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-06-12 11:05:20 -04:00
Andrew Dunstan 93b7d9731f Take PROVE_FLAGS from the command line but not the environment
This reverts commit 56b6ef893f and instead
makes vcregress.pl parse out PROVE_FLAGS from a command line argument
when doing a TAP test, thus making it consistent with the makefile
treatment.

Discussion: https://postgr.es/m/c26a7416-2fb9-34ab-7991-618c922f896e%402ndquadrant.com

Backpatch to 9.4 like previous patch.
2017-06-10 10:19:06 -04:00
Andrew Dunstan 2e02136fe6 Fix thinko in previous openssl change 2017-06-05 20:38:46 -04:00
Andrew Dunstan 614350a3ab Find openssl lib files in right directory for MSVC
Some openssl builds put their lib files in a VC subdirectory, others do
not. Cater for both cases.

Backpatch to all live branches.

From an offline discussion with Leonardo Cecchi.
2017-06-05 14:24:42 -04:00
Magnus Hagander f61bd73993 Update URLs in pgindent source and README
Website and buildfarm is https, not http, and the ftp protocol will be
shut down shortly.
2017-05-23 13:58:11 -04:00
Bruce Momjian ce55481032 Post-PG 10 beta1 pgperltidy run 2017-05-17 19:01:23 -04:00
Bruce Momjian a6fd7b7a5f Post-PG 10 beta1 pgindent run
perltidy run not included.
2017-05-17 16:31:56 -04:00
Bruce Momjian 8a94332478 Update typedefs list in prep. for post-PG10 beta1 pgindent run 2017-05-17 15:52:16 -04:00
Bruce Momjian df238b43d7 Add download URL for perltidy version v20090616 2017-05-17 15:29:37 -04:00
Tom Lane e3f67a5a17 Update oidjoins regression test for v10. 2017-05-15 14:04:11 -04:00
Andrew Dunstan 56b6ef893f Honor PROVE_FLAGS environment setting
On MSVC builds and on back branches that means removing the hardcoded
--verbose setting. On master for Unix that means removing the empty
setting in the global Makefile so that the value can be acquired from
the environment as well as from the make arguments.

Backpatch to 9.4 where we introduced TAP tests
2017-05-12 11:11:49 -04:00
Andrew Dunstan b757e01f62 Add libxml2 include path for MSVC builds
On Unix this path is detected via the use of xml2-config, but that's not
available on Windows. This means that users building with libxml2 will
no longer need to move things around from the standard libxml2
installation for MSVC builds.

Backpatch to all live branches.
2017-05-12 10:21:13 -04:00
Bruce Momjian c4c493fd35 pgindent: use HTTP instead of FTP to retrieve pg_bsd_indent src
FTP support will be removed from ftp.postgresql.org in months, but http
still works.  Typedefs already used http.
2017-05-09 09:28:44 -04:00
Tom Lane d4e59c5521 Install the "posixrules" timezone link in MSVC builds.
Somehow, we'd missed ever doing this.  The consequences aren't too
severe: basically, the timezone library would fall back on its hardwired
notion of the DST transition dates to use for a POSIX-style zone name,
rather than obeying US/Eastern which is the intended behavior.  The net
effect would only be to obey current US DST law further back than it
ought to apply; so it's not real surprising that nobody noticed.

David Rowley, per report from Amit Kapila

Discussion: https://postgr.es/m/CAA4eK1LC7CaNhRAQ__C3ht1JVrPzaAXXhEJRnR5L6bfYHiLmWw@mail.gmail.com
2017-05-07 11:57:41 -04:00
Alvaro Herrera 14722c69f9 Allow MSVC to build with Tcl 8.6.
Commit eaba54c20c added support for Tcl 8.6 for configure-supported
platforms after verifying that pltcl works without further changes, but
the MSVC tooling wasn't updated accordingly.  Update MSVC to match,
restructuring the code to avoid duplicating the logic for every Tcl
version supported.

Backpatch to all live branches, like eaba54c20c.  In 9.4 and previous,
change the patch to use backslashes rather than forward, as in the rest
of the file.

Reported by Paresh More, who also tested the patch I provided.
Discussion: https://postgr.es/m/CAAgiCNGVw3ssBtSi3ZNstrz5k00ax=UV+_ZEHUeW_LMSGL2sew@mail.gmail.com
2017-05-05 12:38:29 -03:00
Magnus Hagander 28d1c8ccc8 Build pgoutput.dll in MSVC build
Without this, logical replication obviously does not work on Windows

MauMau, with clean.bet additions from me per note from Michael Paquier
2017-05-05 12:08:48 +02:00
Andrew Dunstan 9a0d2008c3 Fix perl thinko in commit fed6df486d
Report and fix from Vaishnavi Prabakaran

Backpatch to 9.4 like original.
2017-05-02 08:20:11 -04:00
Andrew Dunstan fed6df486d Allow vcregress.pl to run an arbitrary TAP test set
Currently only provision for running the bin checks in a single step is
provided for. Now these tests can be run individually, as well as tests
in other locations (e.g. src.test/recover).

Also provide for suppressing unnecessary temp installs by setting the
NO_TEMP_INSTALL environment variable just as the Makefiles do.

Backpatch to 9.4.
2017-05-01 10:57:51 -04:00
Bruce Momjian 06fc54cd43 docs: update major release instructions 2017-04-13 10:19:12 -04:00
Bruce Momjian e1c86a5576 git_changelog: improve comment 2017-04-13 09:13:43 -04:00
Bruce Momjian 854854019a git_changelog: improve instructions for finding branch commits
Specifically, use '--summary' with 'git show'.
2017-04-12 15:40:37 -04:00
Tom Lane 587d62d856 Remove bogus redefinition of _MSC_VER.
Commit a4777f355 was a shade too mechanical: we don't want to override
MSVC's own definition of _MSC_VER, as that breaks tests on its numerical
value.  Per buildfarm.
2017-04-11 15:32:33 -04:00
Magnus Hagander a4777f3556 Remove symbol WIN32_ONLY_COMPILER
This used to mean "Visual C++ except in those parts where Borland C++
was supported where it meant one of those". Now that we don't support
Borland C++ anymore, simplify by using _MSC_VER which is the normal way
to detect Visual C++.
2017-04-11 15:22:21 +02:00
Heikki Linnakangas 60f11b87a2 Use SASLprep to normalize passwords for SCRAM authentication.
An important step of SASLprep normalization, is to convert the string to
Unicode normalization form NFKC. Unicode normalization requires a fairly
large table of character decompositions, which is generated from data
published by the Unicode consortium. The script to generate the table is
put in src/common/unicode, as well test code for the normalization.
A pre-generated version of the tables is included in src/include/common,
so you don't need the code in src/common/unicode to build PostgreSQL, only
if you wish to modify the normalization tables.

The SASLprep implementation depends on the UTF-8 functions from
src/backend/utils/mb/wchar.c. So to use it, you must also compile and link
that. That doesn't change anything for the current users of these
functions, the backend and libpq, as they both already link with wchar.o.
It would be good to move those functions into a separate file in
src/commmon, but I'll leave that for another day.

No documentation changes included, because there is no details on the
SCRAM mechanism in the docs anyway. An overview on that in the protocol
specification would probably be good, even though SCRAM is documented in
detail in RFC5802. I'll write that as a separate patch. An important thing
to mention there is that we apply SASLprep even on invalid UTF-8 strings,
to support other encodings.

Patch by Michael Paquier and me.

Discussion: https://www.postgresql.org/message-id/CAB7nPqSByyEmAVLtEf1KxTRh=PWNKiWKEKQR=e1yGehz=wbymQ@mail.gmail.com
2017-04-07 14:56:05 +03:00
Tom Lane df1a699e5b Fix integer-overflow problems in interval comparison.
When using integer timestamps, the interval-comparison functions tried
to compute the overall magnitude of an interval as an int64 number of
microseconds.  As reported by Frazer McLean, this overflows for intervals
exceeding about 296000 years, which is bad since we nominally allow
intervals many times larger than that.  That results in wrong comparison
results, and possibly in corrupted btree indexes for columns containing
such large interval values.

To fix, compute the magnitude as int128 instead.  Although some compilers
have native support for int128 calculations, many don't, so create our
own support functions that can do 128-bit addition and multiplication
if the compiler support isn't there.  These support functions are designed
with an eye to allowing the int128 code paths in numeric.c to be rewritten
for use on all platforms, although this patch doesn't do that, or even
provide all the int128 primitives that will be needed for it.

Back-patch as far as 9.4.  Earlier releases did not guard against overflow
of interval values at all (commit 146604ec4 fixed that), so it seems not
very exciting to worry about overly-large intervals for them.

Before 9.6, we did not assume that unreferenced "static inline" functions
would not draw compiler warnings, so omit functions not directly referenced
by timestamp.c, the only present consumer of int128.h.  (We could have
omitted these functions in HEAD too, but since they were written and
debugged on the way to the present patch, and they look likely to be needed
by numeric.c, let's keep them in HEAD.)  I did not bother to try to prevent
such warnings in a --disable-integer-datetimes build, though.

Before 9.5, configure will never define HAVE_INT128, so the part of
int128.h that exploits a native int128 implementation is dead code in the
9.4 branch.  I didn't bother to remove it, thinking that keeping the file
looking similar in different branches is more useful.

In HEAD only, add a simple test harness for int128.h in src/tools/.

In back branches, this does not change the float-timestamps code path.
That's not subject to the same kind of overflow risk, since it computes
the interval magnitude as float8.  (No doubt, when this code was originally
written, overflow was disregarded for exactly that reason.)  There is a
precision hazard instead :-(, but we'll avert our eyes from that question,
since no complaints have been reported and that code's deprecated anyway.

Kyotaro Horiguchi and Tom Lane

Discussion: https://postgr.es/m/1490104629.422698.918452336.26FA96B7@webmail.messagingengine.com
2017-04-05 23:51:27 -04:00
Tom Lane cd6baed781 Remove reinvention of stringify macro.
We already have CppAsString2, there's no need for the MSVC support to
re-invent a macro to do that (and especially not to inject it in as
ugly a way as this).

Discussion: https://postgr.es/m/CADkLM=c+hm2rc0tkKgC-ZgrLttHT2KkfppE+BC-=i-xj+7V-TQ@mail.gmail.com
2017-04-02 19:19:16 -04:00
Peter Eisentraut 4d33a7f2e7 Fix Perl code which had broken the Windows build
The previous change wanted to avoid modifying $_ in grep, but the code
just made the change in a local variable and then lost it.  Rewrite the
code using a separate map and grep, which is clearer anyway.

Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
2017-03-28 09:00:59 -04:00
Peter Eisentraut facde2a98f Clean up Perl code according to perlcritic
Fix all perlcritic warnings of severity level 5, except in
src/backend/utils/Gen_dummy_probes.pl, which is automatically generated.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
2017-03-27 08:18:22 -04:00
Andres Freund b8d7f053c5 Faster expression evaluation and targetlist projection.
This replaces the old, recursive tree-walk based evaluation, with
non-recursive, opcode dispatch based, expression evaluation.
Projection is now implemented as part of expression evaluation.

This both leads to significant performance improvements, and makes
future just-in-time compilation of expressions easier.

The speed gains primarily come from:
- non-recursive implementation reduces stack usage / overhead
- simple sub-expressions are implemented with a single jump, without
  function calls
- sharing some state between different sub-expressions
- reduced amount of indirect/hard to predict memory accesses by laying
  out operation metadata sequentially; including the avoidance of
  nearly all of the previously used linked lists
- more code has been moved to expression initialization, avoiding
  constant re-checks at evaluation time

Future just-in-time compilation (JIT) has become easier, as
demonstrated by released patches intended to be merged in a later
release, for primarily two reasons: Firstly, due to a stricter split
between expression initialization and evaluation, less code has to be
handled by the JIT. Secondly, due to the non-recursive nature of the
generated "instructions", less performance-critical code-paths can
easily be shared between interpreted and compiled evaluation.

The new framework allows for significant future optimizations. E.g.:
- basic infrastructure for to later reduce the per executor-startup
  overhead of expression evaluation, by caching state in prepared
  statements.  That'd be helpful in OLTPish scenarios where
  initialization overhead is measurable.
- optimizing the generated "code". A number of proposals for potential
  work has already been made.
- optimizing the interpreter. Similarly a number of proposals have
  been made here too.

The move of logic into the expression initialization step leads to some
backward-incompatible changes:
- Function permission checks are now done during expression
  initialization, whereas previously they were done during
  execution. In edge cases this can lead to errors being raised that
  previously wouldn't have been, e.g. a NULL array being coerced to a
  different array type previously didn't perform checks.
- The set of domain constraints to be checked, is now evaluated once
  during expression initialization, previously it was re-built
  every time a domain check was evaluated. For normal queries this
  doesn't change much, but e.g. for plpgsql functions, which caches
  ExprStates, the old set could stick around longer.  The behavior
  around might still change.

Author: Andres Freund, with significant changes by Tom Lane,
	changes by Heikki Linnakangas
Reviewed-By: Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/20161206034955.bh33paeralxbtluv@alap3.anarazel.de
2017-03-25 14:52:06 -07:00
Peter Eisentraut 50c956add8 Remove createlang and droplang
They have been deprecated since PostgreSQL 9.1.

Reviewed-by: Magnus Hagander <magnus@hagander.net>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
2017-03-23 14:16:45 -04:00
Tom Lane 9c2635e26f Fix hard-coded relkind constants in assorted other files.
Although it's reasonable to expect that most of these constants will
never change, that does not make it good programming style to hard-code
the value rather than using the RELKIND_FOO macros.

I think I've now gotten all the hard-coded references in C code.
Unfortunately there's no equally convenient way to parameterize
SQL files ...

Discussion: https://postgr.es/m/11145.1488931324@sss.pgh.pa.us
2017-03-09 23:36:52 -05:00
Andres Freund 3717dc149e Add amcheck extension to contrib.
This is the beginning of a collection of SQL-callable functions to
verify the integrity of data files.  For now it only contains code to
verify B-Tree indexes.

This adds two SQL-callable functions, validating B-Tree consistency to
a varying degree.  Check the, extensive, docs for details.

The goal is to later extend the coverage of the module to further
access methods, possibly including the heap.  Once checks for
additional access methods exist, we'll likely add some "dispatch"
functions that cover multiple access methods.

Author: Peter Geoghegan, editorialized by Andres Freund
Reviewed-By: Andres Freund, Tomas Vondra, Thomas Munro,
   Anastasia Lubennikova, Robert Haas, Amit Langote
Discussion: CAM3SWZQzLMhMwmBqjzK+pRKXrNUZ4w90wYMUWfkeV8mZ3Debvw@mail.gmail.com
2017-03-09 16:33:02 -08:00
Robert Haas 355d3993c5 Add a Gather Merge executor node.
Like Gather, we spawn multiple workers and run the same plan in each
one; however, Gather Merge is used when each worker produces the same
output ordering and we want to preserve that output ordering while
merging together the streams of tuples from various workers.  (In a
way, Gather Merge is like a hybrid of Gather and MergeAppend.)

This works out to a win if it saves us from having to perform an
expensive Sort.  In cases where only a small amount of data would need
to be sorted, it may actually be faster to use a regular Gather node
and then sort the results afterward, because Gather Merge sometimes
needs to wait synchronously for tuples whereas a pure Gather generally
doesn't.  But if this avoids an expensive sort then it's a win.

Rushabh Lathia, reviewed and tested by Amit Kapila, Thomas Munro,
and Neha Sharma, and reviewed and revised by me.

Discussion: http://postgr.es/m/CAGPqQf09oPX-cQRpBKS0Gq49Z+m6KBxgxd_p9gX8CKk_d75HoQ@mail.gmail.com
2017-03-09 07:49:29 -05:00
Tom Lane 03cf221934 Clean up test_ifaddrs a bit.
We customarily #include <netinet/in.h> before <arpa/inet.h>; according
to our git history (cf commit 527f8babc) there used to be platform(s)
where <arpa/inet.h> didn't compile otherwise.  That's probably not
really an issue anymore, but since test_ifaddrs.c is the one and only
place in our code that's not following that rule, bring it into line.
Also remove #include <sys/socket.h>, as that's duplicative given that
libpq/ifaddr.h does so (via pqcomm.h).

In passing, add a .gitignore file so nobody accidentally commits the
test_ifaddrs executable, as I nearly did.

I see no particular need to back-patch this, as it's just neatnik-ism
considering we don't build test_ifaddrs by default, or even document
it anywhere.
2017-03-07 12:06:07 -05:00
Heikki Linnakangas 818fd4a67d Support SCRAM-SHA-256 authentication (RFC 5802 and 7677).
This introduces a new generic SASL authentication method, similar to the
GSS and SSPI methods. The server first tells the client which SASL
authentication mechanism to use, and then the mechanism-specific SASL
messages are exchanged in AuthenticationSASLcontinue and PasswordMessage
messages. Only SCRAM-SHA-256 is supported at the moment, but this allows
adding more SASL mechanisms in the future, without changing the overall
protocol.

Support for channel binding, aka SCRAM-SHA-256-PLUS is left for later.

The SASLPrep algorithm, for pre-processing the password, is not yet
implemented. That could cause trouble, if you use a password with
non-ASCII characters, and a client library that does implement SASLprep.
That will hopefully be added later.

Authorization identities, as specified in the SCRAM-SHA-256 specification,
are ignored. SET SESSION AUTHORIZATION provides more or less the same
functionality, anyway.

If a user doesn't exist, perform a "mock" authentication, by constructing
an authentic-looking challenge on the fly. The challenge is derived from
a new system-wide random value, "mock authentication nonce", which is
created at initdb, and stored in the control file. We go through these
motions, in order to not give away the information on whether the user
exists, to unauthenticated users.

Bumps PG_CONTROL_VERSION, because of the new field in control file.

Patch by Michael Paquier and Heikki Linnakangas, reviewed at different
stages by Robert Haas, Stephen Frost, David Steele, Aleksander Alekseev,
and many others.

Discussion: https://www.postgresql.org/message-id/CAB7nPqRbR3GmFYdedCAhzukfKrgBLTLtMvENOmPrVWREsZkF8g%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqSMXU35g%3DW9X74HVeQp0uvgJxvYOuA4A-A3M%2B0wfEBv-w%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
2017-03-07 14:25:40 +02:00
Heikki Linnakangas 273c458a2b Refactor SHA2 functions and move them to src/common/.
This way both frontend and backends can use them. The functions are taken
from pgcrypto, which now fetches the source files it needs from
src/common/.

A new interface is designed for the SHA2 functions, which allow linking
to either OpenSSL or the in-core stuff taken from KAME as needed.

Michael Paquier, reviewed by Robert Haas.

Discussion: https://www.postgresql.org/message-id/CAB7nPqTGKuTM5jiZriHrNaQeVqp5e_iT3X4BFLWY_HyHxLvySQ%40mail.gmail.com
2017-03-07 14:23:49 +02:00
Peter Eisentraut 550214a4ef Add operator_with_argtypes grammar rule
This makes the handling of operators similar to that of functions and
aggregates.

Rename node FuncWithArgs to ObjectWithArgs, to reflect the expanded use.

Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-03-06 13:31:47 -05:00
Andres Freund 7e3aa03b41 Reduce size of common allocation header.
The new slab allocator needs different per-allocation information than
the classical aset.c.  The definition in 58b25e981 wasn't sufficiently
careful on 32 platforms with 8 byte alignment, leading to buildfarm
failures.  That's not entirely easy to fix by just adjusting the
definition.

As slab.c doesn't actually need the size part(s) of the common header,
all chunks are equally sized after all, it seems better to instead
reduce the header to the part needed by all allocators, namely which
context an allocation belongs to. That has the advantage of reducing
the overhead of slab allocations, and also allows for more flexibility
in future allocators.

To avoid spreading the logic about accessing a chunk's context around,
centralize it in GetMemoryChunkContext(), which allows to delete a
good number of lines.

A followup commit will revise the mmgr/README portion about
StandardChunkHeader, and more.

Author: Andres Freund
Discussion: https://postgr.es/m/20170228074420.aazv4iw6k562mnxg@alap3.anarazel.de
2017-02-28 19:42:44 -08:00
Andres Freund 58b25e9810 Add "Slab" MemoryContext implementation for efficient equal-sized allocations.
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes.  There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.

These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.

While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.

This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly.  It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations.  In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.

A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.

There likely are further potentialy uses of this allocator besides
reorderbuffer.c.

There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.

Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
2017-02-27 03:41:44 -08:00
Tom Lane b6aa17e0ae De-support floating-point timestamps.
Per discussion, the time has come to do this.  The handwriting has been
on the wall at least since 9.0 that this would happen someday, whenever
it got to be too much of a burden to support the float-timestamp option.
The triggering factor now is the discovery that there are multiple bugs
in the code that attempts to implement use of integer timestamps in the
replication protocol even when the server is built for float timestamps.
The internal float timestamps leak into the protocol fields in places.
While we could fix the identified bugs, there's a very high risk of
introducing more.  Trying to build a wall that would positively prevent
mixing integer and float timestamps is more complexity than we want to
undertake to maintain a long-deprecated option.  The fact that these
bugs weren't found through testing also indicates a lack of interest
in float timestamps.

This commit disables configure's --disable-integer-datetimes switch
(it'll still accept --enable-integer-datetimes, though), removes direct
references to USE_INTEGER_DATETIMES, and removes discussion of float
timestamps from the user documentation.  A considerable amount of code is
rendered dead by this, but removing that will occur as separate mop-up.

Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
2017-02-23 11:40:20 -05:00
Tom Lane 5b3a2ca850 Dept of second thoughts: rename new perl script.
It didn't take long at all for me to become irritated that the original
choice of name for this script resulted in "warning" showing up in several
places in build logs, because I tend to grep for that.  Change the script
name to avoid that.
2017-02-19 16:41:51 -05:00
Tom Lane 65d508fd4d Suppress "unused variable" warnings with older versions of flex.
Versions of flex before 2.5.36 might generate code that results in an
"unused variable" warning, when using %option reentrant.  Historically
we've worked around that by specifying -Wno-error, but that's an
unsatisfying solution.  The official "fix" for this was just to insert a
dummy reference to the variable, so write a small perl script that edits
the generated C code similarly.

The MSVC side of this is untested, but the buildfarm should soon reveal
if I broke that.

Discussion: https://postgr.es/m/25456.1487437842@sss.pgh.pa.us
2017-02-19 13:04:30 -05:00
Robert Haas 569174f1be btree: Support parallel index scans.
This isn't exposed to the optimizer or the executor yet; we'll add
support for those things in a separate patch.  But this puts the
basic mechanism in place: several processes can attach to a parallel
btree index scan, and each one will get a subset of the tuples that
would have been produced by a non-parallel scan.  Each index page
becomes the responsibility of a single worker, which then returns
all of the TIDs on that page.

Rahila Syed, Amit Kapila, Robert Haas, reviewed and tested by
Anastasia Lubennikova, Tushar Ahuja, and Haribabu Kommi.
2017-02-15 07:41:14 -05:00
Robert Haas 7ada2d31f4 Remove contrib/tsearch2.
This module was intended to ease migrations of applications that used
the pre-8.3 version of text search to the in-core version introduced
in that release.  However, since all pre-8.3 releases of the database
have been out of support for more than 5 years at this point, we
expect that few people are depending on it at this point.  If some
people still need it, nothing prevents it from being maintained as a
separate extension, outside of core.

Discussion: http://postgr.es/m/CA+Tgmob5R8aDHiFRTQsSJbT1oreKg2FOSBrC=2f4tqEH3dOMAg@mail.gmail.com
2017-02-13 11:06:11 -05:00
Robert Haas 85c11324ca Rename user-facing tools with "xlog" in the name to say "wal".
This means pg_receivexlog because pg_receivewal, pg_resetxlog
becomes pg_resetwal, and pg_xlogdump becomes pg_waldump.
2017-02-09 16:23:46 -05:00
Robert Haas 7b4ac19982 Extend index AM API for parallel index scans.
This patch doesn't actually make any index AM parallel-aware, but it
provides the necessary functions at the AM layer to do so.

Rahila Syed, Amit Kapila, Robert Haas
2017-01-24 16:42:58 -05:00
Peter Eisentraut 352a24a1f9 Generate fmgr prototypes automatically
Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains
prototypes for all functions registered in pg_proc.h.  This avoids
having to manually maintain these prototypes across a random variety of
header files.  It also automatically enforces a correct function
signature, and since there are warnings about missing prototypes, it
will detect functions that are defined but not registered in
pg_proc.h (or otherwise used).

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2017-01-17 14:06:07 -05:00
Peter Eisentraut 05cd12ed5b pg_ctl: Change default to wait for all actions
The different actions in pg_ctl had different defaults for -w and -W,
mostly for historical reasons.  Most users will want the -w behavior, so
make that the default.

Remove the -w option in most example and test code, so avoid confusion
and reduce verbosity.  pg_upgrade is not touched, so it can continue to
work with older installations.

Reviewed-by: Beena Emerson <memissemerson@gmail.com>
Reviewed-by: Ryan Murphy <ryanfmurphy@gmail.com>
2017-01-14 09:15:08 -05:00
Peter Eisentraut e574f15d62 Updates to reflect that pg_ctl stop -m fast is the default
Various example and test code used -m fast explicitly, but since it's
the default, this can be omitted now or should be replaced by a better
example.

pg_upgrade is not touched, so it can continue to operate with older
installations.
2017-01-13 21:25:36 -05:00
Peter Eisentraut 933b46644c Use 'use strict' in all Perl programs 2017-01-05 12:34:48 -05:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Noah Misch cc07e06b1e MSVC: Position MSBFLAGS after flags it might override.
Christian Ullrich
2016-12-18 18:12:23 -05:00
Robert Haas acddbe221b Update typedefs.list
So developers can more easily run pgindent locally
2016-12-13 10:51:32 -05:00
Robert Haas f0e44751d7 Implement table partitioning.
Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own.  The children are called
partitions and contain all of the actual data.  Each partition has an
implicit partitioning constraint.  Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed.  Partitions
can't have extra columns and may not allow nulls unless the parent
does.  Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.

Currently, tables can be range-partitioned or list-partitioned.  List
partitioning is limited to a single column, but range partitioning can
involve multiple columns.  A partitioning "column" can be an
expression.

Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations.  The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.

Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others.  Minor revisions by me.
2016-12-07 13:17:55 -05:00
Heikki Linnakangas fe0a0b5993 Replace PostmasterRandom() with a stronger source, second attempt.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.

pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:

- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom

Unlike the current pgcrypto function, the source is chosen by configure.
That makes it easier to test different implementations, and ensures that
we don't accidentally fall back to a less secure implementation, if the
primary source fails. All of those methods are quite reliable, it would be
pretty surprising for them to fail, so we'd rather find out by failing
hard.

If no strong random source is available, we fall back to using erand48(),
seeded from current timestamp, like PostmasterRandom() was. That isn't
cryptographically secure, but allows us to still work on platforms that
don't have any of the above stronger sources. Because it's not very secure,
the built-in implementation is only used if explicitly requested with
--disable-strong-random.

This replaces the more complicated Fortuna algorithm we used to have in
pgcrypto, which is unfortunate, but all modern platforms have /dev/urandom,
so it doesn't seem worth the maintenance effort to keep that. pgcrypto
functions that require strong random numbers will be disabled with
--disable-strong-random.

Original patch by Magnus Hagander, tons of further work by Michael Paquier
and me.

Discussion: https://www.postgresql.org/message-id/CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqRWkNYRRPJA7-cF+LfroYV10pvjdz6GNvxk-Eee9FypKA@mail.gmail.com
2016-12-05 13:42:59 +02:00
Robert Haas 13df76a537 Introduce dynamic shared memory areas.
Programmers discovered decades ago that it was useful to have a simple
interface for allocating and freeing memory, which is why malloc() and
free() were invented.  Unfortunately, those handy tools don't work
with dynamic shared memory segments because those are specific to
PostgreSQL and are not necessarily mapped at the same address in every
cooperating process.  So invent our own allocator instead.  This makes
it possible for processes cooperating as part of parallel query
execution to allocate and free chunks of memory without having to
reserve them prior to the start of execution.  It could also be used
for longer lived objects; for example, we could consider storing data
for pg_stat_statements or the stats collector in shared memory using
these interfaces, rather than writing them to files.  Basically,
anything that needs shared memory but can't predict in advance how
much it's going to need might find this useful.

Thomas Munro and Robert Haas.  The original code (of mine) on which
Thomas based his work was actually designed to be a new backend-local
memory allocator for PostgreSQL, but that hasn't gone anywhere - or
not yet, anyway.  Thomas took that work and performed major
refactoring and extensive modifications to make it work with dynamic
shared memory, including the addition of appropriate locking.

Discussion: CA+TgmobkeWptGwiNa+SGFWsTLzTzD-CeLz0KcE-y6LFgoUus4A@mail.gmail.com
Discussion: CAEepm=1z5WLuNoJ80PaCvz6EtG9dN0j-KuHcHtU6QEfcPP5-qA@mail.gmail.com
2016-12-02 12:34:36 -05:00
Robert Haas 13e14a78ea Management of free memory pages.
This is intended as infrastructure for a full-fledged allocator for
dynamic shared memory.  The interface looks a bit like a real
allocator, but only supports allocating and freeing memory in
multiples of the 4kB page size.  Further, to free memory, you must
know the size of the span you wish to free, in pages.  While these are
make it unsuitable as an allocator in and of itself, it still serves
as very useful scaffolding for a full-fledged allocator.

Robert Haas and Thomas Munro.  This code is mostly the same as my 2014
submission, but Thomas fixed quite a few bugs and made some changes to
the interface.

Discussion: CA+TgmobkeWptGwiNa+SGFWsTLzTzD-CeLz0KcE-y6LFgoUus4A@mail.gmail.com
Discussion: CAEepm=1z5WLuNoJ80PaCvz6EtG9dN0j-KuHcHtU6QEfcPP5-qA@mail.gmail.com
2016-12-02 12:03:30 -05:00
Noah Misch 650b967076 Change qr/foo$/m to qr/foo\n/m, for Perl 5.8.8.
In each case, absence of a trailing newline would itself constitute a
PostgreSQL bug.  Therefore, this slightly enhances the changed tests.
This works around a bug that last appeared in Perl 5.8.8, fixing
src/test/modules/test_pg_dump when run against that version.  Commit
e7293e3271 worked around the bug, but the
subsequent addition of test_pg_dump introduced affected code.  As that
commit had shown, slight increases in pattern complexity can suppress
the bug.  This commit edits qr/foo$/m patterns too complex to encounter
the bug today, for style consistency and robustness against unrelated
pattern changes.  Back-patch to 9.6, where test_pg_dump was introduced.

As of this writing, a fresh MSYS installation includes an affected Perl
5.8.8.  The Perl 5.8.8 in Red Hat Enterprise Linux 5.11 carries a patch
that renders it unaffected, but the Perl 5.8.5 of Red Hat Enterprise
Linux 4.4 is affected.
2016-11-07 20:27:30 -05:00
Heikki Linnakangas faae1c918e Revert "Replace PostmasterRandom() with a stronger way of generating randomness."
This reverts commit 9e083fd468. That was a
few bricks shy of a load:

* Query cancel stopped working
* Buildfarm member pademelon stopped working, because the box doesn't have
  /dev/urandom nor /dev/random.

This clearly needs some more discussion, and a quite different patch, so
revert for now.
2016-10-18 16:28:23 +03:00
Heikki Linnakangas 9e083fd468 Replace PostmasterRandom() with a stronger way of generating randomness.
This adds a new routine, pg_strong_random() for generating random bytes,
for use in both frontend and backend. At the moment, it's only used in
the backend, but the upcoming SCRAM authentication patches need strong
random numbers in libpq as well.

pg_strong_random() is based on, and replaces, the existing implementation
in pgcrypto. It can acquire strong random numbers from a number of sources,
depending on what's available:
- OpenSSL RAND_bytes(), if built with OpenSSL
- On Windows, the native cryptographic functions are used
- /dev/urandom
- /dev/random

Original patch by Magnus Hagander, with further work by Michael Paquier
and me.

Discussion: <CAB7nPqRy3krN8quR9XujMVVHYtXJ0_60nqgVc6oUk8ygyVkZsA@mail.gmail.com>
2016-10-17 11:52:50 +03:00
Andres Freund 5dfc198146 Use more efficient hashtable for execGrouping.c to speed up hash aggregation.
The more efficient hashtable speeds up hash-aggregations with more than
a few hundred groups significantly. Improvements of over 120% have been
measured.

Due to the the different hash table queries that not fully
determined (e.g. GROUP BY without ORDER BY) may change their result
order.

The conversion is largely straight-forward, except that, due to the
static element types of simplehash.h type hashes, the additional data
some users store in elements (e.g. the per-group working data for hash
aggregaters) is now stored in TupleHashEntryData->additional.  The
meaning of BuildTupleHashTable's entrysize (renamed to additionalsize)
has been changed to only be about the additionally stored size.  That
size is only used for the initial sizing of the hash-table.

Reviewed-By: Tomas Vondra
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 17:22:51 -07:00
Andres Freund b30d3ea824 Add a macro templatized hashtable.
dynahash.c hash tables aren't quite fast enough for some
use-cases. There are several reasons for lacking performance:
- the use of chaining for collision handling makes them cache
  inefficient, that's especially an issue when the tables get bigger.
- as the element sizes for dynahash are only determined at runtime,
  offset computations are somewhat expensive
- hash and element comparisons are indirect function calls, causing
  unnecessary pipeline stalls
- it's two level structure has some benefits (somewhat natural
  partitioning), but increases the number of indirections
to fix several of these the hash tables have to be adjusted to the
individual use-case at compile-time. C unfortunately doesn't provide a
good way to do compile code generation (like e.g. c++'s templates for
all their weaknesses do).  Thus the somewhat ugly approach taken here is
to allow for code generation using a macro-templatized header file,
which generates functions and types based on a prefix and other
parameters.

Later patches use this infrastructure to use such hash tables for
tidbitmap.c (bitmap scans) and execGrouping.c (hash aggregation,
...). In queries where these use up a large fraction of the time, this
has been measured to lead to performance improvements of over 100%.

There are other cases where this could be useful (e.g. catcache.c).

The hash table design chosen is a variant of linear open-addressing. The
biggest disadvantage of simple linear addressing schemes are highly
variable lookup times due to clustering, and deletions leaving a lot of
tombstones around.  To address these issues a variant of "robin hood"
hashing is employed.  Robin hood hashing optimizes chaining lengths by
moving elements close to their optimal bucket ("rich" elements), out of
the way if a to-be-inserted element is further away from its optimal
position (i.e. it's "poor").  While that can make insertions slower, the
average lookup performance is a lot better, and higher fill factors can
be used in a still performant manner.  To avoid tombstones - which
normally solve the issue that a deleted node's presence is relevant to
determine whether a lookup needs to continue looking or is done -
buckets following a deleted element are shifted backwards, unless
they're empty or already at their optimal position.

There's further possible improvements that can be made to this
implementation. Amongst others:
- Use distance as a termination criteria during searches. This is
  generally a good idea, but I've been able to see the overhead of
  distance calculations in some cases.
- Consider combining the 'empty' status into the hashvalue, and enforce
  storing the hashvalue. That could, in some cases, increase memory
  density and remove a few instructions.
- Experiment further with the, very conservatively choosen, fillfactor.
- Make maximum size of hashtable configurable, to allow storing very
  very large tables. That'd require 64bit hash values to be more common
  than now, though.
- some smaller memcpy calls could be optimized to copy larger chunks
But since the new implementation is already considerably faster than
dynahash it seem sensible to start using it.

Reviewed-By: Tomas Vondra
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
2016-10-14 16:07:38 -07:00
Tom Lane eda04886c1 Avoid direct cross-module links in hstore_plperl and ltree_plpython, too.
Just turning the crank on the project started in commit d51924be8.
These cases turn out to be exact subsets of the boilerplate needed
for hstore_plpython.

Discussion: <2652.1475512158@sss.pgh.pa.us>
2016-10-04 17:49:07 -04:00
Tom Lane d51924be88 Convert contrib/hstore_plpython to not use direct linking to other modules.
Previously, on most platforms, we allowed hstore_plpython's references
to hstore and plpython to be unresolved symbols at link time, trusting
the dynamic linker to resolve them when the module is loaded.  This
has a number of problems, the worst being that the dynamic linker
does not know where the references come from and can do nothing but
fail if those other modules haven't been loaded.  We've more or less
gotten away with that for the limited use-case of datatype transform
modules, but even there, it requires some awkward hacks, most recently
commit 83c249200.

Instead, let's not treat these references as linker-resolvable at all,
but use function pointers that are manually filled in by the module's
_PG_init function.  There are few enough contact points that this
doesn't seem unmaintainable, at least for these use-cases.  (Note that
the same technique wouldn't work at all for decoupling from libpython
itself, but fortunately that's just a standard shared library and can
be linked to normally.)

This is an initial patch that just converts hstore_plpython.  If the
buildfarm doesn't find any fatal problems, I'll work on the other
transform modules soon.

Tom Lane, per an idea of Andres Freund's.

Discussion: <2652.1475512158@sss.pgh.pa.us>
2016-10-03 22:27:11 -04:00
Peter Eisentraut bf5bb2e85b Move fsync routines of initdb into src/common/
The intention is to used those in other utilities such as pg_basebackup
and pg_receivexlog.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-29 12:00:00 -04:00
Tom Lane da6c4f6ca8 Refer to OS X as "macOS", except for the port name which is still "darwin".
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users.  Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern.  Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change.  (In short, this patch touches documentation and
comments, but no actual code.)

I didn't touch contrib/start-scripts/osx/, either.  I suspect those are
obsolete and due for a rewrite, anyway.

I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too.  Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
2016-09-25 15:40:57 -04:00
Heikki Linnakangas 3c2d5d6600 Improve error message on MSVC if perl*.lib is not found.
John Harvey, reviewed by Michael Paquier

Discussion: <CABcP5fjEjgOsh097cWnQrsK9yCswo4DZxp-V47DKCH-MxY9Gig@mail.gmail.com>
2016-09-23 14:21:59 +03:00
Robert Haas 8614b39bca MSVC: Include pg_recvlogical in client-only install.
MauMau, reviewed by Michael Paquier
2016-09-19 14:25:57 -04:00
Tom Lane 28e5e5648c Fix and simplify MSVC build's handling of xml/xslt/uuid dependencies.
Solution.pm mistakenly believed that the xml option requires the xslt
option, when actually the dependency is the other way around; and it
believed that libxml requires libiconv, which is not necessarily so,
so we shouldn't enforce it here.  Fix the option cross-checking logic.

Also, since AddProject already takes care of adding libxml and libxslt
include and library dependencies to every project, there's no need
for the custom code that did that in mkvcbuild.  While at it, let's
handle the similar dependencies for uuid in a similar fashion.

Given the lack of field complaints about these overly strict build
dependency requirements, there seems no need for a back-patch.

Michael Paquier

Discussion: <CAB7nPqR0+gpu3mRQvFjf-V-bMxmiSJ6NpTg9_WzVDL+a31cV2g@mail.gmail.com>
2016-09-11 12:46:55 -04:00
Noah Misch d299eb41df MSVC: Pass any user-set MSBFLAGS to MSBuild and VCBUILD.
This is particularly useful to pass /m, to perform a parallel build.

Christian Ullrich, reviewed by Michael Paquier.
2016-09-08 01:42:09 -04:00
Noah Misch 976a9bbd02 MSVC: Place gendef.pl temporary file in the target directory.
Until now, it used the current working directory.  This makes it safe
for simultaneous invocations of gendef.pl, with different target
directories, to run from a single current working directory, such as
$(top_srcdir).  The MSVC build system will soon rely on this.

Christian Ullrich, reviewed by Michael Paquier.
2016-09-08 01:40:53 -04:00
Tom Lane da6ea70c32 Remove vestigial references to "zic" in favor of "IANA database".
Commit b2cbced9e instituted a policy of referring to the timezone database
as the "IANA timezone database" in our user-facing documentation.
Propagate that wording into a couple of places that were still using "zic"
to refer to the database, which is definitely not right (zic is the
compilation tool, not the data).

Back-patch, not because this is very important in itself, but because
we routinely cherry-pick updates to the tznames files and I don't want
to risk future merge failures.
2016-09-04 19:42:08 -04:00
Heikki Linnakangas ec136d19b2 Move code shared between libpq and backend from backend/libpq/ to common/.
When building libpq, ip.c and md5.c were symlinked or copied from
src/backend/libpq into src/interfaces/libpq, but now that we have a
directory specifically for routines that are shared between the server and
client binaries, src/common/, move them there.

Some routines in ip.c were only used in the backend. Keep those in
src/backend/libpq, but rename to ifaddr.c to avoid confusion with the file
that's now in common.

Fix the comment in src/common/Makefile to reflect how libpq actually links
those files.

There are two more files that libpq symlinks directly from src/backend:
encnames.c and wchar.c. I don't feel compelled to move those right now,
though.

Patch by Michael Paquier, with some changes by me.

Discussion: <69938195-9c76-8523-0af8-eb718ea5b36e@iki.fi>
2016-09-02 13:49:59 +03:00
Tom Lane 04164deb7c initdb now needs to reference libpq include files in MSVC builds.
Fallout from commit a00c58314.  Per buildfarm.
2016-08-20 16:53:25 -04:00
Tom Lane a3bce17ef1 Automate the maintenance of SO_MINOR_VERSION for our shared libraries.
Up to now we've manually adjusted these numbers in several different
Makefiles at the start of each development cycle.  While that's not
much work, it's easily forgotten, so let's get rid of it by setting
the SO_MINOR_VERSION values directly from $(MAJORVERSION).

In the case of libpq, this dev cycle's value of SO_MINOR_VERSION happens
to be "10" anyway, so this switch is transparent.  For ecpg's shared
libraries, this will result in skipping one or two minor version numbers
between v9.6 and v10, which seems like no big problem; and it was a bit
inconsistent that they didn't have equal minor version numbers anyway.

Discussion: <21969.1471287988@sss.pgh.pa.us>
2016-08-16 13:58:54 -04:00
Tom Lane a7b5573d66 Remove separate version numbering for ecpg preprocessor.
Once upon a time, it made sense for the ecpg preprocessor to have its
own version number, because it used a manually-maintained grammar that
wasn't always in sync with the core grammar.  But those days are
thankfully long gone, leaving only a maintenance nuisance behind.
Let's use the PG v10 version numbering changeover as an excuse to get
rid of the ecpg version number and just have ecpg identify itself by
PG_VERSION.  From the user's standpoint, ecpg will go from "4.12" in
the 9.6 branch to "10" in the 10 branch, so there's no failure of
monotonicity.

Discussion: <1471332659.4410.67.camel@postgresql.org>
2016-08-16 12:49:30 -04:00
Robert Haas b25b6c9701 Once again allow LWLocks to be used within DSM segments.
Prior to commit 7882c3b0b9, it was
possible to use LWLocks within DSM segments, but that commit broke
this use case by switching from a doubly linked list to a circular
linked list.  Switch back, using a new bit of general infrastructure
for maintaining lists of PGPROCs.

Thomas Munro, reviewed by me.
2016-08-15 18:09:55 -04:00
Tom Lane 3149a12166 Update git_changelog to know that there's a 9.6 branch.
Missed this in the main 10devel version stamping patch.
2016-08-15 15:24:54 -04:00
Tom Lane 0b9358d440 Stamp shared-library minor version numbers for v10. 2016-08-15 14:35:55 -04:00
Tom Lane ca9112a424 Stamp HEAD as 10devel.
This is a good bit more complicated than the average new-version stamping
commit, because it includes various adjustments in pursuit of changing
from three-part to two-part version numbers.  It's likely some further
work will be needed around that change; but this is enough to get through
the regression tests, at least in Unix builds.

Peter Eisentraut and Tom Lane
2016-08-15 13:49:49 -04:00
Tom Lane b5bce6c1ec Final pgindent + perltidy run for 9.6. 2016-08-15 13:42:51 -04:00
Tom Lane 05d8dec690 Simplify the process of perltidy'ing our Perl files.
Wrap the perltidy invocation into a shell script to reduce the risk of
copy-and-paste errors.  Include removal of *.bak files in the script,
so they don't accidentally get committed.  Improve the directions in
the README file.
2016-08-15 11:32:09 -04:00
Noah Misch fcd15f1358 Obstruct shell, SQL, and conninfo injection via database and role names.
Due to simplistic quoting and confusion of database names with conninfo
strings, roles with the CREATEDB or CREATEROLE option could escalate to
superuser privileges when a superuser next ran certain maintenance
commands.  The new coding rule for PQconnectdbParams() calls, documented
at conninfo_array_parse(), is to pass expand_dbname=true and wrap
literal database names in a trivial connection string.  Escape
zero-length values in appendConnStrVal().  Back-patch to 9.1 (all
supported versions).

Nathan Bossart, Michael Paquier, and Noah Misch.  Reviewed by Peter
Eisentraut.  Reported by Nathan Bossart.

Security: CVE-2016-5424
2016-08-08 10:07:46 -04:00
Tom Lane 18555b1323 Establish conventions about global object names used in regression tests.
To ensure that "make installcheck" can be used safely against an existing
installation, we need to be careful about what global object names
(database, role, and tablespace names) we use; otherwise we might
accidentally clobber important objects.  There's been a weak consensus that
test databases should have names including "regression", and that test role
names should start with "regress_", but we didn't have any particular rule
about tablespace names; and neither of the other rules was followed with
any consistency either.

This commit moves us a long way towards having a hard-and-fast rule that
regression test databases must have names including "regression", and that
test role and tablespace names must start with "regress_".  It's not
completely there because I did not touch some test cases in rolenames.sql
that test creation of special role names like "session_user".  That will
require some rethinking of exactly what we want to test, whereas the intent
of this patch is just to hit all the cases in which the needed renamings
are cosmetic.

There is no enforcement mechanism in this patch either, but if we don't
add one we can expect that the tests will soon be violating the convention
again.  Again, that's not such a cosmetic change and it will require
discussion.  (But I did use a quick-hack enforcement patch to find these
cases.)

Discussion: <16638.1468620817@sss.pgh.pa.us>
2016-07-17 18:42:43 -04:00
Tom Lane 30b2731bd2 Fix TAP tests and MSVC scripts for pathnames with spaces.
Change assorted places in our Perl code that did things like
	system("prog $path/file");
to do it more like
	system('prog', "$path/file");
which is safe against spaces and other special characters in the path
variable.  The latter was already the prevailing style, but a few bits
of code hadn't gotten this memo.  Back-patch to 9.4 as relevant.

Michael Paquier, Kyotaro Horiguchi

Discussion: <20160704.160213.111134711.horiguchi.kyotaro@lab.ntt.co.jp>
2016-07-09 16:47:38 -04:00
Tom Lane 63ae052367 Update oidjoins regression test for 9.6.
Looks like we had some more catalog drift recently.
2016-06-22 17:12:55 -04:00
Robert Haas e472ce9624 Add integrity-checking functions to pg_visibility.
The new pg_check_visible() and pg_check_frozen() functions can be used to
verify that the visibility map bits for a relation's data pages match the
actual state of the tuples on those pages.

Amit Kapila and Robert Haas, reviewed (in earlier versions) by Andres
Freund.  Additional testing help by Thomas Munro.
2016-06-15 14:33:58 -04:00
Noah Misch 3be0a62ffe Finish pgindent run for 9.6: Perl files. 2016-06-12 04:19:56 -04:00
Noah Misch b098abf905 Document the authoritative version of perltidy.
Every whole-tree perltidy run has used this version, firmly establishing
it as the de facto standard.
2016-06-12 04:19:44 -04:00
Robert Haas e7bcd983f5 Yet again update typedefs.list file in preparation for pgindent run
Because the run was delayed, the file had a chance to get out of date.
2016-06-09 12:16:17 -04:00
Greg Stark e1623c3959 Fix various common mispellings.
Mostly these are just comments but there are a few in documentation
and a handful in code and tests. Hopefully this doesn't cause too much
unnecessary pain for backpatching. I relented from some of the most
common like "thru" for that reason. The rest don't seem numerous
enough to cause problems.

Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
2016-06-03 16:08:45 +01:00
Peter Eisentraut 5251f2fc4d Update release instructions for translation updates
We don't tag the translations repository any more, because the commits
into postgresql contain the git hashes, and that's authoritative.
2016-05-13 21:21:47 -04:00
Stephen Frost 6e243c43c9 Add test_pg_dump to @contrib_excludes
The test_pg_dump extension doesn't have a C component, so we need
to exclude it from the MSVC build system trying to figure out how
to build it.

Also add a "MODULES" line to the Makefile, as test_extensions has.
Might not be necessary, but seems good to keep things consistent.

Lastly, remove the 'installcheck' line from test_pg_dump, as that
was causing redefinition errors, at least on my box.  This also
makes test_pg_dump consistent with how commit_ts is set up.
2016-05-06 16:39:56 -04:00
Robert Haas f2f5e7e78e Again update typedefs.list file in preparation for pgindent run
This time, use the buildfarm-supplied contents for this file, instead
of trying to update it by eyeballing the pgindent output.

Per discussion with Tom and Bruce.
2016-05-02 09:23:55 -04:00
Tom Lane 8473b7f95f Add a --non-master-only option to git_changelog.
This has the inverse effect of --master-only.  It's needed to help find
cases where a commit should not be described in major release notes
because it was back-patched into older branches, though not at the same
time as the HEAD commit.
2016-05-01 11:24:32 -04:00
Andrew Dunstan 0fb54de9aa Support building with Visual Studio 2015
Adjust the way we detect the locale. As a result the minumum Windows
version supported by VS2015 and later is Windows Vista. Add some tweaks
to remove new compiler warnings. Remove documentation references to the
now obsolete msysGit.

Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich.

Backpatch to 9.5
2016-04-29 08:09:07 -04:00
Robert Haas acb51bd71d Update typedefs.list file in preparation for pgindent run
In addition to adding new typedefs, I also re-sorted the file so that
various entries add piecemeal, mostly or entirely by me, were alphabetized
the same way as other entries in the file.
2016-04-27 11:50:34 -04:00
Tom Lane 8067c8f86b Add a --brief option to git_changelog.
In commit c0b050192, Andres introduced the idea of including one-line
commit references in our major release notes.  Teach git_changelog to
emit a (lightly adapted) version of that format, so that we don't
have to laboriously add it to the notes after the fact.  The default
output isn't changed, since I anticipate still using that for minor
release notes.
2016-04-26 18:52:41 -04:00
Magnus Hagander cf086b1c2f Update helptext for vcregress.pl
This has clearly not been tracking the code changse for quite some time.

Michael Paquier, problem spotted by Kyotaro HORIGUCHI
2016-04-15 10:04:10 +02:00
Kevin Grittner 80647bf65a Make oldSnapshotControl a pointer to a volatile structure
It was incorrectly declared as a volatile pointer to a non-volatile
structure.  Eliminate the OldSnapshotControl struct definition; it
is really not needed.  Pointed out by Tom Lane.

While at it, add OldSnapshotControlData to pgindent's list of
structures.
2016-04-11 15:43:52 -05:00
Peter Eisentraut ee5dbc8173 cpluspluscheck: Update include path
Some things in src/include/fe_utils require libpq headers, so add
libpq's include path to the command line used here.
2016-04-11 11:16:16 -04:00
Andres Freund 48354581a4 Allow Pin/UnpinBuffer to operate in a lockfree manner.
Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).

To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).

As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths.  There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.

As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.

This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me.  Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.

On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.

Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell
2016-04-10 20:12:32 -07:00
Kevin Grittner 279d86afdb Add snapshot_too_old to NSVC @contrib_excludes
The buildfarm showed failure for Windows MSVC builds due to this
omission.  This might not be the only problem with the Makefile for
this feature, but hopefully this will get it past the immediate
problem.

Fix suggested by Tom Lane
2016-04-08 17:22:21 -05:00
Andrew Dunstan 76a1c97bf2 Silence warning from modern perl about unescaped braces 2016-04-08 12:50:30 -04:00
Andrew Dunstan 01a07e6c11 Turn down MSVC compiler verbosity
Most of what is produced by the detailed verbosity level is of no
interest at all, so switch to the normal level for more usable output.

Christian Ullrich

Backpatch to all live branches
2016-04-08 12:37:20 -04:00
Fujii Masao 989be0810d Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.

This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.

Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.

The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.

This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.

The regression test for multiple synchronous standbys is not included
in this commit. It should come later.

Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi

Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 17:18:25 +09:00
Alvaro Herrera f2fcad27d5 Support ALTER THING .. DEPENDS ON EXTENSION
This introduces a new dependency type which marks an object as depending
on an extension, such that if the extension is dropped, the object
automatically goes away; and also, if the database is dumped, the object
is included in the dump output.  Currently the grammar supports this for
indexes, triggers, materialized views and functions only, although the
utility code is generic so adding support for more object types is a
matter of touching the parser rules only.

Author: Abhijit Menon-Sen
Reviewed-by: Alexander Korotkov, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org
2016-04-05 18:38:54 -03:00
Noah Misch 4ad6f13500 Copyedit comments and documentation. 2016-04-01 21:53:10 -04:00
Magnus Hagander 3063e7a840 Add missing gss option to msvc config template
Michael Paquier
2016-03-30 10:49:44 +02:00
Tom Lane f5f15ea6aa Fix MSVC build for changes in zic.
zic now only needs zic.c, but I didn't realize knowledge about it was
hardwired into Mkvcbuild.pm.  Per buildfarm.
2016-03-28 16:02:07 -04:00
Tom Lane cd37bb7859 Improve PL/Tcl errorCode facility by providing decoded name for SQLSTATE.
We don't really want to encourage people to write numeric SQLSTATEs in
programs; that's unreadable and error-prone.  Copy plpgsql's infrastructure
for converting between SQLSTATEs and exception names shown in Appendix A,
and modify examples in tests and documentation to do it that way.
2016-03-25 16:54:52 -04:00
Tom Lane c1156411ad Move psql's psqlscan.l into src/fe_utils.
This completes (at least for now) the project of getting rid of ad-hoc
linkages among the src/bin/ subdirectories.  Everything they share is now
in src/fe_utils/ and is included from a static library at link time.

A side benefit is that we can restore the FLEX_NO_BACKUP check for
psqlscanslash.l.  We might need to think of another way to do that check
if we ever need to build two lexers with that property in the same source
directory, but there's no foreseeable reason to need that.
2016-03-24 20:28:47 -04:00
Tom Lane d65bea26a8 Move psql's print.c and mbprint.c into src/fe_utils.
Just turning the crank ...
2016-03-24 18:27:28 -04:00
Tom Lane 0ecd3fedfc Add missed inclusion requirement in Mkvcbuild.pm.
Per buildfarm.
2016-03-24 17:12:40 -04:00
Tom Lane 588d963b00 Create src/fe_utils/, and move stuff into there from pg_dump's dumputils.
Per discussion, we want to create a static library and put the stuff into
it that until now has been shared across src/bin/ directories by ad-hoc
methods like symlinking a source file.  This commit creates the library and
populates it with a couple of files that contain the widely-useful portions
of pg_dump's dumputils.c file.  dumputils.c survives, because it has some
stuff that didn't seem appropriate for fe_utils, but it's significantly
smaller and is no longer referenced from any other directory.

Follow-on patches will move more stuff into fe_utils.

The Mkvcbuild.pm hacking here is just a best guess; we'll see how the
buildfarm likes it.
2016-03-24 15:55:57 -04:00
Tom Lane 2c6af4f442 Move keywords.c/kwlookup.c into src/common/.
Now that we have src/common/ for code shared between frontend and backend,
we can get rid of (most of) the klugy ways that the keyword table and
keyword lookup code were formerly shared between different uses.
This is a first step towards a more general plan of getting rid of
special-purpose kluges for sharing code in src/bin/.

I chose to merge kwlookup.c back into keywords.c, as it once was, and
always has been so far as keywords.h is concerned.  We could have
kept them separate, but there is noplace that uses ScanKeywordLookup
without also wanting access to the backend's keyword list, so there
seems little point.

ecpg is still a bit weird, but at least now the trickiness is documented.

I think that the MSVC build script should require no adjustments beyond
what's done here ... but we'll soon find out.
2016-03-23 20:22:08 -04:00
Andres Freund 98a64d0bd7 Introduce WaitEventSet API.
Commit ac1d794 ("Make idle backends exit if the postmaster dies.")
introduced a regression on, at least, large linux systems. Constantly
adding the same postmaster_alive_fds to the OSs internal datastructures
for implementing poll/select can cause significant contention; leading
to a performance regression of nearly 3x in one example.

This can be avoided by using e.g. linux' epoll, which avoids having to
add/remove file descriptors to the wait datastructures at a high rate.
Unfortunately the current latch interface makes it hard to allocate any
persistent per-backend resources.

Replace, with a backward compatibility layer, WaitLatchOrSocket with a
new WaitEventSet API. Users can allocate such a Set across multiple
calls, and add more than one file-descriptor to wait on. The latter has
been added because there's upcoming postgres features where that will be
helpful.

In addition to the previously existing poll(2), select(2),
WaitForMultipleObjects() implementations also provide an epoll_wait(2)
based implementation to address the aforementioned performance
problem. Epoll is only available on linux, but that is the most likely
OS for machines large enough (four sockets) to reproduce the problem.

To actually address the aforementioned regression, create and use a
long-lived WaitEventSet for FE/BE communication.  There are additional
places that would benefit from a long-lived set, but that's a task for
another day.

Thanks to Amit Kapila, who helped make the windows code I blindly wrote
actually work.

Reported-By: Dmitry Vasilyev Discussion:
CAB-SwXZh44_2ybvS5Z67p_CDz=XFn4hNAD=CnMEF+QqkXwFrGg@mail.gmail.com
20160114143931.GG10941@awork2.anarazel.de
2016-03-21 12:22:54 +01:00
Andres Freund 72e2d21c12 Combine win32 and unix latch implementations.
Previously latches for windows and unix had been implemented in
different files. A later patch introduce an expanded wait
infrastructure, keeping the implementation separate would introduce too
much duplication.

This basically just moves the functions, without too much change. The
reason to keep this separate is that it allows blame to continue working
a little less badly; and to make review a tiny bit easier.

Discussion: 20160114143931.GG10941@awork2.anarazel.de
2016-03-21 11:03:26 +01:00
Andres Freund 326d73c86f Second attempt at fixing MSVC build for 68ab8e8ba4.
After the previous fix in 6f1f34c9 msvc ended up looking for psqlscan.c
in the wrong directory.

David's fix just forces the path to be adjusted. That's not a
particularly pretty fix, but it hopefully will make the buildfarm green
again.

Author: David Rowley
Discussion: CAKJS1f_9CCi_t+LEgV5GWoCj3wjavcMoDc5qfcf_A0UwpQoPoA@mail.gmail.com
2016-03-21 10:49:45 +01:00
Tom Lane 6f1f34c92b Best-guess attempt at fixing MSVC build for 68ab8e8ba4.
pgbench now needs to use src/bin/psql/psqlscan.l, but it's not very clear
how to fit that into the MSVC build system.  If this doesn't work I'm going
to need some help from somebody who actually understands those scripts ...
2016-03-20 17:51:54 -04:00
Andrew Dunstan 5d03201056 Remove dependency on psed for MSVC builds.
Modern Perl has removed psed from its core distribution, so it might not
be readily available on some build platforms. We therefore replace its
use with a Perl script generated by s2p, which is equivalent to the sed
script. The latter is retained for non-MSVC builds to avoid creating a
new hard dependency on Perl for non-Windows tarball builds.

Backpatch to all live branches.

Michael Paquier and me.
2016-03-19 18:36:35 -04:00
Tom Lane 0ea9efbe9e Split psql's lexer into two separate .l files for SQL and backslash cases.
This gets us to a point where psqlscan.l can be used by other frontend
programs for the same purpose psql uses it for, ie to detect when it's
collected a complete SQL command from input that is divided across
line boundaries.  Moreover, other programs can supply their own lexers
for backslash commands of their own choosing.  A follow-on patch will
use this in pgbench.

The end result here is roughly the same as in Kyotaro Horiguchi's
0001-Make-SQL-parser-part-of-psqlscan-independent-from-ps.patch, although
the details of the method for switching between lexers are quite different.
Basically, in this patch we share the entire PsqlScanState, YY_BUFFER_STATE
stack, *and* yyscan_t between different lexers.  The only thing we need
to do to switch to a different lexer is to make sure the start_state is
valid for the new lexer.  This works because flex doesn't keep any other
persistent state that depends on the specific lexing tables generated for
a particular .l file.  (We are assuming that both lexers are built with
the same flex version, or at least versions that are compatible with
respect to the contents of yyscan_t; but that doesn't seem likely to
be a big problem in practice, considering how slowly flex changes.)

Aside from being more efficient than Horiguchi-san's original solution,
this avoids possible corner-case changes in semantics: the original code
was capable of popping the input buffer stack while still staying in
backslash-related parsing states.  I'm not sure that that equates to any
useful user-visible behaviors, but I'm not sure it doesn't either, so
I'm loath to assume that we only need to consider the topmost buffer when
parsing a backslash command.

I've attempted to update the MSVC build scripts for the added .l file,
but will rely on the buildfarm to see if I missed anything.

Kyotaro Horiguchi and Tom Lane
2016-03-19 00:24:55 -04:00
Andres Freund 9cd00c457e Checkpoint sorting and balancing.
Up to now checkpoints were written in the order they're in the
BufferDescriptors. That's nearly random in a lot of cases, which
performs badly on rotating media, but even on SSDs it causes slowdowns.

To avoid that, sort checkpoints before writing them out. We currently
sort by tablespace, relfilenode, fork and block number.

One of the major reasons that previously wasn't done, was fear of
imbalance between tablespaces. To address that balance writes between
tablespaces.

The other prime concern was that the relatively large allocation to sort
the buffers in might fail, preventing checkpoints from happening. Thus
pre-allocate the required memory in shared memory, at server startup.

This particularly makes it more efficient to have checkpoint flushing
enabled, because that'll often result in a lot of writes that can be
coalesced into one flush.

Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-03-10 17:05:09 -08:00
Andres Freund 428b1d6b29 Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches.  When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively.  This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.

On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.

Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache.  Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.

While desirable and likely possible this patch does not contain an
implementation for windows.

With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.

A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences.  This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.

Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-03-10 17:04:34 -08:00
Andres Freund 2f1f443930 Add valgrind suppressions for python code.
Python's allocator does some low-level tricks for efficiency;
unfortunately they trigger valgrind errors. Those tricks can be disabled
making instrumentation easier; but few people testing postgres will have
such a build of python. So add broad suppressions of the resulting
errors.

See also https://svn.python.org/projects/python/trunk/Misc/README.valgrind

This possibly will suppress valid errors, but without it it's basically
impossible to use valgrind with plpython code.

Author: Andres Freund
Backpatch: 9.4, where we started to maintain valgrind suppressions
2016-03-08 19:40:58 -08:00
Andres Freund 5e43bee830 Add valgrind suppressions for bootstrap related code.
Author: Andres Freund
Backpatch: 9.4, where we started to maintain valgrind suppressions
2016-03-08 19:40:58 -08:00
Joe Conway dc7d70ea05 Expose control file data via SQL accessible functions.
Add four new SQL accessible functions: pg_control_system(),
pg_control_checkpoint(), pg_control_recovery(), and pg_control_init()
which expose a subset of the control file data.

Along the way move the code to read and validate the control file to
src/common, where it can be shared by the new backend functions
and the original pg_controldata frontend program.

Patch by me, significant input, testing, and review by Michael Paquier.
2016-03-05 11:10:19 -08:00
Teodor Sigaev 0e7557dc8d Fix Windows build broken by d78a7d9c7f 2016-03-04 21:36:49 +03:00
Alvaro Herrera 52fe6f4e02 Add 'tap_tests' flag in config_default.pl
This makes the flag more visible for testers using the default file as a
template, increasing the likelyhood that the test suite will be run.
Also have the flag be displayed in the fake "configure" output, if set.

This patch is two new lines only, but perltidy decides to shift things
around which makes it appear a bit bigger.

Author: Michaël Paquier
Reviewed-by: Craig Ringer
Discussion: https://www.postgresql.org/message-id/CAB7nPqRet6UAP2APhZAZw%3DVhJ6w-Q-gGLdZkrOqFgd2vc9-ZDw%40mail.gmail.com
2016-03-04 13:04:53 -03:00
Alvaro Herrera 5847397dec Minor tweaks for new src/test/recovery
Author: Michael Paquier
2016-02-29 18:16:59 -03:00
Alvaro Herrera 49148645f7 Add a test framework for recovery
This long-awaited framework is an expansion of the existing PostgresNode
stuff to support additional features for recovery testing; the recovery
tests included in this commit are a starting point that cover some of
the recovery features we have.  More scripts are expected to be added
later.

Author: Michaël Paquier, a bit of help from Amir Rohan
Reviewed by: Amir Rohan, Stas Kelvich, Kyotaro Horiguchi, Victor Wagner,
Craig Ringer, Álvaro Herrera
Discussion: http://www.postgresql.org/message-id/CAB7nPqTf7V6rswrFa=q_rrWeETUWagP=h8LX8XAov2Jcxw0DRg@mail.gmail.com
Discussion: http://www.postgresql.org/message-id/trinity-b4a8035d-59af-4c42-a37e-258f0f28e44a-1443795007012@3capp-mailcom-lxa08
2016-02-26 16:13:30 -03:00
Noah Misch 4163588783 MSVC: Clean tmp_check directory of pg_controldata test suite.
Back-patch to 9.4, where the suite was introduced.
2016-02-24 23:41:33 -05:00
Noah Misch 5882ca6686 Call xlc __isync() after, not before, associated compare-and-swap.
Architecture reference material specifies this order, and s_lock.h
inline assembly agrees.  The former order failed to provide mutual
exclusion to lwlock.c and perhaps to other clients.  The two xlc
buildfarm members, hornet and mandrill, have failed sixteen times with
duplicate key errors involving pg_class_oid_index or pg_type_oid_index.
Back-patch to 9.5, where commit b64d92f1a5
introduced atomics.

Reviewed by Andres Freund and Tom Lane.
2016-02-19 22:47:50 -05:00
Joe Conway a5c43b8869 Add new system view, pg_config
Move and refactor the underlying code for the pg_config client
application to src/common in support of sharing it with a new
system information SRF called pg_config() which makes the same
information available via SQL. Additionally wrap the SRF with a
new system view, as called pg_config.

Patch by me with extensive input and review by Michael Paquier
and additional review by Alvaro Herrera.
2016-02-17 09:12:06 -08:00
Robert Haas c1772ad922 Change the way that LWLocks for extensions are allocated.
The previous RequestAddinLWLocks() method had several disadvantages.
First, the locks would be in the main tranche; we've recently decided
that it's useful for LWLocks used for separate purposes to have
separate tranche IDs.  Second, there wasn't any correlation between
what code called RequestAddinLWLocks() and what code called
LWLockAssign(); when multiple modules are in use, it could become
quite difficult to troubleshoot problems where LWLockAssign() ran out
of locks.  To fix, create a concept of named LWLock tranches which
can be used either by extension or by core code.

Amit Kapila and Robert Haas
2016-02-04 16:43:04 -05:00
Bruce Momjian 216d568432 Properly install dynloader.h on MSVC builds
This will enable PL/Java to be cleanly compiled, as dynloader.h is a
requirement.

Report by Chapman Flack

Patch by Michael Paquier

Backpatch through 9.1
2016-01-19 23:30:29 -05:00
Tom Lane 65c5fcd353 Restructure index access method API to hide most of it at the C level.
This patch reduces pg_am to just two columns, a name and a handler
function.  All the data formerly obtained from pg_am is now provided
in a C struct returned by the handler function.  This is similar to
the designs we've adopted for FDWs and tablesample methods.  There
are multiple advantages.  For one, the index AM's support functions
are now simple C functions, making them faster to call and much less
error-prone, since the C compiler can now check function signatures.
For another, this will make it far more practical to define index access
methods in installable extensions.

A disadvantage is that SQL-level code can no longer see attributes
of index AMs; in particular, some of the crosschecks in the opr_sanity
regression test are no longer possible from SQL.  We've addressed that
by adding a facility for the index AM to perform such checks instead.
(Much more could be done in that line, but for now we're content if the
amvalidate functions more or less replace what opr_sanity used to do.)
We might also want to expose some sort of reporting functionality, but
this patch doesn't do that.

Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily
editorialized on by me.
2016-01-17 19:36:59 -05:00
Alvaro Herrera a967613911 Windows: Make pg_ctl reliably detect service status
pg_ctl is using isatty() to verify whether the process is running in a
terminal, and if not it sends its output to Windows' Event Log ... which
does the wrong thing when the output has been redirected to a pipe, as
reported in bug #13592.

To fix, make pg_ctl use the code we already have to detect service-ness:
in the master branch, move src/backend/port/win32/security.c to src/port
(with suitable tweaks so that it runs properly in backend and frontend
environments); pg_ctl already has access to pgport so it Just Works.  In
older branches, that's likely to cause trouble, so instead duplicate the
required code in pg_ctl.c.

Author: Michael Paquier
Bug report and diagnosis: Egon Kocjan
Backpatch: all supported branches
2016-01-07 11:59:08 -03:00
Tom Lane de7c8dbea1 Make copyright.pl cope with nonstandard case choices in copyright notices.
The need for this is shown by the files it missed in Bruce's recent run.
I fixed it so that it will actually adjust the case when needed.

In passing, also make it skip .po files, since those will just get
overwritten anyway from the translation repository.
2016-01-02 14:45:21 -05:00
Bruce Momjian ee94300446 Update copyright for 2016
Backpatch certain files through 9.1
2016-01-02 13:33:40 -05:00
Tom Lane 870df2b3b7 Fix omission of -X (--no-psqlrc) in some psql invocations.
As of commit d5563d7df, psql -c no longer implies -X, but not all of
our regression testing scripts had gotten that memo.

To ensure consistency of results across different developers, make
sure that *all* invocations of psql in all scripts in our tree
use -X, even where this is not what previously happened.

Michael Paquier and Tom Lane
2015-12-28 11:46:43 -05:00
Tom Lane c4a8812cf6 Use just one standalone-backend session for initdb's post-bootstrap steps.
Previously, each subroutine in initdb fired up its own standalone backend
session.  Over time we'd grown as many as fifteen of these sessions,
and the cumulative startup and shutdown work for them was getting pretty
noticeable.  Combining things so that all these steps share a single
backend session cuts a good 10% off the total runtime of initdb, more
if you're not fsync'ing.

The main stumbling block to doing this before was that some of the sessions
were run with -j and some not.  The improved definition of -j mode
implemented by my previous commit makes it possible to fix that by running
all the post-bootstrap steps with -j; we just have to use double instead of
single newlines to end command strings.  (This is only absolutely necessary
around the VACUUM and CREATE DATABASE steps, since those can't be run in a
transaction block.  But it seems best to make them all use double newlines
so that the commands remain separate for error-reporting purposes.)

A minor disadvantage is that since initdb can't tell how much of its
output the backend has executed, we can no longer have the per-step
progress reporting initdb used to print.  But things are fast enough
nowadays that that's not really all that useful anyway.

In passing, add more const decoration to some of the static arrays in
initdb.c.
2015-12-17 19:38:21 -05:00
Robert Haas 6150a1b08a Move buffer I/O and content LWLocks out of the main tranche.
Move the content lock directly into the BufferDesc, so that locking and
pinning a buffer touches only one cache line rather than two.  Adjust
the definition of BufferDesc slightly so that this doesn't make the
BufferDesc any larger than one cache line (at least on platforms where
a spinlock is only 1 or 2 bytes).

We can't fit the I/O locks into the BufferDesc and stay within one
cache line, so move those to a completely separate tranche.  This
leaves a relatively limited number of LWLocks in the main tranche, so
increase the padding of those remaining locks to a full cache line,
rather than allowing adjacent locks to share a cache line, hopefully
reducing false sharing.

Performance testing shows that these changes make little difference
on laptop-class machines, but help significantly on larger servers,
especially those with more than 2 sockets.

Andres Freund, originally based on an earlier patch by Simon Riggs.
Review and cosmetic adjustments (including heavy rewriting of the
comments) by me.
2015-12-15 13:32:54 -05:00
Tom Lane 9c779c49e3 Accept flex > 2.5.x on Windows, too.
Commit 32f15d05c fixed this in configure, but missed the similar check
in the MSVC scripts.

Michael Paquier, per report from Victor Wagner
2015-12-10 10:19:13 -05:00
Andrew Dunstan f11c557e92 fix a perl typo 2015-11-19 02:42:02 -05:00
Andrew Dunstan d835dd6685 Improve vcregress.pl's handling of tap tests for client programs
The target is now named 'bincheck' rather than 'tapcheck' so that it
reflects what is checked instead of the test mechanism. Some of the
logic is improved, making it easier to add further sets of TAP based
tests in future. Also, the environment setting logic is imrpoved.

As discussed on -hackers a couple of months ago.
2015-11-18 22:47:41 -05:00
Robert Haas 6e71dd7ce9 Modify tqueue infrastructure to support transient record types.
Commit 4a4e6893aa, which introduced this
mechanism, failed to account for the fact that the RECORD pseudo-type
uses transient typmods that are only meaningful within a single
backend.  Transferring such tuples without modification between two
cooperating backends does not work.  This commit installs a system
for passing the tuple descriptors over the same shm_mq being used to
send the tuples themselves.  The two sides might not assign the same
transient typmod to any given tuple descriptor, so we must also
substitute the appropriate receiver-side typmod for the one used by
the sender.  That adds some CPU overhead, but still seems better than
being unable to pass records between cooperating parallel processes.

Along the way, move the logic for handling multiple tuple queues from
tqueue.c to nodeGather.c; tqueue.c now provides a TupleQueueReader,
which reads from a single queue, rather than a TupleQueueFunnel, which
potentially reads from multiple queues.  This change was suggested
previously as a way to make sure that nodeGather.c rather than tqueue.c
had policy control over the order in which to read from queues, but
it wasn't clear to me until now how good an idea it was.  typmod
mapping needs to be performed separately for each queue, and it is
much simpler if the tqueue.c code handles that and leaves multiplexing
multiple queues to higher layers of the stack.
2015-11-06 16:58:45 -05:00
Robert Haas fd5eaad715 Correct pg_indent to pgindent in various comments.
David Christensen
2015-10-08 12:27:54 -04:00