Commit Graph

1649 Commits

Author SHA1 Message Date
Andres Freund 558a9165e0 Compute XID horizon for page level index vacuum on primary.
Previously the xid horizon was only computed during WAL replay. That
had two major problems:
1) It relied on knowing what the table pointed to looks like. That was
   easy enough before the introducing of tableam (we knew it had to be
   heap, although some trickery around logging the heap relfilenodes
   was required). But to properly handle table AMs we need
   per-database catalog access to look up the AM handler, which
   recovery doesn't allow.
2) Not knowing the xid horizon also makes it hard to support logical
   decoding on standbys. When on a catalog table, we need to be able
   to conflict with slots that have an xid horizon that's too old. But
   computing the horizon by visiting the heap only works once
   consistency is reached, but we always need to be able to detect
   conflicts.

There's also a secondary problem, in that the current method performs
redundant work on every standby. But that's counterbalanced by
potentially computing the value when not necessary (either because
there's no standby, or because there's no connected backends).

Solve 1) and 2) by moving computation of the xid horizon to the
primary and by involving tableam in the computation of the horizon.

To address the potentially increased overhead, increase the efficiency
of the xid horizon computation for heap by sorting the tids, and
eliminating redundant buffer accesses. When prefetching is available,
additionally perform prefetching of buffers.  As this is more of a
maintenance task, rather than something routinely done in every read
only query, we add an arbitrary 10 to the effective concurrency -
thereby using IO concurrency, when not globally enabled.  That's
possibly not the perfect formula, but seems good enough for now.

Bumps WAL format, as latestRemovedXid is now part of the records, and
the heap's relfilenode isn't anymore.

Author: Andres Freund, Amit Khandekar, Robert Haas
Reviewed-By: Robert Haas
Discussion:
    https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de
    https://postgr.es/m/20181214014235.dal5ogljs3bmlq44@alap3.anarazel.de
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-03-26 16:52:54 -07:00
Andres Freund 5db6df0c01 tableam: Add tuple_{insert, delete, update, lock} and use.
This adds new, required, table AM callbacks for insert/delete/update
and lock_tuple. To be able to reasonably use those, the EvalPlanQual
mechanism had to be adapted, moving more logic into the AM.

Previously both delete/update/lock call-sites and the EPQ mechanism had
to have awareness of the specific tuple format to be able to fetch the
latest version of a tuple. Obviously that needs to be abstracted
away. To do so, move the logic that find the latest row version into
the AM. lock_tuple has a new flag argument,
TUPLE_LOCK_FLAG_FIND_LAST_VERSION, that forces it to lock the last
version, rather than the current one.  It'd have been possible to do
so via a separate callback as well, but finding the last version
usually also necessitates locking the newest version, making it
sensible to combine the two. This replaces the previous use of
EvalPlanQualFetch().  Additionally HeapTupleUpdated, which previously
signaled either a concurrent update or delete, is now split into two,
to avoid callers needing AM specific knowledge to differentiate.

The move of finding the latest row version into tuple_lock means that
encountering a row concurrently moved into another partition will now
raise an error about "tuple to be locked" rather than "tuple to be
updated/deleted" - which is accurate, as that always happens when
locking rows. While possible slightly less helpful for users, it seems
like an acceptable trade-off.

As part of this commit HTSU_Result has been renamed to TM_Result, and
its members been expanded to differentiated between updating and
deleting. HeapUpdateFailureData has been renamed to TM_FailureData.

The interface to speculative insertion is changed so nodeModifyTable.c
does not have to set the speculative token itself anymore. Instead
there's a version of tuple_insert, tuple_insert_speculative, that
performs the speculative insertion (without requiring a flag to signal
that fact), and the speculative insertion is either made permanent
with table_complete_speculative(succeeded = true) or aborted with
succeeded = false).

Note that multi_insert is not yet routed through tableam, nor is
COPY. Changing multi_insert requires changes to copy.c that are large
enough to better be done separately.

Similarly, although simpler, CREATE TABLE AS and CREATE MATERIALIZED
VIEW are also only going to be adjusted in a later commit.

Author: Andres Freund and Haribabu Kommi
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20190313003903.nwvrxi7rw3ywhdel@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-23 19:55:57 -07:00
Michael Paquier ed308d7837 Add options to enable and disable checksums in pg_checksums
An offline cluster can now work with more modes in pg_checksums:
- --enable enables checksums in a cluster, updating all blocks with a
correct checksum, and updating the control file at the end.
- --disable disables checksums in a cluster, updating only the control
file.
- --check is an extra option able to verify checksums for a cluster, and
the default used if no mode is specified.

When running --enable or --disable, the data folder gets fsync'd for
durability, and then it is followed by a control file update and flush
to keep the operation consistent should the tool be interrupted, killed
or the host unplugged.  If no mode is specified in the options, then
--check is used for compatibility with older versions of pg_checksums
(named pg_verify_checksums in v11 where it was introduced).

Author: Michael Banck, Michael Paquier
Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-23 08:12:55 +09:00
Alexander Korotkov 550b9d26f8 Get rid of jsonpath_gram.h and jsonpath_scanner.h
Jsonpath grammar and scanner are both quite small.  It doesn't worth complexity
to compile them separately.  This commit makes grammar and scanner be compiled
at once.  Therefore, jsonpath_gram.h and jsonpath_gram.h are no longer needed.
This commit also does some reorganization of code in jsonpath_gram.y.

Discussion: https://postgr.es/m/d47b2023-3ecb-5f04-d253-d557547cf74f%402ndQuadrant.com
2019-03-20 11:13:34 +03:00
Alexander Korotkov 72b6460336 Partial implementation of SQL/JSON path language
SQL 2016 standards among other things contains set of SQL/JSON features for
JSON processing inside of relational database.  The core of SQL/JSON is JSON
path language, allowing access parts of JSON documents and make computations
over them.  This commit implements partial support JSON path language as
separate datatype called "jsonpath".  The implementation is partial because
it's lacking datetime support and suppression of numeric errors.  Missing
features will be added later by separate commits.

Support of SQL/JSON features requires implementation of separate nodes, and it
will be considered in subsequent patches.  This commit includes following
set of plain functions, allowing to execute jsonpath over jsonb values:

 * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]),
 * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]).
 * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]).

This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which
are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb,
jsonpath) correspondingly.  These operators will have an index support
(implemented in subsequent patches).

Catversion bumped, to add new functions and operators.

Code was written by Nikita Glukhov and Teodor Sigaev, revised by me.
Documentation was written by Oleg Bartunov and Liudmila Mantrova.  The work
was inspired by Oleg Bartunov.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-16 12:16:48 +03:00
Tom Lane a6417078c4 Create a script that can renumber manually-assigned OIDs.
This commit adds a Perl script renumber_oids.pl, which can reassign a
range of manually-assigned OIDs to someplace else by modifying OID
fields of the catalog *.dat files and OID-assigning macros in the
catalog *.h files.

Up to now, we've encouraged new patches that need manually-assigned
OIDs to use OIDs just above the range of existing OIDs.  Predictably,
this leads to patches stepping on each others' toes, as whichever
one gets committed first creates an OID conflict that other patch
author(s) have to resolve manually.  With the availability of
renumber_oids.pl, we can eliminate a lot of this hassle.
The new project policy, therefore, is:

* Encourage new patches to use high OIDs (the documentation suggests
choosing a block of OIDs at random in 8000..9999).

* After feature freeze in each development cycle, run renumber_oids.pl
to move all such OIDs down to lower numbers, thus freeing the high OID
range for the next development cycle.

This plan should greatly reduce the risk of OID collisions between
concurrently-developed patches.  Also, if such a collision happens
anyway, we have the option to resolve it without much effort by doing
an off-schedule OID renumbering to get the first-committed patch out
of the way.  Or a patch author could use renumber_oids.pl to change
their patch's assignments without much pain.

This approach does put a premium on not hard-wiring any OID values
in places where renumber_oids.pl and genbki.pl can't fix them.
Project practice in that respect seems to be pretty good already,
but a follow-on patch will sand down some rough edges.

John Naylor and Tom Lane, per an idea of Peter Geoghegan's

Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-12 10:50:48 -04:00
Andres Freund c2fe139c20 tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:

1) Heap scans need to be generalized into table scans. Do this by
   introducing TableScanDesc, which will be the "base class" for
   individual AMs. This contains the AM independent fields from
   HeapScanDesc.

   The previous heap_{beginscan,rescan,endscan} et al. have been
   replaced with a table_ version.

   There's no direct replacement for heap_getnext(), as that returned
   a HeapTuple, which is undesirable for a other AMs. Instead there's
   table_scan_getnextslot().  But note that heap_getnext() lives on,
   it's still used widely to access catalog tables.

   This is achieved by new scan_begin, scan_end, scan_rescan,
   scan_getnextslot callbacks.

2) The portion of parallel scans that's shared between backends need
   to be able to do so without the user doing per-AM work. To achieve
   that new parallelscan_{estimate, initialize, reinitialize}
   callbacks are introduced, which operate on a new
   ParallelTableScanDesc, which again can be subclassed by AMs.

   As it is likely that several AMs are going to be block oriented,
   block oriented callbacks that can be shared between such AMs are
   provided and used by heap. table_block_parallelscan_{estimate,
   intiialize, reinitialize} as callbacks, and
   table_block_parallelscan_{nextpage, init} for use in AMs. These
   operate on a ParallelBlockTableScanDesc.

3) Index scans need to be able to access tables to return a tuple, and
   there needs to be state across individual accesses to the heap to
   store state like buffers. That's now handled by introducing a
   sort-of-scan IndexFetchTable, which again is intended to be
   subclassed by individual AMs (for heap IndexFetchHeap).

   The relevant callbacks for an AM are index_fetch_{end, begin,
   reset} to create the necessary state, and index_fetch_tuple to
   retrieve an indexed tuple.  Note that index_fetch_tuple
   implementations need to be smarter than just blindly fetching the
   tuples for AMs that have optimizations similar to heap's HOT - the
   currently alive tuple in the update chain needs to be fetched if
   appropriate.

   Similar to table_scan_getnextslot(), it's undesirable to continue
   to return HeapTuples. Thus index_fetch_heap (might want to rename
   that later) now accepts a slot as an argument. Core code doesn't
   have a lot of call sites performing index scans without going
   through the systable_* API (in contrast to loads of heap_getnext
   calls and working directly with HeapTuples).

   Index scans now store the result of a search in
   IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
   target is not generally a HeapTuple anymore that seems cleaner.

To be able to sensible adapt code to use the above, two further
callbacks have been introduced:

a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
   slots capable of holding a tuple of the AMs
   type. table_slot_callbacks() and table_slot_create() are based
   upon that, but have additional logic to deal with views, foreign
   tables, etc.

   While this change could have been done separately, nearly all the
   call sites that needed to be adapted for the rest of this commit
   also would have been needed to be adapted for
   table_slot_callbacks(), making separation not worthwhile.

b) tuple_satisfies_snapshot checks whether the tuple in a slot is
   currently visible according to a snapshot. That's required as a few
   places now don't have a buffer + HeapTuple around, but a
   slot (which in heap's case internally has that information).

Additionally a few infrastructure changes were needed:

I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
   internally uses a slot to keep track of tuples. While
   systable_getnext() still returns HeapTuples, and will so for the
   foreseeable future, the index API (see 1) above) now only deals with
   slots.

The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.

Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 12:46:41 -07:00
Tom Lane a0b7626268 Simplify release-note links to back branches.
Now that https://www.postgresql.org/docs/release/ is populated,
replace the stopgap text we had under "Prior Releases" with
a pointer to that archive.

Discussion: https://postgr.es/m/e0f09c9a-bd2b-862a-d379-601dfabc8969@postgresql.org
2019-03-09 18:42:39 -05:00
Andres Freund 8586bf7ed8 tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
  ACCESS METHOD ... TYPE TABLE and
  CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.

Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.

Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.

Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.

Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs.  It's quite possible that this behaviour
still needs to be fine tuned.

For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.

Catversion bumped, to add the heap table AM and references to it.

Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
    https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Tom Lane 02a6a54ecd Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.
Test for the compiler builtins __builtin_clz, __builtin_ctz, and
__builtin_popcount, and make use of these in preference to
handwritten C code if they're available.  Create src/port
infrastructure for "leftmost one", "rightmost one", and "popcount"
so as to centralize these decisions.

On x86_64, __builtin_popcount generally won't make use of the POPCNT
opcode because that's not universally supported yet.  Provide code
that checks CPUID and then calls POPCNT via asm() if available.
This requires indirecting through a function pointer, which is
an annoying amount of overhead for a one-instruction operation,
but it's probably not worth working harder than this for our
current use-cases.

I'm not sure we've found all the existing places that could profit
from this new infrastructure; but we at least touched all the
ones that used copied-and-pasted versions of the bitmapset.c code,
and got rid of multiple copies of the associated constant arrays.

While at it, replace c-compiler.m4's one-per-builtin-function
macros with a single one that can handle all the cases we need
to worry about so far.  Also, because I'm paranoid, make those
checks into AC_LINK checks rather than just AC_COMPILE; the
former coding failed to verify that libgcc has support for the
builtin, in cases where it's not inline code.

David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane

Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-15 23:22:33 -05:00
Alvaro Herrera 457aef0f1f Revert attempts to use POPCNT etc instructions
This reverts commits fc6c72747a, 109de05cbb, d0b4663c23 and
711bab1e4d.

Somebody will have to try harder before submitting this patch again.
I've spent entirely too much time on it already, and the #ifdef maze yet
to be written in order for it to build at all got on my nerves.  The
amount of work needed to get a platform-specific performance improvement
that's barely above the noise level is not worth it.
2019-02-15 16:32:30 -03:00
Alvaro Herrera 711bab1e4d Add basic support for using the POPCNT and SSE4.2s LZCNT opcodes
These opcodes have been around in the AMD world since 2007, and 2008 in
the case of intel.  They're supported in GCC and Clang via some __builtin
macros.  The opcodes may be unavailable during runtime, in which case we
fall back on a C-based implementation of the code.  In order to get the
POPCNT instruction we must pass the -mpopcnt option to the compiler.  We
do this only for the pg_bitutils.c file.

David Rowley (with fragments taken from a patch by Thomas Munro)

Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-13 16:10:06 -03:00
Andrew Gierth 02ddd49932 Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).

Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.

Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.

Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.

We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.

Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:

 - Output format is changed to use fixed-point notation for small
   exponents, as printf would, and also to use lowercase 'e', a
   minimum of 2 exponent digits, and a mandatory sign on the exponent,
   to keep the formatting as close as possible to previous output.

 - The output of exact midpoint values is disabled as noted above.

 - The integer fast-path code is changed somewhat (since we have
   fixed-point output and the upstream did not).

 - Our project style has been largely applied to the code with the
   exception of C99 declaration-after-statement, which has been
   retained as an exception to our present policy.

 - Most of upstream's debugging and conditionals are removed, and we
   use our own configure tests to determine things like uint128
   availability.

Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).

Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.

This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.

Code by me, adapting the work of Ulf Adams and other contributors.

References:
https://dl.acm.org/citation.cfm?id=3192369

Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
2019-02-13 15:20:33 +00:00
Andrew Gierth f397e08599 Use strtof() and not strtod() for float4 input.
Using strtod() creates a double-rounding problem; the input decimal
value is first rounded to the nearest double; rounding that to the
nearest float may then give an incorrect result.

An example is that 7.038531e-26 when input via strtod and then rounded
to float4 gives 0xAE43FEp-107 instead of the correct 0xAE43FDp-107.

Values output by earlier PG versions with extra_float_digits=3 should
all be read in with the same values as previously. However, values
supplied by other software using shortest representations could be
mis-read.

On platforms that lack a strtof() entirely, we fall back to the old
incorrect rounding behavior. (As strtof() is required by C99, such
platforms are considered of primarily historical interest.) On VS2013,
some workarounds are used to get correct error handling.

The regression tests now test for the correct input values, so
platforms that lack strtof() will need resultmap entries. An entry for
HP-UX 10 is included (more may be needed).

Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/871s5emitx.fsf@news-spur.riddles.org.uk
Discussion: https://postgr.es/m/87d0owlqpv.fsf@news-spur.riddles.org.uk
2019-02-13 15:19:44 +00:00
Alvaro Herrera d357a16997 Blind attempt at fixing Windows build
Broken since fe33a196de.
2019-02-12 18:29:26 -03:00
Tom Lane 1fb57af920 Create the infrastructure for planner support functions.
Rename/repurpose pg_proc.protransform as "prosupport".  The idea is
still that it names an internal function that provides knowledge to
the planner about the behavior of the function it's attached to;
but redesign the API specification so that it's not limited to doing
just one thing, but can support an extensible set of requests.

The original purpose of simplifying a function call is handled by
the first request type to be invented, SupportRequestSimplify.
Adjust all the existing transform functions to handle this API,
and rename them fron "xxx_transform" to "xxx_support" to reflect
the potential generalization of what they do.  (Since we never
previously provided any way for extensions to add transform functions,
this change doesn't create an API break for them.)

Also add DDL and pg_dump support for attaching a support function to a
user-defined function.  Unfortunately, DDL access has to be restricted
to superusers, at least for now; but seeing that support functions
will pretty much have to be written in C, that limitation is just
theoretical.  (This support is untested in this patch, but a follow-on
patch will add cases that exercise it.)

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:08:48 -05:00
Andrew Dunstan 51b025933d Fix perl searchpath for gen_keywordlist.pl
as found by running src/tools/perlcheck/pgperlsyncheck
2019-02-07 11:14:29 -05:00
Andrew Dunstan 592123efbb Unify searchpath and do file logic in MSVC build scripts.
Commit f83419b739 failed to notice that mkvcbuild.pl and build.pl use
different searchpath and do-file logic, breaking the latter, so it is
adjusted to use the same logic as mkvcbuild.pl.
2019-02-06 07:36:02 -05:00
Andrew Dunstan f83419b739 Fix included file path for modern perl
Contrary to the comment on 772d4b76, only paths starting with "./" or
"../" are considered relative to the current working directory by perl's
"do" function. So this patch converts all the relevant cases to use "./"
paths. This only affects MSVC.

Backpatch to all live branches.
2019-02-05 19:27:47 -05:00
Tom Lane 527b5ed1ad Doc: in each release branch, keep only that branch's own release notes.
Historically we've had each release branch include all prior branches'
notes, including minor-release changes, back to the beginning of the
project.  That's basically an O(N^2) proposition, and it was starting to
catch up with us: as of HEAD the back-branch release notes alone accounted
for nearly 30% of the documentation.  While there's certainly some value
in easy access to back-branch notes, this is getting out of hand.

Hence, switch over to the rule that each branch contains only its own
release notes.  So as to not make older notes too hard to find, each
branch will provide URLs for the immediately preceding branches'
release notes on the project website.

There might be value in providing aggregated notes across all branches
somewhere on the website, but that's a task for another day.

Discussion: https://postgr.es/m/cbd4aeb5-2d9c-8b84-e968-9e09393d4c83@postgresql.org
2019-02-04 19:18:49 -05:00
Andres Freund a9c35cf85c Change function call information to be variable length.
Before this change FunctionCallInfoData, the struct arguments etc for
V1 function calls are stored in, always had space for
FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two
arrays.  For nearly every function call 100 arguments is far more than
needed, therefore wasting memory. Arg and argnull being two separate
arrays also guarantees that to access a single argument, two
cachelines have to be touched.

Change the layout so there's a single variable-length array with pairs
of value / isnull. That drastically reduces memory consumption for
most function calls (on x86-64 a two argument function now uses
64bytes, previously 936 bytes), and makes it very likely that argument
value and its nullness are on the same cacheline.

Arguments are stored in a new NullableDatum struct, which, due to
padding, needs more memory per argument than before. But as usually
far fewer arguments are stored, and individual arguments are cheaper
to access, that's still a clear win.  It's likely that there's other
places where conversion to NullableDatum arrays would make sense,
e.g. TupleTableSlots, but that's for another commit.

Because the function call information is now variable-length
allocations have to take the number of arguments into account. For
heap allocations that can be done with SizeForFunctionCallInfoData(),
for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro
that helps to allocate an appropriately sized and aligned variable.

Some places with stack allocation function call information don't know
the number of arguments at compile time, and currently variably sized
stack allocations aren't allowed in postgres. Therefore allow for
FUNC_MAX_ARGS space in these cases. They're not that common, so for
now that seems acceptable.

Because of the need to allocate FunctionCallInfo of the appropriate
size, older extensions may need to update their code. To avoid subtle
breakages, the FunctionCallInfoData struct has been renamed to
FunctionCallInfoBaseData. Most code only references FunctionCallInfo,
so that shouldn't cause much collateral damage.

This change is also a prerequisite for more efficient expression JIT
compilation (by allocating the function call information on the stack,
allowing LLVM to optimize it away); previously the size of the call
information caused problems inside LLVM's optimizer.

Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
2019-01-26 14:17:52 -08:00
Andres Freund 63746189b2 Change snapshot type to be determined by enum rather than callback.
This is in preparation for allowing the same snapshot be used for
different table AMs. With the current callback based approach we would
need one callback for each supported AM, which clearly would not be
extensible.  Thus add a new Snapshot->snapshot_type field, and move
the dispatch into HeapTupleSatisfiesVisibility() (which is now a
function). Later work will then dispatch calls to
HeapTupleSatisfiesVisibility() and other AMs visibility functions
depending on the type of the table.  The central SnapshotType enum
also seems like a good location to centralize documentation about the
intended behaviour of various types of snapshots.

As tqual.h isn't included by bufmgr.h any more (as HeapTupleSatisfies*
isn't referenced by TestForOldSnapshot() anymore) a few files now need
to include it directly.

Author: Andres Freund, loosely based on earlier work by Haribabu Kommi
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-01-21 17:03:15 -08:00
Tomas Vondra d6ef7fe75c Revert "Add valgrind suppressions for wcsrtombs optimizations"
This reverts commit d3bbc4b96a.

Per discussion, it's not desirable to add valgrind suppressions for
outside our own code base (e.g. glibc in this case), especially when
the suppressions may be platform-specific. There are better ways to
deal with that, e.g. by providing local suppressions.

Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com
2019-01-19 20:49:51 +01:00
Magnus Hagander 0301db623d Replace @postgresql.org with @lists.postgresql.org for mailinglists
Commit c0d0e54084 replaced the ones in the documentation, but missed out
on the ones in the code. Replace those as well, but unlike c0d0e54084,
don't backpatch the code changes to avoid breaking translations.
2019-01-19 19:06:35 +01:00
Noah Misch 472e1e4cf6 Make Emacs perl-mode indent more like perltidy.
This especially helps braces that surround code blocks.  Back-patch to
v11, where commit 56fb890ace first
appeared; before that, settings were even more distant from perltidy.

Reviewed by Andrew Dunstan.

Discussion: https://postgr.es/m/20190103055355.GB267595@gust.leadboat.com
2019-01-13 11:32:31 -08:00
Tom Lane 52a301e0d9 Improve missing-program error handling in make_ctags and make_etags.
If ctags (resp. etags) isn't installed, these scripts naturally fail,
but the error messages were less clear than one could wish.
It seems worth installing an explicit test to improve that.

Nikolay Shaplov, with suggestions from Michael Paquier and Andrew Dunstan

Discussion: https://postgr.es/m/2394207.ccz7JgCJsh@x200m
2019-01-13 13:33:50 -05:00
Tom Lane c64d0cd5ce Use perfect hashing, instead of binary search, for keyword lookup.
We've been speculating for a long time that hash-based keyword lookup
ought to be faster than binary search, but up to now we hadn't found
a suitable tool for generating the hash function.  Joerg Sonnenberger
provided the inspiration, and sample code, to show us that rolling our
own generator wasn't a ridiculous idea.  Hence, do that.

The method used here requires a lookup table of approximately 4 bytes
per keyword, but that's less than what we saved in the predecessor commit
afb0d0712, so it's not a big problem.  The time savings is indeed
significant: preliminary testing suggests that the total time for raw
parsing (flex + bison phases) drops by ~20%.

Patch by me, but it owes its existence to Joerg Sonnenberger;
thanks also to John Naylor for review.

Discussion: https://postgr.es/m/20190103163340.GA15803@britannica.bec.de
2019-01-09 19:47:46 -05:00
Tom Lane afb0d0712f Replace the data structure used for keyword lookup.
Previously, ScanKeywordLookup was passed an array of string pointers.
This had some performance deficiencies: the strings themselves might
be scattered all over the place depending on the compiler (and some
quick checking shows that at least with gcc-on-Linux, they indeed
weren't reliably close together).  That led to very cache-unfriendly
behavior as the binary search touched strings in many different pages.
Also, depending on the platform, the string pointers might need to
be adjusted at program start, so that they couldn't be simple constant
data.  And the ScanKeyword struct had been designed with an eye to
32-bit machines originally; on 64-bit it requires 16 bytes per
keyword, making it even more cache-unfriendly.

Redesign so that the keyword strings themselves are allocated
consecutively (as part of one big char-string constant), thereby
eliminating the touch-lots-of-unrelated-pages syndrome.  And get
rid of the ScanKeyword array in favor of three separate arrays:
uint16 offsets into the keyword array, uint16 token codes, and
uint8 keyword categories.  That reduces the overhead per keyword
to 5 bytes instead of 16 (even less in programs that only need
one of the token codes and categories); moreover, the binary search
only touches the offsets array, further reducing its cache footprint.
This also lets us put the token codes somewhere else than the
keyword strings are, which avoids some unpleasant build dependencies.

While we're at it, wrap the data used by ScanKeywordLookup into
a struct that can be treated as an opaque type by most callers.
That doesn't change things much right now, but it will make it
less painful to switch to a hash-based lookup method, as is being
discussed in the mailing list thread.

Most of the change here is associated with adding a generator
script that can build the new data structure from the same
list-of-PG_KEYWORD header representation we used before.
The PG_KEYWORD lists that plpgsql and ecpg used to embed in
their scanner .c files have to be moved into headers, and the
Makefiles have to be taught to invoke the generator script.
This work is also necessary if we're to consider hash-based lookup,
since the generator script is what would be responsible for
constructing a hash table.

Aside from saving a few kilobytes in each program that includes
the keyword table, this seems to speed up raw parsing (flex+bison)
by a few percent.  So it's worth doing even as it stands, though
we think we can gain even more with a follow-on patch to switch
to hash-based lookup.

John Naylor, with further hacking by me

Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
2019-01-06 17:02:57 -05:00
Tom Lane d33faa285b Move the built-in conversions into the initial catalog data.
Instead of running a SQL script to create the standard conversion
functions and pg_conversion entries, put those entries into the
initial data in postgres.bki.

This shaves a few percent off the runtime of initdb, and also allows
accurate comments to be attached to the conversion functions; the
previous script labeled them with machine-generated comments that
were not quite right for multi-purpose conversion functions.
Also, we can get rid of the duplicative Makefile and MSVC perl
implementations of the generation code for that SQL script.

A functional change is that these pg_proc and pg_conversion entries
are now "pinned" by initdb.  Leaving them unpinned was perhaps a
good thing back while the conversions feature was under development,
but there seems no valid reason for it now.

Also, the conversion functions are now marked as immutable, where
before they were volatile by virtue of lacking any explicit
specification.  That seems like it was just an oversight.

To avoid using magic constants in pg_conversion.dat, extend
genbki.pl to allow encoding names to be converted, much as it
does for language, access method, etc names.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 19:47:53 -05:00
Tom Lane 814c9019aa Use symbolic references for pg_language OIDs in the bootstrap data.
This patch teaches genbki.pl to replace pg_language names by OIDs
in much the same way as it already does for pg_am names etc, and
converts pg_proc.dat to use such symbolic references in the prolang
column.

Aside from getting rid of a few more magic numbers in the initial
catalog data, this means that Gen_fmgrtab.pl no longer needs to read
pg_language.dat, since it doesn't have to know the OID of the "internal"
language; now it's just looking for the string "internal".

No need for a catversion bump, since the contents of postgres.bki
don't actually change at all.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGWtUqxpfAaxS88vEGvi+jKzWZb2EStu5io-UPc4p9rSJg@mail.gmail.com
2019-01-03 18:38:49 -05:00
Bruce Momjian 97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Andres Freund 09568ec3d3 Create a separate oid range for oids assigned by genbki.pl.
The changes I made in 578b229718 assigned oids below
FirstBootstrapObjectId to objects in include/catalog/*.dat files that
did not have an oid assigned, starting at the max oid explicitly
assigned.  Tom criticized that for mainly two reasons:
1) It's not clear which values are manually and which explicitly
   assigned.
2) The space below FirstBootstrapObjectId gets pretty crowded, and
   some PostgreSQL forks have used oids >= 9000 for their own objects,
   to avoid conflicting.

Thus create a new range for objects not assigned explicit oids, but
assigned by genbki.pl. For now 1-9999 is for explicitly assigned oids,
FirstGenbkiObjectId (10000) to FirstBootstrapObjectId (1200) -1 is for
genbki.pl assigned oids, and < FirstNormalObjectId (16384) is for oids
assigned during bootstrap.  It's possible that we'll have to adjust
these boundaries, but there's some headroom for now.

Add a note suggesting that oids in forks should be assigned in the
9000-9999 range.

Catversion bump for obvious reasons.

Per complaint from Tom Lane.

Author: Andres Freund
Discussion: https://postgr.es/m/16845.1544393682@sss.pgh.pa.us
2018-12-13 14:50:57 -08:00
Tom Lane 29180e5d78 Eliminate parallel-make hazard in ecpg/preproc.
Re-making ecpglib's typename.o is dangerous because another make thread
could be doing that at the same time.  While we've not heard field
complaints traceable to this, it seems inevitable that it'd bite someone
eventually.  Instead, symlink typename.c into the preproc directory and
recompile it there.  That file is small enough that compiling it twice
isn't much of a penalty.  Furthermore, this way we get a .o file that's
made without shlib CFLAGS, which seems cleaner.

This requires adding more stuff to the module's -I list.  The MSVC
aspect of that is untested, but I'm sure the buildfarm will tell me
if I got it wrong.

Per a suggestion from Peter Eisentraut.  Although this is theoretically
a bug fix, the lack of field reports makes me feel we needn't back-patch.

Discussion: https://postgr.es/m/31364.1543511708@sss.pgh.pa.us
2018-12-01 17:19:51 -05:00
Michael Paquier 431f1599a2 Add support for NO_INSTALLCHECK in MSVC scripts
When fetching a list of tests for a given extension in contrib/ or
src/test/modules/, NO_INSTALLCHECK now gets checked first.  If present,
an empty list of tests is returned to let the caller know that tests
for this module need to be bypassed.

This actually fixes a set of issues with MSVC with modules using
REGRESS_OPTS, as an incorrect parsing caused the launched command
to eat the first test listed.  The actual effect on the tree is that
several modules listed a single test, so regressions have been running
with no actual tests.  pg_stat_statements, test_rls_hooks and commit_ts
were impacted by that.  Some other modules like test_decoding (or
snapshot_too_old) don't use yet PGXS rules, but their makefiles will
soon be refactored with an upcoming patch.

Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20181126054302.GI1776@paquier.xyz
2018-11-29 10:31:12 +09:00
Andres Freund 578b229718 Remove WITH OIDS support, change oid catalog column visibility.
Previously tables declared WITH OIDS, including a significant fraction
of the catalog tables, stored the oid column not as a normal column,
but as part of the tuple header.

This special column was not shown by default, which was somewhat odd,
as it's often (consider e.g. pg_class.oid) one of the more important
parts of a row.  Neither pg_dump nor COPY included the contents of the
oid column by default.

The fact that the oid column was not an ordinary column necessitated a
significant amount of special case code to support oid columns. That
already was painful for the existing, but upcoming work aiming to make
table storage pluggable, would have required expanding and duplicating
that "specialness" significantly.

WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0).
Remove it.

Removing includes:
- CREATE TABLE and ALTER TABLE syntax for declaring the table to be
  WITH OIDS has been removed (WITH (oids[ = true]) will error out)
- pg_dump does not support dumping tables declared WITH OIDS and will
  issue a warning when dumping one (and ignore the oid column).
- restoring an pg_dump archive with pg_restore will warn when
  restoring a table with oid contents (and ignore the oid column)
- COPY will refuse to load binary dump that includes oids.
- pg_upgrade will error out when encountering tables declared WITH
  OIDS, they have to be altered to remove the oid column first.
- Functionality to access the oid of the last inserted row (like
  plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed.

The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false)
for CREATE TABLE) is still supported. While that requires a bit of
support code, it seems unnecessary to break applications / dumps that
do not use oids, and are explicit about not using them.

The biggest user of WITH OID columns was postgres' catalog. This
commit changes all 'magic' oid columns to be columns that are normally
declared and stored. To reduce unnecessary query breakage all the
newly added columns are still named 'oid', even if a table's column
naming scheme would indicate 'reloid' or such.  This obviously
requires adapting a lot code, mostly replacing oid access via
HeapTupleGetOid() with access to the underlying Form_pg_*->oid column.

The bootstrap process now assigns oids for all oid columns in
genbki.pl that do not have an explicit value (starting at the largest
oid previously used), only oids assigned later by oids will be above
FirstBootstrapObjectId. As the oid column now is a normal column the
special bootstrap syntax for oids has been removed.

Oids are not automatically assigned during insertion anymore, all
backend code explicitly assigns oids with GetNewOidWithIndex(). For
the rare case that insertions into the catalog via SQL are called for
the new pg_nextoid() function can be used (which only works on catalog
tables).

The fact that oid columns on system tables are now normal columns
means that they will be included in the set of columns expanded
by * (i.e. SELECT * FROM pg_class will now include the table's oid,
previously it did not). It'd not technically be hard to hide oid
column by default, but that'd mean confusing behavior would either
have to be carried forward forever, or it'd cause breakage down the
line.

While it's not unlikely that further adjustments are needed, the
scope/invasiveness of the patch makes it worthwhile to get merge this
now. It's painful to maintain externally, too complicated to commit
after the code code freeze, and a dependency of a number of other
patches.

Catversion bump, for obvious reasons.

Author: Andres Freund, with contributions by John Naylor
Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de
2018-11-20 16:00:17 -08:00
Tomas Vondra d3bbc4b96a Add valgrind suppressions for wcsrtombs optimizations
wcsrtombs (called through wchar2char from common functions like lower,
upper, etc.) uses various optimizations that may look like access to
uninitialized data, triggering valgrind reports.

For example AVX2 instructions load data in 256-bit chunks, and  gconv
does something similar with 32-bit chunks.  This is faster than accessing
the bytes one by one, and the uninitialized part of the buffer is not
actually used. So suppress the bogus reports.

The exact stack depends on possible optimizations - it might be AVX, SSE
(as in the report by Aleksander Alekseev) or something else. Hence the
last frame is wildcarded, to deal with this.

Backpatch all the way back to 9.4.

Author: Tomas Vondra
Discussion: https://www.postgresql.org/message-id/flat/90ac0452-e907-e7a4-b3c8-15bd33780e62%402ndquadrant.com
Discussion: https://www.postgresql.org/message-id/20180220150838.GD18315@e733.localdomain
2018-11-17 23:50:21 +01:00
Andrew Dunstan 12d5f39b15 Adjust valgrind fix in commit 517b0d0b5f
lousyjack still wasn't happy. I have tested this modification and it
worked.
2018-11-08 08:38:46 -05:00
Andrew Dunstan 54ad7282fe Use parallel installcheck in vcregress.pl's upgrade test
This is to keep the test in sync with what's done in test.sh, which
acquired this change in commit da906766c.
2018-11-07 14:17:57 -05:00
Andrew Dunstan 517b0d0b5f Quiet valgrind complaints following pread/pwrite changes
Per complaints from buildfarm and elsewhere
Patch from Jasper Pedersen
Discussion: https://postgr.es/m/0f419c91-49ab-2399-0143-13063bd97c46@redhat.com
2018-11-07 12:58:39 -05:00
Thomas Munro 3fd2a7932e Provide pg_pread() and pg_pwrite() for random I/O.
Forward to POSIX pread() and pwrite(), or emulate them if unavailable.
The emulation is not perfect as the file position is changed, so
we'll put pg_ prefixes on the names to minimize the risk of confusion
in future patches that might inadvertently try to mix pread() and read()
on the same file descriptor.

Author: Thomas Munro
Reviewed-by: Tom Lane, Jesper Pedersen
Discussion: https://postgr.es/m/CAEepm=02rapCpPR3ZGF2vW=SBHSdFYO_bz_f-wwWJonmA3APgw@mail.gmail.com
2018-11-07 09:50:01 +13:00
Magnus Hagander fbec7459aa Fix spelling errors and typos in comments
Author: Daniel Gustafsson <daniel@yesql.se>
2018-11-02 13:56:52 +01:00
Peter Eisentraut a9e5f8e781 Exclude temporary directories from pgindent
Exclude tmp_check and tmp_install from pgindent.  In a fully-built
tree, pgindent would spend a lot of time digging through these
directories and ends up re-indenting installed header files.
2018-10-29 11:39:44 +01:00
Andrew Dunstan 1df92eeafe Fix perl searchpath for modern perl for MSVC tools
Modern versions of perl no longer include the current directory in the
perl searchpath, as it's insecure. Instead of adding the current
directory, we get around the problem by adding the directory where the
script lives.

Problem noted by Victor Wagner.

Solution adapted from buildfarm client code.

Backpatch to all live versions.
2018-10-28 12:22:32 -04:00
Andres Freund 2d10defa77 Remove timetravel extension.
The extension depended on old types which are about to be removed. As
the code additionally was pretty crufty and didn't provide much in the
way of functionality, removing the extension seems to be the best way
forward.  It's fairly trivial to write functionality in plpgsql that
more than covers what timetravel did.

Author: Andres Freund
Discussion:
    https://postgr.es/m/20171213080506.cwjkpcz3bkk6yz2u@alap3.anarazel.de
    https://postgr.es/m/25615.1513115237@sss.pgh.pa.us
2018-10-11 11:43:56 -07:00
Tom Lane 61f14cc8c8 Tweak MSVC build system to match changes in 7143b3e82.
Looks like we need to pull in $libpgcommon in a couple more
places than before.

Per buildfarm.
2018-09-28 15:49:05 -04:00
Tom Lane 97c6852ff7 Tweak MSVC build system to match changes in 7143b3e82.
Also try to make the comment suggesting that this might be needed
more intelligible.

Per buildfarm.
2018-09-28 15:17:07 -04:00
Tom Lane 26e9d4d4ef Convert elog.c's useful_strerror() into a globally-used strerror wrapper.
elog.c has long had a private strerror wrapper that handles assorted
possible failures or deficiencies of the platform's strerror.  On Windows,
it also knows how to translate Winsock error codes, which the native
strerror does not.  Move all this code into src/port/strerror.c and
define strerror() as a macro that invokes it, so that both our frontend
and backend code will have all of this behavior.

I believe this constitutes an actual bug fix on Windows, since AFAICS
our frontend code did not report Winsock error codes properly before this.
However, the main point is to lay the groundwork for implementing %m
in src/port/snprintf.c: the behavior we want %m to have is this one,
not the native strerror's.

Note that this throws away the prior use of src/port/strerror.c,
which was to implement strerror() on platforms lacking it.  That's
been dead code for nigh twenty years now, since strerror() was
already required by C89.

We should likewise cause strerror_r to use this behavior, but
I'll tackle that separately.

Patch by me, reviewed by Michael Paquier

Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-09-26 11:06:42 -04:00
Tom Lane b09a64d602 Add missing pg_description strings for pg_type entries.
I noticed that all non-composite, non-array entries in pg_type.dat
had descr strings, except for "json" and the pseudo-types.  The
lack for json seems certainly an oversight, and there's surely
little reason to not have entries for the pseudo-types either.
So add some.

"make reformat-dat-files" turned up some formatting issues in
pg_amop.dat, too, so fix those in passing.

No catversion bump since the backend doesn't care too much what is
in pg_description.
2018-09-20 16:06:18 -04:00
Andrew Gierth 60f6756f92 Fix out-of-tree build for transform modules.
Neither plperl nor plpython installed sufficient header files to
permit transform modules to be built out-of-tree using PGXS. Fix that
by installing all plperl and plpython header files (other than those
with special purposes such as generated data tables), and also install
plpython's special .mk file for mangling regression tests.

(This commit does not fix the windows install, which does not
currently install _any_ plperl or plpython headers.)

Also fix the existing transform modules for hstore and ltree so that
their cross-module #include directives work as anticipated by commit
df163230b9 et seq. This allows them to serve as working examples of
how to reference other modules when doing separate out-of-tree builds.

Discussion: https://postgr.es/m/87o9ej8bgl.fsf%40news-spur.riddles.org.uk
2018-09-16 18:46:45 +01:00
Tom Lane ed0cdf0e05 Install a check for mis-linking of src/port and src/common functions.
On ELF-based platforms (and maybe others?) it's possible for a shared
library, when dynamically loaded into the backend, to call the backend
versions of src/port and src/common functions rather than the frontend
versions that are actually linked into the shlib.  This is definitely
not what we want, because the frontend versions often behave slightly
differently.  Up to now it's been "slight" enough that nobody noticed;
but with the addition of SCRAM support functions in src/common, we're
observing crashes due to the difference between palloc and malloc
memory allocation rules, as reported in bug #15367 from Jeremy Evans.

The purpose of this patch is to create a direct test for this type of
mis-linking, so that we know whether any given platform requires extra
measures to prevent using the wrong functions.  If the test fails, it
will lead to connection failures in the contrib/postgres_fdw regression
test.  At the moment, *BSD platforms using ELF format are known to have
the problem and can be expected to fail; but we need to know whether
anything else does, and we need a reliable ongoing check for future
platforms.

Actually fixing the problem will be the subject of later commit(s).

Discussion: https://postgr.es/m/153626613985.23143.4743626885618266803@wrigleys.postgresql.org
2018-09-09 12:23:23 -04:00
Peter Eisentraut 842cb9fa62 Refactor dlopen() support
Nowadays, all platforms except Windows and older HP-UX have standard
dlopen() support.  So having a separate implementation per platform
under src/backend/port/dynloader/ is a bit excessive.  Instead, treat
dlopen() like other library functions that happen to be missing
sometimes and put a replacement implementation under src/port/.

Discussion: https://www.postgresql.org/message-id/flat/e11a49cb-570a-60b7-707d-7084c8de0e61%402ndquadrant.com#54e735ae37476a121abb4e33c2549b03
2018-09-06 11:33:04 +02:00
Andres Freund d9dd406fe2 Require C99 (and thus MSCV 2013 upwards).
In 86d78ef50e I enabled configure to check for C99 support, with the
goal of checking which platforms support C99.  While there are a few
machines without C99 support among our buildfarm animals,
de-supporting them for v12 was deemed acceptable.

While not tested in aforementioned commit, the biggest increase in
minimum compiler version comes from MSVC, which gained C99 support
fairly late. The subset in MSVC 2013 is sufficient for our needs, at
this point. While that is a significant increase in minimum version,
the existing windows binaries are already built with a new enough
version.

Make configure error out if C99 support could not be detected. For
MSVC builds, increase the minimum version to 2013.

The increase to MSVC 2013 allows us to get rid of VCBuildProject.pm,
as that was only required for MSVC 2005/2008.

Author: Andres Freund
Discussion: https://postgr.es/m/97d4b165-192d-3605-749c-f614a0c4e783@2ndquadrant.com
2018-08-23 18:33:57 -07:00
Noah Misch f3efef434f MSVC: Finish clean.bat tmp_check coverage.
Use wildcards, so one can add a TAP test suite without updating this
file.  Back-patch to v11, which omitted multiple new suites.
2018-08-19 01:12:22 -07:00
Noah Misch a53f0edd64 MSVC: Remove any tmp_check directory before running a TAP test suite.
Back-patch to v11, where commit 90627cf98a
made the GNU make build system do likewise.  Without this, when a
typical PostgresNode-using test failed, subsequent runs bailed out with
a "File exists" error.
2018-08-19 01:12:22 -07:00
Andrew Gierth df163230b9 Provide for contrib and pgxs modules to install include files.
This allows out-of-tree PLs and similar code to get access to
definitions needed to work with extension data types.

The following existing modules now install headers: contrib/cube,
contrib/hstore, contrib/isn, contrib/ltree, contrib/seg.

Discussion: https://postgr.es/m/87y3euomjh.fsf%40news-spur.riddles.org.uk
2018-07-31 20:07:39 +01:00
Bruce Momjian e2c0df7828 pgtest: run clean, build, and check stages separately
This allows for cleaner error reporting.

Backpatch-through: 9.5
2018-07-28 15:34:06 -04:00
Bruce Momjian 96313bff29 pgtest: grab possible warnings from install.log
Since PG 9.5, 'make check' records the build output in install.log, so
look in there for warnings too.

Backpatch-through: 9.5
2018-07-28 11:35:53 -04:00
Michael Paquier 9ebe0572ce Refactor cluster_rel() to handle more options
This extends cluster_rel() in such a way that more options can be added
in the future, which will reduce the amount of chunk code for an
upcoming SKIP_LOCKED aimed for VACUUM.  As VACUUM FULL is a different
flavor of CLUSTER, we want to make that extensible to ease integration.

This only reworks the API and its callers, without providing anything
user-facing.  Two options are present now: verbose mode and relation
recheck when doing the cluster command work across multiple
transactions.  This could be used as well as a base to extend the
grammar of CLUSTER later on.

Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/20180723031058.GE2854@paquier.xyz
2018-07-24 11:37:32 +09:00
Robert Haas 32df1c9afa Add subtransaction handling for table synchronization workers.
Since the old logic was completely unaware of subtransactions, a
change made in a subsequently-aborted subtransaction would still cause
workers to be stopped at toplevel transaction commit.  Fix that by
managing a stack of worker lists rather than just one.

Amit Khandekar and Robert Haas

Discussion: http://postgr.es/m/CAJ3gD9eaG_mWqiOTA2LfAug-VRNn1hrhf50Xi1YroxL37QkZNg@mail.gmail.com
2018-07-16 17:33:22 -04:00
Peter Eisentraut 333224c99e Update documentation editor setup instructions
Now that the documentation sources are in XML rather than SGML, some of
the documentation about the editor, or more specifically Emacs, setup
needs updating.  The updated instructions recommend using nxml-mode,
which works mostly out of the box, with some small tweaks in
emacs.samples and .dir-locals.el.

Also remove some obsolete stuff in .dir-locals.el.  I did, however,
leave the sgml-mode settings in there so that someone using Emacs
without emacs.samples gets those settings when editing a *.sgml file.
2018-07-13 21:23:41 +02:00
Peter Eisentraut 1f4ec89459 Remove obsolete documentation build tools for Windows
The scripts and instructions have been nonfunctional at least since
PostgreSQL 10 (commit 510074f9f0) and
nobody has stepped up to fix them.  So right now just remove them until
someone wants to resurrect them.

Discussion: https://www.postgresql.org/message-id/flat/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3%40yesql.se
2018-07-13 10:01:04 +02:00
Andrew Dunstan 56b4da8c9d Use more modern instructions for creating a new dev cycle 2018-07-01 07:55:05 -04:00
Andrew Dunstan feced1387f Stamp HEAD as 12devel
Let the hacking begin ...
2018-06-30 12:47:59 -04:00
Andrew Dunstan d842139099 perltidy run prior to branching 2018-06-30 12:28:55 -04:00
Andrew Dunstan 2c64d20048 Update typedefs list 2018-06-30 12:07:27 -04:00
Peter Eisentraut 0fcf5e0e6e Fix whitespace 2018-06-27 08:03:54 +02:00
Andrew Dunstan e3eb8be77e Exclude files in .git from list of perl files
The .git directory might contain perl files, as hooks, for example.
Since we have no control over these they should be excluded from things
like our perlcritic checks.

Per offline report from Mike Blackwell.
2018-06-12 14:54:43 -04:00
Andrew Dunstan 85dd744a70 Move perlcritic files to new perlcheck directory 2018-06-11 14:54:28 -04:00
Andrew Dunstan af616ce483 Add a script to detect perl compile time errors and warnings
Also add a function that centralizes the logic for locating all our perl
files and use it in pgperlcritic and pgperltidy as well as the new
pgperlcheck.
2018-06-11 14:47:20 -04:00
Andrew Dunstan 0039049fb1 Fix compile-time warnings on all perl code
This patch does two things. First, it silences a number of compile-time
warnings in the msvc tools files, mainly those due to the fact that in
some cases we have more than one package per file. Second it supplies a
dummy Perl library with just enough of the Windows API referred to in
our code to let it run these checks cleanly, even on Unix machines where
the code is never supposed to run. The dummy library should only be used
for that purpose, as its README notes.
2018-05-31 08:13:02 -04:00
Andrew Dunstan 01deec5f8a Return a value from Install.pm's lcopy function
Commit 3a7cc727c was a little over eager about adding an explicit return
to this function, whose value is checked in most call sites. This change
reverses that and returns the expected value explicitly. It also adds a
check to the one call site lacking one.
2018-05-28 16:48:48 -04:00
Andrew Dunstan 3a7cc727c7 Don't fall off the end of perl functions
This complies with the perlcritic policy
Subroutines::RequireFinalReturn, which is a severity 4 policy. Since we
only currently check at severity level 5, the policy is raised to that
level until we move to level 4 or lower, so that any new infringements
will be caught.

A small cosmetic piece of tidying of the pgperlcritic script is
included.

Mike Blackwell

Discussion: https://postgr.es/m/CAESHdJpfFm_9wQnQ3koY3c91FoRQsO-fh02za9R3OEMndOn84A@mail.gmail.com
2018-05-27 09:08:42 -04:00
Andrew Dunstan 8a56ff4848 Don't force a blank line before comments in perl code
Suggestion from Bruce Momjian

Discussion: https://postgr.es/m/20180525190445.GA2213@momjian.us
2018-05-27 08:53:54 -04:00
Andrew Dunstan 4431c94c36 Preserve information on use of git-external-diff
Now that the Working with git wiki page no longer suggests producing
context diffs, we should preserve the information on how to use
git-external-diff for those people who want to view context format
diffs. The most obvious place is in the script itself, so that's what's
done here.
2018-05-24 23:45:31 +09:30
Tom Lane a194106c1b MSVC builds must use a separate stamp file for copying generated headers.
Commit bad51a49a tried to use a shortcut with just one stamp file
recording the actions of generating the pg_*_d.h headers and copying
them to the src/include/catalog/ directory.  That doesn't work in all
scenarios though, so we must use two stamp files like the Makefiles do.

John Naylor

Discussion: https://postgr.es/m/CANFyU944GdHr=puPbA78STnqr=8kgMrGF-VDHck6aO_-qNDALg@mail.gmail.com
2018-05-18 11:53:18 -04:00
Andrew Dunstan 3dacd9bf32 Prevent possibly spurious error when running perl -cw 2018-05-18 10:47:41 -04:00
Andrew Dunstan 35361ee788 Restrict vertical tightness to parentheses in Perl code
The vertical tightness settings collapse vertical whitespace between
opening and closing brackets (parentheses, square brakets and braces).
This can make data structures in particular harder to read, and is not
very consistent with our style in non-Perl code. This patch restricts
that setting to parentheses only, and reformats all the perl code
accordingly. Not applying this to parentheses has some unfortunate
effects, so the consensus is to keep the setting for parentheses and not
for the others.

The diff for this patch does highlight some places where structures
should have trailing commas. They can be added manually, as there is no
automatic tool to do so.

Discussion: https://postgr.es/m/a2f2b87c-56be-c070-bfc0-36288b4b41c1@2ndQuadrant.com
2018-05-09 10:14:46 -04:00
Andrew Dunstan 286bb240e1 perltidy some recent code changes before changing perltidy settings 2018-05-09 10:05:35 -04:00
Andrew Dunstan 91703ca214 Add a script and a config file to run perlcritic
This is similar to what we do to run perltidy. For now we only run at
severity level 5. Over time we can improve our perl code and reduce the
severity level.

Discussion: https://postgr.es/m/86aa2a3a-0c68-21fb-9560-84ad6914d561@2ndQuadrant.com
2018-05-09 07:55:23 -04:00
Tom Lane fbb99e5883 Update oidjoins regression test for v11.
Commit 86f575948 already manually updated the oidjoins test for the
new pg_constraint.conparentid => pg_constraint.oid relationship, but
failed to update findoidjoins/README, thus the apparent inconsistency
here.

Michael Paquier

Discussion: https://postgr.es/m/20180507001811.GA27389@paquier.xyz
2018-05-07 14:32:04 -04:00
Andrew Dunstan 2b9bdda744 Clear severity 5 perlcritic warnings from vcregress.pl
My recent update for python3 support used some idioms that are
unapproved. This fixes them. Backpatch to all live branches like the
original.
2018-05-06 07:37:05 -04:00
Andrew Dunstan 966268c762 Provide for testing on python3 modules when under MSVC
This should have been done some years ago as promised in commit
c4dcdd0c2. However, better late than never.

Along the way do a little housekeeping, including using a simpler test
for the python version being tested, and removing a redundant subroutine
parameter. These changes only apply back to release 9.5.

Backpatch to all live releases.
2018-05-04 15:22:48 -04:00
Tom Lane bad51a49a4 Blindly try to fix MSVC build's use of genbki.pl and Gen_fmgrtab.pl.
We need to use a stamp file to record the runs of these scripts, as
is done on the Unix side.  I think I got it right, but can't test.

While at it, extend this handmade dependency logic to also check the
generating script files, as the makefiles do.

Discussion: https://postgr.es/m/16925.1525376229@sss.pgh.pa.us
2018-05-03 18:47:41 -04:00
Tom Lane f7df8043f0 Remove Windows module-list-dumping code.
This code is evidently allocating memory and thus confusing matters
even more.  Let's see whether we can learn anything with
just VirtualQuery.

Discussion: https://postgr.es/m/25495.1524517820@sss.pgh.pa.us
2018-04-30 13:20:13 -04:00
Andrew Dunstan bb779006f4 clean up pg_upgrade tmp_check under MSVC 2018-04-30 12:43:57 -04:00
Tom Lane 6ba0cc4bd3 Dump full memory maps around failing Windows reattach code.
This morning's results from buildfarm member dory make it pretty
clear that something is getting mapped into the just-freed space,
but not what that something is.  Replace my minimalistic probes
with a full dump of the process address space and module space,
based on Noah's work at
<20170403065106.GA2624300%40tornado.leadboat.com>

This is all (probably) to get reverted once we have fixed the
problem, but for now we need information.

Discussion: https://postgr.es/m/25495.1524517820@sss.pgh.pa.us
2018-04-30 11:16:21 -04:00
Peter Eisentraut 76ece16974 perltidy: Add option --nooutdent-long-comments 2018-04-27 11:37:43 -04:00
Peter Eisentraut d4f16d5071 perltidy: Add option --nooutdent-long-quotes 2018-04-27 11:37:43 -04:00
Peter Eisentraut a2ada08d4c perltidy: Don't write backup files
Newer perltidy versions can just avoid writing backup files, so we don't
need the old dance of deleting them afterwards.  Supported since 20120619.

https://metacpan.org/source/SHANCOCK/Perl-Tidy-20120619/CHANGES#L61
2018-04-27 11:37:43 -04:00
Tom Lane f83bf385c1 Preliminary work for pgindent run.
Update typedefs.list from current buildfarm results.  Adjust pgindent's
typedef blacklist to block some more unfortunate typedef names that have
snuck in since last time.  Manually tweak a few places where I didn't
like the initial results of pgindent'ing.
2018-04-26 14:45:04 -04:00
Tom Lane f04d4ac919 Reindent Perl files with perltidy version 20170521.
Discussion: https://postgr.es/m/CABUevEzK3cNiHZQ18f5tK0guoT+cN_jWeVzhYYxY=r+1Q3SmoA@mail.gmail.com
2018-04-25 14:00:19 -04:00
Tom Lane 46cda5bf7b Change pgindent/README to specify that we use perltidy version 20170521.
Per discussion, this is now the project's standard version.
Update the obsolete URL, and explain how to install a specific version.

Discussion: https://postgr.es/m/CABUevEzK3cNiHZQ18f5tK0guoT+cN_jWeVzhYYxY=r+1Q3SmoA@mail.gmail.com
2018-04-25 13:58:39 -04:00
Peter Eisentraut 56fb890ace Make Emacs settings match perltidy configuration
Set Emacs's perl-continued-statement-offset to match perltidy's
--continuation-indentation, which is 2 (not overridden in PostgreSQL's
profile) rather than the 4 that Emacs uses by default.
2018-04-23 11:44:31 -04:00
Simon Riggs 08ea7a2291 Revert MERGE patch
This reverts commits d204ef6377,
83454e3c2b and a few more commits thereafter
(complete list at the end) related to MERGE feature.

While the feature was fully functional, with sufficient test coverage and
necessary documentation, it was felt that some parts of the executor and
parse-analyzer can use a different design and it wasn't possible to do that in
the available time. So it was decided to revert the patch for PG11 and retry
again in the future.

Thanks again to all reviewers and bug reporters.

List of commits reverted, in reverse chronological order:

 f1464c5380 Improve parse representation for MERGE
 ddb4158579 MERGE syntax diagram correction
 530e69e59b Allow cpluspluscheck to pass by renaming variable
 01b88b4df5 MERGE minor errata
 3af7b2b0d4 MERGE fix variable warning in non-assert builds
 a5d86181ec MERGE INSERT allows only one VALUES clause
 4b2d44031f MERGE post-commit review
 4923550c20 Tab completion for MERGE
 aa3faa3c7a WITH support in MERGE
 83454e3c2b New files for MERGE
 d204ef6377 MERGE SQL Command following SQL:2016

Author: Pavan Deolasee
Reviewed-by: Michael Paquier
2018-04-12 11:22:56 +01:00
Tom Lane a65e17bd6f Reduce chattiness of genbki.pl and Gen_fmgrtab.pl.
Make these scripts emit just one log message when they run, not one
per output file.  The latter is way too verbose in the wake of
commit 372728b0d.  The specific wording used is what already existed
in the MSVC scripts.

John Naylor

Discussion: https://postgr.es/m/11103.1523208822@sss.pgh.pa.us
2018-04-09 15:01:10 -04:00
Tom Lane 9c0a0de4c9 Switch client-side code to include catalog/pg_foo_d.h not pg_foo.h.
Everything of use to frontend code should now appear in the _d.h files,
and making this change frees us from needing to worry about whether the
catalog header files proper are frontend-safe.

Remove src/interfaces/ecpg/ecpglib/pg_type.h entirely, as the previous
commit reduced it to a confusingly-named wrapper around pg_type_d.h.

In passing, make test_rls_hooks.c follow project convention of including
our own files with #include "" not <>.

Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
2018-04-08 13:59:52 -04:00
Tom Lane 372728b0d4 Replace our traditional initial-catalog-data format with a better design.
Historically, the initial catalog data to be installed during bootstrap
has been written in DATA() lines in the catalog header files.  This had
lots of disadvantages: the format was badly underdocumented, it was
very difficult to edit the data in any mechanized way, and due to the
lack of any abstraction the data was verbose, hard to read/understand,
and easy to get wrong.

Hence, move this data into separate ".dat" files and represent it in a way
that can easily be read and rewritten by Perl scripts.  The new format is
essentially "key => value" for each column; while it's a bit repetitive,
explicit labeling of each value makes the data far more readable and less
error-prone.  Provide a way to abbreviate entries by omitting field values
that match a specified default value for their column.  This allows removal
of a large amount of repetitive boilerplate and also lowers the barrier to
adding new columns.

Also teach genbki.pl how to translate symbolic OID references into
numeric OIDs for more cases than just "regproc"-like pg_proc references.
It can now do that for regprocedure-like references (thus solving the
problem that regproc is ambiguous for overloaded functions), operators,
types, opfamilies, opclasses, and access methods.  Use this to turn
nearly all OID cross-references in the initial data into symbolic form.
This represents a very large step forward in readability and error
resistance of the initial catalog data.  It should also reduce the
difficulty of renumbering OID assignments in uncommitted patches.

Also, solve the longstanding problem that frontend code that would like to
use OID macros and other information from the catalog headers often had
difficulty with backend-only code in the headers.  To do this, arrange for
all generated macros, plus such other declarations as we deem fit, to be
placed in "derived" header files that are safe for frontend inclusion.
(Once clients migrate to using these pg_*_d.h headers, it will be possible
to get rid of the pg_*_fn.h headers, which only exist to quarantine code
away from clients.  That is left for follow-on patches, however.)

The now-automatically-generated macros include the Anum_xxx and Natts_xxx
constants that we used to have to update by hand when adding or removing
catalog columns.

Replace the former manual method of generating OID macros for pg_type
entries with an automatic method, ensuring that all built-in types have
OID macros.  (But note that this patch does not change the way that
OID macros for pg_proc entries are built and used.  It's not clear that
making that match the other catalogs would be worth extra code churn.)

Add SGML documentation explaining what the new data format is and how to
work with it.

Despite being a very large change in the catalog headers, there is no
catversion bump here, because postgres.bki and related output files
haven't changed at all.

John Naylor, based on ideas from various people; review and minor
additional coding by me; previous review by Alvaro Herrera

Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com
2018-04-08 13:17:27 -04:00
Stephen Frost da9b580d89 Refactor dir/file permissions
Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).

Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).

Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.

Authors: David Steele <david@pgmasters.net>,
         Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Andrew Gierth 1fd8690668 Install errcodes.txt for use by extensions.
Maintainers of out-of-tree PLs typically need access to the set of
error codes. To avoid the need to duplicate that information in some
form in PL source trees, provide errcodes.txt as part of a server
installation.

Thomas Munro, based on a suggestion from Andrew Gierth
Discussion: https://postgr.es/m/87woykk7mu.fsf%40news-spur.riddles.org.uk
2018-04-05 04:05:40 +01:00
Heikki Linnakangas f044d71e33 Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.

Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.

For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.

Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.

Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 12:22:45 +03:00
Tom Lane dddfc4cb2e Prevent accidental linking of system-supplied copies of libpq.so etc.
We were being careless in some places about the order of -L switches in
link command lines, such that -L switches referring to external directories
could come before those referring to directories within the build tree.
This made it possible to accidentally link a system-supplied library, for
example /usr/lib/libpq.so, in place of the one built in the build tree.
Hilarity ensued, the more so the older the system-supplied library is.

To fix, break LDFLAGS into two parts, a sub-variable LDFLAGS_INTERNAL
and the main LDFLAGS variable, both of which are "recursively expanded"
so that they can be incrementally adjusted by different makefiles.
Establish a policy that -L switches for directories in the build tree
must always be added to LDFLAGS_INTERNAL, while -L switches for external
directories must always be added to LDFLAGS.  This is sufficient to
ensure a safe search order.  For simplicity, we typically also put -l
switches for the respective libraries into those same variables.
(Traditional make usage would have us put -l switches into LIBS, but
cleaning that up is a project for another day, as there's no clear
need for it.)

This turns out to also require separating SHLIB_LINK into two variables,
SHLIB_LINK and SHLIB_LINK_INTERNAL, with a similar rule about which
switches go into which variable.  And likewise for PG_LIBS.

Although this change might appear to affect external users of pgxs.mk,
I think it doesn't; they shouldn't have any need to touch the _INTERNAL
variables.

In passing, tweak src/common/Makefile so that the value of CPPFLAGS
recorded in pg_config lacks "-DFRONTEND" and the recorded value of
LDFLAGS lacks "-L../../../src/common".  Both of those things are
mistakes, apparently introduced during prior code rearrangements,
as old versions of pg_config don't print them.  In general we don't
want anything that's specific to the src/common subdirectory to
appear in those outputs.

This is certainly a bug fix, but in view of the lack of field
complaints, I'm unsure whether it's worth the risk of back-patching.
In any case it seems wise to see what the buildfarm makes of it first.

Discussion: https://postgr.es/m/25214.1522604295@sss.pgh.pa.us
2018-04-03 16:26:05 -04:00
Peter Eisentraut 4ab2999815 Attempt to fix jsonb_plperl build on Windows 2018-04-03 10:43:41 -04:00
Simon Riggs d204ef6377 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 09:28:16 +01:00
Simon Riggs 7cf8a5c302 Revert "Modified files for MERGE"
This reverts commit 354f13855e.
2018-04-02 21:34:15 +01:00
Simon Riggs 354f13855e Modified files for MERGE 2018-04-02 21:12:47 +01:00
Andres Freund 51bc271790 Add Bloom filter implementation.
A Bloom filter is a space-efficient, probabilistic data structure that
can be used to test set membership.  Callers will sometimes incur false
positives, but never false negatives.  The rate of false positives is a
function of the total number of elements and the amount of memory
available for the Bloom filter.

Two classic applications of Bloom filters are cache filtering, and data
synchronization testing.  Any user of Bloom filters must accept the
possibility of false positives as a cost worth paying for the benefit in
space efficiency.

This commit adds a test harness extension module, test_bloomfilter.  It
can be used to get a sense of how the Bloom filter implementation
performs under varying conditions.

This is infrastructure for the upcoming "heapallindexed" amcheck patch,
which verifies the consistency of a heap relation against one of its
indexes.

Author: Peter Geoghegan
Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com
2018-03-31 17:49:41 -07:00
Peter Eisentraut 75e95dd79b Attempt to fix jsonb_plpython build on Windows 2018-03-28 11:49:23 -04:00
Tom Lane ef1978d6ed Update pgindent's typedefs blacklist, and make it easier to adjust.
It seems that all buildfarm members are now using the <stdbool.h> code
path, so that none of them report "bool" as a typedef.  We still need it
to be treated that way, so adjust pgindent to force that whether or not
it's in the given list.

Also, the recent introduction of LLVM infrastructure has caused the
appearance of some typedef names that we definitely *don't* want
treated as typedefs, such as "string" and "abs".  Extend the existing
blacklist to include these.  (Additions based on comparing v10's
typedefs list to what the buildfarm is currently emitting.)

Rearrange the code so that the lists of whitelisted/blacklisted
names are a bit easier to find and modify.

Andrew Dunstan and Tom Lane

Discussion: https://postgr.es/m/28690.1521912334@sss.pgh.pa.us
2018-03-27 18:15:39 -04:00
Andres Freund 2a0faed9d7 Add expression compilation support to LLVM JIT provider.
In addition to the interpretation of expressions (which back
evaluation of WHERE clauses, target list projection, aggregates
transition values etc) support compiling expressions to native code,
using the infrastructure added in earlier commits.

To avoid duplicating a lot of code, only support emitting code for
cases that are likely to be performance critical. For expression steps
that aren't deemed that, use the existing interpreter.

The generated code isn't great - some architectural changes are
required to address that. But this already yields a significant
speedup for some analytics queries, particularly with WHERE clauses
filtering a lot, or computing multiple aggregates.

Author: Andres Freund
Tested-By: Thomas Munro
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de

Disable JITing for VALUES() nodes.

VALUES() nodes are only ever executed once. This is primarily helpful
for debugging, when forcing JITing even for cheap queries.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 14:45:59 -07:00
Andres Freund 7ec0d80c05 Add helpers for emitting LLVM IR.
These basically just help to make code a bit more concise and pgindent
proof.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 11:51:58 -07:00
Andres Freund b96d550eb0 Support for optimizing and emitting code in LLVM JIT provider.
This commit introduces the ability to actually generate code using
LLVM. In particular, this adds:

- Ability to emit code both in heavily optimized and largely
  unoptimized fashion
- Batching facility to allow functions to be defined in small
  increments, but optimized and emitted in executable form in larger
  batches (for performance and memory efficiency)
- Type and function declaration synchronization between runtime
  generated code and normal postgres code. This is critical to be able
  to access struct fields etc.
- Developer oriented jit_dump_bitcode GUC, for inspecting / debugging
  the generated code.
- per JitContext statistics of number of functions, time spent
  generating code, optimizing, and emitting it.  This will later be
  employed for EXPLAIN support.

This commit doesn't yet contain any code actually generating
functions. That'll follow in later commits.

Documentation for GUCs added, and for JIT in general, will be added in
later commits.

Author: Andres Freund, with contributions by Pierre Ducroquet
Testing-By: Thomas Munro, Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 11:05:22 -07:00
Robert Haas e2f1eb0ee3 Implement partition-wise grouping/aggregation.
If the partition keys of input relation are part of the GROUP BY
clause, all the rows belonging to a given group come from a single
partition.  This allows aggregation/grouping over a partitioned
relation to be broken down * into aggregation/grouping on each
partition.  This should be no worse, and often better, than the normal
approach.

If the GROUP BY clause does not contain all the partition keys, we can
still perform partial aggregation for each partition and then finalize
aggregation after appending the partial results.  This is less certain
to be a win, but it's still useful.

Jeevan Chalke, Ashutosh Bapat, Robert Haas.  The larger patch series
of which this patch is a part was also reviewed and tested by Antonin
Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin
Knizhnik, Pascal Legrand, and Rafia Sabih.

Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
2018-03-22 12:49:48 -04:00
Teodor Sigaev 2058d6a22b Add conditional.c to libpgfeutils for MSVC build
conditional.c was moved in f67b113ac6 commit
but forgotten to add to Windows build system.

I don't have a Windows box, so blind attempt.
2018-03-22 19:45:34 +03:00
Andres Freund 432bb9e04d Basic JIT provider and error handling infrastructure.
This commit introduces:

1) JIT provider abstraction, which allows JIT functionality to be
   implemented in separate shared libraries. That's desirable because
   it allows to install JIT support as a separate package, and because
   it allows experimentation with different forms of JITing.
2) JITContexts which can be, using functions introduced in follow up
   commits, used to emit JITed functions, and have them be cleaned up
   on error.
3) The outline of a LLVM JIT provider, which will be fleshed out in
   subsequent commits.

Documentation for GUCs added, and for JIT in general, will be added in
later commits.

Author: Andres Freund, with architectural input from Jeff Davis
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-21 19:28:28 -07:00
Peter Eisentraut 8a3d942529 Add ssl_passphrase_command setting
This allows specifying an external command for prompting for or
otherwise obtaining passphrases for SSL key files.  This is useful
because in many cases there is no TTY easily available during service
startup.

Also add a setting ssl_passphrase_command_supports_reload, which allows
supporting SSL configuration reload even if SSL files need passphrases.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
2018-03-17 08:28:51 -04:00
Tom Lane 84a3611ccc Fix msvc/ecpg_regression.proj for recent ECPG test additions.
Commit 3b7ab4380 added some tests that require ecpg to be given the
new "-C ORACLE" switch.  Teach the MSVC build infrastructure about
that.

Michael Paquier

Discussion: https://postgr.es/m/8299.1521154647@sss.pgh.pa.us
2018-03-15 22:36:19 -04:00
Magnus Hagander 6946280cde Actually pick .lib file when multiple perl libs are present
7240962f86 got it right in the comment,
but the code did not actually do what the comment said. Fix that.

Issue pointed out by Noah Misch.
2018-03-04 18:01:25 +01:00
Magnus Hagander 7240962f86 Fix msvc builds for ActivePerl > 5.24
From this version ActivePerl ships both a .lib and a .a file for the
perl library, which our code would detect as there being no library
available. Instead, we should pick the .lib version and use that.

Report and suggested fix in bug #15065

Author: Heath Lord
2018-03-02 12:40:49 +01:00
Tom Lane 51057feaa6 Fix up ecpg's configuration so it handles "long long int" in MSVC builds.
Although configure-based builds correctly define HAVE_LONG_LONG_INT when
appropriate (in both pg_config.h and ecpg_config.h), builds using the MSVC
scripts failed to do so.  This currently has no impact on the backend,
since it uses that symbol nowhere; but it does prevent ecpg from
supporting "long long int".  Fix that.

Also, adjust Solution.pm so that in the constructed ecpg_config.h file,
the "#if (_MSC_VER > 1200)" covers only the LONG_LONG_INT-related
#defines, not the whole file.  AFAICS this was a thinko on somebody's
part: ENABLE_THREAD_SAFETY should always be defined in Windows builds,
and in branches using USE_INTEGER_DATETIMES, the setting of that shouldn't
depend on the compiler version either.  If I'm wrong, I imagine the
buildfarm will say so.

Per bug #15080 from Jonathan Allen; issue diagnosed by Michael Meskes
and Andrew Gierth.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/151935568942.1461.14623890240535309745@wrigleys.postgresql.org
2018-02-27 16:46:52 -05:00
Noah Misch 582edc369c Empty search_path in Autovacuum and non-psql/pgbench clients.
This makes the client programs behave as documented regardless of the
connect-time search_path and regardless of user-created objects.  Today,
a malicious user with CREATE permission on a search_path schema can take
control of certain of these clients' queries and invoke arbitrary SQL
functions under the client identity, often a superuser.  This is
exploitable in the default configuration, where all users have CREATE
privilege on schema "public".

This changes behavior of user-defined code stored in the database, like
pg_index.indexprs and pg_extension_config_dump().  If they reach code
bearing unqualified names, "does not exist" or "no schema has been
selected to create in" errors might appear.  Users may fix such errors
by schema-qualifying affected names.  After upgrading, consider watching
server logs for these errors.

The --table arguments of src/bin/scripts clients have been lax; for
example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint.  That
now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still
performs a checkpoint.

Back-patch to 9.3 (all supported versions).

Reviewed by Tom Lane, though this fix strategy was not his first choice.
Reported by Arseniy Sharoglazov.

Security: CVE-2018-1058
2018-02-26 07:39:44 -08:00
Robert Haas 9da0cc3528 Support parallel btree index builds.
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds.  Testing
to date shows that this can often be 2-3x faster than a serial
index build.

The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature.  We can
refine it as we get more experience.

Peter Geoghegan with some help from Rushabh Lathia.  While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature.  Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.

Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
2018-02-02 13:32:44 -05:00
Peter Eisentraut f75a959155 Refactor client-side SSL certificate checking code
Separate the parts specific to the SSL library from the general logic.

The previous code structure was

open_client_SSL()
calls verify_peer_name_matches_certificate()
calls verify_peer_name_matches_certificate_name()
calls wildcard_certificate_match()

and was completely in fe-secure-openssl.c.  The new structure is

open_client_SSL() [openssl]
calls pq_verify_peer_name_matches_certificate() [generic]
calls pgtls_verify_peer_name_matches_certificate_guts() [openssl]
calls openssl_verify_peer_name_matches_certificate_name() [openssl]
calls pq_verify_peer_name_matches_certificate_name() [generic]
calls wildcard_certificate_match() [generic]

Move the generic functions into a new file fe-secure-common.c, so the
calls generally go fe-connect.c -> fe-secure.c -> fe-secure-${impl}.c ->
fe-secure-common.c, although there is a bit of back-and-forth between
the last two.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2018-01-30 22:56:24 -05:00
Robert Haas 2f17844104 Allow UPDATE to move rows between partitions.
When an UPDATE causes a row to no longer match the partition
constraint, try to move it to a different partition where it does
match the partition constraint.  In essence, the UPDATE is split into
a DELETE from the old partition and an INSERT into the new one.  This
can lead to surprising behavior in concurrency scenarios because
EvalPlanQual rechecks won't work as they normally did; the known
problems are documented.  (There is a pending patch to improve the
situation further, but it needs more review.)

Amit Khandekar, reviewed and tested by Amit Langote, David Rowley,
Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro
Herrera, Amit Kapila, and me.  A few final revisions by me.

Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com
2018-01-19 15:33:06 -05:00
Andres Freund 69c3936a14 Expression evaluation based aggregate transition invocation.
Previously aggregate transition and combination functions were invoked
by special case code in nodeAgg.c, evaluating input and filters
separately using the expression evaluation machinery. That turns out
to not be great for performance for several reasons:

- repeated expression evaluations have some cost
- the transition functions invocations are poorly predicted, as
  commonly there are multiple aggregates in a query, resulting in the
  same call-stack invoking different functions.
- filter and input computation had to be done separately
- the special case code made it hard to implement JITing of the whole
  transition function invocation

Address this by building one large expression that computes input,
evaluates filters, and invokes transition functions.

This leads to moderate speedups in queries bottlenecked by aggregate
computations, and enables large speedups for similar cases once JITing
is done.

There's potential for further improvement:
- It'd be nice if we could simplify the somewhat expensive
  aggstate->all_pergroups lookups.
- right now there's still an advance_transition_function invocation in
  nodeAgg.c, leading to some code duplication.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-01-09 13:25:38 -08:00
Andrew Dunstan 99d5a3ffb9 Fix use of config-specific libraries for Windows OpenSSL
Commit 614350a3 allowed for an different builds of OpenSSL libraries on
Windows, but ignored the fact that the alternative builds don't have
config-specific libraries. This patch fixes the Solution file to ask for
the correct libraries.

per offline discussions with Leonardo Cecchi and Marco Nenciarini,

Backpatch to all live branches.
2018-01-03 15:36:54 -05:00
Bruce Momjian 9d4649ca49 Update copyright for 2018
Backpatch-through: certain files through 9.3
2018-01-02 23:30:12 -05:00
Andres Freund 1804284042 Add parallel-aware hash joins.
Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
Hash Join with Parallel Hash.  While hash joins could already appear in
parallel queries, they were previously always parallel-oblivious and had a
partial subplan only on the outer side, meaning that the work of the inner
subplan was duplicated in every worker.

After this commit, the planner will consider using a partial subplan on the
inner side too, using the Parallel Hash node to divide the work over the
available CPU cores and combine its results in shared memory.  If the join
needs to be split into multiple batches in order to respect work_mem, then
workers process different batches as much as possible and then work together
on the remaining batches.

The advantages of a parallel-aware hash join over a parallel-oblivious hash
join used in a parallel query are that it:

 * avoids wasting memory on duplicated hash tables
 * avoids wasting disk space on duplicated batch files
 * divides the work of building the hash table over the CPUs

One disadvantage is that there is some communication between the participating
CPUs which might outweigh the benefits of parallelism in the case of small
hash tables.  This is avoided by the planner's existing reluctance to supply
partial plans for small scans, but it may be necessary to estimate
synchronization costs in future if that situation changes.  Another is that
outer batch 0 must be written to disk if multiple batches are required.

A potential future advantage of parallel-aware hash joins is that right and
full outer joins could be supported, since there is a single set of matched
bits for each hashtable, but that is not yet implemented.

A new GUC enable_parallel_hash is defined to control the feature, defaulting
to on.

Author: Thomas Munro
Reviewed-By: Andres Freund, Robert Haas
Tested-By: Rafia Sabih, Prabhat Sahu
Discussion:
    https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
    https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
2017-12-21 00:43:41 -08:00
Andres Freund ab9e0e718a Add shared tuplestores.
SharedTuplestore allows multiple participants to write into it and
then read the tuples back from it in parallel.  Each reader receives
partial results.

For now it always uses disk files, but other buffering policies and
other kinds of scans (ie each reader receives complete results) may be
useful in future.

The upcoming parallel hash join feature will use this facility.

Author: Thomas Munro
Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas
Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
2017-12-18 14:23:19 -08:00
Peter Eisentraut 632b03da31 Start a separate test suite for plpgsql
The plpgsql.sql test file in the main regression tests is now by far the
largest after numeric_big, making editing and managing the test cases
very cumbersome.  The other PLs have their own test suites split up into
smaller files by topic.  It would be nice to have that for plpgsql as
well.  So, to get that started, set up test infrastructure in
src/pl/plpgsql/src/ and split out the recently added procedure test
cases into a new file there.  That file now mirrors the test cases added
to the other PLs, making managing those matching tests a bit easier too.

msvc build system changes with help from Michael Paquier
2017-12-13 11:02:29 -05:00
Noah Misch 7e0c574ee2 MSVC 2012+: Permit linking to 32-bit, MinGW-built libraries.
Notably, this permits linking to the 32-bit Perl binaries advertised on
perl.org, namely Strawberry Perl and ActivePerl.  This has a side effect
of permitting linking to binaries built with obsolete MSVC versions.

By default, MSVC 2012 and later require a "safe exception handler table"
in each binary.  MinGW-built, 32-bit DLLs lack the relevant exception
handler metadata, so linking to them failed with error LNK2026.  Restore
the semantics of MSVC 2010, which omits the table from a given binary if
some linker input lacks metadata.  This has no effect on 64-bit builds
or on MSVC 2010 and earlier.  Back-patch to 9.3 (all supported
versions).

Reported by Victor Wagner.

Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
2017-12-09 00:58:55 -08:00
Noah Misch 65a00f3035 MSVC: Test whether 32-bit Perl needs -D_USE_32BIT_TIME_T.
Commits 5a5c2feca3 and
b5178c5d08 introduced support for modern
MSVC-built, 32-bit Perl, but they broke use of MinGW-built, 32-bit Perl
distributions like Strawberry Perl and modern ActivePerl.  Perl has no
robust means to report whether it expects a -D_USE_32BIT_TIME_T ABI, so
test this.  Back-patch to 9.3 (all supported versions).

The chief alternative was a heuristic of adding -D_USE_32BIT_TIME_T when
$Config{gccversion} is nonempty.  That banks on every gcc-built Perl
using the same ABI.  gcc could change its default ABI the way MSVC once
did, and one could build Perl with gcc and the non-default ABI.

The GNU make build system could benefit from a similar test, without
which it does not support MSVC-built Perl.  For now, just add a comment.
Most users taking the special step of building Perl with MSVC probably
build PostgreSQL with MSVC.

Discussion: https://postgr.es/m/20171130041441.GA3161526@rfd.leadboat.com
2017-12-08 18:06:05 -08:00
Andres Freund dc6c4c9dc2 Add infrastructure for sharing temporary files between backends.
SharedFileSet allows temporary files to be created by one backend and
then exported for read-only access by other backends, with clean-up
managed by reference counting associated with a DSM segment.  This
includes changes to fd.c and buffile.c to support the new kind of
temporary file.

This will be used by an upcoming patch adding support for parallel
hash joins.

Author: Thomas Munro
Reviewed-By: Peter Geoghegan, Andres Freund, Robert Haas, Rushabh Lathia
Discussion:
    https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
    https://postgr.es/m/CAH2-WznJ_UgLux=_jTgCQ4yFz0iBntudsNKa1we3kN1BAG=88w@mail.gmail.com
2017-12-01 16:30:56 -08:00
Robert Haas eaedf0df71 Update typedefs.list and re-run pgindent
Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com
2017-11-29 09:24:24 -05:00
Magnus Hagander d5f965c257 Fix typo in comment
Andreas Karlsson
2017-11-27 09:24:14 +01:00
Tom Lane 6d4ae6a8e7 Update MSVC build process for new timezone data.
Missed this dependency in commits 7cce222c9 et al.
2017-11-25 18:15:22 -05:00
Noah Misch 84c4313c6f Support linking with MinGW-built Perl.
This is necessary for ActivePerl 5.18 onwards and for Strawberry Perl.
It is not sufficient for 32-bit builds with newer Visual Studio; these
fail with error LINK2026.  Back-patch to 9.3 (all supported versions).

Reported by Victor Wagner.

Discussion: https://postgr.es/m/20160326154321.7754ab8f@wagner.wagner.home
2017-11-23 20:22:04 -08:00
Andres Freund 7082e614c0 Provide DSM segment to ExecXXXInitializeWorker functions.
Previously, executor nodes running in parallel worker processes didn't
have access to the dsm_segment object used for parallel execution.  In
order to support resource management based on DSM segment lifetime,
they need that.  So create a ParallelWorkerContext object to hold it
and pass it to all InitializeWorker functions.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
2017-11-16 17:39:18 -08:00
Andrew Dunstan 98d54bb779 Back out the session_start and session_end hooks feature.
It's become apparent during testing that there are problems with at
least the testing regime. I don't think we should have it without a
working test regime, and the difficulties might indicate implementation
problems anyway, so I'm backing out the whole thing until that's sorted
out.

This reverts commits 7459484 9989f92 cd8ce3a
2017-11-16 11:35:02 -05:00
Andrew Dunstan 745948422c Disable installcheck tests for test_session_hooks
The module requires a preloaded library and the defect can't be cured by
a LOAD instruction in the test script. To achieve this we override the
installcheck target in the module's Makefile, and exclude ithe module in
vcregress.pl.

Along the way, revert commit 9989f92aab.
2017-11-15 17:49:04 -05:00
Noah Misch 9363b8b79b MSVC: Rebuild spiexceptions.h when out of date.
Also, add a warning to catch future instances of naming a nonexistent
file as a prerequisite.  Back-patch to 9.3 (all supported versions).
2017-11-12 18:43:32 -08:00
Robert Haas 1aba8e651a Add hash partitioning.
Hash partitioning is useful when you want to partition a growing data
set evenly.  This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.

At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work.  Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.

Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me.  A few
final tweaks also by me.

Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
2017-11-09 18:07:44 -05:00
Andrew Dunstan 74d2c0dbfd Improve gendef.pl diagnostic on failure to open sym file
There have been numerous buildfarm failures but the diagnostic is
currently silent about the reason for failure to open the file. Let's
see if we can get to the bottom of it.

Backpatch to all live branches.
2017-10-26 10:01:02 -04:00
Andres Freund 141fd1b66c Improve sys/catcache performance.
The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
   entries - instead store them as a Datum array. This also allows to
   get rid of having to build dummy tuples for negative & list
   entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
   likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
   greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
   allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
   piece.

This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.

I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles. Unfortunately it's not easy to do so safely as
  an entry's memory location can change at various times, which
  doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
  but the win isn't big and the code for it is ugly, because the
  tuples have to be freed as well.
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in
  profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-10-13 14:22:41 -07:00
Tom Lane ef73a8162a Enforce our convention about max number of parallel regression tests.
We have a very old rule that parallel_schedule should have no more
than twenty tests in any one parallel group, so as to provide a
bound on the number of concurrently running processes needed to
pass the tests.  But people keep forgetting the rule, so let's add
a few lines of code to check it.

Discussion: https://postgr.es/m/a37e9c57-22d4-1b82-1270-4501cd2e984e@2ndquadrant.com
2017-10-07 17:20:09 -04:00
Andres Freund 15334ad19a Attempt to adapt windows build for 212e6f34d5.
Per buildfarm animal baiji.
2017-10-04 09:32:02 -07:00
Tom Lane 4736d74479 Adjust git_changelog for new-style release tags.
It wasn't on board with REL_n_n format.
2017-10-04 00:45:15 -04:00
Andrew Dunstan f2ab3898f3 Support building with Visual Studio 2017
Haribabu Kommi, reviewed by Takeshi Ideriha and Christian Ullrich

Backpatch to 9.6
2017-09-25 08:03:05 -04:00
Andres Freund fc49e24fa6 Make WAL segment size configurable at initdb time.
For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.

But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.

This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured.  For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.

Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
    Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com
2017-09-19 22:03:48 -07:00
Andres Freund cc5f81366c Add support for coordinating record typmods among parallel workers.
Tuples can have type RECORDOID and a typmod number that identifies a blessed
TupleDesc in a backend-private cache.  To support the sharing of such tuples
through shared memory and temporary files, provide a typmod registry in
shared memory.

To achieve that, introduce per-session DSM segments, created on demand when a
backend first runs a parallel query.  The per-session DSM segment has a
table-of-contents just like the per-query DSM segment, and initially the
contents are a shared record typmod registry and a DSA area to provide the
space it needs to grow.

State relating to the current session is accessed via a Session object
reached through global variable CurrentSession that may require significant
redesign further down the road as we figure out what else needs to be shared
or remodelled.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CAEepm=0ZtQ-SpsgCyzzYpsXS6e=kZWqk3g5Ygn3MDV7A8dabUA@mail.gmail.com
2017-09-14 19:59:21 -07:00
Tom Lane e451901804 Fix macro-redefinition warning on MSVC.
In commit 9d6b160d7, I tweaked pg_config.h.win32 to use
"#define HAVE_LONG_LONG_INT_64 1" rather than defining it as empty,
for consistency with what happens in an autoconf'd build.
But Solution.pm injects another definition of that macro into
ecpg_config.h, leading to justifiable (though harmless) compiler whining.
Make that one consistent too.  Back-patch, like the previous patch.

Discussion: https://postgr.es/m/CAEepm=1dWsXROuSbRg8PbKLh0S=8Ou-V8sr05DxmJOF5chBxqQ@mail.gmail.com
2017-09-03 11:01:08 -04:00