Commit Graph

8321 Commits

Author SHA1 Message Date
Teodor Sigaev
075aade436 Adjust INCLUDE index truncation comments and code.
Add several assertions that ensure that we're dealing with a pivot tuple
without non-key attributes where that's expected.  Also, remove the
assertion within _bt_isequal(), restoring the v10 function signature.  A
similar check will be performed for the page highkey within
_bt_moveright() in most cases.  Also avoid dropping all objects within
regression tests, to increase pg_dump test coverage for INCLUDE indexes.

Rather than using infrastructure that's generally intended to be used
with reference counted heap tuple descriptors during truncation, use the
same function that was introduced to store flat TupleDescs in shared
memory (we use a temp palloc'd buffer).  This isn't strictly necessary,
but seems more future-proof than the old approach.  It also lets us
avoid including rel.h within indextuple.c, which was arguably a
modularity violation.  Also, we now call index_deform_tuple() with the
truncated TupleDesc, not the source TupleDesc, since that's more robust,
and saves a few cycles.

In passing, fix a memory leak by pfree'ing truncated pivot tuple memory
during CREATE INDEX.  Also pfree during a page split, just to be
consistent.

Refactor _bt_check_natts() to be more readable.

Author: Peter Geoghegan with some editorization by me
Reviewed by: Alexander Korotkov, Teodor Sigaev
Discussion: https://www.postgresql.org/message-id/CAH2-Wz%3DkCWuXeMrBCopC-tFs3FbiVxQNjjgNKdG2sHxZ5k2y3w%40mail.gmail.com
2018-04-19 08:45:58 +03:00
Tom Lane
55d26ff638 Rationalize handling of single and double quotes in bootstrap data.
Change things around so that proper quoting of values interpolated into
the BKI data by initdb is the responsibility of initdb, not something
we half-heartedly handle by putting double quotes into the raw BKI data.
(Note: experimentation shows that it still doesn't work to put a double
quote into the initial superuser username, but that's the fault of
inadequate quoting while interpolating the name into SQL scripts;
the BKI aspect of it works fine now.)

Having done that, we can remove the special-case handling of values
that look like "something" from genbki.pl, and instead teach it to
escape double --- and single --- quotes properly.  This removes the
nowhere-documented need to treat those specially in the BKI source
data; whatever you write will be passed through unchanged into the
inserted data value, modulo Perl's rules about single-quoted strings.

Add documentation explaining the (pre-existing) handling of backslashes
in the BKI data.

Per an earlier discussion with John Naylor.

Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com
2018-04-17 19:53:50 -04:00
Tom Lane
9ffcccdb95 Rationalize handling of array type names in bootstrap data.
Formerly, Catalog.pm turned a C array type declaration in the catalog
header files into a SQL type, e.g., 'foo[]'.  Along the way, genbki.pl
turned this into '_foo' for the purpose of type lookups, but wrote 'foo[]'
to postgres.bki.  During bootstrap, bootscanner.l had to have a special
case rule to tokenize this, and then MapArrayTypeName() would turn 'foo[]'
into '_foo' one more time.

This seems unnecessarily complicated, especially since nobody cares that
much about the readability of postgres.bki.  Instead, make Catalog.pm
convert the C declaration into '_foo' to start with, and preserve that
representation of the type name throughout bootstrap data processing.
Then rip out the special-case code in bootscanner.l and bootstrap.c.

This changes postgres.bki to the extent that array fields are now
declared like
  proconfig = _text ,
rather than
  proconfig = text[] ,

No documentation update, since the SGML docs didn't mention any of this
in the first place, and it's all pretty transparent to writers of
catalog header files anyway.

John Naylor

Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com
2018-04-17 18:29:11 -04:00
Heikki Linnakangas
cf5a189059 Fix confusion on the padding of GIDs in on commit and abort records.
Review of commit 1eb6d652: It's pointless to add padding to the GID fields,
when the code that follows assumes that there is no alignment, and uses
memcpy(). Remove the pointless padding.

Update comments to note the new fields in the WAL records.

Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/33b787bf-dc20-1161-54e9-3f3b607bf59d%40iki.fi
2018-04-17 16:10:42 -04:00
Alvaro Herrera
b7e2cbc5b4 Update Append's idea of first_partial_plan
It turns out that after runtime partition pruning, Append's
first_partial_plan does not accurately represent partial plans to run,
if any of those got pruned.  This could limit participation of workers
in some partial subplans, if other subplans got pruned.  Fix it by
keeping an index of the first valid partial subplan in the state node,
determined at execnode Init time.

Author: David Rowley, with cosmetic changes by me.
Discussion: https://postgr.es/m/CAKJS1f8o2Yd=rOP=Et3A0FWgF+gSAOkFSU6eNhnGzTPV7nN8sQ@mail.gmail.com
2018-04-17 16:25:02 -03:00
Alvaro Herrera
4d64abc2fe List src/include/partitioning in src/include/Makefile
This omission prevented partitioning header files from being installed.

Per buildfarm member crake.
2018-04-14 21:33:32 -03:00
Alvaro Herrera
da6f3e45dd Reorganize partitioning code
There's been a massive addition of partitioning code in PostgreSQL 11,
with little oversight on its placement, resulting in a
catalog/partition.c with poorly defined boundaries and responsibilities.
This commit tries to set a couple of distinct modules to separate things
a little bit.  There are no code changes here, only code movement.

There are three new files:
  src/backend/utils/cache/partcache.c
  src/include/partitioning/partdefs.h
  src/include/utils/partcache.h

The previous arrangement of #including catalog/partition.h almost
everywhere is no more.

Authors: Amit Langote and Álvaro Herrera
Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp
	https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp
	https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp
	https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql
2018-04-14 21:12:14 -03:00
Tom Lane
2a67d6440d Add commentary explaining why MaxIndexTuplesPerPage calculation is safe.
MaxIndexTuplesPerPage ignores the fact that btree indexes sometimes
store tuples with no data payload.  But it also ignores the possibility
of "special space" on index pages, which offsets that, so that the
result isn't an underestimate.  This all seems worth documenting, though.

In passing, remove #define MinIndexTupleSize, which was added by
commit 2c03216d8 but not used in that commit nor later ones.

Comment text by me; issue noticed by Peter Geoghegan.

Discussion: https://postgr.es/m/CAH2-WzkQmb54Kbx-YHXstRKXcNc+_87jwV3DRb54xcybLR7Oig@mail.gmail.com
2018-04-14 12:33:15 -04:00
Peter Eisentraut
a8677e3ff6 Support named and default arguments in CALL
We need to call expand_function_arguments() to expand named and default
arguments.

In PL/pgSQL, we also need to deal with named and default INOUT arguments
when receiving the output values into variables.

Author: Pavel Stehule <pavel.stehule@gmail.com>
2018-04-14 09:13:53 -04:00
Simon Riggs
08ea7a2291 Revert MERGE patch
This reverts commits d204ef6377,
83454e3c2b and a few more commits thereafter
(complete list at the end) related to MERGE feature.

While the feature was fully functional, with sufficient test coverage and
necessary documentation, it was felt that some parts of the executor and
parse-analyzer can use a different design and it wasn't possible to do that in
the available time. So it was decided to revert the patch for PG11 and retry
again in the future.

Thanks again to all reviewers and bug reporters.

List of commits reverted, in reverse chronological order:

 f1464c5380 Improve parse representation for MERGE
 ddb4158579 MERGE syntax diagram correction
 530e69e59b Allow cpluspluscheck to pass by renaming variable
 01b88b4df5 MERGE minor errata
 3af7b2b0d4 MERGE fix variable warning in non-assert builds
 a5d86181ec MERGE INSERT allows only one VALUES clause
 4b2d44031f MERGE post-commit review
 4923550c20 Tab completion for MERGE
 aa3faa3c7a WITH support in MERGE
 83454e3c2b New files for MERGE
 d204ef6377 MERGE SQL Command following SQL:2016

Author: Pavan Deolasee
Reviewed-by: Michael Paquier
2018-04-12 11:22:56 +01:00
Teodor Sigaev
c9c875a28f Rename IndexInfo.ii_KeyAttrNumbers array
Rename ii_KeyAttrNumbers to ii_IndexAttrNumbers to prevent confusion with
ii_NumIndexAttrs/ii_NumIndexKeyAttrs. ii_IndexAttrNumbers contains
all attributes including "including" columns, not only key attribute.

Discussion: https://www.postgresql.org/message-id/13123421-1d52-d0e4-c95c-6d69011e0595%40sigaev.ru
2018-04-12 13:02:45 +03:00
Alvaro Herrera
15a8f8caad Fix IndexOnlyScan counter for heap fetches in parallel mode
The HeapFetches counter was using a simple value in IndexOnlyScanState,
which fails to propagate values from parallel workers; so the counts are
wrong when IndexOnlyScan runs in parallel.  Move it to Instrumentation,
like all the other counters.

While at it, change INSERT ON CONFLICT conflicting tuple counter to use
the new ntuples2 instead of nfiltered2, which is a blatant misuse.

Discussion: https://postgr.es/m/20180409215851.idwc75ct2bzi6tea@alvherre.pgsql
2018-04-10 15:56:15 -03:00
Alvaro Herrera
468abb8f7a Fix incorrect logic for choosing the next Parallel Append subplan
In 499be013de support for pruning unneeded Append subnodes was added.
The logic in that commit was not correctly checking if the next subplan
was in fact a valid subplan. This could cause parallel workers processes
to be given a subplan to work on which didn't require any work.

Per code review following an otherwise unexplained regression failure in
buildfarm member Pademelon.  (We haven't been able to reproduce the
failure, so this is a bit of a blind fix in terms of whether it'll
actually fix it; but it is a clear bug nonetheless).

In passing, also add a comment to explain what first_partial_plan means.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f_E5r05hHUVG3UmCQJ49DGKKHtN=SHybD44LdzBn+CJng@mail.gmail.com
2018-04-09 17:23:49 -03:00
Tom Lane
2cdf359fc4 Make reformat_dat_file.pl preserve all blank lines.
In its original form, reformat_dat_file.pl smashed consecutive blank
lines to a single blank line, which was helpful for mopping up excess
whitespace during the bootstrap data format conversion.  But going
forward, there seems little reason to do that; if developers want to
put in multiple blank lines, let 'em.  This makes it conform to the
documentation I (tgl) wrote, too.

In passing, clean up some sloppy markup choices in bki.sgml.

John Naylor

Discussion: https://postgr.es/m/28827.1523039259@sss.pgh.pa.us
2018-04-09 14:58:39 -04:00
Tom Lane
af1a949109 Further cleanup of client dependencies on src/include/catalog headers.
In commit 9c0a0de4c, I'd failed to notice that catalog/catalog.h
should also be considered a frontend-unsafe header, because it includes
(and needs) the full form of pg_class.h, not to mention relcache.h.
However, various frontend code was depending on it to get
TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for.

The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY,
as well as the OIDCHARS symbol, to common/relpath.h.  Do that, and mop up
inclusions as necessary.  (I found that quite a few current users of
catalog/catalog.h don't seem to need it at all anymore, apparently as a
result of the refactorings that created common/relpath.[hc].  And
initdb.c needed it only as a route to pg_class_d.h.)

Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us
2018-04-09 14:39:58 -04:00
Magnus Hagander
f5543d47bc catversion bump for online-checksums revert
Lack thereof pointed out by Tom Lane.
2018-04-09 19:26:58 +02:00
Magnus Hagander
a228cc13ae Revert "Allow on-line enabling and disabling of data checksums"
This reverts the backend sides of commit 1fde38beaa.
I have, at least for now, left the pg_verify_checksums tool in place, as
this tool can be very valuable without the rest of the patch as well,
and since it's a read-only tool that only runs when the cluster is down
it should be a lot safer.
2018-04-09 19:03:42 +02:00
Alvaro Herrera
d7a95f06a1 Minor comment updates
Fix a couple of typos, and update a comment about why we set a BMS to
NULL.

Author: David Rowley
Discussion: http://postgr.es/m/CAKJS1f-tux=KdUz6ENJ9GHM_V2qgxysadYiOyQS9Ko9PTteVhQ@mail.gmail.com
2018-04-09 11:17:35 -03:00
Tom Lane
4f85f66469 Cosmetic cleanups in initial catalog data.
Write ',' and ';' for typdelim values instead of the obscurantist
ASCII octal equivalents.  Not sure why anybody ever thought the
latter were better; maybe it had something to do with lack of
a better quoting convention, twenty-plus years ago?

Reassign a couple of high-numbered OIDs that were left in during
yesterday's mad rush to commit stuff of uncertain internal
temperature.

The latter requires a catversion bump, though the former wouldn't
since the end-result catalog data is unchanged.
2018-04-08 15:55:49 -04:00
Tom Lane
cca563f384 Reduce worst-case shell command line length during "make install".
Addition of the catalog/pg_foo_d.h headers seems to have pushed us over
the brink of the maximum command line length for some older platforms
during "make install" for our header files.  The main culprit here is
repetition of the target directory path, which could be long.
Rearrange so that we don't repeat that once per file, but only once
per subdirectory.

Per buildfarm.

Discussion: https://postgr.es/m/E1f5Dwm-0004n5-7O@gemulon.postgresql.org
2018-04-08 15:08:32 -04:00
Tom Lane
cefa387153 Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.
Traditionally, include/catalog/pg_foo.h contains extern declarations
for functions in backend/catalog/pg_foo.c, in addition to its function
as the authoritative definition of the pg_foo catalog's rowtype.
In some cases, we'd been forced to split out those extern declarations
into separate pg_foo_fn.h headers so that the catalog definitions
could be #include'd by frontend code.  That problem is gone as of
commit 9c0a0de4c, so let's undo the splits to make things less
confusing.

Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
2018-04-08 14:35:29 -04:00
Tom Lane
372728b0d4 Replace our traditional initial-catalog-data format with a better design.
Historically, the initial catalog data to be installed during bootstrap
has been written in DATA() lines in the catalog header files.  This had
lots of disadvantages: the format was badly underdocumented, it was
very difficult to edit the data in any mechanized way, and due to the
lack of any abstraction the data was verbose, hard to read/understand,
and easy to get wrong.

Hence, move this data into separate ".dat" files and represent it in a way
that can easily be read and rewritten by Perl scripts.  The new format is
essentially "key => value" for each column; while it's a bit repetitive,
explicit labeling of each value makes the data far more readable and less
error-prone.  Provide a way to abbreviate entries by omitting field values
that match a specified default value for their column.  This allows removal
of a large amount of repetitive boilerplate and also lowers the barrier to
adding new columns.

Also teach genbki.pl how to translate symbolic OID references into
numeric OIDs for more cases than just "regproc"-like pg_proc references.
It can now do that for regprocedure-like references (thus solving the
problem that regproc is ambiguous for overloaded functions), operators,
types, opfamilies, opclasses, and access methods.  Use this to turn
nearly all OID cross-references in the initial data into symbolic form.
This represents a very large step forward in readability and error
resistance of the initial catalog data.  It should also reduce the
difficulty of renumbering OID assignments in uncommitted patches.

Also, solve the longstanding problem that frontend code that would like to
use OID macros and other information from the catalog headers often had
difficulty with backend-only code in the headers.  To do this, arrange for
all generated macros, plus such other declarations as we deem fit, to be
placed in "derived" header files that are safe for frontend inclusion.
(Once clients migrate to using these pg_*_d.h headers, it will be possible
to get rid of the pg_*_fn.h headers, which only exist to quarantine code
away from clients.  That is left for follow-on patches, however.)

The now-automatically-generated macros include the Anum_xxx and Natts_xxx
constants that we used to have to update by hand when adding or removing
catalog columns.

Replace the former manual method of generating OID macros for pg_type
entries with an automatic method, ensuring that all built-in types have
OID macros.  (But note that this patch does not change the way that
OID macros for pg_proc entries are built and used.  It's not clear that
making that match the other catalogs would be worth extra code churn.)

Add SGML documentation explaining what the new data format is and how to
work with it.

Despite being a very large change in the catalog headers, there is no
catversion bump here, because postgres.bki and related output files
haven't changed at all.

John Naylor, based on ideas from various people; review and minor
additional coding by me; previous review by Alvaro Herrera

Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com
2018-04-08 13:17:27 -04:00
Andrew Gierth
49b0e300f7 Support index INCLUDE in the AM properties interface.
This rectifies an oversight in commit 8224de4f4, by adding a new
property 'can_include' for pg_indexam_has_property, and adjusting the
results of pg_index_column_has_property to give more appropriate
results for INCLUDEd columns.
2018-04-08 06:02:05 +01:00
Andres Freund
d234602c28 Remove overzeleous assertions in pg_atomic_flag code.
The atomics code asserts proper alignment in various places. That's
mainly because the alignment of 64bit integers is not sufficient for
atomic operations on all platforms. Some ABIs only have four byte
alignment, but don't have atomic behavior when crossing page
boundaries.

The flags code isn't affected by that however, as the type alignment
always is sufficient for atomic operations. Nevertheless the code
asserted alignment requirements. Before 8c3debbb it was only broken on
hppa, after it probably affect further platforms.

Thus remove the assertions for pg_atomic_flag operators.

Per buildfarm animal pademelon.

Discussion: https://postgr.es/m/7223.1523124425@sss.pgh.pa.us
Backpatch: 9.5-
2018-04-07 18:30:15 -07:00
Stephen Frost
c37b3d08ca Allow group access on PGDATA
Allow the cluster to be optionally init'd with read access for the
group.

This means a relatively non-privileged user can perform a backup of the
cluster without requiring write privileges, which enhances security.

The mode of PGDATA is used to determine whether group permissions are
enabled for directory and file creates.  This method was chosen as it's
simple and works well for the various utilities that write into PGDATA.

Changing the mode of PGDATA manually will not automatically change the
mode of all the files contained therein.  If the user would like to
enable group access on an existing cluster then changing the mode of all
the existing files will be required.  Note that pg_upgrade will
automatically change the mode of all migrated files if the new cluster
is init'd with the -g option.

Tests are included for the backend and all the utilities which operate
on the PG data directory to ensure that the correct mode is set based on
the data directory permissions.

Author: David Steele <david@pgmasters.net>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Stephen Frost
da9b580d89 Refactor dir/file permissions
Consolidate directory and file create permissions for tools which work
with the PG data directory by adding a new module (common/file_perm.c)
that contains variables (pg_file_create_mode, pg_dir_create_mode) and
constants to initialize them (0600 for files and 0700 for directories).

Convert mkdir() calls in the backend to MakePGDirectory() if the
original call used default permissions (always the case for regular PG
directories).

Add tests to make sure permissions in PGDATA are set correctly by the
tools which modify the PG data directory.

Authors: David Steele <david@pgmasters.net>,
         Adam Brightwell <adam.brightwell@crunchydata.com>
Reviewed-By: Michael Paquier, with discussion amongst many others.
Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net
2018-04-07 17:45:39 -04:00
Alvaro Herrera
499be013de Support partition pruning at execution time
Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query.  This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.

This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:

1. Parameterized Nested Loop Joins: The parameter from the outer side of the
   join can be used to determine the minimum set of inner side partitions to
   scan.

2. Initplans: Once an initplan has been executed we can then determine which
   partitions match the value from the initplan.

Partition pruning is performed in two ways.  When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor.  This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.

For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait.  Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output.  In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed.  If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)".  Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times.  This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.

This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable.  This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.

Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
2018-04-07 17:54:39 -03:00
Alvaro Herrera
5c0675215e Add bms_prev_member function
This works very much like the existing bms_last_member function, only it
traverses through the Bitmapset in the opposite direction from the most
significant bit down to the least significant bit.  A special prevbit value of
-1 may be used to have the function determine the most significant bit.  This
is useful for starting a loop.  When there are no members less than prevbit,
the function returns -2 to indicate there are no more members.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f-K=3d5MDASNYFJpUpc20xcBnAwNC1-AOeunhn0OtkWbQ@mail.gmail.com
2018-04-07 17:54:39 -03:00
Andres Freund
f16241bef7 Raise error when affecting tuple moved into different partition.
When an update moves a row between partitions (supported since
2f17844104), our normal logic for following update chains in READ
COMMITTED mode doesn't work anymore. Cross partition updates are
modeled as an delete from the old and insert into the new
partition. No ctid chain exists across partitions, and there's no
convenient space to introduce that link.

Not throwing an error in a partitioned context when one would have
been thrown without partitioning is obviously problematic. This commit
introduces infrastructure to detect when a tuple has been moved, not
just plainly deleted. That allows to throw an error when encountering
a deletion that's actually a move, while attempting to following a
ctid chain.

The row deleted as part of a cross partition update is marked by
pointing it's t_ctid to an invalid block, instead of self as a normal
update would.  That was deemed to be the least invasive and most
future proof way to represent the knowledge, given how few infomask
bits are there to be recycled (there's also some locking issues with
using infomask bits).

External code following ctid chains should be updated to check for
moved tuples. The most likely consequence of not doing so is a missed
error.

Author: Amul Sul, editorialized by me
Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas
Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com
2018-04-07 13:24:27 -07:00
Teodor Sigaev
8224de4f42 Indexes with INCLUDE columns and their support in B-tree
This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.

Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.

In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.

Bump catalog version

Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
			 David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
2018-04-07 23:00:39 +03:00
Teodor Sigaev
1c1791e000 Add json(b)_to_tsvector function
Jsonb has a complex nature so there isn't best-for-everything way to convert it
to tsvector for full text search. Current to_tsvector(json(b)) suggests to
convert only string values, but it's possible to index keys, numerics and even
booleans value. To solve that json(b)_to_tsvector has a second required
argument contained a list of desired types of json fields. Second argument is
a jsonb scalar or array right now with possibility to add new options in a
future.

Bump catalog version

Author: Dmitry Dolgov with some editorization by me
Reviewed by: Teodor Sigaev
Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com
2018-04-07 20:58:03 +03:00
Peter Eisentraut
039eb6e92f Logical replication support for TRUNCATE
Update the built-in logical replication system to make use of the
previously added logical decoding for TRUNCATE support.  Add the
required truncate callback to pgoutput and a new logical replication
protocol message.

Publications get a new attribute to determine whether to replicate
truncate actions.  When updating a publication via pg_dump from an older
version, this is not set, thus preserving the previous behavior.

Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
2018-04-07 11:34:11 -04:00
Peter Eisentraut
5dfd1e5a66 Logical decoding of TRUNCATE
Add a new WAL record type for TRUNCATE, which is only used when
wal_level >= logical.  (For physical replication, TRUNCATE is already
replicated via SMGR records.)  Add new callback for logical decoding
output plugins to receive TRUNCATE actions.

Author: Simon Riggs <simon@2ndquadrant.com>
Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it>
Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
2018-04-07 11:34:10 -04:00
Andres Freund
8c3debbbf6 Fix and improve pg_atomic_flag fallback implementation.
The atomics fallback implementation for pg_atomic_flag was broken,
returning the inverted value from pg_atomic_test_set_flag().  This was
unnoticed because a) atomic flags were unused until recently b) the
test code wasn't run when the fallback implementation was in
use (because it didn't allow to test for some edge cases).

Fix the bug, and improve the fallback so it has the same behaviour as
the non-fallback implementation in the problematic edge cases. That
breaks ABI compatibility in the back branches when fallbacks are in
use, but given they were broken until now...

Author: Andres Freund
Reported-by: Daniel Gustafsson
Discussion:
    https://postgr.es/m/FB948276-7B32-4B77-83E6-D00167F8EEB4@yesql.se
    https://postgr.es/m/20180406233854.uni2h3mbnveczl32@alap3.anarazel.de
Backpatch: 9.5-, where the atomics abstraction was introduced.
2018-04-06 19:55:32 -07:00
Robert Haas
3d956d9562 Allow insert and update tuple routing and COPY for foreign tables.
Also enable this for postgres_fdw.

Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
patch series of which this is a part has been reviewed by Amit
Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
and me.  Minor documentation changes to the final version by me.

Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp
2018-04-06 19:22:03 -04:00
Tom Lane
cb1ff1e5af Remove some unnecessary quote marks from catalog DATA lines.
This has no functional impact whatsoever.  However, it causes
these unnecessary quote marks to disappear from the generated
postgres.bki file, making it easier to verify that the upcoming
bootstrap data conversion patch doesn't change the generated file.
2018-04-06 18:58:38 -04:00
Alvaro Herrera
9fdb675fc5 Faster partition pruning
Add a new module backend/partitioning/partprune.c, implementing a more
sophisticated algorithm for partition pruning.  The new module uses each
partition's "boundinfo" for pruning instead of constraint exclusion,
based on an idea proposed by Robert Haas of a "pruning program": a list
of steps generated from the query quals which are run iteratively to
obtain a list of partitions that must be scanned in order to satisfy
those quals.

At present, this targets planner-time partition pruning, but there exist
further patches to apply partition pruning at execution time as well.

This commit also moves some definitions from include/catalog/partition.h
to a new file include/partitioning/partbounds.h, in an attempt to
rationalize partitioning related code.

Authors: Amit Langote, David Rowley, Dilip Kumar
Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen.
Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
2018-04-06 16:44:05 -03:00
Stephen Frost
11523e860f Support new default roles with adminpack
This provides a newer version of adminpack which works with the newly
added default roles to support GRANT'ing to non-superusers access to
read and write files, along with related functions (unlinking files,
getting file length, renaming/removing files, scanning the log file
directory) which are supported through adminpack.

Note that new versions of the functions are required because an
environment might have an updated version of the library but still have
the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to
PUBLIC for the functions).

This patch also removes the long-deprecated alternative names for
functions that adminpack used to include and which are now included in
the backend, in adminpack v1.1.  Applications using the deprecated names
should be updated to use the backend functions instead.  Existing
installations which continue to use adminpack v1.0 should continue to
function until/unless adminpack is upgraded.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
2018-04-06 14:47:10 -04:00
Stephen Frost
0fdc8495bf Add default roles for file/program access
This patch adds new default roles named 'pg_read_server_files',
'pg_write_server_files', 'pg_execute_server_program' which
allow an administrator to GRANT to a non-superuser role the ability to
access server-side files or run programs through PostgreSQL (as the user
the database is running as).  Having one of these roles allows a
non-superuser to use server-side COPY to read, write, or with a program,
and to use file_fdw (if installed by a superuser and GRANT'd USAGE on
it) to read from files or run a program.

The existing misc file functions are also changed to allow a user with
the 'pg_read_server_files' default role to read any files on the
filesystem, matching the privileges given to that role through COPY and
file_fdw from above.

Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net
2018-04-06 14:47:10 -04:00
Peter Eisentraut
bbca77623f Rename MemoryContextCopySetIdentifier() for clarity
MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier

Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us
2018-04-06 12:37:54 -04:00
Peter Eisentraut
bcf79b5bb6 Split the SetSubscriptionRelState function into two
We don't actually need the insert-or-update logic, so it's clearer to
have separate functions for the inserting and updating.

Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
2018-04-06 10:00:26 -04:00
Simon Riggs
f1464c5380 Improve parse representation for MERGE
Separation of parser data structures from executor, as
requested by Tom Lane. Further improvements possible.

While there, implement error for multiple VALUES clauses via parser
to allow line number of error, as requested by Andres Freund.

Author: Pavan Deolasee

Discussion: https://www.postgresql.org/message-id/CABOikdPpqjectFchg0FyTOpsGXyPoqwgC==OLKWuxgBOsrDDZw@mail.gmail.com
2018-04-06 09:38:59 +01:00
Magnus Hagander
1fde38beaa Allow on-line enabling and disabling of data checksums
This makes it possible to turn checksums on in a live cluster, without
the previous need for dump/reload or logical replication (and to turn it
off).

Enabling checkusm starts a background process in the form of a
launcher/worker combination that goes through the entire database and
recalculates checksums on each and every page. Only when all pages have
been checksummed are they fully enabled in the cluster. Any failure of
the process will revert to checksums off and the process has to be
started.

This adds a new WAL record that indicates the state of checksums, so
the process works across replicated clusters.

Authors: Magnus Hagander and Daniel Gustafsson
Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin
2018-04-05 22:04:48 +02:00
Simon Riggs
530e69e59b Allow cpluspluscheck to pass by renaming variable
Use of a C++ keyword as a function name caused problems

Reported-by: Álvaro Herrera
2018-04-05 20:06:02 +01:00
Magnus Hagander
eed1ce72e1 Allow background workers to bypass datallowconn
THis adds a "flags" field to the BackgroundWorkerInitializeConnection()
and BackgroundWorkerInitializeConnectionByOid(). For now only one flag,
BGWORKER_BYPASS_ALLOWCONN, is defined, which allows the worker to ignore
datallowconn.
2018-04-05 19:02:45 +02:00
Teodor Sigaev
1664ae1978 Add websearch_to_tsquery
Error-tolerant conversion function with web-like syntax for search query,
it simplifies  constraining search engine with close to habitual interface for
users.

Bump catalog version

Authors: Victor Drobny, Dmitry Ivanov with editorization by me
Reviewed by: Aleksander Alekseev, Tomas Vondra, Thomas Munro, Aleksandr Parfenov
Discussion: https://www.postgresql.org/message-id/flat/fe931111ff7e9ad79196486ada79e268@postgrespro.ru
2018-04-05 19:55:11 +03:00
Alvaro Herrera
fbc27330b8 Add missing include
Newly added prototype broke cpluspluscheck.

Minor buglet in commit 8694cc96b5.
2018-04-05 12:20:17 -03:00
Simon Riggs
4b2d44031f MERGE post-commit review
Review comments from Andres Freund

* Consolidate code into AfterTriggerGetTransitionTable()
* Rename nodeMerge.c to execMerge.c
* Rename nodeMerge.h to execMerge.h
* Move MERGE handling in ExecInitModifyTable()
  into a execMerge.c ExecInitMerge()
* Move mt_merge_subcommands flags into execMerge.h
* Rename opt_and_condition to opt_merge_when_and_condition
* Wordsmith various comments

Author: Pavan Deolasee
Reviewer: Simon Riggs
2018-04-05 09:54:07 +01:00
Tom Lane
1383e2a1a9 Improve FSM management for BRIN indexes.
BRIN indexes like to propagate additions of free space into the upper pages
of their free space maps as soon as the new space is known, even when it's
just on one individual index page.  Previously this required calling
FreeSpaceMapVacuum, which is quite an expensive thing if the map is large.
Use the FreeSpaceMapVacuumRange function recently added by commit c79f6df75
to reduce the amount of work done for this purpose.

Fix a couple of places that neglected to do the upper-page vacuuming at all
after recording new free space.  If the policy is to be that BRIN should do
that, it should do it everywhere.

Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do
FreeSpaceMapVacuum unconditionally in brin_vacuum_scan.  Because of the
FSM's imprecise storage of free space, the old complications here seldom
bought anything, they just slowed things down.  This approach also
provides a predictable path for FSM corruption to be repaired.

Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer
where it's about to return an extended page to the caller.  The caller
should do that, instead, after it's inserted its new tuple.  Fix the
one caller that forgot to do so.

Simplify logic in brin_doupdate's same-page-update case by postponing
brin_initialize_empty_new_buffer to after the critical section; I see
little point in doing it before.

Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan.
Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in
a couple of places where we already had the right values.

Move a BRIN_elog debug logging call out of a critical section; that's
pretty unsafe and I don't think it buys us anything to not wait till
after the critical section.

Move the "*extended = false" step in brin_getinsertbuffer into the
routine's main loop.  There's no actual bug there, since the loop can't
iterate with *extended still true, but it doesn't seem very future-proof
as coded; and it's certainly not documented as a loop invariant.

This is all from follow-on investigation inspired by commit c79f6df75.

Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us
2018-04-04 14:26:04 -04:00
Alvaro Herrera
3de241dba8 Foreign keys on partitioned tables
Author: Álvaro Herrera
Discussion: https://postgr.es/m/20171231194359.cvojcour423ulha4@alvherre.pgsql
Reviewed-by: Peter Eisentraut
2018-04-04 14:02:49 -03:00
Teodor Sigaev
857f9c36cd Skip full index scan during cleanup of B-tree indexes when possible
Vacuum of index consists from two stages: multiple (zero of more) ambulkdelete
calls and one amvacuumcleanup call. When workload on particular table
is append-only, then autovacuum isn't intended to touch this table. However,
user may run vacuum manually in order to fill visibility map and get benefits
of index-only scans. Then ambulkdelete wouldn't be called for indexes
of such table (because no heap tuples were deleted), only amvacuumcleanup would
be called In this case, amvacuumcleanup would perform full index scan for
two objectives: put recyclable pages into free space map and update index
statistics.

This patch allows btvacuumclanup to skip full index scan when two conditions
are satisfied: no pages are going to be put into free space map and index
statistics isn't stalled. In order to check first condition, we store
oldest btpo_xact in the meta-page. When it's precedes RecentGlobalXmin, then
there are some recyclable pages. In order to check second condition we store
number of heap tuples observed during previous full index scan by cleanup.
If fraction of newly inserted tuples is less than
vacuum_cleanup_index_scale_factor, then statistics isn't considered to be
stalled. vacuum_cleanup_index_scale_factor can be defined as both reloption and GUC (default).

This patch bumps B-tree meta-page version. Upgrade of meta-page is performed
"on the fly": during VACUUM meta-page is rewritten with new version. No special
handling in pg_upgrade is required.

Author: Masahiko Sawada, Alexander Korotkov
Review by: Peter Geoghegan, Kyotaro Horiguchi, Alexander Korotkov, Yura Sokolov
Discussion: https://www.postgresql.org/message-id/flat/CAD21AoAX+d2oD_nrd9O2YkpzHaFr=uQeGr9s1rKC3O4ENc568g@mail.gmail.com
2018-04-04 19:29:00 +03:00
Heikki Linnakangas
f044d71e33 Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.

Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.

For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.

Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.

Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 12:22:45 +03:00
Heikki Linnakangas
638a199fa9 Also fix the descriptions in pg_config.h.win32.
I missed pg_config.h.win32 in the previous commit that fixed these in
pg_config.h.in.
2018-04-04 11:33:39 +03:00
Heikki Linnakangas
8989f52b1b Fix incorrect description of USE_SLICING_BY_8_CRC32C.
And a typo in the description of USE_SSE42_CRC32C_WITH_RUNTIME_CHECK,
spotted by Daniel Gustafsson.
2018-04-04 11:20:53 +03:00
Teodor Sigaev
710d90da1f Add prefix operator for TEXT type.
The prefix operator along with SP-GiST indexes can be used as an alternative
for LIKE 'word%' commands  and it doesn't have a limitation of string/prefix
length as B-Tree has.

Bump catalog version

Author: Ildus Kurbangaliev with some editorization by me
Review by: Arthur Zakirov, Alexander Korotkov, and me
Discussion: https://www.postgresql.org/message-id/flat/20180202180327.222b04b3@wp.localdomain
2018-04-03 19:46:45 +03:00
Simon Riggs
aa3faa3c7a WITH support in MERGE
Author: Peter Geoghegan
Recursive support removed, no tests
Docs added by me
2018-04-03 12:13:59 +01:00
Simon Riggs
83454e3c2b New files for MERGE 2018-04-03 10:22:21 +01:00
Simon Riggs
d204ef6377 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 09:28:16 +01:00
Simon Riggs
aa5877bb26 Revert "MERGE SQL Command following SQL:2016"
This reverts commit e6597dc353.
2018-04-02 21:36:38 +01:00
Simon Riggs
7cf8a5c302 Revert "Modified files for MERGE"
This reverts commit 354f13855e.
2018-04-02 21:34:15 +01:00
Simon Riggs
354f13855e Modified files for MERGE 2018-04-02 21:12:47 +01:00
Simon Riggs
e6597dc353 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewers: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-02 21:04:35 +01:00
Peter Eisentraut
2764d5dcfa Make be-secure-common.c more consistent for future SSL implementations
Recent commit 8a3d9425 has introduced be-secure-common.c, which is aimed
at including backend-side APIs that can be used by any SSL
implementation.  The purpose is similar to fe-secure-common.c for the
frontend-side APIs.

However, this has forgotten to include check_ssl_key_file_permissions()
in the move, which causes a double dependency between be-secure.c and
be-secure-openssl.c.

Refactor the code in a more logical way.  This also puts into light an
API which is usable by future SSL implementations for permissions on SSL
key files.

Author: Michael Paquier <michael@paquier.xyz>
2018-04-02 11:37:40 -04:00
Robert Haas
7e0d64c7a5 postgres_fdw: Push down partition-wise aggregation.
Since commit 7012b132d0, postgres_fdw
has been able to push down the toplevel aggregation operation to the
remote server.  Commit e2f1eb0ee3 made
it possible to break down the toplevel aggregation into one
aggregate per partition.  This commit lets postgres_fdw push down
aggregation in that case just as it does at the top level.

In order to make this work, this commit adds an additional argument
to the GetForeignUpperPaths FDW API.  A matching argument is added
to the signature for create_upper_paths_hook.  Third-party code using
either of these will need to be updated.

Also adjust create_foreignscan_plan() so that it picks up the correct
set of relids in this case.

Jeevan Chalke, reviewed by Ashutosh Bapat and by me and with some
adjustments by me.  The larger patch series of which this patch is a
part was also reviewed and tested by Antonin Houska, Rajkumar
Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal
Legrand, and Rafia Sabih.

Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com
Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
2018-04-02 10:51:50 -04:00
Tom Lane
0b11a674fb Fix a boatload of typos in C comments.
Justin Pryzby

Discussion: https://postgr.es/m/20180331105640.GK28454@telsasoft.com
2018-04-01 15:01:28 -04:00
Andres Freund
51bc271790 Add Bloom filter implementation.
A Bloom filter is a space-efficient, probabilistic data structure that
can be used to test set membership.  Callers will sometimes incur false
positives, but never false negatives.  The rate of false positives is a
function of the total number of elements and the amount of memory
available for the Bloom filter.

Two classic applications of Bloom filters are cache filtering, and data
synchronization testing.  Any user of Bloom filters must accept the
possibility of false positives as a cost worth paying for the benefit in
space efficiency.

This commit adds a test harness extension module, test_bloomfilter.  It
can be used to get a sense of how the Bloom filter implementation
performs under varying conditions.

This is infrastructure for the upcoming "heapallindexed" amcheck patch,
which verifies the consistency of a heap relation against one of its
indexes.

Author: Peter Geoghegan
Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com
2018-03-31 17:49:41 -07:00
Tom Lane
b0c90c85fc Portability fix for commit 9a895462d.
So far as I can find, NI_MAXHOST isn't actually required anywhere by
POSIX.  Nonetheless, commit 9a895462d supposed that it could rely on
having that symbol without any ceremony at all.  We do have a hack
for providing it if the platform doesn't, in getaddrinfo.h, so fix
the problem by #including that file.  Per buildfarm.
2018-03-30 20:52:13 -04:00
Andres Freund
3e256e5506 Add SKIP_LOCKED option to RangeVarGetRelidExtended().
This will be used for VACUUM (SKIP LOCKED).

Author: Nathan Bossart
Reviewed-By: Michael Paquier and Andres Freund
Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de
2018-03-30 17:05:16 -07:00
Andres Freund
d87510a524 Combine options for RangeVarGetRelidExtended() into a flags argument.
A followup patch will add a SKIP_LOCKED option. To avoid introducing
evermore arguments, breaking existing callers each time, introduce a
flags argument. This'll no doubt break a few external users...

Also change the MISSING_OK behaviour so a DEBUG1 debug message is
emitted when a relation is not found.

Author: Nathan Bossart
Reviewed-By: Michael Paquier and Andres Freund
Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de
2018-03-30 17:05:16 -07:00
Fujii Masao
9a895462d9 Enhance pg_stat_wal_receiver view to display host and port of sender server.
Previously there was no way in the standby side to find out the host and port
of the sender server that the walreceiver was currently connected to when
multiple hosts and ports were specified in primary_conninfo. For that purpose,
this patch adds sender_host and sender_port columns into pg_stat_wal_receiver
view. They report the host and port that the active replication connection
currently uses.

Bump catalog version.

Author: Haribabu Kommi
Reviewed-by: Michael Paquier and me

Discussion: https://postgr.es/m/CAJrrPGcV_aq8=cdqkFhVDJKEnDQ70yRTTdY9RODzMnXNrCz2Ow@mail.gmail.com
2018-03-31 07:51:22 +09:00
Tom Lane
11002f8afa Fix bogus provolatile/proparallel markings on a few built-in functions.
Richard Yen reported that pg_upgrade failed if the target cluster had
force_parallel_mode = on, because binary_upgrade_create_empty_extension()
is marked parallel restricted, allowing it to be executed in parallel
mode, which complains because it tries to acquire an XID.

In general, no function that might try to modify database data should
be considered parallel safe or restricted, since execution of it might
force XID acquisition.  We found several other examples of this mistake.

Furthermore, functions that execute user-supplied SQL queries or query
fragments, or pull data from user-supplied cursors, had better be marked
both volatile and parallel unsafe, because we don't know what the supplied
query or cursor might try to do.  There were several tsquery and XML
functions that had the wrong proparallel marking for this, and some of
them were even mislabeled as to volatility.

All these bugs are old, dating back to 9.6 for the proparallel mistakes
and much further for the provolatile mistakes.  We can't force a
catversion bump in the back branches, but we can at least ensure that
installations initdb'd in future have the right values.

Thomas Munro and Tom Lane

Discussion: https://postgr.es/m/CAEepm=2sNDScSLTfyMYu32Q=ob98ZGW-vM_2oLxinzSABGQ6VA@mail.gmail.com
2018-03-30 18:14:51 -04:00
Teodor Sigaev
43d1ed60fd Predicate locking in GIN index
Predicate locks are used on per page basis only if fastupdate = off, in
opposite case predicate lock on pending list will effectively lock whole index,
to reduce locking overhead, just lock a relation. Entry and posting trees are
essentially B-tree, so locks are acquired on leaf pages only.

Author: Shubham Barai with some editorization by me and Dmitry Ivanov
Review by: Alexander Korotkov, Dmitry Ivanov, Fedor Sigaev
Discussion: https://www.postgresql.org/message-id/flat/CALxAEPt5sWW+EwTaKUGFL5_XFcZ0MuGBcyJ70oqbWqr42YKR8Q@mail.gmail.com
2018-03-30 14:23:17 +03:00
Robert Haas
c1de1a3a8b Remove 'target' from GroupPathExtraData.
It's not needed.

Jeevan Chalke

Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com
2018-03-29 16:17:18 -04:00
Tom Lane
a063baaced Remove UpdateFreeSpaceMap(), use FreeSpaceMapVacuumRange() instead.
FreeSpaceMapVacuumRange has the same effect, is more efficient if many
pages are involved, and makes fewer assumptions about how it's used.
Notably, Claudio Freire pointed out that UpdateFreeSpaceMap could fail
if the specified freespace value isn't the maximum possible.  This isn't
a problem for the single existing user, but the function represents an
attractive nuisance IMO, because it's named as though it were a
general-purpose update function and its limitations are undocumented.
In any case we don't need multiple ways to get the same result.

In passing, do some code review and cleanup in RelationAddExtraBlocks.
In particular, I see no excuse for it to omit the PageIsNew safety check
that's done in the mainline extension path in RelationGetBufferForTuple.

Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com
2018-03-29 12:22:44 -04:00
Tom Lane
851a26e266 While vacuuming a large table, update upper-level FSM data every so often.
VACUUM updates leaf-level FSM entries immediately after cleaning the
corresponding heap blocks.  fsmpage.c updates the intra-page search trees
on the leaf-level FSM pages when this happens, but it does not touch the
upper-level FSM pages, so that the released space might not actually be
findable by searchers.  Previously, updating the upper-level pages happened
only at the conclusion of the VACUUM run, in a single FreeSpaceMapVacuum()
call.  This is bad because the VACUUM might get canceled before ever
reaching that point, so that from the point of view of searchers no space
has been freed at all, leading to table bloat.

We can improve matters by updating the upper pages immediately after each
cycle of index-cleaning and heap-cleaning, processing just the FSM pages
corresponding to the range of heap blocks we have now fully cleaned.
This adds a small amount of extra work, since the FSM pages leading down
to each range boundary will be touched twice, but it's pretty negligible
compared to everything else going on in a large VACUUM.

If there are no indexes, VACUUM doesn't work in cycles but just cleans
each heap page on first visit.  In that case we just arbitrarily update
upper FSM pages after each 8GB of heap.  That maintains the goal of not
letting all this work slide until the very end, and it doesn't seem worth
expending extra complexity on a case that so seldom occurs in practice.

In either case, the FSM is fully up to date before any attempt is made
to truncate the relation, so that the most likely scenario for VACUUM
cancellation no longer results in out-of-date upper FSM pages.  When
we do successfully truncate, adjusting the FSM to reflect that is now
fully handled within FreeSpaceMapTruncateRel.

Claudio Freire, reviewed by Masahiko Sawada and Jing Wang, some additional
tweaks by me

Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com
2018-03-29 11:29:54 -04:00
Teodor Sigaev
c0cbe00fee Add casts from jsonb
Add explicit cast from scalar jsonb to all numeric and bool types. It would be
better to have cast from scalar jsonb to text too but there is already a cast
from jsonb to text as just text representation of json. There is no way to have
two different casts for the same type's pair.

Bump catalog version

Author: Anastasia Lubennikova with editorization by Nikita Glukhov and me
Review by: Aleksander Alekseev, Nikita Glukhov, Darafei Praliaskouski
Discussion: https://www.postgresql.org/message-id/flat/0154d35a-24ae-f063-5273-9ffcdf1c7f2e@postgrespro.ru
2018-03-29 16:33:56 +03:00
Peter Eisentraut
056a5a3f63 Allow committing inside cursor loop
Previously, committing or aborting inside a cursor loop was prohibited
because that would close and remove the cursor.  To allow that,
automatically convert such cursors to holdable cursors so they survive
commits or rollbacks.  Portals now have a new state "auto-held", which
means they have been converted automatically from pinned.  An auto-held
portal is kept on transaction commit or rollback, but is still removed
when returning to the main loop on error.

This supports all languages that have cursor loop constructs: PL/pgSQL,
PL/Python, PL/Perl.

Reviewed-by: Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>
2018-03-28 19:03:26 -04:00
Andres Freund
1f0c6a9e7d Add EXPLAIN support for JIT.
This just shows a few details about JITing, e.g. how many functions
have been JITed, and how long that took.  To avoid noise in regression
tests with functions sometimes being JITed in --with-llvm builds,
disable display when COSTS OFF is specified.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-28 13:26:51 -07:00
Andres Freund
9370462e9a Add inlining support to LLVM JIT provider.
This provides infrastructure to allow JITed code to inline code
implemented in C. This e.g. can be postgres internal functions or
extension code.

This already speeds up long running queries, by allowing the LLVM
optimizer to optimize across function boundaries. The optimization
potential currently doesn't reach its full potential because LLVM
cannot optimize the FunctionCallInfoData argument fully away, because
it's allocated on the heap rather than the stack. Fixing that is
beyond what's realistic for v11.

To be able to do that, use CLANG to convert C code to LLVM bitcode,
and have LLVM build a summary for it. That bitcode can then be used to
to inline functions at runtime. For that the bitcode needs to be
installed. Postgres bitcode goes into $pkglibdir/bitcode/postgres,
extensions go into equivalent directories.  PGXS has been modified so
that happens automatically if postgres has been compiled with LLVM
support.

Currently this isn't the fastest inline implementation, modules are
reloaded from disk during inlining. That's to work around an apparent
LLVM bug, triggering an apparently spurious error in LLVM assertion
enabled builds.  Once that is resolved we can remove the superfluous
read from disk.

Docs will follow in a later commit containing docs for the whole JIT
feature.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-28 13:19:08 -07:00
Andres Freund
8a934d6778 Use isinf builtin for clang, for performance.
When compiling with clang glibc's definition of isinf() ends up
leading to and external libc function call. That's because there was a
bug in the builtin in an old gcc version, and clang claims
compatibility with an older version.  That causes clang to be
measurably slower for floating point heavy workloads than gcc.

To fix simply redirect isinf when using clang and clang confirms it
has __builtin_isinf().
2018-03-28 13:12:15 -07:00
Peter Eisentraut
d92bc83c48 PL/pgSQL: Nested CALL with transactions
So far, a nested CALL or DO in PL/pgSQL would not establish a context
where transaction control statements were allowed.  This fixes that by
handling CALL and DO specially in PL/pgSQL, passing the atomic/nonatomic
execution context through and doing the required management around
transaction boundaries.

Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>
2018-03-28 13:31:27 -04:00
Simon Riggs
1eb6d6527a Store 2PC GID in commit/abort WAL recs for logical decoding
Store GID of 2PC in commit/abort WAL records when wal_level = logical.
This allows logical decoding to send the SAME gid to subscribers
across restarts of logical replication.

Track relica origin replay progress for 2PC.

(Edited from patch 0003 in the logical decoding 2PC series.)

Authors: Nikhil Sontakke, Stas Kelvich
Reviewed-by: Simon Riggs, Andres Freund
2018-03-28 17:42:50 +01:00
Andres Freund
f4f5845b31 Quick adaption of JIT tuple deforming to the fast default patch.
Instead using memset to set tts_isnull, call the new
slot_getmissingattrs().

Also fix a bug (= instead of >=) in the code generation. Normally = is
correct, but when repeatedly deforming fields not in a
tuple (e.g. deform up to natts + 1 and then natts + 2) >= is needed.

Discussion: https://postgr.es/m/20180328010053.i2qvsuuusst4lgmc@alap3.anarazel.de
2018-03-27 21:03:10 -07:00
Andres Freund
b4013b8e4a Add catversion bump missed in 16828d5c0.
Given that pg_attribute changed its layout...
2018-03-27 19:07:39 -07:00
Andrew Dunstan
16828d5c02 Fast ALTER TABLE ADD COLUMN with a non-NULL default
Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.

Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.

The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g.  catalog relations) NULL can be passed for this
argument.

Andrew Dunstan, heavily modified from an original patch from Serge
Rielau.
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.

Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com
2018-03-28 10:43:52 +10:30
Tom Lane
442accc3fe Allow memory contexts to have both fixed and variable ident strings.
Originally, we treated memory context names as potentially variable in
all cases, and therefore always copied them into the context header.
Commit 9fa6f00b1 rethought this a little bit and invented a distinction
between fixed and variable names, skipping the copy step for the former.
But we can make things both simpler and more useful by instead allowing
there to be two parts to a context's identification, a fixed "name" and
an optional, variable "ident".  The name supplied in the context create
call is now required to be a compile-time-constant string in all cases,
as it is never copied but just pointed to.  The "ident" string, if
wanted, is supplied later.  This is needed because typically we want
the ident to be stored inside the context so that it's cleaned up
automatically on context deletion; that means it has to be copied into
the context before we can set the pointer.

The cost of this approach is basically just an additional pointer field
in struct MemoryContextData, which isn't much overhead, and is bought
back entirely in the AllocSet case by not needing a headerSize field
anymore, since we no longer have to cope with variable header length.
In addition, we can simplify the internal interfaces for memory context
creation still further, saving a few cycles there.  And it's no longer
true that a custom identifier disqualifies a context from participating
in aset.c's freelist scheme, so possibly there's some win on that end.

All the places that were using non-compile-time-constant context names
are adjusted to put the variable info into the "ident" instead.  This
allows more effective identification of those contexts in many cases;
for example, subsidary contexts of relcache entries are now identified
by both type (e.g. "index info") and relname, where before you got only
one or the other.  Contexts associated with PL function cache entries
are now identified more fully and uniformly, too.

I also arranged for plancache contexts to use the query source string
as their identifier.  This is basically free for CachedPlanSources, as
they contained a copy of that string already.  We pay an extra pstrdup
to do it for CachedPlans.  That could perhaps be avoided, but it would
make things more fragile (since the CachedPlanSource is sometimes
destroyed first).  I suspect future improvements in error reporting will
require CachedPlans to have a copy of that string anyway, so it's not
clear that it's worth moving mountains to avoid it now.

This also changes the APIs for context statistics routines so that the
context-specific routines no longer assume that output goes straight
to stderr, nor do they know all details of the output format.  This
is useful immediately to reduce code duplication, and it also allows
for external code to do something with stats output that's different
from printing to stderr.

The reason for pushing this now rather than waiting for v12 is that
it rethinks some of the API changes made by commit 9fa6f00b1.  Seems
better for extension authors to endure just one round of API changes
not two.

Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com
2018-03-27 16:46:51 -04:00
Simon Riggs
c203d6cf81 Allow HOT updates for some expression indexes
If the value of an index expression is unchanged after UPDATE,
allow HOT updates where previously we disallowed them, giving
a significant performance boost in those cases.

Particularly useful for indexes such as JSON->>field where the
JSON value changes but the indexed value does not.

Submitted as "surjective indexes" patch, now enabled by use
of new "recheck_on_update" parameter.

Author: Konstantin Knizhnik
Reviewer: Simon Riggs, with much wordsmithing and some cleanup
2018-03-27 19:57:02 +01:00
Teodor Sigaev
920a5e500a Skip temp tables from basebackup.
Do not store temp tables in basebackup, they will not be visible anyway, so,
there are not reasons to store them.

Author: David Steel
Reviewed by: me
Discussion: https://www.postgresql.org/message-id/flat/5ea4d26a-a453-c1b7-eff9-5a3ef8f8aceb@pgmasters.net
2018-03-27 16:14:40 +03:00
Andres Freund
32af96b2b1 JIT tuple deforming in LLVM JIT provider.
Performing JIT compilation for deforming gains performance benefits
over unJITed deforming from compile-time knowledge of the tuple
descriptor. Fixed column widths, NOT NULLness, etc can be taken
advantage of.

Right now the JITed deforming is only used when deforming tuples as
part of expression evaluation (and obviously only if the descriptor is
known). It's likely to be beneficial in other cases, too.

By default tuple deforming is JITed whenever an expression is JIT
compiled. There's a separate boolean GUC controlling it, but that's
expected to be primarily useful for development and benchmarking.

Docs will follow in a later commit containing docs for the whole JIT
feature.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-26 12:57:19 -07:00
Alvaro Herrera
555ee77a96 Handle INSERT .. ON CONFLICT with partitioned tables
Commit eb7ed3f306 enabled unique constraints on partitioned tables,
but one thing that was not working properly is INSERT/ON CONFLICT.
This commit introduces a new node keeps state related to the ON CONFLICT
clause per partition, and fills it when that partition is about to be
used for tuple routing.

Author: Amit Langote, Álvaro Herrera
Reviewed-by: Etsuro Fujita, Pavan Deolasee
Discussion: https://postgr.es/m/20180228004602.cwdyralmg5ejdqkq@alvherre.pgsql
2018-03-26 10:43:54 -03:00
Teodor Sigaev
8694cc96b5 Exclude unlogged tables from base backups
Exclude unlogged tables from base backup entirely except init fork which marks
created unlogged table. The next question is do not backup temp table but
it's a story for separate patch.

Author: David Steele
Review by: Adam Brightwell, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/flat/04791bab-cb04-ba43-e9c0-664a4c1ffb2c@pgmasters.net
2018-03-23 19:14:12 +03:00
Peter Eisentraut
7ba7986fb4 Fix interaction of Perl and stdbool.h
Revert the PL/Perl-specific change in
9a95a77d9d.  We must not prevent Perl from
using stdbool.h when it has been built to do so, even if it uses an
incompatible size.  Otherwise, we would be imposing our bool on Perl,
which will lead to crashes because of the size mismatch.

Instead, we undef bool after including the Perl headers, as we did
previously, but now only if we are not using stdbool.h ourselves.
Record that choice in c.h as USE_STDBOOL.  This will also make it easier
to apply that coding pattern elsewhere if necessary.
2018-03-23 10:31:10 -04:00
Alvaro Herrera
86f575948c Allow FOR EACH ROW triggers on partitioned tables
Previously, FOR EACH ROW triggers were not allowed in partitioned
tables.  Now we allow AFTER triggers on them, and on trigger creation we
cascade to create an identical trigger in each partition.  We also clone
the triggers to each partition that is created or attached later.

This means that deferred unique keys are allowed on partitioned tables,
too.

Author: Álvaro Herrera
Reviewed-by: Peter Eisentraut, Simon Riggs, Amit Langote, Robert Haas,
	Thomas Munro
Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql
2018-03-23 10:48:22 -03:00
Andres Freund
2111a48a0c Adapt expression JIT to stdbool.h introduction.
The LLVM JIT provider uses clang to synchronize types between normal C
code and runtime generated code. Clang represents stdbool.h style
booleans in return values & parameters differently from booleans
stored in variables.

Thus the expression compilation code from 2a0faed9d needs to be
adapted to 9a95a77d9. Instead of hardcoding i8 as the type for
booleans (which already was wrong on some edge case platforms!), use
postgres' notion of a boolean as used for storage and for parameters.

Per buildfarm animal xenodermus.

Author: Andres Freund
2018-03-22 22:15:51 -07:00
Peter Eisentraut
9a95a77d9d Use stdbool.h if suitable
Using the standard bool type provided by C allows some recent compilers
and debuggers to give better diagnostics.  Also, some extension code and
third-party headers are increasingly pulling in stdbool.h, so it's
probably saner if everyone uses the same definition.

But PostgreSQL code is not prepared to handle bool of a size other than
1, so we keep our own old definition if we encounter a stdbool.h with a
bool of a different size.  (Among current build farm members, this only
applies to old macOS versions on PowerPC.)

To check that the used bool is of the right size, add a static
assertions about size of GinTernaryValue vs bool.  This is currently the
only place that assumes that bool and char are of the same size.

Discussion: https://www.postgresql.org/message-id/flat/3a0fe7e1-5ed1-414b-9230-53bbc0ed1f49@2ndquadrant.com
2018-03-22 20:42:25 -04:00
Andres Freund
2a0faed9d7 Add expression compilation support to LLVM JIT provider.
In addition to the interpretation of expressions (which back
evaluation of WHERE clauses, target list projection, aggregates
transition values etc) support compiling expressions to native code,
using the infrastructure added in earlier commits.

To avoid duplicating a lot of code, only support emitting code for
cases that are likely to be performance critical. For expression steps
that aren't deemed that, use the existing interpreter.

The generated code isn't great - some architectural changes are
required to address that. But this already yields a significant
speedup for some analytics queries, particularly with WHERE clauses
filtering a lot, or computing multiple aggregates.

Author: Andres Freund
Tested-By: Thomas Munro
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de

Disable JITing for VALUES() nodes.

VALUES() nodes are only ever executed once. This is primarily helpful
for debugging, when forcing JITing even for cheap queries.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 14:45:59 -07:00
Andres Freund
7ced1d1247 Add FIELDNO_* macro designating offset into structs required for JIT.
For any interesting JIT target, fields inside structs need to be
accessed. b96d550e contains infrastructure for syncing the definition
of types between postgres C code and runtime code generation with
LLVM. But that doesn't sync the number or names of fields inside
structs, just the types (including padding etc).

One option would be to hardcode the offset numbers in the JIT code,
but that'd be hard to keep in sync. Instead add macros indicating the
field offset to the fields that need to be accessed. Not pretty, but
manageable.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 14:45:59 -07:00
Andres Freund
fb46ac26fe Expand list of synchronized types and functions in LLVM JIT provider.
Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 14:45:59 -07:00
Andres Freund
cc415a56d0 Basic planner and executor integration for JIT.
This adds simple cost based plan time decision about whether JIT
should be performed. jit_above_cost, jit_optimize_above_cost are
compared with the total cost of a plan, and if the cost is above them
JIT is performed / optimization is performed respectively.

For that PlannedStmt and EState have a jitFlags (es_jit_flags) field
that stores information about what JIT operations should be performed.

EState now also has a new es_jit field, which can store a
JitContext. When there are no errors the context is released in
standard_ExecutorEnd().

It is likely that the default values for jit_[optimize_]above_cost
will need to be adapted further, but in my test these values seem to
work reasonably.

Author: Andres Freund, with feedback by Peter Eisentraut
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 11:51:58 -07:00
Andres Freund
7ec0d80c05 Add helpers for emitting LLVM IR.
These basically just help to make code a bit more concise and pgindent
proof.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-22 11:51:58 -07:00