Commit Graph

33584 Commits

Author SHA1 Message Date
Michael Paquier 4e197bf195 Fix typo related to to_tsvector() in tests of json and jsonb
Author: Sho Kato
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/25C1C6B2E7BE044889E4FE8643A58BA963E1D03D@G01JPEXMBKW03
2019-03-15 16:20:11 +09:00
Thomas Munro bb16aba50c Enable parallel query with SERIALIZABLE isolation.
Previously, the SERIALIZABLE isolation level prevented parallel query
from being used.  Allow the two features to be used together by
sharing the leader's SERIALIZABLEXACT with parallel workers.

An extra per-SERIALIZABLEXACT LWLock is introduced to make it safe to
share, and new logic is introduced to coordinate the early release
of the SERIALIZABLEXACT required for the SXACT_FLAG_RO_SAFE
optimization, as follows:

The first backend to observe the SXACT_FLAG_RO_SAFE flag (set by
some other transaction) will 'partially release' the SERIALIZABLEXACT,
meaning that the conflicts and locks it holds are released, but the
SERIALIZABLEXACT itself will remain active because other backends
might still have a pointer to it.

Whenever any backend notices the SXACT_FLAG_RO_SAFE flag, it clears
its own MySerializableXact variable and frees local resources so that
it can skip SSI checks for the rest of the transaction.  In the
special case of the leader process, it transfers the SERIALIZABLEXACT
to a new variable SavedSerializableXact, so that it can be completely
released at the end of the transaction after all workers have exited.

Remove the serializable_okay flag added to CreateParallelContext() by
commit 9da0cc35, because it's now redundant.

Author: Thomas Munro
Reviewed-by: Haribabu Kommi, Robert Haas, Masahiko Sawada, Kevin Grittner
Discussion: https://postgr.es/m/CAEepm=0gXGYhtrVDWOTHS8SQQy_=S9xo+8oCxGLWZAOoeJ=yzQ@mail.gmail.com
2019-03-15 17:47:04 +13:00
Amit Kapila 13e8643bfc During pg_upgrade, conditionally skip transfer of FSMs.
If a heap on the old cluster has 4 pages or fewer, and the old cluster
was PG v11 or earlier, don't copy or link the FSM. This will shrink
space usage for installations with large numbers of small tables.

This will allow pg_upgrade to take advantage of commit b0eaa4c51b where
we have avoided creation of the free space map for small heap relations.

Author: John Naylor
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CACPNZCu4cOdm3uGnNEGXivy7Gz8UWyQjynDpdkPGabQ18_zK6g%40mail.gmail.com
2019-03-15 08:25:57 +05:30
Peter Eisentraut 2fadf24e24 Reorder identity regression test
The previous test order had the effect that if something was wrong
with the identity functionality, the create_table_like test would
likely fail or crash first, which is confusing.  Reorder so that the
identity test comes before create_table_like.
2019-03-15 00:21:30 +01:00
Tom Lane de57004799 Fix some oversights in commit 2455ab488.
The idea was to generate all the junk in a destroyable subcontext rather
than leaking it in the caller's context, but partition_bounds_create was
still being called in the caller's context, allowing plenty of scope for
leakage.  Also, get_rel_relkind() was still being called in the rel's
rd_pdcxt, creating a risk of session-lifespan memory wastage.

Simplify the logic a bit while at it.  Also, reduce rd_pdcxt to
ALLOCSET_SMALL_SIZES, since it seems likely to not usually be big.

Probably something like this needs to be back-patched into v11,
but for now let's get some buildfarm testing on this.

Discussion: https://postgr.es/m/15943.1552601288@sss.pgh.pa.us
2019-03-14 18:36:33 -04:00
Peter Eisentraut 61dc407893 Improve code comment 2019-03-14 22:44:21 +01:00
Peter Eisentraut 8bee36708f Remove unused #include 2019-03-14 22:03:14 +01:00
Peter Eisentraut b13a913607 Add BKI_DEFAULT to pg_class.relrewrite
This column is always 0 on disk, so it doesn't have to be tracked
separately for each entry.
2019-03-14 21:25:39 +01:00
Tom Lane 0a9d7e1f6d Ensure dummy paths have correct required_outer if rel is parameterized.
The assertions added by commits 34ea1ab7f et al found another problem:
set_dummy_rel_pathlist and mark_dummy_rel were failing to label
the dummy paths they create with the correct outer_relids, in case
the relation is necessarily parameterized due to having lateral
references in its tlist.  It's likely that this has no user-visible
consequences in production builds, at the moment; but still an assertion
failure is a bad thing, so back-patch the fix.

Per bug #15694 from Roman Zharkov (via Alexander Lakhin)
and an independent report by Tushar Ahuja.

Discussion: https://postgr.es/m/15694-74f2ca97e7044f7f@postgresql.org
Discussion: https://postgr.es/m/7d72ab20-c725-3ce2-f99d-4e64dd8a0de6@enterprisedb.com
2019-03-14 12:16:36 -04:00
Robert Haas 2455ab4884 Defend against leaks into RelationBuildPartitionDesc.
In normal builds, this isn't very important, because the leaks go
into fairly short-lived contexts, but under CLOBBER_CACHE_ALWAYS,
this can result in leaking hundreds of megabytes into MessageContext,
which probably explains recent failures on hyrax.

This may or may not be the best long-term strategy for dealing
with this leak, but we can change it later if we come up with
something better.  For now, do this to make the buildfarm green
again (hopefully).  Commit 898e5e3290
seems to have exacerbated this problem for reasons that are not
quite clear, but I don't believe it's actually the cause.

Discussion: http://postgr.es/m/CA+TgmoY3bRmGB6-DUnoVy5fJoreiBJ43rwMrQRCdPXuKt4Ykaw@mail.gmail.com
2019-03-14 12:14:47 -04:00
Peter Eisentraut c6ff0b892c Refactor ParamListInfo initialization
There were six copies of identical nontrivial code.  Put it into a
function.
2019-03-14 13:30:09 +01:00
Michael Paquier 6eebfdc38b Fix thinko when bumping on temporary directories in pg_checksums
This fixes an oversight from 5c99513.  This has no actual consequence as
PG_TEMP_FILE_PREFIX and PG_TEMP_FILES_DIR have the same value so when
bumping on a temporary path the directory scan was still moving on to
the next entry instead of skipping the rest of the scan, but let's keep
the logic correct.

Author: Michael Banck
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20190314.115417.58230569.horiguchi.kyotaro@lab.ntt.co.jp
Backpatch-through: 11
2019-03-14 14:14:49 +09:00
Tom Lane 401b87a24f Sync commentary in transam.h and bki.sgml.
Commit a6417078c missed updating some comments in transam.h about
reservation of high OIDs for development purposes.  Also tamp down
an over-optimistic comment there about how easy it'd be to change
FirstNormalObjectId.

Earlier, commit 09568ec3d failed to update bki.sgml for the split
between genbki.pl-assigned OIDs and those assigned during initdb.

Also fix genbki.pl so that it will complain if it overruns
that split.  It's possible that doing so would have no very bad
consequences, but that's no excuse for not detecting it.
2019-03-14 00:23:40 -04:00
Michael Paquier 364298be22 Fix race condition in recently-added TAP test for recovery consistency
A couple of queries are run on the primary to create and fill in a test
table, which gets checked on the standby afterwards.  However the test
was not waiting for the confirmation that the necessary records have
been replayed on the standby, leading to spurious failures.

Per buildfarm member loach.  Thanks to Thomas Munro for the report and
Tom Lane for the failure analysis.

Discussion: https://postgr.es/m/CA+hUKGLUpqG52xtriUz5RpmeKPoEfNxNc-CginG+Cx+X2-Ycew@mail.gmail.com
2019-03-14 12:41:45 +09:00
Tom Lane c015f853bf Adjust the tests for the hyperbolic functions.
Preliminary results from the buildfarm suggest that no platform gets
commit c6f153dcf's test cases wrong by more than one or two units in
the last place, so setting extra_float_digits = 0 should be plenty
to hide the cross-platform variations.

Also, add tests for Infinity/NaN inputs.  I think it highly likely
that we'll end up removing these again, rather than adding code to
make ancient platforms conform.  But it seems useful to find out
just how many platforms have such issues before we make a decision.

Discussion: https://postgr.es/m/E1h3nUY-0000sM-Vf@gemulon.postgresql.org
2019-03-13 21:05:33 -04:00
Tom Lane c6f153dcfe Rethink how to test the hyperbolic functions.
The initial commit tried to test them on trivial cases such as 0,
reasoning that we shouldn't hit any portability issues that way.
The buildfarm immediately proved that hope ill-founded, and anyway
it's not a great testing scheme because it doesn't prove that we're
even calling the right library function for each SQL function.

Instead, let's test them at inputs such as 1 (or something within
the valid range, as needed), so that each function should produce
a different output.

As committed, this is just about certain to show portability
failures, because it's very unlikely that every platform computes
these functions the same as mine down to the last bit.  However,
I want to put it through a buildfarm cycle this way, so that
we can see how big the variations are.  The plan is to add
"set extra_float_digits = -1", or whatever we need in order to
hide the variations; but first we need data.

Discussion: https://postgr.es/m/E1h3nUY-0000sM-Vf@gemulon.postgresql.org
2019-03-13 18:13:45 -04:00
Thomas Munro c6c9474aaf Use condition variables to wait for checkpoints.
Previously we used a polling/sleeping loop to wait for checkpoints
to begin and end, which leads to up to a couple hundred milliseconds
of needless thumb-twiddling.  Use condition variables instead.

Author: Thomas Munro
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/CA%2BhUKGLY7sDe%2Bbg1K%3DbnEzOofGoo4bJHYh9%2BcDCXJepb6DQmLw%40mail.gmail.com
2019-03-14 10:59:33 +13:00
Robert Haas 5655565c07 Revert setting client_min_messages to 'debug1' in new tests.
The buildfarm doesn't like this, because some buildfarm members have
log_statement = 'all'.  We could change the log level of the messages
instead, but Tom doesn't like that.  So let's do this instead, at
least for now.

Patch by Sergei Kornilov, applied here in reverse.

Discussion: http://postgr.es/m/2123251552490241@myt6-fe24916a5562.qloud-c.yandex.net
2019-03-13 13:18:25 -04:00
Peter Eisentraut f177660ab0 Include all columns in default names for foreign key constraints
When creating a name for a foreign key constraint when none is
specified, use all column names instead of only the first one, similar
to how it is already done for index names.

Author: Paul Martinez <hellopfm@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CAF+2_SFjky6XRfLNRXpkG97W6PRbOO_mjAxqXzAAimU=c7w7_A@mail.gmail.com
2019-03-13 14:25:42 +01:00
Robert Haas bbb96c3704 Allow ALTER TABLE .. SET NOT NULL to skip provably unnecessary scans.
If existing CHECK or NOT NULL constraints preclude the presence
of nulls, we need not look to see whether any are present.

Sergei Kornilov, reviewed by Stephen Frost, Ildar Musin, David Rowley,
and by me.

Discussion: http://postgr.es/m/81911511895540@web58j.yandex.ru
2019-03-13 08:55:00 -04:00
Michael Paquier b0825d28ea Add TAP test to check consistency of minimum recovery LSN
c186ba13 has fixed an issue related to the updates of the minimum
recovery LSN across multiple processes on standbys, but we never really
had a test case able to reliably check its logic.

This commit introduces a new test case to close the gap, and is designed
to check the consistency of data based on the minimum recovery point set
by either the startup process or the checkpointer for both an offline
cluster (by looking at the on-disk page headers) and an online cluster
(using pageinspect).

Note that with c186ba13 reverted, this test fails badly for both the
online and offline cases, as designed.

Author: Michael Paquier, Andrew Gierth
Reviewed-by: Andrew Gierth, Georgios Kokolatos, Arthur Zakirov
Discussion: https://postgr.es/m/20181108044525.GA17482@paquier.xyz
2019-03-13 14:58:24 +09:00
Michael Paquier 6dd263cfaa Rename pg_verify_checksums to pg_checksums
The current tool name is too restrictive and focuses only on verifying
checksums.  As more options to control checksums for an offline cluster
are planned to be added, switch to a more generic name.  Documentation
as well as all past references to the tool are updated.

Author: Michael Paquier
Reviewed-by: Michael Banck, Fabien Coelho, Seigei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-13 10:43:20 +09:00
Michael Paquier c9ae7f704c Fix cross-version compatibility checks of pg_verify_checksums
pg_verify_checksums performs a read of the control file, and the data it
fetches should be from a data folder compatible with the major version
of Postgres the binary has been compiled with, but we never actually
checked that compatibility.

Reported-by: Sergei Kornilov
Author: Michael Paquier
Reviewed-by: Sergei Kornilov
Discussion: https://postgr.es/m/155231347133.16480.11453587097036807558.pgcf@coridan.postgresql.org
Backpatch-through: 11
2019-03-13 09:51:02 +09:00
Peter Geoghegan 3f34283973 Correct obsolete nbtree page split comment.
Commit 40dae7ec53, which made the nbtree page split algorithm more
robust, made _bt_insert_parent() only unlock the right child of the
parent page before inserting a new downlink into the parent.  Update a
comment from the Berkeley days claiming that both left and right child
pages are unlocked before the new downlink actually gets inserted.

The claim that it is okay to release both locks early based on Lehman
and Yao's say-so never made much sense.  Lehman and Yao must sometimes
"couple" buffer locks across a pair of internal pages when relocating a
downlink, unlike the corresponding code within _bt_getstack().
2019-03-12 16:40:05 -07:00
Tom Lane f1d85aa98e Add support for hyperbolic functions, as well as log10().
The SQL:2016 standard adds support for the hyperbolic functions
sinh(), cosh(), and tanh().  POSIX has long required libm to
provide those functions as well as their inverses asinh(),
acosh(), atanh().  Hence, let's just expose the libm functions
to the SQL level.  As with the trig functions, we only implement
versions for float8, not numeric.

For the moment, we'll assume that all platforms actually do have
these functions; if experience teaches otherwise, some autoconf
effort may be needed.

SQL:2016 also adds support for base-10 logarithm, but with the
function name log10(), whereas the name we've long used is log().
Add aliases named log10() for the float8 and numeric versions.

Lætitia Avrot

Discussion: https://postgr.es/m/CAB_COdguG22LO=rnxDQ2DW1uzv8aQoUzyDQNJjrR4k00XSgm5w@mail.gmail.com
2019-03-12 15:55:09 -04:00
Tom Lane 3aa0395d4e Remove remaining hard-wired OID references in the initial catalog data.
In the v11-era commits that taught genbki.pl to resolve symbolic
OID references in the initial catalog data, we didn't bother to
make every last reference symbolic; some of the catalogs have so
few initial rows that it didn't seem worthwhile.

However, the new project policy that OIDs assigned by new patches
should be automatically renumberable changes this calculus.
A patch that wants to add a row in one of these catalogs would have
a problem when the OID it assigns gets renumbered.  Hence, do the
mop-up work needed to make all OID references in initial data be
symbolic, and establish an associated project policy that we'll
never again write a hard-wired OID reference there.

No catversion bump since the contents of postgres.bki aren't
actually changed by this commit.

Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-12 12:30:35 -04:00
Tom Lane a6417078c4 Create a script that can renumber manually-assigned OIDs.
This commit adds a Perl script renumber_oids.pl, which can reassign a
range of manually-assigned OIDs to someplace else by modifying OID
fields of the catalog *.dat files and OID-assigning macros in the
catalog *.h files.

Up to now, we've encouraged new patches that need manually-assigned
OIDs to use OIDs just above the range of existing OIDs.  Predictably,
this leads to patches stepping on each others' toes, as whichever
one gets committed first creates an OID conflict that other patch
author(s) have to resolve manually.  With the availability of
renumber_oids.pl, we can eliminate a lot of this hassle.
The new project policy, therefore, is:

* Encourage new patches to use high OIDs (the documentation suggests
choosing a block of OIDs at random in 8000..9999).

* After feature freeze in each development cycle, run renumber_oids.pl
to move all such OIDs down to lower numbers, thus freeing the high OID
range for the next development cycle.

This plan should greatly reduce the risk of OID collisions between
concurrently-developed patches.  Also, if such a collision happens
anyway, we have the option to resolve it without much effort by doing
an off-schedule OID renumbering to get the first-committed patch out
of the way.  Or a patch author could use renumber_oids.pl to change
their patch's assignments without much pain.

This approach does put a premium on not hard-wiring any OID values
in places where renumber_oids.pl and genbki.pl can't fix them.
Project practice in that respect seems to be pretty good already,
but a follow-on patch will sand down some rough edges.

John Naylor and Tom Lane, per an idea of Peter Geoghegan's

Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-12 10:50:48 -04:00
Etsuro Fujita b5afdde6a7 Fix testing of parallel-safety of scan/join target.
In commit 960df2a971 ("Correctly assess parallel-safety of tlists when
SRFs are used."), the testing of scan/join target was done incorrectly,
which caused a plan-quality problem.  Backpatch through to v11 where
the aforementioned commit went in, since this is a regression from v10.

Author: Etsuro Fujita
Reviewed-by: Robert Haas and Tom Lane
Discussion: https://postgr.es/m/5C75303E.8020303@lab.ntt.co.jp
2019-03-12 16:21:57 +09:00
Amit Kapila 6f918159a9 Add more tests for FSM.
In commit b0eaa4c51b, we left out a test that used a vacuum to remove dead
rows as the behavior of test was not predictable.  This test has been
rewritten to use fillfactor instead to control free space.  Since we no
longer need to remove dead rows as part of the test, put the fsm regression
test in a parallel group.

Author: John Naylor
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1L=qWp_bJ5aTc9+fy4Ewx2LPaLWY-RbR4a60g_rupCKnQ@mail.gmail.com
2019-03-12 08:14:28 +05:30
Michael Paquier ce6afc6823 Add routine able to update the control file to src/common/
This adds a new routine to src/common/ which is compatible with both the
frontend and backend code, able to update the control file's contents.
This is now getting used only by pg_rewind, but some upcoming patches
which add more control on checksums for offline instances will make use
of it.  This could also get used more by the backend as xlog.c has its
own flavor of the same logic with some wait events and an additional
flush phase before closing the opened file descriptor, but this is let
as separate work.

Author: Michael Banck, Michael Paquier
Reviewed-by: Fabien Coelho, Sergei Kornilov
Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-12 10:03:33 +09:00
Tom Lane 1a83a80a2f Allow fractional input values for integer GUCs, and improve rounding logic.
Historically guc.c has just refused examples like set work_mem = '30.1GB',
but it seems more useful for it to take that and round off the value to
some reasonable approximation of what the user said.  Just rounding to
the parameter's native unit would work, but it would lead to rather
silly-looking settings, such as 31562138kB for this example.  Instead
let's round to the nearest multiple of the next smaller unit (if any),
producing 30822MB.

Also, do the units conversion math in floating point and round to integer
(if needed) only at the end.  This produces saner results for inputs that
aren't exact multiples of the parameter's native unit, and removes another
difference in the behavior for integer vs. float parameters.

In passing, document the ability to use hex or octal input where it
ought to be documented.

Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
2019-03-11 19:13:55 -04:00
Andres Freund 32b8f0b033 Remove spurious return.
Per buildfarm member anole.

Author: Andres Freund
2019-03-11 15:04:00 -07:00
Tom Lane d9c5e9629b Give up on testing guc.c's behavior for "infinity" inputs.
Further buildfarm testing shows that on the machines that are failing
ac75959cd's test case, what we're actually getting from strtod("-infinity")
is a syntax error (endptr == value) not ERANGE at all.  This test case
is not worth carrying two sets of expected output for, so just remove it,
and revert commit b212245f9's misguided attempt to work around the platform
dependency.

Discussion: https://postgr.es/m/E1h33xk-0001Og-Gs@gemulon.postgresql.org
2019-03-11 17:53:09 -04:00
Andres Freund 8cacea7a72 Ensure sufficient alignment for ParallelTableScanDescData in BTShared.
Previously ParallelTableScanDescData was just a member in BTShared,
but after c2fe139c2 that doesn't guarantee sufficient alignment as
specific AMs might (are likely to) need atomic variables in the
struct.

One might think that MAXALIGNing would be sufficient, but as a
comment in shm_toc_allocate() explains, that's not enough. For now,
copy the hack described there.

For parallel sequential scans no such change is needed, as its
allocations go through shm_toc_allocate().

An alternative approach would have been to allocate the parallel scan
descriptor in a separate TOC entry, but there seems little benefit in
doing so.

Per buildfarm member dromedary.

Author: Andres Freund
Discussion: https://postgr.es/m/20190311203126.ty5gbfz42gjbm6i6@alap3.anarazel.de
2019-03-11 14:26:43 -07:00
Andres Freund c2fe139c20 tableam: Add and use scan APIs.
Too allow table accesses to be not directly dependent on heap, several
new abstractions are needed. Specifically:

1) Heap scans need to be generalized into table scans. Do this by
   introducing TableScanDesc, which will be the "base class" for
   individual AMs. This contains the AM independent fields from
   HeapScanDesc.

   The previous heap_{beginscan,rescan,endscan} et al. have been
   replaced with a table_ version.

   There's no direct replacement for heap_getnext(), as that returned
   a HeapTuple, which is undesirable for a other AMs. Instead there's
   table_scan_getnextslot().  But note that heap_getnext() lives on,
   it's still used widely to access catalog tables.

   This is achieved by new scan_begin, scan_end, scan_rescan,
   scan_getnextslot callbacks.

2) The portion of parallel scans that's shared between backends need
   to be able to do so without the user doing per-AM work. To achieve
   that new parallelscan_{estimate, initialize, reinitialize}
   callbacks are introduced, which operate on a new
   ParallelTableScanDesc, which again can be subclassed by AMs.

   As it is likely that several AMs are going to be block oriented,
   block oriented callbacks that can be shared between such AMs are
   provided and used by heap. table_block_parallelscan_{estimate,
   intiialize, reinitialize} as callbacks, and
   table_block_parallelscan_{nextpage, init} for use in AMs. These
   operate on a ParallelBlockTableScanDesc.

3) Index scans need to be able to access tables to return a tuple, and
   there needs to be state across individual accesses to the heap to
   store state like buffers. That's now handled by introducing a
   sort-of-scan IndexFetchTable, which again is intended to be
   subclassed by individual AMs (for heap IndexFetchHeap).

   The relevant callbacks for an AM are index_fetch_{end, begin,
   reset} to create the necessary state, and index_fetch_tuple to
   retrieve an indexed tuple.  Note that index_fetch_tuple
   implementations need to be smarter than just blindly fetching the
   tuples for AMs that have optimizations similar to heap's HOT - the
   currently alive tuple in the update chain needs to be fetched if
   appropriate.

   Similar to table_scan_getnextslot(), it's undesirable to continue
   to return HeapTuples. Thus index_fetch_heap (might want to rename
   that later) now accepts a slot as an argument. Core code doesn't
   have a lot of call sites performing index scans without going
   through the systable_* API (in contrast to loads of heap_getnext
   calls and working directly with HeapTuples).

   Index scans now store the result of a search in
   IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the
   target is not generally a HeapTuple anymore that seems cleaner.

To be able to sensible adapt code to use the above, two further
callbacks have been introduced:

a) slot_callbacks returns a TupleTableSlotOps* suitable for creating
   slots capable of holding a tuple of the AMs
   type. table_slot_callbacks() and table_slot_create() are based
   upon that, but have additional logic to deal with views, foreign
   tables, etc.

   While this change could have been done separately, nearly all the
   call sites that needed to be adapted for the rest of this commit
   also would have been needed to be adapted for
   table_slot_callbacks(), making separation not worthwhile.

b) tuple_satisfies_snapshot checks whether the tuple in a slot is
   currently visible according to a snapshot. That's required as a few
   places now don't have a buffer + HeapTuple around, but a
   slot (which in heap's case internally has that information).

Additionally a few infrastructure changes were needed:

I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now
   internally uses a slot to keep track of tuples. While
   systable_getnext() still returns HeapTuples, and will so for the
   foreseeable future, the index API (see 1) above) now only deals with
   slots.

The remainder, and largest part, of this commit is then adjusting all
scans in postgres to use the new APIs.

Author: Andres Freund, Haribabu Kommi, Alvaro Herrera
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
2019-03-11 12:46:41 -07:00
Andrew Dunstan a478415281 pgbench: increase the maximum number of variables/arguments
pgbench's arbitrary limit of 10 arguments for SQL statements or
metacommands is far too low. Increase it to 256.

This results in a very modest increase in memory usage, not enough to
worry about.

The maximum includes the SQL statement or metacommand. This is reflected
in the comments and revised TAP tests.

Simon Riggs and Dagfinn Ilmari Mannsåker with some light editing by me.
Reviewed by: David Rowley and Fabien Coelho

Discussion: https://postgr.es/m/CANP8+jJiMJOAf-dLoHuR-8GENiK+eHTY=Omw38Qx7j2g0NDTXA@mail.gmail.com
2019-03-11 13:18:37 -04:00
Amit Kapila a6e48da088 Fix typos in commit 8586bf7ed8.
Author: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KNv1Mg2krf4E9ssWFnE=8A9mZ1VbVywXBZTFSzb+wP2g@mail.gmail.com
2019-03-11 09:58:46 -07:00
Alvaro Herrera af38498d4c Move hash_any prototype from access/hash.h to utils/hashutils.h
... as well as its implementation from backend/access/hash/hashfunc.c to
backend/utils/hash/hashfn.c.

access/hash is the place for the hash index AM, not really appropriate
for generic facilities, which is what hash_any is; having things the old
way meant that anything using hash_any had to include the AM's include
file, pointlessly polluting its namespace with unrelated, unnecessary
cruft.

Also move the HTEqual strategy number to access/stratnum.h from
access/hash.h.

To avoid breaking third-party extension code, add an #include
"utils/hashutils.h" to access/hash.h.  (An easily removed line by
committers who enjoy their asbestos suits to protect them from angry
extension authors.)

Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql
2019-03-11 13:17:50 -03:00
Tom Lane b212245f96 In guc.c, ignore ERANGE errors from strtod().
Instead, just proceed with the infinity or zero result that it should
return for overflow/underflow.  This avoids a platform dependency,
in that various versions of strtod are inconsistent about whether they
signal ERANGE for a value that's specified as infinity.

It's possible this won't be enough to remove the buildfarm failures
we're seeing from ac75959cd, in which case I'll take out the infinity
test case that commit added.  But first let's see if we can fix it.

Discussion: https://postgr.es/m/E1h33xk-0001Og-Gs@gemulon.postgresql.org
2019-03-11 11:25:26 -04:00
Michael Meskes 08cecfaf60 Fix potential memory access violation in ecpg if filename of include file is
shorter than 2 characters.

Patch by: "Wu, Fei" <wufei.fnst@cn.fujitsu.com>
2019-03-11 16:11:16 +01:00
Michael Meskes 98bdaab0d9 Fix ecpglib regression that made it impossible to close a cursor that was
opened in a prepared statement.

Patch by: "Kuroda, Hayato" <kuroda.hayato@jp.fujitsu.com>
2019-03-11 16:00:13 +01:00
Peter Eisentraut 3c06715447 Remove unused macro
Use was removed in 25ca5a9a54.
2019-03-11 09:29:50 +01:00
Peter Eisentraut 27f3dea648 psql: Add documentation URL to \help output
Add a link to the specific command's reference web page to the bottom
of its \help output.

Discussion: https://www.postgresql.org/message-id/flat/40179bd0-fa7d-4108-1991-a20ae9ad5667%402ndquadrant.com
2019-03-11 09:11:37 +01:00
Michael Paquier f2d84a4a6b Adjust error message for partial writes in WAL segments
93473c6 has removed openLogOff, changing on the way the error message
which is used to report partial writes to WAL segments.  The
newly-introduced error message used the offset up to which the write has
happened, keeping always the same total length to write.  This changes
the error message so as the number of bytes left to write are reported.

Reported-by: Michael Paquier
Author: Robert Haas
Discussion: https://postgr.es/m/20190306235251.GA17293@paquier.xyz
2019-03-11 09:31:25 +09:00
Tom Lane cbccac371c Reduce the default value of autovacuum_vacuum_cost_delay to 2ms.
This is a better way to implement the desired change of increasing
autovacuum's default resource consumption.

Discussion: https://postgr.es/m/28720.1552101086@sss.pgh.pa.us
2019-03-10 15:16:21 -04:00
Tom Lane 52985e4fea Revert "Increase the default vacuum_cost_limit from 200 to 2000"
This reverts commit bd09503e63.

Per discussion, it seems like what we should do instead is to
reduce the default value of autovacuum_vacuum_cost_delay by the
same factor.  That's functionally equivalent as long as the
platform can accurately service the smaller delay request, which
should be true on anything released in the last 10 years or more.
And smaller, more-closely-spaced delays are better in terms of
providing a steady I/O load.

Discussion: https://postgr.es/m/28720.1552101086@sss.pgh.pa.us
2019-03-10 15:05:25 -04:00
Tom Lane caf626b2cd Convert [autovacuum_]vacuum_cost_delay into floating-point GUCs.
This change makes it possible to specify sub-millisecond delays,
which work well on most modern platforms, though that was not true
when the cost-delay feature was designed.

To support this without breaking existing configuration entries,
improve guc.c to allow floating-point GUCs to have units.  Also,
allow "us" (microseconds) as an input/output unit for time-unit GUCs.
(It's not allowed as a base unit, at least not yet.)

Likewise change the autovacuum_vacuum_cost_delay reloption to be
floating-point; this forces a catversion bump because the layout of
StdRdOptions changes.

This patch doesn't in itself change the default values or allowed
ranges for these parameters, and it should not affect the behavior
for any already-allowed setting for them.

Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
2019-03-10 15:01:39 -04:00
Tom Lane 28a65fc360 Include GUC's unit, if it has one, in out-of-range error messages.
This should reduce confusion in cases where we've applied a units
conversion, so that the number being reported (and the quoted range
limits) are in some other units than what the user gave in the
setting we're rejecting.

Some of the changes here assume that float GUCs can have units,
which isn't true just yet, but will be shortly.

Discussion: https://postgr.es/m/3811.1552169665@sss.pgh.pa.us
2019-03-10 13:18:17 -04:00
Tom Lane ac75959cdc Disallow NaN as a value for floating-point GUCs.
None of the code that uses GUC values is really prepared for them to
hold NaN, but parse_real() didn't have any defense against accepting
such a value.  Treat it the same as a syntax error.

I haven't attempted to analyze the exact consequences of setting any
of the float GUCs to NaN, but since they're quite unlikely to be good,
this seems like a back-patchable bug fix.

Note: we don't need an explicit test for +-Infinity because those will
be rejected by existing range checks.  I added a regression test for
that in HEAD, but not older branches because the spelling of the value
in the error message will be platform-dependent in branches where we
don't always use port/snprintf.c.

Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
2019-03-10 12:59:16 -04:00
Alvaro Herrera 203749a8a6 pg_upgrade: Ignore TOAST for partitioned tables
Since partitioned tables in pg12 do not have toast tables, trying to set
the toast OID confuses pg_upgrade.  Have pg_dump omit those values to
avoid the problem.

Per Andres Freund and buildfarm members crake and snapper
Discussion: https://postgr.es/m/20190306204104.yle5jfbnqkcwykni@alap3.anarazel.de
2019-03-10 13:20:58 -03:00
Alexander Korotkov f2e403803f Support for INCLUDE attributes in GiST indexes
Similarly to B-tree, GiST index access method gets support of INCLUDE
attributes.  These attributes aren't used for tree navigation and aren't
present in non-leaf pages.  But they are present in leaf pages and can be
fetched during index-only scan.

The point of having INCLUDE attributes in GiST indexes is slightly different
from the point of having them in B-tree.  The main point of INCLUDE attributes
in B-tree is to define UNIQUE constraint over part of attributes enabled for
index-only scan.  In GiST the main point of INCLUDE attributes is to use
index-only scan for attributes, whose data types don't have GiST opclasses.

Discussion: https://postgr.es/m/73A1A452-AD5F-40D4-BD61-978622FF75C1%40yandex-team.ru
Author: Andrey Borodin, with small changes by me
Reviewed-by: Andreas Karlsson
2019-03-10 11:37:17 +03:00
Tom Lane a0b7626268 Simplify release-note links to back branches.
Now that https://www.postgresql.org/docs/release/ is populated,
replace the stopgap text we had under "Prior Releases" with
a pointer to that archive.

Discussion: https://postgr.es/m/e0f09c9a-bd2b-862a-d379-601dfabc8969@postgresql.org
2019-03-09 18:42:39 -05:00
Magnus Hagander 0516c61b75 Add new clientcert hba option verify-full
This allows a login to require both that the cn of the certificate
matches (like authentication type cert) *and* that another
authentication method (such as password or kerberos) succeeds as well.

The old value of clientcert=1 maps to the new clientcert=verify-ca,
clientcert=0 maps to the new clientcert=no-verify, and the new option
erify-full will add the validation of the CN.

Author: Julian Markwort, Marius Timmer
Reviewed by: Magnus Hagander, Thomas Munro
2019-03-09 12:19:47 -08:00
Magnus Hagander 6b9e875f72 Track block level checksum failures in pg_stat_database
This adds a column that counts how many checksum failures have occurred
on files belonging to a specific database. Both checksum failures
during normal backend processing and those created when a base backup
detects a checksum failure are counted.

Author: Magnus Hagander
Reviewed by: Julien Rouhaud
2019-03-09 10:47:30 -08:00
Noah Misch 3c5926301a Avoid some table rewrites for ALTER TABLE .. SET DATA TYPE timestamp.
When the timezone is UTC, timestamptz and timestamp are binary coercible
in both directions.  See b8a18ad485 and
c22ecc6562 for the previous attempt in
this problem space.  Skip the table rewrite; for now, continue to
needlessly rewrite any index on an affected column.

Reviewed by Simon Riggs and Tom Lane.

Discussion: https://postgr.es/m/20190226061450.GA1665944@rfd.leadboat.com
2019-03-08 20:16:27 -08:00
Michael Paquier 82a5649fb9 Tighten use of OpenTransientFile and CloseTransientFile
This fixes two sets of issues related to the use of transient files in
the backend:
1) OpenTransientFile() has been used in some code paths with read-write
flags while read-only is sufficient, so switch those calls to be
read-only where necessary.  These have been reported by Joe Conway.
2) When opening transient files, it is up to the caller to close the
file descriptors opened.  In error code paths, CloseTransientFile() gets
called to clean up things before issuing an error.  However in normal
exit paths, a lot of callers of CloseTransientFile() never actually
reported errors, which could leave a file descriptor open without
knowing about it.  This is an issue I complained about a couple of
times, but never had the courage to write and submit a patch, so here we
go.

Note that one frontend code path is impacted by this commit so as an
error is issued when fetching control file data, making backend and
frontend to be treated consistently.

Reported-by: Joe Conway, Michael Paquier
Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Georgios Kokolatos, Joe Conway
Discussion: https://postgr.es/m/20190301023338.GD1348@paquier.xyz
Discussion: https://postgr.es/m/c49b69ec-e2f7-ff33-4f17-0eaa4f2cef27@joeconway.com
2019-03-09 08:50:55 +09:00
Alvaro Herrera 2e616dee9e Fix crash with old libxml2
Certain libxml2 versions (such as the 2.7.6 commonly seen in older
distributions, but apparently only on x86_64) contain a bug that causes
xmlCopyNode, when called on a XML_DOCUMENT_NODE, to return a node that
xmlFreeNode crashes on.  Arrange to call xmlFreeDoc instead of
xmlFreeNode for those nodes.

Per buildfarm members lapwing and grison.

Author: Pavel Stehule, light editing by Álvaro.
Discussion: https://postgr.es/m/20190308024436.GA2374@alvherre.pgsql
2019-03-08 19:13:25 -03:00
Tom Lane 1b76168da7 Reformat catalog .dat files.
Test run for my previous commit; cleans up formatting issues in some
other recent commits.
2019-03-08 12:01:27 -05:00
Tom Lane 27aaf6eff4 Minor improvements for reformat_dat_file.pl.
Use Getopt::Long in preference to hand-rolled option parsing code.

Also, remove "-I .../backend/catalog" switch from the Makefile
invocations.  That's been unnecessary for some time, and leaving it
there gives the false impression it's needed in manual invocations.

John Naylor (extracted from a larger but more controversial patch)

Discussion: https://postgr.es/m/CACPNZCsHdcQN2jQ1=ptbi1Co2Nj3aHgRCUMk62=ThgWNabPY+Q@mail.gmail.com
2019-03-08 11:48:49 -05:00
Michael Paquier beeb8e2e07 Fix compatibility of pg_basebackup -R with 11 and older versions
When 2dedf4d9 has integrated recovery.conf into postgresql.conf, it also
changed pg_basebackup -R in the way recovery configuration is
generated.  However this implementation forgot the fact that
pg_basebackup needs to keep compatibility with older server versions as
well.

Reported-by: Devrim Gündüz
Author: Sergei Kornilov, Michael Paquier
Discussion: https://postgr.es/m/3458f7cd12d74acd90180a671c8d5a081d60e162.camel@gunduz.org
2019-03-08 10:17:23 +09:00
Alvaro Herrera 251cf2e27b Fix minor deficiencies in XMLTABLE, xpath(), xmlexists()
Correctly process nodes of more types than previously.  In some cases,
nodes were being ignored (nothing was output); in other cases, trying to
return them resulted in errors about unrecognized nodes.  In yet other
cases, necessary escaping (of XML special characters) was not being
done.  Fix all those (as far as the authors could find) and add
regression tests cases verifying the new behavior.

I (Álvaro) was of two minds about backpatching these changes.  They do
seem bugfixes that would benefit most users of the affected functions;
but on the other hand it would change established behavior in minor
releases, so it seems prudent not to.

Authors: Pavel Stehule, Markus Winand, Chapman Flack
Discussion:
   https://postgr.es/m/CAFj8pRA6J25CtAZ2TuRvxK3gat7-bBUYh0rfE2yM7Hj9GD14Dg@mail.gmail.com
   https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at

The elephant in the room as pointed out by Chapman Flack, not fixed in
this commit, is that we still have XMLTABLE operating on XPath 1.0
instead of the standard-mandated XQuery (or even its subset XPath 2.0).
Fixing that is a major undertaking, however.
2019-03-07 18:16:34 -03:00
Tom Lane 1d33858406 Fix handling of targetlist SRFs when scan/join relation is known empty.
When we introduced separate ProjectSetPath nodes for application of
set-returning functions in v10, we inadvertently broke some cases where
we're supposed to recognize that the result of a subquery is known to be
empty (contain zero rows).  That's because IS_DUMMY_REL was just looking
for a childless AppendPath without allowing for a ProjectSetPath being
possibly stuck on top.  In itself, this didn't do anything much worse
than produce slightly worse plans for some corner cases.

Then in v11, commit 11cf92f6e rearranged things to allow the scan/join
targetlist to be applied directly to partial paths before they get
gathered.  But it inserted a short-circuit path for dummy relations
that was a little too short: it failed to insert a ProjectSetPath node
at all for a targetlist containing set-returning functions, resulting in
bogus "set-valued function called in context that cannot accept a set"
errors, as reported in bug #15669 from Madelaine Thibaut.

The best way to fix this mess seems to be to reimplement IS_DUMMY_REL
so that it drills down through any ProjectSetPath nodes that might be
there (and it seems like we'd better allow for ProjectionPath as well).

While we're at it, make it look at rel->pathlist not cheapest_total_path,
so that it gives the right answer independently of whether set_cheapest
has been done lately.  That dependency looks pretty shaky in the context
of code like apply_scanjoin_target_to_paths, and even if it's not broken
today it'd certainly bite us at some point.  (Nastily, unsafe use of the
old coding would almost always work; the hazard comes down to possibly
looking through a dangling pointer, and only once in a blue moon would
you find something there that resulted in the wrong answer.)

It now looks like it was a mistake for IS_DUMMY_REL to be a macro: if
there are any extensions using it, they'll continue to use the old
inadequate logic until they're recompiled, after which they'll fail
to load into server versions predating this fix.  Hopefully there are
few such extensions.

Having fixed IS_DUMMY_REL, the special path for dummy rels in
apply_scanjoin_target_to_paths is unnecessary as well as being wrong,
so we can just drop it.

Also change a few places that were testing for partitioned-ness of a
planner relation but not using IS_PARTITIONED_REL for the purpose; that
seems unsafe as well as inconsistent, plus it required an ugly hack in
apply_scanjoin_target_to_paths.

In passing, save a few cycles in apply_scanjoin_target_to_paths by
skipping processing of pre-existing paths for partitioned rels,
and do some cosmetic cleanup and comment adjustment in that function.

I renamed IS_DUMMY_PATH to IS_DUMMY_APPEND with the intention of breaking
any code that might be using it, since in almost every case that would
be wrong; IS_DUMMY_REL is what to be using instead.

In HEAD, also make set_dummy_rel_pathlist static (since it's no longer
used from outside allpaths.c), and delete is_dummy_plan, since it's no
longer used anywhere.

Back-patch as appropriate into v11 and v10.

Tom Lane and Julien Rouhaud

Discussion: https://postgr.es/m/15669-02fb3296cca26203@postgresql.org
2019-03-07 14:22:13 -05:00
Robert Haas 898e5e3290 Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.
We still require AccessExclusiveLock on the partition itself, because
otherwise an insert that violates the newly-imposed partition
constraint could be in progress at the same time that we're changing
that constraint; only the lock level on the parent relation is
weakened.

To make this safe, we have to cope with (at least) three separate
problems. First, relevant DDL might commit while we're in the process
of building a PartitionDesc.  If so, find_inheritance_children() might
see a new partition while the RELOID system cache still has the old
partition bound cached, and even before invalidation messages have
been queued.  To fix that, if we see that the pg_class tuple seems to
be missing or to have a null relpartbound, refetch the value directly
from the table. We can't get the wrong value, because DETACH PARTITION
still requires AccessExclusiveLock throughout; if we ever want to
change that, this will need more thought. In testing, I found it quite
difficult to hit even the null-relpartbound case; the race condition
is extremely tight, but the theoretical risk is there.

Second, successive calls to RelationGetPartitionDesc might not return
the same answer.  The query planner will get confused if lookup up the
PartitionDesc for a particular relation does not return a consistent
answer for the entire duration of query planning.  Likewise, query
execution will get confused if the same relation seems to have a
different PartitionDesc at different times.  Invent a new
PartitionDirectory concept and use it to ensure consistency.  This
ensures that a single invocation of either the planner or the executor
sees the same view of the PartitionDesc from beginning to end, but it
does not guarantee that the planner and the executor see the same
view.  Since this allows pointers to old PartitionDesc entries to
survive even after a relcache rebuild, also postpone removing the old
PartitionDesc entry until we're certain no one is using it.

For the most part, it seems to be OK for the planner and executor to
have different views of the PartitionDesc, because the executor will
just ignore any concurrently added partitions which were unknown at
plan time; those partitions won't be part of the inheritance
expansion, but invalidation messages will trigger replanning at some
point.  Normally, this happens by the time the very next command is
executed, but if the next command acquires no locks and executes a
prepared query, it can manage not to notice until a new transaction is
started.  We might want to tighten that up, but it's material for a
separate patch.  There would still be a small window where a query
that started just after an ATTACH PARTITION command committed might
fail to notice its results -- but only if the command starts before
the commit has been acknowledged to the user. All in all, the warts
here around serializability seem small enough to be worth accepting
for the considerable advantage of being able to add partitions without
a full table lock.

Although in general the consequences of new partitions showing up
between planning and execution are limited to the query not noticing
the new partitions, run-time partition pruning will get confused in
that case, so that's the third problem that this patch fixes.
Run-time partition pruning assumes that indexes into the PartitionDesc
are stable between planning and execution.  So, add code so that if
new partitions are added between plan time and execution time, the
indexes stored in the subplan_map[] and subpart_map[] arrays within
the plan's PartitionedRelPruneInfo get adjusted accordingly.  There
does not seem to be a simple way to generalize this scheme to cope
with partitions that are removed, mostly because they could then get
added back again with different bounds, but it works OK for added
partitions.

This code does not try to ensure that every backend participating in
a parallel query sees the same view of the PartitionDesc.  That
currently doesn't matter, because we never pass PartitionDesc
indexes between backends.  Each backend will ignore the concurrently
added partitions which it notices, and it doesn't matter if different
backends are ignoring different sets of concurrently added partitions.
If in the future that matters, for example because we allow writes in
parallel query and want all participants to do tuple routing to the same
set of partitions, the PartitionDirectory concept could be improved to
share PartitionDescs across backends.  There is a draft patch to
serialize and restore PartitionDescs on the thread where this patch
was discussed, which may be a useful place to start.

Patch by me.  Thanks to Alvaro Herrera, David Rowley, Simon Riggs,
Amit Langote, and Michael Paquier for discussion, and to Alvaro
Herrera for some review.

Discussion: http://postgr.es/m/CA+Tgmobt2upbSocvvDej3yzokd7AkiT+PvgFH+a9-5VV1oJNSQ@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZE0r9-cyA-aY6f8WFEROaDLLL7Vf81kZ8MtFCkxpeQSw@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoY13KQZF-=HNTrt9UYWYx3_oYOQpu9ioNT49jGgiDpUEA@mail.gmail.com
2019-03-07 11:13:12 -05:00
Alvaro Herrera eaaa5986ad Fix the BY {REF,VALUE} clause of XMLEXISTS/XMLTABLE
This clause is used to indicate the passing mode of a XML document, but
we were doing it wrong: we accepted BY REF and ignored it, and rejected
BY VALUE as a syntax error.  The reality, however, is that documents are
always passed BY VALUE, so rejecting that clause was silly.  Change
things so that we accept BY VALUE.

BY REF continues to be accepted, and continues to be ignored.

Author: Chapman Flack
Reviewed-by: Pavel Stehule
Discussion: https://postgr.es/m/5C297BB7.9070509@anastigmatix.net
2019-03-07 11:20:35 -03:00
Alvaro Herrera cb706ec4b6 Add missing <limits.h>
Per buildfarm
2019-03-07 10:08:29 -03:00
Alvaro Herrera 7e413a0f82 pg_dump: allow multiple rows per insert
This is useful to speed up loading data in a different database engine.

Authors: Surafel Temesgen and David Rowley.  Lightly edited by Álvaro.
Reviewed-by: Fabien Coelho
Discussion: https://postgr.es/m/CALAY4q9kumSdnRBzvRJvSRf2+BH20YmSvzqOkvwpEmodD-xv6g@mail.gmail.com
2019-03-07 09:34:17 -03:00
Thomas Munro 42210524cc Remove useless header inclusion. 2019-03-07 20:02:22 +13:00
Thomas Munro 91595f9d49 Drop the vestigial "smgr" type.
Before commit 3fa2bb31 this type appeared in the catalogs to
select which of several block storage mechanisms each relation
used.

New features under development propose to revive the concept of
different block storage managers for new kinds of data accessed
via bufmgr.c, but don't need to put references to them in the
catalogs.  So, avoid useless maintenance work on this type by
dropping it.  Update some regression tests that were referencing
it where any type would do.

Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
2019-03-07 15:44:04 +13:00
Andres Freund 277cb78983 Don't reuse slots between root and partition in ON CONFLICT ... UPDATE.
Until now the the slot to store the conflicting tuple, and the result
of the ON CONFLICT SET, where reused between partitions. That
necessitated changing slots descriptor when switching partitions.

Besides the overhead of switching descriptors on a slot (which
requires memory allocations and prevents JITing), that's importantly
also problematic for tableam. There individual partitions might belong
to different tableams, needing different kinds of slots.

In passing also fix ExecOnConflictUpdate to clear the existing slot at
exit. Otherwise that slot could continue to hold a pin till the query
ends, which could be far too long if the input data set is large, and
there's no further conflicts. While previously also problematic, it's
now more important as there will be more such slots when partitioned.

Author: Andres Freund
Reviewed-By: Robert Haas, David Rowley
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-03-06 15:43:33 -08:00
Andres Freund d16a74c20c Fix equalfuncs for accessMethod addition in 8586bf7ed8.
In a complete brown paper bag moment, I forgot to include equalfuncs
in my previous fix of copy/out/readfuncs.  Thanks Tom for noticing.

Discussion: https://postgr.es/m/1659.1551903210@sss.pgh.pa.us
2019-03-06 13:04:09 -08:00
Andrew Dunstan 342cb650e0 Don't log incomplete startup packet if it's empty
This will stop logging cases where, for example, a monitor opens a
connection and immediately closes it. If the packet contains any data an
incomplete packet will still be logged.

Author: Tom Lane

Discussion: https://postgr.es/m/a1379a72-2958-1ed0-ef51-09a21219b155@2ndQuadrant.com
2019-03-06 15:36:41 -05:00
Andres Freund b172342321 Fix copy/out/readfuncs for accessMethod addition in 8586bf7ed8.
This includes a catversion bump, as IntoClause is theoretically
speaking part of storable rules. In practice I don't think that can
happen, but there's no reason to be stingy here.

Per buildfarm member calliphoridae.
2019-03-06 11:55:28 -08:00
Andres Freund 863aa55624 Fix collation dependency in test introduced in 8586bf7ed8, take 2.
Per buildfarm.  This time I hopefully actually made sure to get all
the cases...
2019-03-06 11:27:14 -08:00
Andres Freund 836f634522 Fix collation dependency in test introduced in 8586bf7ed8.
Per buildfarm.
2019-03-06 11:06:01 -08:00
Andres Freund 3b925e905d tableam: Add pg_dump support.
This adds pg_dump support for table AMs in a similar manner to how
tablespaces are handled. That is, instead of specifying the AM for
every CREATE TABLE etc, emit SET default_table_access_method
statements. That makes it easier to change the AM for all/most tables
in a dump, and allows restore to succeed even if some AM is not
available.

This increases the dump archive version, as a tables/matview's AM
needs to be tracked therein.

Author: Dimitri Dolgov, Andres Freund
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Andres Freund 8586bf7ed8 tableam: introduce table AM infrastructure.
This introduces the concept of table access methods, i.e. CREATE
  ACCESS METHOD ... TYPE TABLE and
  CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.

Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.

Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.

Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.

Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs.  It's quite possible that this behaviour
still needs to be fine tuned.

For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.

Catversion bumped, to add the heap table AM and references to it.

Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
    https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
    https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
    https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06 09:54:38 -08:00
Andres Freund f217761856 Fix bug in clearing of virtual tuple slot.
I broke/typoed this in 4da597edf1. Astonishingly this mostly
doesn't cause breakage, except when trying to change the tuple
descriptor of a slot (because TTS_FLAG_FIXED is assumed to be set).

Author: Andres Freund
2019-03-06 09:54:38 -08:00
Robert Haas 93473c6ac8 Removed unused variable, openLogOff.
Antonin Houska

Discussion: http://postgr.es/m/30413.1551870730@localhost
2019-03-06 09:44:08 -05:00
Andrew Dunstan bd09503e63 Increase the default vacuum_cost_limit from 200 to 2000
The original 200 default value was set back in f425b605f4 when the cost
delay settings were first added.  Hardware has improved quite a bit since
then and we've also made improvements such as sorting buffers during
checkpoints (9cd00c457e) which should result in less random writes.

This low default value was reportedly causing problems with badly
configured servers and in the absence of a native method to remove
excessive bloat from tables without incurring an AccessExclusiveLock, this
often made cleaning up the damage caused by badly configured auto-vacuums
difficult.

It seems more likely that someone will notice that auto-vacuum is running
too quickly than too slowly, so let's go all out and multiple the default
value for the setting by 10.  With the default vacuum_cost_page_dirty and
autovacuum_vacuum_cost_delay (assuming a page size of 8192 bytes), this
allows autovacuum a theoretical maximum dirty write rate of around 39MB/s
instead of just 3.9MB/s.

Author: David Rowley

Discussion: https://postgr.es/m/CAKJS1f_YbXC2qTMPyCbmsPiKvZYwpuQNQMohiRXLj1r=8_rYvw@mail.gmail.com
2019-03-06 09:10:12 -05:00
Michael Paquier ff9bff0a85 Teach SKIP_LOCKED to psql tab completion of VACUUM and ANALYZE
This was missing since 803b130, which has introduced the option for the
user-facing VACUUM and ANALYZE.

Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoD2TMdTxRhZ7WSp940V82_OAyPmgHnbi25UbbArLgA92Q@mail.gmail.com
2019-03-06 14:42:52 +09:00
Andrew Dunstan e988878f85 Fix pgbench TAP test failure with funky file names (redux)
This test fails if the containing directory contains a funny character
such as a space or some perl metacharacter. To avoid that, we check for
files names using readdir and a regex, rather than using a glob pattern.

Discussion: https://postgr.es/m/CAM6_UM6dGdU39PKAC24T+HD9ouy0jLN9vH6163K8QEEzr__iZw@mail.gmail.com

Author: Fabien COELHO
Reviewed-by: Raúl Marín Rodríguez
2019-03-05 10:46:35 -05:00
Peter Eisentraut 9b1384dd6a Remove duplicate macro
The original commit appears to have accidentally introduced a
duplicate definition.  Keep only one of them.
2019-03-05 15:02:56 +01:00
Heikki Linnakangas fe280694d0 Scan GiST indexes in physical order during VACUUM.
Scanning an index in physical order is faster than walking it in logical
order, because sequential I/O is faster than random I/O. The idea and code
structure is borrowed from B-tree vacuum code.

Patch by Andrey Borodin, with changes by me. Based on early work by
Konstantin Kuznetsov, although the patch has been rewritten multiple times
since his original version.

Discussion: https://www.postgresql.org/message-id/1B9FAC6F-FA19-4A24-8C1B-F4F574844892%40yandex-team.ru
2019-03-05 15:19:48 +02:00
Peter Geoghegan 35bc0ec7c8 Note case where nbtree VACUUM finishes splits.
The nbtree README claims that VACUUM can never finish interrupted page
splits by design.  That isn't entirely accurate, though.  Note an
exception to the general rule.

Discussion: https://postgr.es/m/CAH2-Wz=_Xvv8byzK_LvY4ci76OgsHCQzoKF7We8yG9waO7j6rA@mail.gmail.com
2019-03-04 17:57:36 -08:00
Andrew Dunstan 1638623f34 Disable dump_connstr test on Msys2
For some reason the dump test with names with high bits set fails on
Msys2 (although not Msys1). Disable the tests for now, so that other
tests can run.
2019-03-04 17:35:54 -05:00
Andrew Dunstan 4eff1e9f0b Allow recovery tests to run on Windows as an admin user
This is the only test that fails when run as an admin user. The reason
is that when Postgres is started via pg_ctl its admin privileges are
lowered. However, this test called 'postgres -D datadir' directly,
resulting in a failure. Replace that by calling pg_ctl and then checking
the result for the expected failure, and the logfile for the expected
error message.
2019-03-04 15:54:02 -05:00
Peter Geoghegan 72c7c4e386 Correct obsolete nbtree page split WAL comment.
Commit 2c03216d83, which revamped the WAL record format, failed to
update a comment referencing the old API.  Update the comment.
2019-03-04 12:32:40 -08:00
Alvaro Herrera b96f6b1948 pg_partition_ancestors
Adds another introspection feature for partitioning, necessary for
further psql patches.

Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/20190226222757.GA31622@alvherre.pgsql
2019-03-04 16:14:29 -03:00
Alvaro Herrera d12fbe2f8e Test partition functions with legacy inheritance children, too
It's worth immortalizing this behavior, per discussion.

Discussion: https://postgr.es/m/20190228193203.GA26151@alvherre.pgsql
2019-03-04 15:52:41 -03:00
Peter Eisentraut 278584b526 Remove volatile from latch API
This was no longer useful since the latch functions use memory
barriers already, which are also compiler barriers, and volatile does
not help with cross-process access.

Discussion: https://www.postgresql.org/message-id/flat/20190218202511.qsfpuj5sy4dbezcw%40alap3.anarazel.de#18783c27d73e9e40009c82f6e0df0974
2019-03-04 11:30:41 +01:00
Michael Paquier 754b90f657 Fix error handling of readdir() port implementation on first file lookup
The implementation of readdir() in src/port/ which gets used by MSVC has
been added in 399a36a, and since the beginning it considers all errors
on the first file lookup as ENOENT, setting errno accordingly and
letting the routine caller think that the directory is empty.  While
this is normally enough for the case of the backend, this can confuse
callers of this routine on Windows as all errors would map to the same
behavior.  So, for example, even permission errors would be thought as
having an empty directory, while there could be contents in it.

This commit changes the error handling so as readdir() gets a behavior
similar to native implementations: force errno=0 when seeing
ERROR_FILE_NOT_FOUND as error and consider other errors as plain
failures.

While looking at the patch, I noticed that MinGW does not enforce
errno=0 when looking at the first file, but it gets enforced on the next
file lookups.  A comment related to that was incorrect in the code.

Reported-by: Yuri Kurenkov
Diagnosed-by: Yuri Kurenkov, Grigory Smolkin
Author:	Konstantin Knizhnik
Reviewed-by: Andrew Dunstan, Michael Paquier
Discussion: https://postgr.es/m/2cad7829-8d66-e39c-b937-ac825db5203d@postgrespro.ru
Backpatch-through: 9.4
2019-03-04 09:49:06 +09:00
Andrew Dunstan 5bd9160f27 fix thinko in logrotate test 2019-03-03 19:15:47 -05:00
Andrew Dunstan d611175e53 Don't do pg_ctl logrotate test on Windows
The test crashes and burns quite badly, for some reason, but even if it
didn't it wouldn't work, since Windows doesn't let you rename a file
held by a running process.
2019-03-03 18:19:44 -05:00
Tom Lane 80b9e9c466 Improve performance of index-only scans with many index columns.
StoreIndexTuple was a loop over index_getattr, which is O(N^2)
if the index columns are variable-width, and the performance
impact is already quite visible at ten columns.  The obvious
move is to replace that with a call to index_deform_tuple ...
but that's *also* a loop over index_getattr.  Improve it to
be essentially a clone of heap_deform_tuple.

(There are a few other places that loop over all index columns
with index_getattr, and perhaps should be changed likewise,
but most of them don't seem performance-critical.  Anyway, the
rest would mostly only be interested in the index key columns,
which there aren't likely to be so many of.  Wide index tuples
are a new thing with INCLUDE.)

Konstantin Knizhnik

Discussion: https://postgr.es/m/e06b2d27-04fc-5c0e-bb8c-ecd72aa24959@postgrespro.ru
2019-03-03 16:57:14 -05:00
Andrew Dunstan 78b408a20a Avoid accidental wildcard expansion in msys shell
Commit f092de05 added a test for pg_dumpall --exclude-database including
the wildcard pattern '*dump*' which matches some files in the source
directory. The test library on msys uses the shell which expands this
and thus the program gets incorrect arguments. This doesn't happen if
the pattern doesn't match any files, so here the pattern is set to
'*dump_test*' which is such a pattern.

Per buildfarm animal jacana.
2019-03-03 11:48:12 -05:00
Dean Rasheed ed4653db8c Further fixing for multi-row VALUES lists for updatable views.
Previously, rewriteTargetListIU() generated a list of attribute
numbers from the targetlist, which were passed to rewriteValuesRTE(),
which expected them to contain the same number of entries as there are
columns in the VALUES RTE, and to be in the same order. That was fine
when the target relation was a table, but for an updatable view it
could be broken in at least three different ways ---
rewriteTargetListIU() could insert additional targetlist entries for
view columns with defaults, the view columns could be in a different
order from the columns of the underlying base relation, and targetlist
entries could be merged together when assigning to elements of an
array or composite type. As a result, when recursing to the base
relation, the list of attribute numbers generated from the rewritten
targetlist could no longer be relied upon to match the columns of the
VALUES RTE. We got away with that prior to 41531e42d3 because it used
to always be the case that rewriteValuesRTE() did nothing for the
underlying base relation, since all DEFAULTS had already been replaced
when it was initially invoked for the view, but that was incorrect
because it failed to apply defaults from the base relation.

Fix this by examining the targetlist entries more carefully and
picking out just those that are simple Vars referencing the VALUES
RTE. That's sufficient for the purposes of rewriteValuesRTE(), which
is only responsible for dealing with DEFAULT items in the VALUES
RTE. Any DEFAULT item in the VALUES RTE that doesn't have a matching
simple-Var-assignment in the targetlist is an error which we complain
about, but in theory that ought to be impossible.

Additionally, move this code into rewriteValuesRTE() to give a clearer
separation of concerns between the 2 functions. There is no need for
rewriteTargetListIU() to know about the details of the VALUES RTE.

While at it, fix the comment for rewriteValuesRTE() which claimed that
it doesn't support array element and field assignments --- that hasn't
been true since a3c7a993d5 (9.6 and later).

Back-patch to all supported versions, with minor differences for the
pre-9.6 branches, which don't support array element and field
assignments to the same column in multi-row VALUES lists.

Reviewed by Amit Langote.

Discussion: https://postgr.es/m/15623-5d67a46788ec8b7f@postgresql.org
2019-03-03 10:51:13 +00:00
Michael Paquier 3422955735 Consider only relations part of partition trees in partition functions
This changes the partition functions so as tables and indexes which are
not part of partition trees are handled the same way as what is done for
undefined objects and unsupported relkinds: pg_partition_tree() returns
no rows and pg_partition_root() returns a NULL result.  Hence,
partitioned tables, partitioned indexes and relations whose flag
pg_class.relispartition is set are considered as valid objects to
process.

Previously, tables and indexes not included in a partition tree were
processed the same way as a partition or a partitioned table, which
caused the functions to return inconsistent results for inherited
tables, especially when inheriting from multiple tables.

Reported-by: Álvaro Herrera
Author: Amit Langote, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20190228193203.GA26151@alvherre.pgsql
2019-03-02 18:18:59 +09:00
Andres Freund 70b9bda65f Use a virtual rather than a heap slot in two places where that suffices.
Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-03-01 17:26:43 -08:00
Tom Lane 3396138a6d Check we don't misoptimize a NOT IN where the subquery returns no rows.
Future-proofing against a common mistake in attempts to optimize NOT IN.
We don't have such an optimization right now, but attempts to do so
are in the works, and some of 'em are buggy.  Add a regression test case
covering the point.

David Rowley

Discussion: https://postgr.es/m/CAKJS1f90E9agVZryVyUpbHQbjTt5ExqS2Fsodmt5_A7E_cEyVA@mail.gmail.com
2019-03-01 17:57:20 -05:00
Tom Lane 65ce07e020 Teach optimizer's predtest.c more things about ScalarArrayOpExpr.
In particular, make it possible to prove/refute "x IS NULL" and
"x IS NOT NULL" predicates from a clause involving a ScalarArrayOpExpr
even when we are unable or unwilling to deconstruct the expression
into an AND/OR tree.  This avoids a former unexpected degradation of
plan quality when the size of an ARRAY[] expression or array constant
exceeded the arbitrary MAX_SAOP_ARRAY_SIZE limit.  For IS-NULL proofs,
we don't really care about the values of the individual array elements;
at most, we care whether there are any, and for some common cases we
needn't even know that.

The main user-visible effect of this is to let the optimizer recognize
applicability of partial indexes with "x IS NOT NULL" predicates to
queries with "x IN (array)" clauses in some cases where it previously
failed to recognize that.  The structure of predtest.c is such that a
bunch of related proofs will now also succeed, but they're probably
much less useful in the wild.

James Coleman, reviewed by David Rowley

Discussion: https://postgr.es/m/CAAaqYe8yKSvzbyu8w-dThRs9aTFMwrFxn_BkTYeXgjqe3CbNjg@mail.gmail.com
2019-03-01 17:14:17 -05:00
Peter Eisentraut aad21d4c3c Fix whitespace 2019-03-01 20:56:53 +01:00
Andrew Dunstan 97b6f2eb75 Remove tests for pg_dumpall --exclude-database missing argument
It turns out that different getopt implementations spell the error for
missing arguments different ways. This test is of fairly marginal
value, so instead of trying to keep up  with the different error
messages just remove the test.
2019-03-01 14:14:50 -05:00
Andres Freund ad0bda5d24 Store tuples for EvalPlanQual in slots, rather than as HeapTuples.
For the upcoming pluggable table access methods it's quite
inconvenient to store tuples as HeapTuples, as that'd require
converting tuples from a their native format into HeapTuples. Instead
use slots to manage epq tuples.

To fit into that scheme, change the foreign data wrapper callback
RefetchForeignRow, to store the tuple in a slot. Insist on using the
caller provided slot, so it conveniently can be stored in the
corresponding EPQ slot.  As there is no in core user of
RefetchForeignRow, that change was done blindly, but we plan to test
that soon.

To avoid duplicating that work for row locks, move row locks to just
directly use the EPQ slots - it previously temporarily stored tuples
in LockRowsState.lr_curtuples, but that doesn't seem beneficial, given
we'd possibly end up with a significant number of additional slots.

The behaviour of es_epqTupleSet[rti -1] is now checked by
es_epqTupleSlot[rti -1] != NULL, as that is distinguishable from a
slot containing an empty tuple.

Author: Andres Freund, Haribabu Kommi, Ashutosh Bapat
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-03-01 10:37:57 -08:00
Andrew Dunstan 6cbdbd9e8d Add extra descriptive headings in pg_dumpall
Headings are added for the User Configurations and Databases sections,
and for each user configuration and database in the output.

Author: Fabien Coelho
Discussion: https://postgr.es/m/alpine.DEB.2.21.1812272222130.32444@lancre
2019-03-01 11:38:54 -05:00
Andrew Dunstan f092de0503 Add --exclude-database option to pg_dumpall
This option functions similarly to pg_dump's --exclude-table option, but
for database names. The option can be given once, and the argument can
be a pattern including wildcard characters.

Author: Andrew Dunstan.
Reviewd-by: Fabien Coelho and Michael Paquier
Discussion: https://postgr.es/m/43a54a47-4aa7-c70e-9ca6-648f436dd6e6@2ndQuadrant.com
2019-03-01 10:47:44 -05:00
Amit Kapila 9c32e4c350 Clear the local map when not used.
After commit b0eaa4c51b, we use a local map of pages to find the required
space for small relations.  We do clear this map when we have found a block
with enough free space, when we extend the relation, or on transaction
abort so that it can be used next time.  However, we miss to clear it when
we didn't find any pages to try from the map which leads to an assertion
failure when we later tried to use it after relation extension.

In the passing, I have improved some comments in this area.

Reported-by: Tom Lane based on buildfarm results
Author: Amit Kapila
Reviewed-by: John Naylor
Tested-by: Kuntal Ghosh
Discussion: https://postgr.es/m/32368.1551114120@sss.pgh.pa.us
2019-03-01 07:38:47 +05:30
Michael Paquier 0f3cdf873e Make pg_partition_tree return no rows on unsupported and undefined objects
The function was tweaked so as it returned one row full of NULLs when
working on an unsupported relkind or an undefined object as of cc53123,
and after discussion with Amit and Álvaro it looks more natural to make
it return no rows.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Langote
Discussion: https://postgr.es/m/20190227184808.GA17357@alvherre.pgsql
2019-03-01 09:07:07 +09:00
Andres Freund 253655116b Don't superfluously materialize slot after DELETE from an FDW.
Previously that was needed to safely store the table oid, but after
b8d71745ea that's not necessary anymore.

Author: Andres Freund
2019-02-28 14:54:12 -08:00
Andres Freund 8f0577386e Don't force materializing when copying a buffer tuple table slot.
After 5408e233f0 it's not necessary to force materializing the
target slot when copying from one buffer slot to another. Previously
that was required because the HeapTupleData portion of the source slot
wasn't guaranteed to stay valid long enough, but now we can simply
copy that part into the destination slot's tupdata.

Author: Andres Freund
2019-02-28 14:54:12 -08:00
Joe Conway 4598a99cf2 Make get_controlfile not leak file descriptors
When backend functions were added to expose controldata via SQL,
reading of pg_control was consolidated under src/common so that
both frontend and backend could share the same code. That move
from frontend-only to shared frontend-backend failed to recognize
the risk (and coding standards violation) of using a bare open().
In particular, it risked leaking file descriptors if transient
errors occurred while reading the file. Fix that by using
OpenTransientFile() instead in the backend case, which is
purpose-built for this type of usage.

Since there have been no complaints from the field, and an intermittent
failure low risk, no backpatch. Hard failure would of course be bad, but
in that case these functions are probably the least of your worries.

Author: Joe Conway
Reviewed-By: Michael Paquier
Reported by: Michael Paquier
Discussion: https://postgr.es/m/20190227074728.GA15710@paquier.xyz
2019-02-28 15:57:40 -05:00
Andres Freund f414abd62d Allow buffer tuple table slots to materialize after ExecStoreVirtualTuple().
While not common, it can be useful to store a virtual tuple into a
buffer tuple table slot, and then materialize that slot. So far we've
asserted out, which surprisingly wasn't a problem for anything in
core. But that seems fragile, and it also breaks redis_fdw after
ff11e7f4b9.

Thus, allow materializing a virtual tuple stored in a buffer tuple
table slot.

Author: Andres Freund
Discussion:
    https://postgr.es/m/20190227181621.xholonj7ff7ohxsg@alap3.anarazel.de
    https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-02-28 12:28:03 -08:00
Alvaro Herrera 19455c9f56 pg_dump: Fix ArchiveEntry handling of some empty values
Commit f831d4acc changed what pg_dump emits for some empty fields: they
were output as empty strings before, NULL pointer afterwards.  That
makes old pg_restore unable to work (crash) with such files, which is
unacceptable.  Return to the original representation by explicitly
setting those struct members to "" where needed; remove some no longer
needed checks for NULL input.

We can declutter the code a little by returning to NULLs when we next
update the archive version, so add a note to remind us later.

Discussion: https://postgr.es/m/20190225074539.az6j3u464cvsoxh6@depesz.com
Reported-by: hubert depesz lubaczewski
Author: Dmitry Dolgov
2019-02-28 17:19:44 -03:00
Peter Eisentraut 3f61999cc9 Merge near-duplicate code in RI triggers
Merge ri_setnull and ri_setdefault into one function ri_set.  These
functions were to a large part identical.

This is a continuation in spirit of
4797f9b519.

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
2019-02-28 20:35:55 +01:00
Tom Lane c94fb8e8ac Standardize some more loops that chase down parallel lists.
We have forboth() and forthree() macros that simplify iterating
through several parallel lists, but not everyplace that could
reasonably use those was doing so.  Also invent forfour() and
forfive() macros to do the same for four or five parallel lists,
and use those where applicable.

The immediate motivation for doing this is to reduce the number
of ad-hoc lnext() calls, to reduce the footprint of a WIP patch.
However, it seems like good cleanup and error-proofing anyway;
the places that were combining forthree() with a manually iterated
loop seem particularly illegible and bug-prone.

There was some speculation about restructuring related parsetree
representations to reduce the need for parallel list chasing of
this sort.  Perhaps that's a win, or perhaps not, but in any case
it would be considerably more invasive than this patch; and it's
not particularly related to my immediate goal of improving the
List infrastructure.  So I'll leave that question for another day.

Patch by me; thanks to David Rowley for review.

Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
2019-02-28 14:25:01 -05:00
Peter Eisentraut 0a271df705 Clean up some variable names in ri_triggers.c
There was a mix of old_slot/oldslot, new_slot/newslot.  Since we've
changed everything from row to slot, we might as well take this
opportunity to clean this up.

Also update some more comments for the slot change.
2019-02-28 15:29:00 +01:00
Peter Eisentraut 554ebf6878 Compact for loops
Declare loop variable in for loop, for readability and to save space.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
2019-02-28 11:06:33 +01:00
Peter Eisentraut 05d604718f Reduce comments
Reduce the vertical space used by comments in ri_triggers.c, making
the file longer and more tedious to read than it needs to be.  Update
some comments to use a more common style.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
2019-02-28 11:06:33 +01:00
Peter Eisentraut 0afdecc1e5 Remove unnecessary unused MATCH PARTIAL code
ri_triggers.c spends a lot of space catering to a not-yet-implemented
MATCH PARTIAL option.  An actual implementation would probably not use
the existing code structure anyway, so let's just simplify this for
now.

First, have ri_FetchConstraintInfo() check that riinfo->confmatchtype
is valid.  Then we don't have to repeat that everywhere.

In the various referential action functions, we don't need to pay
attention to the match type at all right now, so remove all that code.
A future MATCH PARTIAL implementation would probably have some
conditions added to the present code, but it won't need an entirely
separate switch branch in each case.

In RI_FKey_fk_upd_check_required(), reorganize the code to make it
much simpler.

Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
2019-02-28 11:06:33 +01:00
Peter Eisentraut 22c0d52b8d Update comment
for ff11e7f4b9
2019-02-28 10:43:00 +01:00
Michael Paquier b3a156858a Improve documentation of data_sync_retry
Reflecting an updated parameter value requires a server restart, which
was not mentioned in the documentation and in postgresql.conf.sample.

Reported-by: Thomas Poty
Discussion: https://postgr.es/m/15659-0cd812f13027a2d8@postgresql.org
2019-02-28 11:02:11 +09:00
Michael Paquier 87c346a35e Fix SCRAM authentication via SSL when mixing versions of OpenSSL
When using a libpq client linked with OpenSSL 1.0.1 or older to connect
to a backend linked with OpenSSL 1.0.2 or newer, the server would send
SCRAM-SHA-256-PLUS and SCRAM-SHA-256 as valid mechanisms for the SASL
exchange, and the client would choose SCRAM-SHA-256-PLUS even if it does
not support channel binding, leading to a confusing error.  In this
case, what the client ought to do is switch to SCRAM-SHA-256 so as the
authentication can move on and succeed.

So for a SCRAM authentication over SSL, here are all the cases present
and how we deal with them using libpq:
1) Server supports channel binding, it sends SCRAM-SHA-256-PLUS and
SCRAM-SHA-256 as allowed mechanisms.
1-1) Client supports channel binding, chooses SCRAM-SHA-256-PLUS.
1-2) Client does not support channel binding, chooses SCRAM-SHA-256.
2) Server does not support channel binding, sends SCRAM-SHA-256 as
allowed mechanism.
2-1) Client supports channel binding, still it has no choice but to
choose SCRAM-SHA-256.
2-2) Client does not support channel binding, it chooses SCRAM-SHA-256.
In all these scenarios the connection should succeed, and the one which
was handled incorrectly prior this commit is 1-2), causing the
connection attempt to fail because client chose SCRAM-SHA-256-PLUS over
SCRAM-SHA-256.

Reported-by: Hugh Ranalli
Diagnosed-by: Peter Eisentraut
Author: Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/CAAhbUMO89SqUk-5mMY+OapgWf-twF2NA5sCucbHEzMfGbvcepA@mail.gmail.com
Backpatch-through: 11
2019-02-28 09:40:28 +09:00
Andres Freund 5963b29e03 Initialize variable to silence compiler warning.
After ff11e7f4b9 Tom's compiler warns about accessing a potentially
uninitialized rInfo. That's not actually possible, but it's
understandable the compiler would get this wrong. NULL initialize too.

Reported-By: Tom Lane
Discussion: https://postgr.es/m/11199.1551285318@sss.pgh.pa.us
2019-02-27 09:14:34 -08:00
Peter Eisentraut 538ecc17c4 Set cluster_name for PostgresNode.pm instances
This can help identifying test instances more easily at run time, and
it also provides some minimal test coverage for the cluster_name
feature.

Reviewed-by: Euler Taveira <euler@timbira.com.br>
Discussion: https://www.postgresql.org/message-id/flat/1257eaee-4874-e791-e83a-46720c72cac7@2ndquadrant.com
2019-02-27 10:59:25 +01:00
Peter Eisentraut 6ae578a91e Set fallback_application_name for a walreceiver to cluster_name
By default, the fallback_application_name for a physical walreceiver
is "walreceiver".  This means that multiple standbys cannot be
distinguished easily on a primary, for example in pg_stat_activity or
synchronous_standby_names.

If cluster_name is set, use that for fallback_application_name in the
walreceiver.  (If it's not set, it remains "walreceiver".)  If someone
set cluster_name to identify their instance, we might as well use that
by default to identify the node remotely as well.  It's still possible
to specify another application_name in primary_conninfo explicitly.

Reviewed-by: Euler Taveira <euler@timbira.com.br>
Discussion: https://www.postgresql.org/message-id/flat/1257eaee-4874-e791-e83a-46720c72cac7@2ndquadrant.com
2019-02-27 10:59:25 +01:00
Michael Paquier 414a9d3cf3 Fix memory leak when inserting tuple at relation creation for CTAS
The leak has been introduced by 763f2ed which has addressed the problem
for transient tables, and forgot CREATE TABLE AS which shares a similar
logic when receiving a new tuple to store into the newly-created
relation.

Author: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1xZXtz3mziPEPD2Fubbas4G2RWkZm5HHABtfKVcbu1=Sg@mail.gmail.com
2019-02-27 14:14:06 +09:00
Andres Freund ff11e7f4b9 Use slots in trigger infrastructure, except for the actual invocation.
In preparation for abstracting table storage, convert trigger.c to
track tuples in slots. Which also happens to make code calling
triggers simpler.

As the calling interface for triggers themselves is not changed in
this patch, HeapTuples still are extracted from the slot at that
time. But that's handled solely inside trigger.c, not visible to
callers. It's quite likely that we'll want to revise the external
trigger interface, but that's a separate large project.

As part of this work the slots used for old/new/return tuples are
moved from EState into ResultRelInfo, as different updated tables
might need different slots. The slots are now also now created
on-demand, which is good both from an efficiency POV, but also makes
the modifying code simpler.

Author: Andres Freund, Amit Khandekar and Ashutosh Bapat
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-02-26 20:31:38 -08:00
Andres Freund b8d71745ea Store table oid and tuple's tid in tuple slots directly.
After the introduction of tuple table slots all table AMs need to
support returning the table oid of the tuple stored in a slot created
by said AM. It does not make sense to re-implement that in every AM,
therefore move handling of table OIDs into the TupleTableSlot
structure itself.  It's possible that we, at a later date, might want
to get rid of HeapTupleData.t_tableOid entirely, but doing so before
the abstractions for table AMs are integrated turns out to be too
hard, so delay that for now.

Similarly, every AM needs to support the concept of a tuple
identifier (tid / item pointer) for its tuples. It's quite possible
that we'll generalize the exact form of a tid at a future point (to
allow for things like index organized tables), but for now many parts
of the code know about tids, so there's not much point in abstracting
tids away. Therefore also move into slot (rather than providing API to
set/get the tid associated with the tuple in a slot).

Once table AM includes insert/updating/deleting tuples, the
responsibility to set the correct tid after such an action will move
into that. After that change, code doing such modifications, should
not have to deal with HeapTuples directly anymore.

Author: Andres Freund, Haribabu Kommi and Ashutosh Bapat
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-02-26 20:31:16 -08:00
Andres Freund 5408e233f0 Allow to use HeapTupleData embedded in [Buffer]HeapTupleTableSlot.
That avoids having to care about the lifetime of the
HeapTupleHeaderData passed to ExecStore[Buffer]HeapTuple(). That
doesn't make a huge difference for a plain HeapTupleTableSlot, but for
BufferHeapTupleTableSlot it can be a significant advantage, avoiding
the need to materialize slots where it's inconvenient to provide a
HeapTupleData with appropriate lifetime to point to the on-disk tuple.

It's quite possible that we'll want to add support functions for
constructing HeapTuples using that embedded HeapTupleData, but for now
callers do so themselves.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-02-26 18:15:59 -08:00
Andres Freund 8aa02b52db Add ExecStorePinnedBufferHeapTuple.
This allows to avoid an unnecessary pin/unpin cycle when storing a
tuple in an already pinned buffer into a slot, when the pin isn't
further needed at the call site.

Only a single caller for now (to ensure coverage), but upcoming
patches will increase use of the new function.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-02-26 17:59:01 -08:00
Robert Haas f4b6341d5f Change lock acquisition order in expand_inherited_rtentry.
Previously, this function acquired locks in the order using
find_all_inheritors(), which locks the children of each table that it
processes in ascending OID order, and which processes the inheritance
hierarchy as a whole in a breadth-first fashion.  Now, it processes
the inheritance hierarchy in a depth-first fashion, and at each level
it proceeds in the order in which tables appear in the PartitionDesc.
If table inheritance rather than table partitioning is used, the old
order is preserved.

This change moves the locking of any given partition much closer to
the code that actually expands that partition.  This seems essential
if we ever want to allow concurrent DDL to add or remove partitions,
because if the set of partitions can change, we must use the same data
to decide which partitions to lock as we do to decide which partitions
to expand; otherwise, we might expand a partition that we haven't
locked.  It should hopefully also facilitate efforts to postpone
inheritance expansion or locking for performance reasons, because
there's really no way to postpone locking some partitions if
we're blindly locking them all using find_all_inheritors().

The only downside of this change which is known to me is that it
further deviates from the principle that we should always lock the
inheritance hierarchy in find_all_inheritors() order to avoid deadlock
risk.  However, we've already crossed that bridge in commit
9eefba181f and there are futher patches
pending that make similar changes, so this isn't really giving up
anything that we haven't surrendered already -- and it seems entirely
worth it, given the performance benefits some of those changes seem
likely to bring.

Patch by me; thanks to David Rowley for discussion of these issues.

Discussion: http://postgr.es/m/CAKJS1f_eEYVEq5tM8sm1k-HOwG0AyCPwX54XG9x4w0zy_N4Q_Q@mail.gmail.com
Discussion: http://postgr.es/m/CA+TgmoZUwPf_uanjF==gTGBMJrn8uCq52XYvAEorNkLrUdoawg@mail.gmail.com
2019-02-26 12:22:57 -05:00
Michael Meskes 42ccbe4351 Free memory in ecpg bytea regression test.
While not really a problem it's easier to run tools like valgrind against it
when fixed.
2019-02-26 11:59:35 +01:00
Michael Meskes 0cc0507940 Hopefully fixing memory handling issues in ecpglib that Coverity found. 2019-02-26 10:56:54 +01:00
Michael Paquier 6e52209eb1 Simplify some code in pg_rewind when syncing target directory
9a4059d simplified the flush of target data folder when finishing
processing, and could have done a bit more.

Discussion: https://postgr.es/m/20190131064759.GA13429@paquier.xyz
2019-02-26 16:08:24 +09:00
Peter Geoghegan 2ab23445bc Remove unneeded argument from _bt_getstackbuf().
_bt_getstackbuf() is called at exactly two points following commit
efada2b8e9 (one call site is concerned with page splits, while the
other is concerned with page deletion).  The parent buffer returned by
_bt_getstackbuf() is write-locked in both cases.  Remove the 'access'
argument and make _bt_getstackbuf() assume that callers require a
write-lock.
2019-02-25 17:47:43 -08:00
Peter Geoghegan 067786cea0 Correct obsolete nbtree page deletion comment.
Commit efada2b8e9, which made the nbtree page deletion algorithm more
robust, removed _bt_getstackbuf() calls from _bt_pagedel().  It failed
to update a comment that referenced the earlier approach.  Update the
comment to explain that the _bt_getstackbuf() page deletion call site
mirrors the only other remaining _bt_getstackbuf() call site, which is
reached during page splits.
2019-02-25 16:54:18 -08:00
Peter Eisentraut b6926dee0e psql: Remove obsolete code
The check in create_help.pl for a null end tag (</>) has been obsolete
since the conversion from SGML to XML, since XML does not allow that
anymore.
2019-02-25 12:00:29 +01:00
Peter Eisentraut bc09d5e4cc Remove unnecessary use of PROCEDURAL
Remove some unnecessary, legacy-looking use of the PROCEDURAL keyword
before LANGUAGE.  We mostly don't use this anymore, so some of these
look a bit old.

There is still some use in pg_dump, which is harder to remove because
it's baked into the archive format, so I'm not touching that.

Discussion: https://www.postgresql.org/message-id/2330919b-62d9-29ac-8de3-58c024fdcb96@2ndquadrant.com
2019-02-25 08:38:59 +01:00
Michael Paquier effe7d9552 Make release of 2PC identifier and locks consistent in COMMIT PREPARED
When preparing a transaction in two-phase commit, a dummy PGPROC entry
holding the GID used for the transaction is registered, which gets
released once COMMIT PREPARED is run.  Prior releasing its shared memory
state, all the locks taken in the prepared transaction are released
using a dedicated set of callbacks (pgstat and multixact having similar
callbacks), which may cause the locks to be released before the GID is
set free.

Hence, there is a small window where lock conflicts could happen, for
example:
- Transaction A releases its locks, still holding its GID in shared
memory.
- Transaction B held a lock which conflicted with locks of transaction
A.
- Transaction B continues its processing, reusing the same GID as
transaction A.
- Transaction B fails because of a conflicting GID, already in use by
transaction A.

This commit changes the shared memory state release so as post-commit
callbacks and predicate lock cleanup happen consistently with the shared
memory state cleanup for the dummy PGPROC entry.  The race window is
small and 2PC had this issue from the start, so no backpatch is done.
On top if that fixes discussed involved ABI breakages, which are not
welcome in stable branches.

Reported-by: Oleksii Kliukin, Ildar Musin
Diagnosed-by: Oleksii Kliukin, Ildar Musin
Author: Michael Paquier
Reviewed-by: Masahiko Sawada, Oleksii Kliukin
Discussion: https://postgr.es/m/BF9B38A4-2BFF-46E8-BA87-A2D00A8047A6@hintbits.com
2019-02-25 14:19:34 +09:00
Thomas Munro 29ddb548f6 Fix inconsistent out-of-memory error reporting in dsa.c.
Commit 16be2fd1 introduced the flag DSA_ALLOC_NO_OOM to control whether
the DSA allocator would raise an error or return InvalidDsaPointer on
failure to allocate.  One edge case was not handled correctly: if we
fail to allocate an internal "span" object for a large allocation, we
would always return InvalidDsaPointer regardless of the flag; a caller
not expecting that could then dereference a null pointer.

This is a plausible explanation for a one-off report of a segfault.

Remove a redundant pair of braces so that all three stanzas that handle
DSA_ALLOC_NO_OOM match in style, for visual consistency.

While fixing inconsistencies, if FreePageManagerGet() can't supply the
pages that our book-keeping says it should be able to supply, then we
should always report a FATAL error.  Previously we treated that as a
regular allocation failure in one code path, but as a FATAL condition
in another.

Back-patch to 10, where dsa.c landed.

Author: Thomas Munro
Reported-by: Jakub Glapa
Discussion: https://postgr.es/m/CAEepm=2oPqXxyWQ-1o60tpOLrwkw=VpgNXqqF1VN2EyO9zKGQw@mail.gmail.com
2019-02-25 11:11:40 +13:00
Tom Lane 9e138a401d Fix ecpg bugs caused by missing semicolons in the backend grammar.
The Bison documentation clearly states that a semicolon is required
after every grammar rule, and our scripts that generate ecpg's
grammar from the backend's implicitly assumed this is true.  But it
turns out that only ancient versions of Bison actually enforce that.
There have been a couple of rules without trailing semicolons in
gram.y for some time, and as a consequence, ecpg's grammar was faulty
and produced wrong output for the affected statements.

To fix, add the missing semis, and add some cross-checks to ecpg's
scripts so that they'll bleat if we mess this up again.

The cases that were broken were:
* "SET variable = DEFAULT" (but not "SET variable TO DEFAULT"),
  as well as allied syntaxes such as ALTER SYSTEM SET ... DEFAULT.
  These produced syntactically invalid output that the server
  would reject.
* Multiple type names in DROP TYPE/DOMAIN commands.  Only the
  first type name would be listed in the emitted command.

Per report from Daisuke Higuchi.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905DB51CE@g01jpexmbkw24
2019-02-24 12:51:50 -05:00
Thomas Munro f16735d80d Tolerate EINVAL when calling fsync() on a directory.
Previously, we tolerated EBADF as a way for the operating system to
indicate that it doesn't support fsync() on a directory.  Tolerate
EINVAL too, for older versions of Linux CIFS.

Bug #15636.  Back-patch all the way.

Reported-by: John Klann
Discussion: https://postgr.es/m/15636-d380890dafd78fc6@postgresql.org
2019-02-24 23:50:20 +13:00
Thomas Munro 483520eca4 Tolerate ENOSYS failure from sync_file_range().
One unintended consequence of commit 9ccdd7f6 was that Windows WSL
users started getting a panic whenever we tried to initiate data
flushing with sync_file_range(), because WSL does not implement that
system call.  Previously, they got a stream of periodic warnings,
which was also undesirable but at least ignorable.

Prevent the panic by handling ENOSYS specially and skipping the panic
promotion with data_sync_elevel().  Also suppress future attempts
after the first such failure so that the pre-existing problem of
noisy warnings is improved.

Back-patch to 9.6 (older branches were not affected in this way by
9ccdd7f6).

Author: Thomas Munro and James Sewell
Tested-by: James Sewell
Reported-by: Bruce Klein
Discussion: https://postgr.es/m/CA+mCpegfOUph2U4ZADtQT16dfbkjjYNJL1bSTWErsazaFjQW9A@mail.gmail.com
2019-02-24 22:37:20 +13:00
Peter Eisentraut f275225539 Revert "pg_regress: Don't use absolute paths for the diff"
This reverts commit 1995552deb.

Several developers didn't like the new behavior.
2019-02-23 09:37:25 +01:00
Michael Paquier 4c23216002 Fix incorrect function reference in comment of twophase.c
The header block of TwoPhaseGetDummyBackendId mentioned incorrectly
TwoPhaseGetDummyProc.

Reported-by: Oleksii Kliukin
Discussion: https://postgr.es/m/D8336E40-BBE1-4954-98BB-7830D3F5CB36@hintbits.com
2019-02-23 08:40:01 +09:00
Michael Paquier b108676708 Add TAP tests for 2PC post-commit callbacks of multixacts at recovery
The current set of TAP tests for two-phase transactions include some
coverage for post-commit callbacks of multixact, but it lacked tests for
testing the recovery of those callbacks.  This commit adds some tests
with soft and hard restarts in this case, using transactions which
include DDLs.

Author: Michael Paquier
Reviewed-by: Oleksii Kliukin
Discussion: https://postgr.es/m/20190221055431.GO15532@paquier.xyz
2019-02-23 08:19:31 +09:00
Tom Lane ab5fcf2b04 Fix plan created for inherited UPDATE/DELETE with all tables excluded.
In the case where inheritance_planner() finds that every table has
been excluded by constraints, it thought it could get away with
making a plan consisting of just a dummy Result node.  While certainly
there's no updating or deleting to be done, this had two user-visible
problems: the plan did not report the correct set of output columns
when a RETURNING clause was present, and if there were any
statement-level triggers that should be fired, it didn't fire them.

Hence, rather than only generating the dummy Result, we need to
stick a valid ModifyTable node on top, which requires a tad more
effort here.

It's been broken this way for as long as inheritance_planner() has
known about deleting excluded subplans at all (cf commit 635d42e9c),
so back-patch to all supported branches.

Amit Langote and Tom Lane, per a report from Petr Fedorov.

Discussion: https://postgr.es/m/5da6f0f0-1364-1876-6978-907678f89a3e@phystech.edu
2019-02-22 12:23:19 -05:00
Alvaro Herrera 98098faaff Report correct name in autovacuum "work items" activity
We were reporting the database name instead of the relation name to
pg_stat_activity.  Repair.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20190220185552.GR28750@telsasoft.com
2019-02-22 13:00:16 -03:00
Peter Eisentraut 1373ba55c9 Add const qualifier
New code introduced in 050710b369.  The
lack of const is not currently a compiler warning, but it's nice to
have for consistency with surrounding code.
2019-02-22 09:01:19 +01:00
Michael Paquier 554ca6954e Remove duplicate variable declaration in fe-connect.c
The same variables are declared twice when checking if a connection is
writable, which is useless.

Author: Haribabu Kommi
Discussion: https://postgr.es/m/CAJrrPGf=rcALB54w_Tg1_hx3y+cgSWaERY-uYSQzGc3Zt5XN4g@mail.gmail.com
2019-02-22 13:16:47 +09:00
Tom Lane 24d08f3c0a Fix mark-and-restore-skipping test case to not be a self-join.
There isn't any good reason for this test to be a self-join rather
than a join between separate tables, except that it saved a couple
of SQL commands for setup.  A proposed patch to optimize away
self-joins breaks the test, so adjust it to avoid that happening.

Discussion: https://postgr.es/m/64486b0b-0404-e39e-322d-0801154901f3@postgrespro.ru
2019-02-21 18:55:29 -05:00
Tom Lane 0c7d537930 Move estimate_hashagg_tablesize to selfuncs.c, and widen result to double.
It seems to make more sense for this to be in selfuncs.c, since it's
largely a statistical-estimation thing, and it's related to other
functions like estimate_hash_bucket_stats that are there.

While at it, change the result type from Size to double.  Perhaps at one
point it was impossible for the result to overflow an integer, but
I've got no confidence in that proposition anymore.  Nothing's actually
done with the result except to compare it to a work_mem-based limit,
so as long as we don't get an overflow on the way to that comparison,
things should be fine even with very large dNumGroups.

Code movement proposed by Antonin Houska, type change by me

Discussion: https://postgr.es/m/25767.1549359615@localhost
2019-02-21 14:59:12 -05:00
Peter Eisentraut f9692a769b Hide other user's pg_stat_ssl rows
Change pg_stat_ssl so that an unprivileged user can only see their own
rows; other rows will be all null.  This makes the behavior consistent
with pg_stat_activity, where information about where the connection
came from is also restricted.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/63117976-d02c-c8e2-3aef-caa31a5ab8d3%402ndquadrant.com
2019-02-21 19:51:52 +01:00
Peter Eisentraut 1995552deb pg_regress: Don't use absolute paths for the diff
Don't expand inputfile and outputfile to absolute paths globally, just
where needed.  In particular, pass them as is to the file name
arguments of the diff command, so that we don't see the full absolute
path in the diff header, which makes the diff unnecessarily verbose
and harder to read.

Discussion: https://www.postgresql.org/message-id/0cc82900-c457-1cee-3ab2-7b0f5d215061@2ndquadrant.com
2019-02-21 18:34:19 +01:00
Robert Haas 1bb5e78218 Move code for managing PartitionDescs into a new file, partdesc.c
This is similar in spirit to the existing partbounds.c file in the
same directory, except that there's a lot less code in the new file
created by this commit.  Pending work in this area proposes to add a
bunch more code related to PartitionDescs, though, and this will give
us a good place to put it.

Discussion: http://postgr.es/m/CA+TgmoZUwPf_uanjF==gTGBMJrn8uCq52XYvAEorNkLrUdoawg@mail.gmail.com
2019-02-21 11:45:02 -05:00
Robert Haas 9eefba181f Delay lock acquisition for partitions until we route a tuple to them.
Instead of locking all partitions to which we might route a tuple at
executor startup, just lock them as we use them.  In some cases such a
partition might get locked at executor startup anyway because it
appears in the query's range table for some other reason, but in other
cases this is a bit savings.

This changes the order in which partitions are locked in some cases,
which might conceivably create deadlock hazards that don't exist
today, but per discussion, it seems like such cases should be rare
enough that we can neglect them in favor of improving performance.

David Rowley, reviewed and tested by Tomas Vondra, Sho Kato, John
Naylor, Tom Lane, and me.

Discussion: http://postgr.es/m/CAKJS1f-=FnMqmQP6qitkD+xEddxw22ySLP-0xFk3JAqUX2yfMw@mail.gmail.com
2019-02-21 11:24:40 -05:00
Tom Lane fa86238f1e Speed up match_eclasses_to_foreign_key_col() when there are many ECs.
Check ec_relids before bothering to iterate through the EC members.
On a perhaps extreme, but still real-world, query in which
match_eclasses_to_foreign_key_col() accounts for the bulk of the
planner's runtime, this saves nearly 40% of the runtime.  It's a bit
of a stopgap fix, but it's simple enough to be back-patched to 9.6
where this code came in; so let's do that.

David Rowley

Discussion: https://postgr.es/m/6970.1545327857@sss.pgh.pa.us
2019-02-20 20:53:17 -05:00
Andrew Gierth d26a810ebf Use an unsigned char for bool if we don't use the native bool.
On (rare) platforms where sizeof(bool) > 1, we need to use our own
bool, but imported c99 code (such as Ryu) may want to use bool values
as array subscripts, which elicits warnings if bool is defined as
char. Using unsigned char instead should work just as well for our
purposes, and avoid such warnings.

Per buildfarm members prariedog and locust.
2019-02-20 21:31:02 +00:00
Tom Lane e04a3905e4 Improve planner's understanding of strictness of type coercions.
PG type coercions are generally strict, ie a NULL input must produce
a NULL output (or, in domain cases, possibly an error).  The planner's
understanding of that was a bit incomplete though, so improve it:

* Teach contain_nonstrict_functions() that CoerceViaIO can always be
considered strict.  Previously it believed that only if the underlying
I/O functions were marked strict, which is often but not always true.

* Teach clause_is_strict_for() that CoerceViaIO, ArrayCoerceExpr,
ConvertRowtypeExpr, CoerceToDomain can all be considered strict.
Previously it knew nothing about any of them.

The main user-visible impact of this is that IS NOT NULL predicates
can be proven to hold from expressions involving casts in more cases
than before, allowing partial indexes with such predicates to be used
without extra pushups.  This reduces the surprise factor for users,
who may well be used to ordinary (function-call-based) casts being
known to be strict.

Per a gripe from Samuel Williams.  This doesn't rise to the level of
a bug, IMO, so no back-patch.

Discussion: https://postgr.es/m/27571.1550617881@sss.pgh.pa.us
2019-02-20 14:39:11 -05:00
Tom Lane 1571bc0f06 Fix incorrect strictness test for ArrayCoerceExpr expressions.
The recursion in contain_nonstrict_functions_walker() was done wrong,
causing the strictness check to be bypassed for a parse node that
is the immediate input of an ArrayCoerceExpr node.  This could allow,
for example, incorrect decisions about whether a strict SQL function
can be inlined.

I didn't add a regression test, because (a) the bug is so narrow
and (b) I couldn't think of a test case that wasn't dependent on a
large number of other behaviors, to the point where it would likely
soon rot to the point of not testing what it was intended to.

I broke this in commit c12d570fa, so back-patch to v11.

Discussion: https://postgr.es/m/27571.1550617881@sss.pgh.pa.us
2019-02-20 13:36:55 -05:00
Alvaro Herrera 5721b9b3ce Make object address handling more robust
pg_identify_object_as_address crashes when passed certain tuples from
inconsistent system catalogs.  Make it more defensive.

Author: Álvaro Herrera
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/20190218202743.GA12392@alvherre.pgsql
2019-02-20 11:26:08 -03:00
Dean Rasheed 41531e42d3 Fix DEFAULT-handling in multi-row VALUES lists for updatable views.
INSERT ... VALUES for a single VALUES row is implemented differently
from a multi-row VALUES list, which causes inconsistent behaviour in
the way that DEFAULT items are handled. In particular, when inserting
into an auto-updatable view on top of a table with a column default, a
DEFAULT item in a single VALUES row gets correctly replaced with the
table column's default, but for a multi-row VALUES list it is replaced
with NULL.

Fix this by allowing rewriteValuesRTE() to leave DEFAULT items in the
VALUES list untouched if the target relation is an auto-updatable view
and has no column default, deferring DEFAULT-expansion until the query
against the base relation is rewritten. For all other types of target
relation, including tables and trigger- and rule-updatable views, we
must continue to replace DEFAULT items with NULL in the absence of a
column default.

This is somewhat complicated by the fact that if an auto-updatable
view has DO ALSO rules attached, the VALUES lists for the product
queries need to be handled differently from the original query, since
the product queries need to act like rule-updatable views whereas the
original query has auto-updatable view semantics.

Back-patch to all supported versions.

Reported by Roger Curley (bug #15623). Patch by Amit Langote and me.

Discussion: https://postgr.es/m/15623-5d67a46788ec8b7f@postgresql.org
2019-02-20 08:30:21 +00:00
Michael Paquier 56fadbedbd Mark correctly initial slot snapshots with MVCC type when built
When building an initial slot snapshot, snapshots are marked with
historic MVCC snapshots as type with the marker field being set in
SnapBuildBuildSnapshot() but not overriden in SnapBuildInitialSnapshot().
Existing callers of SnapBuildBuildSnapshot() do not care about the type
of snapshot used, but extensions calling it actually may, as reported.

While on it, mark correctly the snapshot type when importing one.  This
is cosmetic as the field is enforced to 0.

Author: Antonin Houska
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/23215.1527665193@localhost
Backpatch-through: 9.4
2019-02-20 12:31:07 +09:00
Peter Eisentraut 90cfa49003 Use varargs macro for CACHEDEBUG
Reviewed-by: Andres Freund <andres@anarazel.de>
2019-02-19 11:42:39 +01:00
Tom Lane 315dcffb94 Fix omissions in ecpg/test/sql/.gitignore.
Oversights in commits 050710b36 and e81f0e311.
2019-02-18 21:24:38 -05:00
Andres Freund 22bc403029 Remove line duplicated during conflict resolution.
I included the duplicated ExecTypeFromTL in 578b2297 "Remove WITH OIDS
support".

Reported-By: Peter Eisentraut
Discussion: https://postgr.es/m/ba819888-63c6-7f98-6acb-3731142d9414@2ndquadrant.com
2019-02-18 11:07:30 -08:00
Tom Lane 93b5cc039e De-clutter display of script runtimes in pg_regress.
Add more whitespace, per suggestion from Peter Eisentraut.

Discussion: https://postgr.es/m/e265e2ae-e92e-5ab9-dc68-60b6cb047b3d@2ndquadrant.com
2019-02-18 11:16:39 -05:00
Andrew Dunstan af25bc03e1 Provide an extra-float-digits setting for pg_dump / pg_dumpall
Changes made by commit 02ddd49 mean that dumps made against pre version
12 instances are no longer comparable with those made against version 12
or later instances. This makes cross-version upgrade testing fail in the
buildfarm. Experimentation has shown that the error is cured if the
dumps are made when extra_float_digits is set to 0. Hence this patch
allows for it to be explicitly set rather than relying on pg_dump's
builtin default (3 in almost all cases). This feature might have other
uses, but should not normally be used.

Discussion: https://postgr.es/m/c76f7051-8fd3-ec10-7579-1f8842305b85@2ndQuadrant.com
2019-02-18 07:37:07 -05:00
Michael Meskes 8e6ab9f801 Properly end string to make sure ecpglib does not read beyond its boundaries. 2019-02-18 12:52:53 +01:00
Michael Meskes e81f0e3113 Sync ECPG's CREATE TABLE AS statement with backend's.
Author: Higuchi-san ("Higuchi, Daisuke" <higuchi.daisuke@jp.fujitsu.com>)
2019-02-18 11:57:34 +01:00
Michael Meskes 050710b369 Add bytea datatype to ECPG.
So far ECPG programs had to treat binary data for bytea column as 'char' type.
But this meant converting from/to escaped format with PQunescapeBytea/
PQescapeBytea() and therefore forcing users to add unnecessary code and cost
for the conversion in runtime. By adding a dedicated datatype for bytea most of
this special handling is no longer needed.

Author: Matsumura-san ("Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com>)

Discussion: https://postgr.es/m/flat/03040DFF97E6E54E88D3BFEE5F5480F737A141F9@G01JPEXMBYT04
2019-02-18 10:20:31 +01:00
Etsuro Fujita 3fdc374b5d Save PathTargets for distinct/ordered relations in root->upper_targets[].
For the convenience of extensions, we previously only saved PathTargets
for grouped, window, and final relations in root->upper_targets[] in
grouping_planner().  To improve the convenience, save PathTargets for
distinct and ordered relations as well.

Author: Antonin Houska, with an additional change by me
Discussion: https://postgr.es/m/10994.1549559088@localhost
2019-02-18 16:13:46 +09:00
Michael Paquier a916bdc496 Fix some issues with TAP tests of pg_basebackup and pg_verify_checksums
ee9e145 has fixed the tests of pg_basebackup for checksums a first time,
still one seek() call missed the shot.  Also, the data written in files
to emulate corruptions was not actually writing zeros as the quoting
style was incorrect.

Backpatch the portion for pg_basebackup to v11 where these tests have
been introduced.  The tests of pg_verify_checksums are new as of v12.

Author: Michael Banck
Discussion: https://postgr.es/m/1550153276.796.35.camel@credativ.de
Backpatch-through: 11
2019-02-18 14:23:30 +09:00
Michael Paquier f0cce9fcb5 Fix typo in transam.h for OIDs assigned by genbki.pl
The actual range of reserved OIDs in this case is [11000,11999] and not
[11000,12000].

Author: John Naylor
Discussion: https://postgr.es/m/CAJVSVGV5StmK-inxbmrf0nLbBGeaAKnjnqxXmk+4ufeav8JMSA@mail.gmail.com
2019-02-18 12:44:25 +09:00
Michael Paquier 0dd6ff0ac8 Avoid some unnecessary block reads in WAL reader
When reading a new page internally and depending on the way the WAL
reader facility gets used by plugins, the current implementation of the
WAL reader may finish by reading a block multiple times while it is not
actually necessary as the requested data length may be equal to what has
been already read.  This can happen for any size, but is more likely to
happen at the end of a page.  This can cause performance penalties in
plugins which rely on the block reads to be purely sequential, zlib not
liking backward reads for example.  The new behavior also shaves some
cycles when doing recovery.

Author: Arthur Zakirov
Reviewed-by: Andrey Lepikhov, Michael Paquier, Grigory Smolkin
Discussion: https://postgr.es/m/2ddf4a32-517e-d6f4-d992-4a63b6035bfd@postgrespro.ru
2019-02-18 09:52:02 +09:00
Thomas Munro 0b55aaacec Fix race in dsm_unpin_segment() when handles are reused.
Teach dsm_unpin_segment() to skip segments that are in the process
of being destroyed by another backend, when searching for a handle.
Such a segment cannot possibly be the one we are looking for, even
if its handle matches.  Another slot might hold a recently created
segment that has the same handle value by coincidence, and we need
to keep searching for that one.

The bug caused rare "cannot unpin a segment that is not pinned"
errors on 10 and 11.  Similar to commit 6c0fb941 for dsm_attach().

Back-patch to 10, where dsm_unpin_segment() landed.

Author: Thomas Munro
Reported-by: Justin Pryzby
Tested-by: Justin Pryzby (along with other recent DSA/DSM fixes)
Discussion: https://postgr.es/m/20190216023854.GF30291@telsasoft.com
2019-02-18 09:58:29 +13:00
Tom Lane a32ca78836 Fix CREATE VIEW to allow zero-column views.
We should logically have allowed this case when we allowed zero-column
tables, but it was overlooked.

Although this might be thought a feature addition, it's really a bug
fix, because it was possible to create a zero-column view via
the convert-table-to-view code path, and then you'd have a situation
where dump/reload would fail.  Hence, back-patch to all supported
branches.

Arrange the added test cases to provide coverage of the related
pg_dump code paths (since these views will be dumped and reloaded
during the pg_upgrade regression test).  I also made them test
the case where pg_dump has to postpone the view rule into post-data,
which disturbingly had no regression coverage before.

Report and patch by Ashutosh Sharma (test case by me)

Discussion: https://postgr.es/m/CAE9k0PkmHdeSaeZt2ujnb_cKucmK3sDDceDzw7+d5UZoNJPYOg@mail.gmail.com
2019-02-17 12:37:31 -05:00
Joe Conway 290e3b77fd Mark pg_config() stable rather than immutable
pg_config() has been marked immutable since its inception. As part of a
larger discussion around the definition of immutable versus stable and
related implications for marking functions parallel safe raised by
Andres, the consensus was clearly that pg_config() is stable, since
it could possibly change output even for the same minor version with
a recompile or installation of a new binary. So mark it stable.

Theoretically this could/should be backpatched, but it was deemed to be not
worth the effort since in practice this is very unlikely to cause problems
in the real world.

Discussion: https://postgr.es/m/20181126234521.rh3grz7aavx2ubjv@alap3.anarazel.de
2019-02-17 09:21:13 -05:00
Andrew Gierth 6ee89952d4 Remove float8-small-is-zero regression test variant.
Since this was also the variant used as an example in the docs, update
the docs to use float4-misrounded-input as an example instead, since
that is now the only remaining variant file.
2019-02-16 22:11:04 +00:00
Tom Lane 608b167f9f Allow user control of CTE materialization, and change the default behavior.
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it).  This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.

Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query.  Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.

Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free.  By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED.  Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.

Andreas Karlsson, Andrew Gierth, David Fetter

Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
2019-02-16 16:11:12 -05:00
Andrew Gierth 79730e2a9b Fix previous MinGW fix.
Definitions required for MinGW need to be outside #if _MSC_VER. Oops.
2019-02-16 15:23:02 +00:00
Michael Meskes bd7c95f0c1 Add DECLARE STATEMENT support to ECPG.
DECLARE STATEMENT is a statement that lets users declare an identifier
pointing at a connection.  This identifier will be used in other embedded
dynamic SQL statement such as PREPARE, EXECUTE, DECLARE CURSOR and so on.
When connecting to a non-default connection, the AT clause can be used in
a DECLARE STATEMENT once and is no longer needed in every dynamic SQL
statement.  This makes ECPG applications easier and more efficient.  Moreover,
writing code without designating connection explicitly improves portability.

Authors: Ideriha-san ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
         Kuroda-san ("Kuroda, Hayato" <kuroda.hayato@jp.fujitsu.com>)

Discussion: https://postgr.es/m4E72940DA2BF16479384A86D54D0988A565669DF@G01JPEXMBKW04
2019-02-16 11:05:54 +01:00
Tom Lane 02a6a54ecd Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.
Test for the compiler builtins __builtin_clz, __builtin_ctz, and
__builtin_popcount, and make use of these in preference to
handwritten C code if they're available.  Create src/port
infrastructure for "leftmost one", "rightmost one", and "popcount"
so as to centralize these decisions.

On x86_64, __builtin_popcount generally won't make use of the POPCNT
opcode because that's not universally supported yet.  Provide code
that checks CPUID and then calls POPCNT via asm() if available.
This requires indirecting through a function pointer, which is
an annoying amount of overhead for a one-instruction operation,
but it's probably not worth working harder than this for our
current use-cases.

I'm not sure we've found all the existing places that could profit
from this new infrastructure; but we at least touched all the
ones that used copied-and-pasted versions of the bitmapset.c code,
and got rid of multiple copies of the associated constant arrays.

While at it, replace c-compiler.m4's one-per-builtin-function
macros with a single one that can handle all the cases we need
to worry about so far.  Also, because I'm paranoid, make those
checks into AC_LINK checks rather than just AC_COMPILE; the
former coding failed to verify that libgcc has support for the
builtin, in cases where it's not inline code.

David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane

Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-15 23:22:33 -05:00
Andrew Gierth 72880ac182 Cygwin and Mingw floating-point fixes.
Deal with silent-underflow errors in float4 for cygwin and mingw by
using our strtof() wrapper; deal with misrounding errors by adding
them to the resultmap. Some slight reorganization of declarations was
done to avoid duplicating material between cygwin.h and win32_port.h.

While here, remove from the resultmap all references to
float8-small-is-zero; inspection of cygwin output suggests it's no
longer required there, and the freebsd/netbsd/openbsd entries should
no longer be necessary (these date back to c. 2000). This commit
doesn't remove the file itself nor the documentation references for
it; that will happen in a subsequent commit if all goes well.
2019-02-16 01:50:16 +00:00
Alvaro Herrera 457aef0f1f Revert attempts to use POPCNT etc instructions
This reverts commits fc6c72747a, 109de05cbb, d0b4663c23 and
711bab1e4d.

Somebody will have to try harder before submitting this patch again.
I've spent entirely too much time on it already, and the #ifdef maze yet
to be written in order for it to build at all got on my nerves.  The
amount of work needed to get a platform-specific performance improvement
that's barely above the noise level is not worth it.
2019-02-15 16:32:30 -03:00
Tom Lane e89f14e2bb Refactor index cost estimation functions in view of IndexClause changes.
Get rid of deconstruct_indexquals() in favor of just iterating over the
IndexClause list directly.  The extra services that that function used to
provide, such as hiding clause commutation and associating the right index
column with each clause, are no longer useful given the new data structure.
I'd originally thought that it'd provide a useful amount of abstraction
by freeing callers from paying attention to the exact clause type of each
indexqual, but that hope proves to have been vain, because few callers can
ignore the semantic differences between different clause types.  Indeed,
removing it results in a net code savings, and probably some cycles shaved
by not having to build an extra list-of-structs data structure.

Also, export a few formerly-static support functions, with the goal
of allowing extension AMs to write functionality equivalent to
genericcostestimate() without pointless code duplication.

Discussion: https://postgr.es/m/24586.1550106354@sss.pgh.pa.us
2019-02-15 13:05:19 -05:00
Alvaro Herrera fc6c72747a Fix compiler builtin usage in new pg_bitutils.c
Split out these new functions in three parts: one in a new file that
uses the compiler builtin and gets compiled with the -mpopcnt compiler
option if it exists; another one that uses the compiler builtin but not
the compiler option; and finally the fallback with open-coded
algorithms.

Split out the configure logic: in the original commit, it was selecting
to use the -mpopcnt compiler switch together with deciding whether to
use the compiler builtin, but those two things are really separate.
Split them out.  Also, expose whether the builtin exists to
Makefile.global, so that src/port's Makefile can decide whether to
compile the hw-optimized file.

Remove CPUID test for CTZ/CLZ.  Make pg_{right,left}most_ones use either
the compiler intrinsic or open-coded algo; trying to use the
HW-optimized version is a waste of time.  Make them static inline
functions.

Discussion: https://postgr.es/m/20190213221719.GA15976@alvherre.pgsql
2019-02-15 13:39:56 -03:00
Peter Eisentraut 8f27a14b1b Use standard diff separator for regression.diffs
Instead of ======..., use the standard separator for a multi-file
diff, which is, per POSIX,

    "diff %s %s %s\n", <diff_options>, <filename1>, <filename2>

This makes regression.diffs behave more like a proper diff file, for
use with other tools.  And it shows the diff options used, for
clarity.

Discussion: https://www.postgresql.org/message-id/70440c81-37bb-76dd-e48b-b5a9550d5613@2ndquadrant.com
2019-02-15 15:09:50 +01:00
Michael Paquier 331a613e9d Fix support for CREATE TABLE IF NOT EXISTS AS EXECUTE
The grammar IF NOT EXISTS for CTAS is supported since 9.5 and documented
as such, however the case of using EXECUTE as query has never been
covered as EXECUTE CTAS statements and normal CTAS statements are parsed
separately.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/2ddcc188-e37c-a0be-32bf-a56b07c3559e@proxel.se
Backpatch-through: 9.5
2019-02-15 17:12:24 +09:00
Thomas Munro 6c0fb94189 Fix race in dsm_attach() when handles are reused.
DSM handle values can be reused as soon as the underlying shared memory
object has been destroyed.  That means that for a brief moment we
might have two DSM slots with the same handle.  While trying to attach,
if we encounter a slot with refcnt == 1, meaning that it is currently
being destroyed, we should continue our search in case the same handle
exists in another slot.

The race manifested as a rare "dsa_area could not attach to segment"
error, and was more likely in 10 and 11 due to the lack of distinct
seed for random() in parallel workers.  It was made very unlikely in
in master by commit 197e4af9, and older releases don't usually create
new DSM segments in background workers so it was also unlikely there.

This fixes the root cause of bug report #15585, in which the error
could also sometimes result in a self-deadlock in the error path.
It's not yet clear if further changes are needed to avoid that failure
mode.

Back-patch to 9.4, where dsm.c arrived.

Author: Thomas Munro
Reported-by: Justin Pryzby, Sergei Kornilov
Discussion: https://postgr.es/m/20190207014719.GJ29720@telsasoft.com
Discussion: https://postgr.es/m/15585-324ff6a93a18da46@postgresql.org
2019-02-15 14:05:09 +13:00
Tom Lane 8fd3fdd85a Simplify the planner's new representation of indexable clauses a little.
In commit 1a8d5afb0, I thought it'd be a good idea to define
IndexClause.indexquals as NIL in the most common case where the given
clause (IndexClause.rinfo) is usable exactly as-is.  It'd be more
consistent to define the indexquals in that case as being a one-element
list containing IndexClause.rinfo, but I thought saving the palloc
overhead for making such a list would be worthwhile.

In hindsight, that was a great example of "premature optimization is the
root of all evil": it's complicated everyplace that needs to deal with
the indexquals, requiring duplicative code to handle both the simple
case and the not-simple case.  I'd initially found that tolerable but
it's getting less so as I mop up some areas that I'd not touched in
1a8d5afb0.  In any case, two more pallocs during a planner run are
surely at the noise level (a conclusion confirmed by a bit of
microbenchmarking).  So let's change this decision before it becomes
set in stone, and insist that IndexClause.indexquals always be a valid
list of the actual index quals for the clause.

Discussion: https://postgr.es/m/24586.1550106354@sss.pgh.pa.us
2019-02-14 19:37:30 -05:00
Peter Eisentraut 86eea78694 Get rid of another unconstify through API changes
This also makes the code in read_client_first_message() more similar
to read_client_final_message().

Reported-by: Mark Dilger <hornschnorter@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com
2019-02-14 20:44:47 +01:00
Tom Lane 49fa99e54e Move pattern selectivity code from selfuncs.c to like_support.c.
While at it, refactor patternsel() a bit so that it can be used from
the LIKE/regex planner support functions as well.  This makes the
planner able to deal equally well with either operator or function
syntax for these operations.  I'm not excited about that as a feature
in itself, but it provides a nice model for extensions to follow if
they want such behavior for their operations.

This change localizes the use of pattern_fixed_prefix() and
make_greater_string() so that they no longer need be exported.
(We might get pushback from extensions about that, perhaps,
in which case I'd be inclined to re-export them in a new header
file like_support.h.)

This reduces the bulk of selfuncs.c a fair amount, removing ~1370
lines or about one-sixth of that file; it's still too big, but this
is progress.

Discussion: https://postgr.es/m/24537.1550093915@sss.pgh.pa.us
2019-02-14 10:51:59 -05:00
Alvaro Herrera 109de05cbb Fix portability issues in pg_bitutils
We were using uint64 function arguments as "long int" arguments to
compiler builtins, which fails on machines where long ints are 32 bits:
the upper half of the uint64 was being ignored.  Fix by using the "ll"
builtin variants instead, which on those machines take 64 bit arguments.

Also, remove configure tests for __builtin_popcountl() (as well as
"long" variants for ctz and clz): the theory here is that any compiler
version will provide all widths or none, so one test suffices.  Were
this theory to be wrong, we'd have to add tests for
__builtin_popcountll() and friends, which would be tedious.

Per failures in buildfarm member lapwing and ensuing discussion.
2019-02-13 20:09:48 -03:00
Andrew Gierth 80c468b4a4 Remove a stray subnormal value from float tests.
We don't care to assume that input of subnormal float values works,
but a stray subnormal value from the upstream Ryu regression test had
been left in the test data by mistake. Remove it.

Per buildfarm member fulmar.
2019-02-13 20:42:57 +00:00
Alvaro Herrera d0b4663c23 Add -mpopcnt to all compiles of pg_bitutils.c
The way this makefile works, we need to specify it three times.
2019-02-13 17:39:05 -03:00
Andrew Gierth da6520be7f More float test and portability fixes.
Avoid assuming exact results in tstypes test; some platforms vary.
(per buildfarm members eulachon, danio, lapwing)

Avoid dubious usage (inherited from upstream) of bool parameters to
copy_special_str, to see if this fixes the mac/ppc failures (per
buildfarm members prariedog and locust). (Isolated test programs on a
ppc mac don't seem to show any other cause that would explain them.)
2019-02-13 19:35:50 +00:00
Alvaro Herrera 711bab1e4d Add basic support for using the POPCNT and SSE4.2s LZCNT opcodes
These opcodes have been around in the AMD world since 2007, and 2008 in
the case of intel.  They're supported in GCC and Clang via some __builtin
macros.  The opcodes may be unavailable during runtime, in which case we
fall back on a C-based implementation of the code.  In order to get the
POPCNT instruction we must pass the -mpopcnt option to the compiler.  We
do this only for the pg_bitutils.c file.

David Rowley (with fragments taken from a patch by Thomas Munro)

Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-13 16:10:06 -03:00
Andrew Gierth 754ca99314 Fix an overlooked UINT32_MAX.
Replace with PG_UINT32_MAX. Per buildfarm members dory and woodlouse.
2019-02-13 15:57:54 +00:00
Andrew Gierth 02ddd49932 Change floating-point output format for improved performance.
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).

Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.

Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.

Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.

We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.

Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:

 - Output format is changed to use fixed-point notation for small
   exponents, as printf would, and also to use lowercase 'e', a
   minimum of 2 exponent digits, and a mandatory sign on the exponent,
   to keep the formatting as close as possible to previous output.

 - The output of exact midpoint values is disabled as noted above.

 - The integer fast-path code is changed somewhat (since we have
   fixed-point output and the upstream did not).

 - Our project style has been largely applied to the code with the
   exception of C99 declaration-after-statement, which has been
   retained as an exception to our present policy.

 - Most of upstream's debugging and conditionals are removed, and we
   use our own configure tests to determine things like uint128
   availability.

Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).

Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.

This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.

Code by me, adapting the work of Ulf Adams and other contributors.

References:
https://dl.acm.org/citation.cfm?id=3192369

Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
2019-02-13 15:20:33 +00:00
Andrew Gierth f397e08599 Use strtof() and not strtod() for float4 input.
Using strtod() creates a double-rounding problem; the input decimal
value is first rounded to the nearest double; rounding that to the
nearest float may then give an incorrect result.

An example is that 7.038531e-26 when input via strtod and then rounded
to float4 gives 0xAE43FEp-107 instead of the correct 0xAE43FDp-107.

Values output by earlier PG versions with extra_float_digits=3 should
all be read in with the same values as previously. However, values
supplied by other software using shortest representations could be
mis-read.

On platforms that lack a strtof() entirely, we fall back to the old
incorrect rounding behavior. (As strtof() is required by C99, such
platforms are considered of primarily historical interest.) On VS2013,
some workarounds are used to get correct error handling.

The regression tests now test for the correct input values, so
platforms that lack strtof() will need resultmap entries. An entry for
HP-UX 10 is included (more may be needed).

Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/871s5emitx.fsf@news-spur.riddles.org.uk
Discussion: https://postgr.es/m/87d0owlqpv.fsf@news-spur.riddles.org.uk
2019-02-13 15:19:44 +00:00
Peter Eisentraut 37d9916020 More unconstify use
Replace casts whose only purpose is to cast away const with the
unconstify() macro.

Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com
2019-02-13 11:50:16 +01:00
Peter Eisentraut cf40dc65b6 Remove useless casts
Some of these were uselessly casting away "const", some were just
nearby, but they where all unnecessary anyway.

Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com
2019-02-13 11:50:09 +01:00
Michael Paquier 6ea95166a0 Fix comment related to calculation location of total_table_pages
As of commit c6e4133, the calculation happens in make_one_rel() and not
query_planner().

Author: Amit Langote
Discussion: https://postgr.es/m/c7a04a90-42e6-28a4-811a-a7e352831ba1@lab.ntt.co.jp
2019-02-13 16:31:20 +09:00
Thomas Munro 7215efdc00 Fix rare dsa_allocate() failures due to freepage.c corruption.
In a corner case, a btree page was allocated during a clean-up operation
that could cause the tracking of the largest contiguous span of free
space to get out of whack.  That was supposed to be prevented by the use
of the "soft" flag to avoid allocating internal pages during incidental
clean-up work, but the flag was ignored in the case where the FPM was
promoted from singleton format to btree format.  Repair.

Remove an obsolete comment in passing.

Back-patch to 10, where freepage.c arrived (as support for dsa.c).

Author: Robert Haas
Diagnosed-by: Thomas Munro and Robert Haas
Reported-by: Justin Pryzby, Rick Otten, Sand Stone, Arne Roland and others
Discussion: https://postgr.es/m/CAMAYy4%2Bw3NTBM5JLWFi8twhWK4%3Dk_5L4nV5%2BbYDSPu8r4b97Zg%40mail.gmail.com
2019-02-13 13:24:11 +13:00
Tom Lane 75c46149fc Clean up planner confusion between ncolumns and nkeycolumns.
We're only going to consider key columns when creating indexquals,
so there is no point in having the outer loops in indxpath.c iterate
further than nkeycolumns.

Doing so in match_pathkeys_to_index() is actually wrong, and would have
caused crashes by now, except that we have no index AMs supporting both
amcanorderbyop and amcaninclude.

It's also wrong in relation_has_unique_index_for().  The effect there is
to fail to prove uniqueness even when the index does prove it, if there
are extra columns.

Also future-proof examine_variable() for the day when extra columns can
be expressions, and fix what's either a thinko or just an oversight in
btcostestimate(): we should consider the number of key columns, not the
total, when deciding whether to derate correlation.

None of these things seemed important enough to risk changing in a
just-before-wrap patch, but since we're past the release wrap window,
time to fix 'em.

Discussion: https://postgr.es/m/25526.1549847928@sss.pgh.pa.us
2019-02-12 18:38:32 -05:00
Alvaro Herrera 8c67d29fd5 Relax overly strict assertion
Ever since its birth, ReorderBufferBuildTupleCidHash() has contained an
assertion that a catalog tuple cannot change Cmax after acquiring one.  But
that's wrong: if a subtransaction executes DDL that affects that catalog
tuple, and later aborts and another DDL affects the same tuple, it will
change Cmax.  Relax the assertion to merely verify that the Cmax remains
valid and monotonically increasing, instead.

Add a test that tickles the relevant code.

Diagnosed by, and initial patch submitted by: Arseny Sher
Co-authored-by: Arseny Sher
Discussion: https://postgr.es/m/874l9p8hyw.fsf@ars-thinkpad
2019-02-12 18:42:37 -03:00
Alvaro Herrera d357a16997 Blind attempt at fixing Windows build
Broken since fe33a196de.
2019-02-12 18:29:26 -03:00
Alvaro Herrera fe33a196de Use Getopt::Long for catalog scripts
Replace hand-rolled option parsing with the Getopt module. This is
shorter and easier to read. In passing, make some cosmetic adjustments
for consistency.

Author: John Naylor
Reviewed-by: David Fetter
Discussion: https://postgr.es/m/CACPNZCvRjepXh5b2N50njN+rO_2Nzcf=jhMkKX7=79XWUKJyKA@mail.gmail.com
2019-02-12 12:22:08 -03:00
Tom Lane 232a8e233f Fix erroneous error reports in snapbuild.c.
It's pretty unhelpful to report the wrong file name in a complaint
about syscall failure, but SnapBuildSerialize managed to do that twice
in a span of 50 lines.  Also fix half a dozen missing or poorly-chosen
errcode assignments; that's mostly cosmetic, but still wrong.

Noted while studying recent failures on buildfarm member nightjar.
I'm not sure whether those reports are actually giving the wrong
filename, because there are two places here with identically
spelled error messages.  The other one is specifically coded not
to report ENOENT, but if it's this one, how could we be getting
ENOENT from open() with O_CREAT?  Need to sit back and await results.

However, these ereports are clearly broken from birth, so back-patch.
2019-02-12 01:12:52 -05:00
Michael Paquier b7ec820559 Fix description of WAL record XLOG_PARAMETER_CHANGE
max_wal_senders and max_worker_processes got reversed in the output
generated because of ea92368.

Reported-by: Kevin Hale Boyes
Discussion: https://postgr.es/m/CADAecHVAD4=26KAx4nj5DBvxqqvJkuwsy+riiiNhQqwnZg2K8Q@mail.gmail.com
2019-02-12 13:10:59 +09:00
Tom Lane b07c695d9c Fix header inclusion issue.
partprune.h failed to compile by itself; needs to include partdefs.h.

I think I must've broken this in fa2cf164a, though I'd swear I ran
the appropriate tests when removing #includes.  Anyway, it's very
sensible for this file to include partdefs.h, so let's just do that.

Per cpluspluscheck.
2019-02-11 22:37:24 -05:00
Tom Lane 74dfe58a59 Allow extensions to generate lossy index conditions.
For a long time, indxpath.c has had the ability to extract derived (lossy)
index conditions from certain operators such as LIKE.  For just as long,
it's been obvious that we really ought to make that capability available
to extensions.  This commit finally accomplishes that, by adding another
API for planner support functions that lets them create derived index
conditions for their functions.  As proof of concept, the hardwired
"special index operator" code formerly present in indxpath.c is pushed
out to planner support functions attached to LIKE and other relevant
operators.

A weak spot in this design is that an extension needs to know OIDs for
the operators, datatypes, and opfamilies involved in the transformation
it wants to make.  The core-code prototypes use hard-wired OID references
but extensions don't have that option for their own operators etc.  It's
usually possible to look up the required info, but that may be slow and
inconvenient.  However, improving that situation is a separate task.

I want to do some additional refactorization around selfuncs.c, but
that also seems like a separate task.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-11 21:26:14 -05:00
Michael Paquier ea92368cd1 Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context.  This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.

This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.

One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary.  So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.

Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 10:07:56 +09:00
Tom Lane 1d92a0c9f7 Redesign the partition dependency mechanism.
The original setup for dependencies of partitioned objects had
serious problems:

1. It did not verify that a drop cascading to a partition-child object
also cascaded to at least one of the object's partition parents.  Now,
normally a child object would share all its dependencies with one or
another parent (e.g. a child index's opclass dependencies would be shared
with the parent index), so that this oversight is usually harmless.
But if some dependency failed to fit this pattern, the child could be
dropped while all its parents remain, creating a logically broken
situation.  (It's easy to construct artificial cases that break it,
such as attaching an unrelated extension dependency to the child object
and then dropping the extension.  I'm not sure if any less-artificial
cases exist.)

2. Management of partition dependencies during ATTACH/DETACH PARTITION
was complicated and buggy; for example, after detaching a partition
table it was possible to create cases where a formerly-child index
should be dropped and was not, because the correct set of dependencies
had not been reconstructed.

Less seriously, because multiple partition relationships were
represented identically in pg_depend, there was an order-of-traversal
dependency on which partition parent was cited in error messages.
We also had some pre-existing order-of-traversal hazards for error
messages related to internal and extension dependencies.  This is
cosmetic to users but causes testing problems.

To fix #1, add a check at the end of the partition tree traversal
to ensure that at least one partition parent got deleted.  To fix #2,
establish a new policy that partition dependencies are in addition to,
not instead of, a child object's usual dependencies; in this way
ATTACH/DETACH PARTITION need not cope with adding or removing the
usual dependencies.

To fix the cosmetic problem, distinguish between primary and secondary
partition dependency entries in pg_depend, by giving them different
deptypes.  (They behave identically except for having different
priorities for being cited in error messages.)  This means that the
former 'I' dependency type is replaced with new 'P' and 'S' types.

This also fixes a longstanding bug that after handling an internal
dependency by recursing to the owning object, findDependentObjects
did not verify that the current target was now scheduled for deletion,
and did not apply the current recursion level's objflags to it.
Perhaps that should be back-patched; but in the back branches it
would only matter if some concurrent transaction had removed the
internal-linkage pg_depend entry before the recursive call found it,
or the recursive call somehow failed to find it, both of which seem
unlikely.

Catversion bump because the contents of pg_depend change for
partitioning relationships.

Patch HEAD only.  It's annoying that we're not fixing #2 in v11,
but there seems no practical way to do so given that the problem
is exactly a poor choice of what entries to put in pg_depend.
We can't really fix that while staying compatible with what's
in pg_depend in existing v11 installations.

Discussion: https://postgr.es/m/CAH2-Wzkypv1R+teZrr71U23J578NnTBt2X8+Y=Odr4pOdW1rXg@mail.gmail.com
2019-02-11 14:41:17 -05:00
Alvaro Herrera c603b392c3 Fix misleading PG_RE_THROW commentary
The old verbiage indicated that PG_RE_THROW is optional, which is not
really true.  This has confused many people, so it seems worth fixing.

Discussion: https://postgr.es/m/20190206160958.GA22304@alvherre.pgsql
2019-02-11 15:56:09 -03:00
Peter Eisentraut 256fc004af Adjust gratuitously different error message wording 2019-02-11 14:01:05 +01:00
Peter Eisentraut 78b0cac74d Remove unused macro
Last use was removed in 2c66f9924c.
2019-02-11 10:07:25 +01:00
Tom Lane 6bdc3005b5 Fix indexable-row-comparison logic to account for covering indexes.
indxpath.c needs a good deal more attention for covering indexes than
it's gotten.  But so far as I can tell, the only really awful breakage
is in expand_indexqual_rowcompare (nee adjust_rowcompare_for_index),
which was only half fixed in c266ed31a.  The other problems aren't
bad enough to take the risk of a just-before-wrap fix.

The problem here is that if the leading column of a row comparison
matches an index (allowing this code to be reached), and some later
column doesn't match the index, it'll nonetheless believe that that
column matches the first included index column.  Typically that'll
lead to an error like "operator M is not a member of opfamily N" as
a result of fetching a garbage opfamily OID.  But with enough bad
luck, maybe a broken plan would be generated.

Discussion: https://postgr.es/m/25526.1549847928@sss.pgh.pa.us
2019-02-10 22:51:32 -05:00
Tom Lane 72d71e0356 Add per-test-script runtime display to pg_regress.
It seems useful to have this information available, so that it's
easier to tell when a test script is taking a disproportionate
amount of time.

Discussion: https://postgr.es/m/16646.1549770618@sss.pgh.pa.us
2019-02-10 16:54:31 -05:00
Alvaro Herrera cb90de1aac Fix trigger drop procedure
After commit 123cc697a8, we remove redundant FK action triggers during
partition ATTACH by merely deleting the catalog tuple, but that's wrong:
it should use performDeletion() instead.  Repair, and make the comments
more explicit.

Per code review from Tom Lane.

Discussion: https://postgr.es/m/18885.1549642539@sss.pgh.pa.us
2019-02-10 10:00:11 -03:00
Tom Lane 068503c765 Solve cross-version-upgrade testing problem induced by 1fb57af92.
Renaming varchar_transform to varchar_support had a side effect
I hadn't foreseen: the core regression tests leave around a
transform object that relies on that function, so the name
change breaks cross-version upgrade tests, because the name
used in the older branches doesn't match.

Since the dependency on varchar_transform was chosen with the
aid of a dartboard anyway (it would surely not work as a
language transform support function), fix by just choosing
a different random builtin function with the right signature.
Also add some comments explaining why this isn't horribly unsafe.

I chose to make the same substitution in a couple of other
copied-and-pasted test cases, for consistency, though those
aren't directly contributing to the testing problem.

Per buildfarm.  Back-patch, else it doesn't fix the problem.
2019-02-09 21:02:06 -05:00
Tom Lane 4dbe196907 Repair unsafe/unportable snprintf usage in pg_restore.
warn_or_exit_horribly() was blithely passing a potentially-NULL
string pointer to a %s format specifier.  That works (at least
to the extent of not crashing) on some platforms, but not all,
and since we switched to our own snprintf.c it doesn't work
for us anywhere.

Of the three string fields being handled this way here, I think
that only "owner" is supposed to be nullable ... but considering
that this is error-reporting code, it has very little business
assuming anything, so put in defenses for all three.

Per a crash observed on buildfarm member crake and then
reproduced here.  Because of the portability aspect,
back-patch to all supported versions.
2019-02-09 19:45:38 -05:00
Tom Lane a391ff3c3d Build out the planner support function infrastructure.
Add support function requests for estimating the selectivity, cost,
and number of result rows (if a SRF) of the target function.

The lack of a way to estimate selectivity of a boolean-returning
function in WHERE has been a recognized deficiency of the planner
since Berkeley days.  This commit finally fixes it.

In addition, non-constant estimates of cost and number of output
rows are now possible.  We still fall back to looking at procost
and prorows if the support function doesn't service the request,
of course.

To make concrete use of the possibility of estimating output rowcount
for SRFs, this commit adds support functions for array_unnest(anyarray)
and the integer variants of generate_series; the lack of plausible
rowcount estimates for those, even when it's obvious to a human,
has been a repeated subject of complaints.  Obviously, much more
could now be done in this line, but I'm mostly just trying to get
the infrastructure in place.

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:32:23 -05:00
Tom Lane 1fb57af920 Create the infrastructure for planner support functions.
Rename/repurpose pg_proc.protransform as "prosupport".  The idea is
still that it names an internal function that provides knowledge to
the planner about the behavior of the function it's attached to;
but redesign the API specification so that it's not limited to doing
just one thing, but can support an extensible set of requests.

The original purpose of simplifying a function call is handled by
the first request type to be invented, SupportRequestSimplify.
Adjust all the existing transform functions to handle this API,
and rename them fron "xxx_transform" to "xxx_support" to reflect
the potential generalization of what they do.  (Since we never
previously provided any way for extensions to add transform functions,
this change doesn't create an API break for them.)

Also add DDL and pg_dump support for attaching a support function to a
user-defined function.  Unfortunately, DDL access has to be restricted
to superusers, at least for now; but seeing that support functions
will pretty much have to be written in C, that limitation is just
theoretical.  (This support is untested in this patch, but a follow-on
patch will add cases that exercise it.)

Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
2019-02-09 18:08:48 -05:00
Tom Lane 1a8d5afb0d Refactor the representation of indexable clauses in IndexPaths.
In place of three separate but interrelated lists (indexclauses,
indexquals, and indexqualcols), an IndexPath now has one list
"indexclauses" of IndexClause nodes.  This holds basically the same
information as before, but in a more useful format: in particular, there
is now a clear connection between an indexclause (an original restriction
clause from WHERE or JOIN/ON) and the indexquals (directly usable index
conditions) derived from it.

We also change the ground rules a bit by mandating that clause commutation,
if needed, be done up-front so that what is stored in the indexquals list
is always directly usable as an index condition.  This gets rid of repeated
re-determination of which side of the clause is the indexkey during costing
and plan generation, as well as repeated lookups of the commutator
operator.  To minimize the added up-front cost, the typical case of
commuting a plain OpExpr is handled by a new special-purpose function
commute_restrictinfo().  For RowCompareExprs, generating the new clause
properly commuted to begin with is not really any more complex than before,
it's just different --- and we can save doing that work twice, as the
pretty-klugy original implementation did.

Tracking the connection between original and derived clauses lets us
also track explicitly whether the derived clauses are an exact or lossy
translation of the original.  This provides a cheap solution to getting
rid of unnecessary rechecks of boolean index clauses, which previously
seemed like it'd be more expensive than it was worth.

Another pleasant (IMO) side-effect is that EXPLAIN now always shows
index clauses with the indexkey on the left; this seems less confusing.

This commit leaves expand_indexqual_conditions() and some related
functions in a slightly messy state.  I didn't bother to change them
any more than minimally necessary to work with the new data structure,
because all that code is going to be refactored out of existence in
a follow-on patch.

Discussion: https://postgr.es/m/22182.1549124950@sss.pgh.pa.us
2019-02-09 17:30:43 -05:00
Tom Lane 6401583863 Call set_rel_pathlist_hook before generate_gather_paths, not after.
The previous ordering of these steps satisfied the nominal requirement
that set_rel_pathlist_hook could editorialize on the whole set of Paths
constructed for a base relation.  In practice, though, trying to change
the set of partial paths was impossible.  Adding one didn't work because
(a) it was too late to be included in Gather paths made by the core code,
and (b) calling add_partial_path after generate_gather_paths is unsafe,
because it might try to delete a path it thinks is dominated, but that
is already embedded in some Gather path(s).  Nor could the hook safely
remove partial paths, for the same reason that they might already be
embedded in Gathers.

Better to call extensions first, let them add partial paths as desired,
and then gather.  In v11 and up, we already doubled down on that ordering
by postponing gathering even further for single-relation queries; so even
if the hook wished to editorialize on Gather path construction, it could
not.

Report and patch by KaiGai Kohei.  Back-patch to 9.6 where Gather paths
were added.

Discussion: https://postgr.es/m/CAOP8fzahwpKJRTVVTqo2AE=mDTz_efVzV6Get_0=U3SO+-ha1A@mail.gmail.com
2019-02-09 11:41:09 -05:00
Andres Freund 356687bd82 Reset, not recreate, execGrouping.c style hashtables.
This uses the facility added in the preceding commit to fix
performance issues caused by rebuilding the hashtable (with its
comparator expression being the most expensive bit), after every
reset. That's especially important when the comparator is JIT
compiled.

Bug: #15592 #15486
Reported-By: Jakub Janeček, Dmitry Marakasov
Author: Andres Freund
Discussion:
    https://postgr.es/m/15486-05850f065da42931@postgresql.org
    https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Backpatch: 11, where I broke this in bf6c614a2f
2019-02-09 01:05:49 -08:00
Andres Freund 317ffdfeaa Allow to reset execGrouping.c style tuple hashtables.
This has the advantage that the comparator expression, the table's
slot, etc do not have to be rebuilt. Additionally the simplehash.h
hashtable within the tuple hashtable now keeps its previous size and
doesn't need to be reallocated. That both reduces allocator overhead,
and improves performance in cases where the input estimation was off
by a significant factor.

To avoid an API/ABI break, the new parameter is exposed via the new
BuildTupleHashTableExt(), and BuildTupleHashTable() now is a wrapper
around the former, that continues to allocate the table itself in the
tablecxt.

Using this fixes performance issues discovered in the two bugs
referenced. This commit however has not converted the callers, that's
done in a separate commit.

Bug: #15592 #15486
Reported-By: Jakub Janeček, Dmitry Marakasov
Author: Andres Freund
Discussion:
    https://postgr.es/m/15486-05850f065da42931@postgresql.org
    https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Backpatch: 11, this is a prerequisite for other fixes
2019-02-09 01:05:49 -08:00
Andres Freund 3b632a58e7 simplehash: Add support for resetting a hashtable's contents.
A hashtable reset just reset the hashtable entries, but does not free
memory.

Author: Andres Freund
Discussion: https://postgr.es/m/20190114180423.ywhdg2iagzvh43we@alap3.anarazel.de
Bug: #15592 #15486
Backpatch: 11, this is a prerequisite for other fixes
2019-02-09 01:05:49 -08:00
Andres Freund 5567d12ce0 Plug leak in BuildTupleHashTable by creating ExprContext in correct context.
In bf6c614a2f I added a expr context to evaluate the grouping
expression. Unfortunately the code I added initialized them while in
the calling context, rather the table context.  Additionally, I used
CreateExprContext() rather than CreateStandaloneExprContext(), which
creates the econtext in the estate's query context.

Fix that by using CreateStandaloneExprContext when in the table's
tablecxt. As we rely on the memory being freed by a memory context
reset that means that the econtext's shutdown callbacks aren't being
called, but that seems ok as the expressions are tightly controlled
due to ExecBuildGroupingEqual().

Bug: #15592
Reported-By: Dmitry Marakasov
Author: Andres Freund
Discussion: https://postgr.es/m/20190114222838.h6r3fuyxjxkykf6t@alap3.anarazel.de
Backpatch: 11, where I broke this in bf6c614a2f
2019-02-09 01:05:49 -08:00
Tom Lane 0edef16d76 Defend against null error message reported by libxml2.
While this isn't really supposed to happen, it can occur in OOM
situations and perhaps others.  Instead of crashing, substitute
"(no message provided)".

I didn't worry about localizing this text, since we aren't
localizing anything else here; besides, if we're on the edge of
OOM, it's unlikely gettext() would work.

Report and fix by Sergio Conde Gómez in bug #15624.

Discussion: https://postgr.es/m/15624-4dea54091a2864e6@postgresql.org
2019-02-08 13:30:42 -05:00
Peter Eisentraut 3d462f0861 Fix error handling around ssl_*_protocol_version settings
In case of a reload, we just want to LOG errors instead of FATAL when
processing SSL configuration, but the more recent code for the
ssl_*_protocol_version settings didn't behave like that.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
2019-02-08 11:58:19 +01:00
Peter Eisentraut 08d25d7850 Add some const decorations
These mainly help understanding the function signatures better.
2019-02-08 10:13:24 +01:00
Michael Paquier 3677a0b26b Add pg_partition_root to display top-most parent of a partition tree
This is useful when looking at partition trees with multiple layers, and
combined with pg_partition_tree, it provides the possibility to show up
an entire tree by just knowing one member at any level.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Langote
Discussion: https://postgr.es/m/20181207014015.GP2407@paquier.xyz
2019-02-08 08:56:14 +09:00
Tom Lane 34ea1ab7fd Split create_foreignscan_path() into three functions.
Up to now postgres_fdw has been using create_foreignscan_path() to
generate not only base-relation paths, but also paths for foreign joins
and foreign upperrels.  This is wrong, because create_foreignscan_path()
calls get_baserel_parampathinfo() which will only do the right thing for
baserels.  It accidentally fails to fail for unparameterized paths, which
are the only ones postgres_fdw (thought it) was handling, but we really
need different APIs for the baserel and join cases.

In HEAD, the best thing to do seems to be to split up the baserel,
joinrel, and upperrel cases into three functions so that they can
have different APIs.  I haven't actually given create_foreign_join_path
a different API in this commit: we should spend a bit of time thinking
about just what we want to do there, since perhaps FDWs would want to
do something different from the build-up-a-join-pairwise approach that
get_joinrel_parampathinfo expects.  In the meantime, since postgres_fdw
isn't prepared to generate parameterized joins anyway, just give it a
defense against trying to plan joins with lateral refs.

In addition (and this is what triggered this whole mess) fix bug #15613
from Srinivasan S A, by teaching file_fdw and postgres_fdw that plain
baserel foreign paths still have outer refs if the relation has
lateral_relids.  Add some assertions in relnode.c to catch future
occurrences of the same error --- in particular, to catch other FDWs
doing that, but also as backstop against core-code mistakes like the
one fixed by commit bdd9a99aa.

Bug #15613 also needs to be fixed in the back branches, but the
appropriate fix will look quite a bit different there, since we don't
want to assume that existing FDWs get the word right away.

Discussion: https://postgr.es/m/15613-092be1be9576c728@postgresql.org
2019-02-07 13:11:12 -05:00
Andrew Dunstan 51b025933d Fix perl searchpath for gen_keywordlist.pl
as found by running src/tools/perlcheck/pgperlsyncheck
2019-02-07 11:14:29 -05:00
Andrew Dunstan 8ce641f997 Fix searchpath and module location for pg_rewind and ssl TAP tests
The modules RewindTest.pm and ServerSetup.pm are really only useful for
TAP tests, so they really belong in the TAP test directories. In
addition, ServerSetup.pm is renamed to SSLServer.pm.

The test scripts have their own directories added to the search path so
that the relocated modules will be found, regardless of where the tests
are run from, even on modern perl where "." is no longer in the
searchpath.

Discussion: https://postgr.es/m/e4b0f366-269c-73c3-9c90-d9cb0f4db1f9@2ndQuadrant.com

Backpatch as appropriate to 9.5
2019-02-07 11:09:08 -05:00
Peter Eisentraut 0c1f8f166c Use EXECUTE FUNCTION syntax for triggers more
Change pg_dump and ruleutils.c to use the FUNCTION keyword instead of
PROCEDURE in trigger and event trigger definitions.

This completes the pieces of the transition started in
0a63f996e0 that were kept out of
PostgreSQL 11 because of the required catversion change.

Discussion: https://www.postgresql.org/message-id/381bef53-f7be-29c8-d977-948e389161d6@2ndquadrant.com
2019-02-07 09:21:34 +01:00
Peter Eisentraut 13b89f96d0 Allow some recovery parameters to be changed with reload
Change

archive_cleanup_command
promote_trigger_file
recovery_end_command
recovery_min_apply_delay

from PGC_POSTMASTER to PGC_SIGHUP.  This did not require any further
changes.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/ca28011a-cfaa-565c-d622-c1907c33ecf7%402ndquadrant.com
2019-02-07 08:34:48 +01:00
Peter Eisentraut cd5afd8175 Add collation assignment to CALL statement
Otherwise functions that require collation information will not have
it if they are called in arguments to a CALL statement.

Reported-by: Jean-Marc Voillequin <Jean-Marc.Voillequin@moodys.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/1EC8157EB499BF459A516ADCF135ADCE39FFAC54%40LON-WGMSX712.ad.moodys.net
2019-02-07 08:25:47 +01:00
Michael Paquier f339a998ff Align better test output regex with grammar in pg_dump TAP tests
This enforces one-or-more character matches in the regular expressions
for pg_dump testing on SQL syntax output where zero-or-more matches
implies a syntax error.

Author: Daniel Gustafsson
Reviewed-by: David G. Johnston, Michael Paquier
Discussion: https://postgr.es/m/B313C32C-0E24-4AFB-95FF-6DA0C4E18A89@yesql.se
2019-02-07 10:04:55 +09:00
Michael Paquier 537898bd81 Add more tests for CREATE TABLE AS with WITH NO DATA
The relation creation is done at executor startup, however the main
regression test suite is lacking scenarios where no data is inserted
which is something that can happen when using EXECUTE or EXPLAIN with
CREATE TABLE AS and WITH NO DATA.

Some patches are worked on to reshape the way CTAS relations are
created, so this makes sure that we do not miss some query patterns
already supported.

Reported-by: Andreas Karlsson
Author: Michael Paquier
Reviewed-by: Andreas Karlsson
Discussion: https://postgr.es/m/20190206091817.GB14980@paquier.xyz
2019-02-07 09:21:57 +09:00
Peter Eisentraut 727921f466 Hide cascade messages in collate tests
These are not relevant to the tests and would just uselessly bloat
patches.
2019-02-06 22:17:57 +01:00
Tom Lane bdd9a99aac Propagate lateral-reference information to indirect descendant relations.
create_lateral_join_info() computes a bunch of information about lateral
references between base relations, and then attempts to propagate those
markings to appendrel children of the original base relations.  But the
original coding neglected the possibility of indirect descendants
(grandchildren etc).  During v11 development we noticed that this was
wrong for partitioned-table cases, but failed to realize that it was just
as wrong for any appendrel.  While the case can't arise for appendrels
derived from traditional table inheritance (because we make a flat
appendrel for that), nested appendrels can arise from nested UNION ALL
subqueries.  Failure to mark the lower-level relations as having lateral
references leads to confusion in add_paths_to_append_rel about whether
unparameterized paths can be built.  It's not very clear whether that
leads to any user-visible misbehavior; the lack of field reports suggests
that it may cause nothing worse than minor cost misestimation.  Still,
it's a bug, and it leads to failures of Asserts that I intend to add
later.

To fix, we need to propagate information from all appendrel parents,
not just those that are RELOPT_BASERELs.  We can still do it in one
pass, if we rely on the append_rel_list to be ordered with ancestor
relationships before descendant ones; add assertions checking that.
While fixing this, we can make a small performance improvement by
traversing the append_rel_list just once instead of separately for
each appendrel parent relation.

Noted while investigating bug #15613, though this patch does not fix
that (which is why I'm not committing the related Asserts yet).

Discussion: https://postgr.es/m/3951.1549403812@sss.pgh.pa.us
2019-02-06 12:45:21 -05:00
Andrew Dunstan 592123efbb Unify searchpath and do file logic in MSVC build scripts.
Commit f83419b739 failed to notice that mkvcbuild.pl and build.pl use
different searchpath and do-file logic, breaking the latter, so it is
adjusted to use the same logic as mkvcbuild.pl.
2019-02-06 07:36:02 -05:00
Andres Freund 171e0418b0 Fix heap_getattr() handling of fast defaults.
Previously heap_getattr() returned NULL for attributes with a fast
default value (c.f. 16828d5c02), as it had no handling whatsoever
for that case.

A previous fix, 7636e5c60f, attempted to fix issues caused by this
oversight, but just expanding OLD tuples for triggers doesn't actually
solve the underlying issue.

One known consequence of this bug is that the check for HOT updates
can return the wrong result, when a previously fast-default'ed column
is set to NULL. Which in turn means that an index over a column with
fast default'ed columns might be corrupt if the underlying column(s)
allow NULLs.

Fix by handling fast default columns in heap_getattr(), remove now
superfluous expansion in GetTupleForTrigger().

Author: Andres Freund
Discussion: https://postgr.es/m/20190201162404.onngi77f26baem4g@alap3.anarazel.de
Backpatch: 11, where fast defaults were introduced
2019-02-06 01:09:32 -08:00
Michael Paquier d07fb6810e Tighten some regexes with proper character escaping in pg_dump TAP tests
Some tests have been using regular expressions which have been lax in
escaping dots, which may cause tests to pass when they should not.  This
make the whole set of tests more robust where needed.

Author: David Rowley
Reviewed-by: Daniel Gustafsson, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f9jD8aVo1BTH+Vgwd=f-ynbuRVrS90XbWMT6UigaOQJTA@mail.gmail.com
2019-02-06 17:33:55 +09:00
Andrew Dunstan f83419b739 Fix included file path for modern perl
Contrary to the comment on 772d4b76, only paths starting with "./" or
"../" are considered relative to the current working directory by perl's
"do" function. So this patch converts all the relevant cases to use "./"
paths. This only affects MSVC.

Backpatch to all live branches.
2019-02-05 19:27:47 -05:00
Andrew Dunstan 8916b33e52 Keep perl style checker happy
It doesn't like code before "use strict;".
2019-02-05 15:16:55 -05:00
Tom Lane d63dc0aa0c Update time zone data files to tzdata release 2018i.
DST law changes in Kazakhstan, Metlakatla, and São Tomé and Príncipe.
Kazakhstan's Qyzylorda zone is split in two, creating a new zone
Asia/Qostanay, as some areas did not change UTC offset.
Historical corrections for Hong Kong and numerous Pacific islands.
2019-02-05 10:58:53 -05:00