Commit Graph

20175 Commits

Author SHA1 Message Date
Michael Paquier 9989d37d1c Remove XLogFileNameP() from the tree
XLogFileNameP() is a wrapper routine able to build a palloc'd string for
a WAL segment name, which is used for error string generation.  There
were several code paths where it gets called in a critical section,
where memory allocation is not allowed.  This results in triggering
an assertion failure instead of generating the wanted error message.

Another, more annoying, problem is that if the allocation to generate
the WAL segment name fails on OOM, then the failure would be escalated
to a PANIC.

This removes the routine and all its callers are replaced with a logic
using a fixed-size buffer.  This way, all the existing mistakes are
fixed and future ones are prevented.

Author: Masahiko Sawada
Reviewed-by: Michael Paquier, Álvaro Herrera
Discussion: https://postgr.es/m/CA+fd4k5gC9H4uoWMLg9K_QfNrnkkdEw+-AFveob9YX7z8JnKTA@mail.gmail.com
2019-12-03 15:06:04 +09:00
Tom Lane 55a1954da1 Fix EXPLAIN's column alias output for mismatched child tables.
If an inheritance/partitioning parent table is assigned some column
alias names in the query, EXPLAIN mapped those aliases onto the
child tables' columns by physical position, resulting in bogus output
if a child table's columns aren't one-for-one with the parent's.

To fix, make expand_single_inheritance_child() generate a correctly
re-mapped column alias list, rather than just copying the parent
RTE's alias node.  (We have to fill the alias field, not just
adjust the eref field, because ruleutils.c will ignore eref in
favor of looking at the real column names.)

This means that child tables will now always have alias fields in
plan rtables, where before they might not have.  That results in
a rather substantial set of regression test output changes:
EXPLAIN will now always show child tables with aliases that match
the parent table (usually with "_N" appended for uniqueness).
But that seems like a net positive for understandability, since
the parent alias corresponds to something that actually appeared
in the original query, while the child table names didn't.
(Note that this does not change anything for cases where an explicit
table alias was written in the query for the parent table; it
just makes cases without such aliases behave similarly to that.)
Hence, while we could avoid these subsidiary changes if we made
inherit.c more complicated, we choose not to.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-02 19:08:10 -05:00
Tom Lane ce76c0ba53 Add a reverse-translation column number array to struct AppendRelInfo.
This provides for cheaper mapping of child columns back to parent
columns.  The one existing use-case in examine_simple_variable()
would hardly justify this by itself; but an upcoming bug fix will
make use of this array in a mainstream code path, and it seems
likely that we'll find other uses for it as we continue to build
out the partitioning infrastructure.

Discussion: https://postgr.es/m/12424.1575168015@sss.pgh.pa.us
2019-12-02 18:05:29 -05:00
Tom Lane c35b714caf Fix misbehavior with expression indexes on ON COMMIT DELETE ROWS tables.
We implement ON COMMIT DELETE ROWS by truncating tables marked that
way, which requires also truncating/rebuilding their indexes.  But
RelationTruncateIndexes asks the relcache for up-to-date copies of any
index expressions, which may cause execution of eval_const_expressions
on them, which can result in actual execution of subexpressions.
This is a bad thing to have happening during ON COMMIT.  Manuel Rigger
reported that use of a SQL function resulted in crashes due to
expectations that ActiveSnapshot would be set, which it isn't.
The most obvious fix perhaps would be to push a snapshot during
PreCommit_on_commit_actions, but I think that would just open the door
to more problems: CommitTransaction explicitly expects that no
user-defined code can be running at this point.

Fortunately, since we know that no tuples exist to be indexed, there
seems no need to use the real index expressions or predicates during
RelationTruncateIndexes.  We can set up dummy index expressions
instead (we do need something that will expose the right data type,
as there are places that build index tupdescs based on this), and
just ignore predicates and exclusion constraints.

In a green field it'd likely be better to reimplement ON COMMIT DELETE
ROWS using the same "init fork" infrastructure used for unlogged
relations.  That seems impractical without catalog changes though,
and even without that it'd be too big a change to back-patch.
So for now do it like this.

Per private report from Manuel Rigger.  This has been broken forever,
so back-patch to all supported branches.
2019-12-01 13:09:26 -05:00
Peter Eisentraut e6c2d17c53 Small code simplification
FLOAT8PASSBYVAL can be used instead of USE_FLOAT8_BYVAL here.
2019-11-29 10:55:31 +01:00
Peter Eisentraut c4a7a392ec Make allow_system_table_mods settable at run time
Make allow_system_table_mods settable at run time by superusers.  It
was previously postmaster start only.

We don't want to make system catalog DDL wide-open, but there are
occasionally useful things to do like setting reloptions or statistics
on a busy system table, and blocking those doesn't help anyone.  Also,
this enables the possibility of writing a test suite for this setting.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8b00ea5e-28a7-88ba-e848-21528b632354%402ndquadrant.com
2019-11-29 10:22:13 +01:00
Peter Eisentraut 508bf95b76 Remove any-user DML capability from allow_system_table_mods
Previously, allow_system_table_mods allowed a non-superuser to do DML
on a system table without further permission checks.  This has been
removed, as it was quite inconsistent with the rest of the meaning of
this setting.  (Since allow_system_table_mods was previously only
accessible with a server restart, it is unlikely that anyone was using
this possibility.)

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8b00ea5e-28a7-88ba-e848-21528b632354%402ndquadrant.com
2019-11-29 10:22:13 +01:00
Peter Eisentraut d4feadeca1 Add error position to an error message
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/6e7aa4a1-be6a-1a75-b1f9-83a678e5184a@2ndquadrant.com
2019-11-29 09:10:17 +01:00
Tomas Vondra 6d61c3f1cb Remove unnecessary clauses_attnums variable
Commit c676e659b2 reworked how choose_best_statistics() picks the best
extended statistics, but failed to remove clauses_attnums which is now
unnecessary. So get rid of it and backpatch to 12, same as c676e659b2.

Author: Tomas Vondra
Discussion: https://postgr.es/m/CA+u7OA7H5rcE2=8f263w4NZD6ipO_XOrYB816nuLXbmSTH9pQQ@mail.gmail.com
Backpatch-through: 12
2019-11-28 23:25:14 +01:00
Tomas Vondra c676e659b2 Fix choose_best_statistics to check clauses individually
When picking the best extended statistics object for a list of clauses,
it's not enough to look at attnums extracted from the clause list as a
whole. Consider for example this query with OR clauses:

   SELECT * FROM t WHERE (t.a = 1) OR (t.b = 1) OR (t.c = 1)

with a statistics defined on columns (a,b). Relying on attnums extracted
from the whole OR clause, we'd consider the statistics usable. That does
not work, as we see the conditions as a single OR-clause, referencing an
attribute not covered by the statistic, leading to empty list of clauses
to be estimated using the statistics and an assert failure.

This changes choose_best_statistics to check which clauses are actually
covered, and only using attributes from the fully covered ones. For the
previous example this means the statistics object will not be considered
as compatible with the OR-clause.

Backpatch to 12, where MCVs were introduced. The issue does not affect
older versions because functional dependencies don't handle OR clauses.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Reported-By: Manuel Rigger
Discussion: https://postgr.es/m/CA+u7OA7H5rcE2=8f263w4NZD6ipO_XOrYB816nuLXbmSTH9pQQ@mail.gmail.com
Backpatch-through: 12
2019-11-28 22:20:45 +01:00
Alvaro Herrera 3974c4a724 Remove useless "return;" lines
Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql
2019-11-28 16:48:37 -03:00
Etsuro Fujita 47a3c7fa06 Fix typo in comment. 2019-11-27 16:00:45 +09:00
Tom Lane 553d2ec271 Allow access to child table statistics if user can read parent table.
The fix for CVE-2017-7484 disallowed use of pg_statistic data for
planning purposes if the user would not be able to select the associated
column and a non-leakproof function is to be applied to the statistics
values.  That turns out to disable use of pg_statistic data in some
common cases involving inheritance/partitioning, where the user does
have permission to select from the parent table that was actually named
in the query, but not from a child table whose stats are needed.  Since,
in non-corner cases, the user *can* select the child table's data via
the parent, this restriction is not actually useful from a security
standpoint.  Improve the logic so that we also check the permissions of
the originally-named table, and allow access if select permission exists
for that.

When checking access to stats for a simple child column, we can map
the child column number back to the parent, and perform this test
exactly (including not allowing access if the child column isn't
exposed by the parent).  For expression indexes, the current logic
just insists on whole-table select access, and this patch allows
access if the user can select the whole parent table.  In principle,
if the child table has extra columns, this might allow access to
stats on columns the user can't read.  In practice, it's unlikely
that the planner is going to do any stats calculations involving
expressions that are not visible to the query, so we'll ignore that
fine point for now.  Perhaps someday we'll improve that logic to
detect exactly which columns are used by an expression index ...
but today is not that day.

Back-patch to v11.  The issue was created in 9.2 and up by the
CVE-2017-7484 fix, but this patch depends on the append_rel_array[]
planner data structure which only exists in v11 and up.  In
practice the issue is most urgent with partitioned tables, so
fixing v11 and later should satisfy much of the practical need.

Dilip Kumar and Amit Langote, with some kibitzing by me

Discussion: https://postgr.es/m/3876.1531261875@sss.pgh.pa.us
2019-11-26 14:41:48 -05:00
Michael Paquier 12198239c0 Add safeguards for pg_fsync() called with incorrectly-opened fds
On some platforms, fsync() returns EBADFD when opening a file descriptor
with O_RDONLY (read-only), leading ultimately now to a PANIC to prevent
data corruption.

This commit adds a new sanity check in pg_fsync() based on fcntl() to
make sure that we don't repeat again mistakes with incorrectly-set file
descriptors so as problems are detected at an early stage.  Without
that, such errors could only be detected after running Postgres on a
specific supported platform for the culprit code path, which could take
some time before being found.  b8e19b93 was a fix for such a problem,
which got undetected for more than 5 years, and a586cc4b fixed another
similar issue.

Note that the new check added works as well when fsync=off is
configured, so as all regression tests would detect problems as long as
assertions are enabled.  fcntl() being not available on Windows, the
new checks do not happen there.

Author: Michael Paquier
Reviewed-by: Mark Dilger
Discussion: https://postgr.es/m/20191009062640.GB21379@paquier.xyz
2019-11-26 13:32:52 +09:00
Amit Kapila 080313f829 Don't shut down Gather[Merge] early under Limit.
Revert part of commit 19df1702f5.

Early shutdown was added by that commit so that we could collect
statistics from workers, but unfortunately, it interacted badly with
rescans.  The problem is that we ended up destroying the parallel context
which is required for rescans.  This leads to rescans of a Limit node over
a Gather node to produce unpredictable results as it tries to access
destroyed parallel context.  By reverting the early shutdown code, we
might lose statistics in some cases of Limit over Gather [Merge], but that
will require further study to fix.

Reported-by: Jerry Sievers
Diagnosed-by: Thomas Munro
Author: Amit Kapila, testcase by Vignesh C
Backpatch-through: 9.6
Discussion: https://postgr.es/m/87ims2amh6.fsf@jsievers.enova.com
2019-11-26 08:30:24 +05:30
Robert Haas 0d3c3aae33 Use procsignal_sigusr1_handler for auxiliary processes.
AuxiliaryProcessMain does ProcSignalInit, so one might expect that
auxiliary processes would need to respond to SendProcSignal, but none
of the auxiliary processes do that. Change them to use
procsignal_sigusr1_handler instead of their own private handlers so
that they do. Besides seeming more correct, this is also less code. It
shouldn't make any functional difference right now because, as far as
we know, there are no current cases where SendProcSignal targets an
auxiliary process, but there are plans to change that in the future.

Andres Freund

Discussion: http://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de
2019-11-25 16:16:27 -05:00
Alvaro Herrera 0dc8ead463 Refactor WAL file-reading code into WALRead()
XLogReader, walsender and pg_waldump all had their own routines to read
data from WAL files to memory, with slightly different approaches
according to the particular conditions of each environment.  There's a
lot of commonality, so we can refactor that into a single routine
WALRead in XLogReader, and move the differences to a separate (simpler)
callback that just opens the next WAL-segment.  This results in a
clearer (ahem) code flow.

The error reporting needs are covered by filling in a new error-info
struct, WALReadError, and it's the caller's responsibility to act on it.
The backend has WALReadRaiseError() to do so.

We no longer ever need to seek in this interface; switch to using
pg_pread().

Author: Antonin Houska, with contributions from Álvaro Herrera
Reviewed-by: Michaël Paquier, Kyotaro Horiguchi
Discussion: https://postgr.es/m/14984.1554998742@spoje.net
2019-11-25 15:04:54 -03:00
Tom Lane 5883f5fe27 Fix unportable printf format introduced in commit 9290ad198.
"%ld" is not an acceptable format spec for int64 variables, though
it accidentally works on most non-Windows 64-bit platforms.  Follow
the lead of commit 6a1cd8b92, and use "%lld" with an explicit cast
to long long.  Per buildfarm.
2019-11-25 10:48:36 -05:00
Amit Kapila e0487223ec Make the order of the header file includes consistent.
Similar to commits 14aec03502, 7e735035f2 and dddf4cdc33, this commit
makes the order of header file inclusion consistent in more places.

Author: Vignesh C
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
2019-11-25 08:08:57 +05:30
Michael Paquier 2aa84520b3 Fix inconsistent variable name in static function of mac8.c
Both argument names were reversed in the declaration of the function.

Author: Ranier Vilela
Discussion: https://postgr.es/m/MN2PR18MB292755AEFF9A9144B220ABEEE34B0@MN2PR18MB2927.namprd18.prod.outlook.com
2019-11-25 09:57:35 +09:00
Michael Paquier 4cb658af70 Refactor reloption handling for index AMs in-core
This reworks the reloption parsing and build of a couple of index AMs by
creating new structures for each index AM's options.  This split was
already done for BRIN, GIN and GiST (which actually has a fillfactor
parameter), but not for hash, B-tree and SPGiST which relied on
StdRdOptions due to an overlap with the default option set.

This saves a couple of bytes for rd_options in each relcache entry with
indexes making use of relation options, and brings more consistency
between all index AMs.  While on it, add a couple of AssertMacro() calls
to make sure that utility macros to grab values of reloptions are used
with the expected index AM.

Author: Nikolay Shaplov
Reviewed-by: Amit Langote, Michael Paquier, Álvaro Herrera, Dent John
Discussion: https://postgr.es/m/4127670.gFlpRb6XCm@x200m
2019-11-25 09:40:53 +09:00
Tom Lane d3aa114ac4 Doc: improve discussion of race conditions involved in LISTEN.
The user docs didn't really explain how to use LISTEN safely,
so clarify that.  Also clean up some fuzzy-headed explanations
in comments.  No code changes.

Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com
2019-11-24 18:03:39 -05:00
Tom Lane 6b802cfc7f Avoid assertion failure with LISTEN in a serializable transaction.
If LISTEN is the only action in a serializable-mode transaction,
and the session was not previously listening, and the notify queue
is not empty, predicate.c reported an assertion failure.  That
happened because we'd acquire the transaction's initial snapshot
during PreCommit_Notify, which was called *after* predicate.c
expects any such snapshot to have been established.

To fix, just swap the order of the PreCommit_Notify and
PreCommit_CheckForSerializationFailure calls during CommitTransaction.
This will imply holding the notify-insertion lock slightly longer,
but the difference could only be meaningful in serializable mode,
which is an expensive option anyway.

It appears that this is just an assertion failure, with no
consequences in non-assert builds.  A snapshot used only to scan
the notify queue could not have been involved in any serialization
conflicts, so there would be nothing for
PreCommit_CheckForSerializationFailure to do except assign it a
prepareSeqNo and set the SXACT_FLAG_PREPARED flag.  And given no
conflicts, neither of those omissions affect the behavior of
ReleasePredicateLocks.  This admittedly once-over-lightly analysis
is backed up by the lack of field reports of trouble.

Per report from Mark Dilger.  The bug is old, so back-patch to all
supported branches; but the new test case only goes back to 9.6,
for lack of adequate isolationtester infrastructure before that.

Discussion: https://postgr.es/m/3ac7f397-4d5f-be8e-f354-440020675694@gmail.com
Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
2019-11-24 15:57:49 -05:00
Tom Lane 7900269724 Stabilize NOTIFY behavior by transmitting notifies before ReadyForQuery.
This patch ensures that, if any notify messages were received during
a just-finished transaction, they get sent to the frontend just before
not just after the ReadyForQuery message.  With libpq and other client
libraries that act similarly, this guarantees that the client will see
the notify messages as available as soon as it thinks the transaction
is done.

This probably makes no difference in practice, since in realistic
use-cases the application would have to cope with asynchronous
arrival of notify events anyhow.  However, it makes it a lot easier
to build cross-session-notify test cases with stable behavior.
I'm a bit surprised now that we've not seen any buildfarm instability
with the test cases added by commit b10f40bf0.  Tests that I intend
to add in an upcoming bug fix are definitely unstable without this.

Back-patch to 9.6, which is as far back as we can do NOTIFY testing
with the isolationtester infrastructure.

Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
2019-11-24 14:42:59 -05:00
Tom Lane 8b7ae5a82d Stabilize the results of pg_notification_queue_usage().
This function wasn't touched in commit 51004c717, but that turns out
to be a bad idea, because its results now include any dead space
that exists in the NOTIFY queue on account of our being lazy about
advancing the queue tail.  Notably, the isolation tests now fail
if run twice without a server restart between, because async-notify's
first test of the function will already show a positive value.
It seems likely that end users would be equally unhappy about the
result's instability.  To fix, just make the function call
asyncQueueAdvanceTail before computing its result.  That should end
in producing the same value as before, and it's hard to believe that
there's any practical use-case where pg_notification_queue_usage()
is called so often as to create a performance degradation, especially
compared to what we did before.

Out of paranoia, also mark this function parallel-restricted (it
was volatile, but parallel-safe by default, before).  Although the
code seems to work fine when run in a parallel worker, that's outside
the design scope of async.c, and it's a bit scary to have intentional
side-effects happening in a parallel worker.  There seems no plausible
use-case where it'd be important to try to parallelize this, so let's
not take any risk of introducing new bugs.

In passing, re-pgindent async.c and run reformat-dat-files on
pg_proc.dat, just because I'm a neatnik.

Discussion: https://postgr.es/m/13881.1574557302@sss.pgh.pa.us
2019-11-24 14:09:33 -05:00
Alvaro Herrera 45ff049e28 Remove debugging aid
This Assert(false) was not supposed to be in the committed copy.

Reported by: Tom Lane
Discussion: https://postgr.es/m/26476.1574525468@sss.pgh.pa.us
2019-11-23 13:19:20 -03:00
Joe Conway f7a2002e82 Add object TRUNCATE hook
All operations with acl permissions checks should have a corresponding hook
so that, for example, mandatory access control (MAC) may be enforced by an
extension. The command TRUNCATE is missing this hook, so add it. Patch by
Yuli Khodorkovskiy with some editorialization by me. Based on the discussion
not back-patched. A separate patch will exercise the hook in the sepgsql
extension.

Author: Yuli Khodorkovskiy
Reviewed-by: Joe Conway
Discussion: https://postgr.es/m/CAFL5wJcomybj1Xdw7qWmPJRpGuFukKgNrDb6uVBaCMgYS9dkaA%40mail.gmail.com
2019-11-23 10:39:20 -05:00
Tom Lane 4d9ceb0018 Fix bogus tuple-slot management in logical replication UPDATE handling.
slot_modify_cstrings seriously abused the TupleTableSlot API by relying
on a slot's underlying data to stay valid across ExecClearTuple.  Since
this abuse was also quite undocumented, it's little surprise that the
case got broken during the v12 slot rewrites.  As reported in bug #16129
from Ondřej Jirman, this could lead to crashes or data corruption when
a logical replication subscriber processes a row update.  Problems would
only arise if the subscriber's table contained columns of pass-by-ref
types that were not being copied from the publisher.

Fix by explicitly copying the datum/isnull arrays from the source slot
that the old row was in already.  This ends up being about the same
thing that happened pre-v12, but hopefully in a less opaque and
fragile way.

We might've caught the problem sooner if there were any test cases
dealing with updates involving non-replicated or dropped columns.
Now there are.

Back-patch to v10 where this code came in.  Even though the failure
does not manifest before v12, IMO this code is too fragile to leave
as-is.  In any case we certainly want the additional test coverage.

Patch by me; thanks to Tomas Vondra for initial investigation.

Discussion: https://postgr.es/m/16129-a0c0f48e71741e5f@postgresql.org
2019-11-22 11:31:19 -05:00
Tom Lane 4a0aab14dc Defend against self-referential views in relation_is_updatable().
While a self-referential view doesn't actually work, it's possible
to create one, and it turns out that this breaks some of the
information_schema views.  Those views call relation_is_updatable(),
which neglected to consider the hazards of being recursive.  In
older PG versions you get a "stack depth limit exceeded" error,
but since v10 it'd recurse to the point of stack overrun and crash,
because commit a4c35ea1c took out the expression_returns_set() call
that was incidentally checking the stack depth.

Since this function is only used by information_schema views, it
seems like it'd be better to return "not updatable" than suffer
an error.  Hence, add tracking of what views we're examining,
in just the same way that the nearby fireRIRrules() code detects
self-referential views.  I added a check_stack_depth() call too,
just to be defensive.

Per private report from Manuel Rigger.  Back-patch to all
supported versions.
2019-11-21 16:21:43 -05:00
Peter Eisentraut 2e4db241bf Remove configure --disable-float4-byval
This build option was only useful to maintain compatibility for
version-0 functions, but those are no longer supported, so this option
can be removed.

float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.

Discussion: https://www.postgresql.org/message-id/flat/f3e1e576-2749-bbd7-2d57-3f9dcf75255a@2ndquadrant.com
2019-11-21 18:29:21 +01:00
Fujii Masao e6d8069522 Make DROP DATABASE command generate less WAL records.
Previously DROP DATABASE generated as many XLOG_DBASE_DROP WAL records
as the number of tablespaces that the database to drop uses. This caused
the scans of shared_buffers as many times as the number of the tablespaces
during recovery because WAL replay of one XLOG_DBASE_DROP record needs
that full scan. This could make the recovery time longer especially
when shared_buffers is large.

This commit changes DROP DATABASE so that it generates only one
XLOG_DBASE_DROP record, and registers the information of all the tablespaces
into it. Then, WAL replay of XLOG_DBASE_DROP record needs full scan of
shared_buffers only once, and which may improve the recovery performance.

Author: Fujii Masao
Reviewed-by: Kirk Jamison, Simon Riggs
Discussion: https://postgr.es/m/CAHGQGwF8YwNH0ZaL+2wjZPkj+ji9UhC+Z4ScnG97WKtVY5L9iw@mail.gmail.com
2019-11-21 21:10:37 +09:00
Fujii Masao 30840c92ac Allow ALTER VIEW command to rename the column in the view.
ALTER TABLE RENAME COLUMN command always can be used to rename the column
in the view, but it's reasonable to add that syntax to ALTER VIEW too.

Author: Fujii Masao
Reviewed-by: Ibrar Ahmed, Yu Kimura
Discussion: https://postgr.es/m/CAHGQGwHoQMD3b-MqTLcp1MgdhCpOKU7QNRwjFooT4_d+ti5v6g@mail.gmail.com
2019-11-21 19:55:13 +09:00
Amit Kapila 9290ad198b Track statistics for spilling of changes from ReorderBuffer.
This adds the statistics about transactions spilled to disk from
ReorderBuffer.  Users can query the pg_stat_replication view to check
these stats.

Author: Tomas Vondra, with bug-fixes and minor changes by Dilip Kumar
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-21 08:06:51 +05:30
Michael Paquier 168d206400 Provide statistics for hypothetical BRIN indexes
Trying to use hypothetical indexes with BRIN currently fails when trying
to access a relation that does not exist when looking for the
statistics.  With the current API, it is not possible to easily pass
a value for pages_per_range down to the hypothetical index, so this
makes use of the default value of BRIN_DEFAULT_PAGES_PER_RANGE, which
should be fine enough in most cases.

Being able to refine or enforce the hypothetical costs in more
optimistic ways would require more refactoring by filling in the
statistics when building IndexOptInfo in plancat.c.  This would involve
ABI breakages around the costing routines, something not fit for stable
branches.

This is broken since 7e534ad, so backpatch down to v10.

Author: Julien Rouhaud, Heikki Linnakangas
Reviewed-by: Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAOBaU_ZH0LKEA8VFCocr6Lpte1ab0b6FpvgS0y4way+RPSXfYg@mail.gmail.com
Backpatch-through: 10
2019-11-21 10:23:28 +09:00
Tom Lane 9ff5b699ed Sync patternsel_common's operator selection logic with pattern_prefix's.
Make patternsel_common() select the comparison operators to use with
hardwired logic that matches pattern_prefix()'s new logic, eliminating
its dependencies on particular index opfamilies.

This shouldn't change any behavior, as it's just replacing runtime
operator lookups with the same values hard-wired.  But it makes these
closely-related functions look more alike, and saving some runtime
syscache lookups is worth something.

Actually, it's not quite true that this is zero behavioral change:
when estimating for a column of type "name", the comparison constant
will be kept as "text" not coerced to "name".  But that's more correct
anyway, and it allows additional simplification of the coercion logic,
again syncing this more closely with pattern_prefix().

Per consideration of a report from Manuel Rigger.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
2019-11-20 15:00:18 -05:00
Peter Geoghegan 9f0f12ac57 Fix HeapTupleSatisfiesNonVacuumable() comment.
Oversight in commit 63746189b2.
2019-11-20 11:36:54 -08:00
Tom Lane 2ddedcafca Reduce match_pattern_prefix()'s dependencies on index opfamilies.
Historically, the planner's LIKE/regex index optimizations were only
carried out for specific index opfamilies.  That's never been a great
idea from the standpoint of extensibility, but it didn't matter so
much as long as we had no practical way to extend such behaviors anyway.
With the addition of planner support functions, and in view of ongoing
work to support additional table and index AMs, it seems like a good
time to relax this.

Hence, recast the decisions in match_pattern_prefix() so that rather
than decide which operators to generate by looking at what the index
opfamily contains, we decide which operators to generate a-priori
and then see if the opfamily supports them.  This is much more
defensible from a semantic standpoint anyway, since we know the
semantics of the chosen operators precisely, and we only need to
assume that the opfamily correctly implements operators it claims
to support.

The existing "pattern" opfamilies put a crimp in this approach, since
we need to select the pattern operators if we want those to work.
So we still have to special-case those opfamilies.  But that seems
all right, since in view of the addition of collations, the pattern
opfamilies seem like a legacy hack that nobody will be building on.

The only immediate effect of this change, so far as the core code is
concerned, is that anchored LIKE/regex patterns can be mapped onto
BRIN index searches, and exact-match patterns can be mapped onto hash
indexes, not only btree and spgist indexes as before.  That's not a
terribly exciting result, but it does fix an omission mentioned in
the ancient comments here.

Note: no catversion bump, even though this touches pg_operator.dat,
because it's only adding OID macros not changing the contents of
postgres.bki.

Per consideration of a report from Manuel Rigger.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
2019-11-20 14:13:04 -05:00
Tom Lane b3c265d7be Fix corner-case failure in match_pattern_prefix().
The planner's optimization code for LIKE and regex operators could
error out with a complaint like "no = operator for opfamily NNN"
if someone created a binary-compatible index (for example, a
bpchar_ops index on a text column) on the LIKE's left argument.

This is a consequence of careless refactoring in commit 74dfe58a5.
The old code in match_special_index_operator only accepted specific
combinations of the pattern operator and the index opclass, thereby
indirectly guaranteeing that the opclass would have a comparison
operator with the same LHS input type as the pattern operator.
While moving the logic out to a planner support function, I simplified
that test in a way that no longer guarantees that.  Really though we'd
like an altogether weaker dependency on the opclass, so rather than
put back exactly the old code, just allow lookup failure.  I have in
mind now to rewrite this logic completely, but this is the minimum
change needed to fix the bug in v12.

Per report from Manuel Rigger.  Back-patch to v12 where the mistake
came in.

Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
2019-11-19 17:03:34 -05:00
Alexander Korotkov b107140804 Fix page modification outside of critical section in GIN
By oversight 52ac6cd2d0 makes ginDeletePage() sets pd_prune_xid of page to be
deleted before entering the critical section.  It appears that only versions 11
and later were affected by this oversight.

Backpatch-through: 11
2019-11-20 00:12:33 +03:00
Alexander Korotkov 32ca32d0be Revise GIN README
We find GIN concurrency bugs from time to time.  One of the problems here is
that concurrency of GIN isn't well-documented in README.  So, it might be even
hard to distinguish design bugs from implementation bugs.

This commit revised concurrency section in GIN README providing more details.
Some examples are illustrated in ASCII art.

Also, this commit add the explanation of how is tuple layout in internal GIN
B-tree page different in comparison with nbtree.

Discussion: https://postgr.es/m/CAPpHfduXR_ywyaVN4%2BOYEGaw%3DcPLzWX6RxYLBncKw8de9vOkqw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 9.4
2019-11-20 00:04:22 +03:00
Alexander Korotkov d5ad7a09af Fix traversing to the deleted GIN page via downlink
Current GIN code appears to don't handle traversing to the deleted page via
downlink.  This commit fixes that by stepping right from the delete page like
we do in nbtree.

This commit also fixes setting 'deleted' flag to the GIN pages.  Now other page
flags are not erased once page is deleted.  That helps to keep our assertions
true if we arrive deleted page via downlink.

Discussion: https://postgr.es/m/CAPpHfdvMvsw-NcE5bRS7R1BbvA4BxoDnVVjkXC5W0Czvy9LVrg%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 9.4
2019-11-20 00:04:22 +03:00
Alexander Korotkov e14641197a Fix deadlock between ginDeletePage() and ginStepRight()
When ginDeletePage() is about to delete page it locks its left sibling to revise
the rightlink.  So, it locks pages in right to left manner.  Int he same time
ginStepRight() locks pages in left to right manner, and that could cause a
deadlock.

This commit makes ginScanToDelete() keep exclusive lock on left siblings of
currently investigated path.  That elimites need to relock left sibling in
ginDeletePage().  Thus, deadlock with ginStepRight() can't happen anymore.

Reported-by: Chen Huajun
Discussion: https://postgr.es/m/5c332bd1.87b6.16d7c17aa98.Coremail.chjischj%40163.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 10
2019-11-20 00:04:09 +03:00
Amit Kapila cec2edfa78 Add logical_decoding_work_mem to limit ReorderBuffer memory usage.
Instead of deciding to serialize a transaction merely based on the
number of changes in that xact (toplevel or subxact), this makes
the decisions based on amount of memory consumed by the changes.

The memory limit is defined by a new logical_decoding_work_mem GUC,
so for example we can do this

    SET logical_decoding_work_mem = '128kB'

to reduce the memory usage of walsenders or set the higher value to
reduce disk writes. The minimum value is 64kB.

When adding a change to a transaction, we account for the size in
two places. Firstly, in the ReorderBuffer, which is then used to
decide if we reached the total memory limit. And secondly in the
transaction the change belongs to, so that we can pick the largest
transaction to evict (and serialize to disk).

We still use max_changes_in_memory when loading changes serialized
to disk. The trouble is we can't use the memory limit directly as
there might be multiple subxact serialized, we need to read all of
them but we don't know how many are there (and which subxact to
read first).

We do not serialize the ReorderBufferTXN entries, so if there is a
transaction with many subxacts, most memory may be in this type of
objects. Those records are not included in the memory accounting.

We also do not account for INTERNAL_TUPLECID changes, which are
kept in a separate list and not evicted from memory. Transactions
with many CTID changes may consume significant amounts of memory,
but we can't really do much about that.

The current eviction algorithm is very simple - the transaction is
picked merely by size, while it might be useful to also consider age
(LSN) of the changes for example. With the new Generational memory
allocator, evicting the oldest changes would make it more likely
the memory gets actually pfreed.

The logical_decoding_work_mem can be set in postgresql.conf, in which
case it serves as the default for all publishers on that instance.

Author: Tomas Vondra, with changes by Dilip Kumar and Amit Kapila
Reviewed-by: Dilip Kumar and Amit Kapila
Tested-By: Vignesh C
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2019-11-19 07:32:36 +05:30
Peter Geoghegan 2110f71696 nbtree: Tweak _bt_pgaddtup() comments.
Make it clear that _bt_pgaddtup() truncates the first data item on an
internal page because its key is supposed to be treated as minus
infinity within _bt_compare().
2019-11-18 13:04:53 -08:00
Tom Lane bf2efc55da Further fix dumping of views that contain just VALUES(...).
It turns out that commit e9f1c01b7 missed a case: we must print a
VALUES clause in long format if get_query_def is given a resultDesc
that would require the query's output column name(s) to be different
from what the bare VALUES clause would produce.

This applies in case an ALTER ... RENAME COLUMN has been done to
a view that formerly could be printed in simple format, as shown
in the added regression test case.  It also explains bug #16119
from Dmitry Telpt, because it turns out that (unlike CREATE VIEW)
CREATE MATERIALIZED VIEW fails to apply any column aliases it's
given to the stored ON SELECT rule.  So to get them to be printed,
we have to account for the resultDesc renaming.  It might be worth
changing the matview code so that it creates the ON SELECT rule
with the correct aliases; but we'd still need these messy checks in
get_simple_values_rte to handle the case of a subsequent column
rename, so any such change would be just neatnik-ism not a bug fix.

Like the previous patch, back-patch to all supported branches.

Discussion: https://postgr.es/m/16119-e64823f30a45a754@postgresql.org
2019-11-16 20:00:19 -05:00
Tomas Vondra 2dc08bd617 Properly determine length for on-disk TOAST values
In detoast_attr_slice, VARSIZE_ANY was used to compute compressed length
of on-disk TOAST values. That's incorrect, because the varlena value may
be just a TOAST pointer, producing either bogus value or crashing.

This is likely why the code was crashing on big-endian machines before
540f316809 replaced the VARSIZE with VARSIZE_ANY, which however only
masked the issue.

Reported-by: Rushabh Lathia
Discussion: https://postgr.es/m/CAL-OGkthU9Gs7TZchf5OWaL-Gsi=hXqufTxKv9qpNG73d5na_g@mail.gmail.com
2019-11-16 03:07:11 +01:00
Tomas Vondra d482f7f867 Skip system attributes when applying mvdistinct stats
When estimating number of distinct groups, we failed to ignore system
attributes when matching the group expressions to mvdistinct stats,
causing failures like

  ERROR: negative bitmapset member not allowed

Fix that by simply skipping anything that is not a regular attribute.
Backpatch to PostgreSQL 10, where the extended stats were introduced.

Bug: #16111
Reported-by: Tuomas Leikola
Author: Tomas Vondra
Backpatch-through: 10
Discussion: https://postgr.es/m/16111-687799584c3a7e73@postgresql.org
2019-11-16 01:17:15 +01:00
Thomas Munro 76cbfcdf3a Always call ExecShutdownNode() if appropriate.
Call ExecShutdownNode() after ExecutePlan()'s loop, rather than at each
break.  We had forgotten to do that in one case.  The omission caused
intermittent "temporary file leak" warnings from multi-batch parallel
hash joins with a LIMIT clause.

Back-patch to 11.  Though the problem exists in theory in earlier
parallel query releases, nothing really depended on it.

Author: Kyotaro Horiguchi
Reviewed-by: Thomas Munro, Amit Kapila
Discussion: https://postgr.es/m/20191111.212418.2222262873417235945.horikyota.ntt%40gmail.com
2019-11-16 10:11:30 +13:00
Michael Paquier 50d22de932 Cleanup code in reloptions.h regarding reloption handling
reloptions.h includes since ba748f7 a set of macros to handle reloption
types in a way similar to how parseRelOptions() works.  They have never
been used in the core code, and we have more simple methods now to parse
and fill in rd_options for a given relation depending on its relkind, so
remove this interface to simplify things.

Per discussion between Amit Langote, Álvaro Herrera and me.

Discussion: https://postgr.es/m/CA+HiwqE6zbNO92az6pp5GiTw4tr-9rfCE0t84whQSP+YwSKjMQ@mail.gmail.com
2019-11-14 13:59:59 +09:00
Michael Paquier 1bbd608fda Split handling of reloptions for partitioned tables
Partitioned tables do not have relation options yet, but, similarly to
what's done for views which have their own parsing table, it could make
sense to introduce new parameters for some of the existing default ones
like fillfactor, autovacuum, etc.  Splitting things has the advantage to
make the information stored in rd_options include only the necessary
information, reducing the amount of memory used for a relcache entry
with partitioned tables if new reloptions are introduced at this level.

Author:  Nikolay Shaplov
Reviewed-by: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/1627387.Qykg9O6zpu@x200m
2019-11-14 12:34:28 +09:00
Andres Freund 7d962eaf50 Remove unused code from tuplesort.
copytup_index() is unused, as tuplesort_putindextuplevalues() doesn't
use COPYTUP(). Replace function body with an elog(ERROR), as already
done e.g. for copytup_datum().

Author: Andres Freund
Discussion: https://postgr.es/m/20191013144153.ooxrfglvnaocsrx2@alap3.anarazel.de
2019-11-13 15:57:01 -08:00
Tom Lane d57d61533a Add missing check_collation_set call to bpcharne().
We should throw an error for indeterminate collation, but bpcharne()
was missing that logic, resulting in a much less user-friendly error
(either an assertion failure or "cache lookup failed for collation 0").

Per report from Manuel Rigger.  Back-patch to v12 where the mistake
came in, evidently in commit 5e1963fb7.  (Before non-deterministic
collations, this function wasn't collation sensitive.)

Discussion: https://postgr.es/m/CA+u7OA4HOjtymxAbuGNh4-X_2R0Lw5n01tzvP8E5-i-2gQXYWA@mail.gmail.com
2019-11-13 15:53:53 -05:00
Tom Lane 0cafdd03a8 Fix silly initializations (cosmetic only).
Initializing a pointer to "false" isn't per project style,
and reportedly some compilers warn about it (though I've
not seen any such warnings in the buildfarm).

Seems to have come in with commit ff11e7f4b, so back-patch
to v12 where that was added.

Didier Gautheron

Discussion: https://postgr.es/m/CAJRYxu+XQuM0qnSqt1Ujztu6fBPzMMAT3VEn6W32rgKG6A2Fsw@mail.gmail.com
2019-11-13 15:26:54 -05:00
Tom Lane 7bf40ea0d0 Avoid using SplitIdentifierString to parse ListenAddresses, too.
This gets rid of our former behavior of forcibly downcasing
the postmaster's hostname list and truncating the elements to
NAMEDATALEN.  In principle, DNS hostnames are case-insensitive
so the first behavior should be harmless, and server hostnames
are seldom long enough for the second behavior to be an issue.
But it's still dubious, and an easy fix is available: just use
SplitGUCList instead.

AFAICT, all other SplitIdentifierString calls in the backend are
OK: either the items actually are SQL identifiers, or they are
keywords that are short and case-insensitive.

Per thinking about bug #16106.  While this has been wrong for
a very long time, the lack of field complaints means there's
little reason to back-patch.

Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org
2019-11-13 13:51:58 -05:00
Tom Lane 7618eaf5f3 Avoid downcasing/truncation of RADIUS authentication parameters.
Commit 6b76f1bb5 changed all the RADIUS auth parameters to be lists
rather than single values.  But its use of SplitIdentifierString
to parse the list format was not very carefully thought through,
because that function thinks it's parsing SQL identifiers, which
means it will (a) downcase the strings and (b) truncate them to
be shorter than NAMEDATALEN.  While downcasing should be harmless
for the server names and ports, it's just wrong for the shared
secrets, and probably for the NAS Identifier strings as well.
The truncation aspect is at least potentially a problem too,
though typical values for these parameters would fit in 63 bytes.

Fortunately, we now have a function SplitGUCList that is exactly
the same except for not doing the two unwanted things, so fixing
this is a trivial matter of calling that function instead.

While here, improve the documentation to show how to double-quote
the parameter values.  I failed to resist the temptation to do
some copy-editing as well.

Report and patch from Marcos David (bug #16106); doc changes by me.
Back-patch to v10 where the aforesaid commit came in, since this is
arguably a regression from our previous behavior with RADIUS auth.

Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org
2019-11-13 13:41:04 -05:00
Tom Lane 2c7b5dad6e Include TableFunc references when computing expression dependencies.
The TableFunc node (i.e., XMLTABLE) includes type and collation OIDs
that might not be referenced anywhere else in the expression tree,
so they need to be accounted for when extracting dependencies.

Fortunately, the practical effects of this are limited, since
(a) it's somewhat unlikely that people would be extracting
columns of non-builtin types from an XML document, and (b)
in many scenarios, the query would contain other references
to such types, or functions depending on them.  However, it's
not hard to construct examples wherein the existing code lets
one drop a type used in XMLTABLE and thereby break a view.

This is evidently an original oversight in the XMLTABLE patch,
so back-patch to v10 where that came in.

Discussion: https://postgr.es/m/18427.1573508501@sss.pgh.pa.us
2019-11-13 12:11:49 -05:00
Fujii Masao 7b8a899bde Make pg_waldump report more detail information about PREPARE TRANSACTION record.
This commit changes xact_desc() so that it reports the detail information about
PREPARE TRANSACTION record, like GID (global transaction identifier),
timestamp at prepare transaction, delete-on-abort/commit relations,
XID of subtransactions, and invalidation messages. These are helpful
when diagnosing 2PC-related troubles.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Andrey Lepikhov, Kyotaro Horiguchi, Julien Rouhaud, Alvaro Herrera
Discussion: https://postgr.es/m/CAHGQGwEvhASad4JJnCv=0dW2TJypZgW_Vpb-oZik2a3utCqcrA@mail.gmail.com
2019-11-13 16:59:17 +09:00
Amit Kapila 1379fd537f Introduce the 'force' option for the Drop Database command.
This new option terminates the other sessions connected to the target
database and then drop it.  To terminate other sessions, the current user
must have desired permissions (same as pg_terminate_backend()).  We don't
allow to terminate the sessions if prepared transactions, active logical
replication slots or subscriptions are present in the target database.

Author: Pavel Stehule with changes by me
Reviewed-by: Dilip Kumar, Vignesh C, Ibrar Ahmed, Anthony Nowocien,
Ryan Lambert and Amit Kapila
Discussion: https://postgr.es/m/CAP_rwwmLJJbn70vLOZFpxGw3XD7nLB_7+NKz46H5EOO2k5H7OQ@mail.gmail.com
2019-11-13 08:25:33 +05:30
Tom Lane 112caf9039 Finish reverting commit 0a52d378b.
Apply the solution adopted in commit dcb7d3caf (ie, explicitly
don't call memcmp for a zero-length comparison) to func_get_detail()
as well, removing one other place where we were passing an
uninitialized array to a parse_func.c entry point.

Discussion: https://postgr.es/m/MN2PR18MB2927F24692485D754794F01BE3740@MN2PR18MB2927.namprd18.prod.outlook.com
Discussion: https://postgr.es/m/MN2PR18MB2927F6873DF2774A505AC298E3740@MN2PR18MB2927.namprd18.prod.outlook.com
2019-11-12 16:58:08 -05:00
Alvaro Herrera 5c46e7d82e pg_stat_{ssl,gssapi}: Show only processes with connections
It is pointless to show in those views auxiliary processes that don't
open network connections.

A small incompatibility is that anybody joining pg_stat_activity and
pg_stat_ssl/pg_stat_gssapi will have to use a left join if they want to
see such auxiliary processes.

Author: Euler Taveira
Discussion: https://postgr.es/m/20190904151535.GA29108@alvherre.pgsql
2019-11-12 18:48:41 -03:00
Peter Geoghegan 1f55ebae27 Make _bt_keep_natts_fast() use datum_image_eq().
An upcoming patch that adds deduplication to the nbtree AM will rely on
_bt_keep_natts_fast() understanding that differences in TOAST input
state can never affect its answer.  In particular, two opclass-equal
datums (with opclasses deemed safe for deduplication) should never be
treated as unequal by _bt_keep_natts_fast() due to TOAST input
differences.

This also seems like a good idea on general principle.  nbtsplitloc.c
will now occasionally make better decisions about where to split a leaf
page.  The behavior of _bt_keep_natts_fast() is now somewhat closer to
the behavior of _bt_keep_natts().

Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2019-11-12 13:08:41 -08:00
Alvaro Herrera dcb7d3cafa Have LookupFuncName accept NULL argtypes for 0 args
Prior to this change, it requires to be passed a valid pointer just to
be able to pass it to a zero-byte memcmp, per 0a52d378b0.  Given the
strange resulting code in callsites, it seems better to test for the
case specifically and remove the requirement.

Reported-by: Ranier Vilela
Discussion: https://postgr.es/m/MN2PR18MB2927F24692485D754794F01BE3740@MN2PR18MB2927.namprd18.prod.outlook.com
Discussion: https://postgr.es/m/MN2PR18MB2927F6873DF2774A505AC298E3740@MN2PR18MB2927.namprd18.prod.outlook.com
2019-11-12 17:06:58 -03:00
Peter Geoghegan 8c951687f5 Teach datum_image_eq() about cstring datums.
Bring datum_image_eq() in line with datumIsEqual() by adding support for
comparing cstring datums.

An upcoming patch that adds deduplication to the nbtree AM will use
datum_image_eq().  datum_image_eq() will need to work with all datatypes
that can be used as the storage type of a B-Tree index column, including
cstring.  (cstring is used as the storage type for columns of type
"name" as a space-saving optimization.)

Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2019-11-12 11:25:34 -08:00
Tom Lane 7a0574b50e Fix ecpglib.h to declare bool consistently with c.h.
This completes the task begun in commit 1408d5d86, to synchronize
ECPG's exported definitions with the definition of bool used by
c.h (and, therefore, the one actually in use in the ECPG library).
On practically all modern platforms, ecpglib.h will now just
include <stdbool.h>, which should surprise nobody anymore.
That removes a header-inclusion-order hazard for ECPG clients,
who previously might get build failures or unexpected behavior
depending on whether they'd included <stdbool.h> themselves,
and if so, whether before or after ecpglib.h.

On platforms where sizeof(_Bool) is not 1 (only old PPC-based
Mac systems, as far as I know), things are still messy, as
inclusion of <stdbool.h> could still break ECPG client code.
There doesn't seem to be any clean fix for that, and given the
probably-negligible population of users who would care anymore,
it's not clear we should go far out of our way to cope with it.
This change at least fixes some header-inclusion-order hazards
for our own code, since c.h and ecpglib.h previously disagreed
on whether bool should be char or unsigned char.

To implement this with minimal invasion of ECPG client namespace,
move the choice of whether to rely on <stdbool.h> into configure,
and have it export a configuration symbol PG_USE_STDBOOL.

ecpglib.h no longer exports definitions for TRUE and FALSE,
only their lowercase brethren.  We could undo that if we get
push-back about it.

Ideally we'd back-patch this as far as v11, which is where c.h
started to rely on <stdbool.h>.  But the odds of creating problems
for formerly-working ECPG client code seem about as large as the
odds of fixing any non-working cases, so we'll just do this in HEAD.

Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com
2019-11-12 13:00:04 -05:00
Amit Kapila 14aec03502 Make the order of the header file includes consistent in backend modules.
Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order
of header file inclusion consistent for backend modules.

In the passing, removed a couple of duplicate inclusions.

Author: Vignesh C
Reviewed-by: Kuntal Ghosh and Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
2019-11-12 08:30:16 +05:30
Peter Eisentraut d0c92527cc Fix whitespace 2019-11-11 09:51:10 +01:00
Thomas Munro db2687d1f3 Optimize PredicateLockTuple().
PredicateLockTuple() has a fast exit if tuple was written by the current
transaction, as in that case it already has a lock.  This check can be
performed using TransactionIdIsCurrentTransactionId() instead of
SubTransGetTopmostTransaction(), to avoid any chance of having to hit the
disk.

Author: Ashwin Agrawal, based on a suggestion from Andres Freund
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
2019-11-11 17:06:59 +13:00
Thomas Munro 695c5977c8 Optimize TransactionIdIsCurrentTransactionId().
If the passed in xid is the current top transaction, we can do a fast
check and exit early.  This should work well for the current heap but
also works very well for proposed AMs that don't use a separate xid
for subtransactions.

Author: Ashwin Agrawal, based on a suggestion from Andres Freund
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
2019-11-11 16:33:04 +13:00
Amit Kapila 9fab25c6cd Rearrange dropdb() to avoid errors after allowing other sessions to exit.
During Drop Database, it is better to error out before allowing other
sessions to exit and forcefully terminating autovacuum workers.  All the
other errors except for checking subscriptions are already done before.

Author: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1+qhLkCYG2oy9xug9ur_j=G2wQNRYAyd+-kZfZ1z42pLw@mail.gmail.com
2019-11-11 07:42:45 +05:30
Peter Eisentraut 1c60e40ad5 Fix negative bitmapset member not allowed error in logical replication
This happens when we add a replica identity column on a subscriber
that does not yet exist on the publisher, according to the mapping
maintained by the subscriber.  Code that checks whether the target
relation on the subscriber is updatable would check the replica
identity attribute bitmap with a column number -1, which would result
in an error.  To fix, skip such columns in the bitmap lookup and
consider the relation not updatable.  The result is consistent with
the rule that the replica identity columns on the subscriber must be a
subset of those on the publisher, since if the column doesn't exist on
the publisher, the column set on the subscriber can't be a subset.

Reported-by: Tim Clarke <tim.clarke@minerva.info>
Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info
2019-11-09 08:35:44 +01:00
Andres Freund aae50236e4 Pass ItemPointer not HeapTuple to IndexBuildCallback.
Not all AMs use HeapTuples internally, making it inconvenient to pass
a HeapTuple. As the index callbacks really only need the TID, not the
full tuple, modify callback to only take ItemPointer.

Author: Ashwin Agrawal
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/CALfoeis6=8ehuR=VNtHvj3z16cYfCwPdTcpaxU+sfSUJ5QgR3g@mail.gmail.com
2019-11-08 11:49:29 -08:00
Alvaro Herrera 71a8a4f6e3 Add backtrace support for error reporting
Add some support for automatically showing backtraces in certain error
situations in the server.  Backtraces are shown on assertion failure;
also, a new setting backtrace_functions can be set to a list of C
function names, and all ereport()s and elog()s from the mentioned
functions will have backtraces generated.  Finally, the function
errbacktrace() can be manually added to an ereport() call to generate a
backtrace for that call.

Authors: Peter Eisentraut, Álvaro Herrera
Discussion: https://postgr.es/m//5f48cb47-bf1e-05b6-7aae-3bf2cd01586d@2ndquadrant.com
Discussion: https://postgr.es/m/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com
2019-11-08 15:44:20 -03:00
Peter Eisentraut 3dcffb381c Fix gratuitous error message variation 2019-11-08 18:37:17 +01:00
Peter Eisentraut b85e43feb3 More precise errors from initial pg_control check
Use a separate error message for invalid checkpoint location and
invalid state instead of just "invalid data" for both.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/20191107041630.GK1768@paquier.xyz
2019-11-08 08:03:16 +01:00
Peter Geoghegan e86c8ef243 Use "low key" terminology in nbtsort.c.
nbtree index builds once stashed the "minimum key" for a page, which was
used as the basis of the pivot tuple that gets placed in the next level
up (i.e. the tuple that stores the downlink to the page in question).
It doesn't quite work that way anymore, so the "minimum key" terminology
now seems misleading (these days the minimum key is actually a straight
copy of the high key from the left sibling, which is a distinct thing in
subtle but important ways).  Rename this concept to "low key".  This
name is a lot clearer given that there is now a sharp distinction
between pivot and non-pivot tuples.  Also remove comments that describe
obsolete details about how the minimum key concept used to work.

Rather than generating the minus infinity item for the leftmost page on
a level by copying the new item and truncating that copy, simply
allocate a small buffer.  The old approach confusingly created the
impression that the new item had some kind of significance.  This was
another artifact of how things used to work before commits 8224de4f and
dd299df8.
2019-11-07 17:12:09 -08:00
Alvaro Herrera b4bcc6bfdf Fix SET CONSTRAINTS .. DEFERRED on partitioned tables
SET CONSTRAINTS ... DEFERRED failed on partitioned tables, because of a
sanity check that ensures that the affected constraints have triggers.
On partitioned tables, the triggers are in the leaf partitions, not in
the partitioned relations themselves, so the sanity check fails.
Removing the sanity check solves the problem, because the code needed to
support the case is already there.

Backpatch to 11.

Note: deferred unique constraints are not affected by this bug, because
they do have triggers in the parent partitioned table.  I did not add a
test for this scenario.

Discussion: https://postgr.es/m/20191105212915.GA11324@alvherre.pgsql
2019-11-07 13:59:24 -03:00
Tom Lane a7145f6bc8 Fix integer-overflow edge case detection in interval_mul and pgbench.
This patch adopts the overflow check logic introduced by commit cbdb8b4c0
into two more places.  interval_mul() failed to notice if it computed a
new microseconds value that was one more than INT64_MAX, and pgbench's
double-to-int64 logic had the same sorts of edge-case problems that
cbdb8b4c0 fixed in the core code.

To make this easier to get right in future, put the guts of the checks
into new macros in c.h, and add commentary about how to use the macros
correctly.

Back-patch to all supported branches, as we did with the previous fix.

Yuya Watari

Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com
2019-11-07 11:22:58 -05:00
Peter Eisentraut 581a55889b Fix nested error handling in PG_FINALLY
We need to pop the error stack before running the user-supplied
PG_FINALLY code.  Otherwise an error in the cleanup code would end up
at the same sigsetjmp() invocation and result in an infinite error
handling loop.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com
2019-11-07 09:56:47 +01:00
Fujii Masao a0c96856e8 Fix assertion failure when running pgbench -s.
If there is the WAL page that the continuation WAL record just fits within
(i.e., the continuation record ends just at the end of the page) and
the LSN in such page is specified with -s option, previously pg_waldump
caused an assertion failure. The cause of this assertion failure was that
XLogFindNextRecord() that pg_waldump -s calls mistakenly handled
such special WAL page.

This commit changes XLogFindNextRecord() so that it can handle
such WAL page correctly.

Back-patch to all supported versions.

Author: Andrey Lepikhov
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
2019-11-07 16:31:36 +09:00
Thomas Munro 7815e7efdb Add reusable routine for making arrays unique.
Introduce qunique() and qunique_arg(), which can be used after qsort()
and qsort_arg() respectively to remove duplicate values.  Use it where
appropriate.

Author: Thomas Munro
Reviewed-by: Tom Lane (in an earlier version)
Discussion: https://postgr.es/m/CAEepm%3D2vmFTNpAmwbGGD2WaryM6T3hSDVKQPfUwjdD_5XY6vAA%40mail.gmail.com
2019-11-07 17:00:48 +13:00
Michael Paquier 3feb6ace7c Check after errors of SPI_execute() in xml.c
SPI gets used to build a list of relation OIDs for XML object
generation, and one code path building a list uses SPI_execute() without
looking at errors it produces.  So fix that.

Author: Mark Dilger
Reviewed-by: Michael Paquier, Pavel Stehule
Discussion: https://postgr.es/m/17d30445-4862-7917-170f-84328dcd292d@gmail.com
2019-11-07 11:13:31 +09:00
Tomas Vondra 6e3e6cc0e8 Allow sampling of statements depending on duration
This allows logging a sample of statements, without incurring excessive
log traffic (which may impact performance).  This can be useful when
analyzing workloads with lots of short queries.

The sampling is configured using two new GUC parameters:

 * log_min_duration_sample - minimum required statement duration

 * log_statement_sample_rate - sample rate (0.0 - 1.0)

Only statements with duration exceeding log_min_duration_sample are
considered for sampling. To enable sampling, both those GUCs have to
be set correctly.

The existing log_min_duration_statement GUC has a higher priority, i.e.
statements with duration exceeding log_min_duration_statement will be
always logged, irrespectedly of how the sampling is configured. This
means only configurations

  log_min_duration_sample < log_min_duration_statement

do actually sample the statements, instead of logging everything.

Author: Adrien Nayrat
Reviewed-by: David Rowley, Vik Fearing, Tomas Vondra
Discussion: https://postgr.es/m/bbe0a1a8-a8f7-3be2-155a-888e661cc06c@anayrat.info
2019-11-06 19:11:07 +01:00
Tom Lane 22e44e8dbc Minor code review for tuple slot rewrite.
Avoid creating transiently-inconsistent slot states where possible,
by not setting TTS_FLAG_SHOULDFREE until after the slot actually has
a free'able tuple pointer, and by making sure that we reset tts_nvalid
and related derived state before we replace the tuple contents.  This
would only matter if something were to examine the slot after we'd
suffered some kind of error (e.g. out of memory) while manipulating
the slot.  We typically don't do that, so these changes might just be
cosmetic --- but even if so, it seems like good future-proofing.

Also remove some redundant Asserts, and add a couple for consistency.

Back-patch to v12 where all this code was rewritten.

Discussion: https://postgr.es/m/16095-c3ff2e5283b8dba5@postgresql.org
2019-11-06 12:00:17 -05:00
Tom Lane ff43b3e88e Sync our DTrace infrastructure with c.h's definition of type bool.
Since commit d26a810eb, we've defined bool as being either _Bool from
<stdbool.h>, or "unsigned char"; but that commit overlooked the fact
that probes.d has "#define bool char".  For consistency, make it say
"unsigned char" instead.  This should be strictly a cosmetic change,
but it seems best to be in sync.

Formally, in the now-normal case where we're using <stdbool.h>, it'd
be better to write "#define bool _Bool".  However, then we'd need
some build infrastructure to inject that configuration choice into
probes.d, and it doesn't seem worth the trouble.  We only use
<stdbool.h> if sizeof(_Bool) is 1, so having DTrace think that
bool parameters are "unsigned char" should be close enough.

Back-patch to v12 where d26a810eb came in.

Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com
2019-11-06 11:11:40 -05:00
Peter Eisentraut d40abd5fcf Fix memory allocation mistake
The previous code was allocating more memory than necessary because
the formula used the wrong data type.

Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/20191105172918.3e32a446@firost
2019-11-06 14:20:29 +01:00
Peter Eisentraut 5b7ba75f7f Remove unused function argument
The cache_plan argument to ri_PlanCheck has not been used since
e8c9fd5fdf.

Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ec8a8b45-a30b-9193-cd4b-985d60d1497e%402ndquadrant.com
2019-11-06 08:19:27 +01:00
Michael Paquier 5f6b1eb0cf Fix timestamp of sent message for write context in logical decoding
When sending data for logical decoding using the streaming replication
protocol via a WAL sender, the timestamp of the sent write message is
allocated at the beginning of the message when preparing for the write,
and actually computed when the write message is ready to be sent.

The timestamp was getting computed after sending the message.  This
impacts anything using logical decoding, causing for example logical
replication to report mostly NULL for last_msg_send_time in
pg_stat_subscription.

This commit makes sure that the timestamp is computed before sending the
message.  This is wrong since 5a991ef, so backpatch down to 9.4.

Author: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com
Backpatch-through: 9.4
2019-11-06 16:12:21 +09:00
Andrew Gierth a9056cc637 Request small targetlist for input to WindowAgg.
WindowAgg will potentially store large numbers of input rows into
tuplestores to allow access to other rows in the frame. If the input
is coming via an explicit Sort node, then unneeded columns will
already have been discarded (since Sort requests a small tlist); but
there are idioms like COUNT(*) OVER () that result in the input not
being sorted at all, and cases where the input is being sorted by some
means other than a Sort; if we don't request a small tlist, then
WindowAgg's storage requirement is inflated by the unneeded columns.

Backpatch back to 9.6, where the current tlist handling was added.
(Prior to that, WindowAgg would always use a small tlist.)

Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk
2019-11-06 04:13:30 +00:00
Fujii Masao 979766c0af Correct the command tags for ALTER ... RENAME COLUMN.
Previously ALTER MATERIALIZED VIEW / FOREIGN TABLE ... RENAME COLUMN ...
returned "ALTER TABLE" as a command tag. This commit fixes them so that
they return "ALTER MATERIALIZED VIEW" and "ALTER FOREIGN TABLE" as
command tags, respectively.

This issue exists in all supported versions, but we don't back-patch this
because it's not enough of a bug to justify taking any compatibility risks for.
Otherwise, the back-patch would cause minor version update to break,
for example, the existing event trigger functions using TG_TAG.

Author: Fujii Masao
Reviewed-by: Ibrar Ahmed
Discussion: https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==P25WvuBJjg@mail.gmail.com
2019-11-06 12:54:17 +09:00
Andres Freund 26aaf97b68 Make StringInfo available to frontend code.
There's plenty places in frontend code that could benefit from a
string buffer implementation. Some because it yields simpler and
faster code, and some others because of the desire to share code
between backend and frontend.

While there is a string buffer implementation available to frontend
code, libpq's PQExpBuffer, it is clunkier than stringinfo, it
introduces a libpq dependency, doesn't allow for sharing between
frontend and backend code, and has a higher API/ABI stability
requirement due to being exposed via libpq.

Therefore it seems best to just making StringInfo being usable by
frontend code. There's not much to do for that, except for rewriting
two subsequent elog/ereport calls into others types of error
reporting, and deciding on a maximum string length.

For the maximum string size I decided to privately define MaxAllocSize
to the same value as used in the backend. It seems likely that we'll
want to reconsider this for both backend and frontend code in the not
too far away future.

For now I've left stringinfo.h in lib/, rather than common/, to reduce
the likelihood of unnecessary breakage. We could alternatively decide
to provide a redirecting stringinfo.h in lib/, or just not provide
compatibility.

Author: Andres Freund
Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson
Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de
2019-11-05 14:56:40 -08:00
Andres Freund 01368e5d9d Split all OBJS style lines in makefiles into one-line-per-entry style.
When maintaining or merging patches, one of the most common sources
for conflicts are the list of objects in makefiles. Especially when
the split across lines has been changed on both sides, which is
somewhat common due to attempting to stay below 80 columns, those
conflicts are unnecessarily laborious to resolve.

By splitting, and alphabetically sorting, OBJS style lines into one
object per line, conflicts should be less frequent, and easier to
resolve when they still occur.

Author: Andres Freund
Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
2019-11-05 14:41:07 -08:00
Tom Lane 66c61c81b9 Tweak some authentication debug messages to follow project style.
Avoid initial capital, since that's not how we do it.

Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com
2019-11-05 14:29:08 -05:00
Tom Lane 3affe76ef8 Avoid logging complaints about abandoned connections when using PAM.
For a long time (since commit aed378e8d) we have had a policy to log
nothing about a connection if the client disconnects when challenged
for a password.  This is because libpq-using clients will typically
do that, and then come back for a new connection attempt once they've
collected a password from their user, so that logging the abandoned
connection attempt will just result in log spam.  However, this did
not work well for PAM authentication: the bottom-level function
pam_passwd_conv_proc() was on board with it, but we logged messages
at higher levels anyway, for lack of any reporting mechanism.
Add a flag and tweak the logic so that the case is silent, as it is
for other password-using auth mechanisms.

Per complaint from Yoann La Cancellera.  It's been like this for awhile,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com
2019-11-05 14:27:37 -05:00
Tom Lane a30531c5c8 Fix "unexpected relkind" error when denying permissions on toast tables.
get_relkind_objtype, and hence get_object_type, failed when applied to a
toast table.  This is not a good thing, because it prevents reporting of
perfectly legitimate permissions errors.  (At present, these functions
are in fact *only* used to determine the ObjectType argument for
acl_error() calls.)  It seems best to have them fall back to returning
OBJECT_TABLE in every case where they can't determine an object type
for a pg_class entry, so do that.

In passing, make some edits to alter.c to make it more obvious that
those calls of get_object_type() are used only for error reporting.
This might save a few cycles in the non-error code path, too.

Back-patch to v11 where this issue originated.

John Hsu, Michael Paquier, Tom Lane

Discussion: https://postgr.es/m/C652D3DF-2B0C-4128-9420-FB5379F6B1E4@amazon.com
2019-11-05 13:40:37 -05:00
Tom Lane 529ebb20aa Generate EquivalenceClass members for partitionwise child join rels.
Commit d25ea0127 got rid of what I thought were entirely unnecessary
derived child expressions in EquivalenceClasses for EC members that
mention multiple baserels.  But it turns out that some of the child
expressions that code created are necessary for partitionwise joins,
else we fail to find matching pathkeys for Sort nodes.  (This happens
only for certain shapes of the resulting plan; it may be that
partitionwise aggregation is also necessary to show the failure,
though I'm not sure of that.)

Reverting that commit entirely would be quite painful performance-wise
for large partition sets.  So instead, add code that explicitly
generates child expressions that match only partitionwise child join
rels we have actually generated.

Per report from Justin Pryzby.  (Amit Langote noticed the problem
earlier, though it's not clear if he recognized then that it could
result in a planner error, not merely failure to exploit partitionwise
join, in the code as-committed.)  Back-patch to v12 where commit
d25ea0127 came in.

Amit Langote, with lots of kibitzing from me

Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com
Discussion: https://postgr.es/m/20191011143703.GN10470@telsasoft.com
2019-11-05 11:42:24 -05:00
Michael Paquier 3534fa2233 Refactor code building relation options
Historically, the code to build relation options has been shaped the
same way in multiple code paths by using a set of datums in input with
the options parsed with a static table which is then filled with the
option values.  This introduces a new common routine in reloptions.c to
do most of the legwork for the in-core code paths.

Author: Amit Langote
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CA+HiwqGsoSn_uTPPYT19WrtR7oYpYtv4CdS0xuedTKiHHWuk_g@mail.gmail.com
2019-11-05 09:17:05 +09:00
Tom Lane ec28808ba8 Fix ginEntryInsert's counting of GIN leaf tuples.
As the code stands, nEntries counts the number of ginEntryInsert()
calls, so that's what you end up with at the end of a GIN index build.
However, ginvacuumcleanup() recomputes nEntries as the number of
surviving leaf tuples, and that's generally consistent with the way that
gincostestimate() uses the value.  So let's clearly define nEntries
as the number of leaf tuples, and therefore adjust ginEntryInsert() to
increment it only when we make a new one, not when we add TIDs into an
existing tuple or posting tree.

In practice this inconsistency probably has little impact, so I don't
feel a need to back-patch.

Insung Moon and Keisuke Kuroda

Discussion: https://postgr.es/m/CAEMmqBuH_O-oXL+3_ArQ6F5cJ7kXVow2SGQB3HRacku_T+xkmA@mail.gmail.com
2019-11-04 14:16:42 -05:00
Peter Eisentraut a63c84e59a Fix some compiler warnings on older compilers
Some older compilers appear to not understand the recently introduced
PG_FINALLY code structure that well in some circumstances and complain
about possibly uninitialized variables.  So to fix, initialize the
variables explicitly in the cases complained about.

Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com
2019-11-04 11:07:32 +01:00
Peter Eisentraut 8557a6f10c Catch invalid typlens in a couple of places
Rearrange the logic in record_image_cmp() and datum_image_eq() to
error out on unexpected typlens (either not supported there or
completely invalid due to corruption).  Barring corruption, this is
not possible today but it seems more future-proof and robust to fix
this.

Reported-by: Peter Geoghegan <pg@bowt.ie>
2019-11-04 09:08:15 +01:00
Tom Lane db27b60f07 Suppress warning from older compilers.
Commit 8af1624e3 introduced a warning about possibly returning
without a value, on compilers that don't realize that ereport(ERROR)
doesn't return.  Tweak the code to avoid that.

Per buildfarm.  Back-patch to 9.6, like the aforesaid commit.
2019-11-03 16:10:23 -05:00
Tom Lane 8af1624e3f Validate ispell dictionaries more carefully.
Using incorrect, or just mismatched, dictionary and affix files
could result in a crash, due to failure to cross-check offsets
obtained from the file.  Add necessary validation, as well as
some Asserts for future-proofing.

Per bug #16050 from Alexander Lakhin.  Back-patch to 9.6 where the
problem was introduced.

Arthur Zakirov, per initial investigation by Tomas Vondra

Discussion: https://postgr.es/m/16050-024ae722464ab604@postgresql.org
Discussion: https://postgr.es/m/20191013012610.2p2fp3zzpoav7jzf@development
2019-11-02 16:45:32 -04:00
Michael Paquier dc816e5815 Fix failure when creating cloned indexes for a partition
When using CREATE TABLE for a new partition, the partitioned indexes of
the parent are created automatically in a fashion similar to LIKE
INDEXES.  The new partition and its parent use a mapping for attribute
numbers for this operation, and while the mapping was correctly built,
its length was defined as the number of attributes of the newly-created
child, and not the parent.  If the parent includes dropped columns, this
could cause failures.

This is wrong since 8b08f7d which has introduced the concept of
partitioned indexes, so backpatch down to 11.

Reported-by: Wyatt Alt
Author: Michael Paquier
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/CAGem3qCcRmhbs4jYMkenYNfP2kEusDXvTfw-q+eOhM0zTceG-g@mail.gmail.com
Backpatch-through: 11
2019-11-02 14:16:04 +09:00
Michael Paquier e174f699c4 Add some assertions in syncrep.c
A couple of routines assume that the LWLock SyncRepLock needs to be
taken, so add a couple of assertions to be sure of that.  Also, when
waiting for a given LSN at transaction commit, the code implied that the
syncrep queue cleanup happens while holding interrupts, but the code
never checked after that.

Author: Michael Paquier
Reviewed-by: Fujii Masao, Kyotaro Horiguchi, Dongming Liu
Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com
2019-11-01 22:51:05 +09:00
Michael Paquier 20345197ff Fix race condition at backend exit when deleting element in syncrep queue
When a backend exits, it gets deleted from the syncrep queue if present.
The queue was checked without SyncRepLock taken in exclusive mode, so it
would have been possible for a backend to remove itself after a WAL
sender already did the job.  Fix this issue based on a suggestion from
Fujii Masao, by first checking the queue without the lock.  Then, if the
backend is present in the queue, take the lock and perform an additional
lookup check before doing the element deletion.

Author: Dongming Liu
Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com
Backpatch-through: 9.4
2019-11-01 22:38:32 +09:00
Peter Eisentraut 604bd36711 PG_FINALLY
This gives an alternative way of catching exceptions, for the common
case where the cleanup code is the same in the error and non-error
cases.  So instead of

    PG_TRY();
    {
        ... code that might throw ereport(ERROR) ...
    }
    PG_CATCH();
    {
        cleanup();
	PG_RE_THROW();
    }
    PG_END_TRY();
    cleanup();

one can write

    PG_TRY();
    {
        ... code that might throw ereport(ERROR) ...
    }
    PG_FINALLY();
    {
        cleanup();
    }
    PG_END_TRY();

Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com
2019-11-01 11:18:03 +01:00
Peter Eisentraut 7302514088 Add const qualifiers to internal range type APIs
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/dc9b45fa-b950-fadc-4751-85d6f729df55%402ndquadrant.com
2019-10-31 07:48:21 +01:00
Michael Paquier f921ea624e Fix typo in comment of syncrep.c
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20191030.123428.18823202335157111.horikyota.ntt@gmail.com
2019-10-31 10:22:24 +09:00
Peter Eisentraut c5e1df951d Remove one use of IDENT_USERNAME_MAX
IDENT_USERNAME_MAX is the maximum length of the information returned
by an ident server, per RFC 1413.  Using it as the buffer size in peer
authentication is inappropriate.  It was done here because of the
historical relationship between peer and ident authentication.  To
reduce confusion between the two authenticaton methods and disentangle
their code, use a dynamically allocated buffer instead.

Discussion: https://www.postgresql.org/message-id/flat/c798fba5-8b71-4f27-c78e-37714037ea31%402ndquadrant.com
2019-10-30 11:18:00 +01:00
Peter Eisentraut 5cc1e64fb6 Update code comments about peer authenticaton
For historical reasons, the functions for peer authentication were
grouped under ident authentication.  But they are really completely
separate, so give them their own section headings.
2019-10-30 09:13:39 +01:00
Michael Paquier 6ca86bb7e9 Fix typos in the code
Author: Vignesh C
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/CALDaNm0ni+GAOe4+fbXiOxNrVudajMYmhJFtXGX-zBPoN8ixhw@mail.gmail.com
2019-10-30 10:03:00 +09:00
Michael Paquier d80be6f2f6 Fix handling of pg_class.relispartition at swap phase in REINDEX CONCURRENTLY
When cancelling REINDEX CONCURRENTLY after swapping the old and new
indexes (for example interruption at step 5), the old index remains
around and is marked as invalid.  The old index should also be manually
droppable to clean up the parent relation from any invalid indexes still
remaining.  For a partition index reindexed, pg_class.relispartition was
not getting updated, causing the index to not be droppable as DROP INDEX
would look for dependencies in a partition tree, which do not exist
anymore after the swap phase is done.

The fix here is simple: when swapping the old and new indexes, make sure
that pg_class.relispartition is correctly switched, similarly to what is
done for the index name.

Reported-by: Justin Pryzby
Author: Michael Paquier
Discussion: https://postgr.es/m/20191015164047.GA22729@telsasoft.com
Backpatch-through: 12
2019-10-29 11:08:09 +09:00
Tom Lane 8b7a0f1d11 Allow extracting fields from a ROW() expression in more cases.
Teach get_expr_result_type() to manufacture a tuple descriptor directly
from a RowExpr node.  If the RowExpr has type RECORD, this is the only
way to get a tupdesc for its result, since even if the rowtype has been
blessed, we don't have its typmod available at this point.  (If the
RowExpr has some named composite type, we continue to let the existing
code handle it, since the RowExpr might well not have the correct column
names embedded in it.)

This fixes assorted corner cases illustrated by the added regression
tests.

Discussion: https://postgr.es/m/10872.1572202006@sss.pgh.pa.us
2019-10-28 15:08:24 -04:00
Tom Lane bd1ef5799b Handle empty-string edge cases correctly in strpos().
Commit 9556aa01c rearranged the innards of text_position() in a way
that would make it not work for empty search strings.  Which is fine,
because all callers of that code special-case an empty pattern in
some way.  However, the primary use-case (text_position itself) got
special-cased incorrectly: historically it's returned 1 not 0 for
an empty search string.  Restore the historical behavior.

Per complaint from Austin Drenski (via Shay Rojansky).
Back-patch to v12 where it got broken.

Discussion: https://postgr.es/m/CADT4RqAz7oN4vkPir86Kg1_mQBmBxCp-L_=9vRpgSNPJf0KRkw@mail.gmail.com
2019-10-28 12:21:13 -04:00
Michael Paquier 68ac9cf249 Fix dependency handling at swap phase of REINDEX CONCURRENTLY
When swapping the dependencies of the old and new indexes, the code has
been correctly switching all links in pg_depend from the old to the new
index for both referencing and referenced entries.  However it forgot
the fact that the new index may itself have existing entries in
pg_depend, like references to the parent table attributes.  This
resulted in duplicated entries in pg_depend after running REINDEX
CONCURRENTLY.

Fix this problem by removing any existing entries in pg_depend on the
new index before switching the dependencies of the old index to the new
one.  More regression tests are added to check the consistency of
entries in pg_depend for indexes, including partition indexes.

Author: Michael Paquier
Discussion: https://postgr.es/m/20191025064318.GF8671@paquier.xyz
Backpatch-through: 12
2019-10-28 11:57:31 +09:00
Michael Paquier 51970fa8df Fix initialization of fake LSN for unlogged relations
9155580 has changed the value of the first fake LSN for unlogged
relations from 1 to FirstNormalUnloggedLSN (aka 1000), GiST requiring a
non-zero LSN on some pages to allow an interlocking logic to work, but
its value was still initialized to 1 at the beginning of recovery or
after running pg_resetwal.  This fixes the initialization for both code
paths.

Author: Takayuki Tsunakawa
Reviewed-by: Dilip Kumar, Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/OSBPR01MB2503CE851940C17DE44AE3D9FE6F0@OSBPR01MB2503.jpnprd01.prod.outlook.com
Backpatch-through: 12
2019-10-27 13:54:12 +09:00
Peter Eisentraut 2fc2a88e67 Remove obsolete information schema tables
Remove SQL_LANGUAGES, which was eliminated in SQL:2008, and
SQL_PACKAGES and SQL_SIZING_PROFILES, which were eliminated in
SQL:2011.  Since they were dropped by the SQL standard, the
information in them was no longer updated and therefore no longer
useful.

This also removes the feature-package association information in
sql_feature_packages.txt, but for the time begin we are keeping the
information which features are in the Core package (that is, mandatory
SQL features).  Maybe at some point someone wants to invent a way to
store that that does not involve using the "package" mechanism
anymore.

Discussion https://www.postgresql.org/message-id/flat/91334220-7900-071b-9327-0c6ecd012017%402ndquadrant.com
2019-10-25 21:37:14 +02:00
Tom Lane 22f6f2c1cc Improve management of statement timeouts.
Commit f8e5f156b added private state in postgres.c to track whether
a statement timeout is running.  This seems like bad design to me;
timeout.c's private state should be the single source of truth about
that.  We already fixed one bug associated with failure to keep those
states in sync (cf. be42015fc), and I've got little faith that we
won't find more in future.  So get rid of postgres.c's local variable
by exposing a way to ask timeout.c whether a timeout is running.
(Obviously, such an inquiry is subject to race conditions, but it
seems fine for the purpose at hand.)

To make get_timeout_active() as cheap as possible, add a flag in
the per-timeout struct showing whether that timeout is active.
This allows some small savings elsewhere in timeout.c, mainly
elimination of unnecessary searches of the active_timeouts array.

While at it, fix enable_statement_timeout to not call disable_timeout
when statement_timeout is 0 and the timeout is not running.  This
avoids a useless deschedule-and-reschedule-timeouts cycle, which
represents a significant savings (at least one kernel call) when
there is any other active timeout.  Right now, there usually isn't,
but there are proposals around to change that.

Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org
2019-10-25 11:41:16 -04:00
Tom Lane 2b2bacdca0 Reset statement_timeout between queries of a multi-query string.
Historically, we started the timer (if StatementTimeout > 0) at the
beginning of a simple-Query message and usually let it run until the
end, so that the timeout limit applied to the entire query string,
and intra-string changes of the statement_timeout GUC had no effect.
But, confusingly, a COMMIT within the string would reset the state
and allow a fresh timeout cycle to start with the current setting.

Commit f8e5f156b changed the behavior of statement_timeout for extended
query protocol, and as an apparently-unintended side effect, a change in
the statement_timeout GUC during a multi-statement simple-Query message
might have an effect immediately --- but only if it was going from
"disabled" to "enabled".

This is all pretty confusing, not to mention completely undocumented.
Let's change things so that the timeout is always reset between queries
of a multi-query string, whether they're transaction control commands
or not.  Thus the active timeout setting is applied to each query in
the string, separately.  This costs a few more cycles if statement_timeout
is active, but it provides much more intuitive behavior, especially if one
changes statement_timeout in one of the queries of the string.

Also, add something to the documentation to explain all this.

Per bug #16035 from Raj Mohite.  Although this is a bug fix, I'm hesitant
to back-patch it; conceivably somebody has worked out the old behavior
and is depending on it.  (But note that this change should make the
behavior less restrictive in most cases, since the timeout will now
be applied to shorter segments of code.)

Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org
2019-10-25 11:15:50 -04:00
Michael Paquier 8270a0d9a9 Handle interrupts within a transaction context in REINDEX CONCURRENTLY
Phases 2 (building the new index) and 3 (validating the new index)
checked for interrupts outside a transaction context, having as
consequence to not release session-level locks taken on the parent
relation and the old and new indexes processed.  This could for example
be triggered with statement_timeout and a bad timing, and would issue
confusing error messages when shutting down the session still holding
the locks (note that an assertion failure would be triggered first), on
top of more issues with concurrent sessions trying to take a lock that
would interfere with the SHARE UPDATE EXCLUSIVE locks hold here.

This moves all the interruption checks inside a transaction context.
Note that I have manually tested all interruptions to make sure that
invalid indexes can be cleaned up properly.  Partition indexes still
have issues on their own with some missing dependency handling, which
will be dealt with in a follow-up patch.

Reported-by: Justin Pryzby
Author: Michael Paquier
Discussion: https://postgr.es/m/20191013025145.GC4475@telsasoft.com
Backpatch-through: 12
2019-10-25 10:20:08 +09:00
Fujii Masao 3b0c59ac1c Fix typo in xlog.c.
Author: Fujii Masao
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAHGQGwH7dtYvOZZ8c0AG5AJwH5pfiRdKaCptY1_RdHy0HYeRfQ@mail.gmail.com
2019-10-24 14:13:36 +09:00
Michael Paquier 5d3500da72 Acquire properly session-level lock on new index in REINDEX CONCURRENTLY
In the first transaction run for REINDEX CONCURRENTLY, a thinko in the
existing logic caused two session locks to be taken on the old index,
causing the session lock on the newly-created index to be missed.  This
made possible concurrent DDL commands (like ALTER INDEX) on the new
index while REINDEX CONCURRENTLY was processing from the point where the
first internal transaction committed.

This issue has been discovered while digging into another bug.

Author: Michael Paquier
Discussion: https://postgr.es/m/20191021074323.GB1869@paquier.xyz
Backpatch-through: 12
2019-10-23 15:04:48 +09:00
Michael Paquier e3db3f829f Clean up properly error_context_stack in autovacuum worker on exception
Any callback set would have no meaning in the context of an exception.
As an autovacuum worker exits quickly in this context, this could be
only an issue within EmitErrorReport(), where the elog hook is for
example called.  That's unlikely to going to be a problem, but let's be
clean and consistent with other code paths handling exceptions.  This is
present since 2909419, which introduced autovacuum.

Author: Ashwin Agrawal
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALfoeisM+_+dgmAdAOHAu0k-ZpEHHqSSG=GRf3pKJGm8OqWX0w@mail.gmail.com
Backpatch-through: 9.4
2019-10-23 10:25:06 +09:00
Peter Eisentraut f86f46d091 Fix comment
The last argument of smgrextend() was renamed from isTemp to skipFsync
in debcec7dc3, but the comments at two
call sites were not updated.
2019-10-22 09:58:20 +02:00
Alexander Korotkov 52ad1e6599 Refactor jsonpath's compareDatetime()
This commit refactors come ridiculous coding in compareDatetime().  Also, it
provides correct cross-datatype comparison even when one of values overflows
during cast.  That eliminates dilemma on whether we should suppress overflow
errors during cast.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/32308.1569455803%40sss.pgh.pa.us
Discussion: https://postgr.es/m/a5629d0c-8162-7559-16aa-0c8390d6ba5f%40postgrespro.ru
Author: Nikita Glukhov, Alexander Korotkov
2019-10-21 23:07:07 +03:00
Alexander Korotkov a6888fde7f Refactor timestamp2timestamptz_opt_error()
While casting from timestamp to timestamptz we do timestamp2tm() then
tm2timestamp().  This commit eliminates call to tm2timestamp().  Instead, it
directly applies timezone offset to the original timestamp value.  That makes
upcoming datetime overflow handling in jsonpath easier.  That should also save
us some CPU cycles.

Discussion: https://postgr.es/m/CAPpHfdvRPRh_mTGar5WmDeRZ%3DU5dOXHdxspYYD%3D76m3knNGjXA%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Tom Lane
2019-10-21 23:07:07 +03:00
Etsuro Fujita 80831bcdbe Update obsolete comment.
Commit b52b7dc25, which moved code creating PartitionBoundInfo in
RelationBuildPartitionDesc() in partcache.c (relocated to partdesc.c
afterwards) to partbounds.c, should have updated this, but didn't.

Author: Etsuro Fujita
Reviewed-by: Alvaro Herrera
Backpatch-through: 12
Discussion: https://postgr.es/m/CAPmGK16Uxr%3DPatiGyaRwiQVLB7Y-GqbkK3AxRLVYzU0Czv%3DsEw%40mail.gmail.com
2019-10-21 17:30:00 +09:00
Amit Kapila 70a6c37d52 Fix memory leak introduced in commit 7df159a620.
We memorize all internal and empty leaf pages in the 1st vacuum stage for
gist indexes.  They are used in the 2nd stage, to delete all the empty
pages.  There was a memory context page_set_context for this purpose, but
we never used it.

Reported-by: Amit Kapila
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 12, where it got introduced
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
2019-10-21 08:57:32 +05:30
Peter Eisentraut 5d3587d14b Fix most -Wundef warnings
In some cases #if was used instead of #ifdef in an inconsistent style.
Cleaning this up also helps when analyzing cases like
38d8dce61f where this makes a
difference.

There are no behavior changes here, but the change in pg_bswap.h would
prevent possible accidental misuse by third-party code.

Discussion: https://www.postgresql.org/message-id/flat/3b615ca5-c595-3f1d-fdf7-a429e564f614%402ndquadrant.com
2019-10-19 18:31:38 +02:00
Noah Misch 48cc59ed24 Use standard compare_exchange loop style in ProcArrayGroupClearXid().
Besides style, this might improve performance in the contended case.

Reviewed by Amit Kapila.

Discussion: https://postgr.es/m/20191015035348.GA4166224@rfd.leadboat.com
2019-10-18 20:21:10 -07:00
Michael Paquier f25968c496 Remove last traces of heap_open/close in the tree
Since pluggable storage has been introduced, those two routines have
been replaced by table_open/close, with some compatibility macros still
present to allow extensions to compile correctly with v12.

Some code paths using the old routines still remained, so replace them.
Based on the discussion done, the consensus reached is that it is better
to remove those compatibility macros so as nothing new uses the old
routines, so remove also the compatibility macros.

Discussion: https://postgr.es/m/20191017014706.GF5605@paquier.xyz
2019-10-19 11:18:15 +09:00
Fujii Masao ec1259e880 Fix failure of archive recovery with recovery_min_apply_delay enabled.
recovery_min_apply_delay parameter is intended for use with streaming
replication deployments. However, the document clearly explains that
the parameter will be honored in all cases if it's specified. So it should
take effect even if in archive recovery. But, previously, archive recovery
with recovery_min_apply_delay enabled always failed, and caused assertion
failure if --enable-caasert is enabled.

The cause of this problem is that; the ownership of recoveryWakeupLatch
that recovery_min_apply_delay uses was taken only when standby mode
is requested. So unowned latch could be used in archive recovery, and
which caused the failure.

This commit changes recovery code so that the ownership of
recoveryWakeupLatch is taken even in archive recovery. Which prevents
archive recovery with recovery_min_apply_delay from failing.

Back-patch to v9.4 where recovery_min_apply_delay was added.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
2019-10-18 22:32:18 +09:00
Fujii Masao 9b95a36be8 Make crash recovery ignore recovery_min_apply_delay setting.
In v11 or before, this setting could not take effect in crash recovery
because it's specified in recovery.conf and crash recovery always
starts without recovery.conf. But commit 2dedf4d9a8 integrated
recovery.conf into postgresql.conf and which unexpectedly allowed
this setting to take effect even in crash recovery. This is definitely
not good behavior.

To fix the issue, this commit makes crash recovery always ignore
recovery_min_apply_delay setting.

Back-patch to v12 where the issue was added.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com
Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
2019-10-18 22:24:18 +09:00
Alvaro Herrera 89403ed228 Fix typo
Apparently while this code was being developed,
ReindexRelationConcurrently operated on multiple relations.  The version
that was ultimately pushed doesn't, so this comment's use of plural is
inaccurate.
2019-10-18 14:49:39 +02:00
Alvaro Herrera d2efb90dba Update comments about progress reporting by index_drop
Michaël Paquier complained that index_drop is requesting progress
reporting for non-obvious reasons, so let's add a comment to explain
why.

Discussion: https://postgr.es/m/20191017010412.GH2602@paquier.xyz
2019-10-18 07:23:05 -03:00
Michael Paquier 3f60f690fa Fix timeout handling in logical replication worker
The timestamp tracking the last moment a message is received in a
logical replication worker was initialized in each loop checking if a
message was received or not, causing wal_receiver_timeout to be ignored
in basically any logical replication deployments.  This also broke the
ping sent to the server when reaching half of wal_receiver_timeout.

This simply moves the initialization of the timestamp out of the apply
loop to the beginning of LogicalRepApplyLoop().

Reported-by: Jehan-Guillaume De Rorthais
Author: Julien Rouhaud
Discussion: https://postgr.es/m/CAOBaU_ZHESFcWva8jLjtZdCLspMj7vqaB2k++rjHLY897ZxbYw@mail.gmail.com
Backpatch-through: 10
2019-10-18 14:26:29 +09:00
Alvaro Herrera 38ddeab13b Fix minor bug in logical-replication walsender shutdown
Logical walsender should exit when it catches up with sending WAL during
shutdown; but there was a rare corner case when it failed to because of
a race condition that puts it back to wait for more WAL instead -- but
since there wasn't any, it'd not shut down immediately.  It would only
continue the shutdown when wal_sender_timeout terminates the sleep,
which causes annoying waits during shutdown procedure.  Restructure the
code so that we no longer forget to set WalSndCaughtUp in that case.

This was an oversight in commit c6c333436.

Backpatch all the way down to 9.4.

Author: Craig Ringer, Álvaro Herrera
Discussion: https://postgr.es/m/CAMsr+YEuz4XwZX_QmnX_-2530XhyAmnK=zCmicEnq1vLr0aZ-g@mail.gmail.com
2019-10-17 15:06:06 +02:00
Thomas Munro 3c8c55dd54 When restoring GUCs in parallel workers, show an error context.
Otherwise it can be hard to see where an error is coming from, when
the parallel worker sets all the GUCs that it received from the
leader.  Bug #15726.  Back-patch to 9.5, where RestoreGUCState()
appeared.

Reported-by: Tiago Anastacio
Reviewed-by: Daniel Gustafsson, Tom Lane
Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org
2019-10-17 13:47:01 +13:00
Thomas Munro 6bda2af039 Fix bug that could try to freeze running multixacts.
Commits 801c2dc7 and 801c2dc7 made it possible for vacuum to
try to freeze a multixact that is still running.  That was
prevented by a check, but raised an error.  Repair.

Back-patch all the way.

Author: Nathan Bossart, Jeremy Schneider
Reported-by: Jeremy Schneider
Reviewed-by: Jim Nasby, Thomas Munro
Discussion: https://postgr.es/m/DAFB8AFF-2F05-4E33-AD7F-FF8B0F760C17%40amazon.com
2019-10-17 09:59:21 +13:00
Alvaro Herrera 0d21f919eb Fix crash when reporting CREATE INDEX progress
A race condition can make us try to dereference a NULL pointer to the
PGPROC struct of a process that's already finished.  That results in
crashes during REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY.

This was introduced in ab0dfc961b, so backpatch to pg12.

Reported by: Justin Pryzby
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/20191012004446.GT10470@telsasoft.com
2019-10-16 14:51:34 +02:00
Michael Paquier 1de4fd1092 Refresh some incorrect links in pg_crc.c/h
Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm0LPk9vTGTBPBRv0=fX=94o4r6-DuBbHNeCN2AH5bufLw@mail.gmail.com
2019-10-16 15:10:14 +09:00
Thomas Munro d5ac14f9cc Use libc version as a collation version on glibc systems.
Using glibc's version string to detect potential collation definition
changes is not 100% reliable, but it's better than nothing.  Currently
this affects only collations explicitly provided by "libc".  More work
will be needed to handle the default collation.

Author: Thomas Munro, based on a suggestion from Christoph Berg
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com
2019-10-16 17:28:24 +13:00
Andres Freund cef82eda14 Fix CLUSTER on expression indexes.
Since the introduction of different slot types, in 1a0586de36, we
create a virtual slot in tuplesort_begin_cluster(). While that looks
right, it unfortunately doesn't actually work, as ExecStoreHeapTuple()
is used to store tuples in the slot. Unfortunately no regression tests
for CLUSTER on expression indexes existed so far.

Fix the slot type, and add bare bones tests for CLUSTER on expression
indexes.

Reported-By: Justin Pryzby
Author: Andres Freund
Discussion: https://postgr.es/m/20191011210320.GS10470@telsasoft.com
Backpatch: 12, like 1a0586de36
2019-10-15 10:40:13 -07:00
Peter Eisentraut bdb839cbde Update unicode.org URLs
Use https, consistent host name, remove references to ftp.  Also
update the URLs for CLDR, which has moved from Trac to GitHub.
2019-10-13 22:10:38 +02:00
Tom Lane 9abb2bfc04 In the postmaster, rely on the signal infrastructure to block signals.
POSIX sigaction(2) can be told to block a set of signals while a
signal handler executes.  Make use of that instead of manually
blocking and unblocking signals in the postmaster's signal handlers.
This should save a few cycles, and it also prevents recursive
invocation of signal handlers when many signals arrive in close
succession.  We have seen buildfarm failures that seem to be due to
postmaster stack overflow caused by such recursion (exacerbated by
a Linux PPC64 kernel bug).

This doesn't change anything about the way that it works on Windows.
Somebody might consider adjusting port/win32/signal.c to let it work
similarly, but I'm not in a position to do that.

For the moment, just apply to HEAD.  Possibly we should consider
back-patching this, but it'd be good to let it age awhile first.

Discussion: https://postgr.es/m/14878.1570820201@sss.pgh.pa.us
2019-10-13 15:48:26 -04:00
Michael Paquier 1df5875d39 Fix dependency handling of column drop with partitioned tables
When dropping a column on a partitioned table which has one or more
partitioned indexes, the operation was failing as dependencies with
partitioned indexes using the column dropped were not getting removed in
a way consistent with the columns involved across all the relations part
of an inheritance tree.

This commit refactors the code executing column drop so as all the
columns from an inheritance tree to remove are gathered first, and
dropped all at the end.  This way, we let the dependency machinery sort
out by itself the deletion of all the columns with the partitioned
indexes across a partition tree.

This issue has been introduced by 1d92a0c, so backpatch down to
REL_12_STABLE.

Author: Amit Langote, Michael Paquier
Reviewed-by: Álvaro Herrera, Ashutosh Sharma
Discussion: https://postgr.es/m/CA+HiwqE9kuBsZ3b5pob2-cvE8ofzPWs-og+g8bKKGnu6b4-yTQ@mail.gmail.com
Backpatch-through: 12
2019-10-13 17:51:55 +09:00
Peter Eisentraut b4675a8ae2 Fix use of term "verifier"
Within the context of SCRAM, "verifier" has a specific meaning in the
protocol, per RFCs.  The existing code used "verifier" differently, to
mean whatever is or would be stored in pg_auth.rolpassword.

Fix this by using the term "secret" for this, following RFC 5803.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/be397b06-6e4b-ba71-c7fb-54cae84a7e18%402ndquadrant.com
2019-10-12 21:41:59 +02:00
Fujii Masao 20961ceaf0 Make crash recovery ignore restore_command and recovery_end_command settings.
In v11 or before, those settings could not take effect in crash recovery
because they are specified in recovery.conf and crash recovery always
starts without recovery.conf. But commit 2dedf4d9a8 integrated
recovery.conf into postgresql.conf and which unexpectedly allowed
those settings to take effect even in crash recovery. This is definitely
not good behavior.

To fix the issue, this commit makes crash recovery always ignore
restore_command and recovery_end_command settings.

Back-patch to v12 where the issue was added.

Author: Fujii Masao
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
2019-10-11 15:47:59 +09:00
Andres Freund 93765bd956 Fix table rewrites that include a column without a default.
In c2fe139c20 I made ATRewriteTable() use tuple slots. Unfortunately
I did not notice that columns can be added in a rewrite that do not
have a default, when another column is added/altered requiring one.

Initialize columns to NULL again, and add tests.

Bug: #16038
Reported-By: anonymous
Author: Andres Freund
Discussion: https://postgr.es/m/16038-5c974541f2bf6749@postgresql.org
Backpatch: 12, where the bug was introduced in c2fe139c20
2019-10-09 22:00:50 -07:00
Peter Eisentraut 50518ec296 Revert "Use libc version as a collation version on glibc systems."
This reverts commit 9f90b1d08d.

This needs some refinements in the pg_dump and pg_upgrade tests.
2019-10-09 21:36:01 +02:00
Peter Eisentraut 9f90b1d08d Use libc version as a collation version on glibc systems.
Using glibc's version number to detect potential collation definition
changes is not 100% reliable, but it's better than nothing.

Author: Thomas Munro
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com
2019-10-09 21:17:47 +02:00
Michael Paquier b8e19b932a Flush logical mapping files with fd opened for read/write at checkpoint
The file descriptor was opened with read-only to fsync a regular file,
which would cause EBADFD errors on some platforms.

This is similar to the recent fix done by a586cc4b (which was broken by
me with 82a5649), except that I noticed this issue while monitoring the
backend code for similar mistakes.  Backpatch to 9.4, as this has been
introduced since logical decoding exists as of b89e151.

Author: Michael Paquier
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz
Backpatch-through: 9.4
2019-10-09 13:30:43 +09:00
Peter Eisentraut 38d8dce61f Remove some code for old unsupported versions of MSVC
As of d9dd406fe2, we require MSVC 2013,
which means _MSC_VER >= 1800.  This means that conditionals about
older versions of _MSC_VER can be removed or simplified.

Previous code was also in some cases handling MinGW, where _MSC_VER is
not defined at all, incorrectly, such as in pg_ctl.c and win32_port.h,
leading to some compiler warnings.  This should now be handled better.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
2019-10-08 10:50:54 +02:00
Michael Paquier a7471bd85c Update some outdated links about XLC and UNIX specification
Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm3Dy=dTdx8UCVw=DWbzLzmRUC1dkq45=heOZDUg3U_PtA@mail.gmail.com
2019-10-08 14:31:30 +09:00
Tom Lane 3887e9455f Check for too many postmaster children before spawning a bgworker.
The postmaster's code path for spawning a bgworker neglected to check
whether we already have the max number of live child processes.  That's
a bit hard to hit, since it would necessarily be a transient condition;
but if we do, AssignPostmasterChildSlot() fails causing a postmaster
crash, as seen in a report from Bhargav Kamineni.

To fix, invoke canAcceptConnections() in the bgworker code path, as we
do in the other code paths that spawn children.  Since we don't want
the same pmState tests in this case, add a child-process-type parameter
to canAcceptConnections() so that it can know what to do.

Back-patch to 9.5.  In principle the same hazard exists in 9.4, but the
code is enough different that this patch wouldn't quite fix it there.
Given the tiny usage of bgworkers in that branch it doesn't seem worth
creating a variant patch for it.

Discussion: https://postgr.es/m/18733.1570382257@sss.pgh.pa.us
2019-10-07 12:39:09 -04:00
Tom Lane ac12ab06a9 Avoid trying to release a List's initial allocation via repalloc().
Commit 1cff1b95a included some code that supposed it could repalloc()
a memory chunk to a smaller size without risk of the chunk moving.
That was not a great idea, because it depended on undocumented behavior
of AllocSetRealloc, which commit c477f3e44 changed thereby breaking it.
(Not to mention that this code ought to work with other memory context
types, which might not work the same...)  So get rid of the repalloc
calls, and instead just wipe the now-unused ListCell array and/or tell
Valgrind it's NOACCESS, as if we'd freed it.

In cases where the initial list allocation had been quite large, this
could represent an annoying waste of space.  In principle we could
ameliorate that by allocating the initial cell array separately when
it exceeds some threshold.  But that would complicate new_list() which
is hot code, and the returns would materialize only in narrow cases.
On balance I don't think it'd be worth it.

Discussion: https://postgr.es/m/17059.1570208426@sss.pgh.pa.us
2019-10-06 12:06:30 -04:00
Tomas Vondra 36425ece5d Change MemoryContextMemAllocated to return Size
Commit f2369bc610 switched most of the memory accounting from int64 to
Size, but it forgot to change the MemoryContextMemAllocated return type.
So this fixes that omission.

Discussion: https://www.postgresql.org/message-id/11238.1570200198%40sss.pgh.pa.us
2019-10-05 20:49:39 +02:00
Andres Freund d986d4e87f Fix crash caused by EPQ happening with a before update trigger present.
When ExecBRUpdateTriggers()'s GetTupleForTrigger() follows an EPQ
chain the former needs to run the result tuple through the junkfilter
again, and update the slot containing the new version of the tuple to
contain that new version. The input tuple may already be in the
junkfilter's output slot, which used to be OK - we don't need the
previous version anymore. Unfortunately ff11e7f4b9 started to use
ExecCopySlot() to update newslot, and ExecCopySlot() doesn't support
copying a slot into itself, leading to a slot in a corrupt
state, which then can cause crashes or other symptoms.

Fix this by skipping the ExecCopySlot() when copying into itself.

While we could have easily made ExecCopySlot() handle that case, it
seems better to add an assert forbidding doing so instead. As the goal
of copying might be to make the contents of one slot independent from
another, it seems failure prone to handle doing so silently.

A follow-up commit will add tests for the obviously under-covered
combination of EPQ and triggers. Done as a separate commit as it might
make sense to backpatch them further than this bug.

Also remove confusion with confusing variable names for slots in
ExecBRDeleteTriggers() and ExecBRUpdateTriggers().

Bug: #16036
Reported-By: Антон Власов
Author: Andres Freund
Discussion: https://postgr.es/m/16036-28184c90d952fb7f@postgresql.org
Backpatch: 12-, where ff11e7f4b9 was merged
2019-10-04 13:50:49 -07:00
Andres Freund a586cc4b6c Use a fd opened for read/write when syncing slots during startup, take 2.
Cribbing from dfbaed45975:
    Some operating systems, including the reporter's windows, return EBADFD
    or similar when fsync() is invoked on a O_RDONLY file descriptor.
    Unfortunately RestoreSlotFromDisk() does exactly that; which causes
    failures after restarts in at least some scenarios.

    If you hit the bug the error message will be something like
    ERROR: could not fsync file "pg_replslot/$name/state": Bad file descriptor

    Simply use O_RDWR instead of O_RDONLY when opening the relevant file
    descriptor to fix the bug.

Unfortunately this fix was undone in 82a5649fb9. Re-apply, and add a
comment.

Bug: 16039
Reported-By: Hans Buschmann
Author: Andres Freund
Discussion: https://postgr.es/m/16039-196fc97cc05e141c@postgresql.org
Backpatch: 12-, as 82a5649fb9
2019-10-04 13:34:28 -07:00
Robert Haas 2e8b6bfa90 Rename some toasting functions based on whether they are heap-specific.
The old names for the attribute-detoasting functions names included
the word "heap," which seems outdated now that the heap is only one of
potentially many table access methods.

On the other hand, toast_insert_or_update and toast_delete are
heap-specific, so rename them by adding "heap_" as a prefix.

Not all of the work of making the TOAST system fully accessible to AMs
other than the heap is done yet, but there seems to be little harm in
getting this renaming out of the way now. Commit
8b94dab066 already divided up the
functions among various files partially according to whether it was
intended that they should be heap-specific or AM-agnostic, so this is
just clarifying the division contemplated by that commit.

Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro,
Andres Freund, and Álvaro Herrera.

Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com
2019-10-04 14:24:46 -04:00
Tom Lane 61aa9f544a Fix bitshiftright()'s zero-padding some more.
Commit 5ac0d9360 failed to entirely fix bitshiftright's habit of
leaving one-bits in the pad space that should be all zeroes,
because in a moment of sheer brain fade I'd concluded that only
the code path used for not-a-multiple-of-8 shift distances needed
to be fixed.  Of course, a multiple-of-8 shift distance can also
cause the problem, so we need to forcibly zero the extra bits
in both cases.

Per bug #16037 from Alexander Lakhin.  As before, back-patch to all
supported branches.

Discussion: https://postgr.es/m/16037-1d1ebca564db54f4@postgresql.org
2019-10-04 10:34:40 -04:00
Tomas Vondra f2369bc610 Use Size instead of int64 to track allocated memory
Commit 5dd7fc1519 added block-level memory accounting, but used int64 variable to
track the amount of allocated memory. That is incorrect, because we have Size for
exactly these purposes, but it was mostly harmless until c477f3e449 which changed
how we handle with repalloc() when downsizing the chunk. Previously we've ignored
these cases and just kept using the original chunk, but now we need to update the
accounting, and the code was doing this:

    context->mem_allocated += blksize - oldblksize;

Both blksize and oldblksize are Size (so unsigned) which means the subtraction
underflows, producing a very high positive value. On 64-bit platforms (where Size
has the same size as mem_alllocated) this happens to work because the result wraps
to the right value, but on (some) 32-bit platforms this fails.

This fixes two things - it changes mem_allocated (and related variables) to Size,
and it splits the update to two separate steps, to prevent any underflows.

Discussion: https://www.postgresql.org/message-id/15151.1570163761%40sss.pgh.pa.us
2019-10-04 16:10:56 +02:00
Robert Haas 967e276e9f Remove AtSubStart_Notify.
Allocate notify-related state lazily instead. This makes trivial
subtransactions noticeably faster.

Patch by me, reviewed and tested by Dilip Kumar, Kyotaro Horiguchi,
and Jeevan Ladhe.

Discussion: https://postgr.es/m/CA+TgmobE1J22S1eC-6N-je9LgrcwZypkwp+zH6JXo9mc=4Nk3A@mail.gmail.com
2019-10-04 08:19:25 -04:00
Tom Lane 8e10405c74 Avoid unnecessary out-of-memory errors during encoding conversion.
Encoding conversion uses the very simplistic rule that the output
can't be more than 4X longer than the input, and palloc's a buffer
of that size.  This results in failure to convert any string longer
than 1/4 GB, which is becoming an annoying limitation.

As a band-aid to improve matters, allow the allocated output buffer
size to exceed 1GB.  We still insist that the final result fit into
MaxAllocSize (1GB), though.  Perhaps it'd be safe to relax that
restriction, but it'd require close analysis of all callers, which
is daunting (not least because external modules might call these
functions).  For the moment, this should allow a 2X to 4X improvement
in the longest string we can convert, which is a useful gain in
return for quite a simple patch.

Also, once we have successfully converted a long string, repalloc
the output down to the actual string length, returning the excess
to the malloc pool.  This seems worth doing since we can usually
expect to give back several MB if we take this path at all.

This still leaves much to be desired, most notably that the assumption
that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no
guard code verifying that the output buffer isn't overrun.  Fixing
that would require significant changes in the encoding conversion
APIs, so it'll have to wait for some other day.

The present patch seems safely back-patchable, so patch all supported
branches.

Alvaro Herrera and Tom Lane

Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql
Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
2019-10-03 17:34:25 -04:00
Tom Lane c477f3e449 Allow repalloc() to give back space when a large chunk is downsized.
Up to now, if you resized a large (>8K) palloc chunk down to a smaller
size, aset.c made no attempt to return any space to the malloc pool.
That's unpleasant if a really large allocation is resized to a
significantly smaller size.  I think no such cases existed when this
code was designed, and I'm not sure whether they're common even yet,
but an upcoming fix to encoding conversion will certainly create such
cases.  Therefore, fix AllocSetRealloc so that it gives realloc()
a chance to do something with the block.  This doesn't noticeably
increase complexity, we mostly just have to change the order in which
the cases are considered.

Back-patch to all supported branches.

Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql
Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us
2019-10-03 13:56:26 -04:00
Andrew Gierth b7a1c5539a Selectively include window frames in expression walks/mutates.
query_tree_walker and query_tree_mutator were skipping the
windowClause of the query, without regard for the fact that the
startOffset and endOffset in a WindowClause node are expression trees
that need to be processed. This was an oversight in commit ec4be2ee6
from 2010 which added the expression fields; the main symptom is that
function parameters in window frame clauses don't work in inlined
functions.

Fix (as conservatively as possible since this needs to not break
existing out-of-tree callers) and add tests.

Backpatch all the way, since this has been broken since 9.0.

Per report from Alastair McKinley; fix by me with kibitzing and review
from Tom Lane.

Discussion: https://postgr.es/m/DB6PR0202MB2904E7FDDA9D81504D1E8C68E3800@DB6PR0202MB2904.eurprd02.prod.outlook.com
2019-10-03 10:54:52 +01:00
Michael Paquier df86e52cac Remove temporary WAL and history files at the end of archive recovery
cbc55da has reworked the order of some actions at the end of archive
recovery.  Unfortunately this overlooked the fact that the startup
process needs to remove RECOVERYXLOG (for temporary WAL segment newly
recovered from archives) and RECOVERYHISTORY (for temporary history
file) at this step, leaving the files around even after recovery ended.

Backpatch to 9.5, like the previous commit.

Author: Sawada Masahiko
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com
Backpatch-through: 9.5
2019-10-02 15:53:07 +09:00
Michael Paquier 9555cc8d2b Revert hooks for session start and end, take two
The location of the session end hook has been chosen so as it is
possible to allow modules to do their own transactions, however any
trying to any any subsystem which went through before_shmem_exit()
would cause issues, limiting the pluggability of the hook.

Per discussion with Tom Lane and Andres Freund.

Discussion: https://postgr.es/m/18722.1569906636@sss.pgh.pa.us
2019-10-02 09:55:27 +09:00
Tomas Vondra fa2fe04bf1 Mark two variables in in aset.c with PG_USED_FOR_ASSERTS_ONLY
This fixes two compiler warnings about unused variables in non-assert builds,
introduced by 5dd7fc1519.
2019-10-01 14:39:06 +02:00
Tomas Vondra 11a078cf87 Optimize partial TOAST decompression
Commit 4d0e994eed added support for partial TOAST decompression, so the
decompression is interrupted after producing the requested prefix. For
prefix and slices near the beginning of the entry, this may saves a lot
of decompression work.

That however only deals with decompression - the whole compressed entry
was still fetched and re-assembled, even though the compression used
only a small fraction of it. This commit improves that by computing how
much compressed data may be needed to decompress the requested prefix,
and then fetches only the necessary part.

We always need to fetch a bit more compressed data than the requested
(uncompressed) prefix, because the prefix may not be compressible at all
and pglz itself adds a bit of overhead. That means this optimization is
most effective when the requested prefix is much smaller than the whole
compressed entry.

Author: Binguo Bao
Reviewed-by: Andrey Borodin, Tomas Vondra, Paul Ramsey
Discussion: https://www.postgresql.org/message-id/flat/CAL-OGkthU9Gs7TZchf5OWaL-Gsi=hXqufTxKv9qpNG73d5na_g@mail.gmail.com
2019-10-01 14:28:28 +02:00
Michael Paquier e788bd924c Add hooks for session start and session end, take two
These hooks can be used in loadable modules.  A simple test module is
included.

The first attempt was done with cd8ce3a but we lacked handling for
NO_INSTALLCHECK in the MSVC scripts (problem solved afterwards by
431f1599) so the buildfarm got angry.  This also fixes a couple of
issues noticed upon review compared to the first attempt, so the code
has slightly changed, resulting in a more simple test module.

Author: Fabrízio de Royes Mello, Yugo Nagata
Reviewed-by: Andrew Dunstan, Michael Paquier, Aleksandr Parfenov
Discussion: https://postgr.es/m/20170720204733.40f2b7eb.nagata@sraoss.co.jp
Discussion: https://postgr.es/m/20190823042602.GB5275@paquier.xyz
2019-10-01 12:15:25 +09:00
Tomas Vondra 5dd7fc1519 Add transparent block-level memory accounting
Adds accounting of memory allocated in a memory context. Compared to
various ad hoc solutions, the main advantage is that the accounting is
transparent and does not require direct control over allocations (this
matters for use cases where the allocations happen in user code, like
for example aggregate states allocated in a transition functions).

To reduce overhead, the accounting happens at the block level (not for
individual chunks) and only the context immediately owning the block is
updated. When inquiring about amount of memory allocated in a context,
we have to recursively walk all children contexts.

This "lazy" accounting works well for cases with relatively small number
of contexts in the relevant subtree and/or with infrequent inquiries.

Author: Jeff Davis
Reivewed-by: Tomas Vondra, Melanie Plageman, Soumyadeep Chakraborty
Discussion: https://www.postgresql.org/message-id/flat/027a129b8525601c6a680d27ce3a7172dab61aab.camel@j-davis.com
2019-10-01 03:13:39 +02:00
Andres Freund 36d22dd95b Don't generate EEOP_*_FETCHSOME operations for slots know to be virtual.
That avoids unnecessary work during both interpreted execution, and
JIT compiled expression evaluation. Both benefit from fewer expression
steps needing be processed, and for interpreted execution there now is
a fastpath dedicated to just fetching a value from a virtual
slot. That's e.g. beneficial for hashjoins over nodes that perform
projections, as the hashed columns are currently fetched individually.

Author: Soumyadeep Chakraborty, Andres Freund
Discussion: https://postgr.es/m/CAE-ML+9OKSN71+mHtfMD-L24oDp8dGTfaVjDU6U+j+FNAW5kRQ@mail.gmail.com
2019-09-30 16:06:16 -07:00
Andres Freund 34c9c53bb0 Reduce code duplication for ExecJust*Var operations.
This is mainly in preparation for adding further fastpath evaluation
routines.

Also reorder ExecJust*Var functions to be consistent with the order in
which they're used.

Author: Andres Freund
Discussion: https://postgr.es/m/CAE-ML+9OKSN71+mHtfMD-L24oDp8dGTfaVjDU6U+j+FNAW5kRQ@mail.gmail.com
2019-09-30 15:32:00 -07:00
Fujii Masao 7acf8a876b Make crash recovery ignore recovery target settings.
In v11 or before, recovery target settings could not take effect in
crash recovery because they are specified in recovery.conf and
crash recovery always starts without recovery.conf. But commit
2dedf4d9a8 integrated recovery.conf into postgresql.conf and
which unexpectedly allowed recovery target settings to take effect
even in crash recovery. This is definitely not good behavior.

To fix the issue, this commit makes crash recovery always ignore
recovery target settings.

Back-patch to v12.

Author: Peter Eisentraut
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net
2019-09-30 10:18:15 +09:00
Andres Freund ac88807f9b jit: Re-allow JIT compilation of execGrouping.c hashtable comparisons.
In the course of 5567d12ce0, 356687bd8 and 317ffdfeaa, I changed
BuildTupleHashTable[Ext]'s call to ExecBuildGroupingEqual to not pass
in the parent node, but NULL. Which in turn prevents the tuple
equality comparator from being JIT compiled.  While that fixes
bug #15486, it is not actually necessary after all of the above commits,
as we don't re-build the comparator when using the new
BuildTupleHashTableExt() interface (as the content of the hashtable
are reset, but the TupleHashTable itself is not).

Therefore re-allow jit compilation for callers that use
BuildTupleHashTableExt with a separate context for "metadata" and
content.

As in the previous commit, there's ongoing work to make this easier to
test to prevent such regressions in the future, but that
infrastructure is not going to be backpatchable.

The performance impact of not JIT compiling hashtable equality
comparators can be substantial e.g. for aggregation queries that
aggregate a lot of input rows to few output rows (when there are a lot
of output groups, there will be fewer comparisons).

Author: Andres Freund
Discussion: https://postgr.es/m/20190927072053.njf6prdl3vb7y7qb@alap3.anarazel.de
Backpatch: 11, just as 5567d12ce0
2019-09-29 16:24:32 -07:00
Andres Freund 97e971ee05 Fix determination when slot types for upper executor nodes are fixed.
For many queries the fact that the tuple descriptor from the lower
node was not taken into account when determining whether the type of a
slot is fixed, lead to tuple deforming for such upper nodes not to be
JIT accelerated.

I broke this in 675af5c01e.

There is ongoing work to enable writing regression tests for related
behavior (including a patch that would have detected this
regression), by optionally showing such details in EXPLAIN. But as it
seems unlikely that that will be suitable for stable branches, just
merge the fix for now.

While it's fairly close to the 12 release window, the fact that 11
continues to perform JITed tuple deforming in these cases, that
there's still cases where we do so in 12, and the fact that the
performance regression can be sizable, weigh in favor of fixing it
now.

Author: Andres Freund
Discussion: https://postgr.es/m/20190927072053.njf6prdl3vb7y7qb@alap3.anarazel.de
Backpatch: 12-, where 675af5c01e was merged.
2019-09-29 15:46:17 -07:00
Peter Eisentraut 4e6f101e92 Fix compilation with older OpenSSL versions
Some older OpenSSL versions (0.9.8 branch) define TLS*_VERSION macros
but not the corresponding SSL_OP_NO_* macro, which causes the code for
handling ssl_min_protocol_version/ssl_max_protocol_version to fail to
compile.  To fix, add more #ifdefs and error handling.

Reported-by: Victor Wagner <vitus@wagner.pp.ru>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/20190924101859.09383b4f%40fafnir.local.vm
2019-09-28 22:49:01 +02:00
Michael Paquier 55282fa20f Remove code relevant to OpenSSL 0.9.6 in be/fe-secure-openssl.c
HEAD supports OpenSSL 0.9.8 and newer versions, and this code likely got
forgotten as its surrounding comments mention an incorrect version
number.

Author: Michael Paquier
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/20190927032311.GB8485@paquier.xyz
2019-09-28 15:22:49 +09:00
Andres Freund 3f6b3be39c Silence -Wmaybe-uninitialized compiler warnings in dbcommands.c.
When compiling postgres using gcc -O3, there are false-positive
warnings about the now initialized variables. Silence them.

Author: Peter Eisentraut, Andres Freund
Discussion: https://postgr.es/m/15fb2350-b8b8-e188-278f-0b34fdee5210@2ndquadrant.com
2019-09-27 14:14:30 -07:00
Andres Freund c967e13f40 Fix implicit-fallthrough compiler warning introduced in 6dda292d4d.
For some reason at least gcc-9 warns about the fallthrough, even
though it otherwise recognizes that elog(ERROR, ...) doesn't return.

Author: Andres Freund
2019-09-27 10:29:25 -07:00
Michael Paquier fbfa566488 Fix lockmode initialization for custom relation options
The code was enforcing AccessExclusiveLock for all custom relation
options, which is incorrect as the APIs allow a custom lock level to be
set.

While on it, fix a couple of inconsistencies in the tests and the README
of dummy_index_am.

Oversights in commit 773df88.

Discussion: https://postgr.es/m/20190925234152.GA2115@paquier.xyz
2019-09-27 09:31:20 +09:00
Michael Paquier 6e22813b2d Fix comment in xlogreader.c
This has been introduced by 709d003, that has moved readSegNo, readOff
and readPageTLI into a new structure called WALOpenSegment initialized
separately.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20190926.110809.248342687.horikyota.ntt@gmail.com
2019-09-26 11:53:37 +09:00
Alexander Korotkov 7881bb14f4 Correctly cast types to Datum and back in compareDatetime()
Discussion: https://postgr.es/m/CAPpHfdteFKW6MLpXM4md99m55YAuXs0n9_P2wiTq_EmG09doUA%40mail.gmail.com
2019-09-26 02:09:01 +03:00
Tom Lane b81a9c2fc5 Fix handling of GENERATED columns in CREATE TABLE LIKE INCLUDING DEFAULTS.
LIKE INCLUDING DEFAULTS tried to copy the attrdef expression without
copying the state of the attgenerated column.  This is in fact wrong,
because GENERATED and DEFAULT expressions are not the same kind of animal;
one can contain Vars and the other not.  We *must* copy attgenerated
when we're copying the attrdef expression.  Rearrange the if-tests
so that the expression is copied only when the correct one of
INCLUDING DEFAULTS and INCLUDING GENERATED has been specified.

Per private report from Manuel Rigger.

Tom Lane and Peter Eisentraut
2019-09-25 17:30:42 -04:00
Alexander Korotkov bffe1bd684 Implement jsonpath .datetime() method
This commit implements jsonpath .datetime() method as it's specified in
SQL/JSON standard.  There are no-argument and single-argument versions of
this method.  No-argument version selects first of ISO datetime formats
matching input string.  Single-argument version accepts template string as
its argument.

Additionally to .datetime() method itself this commit also implements
comparison ability of resulting date and time values.  There is some difficulty
because exising jsonb_path_*() functions are immutable, while comparison of
timezoned and non-timezoned types involves current timezone.  At first, current
timezone could be changes in session.  Moreover, timezones themselves are not
immutable and could be updated.  This is why we let existing immutable functions
throw errors on such non-immutable comparison.  In the same time this commit
provides jsonb_path_*_tz() functions which are stable and support operations
involving timezones.  As new functions are added to the system catalog,
catversion is bumped.

Support of .datetime() method was the only blocker prevents T832 from being
marked as supported.  sql_features.txt is updated correspondingly.

Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov.
Heavily revised by me.  Comments were adjusted by Liudmila Mantrova.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Alexander Korotkov, Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Liudmila Mantrova
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-25 22:51:51 +03:00
Alexander Korotkov 6dda292d4d Allow datetime values in JsonbValue
SQL/JSON standard allows manipulation with datetime values.  So, it appears to
be convinient to allow datetime values to be represented in JsonbValue struct.
These datetime values are allowed for temporary representation only.  During
serialization datetime values are converted into strings.

SQL/JSON requires writing timestamps with timezone in the same timezone offset
as they were parsed.  This is why we allow storage of timezone offset in
JsonbValue struct.  For the same reason timezone offset argument is added to
JsonEncodeDateTime() function.

Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov.
Revised by me.  Comments were adjusted by Liudmila Mantrova.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Alexander Korotkov, Liudmila Mantrova
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-25 22:51:51 +03:00
Alexander Korotkov 5bc450629b Error suppression support for upcoming jsonpath .datetime() method
Add support of error suppression in some date and time manipulation functions
as it's required for jsonpath .datetime() method support.  This commit doesn't
use PG_TRY()/PG_CATCH() in order to implement that.  Instead, it provides
internal versions of date and time functions used, which support error
suppression.

Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Alexander Korotkov, Nikita Glukhov
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-25 22:51:51 +03:00
Alexander Korotkov 66c74f8b6e Implement parse_datetime() function
This commit adds parse_datetime() function, which implements datetime
parsing with extended features demanded by upcoming jsonpath .datetime()
method:

 * Dynamic type identification based on template string,
 * Support for standard-conforming 'strict' mode,
 * Timezone offset is returned as separate value.

Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov.
Revised by me.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Alexander Korotkov
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-25 22:51:51 +03:00
Alexander Korotkov 1a950f37d0 Implement standard datetime parsing mode
SQL Standard 2016 defines rules for handling separators in datetime template
strings, which are different to to_date()/to_timestamp() rules.  Standard
allows only small set of separators and requires strict matching for them.

Standard applies to jsonpath .datetime() method and CAST (... FORMAT ...) SQL
clause.  We're not going to change handling of separators in existing
to_date()/to_timestamp() functions, because their current behavior is familiar
for users.  Standard behavior now available by special flag, which will be used
in upcoming .datetime() jsonpath method.

Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Alexander Korotkov
2019-09-25 22:51:29 +03:00
Alvaro Herrera 773df883e8 Support reloptions of enum type
All our current in core relation options of type string (not many,
admittedly) behave in reality like enums.  But after seeing an
implementation for enum reloptions, it's clear that strings are messier,
so introduce the new reloption type.  Switch all string options to be
enums instead.

Fortunately we have a recently introduced test module for reloptions, so
we don't lose coverage of string reloptions, which may still be used by
third-party modules.

Authors: Nikolay Shaplov, Álvaro Herrera
Reviewed-by: Nikita Glukhov, Aleksandr Parfenov
Discussion: https://postgr.es/m/43332102.S2V5pIjXRx@x200m
2019-09-25 15:56:52 -03:00
Michael Paquier 69f9410807 Allow definition of lock mode for custom reloptions
Relation options can define a lock mode other than AccessExclusiveMode
since 47167b7, but modules defining custom relation options did not
really have a way to enforce that.  Correct that by extending the
current API set so as modules can define a custom lock mode.

Author: Michael Paquier
Reviewed-by: Kuntal Ghosh
Discussion: https://postgr.es/m/20190920013831.GD1844@paquier.xyz
2019-09-25 10:13:52 +09:00
Michael Paquier 736b84eede Fix failure with lock mode used for custom relation options
In-core relation options can use a custom lock mode since 47167b7, that
has lowered the lock available for some autovacuum parameters.  However
it forgot to consider custom relation options.  This causes failures
with ALTER TABLE SET when changing a custom relation option, as its lock
is not defined.  The existing APIs to define a custom reloption does not
allow to define a custom lock mode, so enforce its initialization to
AccessExclusiveMode which should be safe enough in all cases.  An
upcoming patch will extend the existing APIs to allow a custom lock mode
to be defined.

The problem can be reproduced with bloom indexes, so add a test there.

Reported-by: Nikolay Sharplov
Analyzed-by: Thomas Munro, Michael Paquier
Author: Michael Paquier
Reviewed-by: Kuntal Ghosh
Discussion: https://postgr.es/m/20190920013831.GD1844@paquier.xyz
Backpatch-through: 9.6
2019-09-25 10:07:23 +09:00
Alexander Korotkov 90c0987258 Fix bug in pairingheap_SpGistSearchItem_cmp()
Our item contains only so->numberOfNonNullOrderBys of distances.  Reflect that
in the loop upper bound.

Discussion: https://postgr.es/m/53536807-784c-e029-6e92-6da802ab8d60%40postgrespro.ru
Author: Nikita Glukhov
Backpatch-through: 12
2019-09-25 01:47:36 +03:00
Alvaro Herrera 709d003fbd Rework WAL-reading supporting structs
The state-tracking of WAL reading in various places was pretty messy,
mostly because the ancient physical-replication WAL reading code wasn't
using the XLogReader abstraction.  This led to some untidy code.  Make
it prettier by creating two additional supporting structs,
WALSegmentContext and WALOpenSegment which keep track of WAL-reading
state.  This makes code cleaner, as well as supports more future
cleanup.

Author: Antonin Houska
Reviewed-by: Álvaro Herrera and (older versions) Robert Haas
Discussion: https://postgr.es/m/14984.1554998742@spoje.net
2019-09-24 16:39:53 -03:00
Tom Lane a9ae99d019 Prevent bogus pullup of constant-valued functions returning composite.
Fix an oversight in commit 7266d0997: as it stood, the code failed
when a function-in-FROM returns composite and can be simplified
to a composite constant.

For the moment, just test for composite result and abandon pullup
if we see one.  To make it actually work, we'd have to decompose
the composite constant into per-column constants; which is surely
do-able, but I'm not convinced it's worth the code space.

Per report from Raúl Marín Rodríguez.

Discussion: https://postgr.es/m/CAM6_UM4isP+buRA5sWodO_MUEgutms-KDfnkwGmryc5DGj9XuQ@mail.gmail.com
2019-09-24 12:11:32 -04:00
Fujii Masao 6d05086c0a Speedup truncations of relation forks.
When a relation is truncated, shared_buffers needs to be scanned
so that any buffers for the relation forks are invalidated in it.
Previously, shared_buffers was scanned for each relation forks, i.e.,
MAIN, FSM and VM, when VACUUM truncated off any empty pages
at the end of relation or TRUNCATE truncated the relation in place.
Since shared_buffers needed to be scanned multiple times,
it could take a long time to finish those commands especially
when shared_buffers was large.

This commit changes the logic so that shared_buffers is scanned only
one time for those three relation forks.

Author: Kirk Jamison
Reviewed-by: Masahiko Sawada, Thomas Munro, Alvaro Herrera, Takayuki Tsunakawa and Fujii Masao
Discussion: https://postgr.es/m/D09B13F772D2274BB348A310EE3027C64E2067@g01jpexmbkw24
2019-09-24 17:31:26 +09:00
Andres Freund 30d1379658 Fix ExprState's tag to be of type NodeTag rather than Node.
This appears to have been an oversight in b8d7f053c5. As it's
effectively harmless, though confusing, only fix in master.

Author: Andres Freund
2019-09-23 15:28:13 -07:00
Peter Eisentraut 887248e97e Message style fixes 2019-09-23 13:38:39 +02:00
Tom Lane 5ac0d93600 Fix failure to zero-pad the result of bitshiftright().
If the bitstring length is not a multiple of 8, we'd shift the
rightmost bits into the pad space, which must be zeroes --- bit_cmp,
for one, depends on that.  This'd lead to the result failing to
compare equal to what it should compare equal to, as reported in
bug #16013 from Daryl Waycott.

This is, if memory serves, not the first such bug in the bitstring
functions.  In hopes of making it the last one, do a bit more work
than minimally necessary to fix the bug:

* Add assertion checks to bit_out() and varbit_out() to complain if
they are given incorrectly-padded input.  This will improve the
odds that manual testing of any new patch finds problems.

* Encapsulate the padding-related logic in macros to make it
easier to use.

Also, remove unnecessary padding logic from bit_or() and bitxor().
Somebody had already noted that we need not re-pad the result of
bit_and() since the inputs are required to be the same length,
but failed to extrapolate that to the other two.

Also, move a comment block that once was near the head of varbit.c
(but people kept putting other stuff in front of it), to put it in
the header block.

Note for the release notes: if anyone has inconsistent data as a
result of saving the output of bitshiftright() in a table, it's
possible to fix it with something like
UPDATE mytab SET bitcol = ~(~bitcol) WHERE bitcol != ~(~bitcol);

This has been broken since day one, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/16013-c2765b6996aacae9@postgresql.org
2019-09-22 17:45:59 -04:00
Tom Lane 0a2f894c3c Fix typo in tts_virtual_copyslot.
The code used the destination slot's natts where it intended to
use the source slot's natts.  Adding an Assert shows that there
is no case in "make check-world" where these counts are different,
so maybe this is a harmless bug, but it's still a bug.

Takayuki Tsunakawa

Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1FD34C0E@G01JPEXMBYT05
2019-09-22 14:21:07 -04:00
Tom Lane 51004c7172 Make some efficiency improvements in LISTEN/NOTIFY.
Move the responsibility for advancing the NOTIFY queue tail pointer
from the listener(s) to the notification sender, and only have the
sender do it once every few queue pages, rather than after every batch
of notifications as at present.  This reduces the number of times we
execute asyncQueueAdvanceTail, and reduces contention when there are
multiple listeners (since that function requires exclusive lock).
This change relies on the observation that we don't really need the tail
pointer to be exactly up-to-date.  It's certainly not necessary to
attempt to release disk space more often than once per SLRU segment.
The only other usage of the tail pointer is that an incoming listener,
if it's the only listener in its database, will need to scan the queue
forward from the tail; but that's surely a less performance-critical
path than routine sending and receiving of notifies.  We compromise by
advancing the tail pointer after every 4 pages of output, so that it
shouldn't get more than a few pages behind.

Also, when sending signals to other backends after adding notify
message(s) to the queue, recognize that only backends in our own
database are going to care about those messages, so only such
backends really need to be awakened promptly.  Backends in other
databases should get kicked if they're well behind on reading the
queue, else they'll hold back the global tail pointer; but wakening
them for every single message is pointless.  This change can
substantially reduce signal traffic if listeners are spread among
many databases.  It won't help for the common case of only a single
active database, but the extra check costs very little.

Martijn van Oosterhout, with some adjustments by me

Discussion: https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com
Discussion: https://postgr.es/m/CADWG95uFj8rLM52Er80JnhRsTbb_AqPP1ANHS8XQRGbqLrU+jA@mail.gmail.com
2019-09-22 11:46:29 -04:00
Tom Lane c160b8928c Straighten out leakproofness markings on text comparison functions.
Since we introduced the idea of leakproof functions, texteq and textne
were marked leakproof but their sibling text comparison functions were
not.  This inconsistency seemed justified because texteq/textne just
relied on memcmp() and so could easily be seen to be leakproof, while
the other comparison functions are far more complex and indeed can
throw input-dependent errors.

However, that argument crashed and burned with the addition of
nondeterministic collations, because now texteq/textne may invoke
the exact same varstr_cmp() infrastructure as the rest.  It makes no
sense whatever to give them different leakproofness markings.

After a certain amount of angst we've concluded that it's all right
to consider varstr_cmp() to be leakproof, mostly because the other
choice would be disastrous for performance of many queries where
leakproofness matters.  The input-dependent errors should only be
reachable for corrupt input data, or so we hope anyway; certainly,
if they are reachable in practice, we've got problems with requirements
as basic as maintaining a btree index on a text column.

Hence, run around to all the SQL functions that derive from varstr_cmp()
and mark them leakproof.  This should result in a useful gain in
flexibility/performance for queries in which non-leakproofness degrades
the efficiency of the query plan.

Back-patch to v12 where nondeterministic collations were added.
While this isn't an essential bug fix given the determination
that varstr_cmp() is leakproof, we might as well apply it now that
we've been forced into a post-beta4 catversion bump.

Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us
2019-09-21 16:56:30 -04:00
Tom Lane 2810396312 Fix up handling of nondeterministic collations with pattern_ops opclasses.
text_pattern_ops and its siblings can't be used with nondeterministic
collations, because they use the text_eq operator which will not behave
as bitwise equality if applied with a nondeterministic collation.  The
initial implementation of that restriction was to insert a run-time test
in the related comparison functions, but that is inefficient, may throw
misleading errors, and will throw errors in some cases that would work.
It seems sufficient to just prevent the combination during CREATE INDEX,
so do that instead.

Lacking any better way to identify the opclasses involved, we need to
hard-wire tests for them, which requires hand-assigned values for their
OIDs, which forces a catversion bump because they previously had OIDs
that would be assigned automatically.  That's slightly annoying in the
v12 branch, but fortunately we're not at rc1 yet, so just do it.

Back-patch to v12 where nondeterministic collations were added.

In passing, run make reformat-dat-files, which found some unrelated
whitespace issues (slightly different ones in HEAD and v12).

Peter Eisentraut, with small corrections by me

Discussion: https://postgr.es/m/22566.1568675619@sss.pgh.pa.us
2019-09-21 16:29:17 -04:00
Alvaro Herrera 1a2983231d Split out code into new getKeyJsonValueFromContainer()
The new function stashes its output value in a JsonbValue that can be
passed in by the caller, which enables some of them to pass
stack-allocated structs -- saving palloc cycles.  It also allows some
callers that know they are handling a jsonb object to use this new jsonb
object-specific API, instead of going through generic container
findJsonbValueFromContainer.

Author: Nikita Glukhov
Discussion: https://postgr.es/m/7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru
2019-09-20 20:18:11 -03:00
Alvaro Herrera dbb9aeda99 Optimize get_jsonb_path_all avoiding an iterator
Instead of creating an iterator object at each step down the JSONB
object/array, we can just just examine its object/array flags, which is
faster.  Also, use the recently introduced JsonbValueAsText instead of
open-coding the same thing, for code simplicity.

Author: Nikita Glukhov
Discussion: https://postgr.es/m/7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru
2019-09-20 19:31:32 -03:00
Alvaro Herrera abb014a631 Refactor code into new JsonbValueAsText, and use it more
jsonb_object_field_text and jsonb_array_element_text both contained
identical copies of this code, so extract that into new routine
JsonbValueAsText.  This can also be used in other places, to measurable
performance benefit: the jsonb_each() and jsonb_array_elements()
functions can use it for outputting text forms instead of their less
efficient current implementation (because we no longer need to build
intermediate a jsonb representation of each value).

Author: Nikita Glukhov
Discussion: https://postgr.es/m/7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru
2019-09-20 19:30:16 -03:00
Tom Lane e56cad84d5 Fix some minor spec-compliance issues in jsonpath lexer.
Although the SQL/JSON tech report makes reference to ECMAScript which
allows both single- and double-quoted strings, all the rest of the
report speaks only of double-quoted string literals in jsonpaths.
That's more compatible with JSON itself; moreover single-quoted strings
are hard to use inside a jsonpath that is itself a single-quoted SQL
literal.  So guess that the intent is to allow only double-quoted
literals, and remove lexer support for single-quoted literals.
It'll be less painful to add this again later if we're wrong, than to
remove a shipped feature.

Also, adjust the lexer so that unrecognized backslash sequences are
treated as just meaning the escaped character, not as errors.  This
change has much better support in the standards, as JSON, JavaScript
and ECMAScript all make it plain that that's what's supposed to
happen.

Back-patch to v12.

Discussion: https://postgr.es/m/CAPpHfdvDci4iqNF9fhRkTqhe-5_8HmzeLt56drH%2B_Rv2rNRqfg@mail.gmail.com
2019-09-20 14:22:58 -04:00
Alvaro Herrera d1b0007639 Fix progress report of REINDEX INDEX
I (Álvaro) broke that in commit 6212276e43 -- forgot to set the
necessary flag.  Repair.

Author: Amit Langote
Discussion: https://postgr.es/m/CA+HiwqEaM2tV5awKhP1vSbgjQe_uXVU15Oi4sTgwgempwMiT8g@mail.gmail.com
2019-09-20 12:56:00 -03:00
Alexander Korotkov 8c8a267201 Fix freeing old values in index_store_float8_orderby_distances()
6cae9d2c10 has added an error in freeing old values in
index_store_float8_orderby_distances() function.  It looks for old value in
scan->xs_orderbynulls[i] after setting a new value there.
This commit fixes that.  Also it removes short-circuit in handling
distances == NULL situation.  Now distances == NULL will be treated the same
way as array with all null distances.  That is, previous values will be freed
if any.

Reported-by: Tom Lane, Nikita Glukhov
Discussion: https://postgr.es/m/CAPpHfdu2wcoAVAm3Ek66rP%3Duo_C-D84%2B%2Buf1VEcbyi_caBXWCA%40mail.gmail.com
Discussion: https://postgr.es/m/426580d3-a668-b9d1-7b8e-f74d1a6524e0%40postgrespro.ru
Backpatch-through: 12
2019-09-20 01:19:08 +03:00
Alexander Korotkov 6cae9d2c10 Improve handling of NULLs in KNN-GiST and KNN-SP-GiST
This commit improves subject in two ways:

 * It removes ugliness of 02f90879e7, which stores distance values and null
   flags in two separate arrays after GISTSearchItem struct.  Instead we pack
   both distance value and null flag in IndexOrderByDistance struct.  Alignment
   overhead should be negligible, because we typically deal with at most few
   "col op const" expressions in ORDER BY clause.
 * It fixes handling of "col op NULL" expression in KNN-SP-GiST.  Now, these
   expression are not passed to support functions, which can't deal with them.
   Instead, NULL result is implicitly assumed.  It future we may decide to
   teach support functions to deal with NULL arguments, but current solution is
   bugfix suitable for backpatch.

Reported-by: Nikita Glukhov
Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru
Author: Nikita Glukhov
Reviewed-by: Alexander Korotkov
Backpatch-through: 9.4
2019-09-19 21:48:39 +03:00
Peter Eisentraut e1c8743e6c GSSAPI error message improvements
Make the error messages around GSSAPI encryption a bit clearer.  Tweak
some messages to avoid plural problems.

Also make a code change for clarity.  Using "conf" for "confidential"
is quite confusing.  Using "conf_state" is perhaps not much better but
that's what the GSSAPI documentation uses, so there is at least some
hope of understanding it.
2019-09-19 15:09:49 +02:00
Fujii Masao 33a94bae60 Remove unused smgrdounlinkfork() function.
smgrdounlinkfork() became dead code as the result of commit ece01aae47,
but it was left in place just in case we want it someday. However no users
have appeared in 7 years, so it's time to remove this unused function.

Author: Kirk Jamison
Discussion: https://www.postgresql.org/message-id/D09B13F772D2274BB348A310EE3027C64E2067@g01jpexmbkw24
2019-09-18 21:05:33 +09:00
Peter Eisentraut 48770492c3 Add some const decorations to array constants
Author: Mark G <markg735@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAEeOP_YFVeFjq4zDZLDQbLSRFxBiTpwBQHxCNgGd%2Bp5VztTXyQ%40mail.gmail.com
2019-09-17 22:03:00 +02:00
Tom Lane d5b90cd648 Fix bogus handling of XQuery regex option flags.
The SQL spec defers to XQuery to define what the option flags are
for LIKE_REGEX patterns.  XQuery says that:
* 's' allows the dot character to match newlines, which by
  default it will not;
* 'm' allows ^ and $ to match at newlines, not only at the
  start/end of the whole string.
Thus, these are *not* inverses as they are for the similarly-named
POSIX options, and neither one corresponds to the POSIX 'n' option.
Fortunately, Spencer's library does expose these two behaviors as
separately twiddlable flags, so we just have to fix the mapping from
JSP flag bits to REG flag bits.  I also chose to rename the symbol
for 's' to DOTALL, to make it clearer that it's not the inverse
of MLINE.

Also, XQuery says that if the 'q' flag "is used together with the m, s,
or x flag, that flag has no effect".  I read this as saying that 'q'
overrides the other flags; whoever wrote our code seems to have read
it backwards.

Lastly, while XQuery's 'x' flag is related to what Spencer's code
does for REG_EXPANDED, it's not the same or a subset.  It seems best
to treat XQuery's 'x' as unimplemented for now.  Maybe later we can
expand our regex code to offer 'x'-style parsing as a separate option.

While at it, refactor the jsonpath code so that (a) there's only
one copy of the flag transformation logic not two, and (b) the
processing of flags is independent of the order in which the flags
are written.

We need some documentation updates to go with this, but I'll
tackle that separately.

Back-patch to v12 where this code originated.

Discussion: https://postgr.es/m/CAPpHfdvDci4iqNF9fhRkTqhe-5_8HmzeLt56drH%2B_Rv2rNRqfg@mail.gmail.com
Reference: https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/#flags
2019-09-17 15:39:51 -04:00
Peter Eisentraut a25221f53c Remove mingwcompat.c
We believe that the issues that this was working around have been
fixed in MinGW more than 5 years ago, so this isn't necessary anymore.

Discussion: https://www.postgresql.org/message-id/flat/20190719050830.GK1859%40paquier.xyz
2019-09-17 11:34:28 +02:00
Alexander Korotkov b64b857f50 Support for SSSSS datetime format pattern
SQL Standard 2016 defines SSSSS format pattern for seconds past midnight in
jsonpath .datetime() method and CAST (... FORMAT ...) SQL clause.  In our
datetime parsing engine we currently support it with SSSS name.

This commit adds SSSSS as an alias for SSSS.  Alias is added in favor of
upcoming jsonpath .datetime() method.  But it's also supported in to_date()/
to_timestamp() as positive side effect.

Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Nikita Glukhov, Alexander Korotkov
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-16 21:14:56 +03:00
Alexander Korotkov d589f94460 Support for FF1-FF6 datetime format patterns
SQL Standard 2016 defines FF1-FF9 format patters for fractions of seconds in
jsonpath .datetime() method and CAST (... FORMAT ...) SQL clause.  Parsing
engine of upcoming .datetime() method will be shared with to_date()/
to_timestamp().

This patch implements FF1-FF6 format patterns for upcoming jsonpath .datetime()
method.  to_date()/to_timestamp() functions will also get support of this
format patterns as positive side effect.  FF7-FF9 are not supported due to
lack of precision in our internal timestamp representation.

Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov.
Heavily revised by me.

Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com
Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com
Author: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Alexander Korotkov
Reviewed-by: Anastasia Lubennikova, Peter Eisentraut
2019-09-16 21:14:32 +03:00
Tom Lane d812257809 Fix bogus sizeof calculations.
Noted by Coverity.  Typo in 27cc7cd2b, so back-patch to v12
as that was.
2019-09-15 11:51:57 -04:00
Tom Lane b360e0fcd7 Make tuplesort_set_bound() assertions more comprehensible, hopefully.
Add the comments that I griped were missing.  Also re-order tests
so that parallelism-related tests aren't randomly separated from
each other.

Discussion: https://postgr.es/m/CAAaqYe9GD__4Crm=ddz+-XXcNhfY_V5gFYdLdmkFNq=2VHO56Q@mail.gmail.com
2019-09-13 16:57:07 -04:00
Alvaro Herrera bac2fae05c logical decoding: process ASSIGNMENT during snapshot build
Most WAL records are ignored in early SnapBuild snapshot build phases.
But it's critical to process some of them, so that later messages have
the correct transaction state after the snapshot is completely built; in
particular, XLOG_XACT_ASSIGNMENT messages are critical in order for
sub-transactions to be correctly assigned to their parent transactions,
or at least one assert misbehaves, as reported by Ildar Musin.

Diagnosed-by: Masahiko Sawada
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAONYFtOv+Er1p3WAuwUsy1zsCFrSYvpHLhapC_fMD-zNaRWxYg@mail.gmail.com
2019-09-13 16:36:28 -03:00
Alvaro Herrera 6212276e43 Fix progress reporting of CLUSTER / VACUUM FULL
The progress state was being clobbered once the first index completed
being rebuilt, causing the final phases of the operation not show
anything in the progress view.  This was inadvertently broken in
03f9e5cba0, which added progress tracking for REINDEX.

(The reason this bugfix is this small is that I had already noticed this
problem when writing monitoring for CREATE INDEX, and had already worked
around it, as can be seen in discussion starting at
https://postgr.es/m/20190329150218.GA25010@alvherre.pgsql Fixing the
problem is just a matter of fixing one place touched by the REINDEX
monitoring.)

Reported by: Álvaro Herrera
Author: Álvaro Herrera
Discussion: https://postgr.es/m/20190801184333.GA21369@alvherre.pgsql
2019-09-13 14:54:26 -03:00
Peter Geoghegan 3b6b54f178 Fix nbtree page split rmgr desc routine.
Include newitemoff in rmgr desc output for nbtree page split records.
In passing, correct an obsolete comment that claimed that newitemoff is
only logged for _L variant nbtree page split WAL records.

Both issues were oversights in commit 2c03216d83, which revamped the
WAL format.

Author: Peter Geoghegan
Backpatch: 9.5-, where the WAL format was revamped.
2019-09-12 15:45:08 -07:00
Peter Geoghegan 1b9becd43c Remove redundant _bt_truncate() comment paragraph. 2019-09-12 09:51:27 -07:00
Alvaro Herrera bc98e1ea64 Merge two assertions to make comment clearer
Authored by Tom Lane, after a gripe from James Coleman.

Discussion: https://postgr.es/m/CAAaqYe9GD__4Crm=ddz+-XXcNhfY_V5gFYdLdmkFNq=2VHO56Q@mail.gmail.com
2019-09-12 10:37:04 -03:00
Tom Lane 9a86f03b4e Rearrange postmaster's startup sequence for better syslogger results.
This is a second try at what commit 57431a911 tried to do, namely,
launch the syslogger before we open postmaster sockets so that our
messages about the sockets end up in the syslogger files.  That
commit fell foul of a bunch of subtle issues caused by trying to
launch a postmaster child process before creating shared memory.
Rather than messing with that interaction, let's postpone opening
the sockets till after we launch the syslogger.

This would not have been terribly safe before commit 7de19fbc0,
because we relied on socket opening to detect whether any competing
postmasters were using the same port number.  But now that we choose
IPC keys without regard to the port number, there's no interaction
to worry about.

Also delay creation of the external PID file (if requested) till after
the sockets are open, since external code could plausibly be relying
on that ordering of events.  And postpone most of the work of
RemovePgTempFiles() so that that potentially-slow processing still
happens after we make the external PID file.  We have to be a bit
careful about that last though: as noted in the discussion subsequent to
bug #15804, EXEC_BACKEND builds still have to clear the parameter-file
temp dir before launching the syslogger.

Patch by me; thanks to Michael Paquier for review/testing.

Discussion: https://postgr.es/m/15804-3721117bf40fb654@postgresql.org
2019-09-11 11:43:01 -04:00
Tomas Vondra d06215d03b Allow setting statistics target for extended statistics
When building statistics, we need to decide how many rows to sample and
how accurate the resulting statistics should be. Until now, it was not
possible to explicitly define statistics target for extended statistics
objects, the value was always computed from the per-attribute targets
with a fallback to the system-wide default statistics target.

That's a bit inconvenient, as it ties together the statistics target set
for per-column and extended statistics. In some cases it may be useful
to require larger sample / higher accuracy for extended statics (or the
other way around), but with this approach that's not possible.

So this commit introduces a new command, allowing to specify statistics
target for individual extended statistics objects, overriding the value
derived from per-attribute targets (and the system default).

  ALTER STATISTICS stat_name SET STATISTICS target_value;

When determining statistics target for an extended statistics object we
first look at this explicitly set value. When this value is -1, we fall
back to the old formula, looking at the per-attribute targets first and
then the system default. This means the behavior is backwards compatible
with older PostgreSQL releases.

Author: Tomas Vondra
Discussion: https://postgr.es/m/20190618213357.vli3i23vpkset2xd@development
Reviewed-by: Kirk Jamison, Dean Rasheed
2019-09-11 00:25:51 +02:00
Tom Lane bca6e64354 Reduce overhead of scanning the backend[] array in LISTEN/NOTIFY.
Up to now, async.c scanned its whole array of per-backend state
whenever it needed to find listening backends.  That's expensive
if MaxBackends is large, so extend the data structure with list
links that thread the active entries together.

A downside of this change is that asyncQueueUnregister (unregister
a listening backend at backend exit) now requires exclusive not shared
lock, and it can take awhile if there are many other listening
backends.  We could improve the latter issue by using a doubly- not
singly-linked list, but it's probably not worth the storage space;
typical usage patterns for LISTEN/NOTIFY have fairly long-lived
listeners.

In return for that, Exec_ListenPreCommit (initially register a
listening backend), SignalBackends, and asyncQueueAdvanceTail
get significantly faster when MaxBackends is much larger than
the number of listening backends.  If most of the potential
backend slots are listening, we don't win, but that's a case
where the actual interprocess-signal overhead is going to swamp
these considerations anyway.

Martijn van Oosterhout, hacked a bit more by me

Discussion: https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com
2019-09-10 18:15:20 -04:00
Peter Geoghegan 55d015bde0 Add _bt_binsrch() scantid assertion to nbtree.
Assert that _bt_binsrch() binary searches with scantid set in insertion
scankey cannot be performed on leaf pages.  Leaf-level binary searches
where scantid is set must use _bt_binsrch_insert() instead.

_bt_binsrch_insert() is likely to have additional responsibilities in
the future, such as searching within GIN-style posting lists using
scantid.  It seems like a good idea to tighten things up now.
2019-09-09 11:41:19 -07:00
Andres Freund 27cc7cd2bc Reorder EPQ work, to fix rowmark related bugs and improve efficiency.
In ad0bda5d24 I changed the EvalPlanQual machinery to store
substitution tuples in slot, instead of using plain HeapTuples. The
main motivation for that was that using HeapTuples will be inefficient
for future tableams.  But it turns out that that conversion was buggy
for non-locking rowmarks - the wrong tuple descriptor was used to
create the slot.

As a secondary issue 5db6df0c0 changed ExecLockRows() to begin EPQ
earlier, to allow to fetch the locked rows directly into the EPQ
slots, instead of having to copy tuples around. Unfortunately, as Tom
complained, that forces some expensive initialization to happen
earlier.

As a third issue, the test coverage for EPQ was clearly insufficient.

Fixing the first issue is unfortunately not trivial: Non-locked row
marks were fetched at the start of EPQ, and we don't have the type
information for the rowmarks available at that point. While we could
change that, it's not easy. It might be worthwhile to change that at
some point, but to fix this bug, it seems better to delay fetching
non-locking rowmarks when they're actually needed, rather than
eagerly. They're referenced at most once, and in cases where EPQ
fails, might never be referenced. Fetching them when needed also
increases locality a bit.

To be able to fetch rowmarks during execution, rather than
initialization, we need to be able to access the active EPQState, as
that contains necessary data. To do so move EPQ related data from
EState to EPQState, and, only for EStates creates as part of EPQ,
reference the associated EPQState from EState.

To fix the second issue, change EPQ initialization to allow use of
EvalPlanQualSlot() to be used before EvalPlanQualBegin() (but
obviously still requiring EvalPlanQualInit() to have been done).

As these changes made struct EState harder to understand, e.g. by
adding multiple EStates, significantly reorder the members, and add a
lot more comments.

Also add a few more EPQ tests, including one that fails for the first
issue above. More is needed.

Reported-By: yi huang
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion:
    https://postgr.es/m/CAHU7rYZo_C4ULsAx_LAj8az9zqgrD8WDd4hTegDTMM1LMqrBsg@mail.gmail.com
    https://postgr.es/m/24530.1562686693@sss.pgh.pa.us
Backpatch: 12-, where the EPQ changes were introduced
2019-09-09 05:14:11 -07:00
Alexander Korotkov 7e04160390 Fix handling of non-key columns get_index_column_opclass()
f2e40380 introduces support of non-key attributes in GiST indexes.  Then if
get_index_column_opclass() is asked by gistproperty() to get an opclass of
non-key column, it returns garbage past oidvector value.  This commit fixes
that by making get_index_column_opclass() return InvalidOid in this case.

Discussion: https://postgr.es/m/20190902231948.GA5343%40alvherre.pgsql
Author: Nikita Glukhov, Alexander Korotkov
Backpatch-through: 12
2019-09-09 13:50:12 +03:00
Tom Lane 1192e3fb54 Fix RelationIdGetRelation calls that weren't bothering with error checks.
Some of these are quite old, but that doesn't make them not bugs.
We'd rather report a failure via elog than SIGSEGV.

While at it, uniformly spell the error check as !RelationIsValid(rel)
rather than a bare rel == NULL test.  The machine code is the same
but it seems better to be consistent.

Coverity complained about this today, not sure why, because the
mistake is in fact old.
2019-09-08 17:00:50 -04:00
Alexander Korotkov 02f90879e7 Fix handling of NULL distances in KNN-GiST
In order to implement NULL LAST semantic GiST previously assumed distance to
the NULL value to be Inf.  However, our distance functions can return Inf and
NaN for non-null values.  In such cases, NULL LAST semantic appears to be
broken.  This commit fixes that by introducing separate array of null flags for
distances.

Backpatch to all supported versions.

Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 9.4
2019-09-08 22:08:12 +03:00
Alexander Korotkov e5d8f35961 Fix handling Inf and Nan values in GiST pairing heap comparator
Previously plain float comparison was used in GiST pairing heap.  Such
comparison doesn't provide proper ordering for value sets containing Inf and Nan
values.  This commit fixes that by usage of float8_cmp_internal().  Note, there
is remaining problem with NULL distances, which are represented as Inf in
pairing heap.  It would be fixes in subsequent commit.

Backpatch to all supported versions.

Reported-by: Andrey Borodin
Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Heikki Linnakangas
Backpatch-through: 9.4
2019-09-08 22:08:12 +03:00
Peter Eisentraut 862ef372d6 Fix behavior of AND CHAIN outside of explicit transaction blocks
When using COMMIT AND CHAIN or ROLLBACK AND CHAIN not in an explicit
transaction block, the previous implementation would leave a
transaction block active in the ROLLBACK case but not the COMMIT case.
To fix for now, error out when using these commands not in an explicit
transaction block.  This restriction could be lifted if a sensible
definition and implementation is found.

Bug: #15977
Author: fn ln <emuser20140816@gmail.com>
Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2019-09-08 16:23:03 +02:00
Tom Lane db43831899 Avoid using INFO elevel for what are fundamentally debug messages.
Commit 6f6b99d13 stuck an INFO message into the fast path for
checking partition constraints, for no very good reason except
that it made it easy for the regression tests to verify that
that path was taken.  Assorted later patches did likewise,
increasing the unsuppressable-chatter level from ALTER TABLE
even more.  This isn't good for the user experience, so let's
drop these messages down to DEBUG1 where they belong.  So as
not to have a loss of test coverage, create a TAP test that
runs the relevant queries with client_min_messages = DEBUG1
and greps for the expected messages.

This testing method is a bit brute-force --- in particular,
it duplicates the execution of a fair amount of the core
create_table and alter_table tests.  We experimented with
other solutions, but running any significant amount of
standard testing with client_min_messages = DEBUG1 seems
to have a lot of output-stability pitfalls, cf commits
bbb96c370 and 5655565c0.  Possibly at some point we'll look
into whether we can reduce the amount of test duplication.

Backpatch into v12, because some of these messages are new
in v12 and we don't really want to ship it that way.

Sergei Kornilov

Discussion: https://postgr.es/m/81911511895540@web58j.yandex.ru
Discussion: https://postgr.es/m/4859321552643736@myt5-02b80404fd9e.qloud-c.yandex.net
2019-09-07 19:03:11 -04:00
Tom Lane ca70bdaefe Fix issues around strictness of SIMILAR TO.
As a result of some long-ago quick hacks, the SIMILAR TO operator
and the corresponding flavor of substring() interpreted "ESCAPE NULL"
as selecting the default escape character '\'.  This is both
surprising and not per spec: the standard is clear that these
functions should return NULL for NULL input.

Additionally, because of inconsistency of the strictness markings
of 3-argument substring() and similar_escape(), the planner could not
inline the SQL definition of substring(), resulting in a substantial
performance penalty compared to the underlying POSIX substring()
function.

The simplest fix for this would be to change the strictness marking
of similar_escape(), but if we do that we risk breaking existing views
that depend on that function.  Hence, leave similar_escape() as-is
as a compatibility function, and instead invent a new function
similar_to_escape() that comes in two strict variants.

There are a couple of other behaviors in this area that are also
not per spec, but they are documented and seem generally at least
as sane as the spec's definition, so leave them alone.  But improve
the documentation to describe them fully.

Patch by me; thanks to Álvaro Herrera and Andrew Gierth for review
and discussion.

Discussion: https://postgr.es/m/14047.1557708214@sss.pgh.pa.us
2019-09-07 14:21:59 -04:00
Robert Haas bd124996ef Create an API for inserting and deleting rows in TOAST tables.
This moves much of the non-heap-specific logic from toast_delete and
toast_insert_or_update into a helper functions accessible via a new
header, toast_helper.h.  Using the functions in this module, a table
AM can implement creation and deletion of TOAST table rows with
much less code duplication than was possible heretofore.  Some
table AMs won't want to use the TOAST logic at all, but for those
that do this will make that easier.

Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro,
Andres Freund, and Álvaro Herrera.

Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com
2019-09-06 10:38:51 -04:00
Robert Haas 286af0ce12 When performing a base backup, check for read errors.
The old code didn't differentiate between a read error and a
concurrent truncation. fread reports both of these by returning 0;
you have to use feof() or ferror() to distinguish between them,
which this code did not do.

It might be a better idea to use read() rather than fread() here,
so that we can display a less-generic error message, but I'm not
sure that would qualify as a back-patchable bug fix, so just do
this much for now.

Jeevan Chalke, reviewed by Jeevan Ladhe and by me.

Discussion: http://postgr.es/m/CA+TgmobG4ywMzL5oQq2a8YKp8x2p3p1LOMMcGqpS7aekT9+ETA@mail.gmail.com
2019-09-06 08:22:32 -04:00
Fujii Masao 946647f845 Make pg_promote() detect postmaster death while waiting for promotion to end.
Previously even if postmaster died and WaitLatch() woke up with that event
while pg_promote() was waiting for the standby promotion to finish,
pg_promote() did nothing special and kept waiting until timeout occurred.
This could cause a busy loop.

This patch make pg_promote() return false immediately when postmaster
dies, to avoid such a busy loop.

Back-patch to v12 where pg_promote() was added.

Author: Fujii Masao
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAHGQGwEs9ROgSp+QF+YdDU+xP8W=CY1k-_Ov-d_Z3JY+to3eXA@mail.gmail.com
2019-09-06 14:27:25 +09:00
Tom Lane 7de19fbc0b Use data directory inode number, not port, to select SysV resource keys.
This approach provides a much tighter binding between a data directory
and the associated SysV shared memory block (and SysV or named-POSIX
semaphores, if we're using those).  Key collisions are still possible,
but only between data directories stored on different filesystems,
so the situation should be negligible in practice.  More importantly,
restarting the postmaster with a different port number no longer
risks failing to identify a relevant shared memory block, even when
postmaster.pid has been removed.  A standalone backend is likewise
much more certain to detect conflicting leftover backends.

(In the longer term, we might now think about deprecating the port as
a cluster-wide value, so that one postmaster could support sockets
with varying port numbers.  But that's for another day.)

The hazards fixed here apply only on Unix systems; our Windows code
paths already use identifiers derived from the data directory path
name rather than the port.

src/test/recovery/t/017_shm.pl, which intends to test key-collision
cases, has been substantially rewritten since it can no longer use
two postmasters with identical port numbers to trigger the case.
Instead, use Perl's IPC::SharedMem module to create a conflicting
shmem segment directly.  The test script will be skipped if that
module is not available.  (This means that some older buildfarm
members won't run it, but I don't think that that results in any
meaningful coverage loss.)

Patch by me; thanks to Noah Misch and Peter Eisentraut for discussion
and review.

Discussion: https://postgr.es/m/16908.1557521200@sss.pgh.pa.us
2019-09-05 13:31:46 -04:00
Robert Haas 8b94dab066 Split tuptoaster.c into three separate files.
detoast.c/h contain functions required to detoast a datum, partially
or completely, plus a few other utility functions for examining the
size of toasted datums.

toast_internals.c/h contain functions that are used internally to the
TOAST subsystem but which (mostly) do not need to be accessed from
outside.

heaptoast.c/h contains code that is intrinsically specific to the
heap AM, either because it operates on HeapTuples or is based on the
layout of a heap page.

detoast.c and toast_internals.c are placed in
src/backend/access/common rather than src/backend/access/heap.  At
present, both files still have dependencies on the heap, but that will
be improved in a future commit.

Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro,
Andres Freund, and Álvaro Herrera.

Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com
2019-09-05 13:15:10 -04:00
Peter Eisentraut 74a308cf52 Use explicit_bzero
Use the explicit_bzero() function in places where it is important that
security information such as passwords is cleared from memory.  There
might be other places where it could be useful; this is just an
initial collection.

For platforms that don't have explicit_bzero(), provide various
fallback implementations.  (explicit_bzero() itself isn't standard,
but as Linux/glibc, FreeBSD, and OpenBSD have it, it's the most common
spelling, so it makes sense to make that the invocation point.)

Discussion: https://www.postgresql.org/message-id/flat/42d26bde-5d5b-c90d-87ae-6cab875f73be%402ndquadrant.com
2019-09-05 08:30:42 +02:00
Michael Paquier ae060a52b2 Fix thinko when ending progress report for a backend
The logic ending progress reporting for a backend entry introduced by
b6fb647 causes callers of pgstat_progress_end_command() to do some extra
work when track_activities is enabled as the process fields are reset in
the backend entry even if no command were started for reporting.

This resets the fields only if a command is registered for progress
reporting, and only if track_activities is enabled.

Author: Masahiho Sawada
Discussion: https://postgr.es/m/CAD21AoCry_vJ0E-m5oxJXGL3pnos-xYGCzF95rK5Bbi3Uf-rpA@mail.gmail.com
Backpatch-through: 9.6
2019-09-04 15:46:37 +09:00
Alvaro Herrera 25dcc9d35d Make XLogReaderInvalReadState static
This function is only used by xlogreader.c itself, so there's no need to
export it.  It was introduced by commit 3b02ea4f07 with the apparent
intention that it could be used externally, but I couldn't find any
external code calling it.

I (Álvaro) couldn't resist the urge to sort nearby function prototypes
properly while at it.

Author: Antonin Houska
Discussion: https://postgr.es/m/14984.1554998742@spoje.net
2019-09-03 17:41:43 -04:00
Alvaro Herrera fe66125974 Remove 'msg' parameter from convert_tuples_by_name
The message was included as a parameter when this function was added in
dcb2bda9b7, but I don't think it has ever served any useful purpose.
Let's stop spreading it pointlessly.

Reviewed by Amit Langote and Peter Eisentraut.

Discussion: https://postgr.es/m/20190806224728.GA17233@alvherre.pgsql
2019-09-03 14:47:29 -04:00
Peter Eisentraut 396e4afdbc Better error messages for short reads/writes in SLRU
This avoids getting a

    Could not read from file ...: Success.

for a short read or write (since errno is not set in that case).
Instead, report a more specific error messages.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com
2019-09-03 08:30:21 +02:00
Michael Paquier 3a54eb1a38 Fix memory leak with lower, upper and initcap with ICU-provided collations
The leak happens in str_tolower, str_toupper and str_initcap, which are
used in several places including their equivalent SQL-level functions,
and can only be triggered when using an ICU-provided collation when
converting the input string.

b615920 fixed a similar leak.  Backpatch down 10 where ICU collations
have been introduced.

Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/94c0ad0a-cbc2-e4a3-7829-2bdeaf9146db@postgrespro.ru
Backpatch-through: 10
2019-09-03 12:30:53 +09:00
Tom Lane f63a5ead9d Avoid touching replica identity index in ExtractReplicaIdentity().
In what seems like a fit of misplaced optimization,
ExtractReplicaIdentity() accessed the relation's replica-identity
index without taking any lock on it.  Usually, the surrounding query
already holds some lock so this is safe enough ... but in the case
of a previously-planned delete, there might be no existing lock.
Given a suitable test case, this is exposed in v12 and HEAD by an
assertion added by commit b04aeb0a0.

The whole thing's rather poorly thought out anyway; rather than
looking directly at the index, we should use the index-attributes
bitmap that's held by the parent table's relcache entry, as the
caller functions do.  This is more consistent and likely a bit
faster, since it avoids a cache lookup.  Hence, change to doing it
that way.

While at it, rather than blithely assuming that the identity
columns are non-null (with catastrophic results if that's wrong),
add assertion checks that they aren't null.  Possibly those should
be actual test-and-elog, but I'll leave it like this for now.

In principle, this is a bug that's been there since this code was
introduced (in 9.4).  In practice, the risk seems quite low, since
we do have a lock on the index's parent table, so concurrent
changes to the index's catalog entries seem unlikely.  Given the
precedent that commit 9c703c169 wasn't back-patched, I won't risk
back-patching this further than v12.

Per report from Hadi Moshayedi.

Discussion: https://postgr.es/m/CAK=1=Wrek44Ese1V7LjKiQS-Nd-5LgLi_5_CskGbpggKEf3tKQ@mail.gmail.com
2019-09-02 16:10:37 -04:00
Heikki Linnakangas bde7493d10 Fix overflow check and comment in GIN posting list encoding.
The comment did not match what the code actually did for integers with
the 43rd bit set. You get an integer like that, if you have a posting
list with two adjacent TIDs that are more than 2^31 blocks apart.
According to the comment, we would store that in 6 bytes, with no
continuation bit on the 6th byte, but in reality, the code encodes it
using 7 bytes, with a continuation bit on the 6th byte as normal.

The decoding routine also handled these 7-byte integers correctly, except
for an overflow check that assumed that one integer needs at most 6 bytes.
Fix the overflow check, and fix the comment to match what the code
actually does. Also fix the comment that claimed that there are 17 unused
bits in the 64-bit representation of an item pointer. In reality, there
are 64-32-11=21.

Fitting any item pointer into max 6 bytes was an important property when
this was written, because in the old pre-9.4 format, item pointers were
stored as plain arrays, with 6 bytes for every item pointer. The maximum
of 6 bytes per integer in the new format guaranteed that we could convert
any page from the old format to the new format after upgrade, so that the
new format was never larger than the old format. But we hardly need to
worry about that anymore, and running into that problem during upgrade,
where an item pointer is expanded from 6 to 7 bytes such that the data
doesn't fit on a page anymore, is implausible in practice anyway.

Backpatch to all supported versions.

This also includes a little test module to test these large distances
between item pointers, without requiring a 16 TB table. It is not
backpatched, I'm including it more for the benefit of future development
of new posting list formats.

Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi
Reviewed-by: Masahiko Sawada, Alexander Korotkov
2019-08-28 12:55:33 +03:00
Thomas Munro 720b59b55b Avoid catalog lookups in RelationAllowsEarlyPruning().
RelationAllowsEarlyPruning() performed a catalog scan, but is used
in two contexts where that was a bad idea:

1.  In heap_page_prune_opt(), which runs very frequently in some large
    scans.  This caused major performance problems in a field report
    that was easy to reproduce.

2.  In TestForOldSnapshot(), which runs while we hold a buffer content
    lock.  It's not clear if this was guaranteed to be free of buffer
    deadlock risk.

The check was introduced in commit 2cc41acd8 and defended against a
real problem: 9.6's hash indexes have no page LSN and so we can't
allow early pruning (ie the snapshot-too-old feature).  We can remove
the check from all later releases though: hash indexes are now logged,
and there is no way to create UNLOGGED indexes on regular logged
tables.

If a future release allows such a combination, it might need to put
a similar check in place, but it'll need some more thought.

Back-patch to 10.

Author: Thomas Munro
Reviewed-by: Tom Lane, who spotted the second problem
Discussion: https://postgr.es/m/CA%2BhUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD%2Bg%40mail.gmail.com
Discussion: https://postgr.es/m/CAA4eK1%2BWy%2BN4eE5zPm765h68LrkWc3Biu_8rzzi%2BOYX4j%2BiHRw%40mail.gmail.com
2019-08-28 16:18:29 +12:00