Commit Graph

20265 Commits

Author SHA1 Message Date
Peter Eisentraut 17b9e7f9fe Support adding partitioned tables to publication
When a partitioned table is added to a publication, changes of all of
its partitions (current or future) are published via that publication.

This change only affects which tables a publication considers as its
members.  The receiving side still sees the data coming from the
individual leaf partitions.  So existing restrictions that partition
hierarchies can only be replicated one-to-one are not changed by this.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-10 09:09:32 +01:00
Michael Paquier 61d7c7bce3 Prevent reindex of invalid indexes on TOAST tables
Such indexes can only be duplicated leftovers of a previously failed
REINDEX CONCURRENTLY command, and a valid equivalent is guaranteed to
exist.  As toast indexes can only be dropped if invalid, reindexing
these would lead to useless duplicated indexes that can't be dropped
anymore, except if the parent relation is dropped.

Thanks to Justin Pryzby for reminding that this problem was reported
long ago during the review of the original patch of REINDEX
CONCURRENTLY, but the issue was never addressed.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
2020-03-10 15:38:17 +09:00
Fujii Masao 71e0d0a737 Tidy up XLogSource code in xlog.c.
This commit replaces 0 used as an initial value of XLogSource variable,
with XLOG_FROM_ANY. Also this commit changes those variable so that
XLogSource instead of int is used as the type for them. These changes
are for code readability and debugger-friendliness.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
2020-03-10 09:41:44 +09:00
Jeff Davis 24d85952a5 Introduce LogicalTapeSetExtend().
Increases the number of tapes in a logical tape set. This will be
important for disk-based hash aggregation, because the maximum number
of tapes is not known ahead of time.

While discussing this change, it was observed to regress the
performance of Sort for at least one test case. The performance
regression was because some versions of GCC switch to an inlined
version of memcpy() in LogicalTapeWrite() after this change. No
performance regression for clang was observed.

Because the regression is due to an arbitrary decision by the
compiler, I decided it shouldn't hold up this change. If it needs to
be fixed, we can find a workaround.

Author: Adam Lee, Jeff Davis
Discussion: https://postgr.es/m/e54bfec11c59689890f277722aaaabd05f78e22c.camel%40j-davis.com
2020-03-09 10:40:02 -07:00
Fujii Masao 17d3fcdc39 Fix bug that causes to report waiting in PS display twice, in hot standby.
Previously "waiting" could appear twice via PS in case of lock conflict
in hot standby mode. Specifically this issue happend when the delay
in WAL application determined by max_standby_archive_delay and
max_standby_streaming_delay had passed but it took more than 500 msec
to cancel all the conflicting transactions. Especially we can observe this
easily by setting those delay parameters to -1.

The cause of this issue was that WaitOnLock() and
ResolveRecoveryConflictWithVirtualXIDs() added "waiting" to
the process title in that case. This commit prevents
ResolveRecoveryConflictWithVirtualXIDs() from reporting waiting
in case of lock conflict, to fix the bug.

Back-patch to all back branches.

Author: Masahiko Sawada
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/CA+fd4k4mXWTwfQLS3RPwGr4xnfAEs1ysFfgYHvmmoUgv6Zxvmg@mail.gmail.com
2020-03-10 00:14:43 +09:00
Peter Eisentraut 71d60e2aa0 Add tg_updatedcols to TriggerData
This allows a trigger function to determine for an UPDATE trigger
which columns were actually updated.  This allows some optimizations
in generic trigger functions such as lo_manage and
tsvector_update_trigger.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
2020-03-09 09:34:55 +01:00
Peter Eisentraut 8f152b6c50 Code simplification
Initialize TriggerData to 0 for the whole struct together, instead of
each field separately.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/11c5f156-67a9-0fb5-8200-2a8018eb2e0c@2ndquadrant.com
2020-03-09 09:34:55 +01:00
Fujii Masao ef34ab42a8 Avoid assertion failure with targeted recovery in standby mode.
At the end of recovery, standby mode is turned off to re-fetch the last
valid record from archive or pg_wal. Previously, if recovery target was
reached and standby mode was turned off while the current WAL source
was stream, recovery could try to retrieve WAL file containing the last
valid record unexpectedly from stream even though not in standby mode.
This caused an assertion failure. That is, the assertion test confirms that
WAL file should not be retrieved from stream if standby mode is not true.

This commit moves back the current WAL source to archive if it's stream
even though not in standby mode, to avoid that assertion failure.

This issue doesn't cause the server to crash when built with assertion
disabled. In this case, the attempt to retrieve WAL file from stream not
in standby mode just fails. And then recovery tries to retrieve WAL file
from archive or pg_wal.

Back-patch to all supported branches.

Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20200227.124830.2197604521555566121.horikyota.ntt@gmail.com
2020-03-09 15:32:32 +09:00
Fujii Masao d9249441ef Mark ssl_passphrase_command as GUC_SUPERUSER_ONLY.
This commit changes the GUC ssl_passphrase_command so that
it's examinable by only superuser and a member of pg_read_all_settings.
Per discussion, we determined to do this because the parameter may
contain a sensitive informtaion like a passphrase itself.

Author: Insung Moon
Reviewed-by: Keisuke Kuroda
Discussion: https://postgr.es/m/CAEMmqBuHVGayc+QkYKgx3gWSdqwTAQGw+0DYn3WhcX-eNa2ntA@mail.gmail.com
2020-03-09 11:41:31 +09:00
Tom Lane ea7dace2aa Simplify/speed up assertion cross-check in ginCompressPostingList().
I noticed while testing some other stuff that the CHECK_ENCODING_ROUNDTRIP
logic in ginCompressPostingList could account for over 50% of the runtime
of an INSERT with a GIN index.  While that's not relevant to production
performance, it's still kind of annoying in a debug build.  Replacing
the loop around short memcmp's with one long memcmp works just as well
and is significantly faster, at least on my machine.
2020-03-07 13:31:17 -05:00
Peter Eisentraut 8d7def5c27 Fix typo 2020-03-07 08:51:59 +01:00
Tom Lane a6525588b7 Allow Unicode escapes in any server encoding, not only UTF-8.
SQL includes provisions for numeric Unicode escapes in string
literals and identifiers.  Previously we only accepted those
if they represented ASCII characters or the server encoding
was UTF-8, making the conversion to internal form trivial.
This patch adjusts things so that we'll call the appropriate
encoding conversion function in less-trivial cases, allowing
the escape sequence to be accepted so long as it corresponds
to some character available in the server encoding.

This also applies to processing of Unicode escapes in JSONB.
However, the old restriction still applies to client-side
JSON processing, since that hasn't got access to the server's
encoding conversion infrastructure.

This patch includes some lexer infrastructure that simplifies
throwing errors with error cursors pointing into the middle of
a string (or other complex token).  For the moment I only used
it for errors relating to Unicode escapes, but we might later
expand the usage to some other cases.

Patch by me, reviewed by John Naylor.

Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
2020-03-06 14:17:43 -05:00
Tom Lane fe30e7ebfa Allow ALTER TYPE to change some properties of a base type.
Specifically, this patch allows ALTER TYPE to:
* Change the default TOAST strategy for a toastable base type;
* Promote a non-toastable type to toastable;
* Add/remove binary I/O functions for a type;
* Add/remove typmod I/O functions for a type;
* Add/remove a custom ANALYZE statistics functions for a type.

The first of these can be done by the type's owner; all the others
require superuser privilege since misuse could cause problems.

The main motivation for this patch is to allow extensions to
upgrade the feature sets of their data types, so the set of
alterable properties is biased towards that use-case.  However
it's also true that changing some other properties would be
a lot harder, as they get baked into physical storage and/or
stored expressions that depend on the type.

Along the way, refactor GenerateTypeDependencies() to make it easier
to call, refactor DefineType's volatility checks so they can be shared
by AlterType, and teach typcache.c that it might have to reload data
from the type's pg_type row, a scenario it never handled before.
Also rearrange alter_type.sgml a bit for clarity (put the
composite-type operations together).

Tomas Vondra and Tom Lane

Discussion: https://postgr.es/m/20200228004440.b23ein4qvmxnlpht@development
2020-03-06 12:19:29 -05:00
Tom Lane bb03010b9f Remove the "opaque" pseudo-type and associated compatibility hacks.
A long time ago, it was necessary to declare datatype I/O functions,
triggers, and language handler support functions in a very type-unsafe
way involving a single pseudo-type "opaque".  We got rid of those
conventions in 7.3, but there was still support in various places to
automatically convert such functions to the modern declaration style,
to be able to transparently re-load dumps from pre-7.3 servers.
It seems unnecessary to continue to support that anymore, so take out
the hacks; whereupon the "opaque" pseudo-type itself is no longer
needed and can be dropped.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/4110.1583255415@sss.pgh.pa.us
2020-03-05 15:48:56 -05:00
Tom Lane 84eca14bc4 Remove ancient hacks to ignore certain opclass names in CREATE INDEX.
Twenty years ago, we removed certain operator classes in favor of
letting indexes over their data types be built with some other
binary-compatible, more standard opclass.  As a hack to allow existing
index definitions to be dumped and reloaded, we made CREATE INDEX ignore
the removed opclass names, so that such indexes would fall back to the
new default opclass for their data types.  This was never intended to
be a long-lived thing; it carries the obvious risk of breaking some
future developer's attempt to re-use those old opclass names.  Since
all of the cases in question are for opclasses that were removed
before PG 8.0, it seems okay to get rid of these hacks now.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Discussion: https://postgr.es/m/3685.1583422389@sss.pgh.pa.us
2020-03-05 15:36:06 -05:00
Tom Lane e58a599752 Remove ancient support for upgrading pre-7.3 foreign key constraints.
Before 7.3, foreign key constraints had no explicit catalog
representation, so that what pg_dump produced for them was (usually)
a set of three CREATE CONSTRAINT TRIGGER commands.  Commit a2899ebdc
and some follow-on fixes added an ugly hack in CreateTrigger() to
recognize that pattern and reconstruct the foreign key definition.
However, we've never had any test coverage for that code, so that it's
legitimate to wonder if it still works; and having to maintain it in
the face of upcoming trigger-related patches seems rather pointless.
Let's decree that its time has passed, and drop it.

This is part of a group of patches removing various server-side kluges
for transparently upgrading pre-8.0 dump files.  Since we've had few
complaints about dropping pg_dump's support for dumping from pre-8.0
servers (commit 64f3524e2), it seems okay to now remove these kluges.

Daniel Gustafsson

Discussion: https://postgr.es/m/805874E2-999C-4CDA-856F-1AFBD9DFE933@yesql.se
2020-03-05 15:25:45 -05:00
Alvaro Herrera a77315fdf2
Remove RangeIOData->typiofunc
We used to carry the I/O function OID in RangeIOData, but it's not used
for anything.  Since the struct is not exposed to the world anyway, we
can simplify it a bit.  Also, rename the FmgrInfo member to match
the accompanying 'typioparam' and put them in a more sensible order.

Reviewed by Tom Lane and Paul Jungwirth.

Discussion: https://postgr.es/m/20200304215711.GA8732@alvherre.pgsql
2020-03-05 11:35:02 -03:00
Michael Paquier fbcf087112 Fix more issues with dependency handling at swap phase of REINDEX CONCURRENTLY
When canceling a REINDEX CONCURRENTLY operation after swapping is done,
a drop of the parent table would leave behind old indexes.  This is a
consequence of 68ac9cf, which fixed the case of pg_depend bloat when
repeating REINDEX CONCURRENTLY on the same relation.

In order to take care of the problem without breaking the previous fix,
this uses a different strategy, possible even with the exiting set of
routines to handle dependency changes.  The dependencies of/on the
new index are additionally switched to the old one, allowing an old
invalid index remaining around because of a cancellation or a failure to
use the dependency links of the concurrently-created index.  This
ensures that dropping any objects the old invalid index depends on also
drops the old index automatically.

Reported-by: Julien Rouhaud
Author: Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/20200227080735.l32fqcauy73lon7o@nol
Backpatch-through: 12
2020-03-05 12:50:15 +09:00
Jeff Davis c954d49046 Extend ExecBuildAggTrans() to support a NULL pointer check.
Optionally push a step to check for a NULL pointer to the pergroup
state.

This will be important for disk-based hash aggregation in combination
with grouping sets. When memory limits are reached, a given tuple may
find its per-group state for some grouping sets but not others. For
the former, it advances the per-group state as normal; for the latter,
it skips evaluation and the calling code will have to spill the tuple
and reprocess it in a later batch.

Add the NULL check as a separate expression step because in some
common cases it's not needed.

Discussion: https://postgr.es/m/20200221202212.ssb2qpmdgrnx52sj%40alap3.anarazel.de
2020-03-04 17:29:18 -08:00
Tom Lane 3ed2005ff5 Introduce macros for typalign and typstorage constants.
Our usual practice for "poor man's enum" catalog columns is to define
macros for the possible values and use those, not literal constants,
in C code.  But for some reason lost in the mists of time, this was
never done for typalign/attalign or typstorage/attstorage.  It's never
too late to make it better though, so let's do that.

The reason I got interested in this right now is the need to duplicate
some uses of the TYPSTORAGE constants in an upcoming ALTER TYPE patch.
But in general, this sort of change aids greppability and readability,
so it's a good idea even without any specific motivation.

I may have missed a few places that could be converted, and it's even
more likely that pending patches will re-introduce some hard-coded
references.  But that's not fatal --- there's no expectation that
we'd actually change any of these values.  We can clean up stragglers
over time.

Discussion: https://postgr.es/m/16457.1583189537@sss.pgh.pa.us
2020-03-04 10:34:25 -05:00
Tom Lane d677550493 Allow to_date/to_timestamp to recognize non-English month/day names.
to_char() has long allowed the TM (translation mode) prefix to
specify output of translated month or day names; but that prefix
had no effect in input format strings.  Now it does.  to_date()
and to_timestamp() will now recognize the same month or day names
that to_char() would output for the same format code.  Matching is
case-insensitive (per the active collation's notion of what that
means), just as it has always been for English month/day names
without the TM prefix.

(As per the discussion thread, there are lots of cases that this
feature will not handle, such as alternate day names.  But being
able to accept what to_char() will output seems useful enough.)

In passing, fix some shaky English and violations of message
style guidelines in jsonpath errors for the .datetime() method,
which depends on this code.

Juan José Santamaría Flecha, reviewed and modified by me,
with other commentary from Alvaro Herrera, Tomas Vondra,
Arthur Zakirov, Peter Eisentraut, Mark Dilger.

Discussion: https://postgr.es/m/CAC+AXB3u1jTngJcoC1nAHBf=M3v-jrEfo86UFtCqCjzbWS9QhA@mail.gmail.com
2020-03-03 11:06:47 -05:00
Peter Geoghegan 1e07f5e0a1 Remove overzealous _bt_split() assertions.
_bt_split() is passed NULL as its insertion scankey for internal page
splits.  Two recently added Assert() statements failed to consider this,
leading to a crash with pg_upgrade'd BREE_VERSION < 4 indexes.  Remove
the assertions.

The assertions in question were added by commit 0d861bbb, which added
nbtree deduplication.  It would be possible to fix the assertions
directly instead, but they weren't adding much anyway.
2020-03-02 21:40:11 -08:00
Michael Paquier 0b48f1335d Fix assertion failure with ALTER TABLE ATTACH PARTITION and indexes
Using ALTER TABLE ATTACH PARTITION causes an assertion failure when
attempting to work on a partitioned index, because partitioned indexes
cannot have partition bounds.

The grammar of ALTER TABLE ATTACH PARTITION requires partition bounds,
but not ALTER INDEX, so mixing ALTER TABLE with partitioned indexes is
confusing.  Hence, on HEAD, prevent ALTER TABLE to attach a partition if
the relation involved is a partitioned index.  On back-branches, as
applications may rely on the existing behavior, just remove the
culprit assertion.

Reported-by: Alexander Lakhin
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/16276-5cd1dcc8fb8be7b5@postgresql.org
Backpatch-through: 11
2020-03-03 13:55:41 +09:00
Fujii Masao e65497df8f Report progress of streaming base backup.
This commit adds pg_stat_progress_basebackup view that reports
the progress while an application like pg_basebackup is taking
a base backup. This uses the progress reporting infrastructure
added by c16dc1aca5, adding support for streaming base backup.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Amit Langote, Sergei Kornilov
Discussion: https://postgr.es/m/9ed8b801-8215-1f3d-62d7-65bff53f6e94@oss.nttdata.com
2020-03-03 12:03:43 +09:00
Michael Paquier d79fb88ac7 Preserve pg_index.indisclustered across REINDEX CONCURRENTLY
If the flag value is lost, a CLUSTER query following REINDEX
CONCURRENTLY could fail.  Non-concurrent REINDEX is already handling
this case consistently.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200229024202.GH29456@telsasoft.com
Backpatch-through: 12
2020-03-03 10:12:28 +09:00
Alvaro Herrera 2f9661311b
Represent command completion tags as structs
The backend was using strings to represent command tags and doing string
comparisons in multiple places, but that's slow and unhelpful.  Create a
new command list with a supporting structure to use instead; this is
stored in a tag-list-file that can be tailored to specific purposes with
a caller-definable C macro, similar to what we do for WAL resource
managers.  The first first such uses are a new CommandTag enum and a
CommandTagBehavior struct.

Replace numerous occurrences of char *completionTag with a
QueryCompletion struct so that the code no longer stores information
about completed queries in a cstring.  Only at the last moment, in
EndCommand(), does this get converted to a string.

EventTriggerCacheItem no longer holds an array of palloc’d tag strings
in sorted order, but rather just a Bitmapset over the CommandTags.

Author: Mark Dilger, with unsolicited help from Álvaro Herrera
Reviewed-by: John Naylor, Tom Lane
Discussion: https://postgr.es/m/981A9DB4-3F0C-4DA5-88AD-CB9CFF4D6CAD@enterprisedb.com
2020-03-02 18:19:51 -03:00
Peter Geoghegan 77b88bd5dc Add assertions to _bt_update_posting().
Copy some assertions from _bt_form_posting() to its sibling function,
_bt_update_posting().

Discussion: https://postgr.es/m/CAH2-WzkPR8KMwkL0ap976kmXwBCeukTeHz6fB-U__wvuP1S9Zg@mail.gmail.com
2020-03-02 08:07:16 -08:00
Michael Paquier 12c5cad76f Handle logical decoding in multi-insert for catalog tuples
The code path for multi-insert decoding is not stressed yet for
catalogs (a future patch may introduce this capability), so no
back-patch is needed.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/9690D72F-5C4F-4016-9572-6D16684E1D87@yesql.se
2020-03-02 10:00:37 +09:00
Peter Geoghegan 84ec9b231a Remove dead code from _bt_update_posting().
Discussion: https://postgr.es/m/CAH2-WzmAufHiOku6AGiFD=81VQs5nYJ1L2YkhW1t+BH4CMsgRw@mail.gmail.com
2020-03-01 12:11:26 -08:00
Dean Rasheed 43a899f41f Fix corner-case loss of precision in numeric ln().
When deciding on the local rscale to use for the Taylor series
expansion, ln_var() neglected to account for the fact that the result
is subsequently multiplied by a factor of 2^(nsqrt+1), where nsqrt is
the number of square root operations performed in the range reduction
step, which can be as high as 22 for very large inputs. This could
result in a loss of precision, particularly when combined with large
rscale values, for which a large number of Taylor series terms is
required (up to around 400).

Fix by computing a few extra digits in the Taylor series, based on the
weight of the multiplicative factor log10(2^(nsqrt+1)). It remains to
be proven whether or not the other 8 extra digits used for the Taylor
series is appropriate, but this at least deals with the obvious
oversight of failing to account for the effects of the final
multiplication.

Per report from Justin AnyhowStep. Reviewed by Tom Lane.

Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
2020-03-01 14:49:25 +00:00
Tom Lane 58c47ccfff Correctly re-use hash tables in buildSubPlanHash().
Commit 356687bd8 omitted to remove leftover code for destroying
a hashed subplan's hash tables, with the result that the tables
were always rebuilt not reused; this leads to severe memory
leakage if a hashed subplan is re-executed enough times.
Moreover, the code for reusing the hashnulls table had a typo
that would have made it do the wrong thing if it were reached.

Looking at the code coverage report shows severe under-coverage
of the potential callers of ResetTupleHashTable, so add some test
cases that exercise them.

Andreas Karlsson and Tom Lane, per reports from Ranier Vilela
and Justin Pryzby.

Backpatch to v11, as the faulty commit was.

Discussion: https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se
Discussion: https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
2020-02-29 13:48:09 -05:00
Tom Lane 6afc8aefd3 Remove obsolete comment.
Noted while studying subplan hash issue.
2020-02-29 13:23:12 -05:00
Tom Lane 80d76be51c Avoid failure if autovacuum tries to access a just-dropped temp namespace.
Such an access became possible when commit 246a6c8f7 added more
aggressive cleanup of orphaned temp relations by autovacuum.
Since autovacuum's snapshot might be slightly stale, it could
attempt to access an already-dropped temp namespace, resulting in
an assertion failure or null-pointer dereference.  (In practice,
since we don't drop temp namespaces automatically but merely
recycle them, this situation could only arise if a superuser does
a manual drop of a temp namespace.  Still, that should be allowed.)

The core of the bug, IMO, is that isTempNamespaceInUse and its callers
failed to think hard about whether to treat "temp namespace isn't there"
differently from "temp namespace isn't in use".  In hopes of forestalling
future mistakes of the same ilk, replace that function with a new one
checkTempNamespaceStatus, which makes the same tests but returns a
three-way enum rather than just a bool.  isTempNamespaceInUse is gone
entirely in HEAD; but just in case some external code is relying on it,
keep it in the back branches, as a bug-compatible wrapper around the
new function.

Per report originally from Prabhat Kumar Sahu, investigated by Mahendra
Singh and Michael Paquier; the final form of the patch is my fault.
This replaces the failed fix attempt in a052f6cbb.

Backpatch as far as v11, as 246a6c8f7 was.

Discussion: https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com
2020-02-28 20:28:34 -05:00
Jeff Davis 32bb4535a0 Fix commit c11cb17d.
I neglected to update copyfuncs/outfuncs/readfuncs.

Discussion: https://postgr.es/m/12491.1582833409%40sss.pgh.pa.us
2020-02-28 09:35:11 -08:00
Alvaro Herrera db989184cd
Add comments on avoid reuse of parse-time snapshot
Apparently, reusing the parse-time query snapshot for later steps
(execution) is a frequently considered optimization ... but it doesn't
work, for reasons discovered in thread [1].  Adding some comments about
why it doesn't really work can relieve some future hackers from wasting
time reimplementing it again.

[1] https://postgr.es/m/flat/5075D8DF.6050500@fuzzy.cz

Author: Michail Nikolaev
Discussion: https://postgr.es/m/CANtu0ogp6cTvMJObXP8n=k+JtqxY1iT9UV5MbGCpjjPa5crCiw@mail.gmail.com
2020-02-28 13:25:26 -03:00
Peter Eisentraut 1933ae629e Add PostgreSQL home page to --help output
Per emerging standard in GNU programs and elsewhere.  Autoconf already
has support for specifying a home page, so we can just that.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Peter Eisentraut 864934131e Refer to bug report address by symbol rather than hardcoding
Use the PACKAGE_BUGREPORT macro that is created by Autoconf for
referring to the bug reporting address rather than hardcoding it
everywhere.  This makes it easier to change the address and it reduces
translation work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
2020-02-28 13:12:21 +01:00
Jeff Davis c11cb17dc5 Save calculated transitionSpace in Agg node.
This will be useful in the upcoming Hash Aggregation work to improve
estimates for hash table sizing.

Discussion: https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
2020-02-27 11:20:56 -08:00
Alvaro Herrera b9b408c487
Record parents of triggers
This let us get rid of a recently introduced ugly hack (commit
1fa846f1c9).

Author: Álvaro Herrera
Reviewed-by: Amit Langote, Tom Lane
Discussion: https://postgr.es/m/20200217215641.GA29784@alvherre.pgsql
2020-02-27 13:23:33 -03:00
Robert Haas 05d8449e73 Move src/backend/utils/hash/hashfn.c to src/common
This also involves renaming src/include/utils/hashutils.h, which
becomes src/include/common/hashfn.h. Perhaps an argument can be
made for keeping the hashutils.h name, but it seemed more
consistent to make it match the name of the file, and also more
descriptive of what is actually going on here.

Patch by me, reviewed by Suraj Kharage and Mark Dilger. Off-list
advice on how not to break the Windows build from Davinder Singh
and Amit Kapila.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-27 09:25:41 +05:30
Peter Geoghegan 2c0797da2c Silence another compiler warning in nbtinsert.c.
Per complaint from Álvaro Herrera.
2020-02-26 15:15:45 -08:00
Tom Lane a477bfc1df Suppress unnecessary RelabelType nodes in more cases.
eval_const_expressions sometimes produced RelabelType nodes that
were useless because they just relabeled an expression to the same
exposed type it already had.  This is worth avoiding because it can
cause two equivalent expressions to not be equal(), preventing
recognition of useful optimizations.  In the test case added here,
an unpatched planner fails to notice that the "sqli = constant" clause
renders a sort step unnecessary, because one code path produces an
extra RelabelType and another doesn't.

Fix by ensuring that eval_const_expressions_mutator's T_RelabelType
case will not add in an unnecessary RelabelType.  Also save some
code by sharing a subroutine with the effectively-equivalent cases
for CollateExpr and CoerceToDomain.  (CollateExpr had no bug, and
I think that the case couldn't arise with CoerceToDomain, but
it seems prudent to do the same check for all three cases.)

Back-patch to v12.  In principle this has been wrong all along,
but I haven't seen a case where it causes visible misbehavior
before v12, so refrain from changing stable branches unnecessarily.

Per investigation of a report from Eric Gillum.

Discussion: https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
2020-02-26 18:14:12 -05:00
Peter Geoghegan 2d8a6fad18 Silence compiler warning in nbtinsert.c.
Per buildfarm member longfin.
2020-02-26 13:17:36 -08:00
Peter Geoghegan 0d861bbb70 Add deduplication to nbtree.
Deduplication reduces the storage overhead of duplicates in indexes that
use the standard nbtree index access method.  The deduplication process
is applied lazily, after the point where opportunistic deletion of
LP_DEAD-marked index tuples occurs.  Deduplication is only applied at
the point where a leaf page split would otherwise be required.  New
posting list tuples are formed by merging together existing duplicate
tuples.  The physical representation of the items on an nbtree leaf page
is made more space efficient by deduplication, but the logical contents
of the page are not changed.  Even unique indexes make use of
deduplication as a way of controlling bloat from duplicates whose TIDs
point to different versions of the same logical table row.

The lazy approach taken by nbtree has significant advantages over a GIN
style eager approach.  Most individual inserts of index tuples have
exactly the same overhead as before.  The extra overhead of
deduplication is amortized across insertions, just like the overhead of
page splits.  The key space of indexes works in the same way as it has
since commit dd299df8 (the commit that made heap TID a tiebreaker
column).

Testing has shown that nbtree deduplication can generally make indexes
with about 10 or 15 tuples for each distinct key value about 2.5X - 4X
smaller, even with single column integer indexes (e.g., an index on a
referencing column that accompanies a foreign key).  The final size of
single column nbtree indexes comes close to the final size of a similar
contrib/btree_gin index, at least in cases where GIN's posting list
compression isn't very effective.  This can significantly improve
transaction throughput, and significantly reduce the cost of vacuuming
indexes.

A new index storage parameter (deduplicate_items) controls the use of
deduplication.  The default setting is 'on', so all new B-Tree indexes
automatically use deduplication where possible.  This decision will be
reviewed at the end of the Postgres 13 beta period.

There is a regression of approximately 2% of transaction throughput with
synthetic workloads that consist of append-only inserts into a table
with several non-unique indexes, where all indexes have few or no
repeated values.  The underlying issue is that cycles are wasted on
unsuccessful attempts at deduplicating items in non-unique indexes.
There doesn't seem to be a way around it short of disabling
deduplication entirely.  Note that deduplication of items in unique
indexes is fairly well targeted in general, which avoids the problem
there (we can use a special heuristic to trigger deduplication passes in
unique indexes, since we're specifically targeting "version bloat").

Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed.

No bump in BTREE_VERSION, since the representation of posting list
tuples works in a way that's backwards compatible with version 4 indexes
(i.e. indexes built on PostgreSQL 12).  However, users must still
REINDEX a pg_upgrade'd index to use deduplication, regardless of the
Postgres version they've upgraded from.  This is the only way to set the
new nbtree metapage flag indicating that deduplication is generally
safe.

Author: Anastasia Lubennikova, Peter Geoghegan
Reviewed-By: Peter Geoghegan, Heikki Linnakangas
Discussion:
    https://postgr.es/m/55E4051B.7020209@postgrespro.ru
    https://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
2020-02-26 13:05:30 -08:00
Peter Geoghegan 612a1ab767 Add equalimage B-Tree support functions.
Invent the concept of a B-Tree equalimage ("equality implies image
equality") support function, registered as support function 4.  This
indicates whether it is safe (or not safe) to apply optimizations that
assume that any two datums considered equal by an operator class's order
method must be interchangeable without any loss of semantic information.
This is static information about an operator class and a collation.

Register an equalimage routine for almost all of the existing B-Tree
opclasses.  We only need two trivial routines for all of the opclasses
that are included with the core distribution.  There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which
returns 'true' when the collation is a deterministic collation).

This patch is infrastructure for an upcoming patch that adds B-Tree
deduplication.

Author: Peter Geoghegan, Anastasia Lubennikova
Discussion: https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
2020-02-26 11:28:25 -08:00
Andres Freund 2742c45080 expression eval: Reduce number of steps for agg transition invocations.
Do so by combining the various steps that are part of aggregate
transition function invocation into one larger step. As some of the
current steps are only necessary for some aggregates, have one variant
of the aggregate transition step for each possible combination.

To avoid further manual copies of code in the different transition
step implementations, move most of the code into helper functions
marked as "always inline".

The benefit of this change is an increase in performance when
aggregating lots of rows. This comes in part due to the reduced number
of indirect jumps due to the reduced number of steps, and in part by
reducing redundant setup code across steps. This mainly benefits
interpreted execution, but the code generated by JIT is also improved
a bit.

As a nice side-effect it also ends up making the code a bit simpler.

A small additional optimization is removing the need to set
aggstate->curaggcontext before calling ExecAggInitGroup, choosing to
instead passign curaggcontext as an argument. It was, in contrast to
other aggregate related functions, only needed to fetch a memory
context to copy the transition value into.

Author: Andres Freund
Discussion:
   https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
   https://postgr.es/m/5c371df7cee903e8cd4c685f90c6c72086d3a2dc.camel@j-davis.com
2020-02-24 15:09:09 -08:00
Michael Paquier 7d672b76bf Issue properly WAL record for CID of first catalog tuple in multi-insert
Multi-insert for heap is not yet used actively for catalogs, but the
code to support this case is in place for logical decoding.  The
existing code forgot to issue a XLOG_HEAP2_NEW_CID record for the first
tuple inserted, leading to failures when attempting to use multiple
inserts for catalogs at decoding time.  This commit fixes the problem by
WAL-logging the needed CID.

This is not an active bug, so no back-patch is done.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/E0D4CC67-A1CF-4DF4-991D-B3AC2EB5FAE9@yesql.se
2020-02-25 07:55:22 +09:00
Tom Lane 3d475515a1 Account explicitly for long-lived FDs that are allocated outside fd.c.
The comments in fd.c have long claimed that all file allocations should
go through that module, but in reality that's not always practical.
fd.c doesn't supply APIs for invoking some FD-producing syscalls like
pipe() or epoll_create(); and the APIs it does supply for non-virtual
FDs are mostly insistent on releasing those FDs at transaction end;
and in some cases the actual open() call is in code that can't be made
to use fd.c, such as libpq.

This has led to a situation where, in a modern server, there are likely
to be seven or so long-lived FDs per backend process that are not known
to fd.c.  Since NUM_RESERVED_FDS is only 10, that meant we had *very*
few spare FDs if max_files_per_process is >= the system ulimit and
fd.c had opened all the files it thought it safely could.  The
contrib/postgres_fdw regression test, in particular, could easily be
made to fall over by running it under a restrictive ulimit.

To improve matters, invent functions Acquire/Reserve/ReleaseExternalFD
that allow outside callers to tell fd.c that they have or want to allocate
a FD that's not directly managed by fd.c.  Add calls to track all the
fixed FDs in a standard backend session, so that we are honestly
guaranteeing that NUM_RESERVED_FDS FDs remain unused below the EMFILE
limit in a backend's idle state.  The coding rules for these functions say
that there's no need to call them in code that just allocates one FD over
a fairly short interval; we can dip into NUM_RESERVED_FDS for such cases.
That means that there aren't all that many places where we need to worry.
But postgres_fdw and dblink must use this facility to account for
long-lived FDs consumed by libpq connections.  There may be other places
where it's worth doing such accounting, too, but this seems like enough
to solve the immediate problem.

Internally to fd.c, "external" FDs are limited to max_safe_fds/3 FDs.
(Callers can choose to ignore this limit, but of course it's unwise
to do so except for fixed file allocations.)  I also reduced the limit
on "allocated" files to max_safe_fds/3 FDs (it had been max_safe_fds/2).
Conceivably a smarter rule could be used here --- but in practice,
on reasonable systems, max_safe_fds should be large enough that this
isn't much of an issue, so KISS for now.  To avoid possible regression
in the number of external or allocated files that can be opened,
increase FD_MINFREE and the lower limit on max_files_per_process a
little bit; we now insist that the effective "ulimit -n" be at least 64.

This seems like pretty clearly a bug fix, but in view of the lack of
field complaints, I'll refrain from risking a back-patch.

Discussion: https://postgr.es/m/E1izCmM-0005pV-Co@gemulon.postgresql.org
2020-02-24 17:28:33 -05:00
Robert Haas a91e2fa941 Adapt hashfn.c and hashutils.h for frontend use.
hash_any() and its various variants are defined to return Datum,
which is a backend-only concept, but the underlying functions
actually want to return uint32 and uint64, and only return Datum
because it's convenient for callers who are using them to
implement a hash function for some SQL datatype.

However, changing these functions to return uint32 and uint64
seems like it might lead to programming errors or back-patching
difficulties, both because they are widely used and because
failure to use UInt{32,64}GetDatum() might not provoke a
compilation error. Instead, rename the existing functions as
well as changing the return type, and add static inline wrappers
for those callers that need the previous behavior.

Although this commit adapts hashutils.h and hashfn.c so that they
can be compiled as frontend code, it does not actually do
anything that would cause them to be so compiled. That is left
for another commit.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:27:15 +05:30
Robert Haas 9341c783cc Put all the prototypes for hashfn.c into the same header file.
Previously, some of the prototypes for functions in hashfn.c were
in utils/hashutils.h and others were in utils/hsearch.h, but that
is confusing and has no particular benefit.

Patch by me, reviewed by Suraj Kharage and Mark Dilger.

Discussion: http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
2020-02-24 17:22:45 +05:30