Commit Graph

3933 Commits

Author SHA1 Message Date
Tom Lane
a5cc4dab6d Yet more elimination of dead stores and useless initializations.
I'm not sure what tool Ranier was using, but the ones I contributed
were found by using a newer version of scan-build than I tried before.

Ranier Vilela and Tom Lane

Discussion: https://postgr.es/m/CAEudQAo1+AcGppxDSg8k+zF4+Kv+eJyqzEDdbpDg58-=MQcerQ@mail.gmail.com
2020-09-05 13:17:32 -04:00
Michael Paquier
8febfd1855 Switch to multi-inserts when registering dependencies for many code paths
This commit improves the dependency registrations by taking advantage of
the preliminary work done in 63110c62, to group together the insertion
of dependencies of the same type to pg_depend.  With the current layer
of routines available, and as only dependencies of the same type can be
grouped, there are code paths still doing more than one multi-insert
when it is necessary to register dependencies of multiple types
(constraint and index creation are two cases doing that).

While on it, this refactors some of the code to use ObjectAddressSet()
when manipulating object addresses.

Author: Daniel Gustafsson, Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera
Discussion: https://postgr.es/m/20200807061619.GA23955@paquier.xyz
2020-09-05 21:33:53 +09:00
Michael Paquier
844c05abc3 Remove variable "concurrent" from ReindexStmt
This node already handles multiple options using a bitmask, so having a
separate boolean flag is not necessary.  This simplifies the code a bit
with less arguments to give to the reindex routines, by replacing the
boolean with an equivalent bitmask value.

Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/20200902110326.GA14963@paquier.xyz
2020-09-04 10:43:32 +09:00
Amit Kapila
464824323e Add support for streaming to built-in logical replication.
To add support for streaming of in-progress transactions into the
built-in logical replication, we need to do three things:

* Extend the logical replication protocol, so identify in-progress
transactions, and allow adding additional bits of information (e.g.
XID of subtransactions).

* Modify the output plugin (pgoutput) to implement the new stream
API callbacks, by leveraging the extended replication protocol.

* Modify the replication apply worker, to properly handle streamed
in-progress transaction by spilling the data to disk and then
replaying them on commit.

We however must explicitly disable streaming replication during
replication slot creation, even if the plugin supports it. We
don't need to replicate the changes accumulated during this phase,
and moreover we don't have a replication connection open so we
don't have where to send the data anyway.

Author: Tomas Vondra, Dilip Kumar and Amit Kapila
Reviewed-by: Amit Kapila, Kuntal Ghosh and Ajin Cherian
Tested-by: Neha Sharma, Mahendra Singh Thalor and Ajin Cherian
Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com
2020-09-03 07:54:07 +05:30
Michael Paquier
1d65416661 Improve handling of dropped relations for REINDEX DATABASE/SCHEMA/SYSTEM
When multiple relations are reindexed, a scan of pg_class is done first
to build the list of relations to work on.  However the REINDEX logic
has never checked if a relation listed still exists when beginning the
work on it, causing for example sudden cache lookup failures.

This commit adds safeguards against dropped relations for REINDEX,
similarly to VACUUM or CLUSTER where we try to open the relation,
ignoring it if it is missing.  A new option is added to the REINDEX
routines to control if a missed relation is OK to ignore or not.

An isolation test, based on REINDEX SCHEMA, is added for the concurrent
and non-concurrent cases.

Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200813043805.GE11663@paquier.xyz
2020-09-02 09:08:12 +09:00
Tom Lane
a7212be8b9 Set cutoff xmin more aggressively when vacuuming a temporary table.
Since other sessions aren't allowed to look into a temporary table
of our own session, we do not need to worry about the global xmin
horizon when setting the vacuum XID cutoff.  Indeed, if we're not
inside a transaction block, we may set oldestXmin to be the next
XID, because there cannot be any in-doubt tuples in a temp table,
nor any tuples that are dead but still visible to some snapshot of
our transaction.  (VACUUM, of course, is never inside a transaction
block; but we need to test that because CLUSTER shares the same code.)

This approach allows us to always clean out a temp table completely
during VACUUM, independently of concurrent activity.  Aside from
being useful in its own right, that simplifies building reproducible
test cases.

Discussion: https://postgr.es/m/3490536.1598629609@sss.pgh.pa.us
2020-09-01 18:40:43 -04:00
Alvaro Herrera
afc7e0ad55
Raise error on concurrent drop of partitioned index
We were already raising an error for DROP INDEX CONCURRENTLY on a
partitioned table, albeit a different and confusing one:
  ERROR:  DROP INDEX CONCURRENTLY must be first action in transaction

Change that to throw a more comprehensible error:
  ERROR:  cannot drop partitioned index \"%s\" concurrently

Michael Paquier authored the test case for indexes on temporary
partitioned tables.

Backpatch to 11, where indexes on partitioned tables were added.

Reported-by: Jan Mussler <jan.mussler@zalando.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16594-d2956ca909585067@postgresql.org
2020-09-01 13:40:43 -04:00
Tom Lane
3d351d916b Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.
Historically, we've considered the state with relpages and reltuples
both zero as indicating that we do not know the table's tuple density.
This is problematic because it's impossible to distinguish "never yet
vacuumed" from "vacuumed and seen to be empty".  In particular, a user
cannot use VACUUM or ANALYZE to override the planner's normal heuristic
that an empty table should not be believed to be empty because it is
probably about to get populated.  That heuristic is a good safety
measure, so I don't care to abandon it, but there should be a way to
override it if the table is indeed intended to stay empty.

Hence, represent the initial state of ignorance by setting reltuples
to -1 (relpages is still set to zero), and apply the minimum-ten-pages
heuristic only when reltuples is still -1.  If the table is empty,
VACUUM or ANALYZE (but not CREATE INDEX) will override that to
reltuples = relpages = 0, and then we'll plan on that basis.

This requires a bunch of fiddly little changes, but we can get rid of
some ugly kluges that were formerly needed to maintain the old definition.

One notable point is that FDWs' GetForeignRelSize methods will see
baserel->tuples = -1 when no ANALYZE has been done on the foreign table.
That seems like a net improvement, since those methods were formerly
also in the dark about what baserel->tuples = 0 really meant.  Still,
it is an API change.

I bumped catversion because code predating this change would get confused
by seeing reltuples = -1.

Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
2020-08-30 12:21:51 -04:00
Tom Lane
5b02d68e75 Fix ALTER TABLE's scheduling rules for AT_AddConstraint subcommands.
Commit 1281a5c90 rearranged the logic in this area rather drastically,
and it broke the case of adding a foreign key constraint in the same
ALTER that adds the pkey or unique constraint it depends on.  While
self-referential fkeys are surely a pretty niche case, this used to
work so we shouldn't break it.

To fix, reorganize the scheduling rules in ATParseTransformCmd so
that a transformed AT_AddConstraint subcommand will be delayed into
a later pass in all cases, not only when it's been spit out as a
side-effect of parsing some other command type.

Also tweak the logic so that we won't run ATParseTransformCmd twice
while doing this.  It seems to work even without that, but it's
surely wasting cycles to do so.

Per bug #16589 from Jeremy Evans.  Back-patch to v13 where the new
code was introduced.

Discussion: https://postgr.es/m/16589-31c8d981ca503896@postgresql.org
2020-08-22 12:34:17 -04:00
Tom Lane
5028981923 Fix handling of CREATE TABLE LIKE with inheritance.
If a CREATE TABLE command uses both LIKE and traditional inheritance,
Vars in CHECK constraints and expression indexes that are absorbed
from a LIKE parent table tended to get mis-numbered, resulting in
wrong answers and/or bizarre error messages (though probably not any
actual crashes, thanks to validation occurring in the executor).

In v12 and up, the same could happen to Vars in GENERATED expressions,
even in cases with no LIKE clause but multiple traditional-inheritance
parents.

The cause of the problem for LIKE is that parse_utilcmd.c supposed
it could renumber such Vars correctly during transformCreateStmt(),
which it cannot since we have not yet accounted for columns added via
inheritance.  Fix that by postponing processing of LIKE INCLUDING
CONSTRAINTS, DEFAULTS, GENERATED, INDEXES till after we've performed
DefineRelation().

The error with GENERATED and multiple inheritance is a simple oversight
in MergeAttributes(); it knows it has to renumber Vars in inherited
CHECK constraints, but forgot to apply the same processing to inherited
GENERATED expressions (a/k/a defaults).

Per bug #16272 from Tom Gottfried.  The non-GENERATED variants of the
issue are ancient, presumably dating right back to the addition of
CREATE TABLE LIKE; hence back-patch to all supported branches.

Discussion: https://postgr.es/m/16272-6e32da020e9a9381@postgresql.org
2020-08-21 15:00:47 -04:00
Fujii Masao
9d701e624f Rework EXPLAIN for planner's buffer usage.
Commit ce77abe63c allowed EXPLAIN (BUFFERS) to report the information
on buffer usage during planning phase. However three issues were
reported regarding this feature.

(1) Previously, EXPLAIN option BUFFERS required ANALYZE. So the query
    had to be actually executed by specifying ANALYZE even when we
    want to see only the planner's buffer usage. This was inconvenient
    especially when the query was write one like DELETE.

(2) EXPLAIN included the planner's buffer usage in summary
    information. So SUMMARY option had to be enabled to report that.
    Also this format was confusing.

(3) The output structure for planning information was not consistent
    between TEXT format and the others. For example, "Planning" tag
    was output in JSON format, but not in TEXT format.

For (1), this commit allows us to perform EXPLAIN (BUFFERS) without
ANALYZE to report the planner's buffer usage.

For (2), this commit changed EXPLAIN output so that the planner's
buffer usage is reported before summary information.

For (3), this commit made the output structure for planning
information more consistent between the formats.

Back-patch to v13 where the planner's buffer usage was allowed to
be reported in EXPLAIN.

Reported-by: Pierre Giraud, David Rowley
Author: Fujii Masao
Reviewed-by: David Rowley, Julien Rouhaud, Pierre Giraud
Discussion: https://postgr.es/m/07b226e6-fa49-687f-b110-b7c37572f69e@dalibo.com
2020-08-21 20:48:59 +09:00
Noah Misch
676a9c3cc4 Correct several behavior descriptions in comments.
Reuse cautionary language from src/test/ssl/README in
src/test/kerberos/README.  SLRUs have had access to six-character
segments names since commit 73c986adde,
and recovery stopped calling HeapTupleHeaderAdvanceLatestRemovedXid() in
commit 558a9165e0.  The other corrections
are more self-evident.
2020-08-15 20:21:52 -07:00
Noah Misch
566372b3d6 Prevent concurrent SimpleLruTruncate() for any given SLRU.
The SimpleLruTruncate() header comment states the new coding rule.  To
achieve this, add locktype "frozenid" and two LWLocks.  This closes a
rare opportunity for data loss, which manifested as "apparent
wraparound" or "could not access status of transaction" errors.  Data
loss is more likely in pg_multixact, due to released branches' thin
margin between multiStopLimit and multiWrapLimit.  If a user's physical
replication primary logged ":  apparent wraparound" messages, the user
should rebuild standbys of that primary regardless of symptoms.  At less
risk is a cluster having emitted "not accepting commands" errors or
"must be vacuumed" warnings at some point.  One can test a cluster for
this data loss by running VACUUM FREEZE in every database.  Back-patch
to 9.5 (all supported versions).

Discussion: https://postgr.es/m/20190218073103.GA1434723@rfd.leadboat.com
2020-08-15 10:15:53 -07:00
Andres Freund
5788e258bb snapshot scalability: Move PGXACT->vacuumFlags to ProcGlobal->vacuumFlags.
Similar to the previous commit this increases the chance that data
frequently needed by GetSnapshotData() stays in l2 cache. As we now
take care to not unnecessarily write to ProcGlobal->vacuumFlags, there
should be very few modifications to the ProcGlobal->vacuumFlags array.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
2020-08-14 15:33:35 -07:00
Andres Freund
941697c3c1 snapshot scalability: Introduce dense array of in-progress xids.
The new array contains the xids for all connected backends / in-use
PGPROC entries in a dense manner (in contrast to the PGPROC/PGXACT
arrays which can have unused entries interspersed).

This improves performance because GetSnapshotData() always needs to
scan the xids of all live procarray entries and now there's no need to
go through the procArray->pgprocnos indirection anymore.

As the set of running top-level xids changes rarely, compared to the
number of snapshots taken, this substantially increases the likelihood
of most data required for a snapshot being in l2 cache.  In
read-mostly workloads scanning the xids[] array will sufficient to
build a snapshot, as most backends will not have an xid assigned.

To keep the xid array dense ProcArrayRemove() needs to move entries
behind the to-be-removed proc's one further up in the array. Obviously
moving array entries cannot happen while a backend sets it
xid. I.e. locking needs to prevent that array entries are moved while
a backend modifies its xid.

To avoid locking ProcArrayLock in GetNewTransactionId() - a fairly hot
spot already - ProcArrayAdd() / ProcArrayRemove() now needs to hold
XidGenLock in addition to ProcArrayLock. Adding / Removing a procarray
entry is not a very frequent operation, even taking 2PC into account.

Due to the above, the dense array entries can only be read or modified
while holding ProcArrayLock and/or XidGenLock. This prevents a
concurrent ProcArrayRemove() from shifting the dense array while it is
accessed concurrently.

While the new dense array is very good when needing to look at all
xids it is less suitable when accessing a single backend's xid. In
particular it would be problematic to have to acquire a lock to access
a backend's own xid. Therefore a backend's xid is not just stored in
the dense array, but also in PGPROC. This also allows a backend to
only access the shared xid value when the backend had acquired an
xid.

The infrastructure added in this commit will be used for the remaining
PGXACT fields in subsequent commits. They are kept separate to make
review easier.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
2020-08-14 15:33:35 -07:00
Andres Freund
1f51c17c68 snapshot scalability: Move PGXACT->xmin back to PGPROC.
Now that xmin isn't needed for GetSnapshotData() anymore, it leads to
unnecessary cacheline ping-pong to have it in PGXACT, as it is updated
considerably more frequently than the other PGXACT members.

After the changes in dc7420c2c9, this is a very straight-forward change.

For highly concurrent, snapshot acquisition heavy, workloads this change alone
can significantly increase scalability. E.g. plain pgbench on a smaller 2
socket machine gains 1.07x for read-only pgbench, 1.22x for read-only pgbench
when submitting queries in batches of 100, and 2.85x for batches of 100
'SELECT';.  The latter numbers are obviously not to be expected in the
real-world, but micro-benchmark the snapshot computation
scalability (previously spending ~80% of the time in GetSnapshotData()).

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
2020-08-13 16:25:21 -07:00
Andres Freund
dc7420c2c9 snapshot scalability: Don't compute global horizons while building snapshots.
To make GetSnapshotData() more scalable, it cannot not look at at each proc's
xmin: While snapshot contents do not need to change whenever a read-only
transaction commits or a snapshot is released, a proc's xmin is modified in
those cases. The frequency of xmin modifications leads to, particularly on
higher core count systems, many cache misses inside GetSnapshotData(), despite
the data underlying a snapshot not changing. That is the most
significant source of GetSnapshotData() scaling poorly on larger systems.

Without accessing xmins, GetSnapshotData() cannot calculate accurate horizons /
thresholds as it has so far. But we don't really have to: The horizons don't
actually change that much between GetSnapshotData() calls. Nor are the horizons
actually used every time a snapshot is built.

The trick this commit introduces is to delay computation of accurate horizons
until there use and using horizon boundaries to determine whether accurate
horizons need to be computed.

The use of RecentGlobal[Data]Xmin to decide whether a row version could be
removed has been replaces with new GlobalVisTest* functions.  These use two
thresholds to determine whether a row can be pruned:
1) definitely_needed, indicating that rows deleted by XIDs >= definitely_needed
   are definitely still visible.
2) maybe_needed, indicating that rows deleted by XIDs < maybe_needed can
   definitely be removed
GetSnapshotData() updates definitely_needed to be the xmin of the computed
snapshot.

When testing whether a row can be removed (with GlobalVisTestIsRemovableXid())
and the tested XID falls in between the two (i.e. XID >= maybe_needed && XID <
definitely_needed) the boundaries can be recomputed to be more accurate. As it
is not cheap to compute accurate boundaries, we limit the number of times that
happens in short succession.  As the boundaries used by
GlobalVisTestIsRemovableXid() are never reset (with maybe_needed updated by
GetSnapshotData()), it is likely that further test can benefit from an earlier
computation of accurate horizons.

To avoid regressing performance when old_snapshot_threshold is set (as that
requires an accurate horizon to be computed), heap_page_prune_opt() doesn't
unconditionally call TransactionIdLimitedForOldSnapshots() anymore. Both the
computation of the limited horizon, and the triggering of errors (with
SetOldSnapshotThresholdTimestamp()) is now only done when necessary to remove
tuples.

This commit just removes the accesses to PGXACT->xmin from
GetSnapshotData(), but other members of PGXACT residing in the same
cache line are accessed. Therefore this in itself does not result in a
significant improvement. Subsequent commits will take advantage of the
fact that GetSnapshotData() now does not need to access xmins anymore.

Note: This contains a workaround in heap_page_prune_opt() to keep the
snapshot_too_old tests working. While that workaround is ugly, the tests
currently are not meaningful, and it seems best to address them separately.

Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de
2020-08-12 16:03:49 -07:00
Peter Eisentraut
1784f278a6 Replace remaining StrNCpy() by strlcpy()
They are equivalent, except that StrNCpy() zero-fills the entire
destination buffer instead of providing just one trailing zero.  For
all but a tiny number of callers, that's just overhead rather than
being desirable.

Remove StrNCpy() as it is now unused.

In some cases, namestrcpy() is the more appropriate function to use.
While we're here, simplify the API of namestrcpy(): Remove the return
value, don't check for NULL input.  Nothing was using that anyway.
Also, remove a few unused name-related functions.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/44f5e198-36f6-6cdb-7fa9-60e34784daae%402ndquadrant.com
2020-08-10 23:20:37 +02:00
Tom Lane
7eeb1d9861 Make contrib modules' installation scripts more secure.
Hostile objects located within the installation-time search_path could
capture references in an extension's installation or upgrade script.
If the extension is being installed with superuser privileges, this
opens the door to privilege escalation.  While such hazards have existed
all along, their urgency increases with the v13 "trusted extensions"
feature, because that lets a non-superuser control the installation path
for a superuser-privileged script.  Therefore, make a number of changes
to make such situations more secure:

* Tweak the construction of the installation-time search_path to ensure
that references to objects in pg_catalog can't be subverted; and
explicitly add pg_temp to the end of the path to prevent attacks using
temporary objects.

* Disable check_function_bodies within installation/upgrade scripts,
so that any security gaps in SQL-language or PL-language function bodies
cannot create a risk of unwanted installation-time code execution.

* Adjust lookup of type input/receive functions and join estimator
functions to complain if there are multiple candidate functions.  This
prevents capture of references to functions whose signature is not the
first one checked; and it's arguably more user-friendly anyway.

* Modify various contrib upgrade scripts to ensure that catalog
modification queries are executed with secure search paths.  (These
are in-place modifications with no extension version changes, since
it is the update process itself that is at issue, not the end result.)

Extensions that depend on other extensions cannot be made fully secure
by these methods alone; therefore, revert the "trusted" marking that
commit eb67623c9 applied to earthdistance and hstore_plperl, pending
some better solution to that set of issues.

Also add documentation around these issues, to help extension authors
write secure installation scripts.

Patch by me, following an observation by Andres Freund; thanks
to Noah Misch for review.

Security: CVE-2020-14350
2020-08-10 10:44:42 -04:00
Alvaro Herrera
cea3d55898
Remove PROC_IN_ANALYZE and derived flags
These flags are unused and always have been.

Discussion: https://postgr.es/m/20200805235549.GA8118@alvherre.pgsql
2020-08-07 17:24:40 -04:00
David Rowley
d5e96520ff Fix bogus EXPLAIN output for Hash Aggregate
9bdb300de modified the EXPLAIN output for Hash Aggregate to show details
from parallel workers. However, it neglected to consider that a given
parallel worker may not have assisted with the given Hash Aggregate. This
can occur when workers fail to start or during Parallel Append with
enable_partitionwise_join enabled when only a single worker is working on
a non-parallel aware sub-plan. It could also happen if a worker simply
wasn't fast enough to get any work done before other processes went and
finished all the work.

The bogus output came from the fact that ExplainOpenWorker() skipped
showing any details for non-initialized workers but show_hashagg_info()
did show details from the worker.  This meant that the worker properties
that were shown were not properly attributed to the worker that they
belong to.

In passing, we also now don't show Hash Aggregate properties for the
leader process when it did not contribute any work to the Hash Aggregate.
This can occur either during Parallel Append when only a parallel worker
worked on a given sub plan or with parallel_leader_participation set to
off.  This aims to make the behavior of Hash Aggregate's EXPLAIN output
more similar to Sort's.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20200805012105.GZ28072%40telsasoft.com
Backpatch-through: 13, where the original breakage was introduced
2020-08-07 10:22:18 +12:00
David Rowley
6ee3b5fb99 Use int64 instead of long in incremental sort code
Windows 64bit has 4-byte long values which is not suitable for tracking
disk space usage in the incremental sort code. Let's just make all these
fields int64s.

Author: James Coleman
Discussion: https://postgr.es/m/CAApHDvpky%2BUhof8mryPf5i%3D6e6fib2dxHqBrhp0Qhu0NeBhLJw%40mail.gmail.com
Backpatch-through: 13, where the incremental sort code was added
2020-08-02 14:24:46 +12:00
Tom Lane
9f9682783b Invent "amadjustmembers" AM method for validating opclass members.
This allows AM-specific knowledge to be applied during creation of
pg_amop and pg_amproc entries.  Specifically, the AM knows better than
core code which entries to consider as required or optional.  Giving
the latter entries the appropriate sort of dependency allows them to
be dropped without taking out the whole opclass or opfamily; which
is something we'd like to have to correct obsolescent entries in
extensions.

This callback also opens the door to performing AM-specific validity
checks during opclass creation, rather than hoping than an opclass
developer will remember to test with "amvalidate".  For the most part
I've not actually added any such checks yet; that can happen in a
follow-on patch.  (Note that we shouldn't remove any tests from
"amvalidate", as those are still needed to cross-check manually
constructed entries in the initdb data.  So adding tests to
"amadjustmembers" will be somewhat duplicative, but it seems like
a good idea anyway.)

Patch by me, reviewed by Alexander Korotkov, Hamid Akhtar, and
Anastasia Lubennikova.

Discussion: https://postgr.es/m/4578.1565195302@sss.pgh.pa.us
2020-08-01 17:12:47 -04:00
Tom Lane
3d2376d55c Fix oversight in ALTER TYPE: typmodin/typmodout must propagate to arrays.
If a base type supports typmods, its array type does too, with the
same interpretation.  Hence changes in pg_type.typmodin/typmodout
must be propagated to the array type.

While here, improve AlterTypeRecurse to not recurse to domains if
there is nothing we'd need to change.

Oversight in fe30e7ebf.  Back-patch to v13 where that came in.
2020-07-31 17:11:28 -04:00
Michael Paquier
e3931d01f3 Use multi-inserts for pg_attribute and pg_shdepend
For pg_attribute, this allows to insert at once a full set of attributes
for a relation (roughly 15% of WAL reduction in extreme cases).  For
pg_shdepend, this reduces the work done when creating new shared
dependencies from a database template.  The number of slots used for the
insertion is capped at 64kB of data inserted for both, depending on the
number of items to insert and the length of the rows involved.

More can be done for other catalogs, like pg_depend.  This part requires
a different approach as the number of slots to use depends also on the
number of entries discarded as pinned dependencies.  This is also
related to the rework or dependency handling for ALTER TABLE and CREATE
TABLE, mainly.

Author: Daniel Gustafsson
Reviewed-by: Andres Freund, Michael Paquier
Discussion: https://postgr.es/m/20190213182737.mxn6hkdxwrzgxk35@alap3.anarazel.de
2020-07-31 10:54:26 +09:00
David Rowley
0e3e1c4e1c Make EXPLAIN ANALYZE of HashAgg more similar to Hash Join
There were various unnecessary differences between Hash Agg's EXPLAIN
ANALYZE output and Hash Join's.  Here we modify the Hash Agg output so
that it's better aligned to Hash Join's.

The following changes have been made:
1. Start batches counter at 1 instead of 0.
2. Always display the "Batches" property, even when we didn't spill to
   disk.
3. Use the text "Batches" instead of "HashAgg Batches" for text format.
4. Use the text "Memory Usage" instead of "Peak Memory Usage" for text
   format.
5. Include "Batches" before "Memory Usage" in both text and non-text
   formats.

In passing also modify the "Planned Partitions" property so that we show
it regardless of if the value is 0 or not for non-text EXPLAIN formats.
This was pointed out by Justin Pryzby and probably should have been part
of 40efbf870.

Reviewed-by: Justin Pryzby, Jeff Davis
Discussion: https://postgr.es/m/CAApHDvrshRnA6C0VFnu7Fb9TVvgGo80PUMm5+2DiaS1gEkPvtw@mail.gmail.com
Backpatch-through: 13, where HashAgg batching was introduced
2020-07-29 11:42:21 +12:00
Tom Lane
0a0727ccfc Improve performance of binary COPY FROM through better buffering.
At least on Linux and macOS, fread() turns out to have far higher
per-call overhead than one could wish.  Reading 64KB of data at a time
and then parceling it out with our own memcpy logic makes binary COPY
from a file significantly faster --- around 30% in simple testing for
cases with narrow text columns (on Linux ... even more on macOS).

In binary COPY from frontend, there's no per-call fread(), and this
patch introduces an extra layer of memcpy'ing, but it still manages
to eke out a small win.  Apparently, the control-logic overhead in
CopyGetData() is enough to be worth avoiding for small fetches.

Bharath Rupireddy and Amit Langote, reviewed by Vignesh C,
cosmetic tweaks by me

Discussion: https://postgr.es/m/CALj2ACU5Bz06HWLwqSzNMN=Gupoj6Rcn_QVC+k070V4em9wu=A@mail.gmail.com
2020-07-25 16:34:35 -04:00
Fujii Masao
d05b172a76 Add generic_plans and custom_plans fields into pg_prepared_statements.
There was no easy way to find how many times generic and custom plans
have been executed for a prepared statement. This commit exposes those
numbers of times in pg_prepared_statements view.

Author: Atsushi Torikoshi, Kyotaro Horiguchi
Reviewed-by: Tatsuro Yamada, Masahiro Ikeda, Fujii Masao
Discussion: https://postgr.es/m/CACZ0uYHZ4M=NZpofH6JuPHeX=__5xcDELF8hT8_2T+R55w4RQw@mail.gmail.com
2020-07-20 11:55:50 +09:00
Tom Lane
9de77b5453 Allow logical replication to transfer data in binary format.
This patch adds a "binary" option to CREATE/ALTER SUBSCRIPTION.
When that's set, the publisher will send data using the data type's
typsend function if any, rather than typoutput.  This is generally
faster, if slightly less robust.

As committed, we won't try to transfer user-defined array or composite
types in binary, for fear that type OIDs won't match at the subscriber.
This might be changed later, but it seems like fit material for a
follow-on patch.

Dave Cramer, reviewed by Daniel Gustafsson, Petr Jelinek, and others;
adjusted some by me

Discussion: https://postgr.es/m/CADK3HH+R3xMn=8t3Ct+uD+qJ1KD=Hbif5NFMJ+d5DkoCzp6Vgw@mail.gmail.com
2020-07-18 12:44:51 -04:00
Michael Paquier
2a10fdc430 Eliminate cache lookup errors in SQL functions for object addresses
When using the following functions, users could see various types of
errors of the type "cache lookup failed for OID XXX" with elog(), that
can only be used for internal errors:
* pg_describe_object()
* pg_identify_object()
* pg_identify_object_as_address()

The set of APIs managing object addresses for all object types are made
smarter by gaining a new argument "missing_ok" that allows any caller to
control if an error is raised or not on an undefined object.  The SQL
functions listed above are changed to handle the case where an object is
missing.

Regression tests are added for all object types for the cases where
these are undefined.  Before this commit, these cases failed with cache
lookup errors, and now they basically return NULL (minus the name of the
object type requested).

Author: Michael Paquier
Reviewed-by: Aleksander Alekseev, Dmitry Dolgov, Daniel Gustafsson,
Álvaro Herrera, Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com
2020-07-15 09:03:10 +09:00
David Rowley
101f903e51 Add comment to explain an unused function parameter
Removing the unused 'miinfo' parameter has been raised a couple of times
now.  It was decided in the 2nd discussion below that we're going to leave
it alone.  It seems like it might be useful to add a comment to mention
this fact so that nobody wastes any time in the future proposing its
removal again.

Discussion: https://postgr.es/m/CAApHDvpCf-qR5HC1rXskUM4ToV+3YDb4-n1meY=vpAHsRS_1PA@mail.gmail.com
Discussion: https://postgr.es/m/CAE9k0P%3DFvcDswnSVtRpSyZMpcAWC%3DGp%3DifZ0HdfPaRQ%3D__LBtw%40mail.gmail.com
2020-07-14 17:29:52 +12:00
David Rowley
f1fcf2d3b2 Fix timing issue with ALTER TABLE's validate constraint
An ALTER TABLE to validate a foreign key in which another subcommand
already caused a pending table rewrite could fail due to ALTER TABLE
attempting to validate the foreign key before the actual table rewrite
takes place.  This situation could result in an error such as:

ERROR:  could not read block 0 in file "base/nnnnn/nnnnn": read only 0 of 8192 bytes

The failure here was due to the SPI call which validates the foreign key
trying to access an index which is yet to be rebuilt.

Similarly, we also incorrectly tried to validate CHECK constraints before
the heap had been rewritten.

The fix for both is to delay constraint validation until phase 3, after
the table has been rewritten.  For CHECK constraints this means a slight
behavioral change.  Previously ALTER TABLE VALIDATE CONSTRAINT on
inheritance tables would be validated from the bottom up.  This was
different from the order of evaluation when a new CHECK constraint was
added.  The changes made here aligns the VALIDATE CONSTRAINT evaluation
order for inheritance tables to be the same as ADD CONSTRAINT, which is
generally top-down.

Reported-by: Nazli Ugur Koyluoglu, using SQLancer
Discussion: https://postgr.es/m/CAApHDvp%3DZXv8wiRyk_0rWr00skhGkt8vXDrHJYXRMft3TjkxCA%40mail.gmail.com
Backpatch-through: 9.5 (all supported versions)
2020-07-14 16:55:35 +12:00
Tom Lane
cd22d3cdb9 Avoid useless buffer allocations during binary COPY FROM.
The raw_buf and line_buf buffers aren't used when reading binary format,
so skip allocating them.  raw_buf is 64K so that seems like a worthwhile
savings.  An unused line_buf only wastes 1K, but as long as we're checking
it's free to avoid allocating that too.

Bharath Rupireddy, tweaked a bit by me

Discussion: https://postgr.es/m/CALj2ACXcCKaGPY0whowqrJ4OPJvDnTssgpGCzvuFQu5z0CXb-g@mail.gmail.com
2020-07-11 14:21:28 -04:00
Michael Paquier
cc35d8933a Rename field "relkind" to "objtype" for CTAS and ALTER TABLE nodes
"relkind" normally refers to the char field from pg_class.  However, in
the parse nodes AlterTableStmt and CreateTableAsStmt, "relkind" was used
for a field of type enum ObjectType, that could refer to other object
types than those possible for a relkind.  Such fields being usually
named "objtype", switch the name in both structures to make things more
consistent.  Note that this led to some confusion in functions that
also operate on a RangeTableEntry object, which also has a field named
"relkind".

This naming goes back to commit 09d4e96, where only OBJECT_TABLE and
OBJECT_INDEX were used.  This got extended later to use as well
OBJECT_TYPE with e440e12, not really a relation kind.

Author: Mark Dilger
Reviewed-by: Daniel Gustafsson, Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/609181AE-E399-47C7-9221-856E0F96BF93@enterprisedb.com
2020-07-11 13:32:28 +09:00
David Rowley
2b7dbc0db6 Fix whitespace in HashAgg EXPLAIN ANALYZE
The Sort node does not put a space between the number of kilobytes and
the "kB" of memory or disk space used, but HashAgg does.  Here we align
HashAgg to do the same as Sort.  Sort has been displaying this
information for longer than HashAgg, so it makes sense to align HashAgg
to Sort rather than the other way around.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20200708163021.GW4107@telsasoft.com
Backpatch-through: 13, where the hashagg started showing these details
2020-07-09 10:06:24 +12:00
Andres Freund
a9a4a7ad56 code: replace most remaining uses of 'master'.
Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 13:24:35 -07:00
Andres Freund
5e7bbb5286 code: replace 'master' with 'primary' where appropriate.
Also changed "in the primary" to "on the primary", and added a few
"the" before "primary".

Author: Andres Freund
Reviewed-By: David Steele
Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de
2020-07-08 12:57:23 -07:00
Tom Lane
f3faf35f37 Don't create pg_type entries for sequences or toast tables.
Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites.  But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).

So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them.  This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.

Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value.  Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.

Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.

Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
2020-07-07 15:43:22 -04:00
Tom Lane
f7b5988cc0 Fix temporary tablespaces for shared filesets some more.
Commit ecd9e9f0b fixed the problem in the wrong place, causing unwanted
side-effects on the behavior of GetNextTempTableSpace().  Instead,
let's make SharedFileSetInit() responsible for subbing in the value
of MyDatabaseTableSpace when the default tablespace is called for.

The convention about what is in the tempTableSpaces[] array is
evidently insufficiently documented, so try to improve that.

It also looks like SharedFileSetInit() is doing the wrong thing in the
case where temp_tablespaces is empty.  It was hard-wiring use of the
pg_default tablespace, but it seems like using MyDatabaseTableSpace
is more consistent with what happens for other temp files.

Back-patch the reversion of PrepareTempTablespaces()'s behavior to
9.5, as ecd9e9f0b was.  The changes in SharedFileSetInit() go back
to v11 where that was introduced.  (Note there is net zero code change
before v11 from these two patch sets, so nothing to release-note.)

Magnus Hagander and Tom Lane

Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com
2020-07-03 17:01:34 -04:00
Magnus Hagander
ecd9e9f0bc Fix temporary tablespaces for shared filesets
A likely copy/paste error in 98e8b48053 from  back in 2004 would
cause temp tablespace to be reset to InvalidOid if temp_tablespaces
was set to the same value as the primary tablespace in the database.
This would cause shared filesets (such as for parallel hash joins)
to ignore them, putting the temporary files in the default tablespace
instead of the configured one. The bug is in the old code, but it
appears to have been exposed only once we had shared filesets.

Reviewed-By: Daniel Gustafsson
Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com
Backpatch-through: 9.5
2020-07-03 15:09:06 +02:00
Michael Paquier
684b4f29b7 Refactor creation of normal dependency records when creating extension
When creating an extension, the same type of dependency is used when
registering a dependency to a schema and required extensions.  This
improves the code so as those dependencies are not recorded one-by-one,
but grouped together.  Note that this has as side effect to remove
duplicate dependency entries, even if it should not happen in practice
as extensions listed as required in a control file should be listed only
once.

Extracted from a larger patch by the same author.

Author: Daniel Dustafsson
Discussion: https://postgr.es/m/20200629065535.GA183079@paquier.xyz
2020-07-01 11:12:33 +09:00
David Rowley
40efbf8706 Further adjustments to Hashagg EXPLAIN ANALYZE output
The "Disk Usage" and "HashAgg Batches" properties in the EXPLAIN ANALYZE
output for HashAgg were previously only shown if the number of batches
was greater than 0.  Here we change this so that these properties are
always shown for EXPLAIN ANALYZE formats other than "text".  The idea here
is that since the HashAgg could have spilled to disk if there had been
more data or groups to aggregate, then it's relevant that we're clear in
the EXPLAIN ANALYZE output when no spilling occurred in this particular
execution of the given plan.

For the "text" EXPLAIN format, we still hide these properties when no
spilling occurs.  This EXPLAIN format is designed to be easy for humans
to read.  To maintain the readability we have a higher threshold for which
properties we display for this format.

Discussion: https://postgr.es/m/CAApHDvo_dmNozQQTmN-2jGp1vT%3Ddxx7Q0vd%2BMvD1cGpv2HU%3DSg%40mail.gmail.com
Backpatch-through: 13, where the hashagg spilling code was added.
2020-07-01 12:15:59 +12:00
Tom Lane
c410af098c Mop up some no-longer-necessary hacks around printf %.*s format.
Commit 54cd4f045 added some kluges to work around an old glibc bug,
namely that %.*s could misbehave if glibc thought any characters in
the supplied string were incorrectly encoded.  Now that we use our
own snprintf.c implementation, we need not worry about that bug (even
if it still exists in the wild).  Revert a couple of particularly
ugly hacks, and remove or improve assorted comments.

Note that there can still be encoding-related hazards here: blindly
clipping at a fixed length risks producing wrongly-encoded output
if the clip splits a multibyte character.  However, code that's
doing correct multibyte-aware clipping doesn't really need a comment
about that, while code that isn't needs an explanation why not,
rather than a red-herring comment about an obsolete bug.

Discussion: https://postgr.es/m/279428.1593373684@sss.pgh.pa.us
2020-06-29 17:12:38 -04:00
Tom Lane
63d2ac23b0 Undo double-quoting of index names in non-text EXPLAIN output formats.
explain_get_index_name() applied quote_identifier() to the index name.
This is fine for text output, but the non-text output formats all have
their own quoting conventions and would much rather start from the
actual index name.  For example in JSON you'd get something like

       "Index Name": "\"My Index\"",

which is surely not desirable, especially when the same does not
happen for table names.  Hence, move the responsibility for applying
quoting out to the callers, where it can go into already-existing
special code paths for text format.

This changes the API spec for users of explain_get_index_name_hook:
before, they were supposed to apply quote_identifier() if necessary,
now they should not.  Research suggests that the only publicly
available user of the hook is hypopg, and it actually forgot to
apply quoting anyway, so it's fine.  (In any case, there's no
behavioral change for the output of a hook as seen in non-text
EXPLAIN formats, so this won't break any case that programs should
be relying on.)

Digging in the commit logs, it appears that quoting was included in
explain_get_index_name's duties when commit 604ffd280 invented it;
and that was fine at the time because we only had text output format.
This should have been rethought when non-text formats were invented,
but it wasn't.

This is a fairly clear bug for users of non-text EXPLAIN formats,
so back-patch to all supported branches.

Per bug #16502 from Maciek Sakrejda.  Patch by me (based on
investigation by Euler Taveira); thanks to Julien Rouhaud for review.

Discussion: https://postgr.es/m/16502-57bd1c9f913ed1d1@postgresql.org
2020-06-22 11:46:41 -04:00
Amit Kapila
74b4d78e03 Removal unused function parameter in CopyReadBinaryAttribute.
The function parameter column_no is not used in CopyReadBinaryAttribute,
this can be removed.

Commit 0e319c7ad7 removed the usage of column_no parameter in function
CopyReadBinaryAttribute but forgot to remove the parameter.

Reported-by: Vignesh C
Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm1TYSNTfqx_jfz9_mwEZ2Er=dZnu++duXpC1uQo1cG=WA@mail.gmail.com
2020-06-20 09:18:57 +05:30
David Rowley
9bdb300ded Fix EXPLAIN ANALYZE for parallel HashAgg plans
Since 1f39bce02, HashAgg nodes have had the ability to spill to disk when
memory consumption exceeds work_mem. That commit added new properties to
EXPLAIN ANALYZE to show the maximum memory usage and disk usage, however,
it didn't quite go as far as showing that information for parallel
workers.  Since workers may have experienced something very different from
the main process, we should show this information per worker, as is done
in Sort.

Reviewed-by: Justin Pryzby
Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/CAApHDvpEKbfZa18mM1TD7qV6PG+w97pwCWq5tVD0dX7e11gRJw@mail.gmail.com
Backpatch-through: 13, where the hashagg spilling code was added.
2020-06-19 17:24:27 +12:00
Michael Paquier
7a3543c2ea Fix some comments referring to past features
Timestamp can only be an int64 since b9d092c, and support for WITH OIDS
has been removed as of 578b229.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200612023709.GC14879@telsasoft.com
2020-06-15 21:18:14 +09:00
Peter Eisentraut
8f5b596744 Refactor AlterExtensionContentsStmt grammar
Make use of the general object support already used by COMMENT, DROP,
and SECURITY LABEL.

Discussion: https://www.postgresql.org/message-id/flat/163c00a5-f634-ca52-fc7c-0e53deda8735%402ndquadrant.com
2020-06-13 09:19:30 +02:00
Peter Eisentraut
a332b366d4 Grammar object type refactoring
Unify the grammar of COMMENT, DROP, and SECURITY LABEL further.  They
all effectively just take an object address for later processing, so
we can make the grammar more generalized.  Some extra checking about
which object types are supported can be done later in the statement
execution.

Discussion: https://www.postgresql.org/message-id/flat/163c00a5-f634-ca52-fc7c-0e53deda8735%402ndquadrant.com
2020-06-13 09:19:30 +02:00
Tom Lane
2f48ede080 Avoid using a cursor in plpgsql's RETURN QUERY statement.
plpgsql has always executed the query given in a RETURN QUERY command
by opening it as a cursor and then fetching a few rows at a time,
which it turns around and dumps into the function's result tuplestore.
The point of this was to keep from blowing out memory with an oversized
SPITupleTable result (note that while a tuplestore can spill tuples
to disk, SPITupleTable cannot).  However, it's rather inefficient, both
because of extra data copying and because of executor entry/exit
overhead.  In recent versions, a new performance problem has emerged:
use of a cursor prevents use of a parallel plan for the executed query.

We can improve matters by skipping use of a cursor and having the
executor push result tuples directly into the function's result
tuplestore.  However, a moderate amount of new infrastructure is needed
to make that idea work:

* We can use the existing tstoreReceiver.c DestReceiver code to funnel
executor output to the tuplestore, but it has to be extended to support
plpgsql's requirement for possibly applying a tuple conversion map.

* SPI needs to be extended to allow use of a caller-supplied
DestReceiver instead of its usual receiver that puts tuples into
a SPITupleTable.  Two new API calls are needed to handle both the
RETURN QUERY and RETURN QUERY EXECUTE cases.

I also felt that I didn't want these new API calls to use the legacy
method of specifying query parameter values with "char" null flags
(the old ' '/'n' convention); rather they should accept ParamListInfo
objects containing the parameter type and value info.  This required
a bit of additional new infrastructure since we didn't yet have any
parse analysis callback that would interpret $N parameter symbols
according to type data supplied in a ParamListInfo.  There seems to be
no harm in letting makeParamList install that callback by default,
rather than leaving a new ParamListInfo's parserSetup hook as NULL.
(Indeed, as of HEAD, I couldn't find anyplace that was using the
parserSetup field at all; plpgsql was using parserSetupArg for its
own purposes, but parserSetup seemed to be write-only.)

We can actually get plpgsql out of the business of using legacy null
flags altogether, and using ParamListInfo instead of its ad-hoc
PreparedParamsData structure; but this requires inventing one more
SPI API call that can replace SPI_cursor_open_with_args.  That seems
worth doing, though.

SPI_execute_with_args and SPI_cursor_open_with_args are now unused
anywhere in the core PG distribution.  Perhaps someday we could
deprecate/remove them.  But cleaning up the crufty bits of the SPI
API is a task for a different patch.

Per bug #16040 from Jeremy Smith.  This is unfortunately too invasive to
consider back-patching.  Patch by me; thanks to Hamid Akhtar for review.

Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
2020-06-12 12:14:32 -04:00