Commit Graph

23408 Commits

Author SHA1 Message Date
Tom Lane 407b50f2d4 Store GUC data in a memory context, instead of using malloc().
The only real argument for using malloc directly was that we needed
the ability to not throw error on OOM; but mcxt.c grew that feature
awhile ago.

Keeping the data in a memory context improves accountability and
debuggability --- for example, without this it's almost impossible
to detect memory leaks in the GUC code with anything less costly
than valgrind.  Moreover, the next patch in this series will add a
hash table for GUC lookup, and it'd be pretty silly to be using
palloc-dependent hash facilities alongside malloc'd storage of the
underlying data.

This is a bit invasive though, in particular causing an API break
for GUC check hooks that want to modify the GUC's value or use an
"extra" data structure.  They must now use guc_malloc() and
guc_free() instead of malloc() and free().  Failure to change
affected code will result in assertion failures or worse; but
thanks to recent effort in the mcxt infrastructure, it shouldn't
be too hard to diagnose such oversights (at least in assert-enabled
builds).

One note is that this changes ParseLongOption() to return short-lived
palloc'd not malloc'd data.  There wasn't any caller for which the
previous definition was better.

Discussion: https://postgr.es/m/2982579.1662416866@sss.pgh.pa.us
2022-10-14 12:10:48 -04:00
Tom Lane 9c911ec065 Make some minor improvements in memory-context infrastructure.
We lack a version of repalloc() that supports MCXT_ALLOC_NO_OOM
semantics, so invent repalloc_extended() with the usual set of
flags.  repalloc_huge() becomes a legacy wrapper for that.

Also, fix dynahash.c so that it can support HASH_ENTER_NULL
requests when using the default palloc-based allocator.
The only reason it didn't do that already was the lack of the
MCXT_ALLOC_NO_OOM option when that code was written, ages ago.

While here, simplify a few overcomplicated tests in mcxt.c.

Discussion: https://postgr.es/m/2982579.1662416866@sss.pgh.pa.us
2022-10-14 11:55:56 -04:00
Peter Eisentraut 1b11561cc1 Standardize format for printing PIDs
Most code prints PIDs as %d, but some code tried to print them as long
or unsigned long.  While this is in theory allowed, the fact that PIDs
fit into int is deeply baked into all PostgreSQL code, so these random
deviations don't accomplish anything except confusion.

Note that we still need casts from pid_t to int, because on 64-bit
MinGW, pid_t is long long int.  (But per above, actually supporting
that range in PostgreSQL code would be major surgery and probably not
useful.)

Discussion: https://www.postgresql.org/message-id/289c2e45-c7d9-5ce4-7eff-a9e2a33e1580@enterprisedb.com
2022-10-14 08:38:53 +02:00
David Rowley 39b8c293fc Fix incorrect comment regarding command completion tags
The comment talked about some Asserts which did not exist and also a
variable name which seems to have long since disappeared.

Rewrite the comment in a way that will hopefully stand the test of
time and inform people why we always write "INSERT 0 <nrows>" instead of
"INSERT <nrows>" in the command completion tag for INSERT.

Reviewed-by: Mark Dilger
Discussion: https://postgr.es/m/CAApHDvpiUg09AvvGAVopNAKemA9z-kCmt7Fi6HKauc32bKzx4w@mail.gmail.com
2022-10-14 14:32:00 +13:00
Etsuro Fujita 97da48246d Allow batch insertion during COPY into a foreign table.
Commit 3d956d956 allowed the COPY, but it's done by inserting individual
rows to the foreign table, so it can be inefficient due to the overhead
caused by each round-trip to the foreign server.  To improve performance
of the COPY in such a case, this patch allows batch insertion, by
extending the multi-insert machinery in CopyFrom() to the foreign-table
case so that we insert multiple rows to the foreign table at once using
the FDW callback routine added by commit b663a4136.  This patch also
allows this for postgres_fdw.  It is enabled by the "batch_size" option
added by commit b663a4136, which is disabled by default.

When doing batch insertion, we update progress of the COPY command after
performing the FDW callback routine, to count rows not suppressed by the
FDW as well as a BEFORE ROW INSERT trigger.  For consistency, this patch
changes the timing of updating it for plain tables: previously, we
updated it immediately after adding each row to the multi-insert buffer,
but we do so only after writing the rows stored in the buffer out to the
table using table_multi_insert(), which I think would be consistent even
with non-batching mode, because in that mode we update it after writing
each row out to the table using table_tuple_insert().

Andrey Lepikhov, heavily revised by me, with review from Ian Barwick,
Andrey Lepikhov, and Zhihong Yu.

Discussion: https://postgr.es/m/bc489202-9855-7550-d64c-ad2d83c24867%40postgrespro.ru
2022-10-13 18:45:00 +09:00
Amit Kapila 5263c6b095 Improve the WARNING message for CREATE SUBSCRIPTION.
Author: Peter Smith
Reviewed-By: Alvaro Herrera, Tom Lane, Amit Kapila
Discussion: https://postgr.es/m/CAHut+PvqdqOanheWSHDyhQiF+Z-7w=-+k4U+bwbT=b6YQ_hrXQ@mail.gmail.com
2022-10-13 06:09:43 +05:30
Michael Paquier 56b662523f Fix ordering issue with WAL operations in GIN fast insert path
Contrary to what is documented in src/backend/access/transam/README,
ginHeapTupleFastInsert() had a few ordering issues with the way it does
its WAL operations when inserting items in its fast path.

First, when using a separate list, XLogBeginInsert() was being always
called before START_CRIT_SECTION(), and in this case a second thing was
wrong when merging lists, as an exclusive lock was taken on the tail
page *before* calling XLogBeginInsert().  Finally, when inserting items
into a tail page, the order of XLogBeginInsert() and
START_CRIT_SECTION() was reversed.  This commit addresses all these
issues by moving the calls of XLogBeginInsert() after all the pages
logged are locked and pinned, within a critical section.

Author:  Matthias van de Meent, Zhang Mingli
Discussion: https://postgr.es/m/CAEze2WhL8uLMqynnnCu1LAPwxD5RKEo0nHV+eXGg_N6ELU88HQ@mail.gmail.com
2022-10-13 09:31:57 +09:00
Michael Paquier 63585b1ebd doc: Fix description of replication command CREATE_REPLICATION_SLOT
The output plugin name is a mandatory option when creating a logical
slot, but the grammar documented was not described as such.  While on
it, fix two comments in repl_gram.y to show that TEMPORARY is an
optional grammar choice.

Author: Ayaki Tachikake
Discussion: https://postgr.es/m/OSAPR01MB2852607B2329FFA27834105AF1229@OSAPR01MB2852.jpnprd01.prod.outlook.com
Backpatch-through: 15
2022-10-13 08:53:42 +09:00
Michael Paquier 4574eb9d38 Fix shadow variable in postgres.c
-Wshadow=compatible-local is added by default since 0fe954c, and this
warning was detected under -DWRITE_READ_PARSE_PLAN_TREES.

Reviewed-by: David Rowley
Discussion: https://postgr.es/m/Y0Ya5SH0QiaO9kKG@paquier.xyz
2022-10-12 13:42:30 +09:00
Michael Paquier a1176c67c4 Simplify some maths in xlogreader.c
An LSN was calculated from a segment number, a segment size and a
position offset, matching exactly the LSN given by the caller of
XLogReaderValidatePageHeader().  This change removes the extra LSN
calculation, relying only on the LSN given by the function caller
instead.

Author: Bharath Rupireddy
Reviewed-by: Richard Guo, Álvaro Herrera, Kyotaro Horiguchi
Discussion: https://postgr.es/m/CALj2ACXuh4Ms9j9sxMYdtHEe=5sFcyrs-GAHyADu_A_G71kZTg@mail.gmail.com
2022-10-12 09:59:36 +09:00
Tom Lane 18a4a620e2 Harden pmsignal.c against clobbered shared memory.
The postmaster is not supposed to do anything that depends
fundamentally on shared memory contents, because that creates
the risk that a backend crash that trashes shared memory will
take the postmaster down with it, preventing automatic recovery.
In commit 969d7cd43 I lost sight of this principle and coded
AssignPostmasterChildSlot() in such a way that it could fail
or even crash if the shared PMSignalState structure became
corrupted.  Remarkably, we've not seen field reports of such
crashes; but I managed to induce one while testing the recent
changes around palloc chunk headers.

To fix, make a semi-duplicative state array inside the postmaster
so that we need consult only local state while choosing a "child
slot" for a new backend.  Ensure that other postmaster-executed
routines in pmsignal.c don't have critical dependencies on the
shared state, either.  Corruption of PMSignalState might now
lead ReleasePostmasterChildSlot() to conclude that backend X
failed, when actually backend Y was the one that trashed things.
But that doesn't matter, because we'll force a cluster-wide reset
regardless.

Back-patch to all supported branches, since this is an old bug.

Discussion: https://postgr.es/m/3436789.1665187055@sss.pgh.pa.us
2022-10-11 18:54:31 -04:00
Tom Lane b8f2687fdc Yet further fixes for multi-row VALUES lists for updatable views.
DEFAULT markers appearing in an INSERT on an updatable view
could be mis-processed if they were in a multi-row VALUES clause.
This would lead to strange errors such as "cache lookup failed
for type NNNN", or in older branches even to crashes.

The cause is that commit 41531e42d tried to re-use rewriteValuesRTE()
to remove any SetToDefault nodes (that hadn't previously been replaced
by the view's own default values) appearing in "product" queries,
that is DO ALSO queries.  That's fundamentally wrong because the
DO ALSO queries might not even be INSERTs; and even if they are,
their targetlists don't necessarily match the view's column list,
so that almost all the logic in rewriteValuesRTE() is inapplicable.

What we want is a narrow focus on replacing any such nodes with NULL
constants.  (That is, in this context we are interpreting the defaults
as being strictly those of the view itself; and we already replaced
any that aren't NULL.)  We could add still more !force_nulls tests
to further lobotomize rewriteValuesRTE(); but it seems cleaner to
split out this case to a new function, restoring rewriteValuesRTE()
to the charter it had before.

Per bug #17633 from jiye_sw.  Patch by me, but thanks to
Richard Guo and Japin Li for initial investigation.
Back-patch to all supported branches, as the previous fix was.

Discussion: https://postgr.es/m/17633-98cc85e1fa91e905@postgresql.org
2022-10-11 18:24:14 -04:00
Amit Kapila 776e1c8a5d Add a common function to generate the origin name.
Make a common replication origin name formatting function to replace
multiple snprintf() expressions. This also includes logic previously done
by ReplicationOriginNameForTablesync().

This makes the code to generate the origin name consistent among apply
worker and tablesync worker.

Author: Peter Smith
Reviewed-By: Aleksander Alekseev
Discussion: https://postgr.es/m/CAHut%2BPsa8hhfSE6ozUK-ih7GkQziAVAf4f3bqiXEj2nQiu-43g%40mail.gmail.com
2022-10-11 10:37:52 +05:30
Michael Paquier 9fcdf2c787 Add support for COPY TO callback functions
This is useful as a way for extensions to process COPY TO rows in the
way they see fit (say auditing, analytics, backend, etc.) without the
need to invoke an external process running as the OS user running the
backend through PROGRAM that requires superuser rights.  COPY FROM
already provides a similar callback for logical replication.  For COPY
TO, the callback is triggered when we are ready to send a row in
CopySendEndOfRow(), which is the same code path as when sending a row
to a frontend or a pipe/file.

A small test module, test_copy_callbacks, is added to provide some
coverage for this facility.

Author: Bilva Sanaba, Nathan Bossart
Discussion: https://postgr.es/m/253C21D1-FCEB-41D9-A2AF-E6517015B7D7@amazon.com
2022-10-11 11:45:52 +09:00
Tom Lane 0e87dfe464 Harden memory context allocators against bogus chunk pointers.
Before commit c6e0fe1f2, functions such as AllocSetFree could pretty
safely presume that they were given a valid chunk pointer for their
own type of context, because the indirect call through a memory
context object and method struct would be very unlikely to work
otherwise.  But now, if pfree() is mistakenly invoked on a pointer
to garbage, we have three chances in eight of ending up at one of
these functions.  That means we need to take extra measures to
verify that we are looking at what we're supposed to be looking at,
especially in debug builds.

Hence, add code to verify that the chunk's back-link to a block header
leads to a memory context object that satisfies the right sort of
IsA() check.  This is still a bit weaker than what we did before,
but for the moment assume that an IsA() check is sufficient.

As a compromise between speed and safety, implement these checks
as Asserts when dealing with small chunks but plain test-and-elogs
when dealing with large (external) chunks.  The latter case should
not be too performance-critical, but the former case probably is.
In slab.c, all chunks are small; but nonetheless use a plain test
in SlabRealloc, because that is certainly not performance-critical,
indeed we should be suspicious that it's being called in error.

In aset.c, additionally add some assertions that the "value" field
of the chunk header is within the small range allowed for freelist
indexes.  Without that, we might find ourselves trying to wipe
most of memory when CLOBBER_FREED_MEMORY is enabled, or scribbling
on a "freelist header" that's far away from the context object.

Eventually, field experience might show us that it's smarter for
these tests to be active always, but for now we'll try to get
away with just having them as assertions.

While at it, also be more uniform about asserting that context
objects passed as parameters are of the type we expect.  Some
places missed that altogether, and slab.c was for no very good
reason doing it differently from the other allocators.

Discussion: https://postgr.es/m/3578387.1665244345@sss.pgh.pa.us
2022-10-10 18:45:34 -04:00
Tom Lane 235eb4db98 Simplify our Assert infrastructure a little.
Remove the Trap and TrapMacro macros, which were nearly unused
and confusingly had the opposite condition polarity from the
otherwise-functionally-equivalent Assert macros.

Having done that, it's very hard to justify carrying the errorType
argument of ExceptionalCondition, so drop that too, and just
let it assume everything's an Assert.  This saves about 64K
of code space as of current HEAD.

Discussion: https://postgr.es/m/3928703.1665345117@sss.pgh.pa.us
2022-10-10 15:16:56 -04:00
John Naylor 6291b2546c Remove unnecessary semicolons after goto labels
According to the C standard, a label must followed by a statement.
If there was ever a time we needed an empty statement here, it was
a long time ago.

Japin Li

Reviewed by Julien Rouhaud
Discussion: https://www.postgresql.org/message-id/MEYP282MB16690F40189A4F060B41D56DB65E9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2022-10-10 15:08:38 +07:00
Peter Eisentraut 357cfefb09 Use C library functions instead of Abs() for int64
Instead of Abs() for int64, use the C standard functions labs() or
llabs() as appropriate.  Define a small wrapper around them that
matches our definition of int64.  (labs() is C90, llabs() is C99.)

Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/4beb42b5-216b-bce8-d452-d924d5794c63%40enterprisedb.com
2022-10-10 09:01:17 +02:00
Andres Freund 06dbd619bf pgstat: Prevent stats reset from corrupting slotname by removing slotname
Previously PgStat_StatReplSlotEntry contained the slotname, which was mainly
used when writing out the stats during shutdown, to identify the slot in the
serialized data (at runtime the index in ReplicationSlotCtl->replication_slots
is used, but that can change during a restart). Unfortunately the slotname was
overwritten when the slot's stats were reset.

That turned out to only cause "real" problems if the slot was active during
the reset, triggering an assertion failure at the next
pgstat_report_replslot(). In other paths the stats were re-initialized during
pgstat_acquire_replslot().

Fix this by removing slotname from PgStat_StatReplSlotEntry. Instead we can
get the slot's name from the slot itself. Besides fixing a bug, this also is
architecturally cleaner (a name is not really statistics). This is safe
because stats, for a slot removed while shut down, will not be restored at
startup.

In 15 the slotname is not removed, but renamed, to avoid changing the stats
format. In master, bump PGSTAT_FILE_FORMAT_ID.

This commit does not contain a test for the fix. I think this can only be
tested by a tap test starting pg_recvlogical in the background and checking
pg_recvlogical's output. That type of test is notoriously hard to be reliable,
so committing it shortly before the release is wrapped seems like a bad idea.

Reported-by: Jaime Casanova <jcasanov@systemguards.com.ec>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/YxfagaTXUNa9ggLb@ahch-to
Backpatch: 15-, where the bug was introduced in 5891c7a8ed
2022-10-08 09:43:29 -07:00
Peter Eisentraut e4c61bedcb Use fabsf() instead of Abs() or fabs() where appropriate
This function is new in C99.

Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/4beb42b5-216b-bce8-d452-d924d5794c63%40enterprisedb.com
2022-10-08 13:43:26 +02:00
Alvaro Herrera 614a406b4f
Fix self-referencing foreign keys with partitioned tables
There are a number of bugs in this area.  Two of them are fixed here,
namely:
1. get_relation_idx_constraint_oid does not restrict the type of
   constraint that's returned, so with sufficient bad luck it can
   return the OID of a foreign key constraint.  This has the effect that
   a primary key in a partition can end up as a child of a foreign key,
   which makes no sense (it needs to be the child of the equivalent
   primary key.)
   Change the API contract so that only index-backed constraints are
   returned, mimicking get_constraint_index().

2. Both CloneFkReferenced and CloneFkReferencing clone a
   self-referencing foreign key, so the partition ends up with
   a duplicate foreign key.  Change the former function to ignore such
   constraints.

Add some tests to verify that things are better now.  (However, these
new tests show some additional misbehavior that will be fixed later --
namely that there's a constraint marked NOT VALID.)

Backpatch to 12, where these constraints are possible at all.

Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220603154232.1715b14c@karst
2022-10-07 19:37:48 +02:00
Peter Eisentraut f14aad5169 Remove unnecessary uses of Abs()
Use C standard abs() or fabs() instead.

Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/4beb42b5-216b-bce8-d452-d924d5794c63%40enterprisedb.com
2022-10-07 13:29:33 +02:00
Tom Lane 80ef926758 Improve our ability to detect bogus pointers passed to pfree et al.
Commit c6e0fe1f2 was a shade too trusting that any pointer passed
to pfree, repalloc, etc will point at a valid chunk.  Notably,
passing a pointer that was actually obtained from malloc tended
to result in obscure assertion failures, if not worse.  (On FreeBSD
I've seen such mistakes take down the entire cluster, seemingly as
a result of clobbering shared memory.)

To improve matters, extend the mcxt_methods[] array so that it
has entries for every possible MemoryContextMethodID bit-pattern,
with the currently unassigned ID codes pointing to error-reporting
functions.  Then, fiddle with the ID assignments so that patterns
likely to be associated with bad pointers aren't valid ID codes.
In particular, we should avoid assigning bit patterns 000 (zeroed
memory) and 111 (wipe_mem'd memory).

It turns out that on glibc (Linux), malloc uses chunk headers that
have flag bits in the same place we keep MemoryContextMethodID,
and that the bit patterns 000, 001, 010 are the only ones we'll
see as long as the backend isn't threaded.  So we can have very
robust detection of pfree'ing a malloc-assigned block on that
platform, at least so long as we can refrain from using up those
ID codes.  On other platforms, we don't have such a good guarantee,
but keeping 000 reserved will be enough to catch many such cases.

While here, make GetMemoryChunkMethodID() local to mcxt.c, as there
seems no need for it to be exposed even in memutils_internal.h.

Patch by me, with suggestions from Andres Freund and David Rowley.

Discussion: https://postgr.es/m/2910981.1665080361@sss.pgh.pa.us
2022-10-06 21:24:00 -04:00
Andres Freund e5555657ba meson: Add support for building with precompiled headers
This substantially speeds up building for windows, due to the vast amount of
headers included via windows.h. A cross build from linux targetting mingw goes
from

994.11user 136.43system 0:31.58elapsed 3579%CPU
to
422.41user 89.05system 0:14.35elapsed 3562%CPU

The wins on windows are similar-ish (but I don't have a system at hand just
now for actual numbers). Targetting other operating systems the wins are far
smaller (tested linux, macOS, FreeBSD).

For now precompiled headers are disabled by default, it's not clear how well
they work on all platforms. E.g. on FreeBSD gcc doesn't seem to have working
support, but clang does.

When doing a full build precompiled headers are only beneficial for targets
with multiple .c files, as meson builds a separate precompiled header for each
target (so that different compilation options take effect). This commit
therefore only changes target with at least two .c files to use precompiled
headers.

Because this commit adds b_pch=false to the default_options new build
directories will have precompiled headers disabled by default, however
existing build directories will continue use the default value of b_pch, which
is true.

Note that using precompiled headers with ccache requires setting
CCACHE_SLOPPINESS=pch_defines,time_macros to get hits.

Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA+hUKG+50eOUbN++ocDc0Qnp9Pvmou23DSXu=ZA6fepOcftKqA@mail.gmail.com
Discussion: https://postgr.es/m/c5736f70-bb6d-8d25-e35c-e3d886e4e905@enterprisedb.com
Discussion: https://postgr.es/m/20190826054000.GE7005%40paquier.xyz
2022-10-06 17:19:30 -07:00
Andres Freund e0b0142959 Create subscription stats entry at CREATE SUBSCRIPTION time
Previously, the subscription stats entry was created when the first
stats, i.e., an error on apply worker or tablesync worker,  were
reported. Therefore, the stats_reset field was not updated by
pg_stat_reset_subscription_stats() if the stats entry was not
populated yet, which was different behavior than other statistics.

This change creates the subscription stats entry and initializes it at
CREATE SUBSCRIPTION time.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_Zqd-e5imT_3-ZiQv1cfsWuy16OJTiUaCvqpq4V7GVdSg@mail.gmail.com
2022-10-06 17:17:16 -07:00
Andres Freund 0fa41648d7 meson: Fix two comments
Author: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAEG8a3KxObc9g8NTzx1kX0Auf=J7FNiubYZXSK6G5wv5ShmP6A@mail.gmail.com
2022-10-06 13:09:37 -07:00
Tom Lane 9543eff5e0 Remove MemoryContextContains().
MemoryContextContains is no longer reliable in the wake of c6e0fe1f2,
because there's no longer very much redundancy in chunk headers.
(It wasn't *completely* reliable even before that, as there was a
chance of a false positive if you passed it something that didn't
point to an mcxt chunk at all.  But it was generally good enough.)

Hence, remove it.  There is no remaining core code that requires it.
Extensions that have been using it might be able to substitute a
test like "GetMemoryChunkContext(ptr) == context", recognizing that
this explicitly requires that the pointer point to some chunk.

Tom Lane and David Rowley

Discussion: https://postgr.es/m/1913788.1664898906@sss.pgh.pa.us
2022-10-06 13:35:31 -04:00
Tom Lane 42b746d4c9 Remove uses of MemoryContextContains in nodeAgg.c and nodeWindowAgg.c.
MemoryContextContains is no longer reliable in the wake of c6e0fe1f2,
so we need to get rid of these uses.

It appears that there's no really good reason to force the result of
an aggregate's finalfn or serialfn to be allocated in the per-tuple
context.  The only other plausible case is that the result points to
or into the aggregate's transition value, and that's fine because it
will last as long as we need it to.  (This conclusion depends on the
assumption that finalfns are not allowed to scribble on the transition
value, but we've long required that.)  So we can just drop the
MemoryContextContains plus datumCopy business, although we do need
to take care to not return a read-write pointer when the transition
value is an expanded datum.

Likewise, we don't really need to force the result of a window
function to be in the output context.  In this case, the plausible
alternative is that it's pointing into the temporary tuple slot used
by WinGetFuncArgInPartition or WinGetFuncArgInFrame (since those
functions could return such a pointer, which might become the window
function's result).  That will hold still for long enough, unless
there is another window function using the same WindowObject.
I'm content to always perform a datumCopy when there's more than one
such function.

On net, these changes should provide small speed improvements as well
as removing problematic code.

Tom Lane and David Rowley

Discussion: https://postgr.es/m/1913788.1664898906@sss.pgh.pa.us
2022-10-06 13:27:34 -04:00
Tom Lane 66c2922e76 Take care to de-duplicate entries in standby.c's table of locks.
The RecoveryLockLists data structure, which tracks all exclusive
locks that the startup process is holding on behalf of transactions
being replayed, did not have any provision for avoiding duplicate
entries for the same lock.  Maybe that was okay when the code was
first written.  However, modern practice is for checkpoints to
write fresh lists of all active exclusive locks into the WAL.
Thus, an exclusive lock that survives across multiple checkpoints
causes bloat in standbys' startup processes.  If there are a lot
of such locks this can look like a memory leak, and it's even
possible to drive the startup process into a palloc failure from
an over-length List.

To fix, use a hash table instead of simple lists to track the
locks being held.  Allowing for dynahash overhead, this requires
a little more space per lock than the old way (although it's the
same size as what we were allocating prior to c6e0fe1f2).  It's
probably a shade slower too.  However, testing indicates that the
penalty is negligible on ordinary workloads, so let's make this
change to improve robustness in extreme cases.

Patch by me, per report from Dmitriy Kuzmin.  No back-patch
(for now anyway), since it seems that a significant improvement
would only occur in corner cases.

Discussion: https://postgr.es/m/CAHLDt=_ts0A7Agn=hCpUh+RCFkxd+G6uuT=kcTfqFtGur0dp=A@mail.gmail.com
2022-10-06 12:27:36 -04:00
Tom Lane ca71131eeb Introduce t_isalnum() to replace t_isalpha() || t_isdigit() tests.
ts_locale.c omitted support for "isalnum" tests, perhaps on the
grounds that there were initially no use-cases for that.  However,
both ltree and pg_trgm need such tests, and we do also have one
use-case now in the core backend.  The workaround of testing
isalpha and isdigit separately seems quite inefficient, especially
when dealing with multibyte characters; so let's fill in the
missing support.

Discussion: https://postgr.es/m/2548310.1664999615@sss.pgh.pa.us
2022-10-06 11:08:56 -04:00
Michael Paquier 5757141cae Fix comment in xlogprefetcher.c
Author: Sho Kato
Discussion: https://postgr.es/m/TYCPR01MB684954052EC534A3261B29249F5C9@TYCPR01MB6849.jpnprd01.prod.outlook.com
2022-10-06 20:25:02 +09:00
David Rowley 112f0225db Add optional parameter to PG_TRY() macros
This optional parameter can be specified in cases where there are nested
PG_TRY() statements within a function in order to stop the compiler from
issuing warnings about shadowed local variables when compiling with
-Wshadow.  The optional parameter is used as a suffix on the variable
names declared within the PG_TRY(), PG_CATCH(), PG_FINALLY() and
PG_END_TRY() macros.  The parameter, if specified, must be the same in
each component macro of the given PG_TRY() block.

This also adjusts the single case where we have nested PG_TRY() statements
to add a parameter to the inner-most PG_TRY().

This reduces the number of compiler warnings when compiling with
-Wshadow=compatible-local from 5 down to 1.

Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvqWGMdB_pATeUqE=JCtNqNxObPOJ00jFEa2_sZ20j_Wvg@mail.gmail.com
2022-10-06 10:08:31 +13:00
Andres Freund 902ab2fcef meson: Add windows resource files
The generated resource files aren't exactly the same ones as the old
buildsystems generate. Previously "InternalName" and "OriginalFileName" were
mostly wrong / not set (despite being required), but that was hard to fix in
at least the make build. Additionally, the meson build falls back to a
"auto-generated" description when not set, and doesn't set it in a few cases -
unlikely that anybody looks at these descriptions in detail.

Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
2022-10-05 09:56:05 -07:00
David Rowley 2d0bbedda7 Rename shadowed local variables
In a similar effort to f01592f91, here we mostly rename shadowed local
variables to remove the warnings produced when compiling with
-Wshadow=compatible-local.

This fixes 63 warnings and leaves just 5.

Author: Justin Pryzby, David Rowley
Reviewed-by: Justin Pryzby
Discussion https://postgr.es/m/20220817145434.GC26426%40telsasoft.com
2022-10-05 21:01:41 +13:00
Michael Paquier bdf9b60085 Fix comment in guc_tables.c
s/ERROR_HANDLING/ERROR_HANDLING_OPTIONS/.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+PtDj3CV+f0pVisc0XYMi2LHGBpQxQWtF0FjiSVN_nV17Q@mail.gmail.com
2022-10-04 15:39:41 +09:00
Michael Paquier c42cd05c58 Cleanup useless assignments and checks
This cleans up a couple of areas:
- Remove XLogSegNo calculation for the last WAL segment in backup in
xlog.c (7d70809 has moved this logic entirely to xlogbackup.c when
building the contents of the backup history file).
- Remove check on log_min_duration in analyze.c, as it is already true
where this code path is reached.
- Simplify call to find_option() in guc.c.

Author: Ranier Vilela
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAEudQArCDQQiPiFR16=yu9k5s2tp4tgEe1U1ZbkW4ofx81AWWQ@mail.gmail.com
2022-10-04 13:16:23 +09:00
Andres Freund 908e17151b meson: llvm: Use llvm-config's --cxxflags when building llvmjit
Otherwise we don't use LLVM's flags when building llvmjit_wrap.cpp and
llvmjit_inline.cpp. That can cause compile time failures if the C++ compiler
doesn't default to a new enough C++ standards version and link time failures
due to ABI influencing flags like -fno-rtti.
2022-10-03 14:55:52 -07:00
Tom Lane f4c7c410ee Revert "Optimize order of GROUP BY keys".
This reverts commit db0d67db24 and
several follow-on fixes.  The idea of making a cost-based choice
of the order of the sorting columns is not fundamentally unsound,
but it requires cost information and data statistics that we don't
really have.  For example, relying on procost to distinguish the
relative costs of different sort comparators is pretty pointless
so long as most such comparator functions are labeled with cost 1.0.
Moreover, estimating the number of comparisons done by Quicksort
requires more than just an estimate of the number of distinct values
in the input: you also need some idea of the sizes of the larger
groups, if you want an estimate that's good to better than a factor of
three or so.  That's data that's often unknown or not very reliable.
Worse, to arrive at estimates of the number of calls made to the
lower-order-column comparison functions, the code needs to make
estimates of the numbers of distinct values of multiple columns,
which are necessarily even less trustworthy than per-column stats.
Even if all the inputs are perfectly reliable, the cost algorithm
as-implemented cannot offer useful information about how to order
sorting columns beyond the point at which the average group size
is estimated to drop to 1.

Close inspection of the code added by db0d67db2 shows that there
are also multiple small bugs.  These could have been fixed, but
there's not much point if we don't trust the estimates to be
accurate in-principle.

Finally, the changes in cost_sort's behavior made for very large
changes (often a factor of 2 or so) in the cost estimates for all
sorting operations, not only those for multi-column GROUP BY.
That naturally changes plan choices in many situations, and there's
precious little evidence to show that the changes are for the better.
Given the above doubts about whether the new estimates are really
trustworthy, it's hard to summon much confidence that these changes
are better on the average.

Since we're hard up against the release deadline for v15, let's
revert these changes for now.  We can always try again later.

Note: in v15, I left T_PathKeyInfo in place in nodes.h even though
it's unreferenced.  Removing it would be an ABI break, and it seems
a bit late in the release cycle for that.

Discussion: https://postgr.es/m/TYAPR01MB586665EB5FB2C3807E893941F5579@TYAPR01MB5866.jpnprd01.prod.outlook.com
2022-10-03 10:56:16 -04:00
Peter Eisentraut a9d58bfe8a Fix tiny memory leaks
Both check_application_name() and check_cluster_name() use
pg_clean_ascii() but didn't release the memory.  Depending on when the
GUC is set, this might be cleaned up at some later time or it would
leak postmaster memory once.  In any case, it seems better not to have
to rely on such analysis and make the code locally robust.  Also, this
makes Valgrind happier.

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Jacob Champion <jchampion@timescale.com>
Discussion: https://www.postgresql.org/message-id/CAD21AoBmFNy9MPfA0UUbMubQqH3AaK5U3mrv6pSeWrwCk3LJ8g@mail.gmail.com
2022-10-01 12:48:24 +02:00
Michael Paquier 83e42a0035 doc: Fix some grammar and typos
This fixes some areas related to logical replication and custom RMGRs.

Author: Ekaterina Kiryanova
Discussion: https://postgr.es/m/fa4773f1-1396-384a-bcd7-85b5e013f399@postgrespro.ru
Backpatch-through: 15
2022-10-01 15:28:02 +09:00
Tom Lane 2dc2e4e31a Avoid improbable PANIC during heap_update, redux.
Commit 34f581c39 intended to ensure that RelationGetBufferForTuple
would acquire a visibility-map page pin in case the otherBuffer's
all-visible bit had become set since we last had lock on that page.
But I missed a case: when we're extending the relation, VM concerns
were dealt with only in the relatively-less-likely case that we
fail to conditionally lock the otherBuffer.  I think I'd believed
that we couldn't need to worry about it if the conditional lock
succeeds, which is true for the target buffer; but the otherBuffer
was unlocked for awhile so its bit might be set anyway.  So we need
to do the GetVisibilityMapPins dance, and then also recheck the
page's free space, in both cases.

Per report from Jaime Casanova.  Back-patch to v12 as the previous
patch was (although there's still no evidence that the bug is
reachable pre-v14).

Discussion: https://postgr.es/m/E1lWLjP-00006Y-Ml@gemulon.postgresql.org
2022-09-30 19:37:00 -04:00
Alvaro Herrera 69298db8e1
Fix tab-completion after commit 790bf615dd
I (Álvaro) broke tab-completion for GRANT .. ALL TABLES IN SCHEMA while
removing ALL from the publication syntax for schemas in the
aforementioned commit.  I also missed to update a bunch of
tab-completion rules for ALTER/CREATE PUBLICATION that match each
individual piece of ALL TABLES IN SCHEMA.  Repair those bugs.

While fixing up that commit, update a couple of outdated comments
related to the same change.

Backpatch to 15.

Author: Shi yu <shiy.fnst@fujitsu.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/OSZPR01MB6310FCE8609185A56344EED2FD559@OSZPR01MB6310.jpnprd01.prod.outlook.com
2022-09-30 12:53:31 +02:00
Michael Paquier 65b158ae4e Remove useless argument from UnpinBuffer()
The last caller of UnpinBuffer() that did not want to adjust
CurrentResourceOwner was removed in 2d115e4, and nothing has been
introduced in bufmgr.c to do the same thing since.  This simplifies 10
code paths.

Author: Aleksander Alekseev
Reviewed-by: Nathan Bossart, Zhang Mingli, Bharath Rupireddy
Discussion: https://postgr.es/m/CAJ7c6TOmmFpb6ohurLhTC7hKNJWGzdwf8s4EAtAZxD48g-e6Jw@mail.gmail.com
2022-09-30 15:57:47 +09:00
Tom Lane 551aa6b7b9 Improve wording of log messages triggered by max_slot_wal_keep_size.
The one about "terminating process to release replication slot" told
you nothing about why that was happening.  The one about "invalidating
slot because its restart_lsn exceeds max_slot_wal_keep_size" told you
what was happening, but violated our message style guideline about
keeping the primary message short.  Add DETAIL/HINT lines to carry
the appropriate detail and make the two cases more uniform.

While here, fix bogus test logic in 019_replslot_limit.pl: if it timed
out without seeing the expected log message, no test failure would be
reported.  This is flat broken since commit 549ec201d removed the test
counts; even before that it was horribly bad style, since you'd only
get told that not all tests had been run.

Kyotaro Horiguchi, reviewed by Bertrand Drouvot; test fixes by me

Discussion: https://postgr.es/m/20211214.130456.2233153190058148084.horikyota.ntt@gmail.com
2022-09-29 13:27:48 -04:00
Tom Lane d7e39d72ca Use actual backend IDs in pg_stat_get_backend_idset() and friends.
Up to now, the ID values returned by pg_stat_get_backend_idset() and
used by pg_stat_get_backend_activity() and allied functions were just
indexes into a local array of sessions seen by the last stats refresh.
This is problematic for a few reasons.  The "ID" of a session can vary
over its existence, which is surprising.  Also, while these numbers
often match the "backend ID" used for purposes like temp schema
assignment, that isn't reliably true.  We can fairly cheaply switch
things around to make these numbers actually be the sessions' backend
IDs.  The added test case illustrates that with this definition, the
temp schema used by a given session can be obtained given its PID.

While here, delete some dead code that guarded against getting
a NULL return from pgstat_fetch_stat_local_beentry().  That can't
happen as long as the caller is careful to pass an in-range array
index, as all the callers are.  (This code may not have been dead
when written, but it surely is now.)

Nathan Bossart

Discussion: https://postgr.es/m/20220815205811.GA250990@nathanxps13
2022-09-29 12:14:39 -04:00
Etsuro Fujita d5e3fe682a Update comment in ExecInsert() regarding batch insertion.
Remove the stale text that is a leftover from an earlier version of the
patch to add support for batch insertion, and adjust the wording in the
remaining text.

Back-patch to v14 where batch insertion came in.

Review and wording adjustment by Tom Lane.

Discussion: https://postgr.es/m/CAPmGK14goatHPHQv2Aeu_UTKqZ%2BBO%2BP%2Bzd3HKv5D%2BdyyfWKDSw%40mail.gmail.com
2022-09-29 16:55:00 +09:00
Michael Paquier 0823d061b0 Introduce SYSTEM_USER
SYSTEM_USER is a reserved keyword of the SQL specification that,
roughly described, is aimed at reporting some information about the
system user who has connected to the database server.  It may include
implementation-specific information about the means by the user
connected, like an authentication method.

This commit implements SYSTEM_USER as of auth_method:identity, where
"auth_method" is a keyword about the authentication method used to log
into the server (like peer, md5, scram-sha-256, gss, etc.) and
"identity" is the authentication identity as introduced by 9afffcb (peer
sets authn to the OS user name, gss to the user principal, etc.).  This
format has been suggested by Tom Lane.

Note that thanks to d951052, SYSTEM_USER is available to parallel
workers.

Bump catalog version.

Author: Bertrand Drouvot
Reviewed-by: Jacob Champion, Joe Conway, Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/7e692b8c-0b11-45db-1cad-3afc5b57409f@amazon.com
2022-09-29 15:05:40 +09:00
Thomas Munro b6d8a60aba Restore pg_pread and friends.
Commits cf112c12 and a0dc8271 were a little too hasty in getting rid of
the pg_ prefixes where we use pread(), pwrite() and vectored variants.

We dropped support for ancient Unixes where we needed to use lseek() to
implement replacements for those, but it turns out that Windows also
changes the current position even when you pass in an offset to
ReadFile() and WriteFile() if the file handle is synchronous, despite
its documentation saying otherwise.

Switching to asynchronous file handles would fix that, but have other
complications.  For now let's just put back the pg_ prefix and add some
comments to highlight the non-standard side-effect, which we can now
describe as Windows-only.

Reported-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/20220923202439.GA1156054%40nathanxps13
2022-09-29 13:12:11 +13:00
David Rowley 3a5817695a Restrict Datum sort optimization to byval types only
91e9e89dc modified nodeSort.c so that it used datum sorts when the
targetlist of the outer node contained only a single column.  That commit
failed to recognise that the Datum returned by tuplesort_getdatum() must
be pfree'd when the type is a byref type.  Ronan Dunklau did originally
propose the patch with that restriction, but that, probably through my own
fault, got lost during further development work.

Due to the timing of this report (PG15 RC1 is almost out the door), let's
just restrict the datum sort optimization to apply for byval types only.
We might want to look harder into making this work for byref types in
PG16.

Reported-by: Önder Kalacı
Diagnosis-by: Tom Lane
Discussion: https://postgr.es/m/CACawEhVxe0ufR26UcqtU7GYGRuubq3p6ZWPGXL4cxy_uexpAAQ@mail.gmail.com
Backpatch-through: 15, where 91e9e89dc was introduced.
2022-09-29 11:43:00 +13:00
Tom Lane 4d2a844242 Allow callback functions to deregister themselves during a call.
Fetch the next-item pointer before the call not after, so that
we aren't dereferencing a dangling pointer if the callback
deregistered itself during the call.  The risky coding pattern
appears in CallXactCallbacks, CallSubXactCallbacks, and
ResourceOwnerReleaseInternal.  (There are some other places that
might be at hazard if they offered deregistration functionality,
but they don't.)

I (tgl) considered back-patching this, but desisted because it
wouldn't be very safe for extensions to rely on this working in
pre-v16 branches.

Hao Wu

Discussion: https://postgr.es/m/CAH+9SWXTiERkmhRke+QCcc+jRH8d5fFHTxh8ZK0-Yn4BSpyaAg@mail.gmail.com
2022-09-28 11:23:27 -04:00