Commit Graph

6793 Commits

Author SHA1 Message Date
Thomas Munro faeedbcefd Introduce PG_IO_ALIGN_SIZE and align all I/O buffers.
In order to have the option to use O_DIRECT/FILE_FLAG_NO_BUFFERING in a
later commit, we need the addresses of user space buffers to be well
aligned.  The exact requirements vary by OS and file system (typically
sectors and/or memory pages).  The address alignment size is set to
4096, which is enough for currently known systems: it matches modern
sectors and common memory page size.  There is no standard governing
O_DIRECT's requirements so we might eventually have to reconsider this
with more information from the field or future systems.

Aligning I/O buffers on memory pages is also known to improve regular
buffered I/O performance.

Three classes of I/O buffers for regular data pages are adjusted:
(1) Heap buffers are now allocated with the new palloc_aligned() or
MemoryContextAllocAligned() functions introduced by commit 439f6175.
(2) Stack buffers now use a new struct PGIOAlignedBlock to respect
PG_IO_ALIGN_SIZE, if possible with this compiler.  (3) The buffer
pool is also aligned in shared memory.

WAL buffers were already aligned on XLOG_BLCKSZ.  It's possible for
XLOG_BLCKSZ to be configured smaller than PG_IO_ALIGNED_SIZE and thus
for O_DIRECT WAL writes to fail to be well aligned, but that's a
pre-existing condition and will be addressed by a later commit.

BufFiles are not yet addressed (there's no current plan to use O_DIRECT
for those, but they could potentially get some incidental speedup even
in plain buffered I/O operations through better alignment).

If we can't align stack objects suitably using the compiler extensions
we know about, we disable the use of O_DIRECT by setting PG_O_DIRECT to
0.  This avoids the need to consider systems that have O_DIRECT but
can't align stack objects the way we want; such systems could in theory
be supported with more work but we don't currently know of any such
machines, so it's easier to pretend there is no O_DIRECT support
instead.  That's an existing and tested class of system.

Add assertions that all buffers passed into smgrread(), smgrwrite() and
smgrextend() are correctly aligned, unless PG_O_DIRECT is 0 (= stack
alignment tricks may be unavailable) or the block size has been set too
small to allow arrays of buffers to be all aligned.

Author: Thomas Munro <thomas.munro@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CA+hUKGK1X532hYqJ_MzFWt0n1zt8trz980D79WbjwnT-yYLZpg@mail.gmail.com
2023-04-08 16:34:50 +12:00
Tom Lane db6957bae8 Add missing .gitignore entry.
Seems an oversight in 7d8219a44.  Fix before somebody commits
a generated file.
2023-04-07 23:32:49 -04:00
Peter Geoghegan 7d8219a444 Show more detail in heapam rmgr descriptions.
Add helper functions that output arrays in a standard format, and use
the functions inside heapdesc routines.  This allows tools like
pg_walinspect to show a detailed description of the page offset number
arrays for records like PRUNE and VACUUM (unless there was an FPI).

Also document the conventions that desc routines should follow.  Only
the heapdesc routines follow the conventions for now, so they're just
guidelines for the time being.

Based on a suggestion from Andres Freund.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/flat/20230109215842.fktuhesvayno6o4g%40awork3.anarazel.de
2023-04-07 16:08:52 -07:00
Daniel Gustafsson 664d757531 Refactor background psql TAP functions
This breaks out the background and interactive psql functionality into a
new class, PostgreSQL::Test::BackgroundPsql.  Sessions are still initiated
via PostgreSQL::Test::Cluster, but once started they can be manipulated by
the new helper functions which intend to make querying easier.  A sample
session for a command which can be expected to finish at a later time can
be seen below.

  my $session = $node->background_psql('postgres');
  $bsession->query_until(qr/start/, q(
    \echo start
	CREATE INDEX CONCURRENTLY idx ON t(a);
  ));
  $bsession->quit;

Patch by Andres Freund with some additional hacking by me.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/20230130194350.zj5v467x4jgqt3d6@awork3.anarazel.de
2023-04-07 22:14:20 +02:00
Alvaro Herrera e056c557ae
Catalog NOT NULL constraints
We now create pg_constaint rows for NOT NULL constraints with
contype='n'.

We propagate these constraints during operations such as adding
inheritance relationships, creating and attaching partitions, creating
tables LIKE other tables.  We mostly follow the well-known rules of
conislocal and coninhcount that we have for CHECK constraints, with some
adaptations; for example, as opposed to CHECK constraints, we don't
match NOT NULL ones by name when descending a hierarchy to alter it;
instead we match by column number.  This means we don't require the
constraint names to be identical across a hierarchy.

For now, we omit them from system catalogs.  Maybe this is worth
reconsidering.  We don't support NOT VALID nor DEFERRABLE clauses
either; these can be added as separate features later (this patch is
already large and complicated enough.)

This has been very long in the making.  The first patch was written by
Bernd Helmle in 2010 to add a new pg_constraint.contype value ('n'),
which I (Álvaro) then hijacked in 2011 and 2012, until that one was
killed by the realization that we ought to use contype='c' instead:
manufactured CHECK constraints.  However, later SQL standard
development, as well as nonobvious emergent properties of that design
(mostly, failure to distinguish them from "normal" CHECK constraints as
well as the performance implication of having to test the CHECK
expression) led us to reconsider this choice, so now the current
implementation uses contype='n' again.

In 2016 Vitaly Burovoy also worked on this feature[1] but found no
consensus for his proposed approach, which was claimed to be closer to
the letter of the standard, requiring additional pg_attribute columns to
track the OID of the NOT NULL constraint for that column.
[1] https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>

Discussion: https://postgr.es/m/CACA0E642A0267EDA387AF2B%40%5B172.26.14.62%5D
Discussion: https://postgr.es/m/AANLkTinLXMOEMz+0J29tf1POokKi4XDkWJ6-DDR9BKgU@mail.gmail.com
Discussion: https://postgr.es/m/20110707213401.GA27098@alvh.no-ip.org
Discussion: https://postgr.es/m/1343682669-sup-2532@alvh.no-ip.org
Discussion: https://postgr.es/m/CAKOSWNkN6HSyatuys8xZxzRCR-KL1OkHS5-b9qd9bf1Rad3PLA@mail.gmail.com
Discussion: https://postgr.es/m/20220817181249.q7qvj3okywctra3c@alvherre.pgsql
2023-04-07 19:59:57 +02:00
Tom Lane cd82e5c79d Fix locale-dependent test case.
psql parses the interval argument of \watch with locale-dependent
strtod().  In commit 00beecfe8 I added a test case that exercises
a fractional interval, but I hard-coded 0.01 which doesn't work
in locales where the radix point isn't ".".  We don't want to
change this longstanding parsing behavior, so fix the test case
to generate a suitably locale-aware spelling.

Report and patch by Alexander Korotkov.

Discussion: https://postgr.es/m/CAPpHfdv+10Uk6FWjsh3+ju7kHYr76LaRXbYayXmrM7FBU-=Hgg@mail.gmail.com
2023-04-07 10:35:11 -04:00
Amit Kapila 96c498d2f8 Add tab-completion for newly added SUBSCRIPTION options.
Commits c3afe8cf5a and 482675987b added new subscription options
"password_required" and "run_as_owner". This patch adds tab-completion
for these newly added options.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pu=pnJf=SS1583pknSQ3CbOqLCkWcJCQYt6zxTagHEdmw@mail.gmail.com
2023-04-07 10:32:36 +05:30
David Rowley ae78cae3be Add --buffer-usage-limit option to vacuumdb
1cbbee033 added BUFFER_USAGE_LIMIT to the VACUUM and ANALYZE commands, so
here we permit that option to be specified in vacuumdb.

In passing, adjust the documents for vacuum_buffer_usage_limit and the
BUFFER_USAGE_LIMIT VACUUM option to mention "kB" rather than "KB".  Do the
same for the ERROR message in ExecVacuum() and
check_vacuum_buffer_usage_limit().  Without that we might tell a user that
the valid minimum value is 128 KB only to reject that because we accept
only "kB" and not "KB".

Also, add a small reminder comment in vacuum.h to try to trigger the
memory of anyone adding new fields to VacuumParams that they might want to
consider if vacuumdb needs to grow a new option too.

Author: Melanie Plageman
Reviewed-by: Justin Pryzby
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/ZAzTg3iEnubscvbf@telsasoft.com
2023-04-07 12:47:10 +12:00
David Rowley 1cbbee0338 Add VACUUM/ANALYZE BUFFER_USAGE_LIMIT option
Add new options to the VACUUM and ANALYZE commands called
BUFFER_USAGE_LIMIT to allow users more control over how large to make the
buffer access strategy that is used to limit the usage of buffers in
shared buffers.  Larger rings can allow VACUUM to run more quickly but
have the drawback of VACUUM possibly evicting more buffers from shared
buffers that might be useful for other queries running on the database.

Here we also add a new GUC named vacuum_buffer_usage_limit which controls
how large to make the access strategy when it's not specified in the
VACUUM/ANALYZE command.  This defaults to 256KB, which is the same size as
the access strategy was prior to this change.  This setting also
controls how large to make the buffer access strategy for autovacuum.

Per idea by Andres Freund.

Author: Melanie Plageman
Reviewed-by: David Rowley
Reviewed-by: Andres Freund
Reviewed-by: Justin Pryzby
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20230111182720.ejifsclfwymw2reb@awork3.anarazel.de
2023-04-07 11:40:31 +12:00
Tom Lane 31ae2aa9d2 psql: set SHELL_ERROR and SHELL_EXIT_CODE in more places.
Make the \g, \o, \w, and \copy commands set these variables
when closing a pipe.  We missed doing this in commit b0d8f2d98,
but it seems like a good idea.

There are some remaining places in psql that intentionally don't
update these variables after running a child program:
	* pager invocations
	* backtick evaluation within a prompt
	* \e (edit query buffer)

Corey Huinker and Tom Lane

Discussion: https://postgr.es/m/CADkLM=eSKwRGF-rnRqhtBORRtL49QsjcVUCa-kLxKTqxypsakw@mail.gmail.com
2023-04-06 17:33:38 -04:00
Tom Lane 00beecfe83 psql: add an optional execution-count limit to \watch.
\watch can now be told to stop after N executions of the query.

With the idea that we might want to add more options to \watch
in future, this patch generalizes the command's syntax to a list
of name=value options, with the interval allowed to omit the name
for backwards compatibility.

Andrey Borodin, reviewed by Kyotaro Horiguchi, Nathan Bossart,
Michael Paquier, Yugo Nagata, and myself

Discussion: https://postgr.es/m/CAAhFRxiZ2-n_L1ErMm9AZjgmUK=qS6VHb+0SaMn8sqqbhF7How@mail.gmail.com
2023-04-06 13:18:14 -04:00
Tomas Vondra 2820adf775 Support long distance matching for zstd compression
zstd compression supports a special mode for finding matched in distant
past, which may result in better compression ratio, at the expense of
using more memory (the window size is 128MB).

To enable this optional mode, use the "long" keyword when specifying the
compression method (--compress=zstd:long).

Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20220327205020.GM28503@telsasoft.com
2023-04-06 17:18:42 +02:00
Tomas Vondra 84adc8e20f pg_dump: Add support for zstd compression
Allow pg_dump to use the zstd compression, in addition to gzip/lz4. Bulk
of the new compression method is implemented in compress_zstd.{c,h},
covering the pg_dump compression APIs. The rest of the patch adds test
and makes various places aware of the new compression method.

The zstd library (which this patch relies on) supports multithreaded
compression since version 1.5. We however disallow that feature for now,
as it might interfere with parallel backups on platforms that rely on
threads (e.g. Windows). This can be improved / relaxed in the future.

This also fixes a minor issue in InitDiscoverCompressFileHandle(), which
was not updated to check if the file already has the .lz4 extension.

Adding zstd compression was originally proposed in 2020 (see the second
thread), but then was reworked to use the new compression API introduced
in e9960732a9. I've considered both threads when compiling the list of
reviewers.

Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion, Andreas Karlsson
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20201221194924.GI30237@telsasoft.com
2023-04-05 21:39:33 +02:00
Amit Kapila 8df0d3d530 Add Copyright notice in 001_basic.pl and 002_pg_upgrade.pl.
Author: Kuroda Hayato
Discussion: https://postgr.es/m/TYCPR01MB587073D91E372B8EF719931EF5929@TYCPR01MB5870.jpnprd01.prod.outlook.com
2023-04-05 09:20:14 +05:30
Jeff Davis 36320cbc16 Fix MSVC warning introduced in ea1db8ae70.
Discussion: https://postgr.es/m/CA+hUKGJR1BhCORa5WdvwxztD3arhENcwaN1zEQ1Upg20BwjKWA@mail.gmail.com
Reported-by: Thomas Munro
2023-04-04 15:43:18 -07:00
Jeff Davis ea1db8ae70 Canonicalize ICU locale names to language tags.
Convert to BCP47 language tags before storing in the catalog, except
during binary upgrade or when the locale comes from an existing
collation or template database.

The resulting language tags can vary slightly between ICU
versions. For instance, "@colBackwards=yes" is converted to
"und-u-kb-true" in older versions of ICU, and to the simpler (but
equivalent) "und-u-kb" in newer versions.

The process of canonicalizing to a language tag also understands more
input locale string formats than ucol_open(). For instance,
"fr_CA.UTF-8" is misinterpreted by ucol_open() and the region is
ignored; effectively treating it the same as the locale "fr" and
opening the wrong collator. Canonicalization properly interprets the
language and region, resulting in the language tag "fr-CA", which can
then be understood by ucol_open().

This commit fixes a problem in prior versions due to ucol_open()
misinterpreting locale strings as described above. For instance,
creating an ICU collation with locale "fr_CA.UTF-8" would store that
string directly in the catalog, which would later be passed to (and
misinterpreted by) ucol_open(). After this commit, the locale string
will be canonicalized to language tag "fr-CA" in the catalog, which
will be properly understood by ucol_open(). Because this fix affects
the resulting collator, we cannot change the locale string stored in
the catalog for existing databases or collations; otherwise we'd risk
corrupting indexes. Therefore, only canonicalize locales for
newly-created (not upgraded) collations/databases. For similar
reasons, do not backport.

Discussion: https://postgr.es/m/8c7af6820aed94dc7bc259d2aa7f9663518e6137.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-04-04 10:38:58 -07:00
Robert Haas 482675987b Add a run_as_owner option to subscriptions.
This option is normally false, but can be set to true to obtain
the legacy behavior where the subscription runs with the permissions
of the subscription owner rather than the permissions of the
table owner. The advantages of this mode are (1) it doesn't require
that the subscription owner have permission to SET ROLE to each
table owner and (2) since no role switching occurs, the
SECURITY_RESTRICTED_OPERATION restrictions do not apply.

On the downside, it allows any table owner to easily usurp
the privileges of the subscription owner - basically, to take
over their account. Because that's generally quite undesirable,
we don't make this mode the default, but we do make it available,
just in case the new behavior causes too many problems for someone.

Discussion: http://postgr.es/m/CA+TgmoZ-WEeG6Z14AfH7KhmpX2eFh+tZ0z+vf0=eMDdbda269g@mail.gmail.com
2023-04-04 12:03:03 -04:00
Peter Eisentraut 1980d3585e pg_basebackup: Correct type of WalSegSz
The pg_basebackup code had WalSegSz as uint32, whereas the rest of the
code has it as int.  This seems confusing, and using the extra range
wouldn't actually work.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/1bf15c7a-0acd-1864-081e-7a28814310fe%40enterprisedb.com
2023-04-03 07:21:06 +02:00
Tomas Vondra 0070b66fef pg_dump: Use only LZ4 frame format for compression
After 0da243fed0 got committed, it was reported that in some cases the
compression ratio is rather poor - particularly for custom format with
narrow tables - due to writing the LZ4 header/footer for each row.

This commit switches to LZ4F (LZ4 frame format), eliminating most of the
overhead and greatly improving the compression ratio. This makes the
compressed size about the same for plain and custom formats (just like
for gzip, for example).

LZ4F is now used by both compression APIs, which allowed refactoring and
reusing more of the code. For consistency this also renames the LZ4File
struct to LZ4State, and a number of functions are now prefixed with
LZ4Stream_ (instead of LZ4File_).

Patch by Georgios Kokolatos, based on report and initial patch by Justin
Pryzby. Review and minor cleanups by me.

Author: Georgios Kokolatos, Justin Pryzby
Reported-by: Justin Pryzby
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/20230227044910.GO1653%40telsasoft.com
2023-04-01 00:54:50 +02:00
Robert Haas c3afe8cf5a Add new predefined role pg_create_subscription.
This role can be granted to non-superusers to allow them to issue
CREATE SUBSCRIPTION. The non-superuser must additionally have CREATE
permissions on the database in which the subscription is to be
created.

Most forms of ALTER SUBSCRIPTION, including ALTER SUBSCRIPTION .. SKIP,
now require only that the role performing the operation own the
subscription, or inherit the privileges of the owner. However, to
use ALTER SUBSCRIPTION ... RENAME or ALTER SUBSCRIPTION ... OWNER TO,
you also need CREATE permission on the database. This is similar to
what we do for schemas. To change the owner of a schema, you must also
have permission to SET ROLE to the new owner, similar to what we do
for other object types.

Non-superusers are required to specify a password for authentication
and the remote side must use the password, similar to what is required
for postgres_fdw and dblink.  A superuser who wants a non-superuser to
own a subscription that does not rely on password authentication may
set the new password_required=false property on that subscription. A
non-superuser may not set password_required=false and may not modify a
subscription that already has password_required=false.

This new password_required subscription property works much like the
eponymous postgres_fdw property.  In both cases, the actual semantics
are that a password is not required if either (1) the property is set
to false or (2) the relevant user is the superuser.

Patch by me, reviewed by Andres Freund, Jeff Davis, Mark Dilger,
and Stephen Frost (but some of those people did not fully endorse
all of the decisions that the patch makes).

Discussion: http://postgr.es/m/CA+TgmoaDH=0Xj7OBiQnsHTKcF2c4L+=gzPBUKSJLh8zed2_+Dg@mail.gmail.com
2023-03-30 11:37:19 -04:00
Peter Eisentraut 563f21cda8 Move definition of standard collations from initdb to pg_collation.dat
The standard collations "ucs_basic" and "unicode" were defined in
initdb, even though pg_collation.dat seems like the correct place for
them.  It seems this was just forgotten during various reorganizations
of initdb and pg_collation.dat/.h over time.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/08b58ecd-0d50-9395-ed51-dc8294e3fd2b%40enterprisedb.com
2023-03-29 09:45:21 +02:00
Tomas Vondra 00d9dcf5be pg_dump: Fix gzip compression of empty data
The pg_dump Compressor API has three basic callbacks - Allocate, Write
and End.  The gzip implementation (since e9960732a) wrongly assumed the
Write function would always be called, and deferred the initialization
of the internal compression system until the first such call.  But when
there's no data to compress (e.g. for empty LO), this would result in
not finalizing the compression state (because it was not actually
initialized), producing invalid dump.

Fixed by initializing the internal compression system in the Allocate
call, whenever the caller provides the Write.  For decompression the
state is not needed, so we leave the private_data member unpopulated.

Introduces a pg_dump TAP test compressing an empty large object.

This also rearranges the functions to their original order, to make
diffs against older code simpler to understand.  Finally, replace an
unreachable pg_fatal() with a simple assert check.

Reported-by: Justin Pryzby
Author: Justin Pryzby, Georgios Kokolatos
Reviewed-by: Georgios Kokolatos, Tomas Vondra

https://postgr.es/m/20230228235834.GC30529%40telsasoft.com
2023-03-29 02:34:48 +02:00
Jeff Davis 1671f990dd Validate ICU locales.
For ICU collations, ensure that the locale's language exists in ICU,
and that the locale can be opened.

Basic validation helps avoid minor mistakes and misspellings, which
often fall back to the root locale instead of the intended
locale. It's even more important to avoid such mistakes in ICU
versions 54 and earlier, where the same (misspelled) locale string
could fall back to different locales depending on the environment.

Discussion: https://postgr.es/m/11b1eeb7e7667fdd4178497aeb796c48d26e69b9.camel@j-davis.com
Discussion: https://postgr.es/m/df2efad0cae7c65180df8e5ebb709e5eb4f2a82b.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-28 16:34:29 -07:00
Jeff Davis c1f1c1f87f initdb: emit message when using default ICU locale.
Helpful to determine from test logs whether the locale came from the
environment or a command-line option.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-28 08:24:43 -07:00
Jeff Davis f8ca22295e initdb: replace check_icu_locale() with default_icu_locale().
The extra checks done in check_icu_locale() are not necessary. An
existing comment already pointed out that the checks would be done
during post-bootstrap initialization, when the locale is opened by the
backend. This was a mistake in commit 27b62377b4.

This commit creates a simpler function default_icu_locale() to just
return the locale of the default collator.

Discussion: https://postgr.es/m/04182066-7655-344a-b8b7-040b1b2490fb%40enterprisedb.com
Reviewed-by: Peter Eisentraut
2023-03-28 08:24:21 -07:00
Robert Haas c87aff065c amcheck: Generalize one of the recently-added update chain checks.
Commit bbc1376b39 checked that if
a redirected line pointer pointed to a tuple, the tuple should be
marked both HEAP_ONLY_TUPLE and HEAP_UPDATED. But Andres Freund
pointed out that *any* tuple that is marked HEAP_ONLY_TUPLE should
be marked HEAP_UPDATED, not just one that is the target of a
redirected line pointer. Do that instead.

To see why this is better, consider a redirect line pointer A
which points to a heap-only tuple B which points (via CTID)
to another heap-only tuple C. With the old code, we'd complain
if B was not marked HEAP_UPDATED, but with this change, we'll
complain if either B or C is not marked HEAP_UPDATED.

(Note that, with or without this commit, if either B or C were
not marked HEAP_ONLY_TUPLE, we would also complain about that.)

Discussion: http://postgr.es/m/CA%2BTgmobLypZx%3DcOH%2ByY1GZmCruaoucHm77A6y_-Bo%3Dh-_3H28g%40mail.gmail.com
2023-03-27 13:37:16 -04:00
Tom Lane 3c05284d83 Invent GENERIC_PLAN option for EXPLAIN.
This provides a very simple way to see the generic plan for a
parameterized query.  Without this, it's necessary to define
a prepared statement and temporarily change plan_cache_mode,
which is a bit tedious.

One thing that's a bit of a hack perhaps is that we disable
execution-time partition pruning when the GENERIC_PLAN option
is given.  That's because the pruning code may attempt to
fetch the value of one of the parameters, which would fail.

Laurenz Albe, reviewed by Julien Rouhaud, Christoph Berg,
Michel Pelletier, Jim Jones, and myself

Discussion: https://postgr.es/m/0a29b954b10b57f0d135fe12aa0909bd41883eb0.camel@cybertec.at
2023-03-24 17:07:22 -04:00
Andres Freund e522049f23 meson: add install-{quiet, world} targets
To define our own install target, we need dependencies on the i18n targets,
which we did not collect so far.

Discussion: https://postgr.es/m/3fc3bb9b-f7f8-d442-35c1-ec82280c564a@enterprisedb.com
2023-03-23 21:20:18 -07:00
Jeff Davis 3b50275b12 Handle the "und" locale in ICU versions 54 and older.
The "und" locale is an alternative spelling of the root locale, but it
was not recognized until ICU 55. To maintain common behavior across
all supported ICU versions, check for "und" and replace with "root"
before opening.

Previously, the lack of support for "und" was dangerous, because
versions 54 and older fall back to the environment when a locale is
not found. If the user specified "und" for the language (which is
expected and documented), it could not only resolve to the wrong
collator, but it could unexpectedly change (which could lead to
corrupt indexes).

This effectively reverts commit d72900bded, which worked around the
problem for the built-in "unicode" collation, and is no longer
necessary.

Discussion: https://postgr.es/m/60da0cecfb512a78b8666b31631a636215d8ce73.camel@j-davis.com
Discussion: https://postgr.es/m/0c6fa66f2753217d2a40480a96bd2ccf023536a1.camel@j-davis.com
Reviewed-by: Peter Eisentraut
2023-03-23 10:08:27 -07:00
Tomas Vondra d0160ca11e Minor comment improvements for compress_lz4
Author: Tomas Vondra
Reviewed-by: Georgios Kokolatos, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:55:52 +01:00
Tomas Vondra f081a48f9a Unify buffer sizes in pg_dump compression API
Prior to the introduction of the compression API in e9960732a9, pg_dump
would use the ZLIB_IN_SIZE/ZLIB_OUT_SIZE to size input/output buffers.
Commit 0da243fed0 introduced similar constants for LZ4, but while gzip
defined both buffers to be 4kB, LZ4 used 4kB and 16kB without any clear
reasoning why that's desirable.

Furthermore, parts of the code unaware of which compression is used
(e.g. pg_backup_directory.c) continued to use ZLIB_OUT_SIZE directly.

Simplify by replacing the various constants with DEFAULT_IO_BUFFER_SIZE,
set to 4kB. The compression implementations still have an option to use
a custom value, but considering 4kB was fine for 20+ years, I find that
unlikely (and we'd probably just increase the default buffer size).

Author: Georgios Kokolatos
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:55:52 +01:00
Tomas Vondra d3b57755e6 Improve type handling in pg_dump's compress file API
After 0da243fed0 got committed, we've received a report about a compiler
warning, related to the new LZ4File_gets() function:

  compress_lz4.c: In function 'LZ4File_gets':
  compress_lz4.c:492:19: warning: comparison of unsigned expression in
                                  '< 0' is always false [-Wtype-limits]
    492 |         if (dsize < 0)

The reason is very simple - dsize is declared as size_t, which is an
unsigned integer, and thus the check is pointless and we might fail to
notice an error in some cases (or fail in a strange way a bit later).

The warning could have been silenced by simply changing the type, but we
realized the API mostly assumes all the libraries use the same types and
report errors the same way (e.g. by returning 0 and/or negative value).

But we can't make this assumption - the gzip/lz4 libraries already
disagree on some of this, and even if they did a library added in the
future might not.

The right solution is to define what the API does, and translate the
library-specific behavior in consistent way (so that the internal errors
are not exposed to users of our compression API). So this adjusts the
data types in a couple places, so that we don't miss library errors, and
simplifies and unifies the error reporting to simply return true/false
(instead of e.g. size_t).

While at it, make sure LZ4File_open_write() does not clobber errno in
case open_func() fails.

Author: Georgios Kokolatos
Reported-by: Alexander Lakhin
Reviewed-by: Tomas Vondra, Justin Pryzby
Discussion: https://postgr.es/m/33496f7c-3449-1426-d568-63f6bca2ac1f@gmail.com
2023-03-23 17:55:17 +01:00
Tom Lane c75a623304 Fix new test case to work on (some?) big-endian architectures.
Use of pack("L") gets around the basic endian problem, but it doesn't
deal with the fact that the order of the bitfields within the struct
may differ.  This patch fixes it to work with gcc on NetBSD/macppc,
but I wonder whether that will be enough --- in principle, there
could be four different combinations of bitpatterns needed here.

Discussion: https://postgr.es/m/1650745.1679513221@sss.pgh.pa.us
2023-03-22 17:14:21 -04:00
Tom Lane b48af6d174 Fix initdb's handling of min_wal_size and max_wal_size.
In commit 3e51b278d, I misinterpreted the coding in setup_config()
as setting min_wal_size and max_wal_size to compile-time-constant
values.  But it's not: there's a hidden dependency on --wal-segsize.
Therefore leaving these variables commented out is the wrong thing.
Per report from Andres Freund.

Discussion: https://postgr.es/m/20230322200751.jvfvsuuhd3hgm6vv@awork3.anarazel.de
2023-03-22 16:37:41 -04:00
Tom Lane 4fe2aa7656 Reduce memory leakage in initdb.
While testing commit 3e51b278d, I noted that initdb leaks about a
megabyte worth of data due to the sloppy bookkeeping in its
string-manipulating code.  That's not a huge amount on modern machines,
but it's still kind of annoying, and it's easy to fix by recognizing
that we might as well treat these arrays of strings as
modifiable-in-place.  There's no caller that cares about preserving
the old state of the array after replace_token or replace_guc_value.

With this fix, valgrind sees only a few hundred bytes leaked during
an initdb run.

Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us
2023-03-22 14:28:45 -04:00
Tom Lane 3e51b278db Add "-c name=value" switch to initdb.
This option, or its long form --set, sets the GUC "name" to "value".
The setting applies in the bootstrap and standalone servers run by
initdb, and is also written into the generated postgresql.conf.

This can save an extra editing step when creating a new cluster,
but the real use-case is for coping with situations where the
bootstrap server fails to start due to environmental issues;
for example, if it's necessary to force huge_pages to off.

Discussion: https://postgr.es/m/2844176.1674681919@sss.pgh.pa.us
2023-03-22 13:49:05 -04:00
Robert Haas bbc1376b39 Teach verify_heapam() to validate update chains within a page.
Prior to this commit, we only consider each tuple or line pointer
on the page in isolation, but now we can do some validation of a line
pointer against its successor. For example, a redirect line pointer
shouldn't point to another redirect line pointer, and if a tuple
is HOT-updated, the result should be a heap-only tuple.

Himanshu Upadhyaya and Robert Haas, reviewed by Aleksander Alekseev,
Andres Freund, and Peter Geoghegan.
2023-03-22 08:48:54 -04:00
Tom Lane b0d8f2d983 Add SHELL_ERROR and SHELL_EXIT_CODE magic variables to psql.
These are set after a \! command or a backtick substitution.
SHELL_ERROR is just "true" for error (nonzero exit status) or "false"
for success, while SHELL_EXIT_CODE records the actual exit status
following standard shell/system(3) conventions.

Corey Huinker, reviewed by Maxim Orlov and myself

Discussion: https://postgr.es/m/CADkLM=cWao2x2f+UDw15W1JkVFr_bsxfstw=NGea7r9m4j-7rQ@mail.gmail.com
2023-03-21 13:03:56 -04:00
Daniel Gustafsson 106f26a849 Avoid using atooid for numerical comparisons which arent Oids
The check for the number of roles in the target cluster for an upgrade
selects the existing roles and performs a COUNT(*) over the result.  A
value of one is the expected query result value indicating that only
the install user is present in the new cluster. The result was converted
with the function for converting a string containing an Oid into a numeric,
which avoids potential overflow but makes the code less readable since
it's not actually an Oid at all.

Discussion: https://postgr.es/m/41AB5F1F-4389-4B25-9668-5C430375836C@yesql.se
2023-03-21 12:57:21 +01:00
Peter Eisentraut 4c8044c044 pg_waldump: Allow hexadecimal values for -t/--timeline option
This makes it easier to specify values taken directly from WAL file
names.

The option parsing is arranged in the style of option_parse_int() (but
we need to parse unsigned int), to allow future refactoring in the
same manner.

Reviewed-by: Sébastien Lardière <sebastien@lardiere.net>
Discussion: https://www.postgresql.org/message-id/flat/8fef346e-2541-76c3-d768-6536ae052993@lardiere.net
2023-03-21 08:05:23 +01:00
Tom Lane 064709f803 Simplify and speed up pg_dump's creation of parent-table links.
Instead of trying to optimize this by skipping creation of the
links for tables we don't plan to dump, just create them all in
bulk with a single scan over the pg_inherits data.  The previous
approach was more or less O(N^2) in the number of pg_inherits
entries, not to mention being way too complicated.

Also, don't create useless TableAttachInfo objects.
It's silly to create a TableAttachInfo object that we're not
going to dump, when we know perfectly well at creation time
that it won't be dumped.

Patch by me; thanks to Julien Rouhaud for review.

Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 13:43:10 -04:00
Tom Lane bc8cd50fef Fix pg_dump for hash partitioning on enum columns.
Hash partitioning on an enum is problematic because the hash codes are
derived from the OIDs assigned to the enum values, which will almost
certainly be different after a dump-and-reload than they were before.
This means that some rows probably end up in different partitions than
before, causing restore to fail because of partition constraint
violations.  (pg_upgrade dodges this problem by using hacks to force
the enum values to keep the same OIDs, but that's not possible nor
desirable for pg_dump.)

Users can work around that by specifying --load-via-partition-root,
but since that's a dump-time not restore-time decision, one might
find out the need for it far too late.  Instead, teach pg_dump to
apply that option automatically when dealing with a partitioned
table that has hash-on-enum partitioning.

Also deal with a pre-existing issue for --load-via-partition-root
mode: in a parallel restore, we try to TRUNCATE target tables just
before loading them, in order to enable some backend optimizations.
This is bad when using --load-via-partition-root because (a) we're
likely to suffer deadlocks from restore jobs trying to restore rows
into other partitions than they came from, and (b) if we miss getting
a deadlock we might still lose data due to a TRUNCATE removing rows
from some already-completed restore job.

The fix for this is conceptually simple: just don't TRUNCATE if we're
dealing with a --load-via-partition-root case.  The tricky bit is for
pg_restore to identify those cases.  In dumps using COPY commands we
can inspect each COPY command to see if it targets the nominal target
table or some ancestor.  However, in dumps using INSERT commands it's
pretty impractical to examine the INSERTs in advance.  To provide a
solution for that going forward, modify pg_dump to mark TABLE DATA
items that are using --load-via-partition-root with a comment.
(This change also responds to a complaint from Robert Haas that
the dump output for --load-via-partition-root is pretty confusing.)
pg_restore checks for the special comment as well as checking the
COPY command if present.  This will fail to identify the combination
of --load-via-partition-root and --inserts in pre-existing dump files,
but that should be a pretty rare case in the field.  If it does
happen you will probably get a deadlock failure that you can work
around by not using parallel restore, which is the same as before
this bug fix.

Having done this, there seems no remaining reason for the alarmism
in the pg_dump man page about combining --load-via-partition-root
with parallel restore, so remove that warning.

Patch by me; thanks to Julien Rouhaud for review.  Back-patch to
v11 where hash partitioning was introduced.

Discussion: https://postgr.es/m/1376149.1675268279@sss.pgh.pa.us
2023-03-17 13:31:40 -04:00
Peter Eisentraut 95a828378e Fix incorrect format placeholders
Small fixup for 9637badd9f.
2023-03-17 07:48:24 +01:00
Michael Paquier 6f9ee74d45 Improve handling of psql \watch's interval argument
A failure in parsing the interval value defined in the \watch command
was silently switched to 1s of interval between two queries, which can
be confusing.  This commit improves the error handling, and a couple of
tests are added to check after:
- An incorrect value.
- An out-of-range value.
- A negative value.

A value of zero is able to work now, meaning that there is no interval
of time between two queries in a \watch loop.  No backpatch is done, as
it could break existing applications.

Author: Andrey Borodin
Reviewed-by: Kyotaro Horiguchi, Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/CAAhFRxiZ2-n_L1ErMm9AZjgmUK=qS6VHb+0SaMn8sqqbhF7How@mail.gmail.com
2023-03-16 09:32:36 +09:00
Tom Lane a563c24c95 Allow pg_dump to include/exclude child tables automatically.
This patch adds new pg_dump switches
    --table-and-children=pattern
    --exclude-table-and-children=pattern
    --exclude-table-data-and-children=pattern
which act the same as the existing --table, --exclude-table, and
--exclude-table-data switches, except that any partitions or
inheritance child tables of the table(s) matching the pattern
are also included or excluded.

Gilles Darold, reviewed by Stéphane Tachoires

Discussion: https://postgr.es/m/5aa393b5-5f67-8447-b83e-544516990ee2@migops.com
2023-03-14 16:09:03 -04:00
Andrew Dunstan 9f8377f7a2 Add a DEFAULT option to COPY FROM
This allows for a string which if an input field matches causes the
column's default value to be inserted. The advantage of this is that
the default can be inserted in some rows and not others, for which
non-default data is available.

The file_fdw extension is also modified to take allow use of this
option.

Israel Barth Rubio

Discussion: https://postgr.es/m/CAO_rXXAcqesk6DsvioOZ5zmeEmpUN5ktZf-9=9yu+DTr0Xr8Uw@mail.gmail.com
2023-03-13 10:01:56 -04:00
Peter Eisentraut d72900bded Improve support for UNICODE collation on older ICU
The recently added standard collation UNICODE (0d21d4b9bc) doesn't
give consistent results on some build farm members with old ICU
versions.  Apparently, the ICU locale specification 'und' (language
tag style) misbehaves on some older ICU versions.  Replacing it with
'' (ICU locale ID style) fixes it at least on some OS versions.  Let's
see what the build farm says.
2023-03-13 09:08:58 +01:00
Andres Freund a4f23f9b3c pg_amcheck: Minor test speedups
Freezing the relation N times and fetching the tuples one-by-one isn't that
cheap. On my machine this reduces test times by a bit less than one second, on
windows CI it's a few seconds.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230309001558.b7shzvio645ebdta@awork3.anarazel.de
2023-03-11 15:41:47 -08:00
Andres Freund 4f5d461e04 amcheck: Fix FullTransactionIdFromXidAndCtx() for xids before epoch 0
64bit xids can't represent xids before epoch 0 (see also be504a3e97). When
FullTransactionIdFromXidAndCtx() was passed such an xid, it'd create a 64bit
xid far into the future. Noticed while adding assertions in the course of
investigating be504a3e97, as amcheck's test create such xids.

To fix the issue, just return FirstNormalFullTransactionId in this case. A
freshly initdb'd cluster already has a newer horizon. The most minimal version
of this would make the messages for some detected corruptions differently
inaccurate. To make those cases accurate, switch
FullTransactionIdFromXidAndCtx() to use the 32bit modulo difference between
xid and nextxid to compute the 64bit xid, yielding sensible "in the future" /
"in the past" answers.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://postgr.es/m/20230108002923.cyoser3ttmt63bfn@awork3.anarazel.de
Backpatch: 14-, where heapam verification was introduced
2023-03-11 14:12:52 -08:00
Jeff Davis c45dc7ffbb initdb: derive encoding from locale for ICU; similar to libc.
Previously, the default encoding was derived from the locale when
using libc; while the default was always UTF-8 when using ICU. That
would throw an error when the locale was not compatible with UTF-8.

This commit causes initdb to derive the default encoding from the
locale for both providers. If --no-locale is specified (or if the
locale is C or POSIX), the default encoding will be UTF-8 for ICU
(because ICU does not support SQL_ASCII) and SQL_ASCII for libc.

Per buildfarm failure on system "hoverfly" related to commit
27b62377b4.

Discussion: https://postgr.es/m/d191d5841347301a8f1238f609471ddd957fc47e.camel%40j-davis.com
2023-03-10 10:51:24 -08:00