Coverity complained about not freeing some memory associated with
incrementally parsing backup manifests. To fix that, provide and use a new
shutdown function for the JsonManifestParseIncrementalState object, in
line with a suggestion from Tom Lane.
While analysing the problem, I noticed a buglet in freeing memory for
incremental json lexers. To fix that remove a bogus condition on
freeing the memory allocated for them.
Remove duplicate hash_string_pointer() function definitions by creating
a new inline function hash_string() for this purpose.
This has the added advantage of avoiding strlen() calls when doing hash
lookup. It's not clear how many of these are perfomance-sensitive
enough to benefit from that, but the simplification is worth it on
its own.
Reviewed by Jeff Davis
Discussion: https://postgr.es/m/CANWCAZbg_XeSeY0a_PqWmWqeRATvzTzUNYRLeT%2Bbzs%2BYQdC92g%40mail.gmail.com
This changes the three callers to json_parse_manifest() to use
json_parse_manifest_incremental_chunk() if appropriate. In the case of
the backend caller, since we don't know the size of the manifest in
advance we always call the incremental parser.
Author: Andrew Dunstan
Reviewed-By: Jacob Champion
Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net
Allocate memory for the error message inside memory owned by the
JsonLexContext and move responsibility away from the caller for
freeing it. This means that we can partially revert b44669b2ca
as this is now safe to use in FRONTEND code. The motivation for
this comes from the OAuth and incremental JSON patchsets but it
also adds value on its own.
Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAOYmi+mWdTd6ujtyF7MsvXvk7ToLRVG_tYAcaGbQLvf=N4KrQw@mail.gmail.com
Before this patch, if you took a full backup on server A and then
tried to use the backup manifest to take an incremental backup on
server B, it wouldn't know that the manifest was from a different
server and so the incremental backup operation could potentially
complete without error. When you later tried to run pg_combinebackup,
you'd find out that your incremental backup was and always had been
invalid. That's poor timing, because nobody likes finding out about
backup problems only at restore time.
With this patch, you'll get an error when trying to take the (invalid)
incremental backup, which seems a lot nicer.
Amul Sul, revised by me. Review by Michael Paquier.
Discussion: http://postgr.es/m/CA+TgmoYLZzbSAMM3cAjV4Y+iCRZn-bR9H2+Mdz7NdaJFU1Zb5w@mail.gmail.com
Return a pointer to the manifest_data instead of individual pointers
to relevant data stored within the manifest_data object. The previous
approach scales poorly if we add more things to the backup manifest,
as has been proposed.
Amul Sul, reviewed by Sravan Velagandula, Michael Paquier, and me.
Discussion: http://postgr.es/m/CAAJ_b95=1LONf99-M_ep588fL_WgLJfdnb7XG4GWE7JDD22E4w@mail.gmail.com
There isn't a lot of user demand for AIX support, we have a bunch of
hacks to work around AIX-specific compiler bugs and idiosyncrasies,
and no one has stepped up to the plate to properly maintain it.
Remove support for AIX to get rid of that maintenance overhead. It's
still supported for stable versions.
The acute issue that triggered this decision was that after commit
8af2565248, the AIX buildfarm members have been hitting this
assertion:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 2949728
Apperently the "pg_attribute_aligned(a)" attribute doesn't work on AIX
for values larger than PG_IO_ALIGN_SIZE, for a static const variable.
That could be worked around, but we decided to just drop the AIX support
instead.
Discussion: https://www.postgresql.org/message-id/20240224172345.32@rfd.leadboat.com
Reviewed-by: Andres Freund, Noah Misch, Thomas Munro
There are a lot of Perl scripts in the tree, mostly code generation
and TAP tests. Occasionally, these scripts produce warnings. These
are probably always mistakes on the developer side (true positives).
Typical examples are warnings from genbki.pl or related when you make
a mess in the catalog files during development, or warnings from tests
when they massage a config file that looks different on different
hosts, or mistakes during merges (e.g., duplicate subroutine
definitions), or just mistakes that weren't noticed because there is a
lot of output in a verbose build.
This changes all warnings into fatal errors, by replacing
use warnings;
by
use warnings FATAL => 'all';
in all Perl files.
Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
This makes it possible for the code to be easily reused by other
client-side tools, and/or by the server.
Patch by me. Review of this patch in particular by at least Peter
Eisentraut; reviewers for the patch series in general include Dilip
Kumar, Andres Fruend, David Steele, Álvaro Herrera, and Jakub Wartak.
Discussion: http://postgr.es/m/CA+TgmoZ6UGZVnSy5iak6s6+AXu_DewXovDjhLs3-su6nmU_x_g@mail.gmail.com
The old names were too generic, and would have applied to any binary
that made use of JsonManifestParseContext. Rename to make the names
specific to pg_verifybackup, since there are plans afoot to reuse
this infrastructure.
Per suggestion from Álvaro Herrra.
Discussion: http://postgr.es/m/202311131625.o7hzq3oukuyd@alvherre.pgsql
There is no real reason to just run multiple words together when
we can instead punctuate them with marks intended for that purpose.
Per suggestion from Álvaro Herrera.
Discussion: http://postgr.es/m/202311161021.nisg7imt7kyf@alvherre.pgsql
A PostgreSQL release tarball contains a number of prebuilt files, in
particular files produced by bison, flex, perl, and well as html and
man documentation. We have done this consistent with established
practice at the time to not require these tools for building from a
tarball. Some of these tools were hard to get, or get the right
version of, from time to time, and shipping the prebuilt output was a
convenience to users.
Now this has at least two problems:
One, we have to make the build system(s) work in two modes: Building
from a git checkout and building from a tarball. This is pretty
complicated, but it works so far for autoconf/make. It does not
currently work for meson; you can currently only build with meson from
a git checkout. Making meson builds work from a tarball seems very
difficult or impossible. One particular problem is that since meson
requires a separate build directory, we cannot make the build update
files like gram.h in the source tree. So if you were to build from a
tarball and update gram.y, you will have a gram.h in the source tree
and one in the build tree, but the way things work is that the
compiler will always use the one in the source tree. So you cannot,
for example, make any gram.y changes when building from a tarball.
This seems impossible to fix in a non-horrible way.
Second, there is increased interest nowadays in precisely tracking the
origin of software. We can reasonably track contributions into the
git tree, and users can reasonably track the path from a tarball to
packages and downloads and installs. But what happens between the git
tree and the tarball is obscure and in some cases non-reproducible.
The solution for both of these issues is to get rid of the step that
adds prebuilt files to the tarball. The tarball now only contains
what is in the git tree (*). Getting the additional build
dependencies is no longer a problem nowadays, and the complications to
keep these dual build modes working are significant. And of course we
want to get the meson build system working universally.
This commit removes the make distprep target altogether. The make
dist target continues to do its job, it just doesn't call distprep
anymore.
(*) - The tarball also contains the INSTALL file that is built at make
dist time, but not by distprep. This is unchanged for now.
The make maintainer-clean target, whose job it is to remove the
prebuilt files in addition to what make distclean does, is now just an
alias to make distprep. (In practice, it is probably obsolete given
that git clean is available.)
The following programs are now hard build requirements in configure
(they were already required by meson.build):
- bison
- flex
- perl
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
Since C99, there can be a trailing comma after the last value in an
enum definition. A lot of new code has been introducing this style on
the fly. Some new patches are now taking an inconsistent approach to
this. Some add the last comma on the fly if they add a new last
value, some are trying to preserve the existing style in each place,
some are even dropping the last comma if there was one. We could
nudge this all in a consistent direction if we just add the trailing
commas everywhere once.
I omitted a few places where there was a fixed "last" value that will
always stay last. I also skipped the header files of libpq and ecpg,
in case people want to use those with older compilers. There were
also a small number of cases where the enum type wasn't used anywhere
(but the enum values were), which ended up confusing pgindent a bit,
so I left those alone.
Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
Previously, the JSON code didn't have to worry too much about freeing
JsonLexContext, because it was never too long-lived. With new features
being added for SQL/JSON this is no longer the case. Add a routine
that knows how to free this struct and apply that to a few places, to
prevent this from becoming problematic.
At the same time, we change the API of makeJsonLexContextCstringLen to
make it receive a pointer to JsonLexContext for callers that want it to
be stack-allocated; it can also be passed as NULL to get the original
behavior of a palloc'ed one.
This also causes an ABI break due to the addition of flags to
JsonLexContext, so we can't easily backpatch it. AFAICS that's not much
of a problem; apparently some leaks might exist in JSON usage of
text-search, for example via json_to_tsvector, but I haven't seen any
complaints about that.
Per Coverity complaint about datum_to_jsonb_internal().
Discussion: https://postgr.es/m/20230808174110.oq3iymllsv6amkih@alvherre.pgsql
Run pgindent, pgperltidy, and reformat-dat-files.
This set of diffs is a bit larger than typical. We've updated to
pg_bsd_indent 2.1.2, which properly indents variable declarations that
have multi-line initialization expressions (the continuation lines are
now indented one tab stop). We've also updated to perltidy version
20230309 and changed some of its settings, which reduces its desire to
add whitespace to lines to make assignments etc. line up. Going
forward, that should make for fewer random-seeming changes to existing
code.
Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
zstd compression supports a special mode for finding matched in distant
past, which may result in better compression ratio, at the expense of
using more memory (the window size is 128MB).
To enable this optional mode, use the "long" keyword when specifying the
compression method (--compress=zstd:long).
Author: Justin Pryzby
Reviewed-by: Tomas Vondra, Jacob Champion
Discussion: https://postgr.es/m/20230224191840.GD1653@telsasoft.com
Discussion: https://postgr.es/m/20220327205020.GM28503@telsasoft.com
This adds a new option to pg_verifybackup called -P/--progress, showing
every second some information about the progress of the checksum
verification based on the data of a backup manifest.
Similarly to what is done for pg_rewind and pg_basebackup, the
information printed in the progress report consists of the current
amount of data computed and the total amount of data that will be
computed. Note that files found with an incorrect size do not have
their checksum verified, hence their size is not appended to the total
amount of data estimated during the first scan of the manifest data
(such incorrect sizes could be overly high, for one, falsifying the
progress report).
Author: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoC5+JOgMd4o3z_oxw0f8JDSsCYY7zSbhe-O9x7f33rw_A@mail.gmail.com
Formerly, semantic action functions for the JSON parser returned void,
so that there was no way for them to affect the parser's behavior.
That means in particular that they can't force an error exit except by
longjmp'ing. That won't do in the context of our project to make input
functions return errors softly. Hence, change them to return the same
JsonParseErrorType enum value as the parser itself uses. If an action
function returns anything besides JSON_SUCCESS, the parse is abandoned
and that error code is returned.
Action functions can thus easily return the same error conditions that
the parser already knows about. As an escape hatch for expansion, also
invent a code JSON_SEM_ACTION_FAILED that the core parser does not know
the exact meaning of. When returning this code, an action function
must use some out-of-band mechanism for reporting the error details.
This commit simply makes the API change and causes all the existing
action functions to return JSON_SUCCESS, so that there is no actual
change in behavior here. This is long enough and boring enough that
it seemed best to commit it separately from the changes that make
real use of the new mechanism.
In passing, remove a duplicate assignment of
transform_string_values_scalar.
Discussion: https://postgr.es/m/1436686.1670701118@sss.pgh.pa.us
The generated resource files aren't exactly the same ones as the old
buildsystems generate. Previously "InternalName" and "OriginalFileName" were
mostly wrong / not set (despite being required), but that was hard to fix in
at least the make build. Additionally, the meson build falls back to a
"auto-generated" description when not set, and doesn't set it in a few cases -
unlikely that anybody looks at these descriptions in detail.
Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Autoconf is showing its age, fewer and fewer contributors know how to wrangle
it. Recursive make has a lot of hard to resolve dependency issues and slow
incremental rebuilds. Our home-grown MSVC build system is hard to maintain for
developers not using Windows and runs tests serially. While these and other
issues could individually be addressed with incremental improvements, together
they seem best addressed by moving to a more modern build system.
After evaluating different build system choices, we chose to use meson, to a
good degree based on the adoption by other open source projects.
We decided that it's more realistic to commit a relatively early version of
the new build system and mature it in tree.
This commit adds an initial version of a meson based build system. It supports
building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD,
Solaris and Windows (however only gcc is supported on aix, solaris). For
Windows/MSVC postgres can now be built with ninja (faster, particularly for
incremental builds) and msbuild (supporting the visual studio GUI, but
building slower).
Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM
bitcode generation, documentation adjustments) are done in subsequent commits
requiring further review. Other aspects (e.g. not installing test-only
extensions) are not yet addressed.
When building on Windows with msbuild, builds are slower when using a visual
studio version older than 2019, because those versions do not support
MultiToolTask, required by meson for intra-target parallelism.
The plan is to remove the MSVC specific build system in src/tools/msvc soon
after reaching feature parity. However, we're not planning to remove the
autoconf/make build system in the near future. Likely we're going to keep at
least the parts required for PGXS to keep working around until all supported
versions build with meson.
Some initial help for postgres developers is at
https://wiki.postgresql.org/wiki/Meson
With contributions from Thomas Munro, John Naylor, Stone Tickle and others.
Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Author: Peter Eisentraut <peter@eisentraut.org>
Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20211012083721.hvixq4pnh2pixr3j@alap3.anarazel.de
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in optimizer, parser,
utility, libpq, and "commands" code, as well as in remaining library
code. Do the same for all code related to frontend programs (with the
exception of pg_dump/pg_dumpall related code).
Like other recent commits that cleaned up function parameter names, this
commit was written with help from clang-tidy. Later commits will handle
ecpg and pg_dump/pg_dumpall.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
More than twenty years ago (79fcde48b), we hacked the postmaster
to avoid a core-dump on systems that didn't support fflush(NULL).
We've mostly, though not completely, hewed to that rule ever since.
But such systems are surely gone in the wild, so in the spirit of
cleaning out no-longer-needed portability hacks let's get rid of
multiple per-file fflush() calls in favor of using fflush(NULL).
Also, we were fairly inconsistent about whether to fflush() before
popen() and system() calls. While we've received no bug reports
about that, it seems likely that at least some of these call sites
are at risk of odd behavior, such as error messages appearing in
an unexpected order. Rather than expend a lot of brain cells
figuring out which places are at hazard, let's just establish a
uniform coding rule that we should fflush(NULL) before these calls.
A no-op fflush() is surely of trivial cost compared to launching
a sub-process via a shell; while if it's not a no-op then we likely
need it.
Discussion: https://postgr.es/m/2923412.1661722825@sss.pgh.pa.us
This reverts commit 617d691412.
While I still think the basic idea is attractive, we need to sort
out what happens with built .c files, and there also seem to be
VPATH issues.
The backend already used a mechanically-generated list of *.c files,
but everywhere else we had a manually-written-out list of files in
which to seek translatable messages. Commit b0a55e432 contains the
latest in a long line of failures to update those lists. Rather than
manually fix its oversight, let's change to using "$(wildcard *.c)"
in all these nls.mk files.
Many of these files also have manual references to some *.c files
in other directories, most often src/common/. Perhaps we should try
to improve that situation too; but it's a bit less clear how, so for
now just fix the local file references.
Kyotaro Horiguchi and Tom Lane
Discussion: https://postgr.es/m/20220713.160853.453362706160476128.horikyota.ntt@gmail.com
This moves the list of available languages from nls.mk into a separate
file called po/LINGUAS. Advantages:
- It keeps the parts notionally managed by programmers (nls.mk)
separate from the parts notionally managed by translators (LINGUAS).
- It's the standard practice recommended by the Gettext manual
nowadays.
- The Meson build system also supports this layout (and of course
doesn't know anything about our custom nls.mk), so this would enable
sharing the list of languages between the two build systems.
(The MSVC build system currently finds all po files by globbing, so it
is not affected by this change.)
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/557a9f5c-e871-edc7-2f58-a4140fb65b7b@enterprisedb.com
Get rid of the separate "FATAL" log level, as it was applied
so inconsistently as to be meaningless. This mostly involves
s/pg_log_fatal/pg_log_error/g.
Create a macro pg_fatal() to handle the common use-case of
pg_log_error() immediately followed by exit(1). Various
modules had already invented either this or equivalent macros;
standardize on pg_fatal() and apply it where possible.
Invent the ability to add "detail" and "hint" messages to a
frontend message, much as we have long had in the backend.
Except where rewording was needed to convert existing coding
to detail/hint style, I have (mostly) resisted the temptation
to change existing message wording.
Patch by me. Design and patch reviewed at various stages by
Robert Haas, Kyotaro Horiguchi, Peter Eisentraut and
Daniel Gustafsson.
Discussion: https://postgr.es/m/1363732.1636496441@sss.pgh.pa.us
libzstd allows transparent parallel compression just by setting
an option when creating the compression context, so permit that
for both client and server-side backup compression. To use this,
use something like pg_basebackup --compress WHERE-zstd:workers=N
where WHERE is "client" or "server" and N is an integer.
When compression is performed on the server side, this will spawn
threads inside the PostgreSQL backend. While there is almost no
PostgreSQL server code which is thread-safe, the threads here are used
internally by libzstd and touch only data structures controlled by
libzstd.
Patch by me, based in part on earlier work by Dipesh Pandit
and Jeevan Ladhe. Reviewed by Justin Pryzby.
Discussion: http://postgr.es/m/CA+Tgmobj6u-nWF-j=FemygUhobhryLxf9h-wJN7W-2rSsseHNA@mail.gmail.com
The Config and Cwd modules were no longer used, but remained imported,
in a number of tests. Remove to keep the imports to the actually used
modules.
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/A5A074CD-3198-492B-BE5E-7961EFC3733F@yesql.se
These tests were added recently, but older code tests USE_LZ4 rathr
than HAVE_LIBLZ4, so let's follow the established precedent. It
also seems more consistent with the intent of the configure tests,
since I think that the USE_* symbols are intended to correspond to
what the user requested, and the HAVE_* symbols to what configure
found while probing.
Discussion: http://postgr.es/m/CA+Tgmoap+hTD2-QNPJLH4tffeFE8MX5+xkbFKMU3FKBy=ZSNKA@mail.gmail.com
Both client-side compression and server-side compression are now
supported for zstd. In addition, a backup compressed by the server
using zstd can now be decompressed by the client in order to
accommodate the use of -Fp.
Jeevan Ladhe, with some edits by me.
Discussion: http://postgr.es/m/CA+Tgmobyzfbz=gyze2_LL1ZumZunmaEKbHQxjrFkOR7APZGu-g@mail.gmail.com
Some environments may compile with --with-lz4 while the command "lz4"
goes missing, causing two failures in the TAP tests of pg_verifybackup
(008_untar.pl and 010_client_untar.pl) as the code assumed that the
command always existed with a hardcoded value in src/Makefile.global.
Rather than this method, this adds a ./configure check based on
PGAC_PATH_PROGS() to find automatically the command and get an absolute
path to it.
Both tests need to be adjusted for the case where the command does not
exist, actually, as Makefile.global would set now LZ4 to an empty value
in this case. The TAP tests of pg_receivewal already do that.
Per report from buildfarm member copperhead, as an effect of dab2984.
The origin of the failure is actually babbbb5 that did not centralize
the check for the existence of a "lz4" command at ./configure to shave a
few cycles. Note that one just needs to tweak an environment to move
"lz4" out of the way to reproduce the problem, which is what I did to
test this change.
Per discussion with Robert Haas, Tom Lane, Andres Freund and myself.
Discussion: https://postgr.es/m/Ygc51WVAFGocSu4h@paquier.xyz
Rather than doing manual book keeping to plan the number of tests to run
in each TAP suite, conclude each run with done_testing() summing up the
the number of tests that ran. This removes the need for maintaning and
updating the plan count at the expense of an accurate count of remaining
during the test suite runtime.
This patch has been discussed a number of times, often in the context of
other patches which updates tests, so a larger number of discussions can
be found in the archives.
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/DD399313-3D56-4666-8079-88949DAC870F@yesql.se
LZ4 compression can now be performed on the client using
pg_basebackup -Ft --compress client-lz4, and LZ4 decompression of
a backup compressed on the server can be performed on the client
using pg_basebackup -Fp --compress server-lz4.
Dipesh Pandit, reviewed and tested by Jeevan Ladhe and Tushar Ahuja,
with a few corrections - and some documentation - by me.
Discussion: http://postgr.es/m/CAN1g5_FeDmiA9D8wdG2W6Lkq5CpubxOAqTmd2et9hsinTJtsMQ@mail.gmail.com
LZ4 compression can be a lot faster than gzip compression, so users
may prefer it even if the compression ratio is not as good. We will
want pg_basebackup to support LZ4 compression and decompression on the
client side as well, and there is a pending patch for that, but it's
by a different author, so I am committing this part separately for
that reason.
Jeevan Ladhe, reviewed by Tushar Ahuja and by me.
Discussion: http://postgr.es/m/CANm22Cg9cArXEaYgHVZhCnzPLfqXCZLAzjwTq7Fc0quXRPfbxA@mail.gmail.com