Commit Graph

463 Commits

Author SHA1 Message Date
Peter Eisentraut 97d85be365 Make the order of the header file includes consistent
Similar to commit 7e735035f2.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
2024-03-13 15:07:00 +01:00
Peter Eisentraut dbbca2cf29 Remove unused #include's from backend .c files
as determined by include-what-you-use (IWYU)

While IWYU also suggests to *add* a bunch of #include's (which is its
main purpose), this patch does not do that.  In some cases, a more
specific #include replaces another less specific one.

Some manual adjustments of the automatic result:

- IWYU currently doesn't know about includes that provide global
  variable declarations (like -Wmissing-variable-declarations), so
  those includes are being kept manually.

- All includes for port(ability) headers are being kept for now, to
  play it safe.

- No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the
  patch from exploding in size.

Note that this patch touches just *.c files, so nothing declared in
header files changes in hidden ways.

As a small example, in src/backend/access/transam/rmgr.c, some IWYU
pragma annotations are added to handle a special case there.

Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
2024-03-04 12:02:20 +01:00
Michael Paquier 655dc31046 Simplify pg_enc2gettext_tbl[] with C99-designated initializer syntax
This commit switches pg_enc2gettext_tbl[] in encnames.c to use a
C99-designated initializer syntax.

pg_bind_textdomain_codeset() is simplified so as it is possible to do
a direct lookup at the gettext() array with a value of the enum pg_enc
rather than doing a loop through all its elements, as long as the
encoding value provided by GetDatabaseEncoding() is in the correct range
of supported encoding values.  Note that PG_MULE_INTERNAL gains a value
in the array, pointing to NULL.

Author: Jelte Fennema-Nio
Discussion: https://postgr.es/m/CAGECzQT3caUbcCcszNewCCmMbCuyP7XNAm60J3ybd6PN5kH2Dw@mail.gmail.com
2024-03-01 18:03:48 +09:00
Bruce Momjian 29275b1d17 Update copyright for 2024
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz

Backpatch-through: 12
2024-01-03 20:49:05 -05:00
Peter Eisentraut c538592959 Make all Perl warnings fatal
There are a lot of Perl scripts in the tree, mostly code generation
and TAP tests.  Occasionally, these scripts produce warnings.  These
are probably always mistakes on the developer side (true positives).
Typical examples are warnings from genbki.pl or related when you make
a mess in the catalog files during development, or warnings from tests
when they massage a config file that looks different on different
hosts, or mistakes during merges (e.g., duplicate subroutine
definitions), or just mistakes that weren't noticed because there is a
lot of output in a verbose build.

This changes all warnings into fatal errors, by replacing

    use warnings;

by

    use warnings FATAL => 'all';

in all Perl files.

Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
2023-12-29 18:20:00 +01:00
Peter Eisentraut d327f418d1 Remove unused macro
Usage was removed in 6c5576075b but the definition was not removed.
2023-12-26 20:13:11 +01:00
Nathan Bossart b14b1eb4da Teach convert() and friends to avoid copying when possible.
Presently, pg_convert() allocates a new bytea and copies the result
regardless of whether any conversion actually happened.  This
commit adjusts this function to return the source pointer as-is if
no conversion occurred.  This optimization isn't expected to make a
tremendous difference, but it still seems worthwhile to avoid
unnecessary memory allocations.

Author: Yurii Rashkovskii
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CA%2BRLCQyknBPSWXRBQGOi6aYEcdQ9RpH9Kch4GjoeY8dQ3D%2Bvhw%40mail.gmail.com
2023-12-04 11:55:18 -06:00
Peter Eisentraut 721856ff24 Remove distprep
A PostgreSQL release tarball contains a number of prebuilt files, in
particular files produced by bison, flex, perl, and well as html and
man documentation.  We have done this consistent with established
practice at the time to not require these tools for building from a
tarball.  Some of these tools were hard to get, or get the right
version of, from time to time, and shipping the prebuilt output was a
convenience to users.

Now this has at least two problems:

One, we have to make the build system(s) work in two modes: Building
from a git checkout and building from a tarball.  This is pretty
complicated, but it works so far for autoconf/make.  It does not
currently work for meson; you can currently only build with meson from
a git checkout.  Making meson builds work from a tarball seems very
difficult or impossible.  One particular problem is that since meson
requires a separate build directory, we cannot make the build update
files like gram.h in the source tree.  So if you were to build from a
tarball and update gram.y, you will have a gram.h in the source tree
and one in the build tree, but the way things work is that the
compiler will always use the one in the source tree.  So you cannot,
for example, make any gram.y changes when building from a tarball.
This seems impossible to fix in a non-horrible way.

Second, there is increased interest nowadays in precisely tracking the
origin of software.  We can reasonably track contributions into the
git tree, and users can reasonably track the path from a tarball to
packages and downloads and installs.  But what happens between the git
tree and the tarball is obscure and in some cases non-reproducible.

The solution for both of these issues is to get rid of the step that
adds prebuilt files to the tarball.  The tarball now only contains
what is in the git tree (*).  Getting the additional build
dependencies is no longer a problem nowadays, and the complications to
keep these dual build modes working are significant.  And of course we
want to get the meson build system working universally.

This commit removes the make distprep target altogether.  The make
dist target continues to do its job, it just doesn't call distprep
anymore.

(*) - The tarball also contains the INSTALL file that is built at make
dist time, but not by distprep.  This is unchanged for now.

The make maintainer-clean target, whose job it is to remove the
prebuilt files in addition to what make distclean does, is now just an
alias to make distprep.  (In practice, it is probably obsolete given
that git clean is available.)

The following programs are now hard build requirements in configure
(they were already required by meson.build):

- bison
- flex
- perl

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
2023-11-06 15:18:04 +01:00
Tom Lane 3aff1d3fd0 Doc: improve cross-reference in Makefile comment.
Per gripe from Japin Li.

Discussion: https://postgr.es/m/MEYP282MB16692171F13B5DF40DB768EEB6FCA@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2023-09-25 11:25:19 -04:00
Tom Lane 0245f8db36 Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.

This set of diffs is a bit larger than typical.  We've updated to
pg_bsd_indent 2.1.2, which properly indents variable declarations that
have multi-line initialization expressions (the continuation lines are
now indented one tab stop).  We've also updated to perltidy version
20230309 and changed some of its settings, which reduces its desire to
add whitespace to lines to make assignments etc. line up.  Going
forward, that should make for fewer random-seeming changes to existing
code.

Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql
2023-05-19 17:24:48 -04:00
Michael Paquier 8961cb9a03 Fix typos in comments
The changes done in this commit impact comments with no direct
user-visible changes, with fixes for incorrect function, variable or
structure names.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com
2023-05-02 12:23:08 +09:00
Peter Eisentraut d952373a98 New header varatt.h split off from postgres.h
This new header contains all the variable-length data types support
(TOAST support) from postgres.h, which isn't needed by large parts of
the backend code.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/ddcce239-0f29-6e62-4b47-1f8ca742addf%40enterprisedb.com
2023-01-10 05:54:36 +01:00
Bruce Momjian c8e1ba736b Update copyright for 2023
Backpatch-through: 11
2023-01-02 15:00:37 -05:00
Andrew Dunstan 8284cf5f74 Add copyright notices to meson files
Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net
2022-12-20 07:54:39 -05:00
Tom Lane c60c9badba Convert json_in and jsonb_in to report errors softly.
This requires a bit of further infrastructure-extension to allow
trapping errors reported by numeric_in and pg_unicode_to_server,
but otherwise it's pretty straightforward.

In the case of jsonb_in, we are only capturing errors reported
during the initial "parse" phase.  The value-construction phase
(JsonbValueToJsonb) can also throw errors if assorted implementation
limits are exceeded.  We should improve that, but it seems like a
separable project.

Andrew Dunstan and Tom Lane

Discussion: https://postgr.es/m/3bac9841-fe07-713d-fa42-606c225567d6@dunslane.net
2022-12-11 11:28:15 -05:00
Andres Freund 902ab2fcef meson: Add windows resource files
The generated resource files aren't exactly the same ones as the old
buildsystems generate. Previously "InternalName" and "OriginalFileName" were
mostly wrong / not set (despite being required), but that was hard to fix in
at least the make build. Additionally, the meson build falls back to a
"auto-generated" description when not set, and doesn't set it in a few cases -
unlikely that anybody looks at these descriptions in detail.

Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
2022-10-05 09:56:05 -07:00
Peter Eisentraut c8b2ef05f4 Convert *GetDatum() and DatumGet*() macros to inline functions
The previous macro implementations just cast the argument to a target
type but did not check whether the input type was appropriate.  The
function implementation can do better type checking of the input type.

For the *GetDatumFast() macros, converting to an inline function
doesn't work in the !USE_FLOAT8_BYVAL case, but we can use
AssertVariableIsOfTypeMacro() to get a similar level of type checking.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8528fb7e-0aa2-6b54-85fb-0c0886dbd6ed%40enterprisedb.com
2022-09-27 20:50:21 +02:00
Andres Freund e6927270cd meson: Add initial version of meson based build system
Autoconf is showing its age, fewer and fewer contributors know how to wrangle
it. Recursive make has a lot of hard to resolve dependency issues and slow
incremental rebuilds. Our home-grown MSVC build system is hard to maintain for
developers not using Windows and runs tests serially. While these and other
issues could individually be addressed with incremental improvements, together
they seem best addressed by moving to a more modern build system.

After evaluating different build system choices, we chose to use meson, to a
good degree based on the adoption by other open source projects.

We decided that it's more realistic to commit a relatively early version of
the new build system and mature it in tree.

This commit adds an initial version of a meson based build system. It supports
building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD,
Solaris and Windows (however only gcc is supported on aix, solaris). For
Windows/MSVC postgres can now be built with ninja (faster, particularly for
incremental builds) and msbuild (supporting the visual studio GUI, but
building slower).

Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM
bitcode generation, documentation adjustments) are done in subsequent commits
requiring further review. Other aspects (e.g. not installing test-only
extensions) are not yet addressed.

When building on Windows with msbuild, builds are slower when using a visual
studio version older than 2019, because those versions do not support
MultiToolTask, required by meson for intra-target parallelism.

The plan is to remove the MSVC specific build system in src/tools/msvc soon
after reaching feature parity. However, we're not planning to remove the
autoconf/make build system in the near future. Likely we're going to keep at
least the parts required for PGXS to keep working around until all supported
versions build with meson.

Some initial help for postgres developers is at
https://wiki.postgresql.org/wiki/Meson

With contributions from Thomas Munro, John Naylor, Stone Tickle and others.

Author: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Author: Peter Eisentraut <peter@eisentraut.org>
Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20211012083721.hvixq4pnh2pixr3j@alap3.anarazel.de
2022-09-21 22:37:17 -07:00
Peter Geoghegan bfcf1b3480 Harmonize parameter names in storage and AM code.
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in storage, catalog,
access method, executor, and logical replication code, as well as in
miscellaneous utility/library code.

Like other recent commits that cleaned up function parameter names, this
commit was written with help from clang-tidy.  Later commits will do the
same for other parts of the codebase.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
2022-09-19 19:18:36 -07:00
Peter Eisentraut e8d78581bb Revert "Convert *GetDatum() and DatumGet*() macros to inline functions"
This reverts commit 595836e99b.

It has problems when USE_FLOAT8_BYVAL is off.
2022-09-12 19:57:07 +02:00
Peter Eisentraut 595836e99b Convert *GetDatum() and DatumGet*() macros to inline functions
The previous macro implementations just cast the argument to a target
type but did not check whether the input type was appropriate.  The
function implementation can do better type checking of the input type.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/flat/8528fb7e-0aa2-6b54-85fb-0c0886dbd6ed%40enterprisedb.com
2022-09-12 17:36:26 +02:00
Bruce Momjian 27b77ecf9f Update copyright for 2022
Backpatch-through: 10
2022-01-07 19:04:57 -05:00
Peter Eisentraut e752727195 Make Unicode makefile parallel-safe
Fix the rules so that each rule is parallel safe, using the same
trickery that we use elsewhere in the tree for rules that produce more
than one output file.  Refactor the whole makefile so that there is
less repetition.

Discussion: https://www.postgresql.org/message-id/18e34084-aab1-1b4c-edd1-c4f9fb04f714%40enterprisedb.com
2021-10-04 20:26:48 +02:00
Peter Eisentraut ce27c8953e Update Unicode map text files
A couple of newer ones are available.  There are no functional
differences, but let's get them in anyway, so that there is no
surprise diff next time someone wants to do some actual work in this
area.
2021-10-04 13:02:58 +02:00
Peter Eisentraut 86a1aae764 Fix typo in comment
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/20210716.170209.175434392011070182.horikyota.ntt%40gmail.com
2021-07-22 09:37:35 +02:00
John Naylor 004874b72e Add missing check of noError parameter in euc_tw_and_big5.c
Oversight in ea1b99a66

Yukun Wang

Backpatch to v14 where this parameter was introduced

Discussion: https://www.postgresql.org/message-id/flat/OS0PR01MB6003FCEFF0201EF21685FD33B4E39%40OS0PR01MB6003.jpnprd01.prod.outlook.com
2021-07-21 09:11:32 -04:00
Peter Eisentraut 344dedfd1c Remove some whitespace in generated C output
It doesn't match the normal coding style.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/22016aa9-ca59-15c7-01df-f292cb558c4d@enterprisedb.com
2021-07-19 09:48:14 +02:00
Peter Eisentraut 4d56115f72 Make UCS_to_most.pl process encodings in sorted order
This just makes the progress output easier to follow.

Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/22016aa9-ca59-15c7-01df-f292cb558c4d@enterprisedb.com
2021-07-19 09:48:09 +02:00
Tom Lane def5b065ff Initial pgindent and pgperltidy run for v14.
Also "make reformat-dat-files".

The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
2021-05-12 13:14:10 -04:00
Heikki Linnakangas ea1b99a661 Add 'noError' argument to encoding conversion functions.
With the 'noError' argument, you can try to convert a buffer without
knowing the character boundaries beforehand. The functions now need to
return the number of input bytes successfully converted.

This is is a backwards-incompatible change, if you have created a custom
encoding conversion with CREATE CONVERSION. This adds a check to
pg_upgrade for that, refusing the upgrade if there are any user-defined
encoding conversions. Custom conversions are very rare, there are no
commonly used extensions that I know of that uses that feature. No other
objects can depend on conversions, so if you do have one, you can fairly
easily drop it before upgrading, and recreate it after the upgrade with
an updated version.

Add regression tests for built-in encoding conversions. This doesn't cover
every conversion, but it covers all the internal functions in conv.c that
are used to implement the conversions.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi
2021-04-01 11:45:22 +03:00
Heikki Linnakangas 6c5576075b Add direct conversion routines between EUC_TW and Big5.
Conversions between EUC_TW and Big5 were previously implemented by
converting the whole input to MIC first, and then from MIC to the target
encoding. Implement functions to convert directly between the two.

The reason to do this now is that I'm working on a patch that will change
the conversion function signature so that if the input is invalid, we
convert as much as we can and return the number of bytes successfully
converted. That's not possible if we use an intermediary format, because
if an error happens in the intermediary -> final conversion, we lose track
of the location of the invalid character in the original input. Avoiding
the intermediate step makes the conversions faster, too.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/b9e3167f-f84b-7aa4-5738-be578a4db924%40iki.fi
2021-01-28 14:53:03 +02:00
Heikki Linnakangas b80e10638e Add mbverifystr() functions specific to each encoding.
This makes pg_verify_mbstr() function faster, by allowing more efficient
encoding-specific implementations. All the implementations included in
this commit are pretty naive, they just call the same encoding-specific
verifychar functions that were used previously, but that already gives a
performance boost because the tight character-at-a-time loop is simpler.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01@iki.fi
2021-01-28 14:40:07 +02:00
Bruce Momjian ca3b37487b Update copyright for 2021
Backpatch-through: 9.5
2021-01-02 13:06:25 -05:00
Michael Paquier 788dd0b839 Fix some typos
Author: Daniel Gustafsson
Discussion: https://postgr.es/m/C36ADFDF-D09A-4EE5-B186-CB46C3653F4C@yesql.se
2020-11-14 11:43:10 +09:00
Thomas Munro a5073871ea Fix conversion table generator scripts.
convutils.pm used implicit conversion of undefined value to integer
zero.  Some of conversion scripts are susceptible to regexp greediness.
Fix, avoiding whitespace changes in the output.  Also update ICU URLs
that moved.

No need to back-patch, because the output of these scripts is also in
the source tree so we shouldn't need to rerun them on back-branches.

Author: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGJ7SEGLbj%3D%3DTQCcyKRA9aqj8%2B6L%3DexSq1y25TA%3DWxLziQ%40mail.gmail.com
2020-07-22 16:50:03 +12:00
Alvaro Herrera 17cc133f01
Dial back -Wimplicit-fallthrough to level 3
The additional pain from level 4 is excessive for the gain.

Also revert all the source annotation changes to their original
wordings, to avoid back-patching pain.

Discussion: https://postgr.es/m/31166.1589378554@sss.pgh.pa.us
2020-05-13 15:31:14 -04:00
Alvaro Herrera 3e9744465d
Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGS
Use it at level 4, a bit more restrictive than the default level, and
tweak our commanding comments to FALLTHROUGH.

(However, leave zic.c alone, since it's external code; to avoid the
warnings that would appear there, change CFLAGS for that file in the
Makefile.)

Author: Julien Rouhaud <rjuju123@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol
Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org
2020-05-12 16:07:30 -04:00
Andrew Dunstan 7be5d8df1f Use perl warnings pragma consistently
We've had a mixture of the warnings pragma, the -w switch on the shebang
line, and no warnings at all. This patch removes the -w swicth and add
the warnings pragma to all perl sources missing it. It raises the
severity of the TestingAndDebugging::RequireUseWarnings  perlcritic
policy to level 5, so that we catch any future violations.

Discussion: https://postgr.es/m/20200412074245.GB623763@rfd.leadboat.com
2020-04-13 11:55:45 -04:00
Tom Lane 0b34e7d307 Improve user control over truncation of logged bind-parameter values.
This patch replaces the boolean GUC log_parameters_on_error introduced
by commit ba79cb5dc with an integer log_parameter_max_length_on_error,
adding the ability to specify how many bytes to trim each logged
parameter value to.  (The previous coding hard-wired that choice at
64 bytes.)

In addition, add a new parameter log_parameter_max_length that provides
similar control over truncation of query parameters that are logged in
response to statement-logging options, as opposed to errors.  Previous
releases always logged such parameters in full, possibly causing log
bloat.

For backwards compatibility with prior releases,
log_parameter_max_length defaults to -1 (log in full), while
log_parameter_max_length_on_error defaults to 0 (no logging).

Per discussion, log_parameter_max_length is SUSET since the DBA should
control routine logging behavior, but log_parameter_max_length_on_error
is USERSET because it also affects errcontext data sent back to the
client.

Alexey Bashtanov, editorialized a little by me

Discussion: https://postgr.es/m/b10493cc-a399-a03a-67c7-068f2791ee50@imap.cc
2020-04-02 15:04:51 -04:00
Magnus Hagander 087d3d0583 Fix assorted typos
Author: Daniel Gustafsson <daniel@yesql.se>
2020-03-31 16:00:06 +02:00
Tom Lane c8e8b2f9df Marginal comments and docs cleanup.
Fix up some imprecise comments and poor markup from ba79cb5dc.  Also try
to convert the documentation of log_min_duration_sample and friends into
passable English.
2020-03-10 17:34:09 -04:00
Tom Lane a6525588b7 Allow Unicode escapes in any server encoding, not only UTF-8.
SQL includes provisions for numeric Unicode escapes in string
literals and identifiers.  Previously we only accepted those
if they represented ASCII characters or the server encoding
was UTF-8, making the conversion to internal form trivial.
This patch adjusts things so that we'll call the appropriate
encoding conversion function in less-trivial cases, allowing
the escape sequence to be accepted so long as it corresponds
to some character available in the server encoding.

This also applies to processing of Unicode escapes in JSONB.
However, the old restriction still applies to client-side
JSON processing, since that hasn't got access to the server's
encoding conversion infrastructure.

This patch includes some lexer infrastructure that simplifies
throwing errors with error cursors pointing into the middle of
a string (or other complex token).  For the moment I only used
it for errors relating to Unicode escapes, but we might later
expand the usage to some other cases.

Patch by me, reviewed by John Naylor.

Discussion: https://postgr.es/m/2393.1578958316@sss.pgh.pa.us
2020-03-06 14:17:43 -05:00
Tom Lane 5afaa2e426 Rationalize code placement between wchar.c, encnames.c, and mbutils.c.
Move all the backend-only code that'd crept into wchar.c and encnames.c
into mbutils.c.

To remove the last few #ifdef dependencies from wchar.c and encnames.c,
also make the following changes:

* Adjust get_encoding_name_for_icu to return NULL, not throw an error,
for unsupported encodings.  Its sole caller can perfectly well throw an
error instead.  (While at it, I also made this function and its sibling
is_encoding_supported_by_icu proof against out-of-range encoding IDs.)

* Remove the overlength-name error condition from pg_char_to_encoding.
It's completely silly not to treat that just like any other
the-name-is-not-in-the-table case.

Also, get rid of pg_mic_mblen --- there's no obvious reason why
conv.c shouldn't call pg_mule_mblen instead.

Other than that, this is just code movement and comment-polishing with
no functional changes.  Notably, I reordered declarations in pg_wchar.h
to show which functions are frontend-accessible and which are not.

Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
2020-01-16 18:08:21 -05:00
Tom Lane e6afa8918c Move wchar.c and encnames.c to src/common/.
Formerly, various frontend directories symlinked these two sources
and then built them locally.  That's an ancient, ugly hack, and
we now have a much better way: put them into libpgcommon.
So do that.  (The immediate motivation for this is the prospect
of having to introduce still more symlinking if we don't.)

This commit moves these two files absolutely verbatim, for ease of
reviewing the git history.  There's some follow-on work to be done
that will modify them a bit.

Robert Haas, Tom Lane

Discussion: https://postgr.es/m/CA+TgmoYO8oq-iy8E02rD8eX25T-9SmyxKWqqks5OMHxKvGXpXQ@mail.gmail.com
2020-01-16 15:58:55 -05:00
Peter Eisentraut f85a485f89 Add support for automatically updating Unicode derived files
We currently have several sets of files generated from data provided
by Unicode.  These all have ad hoc rules and instructions for updating
when new Unicode versions appear, and it's not done consistently.

This patch centralizes and automates the process and makes it part of
the release checklist.  The Unicode and CLDR versions are specified in
Makefile.global.in.  There is a new make target "update-unicode" that
downloads all the relevant files and runs the generation script.

There is also a new script for generating the table of combining
characters for ucs_wcwidth().  That table is now in a separate include
file rather than hardcoded into the middle of other code.  This is
based on the script that was used for generating
d8594d123c, but the script itself wasn't
committed at that time.

Reviewed-by: John Naylor <john.naylor@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/c8d05f42-443e-6c23-819b-05b31759a37c@2ndquadrant.com
2020-01-09 10:08:14 +01:00
Bruce Momjian 7559d8ebfa Update copyrights for 2020
Backpatch-through: update all files in master, backpatch legal files through 9.4
2020-01-01 12:21:45 -05:00
Alvaro Herrera 6cafde1bd4 Add backend-only appendStringInfoStringQuoted
This provides a mechanism to emit literal values in informative
messages, such as query parameters.  The new code is more complex than
what it replaces, primarily because it wants to be more efficient.
It also has the (currently unused) additional optional capability of
specifying a maximum size to print.

The new function lives out of common/stringinfo.c so that frontend users
of that file need not pull in unnecessary multibyte-encoding support
code.

Author: Álvaro Herrera and Alexey Bashtanov, after a suggestion from Andres Freund
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20190920203905.xkv5udsd5dxfs6tr@alap3.anarazel.de
2019-12-10 17:12:56 -03:00
Andres Freund 01368e5d9d Split all OBJS style lines in makefiles into one-line-per-entry style.
When maintaining or merging patches, one of the most common sources
for conflicts are the list of objects in makefiles. Especially when
the split across lines has been changed on both sides, which is
somewhat common due to attempting to stay below 80 columns, those
conflicts are unnecessarily laborious to resolve.

By splitting, and alphabetically sorting, OBJS style lines into one
object per line, conflicts should be less frequent, and easier to
resolve when they still occur.

Author: Andres Freund
Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de
2019-11-05 14:41:07 -08:00
Peter Eisentraut bdb839cbde Update unicode.org URLs
Use https, consistent host name, remove references to ftp.  Also
update the URLs for CLDR, which has moved from Trac to GitHub.
2019-10-13 22:10:38 +02:00
Michael Paquier a7471bd85c Update some outdated links about XLC and UNIX specification
Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm3Dy=dTdx8UCVw=DWbzLzmRUC1dkq45=heOZDUg3U_PtA@mail.gmail.com
2019-10-08 14:31:30 +09:00