postgresql/src/common
Michael Paquier 377c25d322 Fix overread in JSON parsing errors for incomplete byte sequences
json_lex_string() relies on pg_encoding_mblen_bounded() to point to the
end of a JSON string when generating an error message, and the input it
uses is not guaranteed to be null-terminated.

It was possible to walk off the end of the input buffer by a few bytes
when the last bytes consist of an incomplete multi-byte sequence, as
token_terminator would point to a location defined by
pg_encoding_mblen_bounded() rather than the end of the input.  This
commit switches token_terminator so as the error uses data up to the
end of the JSON input.

More work should be done so as this code could rely on an equivalent of
report_invalid_encoding() so as incorrect byte sequences can show in
error messages in a readable form.  This requires work for at least two
cases in the JSON parsing API: an incomplete token and an invalid escape
sequence.  A more complete solution may be too invasive for a backpatch,
so this is left as a future improvement, taking care of the overread
first.

A test is added on HEAD as test_json_parser makes this issue
straight-forward to check.

Note that pg_encoding_mblen_bounded() no longer has any callers.  This
will be removed on HEAD with a separate commit, as this is proving to
encourage unsafe coding.

Author: Jacob Champion
Discussion: https://postgr.es/m/CAOYmi+ncM7pwLS3AnKCSmoqqtpjvA8wmCdoBtKA3ZrB2hZG6zA@mail.gmail.com
Backpatch-through: 13
2024-05-09 12:45:51 +09:00
..
unicode Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
.gitignore Replace the data structure used for keyword lookup. 2019-01-06 17:02:57 -05:00
Makefile Move frontend-side archive APIs from src/common/ to src/fe_utils/ 2020-06-11 15:48:56 +09:00
archive.c Move routine building restore_command to src/common/ 2020-03-24 12:13:36 +09:00
base64.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
checksum_helper.c Add checksum helper functions. 2020-04-03 11:52:43 -04:00
config_info.c Simplify passing of configure arguments to pg_config 2020-02-10 19:23:41 +01:00
controldata_utils.c Try to handle torn reads of pg_control in frontend. 2023-10-16 17:24:35 +13:00
d2s.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
d2s_full_table.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
d2s_intrinsics.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
digit_table.h Change floating-point output format for improved performance. 2019-02-13 15:20:33 +00:00
encnames.c Rationalize code placement between wchar.c, encnames.c, and mbutils.c. 2020-01-16 18:08:21 -05:00
exec.c Make EXEC_BACKEND more convenient on Linux and FreeBSD. 2023-02-08 13:09:49 +09:00
f2s.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
fe_memutils.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
file_perm.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
file_utils.c Change client-side fsync_fname() to report errors fatally 2020-02-24 16:51:26 +01:00
hashfn.c Dial back -Wimplicit-fallthrough to level 3 2020-05-13 15:31:14 -04:00
ip.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
jsonapi.c Fix overread in JSON parsing errors for incomplete byte sequences 2024-05-09 12:45:51 +09:00
keywords.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
kwlookup.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
link-canary.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
logging.c Fix command-line colorization on Windows with VT100-compatible environments 2020-03-02 15:45:34 +09:00
md5.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
pg_lzcompress.c Improve pglz_decompress's defenses against corrupt compressed data. 2023-10-18 20:43:17 -04:00
pgfnames.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
protocol_openssl.c Move OpenSSL routines for min/max protocol setting to src/common/ 2020-01-17 10:06:17 +09:00
psprintf.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
relpath.c Add declaration-level assertions for compile-time checks 2020-02-03 14:48:42 +09:00
restricted_token.c Improve error messages after LoadLibrary() 2020-04-13 10:24:46 +02:00
rmtree.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
ryu_common.h Update copyrights for 2020 2020-01-01 12:21:45 -05:00
saslprep.c Add support for other normal forms to Unicode normalization API 2020-03-24 10:02:46 +01:00
scram-common.c Initial pgindent and pgperltidy run for v13. 2020-05-14 13:06:50 -04:00
sha2.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
sha2_openssl.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
string.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
stringinfo.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
unicode_norm.c Fix buffer overrun in unicode string normalization with empty input 2021-11-11 15:01:54 +09:00
username.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
wait_error.c Update copyrights for 2020 2020-01-01 12:21:45 -05:00
wchar.c Fix incautious handling of possibly-miscoded strings in client code. 2021-06-07 14:15:25 -04:00