This introduces a new generic SASL authentication method, similar to the
GSS and SSPI methods. The server first tells the client which SASL
authentication mechanism to use, and then the mechanism-specific SASL
messages are exchanged in AuthenticationSASLcontinue and PasswordMessage
messages. Only SCRAM-SHA-256 is supported at the moment, but this allows
adding more SASL mechanisms in the future, without changing the overall
protocol.
Support for channel binding, aka SCRAM-SHA-256-PLUS is left for later.
The SASLPrep algorithm, for pre-processing the password, is not yet
implemented. That could cause trouble, if you use a password with
non-ASCII characters, and a client library that does implement SASLprep.
That will hopefully be added later.
Authorization identities, as specified in the SCRAM-SHA-256 specification,
are ignored. SET SESSION AUTHORIZATION provides more or less the same
functionality, anyway.
If a user doesn't exist, perform a "mock" authentication, by constructing
an authentic-looking challenge on the fly. The challenge is derived from
a new system-wide random value, "mock authentication nonce", which is
created at initdb, and stored in the control file. We go through these
motions, in order to not give away the information on whether the user
exists, to unauthenticated users.
Bumps PG_CONTROL_VERSION, because of the new field in control file.
Patch by Michael Paquier and Heikki Linnakangas, reviewed at different
stages by Robert Haas, Stephen Frost, David Steele, Aleksander Alekseev,
and many others.
Discussion: https://www.postgresql.org/message-id/CAB7nPqRbR3GmFYdedCAhzukfKrgBLTLtMvENOmPrVWREsZkF8g%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/CAB7nPqSMXU35g%3DW9X74HVeQp0uvgJxvYOuA4A-A3M%2B0wfEBv-w%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/55192AFE.6080106@iki.fi
c.h #includes a number of core libc header files, such as <stdio.h>.
There's no point in re-including these after having read postgres.h,
postgres_fe.h, or c.h; so remove code that did so.
While at it, also fix some places that were ignoring our standard pattern
of "include postgres[_fe].h, then system header files, then other Postgres
header files". While there's not any great magic in doing it that way
rather than system headers last, it's silly to have just a few files
deviating from the general pattern. (But I didn't attempt to enforce this
globally, only in files I was touching anyway.)
I'd be the first to say that this is mostly compulsive neatnik-ism,
but over time it might save enough compile cycles to be useful.
nan_test.pgc supposed that it could unconditionally #define isnan()
and isinf() on WIN32. This was evidently copied at some point from
src/include/port/win32.h, but nowadays there's a test on _MSC_VER
there. Make nan_test.pgc look the same.
Per buildfarm warnings. There's no evidence this produces anything
worse than a warning, and besides it's only a test case, so I don't
feel a need to back-patch.
Per discussion, the time has come to do this. The handwriting has been
on the wall at least since 9.0 that this would happen someday, whenever
it got to be too much of a burden to support the float-timestamp option.
The triggering factor now is the discovery that there are multiple bugs
in the code that attempts to implement use of integer timestamps in the
replication protocol even when the server is built for float timestamps.
The internal float timestamps leak into the protocol fields in places.
While we could fix the identified bugs, there's a very high risk of
introducing more. Trying to build a wall that would positively prevent
mixing integer and float timestamps is more complexity than we want to
undertake to maintain a long-deprecated option. The fact that these
bugs weren't found through testing also indicates a lack of interest
in float timestamps.
This commit disables configure's --disable-integer-datetimes switch
(it'll still accept --enable-integer-datetimes, though), removes direct
references to USE_INTEGER_DATETIMES, and removes discussion of float
timestamps from the user documentation. A considerable amount of code is
rendered dead by this, but removing that will occur as separate mop-up.
Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us
Coverity complained that we might pass a null pointer to strcmp()
if PQresultErrorField were to return NULL. That shouldn't be possible,
since the server is supposed to always provide some SQLSTATE or other
in an error message. But we usually defend against such hazards, and
it only takes a little more code to do so here.
There's no good reason to think this is a live bug, so no back-patch.
Formerly an alternate password file could only be selected via the
environment variable PGPASSFILE; now it can also be selected via a
new connection parameter "passfile", corresponding to the conventions
for most other connection parameters. There was some concern about
this creating a security weakness, but it was agreed that that argument
was pretty thin, and there are clear use-cases for handling password
files this way.
Julian Markwort, reviewed by Fabien Coelho, some adjustments by me
Discussion: https://postgr.es/m/a4b4f4f1-7b58-a0e8-5268-5f7db8e8ccaa@uni-muenster.de
Clean up some technical debt left behind by commit 72b1e3a21: instead of
quickly hacking the name of base_yylex() with a #define, set it properly
with "%option prefix". This causes the names of pgc.l's other exported
symbols to change as well, so run around and modify the outside references
to them as needed. Similarly, make pgc.l's external references to
base_yylval use that variable's true name instead of a macro.
The reason for doing this now is that the quick-hack solution will fail
with future versions of flex, as reported by Дилян Палаузов.
Hence, back-patch into 9.6 where the previous commit appeared, since
it's likely people will build 9.6 with newer flex versions during
its lifetime.
Discussion: https://postgr.es/m/d845c1af-e18d-6651-178f-9f08cdf37e10@aegee.org
If the PAGER environment variable is set but contains an empty string,
psql would pass it to "sh" which would silently exit, causing whatever
query output we were printing to vanish entirely. This is quite
mystifying; it took a long time for us to figure out that this was the
cause of Joseph Brenner's trouble report. Rather than allowing that
to happen, we should treat this as another way to specify "no pager".
(We could alternatively treat it as selecting the default pager, but
it seems more likely that the former is what the user meant to achieve
by setting PAGER this way.)
Nonempty, but all-white-space, PAGER values have the same behavior, and
it's pretty easy to test for that, so let's handle that case the same way.
Most other cases of faulty PAGER values will result in the shell printing
some kind of complaint to stderr, which should be enough to diagnose the
problem, so we don't need to work harder than this. (Note that there's
been an intentional decision not to be very chatty about apparent failure
returns from the pager process, since that may happen if, eg, the user
quits the pager with control-C or some such. I'd just as soon not start
splitting hairs about which exit codes might merit making our own report.)
libpq's old PQprint() function was already on board with ignoring empty
PAGER values, but for consistency, make it ignore all-white-space values
as well.
It's been like this a long time, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAFfgvXWLOE2novHzYjmQK8-J6TmHz42G8f3X0SORM44+stUGmw@mail.gmail.com
If we failed to connect to one or more hosts, and then afterwards we
find one that fails to be read-write, the latter error message was
clobbering any earlier ones. Repair.
Mithun Cy, slightly revised by me.
Commit 274bb2b385 caused PQhost() to
return the value of the hostaddr parameter rather than the relevant
host when the latter parameter was specified. That's wrong. Commit
9a1d0af4ad then amplified the damage by
using PQhost() in more places, so that the SSL test suite started
failing.
Report by Andreas Karlsson; patch by me.
Commit 274bb2b385 made it possible to
specify multiple IPs in a connection string, but that's not good
enough for the case where you have a read-write master and a bunch of
read-only standbys and want to connect to whichever server is the
master at the current time. This commit allows that, by making it
possible to specify target_session_attrs=read-write as a connection
parameter.
There was extensive discussion of the best name for the connection
parameter and its values as well as the best way to distinguish master
and standbys. For now, adopt the same solution as JDBC: if the user
wants a read-write connection, issue 'show transaction_read_only' and
rejection the connection if the result is 'on'. In the future, we
could add additional values of this new target_session_attrs parameter
that issue different queries; or we might have some way of
distinguishing the server type without resorting to an SQL query; but
right now, we have this, and that's (hopefully) a good start.
Victor Wagner and Mithun Cy. Design review by Álvaro Herrera, Catalin
Iacob, Takayuki Tsunakawa, and Craig Ringer; code review by me. I
changed Mithun's patch to skip all remaining IPs for a host if we
reject a connection based on this new parameter, rewrote the
documentation, and did some other cosmetic cleanup.
Discussion: http://postgr.es/m/CAD__OuhqPRGpcsfwPHz_PDqAGkoqS1UvnUnOnAB-LBWBW=wu4A@mail.gmail.com
Avoid memory leak in conninfo_uri_parse_options. Use the current host
rather than the comma-separated list of host names when the host name
is needed for GSS, SSPI, or SSL authentication. Document the way
connect_timeout interacts with multiple host specifications.
Takayuki Tsunakawa
On Windows, libc will mask \r\n line endings for us, since we read the
password file in text mode. But that doesn't happen on Unix. People
who share password files across both systems might have \r\n line endings
in a file they use on Unix, so as a convenience, ignore trailing \r.
Per gripe from Josh Berkus.
In passing, put the existing check for empty line somewhere where it's
actually useful, ie after stripping the newline not before.
Vik Fearing, adjusted a bit by me
Discussion: <0de37763-5843-b2cc-855e-5d0e5df25807@agliodbs.com>
It's also possible to specify a separate port for each host.
Previously, we'd loop over every address returned by looking up the
host name; now, we'll try every address for every host name.
Patch by me. Victor Wagner wrote an earlier patch for this feature,
which I read, but I didn't use any of his code. Review by Mithun Cy.
Ordinarily there would not be an async result sitting around at this
point, but it appears that in corner cases there can be. Considering
all the work we're about to launch, it's hardly going to cost anything
noticeable to check.
It's been like this forever, so back-patch to all supported branches.
Report: <CAD-Qf1eLUtBOTPXyFQGW-4eEsop31tVVdZPu4kL9pbQ6tJPO8g@mail.gmail.com>
Leaving the error in the error queue used to be harmless, because the
X509_STORE_load_locations() call used to be the last step in
initialize_SSL(), and we would clear the queue before the next
SSL_connect() call. But previous commit moved things around. The symptom
was that if a CRL file was not found, and one of the subsequent
initialization steps, like loading the client certificate or private key,
failed, we would incorrectly print the "no such file" error message from
the earlier X509_STORE_load_locations() call as the reason.
Backpatch to all supported versions, like the previous patch.
There were several issues with the old coding:
1. There was a race condition, if two threads opened a connection at the
same time. We used a mutex around SSL_CTX_* calls, but that was not
enough, e.g. if one thread SSL_CTX_load_verify_locations() with one
path, and another thread set it with a different path, before the first
thread got to establish the connection.
2. Opening two different connections, with different sslrootcert settings,
seemed to fail outright with "SSL error: block type is not 01". Not sure
why.
3. We created the SSL object, before calling SSL_CTX_load_verify_locations
and SSL_CTX_use_certificate_chain_file on the SSL context. That was
wrong, because the options set on the SSL context are propagated to the
SSL object, when the SSL object is created. If they are set after the
SSL object has already been created, they won't take effect until the
next connection. (This is bug #14329)
At least some of these could've been fixed while still using a shared
context, but it would've been more complicated and error-prone. To keep
things simple, let's just use a separate SSL context for each connection,
and accept the overhead.
Backpatch to all supported versions.
Report, analysis and test case by Kacper Zuk.
Discussion: <20160920101051.1355.79453@wrigleys.postgresql.org>
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users. Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern. Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change. (In short, this patch touches documentation and
comments, but no actual code.)
I didn't touch contrib/start-scripts/osx/, either. I suspect those are
obsolete and due for a rewrite, anyway.
I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too. Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
This makes the -? and -V options work consistently with other binaries.
--help and --version are now only recognized as the first option, i.e.
"ecpg --foobar --help" no longer prints the help, but that's consistent
with most of our other binaries, too.
Backpatch to all supported versions.
Haribabu Kommi
Discussion: <CAJrrPGfnRXvmCzxq6Dy=stAWebfNHxiL+Y_z7uqksZUCkW_waQ@mail.gmail.com>
LibreSSL defines OPENSSL_VERSION_NUMBER to claim that it is version 2.0.0,
but it doesn't have the functions added in OpenSSL 1.1.0. Add autoconf
checks for the individual functions we need, and stop relying on
OPENSSL_VERSION_NUMBER.
Backport to 9.5 and 9.6, like the patch that broke this. In the
back-branches, there are still a few OPENSSL_VERSION_NUMBER checks left,
to check for OpenSSL 0.9.8 or 0.9.7. I left them as they were - LibreSSL
has all those functions, so they work as intended.
Per buildfarm member curculio.
Discussion: <2442.1473957669@sss.pgh.pa.us>
Changes needed to build at all:
- Check for SSL_new in configure, now that SSL_library_init is a macro.
- Do not access struct members directly. This includes some new code in
pgcrypto, to use the resource owner mechanism to ensure that we don't
leak OpenSSL handles, now that we can't embed them in other structs
anymore.
- RAND_SSLeay() -> RAND_OpenSSL()
Changes that were needed to silence deprecation warnings, but were not
strictly necessary:
- RAND_pseudo_bytes() -> RAND_bytes().
- SSL_library_init() and OpenSSL_config() -> OPENSSL_init_ssl()
- ASN1_STRING_data() -> ASN1_STRING_get0_data()
- DH_generate_parameters() -> DH_generate_parameters()
- Locking callbacks are not needed with OpenSSL 1.1.0 anymore. (Good
riddance!)
Also change references to SSLEAY_VERSION_NUMBER with OPENSSL_VERSION_NUMBER,
for the sake of consistency. OPENSSL_VERSION_NUMBER has existed since time
immemorial.
Fix SSL test suite to work with OpenSSL 1.1.0. CA certificates must have
the "CA:true" basic constraint extension now, or OpenSSL will refuse them.
Regenerate the test certificates with that. The "openssl" binary, used to
generate the certificates, is also now more picky, and throws an error
if an X509 extension is specified in "req_extensions", but that section
is empty.
Backpatch to all supported branches, per popular demand. In back-branches,
we still support OpenSSL 0.9.7 and above. OpenSSL 0.9.6 should still work
too, but I didn't test it. In master, we only support 0.9.8 and above.
Patch by Andreas Karlsson, with additional changes by me.
Discussion: <20160627151604.GD1051@msg.df7cb.de>
When building libpq, ip.c and md5.c were symlinked or copied from
src/backend/libpq into src/interfaces/libpq, but now that we have a
directory specifically for routines that are shared between the server and
client binaries, src/common/, move them there.
Some routines in ip.c were only used in the backend. Keep those in
src/backend/libpq, but rename to ifaddr.c to avoid confusion with the file
that's now in common.
Fix the comment in src/common/Makefile to reflect how libpq actually links
those files.
There are two more files that libpq symlinks directly from src/backend:
encnames.c and wchar.c. I don't feel compelled to move those right now,
though.
Patch by Michael Paquier, with some changes by me.
Discussion: <69938195-9c76-8523-0af8-eb718ea5b36e@iki.fi>
OpenSSL officially only supports 1.0.1 and newer. Some OS distributions
still provide patches for 0.9.8, but anything older than that is not
interesting anymore. Let's simplify things by removing compatibility code.
Andreas Karlsson, with small changes by me.
This has been requested a few times, but the use-case for it was never
entirely clear. The reason for adding it now is that transmission of
error reports from parallel workers fails when NLS is active, because
pq_parse_errornotice() wrongly assumes that the existing severity field
is nonlocalized. There are other ways we could have fixed that, but the
other options were basically kluges, whereas this way provides something
that's at least arguably a useful feature along with the bug fix.
Per report from Jakob Egger. Back-patch into 9.6, because otherwise
parallel query is essentially unusable in non-English locales. The
problem exists in 9.5 as well, but we don't want to risk changing
on-the-wire behavior in 9.5 (even though the possibility of new error
fields is specifically called out in the protocol document). It may
be sufficient to leave the issue unfixed in 9.5, given the very limited
usefulness of pq_parse_errornotice in that version.
Discussion: <A88E0006-13CB-49C6-95CC-1A77D717213C@eggerapps.at>
Up to now we've manually adjusted these numbers in several different
Makefiles at the start of each development cycle. While that's not
much work, it's easily forgotten, so let's get rid of it by setting
the SO_MINOR_VERSION values directly from $(MAJORVERSION).
In the case of libpq, this dev cycle's value of SO_MINOR_VERSION happens
to be "10" anyway, so this switch is transparent. For ecpg's shared
libraries, this will result in skipping one or two minor version numbers
between v9.6 and v10, which seems like no big problem; and it was a bit
inconsistent that they didn't have equal minor version numbers anyway.
Discussion: <21969.1471287988@sss.pgh.pa.us>
Once upon a time, it made sense for the ecpg preprocessor to have its
own version number, because it used a manually-maintained grammar that
wasn't always in sync with the core grammar. But those days are
thankfully long gone, leaving only a maintenance nuisance behind.
Let's use the PG v10 version numbering changeover as an excuse to get
rid of the ecpg version number and just have ecpg identify itself by
PG_VERSION. From the user's standpoint, ecpg will go from "4.12" in
the 9.6 branch to "10" in the 10 branch, so there's no failure of
monotonicity.
Discussion: <1471332659.4410.67.camel@postgresql.org>
This is a good bit more complicated than the average new-version stamping
commit, because it includes various adjustments in pursuit of changing
from three-part to two-part version numbers. It's likely some further
work will be needed around that change; but this is enough to get through
the regression tests, at least in Unix builds.
Peter Eisentraut and Tom Lane
NUMERIC_MAX_PRECISION is a purely arbitrary constraint on the precision
and scale you can write in a numeric typmod. It might once have had
something to do with the allowed range of a typmod-less numeric value,
but at least since 9.1 we've allowed, and documented that we allowed,
any value that would physically fit in the numeric storage format;
which is something over 100000 decimal digits, not 1000.
Hence, get rid of numeric_in()'s use of NUMERIC_MAX_PRECISION as a limit
on the allowed range of the exponent in scientific-format input. That was
especially silly in view of the fact that you can enter larger numbers as
long as you don't use 'e' to do it. Just constrain the value enough to
avoid localized overflow, and let make_result be the final arbiter of what
is too large. Likewise adjust ecpg's equivalent of this code.
Also get rid of numeric_recv()'s use of NUMERIC_MAX_PRECISION to limit the
number of base-NBASE digits it would accept. That created a dump/restore
hazard for binary COPY without doing anything useful; the wire-format
limit on number of digits (65535) is about as tight as we would want.
In HEAD, also get rid of pg_size_bytes()'s unnecessary intimacy with what
the numeric range limit is. That code doesn't exist in the back branches.
Per gripe from Aravind Kumar. Back-patch to all supported branches,
since they all contain the documentation claim about allowed range of
NUMERIC (cf commit cabf5d84b).
Discussion: <2895.1471195721@sss.pgh.pa.us>
Due to simplistic quoting and confusion of database names with conninfo
strings, roles with the CREATEDB or CREATEROLE option could escalate to
superuser privileges when a superuser next ran certain maintenance
commands. The new coding rule for PQconnectdbParams() calls, documented
at conninfo_array_parse(), is to pass expand_dbname=true and wrap
literal database names in a trivial connection string. Escape
zero-length values in appendConnStrVal(). Back-patch to 9.1 (all
supported versions).
Nathan Bossart, Michael Paquier, and Noah Misch. Reviewed by Peter
Eisentraut. Reported by Nathan Bossart.
Security: CVE-2016-5424
Beginning with the next development cycle, PG servers will report two-part
not three-part version numbers. Fix libpq so that it will compute the
correct numeric representation of such server versions for reporting by
PQserverVersion(). It's desirable to get this into the field and
back-patched ASAP, so that older clients are more likely to understand the
new server version numbering by the time any such servers are in the wild.
(The results with an old client would probably not be catastrophic anyway
for a released server; for example "10.1" would be interpreted as 100100
which would be wrong in detail but would not likely cause an old client to
misbehave badly. But "10devel" or "10beta1" would result in sversion==0
which at best would result in disabling all use of modern features.)
Extracted from a patch by Peter Eisentraut; comments added by me
Patch: <802ec140-635d-ad86-5fdf-d3af0e260c22@2ndquadrant.com>
To ensure that "make installcheck" can be used safely against an existing
installation, we need to be careful about what global object names
(database, role, and tablespace names) we use; otherwise we might
accidentally clobber important objects. There's been a weak consensus that
test databases should have names including "regression", and that test role
names should start with "regress_", but we didn't have any particular rule
about tablespace names; and neither of the other rules was followed with
any consistency either.
This commit moves us a long way towards having a hard-and-fast rule that
regression test databases must have names including "regression", and that
test role and tablespace names must start with "regress_". It's not
completely there because I did not touch some test cases in rolenames.sql
that test creation of special role names like "session_user". That will
require some rethinking of exactly what we want to test, whereas the intent
of this patch is just to hit all the cases in which the needed renamings
are cosmetic.
There is no enforcement mechanism in this patch either, but if we don't
add one we can expect that the tests will soon be violating the convention
again. Again, that's not such a cosmetic change and it will require
discussion. (But I did use a quick-hack enforcement patch to find these
cases.)
Discussion: <16638.1468620817@sss.pgh.pa.us>
Apparently, at least some versions of Microsoft's shell fail on variable
assignments that have leading whitespace. This instance, introduced in
commit 680513ab7, managed to escape notice for awhile because it's only
invoked if building with OpenSSL. Per bug #14185 from Torben Dannhauer.
Report: <20160613140119.5798.78501@wrigleys.postgresql.org>
NetBSD has seen fit to invent a libc function named strtoi(), which
conflicts with the long-established static functions of the same name in
datetime.c and ecpg's interval.c. While muttering darkly about intrusions
on application namespace, we'll rename our functions to avoid the conflict.
Back-patch to all supported branches, since this would affect attempts
to build any of them on recent NetBSD.
Thomas Munro
In commit b0e40d1893, I should have just
removed the /D switch defining WIN64. The reason the code worked before
is that all Windows64 compilers automatically predefine _WIN64. Perhaps
at one time we had code that depended on WIN64 being defined, but it's
long gone, and we should not encourage any reappearance. Per discussion
with Christian Ullrich.
Everyplace else thinks it's _WIN64, so make these places fall in line.
The pg_regress.c usage is not going to result in any change in behavior,
only suppressing (or not) a compiler warning about downcasting HANDLEs.
So there seems no need for back-patching there.
The libpq/win32.mak usage might represent an actual bug, if anyone were
using this script to build for Windows64, which perhaps nobody is.
Given the lack of field complaints, no back-patch here either.
pg_regress.c problem found by Christian Ullrich, the other by me.
OpenSSL has an unfortunate tendency to mix per-session state error
handling with per-thread error handling. This can cause problems when
programs that link to libpq with OpenSSL enabled have some other use of
OpenSSL; without care, one caller of OpenSSL may cause problems for the
other caller. Backend code might similarly be affected, for example
when a third party extension independently uses OpenSSL without taking
the appropriate precautions.
To fix, don't trust other users of OpenSSL to clear the per-thread error
queue. Instead, clear the entire per-thread queue ahead of certain I/O
operations when it appears that there might be trouble (these I/O
operations mostly need to call SSL_get_error() to check for success,
which relies on the queue being empty). This is slightly aggressive,
but it's pretty clear that the other callers have a very dubious claim
to ownership of the per-thread queue. Do this is both frontend and
backend code.
Finally, be more careful about clearing our own error queue, so as to
not cause these problems ourself. It's possibly that control previously
did not always reach SSLerrmessage(), where ERR_get_error() was supposed
to be called to clear the queue's earliest code. Make sure
ERR_get_error() is always called, so as to spare other users of OpenSSL
the possibility of similar problems caused by libpq (as opposed to
problems caused by a third party OpenSSL library like PHP's OpenSSL
extension). Again, do this is both frontend and backend code.
See bug #12799 and https://bugs.php.net/bug.php?id=68276
Based on patches by Dave Vitek and Peter Eisentraut.
From: Peter Geoghegan <pg@bowt.ie>
Often, upon getting an unexpected error in psql, one's first wish is that
the verbosity setting had been higher; for example, to be able to see the
schema-name field or the server code location info. Up to now the only way
has been to adjust the VERBOSITY variable and repeat the failing query.
That's a pain, and it doesn't work if the error isn't reproducible.
This commit adds support in libpq for regenerating the error message for
an existing error PGresult at any desired verbosity level. This is almost
just a matter of refactoring the existing code into a subroutine, but there
is one bit of possibly-needed information that was not getting put into
PGresults: the text of the last query sent to the server. We must add that
string to the contents of an error PGresult. But we only need to save it
if it might be used, which with the existing error-formatting code only
happens if there is a PG_DIAG_STATEMENT_POSITION error field, which is
probably pretty rare for errors in production situations. So really the
overhead when the feature isn't used should be negligible.
Alex Shulgin, reviewed by Daniel Vérité, some improvements by me
When getParamDescriptions was changed to handle out-of-memory better
by cribbing error recovery logic from getRowDescriptions/getAnotherTuple,
somebody omitted to copy the stanza about checking for excess data in
the message. But you need to do that, since continue'ing out of the
switch in pqParseInput3 means no such check gets applied there anymore.
Noted while looking at Michael Paquier's patch that made yet another
copy of this advance_and_error logic.
(This whole business desperately needs refactoring, because I sure don't
want to see a dozen copies of this code, but that's where we seem to be
headed. What's more, the "suspend parsing on EOF return" convention is a
holdover from protocol 2 and shouldn't exist at all in protocol 3, because
we don't process partial messages anymore. But for now, just fix the
obvious bug.)
Also, fix some wrong/missing comments about what the API spec is
for these three functions.
This doesn't seem worthy of back-patching, even though it's a bug;
the case shouldn't ever arise in the field.
Whenever this function is used with the FORMAT_MESSAGE_FROM_SYSTEM flag,
it's good practice to include FORMAT_MESSAGE_IGNORE_INSERTS as well.
Otherwise, if the message contains any %n insertion markers, the function
will try to fetch argument strings to substitute --- which we are not
passing, possibly leading to a crash. This is exactly analogous to the
rule about not giving printf() a format string you're not in control of.
Noted and patched by Christian Ullrich.
Back-patch to all supported branches.
Now that we have src/common/ for code shared between frontend and backend,
we can get rid of (most of) the klugy ways that the keyword table and
keyword lookup code were formerly shared between different uses.
This is a first step towards a more general plan of getting rid of
special-purpose kluges for sharing code in src/bin/.
I chose to merge kwlookup.c back into keywords.c, as it once was, and
always has been so far as keywords.h is concerned. We could have
kept them separate, but there is noplace that uses ScanKeywordLookup
without also wanting access to the backend's keyword list, so there
seems little point.
ecpg is still a bit weird, but at least now the trickiness is documented.
I think that the MSVC build script should require no adjustments beyond
what's done here ... but we'll soon find out.
Now that we know about the %top{} trick, we can revert to building flex
lexers as separate .o files. This is worth doing for a couple of reasons
besides sheer cleanliness. We can narrow the scope of the -Wno-error flag
that's forced on scan.c. Also, since these grammar and lexer files are
so large, splitting them into separate build targets should have some
advantages in build speed, particularly in parallel or ccache'd builds.
We have quite a few other .l files that could be changed likewise, but the
above arguments don't apply to them, so the benefit of fixing them seems
pretty minimal. Leave the rest for some other day.
Tighten the semantics of boundary-case timestamptz so that we allow
timestamps >= '4714-11-24 00:00+00 BC' and < 'ENDYEAR-01-01 00:00+00 AD'
exactly, no more and no less, but it is allowed to enter timestamps
within that range using non-GMT timezone offsets (which could make the
nominal date 4714-11-23 BC or ENDYEAR-01-01 AD). This eliminates
dump/reload failure conditions for timestamps near the endpoints.
To do this, separate checking of the inputs for date2j() from the
final range check, and allow the Julian date code to handle a range
slightly wider than the nominal range of the datatypes.
Also add a bunch of checks to detect out-of-range dates and timestamps
that formerly could be returned by operations such as date-plus-integer.
All C-level functions that return date, timestamp, or timestamptz should
now be proof against returning a value that doesn't pass IS_VALID_DATE()
or IS_VALID_TIMESTAMP().
Vitaly Burovoy, reviewed by Anastasia Lubennikova, and substantially
whacked around by me
If libpq ran out of memory while constructing the result set, it would hang,
waiting for more data from the server, which might never arrive. To fix,
distinguish between out-of-memory error and not-enough-data cases, and give
a proper error message back to the client on OOM.
There are still similar issues in handling COPY start messages, but let's
handle that as a separate patch.
Michael Paquier, Amit Kapila and me. Backpatch to all supported versions.
The previous coding could overrun the provided buffer size for a very large
input, or lose precision for a very small input. Adopt the methodology
that's been in use in the equivalent backend code for a long time.
Per private report from Bas van Schaik. Back-patch to all supported
branches.
Previously, if no host information had been specified at connection time,
PQhost() would return NULL (unless you are on Windows, in which case you
got "localhost"). This is an unhelpful definition for a couple of reasons:
it can cause corner-case crashes in applications (cf commit c5ef8ce53d),
and there's no well-defined way for applications to find out the socket
directory path that's actually in use. As an example of the latter
problem, psql substituted DEFAULT_PGSOCKET_DIR for NULL in a couple of
places, but this is subtly wrong because it's conceivable that psql is
using a libpq shared library that was built with a different setting.
Hence, change PQhost() to return DEFAULT_PGSOCKET_DIR when appropriate,
and strip out the now-dead substitutions in psql. (There is still one
remaining reference to DEFAULT_PGSOCKET_DIR in psql, in prompt.c, which
I don't see a nice way to get rid of. But it only controls a prompt
abbreviation decision, so it seems noncritical.)
Also update the docs for PQhost, which had never previously mentioned
the possibility of a socket directory path being returned. In passing
fix the outright-incorrect code comment about PGconn.pgunixsocket.
In commit 210eb9b743 I centralized libpq's logic for closing down
the backend communication socket, and made the new pqDropConnection
routine always reset the I/O buffers to empty. Many of the call sites
previously had not had such code, and while that amounted to an oversight
in some cases, there was one place where it was intentional and necessary
*not* to flush the input buffer: pqReadData should never cause that to
happen, since we probably still want to process whatever data we read.
This is the true cause of the problem Robert was attempting to fix in
c3e7c24a1d, namely that libpq no longer reported the backend's final
ERROR message before reporting "server closed the connection unexpectedly".
But that only accidentally fixed it, by invoking parseInput before the
input buffer got flushed; and very likely there are timing scenarios
where we'd still lose the message before processing it.
To fix, pass a flag to pqDropConnection to tell it whether to flush the
input buffer or not. On review I think flushing is actually correct for
every other call site.
Back-patch to 9.3 where the problem was introduced. In HEAD, also improve
the comments added by c3e7c24a1d.
At least since the introduction of Hot Standby, the backend has
sometimes sent fatal errors even when no client query was in
progress, assuming that the client would receive it. However,
pqHandleSendFailure was not in sync with this assumption, and
only tries to catch notices and notifies. Add a parseInput call
to the loop there to fix.
Andres Freund suggested the fix. Comments are by me.
Reviewed by Michael Paquier.
Per discussion, the original name was a bit misleading, and
PQsslAttributeNames() seems more apropos. It's not quite too late to
change this in 9.5, so let's change it while we can.
Also, make sure that the pointer array is const, not only the pointed-to
strings.
Minor documentation wordsmithing while at it.
Lars Kanis, slight adjustments by me
Thom Brown reported that SSL connections didn't seem to work on Windows in
9.5. Asif Naeem figured out that the cause was my_sock_read() looking at
"errno" when it needs to look at "SOCK_ERRNO". This mistake was introduced
in commit 680513ab79, which cloned the
backend's custom SSL BIO code into libpq, and didn't translate the errno
handling properly. Moreover, it introduced unnecessary errno save/restore
logic, which was particularly confusing because it was incomplete; and it
failed to check for all three of EINTR, EAGAIN, and EWOULDBLOCK in
my_sock_write. (That might not be necessary; but since we're copying
well-tested backend code that does do that, it seems prudent to copy it
faithfully.)
Remove the code in plpgsql that suppressed the innermost line of CONTEXT
for messages emitted by RAISE commands. That was never more than a quick
backwards-compatibility hack, and it's pretty silly in cases where the
RAISE is nested in several levels of function. What's more, it violated
our design theory that verbosity of error reports should be controlled
on the client side not the server side.
To alleviate the resulting noise increase, introduce a feature in libpq
and psql whereby the CONTEXT field of messages can be suppressed, either
always or only for non-error messages. Printing CONTEXT for errors only
is now their default behavior.
The actual code changes here are pretty small, but the effects on the
regression test outputs are widespread. I had to edit some of the
alternative expected outputs by hand; hopefully the buildfarm will soon
find anything I fat-fingered.
In passing, fix up (again) the output line counts in psql's various
help displays. Add some commentary about how to verify them.
Pavel Stehule, reviewed by Petr Jelínek, Jeevan Chalke, and others
If an allocation fails in the main message handling loop, pqParseInput3
or pqParseInput2, it should not be treated as "not enough data available
yet". Otherwise libpq will wait indefinitely for more data to arrive from
the server, and gets stuck forever.
This isn't a complete fix - getParamDescriptions and getCopyStart still
have the same issue, but it's a step in the right direction.
Michael Paquier and me. Backpatch to all supported versions.
Use "a" and "an" correctly, mostly in comments. Two error messages were
also fixed (they were just elogs, so no translation work required). Two
function comments in pg_proc.h were also fixed. Etsuro Fujita reported one
of these, but I found a lot more with grep.
Also fix a few other typos spotted while grepping for the a/an typos.
For example, "consists out of ..." -> "consists of ...". Plus a "though"/
"through" mixup reported by Euler Taveira.
Many of these typos were in old code, which would be nice to backpatch to
make future backpatching easier. But much of the code was new, and I didn't
feel like crafting separate patches for each branch. So no backpatching.
This reverts commit 16304a0134, except
for its changes in src/port/snprintf.c; as well as commit
cac18a76bb which is no longer needed.
Fujii Masao reported that the previous commit caused failures in psql on
OS X, since if one exits the pager program early while viewing a query
result, psql sees an EPIPE error from fprintf --- and the wrapper function
thought that was reason to panic. (It's a bit surprising that the same
does not happen on Linux.) Further discussion among the security list
concluded that the risk of other such failures was far too great, and
that the one-size-fits-all approach to error handling embodied in the
previous patch is unlikely to be workable.
This leaves us again exposed to the possibility of the type of failure
envisioned in CVE-2015-3166. However, that failure mode is strictly
hypothetical at this point: there is no concrete reason to believe that
an attacker could trigger information disclosure through the supposed
mechanism. In the first place, the attack surface is fairly limited,
since so much of what the backend does with format strings goes through
stringinfo.c or psprintf(), and those already had adequate defenses.
In the second place, even granting that an unprivileged attacker could
control the occurrence of ENOMEM with some precision, it's a stretch to
believe that he could induce it just where the target buffer contains some
valuable information. So we concluded that the risk of non-hypothetical
problems induced by the patch greatly outweighs the security risks.
We will therefore revert, and instead undertake closer analysis to
identify specific calls that may need hardening, rather than attempt a
universal solution.
We have kept the portion of the previous patch that improved snprintf.c's
handling of errors when it calls the platform's sprintf(). That seems to
be an unalloyed improvement.
Security: CVE-2015-3166
All known standard library implementations of these functions can fail
with ENOMEM. A caller neglecting to check for failure would experience
missing output, information exposure, or a crash. Check return values
within wrappers and code, currently just snprintf.c, that bypasses the
wrappers. The wrappers do not return after an error, so their callers
need not check. Back-patch to 9.0 (all supported versions).
Popular free software standard library implementations do take pains to
bypass malloc() in simple cases, but they risk ENOMEM for floating point
numbers, positional arguments, large field widths, and large precisions.
No specification demands such caution, so this commit regards every call
to a printf family function as a potential threat.
Injecting the wrappers implicitly is a compromise between patch scope
and design goals. I would prefer to edit each call site to name a
wrapper explicitly. libpq and the ECPG libraries would, ideally, convey
errors to the caller rather than abort(). All that would be painfully
invasive for a back-patched security fix, hence this compromise.
Security: CVE-2015-3166
The "check" target no longer needs to depend on "all", because it now
runs "install" directly, which in turn depends on "all". Doing both
will cause problems with parallel make, because two builds will run next
to each other.
Also remove the redirection of the temp-install output into a log file.
This was appropriate when this was done from within pg_regress, but now
it's just a regular make run, and especially with the above changes this
will now take the place of running the "all" target before the test
suites.
problem report by Jeff Janes, patch in part by Michael Paquier
This provides a mechanism for specifying conversions between SQL data
types and procedural languages. As examples, there are transforms
for hstore and ltree for PL/Perl and PL/Python.
reviews by Pavel Stěhule and Andres Freund
Each of the libraries incorporates src/port files, which often check
FRONTEND. Build systems disagreed on whether to build libpgtypes this
way. Only libecpg incorporates files that rely on it today. Back-patch
to 9.0 (all supported versions) to forestall surprises.
Before, make check-world would create a new temporary installation for
each test suite, which is slow and wasteful. Instead, we now create one
test installation that is used by all test suites that are part of a
make run.
The management of the temporary installation is removed from pg_regress
and handled in the makefiles. This allows for better control, and
unifies the code with that of test suites not run through pg_regress.
review and msvc support by Michael Paquier <michael.paquier@gmail.com>
more review by Fabien Coelho <coelho@cri.ensmp.fr>
If someone else already set the callbacks, don't overwrite them with
ours. When unsetting the callbacks, only unset them if they point to
ours.
Author: Jan Urbański <wulczer@wulczer.org>
This is the second try at this, after fcef161729 failed miserably and
had to be reverted: as it turns out, libpq cannot depend on libpgcommon
after all. Instead of shuffling code in the master branch, make that one
just like 9.4 and accept the duplication. (This was all my own mistake,
not the patch submitter's).
psql was already accepting conninfo strings as the first parameter in
\connect, but the way it worked wasn't sane; some of the other
parameters would get the previous connection's values, causing it to
connect to a completely unexpected server or, more likely, not finding
any server at all because of completely wrong combinations of
parameters.
Fix by explicitely checking for a conninfo-looking parameter in the
dbname position; if one is found, use its complete specification rather
than mix with the other arguments. Also, change tab-completion to not
try to complete conninfo/URI-looking "dbnames" and document that
conninfos are accepted as first argument.
There was a weak consensus to backpatch this, because while the behavior
of using the dbname as a conninfo is nowhere documented for \connect, it
is reasonable to expect that it works because it does work in many other
contexts. Therefore this is backpatched all the way back to 9.0.
Author: David Fetter, Andrew Dunstan. Some editorialization by me
(probably earning a Gierth's "Sloppy" badge in the process.)
Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost,
Robert Haas, Andrew Dunstan.
psql was already accepting conninfo strings as the first parameter in
\connect, but the way it worked wasn't sane; some of the other
parameters would get the previous connection's values, causing it to
connect to a completely unexpected server or, more likely, not finding
any server at all because of completely wrong combinations of
parameters.
Fix by explicitely checking for a conninfo-looking parameter in the
dbname position; if one is found, use its complete specification rather
than mix with the other arguments. Also, change tab-completion to not
try to complete conninfo/URI-looking "dbnames" and document that
conninfos are accepted as first argument.
There was a weak consensus to backpatch this, because while the behavior
of using the dbname as a conninfo is nowhere documented for \connect, it
is reasonable to expect that it works because it does work in many other
contexts. Therefore this is backpatched all the way back to 9.0.
To implement this, routines previously private to libpq have been
duplicated so that psql can decide what looks like a conninfo/URI
string. In back branches, just duplicate the same code all the way back
to 9.2, where URIs where introduced; 9.0 and 9.1 have a simpler version.
In master, the routines are moved to src/common and renamed.
Author: David Fetter, Andrew Dunstan. Some editorialization by me
(probably earning a Gierth's "Sloppy" badge in the process.)
Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost,
Robert Haas, Andrew Dunstan.
This improves on commit bbfd7edae5 by
making two simple changes:
* pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn().
Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces
pgindent's tendency to misformat declarations involving them.
* attributes are now always attached to function declarations, not
definitions. Previously some places were taking creative shortcuts,
which were not merely candidates for bad misformatting by pgindent
but often were outright wrong anyway. (It does little good to put a
noreturn annotation where callers can't see it.) In any case, if
we would like to believe that these macros can be used with non-gcc
compilers, we should avoid gratuitous variance in usage patterns.
I also went through and manually improved the formatting of a lot of
declarations, and got rid of excessively repetitive (and now obsolete
anyway) comments informing the reader what pg_attribute_printf is for.
While the SQL standard is pretty vague on the overall topic of operator
precedence (because it never presents a unified BNF for all expressions),
it does seem reasonable to conclude from the spec for <boolean value
expression> that OR has the lowest precedence, then AND, then NOT, then IS
tests, then the six standard comparison operators, then everything else
(since any non-boolean operator in a WHERE clause would need to be an
argument of one of these).
We were only sort of on board with that: most notably, while "<" ">" and
"=" had properly low precedence, "<=" ">=" and "<>" were treated as generic
operators and so had significantly higher precedence. And "IS" tests were
even higher precedence than those, which is very clearly wrong per spec.
Another problem was that "foo NOT SOMETHING bar" constructs, such as
"x NOT LIKE y", were treated inconsistently because of a bison
implementation artifact: they had the documented precedence with respect
to operators to their right, but behaved like NOT (i.e., very low priority)
with respect to operators to their left.
Fixing the precedence issues is just a small matter of rearranging the
precedence declarations in gram.y, except for the NOT problem, which
requires adding an additional lookahead case in base_yylex() so that we
can attach a different token precedence to NOT LIKE and allied two-word
operators.
The bulk of this patch is not the bug fix per se, but adding logic to
parse_expr.c to allow giving warnings if an expression has changed meaning
because of these precedence changes. These warnings are off by default
and are enabled by the new GUC operator_precedence_warning. It's believed
that very few applications will be affected by these changes, but it was
agreed that a warning mechanism is essential to help debug any that are.
Until now __attribute__() was defined to be empty for all compilers but
gcc. That's problematic because it prevents using it in other compilers;
which is necessary e.g. for atomics portability. It's also just
generally dubious to do so in a header as widely included as c.h.
Instead add pg_attribute_format_arg, pg_attribute_printf,
pg_attribute_noreturn macros which are implemented in the compilers that
understand them. Also add pg_attribute_noreturn and pg_attribute_packed,
but don't provide fallbacks, since they can affect functionality.
This means that external code that, possibly unwittingly, relied on
__attribute__ defined to be empty on !gcc compilers may now run into
warnings or errors on those compilers. But there shouldn't be many
occurances of that and it's hard to work around...
Discussion: 54B58BA3.8040302@ohmu.fi
Author: Oskari Saarenmaa, with some minor changes by me.
Commit 865f14a2d3 was quite a few bricks
shy of a load: psql, ecpg, and plpgsql were all left out-of-step with
the core lexer. Of these only the last was likely to be a fatal
problem; but still, a minimal amount of grepping, or even just reading
the comments adjacent to the places that were changed, would have found
the other places that needed to be changed.
This is a possibly-vain effort to silence a Coverity warning about
bogus endianness dependency. The code's fine, because it takes care
of endianness issues for itself, but Coverity sees an int64 being
passed to an int* argument and not unreasonably suspects something's
wrong. I'm not sure if putting the void* cast in the way will shut it
up; but it can't hurt and seems better from a documentation standpoint
anyway, since the pointer is not used as an int* in this code path.
Just for a bit of additional safety, verify that the result length
is 8 bytes as expected.
Back-patch to 9.3 where the code in question was added.
The SGML docs claimed that 1-byte integers could be sent or received with
the "isint" options, but no such behavior has ever been implemented in
pqGetInt() or pqPutInt(). The in-code documentation header for PQfn() was
even less in tune with reality, and the code itself used parameter names
matching neither the SGML docs nor its libpq-fe.h declaration. Do a bit
of additional wordsmithing on the SGML docs while at it.
Since the business about 1-byte integers is a clear documentation bug,
back-patch to all supported branches.
There are a couple of places in our grammar that fail to be strict LALR(1),
by requiring more than a single token of lookahead to decide what to do.
Up to now we've dealt with that by using a filter between the lexer and
parser that merges adjacent tokens into one in the places where two tokens
of lookahead are necessary. But that creates a number of user-visible
anomalies, for instance that you can't name a CTE "ordinality" because
"WITH ordinality AS ..." triggers folding of WITH and ORDINALITY into one
token. I realized that there's a better way.
In this patch, we still do the lookahead basically as before, but we never
merge the second token into the first; we replace just the first token by
a special lookahead symbol when one of the lookahead pairs is seen.
This requires a couple extra productions in the grammar, but it involves
fewer special tokens, so that the grammar tables come out a bit smaller
than before. The filter logic is no slower than before, perhaps a bit
faster.
I also fixed the filter logic so that when backing up after a lookahead,
the current token's terminator is correctly restored; this eliminates some
weird behavior in error message issuance, as is shown by the one change in
existing regression test outputs.
I believe that this patch entirely eliminates odd behaviors caused by
lookahead for WITH. It doesn't really improve the situation for NULLS
followed by FIRST/LAST unfortunately: those sequences still act like a
reserved word, even though there are cases where they should be seen as two
ordinary identifiers, eg "SELECT nulls first FROM ...". I experimented
with additional grammar hacks but couldn't find any simple solution for
that. Still, this is better than before, and it seems much more likely
that we *could* somehow solve the NULLS case on the basis of this filter
behavior than the previous one.
If libpq output buffer is full, pqSendSome() function tries to drain any
incoming data. This avoids deadlock, if the server e.g. sends a lot of
NOTICE messages, and blocks until we read them. However, pqSendSome() only
did that in blocking mode. In non-blocking mode, the deadlock could still
happen.
To fix, take a two-pronged approach:
1. Change the documentation to instruct that when PQflush() returns 1, you
should wait for both read- and write-ready, and call PQconsumeInput() if it
becomes read-ready. That fixes the deadlock, but applications are not going
to change overnight.
2. In pqSendSome(), drain the input buffer before returning 1. This
alleviates the problem for applications that only wait for write-ready. In
particular, a slow but steady stream of NOTICE messages during COPY FROM
STDIN will no longer cause a deadlock. The risk remains that the server
attempts to send a large burst of data and fills its output buffer, and at
the same time the client also sends enough data to fill its output buffer.
The application will deadlock if it goes to sleep, waiting for the socket
to become write-ready, before the server's data arrives. In practice,
NOTICE messages and such that the server might be sending are usually
short, so it's highly unlikely that the server would fill its output buffer
so quickly.
Backpatch to all supported versions.
After finding an "=" character, the pointer was advanced twice when it
should only advance once. This is harmless as long as the value after "="
has at least one character; but if it doesn't, we'd miss the terminator
character and include too much in the value.
In principle this could lead to reading off the end of memory. It does not
seem worth treating as a security issue though, because it would happen on
client side, and besides client logic that's taking conninfo strings from
untrusted sources has much worse security problems than this.
Report and patch received off-list from Thomas Fanghaenel.
Back-patch to 9.2 where the faulty code was introduced.
When ecpg was rewritten to the new protocol version not all variable types
were corrected. This patch rewrites the code for these types to fix that. It
also fixes the documentation to correctly tell the status of array handling.
actually check the returned pointer allocated, potentially NULL which
could be the result of a malloc call.
Issue noted by Coverity, fixed by Michael Paquier <michael@otacoo.com>
All the other new SSL information functions had dummy versions in
be-secure.c, but I missed PQsslAttributes(). Oops. Surprisingly, the linker
did not complain about the missing function on most platforms represented in
the buildfarm, even though it is exported, except for a few Windows systems.
This makes it possible to query for things like the SSL version and cipher
used, without depending on OpenSSL functions or macros. That is a good
thing if we ever get another SSL implementation.
PQgetssl() still works, but it should be considered as deprecated as it
only works with OpenSSL. In particular, PQgetSslInUse() should be used to
check if a connection uses SSL, because as soon as we have another
implementation, PQgetssl() will return NULL even if SSL is in use.
strncpy() has a well-deserved reputation for being unsafe, so make an
effort to get rid of nearly all occurrences in HEAD.
A large fraction of the remaining uses were passing length less than or
equal to the known strlen() of the source, in which case no null-padding
can occur and the behavior is equivalent to memcpy(), though doubtless
slower and certainly harder to reason about. So just use memcpy() in
these cases.
In other cases, use either StrNCpy() or strlcpy() as appropriate (depending
on whether padding to the full length of the destination buffer seems
useful).
I left a few strncpy() calls alone in the src/timezone/ code, to keep it
in sync with upstream (the IANA tzcode distribution). There are also a
few such calls in ecpg that could possibly do with more analysis.
AFAICT, none of these changes are more than cosmetic, except for the four
occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength
source leads to a non-null-terminated destination buffer and ensuing
misbehavior. These don't seem like security issues, first because no stack
clobber is possible and second because if your values of sslcert etc are
coming from untrusted sources then you've got problems way worse than this.
Still, it's undesirable to have unpredictable behavior for overlength
inputs, so back-patch those four changes to all active branches.
Some users run their applications in chroot environments that lack an
/etc/passwd file. This means that the current UID's user name and home
directory are not obtainable. libpq used to be all right with that,
so long as the database role name to use was specified explicitly.
But commit a4c8f14364 broke such cases by
causing any failure of pg_fe_getauthname() to be treated as a hard error.
In any case it did little to advance its nominal goal of causing errors
in pg_fe_getauthname() to be reported better. So revert that and instead
put some real error-reporting code in place. This requires changes to the
APIs of pg_fe_getauthname() and pqGetpwuid(), since the latter had
departed from the POSIX-specified API of getpwuid_r() in a way that made
it impossible to distinguish actual lookup errors from "no such user".
To allow such failures to be reported, while not failing if the caller
supplies a role name, add a second call of pg_fe_getauthname() in
connectOptions2(). This is a tad ugly, and could perhaps be avoided with
some refactoring of PQsetdbLogin(), but I'll leave that idea for later.
(Note that the complained-of misbehavior only occurs in PQsetdbLogin,
not when using the PQconnect functions, because in the latter we will
never bother to call pg_fe_getauthname() if the user gives a role name.)
In passing also clean up the Windows-side usage of GetUserName(): the
recommended buffer size is 257 bytes, the passed buffer length should
be the buffer size not buffer size less 1, and any error is reported
by GetLastError() not errno.
Per report from Christoph Berg. Back-patch to 9.4 where the chroot
failure case was introduced. The generally poor reporting of errors
here is of very long standing, of course, but given the lack of field
complaints about it we won't risk changing these APIs further back
(even though they're theoretically internal to libpq).
Coverity complained that the "else" added to fillPGconn() was unreachable,
which it was. Remove the dead code. In passing, rearrange the tests so as
not to bother trying to fetch values for options that can't be assigned.
Pre-9.3 did not have that issue, but it did have a "return" that should be
"goto oom_error" to ensure that a suitable error message gets filled in.
This reverts commit 9f80f4835a. The
function returned the raw value of a connection parameter, a task served
by PQconninfo(). The next commit will reimplement the psql \conninfo
change that way. Back-patch to 9.4, where that commit first appeared.
If the "dbname" attribute in PQconnectDBParams contained a connection string
or URI (and expand_dbname = TRUE), the database name from the connection
string could not be overridden by a subsequent "dbname" keyword in the
array. That was not intentional; all other options can be overridden.
Furthermore, any subsequent "dbname" caused the connection string from the
first dbname value to be processed again, overriding any values for the same
options that were given between the connection string and the second dbname
option.
In the passing, clarify in the docs that only the first dbname option in the
array is parsed as a connection string.
Alex Shulgin. Backpatch to all supported versions.
An out-of-memory in most of these would lead to strange behavior, like
connecting to a different database than intended, but some would lead to
an outright segfault.
Alex Shulgin and me. Backpatch to all supported versions.
Cygwin builds require this of dependencies pertaining to pattern rules.
On Cygwin, stat("foo") in the absence of a file with that exact name can
locate foo.exe. While GNU make uses stat() for dependencies of ordinary
rules, it uses readdir() to assess dependencies of pattern rules.
Therefore, a pattern rule dependency should match any underlying file
name exactly. Back-patch to 9.4, where the dependency was introduced.
If you call PQreset() repeatedly, and the connection cannot be
re-established, the error messages from the failed connection attempts
kept accumulating in the error string.
Fixes bug #11455 reported by Caleb Epstein. Backpatch to all supported
versions.
The EOF-detection logic in pqReadData was a bit confused about who should
set up the error message in case the kernel gives us read-ready-but-no-data
rather than ECONNRESET or some other explicit error condition. Since the
whole point of this situation is that the lower-level functions don't know
there's anything wrong, pqReadData itself must set up the message. But
keep the assumption that if an errno was reported, a message was set up at
lower levels.
Per bug #11712 from Marko Tiikkaja. It's been like this for a very long
time, so back-patch to all supported branches.
This improves consistency with the MSVC build. On buildfarm member
narwhal, since commit 846e91e022,
shfolder.dll:SHGetFolderPath() crashes when dblink calls it by way of
pqGetHomeDirectory(). Back-patch to 9.4, where that commit first
appeared. How it caused this regression remains a mystery. This is a
partial revert of commit 889f038129, which
adopted shfolder.dll for Windows NT 4.0 compatibility. PostgreSQL 8.2
dropped support for that operating system.
Up to now, PG has assumed that any given timezone abbreviation (such as
"EDT") represents a constant GMT offset in the usage of any particular
region; we had a way to configure what that offset was, but not for it
to be changeable over time. But, as with most things horological, this
view of the world is too simplistic: there are numerous regions that have
at one time or another switched to a different GMT offset but kept using
the same timezone abbreviation. Almost the entire Russian Federation did
that a few years ago, and later this month they're going to do it again.
And there are similar examples all over the world.
To cope with this, invent the notion of a "dynamic timezone abbreviation",
which is one that is referenced to a particular underlying timezone
(as defined in the IANA timezone database) and means whatever it currently
means in that zone. For zones that use or have used daylight-savings time,
the standard and DST abbreviations continue to have the property that you
can specify standard or DST time and get that time offset whether or not
DST was theoretically in effect at the time. However, the abbreviations
mean what they meant at the time in question (or most recently before that
time) rather than being absolutely fixed.
The standard abbreviation-list files have been changed to use this behavior
for abbreviations that have actually varied in meaning since 1970. The
old simple-numeric definitions are kept for abbreviations that have not
changed, since they are a bit faster to resolve.
While this is clearly a new feature, it seems necessary to back-patch it
into all active branches, because otherwise use of Russian zone
abbreviations is going to become even more problematic than it already was.
This change supersedes the changes in commit 513d06ded et al to modify the
fixed meanings of the Russian abbreviations; since we've not shipped that
yet, this will avoid an undesirably incompatible (not to mention incorrect)
change in behavior for timestamps between 2011 and 2014.
This patch makes some cosmetic changes in ecpglib to keep its usage of
datetime lookup tables as similar as possible to the backend code, but
doesn't do anything about the increasingly obsolete set of timezone
abbreviation definitions that are hard-wired into ecpglib. Whatever we
do about that will likely not be appropriate material for back-patching.
Also, a potential free() of a garbage pointer after an out-of-memory
failure in ecpglib has been fixed.
This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that
caused it to produce unexpected results near a timezone transition, if
both the "before" and "after" states are marked as standard time. We'd
only ever thought about or tested transitions between standard and DST
time, but that's not what's happening when a zone simply redefines their
base GMT offset.
In passing, update the SGML documentation to refer to the Olson/zoneinfo/
zic timezone database as the "IANA" database, since it's now being
maintained under the auspices of IANA.
The code wrote a value into the caller's field[] array before checking
to see if there was room, which of course is backwards. Per report from
Michael Paquier.
I fixed the equivalent bug in the backend's version of this code way back
in 630684d3a1, but failed to think about ecpg's copy. Fortunately
this doesn't look like it would be exploitable for anything worse than a
core dump: an external attacker would have no control over the single word
that gets written.
The RFCs say that the CN must not be checked if a subjectAltName extension
of type dNSName is present. IOW, if subjectAltName extension is present,
but there are no dNSNames, we can still check the CN.
Alexey Klyukin
This patch makes libpq check the server's hostname against DNS names listed
in the X509 subjectAltName extension field in the server certificate. This
allows the same certificate to be used for multiple domain names. If there
are no SANs in the certificate, the Common Name field is used, like before
this patch. If both are given, the Common Name is ignored. That is a bit
surprising, but that's the behavior mandated by the relevant RFCs, and it's
also what the common web browsers do.
This also adds a libpq_ngettext helper macro to allow plural messages to be
translated in libpq. Apparently this happened to be the first plural message
in libpq, so it was not needed before.
Alexey Klyukin, with some kibitzing by me.
Programs need execute permission on a DLL file to load it. MSYS
"install" ignores the mode argument, and our Cygwin build statically
links libpq into programs. That explains the lack of buildfarm trouble.
Back-patch to 9.0 (all supported versions).
In support of this, have the MSVC build follow GNU make in preferring
GNUmakefile over Makefile when a directory contains both.
Michael Paquier, reviewed by MauMau.
This refactoring is in preparation for adding support for other SSL
implementations, with no user-visible effects. There are now two #defines,
USE_OPENSSL which is defined when building with OpenSSL, and USE_SSL which
is defined when building with any SSL implementation. Currently, OpenSSL is
the only implementation so the two #defines go together, but USE_SSL is
supposed to be used for implementation-independent code.
The libpq SSL code is changed to use a custom BIO, which does all the raw
I/O, like we've been doing in the backend for a long time. That makes it
possible to use MSG_NOSIGNAL to block SIGPIPE when using SSL, which avoids
a couple of syscall for each send(). Probably doesn't make much performance
difference in practice - the SSL encryption is expensive enough to mask the
effect - but it was a natural result of this refactoring.
Based on a patch by Martijn van Oosterhout from 2006. Briefly reviewed by
Alvaro Herrera, Andreas Karlsson, Jeff Janes.
Based on the old comment, it took me a while to figure out what the
problem was. The importnat detail is that SSL_read() can return WANT_READ
even though some raw data was received from the socket.
ws2_32 is the new version of the library that should be used, as
it contains the require functionality from wsock32 as well as some
more (which is why some binaries were already using ws2_32).
Michael Paquier, reviewed by MauMau
Prominent binaries already had this metadata. A handful of minor
binaries, such as pg_regress.exe, still lack it; efforts to eliminate
such exceptions are welcome.
Michael Paquier, reviewed by MauMau.
Give passwords to each user created in support of an ECPG connection
test case. Use SET SESSION AUTHORIZATION, not a fresh connection, to
reduce privileges during a dblink test case.
To test against such a server, both the "make installcheck-world"
environment and the postmaster environment must provide the default
user's password; $PGPASSFILE is the principal way to do so. (The
postmaster environment needs it for dblink and postgres_fdw tests.)
This reverts commit 45b7abe59e.
It turns out that the %name-prefix syntax without "=" does not work
at all in pre-2.4 Bison. We are not prepared to make such a large
jump in minimum required Bison version just to suppress a warning
message in a version hardly any developers are using yet.
When 3.0 gets more popular, we'll figure out a way to deal with this.
In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for
anyone using 3.0 who doesn't want to see the warning.
%name-prefix doesn't use an "=" sign according to the Bison docs, but it
silently accepted one anyway, until Bison 3.0. This was originally a
typo of mine in commit 012abebab1, and we
seem to have slavishly copied the error into all the other grammar files.
Per report from Vik Fearing; analysis by Peter Eisentraut.
Back-patch to all active branches, since somebody might try to build
a back branch with up-to-date tools.
Ensure that ecpg preprocessor output files are rebuilt when re-testing
after a change in the ecpg preprocessor itself, or a change in any of
several include files that get copied verbatim into the output files.
The lack of these dependencies was what created problems for Kevin Grittner
after the recent pgindent run. There's no way for --enable-depend to
discover these dependencies automatically, so we've gotta put them into
the Makefiles by hand.
While at it, reduce the amount of duplication in the ecpg invocations.
Commit 4318daecc9 broke it. The change in
sub-second precision at extreme dates is normal. The inconsistent
truncation vs. rounding is essentially a bug, albeit a longstanding one.
Back-patch to 8.4, like the causative commit.
If the server sends a long stream of data, and the server + network are
consistently fast enough to force the recv() loop in pqReadData() to
iterate until libpq's input buffer is full, then upon processing the last
incomplete message in each bufferload we'd usually double the buffer size,
due to supposing that we didn't have enough room in the buffer to finish
collecting that message. After filling the newly-enlarged buffer, the
cycle repeats, eventually resulting in an out-of-memory situation (which
would be reported misleadingly as "lost synchronization with server").
Of course, we should not enlarge the buffer unless we still need room
after discarding already-processed messages.
This bug dates back quite a long time: pqParseInput3 has had the behavior
since perhaps 2003, getCopyDataMessage at least since commit 70066eb1a1
in 2008. Probably the reason it's not been isolated before is that in
common environments the recv() loop would always be faster than the server
(if on the same machine) or faster than the network (if not); or at least
it wouldn't be slower consistently enough to let the buffer ramp up to a
problematic size. The reported cases involve Windows, which perhaps has
different timing behavior than other platforms.
Per bug #7914 from Shin-ichi Morita, though this is different from his
proposed solution. Back-patch to all supported branches.
When array of char * was used as target for a FETCH statement returning more
than one row, it tried to store all the result in the first element. Instead it
should dump array of char pointers with right offset, use the address instead
of the value of the C variable while reading the array and treat such variable
as char **, instead of char * for pointer arithmetic.
Patch by Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
It's easy to forget using SYSTEMQUOTEs when constructing command strings
for system() or popen(). Even if we fix all the places missing it now, it is
bound to be forgotten again in the future. Introduce wrapper functions that
do the the extra quoting for you, and get rid of SYSTEMQUOTEs in all the
callers.
We previosly used SYSTEMQUOTEs in all the hard-coded command strings, and
this doesn't change the behavior of those. But user-supplied commands, like
archive_command, restore_command, COPY TO/FROM PROGRAM calls, as well as
pgbench's \shell, will now gain an extra pair of quotes. That is desirable,
but if you have existing scripts or config files that include an extra
pair of quotes, those might need to be adjusted.
Reviewed by Amit Kapila and Tom Lane
Previously, these functions treated "" optin values as defaults in some
ways, but not in others, like when comparing to .pgpass. Also, add
documentation to clarify that now "" and NULL use defaults, like
PQsetdbLogin() has always done.
BACKWARD INCOMPATIBILITY
Patch by Adrian Vondendriesch, docs by me
Report by Jeff Janes
Introduced in 585bca39: msgid is not used in the Windows code path.
Also adjust comments a tad (mostly to keep pgindent from messing it up).
David Rowley
Previously, 'int' was used for socket values in libpq, but socket values
are unsigned on Windows. This is a style correction.
Initial patch and previous PGINVALID_SOCKET initial patch by Joel
Jacobson, modified by me
Report from PVS-Studio
Bind attempts to an LDAP server should time out after two seconds,
allowing additional lines in the service control file to be parsed
(which provide a fall back to a secondary LDAP server or default options).
The existing code failed to enforce that timeout during TCP connect,
resulting in a hang far longer than two seconds if the LDAP server
does not respond.
Laurenz Albe
Previously, in some places, socket creation errors were checked for
negative values, which is not true for Windows because sockets are
unsigned. This masked socket creation errors on Windows.
Backpatch through 9.0. 8.4 doesn't have the infrastructure to fix this.
Remarkably, this hasn't been noticed before, though it surely should
have been happening since around the fall of the Byzantine empire.
Commit 438b529604 changed path.c to depend on FRONTEND, and that exposed
the omission, per buildfarm reports.
I'm suspicious that some other subdirectories are missing this too,
but this one change is enough to make ecpg tests pass for me.
"8" was correct back when "disable" was the longest allowed value, but
since "verify-full" was added, it should be "12". Given the lack of
complaints, I wouldn't be surprised if nobody is actually using these
values ... but still, if they're in the API, they should be right.
Noticed while pursuing a different problem. It's been wrong for quite
a long time, so back-patch to all supported branches.
A number of issues were identified by the Coverity scanner and are
addressed in this patch. None of these appear to be security issues
and many are mostly cosmetic changes.
Short comments for each of the changes follows.
Correct the semi-colon placement in be-secure.c regarding SSL retries.
Remove a useless comparison-to-NULL in proc.c (value is dereferenced
prior to this check and therefore can't be NULL).
Add checking of chmod() return values to initdb.
Fix a couple minor memory leaks in initdb.
Fix memory leak in pg_ctl- involves free'ing the config file contents.
Use an int to capture fgetc() return instead of an enum in pg_dump.
Fix minor memory leaks in pg_dump.
(note minor change to convertOperatorReference()'s API)
Check fclose()/remove() return codes in psql.
Check fstat(), find_my_exec() return codes in psql.
Various ECPG memory leak fixes.
Check find_my_exec() return in ECPG.
Explicitly ignore pqFlush return in libpq error-path.
Change PQfnumber() to avoid doing an strdup() when no changes required.
Remove a few useless check-against-NULL's (value deref'd beforehand).
Check rmtree(), malloc() results in pg_regress.
Also check get_alternative_expectfile() return in pg_regress.
Some of the files we optionally link in from elsewhere weren't ignored
and/or weren't cleaned up at "make clean". Noted while testing on a
machine that needs our version of snprintf.c.
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit. We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue. Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.
Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.
In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass. The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode"). This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.
Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
Many server functions use the MAXDATELEN constant to size a buffer for
parsing or displaying a datetime value. It was much too small for the
longest possible interval output and slightly too small for certain
valid timestamp input, particularly input with a long timezone name.
The long input was rejected needlessly; the long output caused
interval_out() to overrun its buffer. ECPG's pgtypes library has a copy
of the vulnerable functions, which bore the same vulnerabilities along
with some of its own. In contrast to the server, certain long inputs
caused stack overflow rather than failing cleanly. Back-patch to 8.4
(all supported versions).
Reported by Daniel Schüssler, reviewed by Tom Lane.
Security: CVE-2014-0063
In pqSendSome, if the connection is already closed at entry, discard any
queued output data before returning. There is no possibility of ever
sending the data, and anyway this corresponds to what we'd do if we'd
detected a hard error while trying to send(). This avoids possible
indefinite bloat of the output buffer if the application keeps trying
to send data (or even just keeps trying to do PQputCopyEnd, as psql
indeed will).
Because PQputCopyEnd won't transition out of PGASYNC_COPY_IN state
until it's successfully queued the COPY END message, and pqPutMsgEnd
doesn't distinguish a queuing failure from a pqSendSome failure,
this omission allowed an infinite loop in psql if the connection closure
occurred when we had at least 8K queued to send. It might be worth
refactoring so that we can make that distinction, but for the moment
the other changes made here seem to offer adequate defenses.
To guard against other variants of this scenario, do not allow
PQgetResult to return a PGRES_COPY_XXX result if the connection is
already known dead. Make sure it returns PGRES_FATAL_ERROR instead.
Per report from Stephen Frost. Back-patch to all active branches.
This has long been done by the MSVC build system, and has caused
confusion in the past when programs like psql have failed to start
because they can't find the DLL. If it's in the same directory as it now
will be they will find it.
Backpatch to all live branches.
Commit 820f08cabd claimed to make the server
and libpq handle SSL protocol versions identically, but actually the server
was still accepting SSL v3 protocol while libpq wasn't. Per discussion,
SSL v3 is obsolete, and there's no good reason to continue to accept it.
So make the code really equivalent on both sides. The behavior now is
that we use the highest mutually-supported TLS protocol version.
Marko Kreen, some comment-smithing by me
New checks include input, month/day/time internal adjustments, addition,
subtraction, multiplication, and negation. Also adjust docs to
correctly specify interval size in bytes.
Report from Rok Kralj
Per report from Jeffrey Walton, libpq has been accepting only TLSv1
exactly. Along the lines of the backend code, libpq will now support
new versions as OpenSSL adds them.
Marko Kreen, reviewed by Wim Lewis.
There was a bug in the psql's meta command \conninfo. When the
IP address was specified in the hostaddr and psql used it to create
a connection (i.e., psql -d "hostaddr=xxx"), \conninfo could not
display that address. This is because \conninfo got the connection
information only from PQhost() which could not return hostaddr.
This patch adds PQhostaddr(), and changes \conninfo so that it
can display not only the host name that PQhost() returns but also
the IP address which PQhostaddr() returns.
The bug has existed since 9.1 where \conninfo was introduced.
But it's too late to add new libpq function into the released versions,
so no backpatch.
In the platform that doesn't support Unix-domain socket, when
neither host nor hostaddr are specified, the default host
'localhost' is used to connect to the server and PQhost() must
return that, but it didn't. This patch fixes PQhost() so that
it returns the default host in that case.
Also this patch fixes PQhost() so that it doesn't return
Unix-domain socket directory path in the platform that doesn't
support Unix-domain socket.
Back-patch to all supported versions.
krb5 has been deprecated since 8.3, and the recommended way to do
Kerberos authentication is using the GSSAPI authentication method
(which is still fully supported).
libpq retains the ability to identify krb5 authentication, but only
gives an error message about it being unsupported. Since all authentication
is initiated from the backend, there is no need to keep it at all
in the backend.
Split the rather long ecpg_execute() function into ecpg_build_params(),
ecpg_autostart_transaction(), a smaller ecpg_execute() and
ecpg_process_output(). There is no user-visible change here, only code
reorganization to support future patches.
Author: Zoltán Böszörményi
Reviewed by Antonin Houska. Larger, older versions of this patch were
reviewed by Noah Misch and Michael Meskes.
This splits ECPGdo() into ecpg_prologue(), ecpg_do() and
ecpg_epilogue(), and renames free_params() into ecpg_free_params() and
exports it. This makes it possible for future code to use these
routines for their own purposes.
There is no user-visible functionality change here, only code
reorganization.
Zoltán Böszörményi
Reviewed by Antonin Houska. Larger, older versions of this patch were
reviewed by Noah Misch and Michael Meskes.
While working on most platforms the old way sometimes created alignment
problems. This should fix it. Also the regresion tests were updated to test for
the reported case.
Report and fix by MauMau <maumau307@gmail.com>
Previously missing or invalid service files returned NULL. Also fix
pg_upgrade to report "out of memory" for a null return from
PQconndefaults().
Patch by Steve Singer, rewritten by me
variables is varchar. This fixes this test case:
int main(void)
{
exec sql begin declare section;
varchar a[50], b[50];
exec sql end declare section;
return 0;
}
Since varchars are internally turned into custom structs and
the type name is emitted for these variable declarations,
the preprocessed code previously had:
struct varchar_1 { ... } a _,_ struct varchar_2 { ... } b ;
The comma in the generated C file was a syntax error.
There are no regression test changes since it's not exercised.
Patch by Boszormenyi Zoltan <zb@cybertec.at>
ECPG is not supposed to allow and output nested comments in C. These comments
are only allowed in the SQL parts and must not be written into the C file.
Also the different handling of different comments is documented.
Commit 9b4d52f209 failed to notice
that pg_regress_ecpg needed updating.
This patch was independently submitted by both David Rowley
and Andres Freund.
When using a C99-compliant vsnprintf, we can use its report of the required
buffer size to avoid making multiple loops through the formatting logic.
This is similar to the changes recently made in stringinfo.c, but we can't
use psprintf.c here because in libpq we don't want to exit() on error.
(The behavior pqexpbuffer.c has historically used is to mark the
PQExpBuffer as "broken", ie empty, if it runs into any fatal problem.)
To avoid duplicating code more than necessary, I refactored
printfPQExpBuffer and appendPQExpBuffer to share a subroutine that's
very similar to psprintf.c's pvsnprintf in spirit.
asprintf(), aside from not being particularly portable, has a fundamentally
badly-designed API; the psprintf() function that was added in passing in
the previous patch has a much better API choice. Moreover, the NetBSD
implementation that was borrowed for the previous patch doesn't work with
non-C99-compliant vsnprintf, which is something we still have to cope with
on some platforms; and it depends on va_copy which isn't all that portable
either. Get rid of that code in favor of an implementation similar to what
we've used for many years in stringinfo.c. Also, move it into libpgcommon
since it's not really libpgport material.
I think this patch will be enough to turn the buildfarm green again, but
there's still cosmetic work left to do, namely get rid of pg_asprintf()
in favor of using psprintf(). That will come in a followon patch.
Now that msgfmt is run with -c by default, older versions of gettext are
complaining about the PO headers Last-Translator and Language-Team
still having their default values. Newer gettext versions fail to catch
this because of a bug (https://savannah.gnu.org/bugs/?40261), which is
why this hasn't been noticed before.
Copy updated versions of affected translation files from the
pgtranslations repository, were those files have been fixed.
Add asprintf(), pg_asprintf(), and psprintf() to simplify string
allocation and composition. Replacement implementations taken from
NetBSD.
Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Asif Naeem <anaeem.it@gmail.com>
In libpq, we set up and pass to OpenSSL callback routines to handle
locking. When we run out of SSL connections, we try to clean things
up by de-registering the hooks. Unfortunately, we had a few calls
into the OpenSSL library after these hooks were de-registered during
SSL cleanup which lead to deadlocking. This moves the thread callback
cleanup to be after all SSL-cleanup related OpenSSL library calls.
I've been unable to reproduce the deadlock with this fix.
In passing, also move the close_SSL call to be after unlocking our
ssl_config mutex when in a failure state. While it looks pretty
unlikely to be an issue, it could have resulted in deadlocks if we
ended up in this code path due to something other than SSL_new
failing. Thanks to Heikki for pointing this out.
Back-patch to all supported versions; note that the close_SSL issue
only goes back to 9.0, so that hunk isn't included in the 8.4 patch.
Initially found and reported by Vesa-Matti J Kari; many thanks to
both Heikki and Andres for their help running down the specific
issue and reviewing the patch.
We should really be reporting a useful error along with returning
a valid return code if pthread_mutex_lock() throws an error for
some reason. Add that and back-patch to 9.0 as the prior patch.
Pointed out by Alvaro Herrera
I've been working with Nick Phillips on an issue he ran into when
trying to use threads with SSL client certificates. As it turns out,
the call in initialize_SSL() to SSL_CTX_use_certificate_chain_file()
will modify our SSL_context without any protection from other threads
also calling that function or being at some other point and trying to
read from SSL_context.
To protect against this, I've written up the attached (based on an
initial patch from Nick and much subsequent discussion) which puts
locks around SSL_CTX_use_certificate_chain_file() and all of the other
users of SSL_context which weren't already protected.
Nick Phillips, much reworked by Stephen Frost
Back-patch to 9.0 where we started loading the cert directly instead of
using a callback.
PGTYPEStimestamp_defmt_scan() was declared twice inside different .c
files, with slightly different prototypes. Move it into a header file
and correct the prototype.
On Unix-ish platforms, EWOULDBLOCK may be the same as EAGAIN, which is
*not* a success return, at least not on Linux. We need to treat it as a
failure to avoid giving a misleading error message. Per the Single Unix
Spec, only EINPROGRESS and EINTR returns indicate that the connection
attempt is in progress.
On Windows, on the other hand, EWOULDBLOCK (WSAEWOULDBLOCK) is the expected
case. We must accept EINPROGRESS as well because Cygwin will return that,
and it doesn't seem worth distinguishing Cygwin from native Windows here.
It's not very clear whether EINTR can occur on Windows, but let's leave
that part of the logic alone in the absence of concrete trouble reports.
Also, remove the test for errno == 0, effectively reverting commit
da9501bddb, which AFAICS was just a thinko;
or at best it might have been a workaround for a platform-specific bug,
which we can hope is gone now thirteen years later. In any case, since
libpq makes no effort to reset errno to zero before calling connect(),
it seems unlikely that that test has ever reliably done anything useful.
Andres Freund and Tom Lane
Make slightly better decisions about indentation than what pgindent
is capable of. Mostly breaking out long function calls into one
line per argument, with a few other minor adjustments.
No functional changes- all whitespace.
pgindent ran cleanly (didn't change anything) after.
Passes all regressions.
Previously, libpq and the backend had opposite ideas about whether
it was necessary for the client to send a CopyDone message after
receiving an ErrorResponse, making it impossible to cleanly exit
COPY BOTH mode. Fix libpq so that works correctly, adopting the
backend's notion that an ErrorResponse kills the copy in both
directions.
Adjust receivelog.c to avoid a degradation in the quality of the
resulting error messages. libpqwalreceiver.c is already doing
the right thing, so no adjustment needed there.
Add an explicit statement to the documentation explaining how
this part of the protocol is supposed to work, in the hopes of
avoiding future confusion in this area.
Since the consequences of all this confusion are very limited,
especially in the back-branches where no client ever attempts
to exit COPY BOTH mode without closing the connection entirely,
no back-patch.
There's probably no real bug here at present, so not backpatching.
But it seems good to make these bits consistent with the rest of
libpq, so as to avoid future surprises.
Patch by me. Review by Tom Lane.
In most cases, these were just references to the SQL standard in
general. In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
This reverts commit 3780fc679c.
HP-UX didn't like it. There would probably be a way to fix that, but
since the net effect of all of this is zero because ecpg ends up using
libpq anyway, it's not worth bothering further.
This will hopefully be easier to use than pg_config for users who are
already used to the pkg-config interface. It also works better for
multi-arch installations.
reviewed by Tom Lane
It doesn't actually use libpq. But we need to keep libpq in the
CPPFLAGS for building, because compatlib uses ecpglib.h which uses
libpq-fe.h, but we don't need to refer to libpq for linking.
reviewed by Tom Lane
In some parallel make situations, the install-headers target could be
called before the installation directories are created by installdirs,
causing the installation to fail. Fix that by making install-headers
depend on installdirs.
We need this in non-ENABLE_THREAD_SAFETY builds, and also to satisfy
the exports.txt entry; while it might be a good idea to remove the
latter, I'm hesitant to do so except in the context of an intentional
ABI break. At least we don't have a separately maintained source file
for it anymore.
We had two copies of this function in the backend and libpq, which was
already pretty bogus, but it turns out that we need it in some other
programs that don't use libpq (such as pg_test_fsync). So put it where
it probably should have been all along. The signal-mask-initialization
support in src/backend/libpq/pqsignal.c stays where it is, though, since
we only need that in the backend.
I fixed this code back in commit 841b4a2d5, but didn't think carefully
enough about the behavior near zero, which meant it improperly rejected
1999-12-31 24:00:00. Per report from Magnus Hagander.
This includes backend "COPY TO/FROM PROGRAM '...'" syntax, and corresponding
psql \copy syntax. Like with reading/writing files, the backend version is
superuser-only, and in the psql version, the program is run in the client.
In the passing, the psql \copy STDIN/STDOUT syntax is subtly changed: if you
the stdin/stdout is quoted, it's now interpreted as a filename. For example,
"\copy foo from 'stdin'" now reads from a file called 'stdin', not from
standard input. Before this, there was no way to specify a filename called
stdin, stdout, pstdin or pstdout.
This creates a new function in pgport, wait_result_to_str(), which can
be used to convert the exit status of a process, as returned by wait(3),
to a human-readable string.
Etsuro Fujita, reviewed by Amit Kapila.
The backend grammar treats STDIN and STDOUT completely interchangeable, so
that the above accepted. Arguably that was a mistake the backend grammar,
but it's not ecpg's business to second guess that.
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure. It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error. (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)
Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.
Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
This bug goes back to the original Postgres95 sources. Its significance
to modern PG versions is marginal, since we have not used PQprintTuples()
internally in a very long time, and it doesn't seem to have ever been
documented either. Still, it *is* exposed to client apps, so somebody
out there might possibly be using it.
Xi Wang
This is now used by ecpg tests, and not clobbered by pg_upgrade
tests. This change won't affect anything that doesn't set this
environment variable, but will enable the buildfarm to control
exactly what port regression test installs will be running on,
and thus to detect possible rogue postmasters more easily.
Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
Before this patch, streaming replication would refuse to start replicating
if the timeline in the primary doesn't exactly match the standby. The
situation where it doesn't match is when you have a master, and two
standbys, and you promote one of the standbys to become new master.
Promoting bumps up the timeline ID, and after that bump, the other standby
would refuse to continue.
There's significantly more timeline related logic in streaming replication
now. First of all, when a standby connects to primary, it will ask the
primary for any timeline history files that are missing from the standby.
The missing files are sent using a new replication command TIMELINE_HISTORY,
and stored in standby's pg_xlog directory. Using the timeline history files,
the standby can follow the latest timeline present in the primary
(recovery_target_timeline='latest'), just as it can follow new timelines
appearing in an archive directory.
START_REPLICATION now takes a TIMELINE parameter, to specify exactly which
timeline to stream WAL from. This allows the standby to request the primary
to send over WAL that precedes the promotion. The replication protocol is
changed slightly (in a backwards-compatible way although there's little hope
of streaming replication working across major versions anyway), to allow
replication to stop when the end of timeline reached, putting the walsender
back into accepting a replication command.
Many thanks to Amit Kapila for testing and reviewing various versions of
this patch.
This allows a caller to get back the exact conninfo array that was
used to create a connection, including parameters read from the
environment.
In doing this, restructure how options are copied from the conninfo
to the actual connection.
Zoltan Boszormenyi and Magnus Hagander
The length of a socket path name is constrained by the size of struct
sockaddr_un, and there's not a lot we can do about it since that is a
kernel API. However, it would be a good thing if we produced an
intelligible error message when the user specifies a socket path that's too
long --- and getaddrinfo's standard API is too impoverished to do this in
the natural way. So insert explicit tests at the places where we construct
a socket path name. Now you'll get an error that makes sense and even
tells you what the limit is, rather than something generic like
"Non-recoverable failure in name resolution".
Per trouble report from Jeremy Drake and a fix idea from Andrew Dunstan.
This is to see if it will stop intermittent build failures on buildfarm
member okapi. We know that gmake 3.82 has some problems with sometimes
not honoring dependencies in parallel builds, and it seems likely that
this is more of the same. Since the vast bulk of the work in the preproc
directory is associated with creating preproc.c and then preproc.o,
parallelism buys us hardly anything here anyway.
Also, make both this .NOTPARALLEL and the one previously added in
interfaces/ecpg/Makefile be conditional on "ifeq ($(MAKE_VERSION),3.82)".
The known bug in gmake is fixed upstream and should not be present in
3.83 and up, and there's no reason to think it affects older releases.
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.
Users of pgxs will also be able to benefit from this.
I found that these functions tend to return -1 while leaving an empty error
message string in the PGconn, if they suffer some kind of I/O error on the
file. The reason is that lo_close, which thinks it's executed a perfectly
fine SQL command, clears the errorMessage. The minimum-change workaround
is to reorder operations here so that we don't fill the errorMessage until
after lo_close.
libpq defines these functions as accepting "size_t" lengths ... but the
underlying backend functions expect signed int32 length parameters, and so
will misinterpret any value exceeding INT_MAX. Fix the libpq side to throw
error rather than possibly doing something unexpected.
This is a bug of long standing, but I doubt it's worth back-patching. The
problem is really pretty academic anyway with lo_read/lo_write, since any
caller expecting sane behavior would have to have provided a multi-gigabyte
buffer. It's slightly more pressing with lo_truncate, but still we haven't
supported large objects over 2GB until now.
Fix broken-on-bigendian-machines byte-swapping functions, add missed update
of alternate regression expected file, improve error reporting, remove some
unnecessary code, sync testlo64.c with current testlo.c (it seems to have
been cloned from a very old copy of that), assorted cosmetic improvements.
Get rid of the fundamentally indefensible assumption that "long long int"
exists and is exactly 64 bits wide on every platform Postgres runs on.
Instead let the configure script select the type to use for "pg_int64".
This is a bit of a pain in the rear since we do not want to pollute client
namespace with all the random symbols that pg_config.h defines; instead
we have to create a separate generated header file, "pg_config_ext.h".
But now that the infrastructure is there, we might have the ability to
add some other stuff that's long been wanting in this area.
4TB large objects (standard 8KB BLCKSZ case). For this purpose new
libpq API lo_lseek64, lo_tell64 and lo_truncate64 are added. Also
corresponding new backend functions lo_lseek64, lo_tell64 and
lo_truncate64 are added. inv_api.c is changed to handle 64-bit
offsets.
Patch contributed by Nozomi Anzai (backend side) and Yugo Nagata
(frontend side, docs, regression tests and example program). Reviewed
by Kohei Kaigai. Committed by Tatsuo Ishii with minor editings.
Instead of continuing if the next character is not an array boundary get_data()
used to continue only on finding a boundary so it was not able to read any
element after the first.