we are on a 64-bit machine (ie, size_t is wider than int) and someone passes
in a query string that approaches or exceeds INT_MAX bytes. Also, just for
paranoia's sake, guard against similar overflows in sizing the input buffer.
The backend will not in the foreseeable future be prepared to send or receive
strings exceeding 1GB, so I didn't take the more invasive step of switching
all the buffer index variables from int to size_t; though someday we might
want to do that.
I have a suspicion that this is not the only such bug in libpq, but this
fix is enough to take care of the crash reported by Francisco Reyes.
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so. For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway. (This does force initdb
unfortunately.)
Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using. To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.
It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs. However the code is now prepared to avoid this
type of problem in future.
Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly. The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
and standard_conforming_strings; likewise for the other client programs
that need it. As per previous discussion, a pg_dump dump now conforms
to the standard_conforming_strings setting of the source database.
We don't use E'' syntax in the dump, thereby improving portability of
the SQL. I added a SET escape_strings_warning = off command to keep
the dumps from getting a lot of back-chatter from that.
Per Coverity bug #304. Thanks to Martijn van Oosterhout for reporting it.
Zero out the pointer fields of PGresult so that these mistakes are more
easily catched, per discussion.
and standard_conforming_strings. The encoding changes are needed for proper
escaping in multibyte encodings, as per the SQL-injection vulnerabilities
noted in CVE-2006-2313 and CVE-2006-2314. Concurrent fixes are being applied
to the server to ensure that it rejects queries that may have been corrupted
by attempted SQL injection, but this merely guarantees that unpatched clients
will fail rather than allow injection. An actual fix requires changing the
client-side code. While at it we have also fixed these routines to understand
about standard_conforming_strings, so that the upcoming changeover to SQL-spec
string syntax can be somewhat transparent to client code.
Since the existing API of PQescapeString and PQescapeBytea provides no way to
inform them which settings are in use, these functions are now deprecated in
favor of new functions PQescapeStringConn and PQescapeByteaConn. The new
functions take the PGconn to which the string will be sent as an additional
parameter, and look inside the connection structure to determine what to do.
So as to provide some functionality for clients using the old functions,
libpq stores the latest encoding and standard_conforming_strings values
received from the backend in static variables, and the old functions consult
these variables. This will work reliably in clients using only one Postgres
connection at a time, or even multiple connections if they all use the same
encoding and string syntax settings; which should cover many practical
scenarios.
Clients that use homebrew escaping methods, such as PHP's addslashes()
function or even hardwired regexp substitution, will require extra effort
to fix :-(. It is strongly recommended that such code be replaced by use of
PQescapeStringConn/PQescapeByteaConn if at all feasible.
during parse analysis, not only errors detected in the flex/bison stages.
This is per my earlier proposal. This commit includes all the basic
infrastructure, but locations are only tracked and reported for errors
involving column references, function calls, and operators. More could
be done later but this seems like a good set to start with. I've also
moved the ReportSyntaxErrorPosition logic out of psql and into libpq,
which should make it available to more people --- even within psql this
is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
because pqSendSome will absorb input data anytime it'd be forced to block.
Avoiding a kernel call per PQputCopyData call helps COPY speed materially.
Alon Goldshuv
rather than "return expr;" -- the latter style is used in most of the
tree. I kept the parentheses when they were necessary or useful because
the return expression was complex.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
Windows. The test itself is bypassed in configure as discussed, and
libpq has been updated appropriately to allow it to build in thread-safe
mode.
Dave Page
patch adds missing checks to the call sites of malloc(), strdup(),
PQmakeEmptyPGresult(), pqResultAlloc(), and pqResultStrdup(), and updates
the documentation. Per original report from Volkan Yazici about
PQmakeEmptyPGresult() not checking for malloc() failure.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
just stick a list-link into struct PGnotify instead. Result is a smaller
faster and more robust library (mainly because we reduce the number of
malloc's and free's involved in notify processing), plus less pollution
of application link-symbol namespace.
conversion of basic ASCII letters. Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower. These functions use the same notions of
case folding already developed for identifier case conversion. I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent. Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
behavior of malloc and realloc when request size is 0. Fix escape
sequence recognizer so that only valid 3-digit octal sequences are
treated as escape sequences ... isdigit() is not a correct test.
will downcase the supplied field name unless it is double-quoted. Also,
upgrade the routine's handling of double quotes to match the backend,
in particular support doubled double quotes within quoted identifiers.
Per pgsql-interfaces discussion a couple weeks ago.
via extended query protocol, because it sends Sync right after Execute
without realizing that the command to be executed is COPY. There seems
to be no reasonable way for it to realize that, either, so the best fix
seems to be to make the backend ignore Sync during copy-in mode. Bit of
a wart on the protocol, but little alternative. Also, libpq must send
another Sync after terminating the COPY, if the command was issued via
Execute.
libpq users to perform Bind/Execute of previously prepared statements.
Per yesterday's discussion, this offers enough performance improvement
to justify bending the 'no new features during beta' rule.
minute and a half to decode a 500Kb on a fairly fast machine. I think the
culprit is sscanf.
I attach a patch that replaces the function with one used to perform the same
task in pyPgSQL (a Python interface to PostgreSQL). This code was written by
Billy Allie, author of pyPgSQL. I've changed a few variable names to match
those in the original code and removed a bit of Pythonness.
Billy has kindly looked at the code and points out that it is slightly
stricter than the original implementation and if it encounters an invalid
bytea such as '\12C' it drops the unescape '\' and outputs '12C'.
The code is licensed by the author under a BSD license.
I've performed limited testing of the function by putting JPEGs into
PostgreSQL, extracting them using them using the new function and diffing
against the original files.
The new function is significantly faster on my machine with the JPEGs being
decoded in less than a second. I attach a modified libpq example program that
I used for my testing.
Ben Lamb.
protocol 3, then falls back to 2 if postmaster rejects the startup packet
with an old-format error message. A side benefit of the rewrite is that
SSL-encrypted connections can now be made without blocking. (I think,
anyway, but do not have a good way to test.)
handle multiple 'formats' for data I/O. Restructure CommandDest and
DestReceiver stuff one more time (it's finally starting to look a bit
clean though). Code now matches latest 3.0 protocol document as far
as message formats go --- but there is no support for binary I/O yet.
for tableID/columnID in RowDescription. (The latter isn't really
implemented yet though --- the backend always sends zeroes, and libpq
just throws away the data.)
initial values and runtime changes in selected parameters. This gets
rid of the need for an initial 'select pg_client_encoding()' query in
libpq, bringing us back to one message transmitted in each direction
for a standard connection startup. To allow server version to be sent
using the same GUC mechanism that handles other parameters, invent the
concept of a never-settable GUC parameter: you can 'show server_version'
but it's not settable by any GUC input source. Create 'lc_collate' and
'lc_ctype' never-settable parameters so that people can find out these
settings without need for pg_controldata. (These side ideas were all
discussed some time ago in pgsql-hackers, but not yet implemented.)
rewritten and the protocol is changed, but most elog calls are still
elog calls. Also, we need to contemplate mechanisms for controlling
all this functionality --- eg, how much stuff should appear in the
postmaster log? And what API should libpq expose for it?
have length words. COPY OUT reimplemented per new protocol: it doesn't
need \. anymore, thank goodness. COPY BINARY to/from frontend works,
at least as far as the backend is concerned --- libpq's PQgetline API
is not up to snuff, and will have to be replaced with something that is
null-safe. libpq uses message length words for performance improvement
(no cycles wasted rescanning long messages), but not yet for error
recovery.
value '-2' is used to indicate a variable-width type whose width is
computed as strlen(datum)+1. Everything that looks at typlen is updated
except for array support, which Joe Conway is working on; at the moment
it wouldn't work to try to create an array of cstring.
This is necessary for mulibyte character sequences.
See "[HACKERS] PQescapeBytea is not multibyte aware" thread posted around
2002/04/05 for more details.
o Change all current CVS messages of NOTICE to WARNING. We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.
o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.
o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.
o Remove INFO from the client_min_messages options and add NOTICE.
Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.
Regression passed.
queries over non-blocking connections with libpq. "Larger" here
basically means that it doesn't fit into the output buffer.
The basic strategy is to fix pqFlush and pqPutBytes.
The problem with pqFlush as it stands now is that it returns EOF when an
error occurs or when not all data could be sent. The latter case is
clearly not an error for a non-blocking connection but the caller can't
distringuish it from an error very well.
The first part of the fix is therefore to fix pqFlush. This is done by
to renaming it to pqSendSome which only differs from pqFlush in its
return values to allow the caller to make the above distinction and a
new pqFlush which is implemented in terms of pqSendSome and behaves
exactly like the old pqFlush.
The second part of the fix modifies pqPutBytes to use pqSendSome instead
of pqFlush and to either send all the data or if not all data can be
sent on a non-blocking connection to at least put all data into the
output buffer, enlarging it if necessary. The callers of pqPutBytes
don't have to be changed because from their point of view pqPutBytes
behaves like before. It either succeeds in queueing all output data or
fails with an error.
I've also added a new API function PQsendSome which analogously to
PQflush just calls pqSendSome. Programs using non-blocking queries
should use this new function. The main difference is that this function
will have to be called repeatedly (calling select() properly in between)
until all data has been written.
AFAICT, the code in CVS HEAD hasn't changed with respect to non-blocking
queries and this fix should work there, too, but I haven't tested that
yet.
Bernhard Herzog
>
> 1. Now outputs '\\' instead of '\134' when using encode(bytea, 'escape')
> Note that I ended up leaving \0 as \000 so that there are no ambiguities
> when decoding something like, for example, \0123.
>
> 2. Fixed bug in byteain which allowed input values which were not valid
> octals (e.g. \789), to be parsed as if they were octals.
>
> Joe
>
Here's rev 2 of the bytea string support patch. Changes:
1. Added missing declaration for MatchBytea function
2. Added PQescapeBytea to fe-exec.c
3. Applies cleanly on cvs tip from this afternoon
I'm hoping that someone can review/approve/apply this before beta starts, so
I guess I'd vote (not that it counts for much) to delay beta a few days :-)
Joe Conway
> null bytes to be literally '\0', the following can happen:
> 1. User inputs string value as "<null byte>##" where ## are digits in the
> range of 0 to 7.
> 2. PQescapeString converts this to "\0##"
> 3. Escaped string is used in a context that causes "\0##" to be evaluated as
> an octal escape sequence.
I agree that this is a problem, though it is not possible to do
anything harmful with it. In addition, it only occurs if there are
any NUL characters in its input, which is very unlikely if you are
using C strings.
The patch below addresses the issue by removing escaping of \0
characters entirely.
> If the goal is to "safely" encode null bytes, and preserve the rest of the
> string as it was entered, I think the null bytes should be escaped as \\000
> (note that if you simply use \000 the same string truncation problem
> occurs).
We can't do that, this would require 4n + 1 bytes of storage for the
result, breaking the interface.
Florian Weimer
discussion on pgsql-hackers (especially the frightening memory dump in
<12273.999562219@sss.pgh.pa.us>), we decided that it is best not to
use identifiers from an untrusted source at all. Therefore, all
claims of the suitability of PQescapeString() for identifiers have
been removed.
Florian Weimer
two additional files win32.mak and libpgtcl.def.
This patch allows to compile libpgtcl.dll on Windows
with tcl > 8.0. I've tested it on WinNT (VC6.0), SUSE Linux (7.0)
and Solaris 2.6 with tcl 8.3.3.
Mikhail Terekhov
> It seems that win9x doesn't have the "netmsg.dll" so it defaults to "normal"
> FormatMessage.
> I wonder if one could load wsock32.dll or winsock.dll on those systems
> instead of netmsg.dll.
>
> Mikhail, could you please test this code on your nt4 system?
> Could someone else test this code on a win98/95 system?
>
> It works on win2k over here.
It works on win2k here too but not on win98/95 or winNT.
Anyway, attached is the patch which uses Magnus's my_sock_strerror
function (renamed to winsock_strerror). The only difference is that
I put the code to load and unload netmsg.dll in the libpqdll.c
(is this OK Magnus?).
Mikhail Terekhov
Allow pg_shadow to be MD5 encrypted.
Add ENCRYPTED/UNENCRYPTED option to CREATE/ALTER user.
Add password_encryption postgresql.conf option.
Update wire protocol version to 2.1.
functions do not set errno, so some normal conditions are treated as
fatal errors. e.g. fetching large tuples fails, as at some point recv()
returns EWOULDBLOCK. here's a patch, which replaces errno with
WSAGetLastError(). i've tried to to affect non-win32 code.
Dmitry Yurtaev
are now separate files "postgres.h" and "postgres_fe.h", which are meant
to be the primary include files for backend .c files and frontend .c files
respectively. By default, only include files meant for frontend use are
installed into the installation include directory. There is a new make
target 'make install-all-headers' that adds the whole content of the
src/include tree to the installed fileset, for use by people who want to
develop server-side code without keeping the complete source tree on hand.
Cleaned up a whole lot of crufty and inconsistent header inclusions.