libpq: URI parsing fixes

Drop special handling of host component with slashes to mean
Unix-domain socket.  Specify it as separate parameter or using
percent-encoding now.

Allow omitting username, password, and port even if the corresponding
designators are present in URI.

Handle percent-encoding in query parameter keywords.

Alex Shulgin

some documentation improvements by myself
This commit is contained in:
Peter Eisentraut 2012-05-28 22:44:34 +03:00
parent 388d251679
commit 2d612abd4d
4 changed files with 272 additions and 259 deletions

View File

@ -711,6 +711,124 @@ PGPing PQping(const char *conninfo);
</variablelist>
</para>
<sect2 id="libpq-connstring">
<title>Connection Strings</title>
<indexterm zone="libpq-connstring">
<primary><literal>conninfo</literal></primary>
</indexterm>
<indexterm zone="libpq-connstring">
<primary><literal>URI</literal></primary>
</indexterm>
<para>
Several <application>libpq</> functions parse a user-specified string to obtain
connection parameters. There are two accepted formats for these strings:
plain <literal>keyword = value</literal> strings
and <ulink url="http://www.ietf.org/rfc/rfc3986.txt">RFC
3986</ulink> URIs.
</para>
<sect3>
<title>Keyword/Value Connection Strings</title>
<para>
In the first format, each parameter setting is in the form
<literal>keyword = value</literal>. Spaces around the equal sign are
optional. To write an empty value, or a value containing spaces, surround it
with single quotes, e.g., <literal>keyword = 'a value'</literal>. Single
quotes and backslashes within
the value must be escaped with a backslash, i.e., <literal>\'</literal> and
<literal>\\</literal>.
</para>
<para>
Example:
<programlisting>
host=localhost port=5432 dbname=mydb connect_timeout=10
</programlisting>
</para>
<para>
The recognized parameter key words are listed in <xref
linkend="libpq-paramkeywords">.
</para>
</sect3>
<sect3>
<title>Connection URIs</title>
<para>
The general form for a connection <acronym>URI</acronym> is:
<synopsis>
postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&amp;...]
</synopsis>
</para>
<para>
The <acronym>URI</acronym> scheme designator can be either
<literal>postgresql://</literal> or <literal>postgres://</literal>. Each
of the <acronym>URI</acronym> parts is optional. The following examples
illustrate valid <acronym>URI</acronym> syntax uses:
<programlisting>
postgresql://
postgresql://localhost
postgresql://localhost:5433
postgresql://localhost/mydb
postgresql://user@localhost
postgresql://user:secret@localhost
postgresql://other@localhost/otherdb?connect_timeout=10&amp;application_name=myapp
</programlisting>
Components of the hierarchical part of the <acronym>URI</acronym> can also
be given as parameters. For example:
<programlisting>
postgresql:///mydb?host=localhost&amp;port=5433
</programlisting>
</para>
<para>
Percent-encoding may be used to include symbols with special meaning in any
of the <acronym>URI</acronym> parts.
</para>
<para>
Any connection parameters not corresponding to key words listed in <xref
linkend="libpq-paramkeywords"> are ignored and a warning message about them
is sent to <filename>stderr</filename>.
</para>
<para>
For improved compatibility with JDBC connection <acronym>URI</acronym>s,
instances of parameter <literal>ssl=true</literal> are translated into
<literal>sslmode=require</literal>.
</para>
<para>
The host part may be either hostname or an IP address. To specify an
IPv6 host address, enclose it in square brackets:
<synopsis>
postgresql://[2001:db8::1234]/database
</synopsis>
</para>
<para>
The host component is interpreted as described for the parameter <xref
linkend="libpq-connect-host">. In particular, a Unix-domain socket
connection is chosen if the host part is either empty or starts with a
slash, otherwise a TCP/IP connection is initiated. Note, however, that the
slash is a reserved character in the hierarchical part of the URI. So, to
specify a non-standard Unix-domain socket directory, either omit the host
specification in the URI and specify the host as a parameter, or
percent-encode the path in the host component of the URI:
<programlisting>
postgresql:///dbname?host=/var/lib/postgresql
postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
</programlisting>
</para>
</sect3>
</sect2>
<sect2 id="libpq-paramkeywords">
<title>Parameter Key Words</title>
@ -1220,107 +1338,6 @@ PGPing PQping(const char *conninfo);
</variablelist>
</para>
</sect2>
<sect2 id="libpq-connstring">
<title>Connection Strings</title>
<indexterm zone="libpq-connstring">
<primary><literal>conninfo</literal></primary>
</indexterm>
<indexterm zone="libpq-connstring">
<primary><literal>URI</literal></primary>
</indexterm>
<para>
Several <application>libpq</> functions parse a user-specified string to obtain
connection parameters. There are two accepted formats for these strings:
plain <literal>keyword = value</literal> strings, and URIs.
</para>
<para>
In the first format, each parameter setting is in the form
<literal>keyword = value</literal>. Spaces around the equal sign are
optional. To write an empty value, or a value containing spaces, surround it
with single quotes, e.g., <literal>keyword = 'a value'</literal>. Single
quotes and backslashes within
the value must be escaped with a backslash, i.e., <literal>\'</literal> and
<literal>\\</literal>.
</para>
<para>
The currently recognized parameter key words are listed in
<xref linkend="libpq-paramkeywords">.
</para>
<para>
The general form for connection <acronym>URI</acronym> is the
following:
<synopsis>
postgresql://[user[:password]@][unix-socket][:port[/dbname]][?param1=value1&amp;...]
postgresql://[user[:password]@][net-location][:port][/dbname][?param1=value1&amp;...]
</synopsis>
</para>
<para>
The <acronym>URI</acronym> designator can be either
<literal>postgresql://</literal> or <literal>postgres://</literal> and
each of the <acronym>URI</acronym> parts is optional. The following
examples illustrate valid <acronym>URI</acronym> syntax uses:
<synopsis>
postgresql://
postgresql://localhost
postgresql://localhost:5433
postgresql://localhost/mydb
postgresql://user@localhost
postgresql://user:secret@localhost
postgresql://other@localhost/otherdb
</synopsis>
</para>
<para>
Percent-encoding may be used to include a symbol with special meaning in
any of the <acronym>URI</acronym> parts.
</para>
<para>
Additional connection parameters may optionally follow the base <acronym>URI</acronym>.
Any connection parameters not corresponding to key words listed
in <xref linkend="libpq-paramkeywords"> are ignored and a warning message
about them is sent to <filename>stderr</filename>.
</para>
<para>
For improved compatibility with JDBC connection <acronym>URI</acronym>
syntax, instances of parameter <literal>ssl=true</literal> are translated
into <literal>sslmode=require</literal> (see above.)
</para>
<para>
The host part may be either hostname or an IP address. To specify an
IPv6 host address, enclose it in square brackets:
<synopsis>
postgresql://[2001:db8::1234]/database
</synopsis>
As a special case, a host part which starts with <symbol>/</symbol> is
treated as a local Unix socket directory to look for the connection
socket special file:
<synopsis>
postgresql:///path/to/pgsql/socket/dir
</synopsis>
The whole connection string up to the extra parameters designator
(<symbol>?</symbol>) or the port designator (<symbol>:</symbol>) is treated
as the absolute path to the socket directory
(<literal>/path/to/pgsql/socket/dir</literal> in this example.) To specify
a non-default database name in this case you can use either of the following
syntaxes:
<synopsis>
postgresql:///path/to/pgsql/socket/dir?dbname=otherdb
postgresql:///path/to/pgsql/socket/dir:5432/otherdb
</synopsis>
</para>
</sect2>
</sect1>
<sect1 id="libpq-status">

View File

@ -4544,18 +4544,15 @@ conninfo_uri_parse(const char *uri, PQExpBuffer errorMessage,
* options from the URI.
* If not successful, returns false and fills errorMessage accordingly.
*
* Parses the connection URI string in 'uri' according to the URI syntax:
* Parses the connection URI string in 'uri' according to the URI syntax (RFC
* 3986):
*
* postgresql://[user[:pwd]@][unix-socket][:port[/dbname]][?param1=value1&...]
* postgresql://[user[:pwd]@][net-location][:port][/dbname][?param1=value1&...]
* postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&...]
*
* "net-location" is a hostname, an IPv4 address, or an IPv6 address surrounded
* by literal square brackets. To be recognized as a unix-domain socket, the
* value must start with a slash '/'. Note slight inconsistency in that dbname
* can always be specified after net-location, but after unix-socket it can only
* be specified if there is a port specification.
* where "netloc" is a hostname, an IPv4 address, or an IPv6 address surrounded
* by literal square brackets.
*
* Any of those elements might be percent-encoded (%xy).
* Any of the URI parts might use percent-encoding (%xy).
*/
static bool
conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
@ -4566,6 +4563,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
char *buf = strdup(uri); /* need a modifiable copy of the input URI */
char *start = buf;
char prevchar = '\0';
char *user = NULL;
char *host = NULL;
bool retval = false;
if (buf == NULL)
@ -4593,8 +4592,6 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
++p;
if (*p == '@')
{
char *user;
/*
* Found username/password designator, so URI should be of the form
* "scheme://user[:password]@[netloc]".
@ -4609,14 +4606,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
prevchar = *p;
*p = '\0';
if (!*user)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("invalid empty username specifier in URI: %s\n"),
uri);
goto cleanup;
}
if (!conninfo_storeval(options, "user", user,
if (*user &&
!conninfo_storeval(options, "user", user,
errorMessage, false, true))
goto cleanup;
@ -4628,15 +4619,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
++p;
*p = '\0';
if (!*password)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("invalid empty password specifier in URI: %s\n"),
uri);
goto cleanup;
}
if (!conninfo_storeval(options, "password", password,
if (*password &&
!conninfo_storeval(options, "password", password,
errorMessage, false, true))
goto cleanup;
}
@ -4656,88 +4640,66 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
* "p" has been incremented past optional URI credential information at
* this point and now points at the "netloc" part of the URI.
*
* Check for local unix socket dir.
* Look for IPv6 address.
*/
if (*p == '/')
if (*p == '[')
{
const char *socket = p;
/* Look for possible port specifier or query parameters */
while (*p && *p != ':' && *p != '?')
host = ++p;
while (*p && *p != ']')
++p;
prevchar = *p;
*p = '\0';
if (!conninfo_storeval(options, "host", socket,
errorMessage, false, true))
if (!*p)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("end of string reached when looking for matching ']' in IPv6 host address in URI: %s\n"),
uri);
goto cleanup;
}
if (p == host)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("IPv6 host address may not be empty in URI: %s\n"),
uri);
goto cleanup;
}
/* Cut off the bracket and advance */
*(p++) = '\0';
/*
* The address may be followed by a port specifier or a slash or a
* query.
*/
if (*p && *p != ':' && *p != '/' && *p != '?')
{
printfPQExpBuffer(errorMessage,
libpq_gettext("unexpected '%c' at position %d in URI (expecting ':' or '/'): %s\n"),
*p, (int) (p - buf + 1), uri);
goto cleanup;
}
}
else
{
/* Not a unix socket dir: parse as host name or address */
const char *host;
/* not an IPv6 address: DNS-named or IPv4 netloc */
host = p;
/*
*
* Look for IPv6 address
* Look for port specifier (colon) or end of host specifier
* (slash), or query (question mark).
*/
if (*p == '[')
{
host = ++p;
while (*p && *p != ']')
++p;
if (!*p)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("end of string reached when looking for matching ']' in IPv6 host address in URI: %s\n"),
uri);
goto cleanup;
}
if (p == host)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("IPv6 host address may not be empty in URI: %s\n"),
uri);
goto cleanup;
}
/* Cut off the bracket and advance */
*(p++) = '\0';
/*
* The address may be followed by a port specifier or a slash or a
* query.
*/
if (*p && *p != ':' && *p != '/' && *p != '?')
{
printfPQExpBuffer(errorMessage,
libpq_gettext("unexpected '%c' at position %d in URI (expecting ':' or '/'): %s\n"),
*p, (int) (p - buf + 1), uri);
goto cleanup;
}
}
else
{
/* not an IPv6 address: DNS-named or IPv4 netloc */
host = p;
/*
* Look for port specifier (colon) or end of host specifier
* (slash), or query (question mark).
*/
while (*p && *p != ':' && *p != '/' && *p != '?')
++p;
}
/* Save the hostname terminator before we null it */
prevchar = *p;
*p = '\0';
if (!conninfo_storeval(options, "host", host,
errorMessage, false, true))
goto cleanup;
while (*p && *p != ':' && *p != '/' && *p != '?')
++p;
}
/* Save the hostname terminator before we null it */
prevchar = *p;
*p = '\0';
if (*host &&
!conninfo_storeval(options, "host", host,
errorMessage, false, true))
goto cleanup;
if (prevchar == ':')
{
const char *port = ++p; /* advance past host terminator */
@ -4748,14 +4710,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
prevchar = *p;
*p = '\0';
if (!*port)
{
printfPQExpBuffer(errorMessage,
libpq_gettext("missing port specifier in URI: %s\n"),
uri);
goto cleanup;
}
if (!conninfo_storeval(options, "port", port,
if (*port &&
!conninfo_storeval(options, "port", port,
errorMessage, false, true))
goto cleanup;
}
@ -4813,9 +4769,10 @@ conninfo_uri_parse_params(char *params,
{
while (*params)
{
const char *keyword = params;
const char *value = NULL;
char *keyword = params;
char *value = NULL;
char *p = params;
bool malloced = false;
/*
* Scan the params string for '=' and '&', marking the end of keyword
@ -4866,35 +4823,66 @@ conninfo_uri_parse_params(char *params,
++p;
}
keyword = conninfo_uri_decode(keyword, errorMessage);
if (keyword == NULL)
{
/* conninfo_uri_decode already set an error message */
return false;
}
value = conninfo_uri_decode(value, errorMessage);
if (value == NULL)
{
/* conninfo_uri_decode already set an error message */
free(keyword);
return false;
}
malloced = true;
/*
* Special keyword handling for improved JDBC compatibility. Note
* we fail to detect URI-encoded values here, but we don't care.
* Special keyword handling for improved JDBC compatibility.
*/
if (strcmp(keyword, "ssl") == 0 &&
strcmp(value, "true") == 0)
{
free(keyword);
free(value);
malloced = false;
keyword = "sslmode";
value = "require";
}
/*
* Store the value if the corresponding option exists; ignore
* otherwise.
* otherwise. At this point both keyword and value are not
* URI-encoded.
*/
if (!conninfo_storeval(connOptions, keyword, value,
errorMessage, true, true))
errorMessage, true, false))
{
/*
* Check if there was a hard error when decoding or storing the
* option.
*/
if (errorMessage->len != 0)
{
if (malloced)
{
free(keyword);
free(value);
}
return false;
}
fprintf(stderr,
libpq_gettext("WARNING: ignoring unrecognized URI query parameter: %s\n"),
keyword);
}
if (malloced)
{
free(keyword);
free(value);
}
/* Proceed to next key=value pair */
params = p;
@ -5017,7 +5005,8 @@ conninfo_getval(PQconninfoOption *connOptions,
* Store a (new) value for an option corresponding to the keyword in
* connOptions array.
*
* If uri_decode is true, keyword and value are URI-decoded.
* If uri_decode is true, the value is URI-decoded. The keyword is always
* assumed to be non URI-encoded.
*
* If successful, returns a pointer to the corresponding PQconninfoOption,
* which value is replaced with a strdup'd copy of the passed value string.
@ -5034,32 +5023,16 @@ conninfo_storeval(PQconninfoOption *connOptions,
bool uri_decode)
{
PQconninfoOption *option;
char *value_copy;
char *keyword_copy = NULL;
char *value_copy;
/*
* Decode the keyword. XXX this is seldom necessary as keywords do not
* normally need URI-escaping. It'd be good to do away with the
* malloc/free overhead and the general ugliness, but I don't see a
* better way to handle it.
*/
if (uri_decode)
{
keyword_copy = conninfo_uri_decode(keyword, errorMessage);
if (keyword_copy == NULL)
/* conninfo_uri_decode already set an error message */
goto failed;
}
option = conninfo_find(connOptions,
keyword_copy != NULL ? keyword_copy : keyword);
option = conninfo_find(connOptions, keyword);
if (option == NULL)
{
if (!ignoreMissing)
printfPQExpBuffer(errorMessage,
libpq_gettext("invalid connection option \"%s\"\n"),
keyword);
goto failed;
return NULL;
}
if (uri_decode)
@ -5067,7 +5040,7 @@ conninfo_storeval(PQconninfoOption *connOptions,
value_copy = conninfo_uri_decode(value, errorMessage);
if (value_copy == NULL)
/* conninfo_uri_decode already set an error message */
goto failed;
return NULL;
}
else
{
@ -5076,7 +5049,7 @@ conninfo_storeval(PQconninfoOption *connOptions,
if (value_copy == NULL)
{
printfPQExpBuffer(errorMessage, libpq_gettext("out of memory\n"));
goto failed;
return NULL;
}
}
@ -5084,14 +5057,7 @@ conninfo_storeval(PQconninfoOption *connOptions,
free(option->val);
option->val = value_copy;
if (keyword_copy != NULL)
free(keyword_copy);
return option;
failed:
if (keyword_copy != NULL)
free(keyword_copy);
return NULL;
}
/*

View File

@ -20,7 +20,7 @@ trying postgresql://uri-user@host/
user='uri-user' host='host' (inet)
trying postgresql://uri-user@
user='uri-user' host='' (local)
user='uri-user' (local)
trying postgresql://host:12345/
host='host' port='12345' (inet)
@ -38,10 +38,10 @@ trying postgresql://host
host='host' (inet)
trying postgresql://
host='' (local)
(local)
trying postgresql://?hostaddr=127.0.0.1
host='' hostaddr='127.0.0.1' (inet)
hostaddr='127.0.0.1' (inet)
trying postgresql://example.com?hostaddr=63.1.2.4
host='example.com' hostaddr='63.1.2.4' (inet)
@ -59,7 +59,7 @@ trying postgresql://host/db?u%73er=someotheruser&port=12345
user='someotheruser' dbname='db' host='host' port='12345' (inet)
trying postgresql://host/db?u%7aer=someotheruser&port=12345
WARNING: ignoring unrecognized URI query parameter: u%7aer
WARNING: ignoring unrecognized URI query parameter: uzer
dbname='db' host='host' port='12345' (inet)
trying postgresql://host:12345?user=uri-user
@ -87,10 +87,19 @@ trying postgresql://[::1]
host='::1' (inet)
trying postgres://
host='' (local)
(local)
trying postgres:///tmp
host='/tmp' (local)
trying postgres:///
(local)
trying postgres:///db
dbname='db' (local)
trying postgres://uri-user@/db
user='uri-user' dbname='db' (local)
trying postgres://?host=/path/to/socket/dir
host='/path/to/socket/dir' (local)
trying postgresql://host?uzer=
WARNING: ignoring unrecognized URI query parameter: uzer
@ -145,19 +154,32 @@ uri-regress: invalid percent-encoded token: %
trying postgres://@host
uri-regress: invalid empty username specifier in URI: postgres://@host
host='host' (inet)
trying postgres://host:/
uri-regress: missing port specifier in URI: postgres://host:/
host='host' (inet)
trying postgres://:12345/
port='12345' (local)
trying postgres://otheruser@/no/such/directory
trying postgres://otheruser@?host=/no/such/directory
user='otheruser' host='/no/such/directory' (local)
trying postgres://otheruser@/no/such/socket/path:12345
trying postgres://otheruser@/?host=/no/such/directory
user='otheruser' host='/no/such/directory' (local)
trying postgres://otheruser@:12345?host=/no/such/socket/path
user='otheruser' host='/no/such/socket/path' port='12345' (local)
trying postgres://otheruser@/path/to/socket:12345/db
trying postgres://otheruser@:12345/db?host=/path/to/socket
user='otheruser' dbname='db' host='/path/to/socket' port='12345' (local)
trying postgres://:12345/db?host=/path/to/socket
dbname='db' host='/path/to/socket' port='12345' (local)
trying postgres://:12345?host=/path/to/socket
host='/path/to/socket' port='12345' (local)
trying postgres://%2Fvar%2Flib%2Fpostgresql/dbname
dbname='dbname' host='/var/lib/postgresql' (local)

View File

@ -28,7 +28,10 @@ postgresql://[2001:db8::1234]/
postgresql://[200z:db8::1234]/
postgresql://[::1]
postgres://
postgres:///tmp
postgres:///
postgres:///db
postgres://uri-user@/db
postgres://?host=/path/to/socket/dir
postgresql://host?uzer=
postgre://
postgres://[::1
@ -44,6 +47,11 @@ postgresql://%1
postgresql://%
postgres://@host
postgres://host:/
postgres://otheruser@/no/such/directory
postgres://otheruser@/no/such/socket/path:12345
postgres://otheruser@/path/to/socket:12345/db
postgres://:12345/
postgres://otheruser@?host=/no/such/directory
postgres://otheruser@/?host=/no/such/directory
postgres://otheruser@:12345?host=/no/such/socket/path
postgres://otheruser@:12345/db?host=/path/to/socket
postgres://:12345/db?host=/path/to/socket
postgres://:12345?host=/path/to/socket
postgres://%2Fvar%2Flib%2Fpostgresql/dbname