Clean up weird corner cases in lexing of psql meta-command arguments.

These changes allow backtick command evaluation and psql variable
interpolation to happen on substrings of a single meta-command argument.
Formerly, no such evaluations happened at all if the backtick or colon
wasn't the first character of the argument, and we considered an argument
completed as soon as we'd processed one backtick, variable reference, or
quoted substring.  A string like 'FOO'BAR was thus taken as two arguments
not one, not exactly what one would expect.  In the new coding, an argument
is considered terminated only by unquoted whitespace or backslash.

Also, clean up a bunch of omissions, infelicities and outright errors in
the psql documentation of variables and metacommand argument syntax.
This commit is contained in:
Tom Lane 2011-08-26 13:52:23 -04:00
parent e86fdb0ab2
commit 928311a463
4 changed files with 275 additions and 247 deletions

View File

@ -156,8 +156,8 @@ PostgreSQL documentation
Use the file <replaceable class="parameter">filename</replaceable>
as the source of commands instead of reading commands interactively.
After the file is processed, <application>psql</application>
terminates. This is in many ways equivalent to the internal
command <command>\i</command>.
terminates. This is in many ways equivalent to the meta-command
<command>\i</command>.
</para>
<para>
@ -223,7 +223,7 @@ PostgreSQL documentation
<listitem>
<para>
List all available databases, then exit. Other non-connection
options are ignored. This is similar to the internal command
options are ignored. This is similar to the meta-command
<command>\list</command>.
</para>
</listitem>
@ -393,9 +393,9 @@ PostgreSQL documentation
<listitem>
<para>
Perform a variable assignment, like the <command>\set</command>
internal command. Note that you must separate name and value, if
meta-command. Note that you must separate name and value, if
any, by an equal sign on the command line. To unset a variable,
leave off the equal sign. To just set a variable without a value,
leave off the equal sign. To set a variable with an empty value,
use the equal sign but leave off the value. These assignments are
done during a very early stage of start-up, so variables reserved
for internal purposes might get overwritten later.
@ -659,32 +659,32 @@ testdb=&gt;
</para>
<para>
To include whitespace into an argument you can quote it with a
single quote. To include a single quote into such an argument,
use two single quotes. Anything contained in single quotes is
To include whitespace in an argument you can quote it with
single quotes. To include a single quote in an argument,
write two single quotes within single-quoted text.
Anything contained in single quotes is
furthermore subject to C-like substitutions for
<literal>\n</literal> (new line), <literal>\t</literal> (tab),
<literal>\b</literal> (backspace), <literal>\r</literal> (carriage return),
<literal>\f</literal> (form feed),
<literal>\</literal><replaceable>digits</replaceable> (octal), and
<literal>\x</literal><replaceable>digits</replaceable> (hexadecimal).
A backslash preceding any other character within single-quoted text
quotes that single character, whatever it is.
</para>
<para>
If an unquoted argument begins with a colon (<literal>:</literal>),
it is taken as a <application>psql</> variable and the value of the
variable is used as the argument instead. If the variable name is
surrounded by single quotes (e.g. <literal>:'var'</literal>), it
will be escaped as an SQL literal and the result will be used as
the argument. If the variable name is surrounded by double quotes,
it will be escaped as an SQL identifier and the result will be used
as the argument.
Within an argument, text that is enclosed in backquotes
(<literal>`</literal>) is taken as a command line that is passed to the
shell. The output of the command (with any trailing newline removed)
replaces the backquoted text.
</para>
<para>
Arguments that are enclosed in backquotes (<literal>`</literal>)
are taken as a command line that is passed to the shell. The
output of the command (with any trailing newline removed) is taken
as the argument value. The above escape sequences also apply in
backquotes.
If an unquoted colon (<literal>:</literal>) followed by a
<application>psql</> variable name appears within an argument, it is
replaced by the variable's value, as described in <xref
linkend="APP-PSQL-interpolation" endterm="APP-PSQL-interpolation-title">.
</para>
<para>
@ -1803,15 +1803,16 @@ lo_import 152801
<term><literal>\prompt [ <replaceable class="parameter">text</replaceable> ] <replaceable class="parameter">name</replaceable></literal></term>
<listitem>
<para>
Prompts the user to set variable <replaceable
class="parameter">name</>. An optional prompt, <replaceable
Prompts the user to supply text, which is assigned to the variable
<replaceable class="parameter">name</>.
An optional prompt string, <replaceable
class="parameter">text</>, can be specified. (For multiword
prompts, use single quotes.)
prompts, surround the text with single quotes.)
</para>
<para>
By default, <literal>\prompt</> uses the terminal for input and
output. However, if the <option>-f</> command line switch is
output. However, if the <option>-f</> command line switch was
used, <literal>\prompt</> uses standard input and standard output.
</para>
</listitem>
@ -2197,14 +2198,19 @@ lo_import 152801
<listitem>
<para>
Sets the internal variable <replaceable
Sets the <application>psql</> variable <replaceable
class="parameter">name</replaceable> to <replaceable
class="parameter">value</replaceable> or, if more than one value
is given, to the concatenation of all of them. If no second
argument is given, the variable is just set with no value. To
class="parameter">value</replaceable>, or if more than one value
is given, to the concatenation of all of them. If only one
argument is given, the variable is set with an empty value. To
unset a variable, use the <command>\unset</command> command.
</para>
<para>
<command>\set</> without any arguments displays the names and values
of all currently-set <application>psql</> variables.
</para>
<para>
Valid variable names can contain letters, digits, and
underscores. See the section <xref
@ -2221,7 +2227,7 @@ lo_import 152801
<note>
<para>
This command is totally separate from the <acronym>SQL</acronym>
This command is unrelated to the <acronym>SQL</acronym>
command <xref linkend="SQL-SET">.
</para>
</note>
@ -2293,6 +2299,18 @@ lo_import 152801
</varlistentry>
<varlistentry>
<term><literal>\unset <replaceable class="parameter">name</replaceable></literal></term>
<listitem>
<para>
Unsets (deletes) the <application>psql</> variable <replaceable
class="parameter">name</replaceable>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>\w</literal> <replaceable class="parameter">filename</replaceable></term>
<term><literal>\w</literal> <literal>|</><replaceable class="parameter">command</replaceable></term>
@ -2467,18 +2485,28 @@ lo_import 152801
<para>
To set a variable, use the <application>psql</application> meta-command
<command>\set</command>:
<command>\set</command>. For example,
<programlisting>
testdb=&gt; <userinput>\set foo bar</userinput>
</programlisting>
sets the variable <literal>foo</literal> to the value
<literal>bar</literal>. To retrieve the content of the variable, precede
the name with a colon and use it as the argument of any slash
command:
the name with a colon, for example:
<programlisting>
testdb=&gt; <userinput>\echo :foo</userinput>
bar
</programlisting></para>
</programlisting>
This works in both regular SQL commands and meta-commands; there is
more detail in <xref linkend="APP-PSQL-interpolation"
endterm="APP-PSQL-interpolation-title">, below.
</para>
<para>
If you call <command>\set</command> without a second argument, the
variable is set, with an empty string as value. To unset (i.e., delete)
a variable, use the command <command>\unset</command>. To show the
values of all variables, call <command>\set</command> without any argument.
</para>
<note>
<para>
@ -2495,12 +2523,6 @@ bar
</para>
</note>
<para>
If you call <command>\set</command> without a second argument, the
variable is set, with an empty string as value. To unset (or delete) a
variable, use the command <command>\unset</command>.
</para>
<para>
A number of these variables are treated specially
by <application>psql</application>. They represent certain option
@ -2863,47 +2885,57 @@ bar
</refsect3>
<refsect3>
<title><acronym>SQL</acronym> Interpolation</title>
<refsect3 id="APP-PSQL-interpolation">
<title id="APP-PSQL-interpolation-title"><acronym>SQL</acronym> Interpolation</title>
<para>
An additional useful feature of <application>psql</application>
A key feature of <application>psql</application>
variables is that you can substitute (<quote>interpolate</quote>)
them into regular <acronym>SQL</acronym> statements.
<application>psql</application> provides special facilities for
ensuring that values used as SQL literals and identifiers are
properly escaped. The syntax for interpolating a value without
any special escaping is again to prepend the variable name with a colon
(<literal>:</literal>):
them into regular <acronym>SQL</acronym> statements, as well as the
arguments of meta-commands. Furthermore,
<application>psql</application> provides facilities for
ensuring that variable values used as SQL literals and identifiers are
properly quoted. The syntax for interpolating a value without
any quoting is to prepend the variable name with a colon
(<literal>:</literal>). For example,
<programlisting>
testdb=&gt; <userinput>\set foo 'my_table'</userinput>
testdb=&gt; <userinput>SELECT * FROM :foo;</userinput>
</programlisting>
would then query the table <literal>my_table</literal>. Note that this
would query the table <literal>my_table</literal>. Note that this
may be unsafe: the value of the variable is copied literally, so it can
even contain unbalanced quotes or backslash commands. You must make sure
contain unbalanced quotes, or even backslash commands. You must make sure
that it makes sense where you put it.
</para>
<para>
When a value is to be used as an SQL literal or identifier, it is
safest to arrange for it to be escaped. To escape the value of
safest to arrange for it to be quoted. To quote the value of
a variable as an SQL literal, write a colon followed by the variable
name in single quotes. To escape the value an SQL identifier, write
a colon followed by the variable name in double quotes. The previous
example would be more safely written this way:
name in single quotes. To quote the value as an SQL identifier, write
a colon followed by the variable name in double quotes.
These constructs deal correctly with quotes and other special
characters embedded within the variable value.
The previous example would be more safely written this way:
<programlisting>
testdb=&gt; <userinput>\set foo 'my_table'</userinput>
testdb=&gt; <userinput>SELECT * FROM :"foo";</userinput>
</programlisting>
Variable interpolation will not be performed into quoted
<acronym>SQL</acronym> entities.
</para>
<para>
One possible use of this mechanism is to
copy the contents of a file into a table column. First load the file into a
variable and then proceed as above:
Variable interpolation will not be performed within quoted
<acronym>SQL</acronym> literals and identifiers. Therefore, a
construction such as <literal>':foo'</> doesn't work to produce a quoted
literal from a variable's value (and it would be unsafe if it did work,
since it wouldn't correctly handle quotes embedded in the value).
</para>
<para>
One example use of this mechanism is to
copy the contents of a file into a table column.
First load the file into a variable and then interpolate the variable's
value as a quoted string:
<programlisting>
testdb=&gt; <userinput>\set content `cat my_file.txt`</userinput>
testdb=&gt; <userinput>INSERT INTO my_table VALUES (:'content');</userinput>
@ -2914,17 +2946,20 @@ testdb=&gt; <userinput>INSERT INTO my_table VALUES (:'content');</userinput>
<para>
Since colons can legally appear in SQL commands, an apparent attempt
at interpolation (such as <literal>:name</literal>,
at interpolation (that is, <literal>:name</literal>,
<literal>:'name'</literal>, or <literal>:"name"</literal>) is not
changed unless the named variable is currently set. In any case, you
replaced unless the named variable is currently set. In any case, you
can escape a colon with a backslash to protect it from substitution.
(The colon syntax for variables is standard <acronym>SQL</acronym> for
</para>
<para>
The colon syntax for variables is standard <acronym>SQL</acronym> for
embedded query languages, such as <application>ECPG</application>.
The colon syntax for array slices and type casts are
<productname>PostgreSQL</productname> extensions, hence the
conflict. The colon syntax for escaping a variable's value as an
SQL literal or identifier is a <application>psql</application>
extension.)
The colon syntaxes for array slices and type casts are
<productname>PostgreSQL</productname> extensions, which can sometimes
conflict with the standard usage. The colon-quote syntax for escaping a
variable's value as an SQL literal or identifier is a
<application>psql</application> extension.
</para>
</refsect3>

View File

@ -121,7 +121,7 @@ HandleSlashCmds(PsqlScanState scan_state,
/* eat any remaining arguments after a valid command */
/* note we suppress evaluation of backticks here */
while ((arg = psql_scan_slash_option(scan_state,
OT_VERBATIM, NULL, false)))
OT_NO_EVAL, NULL, false)))
{
psql_error("\\%s: extra argument \"%s\" ignored\n", cmd, arg);
free(arg);

View File

@ -33,7 +33,7 @@ enum slash_option_type
OT_SQLIDHACK, /* SQL identifier, but don't downcase */
OT_FILEPIPE, /* it's a filename or pipe */
OT_WHOLE_LINE, /* just snarf the rest of the line */
OT_VERBATIM /* literal (no backticks or variables) */
OT_NO_EVAL /* no expansion of backticks or variables */
};

View File

@ -103,6 +103,8 @@ static PQExpBuffer output_buf; /* current output buffer */
/* these variables do not need to be saved across calls */
static enum slash_option_type option_type;
static char *option_quote;
static int unquoted_option_chars;
static int backtick_start_offset;
/* Return values from yylex() */
@ -114,6 +116,7 @@ static char *option_quote;
int yylex(void);
static void evaluate_backtick(void);
static void push_new_buffer(const char *newstr, const char *varname);
static void pop_buffer_stack(PsqlScanState state);
static bool var_is_current_source(PsqlScanState state, const char *varname);
@ -182,11 +185,11 @@ static void escape_variable(bool as_ident);
%x xus
/* Additional exclusive states for psql only: lex backslash commands */
%x xslashcmd
%x xslashargstart
%x xslasharg
%x xslashquote
%x xslashbackquote
%x xslashdefaultarg
%x xslashquotedarg
%x xslashdquote
%x xslashwholeline
%x xslashend
@ -900,17 +903,53 @@ other .
}
<xslasharg>{
/* eat any whitespace, then decide what to do at first nonblank */
<xslashargstart>{
/*
* Discard any whitespace before argument, then go to xslasharg state.
* An exception is that "|" is only special at start of argument, so we
* check for it here.
*/
{space}+ { }
"\\" {
"|" {
if (option_type == OT_FILEPIPE)
{
/* treat like whole-string case */
ECHO;
BEGIN(xslashwholeline);
}
else
{
/* vertical bar is not special otherwise */
yyless(0);
BEGIN(xslasharg);
}
}
{other} {
yyless(0);
BEGIN(xslasharg);
}
}
<xslasharg>{
/*
* Default processing of text in a slash command's argument.
*
* Note: unquoted_option_chars counts the number of characters at the
* end of the argument that were not subject to any form of quoting.
* psql_scan_slash_option needs this to strip trailing semicolons safely.
*/
{space}|"\\" {
/*
* Unquoted space is end of arg; do not eat. Likewise
* backslash is end of command or next command, do not eat
*
* XXX this means we can't conveniently accept options
* that start with a backslash; therefore, option
* that include unquoted backslashes; therefore, option
* processing that encourages use of backslashes is rather
* broken.
*/
@ -920,26 +959,27 @@ other .
{quote} {
*option_quote = '\'';
unquoted_option_chars = 0;
BEGIN(xslashquote);
}
"`" {
if (option_type == OT_VERBATIM)
{
/* in verbatim mode, backquote is not special */
ECHO;
BEGIN(xslashdefaultarg);
}
else
{
*option_quote = '`';
BEGIN(xslashbackquote);
}
backtick_start_offset = output_buf->len;
*option_quote = '`';
unquoted_option_chars = 0;
BEGIN(xslashbackquote);
}
{dquote} {
ECHO;
*option_quote = '"';
unquoted_option_chars = 0;
BEGIN(xslashdquote);
}
:{variable_char}+ {
/* Possible psql variable substitution */
if (option_type == OT_VERBATIM)
if (option_type == OT_NO_EVAL)
ECHO;
else
{
@ -959,71 +999,54 @@ other .
*/
if (value)
appendPQExpBufferStr(output_buf, value);
else
ECHO;
*option_quote = ':';
}
*option_quote = ':';
return LEXRES_OK;
unquoted_option_chars = 0;
}
:'{variable_char}+' {
if (option_type == OT_VERBATIM)
if (option_type == OT_NO_EVAL)
ECHO;
else
{
escape_variable(false);
return LEXRES_OK;
*option_quote = ':';
}
unquoted_option_chars = 0;
}
:\"{variable_char}+\" {
if (option_type == OT_VERBATIM)
if (option_type == OT_NO_EVAL)
ECHO;
else
{
escape_variable(true);
return LEXRES_OK;
*option_quote = ':';
}
unquoted_option_chars = 0;
}
:'{variable_char}* {
/* Throw back everything but the colon */
yyless(1);
unquoted_option_chars++;
ECHO;
BEGIN(xslashdefaultarg);
}
:\"{variable_char}* {
/* Throw back everything but the colon */
yyless(1);
unquoted_option_chars++;
ECHO;
BEGIN(xslashdefaultarg);
}
"|" {
ECHO;
if (option_type == OT_FILEPIPE)
{
/* treat like whole-string case */
BEGIN(xslashwholeline);
}
else
{
/* treat like default case */
BEGIN(xslashdefaultarg);
}
}
{dquote} {
*option_quote = '"';
ECHO;
BEGIN(xslashquotedarg);
}
{other} {
unquoted_option_chars++;
ECHO;
BEGIN(xslashdefaultarg);
}
}
@ -1034,7 +1057,7 @@ other .
* sequences
*/
{quote} { return LEXRES_OK; }
{quote} { BEGIN(xslasharg); }
{xqdouble} { appendPQExpBufferChar(output_buf, '\''); }
@ -1064,55 +1087,28 @@ other .
<xslashbackquote>{
/*
* backticked text: copy everything until next backquote or end of line.
* Invocation of the command will happen in psql_scan_slash_option.
* backticked text: copy everything until next backquote, then evaluate.
*
* XXX Possible future behavioral change: substitute for :VARIABLE?
*/
"`" { return LEXRES_OK; }
"`" {
/* In NO_EVAL mode, don't evaluate the command */
if (option_type != OT_NO_EVAL)
evaluate_backtick();
BEGIN(xslasharg);
}
{other}|\n { ECHO; }
}
<xslashdefaultarg>{
/*
* Copy everything until unquoted whitespace or end of line. Quotes
* do not get stripped yet.
*/
{space} {
yyless(0);
return LEXRES_OK;
}
"\\" {
/*
* unquoted backslash is end of command or next command,
* do not eat
*
* (this was not the behavior pre-8.0, but it seems
* consistent)
*/
yyless(0);
return LEXRES_OK;
}
{dquote} {
*option_quote = '"';
ECHO;
BEGIN(xslashquotedarg);
}
{other} { ECHO; }
}
<xslashquotedarg>{
/* double-quoted text within a default-type argument: copy */
<xslashdquote>{
/* double-quoted text: copy verbatim, including the double quotes */
{dquote} {
ECHO;
BEGIN(xslashdefaultarg);
BEGIN(xslasharg);
}
{other}|\n { ECHO; }
@ -1461,7 +1457,7 @@ psql_scan_slash_command(PsqlScanState state)
* letters.
*
* if quote is not NULL, *quote is set to 0 if no quoting was found, else
* the quote symbol.
* the last quote symbol used in the argument.
*
* if semicolon is true, unquoted trailing semicolon(s) that would otherwise
* be taken as part of the option string will be stripped.
@ -1480,7 +1476,6 @@ psql_scan_slash_option(PsqlScanState state,
PQExpBufferData mybuf;
int lexresult;
char local_quote;
bool badarg;
/* Must be scanning already */
psql_assert(state->scanbufhandle);
@ -1497,6 +1492,7 @@ psql_scan_slash_option(PsqlScanState state,
output_buf = &mybuf;
option_type = type;
option_quote = quote;
unquoted_option_chars = 0;
if (state->buffer_stack != NULL)
yy_switch_to_buffer(state->buffer_stack->buf);
@ -1506,7 +1502,7 @@ psql_scan_slash_option(PsqlScanState state,
if (type == OT_WHOLE_LINE)
BEGIN(xslashwholeline);
else
BEGIN(xslasharg);
BEGIN(xslashargstart);
/* And lex. */
lexresult = yylex();
@ -1517,85 +1513,18 @@ psql_scan_slash_option(PsqlScanState state,
* a quoted string, as indicated by YY_START, EOL is an error.
*/
psql_assert(lexresult == LEXRES_EOL || lexresult == LEXRES_OK);
badarg = false;
switch (YY_START)
{
case xslashargstart:
/* empty arg */
break;
case xslasharg:
/* empty arg, or possibly a psql variable substitution */
break;
case xslashquote:
if (lexresult != LEXRES_OK)
badarg = true; /* hit EOL not ending quote */
break;
case xslashbackquote:
if (lexresult != LEXRES_OK)
badarg = true; /* hit EOL not ending quote */
else
{
/* Perform evaluation of backticked command */
char *cmd = mybuf.data;
FILE *fd;
bool error = false;
PQExpBufferData output;
char buf[512];
size_t result;
fd = popen(cmd, PG_BINARY_R);
if (!fd)
{
psql_error("%s: %s\n", cmd, strerror(errno));
error = true;
}
initPQExpBuffer(&output);
if (!error)
{
do
{
result = fread(buf, 1, sizeof(buf), fd);
if (ferror(fd))
{
psql_error("%s: %s\n", cmd, strerror(errno));
error = true;
break;
}
appendBinaryPQExpBuffer(&output, buf, result);
} while (!feof(fd));
}
if (fd && pclose(fd) == -1)
{
psql_error("%s: %s\n", cmd, strerror(errno));
error = true;
}
if (PQExpBufferBroken(&output))
{
psql_error("%s: out of memory\n", cmd);
error = true;
}
/* Now done with cmd, transfer result to mybuf */
resetPQExpBuffer(&mybuf);
if (!error)
{
/* strip any trailing newline */
if (output.len > 0 &&
output.data[output.len - 1] == '\n')
output.len--;
appendBinaryPQExpBuffer(&mybuf, output.data, output.len);
}
termPQExpBuffer(&output);
}
break;
case xslashdefaultarg:
/* Strip any trailing semi-colons if requested */
/* Strip any unquoted trailing semi-colons if requested */
if (semicolon)
{
while (mybuf.len > 0 &&
while (unquoted_option_chars-- > 0 &&
mybuf.len > 0 &&
mybuf.data[mybuf.len - 1] == ';')
{
mybuf.data[--mybuf.len] = '\0';
@ -1642,10 +1571,13 @@ psql_scan_slash_option(PsqlScanState state,
}
}
break;
case xslashquotedarg:
/* must have hit EOL inside double quotes */
badarg = true;
break;
case xslashquote:
case xslashbackquote:
case xslashdquote:
/* must have hit EOL inside quotes */
psql_error("unterminated quoted string\n");
termPQExpBuffer(&mybuf);
return NULL;
case xslashwholeline:
/* always okay */
break;
@ -1655,13 +1587,6 @@ psql_scan_slash_option(PsqlScanState state,
exit(1);
}
if (badarg)
{
psql_error("unterminated quoted string\n");
termPQExpBuffer(&mybuf);
return NULL;
}
/*
* An unquoted empty argument isn't possible unless we are at end of
* command. Return NULL instead.
@ -1702,6 +1627,74 @@ psql_scan_slash_command_end(PsqlScanState state)
/* There are no possible errors in this lex state... */
}
/*
* Evaluate a backticked substring of a slash command's argument.
*
* The portion of output_buf starting at backtick_start_offset is evaluated
* as a shell command and then replaced by the command's output.
*/
static void
evaluate_backtick(void)
{
char *cmd = output_buf->data + backtick_start_offset;
PQExpBufferData cmd_output;
FILE *fd;
bool error = false;
char buf[512];
size_t result;
initPQExpBuffer(&cmd_output);
fd = popen(cmd, PG_BINARY_R);
if (!fd)
{
psql_error("%s: %s\n", cmd, strerror(errno));
error = true;
}
if (!error)
{
do
{
result = fread(buf, 1, sizeof(buf), fd);
if (ferror(fd))
{
psql_error("%s: %s\n", cmd, strerror(errno));
error = true;
break;
}
appendBinaryPQExpBuffer(&cmd_output, buf, result);
} while (!feof(fd));
}
if (fd && pclose(fd) == -1)
{
psql_error("%s: %s\n", cmd, strerror(errno));
error = true;
}
if (PQExpBufferBroken(&cmd_output))
{
psql_error("%s: out of memory\n", cmd);
error = true;
}
/* Now done with cmd, delete it from output_buf */
output_buf->len = backtick_start_offset;
output_buf->data[output_buf->len] = '\0';
/* If no error, transfer result to output_buf */
if (!error)
{
/* strip any trailing newline */
if (cmd_output.len > 0 &&
cmd_output.data[cmd_output.len - 1] == '\n')
cmd_output.len--;
appendBinaryPQExpBuffer(output_buf, cmd_output.data, cmd_output.len);
}
termPQExpBuffer(&cmd_output);
}
/*
* Push the given string onto the stack of stuff to scan.