Doc: fix thinko in description of how to escape a backslash in bytea.

Also clean up some discussion that had been left in a very confused
state thanks to half-hearted adjustments for the change to
standard_conforming_strings being the default.

Discussion: https://postgr.es/m/154954987367.1297.4358910045409218@wrigleys.postgresql.org
This commit is contained in:
Tom Lane 2019-02-08 12:49:36 -05:00
parent 9d6d2b2134
commit 8cf3fada2f
1 changed files with 26 additions and 32 deletions

View File

@ -1335,9 +1335,9 @@ SELECT b, char_length(b) FROM test2;
per byte, most significant nibble first. The entire string is per byte, most significant nibble first. The entire string is
preceded by the sequence <literal>\x</literal> (to distinguish it preceded by the sequence <literal>\x</literal> (to distinguish it
from the escape format). In some contexts, the initial backslash may from the escape format). In some contexts, the initial backslash may
need to be escaped by doubling it, in the same cases in which backslashes need to be escaped by doubling it
have to be doubled in escape format; details appear below. (see <xref linkend="sql-syntax-strings"/>).
The hexadecimal digits can For input, the hexadecimal digits can
be either upper or lower case, and whitespace is permitted between be either upper or lower case, and whitespace is permitted between
digit pairs (but not within a digit pair nor in the starting digit pairs (but not within a digit pair nor in the starting
<literal>\x</literal> sequence). <literal>\x</literal> sequence).
@ -1379,9 +1379,7 @@ SELECT '\xDEADBEEF';
values <emphasis>must</emphasis> be escaped, while all octet values <emphasis>must</emphasis> be escaped, while all octet
values <emphasis>can</emphasis> be escaped. In values <emphasis>can</emphasis> be escaped. In
general, to escape an octet, convert it into its three-digit general, to escape an octet, convert it into its three-digit
octal value and precede it octal value and precede it by a backslash.
by a backslash (or two backslashes, if writing the value as a
literal using escape string syntax).
Backslash itself (octet decimal value 92) can alternatively be represented by Backslash itself (octet decimal value 92) can alternatively be represented by
double backslashes. double backslashes.
<xref linkend="datatype-binary-sqlesc"/> <xref linkend="datatype-binary-sqlesc"/>
@ -1398,7 +1396,7 @@ SELECT '\xDEADBEEF';
<entry>Description</entry> <entry>Description</entry>
<entry>Escaped Input Representation</entry> <entry>Escaped Input Representation</entry>
<entry>Example</entry> <entry>Example</entry>
<entry>Output Representation</entry> <entry>Hex Representation</entry>
</row> </row>
</thead> </thead>
@ -1422,7 +1420,7 @@ SELECT '\xDEADBEEF';
<row> <row>
<entry>92</entry> <entry>92</entry>
<entry>backslash</entry> <entry>backslash</entry>
<entry><literal>'\'</literal> or <literal>'\\134'</literal></entry> <entry><literal>'\\'</literal> or <literal>'\134'</literal></entry>
<entry><literal>SELECT '\\'::bytea;</literal></entry> <entry><literal>SELECT '\\'::bytea;</literal></entry>
<entry><literal>\x5c</literal></entry> <entry><literal>\x5c</literal></entry>
</row> </row>
@ -1442,39 +1440,35 @@ SELECT '\xDEADBEEF';
<para> <para>
The requirement to escape <emphasis>non-printable</emphasis> octets The requirement to escape <emphasis>non-printable</emphasis> octets
varies depending on locale settings. In some instances you can get away varies depending on locale settings. In some instances you can get away
with leaving them unescaped. Note that the result in each of the examples with leaving them unescaped.
in <xref linkend="datatype-binary-sqlesc"/> was exactly one octet in
length, even though the output representation is sometimes
more than one character.
</para> </para>
<para> <para>
The reason multiple backslashes are required, as shown The reason that single quotes must be doubled, as shown
in <xref linkend="datatype-binary-sqlesc"/>, is that an input in <xref linkend="datatype-binary-sqlesc"/>, is that this
string written as a string literal must pass through two parse is true for any string literal in a SQL command. The generic
phases in the <productname>PostgreSQL</productname> server. string-literal parser consumes the outermost single quotes
The first backslash of each pair is interpreted as an escape and reduces any pair of single quotes to one data character.
character by the string-literal parser (assuming escape string What the <type>bytea</type> input function sees is just one
syntax is used) and is therefore consumed, leaving the second backslash of the single quote, which it treats as a plain data character.
pair. (Dollar-quoted strings can be used to avoid this level However, the <type>bytea</type> input function treats
of escaping.) The remaining backslash is then recognized by the backslashes as special, and the other behaviors shown in
<type>bytea</type> input function as starting either a three <xref linkend="datatype-binary-sqlesc"/> are implemented by
digit octal value or escaping another backslash. For example, that function.
a string literal passed to the server as <literal>'\001'</literal> </para>
becomes <literal>\001</literal> after passing through the
escape string parser. The <literal>\001</literal> is then sent <para>
to the <type>bytea</type> input function, where it is converted In some contexts, backslashes must be doubled compared to what is
to a single octet with a decimal value of 1. Note that the shown above, because the generic string-literal parser will also
single-quote character is not treated specially by <type>bytea</type>, reduce pairs of backslashes to one data character;
so it follows the normal rules for string literals. (See also see <xref linkend="sql-syntax-strings"/>.
<xref linkend="sql-syntax-strings"/>.)
</para> </para>
<para> <para>
<type>Bytea</type> octets are output in <literal>hex</literal> <type>Bytea</type> octets are output in <literal>hex</literal>
format by default. If you change <xref linkend="guc-bytea-output"/> format by default. If you change <xref linkend="guc-bytea-output"/>
to <literal>escape</literal>, to <literal>escape</literal>,
<quote>non-printable</quote> octet are converted to <quote>non-printable</quote> octets are converted to their
equivalent three-digit octal value and preceded by one backslash. equivalent three-digit octal value and preceded by one backslash.
Most <quote>printable</quote> octets are output by their standard Most <quote>printable</quote> octets are output by their standard
representation in the client character set, e.g.: representation in the client character set, e.g.: