Document method of removing invalid UTF8 escape sequences from dump

file.  Backpatch to 8.1.X.

Paul Lindner
This commit is contained in:
Bruce Momjian 2005-12-06 19:26:43 +00:00
parent af2e8a872d
commit 394fedfd72
1 changed files with 15 additions and 1 deletions

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/release.sgml,v 1.403 2005/12/06 18:45:18 momjian Exp $
$PostgreSQL: pgsql/doc/src/sgml/release.sgml,v 1.404 2005/12/06 19:26:43 momjian Exp $
Typical markup:
@ -525,6 +525,20 @@ psql -t -f fixseq.sql db1 | psql -e db1
<type>boolean</type> rather than an <type>integer</type> (Neil)
</para>
</listitem>
<listitem>
<para>
Some users are having problems loading <literal>UTF8</> data into
8.1.X. This is because previous versions allowed invalid <literal>UTF8</>
sequences to be entered into the database, and this release
properly accepts only valid <literal>UTF8</> sequences. One
way to correct a dumpfile is to use <command>iconv -c -f UTF8 -t UTF8</>.
This will remove invalid character sequences. <command>iconv</>
reads the entire input file into memory so it might be necessary to
<command>split</> the dump into multiple smaller files for processing.
</para>
</listitem>
</itemizedlist>
</sect2>