Create a quick-and-dirty list of known migration issues for pre-8.3

users of tsearch.  This isn't meant to be permanent documentation,
but to call out the areas that need either fixing or real documentation.
This commit is contained in:
Tom Lane 2007-10-22 03:37:04 +00:00
parent f1c87830b5
commit 6088bfb8b6

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.21 2007/10/21 20:04:37 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.22 2007/10/22 03:37:04 tgl Exp $ -->
<chapter id="textsearch">
<title id="textsearch-title">Full Text Search</title>
@ -3476,9 +3476,98 @@ Parser: "pg_catalog.default"
<title>Migration from Pre-8.3 Text Search</title>
<para>
This needs to be written ...
This area needs lots of work. Here is a quick list of known issues:
</para>
<itemizedlist mark="bullet">
<listitem>
<para>
The old contrib/tsearch2 objects <emphasis>must</> be removed from
the pg_dump output from a pre-8.3 database. While many of them won't
load for lack of a tsearch2.so library, some do and cause problems.
We have a working perl script for doing this with a custom- or tar-format
backup, but there is a proposal to incorporate the functionality directly
into pg_restore. Neither approach will help for pg_dumpall output.
</para>
</listitem>
<listitem>
<para>
The old dump may include schema-qualified references to the old
contrib/tsearch2 objects; for example <literal>public.tsvector</>
columns in table definitions. These will fail since the objects
are now in the pg_catalog schema. Given current pg_dump behavior
this will happen only for tables that are in a different schema
from the tsearch2 objects; which makes it more likely to bite
people who carefully put their tsearch2 objects in a
non-<literal>public</> schema.
</para>
<para>
Question: will restore-time failures of this type happen for
any objects other than the tsvector and tsquery datatypes?
</para>
<para>
The basic alternatives for fixing this seem to involve creating
a dummy linkage, such as a public.tsvector domain linking to the
base pg_catalog.tsvector type (which only helps for the datatypes);
or stripping the schema references out of the dump. We could
just recommend that users do this manually, or try to provide
some tools to help.
</para>
</listitem>
<listitem>
<para>
We have renamed the built-in tsvector update triggers, and changed
their arguments too. This will result in CREATE TRIGGER commands
failing during load, which can be ignored, but users will need to
re-issue them with suitable argument adjustment. We probably
can't automate that for them. Also, the old tsearch2 trigger
function offered an option to invoke functions, which was removed
as being a security hole. Users who were relying on that will need to
write custom trigger functions as a substitute. I think all we
can do here is document what to do to fix it.
</para>
</listitem>
<listitem>
<para>
We have renamed a number of other functions besides the triggers,
compared to the tsearch2 versions. This seems unlikely to cause
any problems during dump/reload but it will require adjustments in
the bodies of stored procedures and in client application code.
Again, not much to do except document it.
</para>
</listitem>
<listitem>
<para>
Configuration setup is completely different now. Can we provide
any automated assistance for translating an old custom setup?
It probably can't be 100% automatic in any case, so maybe documentation
is the best we can do here too. Aside from the inside-the-database
differences, outside-the-database configuration files now have
prescribed location and extensions, which was not true before.
</para>
</listitem>
<listitem>
<para>
Relocation of configuration from add-on tables into core system catalogs
will break client queries that looked at the add-on tables.
</para>
</listitem>
<listitem>
<para>
What else?
</para>
</listitem>
</itemizedlist>
</sect1>
</chapter>