Create a GUC variable REGEX_FLAVOR to control the type of regular

expression accepted by the regex operators, per discussion yesterday.

Along the way, reduce deadlock_timeout from PGC_POSTMASTER to PGC_SIGHUP
category.  It is probably best to insist that all backends share the same
setting, but that doesn't mean it has to be frozen at startup.
This commit is contained in:
Tom Lane 2003-02-06 20:25:33 +00:00
parent 465ed56531
commit 77ede8900d
7 changed files with 114 additions and 53 deletions

View File

@ -1,5 +1,5 @@
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.137 2003/02/05 17:41:32 tgl Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.138 2003/02/06 20:25:31 tgl Exp $
PostgreSQL documentation
-->
@ -2665,10 +2665,24 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
due to their availability in programming languages such as Perl and Tcl.
<acronym>RE</acronym>s using these non-POSIX extensions are called
<firstterm>advanced</> <acronym>RE</acronym>s or <acronym>ARE</>s
in this documentation. We first describe the ERE/ARE flavor and then
mention the restrictions of the BRE form.
in this documentation. AREs are almost an exact superset of EREs,
but BREs have several notational incompatibilities (as well as being
much more limited).
We first describe the ARE and ERE forms, noting features that apply
only to AREs, and then describe how BREs differ.
</para>
<note>
<para>
The form of regular expressions accepted by <productname>PostgreSQL</>
can be chosen by setting the <varname>REGEX_FLAVOR</> run-time parameter
(described in the &cite-admin;). The usual setting is
<literal>advanced</>, but one might choose <literal>extended</> for
maximum backwards compatibility with pre-7.4 releases of
<productname>PostgreSQL</>.
</para>
</note>
<para>
A regular expression is defined as one or more
<firstterm>branches</firstterm>, separated by
@ -2784,7 +2798,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
meaning in <productname>PostgreSQL</> string literals.
To write a pattern constant that contains a backslash,
you must write two backslashes in the query.
</para>
</para>
</note>
<table id="posix-quantifiers-table">
@ -3392,11 +3406,11 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
</para>
<para>
Normally the flavor of RE being used is specified by
application-dependent means.
However, this can be overridden by a <firstterm>director</>.
Normally the flavor of RE being used is determined by
<varname>REGEX_FLAVOR</>.
However, this can be overridden by a <firstterm>director</> prefix.
If an RE of any flavor begins with <literal>***:</>,
the rest of the RE is an ARE.
the rest of the RE is taken as an ARE.
If an RE of any flavor begins with <literal>***=</>,
the rest of the RE is taken to be a literal string,
with all characters considered ordinary characters.
@ -3407,8 +3421,8 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
a sequence <literal>(?</><replaceable>xyz</><literal>)</>
(where <replaceable>xyz</> is one or more alphabetic characters)
specifies options affecting the rest of the RE.
These supplement, and can override,
any options specified externally.
These options override any previously determined options (including
both the RE flavor and case sensitivity).
The available option letters are
shown in <xref linkend="posix-embedded-options-table">.
</para>
@ -3432,7 +3446,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
<row>
<entry> <literal>c</> </entry>
<entry> case-sensitive matching (usual default) </entry>
<entry> case-sensitive matching (overrides operator type) </entry>
</row>
<row>
@ -3443,7 +3457,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
<row>
<entry> <literal>i</> </entry>
<entry> case-insensitive matching (see
<xref linkend="posix-matching-rules">) </entry>
<xref linkend="posix-matching-rules">) (overrides operator type) </entry>
</row>
<row>
@ -3471,12 +3485,12 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
<row>
<entry> <literal>s</> </entry>
<entry> non-newline-sensitive matching (usual default) </entry>
<entry> non-newline-sensitive matching (default) </entry>
</row>
<row>
<entry> <literal>t</> </entry>
<entry> tight syntax (usual default; see below) </entry>
<entry> tight syntax (default; see below) </entry>
</row>
<row>
@ -3696,7 +3710,7 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
</para>
<para>
Two significant incompatibilites exist between AREs and the ERE syntax
Two significant incompatibilities exist between AREs and the ERE syntax
recognized by pre-7.4 releases of <productname>PostgreSQL</>:
<itemizedlist>
@ -3717,6 +3731,10 @@ SUBSTRING('foobar' FROM 'o(.)b') <lineannotation>o</lineannotation>
</para>
</listitem>
</itemizedlist>
While these differences are unlikely to create a problem for most
applications, you can avoid them if necessary by
setting <varname>REGEX_FLAVOR</> to <literal>extended</>.
</para>
</sect3>

View File

@ -1,5 +1,5 @@
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.167 2003/01/25 23:10:27 tgl Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.168 2003/02/06 20:25:31 tgl Exp $
-->
<Chapter Id="runtime">
@ -1447,8 +1447,7 @@ env PGOPTIONS='-c geqo=off' psql
practice. On a heavily loaded server you might want to raise it.
Ideally the setting should exceed your typical transaction time,
so as to improve the odds that the lock will be released before
the waiter decides to check for deadlock. This option can only
be set at server start.
the waiter decides to check for deadlock.
</para>
</listitem>
</varlistentry>
@ -1781,6 +1780,20 @@ dynamic_library_path = '/usr/local/lib/postgresql:/home/my_project/lib:$libdir'
</listitem>
</varlistentry>
<varlistentry>
<term><varname>REGEX_FLAVOR</varname> (<type>string</type>)</term>
<indexterm><primary>regular expressions</></>
<listitem>
<para>
The regular expression <quote>flavor</> can be set to
<literal>advanced</>, <literal>extended</>, or <literal>basic</>.
The usual default is <literal>advanced</>. The <literal>extended</>
setting may be useful for exact backwards compatibility with
pre-7.4 releases of <productname>PostgreSQL</>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>SEARCH_PATH</varname> (<type>string</type>)</term>
<indexterm><primary>search_path</></>

View File

@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $Header: /cvsroot/pgsql/src/backend/utils/adt/regexp.c,v 1.44 2003/02/05 17:41:32 tgl Exp $
* $Header: /cvsroot/pgsql/src/backend/utils/adt/regexp.c,v 1.45 2003/02/06 20:25:33 tgl Exp $
*
* Alistair Crooks added the code for the regex caching
* agc - cached the regular expressions used - there's a good chance
@ -34,6 +34,10 @@
#include "utils/builtins.h"
/* GUC-settable flavor parameter */
static int regex_flavor = REG_ADVANCED;
/*
* We cache precompiled regular expressions using a "self organizing list"
* structure, in which recently-used items tend to be near the front.
@ -216,6 +220,34 @@ RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len,
}
/*
* assign_regex_flavor - GUC hook to validate and set REGEX_FLAVOR
*/
const char *
assign_regex_flavor(const char *value,
bool doit, bool interactive)
{
if (strcasecmp(value, "advanced") == 0)
{
if (doit)
regex_flavor = REG_ADVANCED;
}
else if (strcasecmp(value, "extended") == 0)
{
if (doit)
regex_flavor = REG_EXTENDED;
}
else if (strcasecmp(value, "basic") == 0)
{
if (doit)
regex_flavor = REG_BASIC;
}
else
return NULL; /* fail */
return value; /* OK */
}
/*
* interface routines called by the function manager
*/
@ -229,7 +261,7 @@ nameregexeq(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(RE_compile_and_execute(p,
(unsigned char *) NameStr(*n),
strlen(NameStr(*n)),
REG_ADVANCED,
regex_flavor,
0, NULL));
}
@ -242,7 +274,7 @@ nameregexne(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(!RE_compile_and_execute(p,
(unsigned char *) NameStr(*n),
strlen(NameStr(*n)),
REG_ADVANCED,
regex_flavor,
0, NULL));
}
@ -255,7 +287,7 @@ textregexeq(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(RE_compile_and_execute(p,
(unsigned char *) VARDATA(s),
VARSIZE(s) - VARHDRSZ,
REG_ADVANCED,
regex_flavor,
0, NULL));
}
@ -268,7 +300,7 @@ textregexne(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(!RE_compile_and_execute(p,
(unsigned char *) VARDATA(s),
VARSIZE(s) - VARHDRSZ,
REG_ADVANCED,
regex_flavor,
0, NULL));
}
@ -288,7 +320,7 @@ nameicregexeq(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(RE_compile_and_execute(p,
(unsigned char *) NameStr(*n),
strlen(NameStr(*n)),
REG_ICASE | REG_ADVANCED,
regex_flavor | REG_ICASE,
0, NULL));
}
@ -301,7 +333,7 @@ nameicregexne(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(!RE_compile_and_execute(p,
(unsigned char *) NameStr(*n),
strlen(NameStr(*n)),
REG_ICASE | REG_ADVANCED,
regex_flavor | REG_ICASE,
0, NULL));
}
@ -314,7 +346,7 @@ texticregexeq(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(RE_compile_and_execute(p,
(unsigned char *) VARDATA(s),
VARSIZE(s) - VARHDRSZ,
REG_ICASE | REG_ADVANCED,
regex_flavor | REG_ICASE,
0, NULL));
}
@ -327,7 +359,7 @@ texticregexne(PG_FUNCTION_ARGS)
PG_RETURN_BOOL(!RE_compile_and_execute(p,
(unsigned char *) VARDATA(s),
VARSIZE(s) - VARHDRSZ,
REG_ICASE | REG_ADVANCED,
regex_flavor | REG_ICASE,
0, NULL));
}
@ -353,7 +385,7 @@ textregexsubstr(PG_FUNCTION_ARGS)
match = RE_compile_and_execute(p,
(unsigned char *) VARDATA(s),
VARSIZE(s) - VARHDRSZ,
REG_ADVANCED,
regex_flavor,
2, pmatch);
/* match? then return the substring matching the pattern */

View File

@ -5,7 +5,7 @@
* command, configuration file, and command line options.
* See src/backend/utils/misc/README for more information.
*
* $Header: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v 1.113 2003/01/28 18:04:02 tgl Exp $
* $Header: /cvsroot/pgsql/src/backend/utils/misc/guc.c,v 1.114 2003/02/06 20:25:33 tgl Exp $
*
* Copyright 2000 by PostgreSQL Global Development Group
* Written by Peter Eisentraut <peter_e@gmx.net>.
@ -127,6 +127,7 @@ static double phony_random_seed;
static char *client_encoding_string;
static char *datestyle_string;
static char *default_iso_level_string;
static char *regex_flavor_string;
static char *server_encoding_string;
static char *session_authorization_string;
static char *timezone_string;
@ -568,7 +569,7 @@ static struct config_int
},
{
{"deadlock_timeout", PGC_POSTMASTER}, &DeadlockTimeout,
{"deadlock_timeout", PGC_SIGHUP}, &DeadlockTimeout,
1000, 0, INT_MAX, NULL, NULL
},
@ -818,6 +819,11 @@ static struct config_string
"C", locale_time_assign, NULL
},
{
{"regex_flavor", PGC_USERSET}, &regex_flavor_string,
"advanced", assign_regex_flavor, NULL
},
{
{"search_path", PGC_USERSET, GUC_LIST_INPUT | GUC_LIST_QUOTE},
&namespace_search_path,

View File

@ -208,6 +208,7 @@
#max_expr_depth = 10000 # min 10
#max_files_per_process = 1000 # min 25
#password_encryption = true
#regex_flavor = advanced # advanced, extended, or basic
#sql_inheritance = true
#transform_null_equals = false
#statement_timeout = 0 # 0 is disabled, in milliseconds

View File

@ -3,7 +3,7 @@
*
* Copyright 2000-2002 by PostgreSQL Global Development Group
*
* $Header: /cvsroot/pgsql/src/bin/psql/tab-complete.c,v 1.72 2003/01/25 23:10:30 tgl Exp $
* $Header: /cvsroot/pgsql/src/bin/psql/tab-complete.c,v 1.73 2003/02/06 20:25:33 tgl Exp $
*/
/*----------------------------------------------------------------------
@ -215,14 +215,11 @@ psql_completion(char *text, int start, int end)
"TRANSACTION",
/*
* the rest should match USERSET entries in
* backend/utils/misc/guc.c
* the rest should match USERSET and possibly SUSET entries in
* backend/utils/misc/guc.c.
*/
"australian_timezones",
"authentication_timeout",
"autocommit",
"checkpoint_segments",
"checkpoint_timeout",
"client_encoding",
"client_min_messages",
"commit_delay",
@ -231,7 +228,6 @@ psql_completion(char *text, int start, int end)
"cpu_operator_cost",
"cpu_tuple_cost",
"DateStyle",
"db_user_namespace",
"deadlock_timeout",
"debug_pretty_print",
"debug_print_parse",
@ -239,19 +235,19 @@ psql_completion(char *text, int start, int end)
"debug_print_rewritten",
"default_statistics_target",
"default_transaction_isolation",
"default_transaction_read_only",
"dynamic_library_path",
"effective_cache_size",
"enable_hashagg",
"enable_hashjoin",
"enable_indexscan",
"enable_mergejoin",
"enable_nestloop",
"enable_seqscan",
"enable_sort",
"enable_hashagg",
"enable_tidscan",
"explain_pretty_print",
"extra_float_digits",
"fixbtree",
"from_collapse_limit",
"fsync",
"geqo",
@ -262,18 +258,19 @@ psql_completion(char *text, int start, int end)
"geqo_selection_bias",
"geqo_threshold",
"join_collapse_limit",
"log_hostname",
"krb_server_keyfile",
"lc_messages",
"lc_monetary",
"lc_numeric",
"lc_time",
"log_connections",
"log_duration",
"log_executor_stats",
"log_min_error_statement",
"log_pid",
"log_min_messages",
"log_parser_stats",
"log_planner_stats",
"log_statement",
"log_timestamp",
"log_statement_stats",
"max_connections",
"max_expr_depth",
"max_files_per_process",
@ -282,19 +279,12 @@ psql_completion(char *text, int start, int end)
"max_locks_per_transaction",
"password_encryption",
"port",
"pre_auth_delay",
"random_page_cost",
"regex_flavor",
"search_path",
"log_min_messages",
"shared_buffers",
"log_executor_stats",
"log_parser_stats",
"log_planner_stats",
"log_source_port",
"log_statement_stats",
"seed",
"server_encoding",
"silent_mode",
"sort_mem",
"sql_inheritance",
"ssl",
@ -316,7 +306,6 @@ psql_completion(char *text, int start, int end)
"unix_socket_group",
"unix_socket_permissions",
"vacuum_mem",
"virtual_hostt",
"wal_buffers",
"wal_debug",
"wal_sync_method",

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2002, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $Id: builtins.h,v 1.206 2002/12/06 05:20:28 momjian Exp $
* $Id: builtins.h,v 1.207 2003/02/06 20:25:33 tgl Exp $
*
*-------------------------------------------------------------------------
*/
@ -377,6 +377,8 @@ extern Datum texticregexeq(PG_FUNCTION_ARGS);
extern Datum texticregexne(PG_FUNCTION_ARGS);
extern Datum textregexsubstr(PG_FUNCTION_ARGS);
extern Datum similar_escape(PG_FUNCTION_ARGS);
extern const char *assign_regex_flavor(const char *value,
bool doit, bool interactive);
/* regproc.c */
extern Datum regprocin(PG_FUNCTION_ARGS);