Commit Graph

32 Commits

Author SHA1 Message Date
Tom Lane 08fd6ff37f Sync regex code with Tcl 8.5.11.
Sync our regex code with upstream changes since last time we did this,
which was Tcl 8.5.0 (see commit df1e965e12).

There are no functional changes here; the main point is just to lay down
a commit-log marker that somebody has looked at this recently, and to do
what we can to keep the two codebases comparable.
2012-02-17 19:44:26 -05:00
Tom Lane 1e16a8107d Teach regular expression operators to honor collations.
This involves getting the character classification and case-folding
functions in the regex library to use the collations infrastructure.
Most of this work had been done already in connection with the upper/lower
and LIKE logic, so it was a simple matter of transposition.

While at it, split out these functions into a separate source file
regc_pg_locale.c, so that they can be correctly labeled with the Postgres
project's license rather than the Scriptics license.  These functions are
100% Postgres-written code whereas what remains in regc_locale.c is still
mostly not ours, so lumping them both under the same copyright notice was
getting more and more misleading.
2011-04-10 18:03:09 -04:00
Magnus Hagander 9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Tom Lane e621037eec Tweak a couple of macros in the regex code to suppress compiler warnings
from "clang".  The VERR changes make an assignment unconditional, which is
probably easier to read/understand anyway, and one can hardly argue that
it's worth shaving cycles off the case of reporting another error when
one has already been detected.  The INSIST change limits where that macro
can be used, but not in a way that creates a problem for any existing call.
2010-08-02 02:29:39 +00:00
Tom Lane ee3a81f0a0 Change regexp engine's ccondissect/crevdissect routines to perform DFA
matching before recursing instead of after.  The DFA match eliminates
unworkable midpoint choices a lot faster than the recursive check, in most
cases, so doing it first can speed things up; particularly in pathological
cases such as recently exhibited by Michael Glaesemann.

In addition, apply some cosmetic changes that were applied upstream (in the
Tcl project) at the same time, in order to sync with upstream version 1.15
of regexec.c.

Upstream apparently intends to backpatch this, so I will too.  The
pathological behavior could be unpleasant if encountered in the field,
which seems to justify any risk of introducing new bugs.

Tom Lane, reviewed by Donal K. Fellows of Tcl project
2010-02-01 02:45:29 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 303e089df5 Clean up possibly-uninitialized-variable warnings reported by gcc 4.x. 2005-09-24 22:54:44 +00:00
Bruce Momjian 75a64eeb4b I made the patch that implements regexp_replace again.
The specification of this function is as follows.

regexp_replace(source text, pattern text, replacement text, [flags
text])
returns text

Replace string that matches to regular expression in source text to
replacement text.

 - pattern is regular expression pattern.
 - replacement is replace string that can use '\1'-'\9', and '\&'.
    '\1'-'\9': back reference to the n'th subexpression.
    '\&'     : entire matched string.
 - flags can use the following values:
    g: global (replace all)
    i: ignore case
    When the flags is not specified, case sensitive, replace the first
    instance only.

Atsushi Ogawa
2005-07-10 04:54:33 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane 7bcc6d98fb Replace regular expression package with Henry Spencer's latest version
(extracted from Tcl 8.4.1 release, as Henry still hasn't got round to
making it a separate library).  This solves a performance problem for
multibyte, as well as upgrading our regexp support to match recent Tcl
and nearly match recent Perl.
2003-02-05 17:41:33 +00:00
Bruce Momjian bea4792125 This patch removes a bunch of superfluous #include directives: if
postgres.h or c.h includes a system header (such as stdio.h or
stdlib.h), there's no need to specifically include it in any of the .c
files in the backend.

Neil Conway
2002-11-08 20:23:57 +00:00
Tatsuo Ishii ed7baeaf4d Remove #ifdef MULTIBYTE per hackers list discussion. 2002-08-29 07:22:30 +00:00
Thomas G. Lockhart ea01a451cc Implement SQL99 OVERLAY(). Allows substitution of a substring in a string.
Implement SQL99 SIMILAR TO as a synonym for our existing operator "~".
Implement SQL99 regular expression SUBSTRING(string FROM pat FOR escape).
 Extend the definition to make the FOR clause optional.
 Define textregexsubstr() to actually implement this feature.
Update the regression test to include these new string features.
 All tests pass.
Rename the regular expression support routines from "pg95_xxx" to "pg_xxx".
Define CREATE CHARACTER SET in the parser per SQL99. No implementation yet.
2002-06-11 15:44:38 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Bruce Momjian fde8edaf53 Add do { ... } while (0) to more bad macros. 2001-10-25 01:29:37 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Tom Lane f7a839bc2b Clean up portability problems in regexp package: change all routine
definitions from K&R to ANSI C style, and fix broken assumption that
int and long are the same datatype.  This repairs problems observed
on Alpha with regexps having between 32 and 63 states.
2001-02-13 00:02:36 +00:00
Peter Eisentraut 533d516629 Removed MBFLAGS from makefiles since it's now done in include/config.h. 2000-01-19 02:59:03 +00:00
Bruce Momjian a9591ce66a Change #include's to use <> and "" as appropriate. 1999-07-15 23:04:24 +00:00
Bruce Momjian fa1a8d6a97 OK, folks, here is the pgindent output. 1998-09-01 04:40:42 +00:00
Bruce Momjian af74855a60 Renaming cleanup, no pgindent yet. 1998-09-01 03:29:17 +00:00
Bruce Momjian 7b2b779a2a Add auto-size to screen to \d? commands. Use UNION to show all
\d? results in one query. Add \d? field search feature.  Rename MB
to MULTIBYTE.
1998-07-18 18:34:34 +00:00
Marc G. Fournier 661ecf3c48 From: t-ishii@sra.co.jp
Included are patches intended for allowing PostgreSQL to handle
multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and
Mule internal code. With the MB patch you can use multi-byte character
sets in regexp and LIKE. The encoding system chosen is determined at
the compile time.

To enable the MB extension, you need to define a variable "MB" in
Makefile.global or in Makefile.custom. For further information please
take a look at README.mb under doc directory.

(Note that unlike "jp patch" I do not use modified GNU regexp any
more. I changed Henry Spencer's regexp coming with PostgreSQL.)
1998-03-15 07:39:04 +00:00
Bruce Momjian 24cab6bd0d Goodbye register keyword. Compiler knows better. 1998-02-11 19:14:04 +00:00
Bruce Momjian 319dbfa736 Another PGINDENT run that changes variable indenting and case label indenting. Also static variable indenting. 1997-09-08 02:41:22 +00:00
Bruce Momjian 1ccd423235 Massive commit to run PGINDENT on all *.c and *.h files. 1997-09-07 05:04:48 +00:00
Bruce Momjian 4b2b8592a0 Compile and warning cleanup 1996-11-08 06:02:30 +00:00
Marc G. Fournier 1a003fbcc2 Various patches from Bryan that *should* clean up the compile problems
ppl are seeing with v2.0
1996-09-20 08:34:39 +00:00
Marc G. Fournier ca405ae4bf Moved the include files to src/include/regex 1996-08-28 01:55:44 +00:00
Marc G. Fournier d31084e9d1 Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 06:22:35 +00:00