Commit Graph

121 Commits

Author SHA1 Message Date
Neil Conway 4591fb1aa8 Code cleanup for the new regexp UDFs: we can hardcode the OID and some
properties of the "text" type, and then simplify the code accordingly.
Patch from Jeremy Drake.
2007-03-28 22:59:37 +00:00
Neil Conway 9eb78beeae Add three new regexp functions: regexp_matches, regexp_split_to_array,
and regexp_split_to_table. These functions provide access to the
capture groups resulting from a POSIX regular expression match,
and provide the ability to split a string on a POSIX regular
expression, respectively. Patch from Jeremy Drake; code review by
Neil Conway, additional comments and suggestions from Tom and
Peter E.

This patch bumps the catversion, adds some regression tests,
and updates the docs.
2007-03-20 05:45:00 +00:00
Tom Lane 234a02b2a8 Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len).
Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with
VARSIZE and VARDATA, and as a consequence almost no code was using the
longer names.  Rename the length fields of struct varlena and various
derived structures to catch anyplace that was accessing them directly;
and clean up various places so caught.  In itself this patch doesn't
change any behavior at all, but it is necessary infrastructure if we hope
to play any games with the representation of varlena headers.
Greg Stark and Tom Lane
2007-02-27 23:48:10 +00:00
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane d6061d2f31 Fix regex_fixed_prefix() to cope reasonably well with regex patterns of the
form '^(foo)$'.  Before, these could never be optimized into indexscans.
The recent changes to make psql and pg_dump generate such patterns (for \d
commands and -t and related switches, respectively) therefore represented
a big performance hit for people with large pg_class catalogs, as seen in
recent gripe from Erik Jones.  While at it, be more paranoid about
case-sensitivity checking in multibyte encodings, and fix some other
corner cases in which a regex might be interpreted too liberally.
2007-01-03 22:39:26 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian 0ff3461bcc Alphabetically order reference to include files, "N" - "S". 2006-07-11 17:26:59 +00:00
Tom Lane cc39aca7d4 Fix similar_escape() so that SIMILAR TO works properly for patterns involving
alternatives ("|" symbol).  The original coding allowed the added ^ and $
constraints to be absorbed into the first and last alternatives, producing
a pattern that would match more than it should.  Per report from Eric Noriega.

I also changed the pattern to add an ARE director ("***:"), ensuring that
SIMILAR TO patterns do not change behavior if regex_flavor is changed.  This
is necessary to make the non-capturing parentheses work, and seems like a
good idea on general principles.

Back-patched as far as 7.4.  7.3 also has the bug, but a fix seems impractical
because that version's regex engine doesn't have non-capturing parens.
2006-04-13 18:01:31 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Tom Lane 220f2a7d15 Code review for regexp_replace patch. Improve documentation and comments,
fix problems with replacement-string backslashes that aren't followed by
one of the expected characters, avoid giving the impression that
replace_text_regexp() is meant to be called directly as a SQL function,
etc.
2005-10-18 20:38:58 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 8889685555 Suppress signed-vs-unsigned-char warnings. 2005-09-24 17:53:28 +00:00
Bruce Momjian 75a64eeb4b I made the patch that implements regexp_replace again.
The specification of this function is as follows.

regexp_replace(source text, pattern text, replacement text, [flags
text])
returns text

Replace string that matches to regular expression in source text to
replacement text.

 - pattern is regular expression pattern.
 - replacement is replace string that can use '\1'-'\9', and '\&'.
    '\1'-'\9': back reference to the n'th subexpression.
    '\&'     : entire matched string.
 - flags can use the following values:
    g: global (replace all)
    i: ignore case
    When the flags is not specified, case sensitive, replace the first
    instance only.

Atsushi Ogawa
2005-07-10 04:54:33 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane 236404fcd1 Our interface code for Spencer's regexp package was checking for regexp
error conditions during regexp compile, but not during regexp execution;
any sort of "can't happen" errors would be treated as no-match instead
of being reported as they should be.  Noticed while trying to duplicate
a reported Tcl bug.
2004-11-24 22:44:07 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane 0bd61548ab Solve the 'Turkish problem' with undesirable locale behavior for case
conversion of basic ASCII letters.  Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower.  These functions use the same notions of
case folding already developed for identifier case conversion.  I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent.  Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
2004-05-07 00:24:59 +00:00
Tom Lane d3917186b2 pwd 2004-02-03 17:52:55 +00:00
Tom Lane 9bd681a522 Repair problem identified by Olivier Prenant: ALTER DATABASE SET search_path
should not be too eager to reject paths involving unknown schemas, since
it can't really tell whether the schemas exist in the target database.
(Also, when reading pg_dumpall output, it could be that the schemas
don't exist yet, but eventually will.)  ALTER USER SET has a similar issue.
So, reduce the normal ERROR to a NOTICE when checking search_path values
for these commands.  Supporting this requires changing the API for GUC
assign_hook functions, which causes the patch to touch a lot of places,
but the changes are conceptually trivial.
2004-01-19 19:04:40 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Bruce Momjian 46785776c4 Another pgindent run with updated typedefs. 2003-08-08 21:42:59 +00:00
Bruce Momjian f3c3deb7d0 Update copyrights to 2003. 2003-08-04 02:40:20 +00:00
Bruce Momjian 089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Tom Lane b6a1d25b0a Error message editing in utils/adt. Again thanks to Joe Conway for doing
the bulk of the heavy lifting ...
2003-07-27 04:53:12 +00:00
Tom Lane 77ede8900d Create a GUC variable REGEX_FLAVOR to control the type of regular
expression accepted by the regex operators, per discussion yesterday.

Along the way, reduce deadlock_timeout from PGC_POSTMASTER to PGC_SIGHUP
category.  It is probably best to insist that all backends share the same
setting, but that doesn't mean it has to be frozen at startup.
2003-02-06 20:25:33 +00:00
Tom Lane 7bcc6d98fb Replace regular expression package with Henry Spencer's latest version
(extracted from Tcl 8.4.1 release, as Henry still hasn't got round to
making it a separate library).  This solves a performance problem for
multibyte, as well as upgrading our regexp support to match recent Tcl
and nearly match recent Perl.
2003-02-05 17:41:33 +00:00
Tom Lane 9946b83ded Bring SIMILAR TO and SUBSTRING into some semblance of conformance with
the SQL99 standard.  (I'm not sure that the character-class features are
quite right, but that can be fixed later.)  Document SQL99 and POSIX
regexps as being different features; provide variants of SUBSTRING for
each.
2002-09-22 17:27:25 +00:00
Bruce Momjian e50f52a074 pgindent run. 2002-09-04 20:31:48 +00:00
Bruce Momjian d84fe82230 Update copyright to 2002. 2002-06-20 20:29:54 +00:00
Thomas G. Lockhart bad5fe9797 Search the existing regular expression cache as a ring buffer.
Will optimize the case for repeated calls for the same expression,
 which seems to be the most common case. Formerly, always searched
 from the first entry.
May want to look at the least-recently-used algorithm to make sure it
 is identifying the right slots to reclaim. Seems silly to do math when
 it seems that we could simply use an incrementing counter...
2002-06-15 02:49:47 +00:00
Thomas G. Lockhart ea01a451cc Implement SQL99 OVERLAY(). Allows substitution of a substring in a string.
Implement SQL99 SIMILAR TO as a synonym for our existing operator "~".
Implement SQL99 regular expression SUBSTRING(string FROM pat FOR escape).
 Extend the definition to make the FOR clause optional.
 Define textregexsubstr() to actually implement this feature.
Update the regression test to include these new string features.
 All tests pass.
Rename the regular expression support routines from "pg95_xxx" to "pg_xxx".
Define CREATE CHARACTER SET in the parser per SQL99. No implementation yet.
2002-06-11 15:44:38 +00:00
Bruce Momjian ea08e6cd55 New pgindent run with fixes suggested by Tom. Patch manually reviewed,
initdb/regression tests pass.
2001-11-05 17:46:40 +00:00
Bruce Momjian b81844b173 pgindent run on all C files. Java run to follow. initdb/regression
tests pass.
2001-10-25 05:50:21 +00:00
Bruce Momjian 9e1552607a pgindent run. Make it all clean. 2001-03-22 04:01:46 +00:00
Tom Lane 184fb14b90 Make regular-expression error messages a tad less obscure,
per gripe from Josh Berkus.
2001-03-19 22:27:46 +00:00
Bruce Momjian 623bf843d2 Change Copyright from PostgreSQL, Inc to PostgreSQL Global Development Group. 2001-01-24 19:43:33 +00:00
Tom Lane 65da0d66b4 Fix misuse of StrNCpy to copy and add null to non-null-terminated data.
Does not work since it fetches one byte beyond the source data, and when
the phase of the moon is wrong, the source data is smack up against the
end of backend memory and you get SIGSEGV.  Don't laugh, this is a fix
for an actual user bug report.
2000-07-07 21:12:53 +00:00
Tom Lane 8ecac94bb2 Functions on 'text' type updated to new fmgr style. 'text' is
now TOAST-able.
2000-07-06 05:48:31 +00:00
Tom Lane 40f64064ff Update textin() and textout() to new fmgr style. This is just phase
one of updating the whole text datatype, but there are so dang many
calls of these two routines that it seems worth a separate commit.
2000-07-05 23:12:09 +00:00
Bruce Momjian 5c25d60244 Add:
* Portions Copyright (c) 1996-2000, PostgreSQL, Inc

to all files copyright Regents of Berkeley.  Man, that's a lot of files.
2000-01-26 05:58:53 +00:00
Bruce Momjian 86ef36c907 New NameStr macro to convert Name to Str. No need for var.data anymore.
Fewer calls to nameout.

Better use of RelationGetRelationName.
1999-11-07 23:08:36 +00:00
Bruce Momjian 3406901a29 Move some system includes into c.h, and remove duplicates. 1999-07-17 20:18:55 +00:00
Bruce Momjian a71802e12e Final cleanup. 1999-07-16 05:00:38 +00:00
Bruce Momjian 9b645d481c Update #include cleanups 1999-07-16 03:14:30 +00:00
Bruce Momjian a9591ce66a Change #include's to use <> and "" as appropriate. 1999-07-15 23:04:24 +00:00
Bruce Momjian 4b2c2850bf Clean up #include in /include directory. Add scripts for checking includes. 1999-07-15 15:21:54 +00:00
Bruce Momjian 0cf1b79528 Cleanup of /include #include's, for 6.6 only. 1999-07-14 01:20:30 +00:00
Bruce Momjian 6724a50787 Change my-function-name-- to my_function_name, and optimizer renames. 1999-02-13 23:22:53 +00:00
Bruce Momjian 9322950aa4 Cleanup of source files where 'return' or 'var =' is alone on a line. 1999-02-03 21:18:02 +00:00
Bruce Momjian fa1a8d6a97 OK, folks, here is the pgindent output. 1998-09-01 04:40:42 +00:00
Bruce Momjian af74855a60 Renaming cleanup, no pgindent yet. 1998-09-01 03:29:17 +00:00
Bruce Momjian 6bd323c6b3 Remove un-needed braces around single statements. 1998-06-15 19:30:31 +00:00
Bruce Momjian 0d203b745d Re-apply Darren's char2-16 removal code. 1998-04-26 04:12:15 +00:00
Bruce Momjian db21523314 Back out char2-char16 removal. Add later. 1998-04-07 18:14:38 +00:00
Bruce Momjian 57b5966405 The following uuencoded, gzip'd file will ...
1. Remove the char2, char4, char8 and char16 types from postgresql
2. Change references of char16 to name in the regression tests.
3. Rename the char16.sql regression test to name.sql.  4. Modify
the regression test scripts and outputs to match up.

Might require new regression.{SYSTEM} files...

Darren King
1998-03-30 17:28:21 +00:00
Bruce Momjian a32450a585 pgindent run before 6.3 release, with Thomas' requested changes. 1998-02-26 04:46:47 +00:00
Bruce Momjian deea69b90e Change some ABORTS to ERROR. Add line number when COPY Failure. 1998-01-05 16:40:20 +00:00
Bruce Momjian 0d9fc5afd6 Change elog(WARN) to elog(ERROR) and elog(ABORT). 1998-01-05 03:35:55 +00:00
Bruce Momjian f3af1368bd Rename strNcpy to StrNCpy, and change third parameter. 1997-10-25 01:10:58 +00:00
Bruce Momjian 59f6a57e59 Used modified version of indent that understands over 100 typedefs. 1997-09-08 21:56:23 +00:00
Bruce Momjian 319dbfa736 Another PGINDENT run that changes variable indenting and case label indenting. Also static variable indenting. 1997-09-08 02:41:22 +00:00
Bruce Momjian 1ccd423235 Massive commit to run PGINDENT on all *.c and *.h files. 1997-09-07 05:04:48 +00:00
Bruce Momjian ea5b5357cd Remove more (void) and fix -Wall warnings. 1997-08-12 22:55:25 +00:00
Bruce Momjian edb58721b8 Fix pgproc names over 15 chars in output. Add strNcpy() function. remove some (void) casts that are unnecessary. 1997-08-12 20:16:25 +00:00
Bryan Henderson 0e5ab3655c Remove #include <regex.h> so it compiles on systems with GNU regex library. 1996-11-10 01:20:44 +00:00
Marc G. Fournier 0020e8790d Another directory that compiles with no errors, and few warnings 1996-11-06 10:32:10 +00:00
Marc G. Fournier ce4c0ce1de Some compile failure fixes from Keith Parks <emkxp01@mtcc.demon.co.uk> 1996-11-06 06:52:23 +00:00
Marc G. Fournier 950b6ab022 Fixes: Using LIKE or ~ operator on text type files which are null valued
causes segmentation fault.

Thanks to: Salvador Ortiz Garcia, Robert Patrick, Paul 'Shag' Walmsley,
           and James Cooper for finding and fixing the problem.
1996-07-09 06:39:19 +00:00
Marc G. Fournier d31084e9d1 Postgres95 1.01 Distribution - Virgin Sources 1996-07-09 06:22:35 +00:00