postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	c6aae3042b	Simplify and document regex library's compact-NFA representation. The previous coding abused the first element of a cNFA state's arcs list to hold a per-state flag bit, which was confusing, undocumented, and not even particularly efficient. Get rid of that in favor of a separate "stflags" vector. Since there's only one bit in use, I chose to allocate a char per state; we could possibly replace this with a bitmap at some point, but that would make accesses a little slower. It's already about 8X smaller than before, so let's not get overly tense. Also document the representation better than it was before, which is to say not at all. This patch is a byproduct of investigations towards extracting a "fixed prefix" string from the compact-NFA representation of regex patterns. Might need to back-patch it if we decide to back-patch that fix, but for now it's just code cleanup so I'll just put it in HEAD.	2012-07-07 17:39:50 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	2a4c46e0ba	Fix array overrun in regex code. zaptreesubs() was coded to unconditionally reset a capture subre's corresponding pmatch[] entry. However, in regexes without backrefs, that array is caller-supplied and might not have as many entries as the regex has capturing parens. So check the array length and do nothing if there is no corresponding entry, much as subset() does. Failure to check this resulted in a stack clobber in the case reported by Marko Kreen. This bug appears to have been latent in the regex library from the beginning. It was not exposed because find() called dissect() not cdissect(), and the dissect() code path didn't ever call zaptreesubs() (formerly zapmem()). When I unified dissect() and cdissect() in commit `4dd78bf37a`, the problem was exposed. Now that I've seen this, I'm rather suspicious that we might need to back-patch it; but will refrain for now, for lack of evidence that the case can be hit in the previous coding.	2012-05-24 13:56:16 -04:00
Tom Lane	4dd78bf37a	Merge dissect() into cdissect() to remove a pile of near-duplicate code. The "uncomplicated" case isn't materially less complicated than the full case, certainly not enough so to justify duplicating nearly 500 lines of code. The only extra work being done in the full path is zaptreesubs, which is very cheap compared to everything else being done here, and besides that I'm less than convinced that it's not needed in some cases even without backrefs.	2012-02-24 18:40:31 -05:00
Tom Lane	587359479a	Avoid repeated creation/freeing of per-subre DFAs during regex search. In nested sub-regex trees, lower-level nodes created DFAs and then destroyed them again before exiting, which is a bit dumb considering that the recursive search is likely to call those nodes again later. Instead cache each created DFA until the end of pg_regexec(). This is basically a space for time tradeoff, in that it might increase the maximum memory usage. However, in most regex patterns there are not all that many subre nodes, so not that many DFAs --- and in any case, the peak usage occurs when reaching the bottom recursion level, and except for alternation cases that's going to be the same anyway.	2012-02-24 18:40:30 -05:00
Tom Lane	3cbfe485e4	Remove useless "retry memory" logic within regex engine. Apparently some primordial version of Spencer's engine needed cdissect() and child functions to be able to continue matching from a previous position when re-called. That is dead code, though, since trivial inspection shows that cdissect can never be entered without having previously done zapmem which resets the relevant retry counter. I have also verified experimentally that no case in the Tcl regression tests reaches cdissect with a nonzero retry value. Accordingly, remove that logic. This doesn't really save any noticeable number of cycles in itself, but it is one step towards making dissect() and cdissect() equivalent, which will allow removing hundreds of lines of near-duplicated code. Since struct subre's "retry" field is no longer particularly related to any kind of retry, rename it to "id". As of this commit it's only used for identifying a subre node in debug printouts, so you might think we should get rid of the field entirely; but I have a plan for another use.	2012-02-24 18:40:28 -05:00
Tom Lane	173e29aa5d	Fix the general case of quantified regex back-references. Cases where a back-reference is part of a larger subexpression that is quantified have never worked in Spencer's regex engine, because he used a compile-time transformation that neglected the need to check the back-reference match in iterations before the last one. (That was okay for capturing parens, and we still do it if the regex has only capturing parens ... but it's not okay for backrefs.) To make this work properly, we have to add an "iteration" node type to the regex engine's vocabulary of sub-regex nodes. Since this is a moderately large change with a fair risk of introducing new bugs of its own, apply to HEAD only, even though it's a fix for a longstanding bug.	2012-02-24 01:41:03 -05:00
Tom Lane	5223f96d92	Fix regex back-references that are directly quantified with . The syntax "\n", that is a backref with a * quantifier directly applied to it, has never worked correctly in Spencer's library. This has been an open bug in the Tcl bug tracker since 2005: https://sourceforge.net/tracker/index.php?func=detail&aid=1115587&group_id=10894&atid=110894 The core of the problem is in parseqatom(), which first changes "\n" to "\n+\|" and then applies repeat() to the NFA representing the backref atom. repeat() thinks that any arc leading into its "rp" argument is part of the sub-NFA to be repeated. Unfortunately, since parseqatom() already created the arc that was intended to represent the empty bypass around "\n+", this arc gets moved too, so that it now leads into the state loop created by repeat(). Thus, what was supposed to be an "empty" bypass gets turned into something that represents zero or more repetitions of the NFA representing the backref atom. In the original example, in place of ^([bc])\1$ we now have something that acts like ^([bc])(\1+\|[bc])$ At runtime, the branch involving the actual backref fails, as it's supposed to, but then the other branch succeeds anyway. We could no doubt fix this by some rearrangement of the operations in parseqatom(), but that code is plenty ugly already, and what's more the whole business of converting "x" to "x+\|" probably needs to go away to fix another problem I'll mention in a moment. Instead, this patch suppresses the *-conversion when the target is a simple backref atom, leaving the case of m == 0 to be handled at runtime. This makes the patch in regcomp.c a one-liner, at the cost of having to tweak cbrdissect() a little. In the event I went a bit further than that and rewrote cbrdissect() to check all the string-length-related conditions before it starts comparing characters. It seems a bit stupid to possibly iterate through many copies of an n-character backreference, only to fail at the end because the target string's length isn't a multiple of n --- we could have found that out before starting. The existing coding could only be a win if integer division is hugely expensive compared to character comparison, but I don't know of any modern machine where that might be true. This does not fix all the problems with quantified back-references. In particular, the code is still broken for back-references that appear within a larger expression that is quantified (so that direct insertion of the quantification limits into the BACKREF node doesn't apply). I think fixing that will take some major surgery on the NFA code, specifically introducing an explicit iteration node type instead of trying to transform iteration into concatenation of modified regexps. Back-patch to all supported branches. In HEAD, also add a regression test case for this. (It may seem a bit silly to create a regression test file for just one test case; but I'm expecting that we will soon import a whole bunch of regex regression tests from Tcl, so might as well create the infrastructure now.)	2012-02-20 00:52:33 -05:00
Tom Lane	e00f68e49c	Add caching of ctype.h/wctype.h results in regc_locale.c. While this doesn't save a huge amount of runtime, it still seems worth doing, especially since I realized that the data copying I did in my first draft was quite unnecessary. In this version, once we have the results cached, getting them back for re-use is really very cheap. Also, remove the hard-wired limitation to not consider wctype.h results for character codes above 255. It turns out that we can't push the limit as far up as I'd originally hoped, because the regex colormap code is not efficient enough to cope very well with character classes containing many thousand letters, which a Unicode locale is entirely capable of producing. Still, we can push it up to U+7FF (which I chose as the limit of 2-byte UTF8 characters), which will at least make Eastern Europeans happy pending a better solution. Thus, this commit resolves the specific complaint in bug #6457, but not the more general issue that letters of non-western alphabets are mostly not recognized as matching [[:alpha:]].	2012-02-19 21:01:13 -05:00
Tom Lane	27af91438b	Create the beginnings of internals documentation for the regex code. Create src/backend/regex/README to hold an implementation overview of the regex package, and fill it in with some preliminary notes about the code's DFA/NFA processing and colormap management. Much more to do there of course. Also, improve some code comments around the colormap and cvec code. No functional changes except to add one missing assert.	2012-02-19 18:58:23 -05:00
Tom Lane	08fd6ff37f	Sync regex code with Tcl 8.5.11. Sync our regex code with upstream changes since last time we did this, which was Tcl 8.5.0 (see commit `df1e965e12`). There are no functional changes here; the main point is just to lay down a commit-log marker that somebody has looked at this recently, and to do what we can to keep the two codebases comparable.	2012-02-17 19:44:26 -05:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Bruce Momjian	f8fc37b337	Add markers for skips.	2011-08-26 18:15:13 -04:00
Bruce Momjian	6560407c7d	Pgindent run before 9.1 beta2.	2011-06-09 14:32:50 -04:00
Tom Lane	7aa3f1d082	Insert dummy "break"s to silence compiler complaints. Apparently some compilers dislike a case label with nothing after it. Per buildfarm.	2011-04-10 18:44:07 -04:00
Tom Lane	1e16a8107d	Teach regular expression operators to honor collations. This involves getting the character classification and case-folding functions in the regex library to use the collations infrastructure. Most of this work had been done already in connection with the upper/lower and LIKE logic, so it was a simple matter of transposition. While at it, split out these functions into a separate source file regc_pg_locale.c, so that they can be correctly labeled with the Postgres project's license rather than the Scriptics license. These functions are 100% Postgres-written code whereas what remains in regc_locale.c is still mostly not ours, so lumping them both under the same copyright notice was getting more and more misleading.	2011-04-10 18:03:09 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	bfd3f37be3	Fix comparisons of pointers with zero to compare with NULL instead. Per C standard, these are semantically the same thing; but saying NULL when you mean NULL is good for readability. Marti Raudsepp, per results of INRIA's Coccinelle.	2010-10-29 15:51:52 -04:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	e621037eec	Tweak a couple of macros in the regex code to suppress compiler warnings from "clang". The VERR changes make an assignment unconditional, which is probably easier to read/understand anyway, and one can hardly argue that it's worth shaving cycles off the case of reporting another error when one has already been detected. The INSIST change limits where that macro can be used, but not in a way that creates a problem for any existing call.	2010-08-02 02:29:39 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Tom Lane	ee3a81f0a0	Change regexp engine's ccondissect/crevdissect routines to perform DFA matching before recursing instead of after. The DFA match eliminates unworkable midpoint choices a lot faster than the recursive check, in most cases, so doing it first can speed things up; particularly in pathological cases such as recently exhibited by Michael Glaesemann. In addition, apply some cosmetic changes that were applied upstream (in the Tcl project) at the same time, in order to sync with upstream version 1.15 of regexec.c. Upstream apparently intends to backpatch this, so I will too. The pathological behavior could be unpleasant if encountered in the field, which seems to justify any risk of introducing new bugs. Tom Lane, reviewed by Donal K. Fellows of Tcl project	2010-02-01 02:45:29 +00:00
Tom Lane	3e51ae491d	Fix some comments that got mangled by pgindent.	2010-01-30 04:18:00 +00:00
Tom Lane	0d32342501	Teach the regular expression functions to do case-insensitive matching and locale-dependent character classification properly when the database encoding is UTF8. The previous coding worked okay in single-byte encodings, or in any case for ASCII characters, but failed entirely on multibyte characters. The fix assumes that the <wctype.h> functions use Unicode code points as the wchar representation for Unicode, ie, wchar matches pg_wchar. This is only a partial solution, since we're still stupid about non-ASCII characters in multibyte encodings other than UTF8. The practical effect of that is limited, however, since those cases are generally Far Eastern glyphs for which concepts like case-folding don't apply anyway. Certainly all or nearly all of the field reports of problems have been about UTF8. A more general solution would require switching to the platform's wchar representation for all regex operations; which is possible but would have substantial disadvantages. Let's try this and see if it's sufficient in practice.	2009-12-01 21:00:24 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Peter Eisentraut	0474dcb608	Refactor backend makefiles to remove lots of duplicate code	2008-02-19 10:30:09 +00:00
Tom Lane	df1e965e12	Sync our regex code with upstream changes since last time we did this, which was Tcl 8.4.8. The main changes are to remove the never-fully-implemented code for multi-character collating elements, and to const-ify some stuff a bit more fully. In combination with the recent security patch, this commit brings us into line with Tcl 8.5.0. Note that I didn't make any effort to duplicate a lot of cosmetic changes that they made to bring their copy into line with their own style guidelines, such as adding braces around single-line IF bodies. Most of those we either had done already (such as ANSI-fication of function headers) or there is no point because pgindent would undo the change anyway.	2008-02-14 17:33:37 +00:00
Tom Lane	98f27aaef3	Fix assorted security-grade bugs in the regex engine. All of these problems are shared with Tcl, since it's their code to begin with, and the patches have been copied from Tcl 8.5.0. Problems: CVE-2007-4769: Inadequate check on the range of backref numbers allows crash due to out-of-bounds read. CVE-2007-4772: Infinite loop in regex optimizer for pattern '($\|^)*'. CVE-2007-6067: Very slow optimizer cleanup for regex with a large NFA representation, as well as crash if we encounter an out-of-memory condition during NFA construction. Part of the response to CVE-2007-6067 is to put a limit on the number of states in the NFA representation of a regex. This seems needed even though the within-the-code problems have been corrected, since otherwise the code could try to use very large amounts of memory for a suitably-crafted regex, leading to potential DOS by driving the system into swap, activating a kernel OOM killer, etc. Although there are certainly plenty of ways to drive the system into effective DOS with poorly-written SQL queries, these problems seem worth treating as security issues because many applications might accept regex search patterns from untrustworthy sources. Thanks to Will Drewry of Google for reporting these problems. Patches by Will Drewry and Tom Lane. Security: CVE-2007-4769, CVE-2007-4772, CVE-2007-6067	2008-01-03 20:47:55 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	f1c87830b5	Add a useless return statement to suppress a warning seen with some versions of gcc (I'm seeing it with Apple's gcc 4.0.1). I think the reason we did not see this before was that the assert() macros in the regex code were all no-ops till recently.	2007-10-22 01:02:22 +00:00
Tom Lane	298c457520	Make dumpcolors() have tolerable performance when using 32-bit chr, as we do (and upstream Tcl doesn't). The loop limit might be subject to negotiation if anyone ever tries to do regex debugging in Far Eastern languages, but for now 1000 seems plenty. CHR_MAX was right out :-(	2007-10-06 16:18:09 +00:00
Tom Lane	06ce02f989	Adjust some regex debugging printouts to not give wrong-format-width warnings on a 64-bit machine. Noted while chasing a recent regex bug report.	2007-10-06 16:05:54 +00:00
Bruce Momjian	8b4ff8b6a1	Wording cleanup for error messages. Also change can't -> cannot. Standard English uses "may", "can", and "might" in different ways: may - permission, "You may borrow my rake." can - ability, "I can lift that log." might - possibility, "It might rain today." Unfortunately, in conversational English, their use is often mixed, as in, "You may use this variable to do X", when in fact, "can" is a better choice. Similarly, "It may crash" is better stated, "It might crash".	2007-02-01 19:10:30 +00:00
Bruce Momjian	436a2956d8	Re-run pgindent, fixing a problem where comment lines after a blank comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.	2005-11-22 18:17:34 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Tom Lane	303e089df5	Clean up possibly-uninitialized-variable warnings reported by gcc 4.x.	2005-09-24 22:54:44 +00:00
Bruce Momjian	75a64eeb4b	I made the patch that implements regexp_replace again. The specification of this function is as follows. regexp_replace(source text, pattern text, replacement text, [flags text]) returns text Replace string that matches to regular expression in source text to replacement text. - pattern is regular expression pattern. - replacement is replace string that can use '\1'-'\9', and '\&'. '\1'-'\9': back reference to the n'th subexpression. '\&' : entire matched string. - flags can use the following values: g: global (replace all) i: ignore case When the flags is not specified, case sensitive, replace the first instance only. Atsushi Ogawa	2005-07-10 04:54:33 +00:00
Bruce Momjian	b492c3accc	Add parentheses to macros when args are used in computations. Without them, the executation behavior could be unexpected.	2005-05-25 21:40:43 +00:00
Tom Lane	d4c4d28427	Install Tcl regex fixes to sync our regex engine with Tcl 8.4.8 (up from 8.4.1). This corrects some curious regex bugs, though not the greediness issue I was hoping to find a solution for :-(	2004-11-24 22:56:54 +00:00
Tom Lane	0bd61548ab	Solve the 'Turkish problem' with undesirable locale behavior for case conversion of basic ASCII letters. Remove all uses of strcasecmp and strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp; remove most but not all direct uses of toupper and tolower in favor of pg_toupper and pg_tolower. These functions use the same notions of case folding already developed for identifier case conversion. I left the straight locale-based folding in place for situations where we are just manipulating user data and not trying to match it to built-in strings --- for example, the SQL upper() function is still locale dependent. Perhaps this will prove not to be what's wanted, but at the moment we can initdb and pass regression tests in Turkish locale.	2004-05-07 00:24:59 +00:00
PostgreSQL Daemon	969685ad44	$Header: -> $PostgreSQL Changes ...	2003-11-29 19:52:15 +00:00
Tom Lane	5594aa6a6e	Fix broken definition of :print: character class, per Bruno Wolff. Also, make :alnum: character class directly dependent on isalnum() rather than guessing.	2003-09-29 00:21:58 +00:00
Bruce Momjian	46785776c4	Another pgindent run with updated typedefs.	2003-08-08 21:42:59 +00:00
Bruce Momjian	089003fb46	pgindent run.	2003-08-04 00:43:34 +00:00
Tom Lane	7bcc6d98fb	Replace regular expression package with Henry Spencer's latest version (extracted from Tcl 8.4.1 release, as Henry still hasn't got round to making it a separate library). This solves a performance problem for multibyte, as well as upgrading our regexp support to match recent Tcl and nearly match recent Perl.	2003-02-05 17:41:33 +00:00
Bruce Momjian	bea4792125	This patch removes a bunch of superfluous #include directives: if postgres.h or c.h includes a system header (such as stdio.h or stdlib.h), there's no need to specifically include it in any of the .c files in the backend. Neil Conway	2002-11-08 20:23:57 +00:00
Bruce Momjian	a2ba9a76b8	Remove retest Makefile entry because it does not compile.	2002-09-16 16:02:43 +00:00
Bruce Momjian	e50f52a074	pgindent run.	2002-09-04 20:31:48 +00:00
Peter Eisentraut	77f7763b55	Remove all traces of multibyte and locale options. Clean up comments referring to "multibyte" where it really means character encoding.	2002-09-03 21:45:44 +00:00
Bruce Momjian	97ac103289	Remove sys/types.h in files that include postgres.h, and hence c.h, because c.h has sys/types.h.	2002-09-02 02:47:07 +00:00
Tatsuo Ishii	ed7baeaf4d	Remove #ifdef MULTIBYTE per hackers list discussion.	2002-08-29 07:22:30 +00:00
Thomas G. Lockhart	ea01a451cc	Implement SQL99 OVERLAY(). Allows substitution of a substring in a string. Implement SQL99 SIMILAR TO as a synonym for our existing operator "~". Implement SQL99 regular expression SUBSTRING(string FROM pat FOR escape). Extend the definition to make the FOR clause optional. Define textregexsubstr() to actually implement this feature. Update the regression test to include these new string features. All tests pass. Rename the regular expression support routines from "pg95_xxx" to "pg_xxx". Define CREATE CHARACTER SET in the parser per SQL99. No implementation yet.	2002-06-11 15:44:38 +00:00
Tom Lane	3e48c66136	Fix code to work when isalpha and friends are macros, not functions.	2002-05-05 00:50:31 +00:00
Bruce Momjian	f71c924f10	[ Patch comments in three pieces.] Attached is a pacth against 7.2 which adds locale awareness to the character classes of the regular expression engine. ... > > I still think the xdigit class could be handled the same way the digit > > class is (by enumeration rather than using the isxdigit function). That > > saves you a cicle, and I don't think there's any loss. > > In fact, I will email you when I apply the original patch. I miss that case :-(. Here is the pached patch. ... Here is a patch which addresses Tatsuo's concerns (it does return an static struct instead of constructing it).	2002-04-24 01:51:11 +00:00
Bruce Momjian	ea08e6cd55	New pgindent run with fixes suggested by Tom. Patch manually reviewed, initdb/regression tests pass.	2001-11-05 17:46:40 +00:00
Bruce Momjian	6783b2372e	Another pgindent run. Fixes enum indenting, and improves #endif spacing. Also adds space for one-line comments.	2001-10-28 06:26:15 +00:00
Bruce Momjian	b81844b173	pgindent run on all C files. Java run to follow. initdb/regression tests pass.	2001-10-25 05:50:21 +00:00
Bruce Momjian	fde8edaf53	Add do { ... } while (0) to more bad macros.	2001-10-25 01:29:37 +00:00
Tatsuo Ishii	f03b7ba0de	Add dependency for regexec.c	2001-10-04 04:16:16 +00:00
Bruce Momjian	9e1552607a	pgindent run. Make it all clean.	2001-03-22 04:01:46 +00:00
Tom Lane	184fb14b90	Make regular-expression error messages a tad less obscure, per gripe from Josh Berkus.	2001-03-19 22:27:46 +00:00
Tom Lane	f7a839bc2b	Clean up portability problems in regexp package: change all routine definitions from K&R to ANSI C style, and fix broken assumption that int and long are the same datatype. This repairs problems observed on Alpha with regexps having between 32 and 63 states.	2001-02-13 00:02:36 +00:00
Tom Lane	a27b691e29	Ensure that all uses of <ctype.h> functions are applied to unsigned-char values, whether the local char type is signed or not. This is necessary for portability. Per discussion on pghackers around 9/16/00.	2000-12-03 20:45:40 +00:00
Peter Eisentraut	e5ba2fc5b5	Make all commands that link a program look like $(CC) $(CFLAGS) $(LDFLAGS) <object files> <extra-libraries> $(LIBS) -o $@ This form seemed to be the most portable, readable, and logical, but in any case it's better than having a dozen different ones in the tree.	2000-11-30 20:36:13 +00:00
Peter Eisentraut	805e431a38	Add support for VPATH builds, that is, building somewhere else than in the source directory. This involves mostly makefiles using $(srcdir) when they might have used ".". (Regression tests don't work with this, yet.) Sort out usage of CPPFLAGS, CFLAGS (and CXXFLAGS). Add "override" keyword in most places, to preserve necessary flags even when the user overrode the flags.	2000-10-20 21:04:27 +00:00
Peter Eisentraut	424f0edcb8	Fix relative path references so that make knowns which dependencies refer to one another. Sort out builddir vs srcdir variable namings. Remove some now obsoleted make variables.	2000-08-31 16:12:35 +00:00
Tom Lane	091126fa28	Generated header files parse.h and fmgroids.h are now copied into the src/include tree, so that -I backend is no longer necessary anywhere. Also, clean up some bit rot in contrib tree.	2000-05-29 05:45:56 +00:00
Peter Eisentraut	533d516629	Removed MBFLAGS from makefiles since it's now done in include/config.h.	2000-01-19 02:59:03 +00:00
Bruce Momjian	a82f9ffde6	New LDOUT makefile variable for QNX os.	1999-12-13 22:35:27 +00:00
Bruce Momjian	3ffd3d82db	Make LD -r as macros that can be changed for QNX.	1999-12-09 19:15:45 +00:00
Bruce Momjian	3406901a29	Move some system includes into c.h, and remove duplicates.	1999-07-17 20:18:55 +00:00
Bruce Momjian	a9591ce66a	Change #include's to use <> and "" as appropriate.	1999-07-15 23:04:24 +00:00
Bruce Momjian	07842084fe	pgindent run over code.	1999-05-25 16:15:34 +00:00
Tatsuo Ishii	08bcc77a5c	add retest, a regex testing program	1999-05-21 06:27:54 +00:00
Tatsuo Ishii	235a569aaa	Fix multi-byte+locale problem	1999-03-25 04:46:53 +00:00
Bruce Momjian	fa1a8d6a97	OK, folks, here is the pgindent output.	1998-09-01 04:40:42 +00:00
Bruce Momjian	af74855a60	Renaming cleanup, no pgindent yet.	1998-09-01 03:29:17 +00:00
Marc G. Fournier	5979d73841	From: t-ishii@sra.co.jp As Bruce mentioned, this is due to the conflict among changes we made. Included patches should fix the problem(I changed all MB to MULTIBYTE). Please let me know if you have further problem. P.S. I did not include pathces to configure and gram.c to save the file size(configure.in and gram.y modified).	1998-07-26 04:31:41 +00:00
Marc G. Fournier	bf00bbb0c4	I really hope that I haven't missed anything in this one... From: t-ishii@sra.co.jp Attached are patches to enhance the multi-byte support. (patches are against 7/18 snapshot) * determine encoding at initdb/createdb rather than compile time Now initdb/createdb has an option to specify the encoding. Also, I modified the syntax of CREATE DATABASE to accept encoding option. See README.mb for more details. For this purpose I have added new column "encoding" to pg_database. Also pg_attribute and pg_class are changed to catch up the modification to pg_database. Actually I haved added pg_database_mb.h, pg_attribute_mb.h and pg_class_mb.h. These are used only when MB is enabled. The reason having separate files is I couldn't find a way to use ifdef or whatever in those files. I have to admit it looks ugly. No way. * support for PGCLIENTENCODING when issuing COPY command commands/copy.c modified. * support for SQL92 syntax "SET NAMES" See gram.y. * support for LATIN2-5 * add UNICODE regression test case * new test suite for MB New directory test/mb added. * clean up source files Basic idea is to have MB's own subdirectory for easier maintenance. These are include/mb and backend/utils/mb.	1998-07-24 03:32:46 +00:00
Bruce Momjian	7b2b779a2a	Add auto-size to screen to \d? commands. Use UNION to show all \d? results in one query. Add \d? field search feature. Rename MB to MULTIBYTE.	1998-07-18 18:34:34 +00:00
Bruce Momjian	cb7cbc16fa	Hi, here are the patches to enhance existing MB handling. This time I have implemented a framework of encoding translation between the backend and the frontend. Also I have added a new variable setting command: SET CLIENT_ENCODING TO 'encoding'; Other features include: Latin1 support more 8 bit cleaness See doc/README.mb for more details. Note that the pacthes are against May 30 snapshot. Tatsuo Ishii	1998-06-16 07:29:54 +00:00
Bruce Momjian	6bd323c6b3	Remove un-needed braces around single statements.	1998-06-15 19:30:31 +00:00
Marc G. Fournier	f554af0a9f	From: t-ishii@sra.co.jp Hi, here are patches I promised (against 6.3.2): * character_length(), position(), substring() are now aware of multi-byte characters * add octet_length() * add --with-mb option to configure * new regression tests for EUC_KR (contributed by "Soonmyung. Hong" <hong@lunaris.hanmesoft.co.kr>) * add some test cases to the EUC_JP regression test * fix problem in regress/regress.sh in case of System V * fix toupper(), tolower() to handle 8bit chars note that: o patches for both configure.in and configure are included. maybe the one for configure is not necessary. o pg_proc.h was modified to add octet_length(). I used OIDs (1374-1379) for that. Please let me know if these numbers are not appropriate.	1998-04-27 17:10:50 +00:00
Bruce Momjian	645146260e	Cleanup of compiler warnings.	1998-04-06 02:38:26 +00:00
Bruce Momjian	1e801a8f16	Hi, Attached you'll find a (big) patch that fixes make dep and make depend in all Makefiles where I found it to be appropriate. It also removes the dependency in Makefile.global for NAMEDATALEN and OIDNAMELEN by making backend/catalog/genbki.sh and bin/initdb/initdb.sh a little smarter. This no longer requires initdb.sh that is turned into initdb with a sed script when installing Postgres, hence initdb.sh should be renamed to initdb (after the patch has been applied :-) ) This patch is against the 6.3 sources, as it took a while to complete. Please review and apply, Cheers, Jeroen van Vianen	1998-04-06 00:32:26 +00:00
Marc G. Fournier	661ecf3c48	From: t-ishii@sra.co.jp Included are patches intended for allowing PostgreSQL to handle multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and Mule internal code. With the MB patch you can use multi-byte character sets in regexp and LIKE. The encoding system chosen is determined at the compile time. To enable the MB extension, you need to define a variable "MB" in Makefile.global or in Makefile.custom. For further information please take a look at README.mb under doc directory. (Note that unlike "jp patch" I do not use modified GNU regexp any more. I changed Henry Spencer's regexp coming with PostgreSQL.)	1998-03-15 07:39:04 +00:00
Bruce Momjian	a32450a585	pgindent run before 6.3 release, with Thomas' requested changes.	1998-02-26 04:46:47 +00:00
Bruce Momjian	24cab6bd0d	Goodbye register keyword. Compiler knows better.	1998-02-11 19:14:04 +00:00
Marc G. Fournier	6e337eef45	Major cleanout of PORTNAME variables from Makefiles...bound to screw up some of the ports...	1997-12-20 00:29:35 +00:00
Marc G. Fournier	5379b84eff	More cleanups. I can now compile without PORTNAME being defined n Makefile.global. End result, if all goes well, should allow for much easier porting, since there will no longer be a concept of a "port". Most, if not everything, should be determined by configure, or by the compiler itself. Still work to be done though :)	1997-12-19 02:09:10 +00:00
Bruce Momjian	59f6a57e59	Used modified version of indent that understands over 100 typedefs.	1997-09-08 21:56:23 +00:00
Bruce Momjian	319dbfa736	Another PGINDENT run that changes variable indenting and case label indenting. Also static variable indenting.	1997-09-08 02:41:22 +00:00
Bruce Momjian	1ccd423235	Massive commit to run PGINDENT on all .c and .h files.	1997-09-07 05:04:48 +00:00
Bruce Momjian	ea5b5357cd	Remove more (void) and fix -Wall warnings.	1997-08-12 22:55:25 +00:00
Bruce Momjian	edb58721b8	Fix pgproc names over 15 chars in output. Add strNcpy() function. remove some (void) casts that are unnecessary.	1997-08-12 20:16:25 +00:00
Bryan Henderson	fa9c0fff36	Remove __P macro usage so it compiles without cdefs.h.	1996-12-15 09:21:37 +00:00
Vadim B. Mikheev	14ad435294	const register ... --> register const ...	1996-12-14 06:08:14 +00:00
Bruce Momjian	a0990e1884	Makefile cleanup after reorganization	1996-11-09 06:24:51 +00:00
Bruce Momjian	4b2b8592a0	Compile and warning cleanup	1996-11-08 06:02:30 +00:00
Bryan Henderson	b0d6f0aa63	Simplify make files, add full dependencies.	1996-10-27 09:55:05 +00:00

1 2 3 4

153 Commits