postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	abb9c63b2c	Unbreak index optimization for LIKE on bytea The same code is used to handle both text and bytea, but bytea is not collation-aware, so we shouldn't call get_collation_isdeterministic() in that case, since that will error out with an invalid collation. Reported-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAM2%2B6%3DWaf3qJ1%3DyVTUH8_yG-SC0xcBMY%2BSFLhvKKNnWNXSUDBw%40mail.gmail.com	2019-04-15 09:29:17 +02:00
Michael Paquier	92c76021ae	Improve readability of some tests in strings.sql `c251336` has added some tests to check if a toast relation should be empty or not, hardcoding the toast relation name when calling pg_relation_size(). pg_class.reltoastrelid offers the same information, so simplify the tests to use that. Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20190403065949.GH3298@paquier.xyz	2019-04-04 10:24:56 +09:00
Andrew Gierth	b7f6bcbffc	Repair bug in regexp split performance improvements. Commit `c8ea87e4b` introduced a temporary conversion buffer for substrings extracted during regexp splits. Unfortunately the code that sized it was failing to ignore the effects of ignored degenerate regexp matches, so for regexp_split_* calls it could under-size the buffer in such cases. Fix, and add some regression test cases (though those will only catch the bug if run in a multibyte encoding). Backpatch to 9.3 as the faulty code was. Thanks to the PostGIS project, Regina Obe and Paul Ramsey for the report (via IRC) and assistance in analysis. Patch by me.	2018-09-12 19:31:06 +01:00
Peter Eisentraut	10cfce34c0	Add user-callable SHA-2 functions Add the user-callable functions sha224, sha256, sha384, sha512. We already had these in the C code to support SCRAM, but there was no test coverage outside of the SCRAM tests. Adding these as user-callable functions allows writing some tests. Also, we have a user-callable md5 function but no more modern alternative, which led to wide use of md5 as a general-purpose hash function, which leads to occasional complaints about using md5. Also mark the existing md5 functions as leak-proof. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2018-02-22 11:34:53 -05:00
Simon Riggs	56f3468622	Reduce test variability for toast_tuple_target test	2017-11-20 12:09:40 +11:00
Simon Riggs	c2513365a0	Parameter toast_tuple_target controls TOAST for new rows Specifies the point at which we try to move long column values into TOAST tables. No effect on existing rows. Discussion: https://postgr.es/m/CANP8+jKsVmw6CX6YP9z7zqkTzcKV1+Uzr3XjKcZW=2Ya00OyQQ@mail.gmail.com Author: Simon Riggs <simon@2ndQudrant.com> Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndQuadrant.com>	2017-11-20 09:50:10 +11:00
Tom Lane	cf9b0fea5f	Implement regexp_match(), a simplified alternative to regexp_matches(). regexp_match() is like regexp_matches(), but it disallows the 'g' flag and in consequence does not need to return a set. Instead, it returns a simple text array value, or NULL if there's no match. Previously people usually got that behavior with a sub-select, but this way is considerably more efficient. Documentation adjusted so that regexp_match() is presented first and then regexp_matches() is introduced as a more complicated version. This is a bit historically revisionist but seems pedagogically better. Still TODO: extend contrib/citext to support this function. Emre Hasegeli, reviewed by David Johnston Discussion: <CAE2gYzy42sna2ME_e3y1KLQ-4UBrB-eVF0SWn8QG39sQSeVhEw@mail.gmail.com>	2016-08-17 18:33:01 -04:00
Tom Lane	d074b4e50d	Fix regexp_matches() handling of zero-length matches. We'd find the same match twice if it was of zero length and not immediately adjacent to the previous match. replace_text_regexp() got similar cases right, so adjust this search logic to match that. Note that even though the regexp_split_to_xxx() functions share this code, they did not display equivalent misbehavior, because the second match would be considered degenerate and ignored. Jeevan Chalke, with some cosmetic changes by me.	2013-07-31 11:31:22 -04:00
Tom Lane	e7bfc7e42c	Fix some uses of "the quick brown fox". If we're going to quote a well-known pangram, we should quote it accurately. Per gripe from Thom Brown.	2013-05-16 12:30:41 -04:00
Peter Eisentraut	cc26ea9fe2	Clean up references to SQL92 In most cases, these were just references to the SQL standard in general. In a few cases, a contrast was made between SQL92 and later standards -- those have been kept unchanged.	2013-04-20 11:04:41 -04:00
Tom Lane	dbde97cdde	Rewrite LIKE's %-followed-by-_ optimization so it really works (this time for sure ;-)). It now also optimizes more cases, such as %_%_. Improve comments too. Per bug #5478. In passing, also rename the TCHAR macro to GETCHAR, because pgindent is messing with the formatting of the former (apparently it now thinks TCHAR is a typedef name). Back-patch to 8.3, where the bug was introduced.	2010-05-28 17:35:23 +00:00
Tom Lane	9507c8a1db	Add get_bit/set_bit functions for bit strings, paralleling those for bytea, and implement OVERLAY() for bit strings and bytea. In passing also convert text OVERLAY() to a true built-in, instead of relying on a SQL function. Leonardo F, reviewed by Kevin Grittner	2010-01-25 20:55:32 +00:00
Tom Lane	a2a8c7a662	Support hex-string input and output for type BYTEA. Both hex format and the traditional "escape" format are automatically handled on input. The output format is selected by the new GUC variable bytea_output. As committed, bytea_output defaults to HEX, which is an incompatible change. We will keep it this way for awhile for testing purposes, but should consider whether to switch to the more backwards-compatible default of ESCAPE before 8.5 is released. Peter Eisentraut	2009-08-04 16:08:37 +00:00
Tom Lane	fc2660fc25	Fix LIKE's special-case code for % followed by _. I'm not entirely sure that this case is worth a special code path, but a special code path that gets the boundary condition wrong is definitely no good. Per bug #4821 from Andrew Gierth. In passing, clean up some minor code formatting issues (excess parentheses and blank lines in odd places). Back-patch to 8.3, where the bug was introduced.	2009-05-24 18:10:38 +00:00
Tom Lane	1bbbcb04f0	Make new complaint about unsafe Unicode literals include an error location. Every other ereport in scan.l has one, this should too.	2009-05-05 21:09:23 +00:00
Peter Eisentraut	40bc4c2605	Disable the use of Unicode escapes in string constants (U&'') when standard_conforming_strings is not on, for security reasons.	2009-05-05 18:32:17 +00:00
Peter Eisentraut	06735e3256	Unicode escapes in strings and identifiers	2008-10-29 08:04:54 +00:00
Peter Eisentraut	607b2be7bb	Additional string function tests for coverage of oracle_compat.c	2008-10-04 13:55:45 +00:00
Tom Lane	1b70619311	Code review for regexp_matches/regexp_split patch. Refactor to avoid assuming that cached compiled patterns will still be there when the function is next called. Clean up looping logic, thereby fixing bug identified by Pavel Stehule. Share setup code between the two functions, add some comments, and avoid risky mixing of int and size_t variables. Clean up the documentation a tad, and accept all the flag characters mentioned in table 9-19 rather than just a subset.	2007-08-11 03:56:24 +00:00
Tom Lane	31edbadf4a	Downgrade implicit casts to text to be assignment-only, except for the ones from the other string-category types; this eliminates a lot of surprising interpretations that the parser could formerly make when there was no directly applicable operator. Create a general mechanism that supports casts to and from the standard string types (text,varchar,bpchar) for every datatype, by invoking the datatype's I/O functions. These new casts are assignment-only in the to-string direction, explicit-only in the other, and therefore should create no surprising behavior. Remove a bunch of thereby-obsoleted datatype-specific casting functions. The "general mechanism" is a new expression node type CoerceViaIO that can actually convert between any two datatypes if their external text representations are compatible. This is more general than needed for the immediate feature, but might be useful in plpgsql or other places in future. This commit does nothing about the issue that applying the concatenation operator \|\| to non-text types will now fail, often with strange error messages due to misinterpreting the operator as array concatenation. Since it often (not always) worked before, we should either make it succeed or at least give a more user-friendly error; but details are still under debate. Peter Eisentraut and Tom Lane	2007-06-05 21:31:09 +00:00
Tom Lane	3e23b68dac	Support varlena fields with single-byte headers and unaligned storage. This commit breaks any code that assumes that the mere act of forming a tuple (without writing it to disk) does not "toast" any fields. While all available regression tests pass, I'm not totally sure that we've fixed every nook and cranny, especially in contrib. Greg Stark with some help from Tom Lane	2007-04-06 04:21:44 +00:00
Neil Conway	9eb78beeae	Add three new regexp functions: regexp_matches, regexp_split_to_array, and regexp_split_to_table. These functions provide access to the capture groups resulting from a POSIX regular expression match, and provide the ability to split a string on a POSIX regular expression, respectively. Patch from Jeremy Drake; code review by Neil Conway, additional comments and suggestions from Tom and Peter E. This patch bumps the catversion, adds some regression tests, and updates the docs.	2007-03-20 05:45:00 +00:00
Tom Lane	637028afe1	Code review for standard_conforming_strings patch. Fix it so it does not throw warnings for 100%-SQL-standard constructs, clean up some minor infelicities, try to un-break ecpg to the best of my ability. (It's not clear how ecpg is going to find out the setting of standard_conforming_strings, though.) I think pg_dump still needs work, too.	2006-05-11 19:15:36 +00:00
Tom Lane	20ab467d76	Improve parser so that we can show an error cursor position for errors during parse analysis, not only errors detected in the flex/bison stages. This is per my earlier proposal. This commit includes all the basic infrastructure, but locations are only tracked and reported for errors involving column references, function calls, and operators. More could be done later but this seems like a good set to start with. I've also moved the ReportSyntaxErrorPosition logic out of psql and into libpq, which should make it available to more people --- even within psql this is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.	2006-03-14 22:48:25 +00:00
Bruce Momjian	19c21d115d	Enable standard_conforming_strings to be turned on. Kevin Grittner	2006-03-06 19:49:20 +00:00
Bruce Momjian	75a64eeb4b	I made the patch that implements regexp_replace again. The specification of this function is as follows. regexp_replace(source text, pattern text, replacement text, [flags text]) returns text Replace string that matches to regular expression in source text to replacement text. - pattern is regular expression pattern. - replacement is replace string that can use '\1'-'\9', and '\&'. '\1'-'\9': back reference to the n'th subexpression. '\&' : entire matched string. - flags can use the following values: g: global (replace all) i: ignore case When the flags is not specified, case sensitive, replace the first instance only. Atsushi Ogawa	2005-07-10 04:54:33 +00:00
Neil Conway	f3567eeaf2	Implement md5(bytea), update regression tests and documentation. Patch from Abhijit Menon-Sen, minor editorialization from Neil Conway. Also, improve md5(text) to allocate a constant-sized buffer on the stack rather than via palloc. Catalog version bumped.	2005-05-20 01:29:56 +00:00
Tom Lane	7665d1bc16	Teach psql to show the location of syntax errors visually, per recent discussions. Patch by Fabien Coelho and Tom Lane. Still needs to be taught about multi-screen-column kanji characters; Tatsuo has promised to provide the needed infrastructure for that.	2004-03-14 04:25:18 +00:00
Tom Lane	b6a1d25b0a	Error message editing in utils/adt. Again thanks to Joe Conway for doing the bulk of the heavy lifting ...	2003-07-27 04:53:12 +00:00
Tom Lane	9fbd52808e	Adopt latest bison's spelling of 'syntax error' rather than 'parse error' for grammar-detected problems. Revert Makefile hack that kept it looking like the pre-bison-1.875 output.	2003-05-29 20:40:36 +00:00
Tom Lane	f45df8c014	Cause CHAR(n) to TEXT or VARCHAR conversion to automatically strip trailing blanks, in hopes of reducing the surprise factor for newbies. Remove redundant operators for VARCHAR (it depends wholly on TEXT operations now). Clean up resolution of ambiguous operators/functions to avoid surprising choices for domains: domains are treated as equivalent to their base types and binary-coercibility is no longer considered a preference item when choosing among multiple operators/functions. IsBinaryCoercible now correctly reflects the notion that you need only relabel the type to get from type A to type B: that is, a domain is binary-coercible to its base type, but not vice versa. Various marginal cleanup, including merging the essentially duplicate resolution code in parse_func.c and parse_oper.c. Improve opr_sanity regression test to understand about binary compatibility (using pg_cast), and fix a couple of small errors in the catalogs revealed thereby. Restructure "special operator" handling to fetch operators via index opclasses rather than hardwiring assumptions about names (cleans up the pattern_ops stuff a little).	2003-05-26 00:11:29 +00:00
Bruce Momjian	e87e82d2b7	Attached are two small patches to expose md5 as a user function -- including documentation and regression test mods. It seemed small and unobtrusive enough to not require a specific proposal on the hackers list -- but if not, let me know and I'll make a pitch. Otherwise, if there are no objections please apply. Joe Conway	2002-12-06 05:20:28 +00:00
Tom Lane	9946b83ded	Bring SIMILAR TO and SUBSTRING into some semblance of conformance with the SQL99 standard. (I'm not sure that the character-class features are quite right, but that can be fixed later.) Document SQL99 and POSIX regexps as being different features; provide variants of SUBSTRING for each.	2002-09-22 17:27:25 +00:00
Tom Lane	b26dfb9522	Extend pg_cast castimplicit column to a three-way value; this allows us to be flexible about assignment casts without introducing ambiguity in operator/function resolution. Introduce a well-defined promotion hierarchy for numeric datatypes (int2->int4->int8->numeric->float4->float8). Change make_const to initially label numeric literals as int4, int8, or numeric (never float8 anymore). Explicitly mark Func and RelabelType nodes to indicate whether they came from a function call, explicit cast, or implicit cast; use this to do reverse-listing more accurately and without so many heuristics. Explicit casts to char, varchar, bit, varbit will truncate or pad without raising an error (the pre-7.2 behavior), while assigning to a column without any explicit cast will still raise an error for wrong-length data like 7.3. This more nearly follows the SQL spec than 7.2 behavior (we should be reporting a 'completion condition' in the explicit-cast cases, but we have no mechanism for that, so just do silent truncation). Fix some problems with enforcement of typmod for array elements; it didn't work at all in 'UPDATE ... SET array[n] = foo', for example. Provide a generalized array_length_coerce() function to replace the specialized per-array-type functions that used to be needed (and were missing for NUMERIC as well as all the datetime types). Add missing conversions int8<->float4, text<->numeric, oid<->int8. initdb forced.	2002-09-18 21:35:25 +00:00
Bruce Momjian	81186865fe	Joe Conway wrote: > Hannu Krosing wrote: > >> It seems that my last mail on this did not get through to the list >> ;( >> >> Please consider renaming the new builtin function >> split(text,text,int) >> >> to something else, perhaps >> >> split_part(text,text,int) >> >> (like date_part) >> >> The reason for this request is that 3 most popular scripting >> languages (perl, python, php) all have also a function with similar >> signature, but returning an array instead of single element and the >> (optional) third argument is limit (maximum number of splits to >> perform) >> >> I think that it would be good to have similar function in (some >> future release of) postgres, but if we now let in a function with >> same name and arguments but returning a single string instead an >> array of them, then we will need to invent a new and not so easy to >> recognise name for the "real" split function. >> > > This is a good point, and I'm not opposed to changing the name, but > it is too bad your original email didn't get through before beta1 was > rolled. The change would now require an initdb, which I know we were > trying to avoid once beta started (although we could change it > without requiring an initdb I suppose). > > I guess if we do end up needing an initdb for other reasons, we > should make this change too. Any other opinions? Is split_part an > acceptable name? > > Also, if we add a todo to produce a "real" split function that > returns an array, similar to those languages, I'll take it for 7.4. No one commented on the choice of name, so the attached patch changes the name of split(text,text,int) to split_part(text,text,int) per Hannu's recommendation above. This can be applied without an initdb if current beta testers are advised to run: update pg_proc set proname = 'split_part' where proname = 'split'; in the case they want to use this function. Regression and doc fix is also included in the patch. Joe Conway	2002-09-12 00:21:25 +00:00
Bruce Momjian	b60acaf568	The following small patch provides a couple of minor updates (against CVS HEAD): Amended "strings" regression test. TOAST tests now insert two values with storage set to "external", to exercise properly the TOAST slice routines which fetch only a subset of the chunks. Changed now-misleading comment on AlterTableCreateToastTable in tablecmds.c, because both columns of the index on a toast table are now used. John Gray	2002-08-28 20:18:29 +00:00
Bruce Momjian	89260124db	Add: replace(string, from, to) -- replaces all occurrences of "from" in "string" to "to" split(string, fldsep, column) -- splits "string" on "fldsep" and returns "column" number piece to_hex(int32_num) & to_hex(int64_num) -- takes integer number and returns as hex string Joe Conway	2002-08-22 03:24:01 +00:00
Tom Lane	2efb8e8070	Code review for 'at character n' patch --- point at proper end of a token scanned by multiple lex rules.	2002-08-18 03:35:08 +00:00
Bruce Momjian	06b604b737	Modify regression tests to match new error reporting format from Gavin.	2002-08-18 02:48:29 +00:00
Thomas G. Lockhart	ea01a451cc	Implement SQL99 OVERLAY(). Allows substitution of a substring in a string. Implement SQL99 SIMILAR TO as a synonym for our existing operator "~". Implement SQL99 regular expression SUBSTRING(string FROM pat FOR escape). Extend the definition to make the FOR clause optional. Define textregexsubstr() to actually implement this feature. Update the regression test to include these new string features. All tests pass. Rename the regular expression support routines from "pg95_xxx" to "pg_xxx". Define CREATE CHARACTER SET in the parser per SQL99. No implementation yet.	2002-06-11 15:44:38 +00:00
Tom Lane	61446e0927	Improve lexer's error reporting. You get the whole token mentioned now in parse error messages, not just the part scanned by the last flex rule. For example, select "foo" "bar"; used to draw ERROR: parser: parse error at or near """ which was rather unhelpful. Now it gives ERROR: parser: parse error at or near ""bar"" Also, error messages concerning bitstring literals and suchlike will quote the source text at you, not the processed internal form of the literal.	2002-05-01 17:12:08 +00:00
Tom Lane	7c0c9b3cce	New improved version of bpcharin() may have got the truncation case right, but it failed to get the padding case right. This was obscured by subsequent application of bpchar() in all but one regression test case, and that one didn't fail in an obvious way --- trailing blanks are hard to see. Add another test case to make it more obvious if it breaks again.	2001-06-01 17:49:17 +00:00
Peter Eisentraut	5546ec289b	Make char(n) and varchar(n) types raise an error if the inserted string is too long. While I was adjusting the regression tests I moved the array things all into array.sql, to make things more manageable.	2001-05-21 16:54:46 +00:00
Tom Lane	8ae9ad1cb8	Reimplement LIKE/ESCAPE as operators so that indexscan optimization can still work, per recent discussion on pghackers. Correct some bugs in ILIKE implementation.	2000-09-15 18:45:31 +00:00
Thomas G. Lockhart	701de35a3f	Forgot to update the regression test output.	2000-08-07 01:43:14 +00:00
Tom Lane	2d4a05d7df	Update strings test to reflect the fact that casting to char() will now truncate or pad to the specified length.	2000-01-17 00:16:41 +00:00
Thomas G. Lockhart	d83105539a	Verified output from new psql. Include a few new tests for datetime/timespan arithmetic.	2000-01-05 06:06:23 +00:00
Thomas G. Lockhart	3955d66803	Add test for UNION. Add additional tests in strings for conversions of the "name" data type. Test SQL92 string functions such as SUBSTRING() and POSITION(). Fix geometry tests to reflect code fixed by Gautam. Update error messages.	1998-05-29 13:22:42 +00:00
Marc G. Fournier	2a3c589c5a	Clean up regression tests for SunOS (based on Solaris v2.6) Clean up strings.out , removed func_get_detail from error message	1998-02-10 14:22:50 +00:00
Bruce Momjian	0d9fc5afd6	Change elog(WARN) to elog(ERROR) and elog(ABORT).	1998-01-05 03:35:55 +00:00

1 2

51 Commits