Commit Graph

154 Commits

Author SHA1 Message Date
Bruce Momjian 29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
Tom Lane a5cf12e2ef Fix performance issues in replace_text(), replace_text_regexp(), and
text_to_array(): they all had O(N^2) behavior on long input strings in
multibyte encodings, because of repeated rescanning of the input text to
identify substrings whose positions/lengths were computed in characters
instead of bytes.  Fix by tracking the current source position as a char
pointer as well as a character-count.  Also avoid some unnecessary palloc
operations.  text_to_array() also leaked memory intracall due to failure
to pfree temporary strings.  Per gripe from Tatsuo Ishii.
2006-11-08 19:22:25 +00:00
Tom Lane 452fa214e5 Fix string_to_array() to correctly handle the case where there are
overlapping possible matches for the separator string, such as
string_to_array('123xx456xxx789', 'xx').
Also, revise the logic of replace(), split_part(), and string_to_array()
to avoid O(N^2) work from redundant searches and conversions to pg_wchar
format when there are N matches to the separator string.
Backpatched the full patch as far as 8.0.  7.4 also has the bug, but the
code has diverged a lot, so I just went for a quick-and-dirty fix of the
bug itself in that branch.
2006-10-07 00:11:53 +00:00
Bruce Momjian f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Bruce Momjian e0522505bd Remove 576 references of include files that were not needed. 2006-07-14 14:52:27 +00:00
Bruce Momjian a22d76d96a Allow include files to compile own their own.
Strip unused include files out unused include files, and add needed
includes to C files.

The next step is to remove unused include files in C files.
2006-07-13 16:49:20 +00:00
Tom Lane 47a37aeebd Split definitions for md5.c out of crypt.h and into their own header
libpq/md5.h, so that there's a clear separation between backend-only
definitions and shared frontend/backend definitions.  (Turns out this
is reversing a bad decision from some years ago...)  Fix up references
to crypt.h as needed.  I looked into moving the code into src/port, but
the headers in src/include/libpq are sufficiently intertwined that it
seems more work than it's worth to do that.
2006-06-20 19:56:52 +00:00
Tom Lane c61a2f5841 Change the backend to reject strings containing invalidly-encoded multibyte
characters in all cases.  Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release).  Embedded
zero (null) bytes will be rejected as well.  The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated.  Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.

Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.

Patches by Tatsuo Ishii and Tom Lane.  Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.
2006-05-21 20:05:21 +00:00
Tom Lane 147d4bf3e5 Modify all callers of datatype input and receive functions so that if these
functions are not strict, they will be called (passing a NULL first parameter)
during any attempt to input a NULL value of their datatype.  Currently, all
our input functions are strict and so this commit does not change any
behavior.  However, this will make it possible to build domain input functions
that centralize checking of domain constraints, thereby closing numerous holes
in our domain support, as per previous discussion.

While at it, I took the opportunity to introduce convenience functions
InputFunctionCall, OutputFunctionCall, etc to use in code that calls I/O
functions.  This eliminates a lot of grotty-looking casts, but the main
motivation is to make it easier to grep for these places if we ever need
to touch them again.
2006-04-04 19:35:37 +00:00
Bruce Momjian f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00
Neil Conway 0d9742f99a Attached is a patch that replaces a bunch of places where StringInfos
are unnecessarily allocated on the heap rather than the stack. If the
StringInfo doesn't outlive the stack frame in which it is created,
there is no need to allocate it on the heap via makeStringInfo() --
stack allocation is faster.  While it's not a big deal unless the
code is in a critical path, I don't see a reason not to save a few
cycles -- using stack allocation is not less readable.

I also cleaned up a bit of code along the way: moved variable
declarations into a more tightly-enclosing scope where possible,
fixed some pointless copying of strings in dblink, etc.
2006-03-01 06:51:01 +00:00
Neil Conway 4d39c6bcf5 Fix typo in comment. 2006-02-26 02:23:41 +00:00
Tom Lane 656beff590 Adjust string comparison so that only bitwise-equal strings are considered
equal: if strcoll claims two strings are equal, check it with strcmp, and
sort according to strcmp if not identical.  This fixes inconsistent
behavior under glibc's hu_HU locale, and probably under some other locales
as well.  Also, take advantage of the now-well-defined behavior to speed up
texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise
comparison and not bother with strcoll at all.

NOTE: affected databases may need to REINDEX indexes on text columns to be
sure they are self-consistent.
2005-12-22 22:50:00 +00:00
Bruce Momjian 436a2956d8 Re-run pgindent, fixing a problem where comment lines after a blank
comment line where output as too long, and update typedefs for /lib
directory.  Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).

Backpatch to 8.1.X.
2005-11-22 18:17:34 +00:00
Tom Lane 1d0d8d3c38 Mop-up for nulls-in-arrays patch: fix some places that access array
contents directly.
2005-11-18 02:38:24 +00:00
Peter Eisentraut 07bb9f086b Message corrections 2005-10-29 00:31:52 +00:00
Tom Lane 220f2a7d15 Code review for regexp_replace patch. Improve documentation and comments,
fix problems with replacement-string backslashes that aren't followed by
one of the expected characters, avoid giving the impression that
replace_text_regexp() is meant to be called directly as a SQL function,
etc.
2005-10-18 20:38:58 +00:00
Tom Lane d330f1554d Clean up libpq's pollution of application namespace by renaming the
exported routines of ip.c, md5.c, and fe-auth.c to begin with 'pg_'.
Also get rid of the vestigial fe_setauthsvc/fe_getauthsvc routines
altogether.
2005-10-17 16:24:20 +00:00
Bruce Momjian 1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Tom Lane 8889685555 Suppress signed-vs-unsigned-char warnings. 2005-09-24 17:53:28 +00:00
Neil Conway 148c00acbb Update two comments to refer to use the new list API names. 2005-09-16 04:13:18 +00:00
Tom Lane a9118fc5a8 The idea of using _strncoll() on Windows doesn't work. Revert to same
code as we use on other platforms when encoding is not UTF8.
2005-08-26 17:40:36 +00:00
Tom Lane 767a9021b3 Add small hack to support use of Unicode-based locales on WIN32. This
is not adequately tested yet, but let's get it into beta1 so it can be
tested.  Magnus Hagander and Tom Lane.
2005-08-24 17:50:00 +00:00
Tom Lane 0001e98d54 Code and docs review for pg_column_size() patch. 2005-08-02 16:11:57 +00:00
Bruce Momjian 722f31f786 Thank you for applying patch --- regexp_replace.
An attached patch is a small additional improvement.

This patch use appendStringInfoText instead of appendStringInfoString.
There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is
executed by text type. This can be reduced by appendStringInfoText.

Atsushi Ogawa
2005-07-29 03:17:55 +00:00
Bruce Momjian 9dbd00b0e2 Remove unnecessary parentheses in assignments.
Add spaces where needed.
Reference time interval variables as tinterval.
2005-07-21 04:41:43 +00:00
Tom Lane d78397d301 Change typreceive function API so that receive functions get the same
optional arguments as text input functions, ie, typioparam OID and
atttypmod.  Make all the datatypes that use typmod enforce it the same
way in typreceive as they do in typinput.  This fixes a problem with
failure to enforce length restrictions during COPY FROM BINARY.
2005-07-10 21:14:00 +00:00
Bruce Momjian 75a64eeb4b I made the patch that implements regexp_replace again.
The specification of this function is as follows.

regexp_replace(source text, pattern text, replacement text, [flags
text])
returns text

Replace string that matches to regular expression in source text to
replacement text.

 - pattern is regular expression pattern.
 - replacement is replace string that can use '\1'-'\9', and '\&'.
    '\1'-'\9': back reference to the n'th subexpression.
    '\&'     : entire matched string.
 - flags can use the following values:
    g: global (replace all)
    i: ignore case
    When the flags is not specified, case sensitive, replace the first
    instance only.

Atsushi Ogawa
2005-07-10 04:54:33 +00:00
Bruce Momjian 294de2dc01 pg_column_size() cleanup for messages and code cleanup.
Mark Kirkwood
2005-07-07 04:36:08 +00:00
Bruce Momjian a923602855 Add pg_column_size() to return storage size of a column, including
possible compression.

Mark Kirkwood
2005-07-06 19:02:54 +00:00
Bruce Momjian 109f079be6 I made the patch that improved the performance of replace_text().
The content of the patch is as follows:

(1)Create shortcut when subtext was not found.

(2)Stop using LEFT and RIGHT macro.
In LEFT and RIGHT macro, TEXTPOS is executed by the same content as
execution immediately before. The execution frequency of TEXTPOS can be
reduced by using text_substring instead of LEFT and RIGHT macro.

(3)Add appendStringInfoText, and use it instead of
appendStringInfoString.
There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is
executed by text type. This can be reduced by appendStringInfoText.

(4)Reduce execution of TEXTDUP.

The effect of the patch that I measured is as follows:

- The Data for test was created by 'pgbench -i'.

- Test SQL:
 select replace(aid, '9', 'A') from accounts;

- Test results: Linux(CPU: Pentium III, Compiler option: -O2)
 original: 1.515s
 patched:  1.250s

Atsushi Ogawa
2005-07-04 18:56:44 +00:00
Tom Lane cfd9be939e Change the UNKNOWN type to have an internal representation matching
cstring, rather than text, so as to eliminate useless conversions
inside the parser.  Per recent discussion.
2005-05-30 01:20:50 +00:00
Neil Conway a4374f9070 Remove second argument from textToQualifiedNameList(), as it is no longer
used. From Jaime Casanova.
2005-05-27 00:57:49 +00:00
Neil Conway f3567eeaf2 Implement md5(bytea), update regression tests and documentation. Patch
from Abhijit Menon-Sen, minor editorialization from Neil Conway. Also,
improve md5(text) to allocate a constant-sized buffer on the stack
rather than via palloc.

Catalog version bumped.
2005-05-20 01:29:56 +00:00
Tom Lane 6c412f0605 Change CREATE TYPE to require datatype output and send functions to have
only one argument.  (Per recent discussion, the option to accept multiple
arguments is pretty useless for user-defined types, and would be a likely
source of security holes if it was used.)  Simplify call sites of
output/send functions to not bother passing more than one argument.
2005-05-01 18:56:19 +00:00
Neil Conway 3350b3740e This patch optimizes the md5_text() function (which is used to
implement the md5() SQL-level function). The old code did the
following:

1. de-toast the datum
2. convert it to a cstring via textout()
3. get the length of the cstring via strlen()

Since we are treating the datum context as a blob of binary data,
the latter two steps are unnecessary. Once the data has been
detoasted, we can just use it as-is, and derive its length from
the varlena metadata.

This patch improves some run-of-the-mill md5() computations by
just under 10% in my limited tests, and passes the regression tests.

I also noticed that md5_text() wasn't checking the return value
of md5_hash(); encountering OOM at precisely the right moment
could result in returning a random md5 hash. This patch corrects
that. A better fix would be to make md5_hash() only return on
success (and/or allocate via palloc()), but since it's used in
the frontend as well I don't see an easy way to do that.
2005-02-23 22:46:17 +00:00
PostgreSQL Daemon 2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Bruce Momjian b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane c541bb86e9 Infrastructure for I/O of composite types: arrange for the I/O routines
of a composite type to get that type's OID as their second parameter,
in place of typelem which is useless.  The actual changes are mostly
centralized in getTypeInputInfo and siblings, but I had to fix a few
places that were fetching pg_type.typelem for themselves instead of
using the lsyscache.c routines.  Also, I renamed all the related variables
from 'typelem' to 'typioparam' to discourage people from assuming that
they necessarily contain array element types.
2004-06-06 00:41:28 +00:00
Neil Conway 72b6ad6313 Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
2004-05-30 23:40:41 +00:00
Neil Conway d0b4399d81 Reimplement the linked list data structure used throughout the backend.
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.

The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
2004-05-26 04:41:50 +00:00
Tom Lane 59f9a0b9df Implement a solution to the 'Turkish locale downcases I incorrectly'
problem, per previous discussion.  Make some additional changes to
centralize the knowledge of just how identifier downcasing is done,
in hopes of simplifying any future tweaking in this area.
2004-02-21 00:34:53 +00:00
Neil Conway 7b2cf1713d Micro-opt: replace calls like
appendStringInfo(buf, "%s", str);
with
    appendStringInfoString(buf, str);
as the latter form is slightly faster.
2004-01-31 05:09:41 +00:00
Tom Lane d4fd7d85f3 Fix text_position to not scan past end of source string in multibyte
case, per report from Korea PostgreSQL Users' Group.  Also do some
cosmetic cleanup in nearby code.
2004-01-31 00:45:21 +00:00
Tom Lane 7fc2d50877 Make to_hex() behave portably on negative input values (treat them as
unsigned integers).  Per report from Jim Crate.
2003-12-19 04:56:41 +00:00
Joe Conway b8f40ced2f Make PQescapeBytea and byteaout consistent with each other, and
octal escape all octets outside the range 0x20 to 0x7e. This fixes
the problem pointed out by Sergey Yatskevich here:
http://archives.postgresql.org/pgsql-bugs/2003-11/msg00140.php
2003-11-30 20:55:09 +00:00
PostgreSQL Daemon 969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Peter Eisentraut feb4f44d29 Message editing: remove gratuitous variations in message wording, standardize
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
2003-09-25 06:58:07 +00:00
Tom Lane 4c3c8c048d Remove --enable-recode feature, since it's been broken by IPv6 changes,
and seems to have too few users to justify maintaining.
2003-08-04 04:03:10 +00:00