Commit Graph

2018 Commits

Author SHA1 Message Date
Bruce Momjian 6f3149e464 Fix typo in comment. 2007-11-21 21:49:22 +00:00
Tom Lane d23ba77a44 Fix bogus length calculation that could lead to crash if the string
happened to be right up against the end of memory, per report from
Matt Magoffin.  While at it, avoid useless multiple copying of string
by not depending on xmlStrncatNew.
2007-11-20 23:14:41 +00:00
Teodor Sigaev a867b40cf4 Fix tsvectorout() and tsqueryout() to escape backslesh, add test of that.
Patch by Bruce Momjian <bruce@momjian.us>

Backpatch is needed, but it's impossible to apply it directly
2007-11-16 15:05:59 +00:00
Bruce Momjian f639df0d61 Small comment spacing improvement. 2007-11-16 01:51:22 +00:00
Bruce Momjian 5f0bf6cb0d Run pgindent on remaining files now that LOOPBYTE is a usable macro. 2007-11-16 01:12:24 +00:00
Bruce Momjian 224f91f66d Modify LOOPBYTE/LOOPBIT macros to be more logical; rather than have the
for() body passed as a parameter, make the macros act as simple headers
to code blocks.

This allows pgindent to be run on these files.
2007-11-16 00:13:02 +00:00
Bruce Momjian 7d4c99b414 Fix pgindent to properly handle 'else' and single-line comments on the
same line;  previous fix was only partial.  Re-run pgindent on files
that need it.
2007-11-15 23:23:44 +00:00
Bruce Momjian f6e8730d11 Re-run pgindent with updated list of typedefs. (Updated README should
avoid this problem in the future.)
2007-11-15 22:25:18 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane 866bad9543 Add a rank/(rank+1) normalization option to ts_rank(). While the usefulness
of this seems a bit marginal, if it's useful enough to be shown in the manual
then we probably ought to support doing it without double evaluation of the
ts_rank function.  Per my proposal earlier today.
2007-11-14 23:43:27 +00:00
Tom Lane 4394c1b09c Resurrect the code for the rewrite(ARRAY[...]) aggregate function,
and put it into contrib/tsearch2 compatibility module.
2007-11-13 22:14:50 +00:00
Tom Lane 2b477a2c73 Add missing closing / in xsd:restriction, and remove some unnecessary
spaces for consistency.  Per bug #3734 from Ben Leslie; fix by
Euler Taveira de Oliveira.
2007-11-10 19:29:54 +00:00
Tom Lane d2d52bbb55 xmlGetUTF8Char()'s second argument is both input and output. Fix
uninitialized value, and avoid invoking the function nine separate
times in the pg_xmlIsNameChar macro.  Should resolve buildfarm failures.
Per report from Ben Leslie.
2007-11-10 18:51:20 +00:00
Tom Lane a96fa85025 Second pass at improving LIKE/regex estimation in non-C locales. It turns
out that it's actually quite likely that a string that is an extension of
the given prefix will sort as larger than the "greater" string our previous
code created.  To provide some defense against that, do the comparisons
against a modified string instead of just the bare prefix.  We tack on
"Z", "z", "y", or "9", whichever is seen as largest in the current locale.
Testing suggests that this is sufficient at least for cases involving
ASCII data.
2007-11-09 20:10:02 +00:00
Peter Eisentraut 8db43db01e Allow XML processing instructions starting with "xml" while prohibiting
those being exactly "xml".  Bug #3735 from Ben Leslie
2007-11-09 15:52:51 +00:00
Peter Eisentraut 4c726d5c11 After conferencing again with Bruce, put in more accurate XML error message. 2007-11-08 15:16:45 +00:00
Peter Eisentraut 79cff6bc7e Improve error message 2007-11-08 13:12:56 +00:00
Tom Lane 2de946be6a Improve the performance of LIKE/regex estimation in non-C locales, by making
make_greater_string() try harder to generate a string that's actually greater
than its input string.  Before we just assumed that making a string that was
memcmp-greater was enough, but it is easy to generate examples where this is
not so when the locale is not C.  Instead, loop until the relevant comparison
function agrees that the generated string is greater than the input.

Unfortunately this is probably not enough to guarantee that the generated
string is greater than all extensions of the input, so we cannot relax the
restriction to C locale for the LIKE/regex index optimization.  But it should
at least improve the odds of getting a useful selectivity estimate in
prefix_selectivity().  Per example from Guillaume Smet.

Backpatch to 8.1, mainly because that's what the complainant is using...
2007-11-07 22:37:24 +00:00
Tom Lane 9542287123 Fix patternsel() and callers to do the right thing for NOT LIKE and the other
negated-match operators.  patternsel had been using the supplied operator as
though it were a positive-match operator, and thus obtaining a wrong result,
which was even more wrong after the caller subtracted it from 1.  Seems
cleanest to give patternsel an explicit "negate" argument so that it knows
what's going on.  Also install the same factorization scheme for pattern
join selectivity estimators; even though they are just stubs at the
moment, this may keep someone from making the same type of mistake when
they get filled out.  Per report from Greg Mullane.

Backpatch to 8.2 --- previous releases do not show the problem because
patternsel() doesn't actually use the operator directly.
2007-11-07 21:00:37 +00:00
Tom Lane 5e51297104 Some code review for xml.c:
Add some more xml_init() calls that might not be necessary, but seem like a
good idea to avoid possible problems like we saw in xmlelement().
Fix unsafe assumption that you can keep using the tupledesc of a relcache
entry you don't have open.
Add missing error checks for SearchSysCache failure.
Get rid of handwritten array traversal in xpath() and O(N^2), broken-for-nulls
array access code in map_sql_value_to_xml_value(), in favor of using
deconstruct_array.
Manually adjust a lot of line breaks in places where the code is otherwise
gonna look pretty awful after pg_indent hacks it up (original author seems to
have liked to lay out code for a 200-column window).
2007-11-06 03:06:28 +00:00
Tom Lane 85f807d782 Fix xmlelement() to initialize libxml correctly before using it, and to avoid
assuming that evaluation of its input expressions won't change the state of
libxml.  This requires refactoring xml_init() to not call xmlInitParser(),
since now not all of its callers want that.  I also tweaked things to avoid
repeated execution of one-time-only tests inside xml_init(), though this is
mostly for clarity rather than in hopes of saving any noticeable amount of
runtime.  Per report from Sheikh Amjad and subsequent discussion.
In passing, fix a couple of inadequately schema-qualified queries.
2007-11-05 22:23:07 +00:00
Tom Lane 1c92724985 Set read_only = TRUE while evaluating input queries for ts_rewrite()
and ts_stat(), per my recent suggestion.  Also add a possibly-not-needed-
but-can't-hurt check for NULL SPI_tuptable, before we try to dereference
same.
2007-10-24 03:30:03 +00:00
Tom Lane 592c88a0d2 Remove the aggregate form of ts_rewrite(), since it doesn't work as desired
if there are zero rows to aggregate over, and the API seems both conceptually
and notationally ugly anyway.  We should look for something that improves
on the tsquery-and-text-SELECT version (which is also pretty ugly but at
least it works...), but it seems that will take query infrastructure that
doesn't exist today.  (Hm, I wonder if there's anything in or near SQL2003
window functions that would help?)  Per discussion.
2007-10-24 02:24:49 +00:00
Tom Lane 12f25e70a6 Fix two-argument form of ts_rewrite() so it actually works for cases where
a later rewrite rule should change a subtree modified by an earlier one.
Per my gripe of a few days ago.
2007-10-23 01:44:40 +00:00
Tom Lane bb36c51fcd Fix several bugs in tsvectorin, including crash due to uninitialized field and
miscomputation of required palloc size.  The crash could only occur if the
input contained lexemes both with and without positions, which is probably not
common in practice.  The miscomputation would definitely result in wasted
space.  Also fix some inconsistent coding around alignment of strings and
positions in a tsvector value; these errors could also lead to crashes given
mixed with/without position data and a machine that's picky about alignment.
And be more careful about checking for overflow of string offsets.

Patch is only against HEAD --- I have not looked to see if same bugs are
in back-branch contrib/tsearch2 code.
2007-10-23 00:51:23 +00:00
Tom Lane 1ea47dd8cb Fix shared tsvector/tsquery input code so that we don't say "syntax error in
tsvector" when we are really parsing a tsquery.  Report the bogus input,
too.  Make styles of some related error messages more consistent.
2007-10-21 22:29:56 +00:00
Tom Lane 531ead8ab4 Adjust error message to agree with documentation. The tsearch documentation
uniformly calls these things weights, not classes.
2007-10-20 21:06:20 +00:00
Tom Lane 18e3fcc31e Migrate the former contrib/txid module into core. This will make it easier
for Slony and Skytools to depend on it.  Per discussion.
2007-10-13 23:06:28 +00:00
Tom Lane ff1de5cef6 Guard against possible double free during error escape from XML
functions.  Patch for the reported issue from Kris Jurka, some
other potential trouble spots plugged by Tom.
2007-10-13 20:46:47 +00:00
Tom Lane 8468146b03 Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so.  For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway.  (This does force initdb
unfortunately.)

Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using.  To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.

It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs.  However the code is now prepared to avoid this
type of problem in future.

Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly.  The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
2007-10-13 20:18:42 +00:00
Tom Lane 537e92e41f Fix ALTER COLUMN TYPE to preserve the tablespace and reloptions of indexes
it affects.  The original coding neglected tablespace entirely (causing
the indexes to move to the database's default tablespace) and for an index
belonging to a UNIQUE or PRIMARY KEY constraint, it would actually try to
assign the parent table's reloptions to the index :-(.  Per bug #3672 and
subsequent investigation.

8.0 and 8.1 did not have reloptions, but the tablespace bug is present.
2007-10-13 15:55:40 +00:00
Tom Lane 6f21c57a97 In the integer-datetimes case, date2timestamp and date2timestamptz need
to check for overflow because the legal range of type date is actually
wider than timestamp's.  Problem found by Neil Conway.
2007-09-26 01:10:42 +00:00
Tom Lane 6f5c38dcd0 Just-in-time background writing strategy. This code avoids re-scanning
buffers that cannot possibly need to be cleaned, and estimates how many
buffers it should try to clean based on moving averages of recent allocation
requests and density of reusable buffers.  The patch also adds a couple
more columns to pg_stat_bgwriter to help measure the effectiveness of the
bgwriter.

Greg Smith, building on his own work and ideas from several other people,
in particular a much older patch from Itagaki Takahiro.
2007-09-25 20:03:38 +00:00
Tom Lane f71c7b9dfd Fix bugs in XML binary I/O functions. Heikki and Tom 2007-09-23 21:36:42 +00:00
Tom Lane bbda96d76d Fix bogus calculation of potential output string length in translate(). 2007-09-22 05:35:42 +00:00
Tom Lane 5e87ebb0c3 Although I'd misdiagnosed the reason for the recent failures on
buildfarm member grebe, I see no reason to revert the 1-byte-header-friendly
changes I made in varlena.c.  Instead, tweak the code a little bit to
get more advantage out of that.
2007-09-22 04:40:03 +00:00
Tom Lane 94470b9499 Doh --- what's really happening on buildfarm member grebe is that its
malloc returns NULL for malloc(0).  Defend against that case.
2007-09-22 04:37:53 +00:00
Andrew Dunstan e152893305 Go back to using a separate method for doing ILIKE for single byte
character encodings that doesn't involve calling lower(). This should
cure the performance regression in this case complained of by Guillaume
Smet. It still leaves the horrid performance for multi-byte encodings
introduced in 8.2, but there's no obvious solution for that in sight.
2007-09-22 03:58:34 +00:00
Tom Lane b5d1608b0a Fix varlena.c routines to allow 1-byte-header text values. This is now
demonstrably necessary for text_substring() since regexp_split functions
may pass it such a value; and we might as well convert the whole file
at once.  Per buildfarm results (though I wonder why most machines aren't
showing a failure).
2007-09-22 00:36:38 +00:00
Tom Lane 7583f9a7ca Fix regex, LIKE, and some other second-rank text-manipulation functions
to not cause needless copying of text datums that have 1-byte headers.
Greg Stark, in response to performance gripe from Guillaume Smet and
ITAGAKI Takahiro.
2007-09-21 22:52:52 +00:00
Tom Lane d22ae3ecc2 Solaris portability fix that was previously made in contrib/tsearch2
but got lost from the version committed to main tree.  Per Greg Stark.
2007-09-20 23:27:11 +00:00
Teodor Sigaev bab16af807 Fix msvc warnings, patch by Hannes Eder <Hannes@HannesEder.net> 2007-09-20 18:10:57 +00:00
Tom Lane 282d2a03dd HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version.  Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.

In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.

Pavan Deolasee, with help from a bunch of other people.
2007-09-20 17:56:33 +00:00
Neil Conway bbf4fdc253 Prevent corr() from returning the wrong results for negative correlation
values. The previous coding essentially assumed that x = sqrt(x*x), which
does not hold for x < 0.

Thanks to Jie Zhang at Greenplum and Gavin Sherry for reporting this
issue.
2007-09-19 22:31:48 +00:00
Andrew Dunstan 55613bf9cd Close previously open holes for invalidly encoded data to enter the
database via builtin functions, as recently discussed on -hackers.

chr() now returns a character in the database encoding. For UTF8 encoded databases
the argument is treated as a Unicode code point. For other multi-byte encodings
the argument must designate a strict ascii character, or an error is raised,
as is also the case if the argument is 0.

ascii() is adjusted so that it remains the inverse of chr().

The two argument form of convert() is gone, and the three argument form now
takes a bytea first argument and returns a bytea. To cover this loss three new
functions are introduced:
. convert_from(bytea, name) returns text - converts the first argument from the
  named encoding to the database encoding
. convert_to(text, name) returns bytea - converts the first argument from the
  database encoding to the named encoding
. length(bytea, name) returns int - gives the length of the first argument in
  characters in the named encoding
2007-09-18 17:41:17 +00:00
Tom Lane 22d98e7934 Fix overflow in extract(epoch from interval) for intervals exceeding 68 years.
Seems to have been introduced in 8.1 by careless SECS_PER_DAY
search-and-replace.
2007-09-16 15:56:20 +00:00
Teodor Sigaev 3e805fdcf7 Fix typo in typecasting.
patch from  ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>
2007-09-13 06:54:35 +00:00
Teodor Sigaev 476045a21b Remove QueryOperand->istrue flag, it was used only in cover ranking
(ts_rank_cd). Use palloc'ed array in ranking instead of flag.
2007-09-11 16:01:40 +00:00
Teodor Sigaev 57cafe7982 Refactor from Heikki Linnakangas <heikki@enterprisedb.com>:
* Defined new struct WordEntryPosVector that holds a uint16 length and a
variable size array of WordEntries. This replaces the previous
convention of a variable size uint16 array, with the first element
implying the length. WordEntryPosVector has the same layout in memory,
but is more readable in source code. The POSDATAPTR and POSDATALEN
macros are still used, though it would now be more readable to access
the fields in WordEntryPosVector directly.

* Removed needfree field from DocRepresentation. It was always set to false.

* Miscellaneous other commenting and refactoring
2007-09-11 08:46:29 +00:00
Tom Lane ef4d38c86c Rename recently-added pg_stat_activity column from txn_start to xact_start,
for consistency with other column names such as in pg_stat_database.
2007-09-11 03:28:05 +00:00
Tom Lane 82a47982f3 Arrange for SET LOCAL's effects to persist until the end of the current top
transaction, unless rolled back or overridden by a SET clause for the same
variable attached to a surrounding function call.  Per discussion, these
seem the best semantics.  Note that this is an INCOMPATIBLE CHANGE: in 8.0
through 8.2, SET LOCAL's effects disappeared at subtransaction commit
(leading to behavior that made little sense at the SQL level).

I took advantage of the opportunity to rewrite and simplify the GUC variable
save/restore logic a little bit.  The old idea of a "tentative" value is gone;
it was a hangover from before we had a stack.  Also, we no longer need a stack
entry for every nesting level, but only for those in which a variable's value
actually changed.
2007-09-11 00:06:42 +00:00
Teodor Sigaev d982daae0b Change void* opaque argument to Datum type, add argument's
name to PushFunction type definition.

Per suggestion by Tome Lane <tgl@sss.pgh.pa.us>
2007-09-10 12:36:41 +00:00
Teodor Sigaev 978de9d06d Improvements from Heikki Linnakangas <heikki@enterprisedb.com>
- change the alignment requirement of lexemes in TSVector slightly.
Lexeme strings were always padded to 2-byte aligned length to make sure
that if there's position array (uint16[]) it has the right alignment.
The patch changes that so that the padding is not done when there's no
positions. That makes the storage of tsvectors without positions
slightly more compact.

- added some #include "miscadmin.h" lines I missed in the earlier when I
added calls to check_stack_depth().

- Reimplement the send/recv functions, and added a comment
above them describing the on-wire format. The CRC is now recalculated in
tsquery as well per previous discussion.
2007-09-07 16:03:40 +00:00
Teodor Sigaev 8983852e34 Improving various checks by Heikki Linnakangas <heikki@enterprisedb.com>
- add code to check that the query tree is well-formed. It was indeed
  possible to send malformed queries in binary mode, which produced all
  kinds of strange results.

- make the left-field a uint32. There's no reason to
  arbitrarily limit it to 16-bits, and it won't increase the disk/memory
  footprint either now that QueryOperator and QueryOperand are separate
  structs.

- add check_stack_depth() call to all recursive functions I found.
  Some of them might have a natural limit so that you can't force
  arbitrarily deep recursions, but check_stack_depth() is cheap enough
  that seems best to just stick it into anything that might be a problem.
2007-09-07 15:35:11 +00:00
Teodor Sigaev e5be89981f Refactoring by Heikki Linnakangas <heikki@enterprisedb.com> with
small editorization by me

- Brake the QueryItem struct into QueryOperator and QueryOperand.
  Type was really the only common field between them. QueryItem still
  exists, and is used in the TSQuery struct as before, but it's now a
  union of the two. Many other changes fell from that, like separation
  of pushval_asis function into pushValue, pushOperator and pushStop.

- Moved some structs that were for internal use only from header files
  to the right .c-files.

- Moved tsvector parser to a new tsvector_parser.c file. Parser code was
  about half of the size of tsvector.c, it's also used from tsquery.c, and
  it has some data structures of its own, so it seems better to separate
  it. Cleaned up the API so that TSVectorParserState is not accessed from
  outside tsvector_parser.c.

- Separated enumerations (#defines, really) used for QueryItem.type
  field and as return codes from gettoken_query. It was just accidental
  code sharing.

- Removed ParseQueryNode struct used internally by makepol and friends.
  push*-functions now construct QueryItems directly.

- Changed int4 variables to just ints for variables like "i" or "array
  size", where the storage-size was not significant.
2007-09-07 15:09:56 +00:00
Tom Lane 295e63983d Implement lazy XID allocation: transactions that do not modify any database
rows will normally never obtain an XID at all.  We already did things this way
for subtransactions, but this patch extends the concept to top-level
transactions.  In applications where there are lots of short read-only
transactions, this should improve performance noticeably; not so much from
removal of the actual XID-assignments, as from reduction of overhead that's
driven by the rate of XID consumption.  We add a concept of a "virtual
transaction ID" so that active transactions can be uniquely identified even
if they don't have a regular XID.  This is a much lighter-weight concept:
uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
record is made about them.

Florian Pflug, with some editorialization by Tom.
2007-09-05 18:10:48 +00:00
Andrew Dunstan 2e74c53ec1 Provide for binary input/output of enums, to fix complaint from Merlin Moncure.
This just provides text values, we're not exposing the underlying Oid representation.
Catalog version bumped.
2007-09-04 16:41:43 +00:00
Tom Lane 0ee5a39862 Apply a band-aid fix for the problem that 8.2 and up completely misestimate
the number of rows likely to be produced by a query such as
	SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL;
What this is doing is selecting for t1 rows with no match in t2, and thus
it may produce a significant number of rows even if the t2.key table column
contains no nulls at all.  8.2 thinks the table column's null fraction is
relevant and thus may estimate no rows out, which results in terrible plans
if there are more joins above this one.  A proper fix for this will involve
passing much more information about the context of a clause to the selectivity
estimator functions than we ever have.  There's no time left to write such a
patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway.  Instead,
put in an ad-hoc test to defeat the normal table-stats-based estimation when
an IS NULL test is evaluated at an outer join, and just use a constant
estimate instead --- I went with 0.5 for lack of a better idea.  This won't
catch every case but it will catch the typical ways of writing such queries,
and it seems unlikely to make things worse for other queries.
2007-08-31 23:35:22 +00:00
Tom Lane 79048ca1a4 Install check_stack_depth() protection in two recursive tsquery
processing routines.  Per Heikki.
2007-08-31 02:26:29 +00:00
Tom Lane e75d365633 Fix int8mul so that overflow check is applied correctly for INT64_IS_BUSTED
case, per Florian Pflug.
Not back-patched since it's unclear that anyone but me still cares ...
2007-08-30 05:27:29 +00:00
Tom Lane 8bc225e799 Relax permissions checks on dbsize functions, per discussion. Revert out all
checks for individual-table-size functions, since anyone in the database could
get approximate values from pg_class.relpages anyway.  Allow database-size to
users with CONNECT privilege for the target database (note that this is
granted by default).  Allow tablespace-size if the user has CREATE privilege
on the tablespace (which is *not* granted by default), or if the tablespace is
the default tablespace for the current database (since we treat that as
implicitly allowing use of the tablespace).
2007-08-29 17:24:29 +00:00
Tom Lane 6c96188cb5 Remove the 'not in' operator (!!=). This was a hangover from Berkeley
days that was obsolete the moment we had IN (SELECT ...) capability.
It's arguably a security hole since it applied no permissions check to
the table it searched, and since it was never documented anywhere,
removing it seems more appropriate than fixing it.
2007-08-27 01:39:25 +00:00
Tom Lane cc26599b72 Restrict pg_relation_size to relation owner, pg_database_size to DB owner,
and pg_tablespace_size to superusers.  Perhaps we could weaken the first
case to just require SELECT privilege, but that doesn't work for the
other cases, so use ownership as the common concept.
2007-08-27 01:19:14 +00:00
Tom Lane 741e952b54 Make currtid() functions require SELECT privileges on the target table.
While it's not clear that TID linkage info is of any great use to a
nefarious user, it's certainly unexpected that these functions wouldn't
insist on read privileges.
2007-08-27 00:57:36 +00:00
Tom Lane d01741bfa1 Remove extraneous semicolon --- buildfarm member bear, for one,
objects to it.
2007-08-21 06:34:42 +00:00
Tom Lane 14572e4324 Fix cash_mul_int4 and cash_div_int4 for overenthusiastic substitution
of int64 for int32.  Per reports from Merlin Moncure and Andrew Chernow.
2007-08-21 03:56:07 +00:00
Tom Lane 1783e5db3e Fix money type's send/receive functions to conform to recent widening
of the datatype to int64.  Per Andrew Chernow.
2007-08-21 03:14:36 +00:00
Tom Lane 1cee06ac02 Fix potential access-off-the-end-of-memory in varbit_out(): it fetched the
byte after the last full byte of the bit array, regardless of whether that
byte was part of the valid data or not.  Found by buildfarm testing.
Thanks to Stefan Kaltenbrunner for nailing down the cause.
2007-08-21 02:40:06 +00:00
Tom Lane 440a330a31 Fix a small 64-bit problem in tsearch patch. 2007-08-21 01:45:33 +00:00
Tom Lane 140d4ebcb4 Tsearch2 functionality migrates to core. The bulk of this work is by
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.

Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
2007-08-21 01:11:32 +00:00
Andrew Dunstan fd801f4faa Provide for logfiles in machine readable CSV format. In consequence, rename
redirect_stderr to logging_collector.
Original patch from Arul Shaji, subsequently modified by Greg Smith, and then
heavily modified by me.
2007-08-19 01:41:25 +00:00
Tom Lane 9cb8409762 Repair problems occurring when multiple RI updates have to be done to the same
row within one query: we were firing check triggers before all the updates
were done, leading to bogus failures.  Fix by making the triggers queued by
an RI update go at the end of the outer query's trigger event list, thereby
effectively making the processing "breadth-first".  This was indeed how it
worked pre-8.0, so the bug does not occur in the 7.x branches.
Per report from Pavel Stehule.
2007-08-15 19:15:47 +00:00
Tom Lane ae65ca312f Avoid memory leakage across successive calls of regexp_matches() or
regexp_split_to_table() within a single query.  This is only a partial
solution, as it turns out that with enough matches per string these
functions can also tickle a repalloc() misbehavior.  But fixing that
is a topic for a separate patch.
2007-08-11 19:16:41 +00:00
Tom Lane 1b70619311 Code review for regexp_matches/regexp_split patch. Refactor to avoid assuming
that cached compiled patterns will still be there when the function is next
called.  Clean up looping logic, thereby fixing bug identified by Pavel
Stehule.  Share setup code between the two functions, add some comments, and
avoid risky mixing of int and size_t variables.  Clean up the documentation a
tad, and accept all the flag characters mentioned in table 9-19 rather than
just a subset.
2007-08-11 03:56:24 +00:00
Tom Lane 8d30337566 Fix up bad layout of some comments (probably pg_indent's fault), and
improve grammar a tad.  Per Greg Stark.
2007-08-04 21:53:00 +00:00
Tom Lane bdd6b62245 Switch over to using the src/timezone functions for formatting timestamps
displayed in the postmaster log.  This avoids Windows-specific problems with
localized time zone names that are in the wrong encoding, and generally seems
like a good idea to forestall other potential platform-dependent issues.
To preserve the existing behavior that all backends will log in the same time
zone, create a new GUC variable log_timezone that can only be changed on a
system-wide basis, and reference log-related calculations to that zone instead
of the TimeZone variable.

This fixes the issue reported by Hiroshi Saito that timestamps printed by
xlog.c startup could be improperly localized on Windows.  We still need a
simpler patch for that problem in the back branches, however.
2007-08-04 01:26:54 +00:00
Tom Lane 4ca7a2dacb Make replace(), split_part(), and string_to_array() behave somewhat sanely
when handed an invalidly-encoded pattern.  The previous coding could get
into an infinite loop if pg_mb2wchar_with_len() returned a zero-length
string after we'd tested for nonempty pattern; which is exactly what it
will do if the string consists only of an incomplete multibyte character.
This led to either an out-of-memory error or a backend crash depending
on platform.  Per report from Wiktor Wodecki.
2007-07-19 20:34:20 +00:00
Bruce Momjian b6ed78b2bd Properly adjust age() seconds to match the sign of the larger units.
Patch from Tom.
2007-07-18 03:13:13 +00:00
Neil Conway 474774918b Implement CREATE TABLE LIKE ... INCLUDING INDEXES. Patch from NikhilS,
based in part on an earlier patch from Trevor Hardcastle, and reviewed
by myself.
2007-07-17 05:02:03 +00:00
Tom Lane 39f06dcad6 Fix map_sql_typecoll_to_xmlschema_types() to not fail on dropped
columns, per my gripe earlier today.  Make it look a bit less like
someone's first effort at backend coding.
2007-07-13 03:43:23 +00:00
Tom Lane 04b54876b6 Fix a portability bug (ye olde not casting a <ctype.h> argument to
unsigned char).  Fortunately we still have buildfarm machines that
will flag this.  Seems to be new in CVS HEAD, so no backpatch.
2007-07-12 23:51:10 +00:00
Tom Lane 706754c16b Compute max and min int8 values using unsigned arithmetic, in hopes of
suppressing Sun Studio compiler warnings.  Per Stefan.
2007-07-12 21:04:45 +00:00
Tom Lane 6244c2dfff Fix stddev_pop(numeric) and var_pop(numeric), which were incorrectly producing
the same outputs as stddev_samp() and var_samp() respectively.
2007-07-09 16:13:57 +00:00
Tom Lane 7af3a6fc6f Fix up hash functions for datetime datatypes so that they don't take
unwarranted liberties with int8 vs float8 values for these types.
Specifically, be sure to apply either hashint8 or hashfloat8 depending
on HAVE_INT64_TIMESTAMP.  Per my gripe of even date.
2007-07-06 04:16:00 +00:00
Tom Lane 6faf795662 Fix a passel of ancient bugs in to_char(), including two distinct buffer
overruns (neither of which seem likely to be exploitable as security holes,
fortunately, since the provoker can't control the data written).  One of
these is due to choosing to stomp on the output of a called function, which
is bad news in any case; make it treat the called functions' results as
read-only.  Avoid some unnecessary palloc/pfree traffic too; it's not
really helpful to free small temporary objects, and again this is presuming
more than it ought to about the nature of the results of called functions.
Per report from Patrick Welche and additional code-reading by Imad.
2007-06-29 01:51:35 +00:00
Tom Lane 867e2c91a0 Implement "distributed" checkpoints in which the checkpoint I/O is spread
over a fairly long period of time, rather than being spat out in a burst.
This happens only for background checkpoints carried out by the bgwriter;
other cases, such as a shutdown checkpoint, are still done at full speed.

Remove the "all buffers" scan in the bgwriter, and associated stats
infrastructure, since this seems no longer very useful when the checkpoint
itself is properly throttled.

Original patch by Itagaki Takahiro, reworked by Heikki Linnakangas,
and some minor API editorialization by me.
2007-06-28 00:02:40 +00:00
Alvaro Herrera 80f3b5ad2e Remove unused "caller" argument from stringToQualifiedNameList. 2007-06-26 16:48:09 +00:00
Tom Lane 4c310eca2e Arrange for quote_identifier() and pg_dump to not quote keywords that are
unreserved according to the grammar.  The list of unreserved words has gotten
extensive enough that the unnecessary quoting is becoming a bit of an eyesore.
To do this, add knowledge of the keyword category to keywords.c's table.
(Someday we might be able to generate keywords.c's table and the keyword lists
in gram.y from a common source.)  For the moment, lie about WITH's status in
the table so it will still get quoted --- this is because of the expectation
that WITH will become reserved when the SQL recursive-queries patch gets done.

I didn't force initdb because this affects nothing on-disk; but note that a
few regression tests have changed expected output.
2007-06-18 21:40:58 +00:00
Tom Lane 23347231a5 Tweak the API for per-datatype typmodin functions so that they are passed
an array of strings rather than an array of integers, and allow any simple
constant or identifier to be used in typmods; for example
	create table foo (f1 widget(42,'23skidoo',point));
Of course the typmodin function has still got to pack this info into a
non-negative int32 for storage, but it's still a useful improvement in
flexibility, especially considering that you can do nearly anything if you
are willing to keep the info in a side table.  We can get away with this
change since we have not yet released a version providing user-definable
typmods.  Per discussion.
2007-06-15 20:56:52 +00:00
Tom Lane d0599994da Fix DecodeDateTime to allow timezone to appear before year. This had
historically worked in some but not all cases, but as of 8.2 it failed for all
timezone formats.  Fix, and add regression test cases to catch future
regressions in this area.  Per gripe from Adam Witney.
2007-06-12 15:58:32 +00:00
Tom Lane a9545b3aef Improve UPDATE/DELETE WHERE CURRENT OF so that they can be used from plpgsql
with a plpgsql-defined cursor.  The underlying mechanism for this is that the
main SQL engine will now take "WHERE CURRENT OF $n" where $n is a refcursor
parameter.  Not sure if we should document that fact or consider it an
implementation detail.  Per discussion with Pavel Stehule.
2007-06-11 22:22:42 +00:00
Tom Lane 6808f1b1de Support UPDATE/DELETE WHERE CURRENT OF cursor_name, per SQL standard.
Along the way, allow FOR UPDATE in non-WITH-HOLD cursors; there may once
have been a reason to disallow that, but it seems to work now, and it's
really rather necessary if you want to select a row via a cursor and then
update it in a concurrent-safe fashion.

Original patch by Arul Shaji, rather heavily editorialized by Tom Lane.
2007-06-11 01:16:30 +00:00
Tom Lane e17e40f783 Allow numeric_fac() to be interrupted, since it can take quite a while for
large inputs.  Also cause it to error out immediately if the result will
overflow, instead of grinding through a lot of calculation first.
Per gripe from Jim Nasby.
2007-06-09 15:52:30 +00:00
Tom Lane 2d4db3675f Fix up text concatenation so that it accepts all the reasonable cases that
were accepted by prior Postgres releases.  This takes care of the loose end
left by the preceding patch to downgrade implicit casts-to-text.  To avoid
breaking desirable behavior for array concatenation, introduce a new
polymorphic pseudo-type "anynonarray" --- the added concatenation operators
are actually text || anynonarray and anynonarray || text.
2007-06-06 23:00:50 +00:00
Tom Lane 31edbadf4a Downgrade implicit casts to text to be assignment-only, except for the ones
from the other string-category types; this eliminates a lot of surprising
interpretations that the parser could formerly make when there was no directly
applicable operator.

Create a general mechanism that supports casts to and from the standard string
types (text,varchar,bpchar) for *every* datatype, by invoking the datatype's
I/O functions.  These new casts are assignment-only in the to-string direction,
explicit-only in the other, and therefore should create no surprising behavior.
Remove a bunch of thereby-obsoleted datatype-specific casting functions.

The "general mechanism" is a new expression node type CoerceViaIO that can
actually convert between *any* two datatypes if their external text
representations are compatible.  This is more general than needed for the
immediate feature, but might be useful in plpgsql or other places in future.

This commit does nothing about the issue that applying the concatenation
operator || to non-text types will now fail, often with strange error messages
due to misinterpreting the operator as array concatenation.  Since it often
(not always) worked before, we should either make it succeed or at least give
a more user-friendly error; but details are still under debate.

Peter Eisentraut and Tom Lane
2007-06-05 21:31:09 +00:00
Tom Lane 376ee15033 Fix erroneous error reporting for overlength input in text_date(),
text_time(), and text_timetz().  7.4-vintage bug found by Greg Stark.
2007-06-02 16:41:09 +00:00
Andrew Dunstan 15f8202c20 Improve efficiency of LIKE/ILIKE code, especially for multi-byte charsets,
and most especially for UTF8. Remove unnecessary special cases for bytea
processing and single-byte charset ILIKE.  a ILIKE b is now processed as
lower(a) LIKE lower(b) in all cases. The code is now considerably simpler. All
comparisons are now performed byte-wise, and the text and pattern are also
advanced byte-wise where it is safe to do so - essentially where a wildcard is
not being matched.
Andrew Dunstan, from an original patch by ITAGAKI Takahiro, with ideas from
Tom Lane and Mark Mielke.
2007-06-02 02:03:42 +00:00
Neil Conway f086be3d39 Allow leading and trailing whitespace in the input to the boolean
type. Also, add explicit casts between boolean and text/varchar. Both
of these changes are for conformance with SQL:2003.

Update the regression tests, bump the catversion.
2007-06-01 23:40:19 +00:00
Neil Conway f14f27dd38 Tweak: use memcpy() in text_time(), rather than manually copying bytes
in a loop.
2007-05-30 19:38:05 +00:00
Neil Conway 6af04882de Fix a bug in input processing for the "interval" type. Previously,
"microsecond" and "millisecond" units were not considered valid input
by themselves, which caused inputs like "1 millisecond" to be rejected
erroneously.

Update the docs, add regression tests, and backport to 8.2 and 8.1
2007-05-29 04:58:43 +00:00