Commit Graph

2271 Commits

Author SHA1 Message Date
Tom Lane 7f3eba30c9 When estimating without benefit of MCV lists (suggesting that one or both
inputs is unique or nearly so), make eqjoinsel() clamp the ndistinct estimates
to be not more than the estimated number of rows coming from the input
relations.  This allows the estimate to change in response to the selectivity
of restriction conditions on the inputs.

This is a pretty narrow patch and maybe we should be more aggressive about
similarly clamping ndistinct in other cases; but I'm worried about
double-counting the effects of the restriction conditions.  However, it seems
to help for the case exhibited by Grzegorz Jaskiewicz (antijoin against a
small subset of a relation), so let's try this for awhile.
2008-10-23 00:24:50 +00:00
Tom Lane e6ae3b5dbf Add a concept of "placeholder" variables to the planner. These are variables
that represent some expression that we desire to compute below the top level
of the plan, and then let that value "bubble up" as though it were a plain
Var (ie, a column value).

The immediate application is to allow sub-selects to be flattened even when
they are below an outer join and have non-nullable output expressions.
Formerly we couldn't flatten because such an expression wouldn't properly
go to NULL when evaluated above the outer join.  Now, we wrap it in a
PlaceHolderVar and arrange for the actual evaluation to occur below the outer
join.  When the resulting Var bubbles up through the join, it will be set to
NULL if necessary, yielding the correct results.  This fixes a planner
limitation that's existed since 7.1.

In future we might want to use this mechanism to re-introduce some form of
Hellerstein's "expensive functions" optimization, ie place the evaluation of
an expensive function at the most suitable point in the plan tree.
2008-10-21 20:42:53 +00:00
Tom Lane a303e4dc43 Extend the date type to support infinity and -infinity, analogously to
the timestamp types.  Turns out this doesn't even reduce the available
range of dates, since the restriction to dates that work for Julian-date
arithmetic is much tighter than the int32 range anyway.  Per a longstanding
TODO item.
2008-10-14 17:12:33 +00:00
Tom Lane 791359fe0e Fix EncodeSpecialTimestamp to throw error on unrecognized input, rather than
returning a failure code that none of its callers bothered to check for.
2008-10-14 15:44:29 +00:00
Tom Lane e3b0117459 Implement comparison of generic records (composite types), and invent a
pseudo-type record[] to represent arrays of possibly-anonymous composite
types.  Since composite datums carry their own type identification, no
extra knowledge is needed at the array level.

The main reason for doing this right now is that it is necessary to support
the general case of detection of cycles in recursive queries: if you need to
compare more than one column to detect a cycle, you need to compare a ROW()
to an array built from ROW()s, at least if you want to do it as the spec
suggests.  Add some documentation and regression tests concerning the cycle
detection issue.
2008-10-13 16:25:20 +00:00
Tom Lane 1b0f58a9ce Fix crash in bytea-to-XML mapping when the source value is toasted.
Report and fix by Michael McMaster.  Some minor code beautification by me,
also avoid memory leaks in the special-case paths.
2008-10-09 15:49:04 +00:00
Tom Lane 742fd06d98 Fix up ruleutils.c for CTE features. The main problem was that
get_name_for_var_field didn't have enough context to interpret a reference to
a CTE query's output.  Fixing this requires separate hacks for the regular
deparse case (pg_get_ruledef) and for the EXPLAIN case, since the available
context information is quite different.  It's pretty nearly parallel to the
existing code for SUBQUERY RTEs, though.  Also, add code to make sure we
qualify a relation name that matches a CTE name; else the CTE will mistakenly
capture the reference when reloading the rule.

In passing, fix a pre-existing problem with get_name_for_var_field not working
on variables in targetlists of SubqueryScan plan nodes.  Although latent all
along, this wasn't a problem until we made EXPLAIN VERBOSE try to print
targetlists.  To do this, refactor the deparse_context_for_plan API so that
the special case for SubqueryScan is all on ruleutils.c's side.
2008-10-06 20:29:38 +00:00
Tom Lane bf461538e1 When expanding a whole-row Var into a RowExpr during ResolveNew(), attach
the column alias names of the RTE referenced by the Var to the RowExpr.
This is needed to allow ruleutils.c to correctly deparse FieldSelect nodes
referencing such a construct.  Per my recent bug report.

Adding a field to RowExpr forces initdb (because of stored rules changes)
so this solution is not back-patchable; which is unfortunate because 8.2
and 8.3 have this issue.  But it only affects EXPLAIN for some pretty odd
corner cases, so we can probably live without a solution for the back
branches.
2008-10-06 17:39:26 +00:00
Heikki Linnakangas 5f853c6556 Use fork names instead of numbers in the file names for additional
relation forks. While the file names are not visible to users, for those
that do peek into the data directory, it's nice to have more descriptive
names. Per Greg Stark's suggestion.
2008-10-06 14:13:17 +00:00
Tom Lane 557faa4fb3 Random speculation about the reason for PPC64 buildfarm failures:
maybe isalnum is returning a value with the low-order byte all zero?
2008-10-06 05:03:27 +00:00
Tom Lane 8acfc7594d Tweak the overflow checks in integer division functions to complain if the
machine produces zero (rather than the more usual minimum-possible-integer)
for the only possible overflow case.  This has been seen to occur for at least
some word widths on some hardware, and it's cheap enough to check for
everywhere.  Per Peter's analysis of buildfarm reports.

This could be back-patched, but in the absence of any gripes from the field
I doubt it's worth the trouble.
2008-10-05 23:18:37 +00:00
Peter Eisentraut 2cf8afe5d1 Remove obsolete internal functions istrue, isfalse, isnottrue, isnotfalse,
nullvalue, nonvalue.  A long time ago, these were used to implement the SQL
constructs IS TRUE, etc.
2008-10-05 17:33:17 +00:00
Tom Lane 44d5be0e53 Implement SQL-standard WITH clauses, including WITH RECURSIVE.
There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.

There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly.  But let's land
the patch now so we can get on with other development.

Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane
2008-10-04 21:56:55 +00:00
Heikki Linnakangas 706a308806 Add relation fork support to pg_relation_size() function. You can now pass
name of a fork ('main' or 'fsm', at the moment) to pg_relation_size() to
get the size of a specific fork. Defaults to 'main', if none given.

While we're at it, modify pg_relation_size to take a regclass as argument,
instead of separate variants taking oid and name. This change is
transparent to typical use where the table name is passed as a string
literal, like pg_relation_size('table'), but will break queries like
pg_relation_size(namecol), where namecol is of type name. text-type input
still works, and using a non-schema-qualified table name is not very
reliable anyway, so this is unlikely to break anyone's queries in practice.
2008-10-03 07:33:10 +00:00
Tom Lane 607b39855a Fix improper display of fractional seconds in interval values
when using --enable-integer-datetimes and a non-ISO datestyle.

Ron Mayer
2008-10-02 13:47:38 +00:00
Tom Lane 2dbc0ca937 Dept of second thoughts: let's make sure that get_index_stats_hook is only
applied to expression indexes, not to plain relations.  The original coding
in btcostestimate conflated the two cases, but it's not hard to use
get_relation_stats_hook instead when we're looking to the underlying relation.
2008-09-28 20:42:12 +00:00
Tom Lane 7b7df9f0b1 Add hooks to let plugins override the planner's lookups in pg_statistic.
Simon Riggs, with some editorialization by me.
2008-09-28 19:51:40 +00:00
Andrew Dunstan bc965e840a Compare escaped chars case insensitively for ILIKE - per gripe from TGL. 2008-09-27 16:53:54 +00:00
Tom Lane b1e929f295 Fix pointer-advancement bugs in MS and US cases of new to_timestamp() code.
Alex Hunsaker
2008-09-26 15:35:28 +00:00
Tom Lane 3d8fd75732 Make LIKE throw an error if the escape character is at the end of the pattern
(ie, has nothing to quote), rather than silently ignoring the character as has
been our historical behavior.  This is required by SQL spec and should help
reduce the sort of user confusion seen in bug #4436.  Per discussion.

This is not so much a bug fix as a definitional change, and it could break
existing applications; so not back-patched.  It might deserve being mentioned
as an incompatibility in the 8.4 release notes.
2008-09-26 02:16:40 +00:00
Bruce Momjian fb4bb8b9c5 Fix integral timestamps so the output is consistent in all cases to
round:

	select interval '0:0:0.7', interval '@ 0.70 secs', interval '0.7
		seconds';

Ron Mayer
2008-09-24 19:46:44 +00:00
Heikki Linnakangas 61d9674988 Make LC_COLLATE and LC_CTYPE database-level settings. Collation and
ctype are now more like encoding, stored in new datcollate and datctype
columns in pg_database.

This is a stripped-down version of Radek Strnad's patch, with further
changes by me.
2008-09-23 09:20:39 +00:00
Tom Lane b73c0c2a51 Clean up a couple of weird corner cases in interval parsing: make -yyyy-mm be
interpreted as expected (the sign should affect months too), and get rid of
hard-wired assumption that unmarked signed values must be hours (if integers)
or seconds (if floats).  The former was just a bug in my previous patch,
while the latter may have made sense at one time but seems illogical now
that we support determination of the units from typmod information.
Ron Mayer and myself.
2008-09-16 22:31:21 +00:00
Tom Lane 8948ee37e5 Fix multiple memory leaks in xml_out(). Per report from Matt Magoffin. 2008-09-16 00:49:41 +00:00
Tom Lane 1cd935609f Fix caching of foreign-key-checking queries so that when a replan is needed,
we regenerate the SQL query text not merely the plan derived from it.  This
is needed to handle contingencies such as renaming of a table or column
used in an FK.  Pre-8.3, such cases worked despite the lack of replanning
(because the cached plan needn't actually change), so this is a regression.
Per bug #4417 from Benjamin Bihler.
2008-09-15 23:37:40 +00:00
Tom Lane 06edce4c3f Tighten up to_date/to_timestamp so that they are more likely to reject
erroneous input, rather than silently producing bizarre results as formerly
happened.

Brendan Jurd
2008-09-11 17:32:34 +00:00
Tom Lane 70530c808b Adjust the parser to accept the typename syntax INTERVAL ... SECOND(n)
and the literal syntax INTERVAL 'string' ... SECOND(n), as required by the
SQL standard.  Our old syntax put (n) directly after INTERVAL, which was
a mistake, but will still be accepted for backward compatibility as well
as symmetry with the TIMESTAMP cases.

Change intervaltypmodout to show it in the spec's way, too.  (This could
potentially affect clients, if there are any that analyze the typmod of an
INTERVAL in any detail.)

Also fix interval input to handle 'min:sec.frac' properly; I had overlooked
this case in my previous patch.

Document the use of the interval fields qualifier, which up to now we had
never mentioned in the docs.  (I think the omission was intentional because
it didn't work per spec; but it does now, or at least close enough to be
credible.)
2008-09-11 15:27:30 +00:00
Tom Lane f867339c01 Make our parsing of INTERVAL literals spec-compliant (or at least a heck of
a lot closer than it was before).  To do this, tweak coerce_type() to pass
through the typmod information when invoking interval_in() on an UNKNOWN
constant; then fix DecodeInterval to pay attention to the typmod when deciding
how to interpret a units-less integer value.  I changed one or two other
details as well.  I believe the code now reacts as expected by spec for all
the literal syntaxes that are specifically enumerated in the spec.  There
are corner cases involving strings that don't exactly match the set of fields
called out by the typmod, for which we might want to tweak the behavior some
more; but I think this is an area of user friendliness rather than spec
compliance.  There remain some non-compliant details about the SQL syntax
(as opposed to what's inside the literal string); but at least we'll throw
error rather than silently doing the wrong thing in those cases.
2008-09-10 18:29:41 +00:00
Tom Lane ee33b95d9c Improve the plan cache invalidation mechanism to make it invalidate plans
when user-defined functions used in a plan are modified.  Also invalidate
plans when schemas, operators, or operator classes are modified; but for these
cases we just invalidate everything rather than tracking exact dependencies,
since these types of objects seldom change in a production database.

Tom Lane; loosely based on a patch by Martin Pihlak.
2008-09-09 18:58:09 +00:00
Tom Lane a0b76dc662 Create a separate grantable privilege for TRUNCATE, rather than having it be
always owner-only.  The TRUNCATE privilege works identically to the DELETE
privilege so far as interactions with the rest of the system go.

Robert Haas
2008-09-08 00:47:41 +00:00
Tom Lane e6a310b281 Reimplement text_position and related functions to use Boyer-Moore-Horspool
searching instead of naive matching.  In the worst case this has the same
O(M*N) complexity as the naive method, but the worst case is hard to hit,
and the average case is very fast, especially with longer patterns.

David Rowley
2008-09-07 04:20:00 +00:00
Tom Lane 409c144d83 Adjust psql's new \ef command to present an empty CREATE FUNCTION template
for editing if no function name is specified.  This seems a much cleaner way
to offer that functionality than the original patch had.  In passing,
de-clutter the error displays that are given for a bogus function-name
argument, and standardize on "$function$" as the default delimiter for the
function body.  (The original coding would use the shortest possible
dollar-quote delimiter, which seems to create unnecessarily high risk of
later conflicts with the user-modified function body.)
2008-09-06 20:18:08 +00:00
Tom Lane 2c863ca818 Implement a psql command "\ef" to edit the definition of a function.
In support of that, create a backend function pg_get_functiondef().
The psql command is functional but maybe a bit rough around the edges...

Abhijit Menon-Sen
2008-09-06 00:01:25 +00:00
Tom Lane b153c09209 Add a bunch of new error location reports to parse-analysis error messages.
There are still some weak spots around JOIN USING and relation alias lists,
but most errors reported within backend/parser/ now have locations.
2008-09-01 20:42:46 +00:00
Tom Lane e5536e77a5 Move exprType(), exprTypmod(), expression_tree_walker(), and related routines
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend.  There's probably more that should be done along this line,
but this is a start anyway.
2008-08-25 22:42:34 +00:00
Bruce Momjian 6152de97d3 Minor patch on pgbench
1. -i option should run vacuum analyze only on pgbench tables, not *all*
tables in database.

2. pre-run cleanup step was DELETE FROM HISTORY then VACUUM HISTORY.
This is just a slow version of TRUNCATE HISTORY.

Simon Riggs
2008-08-22 17:57:34 +00:00
Tom Lane bd3daddaf2 Arrange to convert EXISTS subqueries that are equivalent to hashable IN
subqueries into the same thing you'd have gotten from IN (except always with
unknownEqFalse = true, so as to get the proper semantics for an EXISTS).
I believe this fixes the last case within CVS HEAD in which an EXISTS could
give worse performance than an equivalent IN subquery.

The tricky part of this is that if the upper query probes the EXISTS for only
a few rows, the hashing implementation can actually be worse than the default,
and therefore we need to make a cost-based decision about which way to use.
But at the time when the planner generates plans for subqueries, it doesn't
really know how many times the subquery will be executed.  The least invasive
solution seems to be to generate both plans and postpone the choice until
execution.  Therefore, in a query that has been optimized this way, EXPLAIN
will show two subplans for the EXISTS, of which only one will actually get
executed.

There is a lot more that could be done based on this infrastructure: in
particular it's interesting to consider switching to the hash plan if we start
out using the non-hashed plan but find a lot more upper rows going by than we
expected.  I have therefore left some minor inefficiencies in place, such as
initializing both subplans even though we will currently only use one.
2008-08-22 00:16:04 +00:00
Tom Lane d4af2a6481 Clean up the loose ends in selectivity estimation left by my patch for semi
and anti joins.  To do this, pass the SpecialJoinInfo struct for the current
join as an additional optional argument to operator join selectivity
estimation functions.  This allows the estimator to tell not only what kind
of join is being formed, but which variable is on which side of the join;
a requirement long recognized but not dealt with till now.  This also leaves
the door open for future improvements in the estimators, such as accounting
for the null-insertion effects of lower outer joins.  I didn't do anything
about that in the current patch but the information is in principle deducible
from what's passed.

The patch also clarifies the definition of join selectivity for semi/anti
joins: it's the fraction of the left input that has (at least one) match
in the right input.  This allows getting rid of some very fuzzy thinking
that I had committed in the original 7.4-era IN-optimization patch.
There's probably room to estimate this better than the present patch does,
but at least we know what to estimate.

Since I had to touch CREATE OPERATOR anyway to allow a variant signature
for join estimator functions, I took the opportunity to add a couple of
additional checks that were missing, per my recent message to -hackers:
* Check that estimator functions return float8;
* Require execute permission at the time of CREATE OPERATOR on the
operator's function as well as the estimator functions;
* Require ownership of any pre-existing operator that's modified by
the command.
I also moved the lookup of the functions out of OperatorCreate() and
into operatorcmds.c, since that seemed more consistent with most of
the other catalog object creation processes, eg CREATE TYPE.
2008-08-16 00:01:38 +00:00
Tom Lane e006a24ad1 Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace
the old JOIN_IN code, but antijoins are new functionality.)  Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively.  Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins.  Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo".  With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation.  That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch.  So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
2008-08-14 18:48:00 +00:00
Heikki Linnakangas 3f0e808c4a Introduce the concept of relation forks. An smgr relation can now consist
of multiple forks, and each fork can be created and grown separately.

The bulk of this patch is about changing the smgr API to include an extra
ForkNumber argument in every smgr function. Also, smgrscheduleunlink and
smgrdounlink no longer implicitly call smgrclose, because other forks might
still exist after unlinking one. The callers of those functions have been
modified to call smgrclose instead.

This patch in itself doesn't have any user-visible effect, but provides the
infrastructure needed for upcoming patches. The additional forks envisioned
are a rewritten FSM implementation that doesn't rely on a fixed-size shared
memory block, and a visibility map to allow skipping portions of a table in
VACUUM that have no dead tuples.
2008-08-11 11:05:11 +00:00
Tom Lane 9511304752 Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT items
as per my recent proposal:

1. Fold SortClause and GroupClause into a single node type SortGroupClause.
We were already relying on them to be struct-equivalent, so using two node
tags wasn't accomplishing much except to get in the way of comparing items
with equal().

2. Add an "eqop" field to SortGroupClause to carry the associated equality
operator.  This is cheap for the parser to get at the same time it's looking
up the sort operator, and storing it eliminates the need for repeated
not-so-cheap lookups during planning.  In future this will also let us
represent GROUP/DISTINCT operations on datatypes that have hash opclasses
but no btree opclasses (ie, they have equality but no natural sort order).
The previous representation simply didn't work for that, since its only
indicator of comparison semantics was a sort operator.

3. Add a hasDistinctOn boolean to struct Query to explicitly record whether
the distinctClause came from DISTINCT or DISTINCT ON.  This allows removing
some complicated and not 100% bulletproof code that attempted to figure
that out from the distinctClause alone.

This patch doesn't in itself create any new capability, but it's necessary
infrastructure for future attempts to use hash-based grouping for DISTINCT
and UNION/INTERSECT/EXCEPT.
2008-08-02 21:32:01 +00:00
Tom Lane 5618ece82b Code review for array_fill patch: fix inadequate check for array size overflow
and bogus documentation (dimension arrays are int[] not anyarray).  Also the
errhint() messages seem to be really errdetail(), since there is nothing
heuristic about them.  Some other trivial cosmetic improvements.
2008-07-21 04:47:00 +00:00
Tom Lane 69a785b8bf Implement SQL-spec RETURNS TABLE syntax for functions.
(Unlike the original submission, this patch treats TABLE output parameters
as being entirely equivalent to OUT parameters -- tgl)

Pavel Stehule
2008-07-18 03:32:53 +00:00
Tom Lane 6563e9e2e8 Add a "provariadic" column to pg_proc to eliminate the remarkably expensive
need to deconstruct proargmodes for each pg_proc entry inspected by
FuncnameGetCandidates().  Fixes function lookup performance regression
caused by yesterday's variadic-functions patch.

In passing, make pg_proc.probin be NULL, rather than a dummy value '-',
in cases where it is not actually used for the particular type of function.
This should buy back some of the space cost of the extra column.
2008-07-16 16:55:24 +00:00
Tom Lane d89737d31c Support "variadic" functions, which can accept a variable number of arguments
so long as all the trailing arguments are of the same (non-array) type.
The function receives them as a single array argument (which is why they
have to all be the same type).

It might be useful to extend this facility to aggregates, but this patch
doesn't do that.

This patch imposes a noticeable slowdown on function lookup --- a follow-on
patch will fix that by adding a redundant column to pg_proc.

Pavel Stehule
2008-07-16 01:30:23 +00:00
Bruce Momjian 2c773296f8 Add array_fill() to create arrays initialized with a value.
Pavel Stehule
2008-07-16 00:48:54 +00:00
Tom Lane 960af47efd Const-ify the arguments of str_tolower() and friends to suppress compile
warnings.  Clean up various unneeded cruft that was left behind after
creating those routines.  Introduce some convenience functions str_tolower_z
etc to eliminate tedious and error-prone double arguments in formatting.c.
(Currently there seems no need to export the latter, but maybe reconsider
this later.)
2008-07-12 00:44:38 +00:00
Tom Lane 170063cd1e Fix estimate_num_groups() to assume that GROUP BY expressions yielding boolean
results always contribute two groups, regardless of the expression contents.
This is very substantially more accurate than the regular heuristic for
certain boolean tests like "col IS NULL".  Per gripe from Sam Mason.

Back-patch to all supported releases, since the behavior of
estimate_num_groups() hasn't changed all that much since 7.4.
2008-07-07 20:24:55 +00:00
Tom Lane c50838533b Fix AT TIME ZONE (in all three variants) so that we first try to interpret
the timezone argument as a timezone abbreviation, and only try it as a full
timezone name if that fails.  The zic database has four zones (CET, EET, MET,
WET) that are full daylight-savings zones and yet have names that are the
same as their abbreviations for standard time, resulting in ambiguity.
In the timestamp input functions we resolve the ambiguity by preferring the
abbreviation, and AT TIME ZONE should work the same way.  (No functionality
is lost because the zic database also has other names for these zones, eg
Europe/Zurich.)  Per gripe from Jaromir Talir.

Backpatch to 8.1.  Older releases did not have the issue because AT TIME ZONE
only accepted abbreviations not zone names.  (Thus, this patch also arguably
fixes a compatibility botch introduced at 8.1: in ambiguous cases we now
behave the same as 8.0 did.)
2008-07-07 18:09:46 +00:00
Tom Lane c63147d6f0 Add a function pg_get_keywords() to let clients find out the set of keywords
known to the SQL parser.  Dave Page
2008-07-03 20:58:47 +00:00
Tom Lane c5f4b98fae Fix transaction-lifespan memory leak in xpath(). Report by Matt Magoffin,
fix by Kris Jurka.
2008-07-03 00:04:24 +00:00
Teodor Sigaev 5ff9899933 Fix bug "select lower('asd') = 'asd'" returns false with multibyte encoding
and non-C locale. Fix is just to use correct source's length for char2wchar
call.
2008-06-26 16:06:37 +00:00
Tom Lane 5f6f840e93 Reduce the alignment requirement of type "name" from int to char, and arrange
to suppress zero-padding of "name" entries in indexes.

The alignment change is unlikely to save any space, but it is really needed
anyway to make the world safe for our widespread practice of passing plain
old C strings to functions that are declared as taking Name.  In the previous
coding, the C compiler was entitled to assume that a Name pointer was
word-aligned; but we were failing to guarantee that.  I think the reason
we'd not seen failures is that usually the only thing that gets done with
such a pointer is strcmp(), which is hard to optimize in a way that exploits
word-alignment.  Still, some enterprising compiler guy will probably think
of a way eventually, or we might change our code in a way that exposes
more-obvious optimization opportunities.

The padding change is accomplished in one-liner fashion by declaring the
"name" index opclasses to use storage type "cstring" in pg_opclass.h.
Normally btree and hash don't allow a nondefault storage type, because they
don't have any provisions for converting the input datum to another type.
However, because name and cstring are effectively the same thing except for
padding, no conversion is needed --- we only need index_form_tuple() to treat
the datum as being cstring not name, and this is sufficient.  This seems to
make for about a one-third reduction in the typical sizes of system catalog
indexes that involve "name" columns, of which we have many.

These two changes are only weakly related, but the alignment change makes
me feel safer that the padding change won't introduce problems, so I'm
committing them together.
2008-06-24 17:58:27 +00:00
Bruce Momjian f6ec7430f9 Merge duplicate upper/lower/initcap() routines in oracle_compat.c and
formatting.c to use common code;  remove duplicate functions and support
routines that are no longer needed.
2008-06-23 19:27:19 +00:00
Alvaro Herrera a3540b0f65 Improve our #include situation by moving pointer types away from the
corresponding struct definitions.  This allows other headers to avoid including
certain highly-loaded headers such as rel.h and relscan.h, instead using just
relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less
unnecessary dependencies.
2008-06-19 00:46:06 +00:00
Tom Lane b163baa89c Clean up some problems with redundant cross-type arithmetic operators. Add
int2-and-int8 implementations of the basic arithmetic operators +, -, *, /.
This doesn't really add any new functionality, but it avoids "operator is not
unique" failures that formerly occurred in these cases because the parser
couldn't decide whether to promote the int2 to int4 or int8.  We could
alternatively have removed the existing cross-type operators, but
experimentation shows that the cost of an additional type coercion expression
node is noticeable compared to such cheap operators; so let's not give up any
performance here.  On the other hand, I removed the int2-and-int4 modulo (%)
operators since they didn't seem as important from a performance standpoint.
Per a complaint last January from ykhuang.
2008-06-17 19:10:56 +00:00
Bruce Momjian dc69c0362f Move USE_WIDE_UPPER_LOWER define to c.h, and remove TS_USE_WIDE and use
USE_WIDE_UPPER_LOWER instead.
2008-06-17 16:09:06 +00:00
Tom Lane 0b510ad920 Fix unportable (and incorrect anyway) usage of LL constant suffix that
recently snuck into cash.c.  Per report from Edmundo Robles Lopez.
2008-06-09 19:58:39 +00:00
Tom Lane 3a4e929b76 Fix datetime input functions to correctly detect integer overflow when
running on a 64-bit platform ... strtol() will happily return 64-bit
output in that case.  Per bug #4231 from Geoff Tolley.
2008-06-09 19:34:02 +00:00
Alvaro Herrera e4ca6cac43 Change xlog.h to xlogdefs.h in bufpage.h, and fix fallout. 2008-06-06 22:35:22 +00:00
Tom Lane c1943dbaef Fix pg_get_ruledef() so that negative numeric constants are parenthesized.
This is needed because :: casting binds more tightly than minus, so for
example -1::integer is not the same as (-1)::integer, and there are cases
where the difference is important.  In particular this caused a failure
in SELECT DISTINCT ... ORDER BY ... where expressions that should have
matched were seen as different by the parser; but I suspect that there
could be other cases where failure to parenthesize leads to subtler
semantic differences in reloaded rules.  Per report from Alexandr Popov.
2008-06-06 17:59:29 +00:00
Tom Lane 7b8a63c3e9 Alter the xxx_pattern_ops opclasses to use the regular equality operator of
the associated datatype as their equality member.  This means that these
opclasses can now support plain equality comparisons along with LIKE tests,
thus avoiding the need for an extra index in some applications.  This
optimization was not possible when the pattern opclasses were first introduced,
because we didn't insist that text equality meant bitwise equality; but we
do now, so there is no semantic difference between regular and pattern
equality operators.

I removed the name_pattern_ops opclass altogether, since it's really useless:
name's regular comparisons are just strcmp() and are unlikely to become
something different.  Instead teach indxpath.c that btree name_ops can be
used for LIKE whether or not the locale is C.  This might lead to a useful
speedup in LIKE queries on the system catalogs in non-C locales.

The ~=~ and ~<>~ operators are gone altogether.  (It would have been nice to
keep them for backward compatibility's sake, but since the pg_amop structure
doesn't allow multiple equality operators per opclass, there's no way.)

A not-immediately-obvious incompatibility is that the sort order within
bpchar_pattern_ops indexes changes --- it had been identical to plain
strcmp, but is now trailing-blank-insensitive.  This will impact
in-place upgrades, if those ever happen.

Per discussions a couple months ago.
2008-05-27 00:13:09 +00:00
Bruce Momjian 9f19470966 Simplify code in formatting.c now that to upper/lower/initcase do not
modify the passed string.
2008-05-20 01:41:02 +00:00
Tom Lane 07a5606735 Make to_char()'s localized month/day names depend on LC_TIME, not LC_MESSAGES.
Euler Taveira de Oliveira
2008-05-19 18:08:16 +00:00
Tom Lane 63e98b55f0 Coercion sanity check in ri_HashCompareOp failed to allow for enums, as per
example from Rod Taylor.  On reflection the correct test here is for any
polymorphic type, not specifically ANYARRAY as in the original coding.
2008-05-19 04:14:24 +00:00
Tom Lane e6dbcb72fa Extend GIN to support partial-match searches, and extend tsquery to support
prefix matching using this facility.

Teodor Sigaev and Oleg Bartunov
2008-05-16 16:31:02 +00:00
Tom Lane 93c701edc6 Add support for tracking call counts and elapsed runtime for user-defined
functions.

Note that because this patch changes FmgrInfo, any external C functions
you might be testing with 8.4 will need to be recompiled.

Patch by Martin Pihlak, some editorialization by me (principally, removing
tracking of getrusage() numbers)
2008-05-15 00:17:41 +00:00
Alvaro Herrera 5da9da71c4 Improve snapshot manager by keeping explicit track of snapshots.
There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment.  This is all done automatically now.

As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.
2008-05-12 20:02:02 +00:00
Alvaro Herrera f8c4d7db60 Restructure some header files a bit, in particular heapam.h, by removing some
unnecessary #include lines in it.  Also, move some tuple routine prototypes and
macros to htup.h, which allows removal of heapam.h inclusion from some .c
files.

For this to work, a new header file access/sysattr.h needed to be created,
initially containing attribute numbers of system columns, for pg_dump usage.

While at it, make contrib ltree, intarray and hstore header files more
consistent with our header style.
2008-05-12 00:00:54 +00:00
Bruce Momjian f8df836ae3 Adjust power() error messages to be more descriptive. 2008-05-09 21:31:23 +00:00
Bruce Momjian 6e3e60095d Update C comments to mention SQL:2003 handling of power return values. 2008-05-09 15:36:06 +00:00
Bruce Momjian 4a586bd405 Add regression test for various power expressions with a zero base, and
adjust source code to be more modular.
2008-05-08 22:17:54 +00:00
Bruce Momjian 6b4e9d1654 Have numeric 0 ^ 4.3 return 1, rather than an error, and have 0 ^ 0.0
return 1, rather than error.

This was already the float8 behavior.
2008-05-08 19:25:38 +00:00
Magnus Hagander 0423de4d30 Make the pg_stat_activity view call a SRF (pg_stat_get_activity())
instead of calling a bunch of individual functions.

This function can also be called directly, taking a PID as an argument, to
return only the data for a single PID.
2008-05-07 14:41:56 +00:00
Tom Lane b6d15590f7 Add timestamp and timestamptz versions of generate_series().
Hitoshi Harada
2008-05-04 23:19:24 +00:00
Tom Lane 600da67fbe Add pg_conf_load_time() function to report when the Postgres configuration
files were last loaded.

George Gensure
2008-05-04 21:13:36 +00:00
Tom Lane 45173ae24e Use new cstring/text conversion functions in some additional places.
These changes assume that the varchar and xml data types are represented
the same as text.  (I did not, however, accept the portions of the proposed
patch that wanted to assume bytea is the same as text --- tgl.)

Brendan Jurd
2008-05-04 16:42:41 +00:00
Tom Lane c88850f4a0 The 8.2 patch that added support for an alias on the target table of
UPDATE/DELETE forgot to teach ruleutils.c to display the alias.
Per bug #4141 from Mathias Seiler.
2008-05-03 23:19:20 +00:00
Alvaro Herrera 1fcb977a13 Add generate_subscripts, a series-generation function which generates an
array's subscripts.

Pavel Stehule, some editorialization by me.
2008-04-28 14:48:58 +00:00
Tom Lane 8472bf7a73 Allow float8, int8, and related datatypes to be passed by value on machines
where Datum is 8 bytes wide.  Since this will break old-style C functions
(those still using version 0 calling convention) that have arguments or
results of these types, provide a configure option to disable it and retain
the old pass-by-reference behavior.  Likewise, provide a configure option
to disable the recently-committed float4 pass-by-value change.

Zoltan Boszormenyi, plus configurability stuff by me.
2008-04-21 00:26:47 +00:00
Teodor Sigaev be939544a6 Fix broken compare function for tsquery_ops. Per Tom's report.
I never understood why initial authors GiST in pgsql choose so
stgrange signature for 'same' method:
bool *sameFn(Datum a, Datum b, bool* result)
instead of simple, logical
bool sameFn(Datum a, Datum b)
This change will break any existing GiST extension, so we still live with
it and will live.
2008-04-20 09:17:57 +00:00
Bruce Momjian c4fd93b3f3 Re-enable pg_terminate_backend() using SIGTERM. SIGTERM testing still
needed.
2008-04-17 20:56:41 +00:00
Bruce Momjian 76365960d2 Revert addition of pg_terminate_backend() because of race conditions. 2008-04-15 20:28:47 +00:00
Bruce Momjian 18b286f3e3 Add pg_terminate_backend() to allow terminating only a single session. 2008-04-15 13:55:12 +00:00
Tom Lane 9b5c8d45f6 Push index operator lossiness determination down to GIST/GIN opclass
"consistent" functions, and remove pg_amop.opreqcheck, as per recent
discussion.  The main immediate benefit of this is that we no longer need
8.3's ugly hack of requiring @@@ rather than @@ to test weight-using tsquery
searches on GIN indexes.  In future it should be possible to optimize some
other queries better than is done now, by detecting at runtime whether the
index match is exact or not.

Tom Lane, after an idea of Heikki's, and with some help from Teodor.
2008-04-14 17:05:34 +00:00
Tom Lane 226837e57e Since createplan.c no longer cares whether index operators are lossy, it has
no particular need to do get_op_opfamily_properties() while building an
indexscan plan.  Postpone that lookup until executor start.  This simplifies
createplan.c a lot more than it complicates nodeIndexscan.c, and makes things
more uniform since we already had to do it that way for RowCompare
expressions.  Should be a bit faster too, at least for plans that aren't
re-used many times, since we avoid palloc'ing and perhaps copying the
intermediate list data structure.
2008-04-13 20:51:21 +00:00
Tom Lane ba1c463096 Clean up a few places where Datums were being treated as pointers without
going through DatumGetPointer or some other "official" conversion macro.
Not actually a bug, since Datum the same size as pointer is the only
supported case at the moment, but good cleanup for the future.

Gavin Sherry
2008-04-12 23:21:04 +00:00
Tom Lane c846f7ca8a Fix several datatype input functions that were allowing unused bytes in their
results to contain uninitialized, unpredictable values.  While this was okay
as far as the datatypes themselves were concerned, it's a problem for the
parser because occurrences of the "same" literal might not be recognized as
equal by datumIsEqual (and hence not by equal()).  It seems sufficient to fix
this in the input functions since the only critical use of equal() is in the
parser's comparisons of ORDER BY and DISTINCT expressions.
Per a trouble report from Marc Cousin.

Patch all the way back.  Interestingly, array_in did not have the bug before
8.2, which may explain why the issue went unnoticed for so long.
2008-04-11 22:52:05 +00:00
Tom Lane 635aaab278 Fix tsvector_update_trigger() to be domain-friendly: it needs to allow all
the columns it works with to be domains over the expected type, not just
exactly the expected type.  In passing, fix ts_stat() the same way.
Per report from Markus Wollny.
2008-04-08 18:20:29 +00:00
Tom Lane a0fad9762a Re-implement division for numeric values using the traditional "schoolbook"
algorithm.  This is a good deal slower than our old roundoff-error-prone
code for long inputs, so we keep the old code for use in the transcendental
functions, where everything is approximate anyway.  Also create a
user-accessible function div(numeric, numeric) to provide access to the
exact result of trunc(x/y) --- since the regular numeric / operator will
round off its result, simply computing that expression in SQL doesn't
reliably give the desired answer.  This fixes bug #3387 and various related
corner cases, and improves the usefulness of PG for high-precision integer
arithmetic.
2008-04-04 18:45:36 +00:00
Bruce Momjian f96928fde9 Implement current_query(), that shows the currently executing query.
At the same time remove dblink/dblink_current_query() as it is no longer
necessary
*BACKWARD COMPATIBILITY ISSUE* for dblink

Tomas Doran
2008-04-04 16:57:21 +00:00
Magnus Hagander d672ea6ffa Turn xmlbinary and xmloption GUC variables into enumsTurn xmlbinary and
xmloption GUC variables into enums..
2008-04-04 08:33:15 +00:00
Magnus Hagander ad6bf716ba Convert three more guc settings to enum type:
default_transaction_isolation, session_replication_role and regex_flavor.
2008-04-02 14:42:56 +00:00
Tom Lane c5f11f9d19 Fix a number of places that were making file-type tests infelicitously.
The places that did, eg,
	(statbuf.st_mode & S_IFMT) == S_IFDIR
were correct, but there is no good reason not to use S_ISDIR() instead,
especially when that's what the other 90% of our code does.  The places
that did, eg,
	(statbuf.st_mode & S_IFDIR)
were flat out *wrong* and would fail in various platform-specific ways,
eg a symlink could be mistaken for a regular file on most Unixen.

The actual impact of this is probably small, since the problem cases
seem to always involve symlinks or sockets, which are unlikely to be
found in the directories that PG code might be scanning.  But it's
clearly trouble waiting to happen, so patch all the way back anyway.
(There seem to be no occurrences of the mistake in 7.4.)
2008-03-31 01:31:43 +00:00
Tom Lane 7692d8d5b7 Support statement-level ON TRUNCATE triggers. Simon Riggs 2008-03-28 00:21:56 +00:00
Alvaro Herrera 73b0300b2a Move the HTSU_Result enum definition into snapshot.h, to avoid including
tqual.h into heapam.h.  This makes all inclusion of tqual.h explicit.

I also sorted alphabetically the includes on some source files.
2008-03-26 21:10:39 +00:00
Alvaro Herrera 78f02ca1f5 Rename snapmgmt.c/h to snapmgr.c/h, for consistency with other files.
Per complaint from Tom Lane.
2008-03-26 18:48:59 +00:00
Alvaro Herrera d43b085d57 Separate snapshot management code from tuple visibility code, create a
snapmgmt.c file for the former.  The header files have also been reorganized
in three parts: the most basic snapshot definitions are now in a new file
snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c.
tqual.h has been reduced to the bare minimum.

This patch is just a first step towards managing live snapshots within a
transaction; there is no functionality change.

Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and
subsequent discussion.
2008-03-26 16:20:48 +00:00
Tom Lane 220db7ccd8 Simplify and standardize conversions between TEXT datums and ordinary C
strings.  This patch introduces four support functions cstring_to_text,
cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and
two macros CStringGetTextDatum and TextDatumGetCString.  A number of
existing macros that provided variants on these themes were removed.

Most of the places that need to make such conversions now require just one
function or macro call, in place of the multiple notational layers that used
to be needed.  There are no longer any direct calls of textout or textin,
and we got most of the places that were using handmade conversions via
memcpy (there may be a few still lurking, though).

This commit doesn't make any serious effort to eliminate transient memory
leaks caused by detoasting toasted text objects before they reach
text_to_cstring.  We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few
places where it was easy, but much more could be done.

Brendan Jurd and Tom Lane
2008-03-25 22:42:46 +00:00
Tom Lane 32b58d0220 Fix various infelicities that have snuck into usage of errdetail() and
friends.  Avoid double translation of some messages, ensure other messages
are exposed for translation (and make them follow the style guidelines),
avoid unsafe passing of an unpredictable message text as a format string.
2008-03-24 19:12:49 +00:00
Tom Lane 7de81124d5 Create a function quote_nullable(), which works the same as quote_literal()
except that it returns the string 'NULL', rather than a SQL null, when called
with a null argument.  This is often a much more useful behavior for
constructing dynamic queries.  Add more discussion to the documentation
about how to use these functions.

Brendan Jurd
2008-03-23 00:24:20 +00:00
Tom Lane 19595835c3 Refactor to_char/to_date formatting code; primarily, replace DCH_processor
with two new functions DCH_to_char and DCH_from_char that have less confusing
APIs.  Brendan Jurd
2008-03-22 22:32:19 +00:00
Tom Lane 2d0583a166 Get rid of a bunch of #ifdef HAVE_INT64_TIMESTAMP conditionals by inventing
a new typedef TimeOffset to represent an intermediate time value.  It's
either int64 or double as appropriate, and in most usages will be measured
in microseconds or seconds the same as Timestamp.  We don't call it
Timestamp, though, since the value doesn't necessarily represent an absolute
time instant.

Warren Turkal
2008-03-21 01:31:43 +00:00
Tom Lane 965a2a191a Fix regexp substring matching (substring(string from pattern)) for the corner
case where there is a match to the pattern overall but the user has specified
a parenthesized subexpression and that subexpression hasn't got a match.
An example is substring('foo' from 'foo(bar)?').  This should return NULL,
since (bar) isn't matched, but it was mistakenly returning the whole-pattern
match instead (ie, 'foo').  Per bug #4044 from Rui Martins.

This has been broken since the beginning; patch in all supported versions.
The old behavior was sufficiently inconsistent that it's impossible to believe
anyone is depending on it.
2008-03-19 02:40:37 +00:00
Tom Lane 164899db1c Revert thinko introduced into prefix_selectivity() by my recent patch:
make_greater_string needs the < procedure not the >= one.  Spotted by
Peter.
2008-03-17 17:13:54 +00:00
Tom Lane 5e00913daf Fix varstr_cmp's special case for UTF8 encoding on Windows so that strings
that are reported as "equal" by wcscoll() are checked to see if they really
are bitwise equal, and are sorted per strcmp() if not.  We made this happen
a couple of years ago in the regular code path, but it unaccountably got
left out of the Windows/UTF8 case (probably brain fade on my part at the
time).  As in the prior set of changes, affected users may need to reindex
indexes on textual columns.

Backpatch as far as 8.2, which is the oldest release we are still supporting
on Windows.
2008-03-13 18:31:56 +00:00
Tom Lane 9d966c2e32 Fix unportable coding of new error message, per Kris Jurka. 2008-03-10 12:57:05 +00:00
Tom Lane 23c356ccec Document and enforce that the usable range of setseed() arguments is
-1 to 1, not 0 to 1.  The actual behavior for values within this range
does not change.  Kris Jurka
2008-03-10 12:39:23 +00:00
Tom Lane f4230d2937 Change patternsel() so that instead of switching from a pure
pattern-examination heuristic method to purely histogram-driven selectivity at
histogram size 100, we compute both estimates and use a weighted average.
The weight put on the heuristic estimate decreases linearly with histogram
size, dropping to zero for 100 or more histogram entries.
Likewise in ltreeparentsel().  After a patch by Greg Stark, though I
reorganized the logic a bit to give the caller of histogram_selectivity()
more control.
2008-03-09 00:32:09 +00:00
Tom Lane 422495d0da Modify prefix_selectivity() so that it will never estimate the selectivity
of the generated range condition var >= 'foo' AND var < 'fop' as being less
than what eqsel() would estimate for var = 'foo'.  This is intuitively
reasonable and it gets rid of the need for some entirely ad-hoc coding we
formerly used to reject bogus estimates.  The basic problem here is that
if the prefix is more than a few characters long, the two boundary values
are too close together to be distinguishable by comparison to the column
histogram, resulting in a selectivity estimate of zero, which is often
not very sane.  Change motivated by an example from Peter Eisentraut.

Arguably this is a bug fix, but I'll refrain from back-patching it
for the moment.
2008-03-08 22:41:38 +00:00
Tom Lane 9c767ad57b Improve pglz_decompress() so that it cannot clobber memory beyond the
available output buffer when presented with corrupt input.  Some testing
suggests that this slows the decompression loop about 1%, which seems an
acceptable price to pay for more robustness.  (Curiously, the penalty
seems to be *less* on not-very-compressible data, which I didn't expect
since the overhead per output byte ought to be more in the literal-bytes
path.)

Patch from Zdenek Kotala.  I fixed a corner case and did some renaming
of variables to make the routine more readable.
2008-03-08 01:09:36 +00:00
Tom Lane ad434473eb This patch addresses some issues in TOAST compression strategy that
were discussed last year, but we felt it was too late in the 8.3 cycle to
change the code immediately.  Specifically, the patch:

* Reduces the minimum datum size to be considered for compression from
256 to 32 bytes, as suggested by Greg Stark.

* Increases the required compression rate for compressed storage from
20% to 25%, again per Greg's suggestion.

* Replaces force_input_size (size above which compression is forced)
with a maximum size to be considered for compression.  It was agreed
that allowing large inputs to escape the minimum-compression-rate
requirement was not bright, and that indeed we'd rather have a knob
that acted in the other direction.  I set this value to 1MB for the
moment, but it could use some performance studies to tune it.

* Adds an early-failure path to the compressor as suggested by Jan:
if it's been unable to find even one compressible substring in the
first 1KB (parameterizable), assume we're looking at incompressible
input and give up.  (Possibly this logic can be improved, but I'll
commit it as-is for now.)

* Improves the toasting heuristics so that when we have very large
fields with attstorage 'x' or 'e', we will push those out to toast
storage before considering inline compression of shorter fields.
This also responds to a suggestion of Greg's, though my original
proposal for a solution was a bit off base because it didn't fix
the problem for large 'e' fields.

There was some discussion in the earlier threads of exposing some
of the compression knobs to users, perhaps even on a per-column
basis.  I have not done anything about that here.  It seems to me
that if we are changing around the parameters, we'd better get some
experience and be sure we are happy with the design before we set
things in stone by providing user-visible knobs.
2008-03-07 23:20:21 +00:00
Bruce Momjian 910bc51862 When text search string is too long, in error message report actual and
maximum number of bytes allowed.
2008-03-05 15:50:37 +00:00
Tom Lane e04fa58dcd Fix unportable usages of tolower(). On signed-char machines, it is necessary
to explicitly cast the output back to char before comparing it to a char
value, else we get the wrong result for high-bit-set characters.  Found by
Rolf Jentsch.  Also, fix several places where <ctype.h> functions were being
called without casting the argument to unsigned char; this is likewise
unportable, but we keep making that mistake :-(.  These found by buildfarm
member salamander, which I will desperately miss if it ever goes belly-up.
2008-03-01 03:26:35 +00:00
Tom Lane 3bf822c4d7 Disable the undocumented xmlvalidate() function, which was unintentionally
left in the code though it was not meant to be provided.  It represents a
security hole because unprivileged users could use it to look at (at least the
first line of) any file readable by the backend.  Fortunately, this is only
possible if the backend was built with XML support, so the damage is at least
mitigated; and 8.3 probably hasn't propagated into any security-critical uses
yet anyway.  Per report from Sergey Burladyan.
2008-03-01 02:46:49 +00:00
Alvaro Herrera 7157114d54 Remove long-unused and broken TCL_ARRAYS. 2008-02-29 20:58:33 +00:00
Tom Lane fd15dba543 Fix encode(...bytea..., 'escape') so that it converts all high-bit-set byte
values into \nnn octal escape sequences.  When the database encoding is
multibyte this is *necessary* to avoid generating invalidly encoded text.
Even in a single-byte encoding, the old behavior seems very hazardous ---
consider for example what happens if the text is transferred to another
database with a different encoding.  Decoding would then yield some other
bytea value than what was encoded, which is surely undesirable.  Per gripe
from Hernan Gonzalez.

Backpatch to 8.3, but not further.  This is a bit of a judgment call, but I
make it on these grounds: pre-8.3 we don't really have much encoding safety
anyway because of the convert() function family, and we would also have much
higher risk of breaking existing apps that may not be expecting this behavior.
8.3 is still new enough that we can probably get away with making this change
in the function's behavior.
2008-02-26 02:54:08 +00:00
Tom Lane bc93919be7 Reject year zero during datetime input, except when it's a 2-digit year
(then it means 2000 AD).  Formerly we silently interpreted this as 1 BC,
which at best is unwarranted familiarity with the implementation.
It's barely possible that some app somewhere expects the old behavior,
though, so we won't back-patch this into existing release branches.
2008-02-25 23:36:28 +00:00
Tom Lane 05506fc4af Fix datetime input to behave correctly for Feb 29 in years BC.
Formerly, DecodeDate attempted to verify the day-of-the-month exactly, but
it was under the misapprehension that it would know whether we were looking
at a BC year or not.  In reality this check can't be made until the calling
function (eg DecodeDateTime) has processed all the fields.  So, split the
BC adjustment and validity checks out into a new function ValidateDate that
is called only after processing all the fields.  In passing, this patch
makes DecodeTimeOnly work for BC inputs, which it never did before.

(The historical veracity of all this is nonexistent, of course, but if
we're going to say we support proleptic Gregorian calendar then we should
do it correctly.  In any case the unpatched code is broken because it could
emit dates that it would then reject on re-inputting.)

Per report from Bernd Helmle.  Back-patch as far as 8.0; in 7.x we were
not using our own calendar support and so this seems a bit too risky
to put into 7.4.
2008-02-25 23:21:01 +00:00
Peter Eisentraut 0474dcb608 Refactor backend makefiles to remove lots of duplicate code 2008-02-19 10:30:09 +00:00
Tom Lane cf59277ac9 Remove unnecessary opening of other relation in RI_FKey_keyequal_upd_pk
and RI_FKey_keyequal_upd_fk, as well as no-longer-needed calls of
ri_BuildQueryKeyFull.  Aside from saving a few cycles, this avoids needless
deadlock risks when an update is not changing the columns that participate
in an RI constraint.  Per a gripe from Alexey Nalbat.

Back-patch to 8.3.  Earlier releases did have a need to open the other
relation due to the way in which they retrieved information about the RI
constraint, so this problem unfortunately can't easily be improved pre-8.3.

Tom Lane and Stephan Szabo
2008-02-18 23:00:32 +00:00
Tom Lane cd00406774 Replace time_t with pg_time_t (same values, but always int64) in on-disk
data structures and backend internal APIs.  This solves problems we've seen
recently with inconsistent layout of pg_control between machines that have
32-bit time_t and those that have already migrated to 64-bit time_t.  Also,
we can get out from under the problem that Windows' Unix-API emulation is not
consistent about the width of time_t.

There are a few remaining places where local time_t variables are used to hold
the current or recent result of time(NULL).  I didn't bother changing these
since they do not affect any cross-module APIs and surely all platforms will
have 64-bit time_t before overflow becomes an actual risk.  time_t should
be avoided for anything visible to extension modules, however.
2008-02-17 02:09:32 +00:00
Tom Lane 9b43c245e3 Avoid misbehavior in foreign key checks when casting to a datatype for which
the parser supplies a default typmod that can result in data loss (ie,
truncation).  Currently that appears to be only CHARACTER and BIT.
We can avoid the problem by specifying the type's internal name instead
of using SQL-spec syntax.  Since the queries generated here are only used
internally, there's no need to worry about portability.  This problem is
new in 8.3; before we just let the parser do whatever it wanted to resolve
the operator, but 8.3 is trying to be sure that the semantics of FK checks
are consistent.  Per report from Harald Fuchs.
2008-02-07 22:58:35 +00:00
Tom Lane 353a1cca9f Release any detoasted copies of arrays that are made temporarily in
ri_FetchConstraintInfo, to avoid a query-duration memory leak when that
routine is called by RI_FKey_keyequal_upd_fk (which isn't executed in a
short-lived context).  This problem was latent when the routine was added
in February, but it didn't become serious until the varvarlena patch made
it quite likely that the fields being examined would be "toasted" (ie, have
short headers).  Per report from Stephen Denne.
2008-01-25 04:46:07 +00:00
Tom Lane ac12412ede Revise memory management for libxml calls. Instead of keeping libxml's data
in whichever context happens to be current during a call of an xml.c function,
use a dedicated context that will not go away until we explicitly delete it
(which we do at transaction end or subtransaction abort).  This makes recovery
after an error much simpler --- we don't have to individually delete the data
structures created by libxml.  Also, we need to initialize and cleanup libxml
only once per transaction (if there's no error) instead of once per function
call, so it should be a bit faster.  We'll need to keep an eye out for
intra-transaction memory leaks, though.  Alvaro and Tom.
2008-01-15 18:57:00 +00:00
Tom Lane 1bbf8706ae It turns out the LIBXML_TEST_VERSION macro calls xmlInitParser().
Therefore we must xmlCleanupParser(), or we risk leaving behind
dangling pointers to whatever memory context is current when xml_init()
is called.  This seems to fix bug #3860, though we might still want
the more invasive solution being worked on by Alvaro.
2008-01-12 21:14:08 +00:00
Neil Conway 5217663372 Fix two places in xml.c that neglected to check the return values of
SPI_prepare() and SPI_cursor_open(), to silence a Coverity warning.
2008-01-12 10:50:03 +00:00
Neil Conway 25b7583f67 Minor perf tweak for _SPI_strdup(): if we're going to call strlen()
anyway, it is faster to memcpy() than to strcpy().
2008-01-12 10:38:32 +00:00
Tom Lane da3df47c84 lmgr.c:DescribeLockTag was never taught about virtual xids, per Greg Stark.
Also a couple of minor tweaks to try to future-proof the code a bit better
against future locktag additions.
2008-01-08 23:18:51 +00:00
Tom Lane 8c71752ae4 Remove unnecessary comma in enum definition ... some C compilers don't
like that.  Per report from J6M.
2008-01-08 01:04:08 +00:00
Tom Lane 5935890775 A long time ago, Peter pointed out that ruleutils.c didn't dump simple
constant ORDER/GROUP BY entries properly:
http://archives.postgresql.org/pgsql-hackers/2001-04/msg00457.php
The original solution to that was in fact no good, as demonstrated by
today's report from Martin Pitt:
http://archives.postgresql.org/pgsql-bugs/2008-01/msg00027.php
We can't use the column-number-reference format for a constant that is
a resjunk targetlist entry, a case that was unfortunately not thought of
in the original discussion.  What we can do instead (which did not work
at the time, but does work in 7.3 and up) is to emit the constant with
explicit ::typename decoration, even if it otherwise wouldn't need it.
This is sufficient to keep the parser from thinking it's a column number
reference, and indeed is probably what the user must have done to get
such a thing into the querytree in the first place.
2008-01-06 01:03:16 +00:00
Tom Lane eedb068c0a Make standard maintenance operations (including VACUUM, ANALYZE, REINDEX,
and CLUSTER) execute as the table owner rather than the calling user, using
the same privilege-switching mechanism already used for SECURITY DEFINER
functions.  The purpose of this change is to ensure that user-defined
functions used in index definitions cannot acquire the privileges of a
superuser account that is performing routine maintenance.  While a function
used in an index is supposed to be IMMUTABLE and thus not able to do anything
very interesting, there are several easy ways around that restriction; and
even if we could plug them all, there would remain a risk of reading sensitive
information and broadcasting it through a covert channel such as CPU usage.

To prevent bypassing this security measure, execution of SET SESSION
AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context.

Thanks to Itagaki Takahiro for reporting this vulnerability.

Security: CVE-2007-6600
2008-01-03 21:23:15 +00:00
Tom Lane ce9baa06f0 Fix some missed copyright updates. 2008-01-01 20:31:21 +00:00
Bruce Momjian 9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
Peter Eisentraut f5f1355dc4 Wording improvements 2007-12-27 13:02:48 +00:00
Tom Lane ef6bac3323 When given a nonzero column number, pg_get_indexdef() is only supposed to
print the index key variable or expression for that column.  It was mistakenly
printing ASC/DESC/NULLS FIRST/NULLS LAST decoration too --- and not only for
the target column, but all columns.  Someday we should have an option to
extract that info (and the opclass decoration as well) for a single index
column ... but today is not that day.  Per bug #3829 and subsequent
discussion.
2007-12-20 00:23:19 +00:00
Andrew Dunstan 3f2b1db240 Fix thinko in encoding check for chr() 2007-12-18 18:01:48 +00:00
Tom Lane dbc632eb37 Make path_recv() and poly_recv() reject paths/polygons containing no points.
The zero-point case is sensible so far as the data structure is concerned,
so maybe we ought to allow it sometime; but right now the textual input
routines for these types don't allow it, and it seems that not all the
functions for the types are prepared to cope.
Report and patch by Merlin Moncure.
2007-12-18 00:04:08 +00:00
Tom Lane 9fd8843647 Fix mergejoin cost estimation so that we consider the statistical ranges of
the two join variables at both ends: not only trailing rows that need not be
scanned because there cannot be a match on the other side, but initial rows
that will be scanned without possibly having a match.  This allows a more
realistic estimate of startup cost to be made, per recent pgsql-performance
discussion.  In passing, fix a couple of bugs that had crept into
mergejoinscansel: it was not quite up to speed for the task of estimating
descending-order scans, which is a new requirement in 8.3.
2007-12-08 21:05:11 +00:00
Tom Lane 265f904d8f Code review for LIKE ... INCLUDING INDEXES patch. Fix failure to propagate
constraint status of copied indexes (bug #3774), as well as various other
small bugs such as failure to pstrdup when needed.  Allow INCLUDING INDEXES
indexes to be merged with identical declared indexes (perhaps not real useful,
but the code is there and having it not apply to LIKE indexes seems pretty
unorthogonal).  Avoid useless work in generateClonedIndexStmt().  Undo some
poorly chosen API changes, and put a couple of routines in modules that seem
to be better places for them.
2007-12-01 23:44:44 +00:00
Tom Lane bb0e3011f8 Make a cleanup pass over error reports in tsearch code. Use ereport
for user-facing errors, fix some poor choices of errcode, adhere to
message style guide.
2007-11-28 21:56:30 +00:00
Tom Lane 11fccbeaeb Adjust the names of a couple of tsearch index support functions that had
inappropriately generic-sounding names.  This is more or less free since
we already forced initdb for the next beta, and it may prevent confusion or
name conflicts (particularly at the C-global-symbol level) down the road.
Per my proposal yesterday.
2007-11-28 19:33:05 +00:00
Peter Eisentraut 96ee6ff502 Fix XML Schema structure for char types without length (bug #3782) 2007-11-28 14:01:51 +00:00
Tom Lane 66d7bbf674 Suppress compiler warning. 2007-11-27 18:13:01 +00:00
Peter Eisentraut 7888b52076 Make casts from xml to text independent of the XML option setting, thus
immutable and indexable.  Also fix the volatility settings of some other
XML-related functions.
2007-11-27 12:21:05 +00:00
Peter Eisentraut a999ff63ff Use double quotes for quoting xml attributes. 2007-11-25 12:08:11 +00:00
Tom Lane 0f20e7a83e Slightly more paranoia and slightly better comments for use of
Windows-specific MultiByteToWideChar/WideCharToMultiByte calls.
2007-11-24 21:16:55 +00:00
Bruce Momjian 8a52d0c94d Clarify how MONEY trims off trailing thousands separator. 2007-11-24 16:18:48 +00:00
Bruce Momjian 5f128d5fe8 Make the MONEY data type have a thousands separator != decimal symbol,
if the locale has the thousands separator as "".  This now matches the
to_char and psql numericlocale behavior.  (Previously this data type was
basically useless for such setups.)
2007-11-24 15:28:02 +00:00
Bruce Momjian 335d9aff6f Fix white space in MONEY type code. Rename 'comma' to more generic
'ssymbol' as used in previous function.
2007-11-23 19:54:39 +00:00
Bruce Momjian b85cf684f7 Add more comments about thousands separator handling. 2007-11-22 17:51:39 +00:00
Bruce Momjian d9bc7a3946 Add comments about thousands separator logic. 2007-11-22 15:10:05 +00:00
Bruce Momjian 3894e7cc55 When setting default thousands separator when locale has "", use logic
so new thousands separator doesn't match decimal symbol.
2007-11-21 22:28:18 +00:00
Bruce Momjian 6f3149e464 Fix typo in comment. 2007-11-21 21:49:22 +00:00
Tom Lane d23ba77a44 Fix bogus length calculation that could lead to crash if the string
happened to be right up against the end of memory, per report from
Matt Magoffin.  While at it, avoid useless multiple copying of string
by not depending on xmlStrncatNew.
2007-11-20 23:14:41 +00:00
Teodor Sigaev a867b40cf4 Fix tsvectorout() and tsqueryout() to escape backslesh, add test of that.
Patch by Bruce Momjian <bruce@momjian.us>

Backpatch is needed, but it's impossible to apply it directly
2007-11-16 15:05:59 +00:00
Bruce Momjian f639df0d61 Small comment spacing improvement. 2007-11-16 01:51:22 +00:00
Bruce Momjian 5f0bf6cb0d Run pgindent on remaining files now that LOOPBYTE is a usable macro. 2007-11-16 01:12:24 +00:00
Bruce Momjian 224f91f66d Modify LOOPBYTE/LOOPBIT macros to be more logical; rather than have the
for() body passed as a parameter, make the macros act as simple headers
to code blocks.

This allows pgindent to be run on these files.
2007-11-16 00:13:02 +00:00
Bruce Momjian 7d4c99b414 Fix pgindent to properly handle 'else' and single-line comments on the
same line;  previous fix was only partial.  Re-run pgindent on files
that need it.
2007-11-15 23:23:44 +00:00
Bruce Momjian f6e8730d11 Re-run pgindent with updated list of typedefs. (Updated README should
avoid this problem in the future.)
2007-11-15 22:25:18 +00:00
Bruce Momjian fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane 866bad9543 Add a rank/(rank+1) normalization option to ts_rank(). While the usefulness
of this seems a bit marginal, if it's useful enough to be shown in the manual
then we probably ought to support doing it without double evaluation of the
ts_rank function.  Per my proposal earlier today.
2007-11-14 23:43:27 +00:00
Tom Lane 4394c1b09c Resurrect the code for the rewrite(ARRAY[...]) aggregate function,
and put it into contrib/tsearch2 compatibility module.
2007-11-13 22:14:50 +00:00
Tom Lane 2b477a2c73 Add missing closing / in xsd:restriction, and remove some unnecessary
spaces for consistency.  Per bug #3734 from Ben Leslie; fix by
Euler Taveira de Oliveira.
2007-11-10 19:29:54 +00:00
Tom Lane d2d52bbb55 xmlGetUTF8Char()'s second argument is both input and output. Fix
uninitialized value, and avoid invoking the function nine separate
times in the pg_xmlIsNameChar macro.  Should resolve buildfarm failures.
Per report from Ben Leslie.
2007-11-10 18:51:20 +00:00
Tom Lane a96fa85025 Second pass at improving LIKE/regex estimation in non-C locales. It turns
out that it's actually quite likely that a string that is an extension of
the given prefix will sort as larger than the "greater" string our previous
code created.  To provide some defense against that, do the comparisons
against a modified string instead of just the bare prefix.  We tack on
"Z", "z", "y", or "9", whichever is seen as largest in the current locale.
Testing suggests that this is sufficient at least for cases involving
ASCII data.
2007-11-09 20:10:02 +00:00
Peter Eisentraut 8db43db01e Allow XML processing instructions starting with "xml" while prohibiting
those being exactly "xml".  Bug #3735 from Ben Leslie
2007-11-09 15:52:51 +00:00
Peter Eisentraut 4c726d5c11 After conferencing again with Bruce, put in more accurate XML error message. 2007-11-08 15:16:45 +00:00
Peter Eisentraut 79cff6bc7e Improve error message 2007-11-08 13:12:56 +00:00
Tom Lane 2de946be6a Improve the performance of LIKE/regex estimation in non-C locales, by making
make_greater_string() try harder to generate a string that's actually greater
than its input string.  Before we just assumed that making a string that was
memcmp-greater was enough, but it is easy to generate examples where this is
not so when the locale is not C.  Instead, loop until the relevant comparison
function agrees that the generated string is greater than the input.

Unfortunately this is probably not enough to guarantee that the generated
string is greater than all extensions of the input, so we cannot relax the
restriction to C locale for the LIKE/regex index optimization.  But it should
at least improve the odds of getting a useful selectivity estimate in
prefix_selectivity().  Per example from Guillaume Smet.

Backpatch to 8.1, mainly because that's what the complainant is using...
2007-11-07 22:37:24 +00:00
Tom Lane 9542287123 Fix patternsel() and callers to do the right thing for NOT LIKE and the other
negated-match operators.  patternsel had been using the supplied operator as
though it were a positive-match operator, and thus obtaining a wrong result,
which was even more wrong after the caller subtracted it from 1.  Seems
cleanest to give patternsel an explicit "negate" argument so that it knows
what's going on.  Also install the same factorization scheme for pattern
join selectivity estimators; even though they are just stubs at the
moment, this may keep someone from making the same type of mistake when
they get filled out.  Per report from Greg Mullane.

Backpatch to 8.2 --- previous releases do not show the problem because
patternsel() doesn't actually use the operator directly.
2007-11-07 21:00:37 +00:00
Tom Lane 5e51297104 Some code review for xml.c:
Add some more xml_init() calls that might not be necessary, but seem like a
good idea to avoid possible problems like we saw in xmlelement().
Fix unsafe assumption that you can keep using the tupledesc of a relcache
entry you don't have open.
Add missing error checks for SearchSysCache failure.
Get rid of handwritten array traversal in xpath() and O(N^2), broken-for-nulls
array access code in map_sql_value_to_xml_value(), in favor of using
deconstruct_array.
Manually adjust a lot of line breaks in places where the code is otherwise
gonna look pretty awful after pg_indent hacks it up (original author seems to
have liked to lay out code for a 200-column window).
2007-11-06 03:06:28 +00:00
Tom Lane 85f807d782 Fix xmlelement() to initialize libxml correctly before using it, and to avoid
assuming that evaluation of its input expressions won't change the state of
libxml.  This requires refactoring xml_init() to not call xmlInitParser(),
since now not all of its callers want that.  I also tweaked things to avoid
repeated execution of one-time-only tests inside xml_init(), though this is
mostly for clarity rather than in hopes of saving any noticeable amount of
runtime.  Per report from Sheikh Amjad and subsequent discussion.
In passing, fix a couple of inadequately schema-qualified queries.
2007-11-05 22:23:07 +00:00
Tom Lane 1c92724985 Set read_only = TRUE while evaluating input queries for ts_rewrite()
and ts_stat(), per my recent suggestion.  Also add a possibly-not-needed-
but-can't-hurt check for NULL SPI_tuptable, before we try to dereference
same.
2007-10-24 03:30:03 +00:00
Tom Lane 592c88a0d2 Remove the aggregate form of ts_rewrite(), since it doesn't work as desired
if there are zero rows to aggregate over, and the API seems both conceptually
and notationally ugly anyway.  We should look for something that improves
on the tsquery-and-text-SELECT version (which is also pretty ugly but at
least it works...), but it seems that will take query infrastructure that
doesn't exist today.  (Hm, I wonder if there's anything in or near SQL2003
window functions that would help?)  Per discussion.
2007-10-24 02:24:49 +00:00
Tom Lane 12f25e70a6 Fix two-argument form of ts_rewrite() so it actually works for cases where
a later rewrite rule should change a subtree modified by an earlier one.
Per my gripe of a few days ago.
2007-10-23 01:44:40 +00:00
Tom Lane bb36c51fcd Fix several bugs in tsvectorin, including crash due to uninitialized field and
miscomputation of required palloc size.  The crash could only occur if the
input contained lexemes both with and without positions, which is probably not
common in practice.  The miscomputation would definitely result in wasted
space.  Also fix some inconsistent coding around alignment of strings and
positions in a tsvector value; these errors could also lead to crashes given
mixed with/without position data and a machine that's picky about alignment.
And be more careful about checking for overflow of string offsets.

Patch is only against HEAD --- I have not looked to see if same bugs are
in back-branch contrib/tsearch2 code.
2007-10-23 00:51:23 +00:00
Tom Lane 1ea47dd8cb Fix shared tsvector/tsquery input code so that we don't say "syntax error in
tsvector" when we are really parsing a tsquery.  Report the bogus input,
too.  Make styles of some related error messages more consistent.
2007-10-21 22:29:56 +00:00
Tom Lane 531ead8ab4 Adjust error message to agree with documentation. The tsearch documentation
uniformly calls these things weights, not classes.
2007-10-20 21:06:20 +00:00
Tom Lane 18e3fcc31e Migrate the former contrib/txid module into core. This will make it easier
for Slony and Skytools to depend on it.  Per discussion.
2007-10-13 23:06:28 +00:00
Tom Lane ff1de5cef6 Guard against possible double free during error escape from XML
functions.  Patch for the reported issue from Kris Jurka, some
other potential trouble spots plugged by Tom.
2007-10-13 20:46:47 +00:00
Tom Lane 8468146b03 Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so.  For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway.  (This does force initdb
unfortunately.)

Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using.  To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.

It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs.  However the code is now prepared to avoid this
type of problem in future.

Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly.  The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
2007-10-13 20:18:42 +00:00
Tom Lane 537e92e41f Fix ALTER COLUMN TYPE to preserve the tablespace and reloptions of indexes
it affects.  The original coding neglected tablespace entirely (causing
the indexes to move to the database's default tablespace) and for an index
belonging to a UNIQUE or PRIMARY KEY constraint, it would actually try to
assign the parent table's reloptions to the index :-(.  Per bug #3672 and
subsequent investigation.

8.0 and 8.1 did not have reloptions, but the tablespace bug is present.
2007-10-13 15:55:40 +00:00
Tom Lane 6f21c57a97 In the integer-datetimes case, date2timestamp and date2timestamptz need
to check for overflow because the legal range of type date is actually
wider than timestamp's.  Problem found by Neil Conway.
2007-09-26 01:10:42 +00:00
Tom Lane 6f5c38dcd0 Just-in-time background writing strategy. This code avoids re-scanning
buffers that cannot possibly need to be cleaned, and estimates how many
buffers it should try to clean based on moving averages of recent allocation
requests and density of reusable buffers.  The patch also adds a couple
more columns to pg_stat_bgwriter to help measure the effectiveness of the
bgwriter.

Greg Smith, building on his own work and ideas from several other people,
in particular a much older patch from Itagaki Takahiro.
2007-09-25 20:03:38 +00:00
Tom Lane f71c7b9dfd Fix bugs in XML binary I/O functions. Heikki and Tom 2007-09-23 21:36:42 +00:00
Tom Lane bbda96d76d Fix bogus calculation of potential output string length in translate(). 2007-09-22 05:35:42 +00:00
Tom Lane 5e87ebb0c3 Although I'd misdiagnosed the reason for the recent failures on
buildfarm member grebe, I see no reason to revert the 1-byte-header-friendly
changes I made in varlena.c.  Instead, tweak the code a little bit to
get more advantage out of that.
2007-09-22 04:40:03 +00:00
Tom Lane 94470b9499 Doh --- what's really happening on buildfarm member grebe is that its
malloc returns NULL for malloc(0).  Defend against that case.
2007-09-22 04:37:53 +00:00
Andrew Dunstan e152893305 Go back to using a separate method for doing ILIKE for single byte
character encodings that doesn't involve calling lower(). This should
cure the performance regression in this case complained of by Guillaume
Smet. It still leaves the horrid performance for multi-byte encodings
introduced in 8.2, but there's no obvious solution for that in sight.
2007-09-22 03:58:34 +00:00
Tom Lane b5d1608b0a Fix varlena.c routines to allow 1-byte-header text values. This is now
demonstrably necessary for text_substring() since regexp_split functions
may pass it such a value; and we might as well convert the whole file
at once.  Per buildfarm results (though I wonder why most machines aren't
showing a failure).
2007-09-22 00:36:38 +00:00
Tom Lane 7583f9a7ca Fix regex, LIKE, and some other second-rank text-manipulation functions
to not cause needless copying of text datums that have 1-byte headers.
Greg Stark, in response to performance gripe from Guillaume Smet and
ITAGAKI Takahiro.
2007-09-21 22:52:52 +00:00
Tom Lane d22ae3ecc2 Solaris portability fix that was previously made in contrib/tsearch2
but got lost from the version committed to main tree.  Per Greg Stark.
2007-09-20 23:27:11 +00:00
Teodor Sigaev bab16af807 Fix msvc warnings, patch by Hannes Eder <Hannes@HannesEder.net> 2007-09-20 18:10:57 +00:00
Tom Lane 282d2a03dd HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version.  Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.

In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.

Pavan Deolasee, with help from a bunch of other people.
2007-09-20 17:56:33 +00:00
Neil Conway bbf4fdc253 Prevent corr() from returning the wrong results for negative correlation
values. The previous coding essentially assumed that x = sqrt(x*x), which
does not hold for x < 0.

Thanks to Jie Zhang at Greenplum and Gavin Sherry for reporting this
issue.
2007-09-19 22:31:48 +00:00
Andrew Dunstan 55613bf9cd Close previously open holes for invalidly encoded data to enter the
database via builtin functions, as recently discussed on -hackers.

chr() now returns a character in the database encoding. For UTF8 encoded databases
the argument is treated as a Unicode code point. For other multi-byte encodings
the argument must designate a strict ascii character, or an error is raised,
as is also the case if the argument is 0.

ascii() is adjusted so that it remains the inverse of chr().

The two argument form of convert() is gone, and the three argument form now
takes a bytea first argument and returns a bytea. To cover this loss three new
functions are introduced:
. convert_from(bytea, name) returns text - converts the first argument from the
  named encoding to the database encoding
. convert_to(text, name) returns bytea - converts the first argument from the
  database encoding to the named encoding
. length(bytea, name) returns int - gives the length of the first argument in
  characters in the named encoding
2007-09-18 17:41:17 +00:00
Tom Lane 22d98e7934 Fix overflow in extract(epoch from interval) for intervals exceeding 68 years.
Seems to have been introduced in 8.1 by careless SECS_PER_DAY
search-and-replace.
2007-09-16 15:56:20 +00:00
Teodor Sigaev 3e805fdcf7 Fix typo in typecasting.
patch from  ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>
2007-09-13 06:54:35 +00:00