Commit Graph

9173 Commits

Author SHA1 Message Date
Tom Lane 6d871a2538 Restrict tsearch config file base names to contain a-z, 0-9, and underscore,
instead of the initial policy of whatever isalpha() likes.  Per discussion.
2007-09-04 02:16:56 +00:00
Tom Lane e7889b83b7 Support SET FROM CURRENT in CREATE/ALTER FUNCTION, ALTER DATABASE, ALTER ROLE.
(Actually, it works as a plain statement too, but I didn't document that
because it seems a bit useless.)  Unify VariableResetStmt with
VariableSetStmt, and clean up some ancient cruft in the representation of
same.
2007-09-03 18:46:30 +00:00
Tom Lane 7ab43b88d7 Improve stylistic consistency of descriptions of built-in objects by avoiding
initcap style --- the vast majority of the existing descriptions do not use
an initial cap.  I didn't change places where the first word was all-cap.

initdb not forced because this doesn't change any regression test results.
2007-09-03 02:30:45 +00:00
Tom Lane 2abae34a2e Implement function-local GUC parameter settings, as per recent discussion.
There are still some loose ends: I didn't do anything about the SET FROM
CURRENT idea yet, and it's not real clear whether we are happy with the
interaction of SET LOCAL with function-local settings.  The documentation
is a bit spartan, too.
2007-09-03 00:39:26 +00:00
Tom Lane d2825e1c85 Since sort_bounded_heap makes state changes that should be made
regardless of the number of tuples involved, it's incorrect to skip it
when memtupcount = 1; the number of cycles saved is minuscule anyway.
An alternative solution would be to pull the state changes out to the
call site in tuplesort_performsort, but keeping them near the corresponding
changes in make_bounded_heap seems marginally cleaner.  Noticed by
Greg Stark.
2007-09-01 18:47:39 +00:00
Tom Lane 0ee5a39862 Apply a band-aid fix for the problem that 8.2 and up completely misestimate
the number of rows likely to be produced by a query such as
	SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL;
What this is doing is selecting for t1 rows with no match in t2, and thus
it may produce a significant number of rows even if the t2.key table column
contains no nulls at all.  8.2 thinks the table column's null fraction is
relevant and thus may estimate no rows out, which results in terrible plans
if there are more joins above this one.  A proper fix for this will involve
passing much more information about the context of a clause to the selectivity
estimator functions than we ever have.  There's no time left to write such a
patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway.  Instead,
put in an ad-hoc test to defeat the normal table-stats-based estimation when
an IS NULL test is evaluated at an outer join, and just use a constant
estimate instead --- I went with 0.5 for lack of a better idea.  This won't
catch every case but it will catch the typical ways of writing such queries,
and it seems unlikely to make things worse for other queries.
2007-08-31 23:35:22 +00:00
Tom Lane 68e40998d0 Extend whole-row Var evaluation to cope with the case that the sub-plan
generating the tuples has resjunk output columns.  This is not possible for
simple table scans but can happen when evaluating a whole-row Var for a view.
Per example from Patryk Kordylewski.  The problem exists back to 8.0 but
I'm not going to risk back-patching further than 8.2 because of the many
changes in this area.
2007-08-31 18:33:40 +00:00
Tom Lane 79048ca1a4 Install check_stack_depth() protection in two recursive tsquery
processing routines.  Per Heikki.
2007-08-31 02:26:29 +00:00
Tom Lane b4c806faa8 Rewrite make_outerjoininfo's construction of min_lefthand and min_righthand
sets for outer joins, in the light of bug #3588 and additional thought and
experimentation.  The original methodology was fatally flawed for nests of
more than two outer joins: it got the relationships between adjacent joins
right, but didn't always come to the right conclusions about whether a join
could be interchanged with one two or more levels below it.  This was largely
caused by a mistaken idea that we should use the min_lefthand + min_righthand
sets of a sub-join as the minimum left or right input set of an upper join
when we conclude that the sub-join can't commute with the upper one.  If
there's a still-lower join that the sub-join *can* commute with, this method
led us to think that that one could commute with the topmost join; which it
can't.  Another problem (not directly connected to bug #3588) was that
make_outerjoininfo's processing-order-dependent method for enforcing outer
join identity #3 didn't work right: if we decided that join A could safely
commute with lower join B, we dropped all information about sub-joins under B
that join A could perhaps not safely commute with, because we removed B's
entire min_righthand from A's.

To fix, make an explicit computation of all inner join combinations that occur
below an outer join, and add to that the full syntactic relsets of any lower
outer joins that we determine it can't commute with.  This method gives much
more direct enforcement of the outer join rearrangement identities, and it
turns out not to cost a lot of additional bookkeeping.

Thanks to Richard Harris for the bug report and test case.
2007-08-31 01:44:06 +00:00
Tom Lane e75d365633 Fix int8mul so that overflow check is applied correctly for INT64_IS_BUSTED
case, per Florian Pflug.
Not back-patched since it's unclear that anyone but me still cares ...
2007-08-30 05:27:29 +00:00
Tom Lane 8bc225e799 Relax permissions checks on dbsize functions, per discussion. Revert out all
checks for individual-table-size functions, since anyone in the database could
get approximate values from pg_class.relpages anyway.  Allow database-size to
users with CONNECT privilege for the target database (note that this is
granted by default).  Allow tablespace-size if the user has CREATE privilege
on the tablespace (which is *not* granted by default), or if the tablespace is
the default tablespace for the current database (since we treat that as
implicitly allowing use of the tablespace).
2007-08-29 17:24:29 +00:00
Tom Lane a52e4408b9 Add a debug logging message when a resource manager rejects an attempted
restart point.  Per suggestion from Simon Riggs.
2007-08-28 23:17:47 +00:00
Tom Lane 24d4517b3b Improve behavior of log_lock_waits patch. Ensure that something gets logged
even if the "deadlock detected" ERROR message is suppressed by an exception
catcher.  Be clearer about the event sequence when a soft deadlock is fixed:
the fixing process might or might not still have to wait, so log that
separately.  Fix race condition when someone releases us from the lock partway
through printing all this junk --- we'd not get confused about our state, but
the log message sequence could have been misleading, ie, a "still waiting"
message with no subsequent "acquired" message.  Greg Stark and Tom Lane.
2007-08-28 03:23:44 +00:00
Magnus Hagander 3b1e04c3e9 Fix generation of snowball_create.sql on msvc builds. 2007-08-27 10:29:49 +00:00
Tom Lane 862861ee77 Fix a couple of misbehaviors rooted in the fact that the default creation
namespace isn't necessarily first in the search path (there could be implicit
schemas ahead of it).  Examples are

test=# set search_path TO s1;

test=# create view pg_timezone_names as select * from pg_timezone_names();
ERROR:  "pg_timezone_names" is already a view

test=# create table pg_class (f1 int primary key);
ERROR:  permission denied: "pg_class" is a system catalog

You'd expect these commands to create the requested objects in s1, since
names beginning with pg_ aren't supposed to be reserved anymore.  What is
happening is that we create the requested base table and then execute
additional commands (here, CREATE RULE or CREATE INDEX), and that code is
passed the same RangeVar that was in the original command.  Since that
RangeVar has schemaname = NULL, the secondary commands think they should do a
path search, and that means they find system catalogs that are implicitly in
front of s1 in the search path.

This is perilously close to being a security hole: if the secondary command
failed to apply a permission check then it'd be possible for unprivileged
users to make schema modifications to system catalogs.  But as far as I can
find, there is no code path in which a check doesn't occur.  Which makes it
just a weird corner-case bug for people who are silly enough to want to
name their tables the same as a system catalog.

The relevant code has changed quite a bit since 8.2, which means this patch
wouldn't work as-is in the back branches.  Since it's a corner case no one
has reported from the field, I'm not going to bother trying to back-patch.
2007-08-27 03:36:08 +00:00
Tom Lane 6c96188cb5 Remove the 'not in' operator (!!=). This was a hangover from Berkeley
days that was obsolete the moment we had IN (SELECT ...) capability.
It's arguably a security hole since it applied no permissions check to
the table it searched, and since it was never documented anywhere,
removing it seems more appropriate than fixing it.
2007-08-27 01:39:25 +00:00
Tom Lane cc26599b72 Restrict pg_relation_size to relation owner, pg_database_size to DB owner,
and pg_tablespace_size to superusers.  Perhaps we could weaken the first
case to just require SELECT privilege, but that doesn't work for the
other cases, so use ownership as the common concept.
2007-08-27 01:19:14 +00:00
Tom Lane 741e952b54 Make currtid() functions require SELECT privileges on the target table.
While it's not clear that TID linkage info is of any great use to a
nefarious user, it's certainly unexpected that these functions wouldn't
insist on read privileges.
2007-08-27 00:57:36 +00:00
Tom Lane 67bf7b919e Make ARRAY(SELECT ...) return an empty array, rather than a NULL, when the
sub-select returns zero rows.  Per complaint from Jens Schicke.  Since this
is more in the nature of a definition change than a bug, not back-patched.
2007-08-26 21:44:25 +00:00
Tom Lane 75d091a0d7 Fix brain fade in DefineIndex(): it was continuing to access the table's
relcache entry after having heap_close'd it.  This could lead to misbehavior
if a relcache flush wiped out the cache entry meanwhile.  In 8.2 there is a
very real risk of CREATE INDEX CONCURRENTLY using the wrong relid for locking
and waiting purposes.  I think the bug is only cosmetic in 8.0 and 8.1,
because their transgression is limited to using RelationGetRelationName(rel)
in an ereport message immediately after heap_close, and there's no way (except
with special debugging options) for a cache flush to occur in that interval.
Not quite sure that it's cosmetic in 7.4, but seems best to patch anyway.

Found by trying to run the regression tests with CLOBBER_CACHE_ALWAYS enabled.
Maybe we should try to do that on a regular basis --- it's awfully slow,
but perhaps some fast buildfarm machine could do it once in awhile.
2007-08-25 19:08:19 +00:00
Tom Lane 21168267b9 Simplify implementation of ts_debug() function --- use a join instead
of redundant sub-selects.  initdb not forced, since this is just a
cosmetic change, but the new code won't show up till you do one.
2007-08-25 17:47:44 +00:00
Tom Lane a13cefafb1 Fix synonym-dict breakage introduced in last patch :-(.
Minor other cleanups.
2007-08-25 02:29:45 +00:00
Tom Lane 93eab9312f Rename built-in Snowball stemmer dictionaries to be english_stem,
russian_stem, etc.  Per discussion.
2007-08-25 01:06:25 +00:00
Tom Lane 7351b5fa17 Cleanup for some problems in tsearch patch:
- ispell initialization crashed on empty dictionary file
- ispell initialization crashed on affix file with prefixes but no suffixes
- stop words file was run through pg_verify_mbstr, with database
  encoding, but it's supposed to be UTF-8; similar bug for synonym files
- bunch of comments added, typos fixed, and other cleanup

Introduced consistent encoding checking/conversion of data read from tsearch
configuration files, by doing this in a single t_readline() subroutine
(replacing direct usages of fgets).  Cleaned up API for readstopwords too.

Heikki Linnakangas
2007-08-25 00:03:59 +00:00
Andrew Dunstan 44b5efbae6 Reduce memory requirements for writing CSVlogs, so it will work with about
the same amount of memory in ErrorContext as standard logs.
2007-08-23 01:24:43 +00:00
Tom Lane 11fee4e3b5 Suppress testing the options of CREATE TEXT SEARCH DICTIONARY during
initdb.  We should create all the standard dictionaries even though
some of them may not work in template1's encoding.  Per Teodor.
2007-08-22 22:30:20 +00:00
Tom Lane f4ccdb3a17 Fix VPATH-build problem in new tsearch makefile, per Chad Wagner. 2007-08-22 06:11:56 +00:00
Tom Lane 8a5592daf1 Remove option to change parser of an existing text search configuration.
This prevents needing to do complex and poorly-defined updates of the
mapping table if the new parser has different token types than the old.
Per discussion.
2007-08-22 05:13:50 +00:00
Tom Lane b77c6c7311 Whoops, missed updating dsynonym_init for new dictionary parameter method. 2007-08-22 04:13:15 +00:00
Tom Lane d321421d0a Simplify the syntax of CREATE/ALTER TEXT SEARCH DICTIONARY by treating the
init options of the template as top-level options in the syntax.  This also
makes ALTER a bit easier to use, since options can be replaced individually.
I also made these statements verify that the tmplinit method will accept
the new settings before they get stored; in the original coding you didn't
find out about mistakes until the dictionary got invoked.

Under the hood, init methods now get options as a List of DefElem instead
of a raw text string --- that lets tsearch use existing options-pushing code
instead of duplicating functionality.
2007-08-22 01:39:46 +00:00
Tom Lane fd33d90a23 Simplify CREATE TEXT SEARCH CONFIGURATION by eliminating the separate
'with map' parameter; as things now stand there's really not much point
in specifying a config-to-copy if you don't copy its map.  Also, use
COPY instead of TEMPLATE as the key word for a config-to-copy, so as
to avoid confusion with text search templates.  Per discussion; the
just-committed reference page for the command already describes it
this way.
2007-08-21 21:24:00 +00:00
Tom Lane a4be395364 Avoid using TEXT as a Bison symbol, since this provokes warnings on
Windows builds.  In passing, fix an obsolete comment, per gripe from
Greg Stark.
2007-08-21 15:13:42 +00:00
Tom Lane d01741bfa1 Remove extraneous semicolon --- buildfarm member bear, for one,
objects to it.
2007-08-21 06:34:42 +00:00
Tom Lane 14572e4324 Fix cash_mul_int4 and cash_div_int4 for overenthusiastic substitution
of int64 for int32.  Per reports from Merlin Moncure and Andrew Chernow.
2007-08-21 03:56:07 +00:00
Tom Lane 1783e5db3e Fix money type's send/receive functions to conform to recent widening
of the datatype to int64.  Per Andrew Chernow.
2007-08-21 03:14:36 +00:00
Tom Lane 1cee06ac02 Fix potential access-off-the-end-of-memory in varbit_out(): it fetched the
byte after the last full byte of the bit array, regardless of whether that
byte was part of the valid data or not.  Found by buildfarm testing.
Thanks to Stefan Kaltenbrunner for nailing down the cause.
2007-08-21 02:40:06 +00:00
Tom Lane 25a4a77985 Suppress uninitialized-variable warning. 2007-08-21 01:47:19 +00:00
Tom Lane 440a330a31 Fix a small 64-bit problem in tsearch patch. 2007-08-21 01:45:33 +00:00
Tom Lane 140d4ebcb4 Tsearch2 functionality migrates to core. The bulk of this work is by
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.

Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
2007-08-21 01:11:32 +00:00
Andrew Dunstan fd801f4faa Provide for logfiles in machine readable CSV format. In consequence, rename
redirect_stderr to logging_collector.
Original patch from Arul Shaji, subsequently modified by Greg Smith, and then
heavily modified by me.
2007-08-19 01:41:25 +00:00
Tom Lane 817946bb04 Arrange to cache a ResultRelInfo in the executor's EState for relations that
are not one of the query's defined result relations, but nonetheless have
triggers fired against them while the query is active.  This was formerly
impossible but can now occur because of my recent patch to fix the firing
order for RI triggers.  Caching a ResultRelInfo avoids duplicating work by
repeatedly opening and closing the same relation, and also allows EXPLAIN
ANALYZE to "see" and report on these extra triggers.  Use the same mechanism
to cache open relations when firing deferred triggers at transaction shutdown;
this replaces the former one-element-cache strategy used in that case, and
should improve performance a bit when there are deferred triggers on a number
of relations.
2007-08-15 21:39:50 +00:00
Tom Lane 9cb8409762 Repair problems occurring when multiple RI updates have to be done to the same
row within one query: we were firing check triggers before all the updates
were done, leading to bogus failures.  Fix by making the triggers queued by
an RI update go at the end of the outer query's trigger event list, thereby
effectively making the processing "breadth-first".  This was indeed how it
worked pre-8.0, so the bug does not occur in the 7.x branches.
Per report from Pavel Stehule.
2007-08-15 19:15:47 +00:00
Tom Lane 67f99d216a Fix oversight in async-commit patch: there were some places in heapam.c
that still thought they could set HEAP_XMAX_COMMITTED immediately after
seeing the other transaction commit.  Make them use the same logic as
tqual.c does to determine if the hint bit can be set yet.
2007-08-14 17:35:18 +00:00
Tom Lane b83bd31bd9 TEMPORARILY make synchronous_commit default to OFF, so that we can get more
thorough testing of async-commit mode from the buildfarm.  This patch MUST
get reverted before 8.3 release!
2007-08-13 19:27:12 +00:00
Tom Lane 647fd9a108 Fix two bugs induced in VACUUM FULL by async-commit patch.
First, we cannot assume that XLogAsyncCommitFlush guarantees hint bits will be
settable, because clog.c's inexact LSN bookkeeping results in windows where a
previously flushed transaction is considered unhintable because it shares an
LSN slot with a later unflushed transaction.  But repair_frag requires
XMIN_COMMITTED to be correct so that it can distinguish tuples moved by the
current vacuum.  Since not being able to set the bit is an uncommon corner
case, the most practical way of dealing with it seems to be to abandon
shrinking (ie, don't invoke repair_frag) when we find a non-dead tuple whose
XMIN_COMMITTED bit couldn't be set.

Second, it is possible for the same reason that a RECENTLY_DEAD tuple does not
get its XMAX_COMMITTED bit set during scan_heap.  But by the time repair_frag
examines the tuple it might be possible to set the bit.  We therefore must
take buffer content lock when calling HeapTupleSatisfiesVacuum a second time,
else we can get an Assert failure in SetBufferCommitInfoNeedsSave.  This
latter bug is latent in existing releases, but I think it cannot actually
occur without async commit, since the first HeapTupleSatisfiesVacuum call
should always have set the bit.  So I'm not going to back-patch it.

In passing, reduce the existing "cannot shrink relation" messages from NOTICE
to LOG level.  The new message must be no higher than LOG if we don't want
unpredictable regression test failures, and consistency seems like a good
idea.  Also arrange that only one such message is reported per VACUUM FULL;
in typical scenarios you could get spammed with many such messages, which
seems a bit useless.
2007-08-13 19:08:26 +00:00
Tom Lane b70d4a62ee Remove an "optimization" I installed in 2001, to make repalloc() attempt to
enlarge the memory chunk in-place when it was feasible to do so.  This turns
out to not work well at all for scenarios involving repeated cycles of
palloc/repalloc/pfree: the eventually freed chunks go into the wrong freelist
for the next initial palloc request, and so we consume memory indefinitely.
While that could be defended against, the number of cases where the
optimization can still be applied drops significantly, and adjusting the
initial sizes of StringInfo buffers makes it drop to almost nothing.
Seems better to just remove the extra complexity.
Per recent discussion and testing.
2007-08-12 20:39:14 +00:00
Tom Lane 70868c012f Increase the initial size of StringInfo buffers to 1024 bytes (from 256);
likewise increase the initial size of the scanner's literal buffer to 1024
(from 128).  Instrumentation of the regression tests suggests that this
saves a useful amount of repalloc() traffic --- the number of calls occurring
during one set of tests drops from about 6900 to about 3900.  The old sizes
were chosen in the late 90's with an eye to machines much smaller than
are common today.
2007-08-12 20:18:06 +00:00
Tom Lane ae65ca312f Avoid memory leakage across successive calls of regexp_matches() or
regexp_split_to_table() within a single query.  This is only a partial
solution, as it turns out that with enough matches per string these
functions can also tickle a repalloc() misbehavior.  But fixing that
is a topic for a separate patch.
2007-08-11 19:16:41 +00:00
Tom Lane 1b70619311 Code review for regexp_matches/regexp_split patch. Refactor to avoid assuming
that cached compiled patterns will still be there when the function is next
called.  Clean up looping logic, thereby fixing bug identified by Pavel
Stehule.  Share setup code between the two functions, add some comments, and
avoid risky mixing of int and size_t variables.  Clean up the documentation a
tad, and accept all the flag characters mentioned in table 9-19 rather than
just a subset.
2007-08-11 03:56:24 +00:00
Tom Lane bbe3c02d38 Revise postmaster startup/shutdown logic to eliminate the problem that a
constant flow of new connection requests could prevent the postmaster from
completing a shutdown or crash restart.  This is done by labeling child
processes that are "dead ends", that is, we know that they were launched only
to tell a client that it can't connect.  These processes are managed
separately so that they don't confuse us into thinking that we can't advance
to the next stage of a shutdown or restart sequence, until the very end
where we must wait for them to drain out so we can delete the shmem segment.
Per discussion of a misbehavior reported by Keaton Adams.

Since this code was baroque already, and my first attempt at fixing the
problem made it entirely impenetrable, I took the opportunity to rewrite it
in a state-machine style.  That eliminates some duplicated code sections and
hopefully makes everything a bit clearer.
2007-08-09 01:18:43 +00:00