Commit Graph

21088 Commits

Author SHA1 Message Date
Tom Lane 511e902b51 Make TRUNCATE ... RESTART IDENTITY restart sequences transactionally.
In the previous coding, we simply issued ALTER SEQUENCE RESTART commands,
which do not roll back on error.  This meant that an error between
truncating and committing left the sequences out of sync with the table
contents, with potentially bad consequences as were noted in a Warning on
the TRUNCATE man page.

To fix, create a new storage file (relfilenode) for a sequence that is to
be reset due to RESTART IDENTITY.  If the transaction aborts, we'll
automatically revert to the old storage file.  This acts just like a
rewriting ALTER TABLE operation.  A penalty is that we have to take
exclusive lock on the sequence, but since we've already got exclusive lock
on its owning table, that seems unlikely to be much of a problem.

The interaction of this with usual nontransactional behaviors of sequence
operations is a bit weird, but it's hard to see what would be completely
consistent.  Our choice is to discard cached-but-unissued sequence values
both when the RESTART is executed, and at rollback if any; but to not touch
the currval() state either time.

In passing, move the sequence reset operations to happen before not after
any AFTER TRUNCATE triggers are fired.  The previous ordering was not
logically sensible, but was forced by the need to minimize inconsistency
if the triggers caused an error.  Transactional rollback is a much better
solution to that.

Patch by Steve Singer, rather heavily adjusted by me.
2010-11-17 16:42:18 -05:00
Peter Eisentraut cfad144f89 Additional fixes for parallel make
Add some additional dependencies to constrain the build order to prevent
parallel make from failing.  In the case of src/Makefile, this is likely to be
too complicated to be worth maintaining, so just add .NOTPARALLEL to get the
old for-loop-like behavior.

More fine-tuning might be necessary for some platforms or configurations.
2010-11-17 08:08:41 +02:00
Andrew Dunstan b7fcf68e86 Require VALUE keyword when extending an enum type. Based on a patch from Alvaro Herrera. 2010-11-16 22:18:33 -05:00
Magnus Hagander 4acf99b2f3 Send paramHandle to subprocesses as 64-bit on Win64
The handle to the shared memory segment containing startup
parameters was sent as 32-bit even on 64-bit systems. Since
HANDLEs appear to be allocated sequentially this shouldn't
be a problem until we reach 2^32 open handles in the postmaster,
but a 64-bit value should be sent across as 64-bit, and not
zero out the top 32 bits.

Noted by Tom Lane.
2010-11-16 12:40:56 +01:00
Heikki Linnakangas 2edc5cd493 The GiST scan algorithm uses LSNs to detect concurrent pages splits, but
temporary indexes are not WAL-logged. We used a constant LSN for temporary
indexes, on the assumption that we don't need to worry about concurrent page
splits in temporary indexes because they're only visible to the current
session. But that assumption is wrong, it's possible to insert rows and
split pages in the same session, while a scan is in progress. For example,
by opening a cursor and fetching some rows, and INSERTing new rows before
fetching some more.

Fix by generating fake increasing LSNs, used in place of real LSNs in
temporary GiST indexes.
2010-11-16 11:32:21 +02:00
Tom Lane add0ea88e7 Fix aboriginal mistake in plpython's set-returning-function support.
We must stay in the function's SPI context until done calling the iterator
that returns the set result.  Otherwise, any attempt to invoke SPI features
in the python code called by the iterator will malfunction.  Diagnosis and
patch by Jan Urbanski, per bug report from Jean-Baptiste Quenot.

Back-patch to 8.2; there was no support for SRFs in previous versions of
plpython.
2010-11-15 14:26:55 -05:00
Robert Haas 3134d8863e Add new buffers_backend_fsync field to pg_stat_bgwriter.
This new field counts the number of times that a backend which writes a
buffer out to the OS must also fsync() it.  This happens when the
bgwriter fsync request queue is full, and is generally detrimental to
performance, so it's good to know when it's happening.  Along the way,
log a new message at level DEBUG1 whenever we fail to hand off an fsync,
so that the problem can also be seen in examination of log files
(if the logging level is cranked up high enough).

Greg Smith, with minor tweaks by me.
2010-11-15 12:42:59 -05:00
Robert Haas 8d70ed84ba Remove outdated comments from the regression test files.
Since 2004, int2 and int4 operators do detect overflow; this was fixed by
commit 4171bb869f.

Extracted from a larger patch by Andres Freund.
2010-11-15 11:00:20 -05:00
Robert Haas 20cf8ae478 Fix copy-and-pasteo a little more completely.
copydir.c is no longer in src/port
2010-11-15 10:10:58 -05:00
Alvaro Herrera ae4b17edee Fix copy-and-pasteo. 2010-11-15 11:52:56 -03:00
Simon Riggs 52010027ef Avoid spurious Hot Standby conflicts from btree delete records.
Similar conflicts were already avoided for related record types.
Massive over-caution resulted in a usability bug. Clear theoretical
basis for doing this is now confirmed by me.
Request to remove from Heikki (twice), over-caution by me.
2010-11-15 09:30:13 +00:00
Tom Lane 357edc9a99 Adjust comments about what's needed to avoid make 3.80 bug.
... based on further tracing through that code.
2010-11-15 01:00:48 -05:00
Robert Haas 5ccbc3d802 Correct poor grammar in comment. 2010-11-14 23:10:45 -05:00
Robert Haas 5aa446c961 Cleanup various comparisons with the constant "true".
Itagaki Takahiro, with slight modifications.
2010-11-14 21:03:48 -05:00
Tom Lane 3892a2d861 Fix canAcceptConnections() bugs introduced by replication-related patches.
We must not return any "okay to proceed" result code without having checked
for too many children, else we might fail later on when trying to add the
new child to one of the per-child state arrays.  It's not clear whether
this oversight explains Stefan Kaltenbrunner's recent report, but it could
certainly produce a similar symptom.

Back-patch to 8.4; the logic was not broken before that.
2010-11-14 15:57:37 -05:00
Tom Lane 1bd2012149 Work around make 3.80 bug with long expansions of $(eval).
3.80 breaks if the expansion of $(eval) is long enough to require expansion
of its internal variable_buffer.  For the purposes of $(recurse) that means
it'll work so long as no single evaluation of _create_recursive_target
produces more than 195 bytes.  We can manage that by looping over
subdirectories outside the call instead of complicating the generated rule.
This coding is simpler and more readable anyway.

Or at least, this works for me.  We'll see if the buildfarm likes it.
2010-11-14 12:50:06 -05:00
Tom Lane 2138c701a3 Add missing outfuncs.c support for struct InhRelation.
This is needed to support debug_print_parse, per report from Jon Nelson.
Cursory testing via the regression tests suggests we aren't missing
anything else.
2010-11-13 00:36:14 -05:00
Andrew Dunstan 52e2c12288 Attempt to fix MSVC builds broken by parallel make changes. 2010-11-12 22:55:15 -05:00
Robert Haas 11e482c350 Move copydir() prototype into its own header file.
Having this in src/include/port.h makes no sense, now that copydir.c lives
in src/backend/strorage rather than src/port.  Along the way, remove an
obsolete comment from contrib/pg_upgrade that makes reference to the old
location.
2010-11-12 16:39:53 -05:00
Tom Lane d7304244e2 Fix old oversight in const-simplification of COALESCE() expressions.
Once we have found a non-null constant argument, there is no need to
examine additional arguments of the COALESCE.  The previous coding got it
right only if the constant was in the first argument position; otherwise
it tried to simplify following arguments too, leading to unexpected
behavior like this:

regression=# select coalesce(f1, 42, 1/0) from int4_tbl;
ERROR:  division by zero

It's a minor corner case, but a bug is a bug, so back-patch all the way.
2010-11-12 15:17:30 -05:00
Peter Eisentraut 19e231bbda Improved parallel make support
Replace for loops in makefiles with proper dependencies.  Parallel
make can now span across directories.  Also, make -k and make -q work
properly.

GNU make 3.80 or newer is now required.
2010-11-12 22:15:16 +02:00
Heikki Linnakangas e356743f3e Add missing support for removing foreign data wrapper / server privileges
belonging to a user at DROP OWNED BY. Foreign data wrappers and servers
don't do anything useful yet, which is why no-one has noticed, but since we
have them, seems prudent to fix this. Per report from Chetan Suttraway.
Backpatch to 9.0, 8.4 has the same problem but this patch didn't apply
there so I'm not going to bother.
2010-11-12 15:29:23 +02:00
Heikki Linnakangas 542bdb2146 Fix bug introduced by the recent patch to check that the checkpoint redo
location read from backup label file can be found: wasShutdown was set
incorrectly when a backup label file was found.

Jeff Davis, with a little tweaking by me.
2010-11-11 19:32:11 +02:00
Tom Lane b0f2d681bd Fix line_construct_pm() for the case of "infinite" (DBL_MAX) slope.
This code was just plain wrong: what you got was not a line through the
given point but a line almost indistinguishable from the Y-axis, although
not truly vertical.  The only caller that tries to use this function with
m == DBL_MAX is dist_ps_internal for the case where the lseg is horizontal;
it would end up producing the distance from the given point to the place
where the lseg's line crosses the Y-axis.  That function is used by other
operators too, so there are several operators that could compute wrong
distances from a line segment to something else.  Per bug #5745 from
jindiax.

Back-patch to all supported branches.
2010-11-10 16:52:24 -05:00
Robert Haas 7ba6e4f0e0 Add monitoring function pg_last_xact_replay_timestamp.
Fujii Masao, with a little wordsmithing by me.
2010-11-09 22:52:19 -05:00
Itagaki Takahiro 844ed5dc97 Don't use __declspec (dllimport) for PGDLLEXPORT to reduce warnings
by gcc version 4 on mingw and cygwin. We don't use dllexport here
because dllexport and dllwrap don't work well together.
2010-11-10 12:17:43 +09:00
Tom Lane 80fb2c1f40 Repair memory leakage while ANALYZE-ing complex index expressions.
The general design of memory management in Postgres is that intermediate
results computed by an expression are not freed until the end of the tuple
cycle.  For expression indexes, ANALYZE has to re-evaluate each expression
for each of its sample rows, and it wasn't bothering to free intermediate
results until the end of processing of that index.  This could lead to very
substantial leakage if the intermediate results were large, as in a recent
example from Jakub Ouhrabka.  Fix by doing ResetExprContext for each sample
row.  This necessitates adding a datumCopy step to ensure that the final
expression value isn't recycled too.  Some quick testing suggests that this
change adds at worst about 10% to the time needed to analyze a table with
an expression index; which is annoying, but seems a tolerable price to pay
to avoid unexpected out-of-memory problems.

Back-patch to all supported branches.
2010-11-09 11:55:13 -05:00
Heikki Linnakangas 000efc3dfd In rewriteheap.c (used by VACUUM FULL and CLUSTER), calculate the tuple
length stored in the line pointer the same way it's calculated in the normal
heap_insert() codepath. As noted by Jeff Davis, the length stored by
raw_heap_insert() included padding but the one stored by the normal codepath
did not. While the mismatch seems to be harmless, inconsistency isn't good,
and the normal codepath has received a lot more testing over the years.

Backpatch to 8.3 where the heap rewrite code was introduced.
2010-11-09 17:48:14 +02:00
Tom Lane 54428dbe90 Fix error handling in temp-file deletion with log_temp_files active.
The original coding in FileClose() reset the file-is-temp flag before
unlinking the file, so that if control came back through due to an error,
it wouldn't try to unlink the file twice.  This was correct when written,
but when the log_temp_files feature was added, the logging action was put
in between those two steps.  An error occurring during the logging action
--- such as a query cancel --- would result in the unlink not getting done
at all, as in recent report from Michael Glaesemann.

To fix this, make sure that we do both the stat and the unlink before doing
anything that could conceivably CHECK_FOR_INTERRUPTS.  There is a judgment
call here, which is which log message to emit first: if you can see only
one, which should it be?  I chose to log unlink failure at the risk of
losing the log_temp_files log message --- after all, if the unlink does
fail, the temp file is still there for you to see.

Back-patch to all versions that have log_temp_files.  The code was OK
before that.
2010-11-08 22:14:48 -05:00
Alvaro Herrera 854ae8c3a6 Fix permanent memory leak in autovacuum launcher
get_database_list was uselessly allocating its output data, along some
created along the way, in a permanent memory context.  This didn't
matter when autovacuum was a single, short-lived process, but now that
the launcher is permanent, it shows up as a permanent leak.

To fix, make get_database list allocate its output data in the caller's
context, which is in charge of freeing it when appropriate; and the
memory leaked by heap_beginscan et al is allocated in a throwaway
transaction context.
2010-11-08 18:35:42 -03:00
Tom Lane 947d0c862c Use appendrel planning logic for top-level UNION ALL structures.
Formerly, we could convert a UNION ALL structure inside a subquery-in-FROM
into an appendrel, as a side effect of pulling up the subquery into its
parent; but top-level UNION ALL always caused use of plan_set_operations().
That didn't matter too much because you got an Append-based plan either
way.  However, now that the appendrel code can do things with MergeAppend,
it's worthwhile to hack up the top-level case so it also uses appendrels.

This is a bit of a stopgap; but going much further than this will require
a major rewrite of the planner's set-operations support, which I'm not
prepared to undertake now.  For the moment let's grab the low-hanging fruit.
2010-11-08 15:15:02 -05:00
Tom Lane 543d22fc74 Prevent invoking I/O conversion casts via functional/attribute notation.
PG 8.4 added a built-in feature for casting pretty much any data type to
string types (text, varchar, etc).  We allowed this to work in any of the
historically-allowed syntaxes: CAST(x AS text), x::text, text(x), or
x.text.  However, multiple complaints have shown that it's too easy to
invoke such casts unintentionally in the latter two styles, particularly
field selection.  To cure the problem with the narrowest possible change
of behavior, disallow use of I/O conversion casts from composite types to
string types via functional/attribute syntax.  The new functionality is
still available via cast syntax.

In passing, document the equivalence of functional and attribute syntax
in a more visible place.
2010-11-07 13:03:19 -05:00
Tom Lane e43fb604d6 Implement an "S" option for psql's \dn command.
\dn without "S" now hides all pg_XXX schemas as well as information_schema.
Thus, in a bare database you'll only see "public".  ("public" is considered
a user schema, not a system schema, mainly because it's droppable.)
Per discussion back in late September.
2010-11-06 21:41:14 -04:00
Tom Lane d7a2ce4905 Add support for detecting register-stack overrun on IA64.
Per recent investigation, the register stack can grow faster than the
regular stack depending on compiler and choice of options.  To avoid
crashes we must check both stacks in check_stack_depth().

Since this is poorly-tested code, committing only to HEAD for the
moment ... but we might want to consider back-patching later.
2010-11-06 19:36:29 -04:00
Tom Lane dd1c781903 Make get_stack_depth_rlimit() handle RLIM_INFINITY more sanely.
Rather than considering this result as meaning "unknown", report LONG_MAX.
This won't change what superusers can set max_stack_depth to, but it will
cause InitializeGUCOptions() to set the built-in default to 2MB not 100kB.
The latter seems like a fairly unreasonable interpretation of "infinity".
Per my investigation of odd buildfarm results as well as an old complaint
from Heikki.

Since this should persuade all the buildfarm animals to use a reasonable
stack depth setting during "make check", revert previous patch that dumbed
down a recursive regression test to only 5 levels.
2010-11-06 16:50:18 -04:00
Tom Lane 6736916f5f Include the current value of max_stack_depth in stack depth complaints.
I'm mainly interested in finding out what it is on buildfarm machines,
but including the active value in the message seems like good practice
in any case.  Add the info to the HINT, not the ERROR string, so as not
to change the regression tests' expected output.
2010-11-04 17:15:38 -04:00
Tom Lane 09211659d9 Use appendStringInfoString() where appropriate in elog.c.
The nominally equivalent call appendStringInfo(buf, "%s", str) can be
significantly slower when str is large.  In particular, the former usage in
EVALUATE_MESSAGE led to O(N^2) behavior when collecting a large number of
context lines, as I found out while testing recursive functions.  The other
changes are just neatnik-ism and seem unlikely to save anything meaningful,
but a cycle shaved is a cycle earned.
2010-11-04 15:28:35 -04:00
Tom Lane 034967bdcb Reimplement planner's handling of MIN/MAX aggregate optimization.
Per my recent proposal, get rid of all the direct inspection of indexes
and manual generation of paths in planagg.c.  Instead, set up
EquivalenceClasses for the aggregate argument expressions, and let the
regular path generation logic deal with creating paths that can satisfy
those sort orders.  This makes planagg.c a bit more visible to the rest
of the planner than it was originally, but the approach is basically a lot
cleaner than before.  A major advantage of doing it this way is that we get
MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans)
practically for free, whereas in the old way we'd have had to add a whole
lot more duplicative logic.

One small disadvantage of this approach is that MIN/MAX aggregates can no
longer exploit partial indexes having an "x IS NOT NULL" predicate, unless
that restriction or something that implies it is specified in the query.
The previous implementation was able to use the added "x IS NOT NULL"
condition as an extra predicate proof condition, but in this version we
rely entirely on indexes that are considered usable by the main planning
process.  That seems a fair tradeoff for the simplicity and functionality
gained.
2010-11-04 12:01:17 -04:00
Tom Lane 0abc8fdd4d Reduce recursion depth in recently-added regression test.
Some buildfarm members fail the test with the original depth of 10 levels,
apparently because they are running at the minimum max_stack_depth setting
of 100kB and using ~ 10k per recursion level.  While it might be
interesting to try to figure out why they're eating so much stack, it isn't
likely that any fix for that would be back-patchable.  So just change the
test to recurse only 5 levels.  The extra levels don't prove anything
correctness-wise anyway.
2010-11-03 13:41:46 -04:00
Tom Lane 70a0160b07 Use only one hash entry for all instances of a pltcl trigger function.
Like plperl and unlike plpgsql, there isn't any cached state that could
depend on exactly which relation the trigger is being fired for.  So we
can use just one hash entry for all relations, which might save a little
something.

Alex Hunsaker
2010-11-03 12:26:55 -04:00
Tom Lane 61d6dd0c03 Fix adjust_semi_join to be more cautious about clauseless joins.
It was reporting that these were fully indexed (hence cheap), when of
course they're the exact opposite of that.  I'm not certain if the case
would arise in practice, since a clauseless semijoin is hard to produce
in SQL, but if it did happen we'd make some dumb decisions.
2010-11-02 18:45:36 -04:00
Tom Lane 9f376e146b Ensure an index that uses a whole-row Var still depends on its table.
We failed to record any dependency on the underlying table for an index
declared like "create index i on t (foo(t.*))".  This would create trouble
if the table were dropped without previously dropping the index.  To fix,
simplify some overly-cute code in index_create(), accepting the possibility
that sometimes the whole-table dependency will be redundant.  Also document
this hazard in dependency.c.  Per report from Kevin Grittner.

In passing, prevent a core dump in pg_get_indexdef() if the index's table
can't be found.  I came across this while experimenting with Kevin's
example.  Not sure it's a real issue when the catalogs aren't corrupt, but
might as well be cautious.

Back-patch to all supported versions.
2010-11-02 17:15:07 -04:00
Michael Meskes 35d5d962e1 Some cleanup in ecpg code:
Use bool as type for booleans instead of int.
Do not implicitely cast size_t to int.
Make the compiler stop complaining about unused variables by adding an empty statement.
2010-11-02 18:12:01 +01:00
Heikki Linnakangas 8c843fff2d Bootstrap WAL to begin at segment logid=0 logseg=1 (000000010000000000000001)
rather than 0/0, so that we can safely use 0/0 as an invalid value. This is a
more future-proof fix for the corner-case bug in streaming replication that
was fixed yesterday. We had a similar corner-case bug with log/seg 0/0 back in
February as well. Avoiding 0/0 as a valid value should prevent bugs like that
in the future. Per Tom Lane's idea.

Back-patch to 9.0. Since this only affects bootstrapping, it makes no
difference to existing installations. We don't need to worry about the
bug in existing installations, because if you've managed to get past the
initial base backup already, you won't hit the bug in the future either.
2010-11-02 11:39:48 +02:00
Tom Lane 0811ff2063 Avoid using a local FunctionCallInfoData struct in ExecMakeFunctionResult
and related routines.

We already had a redundant FunctionCallInfoData struct in FuncExprState,
but were using that copy only in set-returning-function cases, to avoid
keeping function evaluation state in the expression tree for the benefit
of plpgsql's "simple expression" logic.  But of course that didn't work
anyway.  Given the recent fixes in plpgsql there is no need to have two
separate behaviors here.  Getting rid of the local FunctionCallInfoData
structs should make things a little faster (because we don't need to do
InitFunctionCallInfoData each time), and it also makes for a noticeable
reduction in stack space consumption during recursive calls.
2010-11-01 13:54:21 -04:00
Heikki Linnakangas 931b6db39b Fix corner-case bug in tracking of latest removed WAL segment during
streaming replication. We used log/seg 0/0 to indicate that no WAL segments
have been removed since startup, but 0/0 is a valid value for the very first
WAL segment after initdb. To make that disambiguous, store
(latest removed WAL segment + 1) in the global variable.

Per report from Matt Chesler, also reproduced by Greg Smith.
2010-11-01 10:05:15 +02:00
Tom Lane 76b12e0af7 Revert removal of trigger flag from plperl function hash key.
As noted by Jan Urbanski, this flag is in fact needed to ensure that the
function's input/result conversion functions are set up as expected.

Add a regression test to discourage anyone from making same mistake
in future.
2010-10-31 11:42:51 -04:00
Tom Lane 186cbbda8f Provide hashing support for arrays.
The core of this patch is hash_array() and associated typcache
infrastructure, which works just about exactly like the existing support
for array comparison.

In addition I did some work to ensure that the planner won't think that an
array type is hashable unless its element type is hashable, and similarly
for sorting.  This includes adding a datatype parameter to op_hashjoinable
and op_mergejoinable, and adding an explicit "hashable" flag to
SortGroupClause.  The lack of a cross-check on the element type was a
pre-existing bug in mergejoin support --- but it didn't matter so much
before, because if you couldn't sort the element type there wasn't any good
alternative to failing anyhow.  Now that we have the alternative of hashing
the array type, there are cases where we can avoid a failure by being picky
at the planner stage, so it's time to be picky.

The issue of exactly how to combine the per-element hash values to produce
an array hash is still open for discussion, but the rest of this is pretty
solid, so I'll commit it as-is.
2010-10-30 21:56:11 -04:00
Tom Lane bfd3f37be3 Fix comparisons of pointers with zero to compare with NULL instead.
Per C standard, these are semantically the same thing; but saying NULL
when you mean NULL is good for readability.

Marti Raudsepp, per results of INRIA's Coccinelle.
2010-10-29 15:51:52 -04:00
Tom Lane 48a1fb2390 Oops, missed one fix for EquivalenceClass rearrangement.
Now that we're expecting a mergeclause's left_ec/right_ec to persist from
the initial assignments, we can't just blithely zero these out when
transforming such a clause in adjust_appendrel_attrs.  But really it should
be okay to keep the parent's values, since a child table's derived Var
ought to be equivalent to the parent Var for all EquivalenceClass purposes.
(Indeed, I'm wondering whether we couldn't find a way to dispense with
add_child_rel_equivalences altogether.  But this is wrong in any case.)
2010-10-29 14:44:49 -04:00