status, including a status where the server is running but refuses a
postgres connection.
Have pg_ctl use this new function. This fixes the case where pg_ctl
reports that the server is not running (cannot connect) but in fact it
is running.
After a SQL object is created, we provide an opportunity for security
or logging plugins to get control; for example, a security label provider
could use this to assign an initial security label to newly created
objects. The basic infrastructure is (hopefully) reusable for other types
of events that might require similar treatment.
KaiGai Kohei, with minor adjustments.
Forcibly releasing all leftover buffer pins should be unnecessary now
that we have a robust ResourceOwner mechanism, and it significantly
increases the cost of process shutdown. Instead, in an assert-enabled
build, assert that no pins are held; in a non-assert-enabled build, do
nothing.
supplied, also print the IP address. This allows IPv4 and IPv6 failures
to be distinguished. Also useful when a hostname resolves to multiple
IP addresses.
Also, remove use of inet_ntoa() and use our own inet_net_ntop() in all
places, including in libpq, because it is thread-safe.
This commit adds columns amoppurpose and amopsortfamily to pg_amop, and
column amcanorderbyop to pg_am. For the moment all the entries in
amcanorderbyop are "false", since the underlying support isn't there yet.
Also, extend the CREATE OPERATOR CLASS/ALTER OPERATOR FAMILY commands with
[ FOR SEARCH | FOR ORDER BY sort_operator_family ] clauses to allow the new
columns of pg_amop to be populated, and create pg_dump support for dumping
that information.
I also added some documentation, although it's perhaps a bit premature
given that the feature doesn't do anything useful yet.
Teodor Sigaev, Robert Haas, Tom Lane
Any flavor of ALTER <whatever> .. SET SCHEMA fails if (1) the object
is already in the new schema, (2) either the old or new schema is
a temp schema, or (3) either the old or new schema is the TOAST schema.
Extraced from a patch by Dimitri Fontaine, with additional hacking by me.
Currently, three conversion format specifiers are supported: %s for a
string, %L for an SQL literal, and %I for an SQL identifier. The latter
two are deliberately designed not to overlap with what sprintf() already
supports, in case we want to add more of sprintf()'s functionality here
later.
Patch by Pavel Stehule, heavily revised by me. Reviewed by Jeff Janes
and, in earlier versions, by Itagaki Takahiro and Tom Lane.
We no longer need the terminating zero entry in opfamily[], so get rid of
it. Also replace assorted ad-hoc looping logic with simple for and foreach
constructs. This code is now noticeably more readable than it was an hour
ago; credit to Robert for seeing that it could be simplified.
Avoid depending on LL notation, which is likely to not work in pre-C99
compilers; don't pointlessly use INT32_MIN/INT64_MIN in code that has
the numerical value hard-wired into it anyway; remove some gratuitous
style inconsistencies between pg_ltoa and pg_lltoa; fix int2 test case
so it actually tests int2.
This eliminates the need for inefficient implementions of this
functionality in both contrib/dblink and contrib/tablefunc, so remove
them. The upcoming patch implementing an in-core format() function
will also require this functionality.
In passing, add some regression tests.
Use INT_MIN rather than INT32_MIN as we do elsewhere in the code, and
try to work around nonexistence of INT64_MIN if necessary. Adjust the
new regression tests to something hopefully saner, per observation by
Tom Lane.
When using default autovacuum_vac_cost_limit, autovac_balance_cost relied
on VacuumCostLimit to contain the correct global value ... but after the
first time through in a particular worker process, it didn't, because we'd
trashed it in previous iterations. Depending on the state of other autovac
workers, this could result in a steady reduction of the effective
cost_limit setting as a particular worker processed more and more tables,
causing it to go slower and slower. Spotted by Simon Poole (bug #5759).
Fix by saving and restoring the GUC variables in the loop in do_autovacuum.
In passing, improve a few comments.
Back-patch to 8.3 ... the cost rebalancing code has been buggy since it was
put in.
A hand-coded implementation turns out to be much faster than calling
printf(). In passing, add a few more regresion tests.
Andres Freund, with assorted, mostly cosmetic changes.
As per the ancient comment for set_rel_width, it really wasn't much good
for relations that aren't plain tables: it would never find any stats and
would always fall back on datatype-based estimates, which are often pretty
silly. Fix that by copying up width estimates from the subquery planning
process.
At some point we might want to do this for CTEs too, but that would be a
significantly more invasive patch because the sub-PlannerInfo is no longer
accessible by the time it's needed. I refrained from doing anything about
that, partly for fear of breaking the unmerged CTE-related patches.
In passing, also generate less bogus width estimates for whole-row Vars.
Per a gripe from Jon Nelson.
Given a column reference foo.bar, where there is a composite plpgsql
variable foo but it doesn't contain a column bar, the pre-9.0 coding would
immediately throw a "record foo has no field bar" error. In 9.0 the parser
hook instead falls through to let the core parser see if it can resolve the
reference. If not, you get a complaint about "missing FROM-clause entry
for table foo", which while in some sense correct isn't terribly helpful.
Complicate things a bit so that we can throw the old error message if
neither the core parser nor the hook are able to resolve the column
reference, while not changing the behavior in any other case.
Per bug #5757 from Andrey Galkin.
If we have Limit->Result->Sort, the Result might be projecting a tlist
that contains a set-returning function. If so, it's possible for the
SRF to sometimes return zero rows, which means we could need to fetch
more than N rows from the Sort in order to satisfy LIMIT N.
So top-N sorting cannot be used in this scenario.
Fix things so that top-N sorting can be used in child Sort nodes of a
MergeAppend node, when there is a LIMIT and no intervening joins or
grouping. Actually doing this on the executor side isn't too bad,
but it's a bit messier to get the planner to cost it properly.
Per gripe from Robert Haas.
In passing, fix an oversight in the original top-N-sorting patch:
query_planner should not assume that a LIMIT can be used to make an
explicit sort cheaper when there will be grouping or aggregation in
between. Possibly this should be back-patched, but I'm not sure the
mistake is serious enough to be a real problem in practice.
In the previous coding, we simply issued ALTER SEQUENCE RESTART commands,
which do not roll back on error. This meant that an error between
truncating and committing left the sequences out of sync with the table
contents, with potentially bad consequences as were noted in a Warning on
the TRUNCATE man page.
To fix, create a new storage file (relfilenode) for a sequence that is to
be reset due to RESTART IDENTITY. If the transaction aborts, we'll
automatically revert to the old storage file. This acts just like a
rewriting ALTER TABLE operation. A penalty is that we have to take
exclusive lock on the sequence, but since we've already got exclusive lock
on its owning table, that seems unlikely to be much of a problem.
The interaction of this with usual nontransactional behaviors of sequence
operations is a bit weird, but it's hard to see what would be completely
consistent. Our choice is to discard cached-but-unissued sequence values
both when the RESTART is executed, and at rollback if any; but to not touch
the currval() state either time.
In passing, move the sequence reset operations to happen before not after
any AFTER TRUNCATE triggers are fired. The previous ordering was not
logically sensible, but was forced by the need to minimize inconsistency
if the triggers caused an error. Transactional rollback is a much better
solution to that.
Patch by Steve Singer, rather heavily adjusted by me.
Add some additional dependencies to constrain the build order to prevent
parallel make from failing. In the case of src/Makefile, this is likely to be
too complicated to be worth maintaining, so just add .NOTPARALLEL to get the
old for-loop-like behavior.
More fine-tuning might be necessary for some platforms or configurations.
The handle to the shared memory segment containing startup
parameters was sent as 32-bit even on 64-bit systems. Since
HANDLEs appear to be allocated sequentially this shouldn't
be a problem until we reach 2^32 open handles in the postmaster,
but a 64-bit value should be sent across as 64-bit, and not
zero out the top 32 bits.
Noted by Tom Lane.
temporary indexes are not WAL-logged. We used a constant LSN for temporary
indexes, on the assumption that we don't need to worry about concurrent page
splits in temporary indexes because they're only visible to the current
session. But that assumption is wrong, it's possible to insert rows and
split pages in the same session, while a scan is in progress. For example,
by opening a cursor and fetching some rows, and INSERTing new rows before
fetching some more.
Fix by generating fake increasing LSNs, used in place of real LSNs in
temporary GiST indexes.
We must stay in the function's SPI context until done calling the iterator
that returns the set result. Otherwise, any attempt to invoke SPI features
in the python code called by the iterator will malfunction. Diagnosis and
patch by Jan Urbanski, per bug report from Jean-Baptiste Quenot.
Back-patch to 8.2; there was no support for SRFs in previous versions of
plpython.
This new field counts the number of times that a backend which writes a
buffer out to the OS must also fsync() it. This happens when the
bgwriter fsync request queue is full, and is generally detrimental to
performance, so it's good to know when it's happening. Along the way,
log a new message at level DEBUG1 whenever we fail to hand off an fsync,
so that the problem can also be seen in examination of log files
(if the logging level is cranked up high enough).
Greg Smith, with minor tweaks by me.
Similar conflicts were already avoided for related record types.
Massive over-caution resulted in a usability bug. Clear theoretical
basis for doing this is now confirmed by me.
Request to remove from Heikki (twice), over-caution by me.
We must not return any "okay to proceed" result code without having checked
for too many children, else we might fail later on when trying to add the
new child to one of the per-child state arrays. It's not clear whether
this oversight explains Stefan Kaltenbrunner's recent report, but it could
certainly produce a similar symptom.
Back-patch to 8.4; the logic was not broken before that.
3.80 breaks if the expansion of $(eval) is long enough to require expansion
of its internal variable_buffer. For the purposes of $(recurse) that means
it'll work so long as no single evaluation of _create_recursive_target
produces more than 195 bytes. We can manage that by looping over
subdirectories outside the call instead of complicating the generated rule.
This coding is simpler and more readable anyway.
Or at least, this works for me. We'll see if the buildfarm likes it.
This is needed to support debug_print_parse, per report from Jon Nelson.
Cursory testing via the regression tests suggests we aren't missing
anything else.
Having this in src/include/port.h makes no sense, now that copydir.c lives
in src/backend/strorage rather than src/port. Along the way, remove an
obsolete comment from contrib/pg_upgrade that makes reference to the old
location.
Once we have found a non-null constant argument, there is no need to
examine additional arguments of the COALESCE. The previous coding got it
right only if the constant was in the first argument position; otherwise
it tried to simplify following arguments too, leading to unexpected
behavior like this:
regression=# select coalesce(f1, 42, 1/0) from int4_tbl;
ERROR: division by zero
It's a minor corner case, but a bug is a bug, so back-patch all the way.
Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.
GNU make 3.80 or newer is now required.
belonging to a user at DROP OWNED BY. Foreign data wrappers and servers
don't do anything useful yet, which is why no-one has noticed, but since we
have them, seems prudent to fix this. Per report from Chetan Suttraway.
Backpatch to 9.0, 8.4 has the same problem but this patch didn't apply
there so I'm not going to bother.
location read from backup label file can be found: wasShutdown was set
incorrectly when a backup label file was found.
Jeff Davis, with a little tweaking by me.
This code was just plain wrong: what you got was not a line through the
given point but a line almost indistinguishable from the Y-axis, although
not truly vertical. The only caller that tries to use this function with
m == DBL_MAX is dist_ps_internal for the case where the lseg is horizontal;
it would end up producing the distance from the given point to the place
where the lseg's line crosses the Y-axis. That function is used by other
operators too, so there are several operators that could compute wrong
distances from a line segment to something else. Per bug #5745 from
jindiax.
Back-patch to all supported branches.
The general design of memory management in Postgres is that intermediate
results computed by an expression are not freed until the end of the tuple
cycle. For expression indexes, ANALYZE has to re-evaluate each expression
for each of its sample rows, and it wasn't bothering to free intermediate
results until the end of processing of that index. This could lead to very
substantial leakage if the intermediate results were large, as in a recent
example from Jakub Ouhrabka. Fix by doing ResetExprContext for each sample
row. This necessitates adding a datumCopy step to ensure that the final
expression value isn't recycled too. Some quick testing suggests that this
change adds at worst about 10% to the time needed to analyze a table with
an expression index; which is annoying, but seems a tolerable price to pay
to avoid unexpected out-of-memory problems.
Back-patch to all supported branches.
length stored in the line pointer the same way it's calculated in the normal
heap_insert() codepath. As noted by Jeff Davis, the length stored by
raw_heap_insert() included padding but the one stored by the normal codepath
did not. While the mismatch seems to be harmless, inconsistency isn't good,
and the normal codepath has received a lot more testing over the years.
Backpatch to 8.3 where the heap rewrite code was introduced.
The original coding in FileClose() reset the file-is-temp flag before
unlinking the file, so that if control came back through due to an error,
it wouldn't try to unlink the file twice. This was correct when written,
but when the log_temp_files feature was added, the logging action was put
in between those two steps. An error occurring during the logging action
--- such as a query cancel --- would result in the unlink not getting done
at all, as in recent report from Michael Glaesemann.
To fix this, make sure that we do both the stat and the unlink before doing
anything that could conceivably CHECK_FOR_INTERRUPTS. There is a judgment
call here, which is which log message to emit first: if you can see only
one, which should it be? I chose to log unlink failure at the risk of
losing the log_temp_files log message --- after all, if the unlink does
fail, the temp file is still there for you to see.
Back-patch to all versions that have log_temp_files. The code was OK
before that.
get_database_list was uselessly allocating its output data, along some
created along the way, in a permanent memory context. This didn't
matter when autovacuum was a single, short-lived process, but now that
the launcher is permanent, it shows up as a permanent leak.
To fix, make get_database list allocate its output data in the caller's
context, which is in charge of freeing it when appropriate; and the
memory leaked by heap_beginscan et al is allocated in a throwaway
transaction context.
Formerly, we could convert a UNION ALL structure inside a subquery-in-FROM
into an appendrel, as a side effect of pulling up the subquery into its
parent; but top-level UNION ALL always caused use of plan_set_operations().
That didn't matter too much because you got an Append-based plan either
way. However, now that the appendrel code can do things with MergeAppend,
it's worthwhile to hack up the top-level case so it also uses appendrels.
This is a bit of a stopgap; but going much further than this will require
a major rewrite of the planner's set-operations support, which I'm not
prepared to undertake now. For the moment let's grab the low-hanging fruit.
PG 8.4 added a built-in feature for casting pretty much any data type to
string types (text, varchar, etc). We allowed this to work in any of the
historically-allowed syntaxes: CAST(x AS text), x::text, text(x), or
x.text. However, multiple complaints have shown that it's too easy to
invoke such casts unintentionally in the latter two styles, particularly
field selection. To cure the problem with the narrowest possible change
of behavior, disallow use of I/O conversion casts from composite types to
string types via functional/attribute syntax. The new functionality is
still available via cast syntax.
In passing, document the equivalence of functional and attribute syntax
in a more visible place.
\dn without "S" now hides all pg_XXX schemas as well as information_schema.
Thus, in a bare database you'll only see "public". ("public" is considered
a user schema, not a system schema, mainly because it's droppable.)
Per discussion back in late September.
Per recent investigation, the register stack can grow faster than the
regular stack depending on compiler and choice of options. To avoid
crashes we must check both stacks in check_stack_depth().
Since this is poorly-tested code, committing only to HEAD for the
moment ... but we might want to consider back-patching later.
Rather than considering this result as meaning "unknown", report LONG_MAX.
This won't change what superusers can set max_stack_depth to, but it will
cause InitializeGUCOptions() to set the built-in default to 2MB not 100kB.
The latter seems like a fairly unreasonable interpretation of "infinity".
Per my investigation of odd buildfarm results as well as an old complaint
from Heikki.
Since this should persuade all the buildfarm animals to use a reasonable
stack depth setting during "make check", revert previous patch that dumbed
down a recursive regression test to only 5 levels.
I'm mainly interested in finding out what it is on buildfarm machines,
but including the active value in the message seems like good practice
in any case. Add the info to the HINT, not the ERROR string, so as not
to change the regression tests' expected output.
The nominally equivalent call appendStringInfo(buf, "%s", str) can be
significantly slower when str is large. In particular, the former usage in
EVALUATE_MESSAGE led to O(N^2) behavior when collecting a large number of
context lines, as I found out while testing recursive functions. The other
changes are just neatnik-ism and seem unlikely to save anything meaningful,
but a cycle shaved is a cycle earned.
Per my recent proposal, get rid of all the direct inspection of indexes
and manual generation of paths in planagg.c. Instead, set up
EquivalenceClasses for the aggregate argument expressions, and let the
regular path generation logic deal with creating paths that can satisfy
those sort orders. This makes planagg.c a bit more visible to the rest
of the planner than it was originally, but the approach is basically a lot
cleaner than before. A major advantage of doing it this way is that we get
MIN/MAX optimization on inheritance trees (using MergeAppend of indexscans)
practically for free, whereas in the old way we'd have had to add a whole
lot more duplicative logic.
One small disadvantage of this approach is that MIN/MAX aggregates can no
longer exploit partial indexes having an "x IS NOT NULL" predicate, unless
that restriction or something that implies it is specified in the query.
The previous implementation was able to use the added "x IS NOT NULL"
condition as an extra predicate proof condition, but in this version we
rely entirely on indexes that are considered usable by the main planning
process. That seems a fair tradeoff for the simplicity and functionality
gained.
Some buildfarm members fail the test with the original depth of 10 levels,
apparently because they are running at the minimum max_stack_depth setting
of 100kB and using ~ 10k per recursion level. While it might be
interesting to try to figure out why they're eating so much stack, it isn't
likely that any fix for that would be back-patchable. So just change the
test to recurse only 5 levels. The extra levels don't prove anything
correctness-wise anyway.
Like plperl and unlike plpgsql, there isn't any cached state that could
depend on exactly which relation the trigger is being fired for. So we
can use just one hash entry for all relations, which might save a little
something.
Alex Hunsaker
It was reporting that these were fully indexed (hence cheap), when of
course they're the exact opposite of that. I'm not certain if the case
would arise in practice, since a clauseless semijoin is hard to produce
in SQL, but if it did happen we'd make some dumb decisions.
We failed to record any dependency on the underlying table for an index
declared like "create index i on t (foo(t.*))". This would create trouble
if the table were dropped without previously dropping the index. To fix,
simplify some overly-cute code in index_create(), accepting the possibility
that sometimes the whole-table dependency will be redundant. Also document
this hazard in dependency.c. Per report from Kevin Grittner.
In passing, prevent a core dump in pg_get_indexdef() if the index's table
can't be found. I came across this while experimenting with Kevin's
example. Not sure it's a real issue when the catalogs aren't corrupt, but
might as well be cautious.
Back-patch to all supported versions.
Use bool as type for booleans instead of int.
Do not implicitely cast size_t to int.
Make the compiler stop complaining about unused variables by adding an empty statement.
rather than 0/0, so that we can safely use 0/0 as an invalid value. This is a
more future-proof fix for the corner-case bug in streaming replication that
was fixed yesterday. We had a similar corner-case bug with log/seg 0/0 back in
February as well. Avoiding 0/0 as a valid value should prevent bugs like that
in the future. Per Tom Lane's idea.
Back-patch to 9.0. Since this only affects bootstrapping, it makes no
difference to existing installations. We don't need to worry about the
bug in existing installations, because if you've managed to get past the
initial base backup already, you won't hit the bug in the future either.
and related routines.
We already had a redundant FunctionCallInfoData struct in FuncExprState,
but were using that copy only in set-returning-function cases, to avoid
keeping function evaluation state in the expression tree for the benefit
of plpgsql's "simple expression" logic. But of course that didn't work
anyway. Given the recent fixes in plpgsql there is no need to have two
separate behaviors here. Getting rid of the local FunctionCallInfoData
structs should make things a little faster (because we don't need to do
InitFunctionCallInfoData each time), and it also makes for a noticeable
reduction in stack space consumption during recursive calls.
streaming replication. We used log/seg 0/0 to indicate that no WAL segments
have been removed since startup, but 0/0 is a valid value for the very first
WAL segment after initdb. To make that disambiguous, store
(latest removed WAL segment + 1) in the global variable.
Per report from Matt Chesler, also reproduced by Greg Smith.
As noted by Jan Urbanski, this flag is in fact needed to ensure that the
function's input/result conversion functions are set up as expected.
Add a regression test to discourage anyone from making same mistake
in future.
The core of this patch is hash_array() and associated typcache
infrastructure, which works just about exactly like the existing support
for array comparison.
In addition I did some work to ensure that the planner won't think that an
array type is hashable unless its element type is hashable, and similarly
for sorting. This includes adding a datatype parameter to op_hashjoinable
and op_mergejoinable, and adding an explicit "hashable" flag to
SortGroupClause. The lack of a cross-check on the element type was a
pre-existing bug in mergejoin support --- but it didn't matter so much
before, because if you couldn't sort the element type there wasn't any good
alternative to failing anyhow. Now that we have the alternative of hashing
the array type, there are cases where we can avoid a failure by being picky
at the planner stage, so it's time to be picky.
The issue of exactly how to combine the per-element hash values to produce
an array hash is still open for discussion, but the rest of this is pretty
solid, so I'll commit it as-is.
Per C standard, these are semantically the same thing; but saying NULL
when you mean NULL is good for readability.
Marti Raudsepp, per results of INRIA's Coccinelle.
Now that we're expecting a mergeclause's left_ec/right_ec to persist from
the initial assignments, we can't just blithely zero these out when
transforming such a clause in adjust_appendrel_attrs. But really it should
be okay to keep the parent's values, since a child table's derived Var
ought to be equivalent to the parent Var for all EquivalenceClass purposes.
(Indeed, I'm wondering whether we couldn't find a way to dispense with
add_child_rel_equivalences altogether. But this is wrong in any case.)
Zoltan Boszormenyi exhibited a test case in which planning time was
dominated by construction of EquivalenceClasses and PathKeys that had no
actual relevance to the query (and in fact got discarded immediately).
This happened because we generated PathKeys describing the sort ordering of
every index on every table in the query, and only after that checked to see
if the sort ordering was relevant. The EC/PK construction code is O(N^2)
in the number of ECs, which is all right for the intended number of such
objects, but it gets out of hand if there are ECs for lots of irrelevant
indexes.
To fix, twiddle the handling of mergeclauses a little bit to ensure that
every interesting EC is created before we begin path generation. (This
doesn't cost anything --- in fact I think it's a bit cheaper than before
--- since we always eventually created those ECs anyway.) Then, if an
index column can't be found in any pre-existing EC, we know that that sort
ordering is irrelevant for the query. Instead of creating a useless EC,
we can just not build a pathkey for the index column in the first place.
The index will still be considered if it's useful for non-order-related
reasons, but we will think of its output as unsorted.
The previous wording might have suggested that \du only showed login roles
and \dg only group roles, but that is no longer the case.
proposed by Josh Kupershmidt
Instead of using ExecPrepareExpr, call ExecInitExpr. The net change here
is that we don't apply expression_planner() to the expression tree. There
is no need to do so, because that tree is extracted from a fully planned
plancache entry, so all the needed work is already done. This reduces
the setup costs by about a factor of 2 according to some simple tests.
Oversight noted while fooling around with the simple-expression code for
previous fix.
In general, expression execution state trees aren't re-entrantly usable,
since functions can store private state information in them.
For efficiency reasons, plpgsql tries to cache and reuse state trees for
"simple" expressions. It can get away with that most of the time, but it
can fail if the state tree is dirty from a previous failed execution (as
in an example from Alvaro) or is being used recursively (as noted by me).
Fix by tracking whether a state tree is in use, and falling back to the
"non-simple" code path if so. This results in a pretty considerable speed
hit when the non-simple path is taken, but the available alternatives seem
even more unpleasant because they add overhead in the simple path. Per
idea from Heikki.
Back-patch to all supported branches.
Original patch failed to include new exclusive states in a switch that
needed to include them; and also was guilty of very fuzzy thinking
about how to handle error cases. Per bug #5729 from Alan Choi.
- Avoid closing stdin, since we didn't open it. Previously multiple
inclusions of stdin would be terminated with a single quit, now a separate
quit is needed for each invocation. Previous behavior also accessed stdin
after it was fclose()d, which is undefined behavior per ANSI C.
- Properly restore pset.inputfile, since the caller expects to be able
to free that memory.
Marti Raudsepp
that WAL file containing the checkpoint redo-location can be found. This
avoids making the cluster irrecoverable if the redo location is in an earlie
WAL file than the checkpoint record.
Report, analysis and patch by Jeff Davis, with small changes by me.
Split the old typenameTypeId() into two functions: A new typenameTypeId() that
returns only a type OID, and typenameTypeIdAndMod() that returns type OID and
typmod. This isolates call sites better that actually care about the typmod.
A NestLoopParam's value can only be a Var or Aggref, but this isn't the
case in general for SubPlan parameters, so print_parameter_expr had better
be prepared to cope. Brain fade in my recent patch to print the referenced
expression instead of just printing $N for PARAM_EXEC Params. Per report
from Pavel Stehule.
This avoids a possible crash when inlining a SRF whose argument list
contains a reference to an inline-able user function. The crash is quite
reproducible with CLOBBER_FREED_MEMORY enabled, but would be less certain
in a production build. Problem introduced in 9.0 by the named-arguments
patch, which requires invoking eval_const_expressions() before we can try
to inline a SRF. Per report from Brendan Jurd.
After much expenditure of effort, we've got this to the point where the
performance penalty is pretty minimal in typical cases.
Andrew Dunstan, reviewed by Brendan Jurd, Dean Rasheed, and Tom Lane
as a variable or column name, and it's not reserved in recent versions of
the SQL spec either. This became particularly annoying in 9.0, before that
PL/pgSQL replaced variable names in queries with parameter markers, so
it was possible to use OFF and many other backend parser keywords as
variable names. Because of that, backpatch to 9.0.
This patch eliminates various bizarre behaviors caused by sloppy thinking
about the difference between a domain type and its underlying array type.
In particular, the operation of updating one element of such an array
has to be considered as yielding a value of the underlying array type,
*not* a value of the domain, because there's no assurance that the
domain's CHECK constraints are still satisfied. If we're intending to
store the result back into a domain column, we have to re-cast to the
domain type so that constraints are re-checked.
For similar reasons, such a domain can't be blindly matched to an ANYARRAY
polymorphic parameter, because the polymorphic function is likely to apply
array-ish operations that could invalidate the domain constraints. For the
moment, we just forbid such matching. We might later wish to insert an
automatic downcast to the underlying array type, but such a change should
also change matching of domains to ANYELEMENT for consistency.
To ensure that all such logic is rechecked, this patch removes the original
hack of setting a domain's pg_type.typelem field to match its base type;
the typelem will always be zero instead. In those places where it's really
okay to look through the domain type with no other logic changes, use the
newly added get_base_element_type function in place of get_element_type.
catversion bumped due to change in pg_type contents.
Per bug #5717 from Richard Huxton and subsequent discussion.
outside a transaction.
This repairs brain fade in my patch of 2009-08-30: the reason we had been
storing oldest-database name, not OID, in ShmemVariableCache was of course
to avoid having to do a catalog lookup at times when it might be unsafe.
This error explains why Aleksandr Dushein is having trouble getting out of
an XID wraparound state in bug #5718, though not how he got into that state
in the first place. I suspect pg_upgrade is at fault there.
This call was present in the aboriginal code from Berkeley, and has
never been touched; it may very well be that it was there to mask
effects of bugs in other places and it may no longer be necessary.
The removal has been foreseen in a code comment since 2007; this seems
to be a good time to test this hypothesis.
The trick is to not try to build executables directly from .c files,
but to always build the intermediate .o files. For obscure reasons,
Darwin's version of gcc will leave debug cruft behind in the first
case but not the second. Per complaint from Robert Haas.
A couple of places in the planner need to generate whole-row Vars, and were
cutting corners by setting vartype = RECORDOID in the Vars, even in cases
where there's an identifiable named composite type for the RTE being
referenced. While we mostly got away with this, it failed when there was
also a parser-generated whole-row reference to the same RTE, because the
two Vars weren't equal() due to the difference in vartype. Fix by
providing a subroutine the planner can call to generate whole-row Vars
the same way the parser does.
Per bug #5716 from Andrew Tipton. Back-patch to 9.0 where one of the bogus
calls was introduced (the other one is new in HEAD).
The GIN code has absolutely no business exporting GIN-specific functions
with names as generic as compareItemPointers() or newScanKey(); that's
just trouble waiting to happen. I got annoyed about this again just now
and decided to fix it. This commit ensures that all global symbols
defined in access/gin/ have names including "gin" or "Gin". There were a
couple of cases, like names involving "PostingItem", where arguably the
names were already sufficiently nongeneric; but I figured as long as I was
risking creating merge problems for unapplied GIN patches I might as well
impose a uniform policy.
I didn't touch any static symbol names. There might be some places
where it'd be appropriate to rename some static functions to match
siblings that are exported, but I'll leave that for another time.
The better estimate requires more statistics than we previously stored:
in particular, counts of "entry" versus "data" pages within the index,
as well as knowledge of the number of distinct key values. We collect
this information during initial index build and update it during VACUUM,
storing the info in new fields on the index metapage. No initdb is
required because these fields will read as zeroes in a pre-existing
index, and the new gincostestimate code is coded to behave (reasonably)
sanely if they are zeroes.
Teodor Sigaev, reviewed by Jan Urbanski, Tom Lane, and Itagaki Takahiro.
Look only at the non-localized part of the output from "vcbuild /?",
which is used to determine the version of Visual Studio in use. Different
languages seem to localize different amounts of the string, but we assume
the part "Microsoft Visual C++" won't be modified.
This is not the hoped-for facility of using INSERT/UPDATE/DELETE inside
a WITH, but rather the other way around. It seems useful in its own
right anyway.
Note: catversion bumped because, although the contents of stored rules
might look compatible, there's actually a subtle semantic change.
A single Query containing a WITH and INSERT...VALUES now represents
writing the WITH before the INSERT, not before the VALUES. While it's
not clear that that matters to anyone, it seems like a good idea to
have it cited in the git history for catversion.h.
Original patch by Marko Tiikkaja, with updating and cleanup by
Hitoshi Harada.
Corrupt RADIUS responses were treated as errors and not ignored
(which the RFC2865 states they should be). This meant that a
user with unfiltered access to the network of the PostgreSQL
or RADIUS server could send a spoofed RADIUS response
to the PostgreSQL server causing it to reject a valid login,
provided the attacker could also guess (or brute-force) the
correct port number.
Fix is to simply retry the receive in a loop until the timeout
has expired or a valid (signed by the correct RADIUS server)
packet arrives.
Reported by Alan DeKok in bug #5687.
This patch eliminates the former need to sort the output of an Append scan
when an ordered scan of an inheritance tree is wanted. This should be
particularly useful for fast-start cases such as queries with LIMIT.
Original patch by Greg Stark, with further hacking by Hans-Jurgen Schonig,
Robert Haas, and Tom Lane.
xgettext is only required when make init-po is run manually; it is not
required for a build. The intent to handle that was already there, but
the ifdef's were in the wrong place.
ecpglib. Instead of parsing the statement just as ask the database server. This
patch removes the whole client side track keeping of the current transaction
status.
to see if a particular privilege has been granted to PUBLIC.
The issue was reported by Jim Nasby.
Patch by Alvaro Herrera, and reviewed by KaiGai Kohei.
We may as well make pgstat_count_heap_scan() and related macros just count
whenever rel->pgstat_info isn't null. Testing pgstat_track_counts buys
nothing at all in the normal case where that flag is ON; and when it's OFF,
the pgstat_info link will be null, so it's still a useless test.
This change is unlikely to buy any noticeable performance improvement,
but a cycle shaved is a cycle earned; and my investigations earlier today
convinced me that we're down to the point where individual instructions in
the inner execution loops are starting to matter.
The original coding was quite sloppy about handling the case where
XLogReadBuffer fails (because the page has since been deleted). This
would result in either "bad buffer id: 0" or an Assert failure during
replay, if indeed the page were no longer there. In a couple of places
it also neglected to check whether the change had already been applied,
which would probably result in corrupted index contents. I believe that
bug #5703 is an instance of the first problem. These issues could show up
without replication, but only if you were unfortunate enough to crash
between modification of a GIN index and the next checkpoint.
Back-patch to 8.2, which is as far back as GIN has WAL support.
This patch merges the responsibility for NOT-flattening into
eval_const_expressions' processing. It wasn't done that way originally
because prepqual.c is far older than eval_const_expressions. But putting
this work into eval_const_expressions saves one pass over the qual trees,
and in fact saves even more than that because we can exploit the knowledge
that the subexpressions have already been recursively simplified. Doing it
this way also lets us do it uniformly over all expressions, whereas
prepqual.c formerly just did it at top level to save cycles. That should
improve the planner's ability to recognize logically-equivalent constructs.
While at it, also add the ability to fold a NOT into BooleanTest and
NullTest constructs (the latter only for the scalar-datatype case).
Per discussion of bug #5702.
Completion is supported in the context of \set and when interpolating
a variable value using :foo etc.
In passing, fix some places in tab-complete.c that weren't following
project style for comment formatting.
Pavel Stehule, reviewed by Itagaki Takahiro
This patch adds the SQL-standard concept of an INSTEAD OF trigger, which
is fired instead of performing a physical insert/update/delete. The
trigger function is passed the entire old and/or new rows of the view,
and must figure out what to do to the underlying tables to implement
the update. So this feature can be used to implement updatable views
using trigger programming style rather than rule hacking.
In passing, this patch corrects the names of some columns in the
information_schema.triggers view. It seems the SQL committee renamed
them somewhere between SQL:99 and SQL:2003.
Dean Rasheed, reviewed by Bernd Helmle; some additional hacking by me.
Various places were testing TRIGGER_FIRED_BEFORE() where what they really
meant was !TRIGGER_FIRED_AFTER(), or vice versa. This needs to be cleaned
up because there are about to be more than two possible states.
We might want to note this in the 9.1 release notes as something for
trigger authors to double-check.
For consistency's sake I also changed some places that assumed that
TRIGGER_FIRED_FOR_ROW and TRIGGER_FIRED_FOR_STATEMENT are necessarily
mutually exclusive; that's not in immediate danger of breaking, but
it's still sloppier than it should be.
Extracted from Dean Rasheed's patch for triggers on views. I'm committing
this separately since it's an identifiable separate issue, and is the
only reason for the patch to touch most of these particular files.
This patch resurrects some of the information that could be logged by the
old, now-dead implementation of VACUUM FULL, in particular counts of live
and dead tuples and the time taken for the table rebuild proper. There's
still no logging about the ensuing index rebuilds, though.
Itagaki Takahiro
Use a macro LogicalTapeReadExact() to encapsulate the error check when
we want to read an exact number of bytes from a "tape". Per a suggestion
of Takahiro Itagaki.
This patch eliminates per-chunk palloc overhead for most small allocations
needed in the representation of an ispell dictionary. This saves close to
a factor of 2 on the current Czech ispell data. While it doesn't cover
every last small allocation in the ispell code, we are at the point of
diminishing returns, because about 95% of the allocations are covered
already.
Pavel Stehule, rather heavily revised by Tom
Add explicit initialization and cleanup functions to spell.c, and keep
all working state in the already-existing ISpellDict struct. This lets us
get rid of a static variable along with some extremely shaky assumptions
about usage of child memory contexts.
This commit is just code beautification and has no impact on functionality
or performance, but it opens the way to a less-grotty implementation of
Pavel's memory-saving hack, which will follow shortly.
In versions 8.2 and up, the grammar allows attaching ORDER BY, LIMIT,
FOR UPDATE, or WITH to VALUES, and hence to INSERT ... VALUES. But the
special-case code for VALUES in transformInsertStmt() wasn't expecting any
of those, and just ignored them, leading to unexpected results. Rather
than complicate the special-case path, just ensure that the presence of any
of those clauses makes us treat the query as if it had a general SELECT.
Per report from Hitoshi Harada.
Actually making this case work, if the column is used in the trigger's
WHEN condition, will take some new code that probably isn't appropriate
to back-patch. For now, just throw a FEATURE_NOT_SUPPORTED error rather
than allowing control to reach the "unexpected object" case. Per bug #5688
from Daniel Grace. Back-patch to 9.0 where the possibility of such a
dependency was introduced.
There are numerous methods by which a Perl or Tcl function can subvert
the behavior of another such function executed later; for example, by
redefining standard functions or operators called by the target function.
If the target function is SECURITY DEFINER, or is called by such a
function, this means that any ordinary SQL user with Perl or Tcl language
usage rights can do essentially anything with the privileges of the target
function's owner.
To close this security hole, create a separate Perl or Tcl interpreter for
each SQL userid under which plperl or pltcl functions are executed within
a session. However, all plperlu or pltclu functions run within a session
still share a single interpreter, since they all execute at the trust
level of a database superuser anyway.
Note: this change results in a functionality loss when libperl has been
built without the "multiplicity" option: it's no longer possible to call
plperl functions under different userids in one session, since such a
libperl can't support multiple interpreters in one process. However, such
a libperl already failed to support concurrent use of plperl and plperlu,
so it's likely that few people use such versions with Postgres.
Security: CVE-2010-3433
The point of a PlaceHolderVar is to allow a non-strict expression to be
evaluated below an outer join, after which its value bubbles up like a Var
and can be forced to NULL when the outer join's semantics require that.
However, there was a serious design oversight in that, namely that we
didn't ensure that there was actually a correct place in the plan tree
to evaluate the placeholder :-(. It may be necessary to delay evaluation
of an outer join to ensure that a placeholder that should be evaluated
below the join can be evaluated there. Per recent bug report from Kirill
Simonov.
Back-patch to 8.4 where the PlaceHolderVar mechanism was introduced.
This is intended as infrastructure to support integration with label-based
mandatory access control systems such as SE-Linux. Further changes (mostly
hooks) will be needed, but this is a big chunk of it.
KaiGai Kohei and Robert Haas
1. Resurrect the behavior where old commits on master will have Branch:
labels for branches sprouted after the commit was made. I'm still
dubious about this mode, but if you want it, say --post-date or -p.
2. Annotate the Branch: labels with the release or branch in which the
commit was publicly released. For example, on a release branch you could
see
Branch: REL8_3_STABLE Release: REL8_3_2 [92c3a8004] 2008-03-29 00:15:37 +0000
showing that the fix was released in 8.3.2. Commits on master will
usually instead have notes like
Branch: master Release: REL8_4_BR [6fc9d4272] 2008-03-29 00:15:28 +0000
showing that this commit is ancestral to release branches 8.4 and later.
If no Release: marker appears, the commit hasn't yet made it into any
release.
3. Add support for release branches older than 7.4.
4. The implementation is improved by running git log on each branch only
back to where the branch sprouts from master. This saves a good deal
of time (about 50% of the runtime when generating the complete history).
We generate the post-date-mode tags via a direct understanding that
they should be applied to master commits made before the branch sprouted,
rather than backing into them via matching (which isn't any too
reliable when people used identical log messages for successive commits).
1. Don't assume there's only one candidate match; check them all and use the
one with the closest timestamp. Avoids funny output when someone makes
several successive commits with the same log message, as certain people
have been known to do.
2. When the same commit (with the same SHA1) is reachable from multiple
branch tips, don't report it for all the branches; instead report it only
for the first such branch. Given our development practices, this case
arises only for commits that occurred before a given branch split off from
master. The original coding blamed old commits on *all* the branches,
which isn't terribly useful; the new coding blames such a commit only on
master.
1. Don't forget the last (oldest) commit on the oldest branch.
2. When considering which commit to print next, if two alternatives have
the same "distortion" score (which is actually the normal case, since
generally the "distortion" is 0), then choose the later timestamp to
print first. I don't know where Robert got the idea to ignore timestamps
and sort by branch age, but it wasn't a good idea: the resulting ordering
of commits was just plain bizarre anywhere that some branches had many
fewer commits than others, which is the typical situation for us.
Avoid depending on Date::Calc, which isn't in a basic Perl installation,
when we can equally well use Time::Local which is. Also fix the parsing
of timestamps to take heed of the timezone. (It looks like cvs2git emitted
all commit timestamps with zone GMT, so this refinement might've looked
unnecessary when looking at converted data; but it's needed now.)
Fix parsing of message bodies so that blank lines that may or may not get
emitted by "git log" aren't confused with real data. This avoids strange
formatting of the oldest commit on a branch.
Check child-process exit status, so that we actually notice if "git log"
fails, and so that we don't accumulate zombie children.
The previous coding would decide that join removal was unsafe upon finding
a PlaceHolderVar that needed to be evaluated at the inner rel and then used
above the join. However, this fails to cover the case of PlaceHolderVars
that refer to both the inner rel and some other rels. Per bug report from
Andrus.
hasn't been set.
The only known case where this can happen is when show_session_authorization
is invoked in an autovacuum process, which is possible if an index function
calls it, as for example in bug #5669 from Andrew Geery. We could perhaps
try to return a sensible value, such as the name of the cluster-owning
superuser; but that seems like much more trouble than the case is worth,
and in any case it could create new possible failure modes. Simply
returning an empty string seems like the most appropriate fix.
Back-patch to all supported versions, even those before autovacuum, just
in case there's another way to provoke this crash.
In some situations the original coding led to corrupting the child AppendRel's
subpaths list, effectively adding other members of the parent's list to it.
This was usually masked because we never made any further use of the child's
list, but given the right combination of circumstances, we could do so. The
visible symptom would be a relation getting scanned twice, as in bug #5673
from David Schmitt.
Backpatch to 8.2, which is as far back as the risky coding appears. The
example submitted by David only fails in 8.4 and later, but I'm not convinced
that there aren't any even-more-obscure cases where 8.2 and 8.3 would fail.
We can't actually print the parent RelOptInfo in toto, because that would
lead to infinite recursion. But it's safe enough to reach into the parent
and print its identifying relids, and that makes it a whole lot easier
to figure out what a Path represents. Should have done this years ago.
servers. AFAICT it's harmless at the moment because nothing can depend on
either, but as soon as we introduce an object type with such dependencies,
tableoid needs to be set or pg_dump will fail to interpret the dependencies
correctly. In theory, I guess the uninitialized garbage in tableoid could
cause the object to be mistaken for some other object with same OID as well.
This was unintentionally broken in 8.4 while tightening up checking of
ordinary non-Julian date inputs to forbid references to "year zero".
Per bug #5672 from Benjamin Gigot.
The previous patches failed to cover a lot of symlinks that are only
added in platform-specific cases. Make the lists match what's in the
Makefile for each branch.
Poking around for remaining occurrences of CVS keyword strings, I came
across one that apparently reflects the use of a $Revision: ...$ string
in the original input data. Dunno why anybody would be using that in
an MTA's Received: lines, but there it is. Put it back to the way that
it was originally, according to inspection of the CVS repo.
This script is intended to substitute for cvs2cl in generating release
notes and scrutinizing what got back-patched to which branches.
Script by me. Support for --since by Alex Hunsaker.
The previous coding just terminated the COPY immediately after seeing
the EOF marker (-1 where a row field count is expected). The expected
CopyDone or CopyFail message just got thrown away later, since we weren't
in COPY mode anymore. This behavior complicated matters for the JDBC
driver, and arguably was the wrong thing in any case since a CopyFail
message after the marker wouldn't be honored.
Note that there is a behavioral change here: extra data after the EOF
marker was silently ignored before, but now it will cause an error.
Hence not back-patching, although this is arguably a bug.
Per report and patch by Kris Jurka.
the same number of columns expected by the insert. This suggests that there
were extra parentheses that converted the intended column list into a row
expression.
Original patch by Marko Tiikkaja, rather heavily editorialized by me.
since it can happen when a process fails to start when the system
is under high load.
Per several bug reports and many peoples investigation.
Back-patch to 8.4, which is as far back as the "deadman-switch"
for shared memory access exists.
new WAL arrives via streaming replication. This reduces the latency, and
also allows us to use a longer polling interval, which is good for energy
efficiency.
We still need to poll to check for the appearance of a trigger file, but
the interval is now 5 seconds (instead of 100ms), like when waiting for
a new WAL segment to appear in WAL archive.
dynamic pool of event handles, we can permanently assign one for each
shared latch. Thanks to that, we no longer need a separate shared memory
block for latches, and we don't need to know in advance how many shared
latches there is, so you no longer need to remember to update
NumSharedLatches when you introduce a new latch to the system.
In these cases a qual can get marked with the removable rel in its
required_relids, but this is just to schedule its evaluation correctly, not
because it really depends on the rel. We were assuming that, in effect,
we could throw away *all* quals so marked, which is nonsense. Tighten up
the logic to be a little more paranoid about which quals belong to the
outer join being considered for removal, and arrange for all quals that
don't belong to be updated so they will still get evaluated correctly.
Also fix another problem that happened to be exposed by this test case,
which was that make_join_rel() was failing to notice some cases where
a constant-false qual could be used to prove a join relation empty. If it's
a pushed-down constant false, then the relation is empty even if it's an
outer join, because the qual applies after the outer join expansion.
Per report from Nathan Grange. Back-patch into 9.0.
an online backup instead of performing one. pg_ctl can detect that by
checking if recovery.conf exists.
Backup label file is renamed away early in recovery, so the window where
backup label exists during recovery is normally very small, but you can run
into it e.g if restore_command is set incorrectly and the startup process
never finds even the first WAL segment containing the checkpoint record to
start recovery from.
Fujii Masao with comments by me.
make sense for walsender, but for example application_name and client_encoding
do. We still don't apply per-role settings from pg_db_role_setting, because
that would require connecting to a database to read the table.
Fujii Masao
transaction snapshots, i.e. a snapshot registered at the beginning of
a transaction. Change variable naming and comments to reflect this reality
in preparation for a future, truly serializable mode, e.g.
Serializable Snapshot Isolation (SSI).
For the moment transaction snapshots are still used to implement
SERIALIZABLE, but hopefully not for too much longer. Patch by Kevin
Grittner and Dan Ports with review and some minor wording changes by me.
wait until it is set. Latches can be used to reliably wait until a signal
arrives, which is hard otherwise because signals don't interrupt select()
on some platforms, and even when they do, there's race conditions.
On Unix, latches use the so called self-pipe trick under the covers to
implement the sleep until the latch is set, without race conditions. On
Windows, Windows events are used.
Use the new latch abstraction to sleep in walsender, so that as soon as
a transaction finishes, walsender is woken up to immediately send the WAL
to the standby. This reduces the latency between master and standby, which
is good.
Preliminary work by Fujii Masao. The latch implementation is by me, with
helpful comments from many people.
Peter's original patch had this right, but I dropped the check while revising
the code to search pg_constraint instead of pg_index. Spotted by Dean Rasheed.
A long time ago, this didn't work nicely, but it seems to work on all recent
versions of OS X. The blank-pad method is less desirable since it results
in lots of extra space in ps' output. Per Alexey Klyukin.
Since the code underlying pg_get_expr() is not secure against malformed
input, and can't practically be made so, we need to prevent miscreants
from feeding arbitrary data to it. We can do this securely by declaring
pg_get_expr() to take a new datatype "pg_node_tree" and declaring the
system catalog columns that hold nodeToString output to be of that type.
There is no way at SQL level to create a non-null value of type pg_node_tree.
Since the backend-internal operations that fill those catalog columns
operate below the SQL level, they are oblivious to the datatype relabeling
and don't need any changes.
SI invalidation events, rather than indirectly through the relcache.
In the previous coding, we had to flush a composite-type typcache entry
whenever we discarded the corresponding relcache entry. This caused problems
at least when testing with RELCACHE_FORCE_RELEASE, as shown in recent report
from Jeff Davis, and might result in real-world problems given the kind of
unexpected relcache flush that that test mechanism is intended to model.
The new coding decouples relcache and typcache management, which is a good
thing anyway from a structural perspective. The cost is that we have to
search the typcache linearly to find entries that need to be flushed. There
are a couple of ways we could avoid that, but at the moment it's not clear
it's worth any extra trouble, because the typcache contains very few entries
in typical operation.
Back-patch to 8.2, the same as some other recent fixes in this general area.
The patch could be carried back to 8.0 with some additional work, but given
that it's only hypothetical whether we're fixing any problem observable in
the field, it doesn't seem worth the work now.
initialize the rd_backend field of a fake Relation entry correctly.
Fortunately, that is easy, since only non-temp relations should ever be
mentioned in the WAL stream.
This patch changes _bt_split() and _bt_pagedel() to throw a plain ERROR,
rather than PANIC, for several cases that are reported from the field
from time to time:
* right sibling's left-link doesn't match;
* PageAddItem failure during _bt_split();
* parent page's next child isn't right sibling during _bt_pagedel().
In addition the error messages for these cases have been made a bit
more verbose, with additional values included.
The original motivation for PANIC here was to capture core dumps for
subsequent analysis. But with so many users whose platforms don't capture
core dumps by default, or who are unprepared to analyze them anyway, it's hard
to justify a forced database restart when we can fairly easily detect the
problems before we've reached the critical sections where PANIC would be
necessary. It is not currently known whether the reports of these messages
indicate well-hidden bugs in Postgres, or are a result of storage-level
malfeasance; the latter possibility suggests that we ought to try to be more
robust even if there is a bug here that's ultimately found.
Backpatch to 8.2. The code before that is sufficiently different that
it doesn't seem worth the trouble to back-port further.
Peter Eisentraut reports that some bits of the "address" variable
in get_object_address() give "may be used uninitialized" warnings;
this likes the only excuse his compiler could have for thinking
that's possible.
which is perhaps not a terribly good spot for it but there doesn't seem to be
a better place. Also add a source-code comment pointing out a couple reasons
for having a separate lock file. Per suggestion from Greg Smith.
Egypt and Palestine. Added new names for two Micronesian timezones:
Pacific/Chuuk is now preferred over Pacific/Truk (and the preferred
abbreviation is CHUT not TRUT) and Pacific/Pohnpei is preferred over
Pacific/Ponape. Historical corrections for Finland.
returning "record" actually do have the same rowtype. This is needed because
the parser can't realistically enforce that they will all have the same typmod,
as seen in a recent example from David Wheeler.
Back-patch to 8.0, which is as far back as we have the notion of RECORD
subtypes being distinguished by typmod. Wheeler's example depends on
8.4-and-up features, but I suspect there may be ways to provoke similar
failures before 8.4.
It turns out that some platforms return ENOMEM for a request that violates
SHMALL, whereas we were assuming that ENOSPC would always be used for that.
Apparently the latter is a Linuxism while ENOMEM is the BSD tradition.
Extend the ENOMEM hint to suggest that raising SHMALL might be needed.
Per gripe from A.M.
Backpatch to 9.0, but not further, because this doesn't seem important
enough to warrant creating extra translation work in the stable branches.
(If it were, we'd have figured this out years ago.)
This is reproducibly possible in Python 2.7 if the user turned
PendingDeprecationWarning into an error, but it's theoretically also possible
in earlier versions in case of exceptional conditions.
backpatched to 8.0
files automatically. Otherwise, the following could happen: When
starting from a clean source tree, the first build would delete the
intermediate file, but also create the dependency file, which
mentions the intermediate file, thus making it non-intermediate.
The second build will then need to rebuild the now non-intermediate
missing file. So the second build will do work even though nothing
had changed. One place where this happens is the .c -> .o -> .so
chain for some contrib modules.
There is no reason that proc.c should have to get involved in this dirty hack
for letting the postmaster know which children are walsenders. Revert that
file to the way it was, and confine the kluge to pmsignal.c and postmaster.c.
array_in discards unquoted leading and trailing whitespace in array values,
while array_out is careful to quote array elements that contain whitespace.
This is problematic when the definition of "whitespace" varies between
locales: array_in could drop characters that were meant to be part of the
value. To avoid that, lock down "whitespace" to mean only the traditional
six ASCII space characters.
This change also works around a bug in OS X and some older BSD systems, in
which isspace() could return true for character fragments in UTF8 locales.
(There may be other places in PG where that bug could cause problems, but
this is the only one complained of so far; see recent report from Steven
Schlansker.)
Back-patch to 9.0, but not further. Given the lack of previous reports
of trouble, changing this behavior in stable branches seems to offer
more risk of breaking applications than reward of avoiding problems.
The original coding tended to break down in the face of modified restore
orders, as shown in bug #5626 from Albert Ullrich, because it would flip over
into parallel-restore operation too soon. That causes problems because we
don't have sufficient dependency information in dump archives to allow safe
parallel processing of SECTION_PRE_DATA items. Even if we did, it's probably
undesirable to allow that to override the commanded restore order.
To fix the problem of omitted items causing unexpected changes in restore
order, tweak SortTocFromFile so that omitted items end up at the head of
the list not the tail. This ensures that they'll be examined and their
dependencies will be marked satisfied before we get to any interesting
items.
In HEAD and 9.0, we can easily change restore_toc_entries_parallel so that
all SECTION_PRE_DATA items are guaranteed to be processed in the initial
serial-restore loop, and hence in commanded order. Only DATA and POST_DATA
items are candidates for parallel processing. For them there might be
variations from the commanded order because of parallelism, but we should
do it in a safe order thanks to dependencies.
In 8.4 it's much harder to make such a guarantee. I settled for not
letting the initial loop break out into parallel processing mode if
it sees a DATA/POST_DATA item that's not to be restored; this at least
prevents a non-restorable item from causing premature exit from the loop.
This means that 8.4 will be more likely to fail given a badly-ordered -L
list than 9.x, but we don't really promise any such thing will work anyway.
Since an SMgrRelation now knows whether or not the underlying relation is
temporary, there's no point in also passing that information via an
additional argument.
Per gripe from Fujii Masao, though this is not exactly his proposed patch.
Categorize as DEVELOPER_OPTIONS and set context PGC_SIGHUP, as per Fujii,
but set the default to LOG because higher values aren't really sensible
(see the code for trace_recovery()). Fix the documentation to agree with
the code and to try to explain what the variable actually does. Get rid
of no-op calls trace_recovery(LOG), which accomplish nothing except to
demonstrate that this option confuses even its author.
Aside from being more forgiving, this prevents a rather surprising misbehavior
when the "wrong" order was used: the old code didn't throw a syntax error,
but absorbed the INTO clause into the last USING expression, which then did
strange things downstream.
Intentionally not changing the documentation; we'll continue to advertise
only the "standard" clause order.
Backpatch to 8.4, where the USING clause was added to EXECUTE.
It's not clear if this situation can occur in plpgsql other than via the
EXECUTE USING case Heikki illustrated, which I will shortly close off.
However, ignoring the intoClause if it's there is surely wrong, so let's
patch it for safety.
Backpatch to 8.3, which is as far back as this code has a PlannedStmt
to deal with. There might be another way to make an equivalent test
before that, but since this is just preventing hypothetical bugs,
I'm not going to obsess about it.
pointed out, it would need a 2nd pass after the whole query is processed to
correctly check that an unknown Param is coerced to the same target type
everywhere. Adding the 2nd pass would add a lot more code, which doesn't
seem worth the risk given that there isn't much of a use case for passing
unknown Params in the first place. The code would work without that check,
but it might be confusing and the behavior would be different from the
varparams case.
Instead, just coerce all unknown params in a PL/pgSQL USING clause to text.
That's simple, and is usually what users expect.
Revert the patch in CVS HEAD and master, and backpatch the new solution to
8.4. Unlike the previous solution, this applies easily to 8.4 too.
into TopMemoryContext. This makes no functional difference, but makes it
easier to see what the space is being used for in MemoryContextStats dumps.
Per a recent example in which I was surprised by the size of TopMemoryContext.
afterTriggerInvokeEvents failed to adjust events->tailfree when truncating
the last chunk of an event list. This could result in the data being
"de-truncated" by afterTriggerRestoreEventList during a subsequent
subtransaction abort. Even that wouldn't kill us, because the re-added data
would just be events marked DONE --- unless the data had been partially
overwritten by new events. Then we might crash, or in any case misbehave
(perhaps fire triggers twice, or fire triggers with the wrong event data).
Per bug #5622 from Thue Janus Kristensen.
Back-patch to 8.4 where the current trigger list representation was introduced.
In the new API introduced by my patch to include the backend ID in
temprel filenames, the last argument to smrgextend() became skipFsync
rather than isTemp, but these calls didn't get the memo. It's not
really a problem to pass rel->rd_istemp rather than just plain false,
because smgrextend() now automatically skips the fsync for temprels
anyway, but this seems cleaner and saves some minute number of cycles.
ExecModifyTable(). This avoids memory leakage when trigger functions leave
junk behind in that context (as they more or less must). Problem and solution
identified by Dean Rasheed.
I'm a bit concerned about the longevity of this solution --- once a plan can
have multiple ModifyTable nodes, we are very possibly going to have to do
something different. But it should hold up for 9.0.
elsewhere.
Similarly rename the version in mbprint.c, not because this affects anything
but just to keep the two copies in exact sync. There was some discussion of
having only one copy in src/port/ instead, but this function is so small
and unlikely to change that that seems like overkill.
Slightly editorialized version of a patch by Joseph Adams. (The bug-fix
aspect of his patch was applied separately, and back-patched.)
The implicitly created sequence was created as owned by the current user,
who could be different from the table owner, eg if current user is a
superuser or some member of the table's owning role. This caused sanity
checks in the SEQUENCE OWNED BY code to spit up. Although possibly we
don't need those sanity checks, the safest fix seems to be to make sure
the implicit sequence is assigned the same owner role as the table has.
(We still do all permissions checks as the current user, however.)
Per report from Josh Berkus.
Back-patch to 9.0. The bug goes back to the invention of SEQUENCE OWNED BY
in 8.2, but the fix requires an API change for DefineRelation(), which seems
to have potential for breaking third-party code if done in a minor release.
Given the lack of prior complaints, it's probably not worth fixing in the
stable branches.
_outPlannedStmt is only debug support, so the omission there was not very
serious, but the omission in _copyPlannedStmt is a real bug. The consequence
would be that a copied plan tree would never be marked as a transient plan,
so that we would forget we ought to replan it after some not-yet-ready index
becomes ready for use. This might explain some past complaints about indexes
created with CREATE INDEX CONCURRENTLY not being used right away. Problem
spotted by Yeb Havinga.
Back-patch to 8.3, where the field was added.
parse_analyze() function. That case occurs e.g with PL/pgSQL
EXECUTE ... USING 'stringconstant'.
The coercion with a CoerceViaIO node. The result is similar to the coercion
via input function performed for unknown constants in coerce_type(),
except that this happens at runtime.
Backpatch to 9.0. The issue is present in 8.4 as well, but the coerce param
hook infrastructure this patch relies on was introduced in 9.0. Given the
lack of user reports and harmlessness of the bug, it's not worth attempting
a different fix just for 8.4.
socket lockfile) when writing them. The lack of an fsync here may well
explain two different reports we've seen of corrupted lockfile contents,
which doesn't particularly bother the running server but can prevent a
new server from starting if the old one crashes. Per suggestion from
Alvaro.
Back-patch to all supported versions.
This is appropriate for the same reasons we already do it in
LockSharedObject(): things might have changed while we were waiting
for the lock. There doesn't seem to be a live bug here at the moment,
but that's mostly because it isn't currently used for very much.
in particular, propagate a fix in the test to see whether a UTF8 character has
length 4 bytes. This is likely of little real-world consequence because
5-or-more-byte UTF8 sequences are not supported by Postgres nor seen anywhere
in the wild, but still we may as well get it right. Problem found by Joseph
Adams.
Bug is aboriginal, so back-patch all the way.
used by array_agg(), string_agg(), and similar aggregate functions that use
"internal" as their transition datatype. The previous coding thought this
took *no* extra space, since "internal" is pass-by-value; but actually these
aggregates typically consume a great deal of space. Per bug #5608 from
Itagaki Takahiro, and fix suggestion from Hitoshi Harada.
Back-patch to 8.4, where array_agg was introduced.
This allows us to reliably remove all leftover temporary relation
files on cluster startup without reference to system catalogs or WAL;
therefore, we no longer include temporary relations in XLOG_XACT_COMMIT
and XLOG_XACT_ABORT WAL records.
Since these changes require including a backend ID in each
SharedInvalSmgrMsg, the size of the SharedInvalidationMessage.id
field has been reduced from two bytes to one, and the maximum number
of connections has been reduced from INT_MAX / 4 to 2^23-1. It would
be possible to remove these restrictions by increasing the size of
SharedInvalidationMessage by 4 bytes, but right now that doesn't seem
like a good trade-off.
Review by Jaime Casanova and Tom Lane.
I just noticed that libpq's pqsignal.h was violating our general inclusion
style guidelines by explicitly including postgres_fe.h. Remove that, and
put it in pqsignal.c where it belongs.
functions to the core XML code. Per discussion, the former depends on
XMLOPTION while the others do not. These supersede a version previously
offered by contrib/xml2.
Mike Fowler, reviewed by Pavel Stehule
path that specifies useTemp, but there is no active temp schema in the
current session. (This can happen if the path was saved during a transaction
that created a temp schema and was later rolled back.) For existing callers
it's sufficient to ignore the useTemp flag in this case, though we might
later want to offer an option to create a fresh temp schema. So far as I can
tell this is just an Assert failure: in a non-assert build, the code would
push a zero onto the new search path, which is useless but not very harmful.
Per bug report from Heikki.
Back-patch to 8.3; prior versions don't have this code.
Since the only purpose of WAL-loggin SharedInvalidationMessages is to support
Hot Standby operation, they needn't be included when wal_level < hot_standby.
Back-patch to 9.0.
Review by Heikki Linnakanagas and Fujii Masao.
and input file name, per bug #5617 from Leo Shklovskii. Rearrange the
corresponding code in pg_dump and pg_dumpall so that all three programs
handle this in a consistent, straightforward fashion.
Back-patch to 9.0, but no further. Although this is certainly a bug, it's
possible that people have scripts that will be broken by the added error
check, so it seems better not to change the behavior in stable branches.
and the editor's cursor will be initially placed on that line. In \e the
lines are counted with respect to the query buffer, while in \ef they are
counted with line 1 = first line of function body. These choices are useful
for positioning the cursor on the line of a previously-reported error.
To avoid assumptions about what switch the user's editor takes for this
purpose, invent a new psql variable EDITOR_LINENUMBER_SWITCH with (at
present) no default value.
One incompatibility from previous behavior is that "\e 1234" will now
take "1234" as a line number not a file name. There are at least two
ways to select a numerically-named file if you really want to.
Pavel Stehule, reviewed by Jan Urbanski, with further editing by Robert Haas
and Tom Lane
better handling of NULL elements within the arrays. The third parameter
is a string that should be used to represent a NULL element, or should
be translated into a NULL element, respectively. If the third parameter
is NULL it behaves the same as the two-parameter form.
There are two incompatible changes in the behavior of the two-parameter form
of string_to_array. First, it will return an empty (zero-element) array
rather than NULL when the input string is of zero length. Second, if the
field separator is NULL, the function splits the string into individual
characters, rather than returning NULL as before. These two changes make
this form fully compatible with the behavior of the new three-parameter form.
Pavel Stehule, reviewed by Brendan Jurd
expressions. We need to deal with this when handling subscripts in an array
assignment, and also when catching an exception. In an Assert-enabled build
these omissions led to Assert failures, but I think in a normal build the
only consequence would be short-term memory leakage; which may explain why
this wasn't reported from the field long ago.
Back-patch to all supported versions. 7.4 doesn't have exceptions, but
otherwise these bugs go all the way back.
Heikki Linnakangas and Tom Lane
can be caught in the same places that could catch an ordinary RAISE ERROR
in the same location. The previous coding insisted on throwing the error
from the block containing the active exception handler; which is arguably
more surprising, and definitely unlike Oracle's behavior.
Not back-patching, since this is a pretty obscure corner case. The risk
of breaking somebody's code in a minor version update seems to outweigh
any possible benefit.
Piyush Newe, reviewed by David Fetter
statistics counts. These numbers are being accumulated but haven't yet been
transmitted to the collector (and won't be, until the transaction ends).
For some purposes, though, it's handy to be able to look at them.
Joel Jacobson, reviewed by Itagaki Takahiro
other columns to be referenced without listing them in GROUP BY, so long as
the primary key column(s) are listed in GROUP BY.
Eventually we should also allow functional dependency on a UNIQUE constraint
when the columns are marked NOT NULL, but that has to wait until NOT NULL
constraints are represented in pg_constraint, because we need to have
pg_constraint OIDs for all the conditions needed to ensure functional
dependency.
Peter Eisentraut, reviewed by Alex Hunsaker and Tom Lane
matching a call like f(x, ORDER BY y,z). It could be that what the user
really wants is f(x,z ORDER BY y). We now have pretty conclusive evidence
that many people won't understand this problem without concrete guidance,
so give it to them. Per further discussion of the string_agg() problem.
functionality, while creating an ambiguity in usage with ORDER BY that at
least two people have already gotten seriously confused by. Also, add an
opr_sanity test to check that we don't in future violate the newly minted
policy of not having built-in aggregates with the same name and different
numbers of parameters. Per discussion of a complaint from Thom Brown.
- Rename TSParserGetPrsid to get_ts_parser_oid.
- Rename TSDictionaryGetDictid to get_ts_dict_oid.
- Rename TSTemplateGetTmplid to get_ts_template_oid.
- Rename TSConfigGetCfgid to get_ts_config_oid.
- Rename FindConversionByName to get_conversion_oid.
- Rename GetConstraintName to get_constraint_oid.
- Add new functions get_opclass_oid, get_opfamily_oid, get_rewrite_oid,
get_rewrite_oid_without_relid, get_trigger_oid, and get_cast_oid.
The name of each function matches the corresponding catalog.
Thanks to KaiGai Kohei for the review.
unqualified names.
- Add a missing_ok parameter to get_tablespace_oid.
- Avoid duplicating get_tablespace_od guts in objectNamesToOids.
- Add a missing_ok parameter to get_database_oid.
- Replace get_roleid and get_role_checked with get_role_oid.
- Add get_namespace_oid, get_language_oid, get_am_oid.
- Refactor existing code to use new interfaces.
Thanks to KaiGai Kohei for the review.
The old computation can sometimes underestimate the necessary space
by 2 bytes; however we're not back-patching this, because this result
isn't used for anything critical. Per discussion with Tom Lane,
make the typmod test in this function match the ones in numeric()
and apply_typmod() exactly.
the parameters of \connect, and fix oversight of not enabling translation
of the messages. Also, adjust \connect's similar messages to match, and
deal with 8.2-era violation of basic translatability guidelines there.
implementation deficiencies. Per discussion of bug #5592, we're not
going to change it, but these things should be documented so that if
anyone ever reimplements type tinterval, they will be more careful.
Without this patch, constraints inherited by children of a parent
table which itself has multiple inheritance parents can end up with
the wrong coninhcount. After dropping the constraint, the children
end up with a leftover copy of the constraint that is not dumped
and cannot be dropped. There is a similar problem with ALTER TABLE
.. ADD COLUMN, but that looks significantly more difficult to
resolve, so I'm committing this fix separately.
Back-patch to 8.4, which is the first release that has coninhcount.
Report by Hank Enting.
makeTSQuerySign. The first of these is a live bug, on some platforms,
as per bug #5590 from John Regehr. However the consequences seem limited
because of the relatively narrow scope of use of QTNode.sign. The shift in
makeTSQuerySign is actually safe because TSQS_SIGLEN is unsigned, but it
seems like a good idea to insert an explicit cast rather than depend on that.
tsqueries. CompareTSQ has to have a guard for the case rather than blindly
applying QTNodeCompare to random data past the end of the datums. Also,
change QTNodeCompare to be a little less trusting: use an actual test rather
than just Assert'ing that the input is sane. Problem encountered while
investigating another issue (I saw a core dump in autoanalyze on a table
containing multiple empty tsquery values).
Back-patch to all branches with tsquery support.
In HEAD, also fix some bizarre (though not outright wrong) coding in
tsq_mcontains().
since Apple shipped a compiler that needed this switch, and there's
increasing interest in using other compilers that won't accept the switch
at all. Better to let anybody who still needs the switch inject it via
CPPFLAGS. Per gripe from Neil Conway.
While this hack arguably has some benefit in terms of making PL/pgsql's
line numbering match the programmer's expectations, it also makes
PL/pgsql inconsistent with the remaining PLs, making it difficult for
clients to reliably determine where the error actually is. On balance,
it seems better to be consistent.
Pavel Stehule
from "clang". The VERR changes make an assignment unconditional, which is
probably easier to read/understand anyway, and one can hardly argue that
it's worth shaving cycles off the case of reporting another error when
one has already been detected. The INSIST change limits where that macro
can be used, but not in a way that creates a problem for any existing call.
interval input "invalid" was specified together with other fields. Spotted
by Neil Conway with the help of a clang warning. Although this has been
wrong since the interval code was written more than 10 years ago, it doesn't
affect anything beyond which error message you get for a wrong input, so not
worth back-patching very far.
indexes when the index column type (the opclass opckeytype) is different from
the expression's datatype. When coded, this limitation wasn't worth worrying
about because we had no intelligence to speak of in stats collection for the
datatypes used by such opclasses. However, now that there's non-toy
estimation capability for tsvector queries, it amounts to a bug that ANALYZE
fails to do this.
The fix changes struct VacAttrStats, and therefore constitutes an API break
for custom typanalyze functions. Therefore we can't back-patch it into
released branches, but it was agreed that 9.0 isn't yet frozen hard enough
to make such a change unacceptable. Ergo, back-patch to 9.0 but no further.
The API break had better be mentioned in 9.0 release notes.
bright, but it beats assuming that a prefix match behaves identically to an
exact match, which is what the code was doing before :-(. Noted while
experimenting with Artur Dobrowski's example.
Although the key-combining code claimed to work correctly if its input
contained both lossy and exact pointers for a single page in a single TID
stream, in fact this did not work, and could not work without pretty
fundamental redesign. Modify keyGetItem so that it will not return such a
stream, by handling lossy-pointer cases a bit more explicitly than we did
before.
Per followup investigation of a gripe from Artur Dabrowski.
An example of a query that failed given his data set is
select count(*) from search_tab where
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* | dd:*')) and
(to_tsvector('german', keywords ) @@ to_tsquery('german', 'aa:*'));
Back-patch to 8.4 where the lossy pointer code was introduced.
struct representing a tree entry, rather than being a separately allocated
piece of storage. This API is at least as clean as the old one (if not
more so --- there were some bizarre choices in there) and it permits a
very substantial memory savings, on the order of 2X in ginbulk.c's usage.
Also, fix minor memory leaks in code called by ginEntryInsert, in
particular in ginInsertValue and entryFillRoot, as well as ginEntryInsert
itself. These leaks resulted in the GIN index build context continuing
to bloat even after we'd filled it to maintenance_work_mem and started
to dump data out to the index.
In combination these fixes restore the GIN index build code to honoring
the maintenance_work_mem limit about as well as it did in 8.4. Speed
seems on par with 8.4 too, maybe even a bit faster, for a non-pathological
case in which HEAD was formerly slower.
Back-patch to 9.0 so we don't have a performance regression from 8.4.
possible (ie, whenever the tsquery is a constant), even when no statistics
are available for the tsvector. For example, foo @@ 'a & b'::tsquery
can be expected to be more selective than foo @@ 'a'::tsquery, whether
or not we know anything about foo. We use DEFAULT_TS_MATCH_SEL as the assumed
selectivity of individual query terms when no stats are available, then
combine the terms according to the query's AND/OR structure as usual.
Per experimentation with Artur Dabrowski's example. (The fact that there
are no stats available in that example is a problem in itself, but
nonetheless tsmatchsel should be smarter about the case.)
Back-patch to 8.4 to keep all versions of tsmatchsel() in sync.
routines to make them behave better in the presence of "lossy" index pointers.
The previous coding was outright incorrect for some cases, as recently
reported by Artur Dabrowski: scanGetItem would fail to return index entries in
cases where one index key had multiple exact pointers on the same page as
another key had a lossy pointer. Also, keyGetItem was extremely inefficient
for cases where a single index key generates multiple "entry" streams, such as
an @@ operator with a multiple-clause tsquery. The presence of a lossy page
pointer in any one stream defeated its ability to use the opclass
consistentFn, resulting in probing many heap pages that didn't really need to
be visited. In Artur's example case, a query like
WHERE tsvector @@ to_tsquery('a & b')
was about 50X slower than the theoretically equivalent
WHERE tsvector @@ to_tsquery('a') AND tsvector @@ to_tsquery('b')
The way that I chose to fix this was to have GIN call the consistentFn
twice with both TRUE and FALSE values for the in-doubt entry stream,
returning a hit if either call produces TRUE, but not if they both return
FALSE. The code handles this for the case of a single in-doubt entry stream,
but punts (falling back to the stupid behavior) if there's more than one lossy
reference to the same page. The idea could be scaled up to deal with multiple
lossy references, but I think that would probably be wasted complexity. At
least to judge by Artur's example, such cases don't occur often enough to be
worth trying to optimize.
Back-patch to 8.4. 8.3 did not have lossy GIN index pointers, so not
subject to these problems.
look through join alias Vars to avoid breaking join queries, and
move the test to someplace where it will catch more possible ways
of calling a function. We still ought to throw away the whole thing
in favor of a data-type-based solution, but that's not feasible in
the back branches.
This needs to be back-patched further than 9.0, but I don't have time
to do so today. Committing now so that the fix gets into 9.0beta4.
Transaction aborts now record their LSN to avoid corner case
behaviour in SR/HS, hence change of name of variables and functions.
As pointed out by Fujii Masao. Cosmetic changes only.
related functions. Per today's discussion, we will henceforth assume
that datatype I/O functions are either stable or immutable, never volatile.
(This implies in particular that domain CHECK constraint expressions shouldn't
be volatile, since domain_in executes them.) In turn, functions that execute
the I/O functions of arbitrary datatypes should always be labeled stable.
This affects the labeling of array_to_string, which was unsafely marked
immutable, and record_in, record_out, record_recv, record_send,
domain_in, domain_recv, which were over-conservatively marked volatile.
The array I/O functions were already marked stable, which is correct
per this policy but would have been wrong if we maintained domain_in
as volatile.
Back-patch to 9.0, along with an earlier fix to correctly mark cash_in
and cash_out as stable not immutable (since they depend on lc_monetary).
No catversion bump --- the implications of this are not currently
severe enough to justify a forced initdb.
assuming that a local char[] array would be aligned on at least a word
boundary. There are architectures on which that is pretty much guaranteed to
NOT be the case ... and those arches also don't like non-aligned memory
accesses, meaning that log_newpage() would crash if it ever got invoked.
Even on Intel-ish machines there's a potential for a large performance penalty
from doing I/O to an inadequately aligned buffer. So palloc it instead.
Backpatch to 8.0 --- 7.4 doesn't have this code.
If a zeroed page is present in the heap, ALTER TABLE .. SET TABLESPACE will
set the LSN and TLI while copying it, which is wrong, and heap_xlog_newpage()
will do the same thing during replay, so the corruption propagates to any
standby. Note, however, that the bug can't be demonstrated unless archiving
is enabled, since in that case we skip WAL logging altogether, and the LSN/TLI
are not set.
Back-patch to 8.0; prior releases do not have tablespaces.
Analysis and patch by Jeff Davis. Adjustments for back-branches and minor
wordsmithing by me.
list in ExecLockRows() forgot to allow for the possibility that some of the
rowmarks are for child tables that aren't relevant to the current row.
Per report from Kenichiro Tanaka.
Avoid hard-coding lockmode used for many altering DDL commands, allowing easier
future changes of lock levels. Implementation of initial analysis on DDL
sub-commands, so that many lock levels are now at ShareUpdateExclusiveLock or
ShareRowExclusiveLock, allowing certain DDL not to block reads/writes.
First of number of planned changes in this area; additional docs required
when full project complete.
a pass-by-reference datatype with a nontrivial projection step.
We were using the same memory context for the projection operation as for
the temporary context used by the hashtable routines in execGrouping.c.
However, the hashtable routines feel free to reset their temp context at
any time, which'd lead to destroying input data that was still needed.
Report and diagnosis by Tao Ma.
Back-patch to 8.1, where the problem was introduced by the changes that
allowed us to work with "virtual" tuples instead of materializing intermediate
tuple values everywhere. The earlier code looks quite similar, but it doesn't
suffer the problem because the data gets copied into another context as a
result of having to materialize ExecProject's output tuple.
We used to be consistent about this, but my recent patch to add a
restart_after_crash GUC failed to follow the existing convention.
Report and patch from Fujii Masao.
We now use the phrase 'via local socket in' rather than 'on host' in both
\c and \conninfo output, when applicable.
Fujii Masao, with some kibitzing by me.
I've added a quote_all_identifiers GUC which affects the behavior
of the backend, and a --quote-all-identifiers argument to pg_dump
and pg_dumpall which sets the GUC and also affects the quoting done
internally by those applications.
Design by Tom Lane; review by Alex Hunsaker; in response to bug #5488
filed by Hartmut Goebel.
Remove bespoke code in DoCopy and RI_Initial_Check, which now instead
fabricate call ExecCheckRTPerms with a manufactured RangeTblEntry.
This is intended to make it feasible for an enhanced security provider
to actually make use of ExecutorCheckPerms_hook, but also has the
advantage that RI_Initial_Check can allow use of the fast-path when
column-level but not table-level permissions are present.
KaiGai Kohei. Reviewed (in an earlier version) by Stephen Frost, and by me.
Some further changes to the comments by me.
Per discussion with David Christensen, there can be multiple
instances of PG accessible via local sockets, and you need the port
to see which one you're actually connected to. David's original
patch worked this way, but I inadvertently ripped it out during
commit.
Normally, we automatically restart after a backend crash, but in some
cases when PostgreSQL is invoked by clusterware it may be desirable to
suppress this behavior, so we provide an option which does this.
Since no existing GUC group quite fits, create a new group called
"error handling options" for this and the previously undocumented GUC
exit_on_error, which is now documented.
Review by Fujii Masao.
path when CSV logging is configured but not yet operational. It's sufficient
to send the message to stderr, as we were already doing, and the "Not safe"
gripe has already confused at least two core members ...
Backpatch to 9.0, but not further --- doesn't seem appropriate to change
this behavior in stable branches.
any implicit casting previously applied to the targetlist item. This is
reasonable because the implicit cast, by definition, wasn't written by the
user; so we are preserving the expected behavior that ORDER BY items match
textually equivalent tlist items. The case never arose before because there
couldn't be any implicit casting of a top-level SELECT item before we process
ORDER BY etc. But now it can arise in the context of aggregates containing
ORDER BY clauses, since the "targetlist" is the already-casted list of
arguments for the aggregate. The net effect is that the datatype used for
ORDER BY/DISTINCT purposes is the aggregate's declared input type, not that
of the original input column; which is a bit debatable but not horrendous,
and to do otherwise would require major rework that doesn't seem justified.
Per bug #5564 from Daniel Grace. Back-patch to 9.0 where aggregate ORDER BY
was implemented.
This adds a libpq connection parameter requirepeer that specifies the user
name that the server process is expected to run under.
reviewed by KaiGai Kohei
log files created by the syslogger process.
In passing, make unix_file_permissions display its value in octal, same
as log_file_mode now does.
Martin Pihlak
from defining non-self-conflicting constraints.
Jeff Davis
Note: I (tgl) objected to removing this check in 9.0 on the grounds that it
was an important sanity check in new, poorly tested code. However, it should
be all right to remove it for 9.1, since we'll get field testing from the
9.0 branch.
to dump a PUBLIC user mapping correctly, as per bug #5560 from Shigeru Hanada.
Use the pg_user_mappings view rather than trying to access pg_user_mapping
directly, so that the code doesn't fail when run by a non-superuser. And
clean up some minor carelessness such as unsafe usage of fmtId().
Back-patch to 8.4 where this code was added.
parameter against server cert's CN field) to succeed in the case where
both host and hostaddr are specified. As with the existing precedents
for Kerberos, GSSAPI, SSPI, it is the calling application's responsibility
that host and hostaddr match up --- we just use the host name as given.
Per bug #5559 from Christopher Head.
In passing, make the error handling and messages for the no-host-name-given
failure more consistent among these four cases, and correct a lie in the
documentation: we don't attempt to reverse-lookup host from hostaddr
if host is missing.
Back-patch to 8.4 where SSL cert verification was introduced.
rather than just $N. This brings the display of nestloop-inner-indexscan
plans back to where it's been, and incidentally improves the display of
SubPlan parameters as well. In passing, simplify the EXPLAIN code by
having it deal primarily in the PlanState tree rather than separately
searching Plan and PlanState trees. This is noticeably cleaner for
subplans, and about a wash elsewhere.
One small difference from previous behavior is that EXPLAIN will no longer
qualify local variable references in inner-indexscan plan nodes, since it
no longer sees such nodes as possibly referencing multiple tables. Vars
referenced through PARAM_EXEC Params are still forcibly qualified, though,
so I don't think the display is any more confusing than before. Adjust a
couple of examples in the documentation to match this behavior.
loop from being dropped, I missed subtransaction cleanup. Pinned portals
must be dropped at subtransaction cleanup just as they are at main
transaction cleanup.
Per bug #5556 by Robert Walker. Backpatch to 8.0, 7.4 didn't have
subtransactions.
relation using the general PARAM_EXEC executor parameter mechanism, rather
than the ad-hoc kluge of passing the outer tuple down through ExecReScan.
The previous method was hard to understand and could never be extended to
handle parameters coming from multiple join levels. This patch doesn't
change the set of possible plans nor have any significant performance effect,
but it's necessary infrastructure for future generalization of the concept
of an inner indexscan plan.
ExecReScan's second parameter is now unused, so it's removed.
use the actual element type of the array it's disassembling, rather than
trusting the type OID passed in by its caller. This is needed because
sometimes the planner passes in a type OID that's only binary-compatible
with the target column's type, rather than being an exact match. Per an
example from Bernd Helmle.
Possibly we should refactor get_attstatsslot/free_attstatsslot to not expect
the caller to supply type ID data at all, but for now I'll just do the
minimum-change fix.
Back-patch to 7.4. Bernd's test case only crashes back to 8.0, but since
these subroutines are the same in 7.4, I suspect there may be variant
cases that would crash 7.4 as well.
resjunk outputs of subquery tlists, instead of throwing an error. Per bug
#5548 from Daniel Grace.
We might at some point find we ought to back-patch this further than 9.0,
but I think that such Vars can only occur as resjunk members of upper-level
tlists, in which case the problem can't arise because prior versions didn't
print resjunk tlist items in EXPLAIN VERBOSE.
This hook allows a loadable module to gain control when table permissions
are checked. It is expected to be used by an eventual SE-PostgreSQL
implementation, but there are other possible applications as well. A
sample contrib module can be found in the archives at:
http://archives.postgresql.org/pgsql-hackers/2010-05/msg01095.php
Robert Haas and Stephen Frost
(_PG_init should be called only once anyway, but as long as it's got an
internal guard against repeat calls, that should be in front of the
version check.)
This wasn't important when we used diff's -w (--ignore-all-space) option
to compare regression result files, but it is now. Per buildfarm member
canary, which evidently has been offline since we did that in November,
but came to life again today.
sub-select contains a join alias reference that expands into an expression
containing another sub-select. Per yesterday's report from Merlin Moncure
and subsequent off-list investigation.
Back-patch to 7.4. Older versions didn't attempt to flatten sub-selects in
ways that would trigger this problem.
To do that, replace L'\0' by (WCHAR) 0. Perhaps someday we should teach
pgindent about wide-character literals, but so long as this is the only
use-case in the entire Postgres sources, a workaround seems easier.
Per extensive discussion on pgsql-hackers. We are deliberately not
back-patching this even though the behavior of 8.3 and 8.4 is
unquestionably broken, for fear of breaking existing users of this
parameter. This incompatibility should be release-noted.
flag for src/port/ in front of any -L flags placed in LDFLAGS by configure.
This undoes an L-flag-ordering change that I had thought would be safe,
but seems to be making at least one buildfarm member fail --- the only
theory for orca's failure that I can think of is that it's got an old
copy of libpgport.a in /usr/lib. Also allow for LDFLAGS_SL to be set by
contrib makefiles before they invoke Makefile.global.
needs to appear before anything placed in SHLIB_LINK. This is because
SHLIB_LINK is typically a subset of LIBS, and LIBS has to appear after
LDFLAGS on platforms that are sensitive to the relative order of -L and -l
switches.
supposing that they should set SHLIB_LINK rather than LDFLAGS_SL. Since these
don't go through Makefile.shlib that was a no-op on most platforms. Also
regularize the few platform-specific Makefiles that did pay attention to
SHLIB_LINK: it seems that the real value of that is to pull in BE_DLLLIBS,
so do that instead. Per buildfarm failures on cygwin.
linking both executables and shared libraries, and we add on LDFLAGS_EX when
linking executables or LDFLAGS_SL when linking shared libraries. This
provides a significantly cleaner way of dealing with link-time switches than
the former behavior. Also, make sure that the various platform-specific
%.so: %.o rules incorporate LDFLAGS and LDFLAGS_SL; most of them missed that
before. (I did not add these variables for the platforms that invoke $(LD)
directly, however. It's not clear if we can do that safely, since for the
most part we assume these variables use CC command-line syntax.)
Per gripe from Aaron Swenson and subsequent investigation.
being used in a PL/pgSQL FOR loop is closed was inadequate, as Tom Lane
pointed out. The bug affects FOR statement variants too, because you can
close an implicitly created cursor too by guessing the "<unnamed portal X>"
name created for it.
To fix that, "pin" the portal to prevent it from being dropped while it's
being used in a PL/pgSQL FOR loop. Backpatch all the way to 7.4 which is
the oldest supported version.
idea from the start since the variable is only meant to track commit/abort
events. This patch reverts the logic around the variable to what it was in
8.4, except that the value is now kept in shared memory rather than a static
variable, so that it can be reported correctly by CreateRestartPoint (which is
executed in the bgwriter).
to have different values in different processes of the primary server.
Also put it into the "Streaming Replication" GUC category; it doesn't belong
in "Standby Servers" because you use it on the master not the standby.
In passing also correct guc.c's idea of wal_keep_segments' category.
max_standby_streaming_delay, and revise the implementation to avoid assuming
that timestamps found in WAL records can meaningfully be compared to clock
time on the standby server. Instead, the delay limits are compared to the
elapsed time since we last obtained a new WAL segment from archive or since
we were last "caught up" to WAL data arriving via streaming replication.
This avoids problems with clock skew between primary and standby, as well
as other corner cases that the original coding would misbehave in, such
as the primary server having significant idle time between transactions.
Per my complaint some time ago and considerable ensuing discussion.
Do some desultory editing on the hot standby documentation, too.
Backpatch to 8.3, which is as far back as we have opfamilies.
The opclass portion could probably be backpatched to 8.2, when
REASSIGN OWNED was added, but for now I have not done that.
Asko Tiidumaa, with minor adjustments by me.
The previous commit to make copydir() interruptible prevented
postgres.exe from linking on MinGW and Cygwin, because on those
platforms libpgport_srv.a can't freely reference symbols defined
by the backend. Since that code is already backend-specific anyway,
just move the whole file into the backend rather than adding further
kludges to deal with the symbols needed by CHECK_FOR_INTERRUPTS().
This probably needs some further cleanup, but this commit just moves
the file as-is, which should hopefully be enough to turn the
buildfarm green again.
This makes ALTER DATABASE .. SET TABLESPACE and CREATE DATABASE more
sensitive to interrupts. Backpatch to 8.4, where ALTER DATABASE .. SET
TABLESPACE was introduced. We could go back further, but in the absence
of complaints about the CREATE DATABASE case it doesn't seem worth it.
Guillaume Lelarge, with a small correction by me.