Commit Graph

12105 Commits

Author SHA1 Message Date
Peter Eisentraut ba4cacf075 Recode non-ASCII characters in source to UTF-8
For consistency, have all non-ASCII characters from contributors'
names in the source be in UTF-8.  But remove some other more
gratuitous uses of non-ASCII characters.
2011-05-31 23:11:46 +03:00
Tom Lane be4585b1c2 Replace use of credential control messages with getsockopt(LOCAL_PEERCRED).
It turns out the reason we hadn't found out about the portability issues
with our credential-control-message code is that almost no modern platforms
use that code at all; the ones that used to need it now offer getpeereid(),
which we choose first.  The last holdout was NetBSD, and they added
getpeereid() as of 5.0.  So far as I can tell, the only live platform on
which that code was being exercised was Debian/kFreeBSD, ie, FreeBSD kernel
with Linux userland --- since glibc doesn't provide getpeereid(), we fell
back to the control message code.  However, the FreeBSD kernel provides a
LOCAL_PEERCRED socket parameter that's functionally equivalent to Linux's
SO_PEERCRED.  That is both much simpler to use than control messages, and
superior because it doesn't require receiving a message from the other end
at just the right time.

Therefore, add code to use LOCAL_PEERCRED when necessary, and rip out all
the credential-control-message code in the backend.  (libpq still has such
code so that it can still talk to pre-9.1 servers ... but eventually we can
get rid of it there too.)  Clean up related autoconf probes, too.

This means that libpq's requirepeer parameter now works on exactly the same
platforms where the backend supports peer authentication, so adjust the
documentation accordingly.
2011-05-31 16:10:46 -04:00
Tom Lane 13c00ae8c7 Fix portability bugs in use of credentials control messages for peer auth.
Even though our existing code for handling credentials control messages has
been basically unchanged since 2001, it was fundamentally wrong: it did not
ensure proper alignment of the supplied buffer, and it was calculating
buffer sizes and message sizes incorrectly.  This led to failures on
platforms where alignment padding is relevant, for instance FreeBSD on
64-bit platforms, as seen in a recent Debian bug report passed on by
Martin Pitt (http://bugs.debian.org//cgi-bin/bugreport.cgi?bug=612888).

Rewrite to do the message-whacking using the macros specified in RFC 2292,
following a suggestion from Theo de Raadt in that thread.  Tested by me
on Debian/kFreeBSD-amd64; since OpenBSD and NetBSD document the identical
CMSG API, it should work there too.

Back-patch to all supported branches.
2011-05-30 19:16:05 -04:00
Tom Lane b4b6923e03 Fix VACUUM so that it always updates pg_class.reltuples/relpages.
When we added the ability for vacuum to skip heap pages by consulting the
visibility map, we made it just not update the reltuples/relpages
statistics if it skipped any pages.  But this could leave us with extremely
out-of-date stats for a table that contains any unchanging areas,
especially for TOAST tables which never get processed by ANALYZE.  In
particular this could result in autovacuum making poor decisions about when
to process the table, as in recent report from Florian Helmberger.  And in
general it's a bad idea to not update the stats at all.  Instead, use the
previous values of reltuples/relpages as an estimate of the tuple density
in unvisited pages.  This approach results in a "moving average" estimate
of reltuples, which should converge to the correct value over multiple
VACUUM and ANALYZE cycles even when individual measurements aren't very
good.

This new method for updating reltuples is used by both VACUUM and ANALYZE,
with the result that we no longer need the grotty interconnections that
caused ANALYZE to not update the stats depending on what had happened
in the parent VACUUM command.

Also, fix the logic for skipping all-visible pages during VACUUM so that it
looks ahead rather than behind to decide what to do, as per a suggestion
from Greg Stark.  This eliminates useless scanning of all-visible pages at
the start of the relation or just after a not-all-visible page.  In
particular, the first few pages of the relation will not be invariably
included in the scanned pages, which seems to help in not overweighting
them in the reltuples estimate.

Back-patch to 8.4, where the visibility map was introduced.
2011-05-30 17:06:52 -04:00
Magnus Hagander 5830f69665 Refuse "local" lines in pg_hba.conf on platforms that don't support it
This makes the behavior compatible with that of hostssl, which
also throws an error when there is no SSL support included.
2011-05-30 20:43:41 +02:00
Magnus Hagander 764bde0f16 Don't include local line on platforms without support
Since we now include a sample line for replication on local
connections in pg_hba.conf, don't include it where local
connections aren't available (such as on win32).

Also make sure we use authmethodlocal and not authmethod on
the sample line.
2011-05-30 20:21:06 +02:00
Heikki Linnakangas 3103f9a77d The row-version chaining in Serializable Snapshot Isolation was still wrong.
On further analysis, it turns out that it is not needed to duplicate predicate
locks to the new row version at update, the lock on the version that the
transaction saw as visible is enough. However, there was a different bug in
the code that checks for dangerous structures when a new rw-conflict happens.
Fix that bug, and remove all the row-version chaining related code.

Kevin Grittner & Dan Ports, with some comment editorialization by me.
2011-05-30 20:47:17 +03:00
Tom Lane 5e1365a965 Fix null-dereference crash in parse_xml_decl().
parse_xml_decl's header comment says you can pass NULL for any unwanted
output parameter, but it failed to honor this contract for the "standalone"
flag.  The only currently-affected caller is xml_recv, so the net effect is
that sending a binary XML value containing a standalone parameter in its
xml declaration would crash the backend.  Per bug #6044 from Christopher
Dillard.

In passing, remove useless initializations of parse_xml_decl's output
parameters in xml_parse.

Back-patch to 8.3, where this code was introduced.
2011-05-28 12:36:04 -04:00
Alvaro Herrera 4c60a77508 Remove unused variable
Cédric Villemain
2011-05-27 21:49:22 -04:00
Tom Lane 90857b48e1 Preserve caller's memory context in ProcessCompletedNotifies().
This is necessary to avoid long-term memory leakage, because the main loop
in PostgresMain expects to be executing in MessageContext, and hence is a
bit sloppy about freeing stuff that is only needed for the duration of
processing the current client message.  The known case of an actual leak
is when encoding conversion has to be done on the incoming command string,
but there might be others.  Per report from Per-Olov Esgard.

Back-patch to 9.0, where the bug was introduced by the LISTEN/NOTIFY
rewrite.
2011-05-27 12:10:32 -04:00
Tom Lane 3987e9e620 Make decompilation of optimized CASE constructs more robust.
We had some hacks in ruleutils.c to cope with various odd transformations
that the optimizer could do on a CASE foo WHEN "CaseTestExpr = RHS" clause.
However, the fundamental impossibility of covering all cases was exposed
by Heikki, who pointed out that the "=" operator could get replaced by an
inlined SQL function, which could contain nearly anything at all.  So give
up on the hacks and just print the expression as-is if we fail to recognize
it as "CaseTestExpr = RHS".  (We must cover that case so that decompiled
rules print correctly; but we are not under any obligation to make EXPLAIN
output be 100% valid SQL in all cases, and already could not do so in some
other cases.)  This approach requires that we have some printable
representation of the CaseTestExpr node type; I used "CASE_TEST_EXPR".

Back-patch to all supported branches, since the problem case fails in all.
2011-05-26 19:25:19 -04:00
Bruce Momjian 0711a8b2b3 Add C comment about why we don't spell out "month" in interval values. 2011-05-24 23:55:27 -04:00
Tom Lane b23aeb6519 Cleanup for pull-up-isReset patch.
Clear isReset before, not after, calling the context-specific alloc method,
so as to preserve the option to do a tail call in MemoryContextAlloc
(and also so this code isn't assuming that a failed alloc call won't have
changed the context's state before failing).  Fix missed direct invocation
of reset method.  Reformat a comment.
2011-05-24 17:57:32 -04:00
Peter Eisentraut f50655900a Add a "local" replication sample entry
Also adjust alignment a bit to distinguish commented out from comment.
2011-05-24 21:35:06 +03:00
Tom Lane cc24fb418d Avoid uninitialized bits in the result of QTN2QT().
Found with additional valgrind testing.

Noah Misch
2011-05-24 14:20:08 -04:00
Heikki Linnakangas 34be83b7e1 Fix integer overflow in text_format function, reported by Dean Rasheed.
In the passing, clarify the comment on why text_format_nv wrapper is needed.
2011-05-23 22:24:44 +03:00
Robert Haas 7149b128dc Improve hash_array() logic for combining hash values.
The new logic is less vulnerable to transpositions.

This invalidates the contents of hash indexes built with the old
functions; hence, bump catversion.

Dean Rasheed
2011-05-23 15:17:18 -04:00
Tom Lane 299d171652 Install defenses against overflow in BuildTupleHashTable().
The planner can sometimes compute very large values for numGroups, and in
cases where we have no alternative to building a hashtable, such a value
will get fed directly to BuildTupleHashTable as its nbuckets parameter.
There were two ways in which that could go bad.  First, BuildTupleHashTable
declared the parameter as "int" but most callers were passing "long"s,
so on 64-bit machines undetected overflow could occur leading to a bogus
negative value.  The obvious fix for that is to change the parameter to
"long", which is what I've done in HEAD.  In the back branches that seems a
bit risky, though, since third-party code might be calling this function.
So for them, just put in a kluge to treat negative inputs as INT_MAX.
Second, hash_create can go nuts with extremely large requested table sizes
(notably, my_log2 becomes an infinite loop for inputs larger than
LONG_MAX/2).  What seems most appropriate to avoid that is to bound the
initial table size request to work_mem.

This fixes bug #6035 reported by Daniel Schreiber.  Although the reported
case only occurs back to 8.4 since it involves WITH RECURSIVE, I think
it's a good idea to install the defenses in all supported branches.
2011-05-23 12:52:46 -04:00
Heikki Linnakangas 30e98a7e6e Pull up isReset flag from AllocSetContext to MemoryContext struct. This
avoids the overhead of one function call when calling MemoryContextReset(),
and it seems like the isReset optimization would be applicable to any new
memory context we might invent in the future anyway.

This buys back the overhead I just added in previous patch to always call
MemoryContextReset() in ExecScan, even when there's no quals or projections.
2011-05-21 14:47:19 -04:00
Heikki Linnakangas 0319da638f Reset per-tuple memory context between every row in a scan node, even when
there's no quals or projections. Currently this only matters for foreign
scans, as none of the other scan nodes litter the per-tuple memory context
when there's no quals or projections.
2011-05-21 14:30:11 -04:00
Peter Eisentraut bcf63a51e3 Message style improvements 2011-05-21 00:50:35 +03:00
Peter Eisentraut bb46d42859 Consistent spacing for lengthy error messages
Also, we removed the display of the current value of
max_connections/MaxBackends from some messages earlier, because it was
confusing, so do that in the remaining one as well.
2011-05-19 21:38:24 +03:00
Magnus Hagander a937b07121 Add example for replication in pg_hba.conf
Selena Deckelmann
2011-05-19 14:03:15 -04:00
Robert Haas 74aaa2136d Fix race condition in CheckTargetForConflictsIn.
Dan Ports
2011-05-19 12:12:04 -04:00
Peter Eisentraut c13dc6402b Spell checking and markup refinement 2011-05-19 01:14:45 +03:00
Robert Haas 9bb6d97952 More cleanup of FOREIGN TABLE permissions handling.
This commit fixes psql, pg_dump, and the information schema to be
consistent with the backend changes which I made as part of commit
be90032e0d, and also includes a
related documentation tweak.

Shigeru Hanada, with slight adjustment.
2011-05-13 15:51:03 -04:00
Robert Haas c5ab8425be Kill stray "not". 2011-05-12 17:10:30 -04:00
Alvaro Herrera c6eb5740b3 Fix assorted typos 2011-05-12 08:52:56 -04:00
Tom Lane e05b866447 Split PGC_S_DEFAULT into two values, for true boot_val vs computed default.
Failure to distinguish these cases is the real cause behind the recent
reports of Windows builds crashing on 'infinity'::timestamp, which was
directly due to failure to establish a value of timezone_abbreviations
in postmaster child processes.  The postmaster had the desired value,
but write_one_nondefault_variable() didn't transmit it to backends.

To fix that, invent a new value PGC_S_DYNAMIC_DEFAULT, and be sure to use
that or PGC_S_ENV_VAR (as appropriate) for "default" settings that are
computed during initialization.  (We need both because there's at least
one variable that could receive a value from either source.)

This commit also fixes ProcessConfigFile's failure to restore the correct
default value for certain GUC variables if they are set in postgresql.conf
and then removed/commented out of the file.  We have to recompute and
reinstall the value for any GUC variable that could have received a value
from PGC_S_DYNAMIC_DEFAULT or PGC_S_ENV_VAR sources, and there were a
number of oversights.  (That whole thing is a crock that needs to be
redesigned, but not today.)

However, I intentionally didn't make it work "exactly right" for the cases
of timezone and log_timezone.  The exactly right behavior would involve
running select_default_timezone, which we'd have to do independently in
each postgres process, causing the whole database to become entirely
unresponsive for as much as several seconds.  That didn't seem like a good
idea, especially since the variable's removal from postgresql.conf might be
just an accidental edit.  Instead the behavior is to adopt the previously
active setting as if it were default.

Note that this patch creates an ABI break for extensions that use any of
the PGC_S_XXX constants; they'll need to be recompiled.
2011-05-11 19:57:38 -04:00
Tom Lane 6fc6686b48 Clean up parsing of CREATE TRIGGER's argument list.
Use ColLabel in place of ColId, so that reserved words are accepted as if
they were not reserved.  Also, remove BCONST and XCONST, which were never
documented as allowed.  Allowing those exposes to users an implementation
detail, namely the format in which the lexer outputs such constants, that
seems unwise to expose.

No documentation change needed, since this just makes the code act more
like you'd expect from reading the CREATE TRIGGER man page.

Per complaint from Szymon Guz and subsequent discussion.
2011-05-11 14:43:01 -04:00
Heikki Linnakangas a0c8514149 Shut down WAL receiver if it's still running at end of recovery. We used to
just check that it's not running and PANIC if it was, but that can rightfully
happen if recovery stops at recovery target.
2011-05-11 12:46:08 +03:00
Tom Lane 2e82d0b396 Prevent datebsearch() from crashing on base == NULL && nel == 0.
Normally nel == 0 works okay because the initial value of "last" will be
less than "base"; but if "base" is zero then the calculation wraps around
and we have a very large (unsigned) value for "last", so that the loop can
be entered and we get a SIGSEGV on a bogus pointer.

This is certainly the proximate cause of the recent reports of Windows
builds crashing on 'infinity'::timestamp --- evidently, they're either not
setting an active timezonetktbl, or setting an empty one.  It's not yet
clear to me why it's only happening on Windows and not happening on any
buildfarm member.  But even if that's due to some bug elsewhere, it seems
wise for this function to not choke on the powerup values of
timezonetktbl/sztimezonetktbl.

I also changed the copy of this code in ecpglib, although I am not sure
whether it's exposed to a similar hazard.

Per report and stack trace from Richard Broersma.
2011-05-10 20:37:26 -04:00
Tom Lane 1453cd8f82 Adjust documentation with respect to "unknown" timezone setting.
The recent cleanup of GUC assign hooks got rid of the kludge of using
"unknown" as a magic value for timezone and log_timezone.  But I forgot
to update the documentation to match, as noted by Martin Pitt.
2011-05-10 13:48:40 -04:00
Bruce Momjian 76e5b4c85d Add C comment about the fact that the autovacuum limit can go backwards
by 3, but that is it OK.
2011-05-08 23:59:31 -04:00
Robert Haas 71932ecc2b Add comment about memory reordering to PredicateLockTupleRowVersionLink.
Dan Ports, per head-scratching from Simon Riggs and myself.
2011-05-06 21:55:10 -04:00
Tom Lane d2088ae949 Move RegisterPredicateLockingXid() call to a safer place.
The SSI patch inserted a call of RegisterPredicateLockingXid into
GetNewTransactionId, which was a bad idea on a couple of grounds.  First,
it's not necessary to hold XidGenLock while manipulating that shared
memory, and doing so is bad because XidGenLock is a high-contention lock
that should be held for as short a time as possible.  (Not to mention that
it adds an entirely unnecessary deadlock hazard, since we must take
SerializableXactHashLock as well.)  Second, the specific place where it was
put was between extending CLOG and advancing nextXid, which could result in
unpleasant behavior in case of a failure there.  Pull the call out to
AssignTransactionId, which is much safer and arguably better from a
modularity standpoint too.

There is more work to do to clean up the failure-before-advancing-nextXid
issue, but that is a separate change that will need to be back-patched.
So for the moment I just want to make GetNewTransactionId look the same as
it did in prior versions.
2011-05-06 12:57:28 -04:00
Tom Lane 12b7164578 Remove precedence labeling of keywords TRUE, FALSE, UNKNOWN, and ZONE.
These were labeled with precedences just to avoid attaching explicit
precedences to the productions in which they were the last terminal symbol.
Since a terminal symbol precedence marking can affect many other things
too, it seems like better practice to attach precedence labels to the
productions, and not mark the terminal symbols.

Ideally we'd also remove the precedence attached to NULL_P, but it turns
out that we are actually depending on that having a precedence higher than
POSTFIXOP, else we get a shift/reduce conflict for postfix operators in
b_expr.  (Which more or less proves my point about these markings having a
high risk of unexpected consequences.)  For the moment, move NULL_P into
the set of keywords grouped with IDENT, so that at least it will act
similarly to non-keywords; and document the interaction.
2011-05-05 20:38:52 -04:00
Magnus Hagander d76a149c95 Clarify error message when attempting to create index on foreign table
Instead of just saying "is not a table", specifically state that
indexes aren't supported on *foreign* tables.
2011-05-05 21:47:42 +02:00
Tom Lane dcc685debb Fix pull_up_sublinks' failure to handle nested pull-up opportunities.
After finding an EXISTS or ANY sub-select that can be converted to a
semi-join or anti-join, we should recurse into the body of the sub-select.
This allows cases such as EXISTS-within-EXISTS to be optimized properly.
The original coding would leave the lower sub-select as a SubLink, which
is no better and often worse than what we can do with a join.  Per example
from Wayne Conrad.

Back-patch to 8.4.  There is a related issue in older versions' handling
of pull_up_IN_clauses, but they're lame enough anyway about the whole area
that it seems not worth the extra work to try to fix.
2011-05-02 15:57:28 -04:00
Tom Lane 6755558b92 Improve aset.c's space management in contexts with small maxBlockSize.
The previous coding would allow requests up to half of maxBlockSize to be
treated as "chunks", but when that actually did happen, we'd waste nearly
half of the space in the malloc block containing the chunk, if no smaller
requests came along to fill it.  Avoid this scenario by limiting the
maximum size of a chunk to 1/8th maxBlockSize, so that we can waste no more
than 1/8th of the allocated space.  This will not change the behavior at
all for the default context size parameters (with large maxBlockSize),
but it will change the behavior when using ALLOCSET_SMALL_MAXSIZE.

In particular, there's no longer a need for spell.c to be overly concerned
about the request size parameters it uses, so remove a rather unhelpful
comment about that.

Merlin Moncure, per an idea of Tom Lane's
2011-05-02 12:08:08 -04:00
Tom Lane 83b7584944 Make CLUSTER lock the old table's toast table before copying data.
We must lock out autovacuuming of the old toast table before computing the
OldestXmin horizon we will use.  Otherwise, autovacuum could start on the
toast table later, compute a later OldestXmin horizon, and remove as DEAD
toast tuples that we still need (because we think their parent tuples are
only RECENTLY_DEAD).  Per further thought about bug #5998.
2011-05-01 17:57:33 -04:00
Bruce Momjian 5a71b64130 Lowercase status labels in pg_stat_replication view. 2011-04-29 22:20:43 -04:00
Tom Lane 44e4bbf75d Remove special case for xmin == xmax in HeapTupleSatisfiesVacuum().
VACUUM was willing to remove a committed-dead tuple immediately if it was
deleted by the same transaction that inserted it.  The idea is that such a
tuple could never have been visible to any other transaction, so we don't
need to keep it around to satisfy MVCC snapshots.  However, there was
already an exception for tuples that are part of an update chain, and this
exception created a problem: we might remove TOAST tuples (which are never
part of an update chain) while their parent tuple stayed around (if it was
part of an update chain).  This didn't pose a problem for most things,
since the parent tuple is indeed dead: no snapshot will ever consider it
visible.  But MVCC-safe CLUSTER had a problem, since it will try to copy
RECENTLY_DEAD tuples to the new table.  It then has to copy their TOAST
data too, and would fail if VACUUM had already removed the toast tuples.

Easiest fix is to get rid of the special case for xmin == xmax.  This may
delay reclaiming dead space for a little bit in some cases, but it's by far
the most reliable way to fix the issue.

Per bug #5998 from Mark Reid.  Back-patch to 8.3, which is the oldest
version with MVCC-safe CLUSTER.
2011-04-29 16:29:42 -04:00
Tom Lane fd2e2d09aa Rewrite pg_size_pretty() to avoid compiler bug.
Convert it to use successive shifts right instead of increasing a divisor.
This is probably a tad more efficient than the original coding, and it's
nicer-looking than the previous patch because we don't need a special case
to avoid overflow in the last branch.  But the real reason to do it is to
avoid a Solaris compiler bug, as per results from buildfarm member moa.
2011-04-29 01:45:58 -04:00
Andrew Dunstan ab0ba6e73a Add some casts to try to silence most of the remaining format warnings on MinGW-W64. 2011-04-28 15:05:58 -04:00
Andrew Dunstan c02d5b7c27 Use a macro variable PG_PRINTF_ATTRIBUTE for the style used for checking printf type functions.
The style is set to "printf" for backwards compatibility everywhere except
on Windows, where it is set to "gnu_printf", which eliminates hundreds of
false error messages from modern versions of gcc arising from  %m and %ll{d,u}
formats.
2011-04-28 10:56:14 -04:00
Tom Lane 18c0b4eccd Fix array- and path-creating functions to ensure padding bytes are zeroes.
Per recent discussion, it's important for all computed datums (not only the
results of input functions) to not contain any ill-defined (uninitialized)
bits.  Failing to ensure that can result in equal() reporting that
semantically indistinguishable Consts are not equal, which in turn leads to
bizarre and undesirable planner behavior, such as in a recent example from
David Johnston.  We might eventually try to fix this in a general manner by
allowing datatypes to define identity-testing functions, but for now the
path of least resistance is to expect datatypes to force all unused bits
into consistent states.

Per some testing by Noah Misch, array and path functions seem to be the
only ones presenting risks at the moment, so I looked through all the
functions in adt/array*.c and geo_ops.c and fixed them as necessary.  In
the array functions, the easiest/safest fix is to allocate result arrays
with palloc0 instead of palloc.  Possibly in future someone will want to
look into whether we can just zero the padding bytes, but that looks too
complex for a back-patchable fix.  In the path functions, we already had a
precedent in path_in for just zeroing the one known pad field, so duplicate
that code as needed.

Back-patch to all supported branches.
2011-04-27 13:58:36 -04:00
Andrew Dunstan 348c10efe0 Revert "Remove hard coded formats for INT64 and use configured settings instead."
This reverts commit 9b1508af89.

As requested by Tom.
2011-04-27 11:28:14 -04:00
Andrew Dunstan 9b1508af89 Remove hard coded formats for INT64 and use configured settings instead. 2011-04-27 11:07:52 -04:00
Andrew Dunstan c43d0791ac Use an explicit format string to keep the compiler happy. 2011-04-27 10:02:21 -04:00
Tom Lane 71e7083532 Rephrase some not-supported error messages in pg_hba.conf processing.
In a couple of places we said "not supported on this platform" for cases
that aren't really platform-specific, but could depend on configuration
options such as --with-openssl.  Use "not supported by this build" instead,
as that doesn't convey the impression that you can't fix it without moving
to another OS; that's also more consistent with the wording used for an
identical error case in guc.c.

No back-patch, as the clarity gain is small enough to not be worth
burdening translators with back-branch changes.
2011-04-26 15:56:28 -04:00
Tom Lane c464a0657b Complain if pg_hba.conf contains "hostssl" but SSL is disabled.
Most commenters agreed that this is more friendly than silently failing
to match the line during actual connection attempts.  Also, this will
prevent corner cases that might arise when trying to handle such a line
when the SSL code isn't turned on.  An example is that specifying
clientcert=1 in such a line would formerly result in a completely
misleading complaint that root.crt wasn't present, as seen in a recent
report from Marc-Andre Laverdiere.  While we could have instead fixed
that specific behavior, it seems likely that we'd have a continuing stream
of such bizarre behaviors if we keep on allowing hostssl lines when SSL is
disabled.

Back-patch to 8.4, where clientcert was introduced.  Earlier versions don't
have this specific issue, and the code is enough different to make this
patch not applicable without more work than it seems worth.
2011-04-26 15:40:11 -04:00
Tom Lane 6dab96abaa Remove incorrect HINT for use of ALTER FOREIGN TABLE on the wrong relkind.
Per discussion, removing the hint seems better than correcting it because
the adjacent analogous cases in RenameRelation don't have any hints, and
nobody seems to have missed 'em.

Shigeru Hanada
2011-04-25 20:13:53 -04:00
Robert Haas 68ef051f5c Refactor broken CREATE TABLE IF NOT EXISTS support.
Per bug #5988, reported by Marko Tiikkaja, and further analyzed by Tom
Lane, the previous coding was broken in several respects: even if the
target table already existed, a subsequent CREATE TABLE IF NOT EXISTS
might try to add additional constraints or sequences-for-serial
specified in the new CREATE TABLE statement.

In passing, this also fixes a minor information leak: it's no longer
possible to figure out whether a schema to which you don't have CREATE
access contains a sequence named like "x_y_seq" by attempting to create a
table in that schema called "x" with a serial column called "y".

Some more refactoring of this code in the future might be warranted,
but that will need to wait for a later major release.
2011-04-25 16:55:11 -04:00
Robert Haas be90032e0d Remove partial and undocumented GRANT .. FOREIGN TABLE support.
Instead, foreign tables are treated just like views: permissions can
be granted using GRANT privilege ON [TABLE] foreign_table_name TO role,
and revoked similarly.  GRANT/REVOKE .. FOREIGN TABLE is no longer
supported, just as we don't support GRANT/REVOKE .. VIEW.  The set of
accepted permissions for foreign tables is now identical to the set for
regular tables, and views.

Per report from Thom Brown, and subsequent discussion.
2011-04-25 16:39:18 -04:00
Tom Lane af0f20092c Fix pg_size_pretty() to avoid overflow for inputs close to INT64_MAX.
The expression that tried to round the value to the nearest TB could
overflow, leading to bogus output as reported in bug #5993 from Nicola
Cossu.  This isn't likely to ever happen in the intended usage of the
function (if it could, we'd be needing to use a wider datatype instead);
but it's not hard to give the expected output, so let's do so.
2011-04-25 16:22:12 -04:00
Andrew Dunstan 860be17ec3 Assorted minor changes to silence Windows compiler warnings.
Mostly to do with macro redefinitions or object signedness.
2011-04-25 12:56:53 -04:00
Bruce Momjian 76dd09bbec Add postmaster/postgres undocumented -b option for binary upgrades.
This option turns off autovacuum, prevents non-super-user connections,
and enables oid setting hooks in the backend.  The code continues to use
the old autoavacuum disable settings for servers with earlier catalog
versions.

This includes a catalog version bump to identify servers that support
the -b option.
2011-04-25 12:00:21 -04:00
Robert Haas 02e6a115cc Add fast paths for cases when no serializable transactions are running.
Dan Ports
2011-04-25 09:52:01 -04:00
Robert Haas b429519d8d Fix SSI-related assertion failure.
Bug #5899, reported by Marko Tiikkaja.

Heikki Linnakangas, reviewed by Kevin Grittner and Dan Ports.
2011-04-25 09:47:28 -04:00
Tom Lane e6a30a8c3c Improve cost estimation for aggregates and window functions.
The previous coding failed to account properly for the costs of evaluating
the input expressions of aggregates and window functions, as seen in a
recent gripe from Claudio Freire.  (I said at the time that it wasn't
counting these costs at all; but on closer inspection, it was effectively
charging these costs once per output tuple.  That is completely wrong for
aggregates, and not exactly right for window functions either.)

There was also a hard-wired assumption that aggregates and window functions
had procost 1.0, which is now fixed to respect the actual cataloged costs.

The costing of WindowAgg is still pretty bogus, since it doesn't try to
estimate the effects of spilling data to disk, but that seems like a
separate issue.
2011-04-24 16:55:20 -04:00
Andrew Dunstan d98711dfef Silence a few compiler warnings from gcc on MinGW.
Most of these cast DWORD to int or unsigned int for printf type handling.
This is safe even on 64 bit architectures because a DWORD is always 32 bits.

In one case a variable is initialised to keep the compiler happy.
2011-04-23 18:10:23 -04:00
Tom Lane a0b75a41a9 Hash indexes had better pass the index collation to support functions, too.
Per experimentation with contrib/citext, whose hash function assumes that
it'll be passed a collation.
2011-04-23 14:13:09 -04:00
Tom Lane 2ab0796d7a Fix char2wchar/wchar2char to support collations properly.
These functions should take a pg_locale_t, not a collation OID, and should
call mbstowcs_l/wcstombs_l where available.  Where those functions are not
available, temporarily select the correct locale with uselocale().

This change removes the bogus assumption that all locales selectable in
a given database have the same wide-character conversion method; in
particular, the collate.linux.utf8 regression test now passes with
LC_CTYPE=C, so long as the database encoding is UTF8.

I decided to move the char2wchar/wchar2char functions out of mbutils.c and
into pg_locale.c, because they work on wchar_t not pg_wchar_t and thus
don't really belong with the mbutils.c functions.  Keeping them where they
were would have required importing pg_locale_t into pg_wchar.h somehow,
which did not seem like a good plan.
2011-04-23 12:35:41 -04:00
Tom Lane ae20bf1740 Make GIN and GIST pass the index collation to all their support functions.
Experimentation with contrib/btree_gist shows that the majority of the GIST
support functions potentially need collation information.  Safest policy
seems to be to pass it to all of them, instead of making assumptions about
which ones could possibly need it.
2011-04-22 20:13:12 -04:00
Tom Lane 9e9b9ac7d1 Make a code-cleanup pass over the collations patch.
This patch is almost entirely cosmetic --- mostly cleaning up a lot of
neglected comments, and fixing code layout problems in places where the
patch made lines too long and then pgindent did weird things with that.
I did find a bug-of-omission in equalTupleDescs().
2011-04-22 17:43:18 -04:00
Tom Lane 92647fc4b9 Avoid possible divide-by-zero in gincostestimate.
Per report from Jeff Janes.
2011-04-21 19:28:36 -04:00
Robert Haas a0e8df527e Allow ALTER TYPE .. ADD ATTRIBUTE .. CASCADE to recurse to descendants.
Without this, adding an attribute to a typed table with an inheritance
child fails, which is surprising.

Noah Misch, with minor changes by me.
2011-04-20 22:49:37 -04:00
Robert Haas 8ede427938 Fix use of incorrect constant RemoveRoleFromObjectACL.
This could cause failures when DROP OWNED BY attempt to remove default
privileges on sequences.  Back-patching to 9.0.

Shigeru Hanada
2011-04-20 22:23:58 -04:00
Robert Haas 0babcdf6cf Typo fix. 2011-04-20 22:05:16 -04:00
Robert Haas 68739ba856 Allow ALTER TABLE name {OF type | NOT OF}.
This syntax allows a standalone table to be made into a typed table,
or a typed table to be made standalone.  This is possibly a mildly
useful feature in its own right, but the real motivation for this
change is that we need it to make pg_upgrade work with typed tables.
This doesn't actually fix that problem, but it's necessary
infrastructure.

Noah Misch
2011-04-20 21:38:47 -04:00
Tom Lane 520bcd9c9b Fix bugs in indexing of in-doubt HOT-updated tuples.
If we find a DELETE_IN_PROGRESS HOT-updated tuple, it is impossible to know
whether to index it or not except by waiting to see if the deleting
transaction commits.  If it doesn't, the tuple might again be LIVE, meaning
we have to index it.  So wait and recheck in that case.

Also, we must not rely on ii_BrokenHotChain to decide that it's possible to
omit tuples from the index.  That could result in omitting tuples that we
need, particularly in view of yesterday's fixes to not necessarily set
indcheckxmin (but it's broken even without that, as per my analysis today).
Since this is just an extremely marginal performance optimization, dropping
the test shouldn't hurt.

These cases are only expected to happen in system catalogs (they're
possible there due to early release of RowExclusiveLock in most
catalog-update code paths).  Since reindexing of a system catalog isn't a
particularly performance-critical operation anyway, there's no real need to
be concerned about possible performance degradation from these changes.

The worst aspects of this bug were introduced in 9.0 --- 8.x will always
wait out a DELETE_IN_PROGRESS tuple.  But I think dropping index entries
on the strength of ii_BrokenHotChain is dangerous even without that, so
back-patch removal of that optimization to 8.3 and 8.4.
2011-04-20 20:34:11 -04:00
Tom Lane 9ad7e15507 Set indcheckxmin true when REINDEX fixes an invalid or not-ready index.
Per comment from Greg Stark, it's less clear that HOT chains don't conflict
with the index than it would be for a valid index.  So, let's preserve the
former behavior that indcheckxmin does get set when there are
potentially-broken HOT chains in this case.  This change does not cause any
pg_index update that wouldn't have happened anyway, so we're not
re-introducing the previous bug with pg_index updates, and surely the case
is not significant from a performance standpoint; so let's be as
conservative as possible.
2011-04-20 19:01:20 -04:00
Tom Lane 5b8e442953 Make plan_cluster_use_sort cope with no IndexOptInfo for the target index.
The original coding assumed that such a case represents caller error, but
actually get_relation_info will omit generating an IndexOptInfo for any
index it thinks is unsafe to use.  Therefore, handle this case by returning
"true" to indicate that a seqscan-and-sort is the preferred way to
implement the CLUSTER operation.  New bug in 9.1, no backpatch needed.
Per bug #5985 from Daniel Grace.
2011-04-20 17:44:17 -04:00
Tom Lane 8c19977e9c Avoid changing an index's indcheckxmin horizon during REINDEX.
There can never be a need to push the indcheckxmin horizon forward, since
any HOT chains that are actually broken with respect to the index must
pre-date its original creation.  So we can just avoid changing pg_index
altogether during a REINDEX operation.

This offers a cleaner solution than my previous patch for the problem
found a few days ago that we mustn't try to update pg_index while we are
reindexing it.  System catalog indexes will always be created with
indcheckxmin = false during initdb, and with this modified code we should
never try to change their pg_index entries.  This avoids special-casing
system catalogs as the former patch did, and should provide a performance
benefit for many cases where REINDEX formerly caused an index to be
considered unusable for a short time.

Back-patch to 8.3 to cover all versions containing HOT.  Note that this
patch changes the API for index_build(), but I believe it is unlikely that
any add-on code is calling that directly.
2011-04-19 18:50:56 -04:00
Tom Lane c096d19b74 Revert "Prevent incorrect updates of pg_index while reindexing pg_index itself."
This reverts commit 4b6106ccfe of 2011-04-15.
There's a better way to do it, which will follow shortly.
2011-04-19 16:55:34 -04:00
Tom Lane 390cf3209b Refrain from canonicalizing a client_encoding setting of "UNICODE".
While "UTF8" is the correct name for this encoding, existing JDBC drivers
expect that if they send "UNICODE" it will read back the same way; they
fail with an opaque "Protocol error" complaint if not.  This will be fixed
in the 9.1 drivers, but until older drivers are no longer in use in the
wild, we'd better leave "UNICODE" alone.  Continue to canonicalize all
other inputs.  Per report from Steve Singer and subsequent discussion.
2011-04-19 12:25:13 -04:00
Tom Lane 918854cc08 Fix handling of collations in multi-row VALUES constructs.
Per spec we ought to apply select_common_collation() across the expressions
in each column of the VALUES table.  The original coding was just taking
the first row and assuming it was representative.

This patch adds a field to struct RangeTblEntry to carry the resolved
collations, so initdb is forced for changes in stored rule representation.
2011-04-18 15:31:52 -04:00
Robert Haas 04db0fdbfa Only allow typed tables to hang off composite types, not e.g. tables.
This also ensures that we take a relation lock on the composite type when
creating a typed table, which is necessary to prevent the composite type
and the typed table from getting out of step in the face of concurrent
DDL.

Noah Misch, with some changes.
2011-04-18 10:19:46 -04:00
Robert Haas aea1f24c2c recoveryStopsHere() must check the resource manager ID.
Before commit c016ce7281, this wasn't
needed, but now that multiple resource manager IDs can percolate down
through here, we have to make sure we know which one we've got.
Otherwise, we can confuse (for example) an XLOG_XACT_COMMIT record
with an XLOG_CHECKPOINT_SHUTDOWN record.

Review by Jaime Casanova
2011-04-18 08:27:19 -04:00
Tom Lane 49a642ab18 Add check for matching column collations in ALTER TABLE ... INHERIT.
The other DDL operations that create an inheritance relationship were
checking for collation match already, but this one got missed.

Also fix comments that failed to mention collation checks.
2011-04-17 16:22:13 -04:00
Tom Lane 88dc6fa7a1 foreach() and list_delete() don't mix.
Fix crash when releasing duplicate entries in the encoding conversion cache
list, caused by releasing the current entry of the list being chased by
foreach().  We have a standard idiom for handling such cases, but this
loop wasn't using it.

This got broken in my recent rewrite of GUC assign hooks.  Not sure how
I missed this when testing the modified code, but I did.  Per report from
Peter.
2011-04-17 13:37:39 -04:00
Tom Lane d2f60a3ab0 Add an Assert that indexam.c isn't used on an index awaiting reindexing.
This might have caught the recent embarrassment over trying to modify
pg_index while its indexes were being rebuilt.

Noah Misch
2011-04-16 19:29:10 -04:00
Tom Lane 2d3320d3d2 Simplify reindex_relation's API.
For what seem entirely historical reasons, a bitmask "flags" argument was
recently added to reindex_relation without subsuming its existing boolean
argument into that bitmask.  This seems a bit bizarre, so fold them
together.
2011-04-16 17:26:41 -04:00
Tom Lane 121f49a00e Clean up collation processing in prepunion.c.
This area was a few bricks shy of a load, and badly under-commented too.
We have to ensure that the generated targetlist entries for a set-operation
node expose the correct collation for each entry, since higher-level
processing expects the tlist to reflect the true ordering of the plan's
output.

This hackery wouldn't be necessary if SortGroupClause carried collation
info ... but making it do so would inject more pain in the parser than
would be saved here.  Still, we might want to rethink that sometime.
2011-04-16 16:40:42 -04:00
Tom Lane 4b6106ccfe Prevent incorrect updates of pg_index while reindexing pg_index itself.
The places that attempt to change pg_index.indcheckxmin during a reindexing
operation cannot be executed safely if pg_index itself is the subject of
the operation.  This is the explanation for a couple of recent reports of
VACUUM FULL failing with
	ERROR:  duplicate key value violates unique constraint "pg_index_indexrelid_index"
	DETAIL:  Key (indexrelid)=(2678) already exists.

However, there isn't any real need to update indcheckxmin in such a
situation, if we assume that pg_index can never contain a truly broken HOT
chain.  This assumption holds if new indexes are never created on it during
concurrent operations, which is something we don't consider safe for any
system catalog, not just pg_index.  Accordingly, modify the code to not
manipulate indcheckxmin when reindexing any system catalog.

Back-patch to 8.3, where HOT was introduced.  The known failure scenarios
involve 9.0-style VACUUM FULL, so there might not be any real risk before
9.0, but let's not assume that.
2011-04-15 20:18:57 -04:00
Tom Lane 72826fb362 Guard against incoming rowcount estimate of NaN in cost_mergejoin().
Although rowcount estimates really ought not be NaN, a bug elsewhere
could perhaps result in that, and that would cause Assert failure in
cost_mergejoin, which I believe to be the explanation for bug #5977 from
Anton Kuznetsov.  Seems like a good idea to expend a couple more cycles
to prevent that, even though the real bug is elsewhere.  Not back-patching,
though, because we don't encourage running production systems with
Asserts on.
2011-04-15 17:46:50 -04:00
Heikki Linnakangas 4c37c1e3b2 Reduce the initial size of local lock hash to 16 entries.
The hash table is seq scanned at transaction end, to release all locks,
and making the hash table larger than necessary makes that slower. With
very simple queries, that overhead can amount to a few percent of the total
CPU time used.

At the moment, backend startup needs 6 locks, and a simple query with one
table and index needs 3 locks. 16 is enough for even quite complicated
transactions, and it will grow automatically if it fills up.
2011-04-15 15:07:36 +03:00
Robert Haas 0c80b57d07 Remove obsolete comment.
The lock level for adding a parent table is now ShareUpdateExclusiveLock;
see commit fbcf4b92aa.  This comment didn't
get updated to match, but it doesn't seem important to mention this detail
here, so rather than updating it now, just take it out.
2011-04-13 19:20:39 -07:00
Robert Haas 39a68e5c6c Fix toast table creation.
Instead of using slightly-too-clever heuristics to decide when we must
create a TOAST table, just check whether one is needed every time the
table is altered.  Checking whether a toast table is needed is cheap
enough that we needn't worry about doing it on every ALTER TABLE command,
and the previous coding is apparently prone to accidental breakage:
commit 04e17bae50 broken ALTER TABLE ..
SET STORAGE, which moved some actions from AT_PASS_COL_ATTRS to
AT_PASS_MISC, and commit 6c57239985 broke
ALTER TABLE .. ADD COLUMN by changing the way that adding columns
recurses into child tables.

Noah Misch, with one comment change by me
2011-04-13 18:17:52 -07:00
Tom Lane eca75a12a2 Ensure mark_dummy_rel doesn't create dangling pointers in RelOptInfos.
When we are doing GEQO join planning, the current memory context is a
short-lived context that will be reset at the end of geqo_eval().  However,
the RelOptInfos for base relations are set up before that and then re-used
across many GEQO cycles.  Hence, any code that modifies a baserel during
join planning has to be careful not to put pointers to the short-lived
context into the baserel struct.  mark_dummy_rel got this wrong, leading to
easy-to-reproduce-once-you-know-how crashes in 8.4, as reported off-list by
Leo Carson of SDSC.  Some improvements made in 9.0 make it difficult to
demonstrate the crash in 9.0 or HEAD; but there's no doubt that there's
still a risk factor here, so patch all branches that have the function.
(Note: 8.3 has a similar function, but it's only applied to joinrels and
thus is not a hazard.)
2011-04-13 18:56:40 -04:00
Robert Haas 0a49c95c73 Avoid incorrectly granting replication to roles created with NOSUPERUSER.
Andres Freund
2011-04-13 12:28:53 -07:00
Heikki Linnakangas 40e64017f3 On HP/UX, the structs used by ioctl(SIOCGLIFCONF) are named differently
than on other platforms, and only IPv6 addresses are returned. Because of
those two issues, fall back to ioctl(SIOCGIFCONF) on HP/UX, so that it at
least compiles and finds IPv4 addresses. This function is currently only
used for interpreting samehost/samenet in pg_hba.conf, which isn't that
critical.
2011-04-13 22:25:27 +03:00
Heikki Linnakangas 54685b1c2b Revert the patch to check if we've reached end-of-backup also when doing
crash recovery, and throw an error if not. hubert depesz lubaczewski pointed
out that that situation also happens in the crash recovery following a
system crash that happens during an online backup.

We might want to do something smarter in 9.1, like put the check back for
backups taken with pg_basebackup, but that's for another patch.
2011-04-13 22:05:40 +03:00
Heikki Linnakangas b5bb040da6 On IA64 architecture, we check the depth of the register stack in addition
to the regular stack. The code to do that is platform and compiler specific,
add support for the HP-UX native compiler.
2011-04-13 11:50:55 +03:00
Tom Lane d64713df7e Pass collations to functions in FunctionCallInfoData, not FmgrInfo.
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings.  Fix by passing it in FunctionCallInfoData
instead.  In particular this allows a clean fix for bug #5970 (record_cmp
not working).  This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
2011-04-12 19:19:24 -04:00
Tom Lane 3f5d2fe302 Be more wary of missing statistics in eqjoinsel_semi().
In particular, if we don't have real ndistinct estimates for both sides,
fall back to assuming that half of the left-hand rows have join partners.
This is what was done in 8.2 and 8.3 (cf nulltestsel() in those versions).
It's pretty stupid but it won't lead us to think that an antijoin produces
no rows out, as seen in recent example from Uwe Schroeder.
2011-04-12 01:59:34 -04:00
Tom Lane 921b993677 Fix RI_Initial_Check to use a COLLATE clause when needed in its query.
If the referencing and referenced columns have different collations,
the parser will be unable to resolve which collation to use unless it's
helped out in this way.  The effects are sometimes masked, if we end up
using a non-collation-sensitive plan; but if we do use a mergejoin
we'll see a failure, as recently noted by Robert Haas.

The SQL spec states that the referenced column's collation should be used
to resolve RI checks, so that's what we do.  Note however that we currently
don't append a COLLATE clause when writing a query that examines only the
referencing column.  If we ever support collations that have varying
notions of equality, that will have to be changed.  For the moment, though,
it's preferable to leave it off so that we can use a normal index on the
referencing column.
2011-04-11 21:32:53 -04:00
Peter Eisentraut 5caa3479c2 Clean up most -Wunused-but-set-variable warnings from gcc 4.6
This warning is new in gcc 4.6 and part of -Wall.  This patch cleans
up most of the noise, but there are some still warnings that are
trickier to remove.
2011-04-11 22:28:45 +03:00
Tom Lane 3c381a55b0 Teach pattern_fixed_prefix() about collations.
This is necessary, not optional, now that ILIKE and regexes are collation
aware --- else we might derive a wrong comparison constant for index
optimized pattern matches.
2011-04-11 12:28:28 -04:00
Heikki Linnakangas dad1f46382 TransferPredicateLocksToNewTarget should initialize a new lock
entry's commitSeqNo to that of the old one being transferred, or take
the minimum commitSeqNo if it is merging two lock entries.

Also, CreatePredicateLock should initialize commitSeqNo for to
InvalidSerCommitSeqNo instead of to 0. (I don't think using 0 would
actually affect anything, but we should be consistent.)

I also added a couple of assertions I used to track this down: a
lock's commitSeqNo should never be zero, and it should be
InvalidSerCommitSeqNo if and only if the lock is not held by
OldCommittedSxact.

Dan Ports, to fix leak of predicate locks reported by YAMAMOTO Takashi.
2011-04-11 13:46:37 +03:00
Heikki Linnakangas 7c797e7194 Fix the size of predicate lock manager's shared memory hash tables at creation.
This way they don't compete with the regular lock manager for the slack shared
memory, making the behavior more predictable.
2011-04-11 13:43:31 +03:00
Tom Lane 7aa3f1d082 Insert dummy "break"s to silence compiler complaints.
Apparently some compilers dislike a case label with nothing after it.
Per buildfarm.
2011-04-10 18:44:07 -04:00
Tom Lane 1e16a8107d Teach regular expression operators to honor collations.
This involves getting the character classification and case-folding
functions in the regex library to use the collations infrastructure.
Most of this work had been done already in connection with the upper/lower
and LIKE logic, so it was a simple matter of transposition.

While at it, split out these functions into a separate source file
regc_pg_locale.c, so that they can be correctly labeled with the Postgres
project's license rather than the Scriptics license.  These functions are
100% Postgres-written code whereas what remains in regc_locale.c is still
mostly not ours, so lumping them both under the same copyright notice was
getting more and more misleading.
2011-04-10 18:03:09 -04:00
Andrew Dunstan ed557a373c Don't make "replication" magical as a user name, only as a database name, in pg_hba.conf.
Per gripe from Josh Berkus.
2011-04-10 14:51:26 -04:00
Bruce Momjian bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Tom Lane 9a8b73147c Clean up overly complex code for issuing some related error messages.
The original version was unreadable, and not mechanically checkable
either.
2011-04-09 17:59:03 -04:00
Peter Eisentraut 11745364d0 Add collation support on Windows (MSVC build)
There is not yet support in initdb to populate the pg_collation
catalog, but if that is done manually, the rest should work.
2011-04-10 00:15:41 +03:00
Tom Lane 00f11f419c Fix ILIKE to honor collation when working in single-byte encodings.
The original collation patch only fixed the multi-byte code path.
This change also ensures that ILIKE's idea of the case-folding rules
is exactly the same as str_tolower's.
2011-04-09 17:12:39 -04:00
Tom Lane a19002d4e5 Adjust collation determination rules as per discussion.
Remove crude hack that tried to propagate collation through a
function-returning-record, ie, from the function's arguments to individual
fields selected from its result record.  That is just plain inconsistent,
because the function result is composite and cannot have a collation;
and there's no hope of making this kind of action-at-a-distance work
consistently.  Adjust regression test cases that expected this to happen.

Meanwhile, the behavior of casting to a domain with a declared collation
stays the same as it was, since that seemed to be the consensus.
2011-04-09 14:40:09 -04:00
Tom Lane 69f1d5fe14 Clean up minor collation issues in indxpath.c.
Get rid of bogus collation test in match_special_index_operator (even for
ILIKE, the pattern match operator's collation doesn't matter here, and even
if it did the test was testing the wrong thing).
Fix broken looping logic in expand_indexqual_rowcompare.
Add collation check in match_clause_to_ordering_op.
Make naming and argument ordering more consistent; improve comments.
2011-04-08 19:19:17 -04:00
Tom Lane 466dac8656 Fix make_greater_string to not have an undocumented collation assumption.
The previous coding worked only if ltproc->fn_collation was always either
DEFAULT_COLLATION_OID or a C-compatible locale.  While that's true at the
moment, it wasn't documented (and in fact wasn't true when this code was
committed...).  But it only takes a couple more lines to make its internal
caching behavior locale-aware, so let's do that.
2011-04-08 17:40:20 -04:00
Robert Haas cdcdfca401 Truncate the predicate lock SLRU to empty, instead of almost empty.
Otherwise, the SLRU machinery can get confused and think that the SLRU
has wrapped around.  Along the way, regardless of whether we're
truncating all of the SLRU or just some of it, flush pages after
truncating, rather than before.

Kevin Grittner
2011-04-08 16:52:19 -04:00
Tom Lane 1766a5b63a Tweak collation setup for GIN index comparison functions.
Honor index column's collation spec if there is one, don't go to the
expense of calling get_typcollation when we can reasonably assume that
all GIN storage types will use default collation, and be sure to set
a collation for the comparePartialFn too.
2011-04-08 16:48:25 -04:00
Tom Lane c5ff3ff492 Avoid an unnecessary syscache lookup in parse_coerce.c.
All the other fields of the constant are being extracted from the syscache
entry we already have, so handle collation similarly.  (There don't seem
to be any other uses for the new function at the moment.)
2011-04-08 16:11:41 -04:00
Robert Haas 0bd155cbf2 Fix bug in propagating ALTER TABLE actions to typed tables.
We need to propagate such actions to all typed table children of a
given type, not just the first one.

Noah Misch
2011-04-08 15:46:13 -04:00
Robert Haas fbc0d07796 Partially roll back overenthusiastic SSI optimization.
When a regular lock is held, SSI can use that in lieu of a predicate lock
to detect rw conflicts; but if the regular lock is being taken by a
subtransaction, we can't assume that it'll commit, so releasing the
parent transaction's lock in that case is a no-no.

Kevin Grittner
2011-04-08 15:29:02 -04:00
Robert Haas 56c7140ca8 Tweaks for SSI out-of-shared memory behavior.
If we call hash_search() with HASH_ENTER, it will bail out rather than
return NULL, so it's redundant to check for NULL again in the caller.
Thus, in cases where we believe it's impossible for the hash table to run
out of slots anyway, we can simplify the code slightly.

On the flip side, in cases where it's theoretically possible to run out of
space, we don't want to rely on dynahash.c to throw an error; instead,
we pass HASH_ENTER_NULL and throw the error ourselves if a NULL comes
back, so that we can provide a more descriptive error message.

Kevin Grittner
2011-04-07 16:43:39 -04:00
Tom Lane 73d9a90814 Modernize dlopen interface code for FreeBSD and OpenBSD.
Remove the hard-wired assumption that __mips__ (and only __mips__) lacks
dlopen in FreeBSD and OpenBSD.  This assumption is outdated at least for
OpenBSD, as per report from an anonymous 9.1 tester.  We can perfectly well
use HAVE_DLOPEN instead to decide which code to use.

Some other cosmetic adjustments to make freebsd.c, netbsd.c, and openbsd.c
exactly alike.
2011-04-07 15:14:39 -04:00
Tom Lane d8d429890d Fix collations when we call transformWhereClause from outside the parser.
Previous patches took care of assorted places that call transformExpr from
outside the main parser, but I overlooked the fact that some places use
transformWhereClause as a shortcut for transformExpr + coerce_to_boolean.
In particular this broke collation-sensitive index WHERE clauses, as per
report from Thom Brown.  Trigger WHEN and rule WHERE clauses too.

I'm not forcing initdb for this fix, but any affected indexes, triggers,
or rules will need to be dropped and recreated.
2011-04-07 02:34:57 -04:00
Tom Lane 2594cf0e8c Revise the API for GUC variable assign hooks.
The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment.  And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server".  There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead).  This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".
2011-04-07 00:12:02 -04:00
Robert Haas 632f0faa7c Repair some flakiness in CheckTargetForConflictsIn.
When we release and reacquire SerializableXactHashLock, we must recheck
whether an R/W conflict still needs to be flagged, because it could have
changed under us in the meantime.  And when we release the partition
lock, we must re-walk the list of predicate locks from the beginning,
because our pointer could get invalidated under us.

Bug report #5952 by Yamamoto Takashi.  Patch by Kevin Grittner.
2011-04-05 15:17:25 -04:00
Robert Haas f5e524d92b Add casts from int4 and int8 to numeric.
Joey Adams, per gripe from Ramanujam.  Review by myself and Tom Lane.
2011-04-05 09:35:43 -04:00
Simon Riggs 88f32b7ca2 Avoid assuming there will be only 3 states for synchronous_commit.
Also avoid hardcoding the current default state by giving it the name
"on" and replace with a meaningful name that reflects its behaviour.
Coding only, no change in behaviour.
2011-04-04 23:23:13 +01:00
Robert Haas 240067b3b0 Merge synchronous_replication setting into synchronous_commit.
This means one less thing to configure when setting up synchronous
replication, and also avoids some ambiguity around what the behavior
should be when the settings of these variables conflict.

Fujii Masao, with additional hacking by me.
2011-04-04 16:25:52 -04:00
Robert Haas a0e50e698b Include pid in pg_lock_status() results even for SIREAD locks.
Dan Ports
2011-04-04 13:23:43 -04:00
Robert Haas 6c57239985 Rearrange "add column" logic to merge columns at exec time.
The previous coding set attinhcount too high in some cases, resulting in
an undumpable, undroppable column.  Per bug #5856, reported by Naoya
Anzai.  See also commit 31b6fc06d8, which
fixes a similar bug in ALTER TABLE .. ADD CONSTRAINT.

Patch by Noah Misch.
2011-04-03 21:53:32 -04:00
Robert Haas 38b27792ea Avoid possible hang during smart shutdown.
If a smart shutdown occurs just as a child is starting up, and the
child subsequently becomes a walsender, there is a race condition:
the postmaster might count the exstant backends, determine that there
is one normal backend, and wait for it to die off.  Had the walsender
transition already occurred before the postmaster counted, it would
have proceeded with the shutdown.

To fix this, have each child that transforms into a walsender kick
the postmaster just after doing so, so that the state machine is
certain to advance.

Fujii Masao
2011-04-03 19:42:00 -04:00
Magnus Hagander 5735efee15 Avoid palloc before CurrentMemoryContext is set up on win32
Instead, write the unconverted output - it will be in the wrong
encoding, but at least we don't crash.

Rushabh Lathia
2011-04-01 19:59:44 +02:00
Robert Haas 50533a6dc5 Support comments on FOREIGN DATA WRAPPER and SERVER objects.
This mostly involves making it work with the objectaddress.c framework,
which does most of the heavy lifting.  In that vein, change
GetForeignDataWrapperOidByName to get_foreign_data_wrapper_oid and
GetForeignServerOidByName to get_foreign_server_oid, to match the
pattern we use for other object types.

Robert Haas and Shigeru Hanada
2011-04-01 11:28:28 -04:00
Robert Haas 7fcc75dd26 Fix compiler warning. 2011-04-01 10:36:38 -04:00
Heikki Linnakangas 9d56886112 Fix two missing spaces in error messages.
Josh Kupershmidt
2011-04-01 14:42:38 +03:00
Heikki Linnakangas 60b142b9a6 Fix a tiny race condition in predicate locking. Need to hold the lock while
examining the head of predicate locks list. Also, fix the comment of
RemoveTargetIfNoLongerUsed, it was neglected when we changed the way update
chains are handled.

Kevin Grittner
2011-03-31 18:43:23 +03:00
Heikki Linnakangas 1f0bab8494 Improve error message when WAL ends before reaching end of online backup. 2011-03-31 10:09:49 +03:00
Andrew Dunstan 382fb6a08f Attempt to unbreak windows builds broken by commit 754baa2. 2011-03-30 16:43:31 -04:00
Heikki Linnakangas acf4740132 Check that we've reached end-of-backup also when we're not performing
archive recovery.

It's possible to restore an online backup without recovery.conf, by simply
copying all the necessary WAL files to pg_xlog. "pg_basebackup -x" does that
too. That's the use case where this cross-check is useful.

Backpatch to 9.0. We used to do this in earlier versins, but in 9.0 the code
was inadvertently changed so that the check is only performed after archive
recovery.

Fujii Masao.
2011-03-30 10:53:28 +03:00
Heikki Linnakangas 754baa21f7 Automatically terminate replication connections that are idle for more
than replication_timeout (a new GUC) milliseconds. The TCP timeout is often
too long, you want the master to notice a dead connection much sooner.
People complained about that in 9.0 too, but with synchronous replication
it's even more important to notice dead connections promptly.

Fujii Masao and Heikki Linnakangas
2011-03-30 10:20:37 +03:00
Heikki Linnakangas bc03c5937d Adjust error message, now that we expect other message types than connection
close at this point. Fix PQsetnonblocking() comment.

Fujii Masao
2011-03-30 08:54:28 +03:00
Peter Eisentraut f564e65cda Update SQL features list
Feature F692 "Extended collation support" is now also supported.  This
refers to allowing the COLLATE clause anywhere in a column or domain
definition instead of just directly after the type.

Also correct the name of the feature in accordance with the latest SQL
standard.
2011-03-29 23:23:50 +03:00
Tom Lane eb51af71f2 Prevent a rowtype from being included in itself.
Eventually we might be able to allow that, but it's not clear how many
places need to be fixed to prevent infinite recursion when there's a direct
or indirect inclusion of a rowtype in itself.  One such place is
CheckAttributeType(), which will recurse to stack overflow in cases such as
those exhibited in bug #5950 from Alex Perepelica.  If we were sure it was
the only such place, we could easily modify the code added by this patch to
stop the recursion without a complaint ... but it probably isn't the only
such place.  Hence, throw error until such time as someone is excited
enough about this type of usage to put work into making it safe.

Back-patch as far as 8.3.  8.2 doesn't have the recursive call in
CheckAttributeType in the first place, so I see no need to add code there
in the absence of clear evidence of a problem elsewhere.
2011-03-28 15:46:04 -04:00
Tom Lane d0dd5c7352 Fix check_exclusion_constraint() to insert correct collations in ScanKeys. 2011-03-27 13:29:52 -04:00
Tom Lane 7208fae18f Clean up cruft around collation initialization for tupdescs and scankeys.
I found actual bugs in GiST and plpgsql; the rest of this is cosmetic
but meant to decrease the odds of future bugs of omission.
2011-03-26 18:28:40 -04:00
Tom Lane 0c9d9e8dd6 More collations cleanup, from trawling for missed collation assignments.
Mostly cosmetic, though I did find that generateClonedIndexStmt failed
to clone the index's collations.
2011-03-26 16:35:25 -04:00
Tom Lane b23c9fa929 Clean up a few failures to set collation fields in expression nodes.
I'm not sure these have any non-cosmetic implications, but I'm not sure
they don't, either.  In particular, ensure the CaseTestExpr generated
by transformAssignmentIndirection to represent the base target column
carries the correct collation, because parse_collate.c won't fix that.
Tweak lsyscache.c API so that we can get the appropriate collation
without an extra syscache lookup.
2011-03-26 14:25:48 -04:00
Simon Riggs 92f4786fa9 Additional test for each commit in sync rep path to plug minute
possibility of race condition that would effect performance only.
Requested by Robert Haas. Re-arrange related comments.
2011-03-26 10:09:37 +00:00
Tom Lane bfa4440ca5 Pass collation to makeConst() instead of looking it up internally.
In nearly all cases, the caller already knows the correct collation, and
in a number of places, the value the caller has handy is more correct than
the default for the type would be.  (In particular, this patch makes it
significantly less likely that eval_const_expressions will result in
changing the exposed collation of an expression.)  So an internal lookup
is both expensive and wrong.
2011-03-25 20:10:42 -04:00
Tom Lane c8e993503d Fix failure to propagate collation in negate_clause().
Turns out it was this, and not so much plpgsql, that was at fault in Stefan
Huehner's collation-error-in-a-trigger bug report of a couple weeks ago.
2011-03-25 18:44:47 -04:00
Robert Haas 30f6136f28 Make walreceiver send a reply after receiving data but before flushing it.
It originally worked this way, but was changed by commit
a8a8a3e096, since which time it's been impossible
for walreceiver to ever send a reply with write_location and flush_location
set to different values.
2011-03-25 11:28:16 -04:00
Tom Lane 27dc7e240b Fix handling of collation in SQL-language functions.
Ensure that parameter symbols receive collation from the function's
resolved input collation, and fix inlining to behave properly.

BTW, this commit lays about 90% of the infrastructure needed to support
use of argument names in SQL functions.  Parsing of parameters is now
done via the parser-hook infrastructure ... we'd just need to supply
a column-ref hook ...
2011-03-24 20:30:23 -04:00
Robert Haas a432e2783b Add post-creation hook for extensions, consistent with other object types.
KaiGai Kohei
2011-03-24 18:44:49 -04:00
Tom Lane 3bba9ce945 Clean up handling of COLLATE clauses in index column definitions.
Ensure that COLLATE at the top level of an index expression is treated the
same as a grammatically separate COLLATE.  Fix bogus reverse-parsing logic
in pg_get_indexdef.
2011-03-24 15:29:52 -04:00
Simon Riggs b5f2f2a712 Minor changes to recovery pause behaviour.
Change location LOG message so it works each time we pause, not
just for final pause.
Ensure that we pause only if we are in Hot Standby and can connect
to allow us to run resume function. This change supercedes the
code to override parameter recoveryPauseAtTarget to false if not
attempting to enter Hot Standby, which is now removed.
2011-03-23 19:35:53 +00:00
Robert Haas 19584ec659 Remove synchronous_replication/max_wal_senders cross-check.
This is no longer necessary, and might result in a situation where the
configuration file is reloaded (and everything seems OK) but a subsequent
restart of the database fails.

Per an observation from Fujii Masao.
2011-03-23 11:44:27 -04:00
Simon Riggs b98ac467f5 Prevent intermittent hang in recovery from bgwriter interaction.
Startup process waited for cleanup lock but when hot_standby = off
the pid was not registered, so that the bgwriter would not wake
the waiting process as intended.
2011-03-23 13:30:05 +00:00
Simon Riggs ec497a5ad6 Make FKs valid at creation when added as column constraints.
Bug report from Alvaro Herrera
2011-03-22 23:10:35 +00:00
Tom Lane 6e197cb2e5 Improve reporting of run-time-detected indeterminate-collation errors.
pg_newlocale_from_collation does not have enough context to give an error
message that's even a little bit useful, so move the responsibility for
complaining up to its callers.  Also, reword ERRCODE_INDETERMINATE_COLLATION
error messages in a less jargony, more message-style-guide-compliant
fashion.
2011-03-22 16:55:32 -04:00
Tom Lane 37d6d07dda Throw error for indeterminate collation of an ORDER/GROUP/DISTINCT target.
This restores a parse error that was thrown (though only in the ORDER BY
case) by the original collation patch.  I had removed it in my recent
revisions because it was thrown at a place where collations now haven't
been computed yet; but I thought of another way to handle it.

Throwing the error at parse time, rather than leaving it to be done at
runtime, is good because a syntax error pointer is helpful for localizing
the problem.  We can reasonably assume that the comparison function for a
collatable datatype will complain if it doesn't have a collation to use.
Now the planner might choose to implement GROUP or DISTINCT via hashing,
in which case no runtime error would actually occur, but it seems better
to throw error consistently rather than let the error depend on what the
planner chooses to do.  Another possible objection is that the user might
specify a nondefault sort operator that doesn't care about collation
... but that's surely an uncommon usage, and it wouldn't hurt him to throw
in a COLLATE clause anyway.  This change also makes the ORDER BY/GROUP
BY/DISTINCT case more consistent with the UNION/INTERSECT/EXCEPT case,
which was already coded to throw this error even though the same objections
could be raised there.
2011-03-22 15:58:03 -04:00
Tom Lane 1192ba8b67 Avoid potential deadlock in InitCatCachePhase2().
Opening a catcache's index could require reading from that cache's own
catalog, which of course would acquire AccessShareLock on the catalog.
So the original coding here risks locking index before heap, which could
deadlock against another backend trying to get exclusive locks in the
normal order.  Because InitCatCachePhase2 is only called when a backend
has to start up without a relcache init file, the deadlock was seldom seen
in the field.  (And by the same token, there's no need to worry about any
performance disadvantage; so not much point in trying to distinguish
exactly which catalogs have the risk.)

Bug report, diagnosis, and patch by Nikhil Sontakke.  Additional commentary
by me.  Back-patch to all supported branches.
2011-03-22 13:00:48 -04:00
Tom Lane 8df08c8489 Reimplement planner's handling of MIN/MAX aggregate optimization (again).
Instead of playing cute games with pathkeys, just build a direct
representation of the intended sub-select, and feed it through
query_planner to get a Path for the index access.  This is a bit slower
than 9.1's previous method, since we'll duplicate most of the overhead of
query_planner; but since the whole optimization only applies to rather
simple single-table queries, that probably won't be much of a problem in
practice.  The advantage is that we get to do the right thing when there's
a partial index that needs the implicit IS NOT NULL clause to be usable.
Also, although this makes planagg.c be a bit more closely tied to the
ordering of operations in grouping_planner, we can get rid of some coupling
to lower-level parts of the planner.  Per complaint from Marti Raudsepp.
2011-03-22 00:34:31 -04:00
Heikki Linnakangas 6d8096e2f3 When two base backups are started at the same time with pg_basebackup,
ensure that they use different checkpoints as the starting point. We use
the checkpoint redo location as a unique identifier for the base backup in
the end-of-backup record, and in the backup history file name.

Bug spotted by Fujii Masao.
2011-03-21 11:25:25 +02:00
Tom Lane 82e4d45bd0 Suppress platform-dependent unused-variable warning.
The local variable "sock" can be unused depending on compilation flags.
But there seems no particular need for it, since the kernel calls can
just as easily say port->sock instead.
2011-03-20 13:34:31 -04:00
Tom Lane 176d5bae1d Fix up handling of C/POSIX collations.
Install just one instance of the "C" and "POSIX" collations into
pg_collation, rather than one per encoding.  Make these instances exist
and do something useful even in machines without locale_t support: to wit,
it's now possible to force comparisons and case-folding functions to use C
locale in an otherwise non-C database, whether or not the platform has
support for using any additional collations.

Fix up severely broken upper/lower/initcap functions, too: the C/POSIX
fastpath now does what it is supposed to, and non-default collations are
handled correctly in single-byte database encodings.

Merge the two separate collation hashtables that were being maintained in
pg_locale.c, and be more wary of the possibility that we fail partway
through filling a cache entry.
2011-03-20 12:44:13 -04:00
Tom Lane b310b6e31c Revise collation derivation method and expression-tree representation.
All expression nodes now have an explicit output-collation field, unless
they are known to only return a noncollatable data type (such as boolean
or record).  Also, nodes that can invoke collation-aware functions store
a separate field that is the collation value to pass to the function.
This avoids confusion that arises when a function has collatable inputs
and noncollatable output type, or vice versa.

Also, replace the parser's on-the-fly collation assignment method with
a post-pass over the completed expression tree.  This allows us to use
a more complex (and hopefully more nearly spec-compliant) assignment
rule without paying for it in extra storage in every expression node.

Fix assorted bugs in the planner's handling of collations by making
collation one of the defining properties of an EquivalenceClass and
by converting CollateExprs into discardable RelabelType nodes during
expression preprocessing.
2011-03-19 20:30:08 -04:00
Magnus Hagander 6f9192df61 Rename ident authentication over local connections to peer
This removes an overloading of two authentication options where
one is very secure (peer) and one is often insecure (ident). Peer
is also the name used in libpq from 9.1 to specify the same type
of authentication.

Also make initdb select peer for local connections when ident is
chosen, and ident for TCP connections when peer is chosen.

ident keyword in pg_hba.conf is still accepted and maps to peer
authentication.
2011-03-19 18:44:35 +01:00
Robert Haas fbcf4b92aa Fix possible "tuple concurrently updated" error in ALTER TABLE.
When adding an inheritance parent to a table, an AccessShareLock on the
parent isn't strong enough to prevent trouble, so take
ShareUpdateExclusiveLock instead.  Since this is a behavior change,
albeit a fairly unobtrusive one, and since we have only one report
from the field, no back-patch.

Report by Jon Nelson, analysis by Alvaro Herrera, fix by me.
2011-03-18 22:09:57 -04:00
Robert Haas 727589995a Move synchronous_standbys_defined updates from WAL writer to BG writer.
This is advantageous because the BG writer is alive until much later in
the shutdown sequence than WAL writer; we want to make sure that it's
possible to shut off synchronous replication during a smart shutdown,
else it might not be possible to complete the shutdown at all.

Per very reasonable gripes from Fujii Masao and Simon Riggs.
2011-03-18 21:43:45 -04:00
Robert Haas 7a37900443 Make synchronous replication query cancel/die messages more consistent.
Per a gripe from Thom Brown about my previous commit in this area,
commit 9a56dc3389.
2011-03-18 10:20:22 -04:00
Robert Haas 777e8c0015 Remove bogus semicolons in recoveryPausesHere.
Without this, the startup process goes into a tight loop, consuming
100% of one CPU and failing to respond to interrupts.
2011-03-18 08:09:09 -04:00
Robert Haas 02b1f84e7d Remove bogus comment. 2011-03-17 14:43:57 -04:00
Peter Eisentraut 8c0a5eb78a Raise maximum value of several timeout parameters
The maximum value of deadlock_timeout, max_standby_archive_delay,
max_standby_streaming_delay, log_min_duration_statement, and
log_autovacuum_min_duration was INT_MAX/1000 milliseconds, which is
about 35min, which is too short for some practical uses.  Raise the
maximum value to INT_MAX; the code that uses the parameters already
supports that just fine.
2011-03-17 20:19:51 +02:00
Robert Haas 84abea76f6 Add pause_at_recovery_target to recovery.conf.sample; improve docs.
Fujii Masao, but with the proposed behavior change reverted, and the
rest adjusted accordingly.
2011-03-17 14:04:11 -04:00
Robert Haas 9a56dc3389 Fix various possible problems with synchronous replication.
1. Don't ignore query cancel interrupts.  Instead, if the user asks to
cancel the query after we've already committed it, but before it's on
the standby, just emit a warning and let the COMMIT finish.

2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown).
Instead, emit a warning message and close the connection without
acknowledging the commit.  Other backends will still see the effect of
the commit, but there's no getting around that; it's too late to abort
at this point, and ignoring die interrupts altogether doesn't seem like
a good idea.

3. If synchronous_standby_names becomes empty, wake up all backends
waiting for synchronous replication to complete.  Without this, someone
attempting to shut synchronous replication off could easily wedge the
entire system instead.

4. Avoid depending on the assumption that if a walsender updates
MyProc->syncRepState, we'll see the change even if we read it without
holding the lock.  The window for this appears to be quite narrow (and
probably doesn't exist at all on machines with strong memory ordering)
but protecting against it is practically free, so do that.

5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and
doesn't actually do anything.

There's still some further work needed here to make the behavior of fast
shutdown plausible, but that looks complex, so I'm leaving it for a
separate commit.  Review by Fujii Masao.
2011-03-17 13:12:21 -04:00
Tom Lane 72cfc17aef Improve handling of unknown-type literals in UNION/INTERSECT/EXCEPT.
This patch causes unknown-type Consts to be coerced to the resolved output
type of the set operation at parse time.  Formerly such Consts were left
alone until late in the planning stage.  The disadvantage of that approach
is that it disables some optimizations, because the planner sees the set-op
leaf query as having different output column types than the overall set-op.
We saw an example of that in a recent performance gripe from Claudio
Freire.

Fixing such a Const requires scribbling on the leaf query in
transformSetOperationTree, but that should be all right since if the leaf
query's semantics depended on that output column, it would already have
resolved the unknown to something else.

Most of the bulk of this patch is a simple adjustment of
transformSetOperationTree's API so that upper levels can get at the
TargetEntry containing a Const to be replaced: it now returns a list of
TargetEntries, instead of just the bare expressions.
2011-03-15 21:52:10 -04:00
Robert Haas 5ca4dfc79f Remove 13 keywords that are used only for ROLE options.
Review by Tom Lane.
2011-03-15 10:22:58 -04:00
Tom Lane a2eb9e0c08 Simplify list traversal logic in add_path().
Its mechanism for recovering after deleting the current list cell was
a bit klugy.  Borrow the technique used in other places.
2011-03-13 12:57:14 -04:00
Tom Lane 696d1f7f06 Make all comparisons done for/with statistics use the default collation.
While this will give wrong answers when estimating selectivity for a
comparison operator that's using a non-default collation, the estimation
error probably won't be large; and anyway the former approach created
estimation errors of its own by trying to use a histogram that might have
been computed with some other collation.  So we'll adopt this simplified
approach for now and perhaps improve it sometime in the future.

This patch incorporates changes from Andres Freund to make sure that
selfuncs.c passes a valid collation OID to any datatype-specific function
it calls, in case that function wants collation information.  Said OID will
now always be DEFAULT_COLLATION_OID, but at least we won't get errors.
2011-03-12 16:30:36 -05:00
Bruce Momjian 94fe9c0f4e Use "backend process" rather than "backend server", where appropriate. 2011-03-12 09:38:56 -05:00
Bruce Momjian 3a3f39fdc0 Use macros for time-based constants, rather than constants. 2011-03-12 09:35:56 -05:00
Tom Lane 2a26639a5d On further reflection, we'd better do the same in int.c.
We previously heard of the same problem in int24div(), so there's not a
good reason to suppose the problem is confined to cases involving int8.
2011-03-11 19:04:02 -05:00
Tom Lane 72330995a5 Put in some more safeguards against executing a division-by-zero.
Add dummy returns before every potential division-by-zero in int8.c,
because apparently further "improvements" in gcc's optimizer have
enabled it to break functions that weren't broken before.

Aurelien Jarno, via Martin Pitt
2011-03-11 18:18:55 -05:00
Tom Lane 8acdb8bf9c Split CollateClause into separate raw and analyzed node types.
CollateClause is now used only in raw grammar output, and CollateExpr after
parse analysis.  This is for clarity and to avoid carrying collation names
in post-analysis parse trees: that's both wasteful and possibly misleading,
since the collation's name could be changed while the parsetree still
exists.

Also, clean up assorted infelicities and omissions in processing of the
node type.
2011-03-11 16:28:18 -05:00
Tom Lane e3c732a85c Create an explicit concept of collations that work for any encoding.
Use collencoding = -1 to represent such a collation in pg_collation.
We need this to make the "default" entry work sanely, and a later
patch will fix the C/POSIX entries to be represented this way instead
of duplicating them across all encodings.  All lookup operations now
search first for an entry that's database-encoding-specific, and then
for the same name with collencoding = -1.

Also some incidental code cleanup in collationcmds.c and pg_collation.c.
2011-03-11 13:20:11 -05:00
Bruce Momjian 5ca543fb2e Clarify C comment that O_SYNC/O_FSYNC are really the same settting, as
opposed to O_DSYNC.
2011-03-10 20:02:52 -05:00
Tom Lane 7564654adf Revert addition of third argument to format_type().
Including collation in the behavior of that function promotes a world view
we do not want.  Moreover, it was producing the wrong behavior for pg_dump
anyway: what we want is to dump a COLLATE clause on attributes whose
attcollation is different from the underlying type, and likewise for
domains, and the function cannot do that for us.  Doing it the hard way
in pg_dump is a bit more tedious but produces more correct output.

In passing, fix initdb so that the initial entry in pg_collation is
properly pinned.  It was droppable before :-(
2011-03-10 17:30:46 -05:00
Robert Haas 551c07d84a Make error handling of synchronous_standby_names consistent.
It's not a good idea to kill the postmaster just because someone muffs
this, and it's not consistent with what we do for other, similar GUCs.

Fujii Masao, with a bit more hacking by me
2011-03-10 16:24:52 -05:00
Robert Haas 2e019c8611 More synchronous replication typo fixes.
Fujii Masao
2011-03-10 15:56:18 -05:00
Robert Haas b8bb8dbf20 More synchronous replication tweaks.
SyncRepRequested() must check not only the value of the
synchronous_replication GUC but also whether max_wal_senders > 0.
Otherwise, we might end up waiting for sync rep even when there's no
possibility of a standby ever managing to connect.  There are some
existing cross-checks to prevent this, but they're not quite sufficient:
the user can start the server with max_wal_senders=0,
synchronous_standby_names='', and synchronous_replication=off and then
subsequent make synchronous_standby_names not empty using pg_ctl reload,
and then SET synchronous_standby=on, leading to an indefinite hang.

Along the way, rename the global variable for the synchronous_replication
GUC to match the name of the GUC itself, for clarity.

Report by Fujii Masao, though I didn't use his patch.
2011-03-10 15:43:37 -05:00
Robert Haas 6436098795 Minor sync rep corrections.
Fujii Masao, with a bit of additional wordsmithing by me.
2011-03-10 14:57:02 -05:00
Robert Haas d16e290a8a Emit a LOG message when pausing at the recovery target.
Fujii Masao
2011-03-10 14:37:14 -05:00
Robert Haas fcb99609b6 Replication README updates.
Fujii Masao
2011-03-10 08:59:59 -05:00
Itagaki Takahiro 2d8de0a50b Cleanup copyright years and file names in the header comments of some files. 2011-03-10 15:05:33 +09:00
Tom Lane f6587019ed replication/repl_gram.h needs to be cleaned too ... 2011-03-10 00:12:38 -05:00
Tom Lane 174f65ab00 Fix some oversights in distprep and maintainer-clean targets.
At least two recent commits have apparently imagined that a comment in
a Makefile stating that something would be included in the distribution
tarball was sufficient to make it so.  They hadn't bothered to hook
into the upper maintainer-clean targets either.  Per bug #5923 from
Charles Johnson, in which it emerged that the 9.1alpha4 tarballs are
short a few files that should be there.
2011-03-10 00:04:05 -05:00
Bruce Momjian 76fdee31c4 Mention gcc version in C comment. 2011-03-09 23:41:13 -05:00
Tom Lane a051ef699c Remove collation information from TypeName, where it does not belong.
The initial collations patch treated a COLLATE spec as part of a TypeName,
following what can only be described as brain fade on the part of the SQL
committee.  It's a lot more reasonable to treat COLLATE as a syntactically
separate object, so that it can be added in only the productions where it
actually belongs, rather than needing to reject it in a boatload of places
where it doesn't belong (something the original patch mostly failed to do).
In addition this change lets us meet the spec's requirement to allow
COLLATE anywhere in the clauses of a ColumnDef, and it avoids unfriendly
behavior for constructs such as "foo::type COLLATE collation".

To do this, pull collation information out of TypeName and put it in
ColumnDef instead, thus reverting most of the collation-related changes in
parse_type.c's API.  I made one additional structural change, which was to
use a ColumnDef as an intermediate node in AT_AlterColumnType AlterTableCmd
nodes.  This provides enough room to get rid of the "transform" wart in
AlterTableCmd too, since the ColumnDef can carry the USING expression
easily enough.

Also fix some other minor bugs that have crept in in the same areas,
like failure to copy recently-added fields of ColumnDef in copyfuncs.c.

While at it, document the formerly secret ability to specify a collation
in ALTER TABLE ALTER COLUMN TYPE, ALTER TYPE ADD ATTRIBUTE, and
ALTER TYPE ALTER ATTRIBUTE TYPE; and correct some misstatements about
what the default collation selection will be when COLLATE is omitted.

BTW, the three-parameter form of format_type() should go away too,
since it just contributes to the confusion in this area; but I'll do
that in a separate patch.
2011-03-09 22:39:20 -05:00
Tom Lane 49a08ca1e9 Adjust the permissions required for COMMENT ON ROLE.
Formerly, any member of a role could change the role's comment, as of
course could superusers; but holders of CREATEROLE privilege could not,
unless they were also members.  This led to the odd situation that a
CREATEROLE holder could create a role but then could not comment on it.
It also seems a bit dubious to let an unprivileged user change his own
comment, let alone those of group roles he belongs to.  So, change the
rule to be "you must be superuser to comment on a superuser role, or
hold CREATEROLE to comment on non-superuser roles".  This is the same
as the privilege check for creating/dropping roles, and thus fits much
better with the rule for other object types, namely that only the owner
of an object can comment on it.

In passing, clean up the documentation for COMMENT a little bit.

Per complaint from Owen Jacobson and subsequent discussion.
2011-03-09 11:28:34 -05:00
Tom Lane 3f7d24da16 Add missing keywords to gram.y's unreserved_keywords list.
We really need an automated check for this ... and did VALIDATE really
need to become a keyword at all, rather than picking some other syntax
using existing keywords?
2011-03-08 16:43:56 -05:00
Heikki Linnakangas 46c333a963 Fix overly strict assertion in SummarizeOldestCommittedSxact(). There's a
race condition where SummarizeOldestCommittedSxact() is called even though
another backend already cleared out all finished sxact entries. That's OK,
RegisterSerializableTransactionInt() can just retry getting a news xact
slot from the available-list when that happens.

Reported by YAMAMOTO Takashi, bug #5918.
2011-03-08 21:06:26 +02:00
Heikki Linnakangas 93d888232e Don't throw a warning if vacuum sees PD_ALL_VISIBLE flag set on a page that
contains newly-inserted tuples that according to our OldestXmin are not
yet visible to everyone. The value returned by GetOldestXmin() is conservative,
and it can move backwards on repeated calls, so if we see that contradiction
between the PD_ALL_VISIBLE flag and status of tuples on the page, we have to
assume it's because an earlier vacuum calculated a higher OldestXmin value,
and all the tuples really are visible to everyone.

We have received several reports of this bug, with the "PD_ALL_VISIBLE flag
was incorrectly set in relation ..." warning appearing in logs. We were
finally able to hunt it down with David Gould's help to run extra diagnostics
in an environment where this happened frequently.

Also reword the warning, per Robert Haas' suggestion, to not imply that the
PD_ALL_VISIBLE flag is necessarily at fault, as it might also be a symptom
of corruption on a tuple header.

Backpatch to 8.4, where the PD_ALL_VISIBLE flag was introduced.
2011-03-08 20:30:53 +02:00
Heikki Linnakangas 4cd3fb6e12 Truncate predicate lock manager's SLRU lazily at checkpoint. That's safer
than doing it aggressively whenever the tail-XID pointer is advanced, because
this way we don't need to do it while holding SerializableXactHashLock.

This also fixes bug #5915 spotted by YAMAMOTO Takashi, and removes an
obsolete comment spotted by Kevin Grittner.
2011-03-08 12:12:54 +02:00
Heikki Linnakangas 1a4ab9ec23 If recovery_target_timeline is set to 'latest' and standby mode is enabled,
periodically rescan the archive for new timelines, while waiting for new WAL
segments to arrive. This allows you to set up a standby server that follows
the TLI change if another standby server is promoted to master. Before this,
you had to restart the standby server to make it notice the new timeline.

This patch only scans the archive for TLI changes, it won't follow a TLI
change in streaming replication. That is much needed too, but it would be a
much bigger patch than I dare to sneak in this late in the release cycle.

There was discussion on improving the sanity checking of the WAL segments so
that the system would notice more reliably if the new timeline isn't an
ancestor of the current one, but that is not included in this patch.

Reviewed by Fujii Masao.
2011-03-07 21:14:47 +02:00
Tom Lane 7193a90fc1 Zero out vacuum_count and related counters in pgstat_recv_tabstat().
This fixes an oversight in commit 946045f04d
of 2010-08-21, as reported by Itagaki Takahiro.  Also a couple of minor
cosmetic adjustments.
2011-03-07 11:17:47 -05:00
Heikki Linnakangas 97e3dacd84 Begin error message with lower-case letter. 2011-03-07 10:41:13 +02:00
Heikki Linnakangas baabf05196 Silence compiler warning about undefined function when compiling without
assertions.
2011-03-07 09:56:53 +02:00
Simon Riggs cae4974e3d Dynamic array required within pg_stat_replication. 2011-03-07 00:26:30 +00:00
Simon Riggs 966fb05b58 Add new files for syncrep missed in previous commit 2011-03-06 23:39:14 +00:00
Simon Riggs a8a8a3e096 Efficient transaction-controlled synchronous replication.
If a standby is broadcasting reply messages and we have named
one or more standbys in synchronous_standby_names then allow
users who set synchronous_replication to wait for commit, which
then provides strict data integrity guarantees. Design avoids
sending and receiving transaction state information so minimises
bookkeeping overheads. We synchronize with the highest priority
standby that is connected and ready to synchronize. Other standbys
can be defined to takeover in case of standby failure.

This version has very strict behaviour; more relaxed options
may be added at a later date.

Simon Riggs and Fujii Masao, with reviews by Yeb Havinga, Jaime
Casanova, Heikki Linnakangas and Robert Haas, plus the assistance
of many other design reviewers.
2011-03-06 22:49:16 +00:00
Tom Lane 149b2673c2 Fix incorrect access to pg_index.indcollation.
Since this field is after a variable-length field, it can't simply be
accessed via the C struct for pg_index.  Fortunately, the relcache already
did the dirty work of pulling the information out to where it can be
accessed easily, so this is a one-line fix.

Andres Freund
2011-03-06 12:10:50 -05:00
Peter Eisentraut 9650364b7b Update of SQL feature conformance 2011-03-05 17:03:21 +02:00
Tom Lane 63b656b7bf Create extension infrastructure for the core procedural languages.
This mostly just involves creating control, install, and
update-from-unpackaged scripts for them.  However, I had to adjust plperl
and plpython to not share the same support functions between variants,
because we can't put the same function into multiple extensions.

catversion bump forced due to new contents of pg_pltemplate, and because
initdb now installs plpgsql as an extension not a bare language.

Add support for regression testing these as extensions not bare
languages.

Fix a couple of other issues that popped up while testing this: my initial
hack at pg_dump binary-upgrade support didn't work right, and we don't want
an extra schema permissions test after all.

Documentation changes still to come, but I'm committing now to see
whether the MSVC build scripts need work (likely they do).
2011-03-04 21:51:14 -05:00
Robert Haas efa415da8c Refactor seclabel.c to use the new check_object_ownership function.
This avoids duplicate (and not-quite-matching) code, and makes the logic
for SECURITY LABEL match COMMENT and ALTER EXTENSION ADD/DROP.
2011-03-04 17:26:37 -05:00
Peter Eisentraut b9cff97fdf Don't allow CREATE TABLE AS to create a column with invalid collation
It is possible that an expression ends up with a collatable type but
without a collation.  CREATE TABLE AS could then create a table based
on that.  But such a column cannot be dumped with valid SQL syntax, so
we disallow creating such a column.

per test report from Noah Misch
2011-03-04 23:42:07 +02:00
Tom Lane 8d3b421f5f Allow non-superusers to create (some) extensions.
Remove the unconditional superuser permissions check in CREATE EXTENSION,
and instead define a "superuser" extension property, which when false
(not the default) skips the superuser permissions check.  In this case
the calling user only needs enough permissions to execute the commands
in the extension's installation script.  The superuser property is also
enforced in the same way for ALTER EXTENSION UPDATE cases.

In other ALTER EXTENSION cases and DROP EXTENSION, test ownership of
the extension rather than superuserness.  ALTER EXTENSION ADD/DROP needs
to insist on ownership of the target object as well; to do that without
duplicating code, refactor comment.c's big switch for permissions checks
into a separate function in objectaddress.c.

I also removed the superuserness checks in pg_available_extensions and
related functions; there's no strong reason why everybody shouldn't
be able to see that info.

Also invent an IF NOT EXISTS variant of CREATE EXTENSION, and use that
in pg_dump, so that dumps won't fail for installed-by-default extensions.
We don't have any of those yet, but we will soon.

This is all per discussion of wrapping the standard procedural languages
into extensions.  I'll make those changes in a separate commit; this is
just putting the core infrastructure in place.
2011-03-04 16:08:53 -05:00
Peter Eisentraut 4442e1975d When creating a collation, check that the locales can be loaded
This is the same check that would happen later when the collation is
used, but it's friendlier to check the collation already when it is
created.
2011-03-04 22:14:37 +02:00
Heikki Linnakangas ee3838b1d3 You must hold a lock on the heap page when you call
CheckForSerializableConflictOut(), because it can set hint bits.

YAMAMOTO Takashi
2011-03-04 15:43:11 +02:00
Tom Lane 6252c4f9e2 Run a portal's cleanup hook immediately when pushing it to DONE state.
This works around the problem noted by Yamamoto Takashi in bug #5906,
that there were code paths whereby we could reach AtCleanup_Portals
with a portal's cleanup hook still unexecuted.  The changes I made
a few days ago were intended to prevent that from happening, and
I think that on balance it's still a good thing to avoid, so I don't
want to remove the Assert in AtCleanup_Portals.  Hence do this instead.
2011-03-03 13:04:06 -05:00
Peter Eisentraut 091bda0188 Add collations to information_schema.usage_privileges
This is faked information like for domains.
2011-03-02 23:17:56 +02:00
Heikki Linnakangas 6eba5a7c57 Change pg_last_xlog_receive_location() not to move backwards. That makes
it a lot more useful for determining which standby is most up-to-date,
for example. There was long discussions on whether overwriting existing
existing WAL makes sense to begin with, and whether we should do some more
extensive variable renaming, but this change nevertheless seems quite
uncontroversial.

Fujii Masao, reviewed by Jeff Janes, Robert Haas, Stephen Frost.
2011-03-01 20:54:35 +02:00
Heikki Linnakangas 47ad79122b Fix bugs in Serializable Snapshot Isolation.
Change the way UPDATEs are handled. Instead of maintaining a chain of
tuple-level locks in shared memory, copy any existing locks on the old
tuple to the new tuple at UPDATE. Any existing page-level lock needs to
be duplicated too, as a lock on the new tuple. That was neglected
previously.

Store xmin on tuple-level predicate locks, to distinguish a lock on an old
already-recycled tuple from a new tuple at the same physical location.
Failure to distinguish them caused loops in the tuple-lock chains, as
reported by YAMAMOTO Takashi. Although we don't use the chain representation
of UPDATEs anymore, it seems like a good idea to store the xmin to avoid
some false positives if no other reason.

CheckSingleTargetForConflictsIn now correctly handles the case where a lock
that's being held is not reflected in the local lock table. That happens
if another backend acquires a lock on our behalf due to an UPDATE or a page
split.

PredicateLockPageCombine now retains locks for the page that is being
removed, rather than removing them. This prevents a potentially dangerous
false-positive inconsistency where the local lock table believes that a lock
is held, but it is actually not.

Dan Ports and Kevin Grittner
2011-03-01 19:05:16 +02:00
Tom Lane 97c4ee94ad Include the target table in EXPLAIN output for ModifyTable nodes.
Per discussion, this seems important for plans involving writable CTEs,
since there can now be more than one ModifyTable node in the plan.

To retain the same formatting as for target tables of scan nodes, we
show only one target table, which will be the parent table in case of
an UPDATE or DELETE on an inheritance tree.  Individual child tables
can be determined by inspecting the child plan trees if needed.
2011-03-01 11:37:01 -05:00
Robert Haas 59d6a75942 Avoid excessive Hot Standby feedback messages.
Without this patch, when wal_receiver_status_interval=0, indicating that no
status messages should be sent, Hot Standby feedback messages are instead sent
extremely frequently.

Fujii Masao, with documentation changes by me.
2011-03-01 11:34:25 -05:00
Tom Lane c0b0076036 Rearrange snapshot handling to make rule expansion more consistent.
With this patch, portals, SQL functions, and SPI all agree that there
should be only a CommandCounterIncrement between the queries that are
generated from a single SQL command by rule expansion.  Fetching a whole
new snapshot now happens only between original queries.  This is equivalent
to the existing behavior of EXPLAIN ANALYZE, and it was judged to be the
best choice since it eliminates one source of concurrency hazards for
rules.  The patch should also make things marginally faster by reducing the
number of snapshot push/pop operations.

The patch removes pg_parse_and_rewrite(), which is no longer used anywhere.
There was considerable discussion about more aggressive refactoring of the
query-processing functions exported by postgres.c, but for the moment
nothing more has been done there.

I also took the opportunity to refactor snapmgr.c's API slightly: the
former PushUpdatedSnapshot() has been split into two functions.

Marko Tiikkaja, reviewed by Steve Singer and Tom Lane
2011-02-28 23:28:06 -05:00
Robert Haas 92c30fd2ed Rename pg_stat_replication.apply_location to replay_location.
For consistency with pg_last_xlog_replay_location.  Per discussion.
2011-02-28 12:49:57 -05:00
Tom Lane a874fe7b4c Refactor the executor's API to support data-modifying CTEs better.
The originally committed patch for modifying CTEs didn't interact well
with EXPLAIN, as noted by myself, and also had corner-case problems with
triggers, as noted by Dean Rasheed.  Those problems show it is really not
practical for ExecutorEnd to call any user-defined code; so split the
cleanup duties out into a new function ExecutorFinish, which must be called
between the last ExecutorRun call and ExecutorEnd.  Some Asserts have been
added to these functions to help verify correct usage.

It is no longer necessary for callers of the executor to call
AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now
done by ExecutorStart/ExecutorFinish respectively.  If you really need to
suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to
ExecutorStart.

Also, refactor portal commit processing to allow for the possibility that
PortalDrop will invoke user-defined code.  I think this is not actually
necessary just yet, since the portal-execution-strategy logic forces any
non-pure-SELECT query to be run to completion before we will consider
committing.  But it seems like good future-proofing.
2011-02-27 13:44:12 -05:00
Bruce Momjian 67a5e727c8 Be less detailed about reporting shared memory failure by avoiding the
output of actual Postgres parameter _values_ related to shared memory,
and suggesting that these are only possible parameters to reduce.
2011-02-27 12:21:58 -05:00
Heikki Linnakangas be6668d6ef Increase the default for wal_sender_delay from 200ms to 1s. Now that WAL
sender is immediately woken up by transaction commit, there's no need to
wake up so aggressively.
2011-02-26 23:38:25 +02:00
Tom Lane 000128bc7f Fix order of shutdown processing when CTEs contain inter-references.
We need ExecutorEnd to run the ModifyTable nodes to completion in
reverse order of initialization, not forward order.  Easily done
by constructing the list back-to-front.
2011-02-25 23:53:34 -05:00
Tom Lane 389af95155 Support data-modifying commands (INSERT/UPDATE/DELETE) in WITH.
This patch implements data-modifying WITH queries according to the
semantics that the updates all happen with the same command counter value,
and in an unspecified order.  Therefore one WITH clause can't see the
effects of another, nor can the outer query see the effects other than
through the RETURNING values.  And attempts to do conflicting updates will
have unpredictable results.  We'll need to document all that.

This commit just fixes the code; documentation updates are waiting on
author.

Marko Tiikkaja and Hitoshi Harada
2011-02-25 18:58:02 -05:00
Robert Haas 79ad8fc5f8 Named restore point improvements.
Emit a log message when creating a named restore point, and improve
documentation for pg_create_restore_point().

Euler Taveira de Oliveira, 	per suggestions from Thom Brown, with some
additional wordsmithing by me.
2011-02-24 19:02:00 -05:00
Tom Lane bdca82f44d Add a relkind field to RangeTblEntry to avoid some syscache lookups.
The recent additions for FDW support required checking foreign-table-ness
in several places in the parse/plan chain.  While it's not clear whether
that would really result in a noticeable slowdown, it seems best to avoid
any performance risk by keeping a copy of the relation's relkind in
RangeTblEntry.  That might have some other uses later, anyway.
Per discussion.
2011-02-22 19:24:40 -05:00
Peter Eisentraut 1c51c7d5ff Add PL/Python functions for quoting strings
Add functions plpy.quote_ident, plpy.quote_literal,
plpy.quote_nullable, which wrap the equivalent SQL functions.

To be able to propagate char * constness properly, make the argument
of quote_literal_cstr() const char *.  This also makes it more
consistent with quote_identifier().

Jan Urbański, reviewed by Hitoshi Harada, some refinements by Peter
Eisentraut
2011-02-22 23:41:23 +02:00
Robert Haas 3e6b305d9e Fix a couple of unlogged tables goofs.
"SELECT ... INTO UNLOGGED tabname" works, but wasn't documented; CREATE
UNLOGGED SEQUENCE and CREATE UNLOGGED VIEW failed an assertion, instead
of throwing a sensible error.

Latter issue reported by Itagaki Takahiro; patch review by Tom Lane.
2011-02-22 14:46:19 -05:00
Tom Lane 1ab9b012bd Allow binary I/O of type "void".
void_send is useful for the same reason that void_out doesn't throw error,
namely that someone might do "select void_returning_func(...)"  from a
client that prefers to operate in binary mode.  The void_recv function may
or may not have any practical use, but we provide it for symmetry.

Radosław Smogura
2011-02-22 13:08:22 -05:00
Tom Lane 2e852e541c Remove ExecRemoveJunk(), which is no longer used anywhere.
This was a leftover from the pre-8.1 design of junkfilters.  It doesn't
seem to have any reason to live, since it's merely a combination of two
easy function calls, and not a well-designed combination at that (it
encourages callers to leak the result tuple).
2011-02-21 21:41:08 -05:00
Tom Lane a210be7720 Fix dangling-pointer problem in before-row update trigger processing.
ExecUpdate checked for whether ExecBRUpdateTriggers had returned a new
tuple value by seeing if the returned tuple was pointer-equal to the old
one.  But the "old one" was in estate->es_junkFilter's result slot, which
would be scribbled on if we had done an EvalPlanQual update in response to
a concurrent update of the target tuple; therefore we were comparing a
dangling pointer to a live one.  Given the right set of circumstances we
could get a false match, resulting in not forcing the tuple to be stored in
the slot we thought it was stored in.  In the case reported by Maxim Boguk
in bug #5798, this led to "cannot extract system attribute from virtual
tuple" failures when trying to do "RETURNING ctid".  I believe there is a
very-low-probability chance of more serious errors, such as generating
incorrect index entries based on the original rather than the
trigger-modified version of the row.

In HEAD, change all of ExecBRInsertTriggers, ExecIRInsertTriggers,
ExecBRUpdateTriggers, and ExecIRUpdateTriggers so that they continue to
have similar APIs.  In the back branches I just changed
ExecBRUpdateTriggers, since there is no bug in the ExecBRInsertTriggers
case.
2011-02-21 21:19:50 -05:00
Itagaki Takahiro ca9cf85d54 Fix pg_server_to_client, that was broken in the previous commit. 2011-02-21 16:27:57 +09:00
Itagaki Takahiro 3cba8240a1 Add ENCODING option to COPY TO/FROM and file_fdw.
File encodings can be specified separately from client encoding.
If not specified, client encoding is used for backward compatibility.

Cases when the encoding doesn't match client encoding are slower
than matched cases because we don't have conversion procs for other
encodings. Performance improvement would be be a future work.

Original patch by Hitoshi Harada, and modified by me.
2011-02-21 14:32:40 +09:00
Tom Lane 7c5d0ae707 Add contrib/file_fdw foreign-data wrapper for reading files via COPY.
This is both very useful in its own right, and an important test case
for the core FDW support.

This commit includes a small refactoring of copy.c to expose its option
checking code as a separately callable function.  The original patch
submission duplicated hundreds of lines of that code, which seemed pretty
unmaintainable.

Shigeru Hanada, reviewed by Itagaki Takahiro and Tom Lane
2011-02-20 14:06:59 -05:00
Tom Lane bb74240794 Implement an API to let foreign-data wrappers actually be functional.
This commit provides the core code and documentation needed.  A contrib
module test case will follow shortly.

Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
2011-02-20 00:18:14 -05:00
Tom Lane 327e025071 Create the catalog infrastructure for foreign-data-wrapper handlers.
Add a fdwhandler column to pg_foreign_data_wrapper, plus HANDLER options
in the CREATE FOREIGN DATA WRAPPER and ALTER FOREIGN DATA WRAPPER commands,
plus pg_dump support for same.  Also invent a new pseudotype fdw_handler
with properties similar to language_handler.

This is split out of the "FDW API" patch for ease of review; it's all stuff
we will certainly need, regardless of any other details of the FDW API.
FDW handler functions will not actually get called yet.

In passing, fix some omissions and infelicities in foreigncmds.c.

Shigeru Hanada, Jan Urbanski, Heikki Linnakangas
2011-02-19 00:07:15 -05:00
Tom Lane 82220e8832 Un-break building with BTREE_BUILD_STATS.
This has been broken for awhile, but not clear it's worth back-patching.

Euler Taveira de Oliveira
2011-02-18 14:06:16 -05:00
Simon Riggs bc76695c4c Make a hard state change from catchup to streaming mode.
More useful state change for monitoring purposes, plus a
required change for synchronous replication patch.
2011-02-18 15:07:26 +00:00
Simon Riggs 06828c5feb Separate messages for standby replies and hot standby feedback.
Allow messages to be sent at different times, and greatly reduce
the frequency of hot standby feedback. Refactor to allow additional
message types.
2011-02-18 11:31:49 +00:00
Magnus Hagander 45a6d79b17 Properly initialize variables
Kevin Grittner
2011-02-18 11:59:57 +01:00
Michael Meskes bc423879cc Applied a patch by Zoltán Böszörményi that makes ecpg's parser accept dynamic cursornames even in WHERE CURRENT OF clauses. 2011-02-18 11:16:16 +01:00
Itagaki Takahiro 5c63982af2 Fix an uninitialized field in DR_copy.
Shigeru HANADA
2011-02-18 14:32:19 +09:00
Itagaki Takahiro 62c7bd31c8 Add transaction-level advisory locks.
They share the same locking namespace with the existing session-level
advisory locks, but they are automatically released at the end of the
current transaction and cannot be released explicitly via unlock
functions.

Marko Tiikkaja, reviewed by me.
2011-02-18 14:05:12 +09:00
Tom Lane 52b60530f2 Fix tsmatchsel() to account properly for null rows.
ts_typanalyze.c computes MCE statistics as fractions of the non-null rows,
which seems fairly reasonable, and anyway changing it in released versions
wouldn't be a good idea.  But then ts_selfuncs.c has to account for that.
Failure to do so results in overestimates in columns with a significant
fraction of null documents.  Back-patch to 8.4 where this stuff was
introduced.

Jesper Krogh
2011-02-17 19:00:49 -05:00
Robert Haas 4a25bc145a Add client_hostname field to pg_stat_activity.
Peter Eisentraut, reviewed by Steve Singer, Alvaro Herrera, and me.
2011-02-17 16:03:28 -05:00
Robert Haas a3e8486dff Prevent possible compiler warnings.
Simon Riggs reports that rnode.dbNode and rnode.spcNode were
generating unused variable warnings on gcc 4.4.3 with CFLAGS=-O1
2011-02-17 16:01:46 -05:00