Commit Graph

24711 Commits

Author SHA1 Message Date
Andrew Dunstan 6697aa2bc2 Improve support for building PGXS modules with VPATH.
A VPATH build will be performed when the module's make file path is not
the current directory or when USE_VPATH is set.

This will assist packagers and others who prefer to build without
polluting the source directories.

There is still a bit of work to do here, notably documentation, but it's
probably a good idea to commit what we have so far and let people test
it out on their modules.

Cédric Villemain, with an addition from me.
2013-07-01 12:53:05 -04:00
Bruce Momjian 6d432152b9 Update LSB URL in pg_ctl
Update Linux Standard Base Core Specification 3.1 URL mention in pg_ctl
comments.
2013-07-01 12:46:13 -04:00
Bruce Momjian 06b804377c Remove undocumented -h (help) option
The -h option was not supported by many tools, and not documented, so
remove them for consistency from pg_upgrade, pg_test_fsync, and
pg_test_timing.
2013-07-01 12:40:33 -04:00
Heikki Linnakangas 031cc55bbe Optimize pglz compressor for small inputs.
The pglz compressor has a significant startup cost, because it has to
initialize to zeros the history-tracking hash table. On a 64-bit system, the
hash table was 64kB in size. While clearing memory is pretty fast, for very
short inputs the relative cost of that was quite large.

This patch alleviates that in two ways. First, instead of storing pointers
in the hash table, store 16-bit indexes into the hist_entries array. That
slashes the size of the hash table to 1/2 or 1/4 of the original, depending
on the pointer width. Secondly, adjust the size of the hash table based on
input size. For very small inputs, you don't need a large hash table to
avoid collisions.

Review by Amit Kapila.
2013-07-01 11:00:14 +03:00
Heikki Linnakangas 79ce29c734 Retry short writes when flushing WAL.
We don't normally bother retrying when the number of bytes written by
write() is short of what was requested. It is generally assumed that a
write() to disk doesn't return short, unless you run out of disk space.
While writing the WAL, however, it seems prudent to try a bit harder,
because a failure leads to PANIC. The write() is also much larger than most
write()s in the backend (up to wal_buffers), so there's more room for
surprises.

Also retry on EINTR. All signals used in the backend are flagged SA_RESTART
nowadays, so it shouldn't happen, but better to be defensive.
2013-07-01 09:36:00 +03:00
Peter Eisentraut 129759d6a5 Fix cpluspluscheck in checksum code
C++ is more picky about comparing signed and unsigned integers.
2013-06-30 10:25:43 -04:00
Peter Eisentraut 14a85031b1 ecpg: Consistently use mm_strdup()
mm_strdup() is provided to check errors from strdup(), but some places
were failing to use it.
2013-06-29 22:14:56 -04:00
Heikki Linnakangas ee6556555b Inline ginCompareItemPointers function for speed.
ginCompareItemPointers function is called heavily in gin index scans -
inlining it speeds up some kind of queries a lot.
2013-06-29 12:55:34 +03:00
Simon Riggs d51b271059 Change errcode for lock_timeout to match NOWAIT
Set errcode to ERRCODE_LOCK_NOT_AVAILABLE

Zoltán Bsöszörményi
2013-06-29 00:57:25 +01:00
Simon Riggs f177cbfe67 ALTER TABLE ... ALTER CONSTRAINT for FKs
Allow constraint attributes to be altered,
so the default setting of NOT DEFERRABLE
can be altered to DEFERRABLE and back.

Review by Abhijit Menon-Sen
2013-06-29 00:27:30 +01:00
Simon Riggs 2f74e4ec50 Assert that ALTER TABLE subcommands have pass set 2013-06-29 00:26:46 +01:00
Alvaro Herrera 82233ce7ea Send SIGKILL to children if they don't die quickly in immediate shutdown
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal.  (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.)  This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.

Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit.  This makes immediate
shutdown more reliable.

There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death.  If this controversy is resolved differently than what this patch
implements, it's an easy change to make.

Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau

MauMau and Álvaro Herrera
2013-06-28 17:49:46 -04:00
Robert Haas 5893ffa79c Make the OVER keyword unreserved.
This results in a slightly less specific error message when OVER
is used in a context where we don't accept window functions, but
per discussion, it's worth it to get the benefit of not needing
to reserve this keyword any more.  This same refactoring will
also let us avoid reserving some other keywords that we expect
to add in upcoming patches (specifically, IGNORE, RESPECT, and
FILTER).

Troels Nielsen, with minor changes by me
2013-06-28 11:11:00 -04:00
Robert Haas 5ee73525d5 Define Trap and TrapMacro even in non-cassert builds.
In some cases, the use of these macros may be preferable to Assert()
or AssertMacro(), since this way the caller can set the trap message.

Andres Freund and Robert Haas
2013-06-28 09:33:34 -04:00
Heikki Linnakangas 9e0bc7c1e8 Track spinlock delay in microsecond granularity.
On many platforms the OS will round the sleep time to millisecond
resolution, but there is no reason for us to pre-emptively round the
argument to pg_usleep.

When the delay was measured in milliseconds and started from 1 ms, it
sometimes took many attempts until the logic that increases the delay by
multiplying with a random value between 1 and 2 actually managed to bump it
from 1 ms to 2 ms. That lead to a sequence of 1 ms waits until the delay
started to increase. This wasn't really a problem but it looked odd if you
observed the waits. There is no measurable difference in performance, but
it's more readable this way.

Jeff Janes
2013-06-28 12:39:55 +03:00
Alvaro Herrera 9db4ad44eb Update pg_resetxlog's documentation on multixacts
I added some more functionality to it in 0ac5ad5134 but neglected to
add it to the docs.

Per Peter Eisentraut in message
1367112171.32604.4.camel@vanquo.pezone.net
2013-06-27 15:32:58 -04:00
Noah Misch 263865a489 Permit super-MaxAllocSize allocations with MemoryContextAllocHuge().
The MaxAllocSize guard is convenient for most callers, because it
reduces the need for careful attention to overflow, data type selection,
and the SET_VARSIZE() limit.  A handful of callers are happy to navigate
those hazards in exchange for the ability to allocate a larger chunk.
Introduce MemoryContextAllocHuge() and repalloc_huge().  Use this in
tuplesort.c and tuplestore.c, enabling internal sorts of up to INT_MAX
tuples, a factor-of-48 increase.  In particular, B-tree index builds can
now benefit from much-larger maintenance_work_mem settings.

Reviewed by Stephen Frost, Simon Riggs and Jeff Janes.
2013-06-27 14:53:57 -04:00
Tom Lane 9ef86cd994 Mark index-constraint comments with correct dependency in pg_dump.
When there's a comment on an index that was created with UNIQUE or PRIMARY
KEY constraint syntax, we need to label the comment as depending on the
constraint not the index, since only the constraint object actually appears
in the dump.  This incorrect dependency can lead to parallel pg_restore
trying to restore the comment before the index has been created, per bug
#8257 from Lloyd Albin.

This patch fixes pg_dump to produce the right dependency in dumps made
in the future.  Usually we also try to hack pg_restore to work around
bogus dependencies, so that existing (wrong) dumps can still be restored in
parallel mode; but that doesn't seem practical here since there's no easy
way to relate the constraint dump entry to the comment after the fact.

Andres Freund
2013-06-27 13:54:50 -04:00
Tom Lane a099482c86 Expect EWOULDBLOCK from a non-blocking connect() call only on Windows.
On Unix-ish platforms, EWOULDBLOCK may be the same as EAGAIN, which is
*not* a success return, at least not on Linux.  We need to treat it as a
failure to avoid giving a misleading error message.  Per the Single Unix
Spec, only EINPROGRESS and EINTR returns indicate that the connection
attempt is in progress.

On Windows, on the other hand, EWOULDBLOCK (WSAEWOULDBLOCK) is the expected
case.  We must accept EINPROGRESS as well because Cygwin will return that,
and it doesn't seem worth distinguishing Cygwin from native Windows here.
It's not very clear whether EINTR can occur on Windows, but let's leave
that part of the logic alone in the absence of concrete trouble reports.

Also, remove the test for errno == 0, effectively reverting commit
da9501bddb, which AFAICS was just a thinko;
or at best it might have been a workaround for a platform-specific bug,
which we can hope is gone now thirteen years later.  In any case, since
libpq makes no effort to reset errno to zero before calling connect(),
it seems unlikely that that test has ever reliably done anything useful.

Andres Freund and Tom Lane
2013-06-27 12:36:44 -04:00
Noah Misch 19085116ee Cooperate with the Valgrind instrumentation framework.
Valgrind "client requests" in aset.c and mcxt.c teach Valgrind and its
Memcheck tool about the PostgreSQL allocator.  This makes Valgrind
roughly as sensitive to memory errors involving palloc chunks as it is
to memory errors involving malloc chunks.  Further client requests in
PageAddItem() and printtup() verify that all bits being added to a
buffer page or furnished to an output function are predictably-defined.
Those tests catch failures of C-language functions to fully initialize
the bits of a Datum, which in turn stymie optimizations that rely on
_equalConst().  Define the USE_VALGRIND symbol in pg_config_manual.h to
enable these additions.  An included "suppression file" silences nominal
errors we don't plan to fix.

Reviewed in earlier versions by Peter Geoghegan and Korry Douglas.
2013-06-26 20:22:25 -04:00
Noah Misch a855148a29 Refactor aset.c and mcxt.c in preparation for Valgrind cooperation.
Move some repeated debugging code into functions and store intermediates
in variables where not presently necessary.  No code-generation changes
in a production build, and no functional changes.  This simplifies and
focuses the main patch.
2013-06-26 19:56:03 -04:00
Noah Misch 1d96bb9602 Initialize pad bytes in GinFormTuple().
Every other core buffer page consumer initializes the bytes it furnishes
to PageAddItem().  For consistency, do the same here.  No back-patch;
regardless, we couldn't count on the fix so long as binary upgrade can
carry forward affected index builds.
2013-06-26 19:55:15 -04:00
Noah Misch 5f538ad004 Renovate display of non-ASCII messages on Windows.
GNU gettext selects a default encoding for the messages it emits in a
platform-specific manner; it uses the Windows ANSI code page on Windows
and follows LC_CTYPE on other platforms.  This is inconvenient for
PostgreSQL server processes, so realize consistent cross-platform
behavior by calling bind_textdomain_codeset() on Windows each time we
permanently change LC_CTYPE.  This primarily affects SQL_ASCII databases
and processes like the postmaster that do not attach to a database,
making their behavior consistent with PostgreSQL on non-Windows
platforms.  Messages from SQL_ASCII databases use the encoding implied
by the database LC_CTYPE, and messages from non-database processes use
LC_CTYPE from the postmaster system environment.  PlatformEncoding
becomes unused, so remove it.

Make write_console() prefer WriteConsoleW() to write() regardless of the
encodings in use.  In this situation, write() will invariably mishandle
non-ASCII characters.

elog.c has assumed that messages conform to the database encoding.
While usually true, this does not hold for SQL_ASCII and MULE_INTERNAL.
Introduce MessageEncoding to track the actual encoding of message text.
The present consumers are Windows-specific code for converting messages
to UTF16 for use in system interfaces.  This fixes the appearance in
Windows event logs and consoles of translated messages from SQL_ASCII
processes like the postmaster.  Note that SQL_ASCII inherently disclaims
a strong notion of encoding, so non-ASCII byte sequences interpolated
into messages by %s may yet yield a nonsensical message.  MULE_INTERNAL
has similar problems at present, albeit for a different reason: its lack
of libiconv support or a conversion to UTF8.

Consequently, one need no longer restart Windows with a different
Windows ANSI code page to broadly test backend logging under a given
language.  Changing the user's locale ("Format") is enough.  Several
accounts can simultaneously run postmasters under different locales, all
correctly logging localized messages to Windows event logs and consoles.

Alexander Law and Noah Misch
2013-06-26 11:17:33 -04:00
Peter Eisentraut 2c1031bd86 pg_receivexlog: Fix logic error
The code checking the WAL file name contained a logic error and wouldn't
actually catch some bad names.
2013-06-26 00:01:00 -04:00
Alvaro Herrera 4ca50e0710 Avoid inconsistent type declaration
Clang 3.3 correctly complains that a variable of type enum
MultiXactStatus cannot hold a value of -1, which makes sense.  Change
the declared type of the variable to int instead, and apply casting as
necessary to avoid the warning.

Per notice from Andres Freund
2013-06-25 16:41:47 -04:00
Andrew Dunstan 81166a2f7e Properly dump dropped foreign table cols in binary-upgrade mode.
In binary upgrade mode, we need to recreate and then drop dropped
columns so that all the columns get the right attribute number. This is
true for foreign tables as well as for native tables. For foreign
tables we have been getting the first part right but not the second,
leading to bogus columns in the upgraded database. Fix this all the way
back to 9.1, where foreign tables were introduced.
2013-06-25 13:46:34 -04:00
Fujii Masao 985bd7d497 Support clean switchover.
In replication, when we shutdown the master, walsender tries to send
all the outstanding WAL records to the standby, and then to exit. This
basically means that all the WAL records are fully synced between
two servers after the clean shutdown of the master. So, after
promoting the standby to new master, we can restart the stopped
master as new standby without the need for a fresh backup from
new master.

But there was one problem so far: though walsender tries to send all
the outstanding WAL records, it doesn't wait for them to be replicated
to the standby. Then, before receiving all the WAL records,
walreceiver can detect the closure of connection and exit. We cannot
guarantee that there is no missing WAL in the standby after clean
shutdown of the master. In this case, backup from new master is
required when restarting the stopped master as new standby.

This patch fixes this problem. It just changes walsender so that it
waits for all the outstanding WAL records to be replicated to the
standby before closing the replication connection.

Per discussion, this is a fix that needs to get backpatched rather than
new feature. So, back-patch to 9.1 where enough infrastructure for
this exists.

Patch by me, reviewed by Andres Freund.
2013-06-26 02:14:37 +09:00
Simon Riggs 4f14c86d74 Reverting previous commit, pending investigation
of sporadic seg faults from various build farm members.
2013-06-24 21:21:18 +01:00
Simon Riggs b577a57d41 ALTER TABLE ... ALTER CONSTRAINT for FKs
Allow constraint attributes to be altered,
so the default setting of NOT DEFERRABLE
can be altered to DEFERRABLE and back.

Review by Abhijit Menon-Sen
2013-06-24 20:07:41 +01:00
Peter Eisentraut ce18b01159 Translation updates 2013-06-24 14:16:44 -04:00
Tom Lane 8c1a71d36f Add a comment warning against use of pg_usleep() for long sleeps.
Follow-up to commit 873ab97219, in which
I noted that WaitLatch was a better solution in the commit log message,
but neglected to add any documentation in the code.
2013-06-23 14:43:10 -04:00
Simon Riggs 1f09121b4e Ensure no xid gaps during Hot Standby startup
In some cases with higher numbers of subtransactions
it was possible for us to incorrectly initialize
subtrans leading to complaints of missing pages.

Bug report by Sergey Konoplev
Analysis and fix by Andres Freund
2013-06-23 11:05:02 +01:00
Peter Eisentraut 7dfd5cd21c Clarify terminology standalone backend vs. single-user mode
Most of the documentation uses "single-user mode", so use that in the
code as well.  Adjust the documentation to match the new error message
wording.  Also add a documentation index entry for "single-user mode".

Based-on-patch-by: Jeff Janes <jeff.janes@gmail.com>
2013-06-20 23:03:18 -04:00
Peter Eisentraut 8df54b9fad initdb: Add blank line before output about checksums
This maintains the logical grouping of the output better.
2013-06-19 20:34:45 -04:00
Fujii Masao bab54e383d Support TB (terabyte) memory unit in GUC variables.
Patch by Simon Riggs, reviewed by Jeff Janes and me.
2013-06-20 08:17:14 +09:00
Bruce Momjian f979599b20 Modernize entab source code
Remove halt.c, improve comments, rename manual page file.
2013-06-19 12:31:26 -04:00
Kevin Grittner 8791627b8f Fix the create_index regression test for Danish collation.
In Danish collations, there are letter combinations which sort
higher than 'Z'.  A test for values > 'WA' was picking up rows
where the value started with 'AA', causing the test to fail.

Backpatch to 9.2, where the failing test was added.

Per report from Svenne Krap and analysis by Jeff Janes
2013-06-19 10:36:45 -05:00
Peter Eisentraut c3c86ae2af psql: Re-allow -1 together with -c or -l 2013-06-17 21:53:33 -04:00
Jeff Davis b8fd1a09f3 Add buffer_std flag to MarkBufferDirtyHint().
MarkBufferDirtyHint() writes WAL, and should know if it's got a
standard buffer or not. Currently, the only callers where buffer_std
is false are related to the FSM.

In passing, rename XLOG_HINT to XLOG_FPI, which is more descriptive.

Back-patch to 9.3.
2013-06-17 08:02:12 -07:00
Tom Lane a64ca63e59 Use WaitLatch, not pg_usleep, for delaying in pg_sleep().
This avoids platform-dependent behavior wherein pg_sleep() might fail to be
interrupted by statement timeout, query cancel, SIGTERM, etc.  Also, since
there's no reason to wake up once a second any more, we can reduce the
power consumption of a sleeping backend a tad.

Back-patch to 9.3, since use of SA_RESTART for SIGALRM makes this a bigger
issue than it used to be.
2013-06-15 16:23:24 -04:00
Fujii Masao f69aece6f4 Fix pg_restore -l with the directory archive to display the correct format name.
Back-patch to 9.1 where the directory archive was introduced.
2013-06-16 05:07:02 +09:00
Tom Lane 873ab97219 Use SA_RESTART for all signals, including SIGALRM.
The exclusion of SIGALRM dates back to Berkeley days, when Postgres used
SIGALRM in only one very short stretch of code.  Nowadays, allowing it to
interrupt kernel calls doesn't seem like a very good idea, since its use
for statement_timeout means SIGALRM could occur anyplace in the code, and
there are far too many call sites where we aren't prepared to deal with
EINTR failures.  When third-party code is taken into consideration, it
seems impossible that we ever could be fully EINTR-proof, so better to
use SA_RESTART always and deal with the implications of that.  One such
implication is that we should not assume pg_usleep() will be terminated
early by a signal.  Therefore, long sleeps should probably be replaced
by WaitLatch operations where practical.

Back-patch to 9.3 so we can get some beta testing on this change.
2013-06-15 15:39:51 -04:00
Tom Lane 5242fefb47 Be consistent about #define'ing configure symbols as "1" not empty.
This is just neatnik-ism, since all the tests in the code are #ifdefs,
but we shouldn't specify symbols as "Define to 1 ..." and then not
actually define them that way.
2013-06-15 14:11:43 -04:00
Tom Lane 46e1434f3d Update RELEASE_CHANGES to describe library version bumping more fully. 2013-06-14 14:53:23 -04:00
Tom Lane 8a3f0894a4 Stamp shared-library minor version numbers for 9.4. 2013-06-14 14:49:46 -04:00
Tom Lane 58ae1f4577 Stamp HEAD as 9.4devel.
Let the hacking begin ...
2013-06-14 14:41:28 -04:00
Tom Lane e472b92140 Avoid deadlocks during insertion into SP-GiST indexes.
SP-GiST's original scheme for avoiding deadlocks during concurrent index
insertions doesn't work, as per report from Hailong Li, and there isn't any
evident way to make it work completely.  We could possibly lock individual
inner tuples instead of their whole pages, but preliminary experimentation
suggests that the performance penalty would be huge.  Instead, if we fail
to get a buffer lock while descending the tree, just restart the tree
descent altogether.  We keep the old tuple positioning rules, though, in
hopes of reducing the number of cases where this can happen.

Teodor Sigaev, somewhat edited by Tom Lane
2013-06-14 14:26:43 -04:00
Tom Lane c62866eeaf Remove special-case treatment of LOG severity level in standalone mode.
elog.c has historically treated LOG messages as low-priority during
bootstrap and standalone operation.  This has led to confusion and even
masked a bug, because the normal expectation of code authors is that
elog(LOG) will put something into the postmaster log, and that wasn't
happening during initdb.  So get rid of the special-case rule and make
the priority order the same as it is in normal operation.  To keep from
cluttering initdb's output and the behavior of a standalone backend,
tweak the severity level of three messages routinely issued by xlog.c
during startup and shutdown so that they won't appear in these cases.
Per my proposal back in December.
2013-06-13 23:15:15 -04:00
Tom Lane f04216341d Refactor checksumming code to make it easier to use externally.
pg_filedump and other external utility programs are likely to want to be
able to check Postgres page checksums.  To avoid messy duplication of code,
move the checksumming functionality into an exported header file, much as
we did awhile back for the CRC code.

In passing, get rid of an unportable assumption that a static char[] array
will be word-aligned, and do some other minor code beautification.
2013-06-13 22:35:56 -04:00
Peter Eisentraut fa2fc066f3 PL/Python: Fix type mixup
Memory was allocated based on the sizeof a type that was not the type of
the pointer that the result was being assigned to.  The types happen to
be of the same size, but it's still wrong.
2013-06-13 21:42:42 -04:00
Tom Lane 629b3e96dd Only install a portal's ResourceOwner if it actually has one.
In most scenarios a portal without a ResourceOwner is dead and not subject
to any further execution, but a portal for a cursor WITH HOLD remains in
existence with no ResourceOwner after the creating transaction is over.
In this situation, if we attempt to "execute" the portal directly to fetch
data from it, we were setting CurrentResourceOwner to NULL, leading to a
segfault if the datatype output code did anything that required a resource
owner (such as trying to fetch system catalog entries that weren't already
cached).  The case appears to be impossible to provoke with stock libpq,
but psqlODBC at least is able to cause it when working with held cursors.

Simplest fix is to just skip the assignment to CurrentResourceOwner, so
that any resources used by the data output operations will be managed by
the transaction-level resource owner instead.  For consistency I changed
all the places that install a portal's resowner as current, even though
some of them are probably not reachable with a held cursor's portal.

Per report from Joshua Berry (with thanks to Hiroshi Inoue for developing
a self-contained test case).  Back-patch to all supported versions.
2013-06-13 13:12:49 -04:00
Noah Misch 66008564f8 Avoid reading past datum end when parsing JSON.
Several loops in the JSON parser examined a byte in memory just before
checking whether its address was in-bounds, so they could read one byte
beyond the datum's allocation.  A SIGSEGV is possible.  New in 9.3, so
no back-patch.
2013-06-12 19:51:12 -04:00
Noah Misch 3a5d0c5533 Avoid reading below the start of a stack variable in tokenize_file().
We would wrongly overwrite the prior stack byte if it happened to
contain '\n' or '\r'.  New in 9.3, so no back-patch.
2013-06-12 19:50:52 -04:00
Noah Misch 813895e4ac Don't pass oidvector by value.
Since the structure ends with a flexible array, doing so truncates any
vector having more than one element.  New in 9.3, so no back-patch.
2013-06-12 19:50:37 -04:00
Noah Misch fb435f40d5 Observe array length in HaveVirtualXIDsDelayingChkpt().
Since commit f21bb9cfb5, this function
ignores the caller-provided length and loops until it finds a
terminator, which GetVirtualXIDsDelayingChkpt() never adds.  Restore the
previous loop control logic.  In passing, revert the addition of an
unused variable by the same commit, presumably a debugging relic.
2013-06-12 19:50:14 -04:00
Noah Misch ff53890f68 Don't use ordinary NULL-terminated strings as Name datums.
Consumers are entitled to read the full 64 bytes pertaining to a Name;
using a shorter NULL-terminated string leads to reading beyond the end
its allocation; a SIGSEGV is possible.  Use the frequent idiom of
copying to a NameData on the stack.  New in 9.3, so no back-patch.
2013-06-12 19:49:50 -04:00
Tom Lane dc3eb56383 Improve updatability checking for views and foreign tables.
Extend the FDW API (which we already changed for 9.3) so that an FDW can
report whether specific foreign tables are insertable/updatable/deletable.
The default assumption continues to be that they're updatable if the
relevant executor callback function is supplied by the FDW, but finer
granularity is now possible.  As a test case, add an "updatable" option to
contrib/postgres_fdw.

This patch also fixes the information_schema views, which previously did
not think that foreign tables were ever updatable, and fixes
view_is_auto_updatable() so that a view on a foreign table can be
auto-updatable.

initdb forced due to changes in information_schema views and the functions
they rely on.  This is a bit unfortunate to do post-beta1, but if we don't
change this now then we'll have another API break for FDWs when we do
change it.

Dean Rasheed, somewhat editorialized on by Tom Lane
2013-06-12 17:53:33 -04:00
Andrew Dunstan 78ed8e03c6 Fix unescaping of JSON Unicode escapes, especially for non-UTF8.
Per discussion  on -hackers. We treat Unicode escapes when unescaping
them similarly to the way we treat them in PostgreSQL string literals.
Escapes in the ASCII range are always accepted, no matter what the
database encoding. Escapes for higher code points are only processed in
UTF8 databases, and attempts to process them in other databases will
result in an error. \u0000 is never unescaped, since it would result in
an impermissible null byte.
2013-06-12 13:35:24 -04:00
Tom Lane e262755bfc Fix cache flush hazard in cache_record_field_properties().
We need to increment the refcount on the composite type's cached tuple
descriptor while we do lookups of its column types.  Otherwise a cache
flush could occur and release the tuple descriptor before we're done with
it.  This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a
failure in a production build seem rather low (since the pfree'd descriptor
typically wouldn't get scribbled on immediately).  That may explain the
lack of any previous reports.  Buildfarm issue noted by Christian Ullrich.

Back-patch to 9.1 where the bogus code was added.
2013-06-11 17:26:42 -04:00
Fujii Masao 941c4ece98 Fix pg_isready to handle conninfo properly.
pg_isready displays the host name and the port number that it uses to connect
to the server. So far, pg_isready didn't use the conninfo specified in -d option
for calculating those host name and port number. This can lead to wrong display
to a user. This commit changes pg_isready so that it uses the conninfo for that
calculation.

Original patch by Phil Sorber, modified by me.
2013-06-11 03:03:16 +09:00
Joe Conway 33a4466f76 Fix ordering of obj id for Rules and EventTriggers in pg_dump.
getSchemaData() must identify extension member objects and mark them
as not to be dumped. This must happen after reading all objects that can be
direct members of extensions, but before we begin to process table subsidiary
objects. Both rules and event triggers were wrong in this regard.

Backport rules portion of patch to 9.1 -- event triggers do not exist prior to 9.3.
Suggested fix by Tom Lane, initial complaint and patch by me.
2013-06-09 17:30:39 -07:00
Tom Lane a4424c57c3 Remove unnecessary restrictions about RowExprs in transformAExprIn().
When the existing code here was written, it made sense to special-case
RowExprs because that was the only way that we could handle row comparisons
at all.  Now that we have record_eq() and arrays of composites, the generic
logic for "scalar" types will in fact work on RowExprs too, so there's no
reason to throw error for combinations of RowExprs and other ways of
forming composite values, nor to ignore the possibility of using a
ScalarArrayOpExpr.  But keep using the old logic when comparing two
RowExprs, for consistency with the main transformAExprOp() logic.  (This
allows some cases with not-quite-identical rowtypes to succeed, so we might
get push-back if we removed it.)  Per bug #8198 from Rafal Rzepecki.

Back-patch to all supported branches, since this works fine as far back as
8.4.

Rafal Rzepecki and Tom Lane
2013-06-09 18:39:20 -04:00
Tom Lane f3839ea117 Remove ALTER DEFAULT PRIVILEGES' requirement of schema CREATE permissions.
Per discussion, this restriction isn't needed for any real security reason,
and it seems to confuse people more often than it helps them.  It could
also result in some database states being unrestorable.  So just drop it.

Back-patch to 9.0, where ALTER DEFAULT PRIVILEGES was introduced.
2013-06-09 15:26:40 -04:00
Tom Lane 007556bf08 Remove fixed limit on the number of concurrent AllocateFile() requests.
AllocateFile(), AllocateDir(), and some sister routines share a small array
for remembering requests, so that the files can be closed on transaction
failure.  Previously that array had a fixed size, MAX_ALLOCATED_DESCS (32).
While historically that had seemed sufficient, Steve Toutant pointed out
that this meant you couldn't scan more than 32 file_fdw foreign tables in
one query, because file_fdw depends on the COPY code which uses
AllocateFile().  There are probably other cases, or will be in the future,
where this nonconfigurable limit impedes users.

We can't completely remove any such limit, at least not without a lot of
work, since each such request requires a kernel file descriptor and most
platforms limit the number we can have.  (In principle we could
"virtualize" these descriptors, as fd.c already does for the main VFD pool,
but not without an additional layer of overhead and a lot of notational
impact on the calling code.)  But we can at least let the array size be
configurable.  Hence, change the code to allow up to max_safe_fds/2
allocated file requests.  On modern platforms this should allow several
hundred concurrent file_fdw scans, or more if one increases the value of
max_files_per_process.  To go much further than that, we'd need to do some
more work on the data structure, since the current code for closing
requests has potentially O(N^2) runtime; but it should still be all right
for request counts in this range.

Back-patch to 9.1 where contrib/file_fdw was introduced.
2013-06-09 13:46:54 -04:00
Andrew Dunstan d535136b5d Don't downcase non-ascii identifier chars in multi-byte encodings.
Long-standing code has called tolower() on identifier character bytes
with the high bit set. This is clearly an error and produces junk output
when the encoding is multi-byte. This patch therefore restricts this
activity to cases where there is a character with the high bit set AND
the encoding is single-byte.

There have been numerous gripes about this, most recently from Martin
Schäfer.

Backpatch to all live releases.
2013-06-08 10:00:09 -04:00
Andrew Dunstan 94e3311b97 Handle Unicode surrogate pairs correctly when processing JSON.
In 9.2, Unicode escape sequences are not analysed at all other than
to make sure that they are in the form \uXXXX. But in 9.3 many of the
new operators and functions try to turn JSON text values into text in
the server encoding, and this includes de-escaping Unicode escape
sequences. This processing had not taken into account the possibility
that this might contain a surrogate pair to designate a character
outside the BMP. That is now handled correctly.

This also enforces correct use of surrogate pairs, something that is not
done by the type's input routines. This fact is noted in the docs.
2013-06-08 09:12:48 -04:00
Heikki Linnakangas f73cb5567c Fix typo in comment. 2013-06-06 18:27:01 +03:00
Robert Haas a6370fd9ed Ensure that XLOG_HEAP2_VISIBLE always targets an initialized page.
Andres Freund
2013-06-06 10:21:47 -04:00
Tom Lane 964c0d0f80 Prevent pushing down WHERE clauses into unsafe UNION/INTERSECT nests.
The planner is aware that it mustn't push down upper-level quals into
subqueries if the quals reference subquery output columns that contain
set-returning functions or volatile functions, or are non-DISTINCT outputs
of a DISTINCT ON subquery.  However, it missed making this check when
there were one or more levels of UNION or INTERSECT above the dangerous
expression.  This could lead to "set-valued function called in context that
cannot accept a set" errors, as seen in bug #8213 from Eric Soroos, or to
silently wrong answers in the other cases.

To fix, refactor the checks so that we make the column-is-unsafe checks
during subquery_is_pushdown_safe(), which already has to recursively
inspect all arms of a set-operation tree.  This makes
qual_is_pushdown_safe() considerably simpler, at the cost that we will
spend some cycles checking output columns that possibly aren't referenced
in any upper qual.  But the cases where this code gets executed at all
are already nontrivial queries, so it's unlikely anybody will notice any
slowdown of planning.

This has been broken since commit 05f916e6ad,
which makes the bug over ten years old.  A bit surprising nobody noticed it
before now.
2013-06-05 23:45:11 -04:00
Peter Eisentraut a3bd6096bd Update SQL features list 2013-06-05 22:05:18 -04:00
Tom Lane 3f783c8827 Put analyze_keyword back in explain_option_name production.
In commit 2c92edad48, I broke "EXPLAIN
(ANALYZE)" syntax, because I mistakenly thought that ANALYZE/ANALYSE were
only partially reserved and thus would be included in NonReservedWord;
but actually they're fully reserved so they still need to be called out
here.

A nicer solution would be to demote these words to type_func_name_keyword
status (they can't be less than that because of "VACUUM [ANALYZE] ColId").
While that works fine so far as the core grammar is concerned, it breaks
ECPG's grammar for reasons I don't have time to isolate at the moment.
So do this for the time being.

Per report from Kevin Grittner.  Back-patch to 9.0, like the previous
commit.
2013-06-05 13:32:53 -04:00
Tom Lane 530acda4da Provide better message when CREATE EXTENSION can't find a target schema.
The new message (and SQLSTATE) matches the corresponding error cases in
namespace.c.

This was thought to be a "can't happen" case when extension.c was written,
so we didn't think hard about how to report it.  But it definitely can
happen in 9.2 and later, since we no longer require search_path to contain
any valid schema names.  It's probably also possible in 9.1 if search_path
came from a noninteractive source.  So, back-patch to all releases
containing this code.

Per report from Sean Chittenden, though this isn't exactly his patch.
2013-06-04 17:22:29 -04:00
Tom Lane 5c7603c318 Add ARM64 (aarch64) support to s_lock.h.
Use the same gcc atomic functions as we do on newer ARM chips.
(Basically this is a copy and paste of the __arm__ code block,
but omitting the SWPB option since that definitely won't work.)

Back-patch to 9.2.  The patch would work further back, but we'd also
need to update config.guess/config.sub in older branches to make them
build out-of-the-box, and there hasn't been demand for it.

Mark Salter
2013-06-04 15:42:02 -04:00
Tom Lane dbc6eb1f4b Fix memory leak in LogStandbySnapshot().
The array allocated by GetRunningTransactionLocks() needs to be pfree'd
when we're done with it.  Otherwise we leak some memory during each
checkpoint, if wal_level = hot_standby.  This manifests as memory bloat
in the checkpointer process, or in bgwriter in versions before we made
the checkpointer separate.

Reported and fixed by Naoya Anzai.  Back-patch to 9.0 where the issue
was introduced.

In passing, improve comments for GetRunningTransactionLocks(), and add
an Assert that we didn't overrun the palloc'd array.
2013-06-04 14:58:46 -04:00
Tom Lane 035a5e1e8c Add semicolons to eval'd strings to hide a minor Perl behavioral change.
"eval q{foo}" used to complain that the error was on line 2 of the eval'd
string, because eval internally tacked on "\n;" so that the end of the
erroneous command was indeed on line 2.  But as of Perl 5.18 it more
sanely says that the error is on line 1.  To avoid Perl-version-dependent
regression test results, use "eval q{foo;}" instead in the two places
where this matters.  Per buildfarm.

Since people might try to use newer Perl versions with older PG releases,
back-patch as far as 9.0 where these test cases were added.
2013-06-03 14:19:26 -04:00
Heikki Linnakangas 15386281a6 Put back allow_system_table_mods check in heap_create().
This reverts commit a475c60367.

Erik Rijkers reported back in January 2013 that after the patch, if you do
"pg_dump -t myschema.mytable" to dump a single table, and restore that in
a database where myschema does not exist, the table is silently created in
pg_catalog instead. That is because pg_dump uses
"SET search_path=myschema, pg_catalog" to set schema the table is created
in. While allow_system_table_mods is not a very elegant solution to this,
we can't leave it as it is, so for now, revert it back to the way it was
previously.
2013-06-03 17:22:31 +03:00
Stephen Frost f129615fe7 Additional spelling corrections
A few more minor spelling corrections, no functional changes.

Thom Brown
2013-06-03 08:40:27 -04:00
Heikki Linnakangas e1e2bb34f1 Code review of recycling WAL segments in a restartpoint.
Seems cleaner to get the currently-replayed TLI in the same call to
GetXLogReplayRecPtr that we get the WAL position. Make it more clear in the
comment what the code does when recovery has already ended
(RecoveryInProgress() will set ThisTimeLineID in that case). Finally, make
resetting ThisTimeLineID afterwards more explicit.
2013-06-03 09:25:12 +03:00
Tom Lane 2c92edad48 Allow type_func_name_keywords in some places where they weren't before.
This change makes type_func_name_keywords less reserved than they were
before, by allowing them for role names, language names, EXPLAIN and COPY
options, and SET values for GUCs; which are all places where few if any
actual keywords could appear instead, so no new ambiguities are introduced.

The main driver for this change is to allow "COPY ... (FORMAT BINARY)"
to work without quoting the word "binary".  That is an inconsistency that
has been complained of repeatedly over the years (at least by Pavel Golub,
Kurt Lidl, and Simon Riggs); but we hadn't thought of any non-ugly solution
until now.

Back-patch to 9.0 where the COPY (FORMAT BINARY) syntax was introduced.
2013-06-02 20:09:20 -04:00
Tom Lane a149d8bd56 Fix unportable usage of isspace().
Must cast char argument to unsigned to avoid doing the wrong thing
with high-bit-set characters.  Oversight in commit
30b5ede715.
2013-06-01 13:58:23 -04:00
Stephen Frost c9fc28a7f1 Minor spelling fixes
Fix a few spelling mistakes.

Per bug report #8193 from Lajos Veres.
2013-06-01 10:18:59 -04:00
Stephen Frost 551938ae22 Post-pgindent cleanup
Make slightly better decisions about indentation than what pgindent
is capable of.  Mostly breaking out long function calls into one
line per argument, with a few other minor adjustments.

No functional changes- all whitespace.
pgindent ran cleanly (didn't change anything) after.
Passes all regressions.
2013-06-01 09:38:15 -04:00
Noah Misch 97c4d9b7c7 Don't emit non-canonical empty arrays in array_remove().
Dean Rasheed
2013-05-31 21:50:59 -04:00
Peter Eisentraut 01497e738e Add new source files to nls.mk 2013-05-31 20:03:39 -04:00
Peter Eisentraut 8b5a3998a1 Remove whitespace from end of lines 2013-05-30 21:05:07 -04:00
Peter Eisentraut d7eb6f46de Minor spell checking 2013-05-30 20:56:58 -04:00
Peter Eisentraut 97a11fd0e3 postgresql.conf.sample: Improve whitespace 2013-05-29 22:00:13 -04:00
Bruce Momjian 9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Robert Haas 6eb971bd64 Fix typo in comment.
Pavan Deolasee
2013-05-23 11:34:30 -04:00
Heikki Linnakangas e2ef289363 Print line number correctly in COPY.
When COPY uses the multi-insert method to insert a batch of tuples into the
heap at a time, incorrect line number was printed if something went wrong in
inserting the index tuples (primary key failure, for exampl), or processing
after row triggers.

Fixes bug #8173 reported by Lloyd Albin. Backpatch to 9.2, where the multi-
insert code was added.
2013-05-23 07:49:59 -04:00
Simon Riggs 22a27ef113 After fast promotion use CHECKPOINT_FORCE
Not necessary for correctness, just to make
log_checkpoints output look less singular.

Requested by Fujii Masao
2013-05-21 21:27:12 +01:00
Simon Riggs 75a192638f Maintain ThisTimeLineID correctly in checkpointer
checkpointer needs to reset ThisTimeLineID after
a restartpoint to allow installing/recycling new
WAL files. If recovery has already ended this
would leave ThisTimeLineID set incorrectly and
so we must reset it otherwise later checkpoints
do not have the correct timeline.

Bug report by Heikki Linnakangas.
Further investigation by Heikki and myself.
2013-05-21 21:17:04 +01:00
Heikki Linnakangas 30b5ede715 Fix escaping in generated recovery.conf file.
In the primary_conninfo line that "pg_basebackup -R" generates, single
quotes in parameter values need to be escaped into \\'; the libpq parser
requires the quotes to be escaped into \', and recovery.conf parser requires
the \ to be escaped into \\.

Also, don't quote parameter values unnecessarily, to make the connection
string prettier. Most options in a libpq connection string don't need
quoting.

Reported by Hari Babu, closer analysis by Zoltan Boszormenyi, although I
didn't use his patch.
2013-05-20 19:41:45 +03:00
Tom Lane 2af0971f35 Clarify documentation of EXPLAIN (TIMING OFF) option.
Clarify that this option doesn't suppress measurement of the statement's
total runtime.

Greg Smith
2013-05-19 22:03:32 -04:00
Simon Riggs d4337a0dcb Init crash recovery using the latest available TLI
This simplifies the handling of crashes after fast promotion and various
minor cases that can exist in short timing windows around that case.

Broad fix to bug reported by Michael Paquier on -hackers,
approach prompted by Heikki Linnakangas
2013-05-19 17:31:07 +01:00
Simon Riggs 1781744cfc Emit msg correctly for timeline-crossing crash 2013-05-19 17:00:18 +01:00
Simon Riggs c94dff4c3c Remove single space on end of a line in xlog.c
Michael Paquier
2013-05-19 15:38:47 +01:00
Heikki Linnakangas d0cab7903b Remove unused regression test files.
euc_* and mule_internal test cases were identical to the ones in
src/test/mb. sql_ascii didn't exist elsewhere, but has been broken since
2001, and doesn't seem very interesting anyway. drop.sql hasn't been used
since 2000, when regress.sh was removed.
2013-05-18 22:35:37 +03:00
Tom Lane 403bd6a18b Fix crash when trying to display a NOTIFY rule action.
Fixes oversight in commit 2ffa740be9.
Per report from Josh Kupershmidt.

I think we've broken this case before, so let's add a regression test
this time.
2013-05-16 16:47:26 -04:00
Tom Lane 6563fb2b45 Fix fd.c to preserve errno where needed.
PathNameOpenFile failed to ensure that the correct value of errno was
returned to its caller after a failure (because it incorrectly supposed
that free() can never change errno).  In some cases this would result
in a user-visible failure because an expected ENOENT errno was replaced
with something else.  Bogus EINVAL failures have been observed on OS X,
for example.

There were also a couple of places that could mangle an important value
of errno if FDDEBUG was defined.  While the usefulness of that debug
support is highly debatable, we might as well make it safe to use,
so add errno save/restore logic to the DO_DB macro.

Per bug #8167 from Nelson Minar, diagnosed by RhodiumToad.
Back-patch to all supported branches.
2013-05-16 15:04:31 -04:00
Tom Lane e7bfc7e42c Fix some uses of "the quick brown fox".
If we're going to quote a well-known pangram, we should quote it
accurately.  Per gripe from Thom Brown.
2013-05-16 12:30:41 -04:00
Tom Lane b142068622 Allow CREATE FOREIGN TABLE to include SERIAL columns.
The behavior is that the required sequence is created locally, which is
appropriate because the default expression will be evaluated locally.
Per gripe from Brad Nicholson that this case was refused with a confusing
error message.  We could have improved the error message but it seems
better to just allow the case.

Also, remove ALTER TABLE's arbitrary prohibition against being applied to
foreign tables, which was pretty inconsistent considering we allow it for
views, sequences, and other relation types that aren't even called tables.
This is needed to avoid breaking pg_dump, which sometimes emits column
defaults using separate ALTER TABLE commands.  (I think this can happen
even when the default is not associated with a sequence, so that was a
pre-existing bug once we allowed column defaults for foreign tables.)
2013-05-15 19:03:29 -04:00
Tom Lane e9c336c786 Fix handling of OID wraparound while in standalone mode.
If OID wraparound should occur while in standalone mode (unlikely but
possible), we want to advance the counter to FirstNormalObjectId not
FirstBootstrapObjectId.  Otherwise, user objects might be created with OIDs
in the system-reserved range.  That isn't immediately harmful but it poses
a risk of conflicts during future pg_upgrade operations.

Noted by Andres Freund.  Back-patch to all supported branches, since all of
them are supported sources for pg_upgrade operations.
2013-05-13 15:40:16 -04:00
Tom Lane 904af8db8a Fix handling of strict non-set functions with NULLs in set-valued inputs.
In a construct like "select plain_function(set_returning_function(...))",
the plain function is applied to each output row of the SRF successively.
If some of the SRF outputs are NULL, and the plain function is strict,
you'd expect to get NULL results for such rows ... but what actually
happened was that such rows were omitted entirely from the result set.
This was due to confusion of this case with what should happen for nested
set-returning functions; a strict SRF is indeed supposed to yield an empty
set for null input.  Per bug #8150 from Erwin Brandstetter.

Although this has been broken forever, we're not back-patching because
of the possibility that some apps out there expect the incorrect behavior.
This change should be listed as a possible incompatibility in the 9.3
release notes.
2013-05-12 13:08:12 -04:00
Tom Lane 35d50b527a Fix to_number() to correctly ignore thousands separator when it's '.'.
The existing code in NUM_numpart_from_char has hard-wired logic to treat
'.' as decimal point, even when we're using a locale-aware format string
and the locale says that '.' is the thousands separator.  This results in
clearly wrong answers in FM mode (where we must be able to identify the
decimal point location), as per bug report from Patryk Kordylewski.

Since the initialization code in NUM_prepare_locale already sets up
Np->decimal as either the locale decimal-point string or "." depending
on which decimal-point format code was used, there's really no need to
have any extra logic at all in NUM_numpart_from_char: we only need to
test for a match to Np->decimal.

(Note: AFAICS there's nothing in here that explicitly checks for thousands
separators --- rather, any unmatched character is silently skipped over.
That's pretty bogus IMO but it's not the issue being complained of.)

This is a longstanding bug, but it's possible that some existing apps
are depending on '.' being recognized as decimal point even when using
a D format code.  Hence, no back-patch.  We should probably list this
as a potential incompatibility in the 9.3 release notes.
2013-05-11 16:35:03 -04:00
Tom Lane 69cc60dcfd Guard against input_rows == 0 in estimate_num_groups().
This case doesn't normally happen, because the planner usually clamps
all row estimates to at least one row; but I found that it can arise
when dealing with relations excluded by constraints.  Without a defense,
estimate_num_groups() can return zero, which leads to divisions by zero
inside the planner as well as assertion failures in the executor.

An alternative fix would be to change set_dummy_rel_pathlist() to make
the size estimate for a dummy relation 1 row instead of 0, but that seemed
pretty ugly; and probably someday we'll want to drop the convention that
the minimum rowcount estimate is 1 row.

Back-patch to 8.4, as the problem can be demonstrated that far back.
2013-05-10 17:15:30 -04:00
Tom Lane 91715e8293 Fix management of fn_extra caching during repeated GiST index scans.
Commit d22a09dc70 introduced official support
for GiST consistentFns that want to cache data using the FmgrInfo fn_extra
pointer: the idea was to preserve the cached values across gistrescan(),
whereas formerly they'd been leaked.  However, there was an oversight in
that, namely that multiple scan keys might reference the same column's
consistentFn; the code would result in propagating the same cache value
into multiple scan keys, resulting in crashes or wrong answers.  Use a
separate array instead to ensure that each scan key keeps its own state.

Per bug #8143 from Joel Roller.  Back-patch to 9.2 where the bug was
introduced.
2013-05-09 23:09:04 -04:00
Peter Eisentraut bd98852cbd Remove make_keywords
It is not used anymore.
2013-05-09 22:21:43 -04:00
Tom Lane 284e28f228 Update collate.linux.utf8.out for ruleutils.c line-wrapping changes.
Missed in commit 62e666400d.
2013-05-08 22:47:33 -04:00
Tom Lane a7b965382c Better fix for permissions tests in excluded subqueries.
This reverts the code changes in 50c137487c,
which turned out to induce crashes and not completely fix the problem
anyway.  That commit only considered single subqueries that were excluded
by constraint-exclusion logic, but actually the problem also exists for
subqueries that are appendrel members (ie part of a UNION ALL list).  In
such cases we can't add a dummy subpath to the appendrel's AppendPath list
without defeating the logic that recognizes when an appendrel is completely
excluded.  Instead, fix the problem by having setrefs.c scan the rangetable
an extra time looking for subqueries that didn't get into the plan tree.
(This approach depends on the 9.2 change that made set_subquery_pathlist
generate dummy paths for excluded single subqueries, so that the exclusion
behavior is the same for single subqueries and appendrel members.)

Note: it turns out that the appendrel form of the missed-permissions-checks
bug exists as far back as 8.4.  However, since the practical effect of that
bug seems pretty minimal, consensus is to not attempt to fix it in the back
branches, at least not yet.  Possibly we could back-port this patch once
it's gotten a reasonable amount of testing in HEAD.  For the moment I'm
just going to revert the previous patch in 9.2.
2013-05-08 16:59:58 -04:00
Heikki Linnakangas 2ffa66f497 Fix walsender failure at promotion.
If a standby server has a cascading standby server connected to it, it's
possible that WAL has already been sent up to the next WAL page boundary,
splitting a WAL record in the middle, when the first standby server is
promoted. Don't throw an assertion failure or error in walsender if that
happens.

Also, fix a variant of the same bug in pg_receivexlog: if it had already
received WAL on previous timeline up to a segment boundary, when the
upstream standby server is promoted so that the timeline switch record falls
on the previous segment, pg_receivexlog would miss the segment containing
the timeline switch. To fix that, have walsender send the position of the
timeline switch at end-of-streaming, in addition to the next timeline's ID.
It was previously assumed that the switch happened exactly where the
streaming stopped.

Note: this is an incompatible change in the streaming protocol. You might
get an error if you try to stream over timeline switches, if the client is
running 9.3beta1 and the server is more recent. It should be fine after a
reconnect, however.

Reported by Fujii Masao.
2013-05-08 20:30:17 +03:00
Heikki Linnakangas cb953d8b1b Use the term "radix tree" instead of "suffix tree" for SP-GiST text opclass.
What we have implemented is a radix tree (or a radix trie or a patricia
trie), but the docs and code comments incorrectly called it a "suffix tree".

Alexander Korotkov
2013-05-08 14:34:26 +03:00
Tom Lane 817a89423f Stamp 9.3beta1. 2013-05-06 16:57:06 -04:00
Tom Lane 1d6c72a55b Move materialized views' is-populated status into their pg_class entries.
Previously this state was represented by whether the view's disk file had
zero or nonzero size, which is problematic for numerous reasons, since it's
breaking a fundamental assumption about heap storage.  This was done to
allow unlogged matviews to revert to unpopulated status after a crash
despite our lack of any ability to update catalog entries post-crash.
However, this poses enough risk of future problems that it seems better to
not support unlogged matviews until we can find another way.  Accordingly,
revert that choice as well as a number of existing kluges forced by it
in favor of creating a pg_class.relispopulated flag column.
2013-05-06 13:27:22 -04:00
Tom Lane 5da5798004 Back out some recent translation updates.
Very old versions of msgfmt choke on these specific messages, for reasons
that are unclear at the moment.  Remove them so that we can ship a beta
release and not get complaints from testers (these messages will just go
untranslated, instead, and we're hardly at 100% coverage anyway).
Peter Eisentraut will look for a better fix later.
2013-05-06 12:28:13 -04:00
Tom Lane 3223b25ff7 Disallow unlogged materialized views.
The initial implementation of this feature was really unsupportable,
because it's relying on the physical size of an on-disk file to carry the
relation's populated/unpopulated state, which is at least a modularity
violation and could have serious long-term consequences.  We could say that
an unlogged matview goes to empty on crash, but not everybody likes that
definition, so let's just remove the feature for 9.3.  We can add it back
when we have a less klugy implementation.

I left the grammar and tab-completion support for CREATE UNLOGGED
MATERIALIZED VIEW in place, since it's harmless and allows delivering a
more specific error message about the unsupported feature.

I'm committing this separately to ease identification of what should be
reverted when/if we are able to re-enable the feature.
2013-05-06 12:00:06 -04:00
Simon Riggs b2ad82dafa Execute SET TRANSACTION SNAPSHOT during pg_dump
Previous coding set the SQL buffer but never executed

Bug noted by me during beta testing
2013-05-06 15:37:17 +01:00
Bruce Momjian 8b06e6aba8 Revert idea of zer-padding padding session id in log_line_prefix
Removal of doc adjustment and release note mention as well.
2013-05-06 08:59:39 -04:00
Peter Eisentraut 539ecc9241 Translation updates 2013-05-05 22:34:23 -04:00
Tom Lane 626e6eda4f Improve behavior of \watch with non-tuple-returning commands.
Print the command tag if we get PGRES_COMMAND_OK, and throw an error for
other cases.  Per gripe from Michael Paquier.

In passing, add an fflush(), just to be real sure the output appears
before we sleep.
2013-05-04 16:41:22 -04:00
Kevin Grittner b69ec7cc99 Prevent (auto)vacuum from truncating first page of populated matview.
Per report from Fujii Masao, with regression test using his example.
2013-05-02 17:33:03 -05:00
Andrew Dunstan 5f8b4319b9 Use correct length to convert json unicode escapes.
Bug reported on IRC - fix due to Andrew Gierth.
2013-05-01 18:47:18 -04:00
Tom Lane 50c137487c Fix permission tests for views/tables proven empty by constraint exclusion.
A view defined as "select <something> where false" had the curious property
that the system wouldn't check whether users had the privileges necessary
to select from it.  More generally, permissions checks could be skipped
for tables referenced in sub-selects or views that were proven empty by
constraint exclusion (although some quick testing suggests this seldom
happens in cases of practical interest).  This happened because the planner
failed to include rangetable entries for such tables in the finished plan.

This was noticed in connection with erroneous handling of materialized
views, but actually the issue is quite unrelated to matviews.  Therefore,
revert commit 200ba1667b in favor of a more
direct test for the real problem.

Back-patch to 9.2 where the bug was introduced (by commit
7741dd6590).
2013-05-01 18:26:50 -04:00
Kevin Grittner 200ba1667b Add regression test for bug fixed by recent refactoring.
Test case by Andres Freund for bug fixed by Tom Lane's refactoring
in commit 5194024d72
2013-04-30 15:02:43 -05:00
Simon Riggs ceabfb20f9 Bump PG_CONTROL_VERSION to 937 2013-04-30 13:27:47 +01:00
Simon Riggs 443951748c Record data_checksum_version in control file.
The value is not used anywhere in code, but will
allow future changes to the checksum version
should that become necessary in the future.
2013-04-30 12:27:12 +01:00
Simon Riggs 730924397c Ensure we MarkBufferDirty before visibilitymap_set()
logs the heap page and sets the LSN. Otherwise a
checkpoint could occur between those actions and
leave us in an inconsistent state.

Jeff Davis
2013-04-30 08:15:49 +01:00
Simon Riggs fdea2530bd Compiler optimizations for page checksum code.
Ants Aasma and Jeff Davis
2013-04-30 06:59:26 +01:00
Peter Eisentraut 187ca5e8e9 Revert "pg_ctl: Add idempotent option"
This reverts commit 8730618458.  The
behavior in certain cases is still being debated, and it's too late to
solve this before beta.
2013-04-29 21:55:12 -04:00
Tom Lane db9f0e1d9a Postpone creation of pathkeys lists to fix bug #8049.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.

The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner.  However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.

There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner).  I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series.  This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.

I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test.  The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids.  I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical.  So for the moment, do this and
stop here.

Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct.  (We might have to revisit that choice
if any related bugs turn up.)  In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
2013-04-29 14:50:03 -04:00
Kevin Grittner 5fc893760f Ensure ANALYZE phase is not skipped because of canceled truncate.
Patch b19e4250b4 attempted to
preserve existing behavior regarding statistics generation in the
case that a truncation attempt was canceled due to lock conflicts.
It failed to do this accurately in two regards: (1) autovacuum had
previously generated statistics if the truncate attempt failed to
initially get the lock rather than having started the attempt, and
(2) the VACUUM ANALYZE command had always generated statistics.

Both of these changes were unintended, and are reverted by this
patch.  On review, there seems to be consensus that the previous
failure to generate statistics when the truncate was terminated
was more an unfortunate consequence of how that effort was
previously terminated than a feature we want to keep; so this
patch generates statistics even when an autovacuum truncation
attempt terminates early.  Another unintended change which is kept
on the basis that it is an improvement is that when a VACUUM
command is truncating, it will the new heuristic for avoiding
blocking other processes, rather than keeping an
AccessExclusiveLock on the table for however long the truncation
takes.

Per multiple reports, with some renaming per patch by Jeff Janes.

Backpatch to 9.0, where problem was created.
2013-04-29 13:05:26 -05:00
Robert Haas 91fa8532f4 Attempt to fix error recovery in COPY BOTH mode.
Previously, libpq and the backend had opposite ideas about whether
it was necessary for the client to send a CopyDone message after
receiving an ErrorResponse, making it impossible to cleanly exit
COPY BOTH mode.  Fix libpq so that works correctly, adopting the
backend's notion that an ErrorResponse kills the copy in both
directions.

Adjust receivelog.c to avoid a degradation in the quality of the
resulting error messages.  libpqwalreceiver.c is already doing
the right thing, so no adjustment needed there.

Add an explicit statement to the documentation explaining how
this part of the protocol is supposed to work, in the hopes of
avoiding future confusion in this area.

Since the consequences of all this confusion are very limited,
especially in the back-branches where no client ever attempts
to exit COPY BOTH mode without closing the connection entirely,
no back-patch.
2013-04-29 06:29:32 -04:00
Simon Riggs 43e7a66849 Introduce new page checksum algorithm and module.
Isolate checksum calculation to its own module, so that bufpage
knows little if anything about the details of the calculation.

This implementation is a modified FNV-1a hash checksum, details
of which are given in the new checksum.c header comments.

Basic implementation only, so we fix the output value.

Later related commits will add version numbers to pg_control,
compiler optimization flags and memory barriers.

Ants Aasma, reviewed by Jeff Davis and Simon Riggs
2013-04-29 09:05:27 +01:00
Tom Lane f8db76e875 Editorialize a bit on new ProcessUtility() API.
Choose a saner ordering of parameters (adding a new input param after
the output params seemed a bit random), update the function's header
comment to match reality (cmon folks, is this really that hard?),
get rid of useless and sloppily-defined distinction between
PROCESS_UTILITY_SUBCOMMAND and PROCESS_UTILITY_GENERATED.
2013-04-28 00:18:45 -04:00
Tom Lane 5525e6c40b Fix unsafe event-trigger coding in ProcessUtility().
We mustn't run any of the event-trigger support code when handling
utility statements like START TRANSACTION or ABORT, because that code
may need to refresh event-trigger cache data, which requires being
inside a valid transaction.  (This mistake explains the consistent
build failures exhibited by the CLOBBER_CACHE_ALWAYS buildfarm members,
as well as some irreproducible failures on other members.)

The least messy fix seems to be to break standard_ProcessUtility into two
functions, one that handles all the statements not supported by event
triggers, and one that contains the event-trigger support code and handles
the statements that are supported by event triggers.

This change also fixes several inconsistencies, such as four cases where
support had been installed for "ddl_event_start" but not "ddl_event_end"
triggers, plus the fact that InvokeDDLCommandEventTriggersIfSupported()
paid no mind to isCompleteQuery.

Dimitri Fontaine and Tom Lane
2013-04-27 23:11:51 -04:00
Peter Eisentraut bbb4db4e04 pg_dump: Improve message formatting 2013-04-27 23:06:37 -04:00
Tom Lane 5194024d72 Incidental cleanup of matviews code.
Move checking for unscannable matviews into ExecOpenScanRelation, which is
a better place for it first because the open relation is already available
(saving a relcache lookup cycle), and second because this eliminates the
problem of telling the difference between rangetable entries that will or
will not be scanned by the query.  In particular we can get rid of the
not-terribly-well-thought-out-or-implemented isResultRel field that the
initial matviews patch added to RangeTblEntry.

Also get rid of entirely unnecessary scannability check in the rewriter,
and a bogus decision about whether RefreshMatViewStmt requires a parse-time
snapshot.

catversion bump due to removal of a RangeTblEntry field, which changes
stored rules.
2013-04-27 17:48:57 -04:00
Peter Eisentraut f5d576c6d2 Improve message about failed transaction log archiving
The old phrasing appeared to imply that the failure was terminal.
Improve that by indicating that archiving will be tried again later.
2013-04-26 22:43:54 -04:00
Tom Lane 41a2760f61 Fix collation assignment for aggregates with ORDER BY.
ORDER BY expressions were being treated the same as regular aggregate
arguments for purposes of collation determination, but really they should
not affect the aggregate's collation at all; only collations of the
aggregate's regular arguments should affect it.

In many cases this mistake would lead to incorrectly throwing a "collation
conflict" error; but in some cases the corrected code will silently assign
a different collation to the aggregate than before, for example
	agg(foo ORDER BY bar COLLATE "x")
which will now use foo's collation rather than "x" for the aggregate.
Given this risk and the lack of field complaints about the issue, it
doesn't seem prudent to back-patch.

In passing, rearrange code in assign_collations_walker so that we don't
need multiple copies of the standard logic for computing collation of a
node with children.  (Previously, CaseExpr duplicated the standard logic,
and we would have needed a third copy for Aggref without this change.)

Andrew Gierth and David Fetter
2013-04-26 15:48:53 -04:00
Joe Conway b42ea7981c Ensure that user created rows in extension tables get dumped if the table is explicitly requested, either with a -t/--table switch of the table itself, or by -n/--schema switch of the schema containing the extension table. Patch reviewed by Vibhor Kumar and Dimitri Fontaine.
Backpatched to 9.1 when the extension management facility was added.
2013-04-26 12:02:40 -07:00
Robert Haas 5eb7c4d364 libpq: Fix a few bits that didn't get the memo about COPY BOTH.
There's probably no real bug here at present, so not backpatching.
But it seems good to make these bits consistent with the rest of
libpq, so as to avoid future surprises.

Patch by me.  Review by Tom Lane.
2013-04-26 08:59:40 -04:00
Tom Lane c3d09b3bd2 Avoid deadlock between concurrent CREATE INDEX CONCURRENTLY commands.
There was a high probability of two or more concurrent C.I.C. commands
deadlocking just before completion, because each would wait for the others
to release their reference snapshots.  Fix by releasing the snapshot
before waiting for other snapshots to go away.

Per report from Paul Hinze.  Back-patch to all active branches.
2013-04-25 16:58:05 -04:00
Heikki Linnakangas 447b3174f5 Fix typo in comment.
Peter Geoghegan
2013-04-25 14:09:07 +03:00
Peter Eisentraut 6cf8462834 pg_basebackup: Add missing newlines at end of lines 2013-04-24 22:51:10 -04:00
Peter Eisentraut 4c0343d4af initdb: Improve some messages 2013-04-24 22:50:33 -04:00
Heikki Linnakangas 0c1a160a68 Add missing #include.
On non-Windows systems, sys/time.h was pulled in by portability/instr_time.h,
which pulled in time.h. We certainly should include time.h directly, since
we're using time(2), but the indirect include masked the problem on most
platforms.

Andres Freund
2013-04-24 19:14:28 +03:00
Kevin Grittner 63e20041a2 Fix assertion failure for REFRESH MATERIALIZED VIEW in PL.
This was due to incomplete implementation of rowcount reporting
for RMV, which was due to initial waffling on whether it should
be provided.  It seems unlikely to be a useful or universally
available  number as more sophisticated techniques for maintaining
matviews are added, so remove the partial support rather than
completing it.

Per report of Jeevan Chalke, but with a different fix
2013-04-24 08:39:06 -05:00
Simon Riggs 2317a63328 Make fast promotion the default promotion mode.
Continue to allow a request for synchronous
checkpoints as a mechanism in case of problems.
2013-04-24 12:21:18 +01:00
Tom Lane ac63dca607 Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction.  However, plancache.c busily
saved or examined the search_path for every cached plan.  If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt.  This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.

This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 17:00:23 -04:00
Peter Eisentraut cc26ea9fe2 Clean up references to SQL92
In most cases, these were just references to the SQL standard in
general.  In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
2013-04-20 11:04:41 -04:00
Tom Lane 6e481ebff6 Improve error message when an FDW doesn't support WHERE CURRENT OF.
If an FDW fails to take special measures with a CurrentOfExpr, we will
end up trying to execute it as an ordinary qual, which was being treated
as a purely internal failure condition.  Provide a more user-oriented
error message for such cases.
2013-04-19 16:14:56 -04:00
Peter Eisentraut acd5803053 Standardize spelling of "nonblocking"
Only adjusted the user-exposed messages and documentation,  not all
source code comments.
2013-04-18 23:35:19 -04:00
Bruce Momjian d61dddba37 pgindent: add newline to die() so script line number is not reported on failure. 2013-04-16 10:30:35 -04:00
Heikki Linnakangas 87ae9e7265 Remove some unused and seldom used fields from RelationAmInfo.
This saves some memory from each index relcache entry. At least on a 64-bit
machine, it saves just enough to shrink a typical relcache entry's memory
usage from 2k to 1k. That's nice if you have a lot of backends and a lot of
indexes.
2013-04-16 15:07:58 +03:00
Peter Eisentraut c74d586d2f Fix function return type confusion
When parse_hba_line's return type was changed from bool to a pointer,
the MANDATORY_AUTH_ARG macro wasn't adjusted.
2013-04-15 22:38:08 -04:00
Andrew Dunstan d788121aba Mark json IO and extraction functions immutable.
Per complaint from Hubert Depesz Lubaczewski.

Catalog version bumped.
2013-04-15 21:46:25 -04:00
Andrew Dunstan 728ec9731f Correct handling of NULL arguments in json funcs.
Per gripe from Tom Lane.
2013-04-15 16:20:21 -04:00
Peter Eisentraut e08fdf1310 Add serial comma 2013-04-14 11:12:30 -04:00
Peter Eisentraut 8730618458 pg_ctl: Add idempotent option
This changes the behavior of the start and stop actions to exit
successfully if the server was already started or stopped.

This changes the default behavior of the start action:  Before, if the
server was already running, it would print a message and succeed.  Now,
that situation will result in an error.  When running in idempotent
mode, no message is printed and pg_ctl exits successfully.

It was considered to just make the idempotent behavior the default and
only option, but pg_upgrade needs the old behavior.
2013-04-13 23:42:42 -04:00
Peter Eisentraut ba66752d27 Fix sporadic rebuilds for .pc files
The build of .pc (pkg-config) files depends on all makefiles in use, and
in dependency tracking mode, the previous coding ended up including
/dev/null as a makefile.  Apparently, on some platforms the modification
time of /dev/null changes sporadically, and so the .pc files would end
up being rebuilt every so often.  Fix that by changing the makefile code
to do without using /dev/null.
2013-04-12 22:49:25 -04:00
Tom Lane 0b33790421 Clean up the mess around EXPLAIN and materialized views.
Revert the matview-related changes in explain.c's API, as per recent
complaint from Robert Haas.  The reason for these appears to have been
principally some ill-considered choices around having intorel_startup do
what ought to be parse-time checking, plus a poor arrangement for passing
it the view parsetree it needs to store into pg_rewrite when creating a
materialized view.  Do the latter by having parse analysis stick a copy
into the IntoClause, instead of doing it at runtime.  (On the whole,
I seriously question the choice to represent CREATE MATERIALIZED VIEW as a
variant of SELECT INTO/CREATE TABLE AS, because that means injecting even
more complexity into what was already a horrid legacy kluge.  However,
I didn't go so far as to rethink that choice ... yet.)

I also moved several error checks into matview parse analysis, and
made the check for external Params in a matview more accurate.

In passing, clean things up a bit more around interpretOidsOption(),
and fix things so that we can use that to force no-oids for views,
sequences, etc, thereby eliminating the need to cons up "oids = false"
options when creating them.

catversion bump due to change in IntoClause.  (I wonder though if we
really need readfuncs/outfuncs support for IntoClause anymore.)
2013-04-12 19:25:31 -04:00
Bruce Momjian 5003f94f66 pgindent: improve error messages
per suggestion from Gurjeet Singh
2013-04-12 15:25:33 -04:00
Bruce Momjian 8daa4e960e pgindent: fix downloading of BSD indent binary
Also fix accessing pgentab binary and tar.

Gurjeet Singh
2013-04-12 11:42:27 -04:00
Robert Haas f8a54e936b sepgsql: Enforce db_procedure:{execute} permission.
To do this, we add an additional object access hook type,
OAT_FUNCTION_EXECUTE.

KaiGai Kohei
2013-04-12 08:58:01 -04:00
Robert Haas d017bf41a3 Minor wording corrections for object-access hook stuff.
KaiGai Kohei
2013-04-12 08:40:02 -04:00
Bruce Momjian be55f3b859 Document that git_changelog needs updating for major version stamping. 2013-04-11 12:27:02 -04:00
Alvaro Herrera 6cd18a88b6 Remove quotes around SQL statement in error message 2013-04-11 12:00:09 -03:00
Alvaro Herrera 6a76edb188 Fix confusion between ObjectType and ObjectClass
Per report by Will Leinweber and Peter Eisentraut
2013-04-11 11:59:47 -03:00
Alvaro Herrera f62ab623ad Fix SIGUSR1 handling by unconnected bgworkers
Latch activity was not being detected by non-database-connected workers; the
SIGUSR1 signal handler which is normally in charge of that was set to SIG_IGN.
Create a simple handler to call latch_sigusr1_handler instead.

Robert Haas (bug report and suggested fix)
2013-04-10 16:01:16 -03:00
Alvaro Herrera 61a7d576f2 Fix SIGHUP handling by unconnected bgworkers
Add a SignalUnconnectedWorkers() call so that non-database-connected background
workers are also notified when postmaster is SIGHUPped.  Previously, only
database-connected workers were.

Michael Paquier (bug report and fix)
2013-04-10 15:59:45 -03:00
Robert Haas 4cff7b9dd6 Remove duplicate initialization in XLogReadRecord.
Per a note from Dickson S. Guedes.
2013-04-09 23:59:31 -04:00
Kevin Grittner 52e6e33ab4 Create a distinction between a populated matview and a scannable one.
The intent was that being populated would, long term, be just one
of the conditions which could affect whether a matview was
scannable; being populated should be necessary but not always
sufficient to scan the relation.  Since only CREATE and REFRESH
currently determine the scannability, names and comments
accidentally conflated these concepts, leading to confusion.

Also add missing locking for the SQL function which allows a
test for scannability, and fix a modularity violatiion.

Per complaints from Tom Lane, although its not clear that these
will satisfy his concerns.  Hopefully this will at least better
frame the discussion.
2013-04-09 13:02:49 -05:00
Robert Haas 0bf42a5f3b Adjust ExplainOneQuery_hook_type to take a DestReceiver argument.
The materialized views patch adjusted ExplainOneQuery to take an
additional DestReceiver argument, but failed to add a matching
argument to the definition of ExplainOneQuery_hook.  This is a
problem for users of the hook that want to call ExplainOnePlan.
Fix by adding the missing argument.
2013-04-09 10:25:08 -04:00
Tom Lane 3ccae48f44 Support indexing of regular-expression searches in contrib/pg_trgm.
This works by extracting trigrams from the given regular expression,
in generally the same spirit as the previously-existing support for
LIKE searches, though of course the details are far more complicated.

Currently, only GIN indexes are supported.  We might be able to make
it work with GiST indexes later.

The implementation includes adding API functions to backend/regex/
to provide a view of the search NFA created from a regular expression.
These functions are meant to be generic enough to be supportable in
a standalone version of the regex library, should that ever happen.

Alexander Korotkov, reviewed by Heikki Linnakangas and Tom Lane
2013-04-09 01:06:54 -04:00
Simon Riggs e60d20a35e Minor rewording of README comments 2013-04-08 17:20:26 +01:00
Heikki Linnakangas 594041311c Fix calculation of how many segments to retain for wal_keep_segments.
KeepLogSeg function was broken when we switched to use a 64-bit int for the
segment number.

Per report from Jeff Janes.
2013-04-08 16:29:56 +03:00
Simon Riggs 5787c6730e Skip extraneous locking in XLogCheckBuffer().
Heikki reported comment was wrong, so fixed
code to match the comment: we only need to
take additional locking precautions when we
have a shared lock on the buffer.
2013-04-08 09:11:49 +01:00
Simon Riggs 47c4333189 Avoid tricky race condition recording XLOG_HINT
We copy the buffer before inserting an XLOG_HINT to avoid WAL CRC errors
caused by concurrent hint writes to buffer while share locked. To make this work
we refactor RestoreBackupBlock() to allow an XLOG_HINT to avoid the normal
path for backup blocks, which assumes the underlying buffer is exclusive locked.
Resulting code completely changes layout of XLOG_HINT WAL records, but
this isn't even beta code, so this is a low impact change.
In passing, avoid taking WALInsertLock for full page writes on checksummed
hints, remove related cruft from XLogInsert() and improve xlog_desc record for
XLOG_HINT.

Andres Freund

Bug report by Fujii Masao, testing by Jeff Janes and Jaime Casanova,
review by Jeff Davis and Simon Riggs. Applied with changes from review
and some comment editing.
2013-04-08 08:52:39 +01:00
Simon Riggs a4b94b8515 README comments on checksums on page holes. 2013-04-08 08:42:52 +01:00
Simon Riggs 1be203519a Tune BufferGetLSNAtomic() when checksums !enabled
From performance analysis by Heikki Linnakangas
2013-04-07 22:37:39 +01:00
Simon Riggs cf8dc9e10c Fix checksums for CLUSTER, VACUUM FULL etc.
In CLUSTER, VACUUM FULL and ALTER TABLE SET TABLESPACE
I erroneously set checksum before log_newpage, which
sets the LSN and invalidates the checksum. So set
checksum immediately *after* log_newpage.

Bug report Fujii Masao, Fix and patch by Jeff Davis
2013-04-07 22:16:51 +01:00
Tom Lane faf4726c9f In isolationtester, retry after EINTR return from select(2).
Per report from Jaime Casanova.  Very curious that no one else has seen
this failure ... but the code is clearly wrong as-is.
2013-04-06 22:28:49 -04:00
Robert Haas e965e6344c sepgsql: Enforce db_schema:search permission.
KaiGai Kohei, with comment and doc wordsmithing by me
2013-04-05 08:51:31 -04:00
Tom Lane 927e1dc96c Fix line count in slashUsage().
Counting newlines shows that quite a few recent patches have neglected
to update the output-lines count given to PageOutput().  Fortunately
it's not terribly critical that this be exact, since we long since
exceeded the height of most people's terminal windows.  Still, maybe
we ought to think of a way to not have to maintain this manually anymore.
2013-04-04 20:29:46 -04:00
Tom Lane c6a3fce7dd Add \watch [SEC] command to psql.
This allows convenient re-execution of commands.

Will Leinweber, reviewed by Peter Eisentraut, Daniel Farina, and Tom Lane
2013-04-04 19:56:59 -04:00
Andrew Dunstan e75feb2834 Fix off by one error in JSON extract path code.
Bug report by David Wheeler, diagnosis assistance from Tom Lane.
2013-04-04 18:26:52 -04:00
Bruce Momjian 48a2cd370e psql: fix startup crash caused by PSQLRC containing a tilde
'strdup' the PSQLRC environment variable value before calling a routine
that might free() it.

Backpatch to 9.2, where the bug first appeared.
2013-04-04 12:56:24 -04:00
Heikki Linnakangas bf2b0a1478 Fix crash on compiling a regular expression with more than 32k colors.
Throw an error instead.

Backpatch to all supported branches.
2013-04-04 19:48:11 +03:00
Heikki Linnakangas b8ed4cc962 Calculate # of semaphores correctly with --disable-spinlocks.
The old formula didn't take into account that each WAL sender process needs
a spinlock. We had also already exceeded the fixed number of spinlocks
reserved for misc purposes (10). Bump that to 30.

Backpatch to 9.0, where WAL senders were introduced. If I counted correctly,
9.0 had exactly 10 predefined spinlocks, and 9.1 exceeded that, but bump the
limit in 9.0 too because 10 is uncomfortably close to the edge.
2013-04-04 16:40:37 +03:00
Tom Lane f7b0006f42 Avoid updating our PgBackendStatus entry when track_activities is off.
The point of turning off track_activities is to avoid this reporting
overhead, but a thinko in commit 4f42b546fd
caused pgstat_report_activity() to perform half of its updates anyway.
Fix that, and also make sure that we clear all the now-disabled fields
when transitioning to the non-reporting state.
2013-04-03 14:13:28 -04:00
Tom Lane 845d335a90 Minor robustness improvements for isolationtester.
Notice and complain about PQcancel() failures.  Also, don't dump core if
an error PGresult doesn't contain severity and message subfields, as it
might not if it was generated by libpq itself.  (We have a longstanding
TODO item to improve that, but in the meantime isolationtester had better
cope.)

I tripped across the latter item while investigating a trouble report on
buildfarm member spoonbill.  As for the former, there's no evidence that
PQcancel failure is actually involved in spoonbill's problem, but it still
seems like a bad idea to ignore an error return code.
2013-04-02 21:15:37 -04:00
Tom Lane 17fe2793ea Fix insecure parsing of server command-line switches.
An oversight in commit e710b65c1c allowed
database names beginning with "-" to be treated as though they were secure
command-line switches; and this switch processing occurs before client
authentication, so that even an unprivileged remote attacker could exploit
the bug, needing only connectivity to the postmaster's port.  Assorted
exploits for this are possible, some requiring a valid database login,
some not.  The worst known problem is that the "-r" switch can be invoked
to redirect the process's stderr output, so that subsequent error messages
will be appended to any file the server can write.  This can for example be
used to corrupt the server's configuration files, so that it will fail when
next restarted.  Complete destruction of database tables is also possible.

Fix by keeping the database name extracted from a startup packet fully
separate from command-line switches, as had already been done with the
user name field.

The Postgres project thanks Mitsumasa Kondo for discovering this bug,
Kyotaro Horiguchi for drafting the fix, and Noah Misch for recognizing
the full extent of the danger.

Security: CVE-2013-1899
2013-04-01 14:00:51 -04:00
Tom Lane ce9ab88981 Make REPLICATION privilege checks test current user not authenticated user.
The pg_start_backup() and pg_stop_backup() functions checked the privileges
of the initially-authenticated user rather than the current user, which is
wrong.  For example, a user-defined index function could successfully call
these functions when executed by ANALYZE within autovacuum.  This could
allow an attacker with valid but low-privilege database access to interfere
with creation of routine backups.  Reported and fixed by Noah Misch.

Security: CVE-2013-1901
2013-04-01 13:09:24 -04:00
Peter Eisentraut 85079078ac Revert "ecpg: Don't link compatlib with libpq"
This reverts commit 3780fc679c.

HP-UX didn't like it.  There would probably be a way to fix that, but
since the net effect of all of this is zero because ecpg ends up using
libpq anyway, it's not worth bothering further.
2013-03-31 23:50:51 -04:00
Tom Lane d931ac0ec4 Ignore extra subquery outputs in set_subquery_size_estimates().
In commit 0f61d4dd1b, I added code to copy up
column width estimates for each column of a subquery.  That code supposed
that the subquery couldn't have any output columns that didn't correspond
to known columns of the current query level --- which is true when a query
is parsed from scratch, but the assumption fails when planning a view that
depends on another view that's been redefined (adding output columns) since
the upper view was made.  This results in an assertion failure or even a
crash, as per bug #8025 from lindebg.  Remove the Assert and instead skip
the column if its resno is out of the expected range.
2013-03-31 18:34:15 -04:00
Peter Eisentraut 64f890905f Add pkg-config files for libpq and ecpg libraries
This will hopefully be easier to use than pg_config for users who are
already used to the pkg-config interface.  It also works better for
multi-arch installations.

reviewed by Tom Lane
2013-03-31 16:58:40 -04:00
Peter Eisentraut 3780fc679c ecpg: Don't link compatlib with libpq
It doesn't actually use libpq.  But we need to keep libpq in the
CPPFLAGS for building, because compatlib uses ecpglib.h which uses
libpq-fe.h, but we don't need to refer to libpq for linking.

reviewed by Tom Lane
2013-03-31 16:51:00 -04:00
Tom Lane 22f7b9613e Improve code documentation about "magnetic disk" storage manager.
The modern incarnation of md.c is by no means specific to magnetic disk
technology, but every so often we hear from someone who's misled by the
label.  Try to clarify that it will work for anything that supports
standard filesystem operations.  Per suggestion from Andrew Dunstan.
2013-03-30 14:23:45 -04:00
Peter Eisentraut 602070f9cc ecpg: Parallel make fix
In some parallel make situations, the install-headers target could be
called before the installation directories are created by installdirs,
causing the installation to fail.  Fix that by making install-headers
depend on installdirs.
2013-03-29 21:39:55 -04:00
Andrew Dunstan a570c98d7f Add new JSON processing functions and parser API.
The JSON parser is converted into a recursive descent parser, and
exposed for use by other modules such as extensions. The API provides
hooks for all the significant parser event such as the beginning and end
of objects and arrays, and providing functions to handle these hooks
allows for fairly simple construction of a wide variety of JSON
processing functions. A set of new basic processing functions and
operators is also added, which use this API, including operations to
extract array elements, object fields, get the length of arrays and the
set of keys of a field, deconstruct an object into a set of key/value
pairs, and create records from JSON objects and arrays of objects.

Catalog version bumped.

Andrew Dunstan, with some documentation assistance from Merlin Moncure.
2013-03-29 14:12:13 -04:00
Tom Lane aa02864f64 Must check indisready not just indisvalid when dumping from 9.2 server.
9.2 uses a kluge representation of "indislive"; we have to account for
that when examining pg_index.  Simplest solution is to check indisready
for 9.0 and 9.1 as well; that's harmless though unnecessary, so it's
not worth making a version distinction for.

Fixes oversight in commit 683abc73df,
as noted by Andres Freund.
2013-03-28 22:09:12 -04:00
Tom Lane ae7f1c3ef2 Update time zone data files to tzdata release 2013b.
DST law changes in Chile, Haiti, Morocco, Paraguay, some Russian areas.
Historical corrections for numerous places.
2013-03-28 15:25:48 -04:00
Tom Lane 58bc48179b Avoid "variable might be clobbered by longjmp" warning.
On older-model gcc, the original coding of UTILITY_BEGIN_QUERY() can
draw this error because of multiple assignments to _needCleanup.
Rather than mark that variable volatile, we can suppress the warning
by arranging to have just one unconditional assignment before PG_TRY.
2013-03-28 13:19:49 -04:00
Alvaro Herrera 473ab40c8b Add sql_drop event for event triggers
This event takes place just before ddl_command_end, and is fired if and
only if at least one object has been dropped by the command.  (For
instance, DROP TABLE IF EXISTS of a table that does not in fact exist
will not lead to such a trigger firing).  Commands that drop multiple
objects (such as DROP SCHEMA or DROP OWNED BY) will cause a single event
to fire.  Some firings might be surprising, such as
ALTER TABLE DROP COLUMN.

The trigger is fired after the drop has taken place, because that has
been deemed the safest design, to avoid exposing possibly-inconsistent
internal state (system catalogs as well as current transaction) to the
user function code.  This means that careful tracking of object
identification is required during the object removal phase.

Like other currently existing events, there is support for tag
filtering.

To support the new event, add a new pg_event_trigger_dropped_objects()
set-returning function, which returns a set of rows comprising the
objects affected by the command.  This is to be used within the user
function code, and is mostly modelled after the recently introduced
pg_identify_object() function.

Catalog version bumped due to the new function.

Dimitri Fontaine and Álvaro Herrera
Review by Robert Haas, Tom Lane
2013-03-28 13:05:48 -03:00
Simon Riggs 593c39d156 Revoke bc5334d867 2013-03-28 09:18:02 +00:00
Simon Riggs d139a5e26b Revoke 7a5a59d378 2013-03-28 09:12:55 +00:00
Tom Lane 0d1ecd6300 Reset OpenSSL randomness state in each postmaster child process.
Previously, if the postmaster initialized OpenSSL's PRNG (which it will do
when ssl=on in postgresql.conf), the same pseudo-random state would be
inherited by each forked child process.  The problem is masked to a
considerable extent if the incoming connection uses SSL encryption, but
when it does not, identical pseudo-random state is made available to
functions like contrib/pgcrypto.  The process's PID does get mixed into any
requested random output, but on most systems that still only results in 32K
or so distinct random sequences available across all Postgres sessions.
This might allow an attacker who has database access to guess the results
of "secure" operations happening in another session.

To fix, forcibly reset the PRNG after fork().  Each child process that has
need for random numbers from OpenSSL's generator will thereby be forced to
go through OpenSSL's normal initialization sequence, which should provide
much greater variability of the sequences.  There are other ways we might
do this that would be slightly cheaper, but this approach seems the most
future-proof against SSL-related code changes.

This has been assigned CVE-2013-1900, but since the issue and the patch
have already been publicized on pgsql-hackers, there's no point in trying
to hide this commit.

Back-patch to all supported branches.

Marko Kreen
2013-03-27 18:50:21 -04:00
Heikki Linnakangas 3cfb572dde Fix buffer pin leak in heap update redo routine.
In a heap update, if the old and new tuple were on different pages, and the
new page no longer existed (because it was subsequently truncated away by
vacuum), heap_xlog_update forgot to release the pin on the old buffer. This
bug was introduced by the "Fix multiple problems in WAL replay" patch,
commit 3bbf668de9 (on master branch).

With full_page_writes=off, this triggered an "incorrect local pin count"
error later in replay, if the old page was vacuumed.

This fixes bug #7969, reported by Yunong Xiao. Backpatch to 9.0, like the
commit that introduced this bug.
2013-03-27 22:00:01 +02:00
Simon Riggs 7a5a59d378 Set recovery_config_directory for EXEC_BACKEND.
Remove comment questioning whether this is necessary for DataDir.
From buildfarm failures on Windows.
2013-03-27 16:35:38 +00:00
Heikki Linnakangas 7800a71291 Move some pg_dump function around.
Move functions used only by pg_dump and pg_restore from dumputils.c to a new
file, pg_backup_utils.c. dumputils.c is linked into psql and some programs
in bin/scripts, so it seems good to keep it slim. The parallel functionality
is moved to parallel.c, as is exit_horribly, because the interesting code in
exit_horribly is parallel-related.

This refactoring gets rid of the on_exit_msg_func function pointer. It was
problematic, because a modern gcc version with -Wmissing-format-attribute
complained if it wasn't marked with PF_PRINTF_ATTRIBUTE, but the ancient gcc
version that Tom Lane's old HP-UX box has didn't accept that attribute on a
function pointer, and gave an error. We still use a similar function pointer
trick for getLocalPQBuffer() function, to use a thread-local version of that
in parallel mode on Windows, but that dodges the problem because it doesn't
take printf-like arguments.
2013-03-27 18:10:40 +02:00
Simon Riggs bc5334d867 Allow external recovery_config_directory
If required, recovery.conf can now be located outside of the data directory.
Server needs read/write permissions on this directory.
2013-03-27 11:45:42 +00:00
Tom Lane f7f210b5c4 Fix grammatical errors in some new message strings.
Daniele Varrazzo
2013-03-26 17:52:00 -04:00
Tom Lane 683abc73df Ignore invalid indexes in pg_dump.
Dumping invalid indexes can cause problems at restore time, for example
if the reason the index creation failed was because it tried to enforce
a uniqueness condition not satisfied by the table's data.  Also, if the
index creation is in fact still in progress, it seems reasonable to
consider it to be an uncommitted DDL change, which pg_dump wouldn't be
expected to dump anyway.

Back-patch to all active versions, and teach them to ignore invalid
indexes in servers back to 8.2, where the concept was introduced.

Michael Paquier
2013-03-26 17:43:19 -04:00
Heikki Linnakangas 625b237f79 Fix pg_dump against 9.1/9.2 servers.
The parallel pg_dump patch forgot to add relpages column to 9.1/9.2 version
of the getTables() query.

Reported by Bernd Helmle.
2013-03-26 15:44:26 +02:00
Heikki Linnakangas 901b89e37b Get rid of obsolete parse_version helper function.
For getting the server's version in numeric form, use PQserverVersion().
It does the exact same parsing as dumputils.c's parse_version(), and has
been around in libpq for a long time. For the client's version, just use
the PG_VERSION_NUM constant.
2013-03-26 15:32:02 +02:00
Andrew Dunstan ec143f9405 Fix a small logic bug in adjusted parallel restore code. 2013-03-25 22:52:28 -04:00
Heikki Linnakangas 28ba260906 In base backup, only include our own tablespace version directory.
If you have clusters of different versions pointing to the same tablespace
location, we would incorrectly include all the data belonging to the other
versions, too.

Fixes bug #7986, reported by Sergey Burladyan.
2013-03-25 20:19:22 +02:00
Heikki Linnakangas d298b50a3b Make pg_basebackup work with pre-9.3 servers, and add server version check.
A new 'starttli' field was added to the response of BASE_BACKUP command.
Make pg_basebackup tolerate the case that it's missing, so that it still
works with older servers.

Add an explicit check for the server version, so that you get a nicer error
message if you try to use it with a pre-9.1 server.

The streaming protocol message format changed in 9.3, so -X stream still won't
work with pre-9.3 servers. I added a version check to ReceiveXLogStream()
earlier, but write that slightly differently, so that in 9.4, it will still
work with a 9.3 server. (In 9.4, the error message needs to be adjusted to
"9.3 or above", though). Also, if the version check fails, don't retry.
2013-03-25 19:44:11 +02:00
Heikki Linnakangas ea988ee8c8 Add PF_PRINTF_ATTRIBUTE to on_exit_msg_fmt.
Per warning from -Wmissing-format-attribute.
2013-03-25 10:06:03 +02:00
Heikki Linnakangas 4eefd0f86b Add missing #include.
time(2) requires time.h.
2013-03-25 09:55:43 +02:00
Tom Lane 846681fdd5 Fix some unportable constructs in parallel pg_dump code.
Didn't compile on semi-obsolete gcc, and probably not on not-gcc-at-all
either.
2013-03-24 15:35:37 -04:00
Andrew Dunstan 9e257a181c Add parallel pg_dump option.
New infrastructure is added which creates a set number of workers
(threads on Windows, forked processes on Unix). Jobs are then
handed out to these workers by the master process as needed.
pg_restore is adjusted to use this new infrastructure in place of the
old setup which created a new worker for each step on the fly. Parallel
dumps acquire a snapshot clone in order to stay consistent, if
available.

The parallel option is selected by the -j / --jobs command line
parameter of pg_dump.

Joachim Wieland, lightly editorialized by Andrew Dunstan.
2013-03-24 11:27:20 -04:00
Tom Lane 3b91fe185a Update time zone abbreviation lists for changes missed since 2006.
Most (all?) of Russia has moved to what's effectively year-round daylight
savings time, so that the "standard" zone names now mean an hour later
than they used to.  Update that, notably changing MSK as per recent
complaint from Sergey Konoplev, but also CHOT, GET, IRKT, KGT, KRAT,
MAGT, NOVT, OMST, VLAT, YAKT, YEKT.  The corresponding DST abbreviations
are presumably now obsolete, but I left them in place with their old
definitions, just to reduce any possible breakage from this change.

Also add VOLT (Europe/Volgograd), which for some reason we never had
before, as well as MIST (Antarctica/Macquarie), and fix obsolete
definitions of MAWT, TKT, and WST.
2013-03-23 19:17:46 -04:00
Tom Lane 6960277270 Semi-automatically detect changes in timezone abbreviations.
Add an option to zic.c to dump out all non-obsolete timezone abbreviations
defined in the Olson database.  Comparing this list to its previous state
will clue us in when something happens that we may need to account for in
the tznames/ time zone abbreviation lists.  The README file's previous
exhortation to "just grep for differences" was completely useless advice,
in my now-considerable experience; but maybe this will be a bit more
useful.  As a starting point I built the same list from the tzdata files
as they existed in 2006, which is committed here as known_abbrevs.txt.
Comparison indeed turned up quite a few changes we had neglected to account
for, which I will commit separately.
2013-03-23 19:17:44 -04:00
Andrew Dunstan b7f8465cc6 Avoid renaming data directory during MSVC upgrade testing.
This appears to cause some intermittent file system problems
on Windows 8. Instead, set up the old data directory in its
intended final location to start with.
2013-03-23 16:26:06 -04:00
Kevin Grittner 549dae0352 Fix problems with incomplete attempt to prohibit OIDS with MVs.
Problem with assertion failure in restoring from pg_dump output
reported by Joachim Wieland.

Review and suggestions by Tom Lane and Robert Haas.
2013-03-22 13:27:34 -05:00
Tom Lane 4912385b56 Suppress uninitialized-variable warning in new checksum code.
Some compilers understand that this coding is safe, and some don't.
2013-03-22 12:27:50 -04:00
Simon Riggs 9df56f6d91 Add new README file for pages/checksums 2013-03-22 14:21:58 +00:00
Simon Riggs 96ef3b8ff1 Allow I/O reliability checks using 16-bit checksums
Checksums are set immediately prior to flush out of shared buffers
and checked when pages are read in again. Hint bit setting will
require full page write when block is dirtied, which causes various
infrastructure changes. Extensive comments, docs and README.

WARNING message thrown if checksum fails on non-all zeroes page;
ERROR thrown but can be disabled with ignore_checksum_failure = on.

Feature enabled by an initdb option, since transition from option off
to option on is long and complex and has not yet been implemented.
Default is not to use checksums.

Checksum used is WAL CRC-32 truncated to 16-bits.

Simon Riggs, Jeff Davis, Greg Smith
Wide input and assistance from many community members. Thank you.
2013-03-22 13:54:07 +00:00
Simon Riggs 13fe298ca0 Change commit_delay to be SUSET for 9.3+
Prior to 9.3 the commit_delay affected only the current user,
whereas now only the group leader waits while holding the
WALWriteLock. Deliberate or accidental settings to a poor
value could seriously degrade performance for all users.
Privileges may be delegated by SECURITY DEFINER functions
for anyone that needs per-user settings in real situations.
Request for change from Peter Geoghegan
2013-03-22 12:01:16 +00:00
Tom Lane 9cbc4b80dd Redo postgres_fdw's planner code so it can handle parameterized paths.
I wasn't going to ship this without having at least some example of how
to do that.  This version isn't terribly bright; in particular it won't
consider any combinations of multiple join clauses.  Given the cost of
executing a remote EXPLAIN, I'm not sure we want to be very aggressive
about doing that, anyway.

In support of this, refactor generate_implied_equalities_for_indexcol
so that it can be used to extract equivalence clauses that aren't
necessarily tied to an index.
2013-03-21 19:44:32 -04:00
Heikki Linnakangas f897c4744f Fix "element <@ range" cost estimation.
The statistics-based cost estimation patch for range types broke that, by
incorrectly assuming that the left operand of all range oeprators is a
range. That lead to a "type x is not a range type" error. Because it took so
long for anyone to notice, add a regression test for that case.

We still don't do proper statistics-based cost estimation for that, so you
just get a default constant estimate. We should look into implementing that,
but this patch at least fixes the regression.

Spotted by Tom Lane, when testing query from Josh Berkus.
2013-03-21 11:21:51 +02:00
Alvaro Herrera f8348ea32e Allow extracting machine-readable object identity
Introduce pg_identify_object(oid,oid,int4), which is similar in spirit
to pg_describe_object but instead produces a row of machine-readable
information to uniquely identify the given object, without resorting to
OIDs or other internal representation.  This is intended to be used in
the event trigger implementation, to report objects being operated on;
but it has usefulness of its own.

Catalog version bumped because of the new function.
2013-03-20 18:19:19 -03:00
Tom Lane a7921f71a3 Bump up timeout delays some more in timeouts isolation test.
The buildfarm members using -DCLOBBER_CACHE_ALWAYS still don't like this
test.  Some experimentation shows that on my machine, isolationtester's
query to check for "waiting" state takes 2 to 2.5 seconds to bind+execute
under -DCLOBBER_CACHE_ALWAYS.  Set the timeouts to 5 seconds to leave some
headroom for possibly-slower buildfarm critters.

Really we ought to fix the "waiting" query, which is not only horridly
slow but outright wrong in detail; and then maybe we can back off these
timeouts.  But right now I'm just trying to get the buildfarm green again.
2013-03-20 13:53:43 -04:00
Kevin Grittner 241139ae4b Use ORDER BY on matview definitions were needed for stable plans.
Per report from Hadi Moshayedi of matview regression test failure
with optimization of aggregates.  A few ORDER BY clauses improve
code coverage for matviews while solving that problem.
2013-03-19 10:33:37 -05:00
Simon Riggs bb7cc2623f Remove PageSetTLI and rename pd_tli to pd_checksum
Remove use of PageSetTLI() from all page manipulation functions
and adjust README to indicate change in the way we make changes
to pages. Repurpose those bytes into the pd_checksum field and
explain how that works in comments about page header.

Refactoring ahead of actual feature patch which would make use
of the checksum field, arriving later.

Jeff Davis, with comments and doc changes by Simon Riggs
Direction suggested by Robert Haas; many others providing
review comments.
2013-03-18 13:46:42 +00:00
Tom Lane 4c855750fc Increase timeout delays in new timeouts isolation test.
Buildfarm member friarbird doesn't like this test as-committed, evidently
because it's so slow that the test framework doesn't reliably notice that
the backend is waiting before the timeout goes off.  (This is not totally
surprising, since friarbird builds with -DCLOBBER_CACHE_ALWAYS.)  Increase
the timeout delay from 1 second to 2 in hopes of resolving that problem.
2013-03-17 23:01:20 -04:00
Robert Haas 05f3f9c7b2 Extend object-access hook machinery to support post-alter events.
This also slightly widens the scope of what we support in terms of
post-create events.

KaiGai Kohei, with a few changes, mostly to the comments, by me
2013-03-17 22:57:26 -04:00
Tom Lane 6ac7facdd3 Improve signal-handler lockout mechanism in timeout.c.
Rather than doing a fairly-expensive setitimer() call to prevent interrupts
from happening, let's just invent a simple boolean flag that the signal
handler is required to check.  This is not only faster but considerably
more robust than before, since the previous code effectively assumed that
only ITIMER_REAL events would ever fire the SIGALRM handler, which is
obviously something that can be broken easily by third-party code.

Zoltán Böszörményi and Tom Lane
2013-03-17 22:42:19 -04:00
Tom Lane b1fae823ee Re-include pqsignal() in libpq.
We need this in non-ENABLE_THREAD_SAFETY builds, and also to satisfy
the exports.txt entry; while it might be a good idea to remove the
latter, I'm hesitant to do so except in the context of an intentional
ABI break.  At least we don't have a separately maintained source file
for it anymore.
2013-03-17 15:45:31 -04:00
Tom Lane e2a203a190 initdb needs pqsignal() even on Windows.
I had thought we weren't using this version of pqsignal() at all on
Windows, but that's wrong --- initdb is using it (and coping with the
POSIX-ish semantics of bare signal() :-().  So allow the file to be
built in WIN32+FRONTEND case, and add it to the MSVC build logic.
2013-03-17 15:19:47 -04:00
Tom Lane c68b5eff13 Fix inclusions in pg_receivexlog.c.
Apparently this was depending on pqsignal.h for <signal.h>.
Not sure why I didn't see the failure on my other machine.
2013-03-17 14:11:48 -04:00
Tom Lane da5aeccf64 Move pqsignal() to libpgport.
We had two copies of this function in the backend and libpq, which was
already pretty bogus, but it turns out that we need it in some other
programs that don't use libpq (such as pg_test_fsync).  So put it where
it probably should have been all along.  The signal-mask-initialization
support in src/backend/libpq/pqsignal.c stays where it is, though, since
we only need that in the backend.
2013-03-17 12:06:42 -04:00
Tom Lane d43837d030 Add lock_timeout configuration parameter.
This GUC allows limiting the time spent waiting to acquire any one
heavyweight lock.

In support of this, improve the recently-added timeout infrastructure
to permit efficiently enabling or disabling multiple timeouts at once.
That reduces the performance hit from turning on lock_timeout, though
it's still not zero.

Zoltán Böszörményi, reviewed by Tom Lane,
Stephen Frost, and Hari Babu
2013-03-16 23:22:57 -04:00
Peter Eisentraut d2bef5f7db pg_resetxlog: Capitalize placeholder in --help output 2013-03-16 21:47:52 -04:00
Peter Eisentraut ea1aee88e3 pg_controldata: Undo message spelling change 2013-03-16 21:47:10 -04:00
Tom Lane dcafdbcde1 Improve error reporting in code that checks for buffer refcount leaks.
Formerly we just Assert'ed that each refcount was zero, which was quick
and easy but failed to provide a good overview of what was wrong.
Change the code so that we'll call PrintBufferLeakWarning() for each
buffer with a nonzero refcount, and then Assert at the end of the loop.
This costs nothing in runtime and might ease diagnosis of some bugs.

Greg Smith, reviewed by Satoshi Nagayasu, further tweaked by me
2013-03-15 12:26:26 -04:00
Tom Lane 73e7025bd8 Extend format() to handle field width and left/right alignment.
This change adds some more standard sprintf() functionality to format().

Pavel Stehule, reviewed by Dean Rasheed and Kyotaro Horiguchi
2013-03-14 22:56:56 -04:00
Tom Lane 1a1832eb08 Avoid inserting no-op Limit plan nodes.
This was discussed in connection with the patch to avoid inserting no-op
Result nodes, but not actually implemented therein.
2013-03-14 15:11:05 -04:00
Kevin Grittner fb60e7296c Revert unnecessary change in MV call to checkRuleResultList().
Due to a misreading of the function's comment block, there was an
unneeded change to a call in rewriteDefine.c.  There is, in fact
no reason to pass false for a MV; it should be true just like a
view.

Fixes issue pointed out by Tom Lane
2013-03-14 13:59:52 -05:00
Kevin Grittner 8d7ff13ed5 Add regression test for MV join to view.
This would have caught a bug in the initial patch, and seems like
a good thing to test going forward.

Per bug report by Erik Rijkers and fix by Tom Lane
2013-03-14 13:34:51 -05:00
Heikki Linnakangas f7559c0101 Also update psqlscan.l with the UESCAPE error rule changes.
Even though this patch had no user-visible difference, better keep the code
in psqlscan.l sync with the backend lexer. And of course it's nice to shrink
the psql binary, too. Ecpg's version of the lexer doesn't have the error
rule, it doesn't try to avoid backing up, so it doesn't need to be modified.

As reminded by Tom Lane
2013-03-14 20:31:27 +02:00
Tom Lane 4387cf956b Avoid inserting Result nodes that only compute identity projections.
The planner sometimes inserts Result nodes to perform column projections
(ie, arbitrary scalar calculations) above plan nodes that lack projection
logic of their own.  However, we did that even if the lower plan node was
in fact producing the required column set already; which is a pretty common
case given the popularity of "SELECT * FROM ...".  Measurements show that
the useless plan node adds non-negligible overhead, especially when there
are many columns in the result.  So add a check to avoid inserting a Result
node unless there's something useful for it to do.

There are a couple of remaining places where unnecessary Result nodes
could get inserted, but they are (a) much less performance-critical,
and (b) coded in such a way that it's hard to avoid inserting a Result,
because the desired tlist is changed on-the-fly in subsequent logic.
We'll leave those alone for now.

Kyotaro Horiguchi; reviewed and further hacked on by Amit Kapila and
Tom Lane.
2013-03-14 13:43:18 -04:00
Heikki Linnakangas a5ff502fce Change the way UESCAPE is lexed, to reduce the size of the flex tables.
The error rule used to avoid backtracking with the U&'...' UESCAPE 'x'
syntax bloated the flex tables, so refactor that. This patch makes the error
rule shorter, by introducing a new exclusive flex state that's entered after
parsing U&'...'. This shrinks the postgres binary by about 220kB.
2013-03-14 19:04:43 +02:00
Heikki Linnakangas 59d0bf9dca Add cost estimation of range @> and <@ operators.
The estimates are based on the existing lower bound histogram, and a new
histogram of range lengths.

Bump catversion, because the range length histogram now needs to be present
in statistic slot kind 6, or you get an error on @> and <@ queries. (A
re-ANALYZE would be enough to fix that, though)

Alexander Korotkov, with some refactoring by me.
2013-03-14 15:36:56 +02:00
Peter Eisentraut 788bce13d3 Add regression tests for XML mapping of domains
Pavel Stěhule
2013-03-13 22:42:57 -04:00
Kevin Grittner a18b72adcd Fix bug in dumping prior releases due to MV REFRESH dependency checking.
Reports and suggested patches from Fujii Masao and Andrew Dunstan.

Andrew Dunstan
2013-03-13 20:20:32 -05:00
Tom Lane a0c6dfeecf Allow default expressions to be attached to columns of foreign tables.
There's still some discussion about exactly how postgres_fdw ought to
handle this case, but there seems no debate that we want to allow defaults
to be used for inserts into foreign tables.  So remove the core-code
restrictions that prevented it.

While at it, get rid of the special grammar productions for CREATE FOREIGN
TABLE, and instead add explicit FEATURE_NOT_SUPPORTED error checks for the
disallowed cases.  This makes the grammar a shade smaller, and more
importantly results in much more intelligible error messages for
unsupported cases.  It's also one less thing to fix if we ever start
supporting constraints on foreign tables.
2013-03-12 17:37:07 -04:00
Tom Lane 41eef0ff75 Fix thinko in matview patch.
"break" instead of "continue" suppressed view expansion for views appearing
later in the range table.  Per report from Erikjan Rijkers.

While at it, improve the associated comment a bit.
2013-03-11 12:00:24 -04:00
Andrew Dunstan 38fb4d978c JSON generation improvements.
This adds the following:

    json_agg(anyrecord) -> json
    to_json(any) -> json
    hstore_to_json(hstore) -> json (also used as a cast)
    hstore_to_json_loose(hstore) -> json

The last provides heuristic treatment of numbers and booleans.

Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.

Andrew Dunstan, reviewed by Steve Singer.

Catalog version bumped.
2013-03-10 17:35:36 -04:00
Peter Eisentraut 31531325a4 pg_ctl: Adjust nls.mk for split out of wait_error.c 2013-03-10 16:56:07 -04:00
Peter Eisentraut 74e629cb09 pg_basebackup: Add missing newlines to several error messages 2013-03-10 16:56:06 -04:00
Tom Lane 21734d2fb8 Support writable foreign tables.
This patch adds the core-system infrastructure needed to support updates
on foreign tables, and extends contrib/postgres_fdw to allow updates
against remote Postgres servers.  There's still a great deal of room for
improvement in optimization of remote updates, but at least there's basic
functionality there now.

KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
heavily revised by Tom Lane.
2013-03-10 14:16:02 -04:00
Magnus Hagander 7f49a67f95 Report pg_hba line number and contents when users fail to log in
Instead of just reporting which user failed to log in, log both the
line number in the active pg_hba.conf file (which may not match reality
in case the file has been edited and not reloaded) and the contents of
the matching line (which will always be correct), to make it easier
to debug incorrect pg_hba.conf files.

The message to the client remains unchanged and does not include this
information, to prevent leaking security sensitive information.

Reviewed by Tom Lane and Dean Rasheed
2013-03-10 15:54:37 +01:00
Heikki Linnakangas 96443d1420 Forgot catversion bump in the SP-GiST adjacent support patch. 2013-03-08 17:12:38 +02:00
Heikki Linnakangas 23f10b6473 SP-GiST support of the range adjacent operator -|-
Alexander Korotkov, reviewed by Jeff Davis.
2013-03-08 15:03:19 +02:00
Heikki Linnakangas 2443a26b9b Remove unnecessary #ifdef FRONTEND check to choose between strdup and pstrdup.
The libpgcommon patch made that unnecessary, palloc and friends are now
available in frontend programs too, mapped to plain old malloc.

As pointed out by Alvaro Herrera.
2013-03-08 11:23:33 +02:00
Tom Lane a7b61d4f5a Fix infinite-loop risk in fixempties() stage of regex compilation.
The previous coding of this function could get into situations where it
would never terminate, because successive passes would re-add EMPTY arcs
that had been removed by the previous pass.  Rewrite the function
completely using a new algorithm that is guaranteed to terminate, and
also seems to be usually faster than the old one.  Per Tcl bugs 3604074
and 3606683.

Tom Lane and Don Porter
2013-03-07 11:51:03 -05:00
Heikki Linnakangas 7ccefe8610 Fix tli history file fetching, broken by the archive after crash recevery patch.
If we were about to enter archive recovery after crash recovery, we scanned
the archive for the latest tli history file, and set the recovery target
timeline to that. However, when we actually tried to read the history file,
we would not fetch the file from the archive, because we were not in archive
recovery yet.

To fix, make readTimeLineHistory and existsTimeLineHistory to always fetch
the file from archive if archive recovery is requested, even if we're not in
archive recovery yet.

Backpatch to 9.2. Mitsumasa KONDO
2013-03-07 12:33:24 +02:00
Tom Lane 1908abc4a3 Arrange to cache FdwRoutine structs in foreign tables' relcache entries.
This saves several catalog lookups per reference.  It's not all that
exciting right now, because we'd managed to minimize the number of places
that need to fetch the data; but the upcoming writable-foreign-tables patch
needs this info in a lot more places.
2013-03-06 23:48:09 -05:00
Peter Eisentraut 9795113916 Add fe_memutils.c to nls.mk where used 2013-03-06 23:45:16 -05:00
Robert Haas f90cc26982 Code beautification for object-access hook machinery.
KaiGai Kohei
2013-03-06 20:53:25 -05:00
Peter Eisentraut f11af2bcab Adjust nls.mk for split out of wait_error.c 2013-03-06 20:26:14 -05:00
Tom Lane e11cb8ba2c Fix missing #include in commands/matview.h.
It needs parsenodes.h to be compilable regardless of previous headers.
2013-03-06 18:21:05 -05:00
Kevin Grittner c5bf7a2052 WAL-log the extension of a new empty MV heap which is being populated.
This page with no tuples is used to distinguish an MV containing a
zero-row resultset of its backing query from an MV which has not
been populated by its backing query.  Unless WAL-logged, recovery
and hot standby don't work correctly with what should be an empty
but scannable materialized view.

Fixes bugs reported by Fujii Masao in testing MVs on hot standby.
2013-03-06 17:15:34 -06:00
Kevin Grittner cfa3df3de1 Fix broken pg_dump for 9.0 and 9.1 caused by the MV patch.
Per report and suggestion from Bernd Helmle
2013-03-06 09:51:49 -06:00
Andrew Dunstan cd340ca89a Fix message typo. 2013-03-06 09:53:38 -05:00
Peter Eisentraut 71ea7e9737 pg_ctl: Add comma to message 2013-03-05 23:22:12 -05:00
Andrew Dunstan 0d147e43ad Remove dependency on the DLL of pythonxx.def file.
This confused Cygwin's make because of the colon in the path. The
DLL isn't likely to change under us so preserving the dependency
doesn't gain us much, and it's useful to be able to do a native
Windows build with the Cygwin mingw toolset.

Noah Misch.
2013-03-05 19:24:29 -05:00
Tom Lane 80b011ef0a Fix to_char() to use ASCII-only case-folding rules where appropriate.
formatting.c used locale-dependent case folding rules in some code paths
where the result isn't supposed to be locale-dependent, for example
to_char(timestamp, 'DAY').  Since the source data is always just ASCII
in these cases, that usually didn't matter ... but it does matter in
Turkish locales, which have unusual treatment of "i" and "I".  To confuse
matters even more, the misbehavior was only visible in UTF8 encoding,
because in single-byte encodings we used pg_toupper/pg_tolower which
don't have locale-specific behavior for ASCII characters.  Fix by providing
intentionally ASCII-only case-folding functions and using these where
appropriate.  Per bug #7913 from Adnan Dursun.  Back-patch to all active
branches, since it's been like this for a long time.
2013-03-05 13:02:30 -05:00
Kevin Grittner c8056592bc Bump catversion because of new function in the materialized view patch. 2013-03-05 05:32:03 -06:00
Tom Lane 542eeba269 Fix overflow check in tm2timestamp (this time for sure).
I fixed this code back in commit 841b4a2d5, but didn't think carefully
enough about the behavior near zero, which meant it improperly rejected
1999-12-31 24:00:00.  Per report from Magnus Hagander.
2013-03-04 15:13:31 -05:00
Peter Eisentraut 0ea1f6e98f psql: Let \l accept a pattern
reviewed by Satoshi Nagayasu
2013-03-04 15:17:40 +00:00
Kevin Grittner 54d6706ded Remove accidentally-committed .orig file. 2013-03-04 15:17:13 +00:00
Tom Lane bc61878682 Fix map_sql_value_to_xml_value() to treat domains like their base types.
This was already the case for domains over arrays, but not for domains
over certain built-in types such as boolean.  The special formatting
rules for those types should apply to domains over them as well.
Per discussion.

While this is a bug fix, it's also a behavioral change that seems likely
to trip up some applications.  So no back-patch.

Pavel Stehule
2013-03-03 19:32:22 -05:00
Kevin Grittner 3bf3ab8c56 Add a materialized view relations.
A materialized view has a rule just like a view and a heap and
other physical properties like a table.  The rule is only used to
populate the table, references in queries refer to the
materialized data.

This is a minimal implementation, but should still be useful in
many cases.  Currently data is only populated "on demand" by the
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements.
It is expected that future releases will add incremental updates
with various timings, and that a more refined concept of defining
what is "fresh" data will be developed.  At some point it may even
be possible to have queries use a materialized in place of
references to underlying tables, but that requires the other
above-mentioned features to be working first.

Much of the documentation work by Robert Haas.
Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja
Security review by KaiGai Kohei, with a decision on how best to
implement sepgsql still pending.
2013-03-03 18:23:31 -06:00
Tom Lane b15a6da292 Get rid of any toast table when converting a table to a view.
Also make sure other fields of the view's pg_class entry are appropriate
for a view; it shouldn't have relfrozenxid set for instance.

This ancient omission isn't believed to have any serious consequences in
versions 8.4-9.2, so no backpatch.  But let's fix it before it does bite
us in some serious way.  It's just luck that the case doesn't cause
problems for autovacuum.  (It did cause problems in 8.3, but that's out
of support.)

Andres Freund
2013-03-03 19:05:47 -05:00
Tom Lane 2b78d101d1 Fix SQL function execution to be safe with long-lived FmgrInfos.
fmgr_sql had been designed on the assumption that the FmgrInfo it's called
with has only query lifespan.  This is demonstrably unsafe in connection
with range types, as shown in bug #7881 from Andrew Gierth.  Fix things
so that we re-generate the function's cache data if the (sub)transaction
it was made in is no longer active.

Back-patch to 9.2.  This might be needed further back, but it's not clear
whether the case can realistically arise without range types, so for now
I'll desist from back-patching further.
2013-03-03 17:39:58 -05:00
Peter Eisentraut 1275b88f71 Exclude utils/probes.h and pg_trace.h from cpluspluscheck
They can include sys/sdt.h from SystemTap, which itself contains C++
code and so won't compile with a C++ compiler under extern "C" linkage.
2013-03-01 22:46:11 -05:00
Tom Lane a4d3a504e7 Eliminate memory leaks in plperl's spi_prepare() function.
Careless use of TopMemoryContext for I/O function data meant that repeated
use of spi_prepare and spi_freeplan would leak memory at the session level,
as per report from Christian Schröder.  In addition, spi_prepare
leaked a lot of transient data within the current plperl function's SPI
Proc context, which would be a problem for repeated use of spi_prepare
within a single plperl function call; and it wasn't terribly careful
about releasing permanent allocations in event of an error, either.

In passing, clean up some copy-and-pasteos in query-lookup error messages.

Alex Hunsaker and Tom Lane
2013-03-01 21:34:17 -05:00
Andrew Dunstan 63d283ecd0 Flush stderr and stdout in isolation tester.
This is a possibly vain attempt to fix a buffering issue
observed for some MSVC builds.
2013-02-27 19:13:07 -05:00
Heikki Linnakangas f70b1b2748 Fix MSVC build.
The new file in src/port needs to be listed in Mkvcbuild.pm as well.
2013-02-27 21:31:41 +02:00
Heikki Linnakangas 3a9e64aa0d Cannot use WL_SOCKET_WRITEABLE without WL_SOCKET_READABLE.
In copy-out mode, the frontend should not send any messages until the
backend has finished streaming, by sending a CopyDone message. I'm not sure
if it would be legal for the client to send a new query before receiving the
CopyDone message from the backend, but trying to support that would require
bigger changes to the backend code structure.

Fixes an assertion failure reported by Fujii Masao.
2013-02-27 19:28:51 +02:00
Heikki Linnakangas 5ddf38f21d Add standard file header comment to quotes.c. 2013-02-27 18:42:40 +02:00
Heikki Linnakangas 3d009e45bd Add support for piping COPY to/from an external program.
This includes backend "COPY TO/FROM PROGRAM '...'" syntax, and corresponding
psql \copy syntax. Like with reading/writing files, the backend version is
superuser-only, and in the psql version, the program is run in the client.

In the passing, the psql \copy STDIN/STDOUT syntax is subtly changed: if you
the stdin/stdout is quoted, it's now interpreted as a filename. For example,
"\copy foo from 'stdin'" now reads from a file called 'stdin', not from
standard input. Before this, there was no way to specify a filename called
stdin, stdout, pstdin or pstdout.

This creates a new function in pgport, wait_result_to_str(), which can
be used to convert the exit status of a process, as returned by wait(3),
to a human-readable string.

Etsuro Fujita, reviewed by Amit Kapila.
2013-02-27 18:22:31 +02:00
Tom Lane 73dc003bee Add missing error check in regexp parser.
parseqatom() failed to check for an error return (NULL result) from its
recursive call to parsebranch(), and in consequence could crash with a
null-pointer dereference after an error return.  This bug has been there
since day one, but wasn't noticed before, probably because most error cases
in parsebranch() didn't actually lead to returning NULL.  Add the missing
error check, and also tweak parsebranch() to exit in a less indirect
fashion after a call to parseqatom() fails.

Report by Tomasz Karlik, fix by me.
2013-02-27 10:40:03 -05:00
Tom Lane c153530dc1 Install headers from the new src/include/common subdirectory.
This got missed in commit 8396447cdb.

Andres Freund
2013-02-26 15:27:30 -05:00
Heikki Linnakangas 0a4fe8a318 Remove the check for COPY TO STDIN and COPY FROM STDOUT from ecpg.
The backend grammar treats STDIN and STDOUT completely interchangeable, so
that the above accepted. Arguably that was a mistake the backend grammar,
but it's not ecpg's business to second guess that.
2013-02-26 19:33:15 +02:00
Heikki Linnakangas 2953cd6d17 Only quote libpq connection string values that need quoting.
There's no harm in excessive quoting per se, but it makes the strings nicer
to read. The values can get quite unwieldy, when they're first quoted within
within single-quotes when included in the connection string, and then all
the single-quotes are escaped when the connection string is passed as a
shell argument.
2013-02-25 19:53:04 +02:00
Heikki Linnakangas 3dee636e04 Add -d option to pg_dumpall, for specifying a connection string.
Like with pg_basebackup and pg_receivexlog, it's a bit strange to call the
option -d/--dbname, when in fact you cannot pass a database name in it.

Original patch by Amit Kapila, heavily modified by me.
2013-02-25 19:39:10 +02:00
Heikki Linnakangas 691e595dd9 Add -d/--dbname option to pg_dump.
You could already pass a database name just by passing it as the last
option, without -d. This is an alias for that, like the -d/--dbname option
in psql and many other client applications. For consistency.
2013-02-25 19:39:04 +02:00
Andrew Dunstan a64e33f030 Redo MSVC build implementation for pg_xlogdump.
The previous commit didn't work on MSVC editions earlier than
Visual Studio 2011, apparently. This works by copying files into the
contrib directory, and making provision to clean them up, which should
work on all editions.
2013-02-25 12:00:53 -05:00
Heikki Linnakangas aa05c37e82 Add -d option to pg_basebackup and pg_receivexlog, for connection string.
Without this, there's no way to pass arbitrary libpq connection parameters
to these applications. It's a bit strange that the option is called
-d/--dbname, when in fact you can *not* pass a database name in it, but it's
consistent with other client applications where a connection string is also
passed using -d.

Original patch by Amit Kapila, heavily modified by me.
2013-02-25 14:59:33 +02:00
Andrew Dunstan 786170d74f Provide MSVC build setup for pg_xlogdump. 2013-02-24 20:28:42 -05:00
Peter Eisentraut ca9c666602 Correct tense in log message 2013-02-23 23:30:14 -05:00
Peter Eisentraut 4f36292669 Add quotes to messages 2013-02-22 23:33:07 -05:00
Alvaro Herrera 639ed4e84b Add pg_xlogdump contrib program
This program relies on rm_desc backend routines and the xlogreader
infrastructure to emit human-readable rendering of WAL records.

Author: Andres Freund, with many reworks by Álvaro
Reviewed (in a much earlier version) by Peter Eisentraut
2013-02-22 16:56:55 -03:00
Alvaro Herrera af0a4c5924 Blind attempt at fixing the non-MSVC Windows builds
Apparently, they need -DBUILDING_DLL for the Assert() declarations to
work correctly.
2013-02-22 11:51:15 -03:00
Heikki Linnakangas 6c4f6664b2 Fix thinko in previous commit.
We must still initialize minRecoveryPoint if we start straight with archive
recovery, e.g when recovering from a normal base backup taken with
pg_start/stop_backup. Otherwise we never consider the system consistent.
2013-02-22 13:12:43 +02:00
Heikki Linnakangas abf5c5c9a4 If recovery.conf is created after "pg_ctl stop -m i", do crash recovery.
If you create a base backup using an atomic filesystem snapshot, and try to
perform PITR starting from that base backup, or if you just kill a master
server and create recovery.conf to put it into standby mode, we don't know
how far we need to recover before reaching consistency. Normally in crash
recovery, we replay all the WAL present in pg_xlog, and assume that we're
consistent after that. And normally in archive recovery, minRecoveryPoint,
backupEndRequired, or backupEndPoint is set in the control file, indicating
how far we need to replay to reach consistency. But if the server was
previously up and running normally, and you kill -9 it or take an atomic
filesystem snapshot, none of those fields are set in the control file.

The solution is to perform crash recovery first, replaying all the WAL in
pg_xlog. After that's done, we assume that the system is consistent like in
normal crash recovery, and switch to archive recovery mode after that.

Per report from Kyotaro HORIGUCHI. In his scenario, recovery.conf was
created after "pg_ctl stop -m i". I'm not sure we need to support that exact
scenario, but we should support backing up using a filesystem snapshot,
which looks identical.

This issue goes back to at least 9.0, where hot standby was introduced and
we started to track when consistency is reached. In 9.1 and 9.2, we would
open up for hot standby too early, and queries could briefly see an
inconsistent state. But 9.2 made it more visible, as we started to PANIC if
we see a reference to a non-existing page during recovery, if we've already
reached consistency. This is a fairly big patch, so back-patch to 9.2 only,
where the issue is more visible. We can consider back-patching further after
this has received some more testing in 9.2 and master.
2013-02-22 12:32:41 +02:00
Alvaro Herrera a730183926 Move relpath() to libpgcommon
This enables non-backend code, such as pg_xlogdump, to use it easily.
The previous location, in src/backend/catalog/catalog.c, made that
essentially impossible because that file depends on many backend-only
facilities; so this needs to live separately.
2013-02-21 22:46:17 -03:00
Alvaro Herrera 6e3fd96463 Remove useless variable
Per Jeff Janes
2013-02-21 11:46:46 -03:00
Tom Lane 54a2786835 Need to decorate XactIsoLevel as PGDLLIMPORT for postgres_fdw.
Per buildfarm.
2013-02-21 09:28:42 -05:00
Tom Lane 699d70b2ec Teach MSVC build system about postgres_fdw.
Per buildfarm.
2013-02-21 06:43:15 -05:00
Tom Lane d0d75c4022 Add postgres_fdw contrib module.
There's still a lot of room for improvement, but it basically works,
and we need this to be present before we can do anything much with the
writable-foreign-tables patch.  So let's commit it and get on with testing.

Shigeru Hanada, reviewed by KaiGai Kohei and Tom Lane
2013-02-21 05:27:16 -05:00
Heikki Linnakangas f435cd1d38 Fix pg_dumpall with database names containing =
If a database name contained a '=' character, pg_dumpall failed. The problem
was in the way pg_dumpall passes the database name to pg_dump on the
command line. If it contained a '=' character, pg_dump would interpret it
as a libpq connection string instead of a plain database name.

To fix, pass the database name to pg_dump as a connection string,
"dbname=foo", with the database name escaped if necessary.

Back-patch to all supported branches.
2013-02-20 17:08:54 +02:00
Heikki Linnakangas 2930c05634 Don't pass NULL to fprintf, if a bogus connection string is given to pg_dump.
Back-patch to all supported branches.
2013-02-20 16:33:24 +02:00
Heikki Linnakangas 5d6899dbae Fix yet another typo in comment.
Etsuro Fujita
2013-02-20 12:31:26 +02:00
Alvaro Herrera a40d09e27f Move ExceptionalCondition back to postgres.h
It needs to be defined in the backend even when assertions are not
enabled.  It's cleaner to put it back, than create a separate #ifdef
section in c.h.

Per trouble report from Jeff Janes
2013-02-18 18:53:32 -03:00
Alvaro Herrera 187492b6c2 Split pgstat file in smaller pieces
We now write one file per database and one global file, instead of
having the whole thing in a single huge file.  This reduces the I/O that
must be done when partial data is required -- which is all the time,
because each process only needs information on its own database anyway.
Also, the autovacuum launcher does not need data about tables and
functions in each database; having the global stats for all DBs is
enough.

Catalog version bumped because we have a new subdir under PGDATA.

Author: Tomas Vondra.  Some rework by Álvaro
Testing by Jeff Janes
Other discussion by Heikki Linnakangas, Tom Lane.
2013-02-18 18:12:52 -03:00
Peter Eisentraut 9475db3a4e Add ALTER ROLE ALL SET command
This generalizes the existing ALTER ROLE ... SET and ALTER DATABASE
... SET functionality to allow creating settings that apply to all users
in all databases.

reviewed by Pavel Stehule
2013-02-17 23:45:36 -05:00
Bruce Momjian 17f1523932 Warn about initdb using mount-points
Add code to detect and warn about trying to initdb or create pg_xlog on
mount points.
2013-02-16 18:52:50 -05:00
Heikki Linnakangas 1bd42cd70a Better fix for "unarchived WAL files get deleted on crash recovery" bug.
Revert my earlier fix for the bug that unarchived WAL files get deleted on
crash recovery, commit c9cc7e05c6. We create
a .done file for files streamed or restored from archive, so the WAL file
recycling logic used during normal operation works just as well during
archive recovery.

Per Fujii Masao's suggestion.
2013-02-15 19:33:31 +02:00
Simon Riggs c2f79ba269 Force archive_status of .done for xlogs created by dearchival/replication.
This is a forward-patch of commit 6f4b8a4f4f,
applied to 9.2 back in August. The plan was to do something else in master,
but it looks like it's not going to happen, so let's just apply the 9.2
solution to master as well.

Fujii Masao
2013-02-15 19:28:06 +02:00
Heikki Linnakangas c9cc7e05c6 Don't delete unarchived WAL files during crash recovery.
Bug reported by Jehan-Guillaume (ioguix) de Rorthais. This was introduced
with the change to keep WAL files restored from archive in pg_xlog, in 9.2.
2013-02-15 17:43:59 +02:00
Peter Eisentraut 8e6c8da16a pgindent: Fix order in instructions
The previous order of steps didn't literally work, because git clean
-fdx would delete the downloaded typedefs.list.  Also, pgindent needs to
be called with a path when one is in at the top of the build tree.
2013-02-14 21:40:05 -05:00
Tom Lane fdaf44862b Invent pre-commit/pre-prepare/pre-subcommit events for xact callbacks.
Currently it's only possible for loadable modules to get control during
post-commit cleanup of a transaction.  That doesn't work too well if they
want to do something that could throw an error; for example, an FDW might
need to issue a remote commit, which could well fail.  To improve matters,
extend the existing APIs for XactCallback and SubXactCallback functions
to provide new pre-commit events for this purpose.

The release notes will need to mention that existing callback functions
should be checked to make sure they don't do something unwanted when one
of the new event types occurs.  In the examples within our source tree,
contrib/sepgsql was fine but plpgsql had been a bit too cute.
2013-02-14 20:35:08 -05:00
Tom Lane 71627f3d19 Fix CVE-2013-0255 properly.
Revert commit ab0f7b6089 (in HEAD only)
in favor of the proper solution, which is to declare enum_recv() correctly
in the system catalogs.  It should be declared to take type "internal"
not "cstring".

Also improve the type_sanity regression test, which should have caught
this typo, so that it actually would.  Most of the relevant checks on
the signature of type I/O functions should not have been restricted to
basetypes/pseudotypes, as they should apply to any type's I/O functions.
2013-02-13 16:20:01 -05:00
Tom Lane cd89965aab Fix bogus when-to-deregister-from-listener-array logic.
Since a backend adds itself to the global listener array during
Exec_ListenPreCommit, it's inappropriate for it to remove itself during
Exec_UnlistenCommit or Exec_UnlistenAllCommit --- that leads to failure
when committing a transaction that did UNLISTEN then LISTEN, since we end
up not registered though we should be.  (This leads to missing later
notifications, or to Assert failures in assert-enabled builds.)  Instead
deal with deregistering at the bottom of AtCommit_Notify, when we know the
final state of the listenChannels list.

Also, simplify the representation of registration status by replacing the
transient backendHasExecutedInitialListen flag with an amRegisteredListener
flag.

Per report from Greg Sabino Mullane.  Back-patch to 9.0, where the problem
was introduced during the LISTEN/NOTIFY rewrite.
2013-02-13 12:48:05 -05:00
Heikki Linnakangas fdf9e21196 Update visibility map in the second phase of vacuum.
There's a high chance that a page becomes all-visible when the second phase
of vacuum removes all the dead tuples on it, so it makes sense to check for
that. Otherwise the visibility map won't get updated until the next vacuum.

Pavan Deolasee, reviewed by Jeff Janes.
2013-02-13 17:52:10 +02:00
Alvaro Herrera 0e81ddde2c Rename "string" pstrdup argument to "in"
The former name collides with a symbol also used in the isolation test's
parser, causing assorted failures in certain platforms.
2013-02-12 12:43:09 -03:00
Alvaro Herrera 0f980b0e17 Don't build libpgcommon_srv.a just yet
It's empty, and some archivers do not support that case.
2013-02-12 12:21:27 -03:00
Alvaro Herrera 8396447cdb Create libpgcommon, and move pg_malloc et al to it
libpgcommon is a new static library to allow sharing code among the
various frontend programs and backend; this lets us eliminate duplicate
implementations of common routines.  We avoid libpgport, because that's
intended as a place for porting issues; per discussion, it seems better
to keep them separate.

The first use case, and the only implemented by this patch, is pg_malloc
and friends, which many frontend programs were already using.

At the same time, we can use this to provide palloc emulation functions
for the frontend; this way, some palloc-using files in the backend can
also be used by the frontend cleanly.  To do this, we change palloc() in
the backend to be a function instead of a macro on top of
MemoryContextAlloc().  This was previously believed to cause loss of
performance, but this implementation has been tweaked by Tom and Andres
so that on modern compilers it provides a slight improvement over the
previous one.

This lets us clean up some places that were already with
localized hacks.

Most of the pg_malloc/palloc changes in this patch were authored by
Andres Freund. Zoltán Böszörményi also independently provided a form of
that.  libpgcommon infrastructure was authored by Álvaro.
2013-02-12 11:21:05 -03:00
Peter Eisentraut 0cb1fac3b1 Add noreturn attributes to some error reporting functions 2013-02-12 07:13:22 -05:00
Heikki Linnakangas 62401db45c Support unlogged GiST index.
The reason this wasn't supported before was that GiST indexes need an
increasing sequence to detect concurrent page-splits. In a regular WAL-
logged GiST index, the LSN of the page-split record is used for that
purpose, and in a temporary index, we can get away with a backend-local
counter. Neither of those methods works for an unlogged relation.

To provide such an increasing sequence of numbers, create a "fake LSN"
counter that is saved and restored across shutdowns. On recovery, unlogged
relations are blown away, so the counter doesn't need to survive that
either.

Jeevan Chalke, based on discussions with Robert Haas, Tom Lane and me.
2013-02-11 23:07:09 +02:00
Heikki Linnakangas b669f416ce Fix checkpoint after fast promotion.
The intention was to request a regular online checkpoint immediately after
end of recovery, when performing "fast promotion". However, because the
checkpoint was requested before other backends were allowed to write WAL,
the checkpointer process performed a restartpoint rather than a checkpoint.

Delay the RequestCheckPoint call until after recovery has truly ended, so
that you get a real checkpoint.
2013-02-11 22:22:08 +02:00
Heikki Linnakangas 7803e9327d Include previous TLI in end-of-recovery and shutdown checkpoint records.
This isn't used for anything but a sanity check at the moment, but it could
be highly valuable for debugging purposes. It could also be used to recreate
timeline history by traversing WAL, which seems useful.
2013-02-11 18:16:25 +02:00
Tom Lane c352ea2d74 Further cleanup of gistsplit.c.
After further reflection I was unconvinced that the existing coding is
guaranteed to return valid union datums in every code path for multi-column
indexes.  Fix that by forcing a gistunionsubkey() call at the end of the
recursion.  Having done that, we can remove some clearly-redundant calls
elsewhere.  This should be a little faster for multi-column indexes (since
the previous coding would uselessly do such a call for each column while
unwinding the recursion), as well as much harder to break.

Also, simplify the handling of cases where one side or the other of a
primary split contains only don't-care tuples.  The previous coding used a
very ugly hack in removeDontCares() that essentially forced one random
tuple to be treated as non-don't-care, providing a random initial choice of
seed datum for the secondary split.  It seems unlikely that that method
will give better-than-random splits.  Instead, treat such a split as
degenerate and just let the next column determine the split, the same way
that we handle fully degenerate cases where the two sides produce identical
union datums.
2013-02-10 16:21:26 -05:00
Tom Lane db3d7e9f0d Remove useless picksplit-doesn't-support-secondary-split log spam.
This LOG message was put in over five years ago with the evident
expectation that we'd make all GiST opclasses support secondary split
directly.  However, no such thing ever happened, and indeed the number of
opclasses supporting it decreased to zero in 9.2.  The reason is that
improving on the default implementation isn't that easy --- the
opclass-specific code that did exist, before 9.2, doesn't appear to have
been any improvement over the default.

Hence, remove the message altogether.  There's certainly no point in
nagging users about this in released branches, but I doubt that we'll
ever implement complete opclass-specific support anyway.
2013-02-10 13:07:40 -05:00
Tom Lane dacc185f52 Remove vestigial secondary-split support in gist_box_picksplit().
Not only is this implementation of secondary-split not better than the
default implementation in gistsplit.c, it's actually worse.  The gistsplit.c
code at least looks to see if switching the left and right sides would make
a better merge with the previously-split tuples, while this doesn't.

In any case it's rather useless to support secondary split only in an edge
case.  There used to be more complete support for it here (in chooseLR()),
but that was removed in commit 7f3bd86843.
It appears to me though that the chooseLR() code was really isomorphic to
the default implementation, since it was still based on choosing the cheaper
way of adding two sub-split vectors that had been chosen without regard to
the primary split initially.  I think an implementation of secondary split
that could beat the default implementation would have to be pretty fully
integrated into the split algorithm, not plastered on at the end.

Back-patch to 9.2, but not further; previous branches have the chooseLR()
code which I don't feel a great need to mess with.  This is mainly so we
just have two behaviors and not three among the various branches (IOW, this
patch is cleanup for commit 7f3bd86843e5aad84585a57d3f6b80db3c609916's
incomplete removal of secondary-split support).
2013-02-10 12:40:09 -05:00
Tom Lane 0fd0f3688b Document and clean up gistsplit.c.
Improve comments, rename some variables and functions, slightly simplify
a couple of APIs, in an attempt to make this code readable by people other
than its original author.

Even though this is essentially just cosmetic, back-patch to all active
branches, because otherwise it's going to make back-patching future fixes
in this file very painful.
2013-02-10 11:58:15 -05:00
Tom Lane a187c96d26 Reduce log level of picksplit-doesn't-support-secondary-split whining.
This was agreed to back in 2007, but never actually done.

Josh Hansen
2013-02-09 12:17:55 -05:00
Peter Eisentraut 0343a59d11 psql: Improve unaligned expanded output for zero rows
This used to erroneously print an empty line.  Now it prints nothing.
2013-02-09 00:11:58 -05:00
Peter Eisentraut 8ade58a4ea psql: Improve expanded print output in tuples-only mode
When there are zero result rows, in expanded mode, "(No rows)" is
printed.  So far, there was no way to turn this off.  Now, when
tuples-only mode is turned on, nothing is printed in this case.
2013-02-09 00:11:58 -05:00
Tom Lane c61e26ee3e Add support for ALTER RULE ... RENAME TO.
Ali Dar, reviewed by Dean Rasheed.
2013-02-08 23:58:40 -05:00
Tom Lane f806c191a3 Simplify box_overlap computations.
Given the assumption that a box's high coordinates are not less than its
low coordinates, the tests in box_ov() are overly complicated and can be
reduced to about half as much work.  Since many other functions in
geo_ops.c rely on that assumption, there doesn't seem to be a good reason
not to use it here.

Per discussion of Alexander Korotkov's GiST fix, which was already using
the simplified logic (in a non-fuzzy form, but the equivalence holds just
as well for fuzzy).
2013-02-08 18:26:08 -05:00
Tom Lane 3c29b196b0 Fix gist_box_same and gist_point_consistent to handle fuzziness correctly.
While there's considerable doubt that we want fuzzy behavior in the
geometric operators at all (let alone as currently implemented), nobody is
stepping forward to redesign that stuff.  In the meantime it behooves us
to make sure that index searches agree with the behavior of the underlying
operators.  This patch fixes two problems in this area.

First, gist_box_same was using fuzzy equality, but it really needs to use
exact equality to prevent not-quite-identical upper index keys from being
treated as identical, which for example would prevent an existing upper
key from being extended by an amount less than epsilon.  This would result
in inconsistent indexes.  (The next release notes will need to recommend
that users reindex GiST indexes on boxes, polygons, circles, and points,
since all four opclasses use gist_box_same.)

Second, gist_point_consistent used exact comparisons for upper-page
comparisons in ~= searches, when it needs to use fuzzy comparisons to
ensure it finds all matches; and it used fuzzy comparisons for point <@ box
searches, when it needs to use exact comparisons because that's what the
<@ operator (rather inconsistently) does.

The added regression test cases illustrate all three misbehaviors.

Back-patch to all active branches.  (8.4 did not have GiST point_ops,
but it still seems prudent to apply the gist_box_same patch to it.)

Alexander Korotkov, reviewed by Noah Misch
2013-02-08 18:03:17 -05:00
Alvaro Herrera 381d4b70a9 Clean up c.h / postgres.h after Assert() move
Per Tom
2013-02-08 12:50:58 -03:00
Alvaro Herrera 5766228bc6 Fix Xmax freeze conditions
I broke this in 0ac5ad5134; previously, freezing a tuple marked with an
IS_MULTI xmax was not necessary.

Per brokenness report from Jeff Janes.
2013-02-08 12:50:58 -03:00
Magnus Hagander c572bfaf39 Fix another typo in a comment
Noted by Thom Brown
2013-02-08 15:42:01 +01:00
Peter Eisentraut cf4d67e819 Exclude access/rmgrlist.h from cpluspluscheck
It is not meant to be included standalone.
2013-02-08 07:01:21 -05:00
Peter Eisentraut 4760142146 scripts: Add build prerequisite on libpgport
Without this, building in src/bin/scripts directly will fail if
libpgport wasn't built first.  Other bin components are handled the same
way.

Phil Sorber
2013-02-08 06:43:54 -05:00
Magnus Hagander 733701d274 Fix typo in comment
Etsuro Fujita
2013-02-08 11:45:42 +01:00
Tom Lane bcc6c4c291 Fix performance issue in EXPLAIN (ANALYZE, TIMING OFF).
Commit af7914c662, which added the TIMING
option to EXPLAIN, had an oversight: if the TIMING option is disabled
then control in InstrStartNode() goes through an elog(DEBUG2) call, which
typically does nothing but takes a noticeable amount of time to do it.
Tweak the logic to avoid that.

In HEAD, also change the elog(DEBUG2)'s in instrument.c to elog(ERROR).
It's not very clear why they weren't like that to begin with, but this
episode shows that not complaining more vociferously about misuse is
likely to do little except allow bugs to remain hidden.

While at it, adjust some code that was making possibly-dangerous
assumptions about flag bits being in the rightmost byte of the
instrument_options word.

Problem reported by Pavel Stehule (via Tomas Vondra).
2013-02-07 22:53:00 -05:00
Tom Lane 166d534fcd Repair bugs in GiST page splitting code for multi-column indexes.
When considering a non-last column in a multi-column GiST index,
gistsplit.c tries to improve on the split chosen by the opclass-specific
pickSplit function by considering penalties for the next column.  However,
there were two bugs in this code: it failed to recompute the union keys for
the leftmost index columns, even though these might well change after
reassigning tuples; and it included the old union keys in the recomputation
for the columns it did recompute, so that those keys couldn't get smaller
even if they should.  The first problem could result in an invalid index
in which searches wouldn't find index entries that are in fact present;
the second would make the index less efficient to search.

Both of these errors were caused by misuse of gistMakeUnionItVec, whose
API was designed in a way that just begged such errors to be made.  There
is no situation in which it's safe or useful to compute the union keys for
a subset of the index columns, and there is no caller that wants any
previous union keys to be included in the computation; so the undocumented
choice to treat the union keys as in/out rather than pure output parameters
is a waste of code as well as being dangerous.

Hence, rather than just making a minimal patch, I've changed the API of
gistMakeUnionItVec to remove the "startkey" parameter (it now always
processes all index columns) and treat the attr/isnull arrays as purely
output parameters.

In passing, also get rid of a couple of unnecessary and dangerous uses
of static variables in gistutil.c.  It's remarkable that the one in
gistMakeUnionKey hasn't given us portability troubles before now, because
in addition to posing a re-entrancy hazard, it was unsafely assuming that
a static char[] array would have at least Datum alignment.

Per investigation of a trouble report from Tomas Vondra.  (There are also
some bugs in contrib/btree_gist to be fixed, but that seems like material
for a separate patch.)  Back-patch to all supported branches.
2013-02-07 17:44:02 -05:00
Tom Lane c5aad8dc14 Fix possible failure to send final transaction counts to stats collector.
Normally, we suppress sending a tabstats message to the collector unless
there were some actual table stats to send.  However, during backend exit
we should force out the message if there are any transaction commit/abort
counts to send, else the session's last few commit/abort counts will never
get reported at all.  We had logic for this, but the short-circuit test
at the top of pgstat_report_stat() ignored the "force" flag, with the
consequence that session-ending transactions that touched no database-local
tables would not get counted.  Seems to be an oversight in my commit
641912b4d1, which added the "force" flag.
That was back in 8.3, so back-patch to all supported versions.
2013-02-07 14:44:00 -05:00
Simon Riggs 072521b8c8 Rely only on checkpoint 1 at end of recovery.
Searching for checkpoint 2 (previous) is not
correct in all cases.

Bug report from Heikki Linnakangas
2013-02-07 16:33:05 +00:00
Andrew Dunstan e1c1e21732 Enable building with Microsoft Visual Studio 2012.
Backpatch to release 9.2

Brar Piening and Noah Misch, reviewed by Craig Ringer.
2013-02-06 14:52:29 -05:00
Alvaro Herrera 5a1cd89f8f Split out list of XLog resource managers
The new rmgrlist.h header, containing all necessary data
about built-in resource managers, allows other pieces of code to
access them.

In particular, this allows a future pg_xlogdump program to extract
rm_desc function pointers, without having to keep a duplicate list of
them.
2013-02-06 08:47:28 -03:00
Alvaro Herrera cb9b66d31a Improve error message wording
The wording changes applied in 0ac5ad513 were universally disliked.

Per gripe from Andrew Dunstan
2013-02-06 00:19:53 -03:00
Tom Lane ab0f7b6089 Prevent execution of enum_recv() from SQL.
This function was misdeclared to take cstring when it should take internal.
This at least allows crashing the server, and in principle an attacker
might be able to use the function to examine the contents of server memory.

The correct fix is to adjust the system catalog contents (and fix the
regression tests that should have caught this but failed to).  However,
asking users to correct the catalog contents in existing installations
is a pain, so as a band-aid fix for the back branches, install a check
in enum_recv() to make it throw error if called with a cstring argument.
We will later revert this in HEAD in favor of correcting the catalogs.

Our thanks to Sumit Soni (via Secunia SVCRP) for reporting this issue.

Security: CVE-2013-0255
2013-02-04 16:25:01 -05:00
Simon Riggs f480e29449 Reset vacuum_defer_cleanup_age to PGC_SIGHUP.
Revert commit 84725aa5ef
2013-02-04 16:39:55 +00:00
Simon Riggs bd56e74127 Reset master xmin when hot_standby_feedback disabled.
If walsender has xmin of standby then ensure we
reset the value to 0 when we change from hot_standby_feedback=on
to hot_standby_feedback=off.
2013-02-04 10:29:22 +00:00
Tom Lane 62e666400d Perform line wrapping and indenting by default in ruleutils.c.
This patch changes pg_get_viewdef() and allied functions so that
PRETTY_INDENT processing is always enabled.  Per discussion, only the
PRETTY_PAREN processing (that is, stripping of "unnecessary" parentheses)
poses any real forward-compatibility risk, so we may as well make dump
output look as nice as we safely can.

Also, set the default wrap length to zero (i.e, wrap after each SELECT
or FROM list item), since there's no very principled argument for the
former default of 80-column wrapping, and most people seem to agree this
way looks better.

Marko Tiikkaja, reviewed by Jeevan Chalke, further hacking by Tom Lane
2013-02-03 15:56:45 -05:00
Peter Eisentraut 330ed4ac6c PL/Python: Add result object str handler
This is intended so that say plpy.debug(rv) prints something useful for
debugging query execution results.

reviewed by Steve Singer
2013-02-03 00:31:01 -05:00
Tom Lane d2d153fdb0 Create a psql command \gset to store query results into psql variables.
This eases manipulation of query results in psql scripts.

Pavel Stehule, reviewed by Piyush Newe, Shigeru Hanada, and Tom Lane
2013-02-02 17:06:38 -05:00
Tom Lane 101d6ae755 Prevent "\g filename" from affecting subsequent commands after an error.
In the previous coding, psql's state variable saying that output should
go to a file was only reset after successful completion of a query
returning tuples.  Thus for example,

regression=# select 1/0
regression-# \g somefile
ERROR:  division by zero
regression=# select 1/2;
regression=#

... huh, I wonder where that output went.  Even more oddly, the state
was not reset even if it's the file that's causing the failure:

regression=# select 1/2 \g /foo
/foo: Permission denied
regression=# select 1/2;
/foo: Permission denied
regression=# select 1/2;
/foo: Permission denied

This seems to me not to satisfy the principle of least surprise.
\g is certainly not documented in a way that suggests its effects are
at all persistent.

To fix, adjust the code so that the flag is reset at exit from SendQuery
no matter what happened.

Noted while reviewing the \gset patch, which had comparable issues.
Arguably this is a bug fix, but I'll refrain from back-patching for now.
2013-02-02 14:22:17 -05:00
Simon Riggs 84725aa5ef Mark vacuum_defer_cleanup_age as PGC_POSTMASTER.
Following bug analysis of #7819 by Tom Lane
2013-02-02 18:49:54 +00:00
Bruce Momjian e8ae019661 Adjust COPY FREEZE error message to be more accurate and consistent.
Per suggestions from Noah and Tom.
2013-02-02 12:56:52 -05:00
Alvaro Herrera e1d25de35a Move Assert() definitions to c.h
This way, they can be used by frontend and backend code.  We already
supported that, but doing it this way allows us to mix true frontend
files with backend files compiled in frontend environment.

Author: Andres Freund
2013-02-01 17:50:04 -03:00
Alvaro Herrera dd1569da67 Fix typo in freeze_table_age implementation
The original code used freeze_min_age instead of freeze_table_age.  The
main consequence of this mistake is that lowering freeze_min_age would
cause full-table scans to occur much more frequently, which causes
serious issues because the number of writes required is much larger.
That feature (freeze_min_age) is supposed to affect only how soon tuples
are frozen; some pages should still be skipped due to the visibility
map.

Backpatch to 8.4, where the freeze_table_age feature was introduced.

Report and patch from Andres Freund
2013-02-01 12:00:40 -03:00
Alvaro Herrera 9ee00ef4c7 Fill tuple before HeapSatisfiesHOTAndKeyUpdate
Failing to do this results in almost all updates to system catalogs
being non-HOT updates, because the OID column would differ (not having
been set for the new tuple), which is an indexed column.

While at it, make sure to set the tableoid early in both old and new
tuples as well.  This isn't of much consequence, since that column is
seldom (never?) indexed.

Report and patch from Andres Freund.
2013-02-01 10:43:09 -03:00
Peter Eisentraut 5839052693 Add CREATE RECURSIVE VIEW syntax
This is specified in the SQL standard.  The CREATE RECURSIVE VIEW
specification is transformed into a normal CREATE VIEW statement with a
WITH RECURSIVE clause.

reviewed by Abhijit Menon-Sen and Stephen Frost
2013-01-31 22:31:58 -05:00
Peter Eisentraut b1980f6d03 PL/Tcl: Fix compiler warnings with Tcl 8.6
Some constification was added in the Tcl APIs, so add the modifiers in
PL/Tcl as well.
2013-01-31 22:08:53 -05:00
Alvaro Herrera b78647a0e6 Restrict infomask bits to set on multixacts
We must only set the bit(s) for the strongest lock held in the tuple;
otherwise, a multixact containing members with exclusive lock and
key-share lock will behave as though only a share lock is held.

This bug was introduced in commit 0ac5ad5134, somewhere along
development, when we allowed a singleton FOR SHARE lock to be
implemented without a MultiXact by using a multi-bit pattern.
I overlooked that GetMultiXactIdHintBits() needed to be tweaked as well.
Previously, we could have the bits for FOR KEY SHARE and FOR UPDATE
simultaneously set and it wouldn't cause a problem.

Per report from digoal@126.com
2013-01-31 19:35:31 -03:00
Simon Riggs 3f0ab05233 Switch timelines if we crash soon after promotion.
Previous patch to skip checkpoints at end of recovery didn't
correctly perform crash recovery, fumbling the timeline switch.
Now we record the minRecoveryPointTLI of the newly selected
timeline, so that we crash recover to the correct timeline.

Bug report from Fujii Masao, investigated by me.
2013-01-31 19:29:32 +00:00
Tom Lane 9afc58396a Reject nonzero day fields in AT TIME ZONE INTERVAL functions.
It's not sensible for an interval that's used as a time zone value to be
larger than a day.  When we changed the interval type to contain a separate
day field, check_timezone() was adjusted to reject nonzero day values, but
timetz_izone(), timestamp_izone(), and timestamptz_izone() evidently were
overlooked.

While at it, make the error messages for these three cases consistent.
2013-01-31 12:12:23 -05:00
Magnus Hagander bfb8a8d381 Properly zero-pad the day-of-year part of the win32 build number
This ensure the version number increases over time. The first three digits
in the version number is still set to the actual PostgreSQL version
number, but the last one is intended to be an ever increasing build number,
which previosly failed when it changed between 1, 2 and 3 digits long values.

Noted by Deepak
2013-01-31 15:06:45 +01:00
Tom Lane 2ab218b576 Don't use spi_priv.h in plpython.
There may once have been a reason to violate modularity like that,
but it doesn't appear that there is anymore.
2013-01-30 20:11:58 -05:00
Tom Lane 0900ac2d0d Fix plpgsql's reporting of plan-time errors in possibly-simple expressions.
exec_simple_check_plan and exec_eval_simple_expr attempted to call
GetCachedPlan directly.  This meant that if an error was thrown during
planning, the resulting context traceback would not include the line
normally contributed by _SPI_error_callback.  This is already inconsistent,
but just to be really odd, a re-execution of the very same expression
*would* show the additional context line, because we'd already have cached
the plan and marked the expression as non-simple.

The problem is easy to demonstrate in 9.2 and HEAD because planning of a
cached plan doesn't occur at all until GetCachedPlan is done.  In earlier
versions, it could only be an issue if initial planning had succeeded, then
a replan was forced (already somewhat improbable for a simple expression),
and the replan attempt failed.  Since the issue is mainly cosmetic in older
branches anyway, it doesn't seem worth the risk of trying to fix it there.
It is worth fixing in 9.2 since the instability of the context printout can
affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion
on pgsql-novice.

To fix, introduce a SPI function that wraps GetCachedPlan while installing
the correct callback function.  Use this instead of calling GetCachedPlan
directly from plpgsql.

Also introduce a wrapper function for extracting a SPI plan's
CachedPlanSource list.  This lets us stop including spi_priv.h in
pl_exec.c, which was never a very good idea from a modularity standpoint.

In passing, fix a similar inconsistency that could occur in SPI_cursor_open,
which was also calling GetCachedPlan without setting up a context callback.
2013-01-30 20:02:23 -05:00
Tom Lane 670a6c7a22 Fix grammar for subscripting or field selection from a sub-SELECT result.
Such cases should work, but the grammar failed to accept them because of
our ancient precedence hacks to convince bison that extra parentheses
around a sub-SELECT in an expression are unambiguous.  (Formally, they
*are* ambiguous, but we don't especially care whether they're treated as
part of the sub-SELECT or part of the expression.  Bison cares, though.)
Fix by adding a redundant-looking production for this case.

This is a fine example of why fixing shift/reduce conflicts via
precedence declarations is more dangerous than it looks: you can easily
cause the parser to reject cases that should work.

This has been wrong since commit 3db4056e22
or maybe before, and apparently some people have been working around it
by inserting no-op casts.  That method introduces a dump/reload hazard,
as illustrated in bug #7838 from Jan Mate.  Hence, back-patch to all
active branches.
2013-01-30 14:17:48 -05:00
Peter Eisentraut 574f764321 pg_regress: Allow overriding diff options
By setting the environment variable PG_REGRESS_DIFF_OPTS, custom diff
options can be passed.

reviewed by Jeevan Chalke
2013-01-29 22:59:45 -05:00
Peter Eisentraut 5bb2ddc0af entab: Fix some compiler warnings 2013-01-29 22:21:21 -05:00
Tom Lane 991f3e5ab3 Provide database object names as separate fields in error messages.
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure.  It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error.  (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)

Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.

Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
2013-01-29 17:08:26 -05:00
Heikki Linnakangas c9d7dbacd3 Skip truncating ON COMMIT DELETE ROWS temp tables, if the transaction hasn't
touched any temporary tables.

We could try harder, and keep track of whether we've inserted to any temp
tables, rather than accessed them, and which temp tables have been inserted
to. But this is dead simple, and already covers many interesting scenarios.
2013-01-29 10:43:33 +02:00
Simon Riggs fd4ced5230 Fast promote mode skips checkpoint at end of recovery.
pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
can achieve very fast failover when the apply delay is low. Write new WAL record
XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
readers. If we skip synchronous end of recovery checkpoint we request a normal
spread checkpoint so that the window of re-recovery is low.

Simon Riggs and Kyotaro Horiguchi, with input from Fujii Masao.
Review by Heikki Linnakangas
2013-01-29 00:06:15 +00:00
Alvaro Herrera ee22c55f5a REASSIGN OWNED: handle shared objects, too
Give away ownership of shared objects (databases, tablespaces) along
with local objects, per original code intention.  Try to make the
documentation clearer, too.

Per discussion about DROP OWNED's brokenness, in bug #7748.

This is not backpatched because it'd require some refactoring of the
ALTER/SET OWNER code for databases and tablespaces.
2013-01-28 18:45:50 -03:00
Alvaro Herrera ec41b8edc1 DROP OWNED: don't try to drop tablespaces/databases
My "fix" for bugs #7578 and #6116 on DROP OWNED at fe3b5eb08a not only
misstated that it applied to REASSIGN OWNED (which it did not affect),
but it also failed to fix the problems fully, because I didn't test the
case of owned shared objects.  Thus I created a new bug, reported by
Thomas Kellerer as #7748, which would cause DROP OWNED to fail with a
not-for-user-consumption error message.  The code would attempt to drop
the database, which not only fails to work because the underlying code
does not support that, but is a pretty dangerous and undesirable thing
to be doing as well.

This patch fixes that bug by having DROP OWNED only attempt to process
shared objects when grants on them are found, ignoring ownership.

Backpatch to 8.3, which is as far as the previous bug was backpatched.
2013-01-28 18:40:51 -03:00
Heikki Linnakangas 316186f289 Handle SPIErrors raised directly in PL/Python code.
If a PL/Python function raises an SPIError (or one if its subclasses)
directly with python's raise statement, treat it the same as an SPIError
generated internally. In particular, if the user sets the sqlstate
attribute, preserve that.

Oskari Saarenmaa and Jan Urbański, reviewed by Karl O. Pinc.
2013-01-28 09:46:23 +02:00
Michael Meskes 96bb29dc44 Made ecpglib use translated messages.
Bug reported and fixed by Chen Huajun <chenhj@cn.fujitsu.com>.
2013-01-27 13:48:12 +01:00
Tom Lane 2378d79ab2 Make LATERAL implicit for functions in FROM.
The SQL standard does not have general functions-in-FROM, but it does
allow UNNEST() there (see the <collection derived table> production),
and the semantics of that are defined to include lateral references.
So spec compliance requires allowing lateral references within UNNEST()
even without an explicit LATERAL keyword.  Rather than making UNNEST()
a special case, it seems best to extend this flexibility to any
function-in-FROM.  We'll still allow LATERAL to be written explicitly
for clarity's sake, but it's now a noise word in this context.

In theory this change could result in a change in behavior of existing
queries, by allowing what had been an outer reference in a function-in-FROM
to be captured by an earlier FROM-item at the same level.  However, all
pre-9.3 PG releases have a bug that causes them to match variable
references to earlier FROM-items in preference to outer references (and
then throw an error).  So no previously-working query could contain the
type of ambiguity that would risk a change of behavior.

Per a suggestion from Andrew Gierth, though I didn't use his patch.
2013-01-26 16:18:42 -05:00
Bruce Momjian 8865fe0ad3 Update comments in new DROP IF EXISTS code; commit message update
DROP IF EXISTS with a missing schema in commit
7e2322dff3 applies not only to tables, but
to DROP IF EXISTS with missing schemas for indexes, views, sequences,
and foreign tables.  Yeah!
2013-01-26 14:51:59 -05:00
Bruce Momjian 51cfb87ae2 Update LookupExplicitNamespace() comments; commit message update
Also, commit 7e2322dff3 affected DROP
TABLE IF EXISTS, not CREATE TABLE IF EXISTS.
2013-01-26 13:47:50 -05:00
Bruce Momjian 4deb57de7d Issue ERROR if FREEZE mode can't be honored by COPY
Previously non-honored FREEZE mode was ignored.  This also issues an
appropriate error message based on the cause of the failure, per
suggestion from Tom.  Additional regression test case added.
2013-01-26 13:33:24 -05:00
Bruce Momjian 7e2322dff3 Allow CREATE TABLE IF EXIST so succeed if the schema is nonexistent
Previously, CREATE TABLE IF EXIST threw an error if the schema was
nonexistent.  This was done by passing 'missing_ok' to the function that
looks up the schema oid.
2013-01-26 13:24:50 -05:00
Tom Lane 08be00fabe Fix plpython's handling of functions used as triggers on multiple tables.
plpython tried to use a single cache entry for a trigger function, but it
needs a separate cache entry for each table the trigger is applied to,
because there is table-dependent data in there.  This was done correctly
before 9.1, but commit 46211da1b8 broke it
by simplifying the lookup key from "function OID and triggered table OID"
to "function OID and is-trigger boolean".  Go back to using both OIDs
as the lookup key.  Per bug report from Sandro Santilli.

Andres Freund
2013-01-25 16:59:36 -05:00
Tom Lane 0d5fbdc157 Change plan caching to honor, not resist, changes in search_path.
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan.  The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes.  Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids.  So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.

This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution.  There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse.  We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
2013-01-25 14:14:41 -05:00
Robert Haas a37e83c0a9 Make it easy to time out pg_isready, and make the default 3 seconds.
Along the way, add a missing line to the help message.

Phil Sorber, reviewed by Fujii Masao
2013-01-25 12:03:37 -05:00
Heikki Linnakangas 8936867627 Add prosecdef to \df+ output.
Jon Erdman, reviewed by Phil Sorber and Stephen Frost.
2013-01-25 17:22:26 +02:00
Heikki Linnakangas ba1cc6501e Add some randomness to the choice of which GiST page to insert to.
When descending the tree for an insert, and there are multiple equally good
pages we could insert to, make the choice in random. Previously, we would
always choose the tuple with lowest offset number. That meant that when two
non-leaf pages overlap - in the extreme case they might have exactly the same
key - all but the first such page went unused. That wasn't optimal for space
usage; if you deleted some tuples from the non-first pages, the space would
never be reused.

With this patch, the other pages are sometimes chosen too, although there's
still a heavy bias towards low-offset tuples, so that we don't lose cache
locality when doing a lot of inserts with similar keys.

Original idea by Alexander Korotkov, although this patch version was written
by me and copy-edited by Tom Lane.
2013-01-25 16:58:38 +02:00
Magnus Hagander be926474be Make pg_dump exclude unlogged table data on hot standby slaves
Noted by Joe Van Dyk
2013-01-25 09:46:07 +01:00
Tom Lane 760f3c043a Fix concat() and format() to handle VARIADIC-labeled arguments correctly.
Previously, the VARIADIC labeling was effectively ignored, but now these
functions act as though the array elements had all been given as separate
arguments.

Pavel Stehule
2013-01-25 00:19:56 -05:00
Tom Lane 2ddc600f8f Fix SPI documentation for new handling of ExecutorRun's count parameter.
Since 9.0, the count parameter has only limited the number of tuples
actually returned by the executor.  It doesn't affect the behavior of
INSERT/UPDATE/DELETE unless RETURNING is specified, because without
RETURNING, the ModifyTable plan node doesn't return control to execMain.c
for each tuple.  And we only check the limit at the top level.

While this behavioral change was unintentional at the time, discussion of
bug #6572 led us to the conclusion that we prefer the new behavior anyway,
and so we should just adjust the docs to match rather than change the code.
Accordingly, do that.  Back-patch as far as 9.0 so that the docs match the
code in each branch.
2013-01-24 18:34:00 -05:00
Andrew Dunstan 1068771abf Use correct output device for Windows prompts.
This ensures that mapping of non-ascii prompts
to the correct code page occurs.

Bug report and original patch from Alexander Law,
reviewed and reworked by Noah Misch.

Backpatch to all live branches.
2013-01-24 16:01:31 -05:00
Alvaro Herrera 74ebba84ae Redefine HEAP_XMAX_IS_LOCKED_ONLY
Tuples marked SELECT FOR UPDATE in a cluster that's later processed by
pg_upgrade would have a different infomask bit pattern than those
produced by 9.3dev; that bit pattern was being seen as "dead" by HEAD
(because they would fail the "is this tuple locked" test, and so the
visibility rules would thing they're updated, even though there's no
HEAP_UPDATED version of them).  In other words, some rows could silently
disappear after pg_upgrade.

With this new definition, those tuples become visible again.

This is breakage resulting from my commit 0ac5ad5134.
2013-01-24 16:10:02 -03:00
Alvaro Herrera 6772c1e542 Make output identical to pg_resetxlog's 2013-01-24 11:55:10 -03:00
Simon Riggs 5c54f63fd6 Fix rare missing cancellations in Hot Standby.
The machinery around XLOG_HEAP2_CLEANUP_INFO failed
to correctly pass through the necessary information
on latestRemovedXid, avoiding cancellations in some
infrequent concurrent update/cleanup scenarios.

Backpatchable fix to 9.0

Detailed bug report and fix by Noah Misch,
backpatchable version by me.
2013-01-24 14:19:29 +00:00
Heikki Linnakangas 168d315703 Also fix rotation of csvlog on Windows.
Backpatch to 9.2, like the previous fix.
2013-01-24 11:41:30 +02:00
Tom Lane 8556869f2f Fix failure to rotate postmaster log file for size reasons on Windows.
When we eliminated "unnecessary" wakeups of the syslogger process, we
broke size-based logfile rotation on Windows, because on that platform
data transfer is done in a separate thread.  While non-Windows platforms
would recheck the output file size after every log message, Windows only
did so when the control thread woke up for some other reason, which might
be quite infrequent.  Per bug #7814 from Tsunezumi.  Back-patch to 9.2
where the problem was introduced.

Jeff Janes
2013-01-23 22:08:01 -05:00
Alvaro Herrera ca5db759b8 isolationtester: add a few fflush(stderr) calls
The lack of them is causing failures in some BF members.

Per Andrew Dunstan.
2013-01-23 13:30:14 -03:00
Robert Haas ac2e967362 pg_isready
New command-line utility to test whether a server is ready to
accept connections.

Phil Sorber, reviewed by Michael Paquier and Peter Eisentraut
2013-01-23 11:01:20 -05:00
Alvaro Herrera 0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00
Robert Haas 601e2935e2 Update comments and output for event_trigger regression test. 2013-01-23 06:49:30 -05:00
Heikki Linnakangas 52906f175a Implement pg_unreachable() on MSVC. 2013-01-23 12:53:55 +02:00
Heikki Linnakangas 990fe3c4ed Fix more issues with cascading replication and timeline switches.
When a standby server follows the master using WAL archive, and it chooses
a new timeline (recovery_target_timeline='latest'), it only fetches the
timeline history file for the chosen target timeline, not any other history
files that might be missing from pg_xlog. For example, if the current
timeline is 2, and we choose 4 as the new recovery target timeline, the
history file for timeline 3 is not fetched, even if it's part of this
server's history. That's enough for the standby itself - the history file
for timeline 4 includes timeline 3 as well - but if a cascading standby
server wants to recover to timeline 3, it needs the history file. To fix,
when a new recovery target timeline is chosen, try to copy any missing
history files from the archive to pg_xlog between the old and new target
timeline.

A second similar issue was with the WAL files. When a standby recovers from
archive, and it reaches a segment that contains a switch to a new timeline,
recovery fetches only the WAL file labelled with the new timeline's ID. The
file from the new timeline contains a copy of the WAL from the old timeline
up to the point where the switch happened, and recovery recovers it from the
new file. But in streaming replication, walsender only tries to read it
from the old timeline's file. To fix, change walsender to read it from the
new file, so that it behaves the same as recovery in that sense, and doesn't
try to open the possibly nonexistent file with the old timeline's ID.
2013-01-23 10:19:20 +02:00
Robert Haas ddef9a0028 Fix a few small bugs in yesterday's event trigger patch.
Dimitri Fontaine
2013-01-22 21:37:01 -05:00
Tom Lane 75b39e7909 Add infrastructure for storing a VARIADIC ANY function's VARIADIC flag.
Originally we didn't bother to mark FuncExprs with any indication whether
VARIADIC had been given in the source text, because there didn't seem to be
any need for it at runtime.  However, because we cannot fold a VARIADIC ANY
function's arguments into an array (since they're not necessarily all the
same type), we do actually need that information at runtime if VARIADIC ANY
functions are to respond unsurprisingly to use of the VARIADIC keyword.
Add the missing field, and also fix ruleutils.c so that VARIADIC ANY
function calls are dumped properly.

Extracted from a larger patch that also fixes concat() and format() (the
only two extant VARIADIC ANY functions) to behave properly when VARIADIC is
specified.  This portion seems appropriate to review and commit separately.

Pavel Stehule
2013-01-21 20:26:15 -05:00
Robert Haas 841a5150c5 Add ddl_command_end support for event triggers.
Dimitri Fontaine, with slight changes by me
2013-01-21 18:00:24 -05:00
Alvaro Herrera 765cbfdc92 Refactor ALTER some-obj RENAME implementation
Remove duplicate implementations of catalog munging and miscellaneous
privilege checks.  Instead rely on already existing data in
objectaddress.c to do the work.

Author: KaiGai Kohei, changes by me
Reviewed by: Robert Haas, Álvaro Herrera, Dimitri Fontaine
2013-01-21 12:06:41 -03:00
Tom Lane 8f0d8f481e Fix one-byte buffer overrun in PQprintTuples().
This bug goes back to the original Postgres95 sources.  Its significance
to modern PG versions is marginal, since we have not used PQprintTuples()
internally in a very long time, and it doesn't seem to have ever been
documented either.  Still, it *is* exposed to client apps, so somebody
out there might possibly be using it.

Xi Wang
2013-01-20 23:43:46 -05:00
Tom Lane 535e69a43f Fix error-checking typo in check_TSCurrentConfig().
The code failed to detect an out-of-memory failure.

Xi Wang
2013-01-20 23:09:35 -05:00
Tom Lane d5b31cc32b Fix an O(N^2) performance issue for sessions modifying many relations.
AtEOXact_RelationCache() scanned the entire relation cache at the end of
any transaction that created a new relation or assigned a new relfilenode.
Thus, clients such as pg_restore had an O(N^2) performance problem that
would start to be noticeable after creating 10000 or so tables.  Since
typically only a small number of relcache entries need any cleanup, we
can fix this by keeping a small list of their OIDs and doing hash_searches
for them.  We fall back to the full-table scan if the list overflows.

Ideally, the maximum list length would be set at the point where N
hash_searches would cost just less than the full-table scan.  Some quick
experimentation says that point might be around 50-100; I (tgl)
conservatively set MAX_EOXACT_LIST = 32.  For the case that we're worried
about here, which is short single-statement transactions, it's unlikely
there would ever be more than about a dozen list entries anyway; so it's
probably not worth being too tense about the value.

We could avoid the hash_searches by instead keeping the target relcache
entries linked into a list, but that would be noticeably more complicated
and bug-prone because of the need to maintain such a list in the face of
relcache entry drops.  Since a relcache entry can only need such cleanup
after a somewhat-heavyweight filesystem operation, trying to save a
hash_search per cleanup doesn't seem very useful anyway --- it's the scan
over all the not-needing-cleanup entries that we wish to avoid here.

Jeff Janes, reviewed and tweaked a bit by Tom Lane
2013-01-20 13:45:10 -05:00
Tom Lane 26d905a12d Use SET TRANSACTION READ ONLY in pg_dump, if server supports it.
This currently does little except serve as documentation.  (The one case
where it has a performance benefit, SERIALIZABLE mode in 9.1 and up, was
already using READ ONLY mode.)  However, it's possible that it might have
performance benefits in future, and in any case it seems like good
practice since it would catch any accidentally non-read-only operations.

Pavan Deolasee
2013-01-19 17:56:40 -05:00
Tom Lane 4b94cfb564 Modernize string literal syntax in tutorial example.
Un-double the backslashes in the LIKE patterns, since
standard_conforming_strings is now the default.  Just to be sure, include
a command to set standard_conforming_strings to ON in the example.

Back-patch to 9.1, where standard_conforming_strings became the default.

Josh Kupershmidt, reviewed by Jeff Janes
2013-01-19 17:20:32 -05:00
Andrew Dunstan 9f10f7dc57 Make pgxs build executables with the right suffix.
Complaint and patch from Zoltán Böszörményi.

When cross-compiling, the native make doesn't know
about the Windows .exe suffix, so it only builds with
it when explicitly told to do so.

The native make will not see the link between the target
name and the built executable, and might this do unnecesary
work, but that's a bigger problem than this one, if in fact
we consider it a problem at all.

Back-patch to all live branches.
2013-01-19 14:54:29 -05:00
Tom Lane c2a14bc7c9 Protect against SnapshotNow race conditions in pg_tablespace scans.
Use of SnapshotNow is known to expose us to race conditions if the tuple(s)
being sought could be updated by concurrently-committing transactions.
CREATE DATABASE and DROP DATABASE are particularly exposed because they do
heavyweight filesystem operations during their scans of pg_tablespace,
so that the scans run for a very long time compared to most.  Furthermore,
the potential consequences of a missed or twice-visited row are nastier
than average:

* createdb() could fail with a bogus "file already exists" error, or
  silently fail to copy one or more tablespace's worth of files into the
  new database.

* remove_dbtablespaces() could miss one or more tablespaces, thus failing
  to free filesystem space for the dropped database.

* check_db_file_conflict() could likewise miss a tablespace, leading to an
  OID conflict that could result in data loss either immediately or in
  future operations.  (This seems of very low probability, though, since a
  duplicate database OID would be unlikely to start with.)

Hence, it seems worth fixing these three places to use MVCC snapshots, even
though this will someday be superseded by a generic solution to SnapshotNow
race conditions.

Back-patch to all active branches.

Stephen Frost and Tom Lane
2013-01-18 18:06:20 -05:00
Bruce Momjian 530bbfac57 Rename new latex longtable function name, for consistency 2013-01-18 14:02:58 -05:00
Robert Haas d8c3896626 Unbreak lock conflict detection for Hot Standby.
This got broken in the original fast-path locking patch, because
I failed to account for the fact that Hot Standby startup process
might take a strong relation lock on a relation in a database to
which it is not bound, and confused MyDatabaseId with the database
ID of the relation being locked.

Report and diagnosis by Andres Freund.  Final form of patch by me.
2013-01-18 11:52:28 -05:00
Alvaro Herrera 8c17144c75 Fix off-by-one bug in xlog reading logic
Bug reported by Michael Paquier

Author: Andres Freund
2013-01-18 11:19:53 -03:00
Bruce Momjian 74a82bafe4 psql latex fixes
Remove extra line at bottom of table for new 'latex' mode border=3.
Also update 'latex'-longtable 'tableattr' docs to say
'whitespace-separated' instead of 'space'.
2013-01-18 08:30:31 -05:00
Heikki Linnakangas 6f7cddc7ae Now that START_REPLICATION returns the next timeline's ID after reaching end
of timeline, take advantage of that in walreceiver.

Startup process is still in control of choosign the target timeline, by
scanning the timeline history files present in pg_xlog, but walreceiver now
uses the next timeline's ID to fetch its history file immediately after it
has finished streaming the old timeline. Before, the standby would first try
to restart streaming on the old timeline, which fetches the missing timeline
history file as a side-effect, and only then restart from the new timeline.
This patch eliminates the extra iteration, which speeds up the timeline
switch and reduces the noise in the log caused by the extra restart on the
old timeline.
2013-01-18 11:59:34 +02:00
Heikki Linnakangas 2ff6555313 Use the right timeline when beginning to stream from master.
The xlogreader refactoring broke the logic to decide which timeline to start
streaming from. XLogPageRead() uses the timeline history to check which
timeline the requested WAL position falls into. However, after the
refactoring, XLogPageRead() is always first called with the first page in
the segment, to verify the segment header, and only then with the actual WAL
position we're interested in. That first read of the segment's header made
XLogPageRead() to always start streaming from the old timeline containing
the segment header, not the timeline containing the actual record, if there
was a timeline switch within the segment.

I thought I fixed this yesterday, but that fix was too narrow and only fixed
this for the corner-case that the timeline switch happened in the first page
of the segment. To fix this more robustly, pass explicitly the position of
the record we're actually interested in to XLogPageRead, and use that to
decide which timeline to read from, rather than deduce it from the page and
offset.

Per report from Fujii Masao.
2013-01-18 11:46:49 +02:00
Heikki Linnakangas 88228e6f1d When xlogreader asks the callback function to read a page, make sure we
get a large enough part of the page to include the beginning of the next
record we're interested in. The XLogPageRead callback uses the requested
length to decide which timeline to stream WAL from, and if the first call
is short, and the page contains a timeline switch, we'll repeatedly try
to stream that page from the old timeline, and never get across the
timeline switch.
2013-01-17 23:46:33 +02:00
Heikki Linnakangas 3684a534ef I added a result set to START_STREAMING command, but neglected walreceiver.
The patch to allow pg_receivexlog to switch timeline added a result set
after copy has ended in START_STREAMING command, to return the next
timeline's ID to the client. But walreceived didn't get the memo, and threw
an error on the unexpected result set. Fix.
2013-01-17 23:45:45 +02:00
Alvaro Herrera 279628a0a7 Accelerate end-of-transaction dropping of relations
When relations are dropped, at end of transaction we need to remove the
files and clean the buffer pool of buffers containing pages of those
relations.  Previously we would scan the buffer pool once per relation
to clean up buffers.  When there are many relations to drop, the
repeated scans make this process slow; so we now instead pass a list of
relations to drop and scan the pool once, checking each buffer against
the passed list.  When the number of relations is larger than a
threshold (which as of this patch is being set to 20 relations) we sort
the array before starting, and bsearch the array; when it's smaller, we
simply scan the array linearly each time, because that's faster.  The
exact optimal threshold value depends on many factors, but the
difference is not likely to be significant enough to justify making it
user-settable.

This has been measured to be a significant win (a 15x win when dropping
100,000 relations; an extreme case, but reportedly a real one).

Author: Tomas Vondra, some tweaks by me
Reviewed by: Robert Haas, Shigeru Hanada, Andres Freund, Álvaro Herrera
2013-01-17 16:13:17 -03:00
Heikki Linnakangas 0b6329130e Make pg_receivexlog and pg_basebackup -X stream work across timeline switches.
This mirrors the changes done earlier to the server in standby mode. When
receivelog reaches the end of a timeline, as reported by the server, it
fetches the timeline history file of the next timeline, and restarts
streaming from the new timeline by issuing a new START_STREAMING command.

When pg_receivexlog crosses a timeline, it leaves the .partial suffix on the
last segment on the old timeline. This helps you to tell apart a partial
segment left in the directory because of a timeline switch, and a completed
segment. If you just follow a single server, it won't make a difference, but
it can be significant in more complicated scenarios where new WAL is still
generated on the old timeline.

This includes two small changes to the streaming replication protocol:
First, when you reach the end of timeline while streaming, the server now
sends the TLI of the next timeline in the server's history to the client.
pg_receivexlog uses that as the next timeline, so that it doesn't need to
parse the timeline history file like a standby server does. Second, when
BASE_BACKUP command sends the begin and end WAL positions, it now also sends
the timeline IDs corresponding the positions.
2013-01-17 20:23:00 +02:00
Tom Lane 8ae35e9180 Improve memory space management in tuplesort and tuplestore.
The code originally just doubled the size of the tuple-pointer array so
long as that would fit in allowedMem.  This could result in failing to use
as much as half of allowedMem, if (as is typical) the last doubling attempt
didn't quite fit.  Worse, we might double the array size but be unable to
use most of the added slots, because there was no room left within the
allowedMem limit for tuples the slots should point to.  To fix, double only
so long as we've used less than half of allowedMem in total.  Then do one
more array enlargement, but scale it based on total memory consumption so
far.  This will work nicely as long as the average tuple size is reasonably
stable, and in any case should be better than the old method.

This change will result in large sort operations consuming a larger
fraction of work_mem than they typically did in the past.  The release
notes should mention that users may want to revisit their work_mem
settings, if they'd tuned those settings based on the old behavior of
sorting.

Jeff Janes, reviewed by Peter Geoghegan and Robert Haas
2013-01-17 13:12:56 -05:00
Heikki Linnakangas 1296d5c53c Fix a couple of error-handling bugs in the xlogreader patch.
XLogReadRecord should reset its state on every error, to make sure it
re-reads the page on next call. It was inconsistent in that some errors did
that, but some did not.

In ReadRecord(), don't give up on an error if we're in standby mode. The
loop was set up to retry, but the checks within the loop broke out of the
loop on any error.

Andres Freund, with some tweaking by me.
2013-01-17 19:27:04 +02:00
Bruce Momjian b14f81bc9a Add a latex-longtable output format to psql
latex longtable is more powerful than the 'tabular' output format
'latex' uses.  Also add border=3 support to 'latex'.
2013-01-17 11:39:38 -05:00
Magnus Hagander 8ef6961685 Silence compiler warnings 2013-01-17 16:10:33 +01:00
Heikki Linnakangas 9ee4d06f3f Make GiST indexes on-disk compatible with 9.2 again.
The patch that turned XLogRecPtr into a uint64 inadvertently changed the
on-disk format of GiST indexes, because the NSN field in the GiST page
opaque is an XLogRecPtr. That breaks pg_upgrade. Revert the format of that
field back to the two-field struct that XLogRecPtr was before. This is the
same we did to LSNs in the page header to avoid changing on-disk format.

Bump catversion, as this invalidates any existing GiST indexes built on
9.3devel.
2013-01-17 16:46:16 +02:00
Magnus Hagander bba486f372 Base the default SSL ciphers on DEFAULT instead of ALL
It's better to start from what the OpenSSL people consider a good
default and then remove insecure things (low encryption, exportable
encryption and md5 at this point) from that, instead of starting
from everything that exists and remove from that. We trust the
OpenSSL people to make good choices about what the default is.
2013-01-17 15:04:44 +01:00
Magnus Hagander 4eebf1309f Make size-output fixed length in pg_basebackup verbose mode
This way the line doesn't shift right as the amount of data processed
increases.
2013-01-17 14:43:33 +01:00
Magnus Hagander d7e9ca7ff7 Truncate filenames in the leadning end in pg_basebackup verbose output
When truncating at the end, like before, the output would often end up
just showing the path instead of the filename.

Also increase the length of the filename by 5, which still keeps us at
less than 80 characters in most outputs.
2013-01-17 14:38:49 +01:00
Magnus Hagander f3af53441e Support multiple -t/--table arguments for more commands
On top of the previous support in pg_dump, add support to specify
multiple tables (by using the -t option multiple times) to
pg_restore, clsuterdb, reindexdb and vacuumdb.

Josh Kupershmidt, reviewed by Karl O. Pinc
2013-01-17 11:24:47 +01:00
Peter Eisentraut 36bdfa52a0 Get rid of pg_dump's README
It was largely full of outdated and incorrect information.  Move the few
notes which were still relevant into header comments of pg_backup_tar.c
and pg_dumpall.c.

Josh Kupershmidt
2013-01-16 23:49:54 -05:00
Alvaro Herrera 7fcbf6a405 Split out XLog reading as an independent facility
This new facility can not only be used by xlog.c to carry out crash
recovery, but also by external programs.  By supplying a function to
read XLog pages from somewhere, all the WAL reading can be used for
completely different purposes.

For the standard backend use, the behavior should be pretty much the
same as previously.  As for non-backend programs, an hypothetical
pg_xlogdump program is now closer to reality, but some more backend
support is still necessary.

This patch was originally submitted by Andres Freund in a different
form, but Heikki Linnakangas opted for and authored another design of
the concept.  Andres has advanced the patch since Heikki's initial
version.  Review and some (mostly cosmetics) changes by me.
2013-01-16 16:12:53 -03:00
Heikki Linnakangas 8606dd8190 Make \? help message more clear when not connected.
On second thought, "none" could mislead to think that you're connected a
database with that name. Duplicate the whole string, so that it can be
more easily translated. In back-branches, thought, just use an empty string
in place of the database name, to avoid adding a translatable string.
2013-01-15 22:23:14 +02:00
Heikki Linnakangas b04ce529fd Don't pass NULL to fprintf, if not currently connected to a database.
Backpatch all the way to 8.3. Fixes bug #7811, per report and diagnosis by
Meng Qingzhong.
2013-01-15 19:23:47 +02:00
Alvaro Herrera 7ac5760fa2 Rework order of checks in ALTER / SET SCHEMA
When attempting to move an object into the schema in which it already
was, for most objects classes we were correctly complaining about
exactly that ("object is already in schema"); but for some other object
classes, such as functions, we were instead complaining of a name
collision ("object already exists in schema").  The latter is wrong and
misleading, per complaint from Robert Haas in
CA+TgmoZ0+gNf7RDKRc3u5rHXffP=QjqPZKGxb4BsPz65k7qnHQ@mail.gmail.com

To fix, refactor the way these checks are done.  As a bonus, the
resulting code is smaller and can also share some code with Rename
cases.

While at it, remove use of getObjectDescriptionOids() in error messages.
These are normally disallowed because of translatability considerations,
but this one had slipped through since 9.1.  (Not sure that this is
worth backpatching, though, as it would create some untranslated
messages in back branches.)

This is loosely based on a patch by KaiGai Kohei, heavily reworked by
me.
2013-01-15 13:23:43 -03:00