Commit Graph

27055 Commits

Author SHA1 Message Date
Heikki Linnakangas f92d6a540a Use appendStringInfoString/Char et al where appropriate.
Patch by David Rowley. Backpatch to 9.5, as some of the calls were new in
9.5, and keeping the code in sync with master makes future backpatching
easier.
2015-07-02 12:36:03 +03:00
Heikki Linnakangas 7931622d1d Fix name of argument to pg_stat_file.
It's called "missing_ok" in the docs and in the C code.

I refrained from doing a catversion bump for this, because the name of an
input argument is just documentation, it has no effect on any callers.

Michael Paquier
2015-07-02 12:15:13 +03:00
Andrew Dunstan d5d00294b0 Allow MSVC's contribcheck and modulescheck to run independently.
These require a temp install to have been done, so we now make sure it
is done before proceeding.

Michael Paquier.
2015-07-01 23:28:41 -04:00
Fujii Masao fb174687f7 Make use of xlog_internal.h's macros in WAL-related utilities.
Commit 179cdd09 added macros to check if a filename is a WAL segment
or other such file. However there were still some instances of the
strlen + strspn combination to check for that in WAL-related utilities
like pg_archivecleanup. Those checks can be replaced with the macros.

This patch makes use of the macros in those utilities and
which would make the code a bit easier to read.

Back-patch to 9.5.

Michael Paquier
2015-07-02 10:35:38 +09:00
Tom Lane 1e24cf645d Don't leave pg_hba and pg_ident data lying around in running backends.
Free the contexts holding this data after we're done using it, by the
expedient of attaching them to the PostmasterContext which we were
already taking care to delete (and where, indeed, this data used to live
before commits e5e2fc842c and 7c45e3a3c6).  This saves a
probably-usually-negligible amount of space per running backend.  It also
avoids leaving potentially-security-sensitive data lying around in memory
in processes that don't need it.  You'd have to be unusually paranoid to
think that that amounts to a live security bug, so I've not gone so far as
to forcibly zero the memory; but there surely isn't a good reason to keep
this data around.

Arguably this is a memory management bug in the aforementioned commits,
but it doesn't seem important enough to back-patch.
2015-07-01 18:55:39 -04:00
Tom Lane d7c19d6855 Make sampler_random_fract() actually obey its API contract.
This function is documented to return a value in the range (0,1),
which is what its predecessor anl_random_fract() did.  However, the
new version depends on pg_erand48() which returns a value in [0,1).
The possibility of returning zero creates hazards of division by zero
or trying to compute log(0) at some call sites, and it might well
break third-party modules using anl_random_fract() too.  So let's
change it to never return zero.  Spotted by Coverity.

Michael Paquier, cosmetically adjusted by me
2015-07-01 18:07:48 -04:00
Fujii Masao 8217370864 Make XLogFileCopy() look the same as in 9.4.
XLogFileCopy() was changed heavily in commit de76884. However it was
partially reverted in commit 7abc685 and most of those changes to
XLogFileCopy() were no longer needed. Then commit 7cbee7c removed
those unnecessary code, but XLogFileCopy() looked different in master
and 9.4 though the contents are almost the same.

This patch makes XLogFileCopy() look the same in master and back-branches,
which makes back-patching easier, per discussion on pgsql-hackers.
Back-patch to 9.5.

Discussion: 55760844.7090703@iki.fi

Michael Paquier
2015-07-01 10:54:47 +09:00
Tom Lane 019f7813da Stamp shared-library minor version numbers for 9.6. 2015-06-30 14:06:04 -04:00
Tom Lane cf8d65de10 Stamp HEAD as 9.6devel.
Let the hacking begin ...
2015-06-30 14:01:15 -04:00
Tom Lane 131926a52d Remove useless check for NULL subexpression.
Coverity rightly gripes that it's silly to have a test here when
the adjacent ExecEvalExpr() would choke on a NULL expression pointer.

Petr Jelinek
2015-06-30 12:53:54 -04:00
Heikki Linnakangas 302ac7f271 Add assertion to check the special size is sane before dereferencing it.
This seems useful to catch errors of the sort I just fixed, where
PageGetSpecialPointer is called before initializing the page.
2015-06-30 13:44:04 +03:00
Heikki Linnakangas fdf28853ae Don't call PageGetSpecialPointer() on page until it's been initialized.
After calling XLogInitBufferForRedo(), the page might be all-zeros if it was
not in page cache already. btree_xlog_unlink_page initialized the page
correctly, but it called PageGetSpecialPointer before initializing it, which
would lead to a corrupt page at WAL replay, if the unlinked page is not in
page cache.

Backpatch to 9.4, the bug came with the rewrite of B-tree page deletion.
2015-06-30 13:41:30 +03:00
Robert Haas b48ecf862b In bttext_abbrev_convert, move pfree to the right place.
Without this, we might access memory that's already been freed, or
leak memory if in the C locale.

Peter Geoghegan
2015-06-29 23:53:05 -04:00
Heikki Linnakangas 47fe4d25d5 Initialize GIN metapage correctly when replaying metapage-update WAL record.
I broke this with my WAL format refactoring patch. Before that, the metapage
was read from disk, and modified in-place regardless of the LSN. That was
always a bit silly, as there's no need to read the old page version from
disk disk when we're overwriting it anyway. So that was changed in 9.5, but
I failed to add a GinInitPage call to initialize the page-headers correctly.
Usually you wouldn't notice, because the metapage is already in the page
cache and is not zeroed.

One way to reproduce this is to perform a VACUUM on an already vacuumed
table (so that the vacuum has no real work to do), immediately after a
checkpoint, and then perform an immediate shutdown. After recovery, the
page headers of the metapage will be incorrectly all-zeroes.

Reported by Jeff Janes
2015-06-30 00:06:00 +03:00
Tom Lane f78329d594 Stamp 9.5alpha1. 2015-06-29 15:42:18 -04:00
Tom Lane cbc8d65639 Code + docs review for escaping of option values (commit 11a020eb6).
Avoid memory leak from incorrect choice of how to free a StringInfo
(resetStringInfo doesn't do it).  Now that pg_split_opts doesn't scribble
on the optstr, mark that as "const" for clarity.  Attach the commentary in
protocol.sgml to the right place, and add documentation about the
user-visible effects of this change on postgres' -o option and libpq's
PGOPTIONS option.
2015-06-29 12:42:52 -04:00
Andres Freund 07cb8b02ab Replace ia64 S_UNLOCK compiler barrier with a full memory barrier.
_Asm_sched_fence() is just a compiler barrier, not a memory barrier. But
spinlock release on IA64 needs, at the very least, release
semantics. Use a full barrier instead.

This might be the cause for the occasional failures on buildfarm member
anole.

Discussion: 20150629101108.GB17640@alap3.anarazel.de
2015-06-29 14:53:32 +02:00
Peter Eisentraut c5e5d444de Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: fb7e72f46cfafa1b5bfe4564d9686d63a1e6383f
2015-06-28 23:56:55 -04:00
Tom Lane 2bdc51a294 Run the C portions of guc-file.l through pgindent.
Yeah, I know, pretty anal-retentive of me.  But we oughta find some
way to automate this for the .y and .l files.
2015-06-28 20:49:35 -04:00
Tom Lane 62d16c7fc5 Improve design and implementation of pg_file_settings view.
As first committed, this view reported on the file contents as they were
at the last SIGHUP event.  That's not as useful as reporting on the current
contents, and what's more, it didn't work right on Windows unless the
current session had serviced at least one SIGHUP.  Therefore, arrange to
re-read the files when pg_show_all_settings() is called.  This requires
only minor refactoring so that we can pass changeVal = false to
set_config_option() so that it won't actually apply any changes locally.

In addition, add error reporting so that errors that would prevent the
configuration files from being loaded, or would prevent individual settings
from being applied, are visible directly in the view.  This makes the view
usable for pre-testing whether edits made in the config files will have the
desired effect, before one actually issues a SIGHUP.

I also added an "applied" column so that it's easy to identify entries that
are superseded by later entries; this was the main use-case for the original
design, but it seemed unnecessarily hard to use for that.

Also fix a 9.4.1 regression that allowed multiple entries for a
PGC_POSTMASTER variable to cause bogus complaints in the postmaster log.
(The issue here was that commit bf007a27ac unintentionally reverted
3e3f65973a, which suppressed any duplicate entries within
ParseConfigFp.  However, since the original coding of the pg_file_settings
view depended on such suppression *not* happening, we couldn't have fixed
this issue now without first doing something with pg_file_settings.
Now we suppress duplicates by marking them "ignored" within
ProcessConfigFileInternal, which doesn't hide them in the view.)

Lesser changes include:

Drive the view directly off the ConfigVariable list, instead of making a
basically-equivalent second copy of the data.  There's no longer any need
to hang onto the data permanently, anyway.

Convert show_all_file_settings() to do its work in one call and return a
tuplestore; this avoids risks associated with assuming that the GUC state
will hold still over the course of query execution.  (I think there were
probably latent bugs here, though you might need something like a cursor
on the view to expose them.)

Arrange to run SIGHUP processing in a short-lived memory context, to
forestall process-lifespan memory leaks.  (There is one known leak in this
code, in ProcessConfigDirectory; it seems minor enough to not be worth
back-patching a specific fix for.)

Remove mistaken assignment to ConfigFileLineno that caused line counting
after an include_dir directive to be completely wrong.

Add missed failure check in AlterSystemSetConfigFile().  We don't really
expect ParseConfigFp() to fail, but that's not an excuse for not checking.
2015-06-28 18:06:14 -04:00
Heikki Linnakangas d661532e27 Also trigger restartpoints based on max_wal_size on standby.
When archive recovery and restartpoints were initially introduced,
checkpoint_segments was ignored on the grounds that the files restored from
archive don't consume any space in the recovery server. That was changed in
later releases, but even then it was arguably a feature rather than a bug,
as performing restartpoints as often as checkpoints during normal operation
might be excessive, but you might nevertheless not want to waste a lot of
space for pre-allocated WAL by setting checkpoint_segments to a high value.
But now that we have separate min_wal_size and max_wal_size settings, you
can bound WAL usage with max_wal_size, and still avoid consuming excessive
space usage by setting min_wal_size to a lower value, so that argument is
moot.

There are still some issues with actually limiting the space usage to
max_wal_size: restartpoints in recovery can only start after seeing the
checkpoint record, while a checkpoint starts flushing buffers as soon as
the redo-pointer is set. Restartpoint is paced to happen at the same
leisurily speed, determined by checkpoint_completion_target, as checkpoints,
but because they are started later, max_wal_size can be exceeded by upto
one checkpoint cycle's worth of WAL, depending on
checkpoint_completion_target. But that seems better than not trying at all,
and max_wal_size is a soft limit anyway.

The documentation already claimed that max_wal_size is obeyed in recovery,
so this just fixes the behaviour to match the docs. However, add some
weasel-words there to mention that max_wal_size may well be exceeded by
some amount in recovery.
2015-06-29 00:09:10 +03:00
Heikki Linnakangas a32c3ec893 Promote the assertion that XLogBeginInsert() is not called twice into ERROR.
Seems like cheap insurance for WAL bugs. A spurious call to
XLogBeginInsert() in itself would be fairly harmless, but if there is any
data registered and the insertion is not completed/cancelled properly, there
is a risk that the data ends up in a wrong WAL record.

Per Jeff Janes's suggestion.
2015-06-28 22:30:39 +03:00
Heikki Linnakangas a45c70acf3 Fix double-XLogBeginInsert call in GIN page splits.
If data checksums or wal_log_hints is on, and a GIN page is split, the code
to find a new, empty, block was called after having already called
XLogBeginInsert(). That causes an assertion failure or PANIC, if finding the
new block involves updating a FSM page that had not been modified since last
checkpoint, because that update is WAL-logged, which calls XLogBeginInsert
again. Nested XLogBeginInsert calls are not supported.

To fix, rearrange GIN code so that XLogBeginInsert is called later, after
finding the victim buffers.

Reported by Jeff Janes.
2015-06-28 22:16:21 +03:00
Heikki Linnakangas b36805f3c5 Don't choke on files that are removed while pg_rewind runs.
If a file is removed from the source server, while pg_rewind is running, the
invocation of pg_read_binary_file() will fail. Use the just-added missing_ok
option to that function, to have it return NULL instead, and handle that
gracefully. And similarly for pg_ls_dir and pg_stat_file.

Reported by Fujii Masao, fix by Michael Paquier.
2015-06-28 21:35:51 +03:00
Heikki Linnakangas cb2acb1081 Add missing_ok option to the SQL functions for reading files.
This makes it possible to use the functions without getting errors, if there
is a chance that the file might be removed or renamed concurrently.
pg_rewind needs to do just that, although this could be useful for other
purposes too. (The changes to pg_rewind to use these functions will come in
a separate commit.)

The read_binary_file() function isn't very well-suited for extensions.c's
purposes anymore, if it ever was. So bite the bullet and make a copy of it
in extension.c, tailored for that use case. This seems better than the
accidental code reuse, even if it's a some more lines of code.

Michael Paquier, with plenty of kibitzing by me.
2015-06-28 21:35:46 +03:00
Kevin Grittner cca8ba9529 Fix comment for GetCurrentIntegerTimestamp().
The unit of measure is microseconds, not milliseconds.

Backpatch to 9.3 where the function and its comment were added.
2015-06-28 12:43:59 -05:00
Tatsuo Ishii 527e6d3f09 Fix function declaration style to respect the coding standard. 2015-06-28 18:54:27 +09:00
Tom Lane 0a52d378b0 Avoid passing NULL to memcmp() in lookups of zero-argument functions.
A few places assumed they could pass NULL for the argtypes array when
looking up functions known to have zero arguments.  At first glance
it seems that this should be safe enough, since memcmp() is surely not
allowed to fetch any bytes if its count argument is zero.  However,
close reading of the C standard says that such calls have undefined
behavior, so we'd probably best avoid it.

Since the number of places doing this is quite small, and some other
places looking up zero-argument functions were already passing dummy
arrays, let's standardize on the latter solution rather than hacking
the function lookup code to avoid calling memcmp() in these cases.
I also added Asserts to catch any future violations of the new rule.

Given the utter lack of any evidence that this actually causes any
problems in the field, I don't feel a need to back-patch this change.

Per report from Piotr Stefaniak, though this is not his patch.
2015-06-27 17:47:39 -04:00
Kevin Grittner 604e99396d Add opaque declaration of HTAB to tqual.h.
Commit b89e151054 added the
ResolveCminCmaxDuringDecoding declaration to tqual.h, which uses an
HTAB parameter, without declaring HTAB.  It accidentally fails to
fail to build with current sources because a declaration happens to
be included, directly or indirectly, in all source files that
currently use tqual.h before tqual.h is first included, but we
shouldn't count on that.  Since an opaque declaration is enough
here, just use that, as was done in snapmgr.h.

Backpatch to 9.4, where the HTAB reference was added to tqual.h.
2015-06-27 09:55:06 -05:00
Heikki Linnakangas 7845db2aa7 Fix typo in comment
Etsuro Fujita
2015-06-27 10:17:42 +03:00
Simon Riggs 66fbcb0d2e Avoid hot standby cancels from VAC FREEZE
VACUUM FREEZE generated false cancelations of standby queries on an
otherwise idle master. Caused by an off-by-one error on cutoff_xid
which goes back to original commit.

Backpatch to all versions 9.0+

Analysis and report by Marco Nenciarini

Bug fix by Simon Riggs
2015-06-27 00:41:47 +01:00
Alvaro Herrera 7d60b2af34 Fix DDL command collection for TRANSFORM
Commit b488c580ae, which added the DDL command collection feature,
neglected to update the code that commit cac7658205 had previously
added two weeks earlier for the TRANSFORM feature.

Reported by Michael Paquier.
2015-06-26 18:17:54 -03:00
Alvaro Herrera 4028222468 Fix BRIN xlog replay
There was a confusion about which block number to use when storing an
item's pointer in the revmap -- the revmap page's blkno was being used,
not the data page's blkno.

Spotted-by: Jeff Janes
2015-06-26 18:13:05 -03:00
Robert Haas 8f15f74a44 Be more conservative about removing tablespace "symlinks".
Don't apply rmtree(), which will gleefully remove an entire subtree,
and don't even apply unlink() unless it's symlink or a directory,
the only things that we expect to find.

Amit Kapila, with minor tweaks by me, per extensive discussions
involving Andrew Dunstan, Fujii Masao, and Heikki Linnakangas,
at least some of whom also reviewed the code.
2015-06-26 15:53:13 -04:00
Robert Haas 8a8c581a8c Remove unnecessary NULL test.
Spotted by Coverity and reported by Michael Paquier.  Per discussion,
we don't necessarily care about making Coverity happy in all such
instances, but we can go ahead and change them where it otherwise
seems to improve the code.
2015-06-26 14:46:48 -04:00
Robert Haas 9043ef390f Don't warn about creating temporary or unlogged hash indexes.
Warning people that no WAL-logging will be done doesn't make sense
in this case.

Michael Paquier
2015-06-26 11:37:32 -04:00
Robert Haas 91118f1a59 Reduce log level for background worker events from LOG to DEBUG1.
Per discussion, LOG is just too chatty for something that will happen
as routinely as this.

Pavel Stehule
2015-06-26 11:23:32 -04:00
Andres Freund 1b468a131b Fix the fallback memory barrier implementation to be reentrant.
This was essentially "broken" since 0c8eda62; but until more
recently (14e8803f) barriers usage in signal handlers was infrequent.

The failure to be reentrant was noticed because the test_shm_mq, which
uses memory barriers at a high frequency, occasionally got stuck on some
solaris buildfarm animals. Turns out, those machines use sun studio
12.1, which doesn't yet have efficient memory barrier support. A machine
with a newer sun studio did not fail.  Forcing the barrier fallback to
be used on x86 allows to reproduce the problem.

The new fallback is to use kill(PostmasterPid, 0) based on the theory
that that'll always imply a barrier due to checking the liveliness of
PostmasterPid on systems old enough to need fallback support. It's hard
to come up with a good and performant fallback.

I'm not backpatching this for now - the problem isn't active in the back
branches, and we haven't backpatched barrier changes for
now. Additionally master looks entirely different than the back branches
due to the new atomics abstraction. It seems better to let this rest in
master, where the non-reentrancy actively causes a problem, and then
consider backpatching.

Found-By: Robert Haas
Discussion: 55626265.3060800@dunslane.net
2015-06-26 17:00:38 +02:00
Robert Haas 5ca611841b Improve handling of CustomPath/CustomPlan(State) children.
Allow CustomPath to have a list of paths, CustomPlan a list of plans,
and CustomPlanState a list of planstates known to the core system, so
that custom path/plan providers can more reasonably use this
infrastructure for nodes with multiple children.

KaiGai Kohei, per a design suggestion from Tom Lane, with some
further kibitzing by me.
2015-06-26 09:40:47 -04:00
Heikki Linnakangas 4b8e24b9ad Fix a couple of bugs with wal_log_hints.
1. Replay of the WAL record for setting a bit in the visibility map
contained an assertion that a full-page image of that record type can only
occur with checksums enabled. But it can also happen with wal_log_hints, so
remove the assertion. Unlike checksums, wal_log_hints can be changed on the
fly, so it would be complicated to figure out if it was enabled at the time
that the WAL record was generated.

2. wal_log_hints has the same effect on the locking needed to read the LSN
of a page as data checksums. BufferGetLSNAtomic() didn't get the memo.

Backpatch to 9.4, where wal_log_hints was added.
2015-06-26 12:38:24 +03:00
Robert Haas f7bb7f0625 Allow background workers to connect to no particular database.
The documentation claims that this is supported, but it didn't
actually work.  Fix that.

Reported by Pavel Stehule; patch by me.
2015-06-25 15:52:13 -04:00
Tom Lane 5d1ff6bd55 Fix the logic for putting relations into the relcache init file.
Commit f3b5565dd4 was a couple of bricks shy
of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index
into the relcache init file, because that index is not used by any
syscache.  However, we have historically nailed that index into cache for
performance reasons.  The upshot was that load_relcache_init_file always
decided that the init file was busted and silently ignored it, resulting
in a significant hit to backend startup speed.

To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around
RelationSupportsSysCache(), which can know about additional relations
that should be in the init file despite being unknown to syscache.c.

Also install some guards against future mistakes of this type: make
write_relcache_init_file Assert that all nailed relations get written to
the init file, and make load_relcache_init_file emit a WARNING if it takes
the "wrong number of nailed relations" exit path.  Now that we remove the
init files during postmaster startup, that case should never occur in the
field, even if we are starting a minor-version update that added or removed
rels from the nailed set.  So the warning shouldn't ever be seen by end
users, but it will show up in the regression tests if somebody breaks this
logic.

Back-patch to all supported branches, like the previous commit.
2015-06-25 14:39:05 -04:00
Robert Haas 51d0fe5d56 Update get_relation_info comment.
Thomas Munro
2015-06-23 10:09:53 -04:00
Heikki Linnakangas 9cb36981fb Add missing newline to debug-message.
Michael Paquier
2015-06-23 15:49:28 +03:00
Peter Eisentraut e98d635d5d pg_rewind: Improve message wording 2015-06-22 20:40:41 -04:00
Peter Eisentraut 747781f25e pg_basebackup: Remove redundant newline in error message 2015-06-22 20:40:40 -04:00
Tom Lane 2cb9ec1bcb Improve inheritance_planner()'s performance for large inheritance sets.
Commit c03ad5602f introduced a planner
performance regression for UPDATE/DELETE on large inheritance sets.
It required copying the append_rel_list (which is of size proportional to
the number of inherited tables) once for each inherited table, thus
resulting in O(N^2) time and memory consumption.  While it's difficult to
avoid that in general, the extra work only has to be done for
append_rel_list entries that actually reference subquery RTEs, which
inheritance-set entries will not.  So we can buy back essentially all of
the loss in cases without subqueries in FROM; and even for those, the added
work is mainly proportional to the number of UNION ALL subqueries.

Back-patch to 9.2, like the previous commit.

Tom Lane and Dean Rasheed, per a complaint from Thomas Munro.
2015-06-22 18:53:27 -04:00
Robert Haas da9ee026a0 psql: Add some tab completion for TABLESAMPLE.
Petr Jelinek, reviewed by Brendan Jurd
2015-06-22 14:15:32 -04:00
Noah Misch 4318118edd Truncate strings in tarCreateHeader() with strlcpy(), not sprintf().
This supplements the GNU libc bug #6530 workarounds introduced in commit
54cd4f0457.  On affected systems, a
tar-format pg_basebackup failed when some filename beneath the data
directory was not valid character data in the postmaster/walsender
locale.  Back-patch to 9.1, where pg_basebackup was introduced.  Extant,
bug-prone conversion specifications receive only ASCII bytes or involve
low-importance messages.
2015-06-21 20:04:36 -04:00
Alvaro Herrera ad89a5d115 Add transforms to pg_get_object_address and friends
This was missed when transforms were added by commit cac7658205.

Extracted from a larger patch
Author: Michael Paquier
2015-06-21 16:08:49 -03:00
Andres Freund 667912aee6 Improve multixact emergency autovacuum logic.
Previously autovacuum was not necessarily triggered if space in the
members slru got tight. The first problem was that the signalling was
tied to values in the offsets slru, but members can advance much
faster. Thats especially a problem if old sessions had been around that
previously prevented the multixact horizon to increase. Secondly the
skipping logic doesn't work if the database was restarted after
autovacuum was triggered - that knowledge is not preserved across
restart. This is especially a problem because it's a common
panic-reaction to restart the database if it gets slow to
anti-wraparound vacuums.

Fix the first problem by separating the logic for members from
offsets. Trigger autovacuum whenever a multixact crosses a segment
boundary, as the current member offset increases in irregular values, so
we can't use a simple modulo logic as for offsets.  Add a stopgap for
the second problem, by signalling autovacuum whenver ERRORing out
because of boundaries.

Discussion: 20150608163707.GD20772@alap3.anarazel.de

Backpatch into 9.3, where it became more likely that multixacts wrap
around.
2015-06-21 18:57:28 +02:00
Andres Freund 90231cd518 Add missing check for wal_debug GUC.
9a20a9b2 added a new elog(), enabled when WAL_DEBUG is defined. The
other WAL_DEBUG dependant messages check for the wal_debug GUC, but this
one did not. While at it replace 'upto' with 'up to'.

Discussion: 20150610110253.GF3832@alap3.anarazel.de

Backpatch to 9.4, the first release containing 9a20a9b2.
2015-06-21 18:37:09 +02:00
Peter Eisentraut 103382abf8 PL/Perl: Add alternative expected file for Perl 5.22 2015-06-21 10:37:24 -04:00
Noah Misch f0a264a362 Fix failure to copy setlocale() return value.
POSIX permits setlocale() calls to invalidate any previous setlocale()
return values, but commit 5f538ad004
neglected to account for setlocale(LC_CTYPE, NULL) doing so.  The effect
was to set the LC_CTYPE environment variable to an unintended value.
pg_perm_setlocale() sets this variable to assist PL/Perl; without it,
Perl would undo PostgreSQL's locale settings.  The known-affected
configurations are 32-bit, release builds using Visual Studio 2012 or
Visual Studio 2013.  Visual Studio 2010 is unaffected, as were all
buildfarm-attested configurations.  In principle, this bug could leave
the wrong LC_CTYPE in effect after PL/Perl use, which could in turn
facilitate problems like corrupt tsvector datums.  No known platform
experiences that consequence, because PL/Perl on Windows does not use
this environment variable.

The bug has been user-visible, as early postmaster failure, on systems
with Windows ANSI code page set to CP936 for "Chinese (Simplified, PRC)"
and probably on systems using other multibyte code pages.
(SetEnvironmentVariable() rejects values containing character data not
valid under the Windows ANSI code page.)  Back-patch to 9.4, where the
faulty commit first appeared.

Reported by Didi Hu and 林鹏程.  Reviewed by Tom Lane, though this fix
strategy was not his first choice.
2015-06-20 12:09:29 -04:00
Noah Misch 1f2a378de4 Revert "Detect setlocale(LC_CTYPE, NULL) clobbering previous return values."
This reverts commit b76e76be46.  The
buildfarm yielded no related failures.
2015-06-20 12:08:48 -04:00
Alvaro Herrera 3c400a3f2b Fix thinko in comment (launcher -> worker) 2015-06-20 11:45:59 -03:00
Tom Lane 48913db887 In immediate shutdown, postmaster should not exit till children are gone.
This adjusts commit 82233ce7ea so that the
postmaster does not exit until all its child processes have exited, even
if the 5-second timeout elapses and we have to send SIGKILL.  There is no
great value in having the postmaster process quit sooner, and doing so can
mislead onlookers into thinking that the cluster is fully terminated when
actually some child processes still survive.

This effect might explain recent test failures on buildfarm member hamster,
wherein we failed to restart a cluster just after shutting it down with
"pg_ctl stop -m immediate".

I also did a bit of code review/beautification, including fixing a faulty
use of the Max() macro on a volatile expression.

Back-patch to 9.4.  In older branches, the postmaster never waited for
children to exit during immediate shutdowns, and changing that would be
too much of a behavioral change.
2015-06-19 14:23:39 -04:00
Alvaro Herrera da1a9d0f5b Clamp autovacuum launcher sleep time to 5 minutes
This avoids the problem that it might go to sleep for an unreasonable
amount of time in unusual conditions like the server clock moving
backwards an unreasonable amount of time.

(Simply moving the server clock forward again doesn't solve the problem
unless you wake up the autovacuum launcher manually, say by sending it
SIGHUP).

Per trouble report from Prakash Itnal in
https://www.postgresql.org/message-id/CAHC5u79-UqbapAABH2t4Rh2eYdyge0Zid-X=Xz-ZWZCBK42S0Q@mail.gmail.com

Analyzed independently by Haribabu Kommi and Tom Lane.
2015-06-19 12:44:36 -03:00
Tom Lane be87143fe9 Fix bogus range_table_mutator() logic for RangeTblEntry.tablesample.
Must make a copy of the TableSampleClause node; the previous coding
modified the input data structure in-place.

Petr Jelinek
2015-06-19 11:41:56 -04:00
Robert Haas ed16f73c57 Fix corner case in autovacuum-forcing logic for multixact wraparound.
Since find_multixact_start() relies on SimpleLruDoesPhysicalPageExist(),
and that function looks only at the on-disk state, it's possible for it
to fail to find a page that exists in the in-memory SLRU that has not
been written yet.  If that happens, SetOffsetVacuumLimit() will
erroneously decide to force emergency autovacuuming immediately.

We should probably fix find_multixact_start() to consider the data
cached in memory as well as on the on-disk state, but that's no excuse
for SetOffsetVacuumLimit() to be stupid about the case where it can
no longer read the value after having previously succeeded in doing so.

Report by Andres Freund.
2015-06-19 11:28:30 -04:00
Robert Haas 86e4751786 Add PASSWORD to tab completions for CREATE/ALTER ROLE/USER/GROUP.
Jeevan Chalke
2015-06-19 11:11:22 -04:00
Robert Haas ca3f43aa48 Change TAP test framework to not rely on having a chmod executable.
This might not work at all on Windows, and is not ever efficient.

Michael Paquier
2015-06-19 10:52:00 -04:00
Noah Misch b76e76be46 Detect setlocale(LC_CTYPE, NULL) clobbering previous return values.
POSIX permits setlocale() calls to invalidate any previous setlocale()
return values.  Commit 5f538ad004
neglected to account for that.  In advance of fixing that bug, switch to
failing hard on affected configurations.  This is a planned temporary
commit to assay buildfarm-represented configurations.
2015-06-17 08:13:33 -04:00
Andrew Dunstan 41d798a139 Fix comment in fmgr.h to refer to actual function used.
FunctionLookup() is long gone if it ever existed, and fmgr_info() is
what's now used, so the comments now reflect that.
2015-06-15 23:21:03 -04:00
Michael Meskes 94a484222c Check for out of memory when allocating sqlca.
Patch by Michael Paquier
2015-06-15 14:21:03 +02:00
Michael Meskes af0b49fc98 Fix memory leak in ecpglib's connect function.
Patch by Michael Paquier
2015-06-15 14:20:09 +02:00
Andrew Dunstan 2271d002d5 Fix "path" infrastructure bug affecting jsonb_set()
jsonb_set() and other clients of the setPathArray() utility function
could get spurious results when an array integer subscript is provided
that is not within the range of int.

To fix, ensure that the value returned by strtol() within setPathArray()
is within the range of int;  when it isn't, assume an invalid input in
line with existing, similar cases.  The path-orientated operators that
appeared in PostgreSQL 9.3 and 9.4 do not call setPathArray(), and
already independently take this precaution, so no change there.

Peter Geoghegan
2015-06-12 19:26:03 -04:00
Tom Lane ae58f1430a Fix failure to cover scalar-vs-rowtype cases in exec_stmt_return().
In commit 9e3ad1aac5 I modified plpgsql
to use exec_stmt_return's simple-variables fast path in more cases.
However, I overlooked that there are really two different return
conventions in use here, depending on whether estate->retistuple is true,
and the existing fast-path code had only bothered to handle one of them.
So trying to return a scalar in a function returning composite, or vice
versa, could lead to unexpected error messages (typically "cache lookup
failed for type 0") or to a null-pointer-dereference crash.

In the DTYPE_VAR case, we can just throw error if retistuple is true,
corresponding to what happens in the general-expression code path that was
being used previously.  (Perhaps someday both of these code paths should
attempt a coercion, but today is not that day.)

In the REC and ROW cases, just hand the problem to exec_eval_datum()
when not retistuple.  Also clean up the ROW coding slightly so it looks
more like exec_eval_datum().

The previous commit also caused exec_stmt_return_next() to be used in
more cases, but that code seems to be OK as-is.

Per off-list report from Serge Rielau.  This bug is new in 9.5 so no need
to back-patch.
2015-06-12 13:44:06 -04:00
Tom Lane b00982344a Improve error message and hint for ALTER COLUMN TYPE can't-cast failure.
We already tried to improve this once, but the "improved" text was rather
off-target if you had provided a USING clause.  Also, it seems helpful
to provide the exact text of a suggested USING clause, so users can just
copy-and-paste it when needed.  Per complaint from Keith Rarick and a
suggestion from Merlin Moncure.

Back-patch to 9.2 where the current wording was adopted.
2015-06-12 11:54:03 -04:00
Fujii Masao b5fe62038f Make postmaster restart archiver soon after it dies, even during recovery.
After the archiver dies, postmaster tries to start a new one immediately.
But previously this could happen only while server was running normally
even though archiving was enabled always (i.e., archive_mode was set to
always). So the archiver running during recovery could not restart soon
after it died. This is an oversight in commit ffd3774.

This commit changes reaper(), postmaster's signal handler to cleanup
after a child process dies, so that it tries to a new archiver even during
recovery if necessary.

Patch by me. Review by Alvaro Herrera.
2015-06-12 23:11:51 +09:00
Michael Meskes 96ad72d1c0 Fixed some memory leaks in ECPG.
Patch by Michael Paquier
2015-06-12 14:52:55 +02:00
Michael Meskes 82be1bf509 Fix intoasc() in Informix compat lib. This function used to be a noop.
Patch by Michael Paquier
2015-06-12 14:50:47 +02:00
Fujii Masao 091c02a958 Fix alphabetization in catalogs.sgml.
System catalogs and views should be listed alphabetically
in catalog.sgml, but only pg_file_settings view not.

This patch also fixes typos in pg_file_settings comments.
2015-06-12 12:59:29 +09:00
Fujii Masao cd3cff4778 Clean up useless mention of RMGRDESCSOURCES in pg_rewind Makefile.
RMGRDESCSOURCES is defined and used only in pg_xlogdump Makefile,
but pg_rewind Makefile mentioned it as extra files to remove in "make clean".
This patch removes that useless mention from pg_rewind Makefile.

Michael Paquier
2015-06-12 12:32:48 +09:00
Andrew Dunstan 908e234733 Rename jsonb - text[] operator to #- to avoid ambiguity.
Following recent discussion  on -hackers. The underlying function is
also renamed to jsonb_delete_path. The regression tests now don't need
ugly type casts to avoid the ambiguity, so they are also removed.

Catalog version bumped.
2015-06-11 10:06:58 -04:00
Fujii Masao 966c37fdb5 Fix some issues in pg_rewind.
* Remove invalid option character "N" from the third argument (valid option
string) of getopt_long().

* Use pg_free() or pfree() to free the memory allocated by pg_malloc() or
palloc() instead of always using free().

* Assume problem is no disk space if write() fails but doesn't set errno.

* Fix several typos.

Patch by me. Review by Michael Paquier.
2015-06-11 22:31:18 +09:00
Peter Eisentraut 385522c7dc Fix typo 2015-06-10 21:30:17 -04:00
Kevin Grittner 870681017a Fix typo in comment.
Backpatch to 9.4 to minimize possible conflicts.
2015-06-10 17:03:56 -05:00
Fujii Masao ea9c4c1e4a Fix typo in comment.
David Rowley
2015-06-10 15:26:02 +09:00
Tom Lane f6e9cbfd91 Report more information if pg_perm_setlocale() fails at startup.
We don't know why a few Windows users have seen this fail, but the
taciturnity of the error message certainly isn't helping debug it.
Let's at least find out which LC category isn't working.
2015-06-09 13:37:08 -04:00
Alvaro Herrera 94232c909d Fix typos
tablesapce -> tablespace
there -> their

These were introduced in 72d422a52, so no need to backpatch.
2015-06-08 15:37:42 -03:00
Fujii Masao 7abc685974 Refactor WAL segment copying code.
* Remove unused argument "dstfname" and related code from XLogFileCopy().

* Previously XLogFileCopy() returned a pstrdup'd string so that
InstallXLogFileSegment() used it later. Since the pstrdup'd string was never
free'd, there could be a risk of memory leak. It was almost harmless because
the startup process exited just after calling XLogFileCopy(), it existed.
This commit changes XLogFileCopy() so that it directly calls
InstallXLogFileSegment() and doesn't call pstrdup() at all. Which fixes that
memory leak problem.

* Extend InstallXLogFileSegment() so that the caller can specify the log level.
Which allows us to emit an error when InstallXLogFileSegment() fails a disk
file access like link() and rename(). Previously it was always logged with
LOG level and additionally needed to be logged with ERROR when we wanted
to treat it as an error.

Michael Paquier
2015-06-09 03:03:24 +09:00
Andres Freund d1b958218a Allow HotStandbyActiveInReplay() to be called in single user mode.
HotStandbyActiveInReplay, introduced in 061b079f, only allowed WAL
replay to happen in the startup process, missing the single user case.

This buglet is fairly harmless as it only causes problems when single
user mode in an assertion enabled build is used to replay a btree vacuum
record.

Backpatch to 9.2. 061b079f was backpatched further, but the assertion
was not.
2015-06-08 14:09:27 +02:00
Andrew Dunstan b81c7b4098 Desupport jsonb subscript deletion on objects
Supporting deletion of JSON pairs within jsonb objects using an
array-style integer subscript allowed for surprising outcomes.  This was
mostly due to the implementation-defined ordering of pairs within
objects for jsonb.

It also seems desirable to make jsonb integer subscript deletion
consistent with the 9.4 era general purpose integer subscripting
operator for jsonb (although that operator returns NULL when an object
is encountered, while we prefer here to throw an error).

Peter Geoghegan, following discussion on -hackers.
2015-06-07 20:46:00 -04:00
Tom Lane f3b5565dd4 Use a safer method for determining whether relcache init file is stale.
When we invalidate the relcache entry for a system catalog or index, we
must also delete the relcache "init file" if the init file contains a copy
of that rel's entry.  The old way of doing this relied on a specially
maintained list of the OIDs of relations present in the init file: we made
the list either when reading the file in, or when writing the file out.
The problem is that when writing the file out, we included only rels
present in our local relcache, which might have already suffered some
deletions due to relcache inval events.  In such cases we correctly decided
not to overwrite the real init file with incomplete data --- but we still
used the incomplete initFileRelationIds list for the rest of the current
session.  This could result in wrong decisions about whether the session's
own actions require deletion of the init file, potentially allowing an init
file created by some other concurrent session to be left around even though
it's been made stale.

Since we don't support changing the schema of a system catalog at runtime,
the only likely scenario in which this would cause a problem in the field
involves a "vacuum full" on a catalog concurrently with other activity, and
even then it's far from easy to provoke.  Remarkably, this has been broken
since 2002 (in commit 7863404417), but we had
never seen a reproducible test case until recently.  If it did happen in
the field, the symptoms would probably involve unexpected "cache lookup
failed" errors to begin with, then "could not open file" failures after the
next checkpoint, as all accesses to the affected catalog stopped working.
Recovery would require manually removing the stale "pg_internal.init" file.

To fix, get rid of the initFileRelationIds list, and instead consult
syscache.c's list of relations used in catalog caches to decide whether a
relation is included in the init file.  This should be a tad more efficient
anyway, since we're replacing linear search of a list with ~100 entries
with a binary search.  It's a bit ugly that the init file contents are now
so directly tied to the catalog caches, but in practice that won't make
much difference.

Back-patch to all supported branches.
2015-06-07 15:32:09 -04:00
Tom Lane 1497369e5d Get rid of a //-style comment.
Not sure how "//XXX" got into a committed patch in the first place,
as it's both content-free and against project style.  pgindent made a
bit of a hash of it, too.

Going forward, we should have at least one buildfarm member using
"gcc -ansi" to catch such things, at least till such time as we
decide the project target language isn't C90 any more.  I've turned
this option on on dromedary.
2015-06-05 17:04:07 -04:00
Tom Lane ac23b711dd Fix incorrect order of database-locking operations in InitPostgres().
We should set MyProc->databaseId after acquiring the per-database lock,
not beforehand.  The old way risked deadlock against processes trying to
copy or delete the target database, since they would first acquire the lock
and then wait for processes with matching databaseId to exit; that left a
window wherein an incoming process could set its databaseId and then block
on the lock, while the other process had the lock and waited in vain for
the incoming process to exit.

CountOtherDBBackends() would time out and fail after 5 seconds, so this
just resulted in an unexpected failure not a permanent lockup, but it's
still annoying when it happens.  A real-world example of a use-case is that
short-duration connections to a template database should not cause CREATE
DATABASE to fail.

Doing it in the other order should be fine since the contract has always
been that processes searching the ProcArray for a database ID must hold the
relevant per-database lock while searching.  Thus, this actually removes
the former race condition that required an assumption that storing to
MyProc->databaseId is atomic.

It's been like this for a long time, so back-patch to all active branches.
2015-06-05 13:22:27 -04:00
Robert Haas 068cfadf9e Cope with possible failure of the oldest MultiXact to exist.
Recent commits, mainly b69bf30b9b and
53bb309d2d, introduced mechanisms to
protect against wraparound of the MultiXact member space: the number
of multixacts that can exist at one time is limited to 2^32, but the
total number of members in those multixacts is also limited to 2^32,
and older code did not take care to enforce the second limit,
potentially allowing old data to be overwritten while it was still
needed.

Unfortunately, these new mechanisms failed to account for the fact
that the code paths in which they run might be executed during
recovery or while the cluster was in an inconsistent state.  Also,
they failed to account for the fact that users who used pg_upgrade
to upgrade a PostgreSQL version between 9.3.0 and 9.3.4 might have
might oldestMultiXid = 1 in the control file despite the true value
being larger.

To fix these problems, first, avoid unnecessarily examining the
mmembers of MultiXacts when the cluster is not known to be consistent.
TruncateMultiXact has done this for a long time, and this patch does
not fix that.  But the new calls used to prevent member wraparound
are not needed until we reach normal running, so avoid calling them
earlier.  (SetMultiXactIdLimit is actually called before InRecovery
is set, so we can't rely on that; we invent our own multixact-specific
flag instead.)

Second, make failure to look up the members of a MultiXact a non-fatal
error.  Instead, if we're unable to determine the member offset at
which wraparound would occur, postpone arming the member wraparound
defenses until we are able to do so.  If we're unable to determine the
member offset that should force autovacuum, force it continuously
until we are able to do so.  If we're unable to deterine the member
offset at which we should truncate the members SLRU, log a message and
skip truncation.

An important consequence of these changes is that anyone who does have
a bogus oldestMultiXid = 1 value in pg_control will experience
immediate emergency autovacuuming when upgrading to a release that
contains this fix.  The release notes should highlight this fact.  If
a user has no pg_multixact/offsets/0000 file, but has oldestMultiXid = 1
in the control file, they may wish to vacuum any tables with
relminmxid = 1 prior to upgrading in order to avoid an immediate
emergency autovacuum after the upgrade.  This must be done with a
PostgreSQL version 9.3.5 or newer and with vacuum_multixact_freeze_min_age
and vacuum_multixact_freeze_table_age set to 0.

This patch also adds an additional log message at each database server
startup, indicating either that protections against member wraparound
have been engaged, or that they have not.  In the latter case, once
autovacuum has advanced oldestMultiXid to a sane value, the message
indicating that the guards have been engaged will appear at the next
checkpoint.  A few additional messages have also been added at the DEBUG1
level so that the correct operation of this code can be properly audited.

Along the way, this patch fixes another, related bug in TruncateMultiXact
that has existed since PostgreSQL 9.3.0: when no MultiXacts exist at
all, the truncation code looks up NextMultiXactId, which doesn't exist
yet.  This can lead to TruncateMultiXact removing every file in
pg_multixact/offsets instead of keeping one around, as it should.
This in turn will cause the database server to refuse to start
afterwards.

Patch by me.  Review by Álvaro Herrera, Andres Freund, Noah Misch, and
Thomas Munro.
2015-06-05 09:31:57 -04:00
Tom Lane 1d27842519 Second try at stabilizing query plans in rowsecurity regression test.
This reverts commit 5cdf25e168,
which was almost immediately proven insufficient by the buildfarm.

On second thought, the tables involved are not large enough that
autovacuum or autoanalyze would notice them; what seems far more
likely to be the culprit is the database-wide "vacuum analyze"
in the concurrent gist test.  That thing has given us one headache
too many, so get rid of it in favor of targeted vacuuming of that
test's own tables only.
2015-06-04 16:42:23 -04:00
Tom Lane 1676e4381f Fix brin regression test so it actually tests cidr.
The problem noted in my previous commit was simpler than I thought:
we weren't getting an index plan because the column wasn't indexed.
2015-06-04 15:24:22 -04:00
Tom Lane 79454c696b Tighten the per-operator testing done in brin regression test.
Verify that the number of matches is exactly what it should be, not just
that it not be zero.  This should help us detect any environment-dependent
issues.

Also, verify that we're getting the expected type of scan plan (either
bitmap or seqscan as appropriate).  Right now, this is failing on the
cidrcol test cases, as shown in the output file.  I'll look into that
in a bit, but it seems good to commit this as-is temporarily to verify
that it behaves as expected on the buildfarm.
2015-06-04 14:39:52 -04:00
Tom Lane 78e72794a7 Fix brin "char" test to actually test what it meant to test.
Casting to char, without quotes, does not give the same results as casting
to "char".  That meant we were not testing the brin "char" paths at all,
since we ended up with a text operator not a "char" operator.
2015-06-04 13:50:32 -04:00
Tom Lane bac99475eb Stabilize results of brin regression test.
This test used seqscans on tenk1, with LIMIT, to build test data.
That works most of the time, but if the synchronized-seqscan logic
kicks in, we get varying test data.  This seems likely to explain
the erratic test failures on buildfarm member chipmunk, which uses
smaller-than-default shared_buffers.  To fix, add ORDER BY clauses to
force the ordering to be what it was implicitly being assumed to be.

Peter Geoghegan had noticed this with respect to one of the trouble
spots, though not the ones actually causing the chipmunk issue.
2015-06-04 13:46:34 -04:00
Tom Lane 5cdf25e168 Stabilize query plans in rowsecurity regression test.
Some recent buildfarm failures can be explained by supposing that
autovacuum or autoanalyze fired on the tables created by this test,
resulting in plan changes.  Do a proactive VACUUM ANALYZE on the
test's principal tables to try to forestall such changes.
2015-06-04 10:37:06 -04:00
Fujii Masao 232cd63b1f Remove -i/--ignore-version option from pg_dump, pg_dumpall and pg_restore.
The commit c22ed3d523 turned
the -i/--ignore-version options into no-ops and marked as deprecated.
Considering we shipped that in 8.4, it's time to remove all trace of
those switches, per discussion. We'd still have to wait a couple releases
before it'd be safe to use -i for something else, but it'd be a start.
2015-06-04 19:54:43 +09:00
Tom Lane 3b0f77601b Fix some questionable edge-case behaviors in add_path() and friends.
add_path_precheck was doing exact comparisons of path costs, but it really
needs to do them fuzzily to be sure it won't reject paths that could
survive add_path's comparisons.  (This can only matter if the initial cost
estimate is very close to the final one, but that turns out to often be
true.)

Also, it should ignore startup cost for this purpose if and only if
compare_path_costs_fuzzily would do so.  The previous coding always ignored
startup cost for parameterized paths, which is wrong as of commit
3f59be836c555fa6; it could result in improper early rejection of paths that
we care about for SEMI/ANTI joins.  It also always considered startup cost
for unparameterized paths, which is just as wrong though the only effect is
to waste planner cycles on paths that can't survive.  Instead, it should
consider startup cost only when directed to by the consider_startup/
consider_param_startup relation flags.

Likewise, compare_path_costs_fuzzily should have symmetrical behavior
for parameterized and unparameterized paths.  In this case, the best
answer seems to be that after establishing that total costs are fuzzily
equal, we should compare startup costs whether or not the consider_xxx
flags are on.  That is what it's always done for unparameterized paths,
so let's make the behavior for parameterized  paths match.

These issues were noted while developing the SEMI/ANTI join costing fix
of commit 3f59be836c, but we chose not to back-patch these fixes,
because they can cause changes in the planner's choices among
nearly-same-cost plans.  (There is in fact one minor change in plan choice
within the core regression tests.)  Destabilizing plan choices in back
branches without very clear improvements is frowned on, so we'll just fix
this in HEAD.
2015-06-03 18:02:39 -04:00
Tom Lane 3f59be836c Fix planner's cost estimation for SEMI/ANTI joins with inner indexscans.
When the inner side of a nestloop SEMI or ANTI join is an indexscan that
uses all the join clauses as indexquals, it can be presumed that both
matched and unmatched outer rows will be processed very quickly: for
matched rows, we'll stop after fetching one row from the indexscan, while
for unmatched rows we'll have an indexscan that finds no matching index
entries, which should also be quick.  The planner already knew about this,
but it was nonetheless charging for at least one full run of the inner
indexscan, as a consequence of concerns about the behavior of materialized
inner scans --- but those concerns don't apply in the fast case.  If the
inner side has low cardinality (many matching rows) this could make an
indexscan plan look far more expensive than it actually is.  To fix,
rearrange the work in initial_cost_nestloop/final_cost_nestloop so that we
don't add the inner scan cost until we've inspected the indexquals, and
then we can add either the full-run cost or just the first tuple's cost as
appropriate.

Experimentation with this fix uncovered another problem: add_path and
friends were coded to disregard cheap startup cost when considering
parameterized paths.  That's usually okay (and desirable, because it thins
the path herd faster); but in this fast case for SEMI/ANTI joins, it could
result in throwing away the desired plain indexscan path in favor of a
bitmap scan path before we ever get to the join costing logic.  In the
many-matching-rows cases of interest here, a bitmap scan will do a lot more
work than required, so this is a problem.  To fix, add a per-relation flag
consider_param_startup that works like the existing consider_startup flag,
but applies to parameterized paths, and set it for relations that are the
inside of a SEMI or ANTI join.

To make this patch reasonably safe to back-patch, care has been taken to
avoid changing the planner's behavior except in the very narrow case of
SEMI/ANTI joins with inner indexscans.  There are places in
compare_path_costs_fuzzily and add_path_precheck that are not terribly
consistent with the new approach, but changing them will affect planner
decisions at the margins in other cases, so we'll leave that for a
HEAD-only fix.

Back-patch to 9.3; before that, the consider_startup flag didn't exist,
meaning that the second aspect of the patch would be too invasive.

Per a complaint from Peter Holzer and analysis by Tomas Vondra.
2015-06-03 11:59:10 -04:00
Bruce Momjian ab959cc0ea pgindent: add typedef blog URL 2015-06-01 11:27:30 -04:00
Andrew Dunstan 50ab76d3c1 Avoid naming a variable "new", and remove bogus initializer.
Per gripe from Tom Lane.
2015-05-31 22:56:53 -04:00
Andrew Dunstan 28b29f7e44 Add a couple of missing JsonbValue type initialisers. 2015-05-31 22:51:58 -04:00
Andrew Dunstan 37def42245 Rename jsonb_replace to jsonb_set and allow it to add new values
The function is given a fourth parameter, which defaults to true. When
this parameter is true, if the last element of the path is missing
in the original json, jsonb_set creates it in the result and assigns it
the new value. If it is false then the function does nothing unless all
elements of the path are present, including the last.

Based on some original code from Dmitry Dolgov, heavily modified by me.

Catalog version bumped.
2015-05-31 20:34:10 -04:00
Bruce Momjian ac6f22957d pg_upgrade: add missing period in C comment 2015-05-29 17:44:19 -04:00
Tom Lane 1943c000b7 initdb -S should now have an explicit check that $PGDATA is valid.
The fsync code from the backend essentially assumes that somebody's already
validated PGDATA, at least to the extent of it being a readable directory.
That's safe enough for initdb's normal code path too, but "initdb -S"
doesn't have any other processing at all that touches the target directory.
To have reasonable error-case behavior, add a pg_check_dir call.
Per gripe from Peter E.
2015-05-29 17:02:58 -04:00
Tom Lane 57e1138bcc Remove special cases for ETXTBSY from new fsync'ing logic.
The argument that this is a sufficiently-expected case to be silently
ignored seems pretty thin.  Andres had brought it up back when we were
still considering that most fsync failures should be hard errors, and it
probably would be legit not to fail hard for ETXTBSY --- but the same is
true for EROFS and other cases, which is why we gave up on hard failures.
ETXTBSY is surely not a normal case, so logging the failure seems fine
from here.
2015-05-29 15:11:36 -04:00
Tom Lane 1c8c656b3c Check that all aliases of a built-in function have same leakproof property.
opr_sanity.sql has a test checking that relevant properties of built-in
functions match when the same C function is referenced by multiple pg_proc
entries.  The test neglected to check proleakproof, though, and when
I added that condition it exposed that xideqint4 hadn't been updated to
match xideq.  So fix that as well, and in consequence bump catversion.

This isn't very critical, so no need to worry about fixing back branches.
2015-05-29 13:26:21 -04:00
Tom Lane c07d8c963e Adjust initdb to also not consider fsync'ing failures fatal.
Make initdb's version of this logic look as much like the backend's
as possible.  This is much less critical than in the backend since not
so many people use "initdb -S", but we want the same corner-case error
handling in both cases.

Back-patch to 9.3 where initdb -S option was introduced.  Before that,
initdb only had to deal with freshly-created data directories, wherein
no failures should be expected.

Abhijit Menon-Sen
2015-05-29 13:05:16 -04:00
Tom Lane da33a3894e Revert exporting of internal GUC variable "data_directory".
This undoes a poorly-thought-out choice in commit 970a18687f, namely
to export guc.c's internal variable data_directory.  The authoritative
variable so far as C code is concerned is DataDir; there is no reason for
anything except specific bits of GUC code to look at the GUC variable.

After yesterday's commits fixing the fsync-on-restart patch, the only
remaining misuse of data_directory was in AlterSystemSetConfigFile(),
which would be much better off just using a relative path anyhow: it's
less code and it doesn't break if the DBA moves the data directory of a
running system, which is a case we've taken some pains over in the past.

This is mostly cosmetic, so no need for a back-patch (and I'd be hesitant
to remove a global variable in stable branches anyway).
2015-05-29 11:57:33 -04:00
Tom Lane d8179b001a Fix fsync-at-startup code to not treat errors as fatal.
Commit 2ce439f337 introduced a rather serious
regression, namely that if its scan of the data directory came across any
un-fsync-able files, it would fail and thereby prevent database startup.
Worse yet, symlinks to such files also caused the problem, which meant that
crash restart was guaranteed to fail on certain common installations such
as older Debian.

After discussion, we agreed that (1) failure to start is worse than any
consequence of not fsync'ing is likely to be, therefore treat all errors
in this code as nonfatal; (2) we should not chase symlinks other than
those that are expected to exist, namely pg_xlog/ and tablespace links
under pg_tblspc/.  The latter restriction avoids possibly fsync'ing a
much larger part of the filesystem than intended, if the user has left
random symlinks hanging about in the data directory.

This commit takes care of that and also does some code beautification,
mainly moving the relevant code into fd.c, which seems a much better place
for it than xlog.c, and making sure that the conditional compilation for
the pre_sync_fname pass has something to do with whether pg_flush_data
works.

I also relocated the call site in xlog.c down a few lines; it seems a
bit silly to be doing this before ValidateXLOGDirectoryStructure().

The similar logic in initdb.c ought to be made to match this, but that
change is noncritical and will be dealt with separately.

Back-patch to all active branches, like the prior commit.

Abhijit Menon-Sen and Tom Lane
2015-05-28 17:33:03 -04:00
Tom Lane 0381fefaa4 Fix pg_rewind's handling of top-level symlinks.
The previous coding suffered a null-pointer dereference if it found any
symlink at the top level of $PGDATA.  Fix that, and teach it to recurse
into a symlink for pg_xlog, but not anything else.

Per note from Abhijit Menon-Sen.
2015-05-28 12:44:39 -04:00
Tom Lane 32f628be74 Fix assorted inconsistencies in our calls of readlink().
Ensure that we null-terminate the result string (one place in pg_rewind).
Be paranoid about out-of-range results from readlink() (should not happen,
but there is no good reason for some call sites to be careful about it and
others not).  Consistently use the whole buffer, not sometimes one byte
less.  Ensure we emit an appropriate errcode() in all cases.  Spell the
error messages the same way.

The only serious bug here is the missing null-termination in pg_rewind,
which is new code, so no need for a back-patch.

Abhijit Menon-Sen and Tom Lane
2015-05-28 12:17:22 -04:00
Tom Lane f46edf479e Fix pg_get_functiondef() to print a function's LEAKPROOF property.
Seems to have been an oversight in the original leakproofness patch.
Per report and patch from Jeevan Chalke.

In passing, prettify some awkward leakproof-related code in AlterFunction.
2015-05-28 11:24:37 -04:00
Tom Lane aa9eac45ea Fix portability issue in isolationtester grammar.
specparse.y and specscanner.l used "string" as a token name.  Now, bison
likes to define each token name as a macro for the token code it assigns,
which means those names are basically off-limits for any other use within
the grammar file or included headers.  So names as generic as "string" are
dangerous.  This is what was causing the recent failures on protosciurus:
some versions of Solaris' sys/kstat.h use "string" as a field name.
With late-model bison we don't see this problem because the token macros
aren't defined till later (that is why castoroides didn't show the problem
even though it's on the same machine).  But protosciurus uses bison 1.875
which defines the token macros up front.

This land mine has been there from day one; we'd have found it sooner
except that protosciurus wasn't trying to run the isolation tests till
recently.

To fix, rename the token to "string_literal" which is hopefully less
likely to collide with names used by system headers.  Back-patch to
all branches containing the isolation tests.
2015-05-27 19:14:51 -04:00
Andrew Dunstan f41042cea0 Revert "Add all structured objects passed to pushJsonbValue piecewise."
This reverts commit 54547bd87f.

This appears to have been a thinko on my part. I will try to come up
wioth a better solution.
2015-05-26 22:54:55 -04:00
Andrew Dunstan 956cc4434c Revert "Simplify addJsonbToParseState()"
This reverts commit fba12c8c6c.

This relied on a commit that is also being reverted.
2015-05-26 22:54:11 -04:00
Tom Lane 1f303fd1be Suppress occasional failures in brin regression test.
brin.sql included a call of brin_summarize_new_values(), and expected
it to always report exactly 5 summarization events.  This failed sometimes
during parallel regression tests, as a consequence of the database-wide
VACUUM in gist.sql getting there first.  The most future-proof way
to avoid variation in the test results is to forget about using
brin_summarize_new_values() and just do a plain "VACUUM brintest",
which will exercise the same code anyway.

Having done that, there's no need for preventing autovacuum on brintest;
doing so just reduces the scope of test coverage, so let's not.
2015-05-26 14:11:12 -04:00
Andrew Dunstan fba12c8c6c Simplify addJsonbToParseState()
This function no longer needs to walk non-scalar structures passed to
it, following commit 54547bd87f.
2015-05-26 11:46:02 -04:00
Andrew Dunstan 54547bd87f Add all structured objects passed to pushJsonbValue piecewise.
Commit 9b74f32cdb did this for objects of
type jbvBinary, but in trying further to simplify some of the new jsonb
code I discovered that objects of type jbvObject or jbvArray passed as
WJB_ELEM or WJB_VALUE also caused problems. These too are now added
component by component.

Backpatch to 9.4.
2015-05-26 11:16:52 -04:00
Tom Lane 79f2b5d583 Fix valgrind's "unaddressable bytes" whining about BRIN code.
brin_form_tuple calculated an exact tuple size, then palloc'd and
filled just that much.  Later, brin_doinsert or brin_doupdate would
MAXALIGN the tuple size and tell PageAddItem that that was the size
of the tuple to insert.  If the original tuple size wasn't a multiple
of MAXALIGN, the net result would be that PageAddItem would memcpy
a few more bytes than the palloc request had been for.

AFAICS, this is totally harmless in the real world: the error is a
read overrun not a write overrun, and palloc would certainly have
rounded the request up to a MAXALIGN multiple internally, so there's
no chance of the memcpy fetching off the end of memory.  Valgrind,
however, is picky to the byte level not the MAXALIGN level.

Fix it by pushing the MAXALIGN step back to brin_form_tuple.  (The other
possible source of tuples in this code, brin_form_placeholder_tuple,
was already producing a MAXALIGN'd result.)

In passing, be a bit more paranoid about internal allocations in
brin_form_tuple.
2015-05-25 21:56:19 -04:00
Bruce Momjian 3503003eb7 pgindent: document location of "all" typedef lists 2015-05-25 16:53:51 -04:00
Alvaro Herrera cdbdc43827 Update README.tuplock
Multixact truncation is now handled differently, and this file hadn't
gotten the memo.

Per note from Amit Langote.  I didn't use his patch, though.

Also update the description of infomask bits, which weren't completely up
to date either.  This commit also propagates b01a4f6838 back to 9.3 and
9.4, which apparently I failed to do back then.
2015-05-25 15:09:05 -03:00
Andrew Dunstan 6739aa298b Clean up and simplify jsonb_concat code.
Some of this is made possible by commit
9b74f32cdb which lets pushJsonbValue
handle binary Jsonb values, meaning that clients no longer have to, and
some is just doing things in simpler and more straightforward ways.
2015-05-25 11:43:06 -04:00
Bruce Momjian 8339e70da6 pgindent: fix typo
Report by Michael Paquier
2015-05-25 08:08:41 -04:00
Heikki Linnakangas 12e6c5a6ca Fix rescan of IndexScan node with the new lossy GiST distance functions.
Must reset the "reached end" flag and reorder queue at rescan.

Per report from Regina Obe, bug #13349
2015-05-25 14:48:29 +03:00
Bruce Momjian 266b6984cd pgindent: more doc updates for skipping __asm__ files 2015-05-24 21:51:42 -04:00
Bruce Momjian befa3e648c Revert 9.5 pgindent changes to atomics directory files
This is because there are many __asm__ blocks there that pgindent messes
up.  Also configure pgindent to skip that directory in the future.
2015-05-24 21:45:01 -04:00
Tom Lane 2aa0476dc3 Manual cleanup of pgindent results.
Fix some places where pgindent did silly stuff, often because project
style wasn't followed to begin with.  (I've not touched the atomics
headers, though.)
2015-05-24 15:04:10 -04:00
Tom Lane 17b48a1a9f Rename pg_shdepend.c's typedef "objectType" to SharedDependencyObjectType.
The name objectType is widely used as a field name, and it's pure luck that
this conflict has not caused pgindent to go crazy before.  It messed up
pg_audit.c pretty good though.  Since pg_shdepend.c doesn't export this
typedef and only uses it in three places, changing that seems saner than
changing the field usages.

Back-patch because we're contemplating using the union of all branch
typedefs for future pgindent runs, so this won't fix anything if it
stays the same in back branches.
2015-05-24 13:03:45 -04:00
Tom Lane 23116d5437 Add a bit more commentary about regex's colormap tree data structure.
Per an off-list question from Piotr Stefaniak.
2015-05-24 12:40:38 -04:00
Tom Lane 91e79260f6 Remove no-longer-required function declarations.
Remove a bunch of "extern Datum foo(PG_FUNCTION_ARGS);" declarations that
are no longer needed now that PG_FUNCTION_INFO_V1(foo) provides that.

Some of these were evidently missed in commit e7128e8dbb, but others
were cargo-culted in in code added since then.  Possibly that can be blamed
in part on the fact that we'd not fixed relevant documentation examples,
which I've now done.
2015-05-24 12:20:23 -04:00
Bruce Momjian 807b9e0dff pgindent run for 9.5 2015-05-23 21:35:49 -04:00
Bruce Momjian 225892552b Update typedef file in preparation for pgindent run 2015-05-23 21:20:37 -04:00
Bruce Momjian 58affdfb88 Improve pgindent instructions regarding Perl backup files 2015-05-23 21:09:00 -04:00
Tom Lane f84c8601d6 Add error check for lossy distance functions in index-only scans.
Maybe we should actually support this, but for the moment let's just
throw an error if the opclass tries it.
2015-05-23 16:24:31 -04:00
Tom Lane 72809480d6 Fix incorrect snprintf() limit.
Typo in commit 7cbee7c0a.  No practical effect since the buffer should
never actually be overrun, but various compilers and static analyzers will
whine about it.

Petr Jelinek
2015-05-23 16:05:52 -04:00
Tom Lane 821b821a24 Still more fixes for lossy-GiST-distance-functions patch.
Fix confusion in documentation, substantial memory leakage if float8 or
float4 are pass-by-reference, and assorted comments that were obsoleted
by commit 98edd617f3.
2015-05-23 15:22:25 -04:00
Andres Freund 284bef2977 Fix yet another bug in ON CONFLICT rule deparsing.
Expand testing of rule deparsing a good bit, it's evidently needed.

Author: Peter Geoghegan, Andres Freund
Discussion: CAM3SWZQmXxZhQC32QVEOTYfNXJBJ_Q2SDENL7BV14Cq-zL0FLg@mail.gmail.com
2015-05-23 02:16:24 +02:00
Andres Freund 631d749007 Remove the new UPSERT command tag and use INSERT instead.
Previously, INSERT with ON CONFLICT DO UPDATE specified used a new
command tag -- UPSERT.  It was introduced out of concern that INSERT as
a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as
some affected rows may actually have been updated.

Alvaro Herrera noticed that the implementation of that new command tag
was incomplete; in subsequent discussion we concluded that having it
doesn't provide benefits that are in line with the compatibility breaks
it requires.

Catversion bump due to the removal of PlannedStmt->isUpsert.

Author: Peter Geoghegan
Discussion: 20150520215816.GI5885@postgresql.org
2015-05-23 00:58:45 +02:00
Tom Lane 49ad32d5d9 Fix recently-introduced crash in array_contain_compare().
Silly oversight in commit 1dc5ebc9077ab742079ce5dac9a6664248d42916:
when array2 is an expanded array, it might have array2->xpn.dnulls equal
to NULL, indicating the array is known null-free.  The code wasn't
expecting that, because it formerly always used deconstruct_array() which
always delivers a nulls array.

Per bug #13334 from Regina Obe.
2015-05-22 18:36:48 -04:00
Andrew Dunstan 5302760a50 Unpack jbvBinary objects passed to pushJsonbValue
pushJsonbValue was accepting jbvBinary objects passed as WJB_ELEM or
WJB_VALUE data. While this succeeded, when those objects were later
encountered in attempting to convert the result to Jsonb, errors
occurred. With this change we ghuarantee that a JSonbValue constructed
from calls to pushJsonbValue does not contain any jbvBinary objects.
This cures a problem observed with jsonb_delete.

This means callers of pushJsonbValue no longer need to perform this
unpacking themselves. A subsequent patch will perform some cleanup in
that area.

The error was not triggered by any 9.4 code, but this is a publicly
visible routine, and so the error could be exercised by third party
code, therefore backpatch to 9.4.

Bug report from Peter Geoghegan, fix by me.
2015-05-22 10:21:41 -04:00
Heikki Linnakangas 7cbee7c0a1 At promotion, don't leave behind a partial segment on the old timeline.
With commit de768844, a copy of the partial segment was archived with the
.partial suffix, but the original file was still left in pg_xlog, so it
didn't actually solve the problems with archiving the partial segment that
it was supposed to solve. With this patch, the partial segment is renamed
rather than copied, so we only archive it with the .partial suffix.

Also be more robust in detecting if the last segment is already being
archived. Previously I used XLogArchiveIsBusy() for that, but that's not
quite right. With archive_mode='always', there might be a .ready file for
it, and we don't want to rename it to .partial in that case.

The old segment is needed until we're fully committed to the new timeline,
i.e. until we've written the end-of-recovery WAL record and updated the
min recovery point and timeline in the control file. So move the renaming
later in the startup sequence, after all that's been done.
2015-05-22 11:04:33 +03:00
Tom Lane c5dd8ead40 More fixes for lossy-GiST-distance-functions patch.
Paul Ramsey reported that commit 35fcb1b3d0
induced a core dump on commuted ORDER BY expressions, because it was
assuming that the indexorderby expression could be found verbatim in the
relevant equivalence class, but it wasn't there.  We really don't need
anything that complicated anyway; for the data types likely to be used for
index ORDER BY operators in the foreseeable future, the exprType() of the
ORDER BY expression will serve fine.  (The case where we'd have to work
harder is where the ORDER BY expression's result is only binary-compatible
with the declared input type of the ordering operator; long before worrying
about that, one would need to get rid of GiST's hard-wired assumption that
said datatype is float8.)

Aside from fixing that crash and adding a regression test for the case,
I did some desultory code review:

nodeIndexscan.c was likewise overthinking how hard it ought to work to
identify the datatype of the ORDER BY expressions.

Add comments explaining how come nodeIndexscan.c can get away with
simplifying assumptions about NULLS LAST ordering and no backward scan.

Revert no-longer-needed changes of find_ec_member_for_tle(); while the
new definition was no worse than the old, it wasn't better either, and
it might cause back-patching pain.

Revert entirely bogus additions to genam.h.
2015-05-21 19:47:48 -04:00
Tom Lane d4b538ea36 Improve packing/alignment annotation for ItemPointerData.
We want this struct to be exactly a series of 3 int16 words, no more
and no less.  Historically, at least, some ARM compilers preferred to
pad it to 8 bytes unless coerced.  Our old way of doing that was just
to use __attribute__((packed)), but as pointed out by Piotr Stefaniak,
that does too much: it also licenses the compiler to give the struct
only byte-alignment.  We don't want that because it adds access overhead,
possibly quite significant overhead.  According to the GCC manual, what
we want requires also specifying __attribute__((align(2))).  It's not
entirely clear if all the relevant compilers accept this pragma as well,
but we can hope the buildfarm will tell us if not.  We can also add a
static assertion that should fire if the compiler padded the struct.

Since the combination of these pragmas should define exactly what we
want on any compiler that accepts them, let's try using them wherever
we think they exist, not only for __arm__.  (This is likely to expose
that the conditional definitions in c.h are inadequate, but finding
that out would be a good thing.)

The immediate motivation for this is that the current definition of
ExecRowMark allows its curCtid field to be misaligned.  It is not clear
whether there are any other uses of ItemPointerData with a similar hazard.
We could change the definition of ExecRowMark if this doesn't work, but
it would be far better to have a future-proof fix.

Piotr Stefaniak, some further hacking by me
2015-05-21 17:21:46 -04:00
Fujii Masao 85d0e661aa Make recovery_target_action = pause work.
Previously even if recovery_target_action was set to pause and
the recovery target was reached, the recovery could never be paused.
Because the setting of pause was *always* overridden with that of
shutdown unexpectedly. This override is valid and intentional
if hot_standby is not enabled because there is no way to resume
the paused recovery in this case and the setting of pause is
completely useless. But not if hot_standby is enabled.

This patch changes the code so that the setting of pause is overridden
with that of shutdown only when hot_standby is not enabled.

Bug reported by Andres Freund
2015-05-21 13:56:17 +09:00
Tom Lane a6a66bd647 Another typo fix.
In the spirit of the season.
2015-05-20 14:50:22 -04:00
Heikki Linnakangas fa60fb63e5 Fix more typos in comments.
Patch by CharSyam, plus a few more I spotted with grep.
2015-05-20 19:45:43 +03:00
Heikki Linnakangas 4fc72cc7bb Collection of typo fixes.
Use "a" and "an" correctly, mostly in comments. Two error messages were
also fixed (they were just elogs, so no translation work required). Two
function comments in pg_proc.h were also fixed. Etsuro Fujita reported one
of these, but I found a lot more with grep.

Also fix a few other typos spotted while grepping for the a/an typos.
For example, "consists out of ..." -> "consists of ...". Plus a "though"/
"through" mixup reported by Euler Taveira.

Many of these typos were in old code, which would be nice to backpatch to
make future backpatching easier. But much of the code was new, and I didn't
feel like crafting separate patches for each branch. So no backpatching.
2015-05-20 16:56:22 +03:00
Simon Riggs f6a54fefc2 Fix spelling in comment 2015-05-19 18:37:46 -04:00
Tom Lane 0c071936e9 Revert error-throwing wrappers for the printf family of functions.
This reverts commit 16304a0134, except
for its changes in src/port/snprintf.c; as well as commit
cac18a76bb which is no longer needed.

Fujii Masao reported that the previous commit caused failures in psql on
OS X, since if one exits the pager program early while viewing a query
result, psql sees an EPIPE error from fprintf --- and the wrapper function
thought that was reason to panic.  (It's a bit surprising that the same
does not happen on Linux.)  Further discussion among the security list
concluded that the risk of other such failures was far too great, and
that the one-size-fits-all approach to error handling embodied in the
previous patch is unlikely to be workable.

This leaves us again exposed to the possibility of the type of failure
envisioned in CVE-2015-3166.  However, that failure mode is strictly
hypothetical at this point: there is no concrete reason to believe that
an attacker could trigger information disclosure through the supposed
mechanism.  In the first place, the attack surface is fairly limited,
since so much of what the backend does with format strings goes through
stringinfo.c or psprintf(), and those already had adequate defenses.
In the second place, even granting that an unprivileged attacker could
control the occurrence of ENOMEM with some precision, it's a stretch to
believe that he could induce it just where the target buffer contains some
valuable information.  So we concluded that the risk of non-hypothetical
problems induced by the patch greatly outweighs the security risks.
We will therefore revert, and instead undertake closer analysis to
identify specific calls that may need hardening, rather than attempt a
universal solution.

We have kept the portion of the previous patch that improved snprintf.c's
handling of errors when it calls the platform's sprintf().  That seems to
be an unalloyed improvement.

Security: CVE-2015-3166
2015-05-19 18:19:38 -04:00
Andres Freund 9bc77c4519 Various fixes around ON CONFLICT for rule deparsing.
Neither the deparsing of the new alias for INSERT's target table, nor of
the inference clause was supported. Also fixup a typo in an error
message.

Add regression tests to test those code paths.

Author: Peter Geoghegan
2015-05-19 23:18:57 +02:00
Andres Freund 0740cbd759 Refactor ON CONFLICT index inference parse tree representation.
Defer lookup of opfamily and input type of a of a user specified opclass
until the optimizer selects among available unique indexes; and store
the opclass in the parse analyzed tree instead.  The primary reason for
doing this is that for rule deparsing it's easier to use the opclass
than the previous representation.

While at it also rename a variable in the inference code to better fit
it's purpose.

This is separate from the actual fixes for deparsing to make review
easier.
2015-05-19 21:21:27 +02:00
Heikki Linnakangas b48437d11b Fix off-by-one error in Assertion.
The point of the assertion is to ensure that the arrays allocated in stack
are large enough, but the check was one item short.

This won't matter in practice because MaxIndexTuplesPerPage is an
overestimate, so you can't have that many items on a page in reality.
But let's be tidy.

Spotted by Anastasia Lubennikova. Backpatch to all supported versions, like
the patch that added the assertion.
2015-05-19 19:25:01 +03:00
Tom Lane 0b28ea79c0 Avoid collation dependence in indexes of system catalogs.
No index in template0 should have collation-dependent ordering, especially
not indexes on shared catalogs.  For most textual columns we avoid this
issue by using type "name" (which sorts per strcmp()).  However there are a
few indexed columns that we'd prefer to use "text" for, and for that, the
default opclass text_ops is unsafe.  Fortunately, text_pattern_ops is safe
(it sorts per memcmp()), and it has no real functional disadvantage for our
purposes.  So change the indexes on pg_seclabel.provider and
pg_shseclabel.provider to use text_pattern_ops.

In passing, also mark pg_replication_origin.roname as using
text_pattern_ops --- for some reason it was labeled varchar_pattern_ops
which is just wrong, even though it accidentally worked.

Add regression test queries to catch future errors of these kinds.

We still can't do anything about the misdeclared pg_seclabel and
pg_shseclabel indexes in back branches :-(
2015-05-19 11:47:42 -04:00
Tom Lane afee04352b Revert "Change pg_seclabel.provider and pg_shseclabel.provider to type "name"."
This reverts commit b82a7be603.  There
is a better (less invasive) way to fix it, which I will commit next.
2015-05-19 10:40:04 -04:00
Peter Eisentraut 55c0da38be Message string improvements 2015-05-18 23:01:48 -04:00
Peter Eisentraut 0779f2ba2d Fix parse tree of DROP TRANSFORM and COMMENT ON TRANSFORM
The plain C string language name needs to be wrapped in makeString() so
that the parse tree is copyable.  This is detectable by
-DCOPY_PARSE_PLAN_TREES.  Add a test case for the COMMENT case.

Also make the quoting in the error messages more consistent.

discovered by Tom Lane
2015-05-18 22:55:14 -04:00
Tom Lane b82a7be603 Change pg_seclabel.provider and pg_shseclabel.provider to type "name".
These were "text", but that's a bad idea because it has collation-dependent
ordering.  No index in template0 should have collation-dependent ordering,
especially not indexes on shared catalogs.  There was general agreement
that provider names don't need to be longer than other identifiers, so we
can fix this at a small waste of table space by changing from text to name.

There's no way to fix the problem in the back branches, but we can hope
that security labels don't yet have widespread-enough usage to make it
urgent to fix.

There needs to be a regression sanity test to prevent us from making this
same mistake again; but before putting that in, we'll need to get rid of
similar brain fade in the recently-added pg_replication_origin catalog.

Note: for lack of a suitable testing environment, I've not really exercised
this change.  I trust the buildfarm will show up any mistakes.
2015-05-18 20:07:53 -04:00
Andres Freund e4942f7a56 Attach ON CONFLICT SET ... WHERE to the correct planstate.
The previous coding was a leftover from attempting to hang all the on
conflict logic onto modify table's child nodes. It appears to not have
actually caused problems except for explain.

Add test exercising the broken and some other code paths.

Author: Peter Geoghegan and Andres Freund
2015-05-19 01:55:10 +02:00
Tom Lane 4db485e75b Put back a backwards-compatible version of sampling support functions.
Commit 83e176ec18 removed the longstanding
support functions for block sampling without any consideration of the
impact this would have on third-party FDWs.  The new API is not notably
more functional for FDWs than the old, so forcing them to change doesn't
seem like a good thing.  We can provide the old API as a wrapper (more
or less) around the new one for a minimal amount of extra code.
2015-05-18 18:34:37 -04:00
Tom Lane f5916bb7b5 Recognize "REGRESS_OPTS += ..." syntax in MSVC build scripts.
Necessitated by commit b14cf229f4.
Per buildfarm.
2015-05-18 13:40:06 -04:00
Robert Haas 922de19ef2 Fix error message in pre_sync_fname.
The old one didn't include %m anywhere, and required extra
translation.

Report by Peter Eisentraut. Fix by me. Review by Tom Lane.
2015-05-18 12:53:54 -04:00
Noah Misch fd97bd411d Check return values of sensitive system library calls.
PostgreSQL already checked the vast majority of these, missing this
handful that nearly cannot fail.  If putenv() failed with ENOMEM in
pg_GSS_recvauth(), authentication would proceed with the wrong keytab
file.  If strftime() returned zero in cache_locale_time(), using the
unspecified buffer contents could lead to information exposure or a
crash.  Back-patch to 9.0 (all supported versions).

Other unchecked calls to these functions, especially those in frontend
code, pose negligible security concern.  This patch does not address
them.  Nonetheless, it is always better to check return values whose
specification provides for indicating an error.

In passing, fix an off-by-one error in strftime_win32()'s invocation of
WideCharToMultiByte().  Upon retrieving a value of exactly MAX_L10N_DATA
bytes, strftime_win32() would overrun the caller's buffer by one byte.
MAX_L10N_DATA is chosen to exceed the length of every possible value, so
the vulnerable scenario probably does not arise.

Security: CVE-2015-3166
2015-05-18 10:02:31 -04:00
Noah Misch 16304a0134 Add error-throwing wrappers for the printf family of functions.
All known standard library implementations of these functions can fail
with ENOMEM.  A caller neglecting to check for failure would experience
missing output, information exposure, or a crash.  Check return values
within wrappers and code, currently just snprintf.c, that bypasses the
wrappers.  The wrappers do not return after an error, so their callers
need not check.  Back-patch to 9.0 (all supported versions).

Popular free software standard library implementations do take pains to
bypass malloc() in simple cases, but they risk ENOMEM for floating point
numbers, positional arguments, large field widths, and large precisions.
No specification demands such caution, so this commit regards every call
to a printf family function as a potential threat.

Injecting the wrappers implicitly is a compromise between patch scope
and design goals.  I would prefer to edit each call site to name a
wrapper explicitly.  libpq and the ECPG libraries would, ideally, convey
errors to the caller rather than abort().  All that would be painfully
invasive for a back-patched security fix, hence this compromise.

Security: CVE-2015-3166
2015-05-18 10:02:31 -04:00
Noah Misch cac18a76bb Permit use of vsprintf() in PostgreSQL code.
The next commit needs it.  Back-patch to 9.0 (all supported versions).
2015-05-18 10:02:31 -04:00
Noah Misch b0ce385032 Prevent a double free by not reentering be_tls_close().
Reentering this function with the right timing caused a double free,
typically crashing the backend.  By synchronizing a disconnection with
the authentication timeout, an unauthenticated attacker could achieve
this somewhat consistently.  Call be_tls_close() solely from within
proc_exit_prepare().  Back-patch to 9.0 (all supported versions).

Benkocs Norbert Attila

Security: CVE-2015-3165
2015-05-18 10:02:31 -04:00
Heikki Linnakangas 8cc7a4c5fd Fix typo in comment.
Jim Nasby
2015-05-18 10:38:52 +03:00
Heikki Linnakangas 4df1328950 Put back stats-collector restarting code, removed accidentally.
Removed that code snippet accidentally in the archive_mode='always' patch.

Also, use varname-tags for archive_command in the docs.

Fujii Masao
2015-05-18 10:20:30 +03:00
Peter Eisentraut 382b479ab7 Add new files to nls.mk 2015-05-17 22:55:17 -04:00
Tom Lane 424661913c Fix failure to copy IndexScan.indexorderbyops in copyfuncs.c.
This oversight results in a crash at executor startup if the plan has
been copied.  outfuncs.c was missed as well.

While we could probably have taught both those files to cope with the
originally chosen representation of an Oid array, it would have been
painful, not least because there'd be no easy way to verify the array
length.  An Oid List is far easier to work with.  And AFAICS, there is
no particular notational benefit to using an array rather than a list
in the existing parts of the patch either.  So just change it to a list.

Error in commit 35fcb1b3d0, which is new,
so no need for back-patch.
2015-05-17 21:22:12 -04:00
Magnus Hagander 3b075e9d7b Fix typos in comments
Dmitriy Olshevskiy
2015-05-17 14:58:04 +02:00
Peter Eisentraut e6dc503445 Fix whitespace 2015-05-16 20:43:32 -04:00
Bruce Momjian 750ccaef29 pg_upgrade: no need to check for matching float8_pass_by_value
Report by Noah Misch
2015-05-16 15:27:14 -04:00
Tom Lane 26058bf0dc More portability fixing for bipartite_match.c.
<float.h> is required for isinf() on some platforms.  Per buildfarm.
2015-05-16 11:35:42 -04:00
Bruce Momjian 4c5e060049 pg_upgrade: force timeline 1 in the new cluster
Previously, this prevented promoted standby servers from being upgraded
because of a missing WAL history file.  (Timeline 1 doesn't need a
history file, and we don't copy WAL files anyway.)

Report by Christian Echerer(?), Alexey Klyukin

Backpatch through 9.0
2015-05-16 00:40:18 -04:00
Bruce Momjian fb694d959c pg_upgrade: only allow template0 to be non-connectable
This patch causes pg_upgrade to error out during its check phase if:

(1) template0 is marked connectable
or
(2) any other database is marked non-connectable

This is done because, in the first case, pg_upgrade would fail because
the pg_dumpall --globals restore would fail, and in the second case, the
database would not be restored, leading to data loss.

Report by Matt Landry (1), Stephen Frost (2)

Backpatch through 9.0
2015-05-16 00:10:03 -04:00
Tom Lane 12cc299c65 Avoid direct use of INFINITY.
It's not very portable.  Per buildfarm.
2015-05-15 22:15:01 -04:00
Andres Freund f3d3118532 Support GROUPING SETS, CUBE and ROLLUP.
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.

This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.

The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.

The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.

Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage.  The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting.  The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.

A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.

Needs a catversion bump because stored rules may change.

Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
    Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-16 03:46:31 +02:00
Tom Lane 9d366c1f3d Update time zone data files to tzdata release 2015d.
DST law changes in Egypt, Mongolia, Palestine.
Historical corrections for Canada and Chile.
Revised zone abbreviation for America/Adak (HST/HDT not HAST/HADT).
2015-05-15 19:35:29 -04:00
Alvaro Herrera b0b7be6133 Add BRIN infrastructure for "inclusion" opclasses
This lets BRIN be used with R-Tree-like indexing strategies.

Also provided are operator classes for range types, box and inet/cidr.
The infrastructure provided here should be sufficient to create operator
classes for similar datatypes; for instance, opclasses for PostGIS
geometries should be doable, though we didn't try to implement one.

(A box/point opclass was also submitted, but we ripped it out before
commit because the handling of floating point comparisons in existing
code is inconsistent and would generate corrupt indexes.)

Author: Emre Hasegeli.  Cosmetic changes by me
Review: Andreas Karlsson
2015-05-15 18:05:22 -03:00
Tom Lane 199f5973c5 Improve test for CONVERT() with GB18030 <-> UTF8.
Add a bit of coverage of high code points.

Arjen Nienhuis
2015-05-15 17:03:23 -04:00
Alvaro Herrera 26df7066cc Move strategy numbers to include/access/stratnum.h
For upcoming BRIN opclasses, it's convenient to have strategy numbers
defined in a single place.  Since there's nothing appropriate, create
it.  The StrategyNumber typedef now lives there, as well as existing
strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from
gist.h).  skey.h is forced to include stratnum.h because of the
StrategyNumber typedef, but gist.h is not; extensions that currently
rely on gist.h for rtree strategy numbers might need to add a new

A few .c files can stop including skey.h and/or gist.h, which is a nice
side benefit.

Per discussion:
https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org

Authored by Emre Hasegeli and Álvaro.

(It's not clear to me why bootscanner.l has any #include lines at all.)
2015-05-15 17:03:16 -03:00
Simon Riggs 1e98fa0bf8 SQLStandard feature T613 Sampling now Supported 2015-05-15 15:51:31 -04:00
Tom Lane 66493dd7aa Fix uninitialized variable.
Per compiler warnings.
2015-05-15 15:45:28 -04:00
Tom Lane 8d3e0906df Extend GB18030 encoding conversion to cover full Unicode range.
Our previous code for GB18030 <-> UTF8 conversion only covered Unicode code
points up to U+FFFF, but the actual spec defines conversions for all code
points up to U+10FFFF.  That would be rather impractical as a lookup table,
but fortunately there is a simple algorithmic conversion between the
additional code points and the equivalent GB18030 byte patterns.  Make use
of the just-added callback facility in LocalToUtf/UtfToLocal to perform the
additional conversions.

Having created the infrastructure to do that, we can use the same code to
map certain linearly-related subranges of the Unicode space below U+FFFF,
allowing removal of the corresponding lookup table entries.  This more
than halves the lookup table size, which is a substantial savings;
utf8_and_gb18030.so drops from nearly a megabyte to about half that.

In support of doing that, replace ISO10646-GB18030.TXT with the data file
gb-18030-2000.xml (retrieved from
http://source.icu-project.org/repos/icu/data/trunk/charset/data/xml/ )
in which these subranges have been deleted from the simple lookup entries.

Per bug #12845 from Arjen Nienhuis.  The conversion code added here is
based on his proposed patch, though I whacked it around rather heavily.
2015-05-15 15:02:13 -04:00
Simon Riggs f6d208d6e5 TABLESAMPLE, SQL Standard and extensible
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.

Petr Jelinek

Reviewed by Michael Paquier and Simon Riggs
2015-05-15 14:37:10 -04:00
Heikki Linnakangas 11a83bbedd Silence another create_index regression test failure.
More platform differences in the less-significant digits in output.

Per buildfarm member rover_firefly, still.
2015-05-15 21:24:23 +03:00
Tom Lane 07af523870 Fix outdated src/test/mb/ tests, and add a GB18030 test.
The expected-output files for these tests were broken by the recent
addition of a warning for hash indexes.  Update them.

Also add a test case for GB18030 encoding, similar to the other ones.
This is a pretty weak test, but it's better than nothing.
2015-05-15 13:47:42 -04:00
Heikki Linnakangas ffd37740ee Add archive_mode='always' option.
In 'always' mode, the standby independently archives all files it receives
from the primary.

Original patch by Fujii Masao, docs and review by me.
2015-05-15 18:55:24 +03:00
Heikki Linnakangas 9feaba28e2 Silence create_index regression test failure.
The expected output contained some floating point values which might get
rounded slightly differently on different platforms. The exact output isn't
very interesting in this test, so just round it.

Per buildfarm member rover_firefly.
2015-05-15 18:20:16 +03:00
Heikki Linnakangas 98edd617f3 Fix datatype confusion with the new lossy GiST distance functions.
We can only support a lossy distance function when the distance function's
datatype is comparable with the original ordering operator's datatype.
The distance function always returns a float8, so we are limited to float8,
and float4 (by a hard-coded cast of the float8 to float4).

In light of this limitation, it seems like a good idea to have a separate
'recheck' flag for the ORDER BY expressions, so that if you have a non-lossy
distance function, it still works with lossy quals. There are cases like
that with the build-in or contrib opclasses, but it's plausible.

There was a hidden assumption that the ORDER BY values returned by GiST
match the original ordering operator's return type, but there are plenty
of examples where that's not true, e.g. in btree_gist and pg_trgm. As long
as the distance function is not lossy, we can tolerate that and just not
return the distance to the executor (or rather, always return NULL). The
executor doesn't need the distances if there are no lossy results.

There was another little bug: the recheck variable was not initialized
before calling the distance function. That revealed the bigger issue,
as the executor tried to reorder tuples that didn't need reordering, and
that failed because of the datatype mismatch.
2015-05-15 18:09:31 +03:00
Tom Lane a868931fec Fix insufficiently-paranoid GB18030 encoding verifier.
The previous coding effectively only verified that the second byte of a
multibyte character was in the expected range; moreover, it wasn't careful
to make sure that the second byte even exists in the buffer before touching
it.  The latter seems unlikely to cause any real problems in the field
(in particular, it could never be a problem with null-terminated input),
but it's still a bug.

Since GB18030 is not a supported backend encoding, the only thing we'd
really be doing with GB18030 text is converting it to UTF8 in LocalToUtf,
which would fail anyway on any invalid character for lack of a match in
its lookup table.  So the only user-visible consequence of this change
should be that you'll get "invalid byte sequence for encoding" rather than
"character has no equivalent" for malformed GB18030 input.  However,
impending changes to the GB18030 conversion code will require these tighter
up-front checks to avoid producing bogus results.
2015-05-15 11:04:02 -04:00
Fujii Masao 458a07701e Support --verbose option in reindexdb.
Sawada Masahiko, reviewed by Fabrízio Mello
2015-05-15 21:45:55 +09:00
Heikki Linnakangas 35fcb1b3d0 Allow GiST distance function to return merely a lower-bound.
The distance function can now set *recheck = false, like index quals. The
executor will then re-check the ORDER BY expressions, and use a queue to
reorder the results on the fly.

This makes it possible to do kNN-searches on polygons and circles, which
don't store the exact value in the index, but just a bounding box.

Alexander Korotkov and me
2015-05-15 14:26:51 +03:00
Fujii Masao ecd222e770 Support VERBOSE option in REINDEX command.
When this option is specified, a progress report is printed as each index
is reindexed.

Per discussion, we agreed on the following syntax for the extensibility of
the options.

    REINDEX (flexible options) { INDEX | ... } name

Sawada Masahiko.
Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi,
Jim Nasby and me.

Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com
2015-05-15 20:09:57 +09:00
Tom Lane 7730f48ede Teach UtfToLocal/LocalToUtf to support algorithmic encoding conversions.
Until now, these functions have only supported encoding conversions using
lookup tables, which is fine as long as there's not too many code points
to convert.  However, GB18030 expects all 1.1 million Unicode code points
to be convertible, which would require a ridiculously-sized lookup table.
Fortunately, a large fraction of those conversions can be expressed through
arithmetic, ie the conversions are one-to-one in certain defined ranges.
To support that, provide a callback function that is used after consulting
the lookup tables.  (This patch doesn't actually change anything about the
GB18030 conversion behavior, just provide infrastructure for fixing it.)

Since this requires changing the APIs of UtfToLocal/LocalToUtf anyway,
take the opportunity to rearrange their argument lists into what seems
to me a saner order.  And beautify the call sites by using lengthof()
instead of error-prone sizeof() arithmetic.

In passing, also mark all the lookup tables used by these calls "const".
This moves an impressive amount of stuff into the text segment, at least
on my machine, and is safer anyhow.
2015-05-14 22:27:12 -04:00
Simon Riggs 83e176ec18 Separate block sampling functions
Refactoring ahead of tablesample patch

Requested and reviewed by Michael Paquier

Petr Jelinek
2015-05-15 04:02:54 +02:00
Bruce Momjian 5a3022fde0 pg_upgrade: make controldata checks more consistent
Also add missing float8_pass_by_value check.
2015-05-14 21:56:31 -04:00
Peter Eisentraut a486e35706 Add pg_settings.pending_restart column
with input from David G. Johnston, Robert Haas, Michael Paquier
2015-05-14 20:08:51 -04:00
Tom Lane 1dc5ebc907 Support "expanded" objects, particularly arrays, for better performance.
This patch introduces the ability for complex datatypes to have an
in-memory representation that is different from their on-disk format.
On-disk formats are typically optimized for minimal size, and in any case
they can't contain pointers, so they are often not well-suited for
computation.  Now a datatype can invent an "expanded" in-memory format
that is better suited for its operations, and then pass that around among
the C functions that operate on the datatype.  There are also provisions
(rudimentary as yet) to allow an expanded object to be modified in-place
under suitable conditions, so that operations like assignment to an element
of an array need not involve copying the entire array.

The initial application for this feature is arrays, but it is not hard
to foresee using it for other container types like JSON, XML and hstore.
I have hopes that it will be useful to PostGIS as well.

In this initial implementation, a few heuristics have been hard-wired
into plpgsql to improve performance for arrays that are stored in
plpgsql variables.  We would like to generalize those hacks so that
other datatypes can obtain similar improvements, but figuring out some
appropriate APIs is left as a task for future work.  (The heuristics
themselves are probably not optimal yet, either, as they sometimes
force expansion of arrays that would be better left alone.)

Preliminary performance testing shows impressive speed gains for plpgsql
functions that do element-by-element access or update of large arrays.
There are other cases that get a little slower, as a result of added array
format conversions; but we can hope to improve anything that's annoyingly
bad.  In any case most applications should see a net win.

Tom Lane, reviewed by Andres Freund
2015-05-14 12:08:49 -04:00
Robert Haas 61f68e0bed Fix comment.
Commit 78efd5c1ed overlooked this.

Report by Peter Geoghegan.
2015-05-13 15:27:41 -04:00
Robert Haas 78efd5c1ed Extend abbreviated key infrastructure to datum tuplesorts.
Andrew Gierth, reviewed by Peter Geoghegan and by me.
2015-05-13 14:36:26 -04:00