Commit Graph

25374 Commits

Author SHA1 Message Date
Heikki Linnakangas f7534296b4 Move SizeOfHeapNewCid next to xl_heap_new_cid struct.
They belong together, but the xl_heap_rewrite_mapping struct was wedged
in between.
2014-04-01 16:23:16 +03:00
Robert Haas 4bc15a8bfb Mark FastPathStrongRelationLocks volatile.
Otherwise, the compiler might decide to move modifications to data
within this structure outside the enclosing SpinLockAcquire /
SpinLockRelease pair, leading to shared memory corruption.

This may or may not explain a recent lmgr-related buildfarm failure
on prairiedog, but it needs to be fixed either way.
2014-03-31 14:32:12 -04:00
Robert Haas 066254cea1 Count buffers dirtied due to hints in pgBufferUsage.shared_blks_dirtied.
Previously, such buffers weren't counted, with the possible result that
EXPLAIN (BUFFERS) and pg_stat_statements would understate the true
number of blocks dirtied by an SQL statement.

Back-patch to 9.2, where this counter was introduced.

Amit Kapila
2014-03-31 13:06:26 -04:00
Robert Haas 3f0e4be453 Fix thinko in logical decoding code.
Andres Freund
2014-03-31 13:03:18 -04:00
Heikki Linnakangas 14d02f0bb3 Rewrite the way GIN posting lists are packed on a page, to reduce WAL volume.
Inserting (in retail) into the new 9.4 format GIN posting tree created much
larger WAL records than in 9.3. The previous strategy to WAL logging was
basically to log the whole page on each change, with the exception of
completely unmodified segments up to the first modified one. That was not
too bad when appending to the end of the page, as only the last segment had
to be WAL-logged, but per Fujii Masao's testing, even that produced 2x the
WAL volume that 9.3 did.

The new strategy is to keep track of changes to the posting lists in a more
fine-grained fashion, and also make the repacking" code smarter to avoid
decoding and re-encoding segments unnecessarily.
2014-03-31 15:23:50 +03:00
Heikki Linnakangas 0cfa34c25a Rename GinLogicValue to GinTernaryValue.
It's more descriptive. Also, get rid of the enum, and use #defines instead,
per Greg Stark's suggestion.
2014-03-31 10:26:38 +03:00
Bruce Momjian 9d66116444 psql: display "Replica Identity" only for FULL and NOTHING
INDEX is already displayed on the index, and we now exclude pg_catalog.
DEFAULT is not displayed.
2014-03-29 19:00:11 -04:00
Tom Lane 62215de292 Fix dumping of a materialized view that depends on a table's primary key.
It is possible for a view or materialized view to depend on a table's
primary key, if the view query relies on functional dependency to
abbreviate a GROUP BY list.  This is problematic for pg_dump since we
ordinarily want to dump view definitions in the pre-data section but
indexes in post-data.  pg_dump knows how to deal with this situation for
regular views, by breaking the view's ON SELECT rule apart from the view
proper.  But it had not been taught what to do about materialized views,
and in fact mistakenly dumped them as regular views in such cases, as
seen in bug #9616 from Jesse Denardo.

If we had CREATE OR REPLACE MATERIALIZED VIEW, we could fix this in a
manner analogous to what's done for regular views; but we don't yet,
and we'd not back-patch such a thing into 9.3 anyway.  As a hopefully-
temporary workaround, break the circularity by postponing the matview
into post-data altogether when this case occurs.
2014-03-29 17:34:00 -04:00
Noah Misch 8f5578d0f9 Revert "Secure Unix-domain sockets of "make check" temporary clusters."
About half of the buildfarm members use too-long directory names,
strongly suggesting that this approach is a dead end.
2014-03-29 03:12:00 -04:00
Noah Misch 31c6e54ec9 Secure Unix-domain sockets of "make check" temporary clusters.
Any OS user able to access the socket can connect as the bootstrap
superuser and in turn execute arbitrary code as the OS user running the
test.  Protect against that by placing the socket in the temporary data
directory, which has mode 0700 thanks to initdb.  Back-patch to 8.4 (all
supported versions).  The hazard remains wherever the temporary cluster
accepts TCP connections, notably on Windows.

Attempts to run "make check" from a directory with a long name will now
fail.  An alternative not sharing that problem was to place the socket
in a subdirectory of /tmp, but that is only secure if /tmp is sticky.
The PG_REGRESS_SOCK_DIR environment variable is available as a
workaround when testing from long directory paths.

As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR.  Popular non-default values
like /var/run/postgresql are often unwritable to the build user.

Security: CVE-2014-0067
2014-03-29 00:52:56 -04:00
Tom Lane 9613a1d98e Improve regression test for pg_filenode_relation().
Make it print the details in case there's a failure.

Andres Freund, slightly modified by me
2014-03-28 16:58:29 -04:00
Bruce Momjian e1827012ed Adjust getpwuid() fix commit to display errno string on failure
This adjusts patch 613c6d26bd.
2014-03-28 12:50:15 -04:00
Tom Lane a87c729153 Fix EquivalenceClass processing for nested append relations.
The original coding of EquivalenceClasses didn't foresee that appendrel
child relations might themselves be appendrels; but this is possible for
example when a UNION ALL subquery scans a table with inheritance children.
The oversight led to failure to optimize ordering-related issues very well
for the grandchild tables.  After some false starts involving explicitly
flattening the appendrel representation, we found that this could be fixed
easily by removing a few implicit assumptions about appendrel parent rels
not being children themselves.

Kyotaro Horiguchi and Tom Lane, reviewed by Noah Misch
2014-03-28 11:50:01 -04:00
Tom Lane b777be0d48 Un-break peer authentication.
Commit 613c6d26bd sloppily replaced a
lookup of the UID obtained from getpeereid() with a lookup of the
server's own user name, thus totally destroying peer authentication.
Revert.  Per report from Christoph Berg.

In passing, make sure get_user_name() zeroes *errstr on success on
Windows as well as non-Windows.  I don't think any callers actually
depend on this ATM, but we should be consistent across platforms.
2014-03-28 10:30:37 -04:00
Heikki Linnakangas e709ced153 Silence compiler warnings in new jsonb code.
Amit Kapila.
2014-03-27 08:53:44 +02:00
Andrew Dunstan 7e4d1600a6 Fix uninitialized variables in json's populate_record_worker().
Peter Geoghegan.
2014-03-26 18:20:56 -04:00
Tom Lane 2d5e0f07de Fix refcounting bug in PLy_modify_tuple().
We must increment the refcount on "plntup" as soon as we have the
reference, not sometime later.  Otherwise, if an error is thrown in
between, the Py_XDECREF(plntup) call in the PG_CATCH block removes a
refcount we didn't add, allowing the object to be freed even though
it's still part of the plpython function's parsetree.

This appears to be the cause of crashes seen on buildfarm member
prairiedog.  It's a bit surprising that we've not seen it fail repeatably
before, considering that the regression tests have been exercising the
faulty code path since 2009.

The real-world impact is probably minimal, since it's unlikely anyone would
be provoking the "TD["new"] is not a dictionary" error in production, and
that's the only case that is actually wrong.  Still, it's a bug affecting
the regression tests, so patch all supported branches.

In passing, remove dead variable "plstr", and demote "platt" to a local
variable inside the PG_TRY block, since we don't need to clean it up
in the PG_CATCH path.
2014-03-26 16:41:32 -04:00
Heikki Linnakangas c2a6724823 Pass more than the first XLogRecData entry to rm_desc, with WAL_DEBUG.
If you compile with WAL_DEBUG and enable it with wal_debug=on, we used to
only pass the first XLogRecData entry to the rm_desc routine. I think the
original assumprion was that the first XLogRecData entry contains all the
necessary information for the rm_desc routine, but that's a pretty shaky
assumption. At least standby_redo didn't get the memo.

To fix, piece together all the data in a temporary buffer, and pass that to
the rm_desc routine.

It's been like this forever, but the patch didn't apply cleanly to
back-branches. Probably wouldn't be hard to fix the conflicts, but it's
not worth the trouble.
2014-03-26 18:17:53 +02:00
Bruce Momjian b69c4e65be psql: update "replica identity" display for \d+
Display "replica identity" only for \d plus mode, exclude system schema
objects, and display all possible values, not just non-default,
non-index ones.
2014-03-26 11:13:17 -04:00
Andrew Dunstan f9c6d72cbf Cleanup around json_to_record/json_to_recordset
Set function parameter names and defaults. Add jsonb versions (which the
code already provided for so the actual new code is trivial). Add jsonb
regression tests and docs.

Bump catalog version (which I apparently forgot to do when jsonb was
committed).
2014-03-26 10:18:24 -04:00
Heikki Linnakangas 86cf41ed27 Fix 'recheck' flag in tsquery's GIN tri-consistent function.
It needs to be initialized, like in the boolean gin_tsquery_consistent
version.

Peter Geoghegan.
2014-03-26 10:15:35 +02:00
Andrew Dunstan fbc3def862 Tidy up the populate/to_record{set} code for json a bit.
In the process fix a small bug.
2014-03-25 21:20:54 -04:00
Fujii Masao 49638868f8 Don't forget to flush XLOG_PARAMETER_CHANGE record.
Backpatch to 9.0 where XLOG_PARAMETER_CHANGE record was instroduced.
2014-03-26 02:12:39 +09:00
Bruce Momjian 5db55c6bbc Remove wchar.c Asserts that were stricter than the main code
Assert errors were thrown for functions being passed invalid encodings,
while the main code handled it just fine.

Also document that libpq's PQclientEncoding() returns -1 for an encoding
lookup failure.

Per report from Peter Geoghegan
2014-03-24 15:59:38 -04:00
Bruce Momjian 1420f3a982 Fix ts_rank_cd() to ignore stripped lexemes
Previously, stripped lexemes got a default location and could be
considered if mixed with non-stripped lexemes.

BACKWARD INCOMPATIBILITY CHANGE
2014-03-24 14:37:16 -04:00
Heikki Linnakangas bb42e21be2 Change ginMergeItemPointers to return a palloc'd array.
That seems nicer than making it the caller's responsibility to pass a
suitable-sized array. All the callers were just palloc'ing an array anyway.
2014-03-24 18:44:40 +02:00
Heikki Linnakangas 2f3afc0979 Remove dead code and add comments.
'cbuffer' variable was left over from an earlier version of the patch to
rewrite the incomplete split handling.
2014-03-24 11:02:23 +02:00
Heikki Linnakangas 3ed249b741 Fix "the the" typos.
Erik Rijkers
2014-03-24 08:42:13 +02:00
Andrew Dunstan ab22b149c6 Do jsonb regression test input in the conventional way.
This should make the buildfarm happier.
2014-03-23 20:18:06 -04:00
Andrew Dunstan d9134d0a35 Introduce jsonb, a structured format for storing json.
The new format accepts exactly the same data as the json type. However, it is
stored in a format that does not require reparsing the orgiginal text in order
to process it, making it much more suitable for indexing and other operations.
Insignificant whitespace is discarded, and the order of object keys is not
preserved. Neither are duplicate object keys kept - the later value for a given
key is the only one stored.

The new type has all the functions and operators that the json type has,
with the exception of the json generation functions (to_json, json_agg etc.)
and with identical semantics. In addition, there are operator classes for
hash and btree indexing, and two classes for GIN indexing, that have no
equivalent in the json type.

This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which
was intended to provide similar facilities to a nested hstore type, but which
in the end proved to have some significant compatibility issues.

Authors: Oleg Bartunov,  Teodor Sigaev, Peter Geoghegan and Andrew Dunstan.
Review: Andres Freund
2014-03-23 16:40:19 -04:00
Noah Misch 7cbe57c34d Offer triggers on foreign tables.
This covers all the SQL-standard trigger types supported for regular
tables; it does not cover constraint triggers.  The approach for
acquiring the old row mirrors that for view INSTEAD OF triggers.  For
AFTER ROW triggers, we spool the foreign tuples to a tuplestore.

This changes the FDW API contract; when deciding which columns to
populate in the slot returned from data modification callbacks, writable
FDWs will need to check for AFTER ROW triggers in addition to checking
for a RETURNING clause.

In support of the feature addition, refactor the TriggerFlags bits and
the assembly of old tuples in ModifyTable.

Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
2014-03-23 02:16:34 -04:00
Noah Misch 6115480c54 Improve comments about AfterTriggerBeginQuery() query level usage. 2014-03-23 02:15:52 -04:00
Noah Misch c31305de5f Address ccvalid/ccnoinherit in TupleDesc support functions.
equalTupleDescs() neglected both of these ConstrCheck fields, and
CreateTupleDescCopyConstr() neglected ccnoinherit.  At this time, the
only known behavior defect resulting from these omissions is constraint
exclusion disregarding a CHECK constraint validated by an ALTER TABLE
VALIDATE CONSTRAINT statement issued earlier in the same transaction.
Back-patch to 9.2, where these fields were introduced.
2014-03-23 02:13:43 -04:00
Heikki Linnakangas 4c0e97c2d5 Fix thinkos in GinLogicValue enum.
It was incorrectly declared as global variable, not an enum type, and
the comments for GIN_FALSE and GIN_TRUE were backwards.
2014-03-21 23:41:37 +01:00
Heikki Linnakangas dea6ed2c98 Fix build with LWLOCK_STATS or dtrace.
Also fix the name of the dtrace probe for LWLockAcquireOrWait(). The
function was renamed from LWLockWaitUntilFree to LWLockAqcuireOrWait, but
the dtrace probe was neglected.

Pointed out by Andres Freund and the buildfarm.
2014-03-21 23:26:34 +01:00
Bruce Momjian 1494931d73 Remove MinGW readdir/errno bug workaround fixed on 2003-10-10 2014-03-21 13:47:37 -04:00
Bruce Momjian 6f03927fce Properly check for readdir/closedir() failures
Clear errno before calling readdir() and handle old MinGW errno bug
while adding full test coverage for readdir/closedir failures.

Backpatch through 8.4.
2014-03-21 13:45:11 -04:00
Heikki Linnakangas 68a2e52bba Replace the XLogInsert slots with regular LWLocks.
The special feature the XLogInsert slots had over regular LWLocks is the
insertingAt value that was updated atomically with releasing backends
waiting on it. Add new functions to the LWLock API to do that, and replace
the slots with LWLocks. This reduces the amount of duplicated code.
(There's still some duplication, but at least it's all in lwlock.c now.)

Reviewed by Andres Freund.
2014-03-21 15:10:48 +01:00
Tom Lane af930e606a Again fix initialization of auto-tuned effective_cache_size.
The previous method was overly complex and underly correct; in particular,
by assigning the default value with PGC_S_OVERRIDE, it prevented later
attempts to change the setting in postgresql.conf, as noted by Jeff Janes.
We should just assign the default value with source PGC_S_DYNAMIC_DEFAULT,
which will have the desired priority relative to the boot_val as well as
user-set values.

There is still a gap in this method: if there's an explicit assignment of
effective_cache_size = -1 in the postgresql.conf file, and that assignment
appears before shared_buffers is assigned, the code will substitute 4 times
the bootstrap default for shared_buffers, and that value will then persist
(since it will have source PGC_S_FILE).  I don't see any very nice way
to avoid that though, and it's not a case to be expected in practice.
The existing comments in guc-file.l look forward to a redesign of the
DYNAMIC_DEFAULT mechanism; if that ever happens, we should consider this
case as one of the things we'd like to improve.
2014-03-20 12:58:30 -04:00
Bruce Momjian a4c8f14364 libpq: pass a memory allocation failure error up to PQconndefaults()
Previously user name memory allocation failures were ignored and the
default user name set to NULL.
2014-03-20 11:48:31 -04:00
Alvaro Herrera f88d4cfc9d Setup error context callback for transaction lock waits
With this in place, a session blocking behind another one because of
tuple locks will get a context line mentioning the relation name, tuple
TID, and operation being done on tuple.  For example:

LOG:  process 11367 still waiting for ShareLock on transaction 717 after 1000.108 ms
DETAIL:  Process holding the lock: 11366. Wait queue: 11367.
CONTEXT:  while updating tuple (0,2) in relation "foo"
STATEMENT:  UPDATE foo SET value = 3;

Most usefully, the new line is displayed by log entries due to
log_lock_waits, although of course it will be printed by any other log
message as well.

Author: Christian Kruse, some tweaks by Álvaro Herrera
Reviewed-by: Amit Kapila, Andres Freund, Tom Lane, Robert Haas
2014-03-19 15:10:36 -03:00
Tom Lane ea8c7e9054 Fix memory leak during regular expression execution.
For a regex containing backrefs, pg_regexec() might fail to free all the
sub-DFAs that were created during execution, resulting in a permanent
(session lifespan) memory leak.  Problem was introduced by me in commit
587359479a.  Per report from Sandro Santilli;
diagnosis by Greg Stark.
2014-03-19 11:09:24 -04:00
Fujii Masao fb1d92a9fa Some minor improvements to logical decoding document.
Also improve help message in pg_recvlogical.
2014-03-19 22:13:05 +09:00
Heikki Linnakangas 59a5ab3f42 Remove rm_safe_restartpoint machinery.
It is no longer used, none of the resource managers have multi-record
actions that would make it unsafe to perform a restartpoint.

Also don't allow rm_cleanup to write WAL records, it's also no longer
required. Move the call to rm_cleanup routines to make it more symmetric
with rm_startup.
2014-03-18 22:10:35 +02:00
Heikki Linnakangas 1d3b258cbe Fix misc typos in comments. 2014-03-18 21:05:18 +02:00
Robert Haas a3b30d4cfe Fix uninitialized variable.
Report from Andres Freund, but not his fix.
2014-03-18 14:54:35 -04:00
Heikki Linnakangas 40dae7ec53 Make the handling of interrupted B-tree page splits more robust.
Splitting a page consists of two separate steps: splitting the child page,
and inserting the downlink for the new right page to the parent. Previously,
we handled the case that you crash in between those steps with a cleanup
routine after the WAL recovery had finished, which finished the incomplete
split. However, that doesn't help if the page split is interrupted but the
database doesn't crash, so that you don't perform WAL recovery. That could
happen for example if you run out of disk space.

Remove the end-of-recovery cleanup step. Instead, when a page is split, the
left page is marked with a new INCOMPLETE_SPLIT flag, and when the downlink
is inserted to the parent, the flag is cleared again. If an insertion sees
a page with the flag set, it knows that the split was interrupted for some
reason, and inserts the missing downlink before proceeding.

I used the same approach to fix GIN and GiST split algorithms earlier. This
was the last WAL cleanup routine, so we could get rid of that whole
machinery now, but I'll leave that for a separate patch.

Reviewed by Peter Geoghegan.
2014-03-18 20:50:44 +02:00
Robert Haas 8bdd12bbf0 Add pg_recvlogical, a tool to receive data logical decoding data.
This is fairly basic at the moment, but it's at least useful for
testing and debugging, and possibly more.

Andres Freund
2014-03-18 12:25:14 -04:00
Robert Haas 250f8a7bbe Rewrite comment for shm_mq_receive_bytes.
The comment and the code diverged at some point before the initial
commit of this feature, and I failed to notice.

Noted by Tom Lane.
2014-03-18 11:53:28 -04:00
Tom Lane f7271c4427 Fix relcache reference leak in refresh_by_match_merge().
One path through the loop over indexes forgot to do index_close().  Rather
than adding a fourth call, restructure slightly so that there's only one.

In passing, get rid of an unnecessary syscache lookup: the pg_index struct
for the index is already available from its relcache entry.

Per report from YAMAMOTO Takashi, though this is a bit different from his
suggested patch.  This is new code in HEAD, so no need for back-patch.
2014-03-18 11:36:53 -04:00
Robert Haas 3bd261ca18 Improve shm_mq portability around MAXIMUM_ALIGNOF and sizeof(Size).
Revise the original decision to expose a uint64-based interface and
use Size everywhere possible.  Avoid assuming that MAXIMUM_ALIGNOF is
8, or making any assumption about the relationship between that value
and sizeof(Size).  If MAXIMUM_ALIGNOF is bigger, we'll now insert
padding after the length word; if it's smaller, we are now prepared
to read and write the length word in chunks.

Per discussion with Tom Lane.
2014-03-18 11:23:13 -04:00
Tom Lane 19f2d6cdae Fix pg_dumpall option parsing: -i doesn't take an argument.
This used to work properly, but got fat-fingered in commit
3dee636e04.  Per bug #9620 from
Nicolas Payart.
2014-03-18 10:38:25 -04:00
Fujii Masao e726e59dc4 Fix help message and document in pg_receivexlog.
Add SLOTNAME placeholder to --slot option in help message and
document.
2014-03-18 21:15:45 +09:00
Robert Haas 79a4d24f31 Make it easy to detach completely from shared memory.
The new function dsm_detach_all() can be used either by postmaster
children that don't wish to take any risk of accidentally corrupting
shared memory; or by forked children of regular backends with
the same need.  This patch also updates the postmaster children that
already do PGSharedMemoryDetach() to do dsm_detach_all() as well.

Per discussion with Tom Lane.
2014-03-18 07:58:53 -04:00
Tom Lane d70cf811f7 During index build, check and elog (not just Assert) for broken HOT chain.
The recently-fixed bug in WAL replay could result in not finding a parent
tuple for a heap-only tuple.  The existing code would either Assert or
generate an invalid index entry, neither of which is desirable.  Throw a
regular error instead.
2014-03-17 12:36:11 -04:00
Heikki Linnakangas d663d4399e Fix thinko: have trueTriConsistentFn return GIN_TRUE.
While we're at it, also improve comments in ginlogic.c.
2014-03-17 17:29:04 +02:00
Fujii Masao 2bccced110 Fix typos in comments.
Thom Brown
2014-03-17 20:47:28 +09:00
Fujii Masao 5c6d9fc4b2 Fix bug in clean shutdown of walsender that pg_receiving is connecting to.
On clean shutdown, walsender waits for all WAL to be replicated to a standby,
and exits. It determined whether that replication had been completed by
checking whether its sent location had been equal to a standby's flush
location. Unfortunately this condition never becomes true when the standby
such as pg_receivexlog which always returns an invalid flush location is
connecting to walsender, and then walsender waits forever.

This commit changes walsender so that it just checks a standby's write
location if a flush location is invalid.

Back-patch to 9.1 where enough infrastructure for this exists.
2014-03-17 20:37:50 +09:00
Magnus Hagander 02703ff227 Fix small typo in comment
Michael Paquier
2014-03-17 09:09:21 +01:00
Alvaro Herrera bd1154edec plperl: Fix memory leak in hek2cstr
Backpatch all the way back to 9.1, where it was introduced by commit
50d89d42.

Reported by Sergey Burladyan in #9223
Author: Alex Hunsaker
2014-03-16 23:22:21 -03:00
Peter Eisentraut 2861e8e9cb Make punctuation consistent 2014-03-16 21:47:35 -04:00
Peter Eisentraut e2b959478c Fix whitespace 2014-03-16 21:47:35 -04:00
Tom Lane f4051e363c Fix advertised dispsize for libpq's sslmode connection parameter.
"8" was correct back when "disable" was the longest allowed value, but
since "verify-full" was added, it should be "12".  Given the lack of
complaints, I wouldn't be surprised if nobody is actually using these
values ... but still, if they're in the API, they should be right.

Noticed while pursuing a different problem.  It's been wrong for quite
a long time, so back-patch to all supported branches.
2014-03-16 21:43:40 -04:00
Magnus Hagander 0294023a6b Cleanups from the remove-native-krb5 patch
krb_srvname is actually not available anymore as a parameter server-side, since
with gssapi we accept all principals in our keytab. It's still used in libpq for
client side specification.

In passing remove declaration of krb_server_hostname, where all the functionality
was already removed.

Noted by Stephen Frost, though a different solution than his suggestion
2014-03-16 15:22:45 +01:00
Tom Lane aba7f56779 Update time zone data files to tzdata release 2014a.
DST law changes in Fiji, Turkey; historical changes in Israel, Ukraine.
2014-03-15 13:36:07 -04:00
Heikki Linnakangas efada2b8e9 Fix race condition in B-tree page deletion.
In short, we don't allow a page to be deleted if it's the rightmost child
of its parent, but that situation can change after we check for it.

Problem
-------

We check that the page to be deleted is not the rightmost child of its
parent, and then lock its left sibling, the page itself, its right sibling,
and the parent, in that order. However, if the parent page is split after
the check but before acquiring the locks, the target page might become the
rightmost child, if the split happens at the right place. That leads to an
error in vacuum (I reproduced this by setting a breakpoint in debugger):

ERROR:  failed to delete rightmost child 41 of block 3 in index "foo_pkey"

We currently re-check that the page is still the rightmost child, and throw
the above error if it's not. We could easily just give up rather than throw
an error, but that approach doesn't scale to half-dead pages. To recap,
although we don't normally allow deleting the rightmost child, if the page
is the *only* child of its parent, we delete the child page and mark the
parent page as half-dead in one atomic operation. But before we do that, we
check that the parent can later be deleted, by checking that it in turn is
not the rightmost child of the grandparent (potentially recursing all the
way up to the root). But the same situation can arise there - the
grandparent can be split while we're not holding the locks. We end up with
a half-dead page that we cannot delete.

To make things worse, the keyspace of the deleted page has already been
transferred to its right sibling. As the README points out, the keyspace at
the grandparent level is "out-of-whack" until the half-dead page is deleted,
and if enough tuples with keys in the transferred keyspace are inserted, the
page might get split and a downlink might be inserted into the grandparent
that is out-of-order. That might not cause any serious problem if it's
transient (as the README ponders), but is surely bad if it stays that way.

Solution
--------

This patch changes the page deletion algorithm to avoid that problem. After
checking that the topmost page in the chain of to-be-deleted pages is not
the rightmost child of its parent, and then deleting the pages from bottom
up, unlink the pages from top to bottom. This way, the intermediate stages
are similar to the intermediate stages in page splitting, and there is no
transient stage where the keyspace is "out-of-whack". The topmost page in
the to-be-deleted chain doesn't have a downlink pointing to it, like a page
split before the downlink has been inserted.

This also allows us to get rid of the cleanup step after WAL recovery, if we
crash during page deletion. The deletion will be continued at next VACUUM,
but the tree is consistent for searches and insertions at every step.

This bug is old, all supported versions are affected, but this patch is too
big to back-patch (and changes the WAL record formats of related records).
We have not heard any reports of the bug from users, so clearly it's not
easy to bump into. Maybe backpatch later, after this has had some field
testing.

Reviewed by Kevin Grittner and Peter Geoghegan.
2014-03-14 16:07:19 +02:00
Tom Lane 6c461cb92f Prevent interrupts while reporting non-ERROR elog messages.
This should eliminate the risk of recursive entry to syslog(3), which
appears to be the cause of the hang reported in bug #9551 from James
Morton.

Arguably, the real problem here is auth.c's willingness to turn on
ImmediateInterruptOK while executing fairly wide swaths of backend code.
We may well need to work at narrowing the code ranges in which the
authentication_timeout interrupt is enabled.  For the moment, though,
this is a cheap and reasonably noninvasive fix for a field-reported
failure; the other approach would be complex and not necessarily
bug-free itself.

Back-patch to all supported branches.
2014-03-13 20:59:42 -04:00
Tom Lane f70a78bc1f Allow psql to print COPY command status in more cases.
Previously, psql would print the "COPY nnn" command status only for COPY
commands executed server-side.  Now it will print that for frontend copies
too (including \copy).  However, we continue to suppress the command status
for COPY TO STDOUT, since in that case the copy data has been routed to the
same place that the command status would go, and there is a risk of the
status line being mistaken for another line of COPY data.  Doing that would
break existing scripts, and it doesn't seem worth the benefit --- this case
seems fairly analogous to SELECT, for which we also suppress the command
status.

Kumar Rajeev Rastogi, with substantial review by Amit Khandekar
2014-03-13 13:49:03 -04:00
Tom Lane 7bae0284ee Avoid transaction-commit race condition while receiving a NOTIFY message.
Use TransactionIdIsInProgress, then TransactionIdDidCommit, to distinguish
whether a NOTIFY message's originating transaction is in progress,
committed, or aborted.  The previous coding could accept a message from a
transaction that was still in-progress according to the PGPROC array;
if the client were fast enough at starting a new transaction, it might fail
to see table rows added/updated by the message-sending transaction.  Which
of course would usually be the point of receiving the message.  We noted
this type of race condition long ago in tqual.c, but async.c overlooked it.

The race condition probably cannot occur unless there are multiple NOTIFY
senders in action, since an individual backend doesn't send NOTIFY signals
until well after it's done committing.  But if two senders commit in close
succession, it's certainly possible that we could see the second sender's
message within the race condition window while responding to the signal
from the first one.

Per bug #9557 from Marko Tiikkaja.  This patch is slightly more invasive
than what he proposed, since it removes the now-redundant
TransactionIdDidAbort call.

Back-patch to 9.0, where the current NOTIFY implementation was introduced.
2014-03-13 12:02:54 -04:00
Bruce Momjian 242c2737fb C comments: remove odd blank lines after #ifdef WIN32 lines
A few more
2014-03-13 01:42:24 -04:00
Bruce Momjian 886c0be3f6 C comments: remove odd blank lines after #ifdef WIN32 lines 2014-03-13 01:34:42 -04:00
Heikki Linnakangas a3115f0d9e Only WAL-log the modified portion in an UPDATE, if possible.
When a row is updated, and the new tuple version is put on the same page as
the old one, only WAL-log the part of the new tuple that's not identical to
the old. This saves significantly on the amount of WAL that needs to be
written, in the common case that most fields are not modified.

Amit Kapila, with a lot of back and forth with me, Robert Haas, and others.
2014-03-12 23:28:36 +02:00
Heikki Linnakangas 17d787a3b1 Items on GIN data pages are no longer always 6 bytes; update gincostestimate.
Also improve the comments a bit.
2014-03-12 20:52:22 +02:00
Fujii Masao 588fb50715 Show PIDs of lock holders and waiters in log_lock_waits log message.
Christian Kruse, reviewed by Kumar Rajeev Rastogi.
2014-03-13 03:26:47 +09:00
Robert Haas 336a578b8c Fix incorrect assertion about historical snapshots.
Also fix some nearby comments.

Andres Freund
2014-03-12 14:07:41 -04:00
Robert Haas 890194f14d Comment fixes related to logical decoding.
Andres Freund, per complaints by Peter Eisentraut.
2014-03-12 14:03:09 -04:00
Heikki Linnakangas c5608ea26a Allow opclasses to provide tri-valued GIN consistent functions.
With the GIN "fast scan" feature, GIN can skip items without fetching all
the keys for them, if it can prove that they don't match regardless of
those keys. So far, it has done the proving by calling the boolean
consistent function with all combinations of TRUE/FALSE for the unfetched
keys, but since that's O(n^2), it becomes unfeasible with more than a few
keys. We can avoid calling consistent with all the combinations, if we can
tell the operator class implementation directly which keys are unknown.

This commit includes a triConsistent function for the built-in array and
tsvector opclasses.

Alexander Korotkov, with some changes by me.
2014-03-12 17:51:30 +02:00
Heikki Linnakangas fecfc2b913 In WAL replay, restore GIN metapage unconditionally to avoid torn page.
We don't take a full-page image of the GIN metapage; instead, the WAL record
contains all the information required to reconstruct it from scratch. But
to avoid torn page hazards, we must re-initialize it from the WAL record
every time, even if it already has a greater LSN, similar to how normal full
page images are restored.

This was highly unlikely to cause any problems in practice, because the GIN
metapage is small. We rely on an update smaller than a 512 byte disk sector
to be atomic elsewhere, at least in pg_control. But better safe than sorry,
and this would be easy to overlook if more fields are added to the metapage
so that it's no longer small.

Reported by Noah Misch. Backpatch to all supported versions.
2014-03-12 10:04:57 +02:00
Tom Lane e85a5ffba8 Fix tracking of psql script line numbers during \copy from another place.
Commit 08146775ac changed do_copy() to
temporarily scribble on pset.cur_cmd_source.  That was a mighty ugly bit of
code in any case, but in particular it broke handleCopyIn's ability to tell
whether it was reading from the current script source file (in which case
pset.lineno should be incremented for each line of COPY data), or from
someplace else (in which case it shouldn't).  The former case still worked,
the latter not so much.  The visible effect was that line numbers reported
for errors in a script file would be wrong if there were an earlier \copy
that was reading anything other than inline-in-the-script-file data.

To fix, introduce another pset field that holds the file do_copy wants the
COPY code to use.  This is a little bit ugly, but less so than passing the
file down explicitly through several layers that aren't COPY-specific.

Extracted from a larger patch by Kumar Rajeev Rastogi; that patch also
changes printing of COPY command tags, which is not a bug fix and shouldn't
get back-patched.  This particular idea was from a suggestion by Amit
Khandekar, if I'm reading the thread correctly.

Back-patch to 9.2 where the faulty code was introduced.
2014-03-10 15:47:40 -04:00
Robert Haas 8722017bbc Allow dynamic shared memory segments to be kept until shutdown.
Amit Kapila, reviewed by Kyotaro Horiguchi, with some further
changes by me.
2014-03-10 14:04:47 -04:00
Robert Haas 5a991ef869 Allow logical decoding via the walsender interface.
In order for this to work, walsenders need the optional ability to
connect to a database, so the "replication" keyword now allows true
or false, for backward-compatibility, and the new value "database"
(which causes the "dbname" parameter to be respected).

walsender needs to loop not only when idle but also when sending
decoded data to the user and when waiting for more xlog data to decode.
This means that there are now three separate loops inside walsender.c;
although some refactoring has been done here, this is still a bit ugly.

Andres Freund, with contributions from Álvaro Herrera, and further
review by me.
2014-03-10 13:50:28 -04:00
Robert Haas cb9a0c7987 Teach on_exit_reset() to discard pending cleanups for dsm.
If a postmaster child invokes fork() and then calls on_exit_reset, that
should be sufficient to let it exit() without breaking anything, but
dynamic shared memory broke that by not updating on_exit_reset() to
discard callbacks registered with dynamic shared memory segments.

Per investigation of a complaint from Tom Lane.
2014-03-10 10:17:19 -04:00
Simon Riggs 77049443a1 Correct copy/pasto in comment for REPLICA IDENTITY 2014-03-09 09:05:16 +00:00
Bruce Momjian 5024044a20 C comments: improve description of relfilenode uniqueness
Report by Antonin Houska
2014-03-08 12:20:30 -05:00
Bruce Momjian 11d205e2bd pg_ctl: improve handling of invalid data directory
Return '4' and report a meaningful error message when a non-existent or
invalid data directory is passed.  Previously, pg_ctl would just report
the server was not running.

Patch by me and Amit Kapila
Report from Peter Eisentraut
2014-03-08 12:15:25 -05:00
Tom Lane ea177a3ba7 Remove unportable use of anonymous unions from reorderbuffer.h.
In b89e151054 I had assumed it was ok to use anonymous unions as
struct members, but while a longstanding extension in many compilers,
it's only been standardized in C11.

To fix, remove one of the anonymous unions which tried to hide some
implementation specific enum values and give the other a name. The
latter unfortunately requires changes in output plugins, but since the
feature has only been added a few days ago...

Andres Freund
2014-03-07 17:03:26 -05:00
Bruce Momjian 91d9de9751 fix ReplicationSlotsCountDBSlots for dropping unrelated databases
YAMAMOTO Takashi
2014-03-07 11:42:18 -05:00
Heikki Linnakangas 55566c9a74 Fix dangling smgr_owner pointer when a fake relcache entry is freed.
A fake relcache entry can "own" a SmgrRelation object, like a regular
relcache entry. But when it was free'd, the owner field in SmgrRelation
was not cleared, so it was left pointing to free'd memory.

Amazingly this apparently hasn't caused crashes in practice, or we would've
heard about it earlier. Andres found this with Valgrind.

Report and fix by Andres Freund, with minor modifications by me. Backpatch
to all supported versions.
2014-03-07 13:28:52 +02:00
Heikki Linnakangas ad7b48ea08 Avoid memcpy() with same source and destination address.
The behavior of that is undefined, although unlikely to lead to problems in
practice.

Found by running regression tests with Valgrind.
2014-03-07 13:14:33 +02:00
Tom Lane 7c31874945 Avoid getting more than AccessShareLock when deparsing a query.
In make_ruledef and get_query_def, we have long used AcquireRewriteLocks
to ensure that the querytree we are about to deparse is up-to-date and
the schemas of the underlying relations aren't changing.  Howwever, that
function thinks the query is about to be executed, so it acquires locks
that are stronger than necessary for the purpose of deparsing.  Thus for
example, if pg_dump asks to deparse a rule that includes "INSERT INTO t",
we'd acquire RowExclusiveLock on t.  That results in interference with
concurrent transactions that might for example ask for ShareLock on t.
Since pg_dump is documented as being purely read-only, this is unexpected.
(Worse, it used to actually be read-only; this behavior dates back only
to 8.1, cf commit ba4200246.)

Fix this by adding a parameter to AcquireRewriteLocks to tell it whether
we want the "real" execution locks or only AccessShareLock.

Report, diagnosis, and patch by Dean Rasheed.  Back-patch to all supported
branches.
2014-03-06 19:31:05 -05:00
Heikki Linnakangas a0c2fa9b5c isdigit() needs an unsigned char argument.
Per the C standard, the routine should be passed an int, with a value that's
representable as an unsigned char or EOF. Passing a signed char is wrong,
because a negative value is not representable as an unsigned char.
Unfortunately no compiler warns about that.
2014-03-06 21:40:10 +02:00
Heikki Linnakangas 94ae6ba74d Send keepalives from walsender even when busy sending WAL.
If walsender doesn't hear from the client for the time specified by
wal_sender_timeout, it will conclude the connection or client is dead, and
disconnect. When half of wal_sender_timeout has elapsed, it sends a ping
to the client, leaving it the remainig half of wal_sender_timeout to
respond. However, it only checked if half of wal_sender_timeout had elapsed
when it was about to sleep, so if it was busy sending WAL to the client for
long enough, it would not send the ping request in time. Then the client
would not know it needs to send a reply, and the walsender will disconnect
even though the client is still alive. Fix that.

Andres Freund, reviewed by Robert Haas, and some further changes by me.
Backpatch to 9.3. Earlier versions relied on the client to send the
keepalives on its own, and hence didn't have this problem.
2014-03-06 21:38:51 +02:00
Tom Lane bf4052faa1 Don't reject ROW_MARK_REFERENCE rowmarks for materialized views.
We should allow this so that matviews can be referenced in UPDATE/DELETE
statements in READ COMMITTED isolation level.  The requirement for that
is that a re-fetch by TID will see the same row version the query saw
earlier, which is true of matviews, so there's no reason for the
restriction.  Per bug #9398.

Michael Paquier, after a suggestion by me
2014-03-06 11:37:02 -05:00
Bruce Momjian 0024a3a3b6 C comment update: relfilenode is only unique with a tablespace
Report from Antonin Houska
2014-03-05 20:52:34 -05:00
Bruce Momjian b44fc39fce pg_dump: make argument combination error exit code consistent
Per report from Pavel Golub
2014-03-05 18:15:49 -05:00
Tom Lane f1ba94bcd9 Fix portability issues in recently added make_timestamp/make_interval code.
Explicitly reject infinity/NaN inputs, rather than just assuming that
something else will do it for us.  Per buildfarm.

While at it, make some over-parenthesized and under-legible code
more readable.
2014-03-05 16:42:18 -05:00
Tom Lane 8cf0ad1ea3 Add comment that ec_relids excludes "child" EquivalenceClass members.
This was already documented a few lines further down, but the comment
just beside the field declaration could be misleading.  Per gripe
from Kyotaro Horiguchi.
2014-03-05 16:00:33 -05:00
Robert Haas 406a1a9ef0 Fix some typos introduced by the logical decoding patch.
Erik Rijkers
2014-03-05 13:00:22 -05:00
Tom Lane 114b26c06f Remove unused field "evttype".
Apparent oversight in commit 3855968f.
2014-03-05 11:57:53 -05:00
Alvaro Herrera 2b4f2ab33d Remove the correct pgstat file on DROP DATABASE
We were unlinking the permanent file, not the non-permanent one.  But
since the stat collector already unlinks all permanent files on startup,
there was nothing for it to unlink.  The non-permanent file remained in
place, and was copied to the permanent directory on shutdown, so in
effect no file was ever dropped.

Backpatch to 9.3, where the issue was introduced by commit 187492b6c2.
Before that, there were no per-database files and thus no file to drop
on DROP DATABASE.

Per report from Thom Brown.

Author: Tomáš Vondra
2014-03-05 13:03:29 -03:00
Stephen Frost dd917bb793 Allocate fresh memory for post_opts/exec_path
Instead of having read_post_opts() depend on the memory allocated for
the config file (which is now getting free'd), pg_strdup() for
post_opts and exec_path (similar to how it's being done elsewhere).

Noted by Thom Brown.
2014-03-05 08:50:12 -05:00
Heikki Linnakangas 956685f82b Do wal_level and hot standby checks when doing crash-then-archive recovery.
CheckRequiredParameterValues() should perform the checks if archive recovery
was requested, even if we are going to perform crash recovery first.

Reported by Kyotaro HORIGUCHI. Backpatch to 9.2, like the crash-then-archive
recovery mode.
2014-03-05 14:48:14 +02:00
Heikki Linnakangas af246c37c0 Fix lastReplayedEndRecPtr calculation when starting from shutdown checkpoint.
When entering crash recovery followed by archive recovery, and the latest
checkpoint is a shutdown checkpoint, and there are no more WAL records to
replay before transitioning from crash to archive recovery, we would not
immediately allow read-only connections in hot standby mode even if we
could. That's because when starting from a shutdown checkpoint, we set
lastReplayedEndRecPtr incorrectly to the record before the checkpoint
record, instead of the checkpoint record itself. We don't run the redo
routine of the shutdown checkpoint record, but starting recovery from it
goes through the same motions, so it should be considered as replayed.

Reported by Kyotaro HORIGUCHI. All versions with hot standby are affected,
so backpatch to 9.0.
2014-03-05 13:51:19 +02:00
Stephen Frost eb933162cd Fix issues with pg_ctl
The new, small, free_readfile managed to have bug in it which could
cause it to try and free something it shouldn't, and fix the case
where it was being called with an invalid pointer leading to a
segfault.

Noted by Bruce, issues introduced and fixed by me.
2014-03-05 01:30:03 -05:00
Andrew Dunstan 3b5e03dca2 Provide a FORCE NULL option to COPY in CSV mode.
This forces an input field containing the quoted null string to be
returned as a NULL. Without this option, only unquoted null strings
behave this way. This helps where some CSV producers insist on quoting
every field, whether or not it is needed. The option takes a list of
fields, and only applies to those columns. There is an equivalent
column-level option added to file_fdw.

Ian Barwick, with some tweaking by Andrew Dunstan, reviewed by Payal
Singh.
2014-03-04 17:31:59 -05:00
Alvaro Herrera 84df54b22e Constructors for interval, timestamp, timestamptz
Author: Pavel Stěhule, editorialized somewhat by Álvaro Herrera
Reviewed-by: Tomáš Vondra, Marko Tiikkaja
With input from Fabrízio de Royes Mello, Jim Nasby
2014-03-04 15:09:43 -03:00
Robert Haas af2543e884 Allow VACUUM FULL/CLUSTER to bump freeze horizons even for pg_class.
pg_class is a special case for CLUSTER and VACUUM FULL, so although
commit 3cff1879f8 caused these
operations to advance relfrozenxid and relminmxid for all other
tables, it did not provide the same benefit for pg_class.  This
plugs that gap.

Andres Freund
2014-03-04 11:08:18 -05:00
Robert Haas 7e8db2dc42 Minor corrections to logical decoding patch. 2014-03-04 11:07:54 -05:00
Heikki Linnakangas 7558cc95d3 Error out on send failure in walsender loop.
I changed the loop in 9.3 to use "goto send_failure" instead of "break" on
errors, but I missed this one case. It was a relatively harmless bug: if
the flush fails once it will most likely fail again as soon as we try to
flush the output again. But it's a bug nevertheless.

Report and fix by Andres Freund.
2014-03-04 15:36:05 +02:00
Robert Haas b89e151054 Introduce logical decoding.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables.  The output format is controlled by a
so-called "output plugin"; an example is included.  To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.

Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.

Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
2014-03-03 16:32:18 -05:00
Peter Eisentraut de94b47c0a Fix whitespace 2014-03-03 14:05:33 -05:00
Heikki Linnakangas f8ce16d0d2 Rename huge_tlb_pages to huge_pages, and improve docs.
Christian Kruse
2014-03-03 20:52:48 +02:00
Alvaro Herrera 9067310cc5 pg_dump et al: Add --if-exists option
This option makes pg_dump, pg_dumpall and pg_restore inject an IF EXISTS
clause to each DROP command they emit.  (In pg_dumpall, the clause is
not added to individual objects drops, but rather to the CREATE DATABASE
commands, as well as CREATE ROLE and CREATE TABLESPACE.)

This allows for a better user dump experience when using --clean in case
some objects do not already exist.  Per bug #7873 by Dave Rolsky.

Author: Pavel Stěhule
Reviewed-by: Jeevan Chalke, Álvaro Herrera, Josh Kupershmidt
2014-03-03 15:02:18 -03:00
Robert Haas 34c6d9611d Use a longer buffer in libpqrcv_startstreaming.
Because of the new SLOT clause in the START_REPLICATION command, it's
possible for the command to end up too long for the old maximum buffer
length.

Andres Freund
2014-03-03 07:24:52 -05:00
Robert Haas a8e9b86b5e Bump catversion.
The previous patch should have entailed a catversion bump, but I
forgot.
2014-03-03 07:22:20 -05:00
Robert Haas d83ee62231 Corrections to replication slots code and documentation.
Andres Freund, per a report from Vik Faering
2014-03-03 07:16:54 -05:00
Robert Haas ae95f5f74a Define LSNOID in pg_type.h.
Most other built-in types have a similarly-named constant, so this
type should probably have one, too.

Michael Paquier
2014-03-03 07:03:41 -05:00
Stephen Frost 5592ebac55 Another round of Coverity fixes
Additional non-security issues/improvements spotted by Coverity.

In backend/libpq, no sense trying to protect against port->hba being
NULL after we've already dereferenced it in the switch() statement.

Prevent against possible overflow due to 32bit arithmitic in
basebackup throttling (not yet released, so no security concern).

Remove nonsensical check of array pointer against NULL in procarray.c,
looks to be a holdover from 9.1 and earlier when there were pointers
being used but now it's just an array.

Remove pointer check-against-NULL in tsearch/spell.c as we had already
dereferenced it above (in the strcmp()).

Remove dead code from adt/orderedsetaggs.c, isnull is checked
immediately after each tuplesort_getdatum() call and if true we return,
so no point checking it again down at the bottom.

Remove recently added minor error-condition memory leak in pg_regress.
2014-03-03 03:18:51 -05:00
Stephen Frost b1aebbb6a8 Various Coverity-spotted fixes
A number of issues were identified by the Coverity scanner and are
addressed in this patch.  None of these appear to be security issues
and many are mostly cosmetic changes.

Short comments for each of the changes follows.

Correct the semi-colon placement in be-secure.c regarding SSL retries.
Remove a useless comparison-to-NULL in proc.c (value is dereferenced
  prior to this check and therefore can't be NULL).
Add checking of chmod() return values to initdb.
Fix a couple minor memory leaks in initdb.
Fix memory leak in pg_ctl- involves free'ing the config file contents.
Use an int to capture fgetc() return instead of an enum in pg_dump.
Fix minor memory leaks in pg_dump.
  (note minor change to convertOperatorReference()'s API)
Check fclose()/remove() return codes in psql.
Check fstat(), find_my_exec() return codes in psql.
Various ECPG memory leak fixes.
Check find_my_exec() return in ECPG.
Explicitly ignore pqFlush return in libpq error-path.
Change PQfnumber() to avoid doing an strdup() when no changes required.
Remove a few useless check-against-NULL's (value deref'd beforehand).
Check rmtree(), malloc() results in pg_regress.
Also check get_alternative_expectfile() return in pg_regress.
2014-03-01 22:14:14 -05:00
Tom Lane 9662143f0c Allow regex operations to be terminated early by query cancel requests.
The regex code didn't have any provision for query cancel; which is
unsurprising given its non-Postgres origin, but still problematic since
some operations can take a long time.  Introduce a callback function to
check for a pending query cancel or session termination request, and
call it in a couple of strategic spots where we can make the regex code
exit with an error indicator.

If we ever actually split out the regex code as a standalone library,
some additional work will be needed to let the cancel callback function
be specified externally to the library.  But that's straightforward
(certainly so by comparison to putting the locale-dependent character
classification logic on a similar arms-length basis), and there seems
no need to do it right now.

A bigger issue is that there may be more places than these two where
we need to check for cancels.  We can always add more checks later,
now that the infrastructure is in place.

Since there are known examples of not-terribly-long regexes that can
lock up a backend for a long time, back-patch to all supported branches.
I have hopes of fixing the known performance problems later, but adding
query cancel ability seems like a good idea even if they were all fixed.
2014-03-01 15:20:56 -05:00
Heikki Linnakangas d8a42b150f Remove bogus while-loop.
Commit abf5c5c9a4 added a bogus while-
statement after the for(;;)-loop. It went unnoticed in testing, because
it was dead code.

Report by KONDO Mitsumasa. Backpatch to 9.3. The commit that introduced
this was also applied to 9.2, but not the bogus while-loop part, because
the code in 9.2 looks quite different.
2014-02-28 13:33:41 +02:00
Alvaro Herrera ef5856fd9b Allow BASE_BACKUP to be throttled
A new MAX_RATE option allows imposing a limit to the network transfer
rate from the server side.  This is useful to limit the stress that
taking a base backup has on the server.

pg_basebackup is now able to specify a value to the server, too.

Author: Antonin Houska

Patch reviewed by Stefan Radomski, Andres Freund, Zoltán Böszörményi,
Fujii Masao, and Álvaro Herrera.
2014-02-27 18:55:57 -03:00
Alvaro Herrera 6bfa88acd3 Fix WAL replay of locking an updated tuple
We were resetting the tuple's HEAP_HOT_UPDATED flag as well as t_ctid on
WAL replay of a tuple-lock operation, which is incorrect when the tuple
is already updated.

Back-patch to 9.3.  The clearing of both header elements was there
previously, but since no update could be present on a tuple that was
being locked, it was harmless.

Bug reported by Peter Geoghegan and Greg Stark in
CAM3SWZTMQiCi5PV5OWHb+bYkUcnCk=O67w0cSswPvV7XfUcU5g@mail.gmail.com and
CAM-w4HPTOeMT4KP0OJK+mGgzgcTOtLRTvFZyvD0O4aH-7dxo3Q@mail.gmail.com
respectively; diagnosis by Andres Freund.
2014-02-27 11:13:39 -03:00
Heikki Linnakangas 00976f202c btbuild no longer calls _bt_doinsert(), update comment.
Peter Geoghegan
2014-02-26 18:49:04 +02:00
Jeff Davis 486ea0b19e Fix crash in json_to_record().
json_to_record() depends on get_call_result_type() for the tuple
descriptor of the record that should be returned, but in some cases
that cannot be determined. Add a guard to check if the tuple
descriptor has been properly resolved, similar to other callers of
get_call_result_type().

Also add guard for two other callers of get_call_result_type() in
jsonfuncs.c. Although json_to_record() is the only actual bug, it's a
good idea to follow convention.
2014-02-26 07:47:41 -08:00
Tom Lane fccebe421d Use SnapshotDirty rather than an active snapshot to probe index endpoints.
If there are lots of uncommitted tuples at the end of the index range,
get_actual_variable_range() ends up fetching each one and doing an MVCC
visibility check on it, until it finally hits a visible tuple.  This is
bad enough in isolation, considering that we don't need an exact answer
only an approximate one.  But because the tuples are not yet committed,
each visibility check does a TransactionIdIsInProgress() test, which
involves scanning the ProcArray.  When multiple sessions do this
concurrently, the ensuing contention results in horrid performance loss.
20X overall throughput loss on not-too-complicated queries is easy to
demonstrate in the back branches (though someone's made it noticeably
less bad in HEAD).

We can dodge the problem fairly effectively by using SnapshotDirty rather
than a normal MVCC snapshot.  This will cause the index probe to take
uncommitted tuples as good, so that we incur only one tuple fetch and test
even if there are many such tuples.  The extent to which this degrades the
estimate is debatable: it's possible the result is actually a more accurate
prediction than before, if the endmost tuple has become committed by the
time we actually execute the query being planned.  In any case, it's not
very likely that it makes the estimate a lot worse.

SnapshotDirty will still reject tuples that are known committed dead, so
we won't give bogus answers if an invalid outlier has been deleted but not
yet vacuumed from the index.  (Because btrees know how to mark such tuples
dead in the index, we shouldn't have a big performance problem in the case
that there are many of them at the end of the range.)  This consideration
motivates not using SnapshotAny, which was also considered as a fix.

Note: the back branches were using SnapshotNow instead of an MVCC snapshot,
but the problem and solution are the same.

Per performance complaints from Bartlomiej Romanski, Josh Berkus, and
others.  Back-patch to 9.0, where the issue was introduced (by commit
40608e7f94).
2014-02-25 16:04:06 -05:00
Robert Haas cf6aa68bbd Update a few comments to mention materialized views.
Etsuro Fujita
2014-02-25 13:40:12 -05:00
Robert Haas dd1a3bccca Show xid and xmin in pg_stat_activity and pg_stat_replication.
Christian Kruse, reviewed by Andres Freund and myself, with further
minor adjustments by me.
2014-02-25 12:34:04 -05:00
Robert Haas 278c94209b pg_basebackup: Skip only the *contents* of pg_replslot.
Include the directory itself.

Fujii Masao
2014-02-25 11:23:45 -05:00
Peter Eisentraut 32001ab0b7 Update and clarify ssl_ciphers default
- Write HIGH:MEDIUM instead of DEFAULT:!LOW:!EXP for clarity.
- Order 3DES last to work around inappropriate OpenSSL default.
- Remove !MD5 and @STRENGTH, because they are irrelevant.
- Add clarifying documentation.

Effectively, the new default is almost the same as the old one, but it
is arguably easier to understand and modify.

Author: Marko Kreen <markokr@gmail.com>
2014-02-24 20:30:28 -05:00
Bruce Momjian 848ae330a4 Increase work_mem and maintenance_work_mem defaults by 4x
New defaults are 4MB and 64MB.
2014-02-24 13:04:51 -05:00
Bruce Momjian 4bad548d98 psql: add separate \d display for disabled system triggers
Previously if you disabled all triggers, only user triggers would
show as disabled

Per report from Andres Freund
2014-02-24 12:44:55 -05:00
Bruce Momjian d613861b95 pg_dump: fix subtle memory leak in func and arg signature processing 2014-02-24 12:32:41 -05:00
Bruce Momjian 423f69ab64 Allow single-point polygons to be converted to circles
This allows finding the center of a single-point polygon and converting
it to a point.

Per report from Josef Grahn
2014-02-24 12:24:00 -05:00
Bruce Momjian 8457d0beca docs: document behavior of CHAR() comparisons with chars < space
Space trimming rather than space-padding causes unusual behavior, which
might not be standards-compliant.

Also remove recently-added now-redundant C comment.
2014-02-24 12:09:23 -05:00
Robert Haas 6615e77439 Use pg_lsn data type in pg_stat_replication, too.
Michael Paquier, per a suggestion from Andres Freund
2014-02-24 10:38:45 -05:00
Robert Haas bb818b53d4 Remove a couple of comments from the pg_lsn regression test.
Previously, one of these was a negative test case, but that got
changed along the way and the comments didn't get the memo.

Michael Paquier
2014-02-24 09:32:21 -05:00
Tom Lane 769065c1b2 Prefer pg_any_to_server/pg_server_to_any over pg_do_encoding_conversion.
A large majority of the callers of pg_do_encoding_conversion were
specifying the database encoding as either source or target of the
conversion, meaning that we can use the less general functions
pg_any_to_server/pg_server_to_any instead.

The main advantage of using the latter functions is that they can make use
of a cached conversion-function lookup in the common case that the other
encoding is the current client_encoding.  It's notationally cleaner too in
most cases, not least because of the historical artifact that the latter
functions use "char *" rather than "unsigned char *" in their APIs.

Note that pg_any_to_server will apply an encoding verification step in
some cases where pg_do_encoding_conversion would have just done nothing.
This seems to me to be a good idea at most of these call sites, though
it partially negates the performance benefit.

Per discussion of bug #9210.
2014-02-23 16:59:05 -05:00
Tom Lane 49c817eab7 Plug some more holes in encoding conversion.
Various places assume that pg_do_encoding_conversion() and
pg_server_to_any() will ensure encoding validity of their results;
but they failed to do so in the case that the source encoding is SQL_ASCII
while the destination is not.  We cannot perform any actual "conversion"
in that scenario, but we should still validate the string according to the
destination encoding.  Per bug #9210 from Digoal Zhou.

Arguably this is a back-patchable bug fix, but on the other hand adding
more enforcing of encoding checks might break existing applications that
were being sloppy.  On balance there doesn't seem to be much enthusiasm
for a back-patch, so fix in HEAD only.

While at it, remove some apparently-no-longer-needed provisions for
letting pg_do_encoding_conversion() "work" outside a transaction ---
if you consider it "working" to silently fail to do the requested
conversion.

Also, make a few cosmetic improvements in mbutils.c, notably removing
some Asserts that are certainly dead code since the variables they
assert aren't null are never null, even at process start.  (I think
this wasn't true at one time, but it is now.)
2014-02-23 15:22:50 -05:00
Peter Eisentraut fb05f3ce83 pg_basebackup: Add support for relocating tablespaces
Tablespaces can be relocated in plain backup mode by specifying one or
more -T olddir=newdir options.

Author: Steeve Lennmark <steevel@handeldsbanken.se>
Reviewed-by: Peter Eisentraut <peter_e@gmx.net>
2014-02-22 13:38:06 -05:00
Tom Lane 77585bce03 Do ScalarArrayOp estimation correctly when array is a stable expression.
Most estimation functions apply estimate_expression_value to see if they
can reduce an expression to a constant; the key difference is that it
allows evaluation of stable as well as immutable functions in hopes of
ending up with a simple Const node.  scalararraysel didn't get the memo
though, and neither did gincost_opexpr/gincost_scalararrayopexpr.  Fix
that, and remove a now-unnecessary estimate_expression_value step in the
subsidiary function scalararraysel_containment.

Per complaint from Alexey Klyukin.  Back-patch to 9.3.  The problem
goes back further, but I'm hesitant to change estimation behavior in
long-stable release branches.
2014-02-21 17:10:46 -05:00
Heikki Linnakangas 8f09ca436d Improve comment on setting data_checksum GUC.
There was an extra space there, and "fixed" wasn't very descriptive.
2014-02-20 10:58:30 +02:00
Tom Lane ae5266f259 Remove inappropriate EXPORTS line.
Looks like this gets added later ...
2014-02-19 21:08:50 -05:00
Tom Lane 4f5f485d10 Avoid using dllwrap to build pgevent in Mingw builds.
If this works, we can get rid of configure's support for locating dllwrap
... but let's see what the buildfarm says, first.

Hiroshi Inoue
2014-02-19 19:34:50 -05:00
Tom Lane 52acfd27f1 Fix some missing .gitignore and "make clean" items in ecpg.
Some of the files we optionally link in from elsewhere weren't ignored
and/or weren't cleaned up at "make clean".  Noted while testing on a
machine that needs our version of snprintf.c.
2014-02-19 18:50:48 -05:00
Robert Haas 6f289c2b7d Switch various builtin functions to use pg_lsn instead of text.
The functions in slotfuncs.c don't exist in any released version,
but the changes to xlogfuncs.c represent backward-incompatibilities.
Per discussion, we're hoping that the queries using these functions
are few enough and simple enough that this won't cause too much
breakage for users.

Michael Paquier, reviewed by Andres Freund and further modified
by me.
2014-02-19 11:37:43 -05:00
Robert Haas 694e3d139a Further code review for pg_lsn data type.
Change input function error messages to be more consistent with what is
done elsewhere.  Remove a bunch of redundant type casts, so that the
compiler will warn us if we screw up.  Don't pass LSNs by value on
platforms where a Datum is only 32 bytes, per buildfarm.  Move macros
for packing and unpacking LSNs to pg_lsn.h so that we can include
access/xlogdefs.h, to avoid an unsatisfied dependency on XLogRecPtr.
2014-02-19 10:06:59 -05:00
Robert Haas 844a28a9dd pg_lsn macro naming and type behavior revisions.
Change pg_lsn_mi so that it can return negative values when subtracting
LSNs, and clean up some perhaps ill-considered macro names.
2014-02-19 09:34:15 -05:00
Robert Haas 7d03a83f4d Add a pg_lsn data type, to represent an LSN.
Robert Haas and Michael Paquier
2014-02-19 08:35:23 -05:00
Tom Lane a222f7fda6 Remove broken code that tried to handle OVERLAPS with a single argument.
The SQL standard says that OVERLAPS should have a two-element row
constructor on each side.  The original coding of OVERLAPS support in
our grammar attempted to extend that by allowing a single-element row
constructor, which it internally duplicated ... or tried to, anyway.
But that code has certainly not worked since our List infrastructure was
rewritten in 2004, and I'm none too sure it worked before that.  As it
stands, it ends up building a List that includes itself, leading to
assorted undesirable behaviors later in the parser.

Even if it worked as intended, it'd be a bit evil because of the
possibility of duplicate evaluation of a volatile function that the user
had written only once.  Given the lack of documentation, test cases, or
complaints, let's just get rid of the idea and only support the standard
syntax.

While we're at it, improve the error cursor positioning for the
wrong-number-of-arguments errors, and inline the makeOverlaps() function
since it's only called in one place anyway.

Per bug #9227 from Joshua Yanovski.  Initial patch by Joshua Yanovski,
extended a bit by me.
2014-02-18 12:44:20 -05:00
Magnus Hagander 7f3e17b482 Disable RandomizedBaseAddress on MSVC builds
The ASLR in Windows 8/Windows 2012 can break PostgreSQL's shared memory. It
doesn't fail every time (which is explained by the Random part in ASLR), but
can fail with errors abut failing to reserve shared memory region.

MauMau, reviewed by Craig Ringer
2014-02-18 14:45:58 +01:00
Heikki Linnakangas 057152b37c Fix comment; checkpointer, not bgwriter, performs checkpoints since 9.2.
Amit Langote
2014-02-18 09:48:18 +02:00
Robert Haas 876f78d575 Fix capitalization in README.
Vik Fearing
2014-02-17 14:03:41 -05:00
Tom Lane 01824385ae Prevent potential overruns of fixed-size buffers.
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit.  We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue.  Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.

Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.

In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass.  The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode").  This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.

Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
2014-02-17 11:20:21 -05:00
Noah Misch 31400a6733 Predict integer overflow to avoid buffer overruns.
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement.  Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest.  In passing, add check_stack_depth() to prevent stack overflow
in related functions.

Back-patch to 8.4 (all supported versions).  The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.

Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.

Security: CVE-2014-0064
2014-02-17 09:33:31 -05:00
Noah Misch 4318daecc9 Fix handling of wide datetime input/output.
Many server functions use the MAXDATELEN constant to size a buffer for
parsing or displaying a datetime value.  It was much too small for the
longest possible interval output and slightly too small for certain
valid timestamp input, particularly input with a long timezone name.
The long input was rejected needlessly; the long output caused
interval_out() to overrun its buffer.  ECPG's pgtypes library has a copy
of the vulnerable functions, which bore the same vulnerabilities along
with some of its own.  In contrast to the server, certain long inputs
caused stack overflow rather than failing cleanly.  Back-patch to 8.4
(all supported versions).

Reported by Daniel Schüssler, reviewed by Tom Lane.

Security: CVE-2014-0063
2014-02-17 09:33:31 -05:00
Robert Haas 5f173040e3 Avoid repeated name lookups during table and index DDL.
If the name lookups come to different conclusions due to concurrent
activity, we might perform some parts of the DDL on a different table
than other parts.  At least in the case of CREATE INDEX, this can be
used to cause the permissions checks to be performed against a
different table than the index creation, allowing for a privilege
escalation attack.

This changes the calling convention for DefineIndex, CreateTrigger,
transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible
(in 9.2 and newer), and AlterTable (in 9.1 and older).  In addition,
CheckRelationOwnership is removed in 9.2 and newer and the calling
convention is changed in older branches.  A field has also been added
to the Constraint node (FkConstraint in 8.4).  Third-party code calling
these functions or using the Constraint node will require updating.

Report by Andres Freund.  Patch by Robert Haas and Andres Freund,
reviewed by Tom Lane.

Security: CVE-2014-0062
2014-02-17 09:33:31 -05:00
Noah Misch 537cbd35c8 Prevent privilege escalation in explicit calls to PL validators.
The primary role of PL validators is to be called implicitly during
CREATE FUNCTION, but they are also normal functions that a user can call
explicitly.  Add a permissions check to each validator to ensure that a
user cannot use explicit validator calls to achieve things he could not
otherwise achieve.  Back-patch to 8.4 (all supported versions).
Non-core procedural language extensions ought to make the same two-line
change to their own validators.

Andres Freund, reviewed by Tom Lane and Noah Misch.

Security: CVE-2014-0061
2014-02-17 09:33:31 -05:00
Noah Misch fea164a72a Shore up ADMIN OPTION restrictions.
Granting a role without ADMIN OPTION is supposed to prevent the grantee
from adding or removing members from the granted role.  Issuing SET ROLE
before the GRANT bypassed that, because the role itself had an implicit
right to add or remove members.  Plug that hole by recognizing that
implicit right only when the session user matches the current role.
Additionally, do not recognize it during a security-restricted operation
or during execution of a SECURITY DEFINER function.  The restriction on
SECURITY DEFINER is not security-critical.  However, it seems best for a
user testing his own SECURITY DEFINER function to see the same behavior
others will see.  Back-patch to 8.4 (all supported versions).

The SQL standards do not conflate roles and users as PostgreSQL does;
only SQL roles have members, and only SQL users initiate sessions.  An
application using PostgreSQL users and roles as SQL users and roles will
never attempt to grant membership in the role that is the session user,
so the implicit right to add or remove members will never arise.

The security impact was mostly that a role member could revoke access
from others, contrary to the wishes of his own grantor.  Unapproved role
member additions are less notable, because the member can still largely
achieve that by creating a view or a SECURITY DEFINER function.

Reviewed by Andres Freund and Tom Lane.  Reported, independently, by
Jonas Sundman and Noah Misch.

Security: CVE-2014-0060
2014-02-17 09:33:31 -05:00
Tom Lane fa1f0d7859 PGDLLIMPORT-ify MainLWLockArray, ProcDiePending, proc_exit_inprogress.
These are needed in HEAD to make assorted contrib modules build on Windows.
Now that all the MSVC and Mingw buildfarm members seem to be on the same
page about the need for them, we can have some confidence that future
problems of this ilk will be detected promptly; there seems nothing more
to be learned by delaying this fix further.

I chose to mark QueryCancelPending as well, since it's easy to imagine code
that wants to touch ProcDiePending also caring about QueryCancelPending.
2014-02-16 20:12:43 -05:00
Tom Lane a1c802712c Fix unportable coding in tarCreateHeader().
uid_t and gid_t might be wider than int on some platforms.
Per buildfarm member brolga.
2014-02-16 20:01:18 -05:00
Tom Lane 8d6e2d4abf Revert to using --enable-auto-import in Cygwin builds.
Disabling auto-import requires that all libraries we use be careful about
declspecs for exported variables; and it seems they aren't.  This means
that Cygwin will not give us useful info about missing PGDLLIMPORT markers;
but it's probably sufficient that MSVC and Mingw builds do.
2014-02-16 15:14:04 -05:00
Tom Lane a5cf60682e PGDLLIMPORT'ify DateStyle and IntervalStyle.
This is needed on Windows to support contrib/postgres_fdw.  Although it's
been broken since last March, we didn't notice until recently because there
were no active buildfarm members that complained about missing PGDLLIMPORT
marking.  Efforts are underway to improve that situation, in support of
which we're delaying fixing some other cases of global variables that
should be marked PGDLLIMPORT.  However, this case affects 9.3, so we
can't wait any longer to fix it.

I chose to mark DateOrder as well, though it's not strictly necessary
for postgres_fdw.
2014-02-16 12:37:07 -05:00
Tom Lane 56caaf195e On Windows, expect to find Tcl DLL in bin directory not lib directory.
Still another step in the continuing saga of trying to get
--disable-auto-import to work.

Hiroshi Inoue
2014-02-16 11:24:38 -05:00
Tom Lane 643f75ca9b Fix unportable coding in BackgroundWorkerStateChange().
PIDs aren't necessarily ints; our usual practice for printing them
is to explicitly cast to long.  Per buildfarm member rover_firefly.
2014-02-15 17:15:05 -05:00
Tom Lane f0ee42d59b Fix unportable coding in DetermineSleepTime().
We should not assume that struct timeval.tv_sec is a long, because
it ain't necessarily.  (POSIX says that it's a time_t, which might
well be 64 bits now or in the future; or for that matter might be
32 bits on machines with 64-bit longs.)  Per buildfarm member panther.

Back-patch to 9.3 where the dubious coding was introduced.
2014-02-15 17:09:50 -05:00
Tom Lane 60ff2fdd99 Centralize getopt-related declarations in a new header file pg_getopt.h.
We used to have externs for getopt() and its API variables scattered
all over the place.  Now that we find we're going to need to tweak the
variable declarations for Cygwin, it seems like a good idea to have
just one place to tweak.

In this commit, the variables are declared "#ifndef HAVE_GETOPT_H".
That may or may not work everywhere, but we'll soon find out.

Andres Freund
2014-02-15 14:31:30 -05:00
Bruce Momjian 32be1c8e90 Remove use of sscanf in pg_upgrade, and add C comment to pg_dump
Per report from Jackie Chang
2014-02-15 11:50:56 -05:00
Bruce Momjian a0d8947acb psql: Add C comment about gset_prefix being freed later 2014-02-15 00:09:40 -05:00
Tom Lane 1c5143a0b5 Ooops, forgot to remove solar87 and friends from src/timezone/Makefile.
Per buildfarm.
2014-02-14 23:20:08 -05:00
Tom Lane e04641f4b4 Update time zone data files to tzdata release 2013i.
DST law changes in Jordan; historical changes in Cuba.

Also, remove the zones Asia/Riyadh87, Asia/Riyadh88, and Asia/Riyadh89.
Per the upstream announcement:
    The files solar87, solar88, and solar89 are no longer distributed.
    They were a negative experiment -- that is, a demonstration that
    tz data can represent solar time only with some difficulty and error.
    Their presence in the distribution caused confusion, as Riyadh
    civil time was generally not solar time in those years.
2014-02-14 21:59:13 -05:00
Tom Lane 638b153f2a Fix fat-fingered makefile changes for pltcl.
I put the OBJS assignments in the wrong order.  Per buildfarm.
2014-02-14 17:10:53 -05:00
Tom Lane dcbf39774f In mingw builds, make our own import library for libtcl, too.
Per buildfarm results.
2014-02-14 13:13:06 -05:00
Tom Lane 02b61dd08f In mingw builds, make our own import library for libperl.
Borrow the method already used by plpython.  This is pretty ugly, but
it might fix the build failure exhibited by buildfarm member narwhal
since commit 846e91e022.

Hiroshi Inoue
2014-02-14 11:51:02 -05:00
Tom Lane a7983e989d Cosmetic improvements in plpython's make rule for libpython import library.
This build technique is remarkably ugly, but that doesn't mean it has
to be unreadable too.  Be a bit more liberal with the vertical whitespace,
and give the .def file a proper dependency, just in case.
2014-02-14 11:31:35 -05:00
Heikki Linnakangas 4d894b41cd Change the order that pg_xlog and WAL archive are polled for WAL segments.
If there is a WAL segment with same ID but different TLI present in both
the WAL archive and pg_xlog, prefer the one with higher TLI. Before this
patch, the archive was polled first, for all expected TLIs, and only if no
file was found was pg_xlog scanned. This was a change in behavior from 9.3,
which first scanned archive and pg_xlog for the highest TLI, then archive
and pg_xlog for the next highest TLI and so forth. This patch reverts the
behavior back to what it was in 9.2.

The reason for this is that if for example you try to do archive recovery
to timeline 2, which branched off timeline 1, but the WAL for timeline 2 is
not archived yet, we would replay past the timeline switch point on
timeline 1 using the archived files, before even looking timeline 2's files
in pg_xlog

Report and patch by Kyotaro Horiguchi. Backpatch to 9.3 where the behavior
was changed.
2014-02-14 15:15:09 +02:00
Peter Eisentraut 0f2ca0075c Fix typo
Stefan Kaltenbrunner
2014-02-13 21:50:43 -05:00
Bruce Momjian 9c57d11fca Add C comment about problems with CHAR() space trimming 2014-02-13 21:46:03 -05:00
Tom Lane b8f00a46bc Clean up error cases in psql's COPY TO STDOUT/FROM STDIN code.
Adjust handleCopyOut() to stop trying to write data once it's failed
one time.  For typical cases such as out-of-disk-space or broken-pipe,
additional attempts aren't going to do anything but waste time, and
in any case clean truncation of the output seems like a better behavior
than randomly dropping blocks in the middle.

Also remove dubious (and misleadingly documented) attempt to force our way
out of COPY_OUT state if libpq didn't do that.  If we did have a situation
like that, it'd be a bug in libpq and would be better fixed there, IMO.
We can hope that commit fa4440f516 took care
of any such problems, anyway.

Also fix longstanding bug in handleCopyIn(): PQputCopyEnd() only supports
a non-null errormsg parameter in protocol version 3, and will actively
fail if one is passed in version 2.  This would've made our attempts
to get out of COPY_IN state after a failure into infinite loops when
talking to pre-7.4 servers.

Back-patch the COPY_OUT state change business back to 9.2 where it was
introduced, and the other two fixes into all supported branches.
2014-02-13 18:45:58 -05:00
Alvaro Herrera 801c2dc72c Separate multixact freezing parameters from xid's
Previously we were piggybacking on transaction ID parameters to freeze
multixacts; but since there isn't necessarily any relationship between
rates of Xid and multixact consumption, this turns out not to be a good
idea.

Therefore, we now have multixact-specific freezing parameters:

vacuum_multixact_freeze_min_age: when to remove multis as we come across
them in vacuum (default to 5 million, i.e. early in comparison to Xid's
default of 50 million)

vacuum_multixact_freeze_table_age: when to force whole-table scans
instead of scanning only the pages marked as not all visible in
visibility map (default to 150 million, same as for Xids).  Whichever of
both which reaches the 150 million mark earlier will cause a whole-table
scan.

autovacuum_multixact_freeze_max_age: when for cause emergency,
uninterruptible whole-table scans (default to 400 million, double as
that for Xids).  This means there shouldn't be more frequent emergency
vacuuming than previously, unless multixacts are being used very
rapidly.

Backpatch to 9.3 where multixacts were made to persist enough to require
freezing.  To avoid an ABI break in 9.3, VacuumStmt has a couple of
fields in an unnatural place, and StdRdOptions is split in two so that
the newly added fields can go at the end.

Patch by me, reviewed by Robert Haas, with additional input from Andres
Freund and Tom Lane.
2014-02-13 19:36:31 -03:00
Tom Lane 44c2163302 Fix length checking for Unicode identifiers containing escapes (U&"...").
We used the length of the input string, not the de-escaped string, as
the trigger for NAMEDATALEN truncation.  AFAICS this would only result
in sometimes printing a phony truncation warning; but it's just luck
that there was no worse problem, since we were violating the API spec
for truncate_identifier().  Per bug #9204 from Joshua Yanovski.

This has been wrong since the Unicode-identifier support was added,
so back-patch to all supported branches.
2014-02-13 14:24:42 -05:00
Tom Lane fa4440f516 Improve libpq's error recovery for connection loss during COPY.
In pqSendSome, if the connection is already closed at entry, discard any
queued output data before returning.  There is no possibility of ever
sending the data, and anyway this corresponds to what we'd do if we'd
detected a hard error while trying to send().  This avoids possible
indefinite bloat of the output buffer if the application keeps trying
to send data (or even just keeps trying to do PQputCopyEnd, as psql
indeed will).

Because PQputCopyEnd won't transition out of PGASYNC_COPY_IN state
until it's successfully queued the COPY END message, and pqPutMsgEnd
doesn't distinguish a queuing failure from a pqSendSome failure,
this omission allowed an infinite loop in psql if the connection closure
occurred when we had at least 8K queued to send.  It might be worth
refactoring so that we can make that distinction, but for the moment
the other changes made here seem to offer adequate defenses.

To guard against other variants of this scenario, do not allow
PQgetResult to return a PGRES_COPY_XXX result if the connection is
already known dead.  Make sure it returns PGRES_FATAL_ERROR instead.

Per report from Stephen Frost.  Back-patch to all active branches.
2014-02-12 17:50:57 -05:00
Bruce Momjian 2fc80e8e83 Rename 'gmake' to 'make' in docs and recommended commands
This simplifies the docs and makes it easier to cut/paste command lines.
2014-02-12 17:29:19 -05:00
Tom Lane 6f2aead1ff In XLogReadBufferExtended, don't assume P_NEW yields consecutive pages.
In a database that's not yet reached consistency, it's possible that some
segments of a relation are not full-size but are not the last ones either.
Because of the way smgrnblocks() works, asking for a new page with P_NEW
will fill in the last not-full-size segment --- and if that makes it full
size, the apparent EOF of the relation will increase by more than one page,
so that the next P_NEW request will yield a page past the next consecutive
one.  This breaks the relation-extension logic in XLogReadBufferExtended,
possibly allowing a page update to be applied to some page far past where
it was intended to go.  This appears to be the explanation for reports of
table bloat on replication slaves compared to their masters, and probably
explains some corrupted-slave reports as well.

Fix the loop to check the page number it actually got, rather than merely
Assert()'ing that dead reckoning got it to the desired place.  AFAICT,
there are no other places that make assumptions about exactly which page
they'll get from P_NEW.

Problem identified by Greg Stark, though this is not the same as his
proposed patch.

It's been like this for a long time, so back-patch to all supported
branches.
2014-02-12 14:52:16 -05:00
Magnus Hagander 48870dd9f7 Add missing include, required on some platforms
Noted by the buildfarm and Andres Freund
2014-02-12 20:04:13 +01:00
Magnus Hagander 63ab2befe0 Kill pg_basebackup background process when exiting
If an error occurs in the foreground (backup) process of pg_basebackup,
and we exit in a controlled way, the background process (streaming
xlog process) would stay around and keep streaming.
2014-02-12 18:45:18 +01:00
Tom Lane 1c9acd5c86 Use --disable-auto-import linker switch in Mingw builds, too.
This is evidently the default on buildfarm member narwhal, but that
is a pretty ancient Mingw version, and there is reason to think that
more recent versions of GNU ld have this feature turned on by default.
Since we are trying to achieve consistency of link behavior across
all Windows toolchains, let's just make sure here.
2014-02-12 12:03:53 -05:00
Tom Lane 30657b796c Remove --enable-auto-import linker switch in Cygwin build.
This is expected to make it start failing when contrib modules
reference non-PGDLLIMPORT'ed global variables, as the other Windows
build methods do.  Aside from the value of consistency, the underlying
implementation of this switch is pretty ugly and not really something
we want to rely on if we have to use PGDLLIMPORT anyway for MSVC.
2014-02-12 11:53:07 -05:00
Tom Lane b23fd2d8b3 Tweak position of $(DLL_DEFFILE) in shared-library link commands.
Reading the GNU ld man page suggests that this is order-sensitive
and should go in front of library references.  Correction to commit
846e91e022.
2014-02-12 11:22:23 -05:00
Tom Lane a5eed4d770 Make gendef.pl emit DATA annotations for global variables.
This should make the MSVC build act more like builds for other platforms,
i.e. backend global variables will be automatically available to loadable
libraries without need for explicit PGDLLIMPORT marking.

Craig Ringer
2014-02-11 13:39:14 -05:00
Tom Lane 7a98d323df Flush a stray definition of $(DLLTOOL).
Even if this is needed, it'd be configure's responsibility to set it.
2014-02-11 12:59:48 -05:00
Tom Lane 846e91e022 Get rid of use of dlltool in Mingw builds.
We are almost completely out of the dlltool game, if this works.

Hiroshi Inoue
2014-02-11 12:56:20 -05:00
Tom Lane cba6ffaef3 Cygwin build fixes.
Get rid of use of dlltool for linking the main postgres executable.
dlltool is obsolete and we'd prefer to stop depending on it.

Also, include $(LDAP_LIBS_FE) in $(libpq_pgport).  (It's not clear that
this is really needed, or why it's not a linker bug if it is needed.
But reports are that it's needed on current Cygwin.)

We might want to back-patch this if it works, but first let's see
what the buildfarm thinks.

Marco Atzeri
2014-02-11 12:10:52 -05:00
Peter Eisentraut d3c4c47155 scripts: Remove newlines from end of generated SQL
This results in spurious empty lines in the server log.  Instead, add
the newlines only when printing out the --echo output.  In some cases,
this was already done, leading to two newlines being printed.  Clean
that up as well.

From: Fabrízio de Royes Mello <fabriziomello@gmail.com>
2014-02-10 21:47:19 -05:00
Tom Lane 2895415205 Don't generate plain-text HISTORY and src/test/regress/README anymore.
Providing this information as plain text was doubtless worth the trouble
ten years ago, but it seems likely that hardly anyone reads it in this
format anymore.  And the effort required to maintain these files (in the
form of extra-complex markup rules in the relevant parts of the SGML
documentation) is significant.  So, let's stop doing that and rely solely
on the other documentation formats.

Per discussion, the plain-text INSTALL instructions might still be worth
their keep, so we continue to generate that file.

Rather than remove HISTORY and src/test/regress/README from distribution
tarballs entirely, replace them with simple stub files that tell the reader
where to find the relevant documentation.  This is mainly to avoid possibly
breaking packaging recipes that expect these files to exist.

Back-patch to all supported branches, because simplifying the markup
requirements for release notes won't help much unless we do it in all
branches.
2014-02-10 20:48:04 -05:00
Heikki Linnakangas d699ba4134 Fix WakeupWaiters() to not wake up an exclusive locker unnecessarily.
WakeupWaiters() is supposed to wake up all LW_WAIT_UNTIL_FREE waiters of
the slot, but the loop incorrectly also woke up the first LW_EXCLUSIVE
waiter, if there was no LW_WAIT_UNTIL_FREE waiters in the queue.

Noted by Andres Freund. This code is new in 9.4, so no backpatching.
2014-02-10 15:18:18 +02:00
Heikki Linnakangas 6c2744f1d3 Use memmove() instead of memcpy() for copying overlapping regions.
In commit d2495f272c, I fixed this bug in
to_tsquery(), but missed the fact that plainto_tsquery() has the same bug.
2014-02-10 09:57:59 +02:00
Stephen Frost dfb1e9bdc0 Further pg_dump / ftello improvements
Make ftello error-checking consistent to all calls and remove a
bit of ftello-related code which has been #if 0'd out since 2001.

Note that we are not concerned with the ftello() call under
snprintf() failing as it is just building a string to call
exit_horribly() with; printing -1 in such a case is fine.
2014-02-09 18:28:14 -05:00
Stephen Frost 5e8e794e3b Focus on ftello result < 0 instead of errno
Rather than reset errno (or just hope that its cleared already),
check just the result of the ftello for < 0 to determine if there
was an issue.

Oversight by me, pointed out by Tom.
2014-02-09 13:29:36 -05:00
Magnus Hagander 8198a321c9 Limit pg_basebackup progress output to 1/second
This prevents pg_basebackup from generating excessive output when
dumping large clusters. The status is now updated once / second,
still making it possible to see that there is progress happening,
but limiting the total bandwidth.

Mika Eloranta, reviewed by Sawada Masahiko and Oskari Saarenmaa
2014-02-09 12:51:42 +01:00
Magnus Hagander 01025d80a1 Avoid printing uninitialized filename variable in verbose mode
When using verbose mode for pg_basebackup, in tar format sent to
stdout, we'd print an unitialized buffer as the filename.

Reported by Pontus Lundkvist
2014-02-09 12:05:14 +01:00
Stephen Frost cfa1b4a711 Minor pg_dump improvements
Improve pg_dump by checking results on various fgetc() calls which
previously were unchecked, ditto for ftello.  Also clean up a couple
of very minor memory leaks by waiting to allocate structures until
after the initial check(s).

Issues spotted by Coverity.
2014-02-08 21:25:47 -05:00
Peter Eisentraut 66c04c981d Mark some more variables as static or include the appropriate header
Detected by clang's -Wmissing-variable-declarations.

From: Andres Freund <andres@anarazel.de>
2014-02-08 21:21:46 -05:00
Heikki Linnakangas 6aa2bdf6a0 Initialize the entryRes array between each call to triConsistent.
The shimTriConstistentFn, which calls the opclass's consistent function with
all combinations of TRUE/FALSE for any MAYBE argument, modifies the entryRes
array passed by the caller. Change startScanKey to re-initialize it between
each call to accommodate that.

It's actually a bad habit by shimTriConsistentFn to modify its argument. But
the only caller that doesn't already re-initialize the entryRes array was
startScanKey, and it's easy for startScanKey to do so. Add a comment to
shimTriConsistentFn about that.

Note: this does not give a free pass to opclass-provided consistent
functions to modify the entryRes argument; shimTriConsistent assumes that
they don't, even though it does it itself.

While at it, refactor startScanKey to allocate the requiredEntries and
additionalEntries after it knows exactly how large they need to be. Saves a
little bit of memory, and looks nicer anyway.

Per complaint by Tom Lane, buildfarm and the pg_trgm regression test.
2014-02-07 18:53:31 +02:00
Heikki Linnakangas dbc649fd77 Speed up "rare & frequent" type GIN queries.
If you have a GIN query like "rare & frequent", we currently fetch all the
items that match either rare or frequent, call the consistent function for
each item, and let the consistent function filter out items that only match
one of the terms. However, if we can deduce that "rare" must be present for
the overall qual to be true, we can scan all the rare items, and for each
rare item, skip over to the next frequent item with the same or greater TID.
That greatly speeds up "rare & frequent" type queries.

To implement that, introduce the concept of a tri-state consistent function,
where the 3rd value is MAYBE, indicating that we don't know if that term is
present. Operator classes only provide a boolean consistent function, so we
simulate the tri-state consistent function by calling the boolean function
several times, with the MAYBE arguments set to all combinations of TRUE and
FALSE. Testing all combinations is only feasible for a small number of MAYBE
arguments, but it is envisioned that we'll provide a way for operator
classes to provide a native tri-state consistent function, which can be much
more efficient. But that is not included in this patch.

We were already using that trick to for lossy pages, calling the consistent
function with the lossy entry set to TRUE and FALSE. Now that we have the
tri-state consistent function, use it for lossy pages too.

Alexander Korotkov, with fair amount of refactoring by me.
2014-02-07 15:22:48 +02:00
Heikki Linnakangas e001030c27 Fix thinko in comment.
Amit Langote
2014-02-07 10:28:06 +02:00
Tom Lane 8de3e410fa In RelationClearRelation, postpone cache reload if !IsTransactionState().
We may process relcache flush requests during transaction startup or
shutdown.  In general it's not terribly safe to do catalog access at those
times, so the code's habit of trying to immediately revalidate unflushable
relcache entries is risky.  Although there are no field trouble reports
that are positively traceable to this, we have been able to demonstrate
failure of the assertions recently added in RelationIdGetRelation() and
SearchCatCache().  On the other hand, it seems safe to just postpone
revalidation of the cache entry until we're inside a valid transaction.
The one case where this is questionable is where we're exiting a
subtransaction and the outer transaction is holding the relcache entry open
--- but if we made any significant changes to the rel inside such a
subtransaction, we've got problems anyway.  There are mechanisms in place
to prevent that (to wit, locks for cross-session cases and
CheckTableNotInUse() for intra-session cases), so let's trust to those
mechanisms to keep us out of trouble.
2014-02-06 19:38:06 -05:00
Andrew Dunstan 45e1b6c4c4 Alphabeticize list in OBJS definition in utils/adt Makefile. 2014-02-06 12:11:49 -05:00
Tom Lane ddfc9cb054 Assert(IsTransactionState()) in RelationIdGetRelation().
Commit 42c80c696e added an
Assert(IsTransactionState()) in SearchCatCache(), to catch
any code that thought it could do a catcache lookup outside
transactions.  Extend the same idea to relcache lookups.
2014-02-06 11:28:13 -05:00
Peter Eisentraut f65233755c Fix whitespace 2014-02-05 23:12:51 -05:00
Tom Lane ac8bc3b6e4 Remove unnecessary relcache flushes after changing btree metapages.
These flushes were added in my commit d2896a9ed, which added the btree
logic that keeps a cached copy of the index metapage data in index relcache
entries.  The idea was to ensure that other backends would promptly update
their cached copies after a change.  However, this is not really necessary,
since _bt_getroot() has adequate defenses against believing a stale root
page link, and _bt_getrootheight() doesn't have to be 100% right.
Moreover, if it were necessary, a relcache flush would be an unreliable way
to do it, since the sinval mechanism believes that relcache flush requests
represent transactional updates, and therefore discards them on transaction
rollback.  Therefore, we might as well drop these flush requests and save
the time to rebuild the whole relcache entry after a metapage change.

If we ever try to support in-place truncation of btree indexes, it might
be necessary to revisit this issue so that _bt_getroot() can't get caught
by trying to follow a metapage link to a page that no longer exists.
A possible solution to that is to make use of an smgr, rather than
relcache, inval request to force other backends to discard their cached
metapages.  But for the moment this is not worth pursuing.
2014-02-05 13:43:46 -05:00
Peter Eisentraut 4e18236180 PL/Perl: Fix compiler warning
The code was assigning a (Datum) 0 to a void pointer.  That creates a
warning from clang 3.4.  It was probably a thinko to begin with.
2014-02-04 20:08:39 -05:00
Fujii Masao 489e6ac5a1 Fix comparison of an array of characters with zero to compare with '\0' instead.
Report from Andres Freund.
2014-02-04 10:59:39 +09:00
Tom Lane 0c2338abbb Fix lexing of U& sequences just before EOF.
Commit a5ff502fce was a brick shy of a load
in the backend lexer too, not just psql.  Per further testing of bug #9068.

In passing, improve related comments.
2014-02-03 19:47:57 -05:00
Tom Lane 0def2573c5 Fix *-qualification of named parameters in SQL-language functions.
Given a composite-type parameter named x, "$1.*" worked fine, but "x.*"
not so much.  This has been broken since named parameter references were
added in commit 9bff0780cf, so patch back
to 9.2.  Per bug #9085 from Hardy Falk.
2014-02-03 14:47:17 -05:00
Robert Haas 80353f3528 Adjust pg_sleep_for/pg_sleep_until to use clock_timestamp.
Otherwise, pg_sleep_until does the wrong thing in a multi-statement
transaction.

Julien Rouhaud
2014-02-03 14:33:43 -05:00
Andrew Dunstan d3ee45152b In json code, clean up temp memory contexts after processing.
Craig Ringer.
2014-02-03 10:40:12 -05:00
Fujii Masao 3e8554a54a Make pg_basebackup skip temporary statistics files.
The temporary statistics files don't need to be included in the backup
because they are always reset at the beginning of the archive recovery.
This patch changes pg_basebackup so that it skips all files located in
$PGDATA/pg_stat_tmp or the directory specified by stats_temp_directory
parameter.
2014-02-03 23:19:49 +09:00
Tom Lane 47aaebaac9 Switch in psql_scan() must cover all lexer states (except backslash cases).
Oversight in commit f7559c0101, which changed
UESCAPE lexing in psql.  Per bug #9068 from Manuel Gómez.
2014-02-02 18:59:34 -05:00
Tom Lane 46825d4978 Clean up some sloppy coding in repl_gram.y.
Remove unused copy-and-pasted macro definitions, and improve formatting
of recently-added productions.

I got interested in this because buildfarm member protosciurus has been
crashing in "bison repl_gram.y" since commit 858ec11.  It's a long shot
that this will fix that, though maybe the missing trailing semicolon
has something to do with it?  In any case, there's no need to approve
of dead code, nor of code whose formatting isn't even self-consistent
let alone consistent with what's around it.
2014-02-02 12:51:14 -05:00
Fujii Masao 0753bdb352 Add primary_slotname to recovery.conf.sample. 2014-02-03 00:41:50 +09:00
Fujii Masao 63be3b78f6 Fix typos in docs and comments.
Thom Brown
2014-02-02 10:28:18 +09:00
Andrew Dunstan 9abed7d1cb Fix makefile syntax. 2014-02-01 19:52:39 -05:00
Tom Lane 082c0dfa14 Fix some wide-character bugs in the text-search parser.
In p_isdigit and other character class test functions generated by the
p_iswhat macro, the code path for non-C locales with multibyte encodings
contained a bogus pointer cast that would accidentally fail to malfunction
if types wchar_t and wint_t have the same width.  Apparently that is true
on most platforms, but not on recent Cygwin releases.  Remove the cast,
as it seems completely unnecessary (I think it arose from a false analogy
to the need to cast to unsigned char when dealing with the <ctype.h>
functions).  Per bug #8970 from Marco Atzeri.

In the same functions, the code path for C locale with a multibyte encoding
simply ANDed each wide character with 0xFF before passing it to the
corresponding <ctype.h> function.  This could result in false positive
answers for some non-ASCII characters, so use a range test instead.
Noted by me while investigating Marco's complaint.

Also, remove some useless though not actually buggy maskings and casts
in the hand-coded p_isalnum and p_isalpha functions, which evidently
got tested a bit more carefully than the macro-generated functions.
2014-02-01 18:27:34 -05:00
Andrew Dunstan c8158a2eed fix whitespace 2014-02-01 16:30:26 -05:00
Tom Lane 214c7a4f0b Fix some more bugs in signal handlers and process shutdown logic.
WalSndKill was doing things exactly backwards: it should first clear
MyWalSnd (to stop signal handlers from touching MyWalSnd->latch),
then disown the latch, and only then mark the WalSnd struct unused by
clearing its pid field.

Also, WalRcvSigUsr1Handler and worker_spi_sighup failed to preserve
errno, which is surely a requirement for any signal handler.

Per discussion of recent buildfarm failures.  Back-patch as far
as the relevant code exists.
2014-02-01 16:21:23 -05:00
Andrew Dunstan 7e1531a450 Don't use deprecated dllwrap on Cygwin.
The preferred method is to use "cc -shared", and this allows binaries
to be rebased if required, unlike dllwrap.

Backpatch to 9.0 where we have buildfarm coverage.

There are still some issues with Cygwin, especially modern Cygwin, but
this helps us get closer to good support.

Marco Atzeri.
2014-02-01 16:08:33 -05:00
Andrew Dunstan d587298b80 Copy the libpq DLL to the bin directory on Mingw and Cygwin.
This has long been done by the MSVC build system, and has caused
confusion in the past when programs like psql have failed to start
because they can't find the DLL. If it's in the same directory as it now
will be they will find it.

Backpatch to all live branches.
2014-02-01 15:11:13 -05:00
Bruce Momjian d0ee93797d arrays: tighten checks for multi-dimensional input
Previously an input array string that started with a single-element
array dimension would then later accept a multi-dimensional segment.

BACKWARD INCOMPATIBILITY
2014-02-01 10:49:17 -05:00
Robert Haas 858ec11858 Introduce replication slots.
Replication slots are a crash-safe data structure which can be created
on either a master or a standby to prevent premature removal of
write-ahead log segments needed by a standby, as well as (with
hot_standby_feedback=on) pruning of tuples whose removal would cause
replication conflicts.  Slots have some advantages over existing
techniques, as explained in the documentation.

In a few places, we refer to the type of replication slots introduced
by this patch as "physical" slots, because forthcoming patches for
logical decoding will also have slots, but with somewhat different
properties.

Andres Freund and Robert Haas
2014-01-31 22:45:36 -05:00
Bruce Momjian 5168c76964 pg_restore: make help output plural for multi-enabled options
per report from Josh Kupershmidt
2014-01-31 22:29:01 -05:00
Robert Haas d1981719ad Clear MyProc and MyProcSignalState before they become invalid.
Evidence from buildfarm member crake suggests that the new test_shm_mq
module is routinely crashing the server due to the arrival of a SIGUSR1
after the shared memory segment has been unmapped.  Although processes
using the new dynamic background worker facilities are more likely to
receive a SIGUSR1 around this time, the problem is also possible on older
branches, so I'm back-patching the parts of this change that apply to
older branches as far as they apply.

It's already generally the case that code checks whether these pointers
are NULL before deferencing them, so the important thing is mostly to
make sure that they do get set to NULL before they become invalid.  But
in master, there's one case in procsignal_sigusr1_handler that lacks a
NULL guard, so add that.

Patch by me; review by Tom Lane.
2014-01-31 21:31:08 -05:00
Tom Lane 326e1d73c4 Disallow use of SSL v3 protocol in the server as well as in libpq.
Commit 820f08cabd claimed to make the server
and libpq handle SSL protocol versions identically, but actually the server
was still accepting SSL v3 protocol while libpq wasn't.  Per discussion,
SSL v3 is obsolete, and there's no good reason to continue to accept it.
So make the code really equivalent on both sides.  The behavior now is
that we use the highest mutually-supported TLS protocol version.

Marko Kreen, some comment-smithing by me
2014-01-31 17:51:18 -05:00
Bruce Momjian fc4ffba968 system catalogs: reorder pg_amproc entries into proper sections
Report form Antonin Houska
2014-01-31 16:04:18 -05:00
Bruce Momjian 290d2cb500 pgindent: add Perl comment 2014-01-31 14:46:00 -05:00
Bruce Momjian cad1e022b2 pgindent: add --list-of-typedefs option
Allows typedefs to be specified on the command line, per request from
Andrew.
2014-01-31 13:35:50 -05:00
Fujii Masao a87ae38be8 Add tab completion for ALTER TABLESPACE MOVE in psql. 2014-02-01 01:45:48 +09:00
Bruce Momjian 5ff47acf8f entab: add new options
Add new entab options to process only C comment whitespace after
periods, and to protect leading whitespace.
2014-01-31 11:05:21 -05:00
Bruce Momjian db98b31329 pgindent: preserve blank lines around #else/#endif
This requires a new version of pg_bsd_indent, version 1.3, to be
downloaded.
2014-01-30 22:40:05 -05:00
Robert Haas 760c770ff6 Add convenience functions pg_sleep_for and pg_sleep_until.
Vik Fearing, reviewed by Pavel Stehule and myself
2014-01-30 15:47:56 -05:00
Tom Lane 043f6ff05d Fix bogus handling of "postponed" lateral quals.
When pulling a "postponed" qual from a LATERAL subquery up into the quals
of an outer join, we must make sure that the postponed qual is included
in those seen by make_outerjoininfo().  Otherwise we might compute a
too-small min_lefthand or min_righthand for the outer join, leading to
"JOIN qualification cannot refer to other relations" failures from
distribute_qual_to_rels.  Subtler errors in the created plan seem possible,
too, if the extra qual would only affect join ordering constraints.

Per bug #9041 from David Leverton.  Back-patch to 9.3.
2014-01-30 14:51:16 -05:00
Bruce Momjian 146604ec43 Add checks for interval overflow/underflow
New checks include input, month/day/time internal adjustments, addition,
subtraction, multiplication, and negation.  Also adjust docs to
correctly specify interval size in bytes.

Report from Rok Kralj
2014-01-30 09:41:43 -05:00
Tom Lane 571addd729 Fix unsafe references to errno within error messaging logic.
Various places were supposing that errno could be expected to hold still
within an ereport() nest or similar contexts.  This isn't true necessarily,
though in some cases it accidentally failed to fail depending on how the
compiler chanced to order the subexpressions.  This class of thinko
explains recent reports of odd failures on clang-built versions, typically
missing or inappropriate HINT fields in messages.

Problem identified by Christian Kruse, who also submitted the patch this
commit is based on.  (I fixed a few issues in his patch and found a couple
of additional places with the same disease.)

Back-patch as appropriate to all supported branches.
2014-01-29 20:04:43 -05:00
Andrew Dunstan 120c5cc761 Silence compiler warnings about possibly unset variables.
They are in fact set in every case where they are needed, but the
compiler doesn't know that.

Per gripe from Tom Lane.
2014-01-29 18:54:14 -05:00
Andrew Dunstan 5e52e9d6d4 Forgot to bump catalog version for json_array_elements_text. 2014-01-29 16:38:31 -05:00
Robert Haas 9347baa5bb Include planning time in EXPLAIN ANALYZE output.
This doesn't work for prepared queries, but it's not too easy to get
the information in that case and there's some debate as to exactly
what the right thing to measure is, so just do this for now.

Andreas Karlsson, with slight doc changes by me.
2014-01-29 16:09:15 -05:00
Andrew Dunstan 5264d91541 Add json_array_elements_text function.
This was a notable omission from the json functions added in 9.3 and
there have been numerous complaints about its absence.

Laurence Rowe.
2014-01-29 15:39:01 -05:00
Heikki Linnakangas 699b1f40da Fix thinko in huge_tlb_pages patch.
We calculated the rounded-up size for the allocation, but then failed to
use the rounded-up value in the mmap() call. Oops.

Also, initialize allocsize, to silence warnings seen with some compilers,
as pointed out by Jeff Janes.
2014-01-29 21:33:56 +02:00
Heikki Linnakangas 626a120656 Further optimize GIN multi-key searches.
When skipping over some items in a posting tree, re-find the new location
by descending the tree from root, rather than walking the right links.
This can save a lot of I/O.

Heavily modified from Alexander Korotkov's fast scan patch.
2014-01-29 21:24:38 +02:00
Bruce Momjian 8440897b38 Fix pointer processing in new entab.c function 2014-01-29 13:31:11 -05:00