Commit Graph

16313 Commits

Author SHA1 Message Date
Alvaro Herrera 51c3e9fade Include <sys/select.h> where needed
<sys/select.h> is required by POSIX.1-2001 to get the prototype of
select(2), but nearly no systems enforce that because older standards
let you get away with including some other headers.  Recent OpenBSD
hacking has removed that frail touch of friendliness, however, which
broke some compiles; fix all the way back to 9.1 by adding the required
standard.  Only vacuumdb.c was reported to fail, but it seems easier to
fix the whole lot in a fell swoop.

Per bug #14334 by Sean Farrell.
2016-09-27 01:05:21 -03:00
Tom Lane fdc9186f7e Replace the built-in GIN array opclasses with a single polymorphic opclass.
We had thirty different GIN array opclasses sharing the same operators and
support functions.  That still didn't cover all the built-in types, nor
did it cover arrays of extension-added types.  What we want is a single
polymorphic opclass for "anyarray".  There were two missing features needed
to make this possible:

1. We have to be able to declare the index storage type as ANYELEMENT
when the opclass is declared to index ANYARRAY.  This just takes a few
more lines in index_create().  Although this currently seems of use only
for GIN, there's no reason to make index_create() restrict it to that.

2. We have to be able to identify the proper GIN compare function for
the index storage type.  This patch proceeds by making the compare function
optional in GIN opclass definitions, and specifying that the default btree
comparison function for the index storage type will be looked up when the
opclass omits it.  Again, that seems pretty generically useful.

Since the comparison function lookup is done in initGinState(), making
use of the second feature adds an additional cache lookup to GIN index
access setup.  It seems unlikely that that would be very noticeable given
the other costs involved, but maybe at some point we should consider
making GinState data persist longer than it now does --- we could keep it
in the index relcache entry, perhaps.

Rather fortuitously, we don't seem to need to do anything to get this
change to play nice with dump/reload or pg_upgrade scenarios: the new
opclass definition is automatically selected to replace existing index
definitions, and the on-disk data remains compatible.  Also, if a user has
created a custom opclass definition for a non-builtin type, this doesn't
break that, since CREATE INDEX will prefer an exact match to opcintype
over a match to ANYARRAY.  However, if there's anyone out there with
handwritten DDL that explicitly specifies _bool_ops or one of the other
replaced opclass names, they'll need to adjust that.

Tom Lane, reviewed by Enrique Meneses

Discussion: <14436.1470940379@sss.pgh.pa.us>
2016-09-26 14:52:44 -04:00
Tom Lane da6c4f6ca8 Refer to OS X as "macOS", except for the port name which is still "darwin".
We weren't terribly consistent about whether to call Apple's OS "OS X"
or "Mac OS X", and the former is probably confusing to people who aren't
Apple users.  Now that Apple has rebranded it "macOS", follow their lead
to establish a consistent naming pattern.  Also, avoid the use of the
ancient project name "Darwin", except as the port code name which does not
seem desirable to change.  (In short, this patch touches documentation and
comments, but no actual code.)

I didn't touch contrib/start-scripts/osx/, either.  I suspect those are
obsolete and due for a rewrite, anyway.

I dithered about whether to apply this edit to old release notes, but
those were responsible for quite a lot of the inconsistencies, so I ended
up changing them too.  Anyway, Apple's being ahistorical about this,
so why shouldn't we be?
2016-09-25 15:40:57 -04:00
Tom Lane 959ea7fa76 Remove useless code.
Apparent copy-and-pasteo in standby_desc_invalidations() had two
entries for msg->id == SHAREDINVALRELMAP_ID.

Aleksander Alekseev

Discussion: <20160923090814.GB1238@e733>
2016-09-23 10:44:50 -04:00
Tom Lane 8e6b4ee21f Don't trust CreateFileMapping() to clear the error code on success.
We must test GetLastError() even when CreateFileMapping() returns a
non-null handle.  If that value were left over from some previous system
call, we might be fooled into thinking the segment already existed.
Experimentation on Windows 7 suggests that CreateFileMapping() clears
the error code on success, but it is not documented to do so, so let's
not rely on that happening in all Windows releases.

Amit Kapila

Discussion: <20811.1474390987@sss.pgh.pa.us>
2016-09-23 10:09:52 -04:00
Tom Lane 49a91b88e6 Avoid using PostmasterRandom() for DSM control segment ID.
Commits 470d886c3 et al intended to fix the problem that the postmaster
selected the same "random" DSM control segment ID on every start.  But
using PostmasterRandom() for that destroys the intended property that the
delay between random_start_time and random_stop_time will be unpredictable.
(Said delay is probably already more predictable than we could wish, but
that doesn't mean that reducing it by a couple orders of magnitude is OK.)
Revert the previous patch and add a comment warning against misuse of
PostmasterRandom.  Fix the original problem by calling srandom() early in
PostmasterMain, using a low-security seed that will later be overwritten
by PostmasterRandom.

Discussion: <20789.1474390434@sss.pgh.pa.us>
2016-09-23 09:54:11 -04:00
Bruce Momjian 1ff0042165 C comment: fix function header comment
Fix for transformOnConflictClause().

Author: Tomonari Katsumata
2016-09-22 17:34:24 -04:00
Tom Lane 8023b5827f Remove nearly-unused SizeOfIptrData macro.
Past refactorings have removed all but one reference to SizeOfIptrData
(and that one place was in a pretty noncritical spot).  Since nobody's
complained, it seems probable that there are no supported compilers
that don't think sizeof(ItemPointerData) is 6.  If there are, we're
wasting MAXALIGN per heap tuple anyway, so it's rather silly to worry
about whether we can shave space in places like WAL records.

Pavan Deolasee

Discussion: <CABOikdOOawDda4hwLOT6zdA6MFfPLu3Z2YBZkX0JdayNS6JOeQ@mail.gmail.com>
2016-09-22 14:30:33 -04:00
Tom Lane 96dd77d349 Be sure to rewind the tuplestore read pointer in non-leader CTEScan nodes.
ExecInitCteScan supposed that it didn't have to do anything to the extra
tuplestore read pointer it gets from tuplestore_alloc_read_pointer.
However, it needs this read pointer to be positioned at the start of the
tuplestore, while tuplestore_alloc_read_pointer is actually defined as
cloning the current position of read pointer 0.  In normal situations
that accidentally works because we initialize the whole plan tree at once,
before anything gets read.  But it fails in an EvalPlanQual recheck, as
illustrated in bug #14328 from Dima Pavlov.  To fix, just forcibly rewind
the pointer after tuplestore_alloc_read_pointer.  The cost of doing so is
negligible unless the tuplestore is already in TSS_READFILE state, which
wouldn't happen in normal cases.  We could consider altering tuplestore's
API to make that case cheaper, but that would make for a more invasive
back-patch and it doesn't seem worth it.

This has been broken probably for as long as we've had CTEs, so back-patch
to all supported branches.

Discussion: <32468.1474548308@sss.pgh.pa.us>
2016-09-22 11:35:03 -04:00
Peter Eisentraut ebdf5bf7d1 Delay updating control file to "in production"
Move the updating of the control file to "in production" status until
the point where WAL writes are allowed.  Before, there could be a
significant gap between the control file update and write transactions
actually being allowed.  This makes it more reliable to use the control
status to verify the end of a promotion.

From: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Peter Eisentraut c1dc51d484 pg_ctl: Detect current standby state from pg_control
pg_ctl used to determine whether a server was in standby mode by looking
for a recovery.conf file.  With this change, it instead looks into
pg_control, which is potentially more accurate.  There are also
occasional discussions about removing recovery.conf, so this removes one
dependency.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2016-09-21 12:00:00 -04:00
Robert Haas 470d886c32 Use PostmasterRandom(), not random(), for DSM control segment ID.
Otherwise, every startup gets the same "random" value, which is
definitely not what was intended.
2016-09-20 12:26:29 -04:00
Robert Haas 419113dfdc Retry DSM control segment creation if Windows indicates access denied.
Otherwise, attempts to run multiple postmasters running on the same
machine may fail, because Windows sometimes returns ERROR_ACCESS_DENIED
rather than ERROR_ALREADY_EXISTS when there is an existing segment.

Hitting this bug is much more likely because of another defect not
fixed by this patch, namely that dsm_postmaster_startup() uses
random() which returns the same value every time.  But that's not
a reason not to fix this.

Kyotaro Horiguchi and Amit Kapila, reviewed by Michael Paquier

Discussion: <CAA4eK1JyNdMeF-dgrpHozDecpDfsRZUtpCi+1AbtuEkfG3YooQ@mail.gmail.com>
2016-09-20 12:04:41 -04:00
Heikki Linnakangas 45310221a9 Fix outdated comments, GIST search queue is not an RBTree anymore.
The GiST search queue is implemented as a pairing heap rather than as
Red-Black Tree, since 9.5 (commit e7032610). I neglected these comments
in that commit.
2016-09-20 11:38:25 +03:00
Tom Lane d8c61c9765 Add debugging aid "bmsToString(Bitmapset *bms)".
This function has no direct callers at present, but it's convenient for
manual use in a debugger, rather than having to inspect memory and do
bit-counting in your head.

In passing, get rid of useless outBitmapset() wrapper around
_outBitmapset(); let's just export the function that does the work.
Likewise for outToken().

Ashutosh Bapat, tweaked a bit by me

Discussion: <CAFjFpRdiht8e1HTVirbubr4YzaON5iZTzFJjq909y4sU8M_6eA@mail.gmail.com>
2016-09-16 09:36:24 -04:00
Robert Haas 5225c66336 Clarify policy on marking inherited constraints as valid.
Amit Langote and Robert Haas
2016-09-15 17:24:54 -04:00
Heikki Linnakangas 5c6df67e0c Fix building with LibreSSL.
LibreSSL defines OPENSSL_VERSION_NUMBER to claim that it is version 2.0.0,
but it doesn't have the functions added in OpenSSL 1.1.0. Add autoconf
checks for the individual functions we need, and stop relying on
OPENSSL_VERSION_NUMBER.

Backport to 9.5 and 9.6, like the patch that broke this. In the
back-branches, there are still a few OPENSSL_VERSION_NUMBER checks left,
to check for OpenSSL 0.9.8 or 0.9.7. I left them as they were - LibreSSL
has all those functions, so they work as intended.

Per buildfarm member curculio.

Discussion: <2442.1473957669@sss.pgh.pa.us>
2016-09-15 22:52:51 +03:00
Robert Haas ffccee4736 Fix typo in comment.
Amit Langote
2016-09-15 11:35:57 -04:00
Tom Lane 5472ed3e9b Make min_parallel_relation_size's default value platform-independent.
The documentation states that the default value is 8MB, but this was
only true at BLCKSZ = 8kB, because the default was hard-coded as 1024.
Make the code match the docs by computing the default as 8MB/BLCKSZ.

Oversight in commit 75be66464, noted pursuant to a gripe from Peter E.

Discussion: <90634e20-097a-e4fd-67d5-fb2c42f0dd71@2ndquadrant.com>
2016-09-15 11:23:25 -04:00
Heikki Linnakangas 593d4e47db Support OpenSSL 1.1.0.
Changes needed to build at all:

- Check for SSL_new in configure, now that SSL_library_init is a macro.
- Do not access struct members directly. This includes some new code in
  pgcrypto, to use the resource owner mechanism to ensure that we don't
  leak OpenSSL handles, now that we can't embed them in other structs
  anymore.
- RAND_SSLeay() -> RAND_OpenSSL()

Changes that were needed to silence deprecation warnings, but were not
strictly necessary:

- RAND_pseudo_bytes() -> RAND_bytes().
- SSL_library_init() and OpenSSL_config() -> OPENSSL_init_ssl()
- ASN1_STRING_data() -> ASN1_STRING_get0_data()
- DH_generate_parameters() -> DH_generate_parameters()
- Locking callbacks are not needed with OpenSSL 1.1.0 anymore. (Good
  riddance!)

Also change references to SSLEAY_VERSION_NUMBER with OPENSSL_VERSION_NUMBER,
for the sake of consistency. OPENSSL_VERSION_NUMBER has existed since time
immemorial.

Fix SSL test suite to work with OpenSSL 1.1.0. CA certificates must have
the "CA:true" basic constraint extension now, or OpenSSL will refuse them.
Regenerate the test certificates with that. The "openssl" binary, used to
generate the certificates, is also now more picky, and throws an error
if an X509 extension is specified in "req_extensions", but that section
is empty.

Backpatch to all supported branches, per popular demand. In back-branches,
we still support OpenSSL 0.9.7 and above. OpenSSL 0.9.6 should still work
too, but I didn't test it. In master, we only support 0.9.8 and above.

Patch by Andreas Karlsson, with additional changes by me.

Discussion: <20160627151604.GD1051@msg.df7cb.de>
2016-09-15 14:42:29 +03:00
Heikki Linnakangas c99dd5bfed Fix and clarify comments on replacement selection.
These were modified by the patch to only use replacement selection for the
first run in an external sort.
2016-09-15 11:51:43 +03:00
Peter Eisentraut 656df624c0 Add overflow checks to money type input function
The money type input function did not have any overflow checks at all.
There were some regression tests that purported to check for overflow,
but they actually checked for the overflow behavior of the int8 type
before casting to money.  Remove those unnecessary checks and add some
that actually check the money input function.

Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
2016-09-14 12:00:00 -05:00
Tom Lane 55c3391d1e Be pickier about converting between Name and Datum.
We were misapplying NameGetDatum() to plain C strings in some places.
This worked, because it was just a pointer cast anyway, but it's a type
cheat in some sense.  Use CStringGetDatum instead, and modify the
NameGetDatum macro so it won't compile if applied to something that's
not a pointer to NameData.  This should result in no changes to
generated code, but it is logically cleaner.

Mark Dilger, tweaked a bit by me

Discussion: <EFD8AC94-4C1F-40C1-A5EA-304080089C1B@gmail.com>
2016-09-13 17:17:48 -04:00
Tom Lane fdc79e1909 Fix executor/README to reflect disallowing SRFs in UPDATE.
The parenthetical comment here is obsoleted by commit a4c35ea1c.
Noted by Andres Freund.
2016-09-13 14:25:35 -04:00
Tom Lane a4c35ea1c2 Improve parser's and planner's handling of set-returning functions.
Teach the parser to reject misplaced set-returning functions during parse
analysis using p_expr_kind, in much the same way as we do for aggregates
and window functions (cf commit eaccfded9).  While this isn't complete
(it misses nesting-based restrictions), it's much better than the previous
error reporting for such cases, and it allows elimination of assorted
ad-hoc expression_returns_set() error checks.  We could add nesting checks
later if it seems important to catch all cases at parse time.

There is one case the parser will now throw error for although previous
versions allowed it, which is SRFs in the tlist of an UPDATE.  That never
behaved sensibly (since it's ill-defined which generated row should be
used to perform the update) and it's hard to see why it should not be
treated as an error.  It's a release-note-worthy change though.

Also, add a new Query field hasTargetSRFs reporting whether there are
any SRFs in the targetlist (including GROUP BY/ORDER BY expressions).
The parser can now set that basically for free during parse analysis,
and we can use it in a number of places to avoid expression_returns_set
searches.  (There will be more such checks soon.)  In some places, this
allows decontorting the logic since it's no longer expensive to check for
SRFs in the tlist --- so I made the checks parallel to the handling of
hasAggs/hasWindowFuncs wherever it seemed appropriate.

catversion bump because adding a Query field changes stored rules.

Andres Freund and Tom Lane

Discussion: <24639.1473782855@sss.pgh.pa.us>
2016-09-13 13:54:24 -04:00
Robert Haas 445a38aba2 Have heapam.h include lockdefs.h rather than lock.h.
lockdefs.h was only split from lock.h relatively recently, and
represents a minimal subset of the old lock.h.  heapam.h only needs
that smaller subset, so adjust it to include only that.  This requires
some corresponding adjustments elsewhere.

Peter Geoghegan
2016-09-13 09:21:35 -04:00
Simon Riggs 4068eb9918 Fix copy/pasto in file identification
Daniel Gustafsson
2016-09-12 09:01:58 +01:00
Simon Riggs fc3d4a44e9 Identify walsenders in pg_stat_activity
Following 8299471c37 walsender procs are now visible in pg_stat_activity.
Set query to ‘walsender’ for walsender procs to allow them to be identified.

Discussion:CAB7nPqS8c76KPSufK_HSDeYrbtg+zZ7D0EEkjeM6txSEuCB_jA@mail.gmail.com

Michael Paquier, issue raised by Fujii Masao, reviewed by Tom Lane
2016-09-12 08:57:14 +01:00
Simon Riggs c3c0d7bd70 Raise max setting of checkpoint_timeout to 1d
Previously checkpoint_timeout was capped at 3600s
New max setting is 86400s = 24h = 1d

Discussion: 32558.1454471895@sss.pgh.pa.us
2016-09-11 23:26:18 +01:00
Tom Lane 40b449ae84 Allow CREATE EXTENSION to follow extension update paths.
Previously, to update an extension you had to produce both a version-update
script and a new base installation script.  It's become more and more
obvious that that's tedious, duplicative, and error-prone.  This patch
attempts to improve matters by allowing the new base installation script
to be omitted.  CREATE EXTENSION will install a requested version if it
can find a base script and a chain of update scripts that will get there.
As in the existing update logic, shorter chains are preferred if there's
more than one possibility, with an arbitrary tie-break rule for chains
of equal length.

Also adjust the pg_available_extension_versions view to show such versions
as installable.

While at it, refactor the code so that CASCADE processing works for
extensions requested during ApplyExtensionUpdates().  Without this,
addition of a new requirement in an updated extension would require
creating a new base script, even if there was no other reason to do that.
(It would be easy at this point to add a CASCADE option to ALTER EXTENSION
UPDATE, to allow the same thing to happen during a manually-commanded
version update, but I have not done that here.)

Tom Lane, reviewed by Andres Freund

Discussion: <20160905005919.jz2m2yh3und2dsuy@alap3.anarazel.de>
2016-09-11 14:15:07 -04:00
Heikki Linnakangas 24598337c8 Implement binary heap replace-top operation in a smarter way.
In external sort's merge phase, we maintain a binary heap holding the next
tuple from each input tape. On each step, the topmost tuple is returned,
and replaced with the next tuple from the same tape. We were doing the
replacement by deleting the top node in one operation, and inserting the
next tuple after that. However, you can do a "replace-top" operation more
efficiently, in one "sift-up". A deletion will always walk the heap from
top to bottom, but in a replacement, we can stop as soon as we find the
right place for the new tuple. This is particularly helpful, if the tapes
are not in completely random order, so that the next tuple from a tape is
likely to land near the top of the heap.

Peter Geoghegan, reviewed by Claudio Freire, with some editing by me.

Discussion: <CAM3SWZRhBhiknTF_=NjDSnNZ11hx=U_SEYwbc5vd=x7M4mMiCw@mail.gmail.com>
2016-09-11 16:27:27 +03:00
Tom Lane ddc8893179 Fix miserable coding in pg_stat_get_activity().
Commit dd1a3bccc replaced a test on whether a subroutine returned a
null pointer with a test on whether &pointer->backendStatus was null.
This accidentally failed to fail, at least on common compilers, because
backendStatus is the first field in the struct; but it was surely trouble
waiting to happen.  Commit f91feba87 then messed things up further,
changing the logic to

	local_beentry = pgstat_fetch_stat_local_beentry(curr_backend);
	if (!local_beentry)
		continue;
	beentry = &local_beentry->backendStatus;
	if (!beentry)
	{

where the second "if" is now dead code, so that the intended behavior of
printing a row with "<backend information not available>" cannot occur.

I suspect this is all moot because pgstat_fetch_stat_local_beentry
will never actually return null in this function's usage, but it's still
very poor coding.  Repair back to 9.4 where the original problem was
introduced.
2016-09-10 13:49:04 -04:00
Tom Lane 24992c6db9 Rewrite PageIndexDeleteNoCompact into a form that only deletes 1 tuple.
The full generality of deleting an arbitrary number of tuples is no longer
needed, so let's save some code and cycles by replacing the original coding
with an implementation based on PageIndexTupleDelete.

We can always get back the old code from git if we need it again for new
callers (though I don't care for its willingness to mess with line pointers
it wasn't told to mess with).

Discussion: <552.1473445163@sss.pgh.pa.us>
2016-09-09 19:00:59 -04:00
Tom Lane 1a4be103a5 Convert PageAddItem into a macro to save a few cycles.
Nowadays this is just a backwards-compatibility wrapper around
PageAddItemExtended, so let's avoid the extra level of function call.
In addition, because pretty much all callers are passing constants
for the two bool arguments, compilers will be able to constant-fold
the conversion to a flags bitmask.

Discussion: <552.1473445163@sss.pgh.pa.us>
2016-09-09 18:17:07 -04:00
Tom Lane b1328d78f8 Invent PageIndexTupleOverwrite, and teach BRIN and GiST to use it.
PageIndexTupleOverwrite performs approximately the same function as
PageIndexTupleDelete (or PageIndexDeleteNoCompact) followed by PageAddItem
targeting the same item pointer offset.  But in the case where the new
tuple is the same size as the old, it avoids shuffling other data around on
the page, because the new tuple is placed where the old one was rather than
being appended to the end of the page.  This has been shown to provide a
substantial speedup for some GiST use-cases.

Also, this change allows some API simplifications: we can get rid of
the rather klugy and error-prone PAI_ALLOW_FAR_OFFSET flag for
PageAddItemExtended, since that was used only to cover a corner case
for BRIN that's better expressed by using PageIndexTupleOverwrite.

Note that this patch causes a rather subtle WAL incompatibility: the
physical page content change represented by certain WAL records is now
different than it was before, because while the tuples have the same
itempointer line numbers, the tuples themselves are in different places.
I have not bumped the WAL version number because I think it doesn't matter
unless you are trying to do bitwise comparisons of original and replayed
pages, and in any case we're early in a devel cycle and there will probably
be more WAL changes before v10 gets out the door.

There is probably room to make use of PageIndexTupleOverwrite in SP-GiST
and GIN too, but that is left for a future patch.

Andrey Borodin, reviewed by Anastasia Lubennikova, whacked around a bit
by me

Discussion: <CAJEAwVGQjGGOj6mMSgMwGvtFd5Kwe6VFAxY=uEPZWMDjzbn4VQ@mail.gmail.com>
2016-09-09 18:02:36 -04:00
Alvaro Herrera 5c609a742f Fix locking a tuple updated by an aborted (sub)transaction
When heap_lock_tuple decides to follow the update chain, it tried to
also lock any version of the tuple that was created by an update that
was subsequently rolled back.  This is pointless, since for all intents
and purposes that tuple exists no more; and moreover it causes
misbehavior, as reported independently by Marko Tiikkaja and Marti
Raudsepp: some SELECT FOR UPDATE/SHARE queries may fail to return
the tuples, and assertion-enabled builds crash.

Fix by having heap_lock_updated_tuple test the xmin and return success
immediately if the tuple was created by an aborted transaction.

The condition where tuples become invisible occurs when an updated tuple
chain is followed by heap_lock_updated_tuple, which reports the problem
as HeapTupleSelfUpdated to its caller heap_lock_tuple, which in turn
propagates that code outwards possibly leading the calling code
(ExecLockRows) to believe that the tuple exists no longer.

Backpatch to 9.3.  Only on 9.5 and newer this leads to a visible
failure, because of commit 27846f02c176; before that, heap_lock_tuple
skips the whole dance when the tuple is already locked by the same
transaction, because of the ancient HeapTupleSatisfiesUpdate behavior.
Still, the buggy condition may also exist in more convoluted scenarios
involving concurrent transactions, so it seems safer to fix the bug in
the old branches too.

Discussion:
	https://www.postgresql.org/message-id/CABRT9RC81YUf1=jsmWopcKJEro=VoeG2ou6sPwyOUTx_qteRsg@mail.gmail.com
	https://www.postgresql.org/message-id/48d3eade-98d3-8b9a-477e-1a8dc32a724d@joh.to
2016-09-09 15:54:29 -03:00
Tom Lane 984d0a14e8 In PageIndexTupleDelete, don't assume stored item lengths are MAXALIGNed.
PageAddItem stores the item length as-is.  It MAXALIGN's the amount of
space actually allocated for each tuple, but not the stored length.
PageRepairFragmentation, PageIndexMultiDelete, and PageIndexDeleteNoCompact
are all on board with this and MAXALIGN item lengths after fetching them.
But PageIndexTupleDelete expects the stored length to be a MAXALIGN
multiple already.  This accidentally works for existing index AMs because
they all maxalign their tuple sizes internally; but we don't do that for
heap tuples, and it shouldn't be a requirement for index tuples either.

So, sync PageIndexTupleDelete with the rest of bufpage.c by having it
maxalign the item size after fetching.

Also add a check that pd_special is maxaligned, to ensure that the test
"(offset + size) > phdr->pd_special" is still doing the right thing.
(If offset and pd_special are aligned, it doesn't matter whether size is.)
Again, this is in sync with the rest of the routines here, except for
PageAddItem which doesn't test because it doesn't actually do anything
for which pd_special alignment matters.

This shouldn't have any immediate functional impact; it just adds the
flexibility to use PageIndexTupleDelete on index tuples with non-aligned
lengths.

Discussion: <3814.1473366762@sss.pgh.pa.us>
2016-09-09 12:21:09 -04:00
Tom Lane 967a7b0fc9 Avoid reporting "cache lookup failed" for some user-reachable cases.
We have a not-terribly-thoroughly-enforced-yet project policy that internal
errors with SQLSTATE XX000 (ie, plain elog) should not be triggerable from
SQL.  record_in, domain_in, and PL validator functions all failed to meet
this standard, because they threw plain elog("cache lookup failed for XXX")
errors on bad OIDs, and those are all invokable from SQL.

For record_in, the best fix is to upgrade typcache.c (lookup_type_cache)
to throw a user-facing error for this case.  That seems consistent because
it was more than halfway there already, having user-facing errors for shell
types and non-composite types.  Having done that, tweak domain_in to rely
on the typcache to throw an appropriate error.  (This costs little because
InitDomainConstraintRef would fetch the typcache entry anyway.)

For the PL validator functions, we already have a single choke point at
CheckFunctionValidatorAccess, so just fix its error to be user-facing.

Dilip Kumar, reviewed by Haribabu Kommi

Discussion: <87wpxfygg9.fsf@credativ.de>
2016-09-09 09:20:34 -04:00
Simon Riggs ec253de1fd Fix corruption of 2PC recovery with subxacts
Reading 2PC state files during recovery was borked, causing corruptions during
recovery. Effect limited to servers with 2PC, subtransactions and
recovery/replication.

Stas Kelvich, reviewed by Michael Paquier and Pavan Deolasee
2016-09-09 11:55:12 +01:00
Andres Freund 45e191e3aa Improve scalability of md.c for large relations.
So far md.c used a linked list of segments. That proved to be a problem
when processing large relations, because every smgr.c/md.c level access
to a page incurred walking through a linked list of all preceding
segments. Thus making accessing pages O(#segments).

Replace the linked list of segments hanging off SMgrRelationData with an
array of opened segments. That allows O(1) access to individual
segments, if they've previously been opened.

Discussion: <20140331101001.GE13135@alap3.anarazel.de>
Reviewed-By: Peter Geoghegan, Tom Lane (in an older version)
2016-09-08 17:18:46 -07:00
Andres Freund 417fefaf08 Faster PageIsVerified() for the all zeroes case.
That's primarily useful for testing very large relations, using sparse
files.

Discussion: <20140331101001.GE13135@alap3.anarazel.de>
Reviewed-By: Peter Geoghegan
2016-09-08 17:02:43 -07:00
Andres Freund 769fd9d8e0 Fix mdtruncate() to close fd.c handle of deleted segments.
mdtruncate() forgot to FileClose() a segment's mdfd_vfd, when deleting
it. That lead to a fd.c handle to a truncated file being kept open until
backend exit.

The issue appears to have been introduced way back in 1a5c450f30,
before that the handle was closed inside FileUnlink().

The impact of this bug is limited - only VACUUM and ON COMMIT TRUNCATE
for temporary tables, truncate files in place (i.e. TRUNCATE itself is
not affected), and the relation has to be bigger than 1GB. The
consequences of a leaked fd.c handle aren't severe either.

Discussion: <20160908220748.oqh37ukwqqncbl3n@alap3.anarazel.de>
Backpatch: all supported releases
2016-09-08 16:51:09 -07:00
Simon Riggs 67c6bd1ca3 Fix minor memory leak in Standby startup
StandbyRecoverPreparedTransactions() leaked the buffer
used for two phase state file. This was leaked once
at startup and at every shutdown checkpoint seen.

Backpatch to 9.6

Stas Kelvich
2016-09-08 10:32:58 +01:00
Tom Lane 0ab9c56d0f Support renaming an existing value of an enum type.
Not much to be said about this patch: it does what it says on the tin.

In passing, rename AlterEnumStmt.skipIfExists to skipIfNewValExists
to clarify what it actually does.  In the discussion of this patch
we considered supporting other similar options, such as IF EXISTS
on the type as a whole or IF NOT EXISTS on the target name.  This
patch doesn't actually add any such feature, but it might happen later.

Dagfinn Ilmari Mannsåker, reviewed by Emre Hasegeli

Discussion: <CAO=2mx6uvgPaPDf-rHqG8=1MZnGyVDMQeh8zS4euRyyg4D35OQ@mail.gmail.com>
2016-09-07 16:11:56 -04:00
Tom Lane 4f405c8ef4 Add a HINT for client-vs-server COPY failure cases.
Users often get confused between COPY and \copy and try to use client-side
paths with COPY.  The server then cannot find the file (if remote), or sees
a permissions problem (if local), or some variant of that.  Emit a hint
about this in the most common cases.

In future we might want to expand the set of errnos for which the hint
gets printed, but be conservative for now.

Craig Ringer, reviewed by Christoph Berg and Tom Lane

Discussion: <CAMsr+YEqtD97qPEzQDqrCt5QiqPbWP_X4hmvy2pQzWC0GWiyPA@mail.gmail.com>
2016-09-06 23:55:55 -04:00
Peter Eisentraut 49eb0fd097 Add location field to DefElem
Add a location field to the DefElem struct, used to parse many utility
commands.  Update various error messages to supply error position
information.

To propogate the error position information in a more systematic way,
create a ParseState in standard_ProcessUtility() and pass that to
interested functions implementing the utility commands.  This seems
better than passing the query string and then reassembling a parse state
ad hoc, which violates the encapsulation of the ParseState type.

Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2016-09-06 12:00:00 -04:00
Tom Lane f032722f86 Guard against possible memory allocation botch in batchmemtuples().
Negative availMemLessRefund would be problematic.  It's not entirely
clear whether the case can be hit in the code as it stands, but this
seems like good future-proofing in any case.  While we're at it,
insist that the value be not merely positive but not tiny, so as to
avoid doing a lot of repalloc work for little gain.

Peter Geoghegan

Discussion: <CAM3SWZRVkuUB68DbAkgw=532gW0f+fofKueAMsY7hVYi68MuYQ@mail.gmail.com>
2016-09-06 15:50:31 -04:00
Simon Riggs dcb12ce8d8 Fix VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL
lazy_truncate_heap() was waiting for
VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL, but in microseconds
not milliseconds as originally intended.

Found by code inspection.

Simon Riggs
2016-09-06 15:35:47 +01:00
Tom Lane 25794e841e Cosmetic code cleanup in commands/extension.c.
Some of the comments added by the CREATE EXTENSION CASCADE patch were
a bit sloppy, and I didn't care for redeclaring the same local variable
inside a nested block either.  No functional changes.
2016-09-05 18:53:33 -04:00
Tom Lane c54159d44c Make locale-dependent regex character classes work for large char codes.
Previously, we failed to recognize Unicode characters above U+7FF as
being members of locale-dependent character classes such as [[:alpha:]].
(Actually, the same problem occurs for large pg_wchar values in any
multibyte encoding, but UTF8 is the only case people have actually
complained about.)  It's impractical to get Spencer's original code to
handle character classes or ranges containing many thousands of characters,
because it insists on considering each member character individually at
regex compile time, whether or not the character will ever be of interest
at run time.  To fix, choose a cutoff point MAX_SIMPLE_CHR below which
we process characters individually as before, and deal with entire ranges
or classes as single entities above that.  We can actually make things
cheaper than before for chars below the cutoff, because the color map can
now be a simple linear array for those chars, rather than the multilevel
tree structure Spencer designed.  It's more expensive than before for
chars above the cutoff, because we must do a binary search in a list of
high chars and char ranges used in the regex pattern, plus call iswalpha()
and friends for each locale-dependent character class used in the pattern.
However, multibyte encodings are normally designed to give smaller codes
to popular characters, so that we can expect that the slow path will be
taken relatively infrequently.  In any case, the speed penalty appears
minor except when we have to apply iswalpha() etc. to high character codes
at runtime --- and the previous coding gave wrong answers for those cases,
so whether it was faster is moot.

Tom Lane, reviewed by Heikki Linnakangas

Discussion: <15563.1471913698@sss.pgh.pa.us>
2016-09-05 17:06:29 -04:00
Tom Lane 15bc038f9b Relax transactional restrictions on ALTER TYPE ... ADD VALUE.
To prevent possibly breaking indexes on enum columns, we must keep
uncommitted enum values from getting stored in tables, unless we
can be sure that any such column is new in the current transaction.

Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE
from being executed at all in a transaction block, unless the target
enum type had been created in the current transaction.  This patch
removes that restriction, and instead insists that an uncommitted enum
value can't be referenced unless it belongs to an enum type created
in the same transaction as the value.  Per discussion, this should be
a bit less onerous.  It does require each function that could possibly
return a new enum value to SQL operations to check this restriction,
but there aren't so many of those that this seems unmaintainable.

Andrew Dunstan and Tom Lane

Discussion: <4075.1459088427@sss.pgh.pa.us>
2016-09-05 12:59:55 -04:00
Simon Riggs 016abf1fb8 Add debug check function LWLockHeldByMeInMode()
Tests whether my process holds a lock in given mode.
Add initial usage in MarkBufferDirty().

Thomas Munro
2016-09-05 10:38:08 +01:00
Simon Riggs d851bef2d6 Dirty replication slots when using sql interface
When pg_logical_slot_get_changes(...) sets confirmed_flush_lsn to the point at
which replay stopped, it doesn't dirty the replication slot.  So if the replay
didn't cause restart_lsn or catalog_xmin to change as well, this change will
not get written out to disk. Even on a clean shutdown.

If Pg crashes or restarts, a subsequent pg_logical_slot_get_changes(...) call
will see the same changes already replayed since it uses the slot's
confirmed_flush_lsn as the start point for fetching changes. The caller can't
specify a start LSN when using the SQL interface.

Mark the slot as dirty after reading changes using the SQL interface so that
users won't see repeated changes after a clean shutdown. Repeated changes still
occur when using the walsender interface or after an unclean shutdown.

Craig Ringer
2016-09-05 09:44:38 +01:00
Tom Lane b6182081be Remove duplicate code from ReorderBufferCleanupTXN().
Andres is apparently the only hacker who thinks this code is better as-is.
I (tgl) follow some of his logic, but the fact that it's setting off
warnings from static code analyzers seems like a sufficient reason to
put the complexity into a comment rather than the code.

Aleksander Alekseev

Discussion: <20160404190345.54d84ee8@fujitsu>
2016-09-04 20:49:44 -04:00
Heikki Linnakangas e21db14b8a Clarify the new Red-Black post-order traversal code a bit.
Coverity complained about the for(;;) loop, because it never actually
iterated. It was used just to be able to use "break" to exit it early. I
agree with Coverity, that's a bit confusing, so refactor the code to
use if-else instead.

While we're at it, use a local variable to hold the "current" node. That's
shorter and clearer than referring to "iter->last_visited" all the time.
2016-09-04 15:02:06 +03:00
Tom Lane 600dc4c0da Fix multiple bugs in numeric_poly_deserialize().
These were evidently introduced by yesterday's commit 9cca11c91,
which perhaps needs more review than it got.

Per report from Andreas Seltenreich and additional examination
of nearby code.

Report: <87oa45qfwq.fsf@credativ.de>
2016-09-03 14:18:55 -04:00
Tom Lane 60893786d5 Fix corrupt GIN_SEGMENT_ADDITEMS WAL records on big-endian hardware.
computeLeafRecompressWALData() tried to produce a uint16 WAL log field by
memcpy'ing the first two bytes of an int-sized variable.  That accidentally
works on little-endian hardware, but not at all on big-endian.  Replay then
thinks it's looking at an ADDITEMS action with zero entries, and reads the
first two bytes of the first TID therein as the next segno/action,
typically leading to "unexpected GIN leaf action" errors during replay.
Even if replay failed to crash, the resulting GIN index page would surely
be incorrect.  To fix, just declare the variable as uint16 instead.

Per bug #14295 from Spencer Thomason (much thanks to Spencer for turning
his problem into a self-contained test case).  This likely also explains
a previous report of the same symptom from Bernd Helmle.

Back-patch to 9.4 where the problem was introduced (by commit 14d02f0bb).

Discussion: <20160826072658.15676.7628@wrigleys.postgresql.org>
Possible-Report: <2DA7350F7296B2A142272901@eje.land.credativ.lan>
2016-09-03 13:28:53 -04:00
Simon Riggs 35250b6ad7 New recovery target recovery_target_lsn
Michael Paquier
2016-09-03 17:48:01 +01:00
Tom Lane 39b691f251 Don't require dynamic timezone abbreviations to match underlying time zone.
Previously, we threw an error if a dynamic timezone abbreviation did not
match any abbreviation recorded in the referenced IANA time zone entry.
That seemed like a good consistency check at the time, but it turns out
that a number of the abbreviations in the IANA database are things that
Olson and crew made up out of whole cloth.  Their current policy is to
remove such names in favor of using simple numeric offsets.  Perhaps
unsurprisingly, a lot of these made-up abbreviations have varied in meaning
over time, which meant that our commit b2cbced9e and later changes made
them into dynamic abbreviations.  So with newer IANA database versions
that don't mention these abbreviations at all, we fail, as reported in bug
#14307 from Neil Anderson.  It's worse than just a few unused-in-the-wild
abbreviations not working, because the pg_timezone_abbrevs view stops
working altogether (since its underlying function tries to compute the
whole view result in one call).

We considered deleting these abbreviations from our abbreviations list, but
the problem with that is that we can't stay ahead of possible future IANA
changes.  Instead, let's leave the abbreviations list alone, and treat any
"orphaned" dynamic abbreviation as just meaning the referenced time zone.
It will behave a bit differently than it used to, in that you can't any
longer override the zone's standard vs. daylight rule by using the "wrong"
abbreviation of a pair, but that's better than failing entirely.  (Also,
this solution can be interpreted as adding a small new feature, which is
that any abbreviation a user wants can be defined as referencing a time
zone name.)

Back-patch to all supported branches, since this problem affects all
of them when using tzdata 2016f or newer.

Report: <20160902031551.15674.67337@wrigleys.postgresql.org>
Discussion: <6189.1472820913@sss.pgh.pa.us>
2016-09-02 17:30:02 -04:00
Heikki Linnakangas ec136d19b2 Move code shared between libpq and backend from backend/libpq/ to common/.
When building libpq, ip.c and md5.c were symlinked or copied from
src/backend/libpq into src/interfaces/libpq, but now that we have a
directory specifically for routines that are shared between the server and
client binaries, src/common/, move them there.

Some routines in ip.c were only used in the backend. Keep those in
src/backend/libpq, but rename to ifaddr.c to avoid confusion with the file
that's now in common.

Fix the comment in src/common/Makefile to reflect how libpq actually links
those files.

There are two more files that libpq symlinks directly from src/backend:
encnames.c and wchar.c. I don't feel compelled to move those right now,
though.

Patch by Michael Paquier, with some changes by me.

Discussion: <69938195-9c76-8523-0af8-eb718ea5b36e@iki.fi>
2016-09-02 13:49:59 +03:00
Heikki Linnakangas 9cca11c915 Speed up SUM calculation in numeric aggregates.
This introduces a numeric sum accumulator, which performs better than
repeatedly calling add_var(). The performance comes from using wider digits
and delaying carry propagation, tallying positive and negative values
separately, and avoiding a round of palloc/pfree on every value. This
speeds up SUM(), as well as other standard aggregates like AVG() and
STDDEV() that also calculate a sum internally.

Reviewed-by: Andrey Borodin
Discussion: <c0545351-a467-5b76-6d46-4840d1ea8aa4@iki.fi>
2016-09-02 11:51:49 +03:00
Heikki Linnakangas 9f85784cae Support multiple iterators in the Red-Black Tree implementation.
While we don't need multiple iterators at the moment, the interface is
nicer and less dangerous this way.

Aleksander Alekseev, with some changes by me.
2016-09-02 08:39:39 +03:00
Tom Lane 6c03d981a6 Change API of ShmemAlloc() so it throws error rather than returning NULL.
A majority of callers seem to have believed that this was the API spec
already, because they omitted any check for a NULL result, and hence
would crash on an out-of-shared-memory failure.  The original proposal
was to just add such error checks everywhere, but that does nothing to
prevent similar omissions in future.  Instead, let's make ShmemAlloc()
throw the error (so we can remove the caller-side checks that do exist),
and introduce a new function ShmemAllocNoError() that has the previous
behavior of returning NULL, for the small number of callers that need
that and are prepared to do the right thing.  This also lets us remove
the rather wishy-washy behavior of printing a WARNING for out-of-shmem,
which never made much sense: either the caller has a strategy for
dealing with that, or it doesn't.  It's not ShmemAlloc's business to
decide whether a warning is appropriate.

The v10 release notes will need to call this out as a significant
source-code change.  It's likely that it will be a bug fix for
extension callers too, but if not, they'll need to change to using
ShmemAllocNoError().

This is nominally a bug fix, but the odds that it's fixing any live
bug are actually rather small, because in general the requests
being made by the unchecked callers were already accounted for in
determining the overall shmem size, so really they ought not fail.
Between that and the possible impact on extensions, no back-patch.

Discussion: <24843.1472563085@sss.pgh.pa.us>
2016-09-01 10:13:55 -04:00
Tom Lane 65a588b4c3 Try to fix portability issue in enum renumbering (again).
The hack embodied in commit 4ba61a487 no longer works after today's change
to allow DatumGetFloat4/Float4GetDatum to be inlined (commit 14cca1bf8).
Probably what's happening is that the faulty compilers are deciding that
the now-inlined assignment is a no-op and so they're not required to
round to float4 width.

We had a bunch of similar issues earlier this year in the degree-based
trig functions, and eventually settled on using volatile intermediate
variables as the least ugly method of forcing recalcitrant compilers
to do what the C standard says (cf commit 82311bcdd).  Let's see if
that method works here.

Discussion: <4640.1472664476@sss.pgh.pa.us>
2016-08-31 13:58:01 -04:00
Tom Lane 679226337a Remove no-longer-useful SSL-specific Port.count field.
Since we removed SSL renegotiation, there's no longer any reason to
keep track of the amount of data transferred over the link.

Daniel Gustafsson

Discussion: <FEA7F89C-ECDF-4799-B789-2F8DDCBA467F@yesql.se>
2016-08-31 09:24:19 -04:00
Heikki Linnakangas 14cca1bf8e Use static inline functions for float <-> Datum conversions.
Now that we are OK with using static inline functions, we can use them
to avoid function call overhead of pass-by-val versions of Float4GetDatum,
DatumGetFloat8, and Float8GetDatum. Those functions are only a few CPU
instructions long, but they could not be written into macros previously,
because we need a local union variable for the conversion.

I kept the pass-by-ref versions as regular functions. They are very simple
too, but they call palloc() anyway, so shaving a few instructions from the
function call doesn't seem so important there.

Discussion: <dbb82a4a-2c15-ba27-dd0a-009d2aa72b77@iki.fi>
2016-08-31 16:00:28 +03:00
Tom Lane 0e0f43d6fd Prevent starting a standalone backend with standby_mode on.
This can't really work because standby_mode expects there to be more
WAL arriving, which there will not ever be because there's no WAL
receiver process to fetch it.  Moreover, if standby_mode is on then
hot standby might also be turned on, causing even more strangeness
because that expects read-only sessions to be executing in parallel.
Bernd Helmle reported a case where btree_xlog_delete_get_latestRemovedXid
got confused, but rather than band-aiding individual problems it seems
best to prevent getting anywhere near this state in the first place.
Back-patch to all supported branches.

In passing, also fix some omissions of errcodes in other ereport's in
readRecoveryCommandFile().

Michael Paquier (errcode hacking by me)

Discussion: <00F0B2CEF6D0CEF8A90119D4@eje.credativ.lan>
2016-08-31 08:52:13 -04:00
Tom Lane 052cc223d5 Fix a bunch of places that called malloc and friends with no NULL check.
Where possible, use palloc or pg_malloc instead; otherwise, insert
explicit NULL checks.

Generally speaking, these are places where an actual OOM is quite
unlikely, either because they're in client programs that don't
allocate all that much, or they're very early in process startup
so that we'd likely have had a fork() failure instead.  Hence,
no back-patch, even though this is nominally a bug fix.

Michael Paquier, with some adjustments by me

Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
2016-08-30 18:22:43 -04:00
Alvaro Herrera 8e1e3f958f Split hash.h → hash_xlog.h
Since the hash AM is going to be revamped to have WAL, this is a good
opportunity to clean up the include file a little bit to avoid including
a lot of extra stuff in the future.

Author: Amit Kapila
2016-08-29 18:55:49 -03:00
Heikki Linnakangas 9b7cd59af1 Remove support for OpenSSL versions older than 0.9.8.
OpenSSL officially only supports 1.0.1 and newer. Some OS distributions
still provide patches for 0.9.8, but anything older than that is not
interesting anymore. Let's simplify things by removing compatibility code.

Andreas Karlsson, with small changes by me.
2016-08-29 20:16:02 +03:00
Tom Lane cf34fdbbe1 Make AllocSetContextCreate throw an error for bad context-size parameters.
The previous behavior was to silently change them to something valid.
That obscured the bugs fixed in commit ea268cdc9, and generally seems
less useful than complaining.  Unlike the previous commit, though,
we'll do this in HEAD only --- it's a bit too late to be possibly
breaking third-party code in 9.6.

Discussion: <CA+TgmobNcELVd3QmLD3tx=w7+CokRQiC4_U0txjz=WHpfdkU=w@mail.gmail.com>
2016-08-29 09:29:26 -04:00
Fujii Masao bd082231ed Fix typos in comments. 2016-08-29 16:06:40 +09:00
Fujii Masao bab7823a49 Fix pg_xlogdump so that it handles cross-page XLP_FIRST_IS_CONTRECORD record.
Previously pg_xlogdump failed to dump the contents of the WAL file
if the file starts with the continuation WAL record which spans
more than one pages. Since pg_xlogdump assumed that the continuation
record always fits on a page, it could not find the valid WAL record to
start reading from in that case.

This patch changes pg_xlogdump so that it can handle a continuation
WAL record which crosses a page boundary and find the valid record
to start reading from.

Back-patch to 9.3 where pg_xlogdump was introduced.

Author: Pavan Deolasee
Reviewed-By: Michael Paquier and Craig Ringer
Discussion: CABOikdPsPByMiG6J01DKq6om2+BNkxHTPkOyqHM2a4oYwGKsqQ@mail.gmail.com
2016-08-29 14:34:58 +09:00
Tom Lane ea268cdc9a Add macros to make AllocSetContextCreate() calls simpler and safer.
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls
had typos in the context-sizing parameters.  While none of these led to
especially significant problems, they did create minor inefficiencies,
and it's now clear that expecting people to copy-and-paste those calls
accurately is not a great idea.  Let's reduce the risk of future errors
by introducing single macros that encapsulate the common use-cases.
Three such macros are enough to cover all but two special-purpose contexts;
those two calls can be left as-is, I think.

While this patch doesn't in itself improve matters for third-party
extensions, it doesn't break anything for them either, and they can
gradually adopt the simplified notation over time.

In passing, change TopMemoryContext to use the default allocation
parameters.  Formerly it could only be extended 8K at a time.  That was
probably reasonable when this code was written; but nowadays we create
many more contexts than we did then, so that it's not unusual to have a
couple hundred K in TopMemoryContext, even without considering various
dubious code that sticks other things there.  There seems no good reason
not to let it use growing blocks like most other contexts.

Back-patch to 9.6, mostly because that's still close enough to HEAD that
it's easy to do so, and keeping the branches in sync can be expected to
avoid some future back-patching pain.  The bugs fixed by these changes
don't seem to be significant enough to justify fixing them further back.

Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-08-27 17:50:38 -04:00
Tom Lane 26fa446da6 Add a nonlocalized version of the severity field to client error messages.
This has been requested a few times, but the use-case for it was never
entirely clear.  The reason for adding it now is that transmission of
error reports from parallel workers fails when NLS is active, because
pq_parse_errornotice() wrongly assumes that the existing severity field
is nonlocalized.  There are other ways we could have fixed that, but the
other options were basically kluges, whereas this way provides something
that's at least arguably a useful feature along with the bug fix.

Per report from Jakob Egger.  Back-patch into 9.6, because otherwise
parallel query is essentially unusable in non-English locales.  The
problem exists in 9.5 as well, but we don't want to risk changing
on-the-wire behavior in 9.5 (even though the possibility of new error
fields is specifically called out in the protocol document).  It may
be sufficient to leave the issue unfixed in 9.5, given the very limited
usefulness of pq_parse_errornotice in that version.

Discussion: <A88E0006-13CB-49C6-95CC-1A77D717213C@eggerapps.at>
2016-08-26 16:20:17 -04:00
Tom Lane 78dcd027e8 Fix potential memory leakage from HandleParallelMessages().
HandleParallelMessages leaked memory into the caller's context.  Since it's
called from ProcessInterrupts, there is basically zero certainty as to what
CurrentMemoryContext is, which means we could be leaking into long-lived
contexts.  Over the processing of many worker messages that would grow to
be a problem.  Things could be even worse than just a leak, if we happened
to service the interrupt while ErrorContext is current: elog.c thinks it
can reset that on its own whim, possibly yanking storage out from under
HandleParallelMessages.

Give HandleParallelMessages its own dedicated context instead, which we can
reset during each call to ensure there's no accumulation of wasted memory.

Discussion: <16610.1472222135@sss.pgh.pa.us>
2016-08-26 15:04:05 -04:00
Tom Lane 45a36e6853 Put static forward declarations in elog.c back into same order as code.
The guiding principle for the last few patches in this area apparently
involved throwing darts.

Cosmetic only, but back-patch to 9.6 because there is no reason for
9.6 and HEAD to diverge yet in this file.
2016-08-26 14:19:03 -04:00
Tom Lane 8529036b53 Fix assorted small bugs in ThrowErrorData().
Copy the palloc'd strings into the correct context, ie ErrorContext
not wherever the source ErrorData is.  This would be a large bug,
except that it appears that all catchers of thrown errors do either
EmitErrorReport or CopyErrorData before doing anything that would
cause transient memory contexts to be cleaned up.  Still, it's wrong
and it will bite somebody someday.

Fix failure to copy cursorpos and internalpos.

Utter the appropriate incantations involving recursion_depth, so that
we'll behave sanely if we get an error inside pstrdup.  (In general,
the body of this function ought to act like, eg, errdetail().)

Per code reading induced by Jakob Egger's report.
2016-08-26 14:15:47 -04:00
Tom Lane fbf28b6b52 Fix logic for adding "parallel worker" context line to worker errors.
The previous coding here was capable of adding a "parallel worker" context
line to errors that were not, in fact, returned from a parallel worker.
Instead of using an errcontext callback to add that annotation, just paste
it onto the message by hand; this looks uglier but is more reliable.

Discussion: <19757.1472151987@sss.pgh.pa.us>
2016-08-26 10:07:28 -04:00
Tom Lane ae4760d667 Fix small query-lifespan memory leak in bulk updates.
When there is an identifiable REPLICA IDENTITY index on the target table,
heap_update leaks the id_attrs bitmapset.  That's not many bytes, but it
adds up over enough rows, since the code typically runs in a query-lifespan
context.  Bug introduced in commit e55704d8b, which did a rather poor job
of cloning the existing use-pattern for RelationGetIndexAttrBitmap().

Per bug #14293 from Zhou Digoal.  Back-patch to 9.4 where the bug was
introduced.

Report: <20160824114320.15676.45171@wrigleys.postgresql.org>
2016-08-24 22:20:25 -04:00
Tom Lane 2c00fad286 Fix improper repetition of previous results from a hashed aggregate.
ExecReScanAgg's check for whether it could re-use a previously calculated
hashtable neglected the possibility that the Agg node might reference
PARAM_EXEC Params that are not referenced by its input plan node.  That's
okay if the Params are in upper tlist or qual expressions; but if one
appears in aggregate input expressions, then the hashtable contents need
to be recomputed when the Param's value changes.

To avoid unnecessary performance degradation in the case of a Param that
isn't within an aggregate input, add logic to the planner to determine
which Params are within aggregate inputs.  This requires a new field in
struct Agg, but fortunately we never write plans to disk, so this isn't
an initdb-forcing change.

Per report from Jeevan Chalke.  This has been broken since forever,
so back-patch to all supported branches.

Andrew Gierth, with minor adjustments by me

Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>
2016-08-24 14:38:12 -04:00
Tom Lane 71e006f031 Suppress compiler warnings in non-cassert builds.
With Asserts off, these variables are set but never used, resulting
in warnings from pickier compilers.  Fix that with our standard solution.
Per report from Jeff Janes.
2016-08-23 23:21:10 -04:00
Tom Lane 32909a57f9 Fix network_spgist.c build failures from missing AF_INET definition.
AF_INET is apparently defined in something that's pulled in automatically
on Linux, but the buildfarm says that's not true everywhere.  Comparing
to network_gist.c suggests that including <sys/socket.h> ought to fix it,
and the POSIX standard concurs.
2016-08-23 16:25:35 -04:00
Tom Lane 77e2906821 Create an SP-GiST opclass for inet/cidr.
This seems to offer significantly better search performance than the
existing GiST opclass for inet/cidr, at least on data with a wide mix
of network mask lengths.  (That may suggest that the data splitting
heuristics in the GiST opclass could be improved.)

Emre Hasegeli, with mostly-cosmetic adjustments by me

Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
2016-08-23 15:16:30 -04:00
Robert Haas 0fda682e54 Extend dsm API with a new function dsm_unpin_segment.
If you have previously pinned a segment and decide that you don't
actually want to keep it around until shutdown, this new API lets you
remove the pin.  This is pretty trivial except on Windows, where it
requires closing the duplicate handle that was used to implement the
pin.

Thomas Munro and Amit Kapila, reviewed by Amit Kapila and by me.
2016-08-23 14:32:23 -04:00
Robert Haas 19998730ae Remove duplicate function prototype.
Kyotaro Horiguchi
2016-08-23 13:44:18 -04:00
Tom Lane d2ddee63b4 Improve SP-GiST opclass API to better support unlabeled nodes.
Previously, the spgSplitTuple action could only create a new upper tuple
containing a single labeled node.  This made it useless for opclasses
that prefer to work with fixed sets of nodes (labeled or otherwise),
which meant that restrictive prefixes could not be used with such
node definitions.  Change the output field set for the choose() method
to allow it to specify any valid node set for the new upper tuple,
and to specify which of these nodes to place the modified lower tuple in.

In addition to its primary use for fixed node sets, this feature could
allow existing opclasses that use variable node sets to skip a separate
spgAddNode action when splitting a tuple, by setting up the node needed
for the incoming value as part of the spgSplitTuple action.  However, care
would have to be taken to add the extra node only when it would not make
the tuple bigger than before.  (spgAddNode can enlarge the tuple,
spgSplitTuple can't.)

This is a prerequisite for an upcoming SP-GiST inet opclass, but is
being committed separately to increase the visibility of the API change.

In passing, improve the documentation about the traverse-values feature
that was added by commit ccd6eb49a.

Emre Hasegeli, with cosmetic adjustments and documentation rework by me

Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>
2016-08-23 12:10:34 -04:00
Robert Haas 86f31695f3 Add txid_current_ifassigned().
Add a variant of txid_current() that returns NULL if no transaction ID
is assigned.  This version can be used even on a standby server,
although it will always return NULL since no transaction IDs can be
assigned during recovery.

Craig Ringer, per suggestion from Jim Nasby.  Reviewed by Petr Jelinek
and by me.
2016-08-23 10:30:52 -04:00
Robert Haas ff36700c3b Remove duplicate word from comment.
Erik Rijkers
2016-08-23 10:05:13 -04:00
Tom Lane 7b405b3e04 Refactor some network.c code to create cidr_set_masklen_internal().
Merge several copies of "copy an inet value and adjust the mask length"
code to create a single, conveniently C-callable function.  This function
is exported for future use by inet SPGiST support, but it's good cleanup
anyway since we had three slightly-different-for-no-good-reason copies.

(Extracted from a larger patch, to separate new code from refactoring
of old code)

Emre Hasegeli
2016-08-23 09:39:54 -04:00
Robert Haas 008c4135cc Fix possible sorting error when aborting use of abbreviated keys.
Due to an error in the abbreviated key abort logic, the most recently
processed SortTuple could be incorrectly marked NULL, resulting in an
incorrect final sort order.

In the worst case, this could result in a corrupt btree index, which
would need to be rebuild using REINDEX.  However, abbrevation doesn't
abort very often, not all data types use it, and only one tuple would
end up in the wrong place, so the practical impact of this mistake may
be somewhat limited.

Report and patch by Peter Geoghegan.
2016-08-22 15:22:11 -04:00
Robert Haas af5743851d Improve header comment for LockHasWaitersRelation.
Dimitry Ivanov spotted a typo, and I added a bit of wordsmithing.
2016-08-22 11:53:20 -04:00
Tom Lane 8299471c37 Use LEFT JOINs in some system views in case referenced row doesn't exist.
In particular, left join to pg_authid so that rows in pg_stat_activity
don't disappear if the session's owning user has been dropped.
Also convert a few joins to pg_database to left joins, in the same spirit,
though that case might be harder to hit.  We were doing this in other
views already, so it was a bit inconsistent that these views didn't.

Oskari Saarenmaa, with some further tweaking by me

Discussion: <56E87CD8.60007@ohmu.fi>
2016-08-19 17:13:47 -04:00
Tom Lane 65a603e903 Guard against parallel-restricted functions in VALUES expressions.
Obvious brain fade in set_rel_consider_parallel().  Noticed it while
adjusting the adjacent RTE_FUNCTION case.

In 9.6, also make the code look more like what I just did in HEAD
by removing the unnecessary function_rte_parallel_ok subroutine
(it does nothing that expression_tree_walker wouldn't do).
2016-08-19 14:35:32 -04:00
Tom Lane da1c91631e Speed up planner's scanning for parallel-query hazards.
We need to scan the whole parse tree for parallel-unsafe functions.
If there are none, we'll later need to determine whether particular
subtrees contain any parallel-restricted functions.  The previous coding
retained no knowledge from the first scan, even though this is very
wasteful in the common case where the query contains only parallel-safe
functions.  We can bypass all of the later scans by remembering that fact.
This provides a small but measurable speed improvement when the case
applies, and shouldn't cost anything when it doesn't.

Patch by me, reviewed by Robert Haas

Discussion: <3740.1471538387@sss.pgh.pa.us>
2016-08-19 14:03:13 -04:00
Alvaro Herrera 6f79ae7fe5 reorderbuffer: preserve errno while reporting error
Clobbering errno during cleanup after an error is an oft-repeated, easy
to make mistake.  Deal with it here as everywhere else, by saving it
aside and restoring after cleanup, before ereport'ing.

In passing, add a missing errcode declaration in another ereport() call
in the same file, which I noticed while skimming the file looking for
similar problems.

Backpatch to 9.4, where this code was introduced.
2016-08-19 14:38:55 -03:00
Tom Lane a859e64003 Clean up another pre-ANSI-C-ism in regex code: get rid of pcolor typedef.
pcolor was used to represent function arguments that are nominally of
type color, but when using a pre-ANSI C compiler would be passed as the
promoted integer type.  We really don't need that anymore.
2016-08-19 13:31:10 -04:00
Tom Lane 6eefd2422e Remove typedef celt from the regex library, along with macro NOCELT.
The regex library used to have a notion of a "collating element" that was
distinct from a "character", but Henry Spencer never actually implemented
his planned support for multi-character collating elements, and the Tcl
crew ripped out most of the stubs for that years ago.  The only thing left
that distinguished the "celt" typedef from the "chr" typedef was that
"celt" was supposed to also be able to hold the not-a-character "NOCELT"
value.  However, NOCELT was not used anywhere after the MCCE stub removal
changes, which means there's no need for celt to be different from chr.
Removing the separate typedef simplifies matters and also removes a trap
for the unwary, in that celt is signed while chr may not be, so comparisons
could mean different things.  There's no bug there today because we
restrict CHR_MAX to be less than INT_MAX, but I think there may have been
such bugs before we did that, and there could be again if anyone ever
decides to fool with the range of chr.

This patch also removes assorted unnecessary casts to "chr" of values
that are already chrs.  Many of these seem to be leftover from days when
the code was compatible with pre-ANSI C.
2016-08-19 12:51:02 -04:00
Peter Eisentraut 1d2e73a3df Remove obsolete replacement system() on darwin
Per comment in the file, this was fixed around OS X 10.2.
2016-08-18 12:00:00 -04:00
Heikki Linnakangas fa878703f4 Refactor RandomSalt to handle salts of different lengths.
All we need is 4 bytes at the moment, for MD5 authentication. But in
upcomint patches for SCRAM authentication, SCRAM will need a salt of
different length. It's less scary for the caller to pass the buffer
length anyway, than assume a certain-sized output buffer.

Author: Michael Paquier
Discussion: <CAB7nPqQvO4sxLFeS9D+NM3wpy08ieZdAj_6e117MQHZAfxBFsg@mail.gmail.com>
2016-08-18 13:41:17 +03:00