Commit Graph

17562 Commits

Author SHA1 Message Date
Tom Lane 4d85c2900b Improve comments in vacuum_rel() and analyze_rel().
Remove obsolete references to get_rel_oids().  Avoid listing specific
relkinds in the comments, since we seem unable to keep such things
in sync with the code, and it's not all that helpful anyhow.

Noted by Michael Paquier, though I rewrote the comments a bit more.

Discussion: https://postgr.es/m/CAB7nPqTWiN9zwKTaOrsnKiGDChqRt7C1+CiiDk4N4OMn92rs6A@mail.gmail.com
2017-10-05 10:47:47 -04:00
Robert Haas 4b2ba1fe02 Fix typo.
Etsuro Fujita

Discussion: http://postgr.es/m/1b2e9ac7-b99a-2769-5e42-afdf62bfa7fa@lab.ntt.co.jp
2017-10-05 08:45:24 -04:00
Robert Haas c097b271e8 Fix more user-visible elog() calls.
Michael Paquier discovered that this could be triggered via SQL;
give a nicer message instead.

Patch by Michael Paquier, reviewed by Masahiko Sawada.

Discussion: http://postgr.es/m/CAB7nPqQtPg+LKKtzdKN26judHcvPZ0s1gNigzOT4j8CYuuuBYg@mail.gmail.com
2017-10-05 07:58:02 -04:00
Peter Eisentraut 036166f26e Document and use SPI_result_code_string()
A lot of semi-internal code just prints out numeric SPI error codes,
which is not very helpful.  We already have an API function to convert
the codes to a string, so let's make more use of that.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-10-04 22:14:21 -04:00
Peter Eisentraut 582bbcf37f Move SPI error reporting out of ri_ReportViolation()
These are two completely unrelated code paths, so it doesn't make sense
to pack them into one function.

Add attribute noreturn to ri_ReportViolation().

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-10-04 22:14:21 -04:00
Andres Freund 212e6f34d5 Replace binary search in fmgr_isbuiltin with a lookup array.
Turns out we have enough functions that the binary search is quite
noticeable in profiles.

Thus have Gen_fmgrtab.pl build a new mapping from a builtin function's
oid to an index in the existing fmgr_builtins array. That keeps the
additional memory usage at a reasonable amount.

Author: Andres Freund, with input from Tom Lane
Discussion: https://postgr.es/m/20170914065128.a5sk7z4xde5uy3ei@alap3.anarazel.de
2017-10-04 00:22:38 -07:00
Andres Freund 18f791ab2b Move genbki.pl's find_defined_symbol to Catalog.pm.
Will be used in Gen_fmgrtab.pl in a followup commit.
2017-10-04 00:11:36 -07:00
Tom Lane 11d8d72c27 Allow multiple tables to be specified in one VACUUM or ANALYZE command.
Not much to say about this; does what it says on the tin.

However, formerly, if there was a column list then the ANALYZE action was
implied; now it must be specified, or you get an error.  This is because
it would otherwise be a bit unclear what the user meant if some tables
have column lists and some don't.

Nathan Bossart, reviewed by Michael Paquier and Masahiko Sawada, with some
editorialization by me

Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com
2017-10-03 18:53:44 -04:00
Tom Lane 45f9d08684 Fix race condition with unprotected use of a latch pointer variable.
Commit 597a87ccc introduced a latch pointer variable to replace use
of a long-lived shared latch in the shared WalRcvData structure.
This was not well thought out, because there are now hazards of the
pointer variable changing while it's being inspected by another
process.  This could obviously lead to a core dump in code like

	if (WalRcv->latch)
		SetLatch(WalRcv->latch);

and there's a more remote risk of a torn read, if we have any
platforms where reading/writing a pointer is not atomic.

An actual problem would occur only if the walreceiver process
exits (gracefully) while the startup process is trying to
signal it, but that seems well within the realm of possibility.

To fix, treat the pointer variable (not the referenced latch)
as being protected by the WalRcv->mutex spinlock.  There
remains a race condition that we could apply SetLatch to a
process latch that no longer belongs to the walreceiver, but
I believe that's harmless: at worst it'd cause an extra wakeup
of the next process to use that PGPROC structure.

Back-patch to v10 where the faulty code was added.

Discussion: https://postgr.es/m/22735.1507048202@sss.pgh.pa.us
2017-10-03 14:00:56 -04:00
Alvaro Herrera 89e434b59c Fix coding rules violations in walreceiver.c
1. Since commit b1a9bad9e7 we had pstrdup() inside a
spinlock-protected critical section; reported by Andreas Seltenreich.
Turn those into strlcpy() to stack-allocated variables instead.
Backpatch to 9.6.

2. Since commit 9ed551e0a4 we had a pfree() uselessly inside a
spinlock-protected critical section.  Tom Lane noticed in code review.
Move down.  Backpatch to 9.6.

3. Since commit 64233902d2 we had GetCurrentTimestamp() (a kernel
call) inside a spinlock-protected critical section.  Tom Lane noticed in
code review.  Move it up.  Backpatch to 9.2.

4. Since commit 1bb2558046 we did elog(PANIC) while holding spinlock.
Tom Lane noticed in code review.  Release spinlock before dying.
Backpatch to 9.2.

Discussion: https://postgr.es/m/87h8vhtgj2.fsf@ansel.ydns.eu
2017-10-03 14:58:25 +02:00
Andres Freund 0ba99c84e8 Replace most usages of ntoh[ls] and hton[sl] with pg_bswap.h.
All postgres internal usages are replaced, it's just libpq example
usages that haven't been converted. External users of libpq can't
generally rely on including postgres internal headers.

Note that this includes replacing open-coded byte swapping of 64bit
integers (using two 32 bit swaps) with a single 64bit swap.

Where it looked applicable, I have removed netinet/in.h and
arpa/inet.h usage, which previously provided the relevant
functionality. It's perfectly possible that I missed other reasons for
including those, the buildfarm will tell.

Author: Andres Freund
Discussion: https://postgr.es/m/20170927172019.gheidqy6xvlxb325@alap3.anarazel.de
2017-10-01 15:36:14 -07:00
Tom Lane c12d570fa1 Support arrays over domains.
Allowing arrays with a domain type as their element type was left un-done
in the original domain patch, but not for any very good reason.  This
omission leads to such surprising results as array_agg() not working on
a domain column, because the parser can't identify a suitable output type
for the polymorphic aggregate.

In order to fix this, first clean up the APIs of coerce_to_domain() and
some internal functions in parse_coerce.c so that we consistently pass
around a CoercionContext along with CoercionForm.  Previously, we sometimes
passed an "isExplicit" boolean flag instead, which is strictly less
information; and coerce_to_domain() didn't even get that, but instead had
to reverse-engineer isExplicit from CoercionForm.  That's contrary to the
documentation in primnodes.h that says that CoercionForm only affects
display and not semantics.  I don't think this change fixes any live bugs,
but it makes things more consistent.  The main reason for doing it though
is that now build_coercion_expression() receives ccontext, which it needs
in order to be able to recursively invoke coerce_to_target_type().

Next, reimplement ArrayCoerceExpr so that the node does not directly know
any details of what has to be done to the individual array elements while
performing the array coercion.  Instead, the per-element processing is
represented by a sub-expression whose input is a source array element and
whose output is a target array element.  This simplifies life in
parse_coerce.c, because it can build that sub-expression by a recursive
invocation of coerce_to_target_type().  The executor now handles the
per-element processing as a compiled expression instead of hard-wired code.
The main advantage of this is that we can use a single ArrayCoerceExpr to
handle as many as three successive steps per element: base type conversion,
typmod coercion, and domain constraint checking.  The old code used two
stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty
inefficient, and adding yet another array deconstruction to do domain
constraint checking seemed very unappetizing.

In the case where we just need a single, very simple coercion function,
doing this straightforwardly leads to a noticeable increase in the
per-array-element runtime cost.  Hence, add an additional shortcut evalfunc
in execExprInterp.c that skips unnecessary overhead for that specific form
of expression.  The runtime speed of simple cases is within 1% or so of
where it was before, while cases that previously required two levels of
array processing are significantly faster.

Finally, create an implicit array type for every domain type, as we do for
base types, enums, etc.  Everything except the array-coercion case seems
to just work without further effort.

Tom Lane, reviewed by Andrew Dunstan

Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
2017-09-30 13:40:56 -04:00
Tom Lane 19de0ab23c Fix inadequate locking during get_rel_oids().
get_rel_oids used to not take any relation locks at all, but that stopped
being a good idea with commit 3c3bb9933, which inserted a syscache lookup
into the function.  A concurrent DROP TABLE could now produce "cache lookup
failed", which we don't want to have happen in normal operation.  The best
solution seems to be to transiently take a lock on the relation named by
the RangeVar (which also makes the result of RangeVarGetRelid a lot less
spongy).  But we shouldn't hold the lock beyond this function, because we
don't want VACUUM to lock more than one table at a time.  (That would not
be a big problem right now, but it will become one after the pending
feature patch to allow multiple tables to be named in VACUUM.)

In passing, adjust vacuum_rel and analyze_rel to document that we don't
trust the passed RangeVar to be accurate, and allow the RangeVar to
possibly be NULL --- which it is anyway for a whole-database VACUUM,
though we accidentally didn't crash for that case.

The passed RangeVar is in fact inaccurate when dealing with a child
partition, as of v10, and it has been wrong for a whole long time in the
case of vacuum_rel() recursing to a TOAST table.  None of these things
present visible bugs up to now, because the passed RangeVar is in fact
only consulted for autovacuum logging, and in that particular context it's
always accurate because autovacuum doesn't let vacuum.c expand partitions
nor recurse to toast tables.  Still, this seems like trouble waiting to
happen, so let's nail the door at least partly shut.  (Further cleanup
is planned, in HEAD only, as part of the pending feature patch.)

Fix some sadly inaccurate/obsolete comments too.  Back-patch to v10.

Michael Paquier and Tom Lane

Discussion: https://postgr.es/m/25023.1506107590@sss.pgh.pa.us
2017-09-29 16:26:31 -04:00
Tom Lane 136ab7c5a5 Marginal improvement for generated code in execExprInterp.c.
Avoid the coding pattern "*op->resvalue = f();", as some compilers think
that requires them to evaluate "op->resvalue" before the function call.
Unless there are lots of free registers, this can lead to a useless
register spill and reload across the call.

I changed all the cases like this in ExecInterpExpr(), but didn't bother
in the out-of-line opcode eval subroutines, since those are presumably
not as performance-critical.

Discussion: https://postgr.es/m/2508.1506630094@sss.pgh.pa.us
2017-09-29 11:32:05 -04:00
Peter Eisentraut 5373bc2a08 Add background worker type
Add bgw_type field to background worker structure.  It is intended to be
set to the same value for all workers of the same type, so they can be
grouped in pg_stat_activity, for example.

The backend_type column in pg_stat_activity now shows bgw_type for a
background worker.  The ps listing also no longer calls out that a
process is a background worker but just show the bgw_type.  That way,
being a background worker is more of an implementation detail now that
is not shown to the user.  However, most log messages still refer to
'background worker "%s"'; otherwise constructing sensible and
translatable log messages would become tricky.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
2017-09-29 11:08:24 -04:00
Robert Haas 8b304b8b72 Remove replacement selection sort.
At the time replacement_sort_tuples was introduced, there were still
cases where replacement selection sort noticeably outperformed using
quicksort even for the first run.  However, those cases seem to have
evaporated as a result of further improvements made since that time
(and perhaps also advances in CPU technology).  So remove replacement
selection and the controlling GUC entirely.  This makes tuplesort.c
noticeably simpler and probably paves the way for further
optimizations someone might want to do later.

Peter Geoghegan, with review and testing by Tomas Vondra and me.

Discussion: https://postgr.es/m/CAH2-WzmmNjG_K0R9nqYwMq3zjyJJK+hCbiZYNGhAy-Zyjs64GQ@mail.gmail.com
2017-09-29 10:25:44 -04:00
Alvaro Herrera 20b6552242 Fix freezing of a dead HOT-updated tuple
Vacuum calls page-level HOT prune to remove dead HOT tuples before doing
liveness checks (HeapTupleSatisfiesVacuum) on the remaining tuples.  But
concurrent transaction commit/abort may turn DEAD some of the HOT tuples
that survived the prune, before HeapTupleSatisfiesVacuum tests them.
This happens to activate the code that decides to freeze the tuple ...
which resuscitates it, duplicating data.

(This is especially bad if there's any unique constraints, because those
are now internally violated due to the duplicate entries, though you
won't know until you try to REINDEX or dump/restore the table.)

One possible fix would be to simply skip doing anything to the tuple,
and hope that the next HOT prune would remove it.  But there is a
problem: if the tuple is older than freeze horizon, this would leave an
unfrozen XID behind, and if no HOT prune happens to clean it up before
the containing pg_clog segment is truncated away, it'd later cause an
error when the XID is looked up.

Fix the problem by having the tuple freezing routines cope with the
situation: don't freeze the tuple (and keep it dead).  In the cases that
the XID is older than the freeze age, set the HEAP_XMAX_COMMITTED flag
so that there is no need to look up the XID in pg_clog later on.

An isolation test is included, authored by Michael Paquier, loosely
based on Daniel Wood's original reproducer.  It only tests one
particular scenario, though, not all the possible ways for this problem
to surface; it be good to have a more reliable way to test this more
fully, but it'd require more work.
In message https://postgr.es/m/20170911140103.5akxptyrwgpc25bw@alvherre.pgsql
I outlined another test case (more closely matching Dan Wood's) that
exposed a few more ways for the problem to occur.

Backpatch all the way back to 9.3, where this problem was introduced by
multixact juggling.  In branches 9.3 and 9.4, this includes a backpatch
of commit e5ff9fefcd50 (of 9.5 era), since the original is not
correctable without matching the coding pattern in 9.5 up.

Reported-by: Daniel Wood
Diagnosed-by: Daniel Wood
Reviewed-by: Yi Wen Wong, Michaël Paquier
Discussion: https://postgr.es/m/E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com
2017-09-28 16:44:01 +02:00
Tom Lane 7769fc000a Fix behavior when converting a float infinity to numeric.
float8_numeric() and float4_numeric() failed to consider the possibility
that the input is an IEEE infinity.  The results depended on the
platform-specific behavior of sprintf(): on most platforms you'd get
something like

ERROR:  invalid input syntax for type numeric: "inf"

but at least on Windows it's possible for the conversion to succeed and
deliver a finite value (typically 1), due to a nonstandard output format
from sprintf and lack of syntax error checking in these functions.

Since our numeric type lacks the concept of infinity, a suitable conversion
is impossible; the best thing to do is throw an explicit error before
letting sprintf do its thing.

While at it, let's use snprintf not sprintf.  Overrunning the buffer
should be impossible if sprintf does what it's supposed to, but this
is cheap insurance against a stack smash if it doesn't.

Problem reported by Taiki Kondo.  Patch by me based on fix suggestion
from KaiGai Kohei.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/12A9442FBAE80D4E8953883E0B84E088C8C7A2@BPXM01GP.gisp.nec.co.jp
2017-09-27 17:05:53 -04:00
Tom Lane 28e0727076 Revert to 9.6 treatment of ALTER TYPE enumtype ADD VALUE.
This reverts commit 15bc038f9, along with the followon commits 1635e80d3
and 984c92074 that tried to clean up the problems exposed by bug #14825.
The result was incomplete because it failed to address parallel-query
requirements.  With 10.0 release so close upon us, now does not seem like
the time to be adding more code to fix that.  I hope we can un-revert this
code and add the missing parallel query support during the v11 cycle.

Back-patch to v10.

Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org
2017-09-27 16:14:43 -04:00
Tom Lane 9a50a93c7b Improve wording of error message added in commit 714805010.
Per suggestions from Peter Eisentraut and David Johnston.
Back-patch, like the previous commit.

Discussion: https://postgr.es/m/E1dv9jI-0006oT-Fn@gemulon.postgresql.org
2017-09-26 15:25:56 -04:00
Tom Lane 5ea96efaa0 Fix failure-to-read-man-page in commit 899bd785c.
posix_fallocate() is not quite a drop-in replacement for fallocate(),
because it is defined to return the error code as its function result,
not in "errno".  I (tgl) missed this because RHEL6's version seems
to set errno as well.  That is not the case on more modern Linuxen,
though, as per buildfarm results.

Aside from fixing the return-convention confusion, remove the test
for ENOSYS; we expect that glibc will mask that for posix_fallocate,
though it does not for fallocate.  Keep the test for EINTR, because
POSIX specifies that as a possible result, and buildfarm results
suggest that it can happen in practice.

Back-patch to 9.4, like the previous commit.

Thomas Munro

Discussion: https://postgr.es/m/1002664500.12301802.1471008223422.JavaMail.yahoo@mail.yahoo.com
2017-09-26 13:42:53 -04:00
Tom Lane 984c92074d Remove heuristic same-transaction test from check_safe_enum_use().
The blacklist mechanism added by the preceding commit directly fixes
most of the practical cases that the same-transaction test was meant
to cover.  What remains is use-cases like

	begin;
	create type e as enum('x');
	alter type e add value 'y';
	-- use 'y' somehow
	commit;

However, because the same-transaction test is heuristic, it fails on
small variants of that, such as renaming the type or changing its
owner.  Rather than try to explain the behavior to users, let's
remove it and just have a rule that the newly added value can't be
used before being committed, full stop.  Perhaps later it will be
worth the implementation effort and overhead to have a more accurate
test for type-was-created-in-this-transaction.  We'll wait for some
field experience with v10 before deciding to do that.

Back-patch to v10.

Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org
2017-09-26 13:14:46 -04:00
Tom Lane 1635e80d30 Use a blacklist to distinguish original from add-on enum values.
Commit 15bc038f9 allowed ALTER TYPE ADD VALUE to be executed inside
transaction blocks, by disallowing the use of the added value later
in the same transaction, except under limited circumstances.  However,
the test for "limited circumstances" was heuristic and could reject
references to enum values that were created during CREATE TYPE AS ENUM,
not just later.  This breaks the use-case of restoring pg_dump scripts
in a single transaction, as reported in bug #14825 from Balazs Szilfai.

We can improve this by keeping a "blacklist" table of enum value OIDs
created by ALTER TYPE ADD VALUE during the current transaction.  Any
visible-but-uncommitted value whose OID is not in the blacklist must
have been created by CREATE TYPE AS ENUM, and can be used safely
because it could not have a lifespan shorter than its parent enum type.

This change also removes the restriction that a renamed enum value
can't be used before being committed (unless it was on the blacklist).

Andrew Dunstan, with cosmetic improvements by me.
Back-patch to v10.

Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org
2017-09-26 13:14:46 -04:00
Peter Eisentraut ab28feae2b Handle heap rewrites better in logical replication
A FOR ALL TABLES publication naturally considers all base tables to be a
candidate for replication.  This includes transient heaps that are
created during a table rewrite during DDL.  This causes failures on the
subscriber side because it will not have a table like pg_temp_16386 to
receive data (and if it did, it would be the wrong table).

The prevent this problem, we filter out any tables that match this
naming pattern and match an actual table from FOR ALL TABLES
publications.  This is only a heuristic, meaning that user tables that
match that naming could accidentally be omitted.  A more robust solution
might require an explicit marking of such tables in pg_class somehow.

Reported-by: yxq <yxq@o2.pl>
Bug: #14785
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>
2017-09-26 10:13:43 -04:00
Robert Haas 22c5e73562 Remove lsn from HashScanPosData.
This was intended as infrastructure for weakening VACUUM's locking
requirements, similar to what was done for btree indexes in commit
2ed5b87f96.  However, for hash indexes,
it seems that the improvements which are possible are actually
extremely marginal.  Furthermore, performing the LSN cross-check will
end up skipping cleanup far more often than is necessary; we only care
about page modifications due to a VACUUM, but the LSN check will fail
if ANY modification has occurred.  So, rather than pressing forward
with that "optimization", just rip the LSN field out.

Patch by me, reviewed by Ashutosh Sharma and Amit Kapila

Discussion: http://postgr.es/m/CAA4eK1JxqqcuC5Un7YLQVhOYSZBS+t=3xqZuEkt5RyquyuxpwQ@mail.gmail.com
2017-09-26 09:16:45 -04:00
Robert Haas 79a4a665c0 Fix trivial mistake in README.
You might think I (Robert) could manage to count to five without
messing it up, but if you did, you would be wrong.

Amit Kapila

Discussion: http://postgr.es/m/CAA4eK1JxqqcuC5Un7YLQVhOYSZBS+t=3xqZuEkt5RyquyuxpwQ@mail.gmail.com
2017-09-26 09:01:43 -04:00
Tom Lane 899bd785c0 Avoid SIGBUS on Linux when a DSM memory request overruns tmpfs.
On Linux, shared memory segments created with shm_open() are backed by
swap files created in tmpfs.  If the swap file needs to be extended,
but there's no tmpfs space left, you get a very unfriendly SIGBUS trap.
To avoid this, force allocation of the full request size when we create
the segment.  This adds a few cycles, but none that we wouldn't expend
later anyway, assuming the request isn't hugely bigger than the actual
need.

Make this code #ifdef __linux__, because (a) there's not currently a
reason to think the same problem exists on other platforms, and (b)
applying posix_fallocate() to an FD created by shm_open() isn't very
portable anyway.

Back-patch to 9.4 where the DSM code came in.

Thomas Munro, per a bug report from Amul Sul

Discussion: https://postgr.es/m/1002664500.12301802.1471008223422.JavaMail.yahoo@mail.yahoo.com
2017-09-25 16:09:19 -04:00
Tom Lane 716ea626a8 Make construct_[md_]array return a valid empty array for zero-size input.
If construct_array() or construct_md_array() were given a dimension of
zero, they'd produce an array that contains no elements but has positive
dimension.  This violates a general expectation that empty arrays should
have ndims = 0; in particular, while arrays like this print as empty,
they don't compare equal to other empty arrays.

Up to now we've expected callers to avoid making such calls and instead
be careful to call construct_empty_array() if there would be no elements.
But this has always been an easily missed case, and we've repeatedly had to
fix callers to do it right.  In bug #14826, Erwin Brandstetter pointed out
yet another such oversight, in ts_lexize(); and a bit of examination of
other call sites found at least two more with similar issues.  So let's
fix the problem centrally and permanently by changing these two functions
to construct a proper zero-D empty array whenever the array would be empty.

This renders a few explicit calls of construct_empty_array() redundant,
but the only such place I found that really seemed worth changing was in
ExecEvalArrayExpr().

Although this fixes some very old bugs, no back-patch: the problem is
pretty minor and the risk of changing behavior seems to outweigh the
benefit in stable branches.

Discussion: https://postgr.es/m/20170923125723.1448.39412@wrigleys.postgresql.org
Discussion: https://postgr.es/m/20570.1506198383@sss.pgh.pa.us
2017-09-25 11:55:24 -04:00
Peter Eisentraut 6dda0998af Allow ICU to use SortSupport on Windows with UTF-8
There is no reason to ever prevent the use of SortSupport on Windows
when ICU locales are used.  We previously avoided SortSupport on Windows
with UTF-8 server encoding and a non C-locale due to restrictions in
Windows' libc functionality.

This is now considered to be a restriction in one platform's libc
collation provider, and not a more general platform restriction.

Reported-by: Peter Geoghegan <pg@bowt.ie>
2017-09-24 07:55:24 -04:00
Tom Lane 24541ffd78 ... and the very same bug in publicationListToArray().
Sigh.
2017-09-23 15:16:48 -04:00
Tom Lane 737639017c Fix bogus size calculation in strlist_to_textarray().
It's making an array of Datum, not an array of text *.  The mistake
is harmless since those are currently the same size, but it's still
wrong.
2017-09-23 15:01:59 -04:00
Tom Lane 335f3d04e4 Improve memory management in autovacuum.c.
Invoke vacuum(), as well as "work item" processing, in the PortalContext
that do_autovacuum() has manufactured, which will be reset before each
such invocation.  This ensures cleanup of any memory leaked by these
operations.  It also avoids the rather dangerous practice of calling
vacuum() in a context that vacuum() itself will destroy while it runs.
There's no known live bug there, but it's not hard to imagine introducing
one if we leave it like this.

Tom Lane, reviewed by Michael Paquier and Alvaro Herrera

Discussion: https://postgr.es/m/13849.1506114543@sss.pgh.pa.us
2017-09-23 13:28:16 -04:00
Peter Eisentraut 0c5803b450 Refactor new file permission handling
The file handling functions from fd.c were called with a diverse mix of
notations for the file permissions when they were opening new files.
Almost all files created by the server should have the same permissions
set.  So change the API so that e.g. OpenTransientFile() automatically
uses the standard permissions set, and OpenTransientFilePerm() is a new
function that takes an explicit permissions set for the few cases where
it is needed.  This also saves an unnecessary argument for call sites
that are just opening an existing file.

While we're reviewing these APIs, get rid of the FileName typedef and
use the standard const char * for the file name and mode_t for the file
mode.  This makes these functions match other file handling functions
and removes an unnecessary layer of mysteriousness.  We can also get rid
of a few casts that way.

Author: David Steele <david@pgmasters.net>
2017-09-23 10:16:18 -04:00
Peter Eisentraut aa6b7b72d9 Fix saving and restoring umask
In two cases, we set a different umask for some piece of code and
restore it afterwards.  But if the contained code errors out, the umask
is not restored.  So add TRY/CATCH blocks to fix that.
2017-09-22 17:10:36 -04:00
Andres Freund 791961f59b Add inline murmurhash32(uint32) function.
The function already existed in tidbitmap.c but more users requiring
fast hashing of 32bit ints are coming up.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
2017-09-22 13:38:42 -07:00
Robert Haas 6a2fa09c0c For wal_consistency_checking, mask page checksum as well as page LSN.
If the LSN is different, the checksum will be different, too.

Ashwin Agrawal, reviewed by Michael Paquier and Kuntal Ghosh

Discussion: http://postgr.es/m/CALfoeis5iqrAU-+JAN+ZzXkpPr7+-0OAGv7QUHwFn=-wDy4o4Q@mail.gmail.com
2017-09-22 14:28:22 -04:00
Robert Haas 7c75ef5715 hash: Implement page-at-a-time scan.
Commit 09cb5c0e7d added a similar
optimization to btree back in 2006, but nobody bothered to implement
the same thing for hash indexes, probably because they weren't
WAL-logged and had lots of other performance problems as well.  As
with the corresponding btree case, this eliminates the problem of
potentially needing to refind our position within the page, and cuts
down on pin/unpin traffic as well.

Ashutosh Sharma, reviewed by Alexander Korotkov, Jesper Pedersen,
Amit Kapila, and me.  Some final edits to comments and README by
me.

Discussion: http://postgr.es/m/CAE9k0Pm3KTx93K8_5j6VMzG4h5F+SyknxUwXrN-zqSZ9X8ZS3w@mail.gmail.com
2017-09-22 13:56:27 -04:00
Tom Lane ed87e19807 Mop-up for commit 85feb77aa0.
Adjust commentary in regc_pg_locale.c to remove mention of the possibility
of not having <wctype.h> functions, since we no longer consider that.

Eliminate duplicate code in wparser_def.c by generalizing the p_iswhat
macro to take a parameter saying what to return for non-ASCII chars
in C locale.  (That's not really a consequence of the
USE_WIDE_UPPER_LOWER-ectomy, but I noticed it while doing that.)
2017-09-22 11:35:12 -04:00
Tom Lane 85feb77aa0 Assume wcstombs(), towlower(), and sibling functions are always present.
These functions are required by SUS v2, which is our minimum baseline
for Unix platforms, and are present on all interesting Windows versions
as well.  Even our oldest buildfarm members have them.  Thus, we were not
testing the "!USE_WIDE_UPPER_LOWER" code paths, which explains why the bug
fixed in commit e6023ee7f escaped detection.  Per discussion, there seems
to be no more real-world value in maintaining this option.  Hence, remove
the configure-time tests for wcstombs() and towlower(), remove the
USE_WIDE_UPPER_LOWER symbol, and remove all the !USE_WIDE_UPPER_LOWER code.
There's not actually all that much of the latter, but simplifying the #if
nests is a win in itself.

Discussion: https://postgr.es/m/20170921052928.GA188913@rfd.leadboat.com
2017-09-22 11:00:58 -04:00
Peter Eisentraut e6023ee7fa Fix build with !USE_WIDE_UPPER_LOWER
The placement of the ifdef blocks in formatting.c was pretty bogus, so
the code failed to compile if USE_WIDE_UPPER_LOWER was not defined.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reported-by: Noah Misch <noah@leadboat.com>
2017-09-22 09:26:38 -04:00
Tom Lane 7148050105 Give a better error for duplicate entries in VACUUM/ANALYZE column list.
Previously, the code didn't think about this case and would just try to
analyze such a column twice.  That would fail at the point of inserting
the second version of the pg_statistic row, with obscure error messsages
like "duplicate key value violates unique constraint" or "tuple already
updated by self", depending on context and PG version.  We could allow
the case by ignoring duplicate column specifications, but it seems better
to reject it explicitly.

The bogus error messages seem like arguably a bug, so back-patch to
all supported versions.

Nathan Bossart, per a report from Michael Paquier, and whacked
around a bit by me.

Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com
2017-09-21 18:13:11 -04:00
Andrew Dunstan 28ae524bbf Quieten warnings about unused variables
These variables are only ever written to in assertion-enabled builds,
and the latest Microsoft compilers complain about such variables in
non-assertion-enabled builds.

Apparently they don't worry so much about variables that are written to
but not read from, so most of our PG_USED_FOR_ASSERTS_ONLY variables
don't cause the problem.

Discussion: https://postgr.es/m/7800.1505950322@sss.pgh.pa.us
2017-09-21 08:41:14 -04:00
Robert Haas 9140cf8269 Associate partitioning information with each RelOptInfo.
This is not used for anything yet, but it is necessary infrastructure
for partition-wise join and for partition pruning without constraint
exclusion.

Ashutosh Bapat, reviewed by Amit Langote and with quite a few changes,
mostly cosmetic, by me.  Additional review and testing of this patch
series by Antonin Houska, Amit Khandekar, Rafia Sabih, Rajkumar
Raghuwanshi, Thomas Munro, and Dilip Kumar.

Discussion: http://postgr.es/m/CAFjFpRfneFG3H+F6BaiXemMrKF+FY-POpx3Ocy+RiH3yBmXSNw@mail.gmail.com
2017-09-20 23:39:13 -04:00
Tom Lane 7b86c2ac95 Improve dubious memory management in pg_newlocale_from_collation().
pg_newlocale_from_collation() used malloc() and strdup() directly,
which is generally not per backend coding style, and it didn't bother
to check for failure results, but would just SIGSEGV instead.  Also,
if one of the numerous error checks in the middle of the function
failed, the already-allocated memory would be leaked permanently.
Admittedly, it's not a lot of memory, but it could build up if this
function were called repeatedly for a bad collation.

The first two problems are easily cured by palloc'ing in TopMemoryContext
instead of calling libc directly.  We can fairly easily dodge the leakage
problem for the struct pg_locale_struct by filling in a temporary variable
and allocating permanent storage only once we reach the bottom of the
function.  It's harder to get rid of the potential leakage for ICU's copy
of the collcollate string, but at least that's only allocated after most
of the error checks; so live with that aspect.

Back-patch to v10 where this code came in, with one or another of the
ICU patches.
2017-09-20 13:52:36 -04:00
Robert Haas 57eebca03a Fix create_lateral_join_info to handle dead relations properly.
Commit 0a480502b0 broke it.

Report by Andreas Seltenreich.  Fix by Ashutosh Bapat.

Discussion: http://postgr.es/m/874ls2vrnx.fsf@ansel.ydns.eu
2017-09-20 10:20:10 -04:00
Robert Haas 7f3a3312ab Fix typo.
Thomas Munro

Discussion: http://postgr.es/m/CAEepm=2j-HAgnBUrAazwS0ry7Z_ihk+d7g+Ye3u99+6WbiGt_Q@mail.gmail.com
2017-09-20 10:07:53 -04:00
Peter Eisentraut be87b70b61 Sync process names between ps and pg_stat_activity
Remove gratuitous differences in the process names shown in
pg_stat_activity.backend_type and the ps output.

Reviewed-by: Takayuki Tsunakawa <tsunakawa.takay@jp.fujitsu.com>
2017-09-20 08:59:03 -04:00
Andres Freund fc49e24fa6 Make WAL segment size configurable at initdb time.
For performance reasons a larger segment size than the default 16MB
can be useful. A larger segment size has two main benefits: Firstly,
in setups using archiving, it makes it easier to write scripts that
can keep up with higher amounts of WAL, secondly, the WAL has to be
written and synced to disk less frequently.

But at the same time large segment size are disadvantageous for
smaller databases. So far the segment size had to be configured at
compile time, often making it unrealistic to choose one fitting to a
particularly load. Therefore change it to a initdb time setting.

This includes a breaking changes to the xlogreader.h API, which now
requires the current segment size to be configured.  For that and
similar reasons a number of binaries had to be taught how to recognize
the current segment size.

Author: Beena Emerson, editorialized by Andres Freund
Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael
    Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja
Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com
2017-09-19 22:03:48 -07:00
Tom Lane 2d484f9b05 Remove no-op GiST support functions in the core GiST opclasses.
The preceding patch allowed us to remove useless GiST support functions.
This patch actually does that for all the no-op cases in the core GiST
code.  This buys us whatever performance gain is to be had, and more
importantly exercises the preceding patch.

There remain no-op functions in the contrib GiST opclasses, but those
will take more work to remove.

Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
2017-09-19 23:32:59 -04:00
Tom Lane d3a4f89d8a Allow no-op GiST support functions to be omitted.
There are common use-cases in which the compress and/or decompress
functions can be omitted, with the result being that we make no
data transformation when storing or retrieving index values.
Previously, you had to provide a no-op function anyway, but this
patch allows such opclass support functions to be omitted.

Furthermore, if the compress function is omitted, then the core code
knows that the stored representation is the same as the original data.
This means we can allow index-only scans without requiring a fetch
function to be provided either.  Previously you had to provide a
no-op fetch function if you wanted IOS to work.

This reportedly provides a small performance benefit in such cases,
but IMO the real reason for doing it is just to reduce the amount of
useless boilerplate code that has to be written for GiST opclasses.

Andrey Borodin, reviewed by Dmitriy Sarafannikov

Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com
2017-09-19 23:32:59 -04:00