Commit Graph

40285 Commits

Author SHA1 Message Date
Kevin Grittner 11e178d0dc Inline initial comparisons in TestForOldSnapshot()
Even with old_snapshot_threshold = -1 (which disables the "snapshot
too old" feature), performance regressions were seen at moderate to
high concurrency.  For example, a one-socket, four-core system
running 200 connections at saturation could see up to a 2.3%
regression, with larger regressions possible on NUMA machines.
By inlining the early (smaller, faster) tests in the
TestForOldSnapshot() function, the i7 case dropped to a 0.2%
regression, which could easily just be noise, and is clearly an
improvement.  Further testing will show whether more is needed.
2016-04-21 08:40:08 -05:00
Robert Haas 5b1f9ce1d9 postgres_fdw: Don't push down certain full joins.
If there's a filter condition on either side of a full outer join,
it is neither correct to attach it to the join's ON clause nor to
throw it into the toplevel WHERE clause.  Just don't push down the
join in that case.

To maximize the number of cases where we can still push down full
joins, push inner join conditions into the ON clause at the first
opportunity rather than postponing them to the top-level WHERE
clause.  This produces nicer SQL, anyway.

This bug was introduced in e4106b2528.

Ashutosh Bapat, per report from Rajkumar Raghuwanshi.
2016-04-20 23:54:19 -04:00
Tom Lane cbabb70f35 Honor PGCTLTIMEOUT environment variable for pg_regress' startup wait.
In commit 2ffa869620 we made pg_ctl recognize an environment variable
PGCTLTIMEOUT to set the default timeout for starting and stopping the
postmaster.  However, pg_regress uses pg_ctl only for the "stop" end of
that; it has bespoke code for starting the postmaster, and that code has
historically had a hard-wired 60-second timeout.  Further buildfarm
experience says it'd be a good idea if that timeout were also controlled
by PGCTLTIMEOUT, so let's make it so.  Like the previous patch, back-patch
to all active branches.

Discussion: <13969.1461191936@sss.pgh.pa.us>
2016-04-20 23:48:13 -04:00
Robert Haas b4e0f18382 Add pg_dump support for the new PARALLEL option for aggregates.
This was an oversight in commit 41ea0c2376.

Fabrízio de Royes Mello, per a report from Tushar Ahuja
2016-04-20 23:06:06 -04:00
Robert Haas 9c75e1a36b Forbid parallel Hash Right Join or Hash Full Join.
That won't work.  You'll get bogus null-extended rows.

Mithun Cy
2016-04-20 17:48:55 -04:00
Magnus Hagander cfb863f20a Update backup documentation for new APIs
This includes the rest of the documentation that was not included
in 7117685. A larger restructure would still be wanted, but with
this commit the documentation of the new features is complete.
2016-04-20 14:40:04 -04:00
Tom Lane bde361fef5 Fix memory leak and other bugs in ginPlaceToPage() & subroutines.
Commit 36a35c550a turned the interface between ginPlaceToPage and
its subroutines in gindatapage.c and ginentrypage.c into a royal mess:
page-update critical sections were started in one place and finished in
another place not even in the same file, and the very same subroutine
might return having started a critical section or not.  Subsequent patches
band-aided over some of the problems with this design by making things
even messier.

One user-visible resulting problem is memory leaks caused by the need for
the subroutines to allocate storage that would survive until ginPlaceToPage
calls XLogInsert (as reported by Julien Rouhaud).  This would not typically
be noticeable during retail index updates.  It could be visible in a GIN
index build, in the form of memory consumption swelling to several times
the commanded maintenance_work_mem.

Another rather nasty problem is that in the internal-page-splitting code
path, we would clear the child page's GIN_INCOMPLETE_SPLIT flag well before
entering the critical section that it's supposed to be cleared in; a
failure in between would leave the index in a corrupt state.  There were
also assorted coding-rule violations with little immediate consequence but
possible long-term hazards, such as beginning an XLogInsert sequence before
entering a critical section, or calling elog(DEBUG) inside a critical
section.

To fix, redefine the API between ginPlaceToPage() and its subroutines
by splitting the subroutines into two parts.  The "beginPlaceToPage"
subroutine does what can be done outside a critical section, including
full computation of the result pages into temporary storage when we're
going to split the target page.  The "execPlaceToPage" subroutine is called
within a critical section established by ginPlaceToPage(), and it handles
the actual page update in the non-split code path.  The critical section,
as well as the XLOG insertion call sequence, are both now always started
and finished in ginPlaceToPage().  Also, make ginPlaceToPage() create and
work in a short-lived memory context to eliminate the leakage problem.
(Since a short-lived memory context had been getting created in the most
common code path in the subroutines, this shouldn't cause any noticeable
performance penalty; we're just moving the overhead up one call level.)

In passing, fix a bunch of comments that had gone unmaintained throughout
all this klugery.

Report: <571276DD.5050303@dalibo.com>
2016-04-20 14:25:15 -04:00
Kevin Grittner a343e223a5 Revert no-op changes to BufferGetPage()
The reverted changes were intended to force a choice of whether any
newly-added BufferGetPage() calls needed to be accompanied by a
test of the snapshot age, to support the "snapshot too old"
feature.  Such an accompanying test is needed in about 7% of the
cases, where the page is being used as part of a scan rather than
positioning for other purposes (such as DML or vacuuming).  The
additional effort required for back-patching, and the doubt whether
the intended benefit would really be there, have indicated it is
best just to rely on developers to do the right thing based on
comments and existing usage, as we do with many other conventions.

This change should have little or no effect on generated executable
code.

Motivated by the back-patching pain of Tom Lane and Robert Haas
2016-04-20 08:31:19 -05:00
Tom Lane 4db0d2d2fe Improve regression tests for degree-based trigonometric functions.
Print the actual value of each function result that's expected to be exact,
rather than merely emitting a NULL if it's not right.  Although we print
these with extra_float_digits = 3, we should not trust that the platform
will produce a result visibly different from the expected value if it's off
only in the last place; hence, also include comparisons against the exact
values as before.  This is a bit bulkier and uglier than the previous
printout, but it will provide more information and be easier to interpret
if there's a test failure.

Discussion: <18241.1461073100@sss.pgh.pa.us>
2016-04-19 16:47:21 -04:00
Tom Lane a0382e2d7e Make partition-lock-release coding more transparent in BufferAlloc().
Coverity complained that oldPartitionLock was possibly dereferenced after
having been set to NULL.  That actually can't happen, because we'd only use
it if (oldFlags & BM_TAG_VALID) is true.  But nonetheless Coverity is
justified in complaining, because at line 1275 we actually overwrite
oldFlags, and then still expect its BM_TAG_VALID bit to be a safe guide to
whether to release the oldPartitionLock.  Thus, the code would be incorrect
if someone else had changed the buffer's BM_TAG_VALID flag meanwhile.
That should not happen, since we hold pin on the buffer throughout this
sequence, but it's starting to look like a rather shaky chain of logic.
And there's no need for such assumptions, because we can simply replace
the (oldFlags & BM_TAG_VALID) tests with (oldPartitionLock != NULL),
which has identical results and makes it plain to all comers that we don't
dereference a null pointer.  A small side benefit is that the range of
liveness of oldFlags is greatly reduced, possibly allowing the compiler
to save a register.

This is just cleanup, not an actual bug fix, so there seems no need
for a back-patch.
2016-04-18 18:05:56 -04:00
Tom Lane 75c24d0f74 Further reduce the number of semaphores used under --disable-spinlocks.
Per discussion, there doesn't seem to be much value in having
NUM_SPINLOCK_SEMAPHORES set to 1024: under any scenario where you are
running more than a few backends concurrently, you really had better have a
real spinlock implementation if you want tolerable performance.  And 1024
semaphores is a sizable fraction of the system-wide SysV semaphore limit
on many platforms.  Therefore, reduce this setting's default value to 128
to make it less likely to cause out-of-semaphores problems.
2016-04-18 13:33:06 -04:00
Fujii Masao 8ce8307bd4 Fix typo in docs.
Artur Zakirov
2016-04-18 13:35:21 +09:00
Peter Eisentraut d460c7cc0f doc: Document that sequences can also be extension configuration tables
From: Michael Paquier <michael.paquier@gmail.com>
2016-04-17 23:13:45 -04:00
Tom Lane 9603a32594 Avoid code duplication in \crosstabview.
In commit 6f0d6a507 I added a duplicate copy of psqlscanslash's identifier
downcasing code, but actually it's not hard to split that out as a callable
subroutine and avoid the duplication.
2016-04-17 11:37:58 -04:00
Tom Lane 4039c736eb Adjust spin.c's spinlock emulation so that 0 is not a valid spinlock value.
We've had repeated troubles over the years with failures to initialize
spinlocks correctly; see 6b93fcd14 for a recent example.  Most of the time,
on most platforms, such oversights can escape notice because all-zeroes is
the expected initial content of an slock_t variable.  The only platform
we have where the initialized state of an slock_t isn't zeroes is HPPA,
and that's practically gone in the wild.  To make it easier to catch such
errors without needing one of those, adjust the --disable-spinlocks code
so that zero is not a valid value for an slock_t for it.

In passing, remove a bunch of unnecessary #include's from spin.c;
commit daa7527afc removed all the intermodule coupling that
made them necessary.
2016-04-16 19:53:38 -04:00
Peter Eisentraut 5fdda1ceab doc: Change some "user" to "role" for consistency in the section
suggested by Johannes Choo
2016-04-16 12:54:56 -04:00
Peter Eisentraut efb25e56d8 doc: Markup improvement 2016-04-16 12:49:36 -04:00
Tom Lane c34df8a003 Disallow creation of indexes on system columns (except for OID).
Although OID acts pretty much like user data, the other system columns do
not, so an index on one would likely misbehave.  And it's pretty hard to
see a use-case for one, anyway.  Let's just forbid the case rather than
worry about whether it should be supported.

David Rowley
2016-04-16 12:11:41 -04:00
Stephen Frost 99f2f3c19a In recordExtensionInitPriv(), keep the scan til we're done with it
For reasons of sheer brain fade, we (I) was calling systable_endscan()
immediately after systable_getnext() and expecting the tuple returned
by systable_getnext() to still be valid.

That's clearly wrong.  Move the systable_endscan() down below the tuple
usage.

Discovered initially by Pavel Stehule and then also by Alvaro.

Add a regression test based on Alvaro's testing.
2016-04-15 21:57:15 -04:00
Peter Eisentraut d2de44c2ce doc: Add missing parentheses
From: Alexander Law <exclusion@gmail.com>
2016-04-15 20:44:10 -04:00
Peter Eisentraut c313687673 psql: Add new gettext trigger 2016-04-15 20:23:41 -04:00
Tom Lane 4447f0bcb6 Use less-generic names in matview.sql.
The original coding of this test used table and view names like "t",
"tv", "foo", etc.  This tended to interfere with doing simple manual
tests in the regression database; not to mention that it posed a
considerable risk of conflict with other regression test scripts.
Prefix these names with "mvtest_" to avoid such conflicts.

Also, change transiently-created role name to be "regress_xxx" per
discussions about being careful with regression-test role creation.
2016-04-15 13:04:17 -04:00
Tom Lane 8f1911d5e6 Fix possible crash in ALTER TABLE ... REPLICA IDENTITY USING INDEX.
Careless coding added by commit 07cacba983 could result in a crash
or a bizarre error message if someone tried to select an index on the
OID column as the replica identity index for a table.  Back-patch to 9.4
where the feature was introduced.

Discussion: CAKJS1f8TQYgTRDyF1_u9PVCKWRWz+DkieH=U7954HeHVPJKaKg@mail.gmail.com

David Rowley
2016-04-15 12:11:40 -04:00
Robert Haas da7d44b627 postgres_fdw: Clean up handling of system columns.
Previously, querying the xmin column of a single postgres_fdw foreign
table fetched the tuple length, xmax the typmod, and cmin or cmax the
composite type OID of the tuple.  However, when you queried several
such tables and the join got shipped to the remote side, these columns
ended up containing the remote values of the corresponding columns.
Both behaviors are rather unprincipled, the former for obvious reasons
and the latter because the remote values of these columns don't have
any local significance; our transaction IDs are in a different space
than those of the remote machine.  Clean this up by setting all of
these fields to 0 in both cases.  Also fix the handling of tableoid
to be sane.

Robert Haas and Ashutosh Bapat, reviewed by Etsuro Fujita.
2016-04-15 12:08:14 -04:00
Robert Haas 5702277ca9 Tweak EXPLAIN for parallel query to show workers launched.
The previous display was sort of confusing, because it didn't
distinguish between the number of workers that we planned to launch
and the number that actually got launched.  This has already confused
several people, so display both numbers and label them clearly.

Julien Rouhaud, reviewed by me.
2016-04-15 11:52:18 -04:00
Tom Lane 6b85d4ba9b Fix portability problem induced by commit a6f6b7819.
pg_xlogdump includes bufmgr.h.  With a compiler that emits code for
static inline functions even when they're unreferenced, that leads
to unresolved external references in the new static-inline version
of BufferGetPage().  So hide it with #ifndef FRONTEND, as we've done
for similar issues elsewhere.  Per buildfarm member pademelon.
2016-04-15 10:44:28 -04:00
Magnus Hagander ba8fe38f58 Fix typo in comment 2016-04-15 13:32:54 +02:00
Magnus Hagander cf086b1c2f Update helptext for vcregress.pl
This has clearly not been tracking the code changse for quite some time.

Michael Paquier, problem spotted by Kyotaro HORIGUCHI
2016-04-15 10:04:10 +02:00
Fujii Masao 36c1c91604 Make regression test for multiple synchronous standbys more stable.
The regression test checks whether the output of pg_stat_replication is
expected or not after changing synchronous_standby_names and reloading
the configuration file. Regarding this test logic, previously there was
a timing issue which made the test result unstable. That is,
pg_stat_replication could return unexpected result during small window
after the configuration file was reloaded before new setting value
took effect, and which made the test fail.

This commit changes the test logic so that it uses a loop with a timeout
to give some room for the test to pass. Now the test fails only when
pg_stat_replication keeps returning unexpected result for 30 seconds.

Michael Paquier
2016-04-15 13:58:14 +09:00
Tom Lane f0e766bd7f Fix memory leak in GIN index scans.
The code had a query-lifespan memory leak when encountering GIN entries
that have posting lists (rather than posting trees, ie, there are a
relatively small number of heap tuples containing this index key value).
With a suitable data distribution this could add up to a lot of leakage.
Problem seems to have been introduced by commit 36a35c550, so back-patch
to 9.4.

Julien Rouhaud
2016-04-15 00:02:26 -04:00
Tom Lane 6f0d6a5078 Rethink \crosstabview's argument parsing logic.
\crosstabview interpreted its arguments in an unusual way, including
doing case-insensitive matching of unquoted column names, which is
surely not the right thing.  Rip that out in favor of doing something
equivalent to the dequoting/case-folding rules used by other psql
commands.  To keep it simple, change the syntax so that the optional
sort column is specified as a separate argument, instead of the
also-quite-unusual syntax that attached it to the colH argument with
a colon.

Also, rework the error messages to be closer to project style.
2016-04-14 22:54:31 -04:00
Andres Freund 4b74c6a40e Make init_spin_delay() C89 compliant #2.
My previous attempt at doing so, in 80abbeba23, was not sufficient. While that
fixed the problem for bufmgr.c and lwlock.c , s_lock.c still has non-constant
expressions in the struct initializer, because the file/line/function
information comes from the caller of s_lock().

Give up on using a macro, and use a static inline instead.

Discussion: 4369.1460435533@sss.pgh.pa.us
2016-04-14 19:26:13 -07:00
Andres Freund 533cd2303a Remove trailing commas in enums.
These aren't valid C89. Found thanks to gcc's -Wc90-c99-compat. These
exist in differing places in most supported branches.
2016-04-14 19:25:16 -07:00
Andres Freund 7b16781228 Fix trivial typo. 2016-04-14 19:25:16 -07:00
Tom Lane 6a3d3965d6 Fix core dump in ReorderBufferRestoreChange on alignment-picky platforms.
When re-reading an update involving both an old tuple and a new tuple from
disk, reorderbuffer.c was careless about whether the new tuple is suitably
aligned for direct access --- in general, it isn't.  We'd missed seeing
this in the buildfarm because the contrib/test_decoding tests exercise this
code path only a few times, and by chance all of those cases have old
tuples with length a multiple of 4, which is usually enough to make the
access to the new tuple's t_len safe.  For some still-not-entirely-clear
reason, however, Debian's sparc build gets a bus error, as reported by
Christoph Berg; perhaps it's assuming 8-byte alignment of the pointer?

The lack of previous field reports is probably because you need all of
these conditions to trigger a crash: an alignment-picky platform (not
Intel), a transaction large enough to spill to disk, an update within
that xact that changes a primary-key field and has an odd-length old tuple,
and of course logical decoding tracing the transaction.

Avoid the alignment assumption by using memcpy instead of fetching t_len
directly, and add a test case that exposes the crash on picky platforms.
Back-patch to 9.4 where the bug was introduced.

Discussion: <20160413094117.GC21485@msg.credativ.de>
2016-04-14 19:42:21 -04:00
Tom Lane c2dc194bdb Adjust signature of walrcv_receive hook.
Commit 314cbfc5da redefined the signature of this hook as
typedef int (*walrcv_receive_type) (char **buffer, int *wait_fd);

But in fact the type of the "wait_fd" variable ought to be pgsocket,
which is what WaitLatchOrSocket expects, and which is necessary if
we want to be able to assign PGINVALID_SOCKET to it on Windows.
So fix that.
2016-04-14 13:49:37 -04:00
Tom Lane 994f112573 Adjust datatype of ReplicationState.acquired_by.
It was declared as "pid_t", which would be fine except that none of
the places that printed it in error messages took any thought for the
possibility that it's not equivalent to "int".  This leads to warnings
on some buildfarm members, and could possibly lead to actually wrong
error messages on those platforms.  There doesn't seem to be any very
good reason not to just make it "int"; it's only ever assigned from
MyProcPid, which is int.  If we want to cope with PIDs that are wider
than int, this is not the place to start.

Also, fix the comment, which seems to perhaps be a leftover from a time
when the field was only a bool?

Per buildfarm.  Back-patch to 9.5 which has same issue.
2016-04-14 12:18:09 -04:00
Tom Lane fda21aa05b Docs: clarify description of LIMIT/OFFSET behavior.
Section 7.6 was a tad confusing because it specified what LIMIT NULL
does, but neglected to do the same for OFFSET NULL, making this look
like perhaps a special case or a wrong restatement of the bit about
LIMIT ALL.  Wordsmith a bit while at it.  Per bug #14084.
2016-04-14 10:57:29 -04:00
Tom Lane 22989a8e34 Fix prototype of pgwin32_bind().
I (tgl) had copied-and-pasted this from pgwin32_accept(), failing to
notice that the third parameter should be "int" not "int *".

David Rowley
2016-04-14 09:44:21 -04:00
Tom Lane 92a30a7eb0 Fix broken dependency-mongering for index operator classes/families.
For a long time, opclasscmds.c explained that "we do not create a
dependency link to the AM [for an opclass or opfamily], because we don't
currently support DROP ACCESS METHOD".  Commit 473b932870 invented
DROP ACCESS METHOD, but it batted only 1 for 2 on adding the dependency
links, and 0 for 2 on updating the comments about the topic.

In passing, undo the same commit's entirely inappropriate decision to
blow away an existing index as a side-effect of create_am.sql.
2016-04-13 23:33:31 -04:00
Fujii Masao c8cb745323 Fix duplicated index entry in doc.
Commit cfe96ae corrected the name of pg_logical_emit_message()
in its index entry. But this typo fix caused duplicated index
entry because there was another index entry for the function.

Spotted by Tom Lane.
2016-04-14 11:17:41 +09:00
Stephen Frost bfed4ab824 Disallow SET SESSION AUTHORIZATION pg_*
As part of reserving the pg_* namespace for default roles and in line
with SET ROLE and other previous efforts, disallow settings the role
to a default/reserved role using SET SESSION AUTHORIZATION.

These checks and restrictions on what is allowed regarding default /
reserved roles are under debate, but it seems prudent to ensure that
the existing checks at least cover the intended cases while the
debate rages on.  On me to clean it up if the consensus decision is
to remove these checks.
2016-04-13 21:31:24 -04:00
Andres Freund be65eddd80 Add required database and origin filtering for logical messages.
Logical messages, added in 3fe3511d05, during decoding failed to filter
messages emitted in other databases and messages emitted "under" a
replication origin the output plugin isn't interested in.

Add tests to verify that both types of filtering actually work. While
touching message.sql remove hunk obsoleted by d25379e.

Bump XLOG_PAGE_MAGIC because xl_logical_message changed and because
3fe3511d05 had omitted doing so. 3fe3511d05 additionally didn't bump
catversion, but 7a542700d has done so since.

Author: Petr Jelinek
Reported-By: Andres Freund
Discussion: 20160406142513.wotqy3ba3kanr423@alap3.anarazel.de
2016-04-13 17:38:54 -07:00
Andres Freund 80abbeba23 Make init_spin_delay() C89 compliant and change stuck spinlock reporting.
The current definition of init_spin_delay (introduced recently in
48354581a) wasn't C89 compliant. It's not legal to refer to refer to
non-constant expressions, and the ptr argument was one.  This, as
reported by Tom, lead to a failure on buildfarm animal pademelon.

The pointer, especially on system systems with ASLR, isn't super helpful
anyway, though. So instead of making init_spin_delay into an inline
function, make s_lock_stuck() report the function name in addition to
file:line and change init_spin_delay() accordingly. While not a direct
replacement, the function name is likely more useful anyway (line
numbers are often hard to interpret in third party reports).

This also fixes what file/line number is reported for waits via
s_lock().

As PG_FUNCNAME_MACRO is now used outside of elog.h, move it to c.h.

Reported-By: Tom Lane
Discussion: 4369.1460435533@sss.pgh.pa.us
2016-04-13 17:00:53 -07:00
Tom Lane 6cead413bb Fix pg_dump so pg_upgrade'ing an extension with simple opfamilies works.
As reported by Michael Feld, pg_upgrade'ing an installation having
extensions with operator families that contain just a single operator class
failed to reproduce the extension membership of those operator families.
This caused no immediate ill effects, but would create problems when later
trying to do a plain dump and restore, because the seemingly-not-part-of-
the-extension operator families would appear separately in the pg_dump
output, and then would conflict with the families created by loading the
extension.  This has been broken ever since extensions were introduced,
and many of the standard contrib extensions are affected, so it's a bit
astonishing nobody complained before.

The cause of the problem is a perhaps-ill-considered decision to omit
such operator families from pg_dump's output on the grounds that the
CREATE OPERATOR CLASS commands could recreate them, and having explicit
CREATE OPERATOR FAMILY commands would impede loading the dump script into
pre-8.3 servers.  Whatever the merits of that decision when 8.3 was being
written, it looks like a poor tradeoff now.  We can fix the pg_upgrade
problem simply by removing that code, so that the operator families are
dumped explicitly (and then will be properly made to be part of their
extensions).

Although this fixes the behavior of future pg_upgrade runs, it does nothing
to clean up existing installations that may have improperly-linked operator
families.  Given the small number of complaints to date, maybe we don't
need to worry about providing an automated solution for that; anyone who
needs to clean it up can do so with manual "ALTER EXTENSION ADD OPERATOR
FAMILY" commands, or even just ignore the duplicate-opfamily errors they
get during a pg_restore.  In any case we need this fix.

Back-patch to all supported branches.

Discussion: <20228.1460575691@sss.pgh.pa.us>
2016-04-13 18:58:14 -04:00
Andres Freund 6b93fcd149 Avoid atomic operation in MarkLocalBufferDirty().
The recent patch to make Pin/UnpinBuffer lockfree in the hot
path (48354581a), accidentally used pg_atomic_fetch_or_u32() in
MarkLocalBufferDirty(). Other code operating on local buffers was
careful to only use pg_atomic_read/write_u32 which just read/write from
memory; to avoid unnecessary overhead.

On its own that'd just make MarkLocalBufferDirty() slightly less
efficient, but in addition InitLocalBuffers() doesn't call
pg_atomic_init_u32() - thus the spinlock fallback for the atomic
operations isn't initialized. That in turn caused, as reported by Tom,
buildfarm animal gaur to fail.  As those errors are actually useful
against this type of error, continue to omit - intentionally this time -
initialization of the atomic variable.

In addition, add an explicit note about only using pg_atomic_read/write
on local buffers's state to BufferDesc's description.

Reported-By: Tom Lane
Discussion: 1881.1460431476@sss.pgh.pa.us
2016-04-13 15:28:29 -07:00
Tom Lane 95ef43c430 Widen amount-to-flush arguments of FileWriteback and callers.
It's silly to define these counts as narrower than they might someday
need to be.  Also, I believe that the BLCKSZ * nflush calculation in
mdwriteback was capable of overflowing an int.
2016-04-13 18:12:06 -04:00
Tom Lane fa11a09fed Fix assorted portability issues with using msync() for data flushing.
Commit 428b1d6b29 introduced the use of
msync() for flushing dirty data from the kernel's file buffers.  Several
portability issues were overlooked, though:

* Not all implementations of mmap() think that nbytes == 0 means "map
the whole file".  To fix, use lseek() to find out the true length.
Fix callers of pg_flush_data to be aware that nbytes == 0 may result
in trashing the file's seek position.

* Not all implementations of mmap() will accept partial-page mmap
requests.  To fix, round down the length request to whatever sysconf()
says the page size is.  (I think this is OK from a portability standpoint,
because sysconf() is required by SUS v2, and we aren't trying to compile
this part on Windows anyway.  Buildfarm should let us know if not.)

* On 32-bit machines, the file size might exceed the available free
address space, or even exceed what will fit in size_t.  Check for
the latter explicitly to avoid passing a false request size to mmap().
If mmap fails, silently fall through to the next implementation method,
rather than bleating to the postmaster log and giving up.

* mmap'ing directories fails on some platforms, and even if it works,
msync'ing the directory is quite unlikely to help, as for that matter are
the other flush implementations.  In pre_sync_fname(), just skip flush
attempts on directories.

In passing, copy-edit the comments a bit.

Stas Kelvich and myself
2016-04-13 17:17:51 -04:00
Tom Lane 85e0047077 Improve documentation for \crosstabview.
Fix misleading syntax summary (there cannot be a space between colH and
scolH).  Provide a link from the existing crosstab() function's
documentation to \crosstabview.  Copy-edit the command's description.

Christoph Berg and Tom Lane
2016-04-13 11:49:47 -04:00
Robert Haas cbb2a812d7 Use PG_INT32_MIN instead of reiterating the constant.
Makes no difference, but it's cleaner this way.

Michael Paquier
2016-04-13 07:54:45 -04:00