Commit Graph

37874 Commits

Author SHA1 Message Date
Tom Lane cb66f495f5 Fix misuse of memcpy() in check_ip().
The previous coding copied garbage into a local variable, pretty much
ensuring that the intended test of an IPv6 connection address against a
promoted IPv4 address from pg_hba.conf would never match.  The lack of
field complaints likely indicates that nobody realized this was supposed
to work, which is unsurprising considering that no user-facing docs suggest
it should work.

In principle this could have led to a SIGSEGV due to reading off the end of
memory, but since the source address would have pointed to somewhere in the
function's stack frame, that's quite unlikely.  What led to discovery of
the bug is Hugo Osvaldo Barrera's report of a crash after an OS upgrade,
which is probably because he is now running a system in which memcpy raises
abort() upon detecting overlapping source and destination areas.  (You'd
have to additionally suppose some things about the stack frame layout to
arrive at this conclusion, but it seems plausible.)

This has been broken since the code was added, in commit f3aec2c7f5,
so back-patch to all supported branches.
2015-02-16 16:18:31 -05:00
Heikki Linnakangas c478959a00 Fix comment in libpq OpenSSL code about why a substitue BIO is used.
The comment was copy-pasted from the backend code along with the
implementation, but libpq has different reasons for using the BIO.
2015-02-16 23:05:20 +02:00
Heikki Linnakangas 1c2b7c0879 Restore the SSL_set_session_id_context() call to OpenSSL renegotiation.
This reverts the removal of the call in commit (272923a0). It turns out it
wasn't superfluous after all: without it, renegotiation fails if a client
certificate was used. The rest of the changes in that commit are still OK
and not reverted.

Per investigation of bug #12769 by Arne Scheffer, although this doesn't fix
the reported bug yet.
2015-02-16 22:34:32 +02:00
Tom Lane 9e3ad1aac5 Use fast path in plpgsql's RETURN/RETURN NEXT in more cases.
exec_stmt_return() and exec_stmt_return_next() have fast-path code for
handling a simple variable reference (i.e. "return var") without going
through the full expression evaluation machinery.  For some reason,
pl_gram.y was under the impression that this fast path only applied for
record/row variables; but in reality code for handling regular scalar
variables has been there all along.  Adjusting the logic to allow that
code to be used actually results in a net savings of code in pl_gram.y
(by eliminating some redundancy), and it buys a measurable though not
very impressive amount of speedup.

Noted while fooling with my expanded-array patch, wherein this makes a much
bigger difference because it enables returning an expanded array variable
without an extra flattening step.  But AFAICS this is a win regardless,
so commit it separately.
2015-02-16 15:28:48 -05:00
Heikki Linnakangas 2c75531a6c In the SSL test suite, use a root CA cert that won't expire (so quickly)
All the other certificates were created to be valid for 10000 days, because
we don't want to have to recreate them. But I missed the root CA cert, and
the pre-created certificates included in the repository expired in January.
Fix, and re-create all the certificates.
2015-02-16 22:11:43 +02:00
Tom Lane e983c4d1aa Rationalize the APIs of array element/slice access functions.
The four functions array_ref, array_set, array_get_slice, array_set_slice
have traditionally declared their array inputs and results as being of type
"ArrayType *".  This is a lie, and has been since Berkeley days, because
they actually also support "fixed-length array" types such as "name" and
"point"; not to mention that the inputs could be toasted.  These values
should be declared Datum instead to avoid confusion.  The current coding
already risks possible misoptimization by compilers, and it'll get worse
when "expanded" array representations become a valid alternative.

However, there's a fair amount of code using array_ref and array_set with
arrays that *are* known to be ArrayType structures, and there might be more
such places in third-party code.  Rather than cluttering those call sites
with PointerGetDatum/DatumGetArrayTypeP cruft, what I did was to rename the
existing functions to array_get_element/array_set_element, fix their
signatures, then reincarnate array_ref/array_set as backwards compatibility
wrappers.

array_get_slice/array_set_slice have no such constituency in the core code,
and probably not in third-party code either, so I just changed their APIs.
2015-02-16 12:23:58 -05:00
Fujii Masao cef30974de Correct the path of pg_lzcompress.c in doc.
Commit 40bede5 moved pg_lzcompress.c to src/common, but forgot to
update its path in doc. This commit fixes that oversight.
2015-02-16 14:50:13 +09:00
Tom Lane 08361cea2b Fix null-pointer-deref crash while doing COPY IN with check constraints.
In commit bf7ca15875 I introduced an
assumption that an RTE referenced by a whole-row Var must have a valid eref
field.  This is false for RTEs constructed by DoCopy, and there are other
places taking similar shortcuts.  Perhaps we should make all those places
go through addRangeTableEntryForRelation or its siblings instead of having
ad-hoc logic, but the most reliable fix seems to be to make the new code in
ExecEvalWholeRowVar cope if there's no eref.  We can reasonably assume that
there's no need to insert column aliases if no aliases were provided.

Add a regression test case covering this, and also verifying that a sane
column name is in fact available in this situation.

Although the known case only crashes in 9.4 and HEAD, it seems prudent to
back-patch the code change to 9.2, since all the ingredients for a similar
failure exist in the variant patch applied to 9.3 and 9.2.

Per report from Jean-Pierre Pelletier.
2015-02-15 23:26:45 -05:00
Andrew Dunstan c5b31e29f0 Minor housekeeping on JSON docs. 2015-02-15 17:08:37 -05:00
Peter Eisentraut 64cdbbc48c pg_regress: Write processed input/*.source into output dir
Before, it was writing the processed files into the input directory,
which is incorrect in a vpath build.
2015-02-14 21:33:41 -05:00
Tom Lane 80986e85aa Avoid returning undefined bytes in chkpass_in().
We can't really fix the problem that the result is defined to depend on
random(), so it is still going to fail the "unstable input conversion"
test in parse_type.c.  However, we can at least satify valgrind.  (It
looks like this code used to be valgrind-clean, actually, until somebody
did a careless s/strncpy/strlcpy/g on it.)

In passing, let's just make real sure that chkpass_out doesn't overrun
its output buffer.

No need for backpatch, I think, since this is just to satisfy debugging
tools.

Asif Naeem
2015-02-14 12:20:56 -05:00
Heikki Linnakangas 33e879c4e9 Fix broken #ifdef for __sparcv8
Rob Rowan. Backpatch to all supported versions, like the patch that added
the broken #ifdef.
2015-02-13 23:56:25 +02:00
Heikki Linnakangas 80788a431e Simplify waiting logic in reading from / writing to client.
The client socket is always in non-blocking mode, and if we actually want
blocking behaviour, we emulate it by sleeping and retrying. But we have
retry loops at different layers for reads and writes, which was confusing.
To simplify, remove all the sleeping and retrying code from the lower
levels, from be_tls_read and secure_raw_read and secure_raw_write, and put
all the logic in secure_read() and secure_write().
2015-02-13 21:46:14 +02:00
Heikki Linnakangas 272923a0a6 Simplify the way OpenSSL renegotiation is initiated in server.
At least in all modern versions of OpenSSL, it is enough to call
SSL_renegotiate() once, and then forget about it. Subsequent SSL_write()
and SSL_read() calls will finish the handshake.

The SSL_set_session_id_context() call is unnecessary too. We only have
one SSL context, and the SSL session was created with that to begin with.
2015-02-13 21:46:08 +02:00
Bruce Momjian dc01efa5cc pg_upgrade: improve checksum mismatch error message
Patch by Greg Sabino Mullane, slight adjustments by me
2015-02-11 22:22:26 -05:00
Bruce Momjian 056764b102 pg_upgrade: quote directory names in delete_old_cluster script
This allows the delete script to properly function when special
characters appear in directory paths, e.g. spaces.

Backpatch through 9.0
2015-02-11 22:06:04 -05:00
Bruce Momjian 866f3017a8 pg_upgrade: preserve freeze info for postgres/template1 dbs
pg_database.datfrozenxid and pg_database.datminmxid were not preserved
for the 'postgres' and 'template1' databases.  This could cause missing
clog file errors on access to user tables and indexes after upgrades in
these databases.

Backpatch through 9.0
2015-02-11 21:02:44 -05:00
Andres Freund 8785e6e378 Fix typo in logicaldecoding.sgml.
Author: Tatsuo Ishii

Backpatch to 9.4, where logicaldecoding was introduced.
2015-02-12 01:25:35 +01:00
Tom Lane 4f38a281a3 Fix missing PQclear() in libpqrcv_endstreaming().
This omission leaked one PGresult per WAL streaming cycle, which possibly
would never be enough to notice in the real world, but it's still a leak.

Per Coverity.  Back-patch to 9.3 where the error was introduced.
2015-02-11 19:20:49 -05:00
Tom Lane 58146d35de Fix minor memory leak in ident_inet().
We'd leak the ident_serv data structure if the second pg_getaddrinfo_all
(the one for the local address) failed.  This is not of great consequence
because a failure return here just leads directly to backend exit(), but
if this function is going to try to clean up after itself at all, it should
not have such holes in the logic.  Try to fix it in a future-proof way by
having all the failure exits go through the same cleanup path, rather than
"optimizing" some of them.

Per Coverity.  Back-patch to 9.2, which is as far back as this patch
applies cleanly.
2015-02-11 19:09:54 -05:00
Tom Lane 9179444d07 Fix more memory leaks in failure path in buildACLCommands.
We already had one go at this issue in commit d73b7f973d, but we
failed to notice that buildACLCommands also leaked several PQExpBuffers
along with a simply malloc'd string.  This time let's try to make the
fix a bit more future-proof by eliminating the separate exit path.

It's still not exactly critical because pg_dump will curl up and die on
failure; but since the amount of the potential leak is now several KB,
it seems worth back-patching as far as 9.2 where the previous fix landed.

Per Coverity, which evidently is smarter than clang's static analyzer.
2015-02-11 18:35:23 -05:00
Tom Lane 9feefedf9e Fix pg_dump's heuristic for deciding which casts to dump.
Back in 2003 we had a discussion about how to decide which casts to dump.
At the time pg_dump really only considered an object's containing schema
to decide what to dump (ie, dump whatever's not in pg_catalog), and so
we chose a complicated idea involving whether the underlying types were to
be dumped (cf commit a6790ce857).  But users
are allowed to create casts between built-in types, and we failed to dump
such casts.  Let's get rid of that heuristic, which has accreted even more
ugliness since then, in favor of just looking at the cast's OID to decide
if it's a built-in cast or not.

In passing, also fix some really ancient code that supposed that it had to
manufacture a dependency for the cast on its cast function; that's only
true when dumping from a pre-7.3 server.  This just resulted in some wasted
cycles and duplicate dependency-list entries with newer servers, but we
might as well improve it.

Per gripes from a number of people, most recently Greg Sabino Mullane.
Back-patch to all supported branches.
2015-02-10 22:38:15 -05:00
Tom Lane 1a179f36f7 Fix GEQO to not assume its join order heuristic always works.
Back in commit 400e2c9344 I rewrote GEQO's
gimme_tree function to improve its heuristic for modifying the given tour
into a legal join order.  In what can only be called a fit of hubris,
I supposed that this new heuristic would *always* find a legal join order,
and ripped out the old logic that allowed gimme_tree to sometimes fail.

The folly of this is exposed by bug #12760, in which the "greedy" clumping
behavior of merge_clump() can lead it into a dead end which could only be
recovered from by un-clumping.  We have no code for that and wouldn't know
exactly what to do with it if we did.  Rather than try to improve the
heuristic rules still further, let's just recognize that it *is* a
heuristic and probably must always have failure cases.  So, put back the
code removed in the previous commit to allow for failure (but comment it
a bit better this time).

It's possible that this code was actually fully correct at the time and
has only been broken by the introduction of LATERAL.  But having seen this
example I no longer have much faith in that proposition, so back-patch to
all supported branches.
2015-02-10 20:37:19 -05:00
Michael Meskes 1f393fc923 Fixed array handling in ecpg.
When ecpg was rewritten to the new protocol version not all variable types
were corrected. This patch rewrites the code for these types to fix that. It
also fixes the documentation to correctly tell the status of array handling.
2015-02-10 12:04:10 +01:00
Heikki Linnakangas 025c02420d Speed up CRC calculation using slicing-by-8 algorithm.
This speeds up WAL generation and replay. The new algorithm is
significantly faster with large inputs, like full-page images or when
inserting wide rows. It is slower with tiny inputs, i.e. less than 10 bytes
or so, but the speedup with longer inputs more than make up for that. Even
small WAL records at least have 24 byte header in the front.

The output is identical to the current byte-at-a-time computation, so this
does not affect compatibility. The new algorithm is only used for the
CRC-32C variant, not the legacy version used in tsquery or the
"traditional" CRC-32 used in hstore and ltree. Those are not as performance
critical, and are usually only applied over small inputs, so it seems
better to not carry around the extra lookup tables to speed up those rare
cases.

Abhijit Menon-Sen
2015-02-10 10:54:40 +02:00
Heikki Linnakangas cc761b170c Fix MSVC build.
When I moved pg_crc.c from src/port to src/common, I forgot to modify MSVC
build script accordingly.
2015-02-09 22:13:50 +02:00
Tom Lane bc4de01db3 Minor cleanup/code review for "indirect toast" stuff.
Fix some issues I noticed while fooling with an extension to allow an
additional kind of toast pointer.  Much of this is just comment
improvement, but there are a couple of actual bugs, which might or might
not be reachable today depending on what can happen during logical
decoding.  An example is that toast_flatten_tuple() failed to cover the
possibility of an indirection pointer in its input.  Back-patch to 9.4
just in case that is reachable now.

In HEAD, also correct some really minor issues with recent compression
reorganization, such as dangerously underparenthesized macros.
2015-02-09 12:30:52 -05:00
Heikki Linnakangas c619c2351f Move pg_crc.c to src/common, and remove pg_crc_tables.h
To get CRC functionality in a client program, you now need to link with
libpgcommon instead of libpgport. The CRC code has nothing to do with
portability, so libpgcommon is a better home. (libpgcommon didn't exist
when pg_crc.c was originally moved to src/port.)

Remove the possibility to get CRC functionality by just #including
pg_crc_tables.h. I'm not aware of any extensions that actually did that and
couldn't simply link with libpgcommon.

This also moves the pg_crc.h header file from src/include/utils to
src/include/common, which will require changes to any external programs
that currently does #include "utils/pg_crc.h". That seems acceptable, as
include/common is clearly the right home for it now, and the change needed
to any such programs is trivial.
2015-02-09 11:17:56 +02:00
Fujii Masao 40bede5477 Move pg_lzcompress.c to src/common.
The meta data of PGLZ symbolized by PGLZ_Header is removed, to make
the compression and decompression code independent on the backend-only
varlena facility. PGLZ_Header is being used to store some meta data
related to the data being compressed like the raw length of the uncompressed
record or some varlena-related data, making it unpluggable once PGLZ is
stored in src/common as it contains some backend-only code paths with
the management of varlena structures. The APIs of PGLZ are reworked
at the same time to do only compression and decompression of buffers
without the meta-data layer, simplifying its use for a more general usage.

On-disk format is preserved as well, so there is no incompatibility with
previous major versions of PostgreSQL for TOAST entries.

Exposing compression and decompression APIs of pglz makes possible its
use by extensions and contrib modules. Especially this commit is required
for upcoming WAL compression feature so that the WAL reader facility can
decompress the WAL data by using pglz_decompress.

Michael Paquier, reviewed by me.
2015-02-09 15:15:24 +09:00
Noah Misch 237795a7b4 Check DCH_MAX_ITEM_SIZ limits with <=, not <.
We reserve space for the full amount, not one less.  The affected checks
deal with localized month and day names.  Today's DCH_MAX_ITEM_SIZ value
would suffice for a 60-byte day name, while the longest known is the
49-byte mn_CN.utf-8 word for "Saturday."  Thus, the upshot of this
change is merely to avoid misdirecting future readers of the code; users
are not expected to see errors either way.
2015-02-06 23:39:52 -05:00
Noah Misch a7a4adcf8d Assert(PqCommReadingMsg) in pq_peekbyte().
Interrupting pq_recvbuf() can break protocol sync, so its callers all
deserve this assertion.  The one pq_peekbyte() caller suffices already.
2015-02-06 23:14:27 -05:00
Heikki Linnakangas ff16b40f8c Report WAL flush, not insert, position in replication IDENTIFY_SYSTEM
When beginning streaming replication, the client usually issues the
IDENTIFY_SYSTEM command, which used to return the current WAL insert
position. That's not suitable for the intended purpose of that field,
however. pg_receivexlog uses it to start replication from the reported
point, but if it hasn't been flushed to disk yet, it will fail. Change
IDENTIFY_SYSTEM to report the flush position instead.

Backpatch to 9.1 and above. 9.0 doesn't report any WAL position.
2015-02-06 11:26:50 +02:00
Michael Meskes 5ee5bc3873 This routine was calling ecpg_alloc to allocate to memory but did not
actually check the returned pointer allocated, potentially NULL which
could be the result of a malloc call.

Issue noted by Coverity, fixed by Michael Paquier <michael@otacoo.com>
2015-02-05 15:12:34 +01:00
Heikki Linnakangas d88976cfa1 Use a separate memory context for GIN scan keys.
It was getting tedious to track and release all the different things that
form a scan key. We were leaking at least the queryCategories array, and
possibly more, on a rescan. That was visible if a GIN index was used in a
nested loop join. This also protects from leaks in extractQuery method.

No backpatching, given the lack of complaints from the field. Maybe later,
after this has received more field testing.
2015-02-04 17:40:25 +02:00
Heikki Linnakangas 57fe246890 Fix reference-after-free when waiting for another xact due to constraint.
If an insertion or update had to wait for another transaction to finish,
because there was another insertion with conflicting key in progress,
we would pass a just-free'd item pointer to XactLockTableWait().

All calls to XactLockTableWait() and MultiXactIdWait() had similar issues.
Some passed a pointer to a buffer in the buffer cache, after already
releasing the lock. The call in EvalPlanQualFetch had already released the
pin too. All but the call in execUtils.c would merely lead to reporting a
bogus ctid, however (or an assertion failure, if enabled).

All the callers that passed HeapTuple->t_data->t_ctid were slightly bogus
anyway: if the tuple was updated (again) in the same transaction, its ctid
field would point to the next tuple in the chain, not the tuple itself.

Backpatch to 9.4, where the 'ctid' argument to XactLockTableWait was added
(in commit f88d4cfc)
2015-02-04 16:00:34 +02:00
Robert Haas 370b3a4618 pgcrypto: Code cleanup for decrypt_internal.
Remove some unnecessary null-tests, and replace a goto-label construct
with an "if" block.

Michael Paquier, reviewed by me.
2015-02-04 08:46:32 -05:00
Heikki Linnakangas c31b5d9ddf Fix memory leaks on OOM in ecpg.
These are fairly obscure cases, but let's keep Coverity happy.

Michael Paquier with some further fixes by me.
2015-02-04 14:55:30 +02:00
Andres Freund ff8ca3b04c Add missing float.h include to snprintf.c.
On windows _isnan() (which isnan() is redirected to in port/win32.h)
is declared in float.h, not math.h.

Per buildfarm animal currawong.

Backpatch to all supported branches.
2015-02-04 13:27:31 +01:00
Fujii Masao c036edb7b3 doc: Fix markup
Ian Barwick
2015-02-04 19:00:09 +09:00
Heikki Linnakangas 302262d521 Add dummy PQsslAttributes function for non-SSL builds.
All the other new SSL information functions had dummy versions in
be-secure.c, but I missed PQsslAttributes(). Oops. Surprisingly, the linker
did not complain about the missing function on most platforms represented in
the buildfarm, even though it is exported, except for a few Windows systems.
2015-02-04 09:13:15 +02:00
Andres Freund 3a54f4a494 Remove ill-conceived Assertion in ProcessClientWriteInterrupt().
It's perfectly fine to have blocked interrupts when
ProcessClientWriteInterrupt() is called. In fact it's commonly the
case when emitting error reports. And we deal with that correctly.

Even if that'd not be the case, it'd be a bad location for such a
assertion. Because ProcessClientWriteInterrupt() is only called when
the socket is blocked it's hard to hit.

Per Heikki and buildfarm animals nightjar and dunlin.
2015-02-03 23:52:15 +01:00
Andres Freund 2505ce0be0 Remove remnants of ImmediateInterruptOK handling.
Now that nothing sets ImmediateInterruptOK to true anymore, we can
remove all the supporting code.

Reviewed-By: Heikki Linnakangas
2015-02-03 23:25:47 +01:00
Andres Freund d06995710b Remove the option to service interrupts during PGSemaphoreLock().
The remaining caller (lwlocks) doesn't need that facility, and we plan
to remove ImmedidateInterruptOK entirely. That means that interrupts
can't be serviced race-free and portably anyway, so there's little
reason for keeping the feature.

Reviewed-By: Heikki Linnakangas
2015-02-03 23:25:00 +01:00
Andres Freund 6753333f55 Move deadlock and other interrupt handling in proc.c out of signal handlers.
Deadlock checking was performed inside signal handlers up to
now. While it's a remarkable feat to have made this work reliably,
it's quite complex to understand why that is the case. Partially it
worked due to the assumption that semaphores are signal safe - which
is not actually documented to be the case for sysv semaphores.

The reason we had to rely on performing this work inside signal
handlers is that semaphores aren't guaranteed to be interruptable by
signals on all platforms. But now that latches provide a somewhat
similar API, which actually has the guarantee of being interruptible,
we can avoid doing so.

Signalling between ProcSleep, ProcWakeup, ProcWaitForSignal and
ProcSendSignal is now done using latches. This increases the
likelihood of spurious wakeups. As spurious wakeup already were
possible and aren't likely to be frequent enough to be an actual
problem, this seems acceptable.

This change would allow for further simplification of the deadlock
checking, now that it doesn't have to run in a signal handler. But
even if I were motivated to do so right now, it would still be better
to do that separately. Such a cleanup shouldn't have to be reviewed a
the same time as the more fundamental changes in this commit.

There is one possible usability regression due to this commit. Namely
it is more likely than before that log_lock_waits messages are output
more than once.

Reviewed-By: Heikki Linnakangas
2015-02-03 23:24:38 +01:00
Andres Freund 6647248e37 Don't allow immediate interrupts during authentication anymore.
We used to handle authentication_timeout by setting
ImmediateInterruptOK to true during large parts of the authentication
phase of a new connection.  While that happens to work acceptably in
practice, it's not particularly nice and has ugly corner cases.

Previous commits converted the FE/BE communication to use latches and
implemented support for interrupt handling during both
send/recv. Building on top of that work we can get rid of
ImmediateInterruptOK during authentication, by immediately treating
timeouts during authentication as a reason to die. As die interrupts
are handled immediately during client communication that provides a
sensibly quick reaction time to authentication timeout.

Additionally add a few CHECK_FOR_INTERRUPTS() to some more complex
authentication methods. More could be added, but this already should
provides a reasonable coverage.

While it this overall increases the maximum time till a timeout is
reacted to, it greatly reduces complexity and increases
reliability. That seems like a overall win. If the increase proves to
be noticeable we can deal with those cases by moving to nonblocking
network code and add interrupt checking there.

Reviewed-By: Heikki Linnakangas
2015-02-03 22:54:48 +01:00
Tom Lane cec916f35b Remove unused "m" field in LSEG.
This field has been unreferenced since 1998, and does not appear in lseg
values stored on disk (since sizeof(lseg) is only 32 bytes according to
pg_type).  There was apparently some idea of maintaining it just in values
appearing in memory, but the bookkeeping required to make that work would
surely far outweigh the cost of recalculating the line's slope when needed.
Remove it to (a) simplify matters and (b) suppress some uninitialized-field
whining from Coverity.
2015-02-03 16:53:32 -05:00
Andres Freund 4fe384bd85 Process 'die' interrupts while reading/writing from the client socket.
Up to now it was impossible to terminate a backend that was trying to
send/recv data to/from the client when the socket's buffer was already
full/empty. While the send/recv calls itself might have gotten
interrupted by signals on some platforms, we just immediately retried.

That could lead to situations where a backend couldn't be terminated ,
after a client died without the connection being closed, because it
was blocked in send/recv.

The problem was far more likely to be hit when sending data than when
reading. That's because while reading a command from the client, and
during authentication, we processed interrupts immediately . That
primarily left COPY FROM STDIN as being problematic for recv.

Change things so that that we process 'die' events immediately when
the appropriate signal arrives. We can't sensibly react to query
cancels at that point, because we might loose sync with the client as
we could be in the middle of writing a message.

We don't interrupt writes if the write buffer isn't full, as indicated
by write() returning EWOULDBLOCK, as that would lead to fewer error
messages reaching clients.

Per discussion with Kyotaro HORIGUCHI and Heikki Linnakangas

Discussion: 20140927191243.GD5423@alap3.anarazel.de
2015-02-03 22:45:45 +01:00
Andres Freund 4f85fde8eb Introduce and use infrastructure for interrupt processing during client reads.
Up to now large swathes of backend code ran inside signal handlers
while reading commands from the client, to allow for speedy reaction to
asynchronous events. Most prominently shared invalidation and NOTIFY
handling. That means that complex code like the starting/stopping of
transactions is run in signal handlers...  The required code was
fragile and verbose, and is likely to contain bugs.

That approach also severely limited what could be done while
communicating with the client. As the read might be from within
openssl it wasn't safely possible to trigger an error, e.g. to cancel
a backend in idle-in-transaction state. We did that in some cases,
namely fatal errors, nonetheless.

Now that FE/BE communication in the backend employs non-blocking
sockets and latches to block, we can quite simply interrupt reads from
signal handlers by setting the latch. That allows us to signal an
interrupted read, which is supposed to be retried after returning from
within the ssl library.

As signal handlers now only need to set the latch to guarantee timely
interrupt processing, remove a fair amount of complicated & fragile
code from async.c and sinval.c.

We could now actually start to process some kinds of interrupts, like
sinval ones, more often that before, but that seems better done
separately.

This work will hopefully allow to handle cases like being blocked by
sending data, interrupting idle transactions and similar to be
implemented without too much effort.  In addition to allowing getting
rid of ImmediateInterruptOK, that is.

Author: Andres Freund
Reviewed-By: Heikki Linnakangas
2015-02-03 22:25:20 +01:00
Andres Freund 387da18874 Use a nonblocking socket for FE/BE communication and block using latches.
This allows to introduce more elaborate handling of interrupts while
reading from a socket.  Currently some interrupt handlers have to do
significant work from inside signal handlers, and it's very hard to
correctly write code to do so.  Generic signal handler limitations,
combined with the fact that we can't safely jump out of a signal
handler while reading from the client have prohibited implementation
of features like timeouts for idle-in-transaction.

Additionally we use the latch code to wait in a couple places where we
previously only had waiting code on windows as other platforms just
busy looped.

This can increase the number of systemcalls happening during FE/BE
communication. Benchmarks so far indicate that the impact isn't very
high, and there's room for optimization in the latch code. The chance
of cleaning up the usage of latches gives us, seem to outweigh the
risk of small performance regressions.

This commit theoretically can't used without the next patch in the
series, as WaitLatchOrSocket is not defined to be fully signal
safe. As we already do that in some cases though, it seems better to
keep the commits separate, so they're easier to understand.

Author: Andres Freund
Reviewed-By: Heikki Linnakangas
2015-02-03 22:03:48 +01:00
Tom Lane 778d498c7d Fix breakage in GEODEBUG debug code.
LINE doesn't have an "m" field (anymore anyway).  Also fix unportable
assumption that %x can print the result of pointer subtraction.

In passing, improve single_decode() in minor ways:
* Remove unnecessary leading-whitespace skip (strtod does that already).
* Make GEODEBUG message more intelligible.
* Remove entirely-useless test to see if strtod returned a silly pointer.
* Don't bother computing trailing-whitespace skip unless caller wants
  an ending pointer.

This has been broken since 261c7d4b65.
Although it's only debug code, might as well fix the 9.4 branch too.
2015-02-03 15:20:45 -05:00