Commit Graph

347 Commits

Author SHA1 Message Date
Tom Lane 511540dadf Move isolationtester's is-blocked query into C code for speed.
Commit 4deb41381 modified isolationtester's query to see whether a
session is blocked to also check for waits occurring in GetSafeSnapshot.
However, it did that in a way that enormously increased the query's
runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members
that use that to run about four times slower than before, and in some
cases fail entirely.  To fix, push the entire logic into a dedicated
backend function.  This should actually reduce the CLOBBER_CACHE_ALWAYS
runtime from what it was previously, though I've not checked that.

In passing, expose a SQL function to check for safe-snapshot blockage,
comparable to pg_blocking_pids.  This is more or less free given the
infrastructure built to solve the other problem, so we might as well.

Thomas Munro

Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de
2017-04-10 10:26:54 -04:00
Kevin Grittner 4deb413813 Add isolation test for SERIALIZABLE READ ONLY DEFERRABLE.
This improves code coverage and lays a foundation for testing
similar issues in a distributed environment.

Author: Thomas Munro <thomas.munro@enterprisedb.com>
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
2017-04-05 10:04:36 -05:00
Andrew Gierth 64ae420b27 Repair test for vacuum reltuples fix.
Concurrent auto-analyze could be holding a snapshot, affecting the
removal of deleted row versions.  Remove the deletion to avoid this
happening.  Per buildfarm.

In passing, make the test independent of assumptions of physical row
order, just out of sheer paranoia.
2017-03-17 14:35:54 +00:00
Andrew Gierth 1914c5ea7d Avoid having vacuum set reltuples to 0 on non-empty relations in the
presence of page pins, which leads to serious estimation errors in the
planner.  This particularly affects small heavily-accessed tables,
especially where locking (e.g. from FK constraints) forces frequent
vacuums for mxid cleanup.

Fix by keeping separate track of pages whose live tuples were actually
counted vs. pages that were only scanned for freezing purposes.  Thus,
reltuples can only be set to 0 if all pages of the relation were
actually counted.

Backpatch to all supported versions.

Per bug #14057 from Nicolas Baccelli, analyzed by me.

Discussion: https://postgr.es/m/20160331103739.8956.94469@wrigleys.postgresql.org
2017-03-16 22:28:03 +00:00
Andres Freund 60f826c5e6 Improve isolation tests infrastructure.
Previously if a directory had both isolationtester and plain
regression tests, they couldn't be run in parallel, because they'd
access the same files/directories.  That, so far, only affected
contrib/test_decoding.

Rather than fix that locally in contrib/test_decoding, improve
pg_regress_isolation_[install]check to use separate resources from
plain regression tests.

That requires a minor change in pg_regress, namely that the
--outputdir is created if not already existing, that seems like good
idea anyway.

Use the improved helpers even where previously not used.

Author: Tom Lane and Andres Freund
Discussion: https://postgr.es/m/20170311194831.vm5ikpczq52c2drg@alap3.anarazel.de
2017-03-14 15:56:17 -07:00
Tom Lane 9722bb5757 Fix inclusions of postgres_fe.h from .h files.
We have a project policy that every .c file should start by including
postgres.h, postgres_fe.h, or c.h as appropriate; and then there is no
need for any .h file to explicitly include any of these.  Fix a few
headers that were violating this policy by including postgres_fe.h.

Discussion: https://postgr.es/m/CAEepm=2zCoeq3QxVwhS5DFeUh=yU6z81pbWMgfOB8OzyiBwxzw@mail.gmail.com
Discussion: https://postgr.es/m/11634.1488932128@sss.pgh.pa.us
2017-03-08 20:41:06 -05:00
Tom Lane 9e3755ecb2 Remove useless duplicate inclusions of system header files.
c.h #includes a number of core libc header files, such as <stdio.h>.
There's no point in re-including these after having read postgres.h,
postgres_fe.h, or c.h; so remove code that did so.

While at it, also fix some places that were ignoring our standard pattern
of "include postgres[_fe].h, then system header files, then other Postgres
header files".  While there's not any great magic in doing it that way
rather than system headers last, it's silly to have just a few files
deviating from the general pattern.  (But I didn't attempt to enforce this
globally, only in files I was touching anyway.)

I'd be the first to say that this is mostly compulsive neatnik-ism,
but over time it might save enough compile cycles to be useful.
2017-02-25 16:12:55 -05:00
Heikki Linnakangas 181bdb90ba Fix typos in comments.
Backpatch to all supported versions, where applicable, to make backpatching
of future fixes go more smoothly.

Josh Soref

Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com
2017-02-06 11:33:58 +02:00
Bruce Momjian 1d25779284 Update copyright via script for 2017 2017-01-03 13:48:53 -05:00
Tom Lane a6c0a5b6e8 Don't throw serialization errors for self-conflicts in INSERT ON CONFLICT.
A transaction that conflicts against itself, for example
	INSERT INTO t(pk) VALUES (1),(1) ON CONFLICT DO NOTHING;
should behave the same regardless of isolation level.  It certainly
shouldn't throw a serialization error, as retrying will not help.
We got this wrong due to the ON CONFLICT logic not considering the case,
as reported by Jason Dusek.

Core of this patch is by Peter Geoghegan (based on an earlier patch by
Thomas Munro), though I didn't take his proposed code refactoring for fear
that it might have unexpected side-effects.  Test cases by Thomas Munro
and myself.

Report: <CAO3NbwOycQjt2Oqy2VW-eLTq2M5uGMyHnGm=RNga4mjqcYD7gQ@mail.gmail.com>
Related-Discussion: <57EE93C8.8080504@postgrespro.ru>
2016-10-23 18:36:13 -04:00
Tom Lane 96dd77d349 Be sure to rewind the tuplestore read pointer in non-leader CTEScan nodes.
ExecInitCteScan supposed that it didn't have to do anything to the extra
tuplestore read pointer it gets from tuplestore_alloc_read_pointer.
However, it needs this read pointer to be positioned at the start of the
tuplestore, while tuplestore_alloc_read_pointer is actually defined as
cloning the current position of read pointer 0.  In normal situations
that accidentally works because we initialize the whole plan tree at once,
before anything gets read.  But it fails in an EvalPlanQual recheck, as
illustrated in bug #14328 from Dima Pavlov.  To fix, just forcibly rewind
the pointer after tuplestore_alloc_read_pointer.  The cost of doing so is
negligible unless the tuplestore is already in TSS_READFILE state, which
wouldn't happen in normal cases.  We could consider altering tuplestore's
API to make that case cheaper, but that would make for a more invasive
back-patch and it doesn't seem worth it.

This has been broken probably for as long as we've had CTEs, so back-patch
to all supported branches.

Discussion: <32468.1474548308@sss.pgh.pa.us>
2016-09-22 11:35:03 -04:00
Tom Lane 052cc223d5 Fix a bunch of places that called malloc and friends with no NULL check.
Where possible, use palloc or pg_malloc instead; otherwise, insert
explicit NULL checks.

Generally speaking, these are places where an actual OOM is quite
unlikely, either because they're in client programs that don't
allocate all that much, or they're very early in process startup
so that we'd likely have had a fork() failure instead.  Hence,
no back-patch, even though this is nominally a bug fix.

Michael Paquier, with some adjustments by me

Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>
2016-08-30 18:22:43 -04:00
Andres Freund 9595383bc6 Add alternative output for ON CONFLICT toast isolation test.
On some buildfarm animals the isolationtest added in 07ef0351 failed, as
the order in which processes are run after unlocking is not
guaranteed. Add an alternative output for that.

Discussion: <7969.1471484738@sss.pgh.pa.us>
Backpatch: 9.6, like the test in the aforementioned commit
2016-08-18 17:34:05 -07:00
Andres Freund 07ef035129 Fix deletion of speculatively inserted TOAST on conflict
INSERT ..  ON CONFLICT runs a pre-check of the possible conflicting
constraints before performing the actual speculative insertion.  In case
the inserted tuple included TOASTed columns the ON CONFLICT condition
would be handled correctly in case the conflict was caught by the
pre-check, but if two transactions entered the speculative insertion
phase at the same time, one would have to re-try, and the code for
aborting a speculative insertion did not handle deleting the
speculatively inserted TOAST datums correctly.

TOAST deletion would fail with "ERROR: attempted to delete invisible
tuple" as we attempted to remove the TOAST tuples using
simple_heap_delete which reasoned that the given tuples should not be
visible to the command that wrote them.

This commit updates the heap_abort_speculative() function which aborts
the conflicting tuple to use itself, via toast_delete, for deleting
associated TOAST datums.  Like before, the inserted toast rows are not
marked as being speculative.

This commit also adds a isolationtester spec test, exercising the
relevant code path. Unfortunately 9.5 cannot handle two waiting
sessions, and thus cannot execute this test.

Reported-By: Viren Negi, Oskari Saarenmaa
Author: Oskari Saarenmaa, edited a bit by me
Bug: #14150
Discussion: <20160519123338.12513.20271@wrigleys.postgresql.org>
Backpatch: 9.5, where ON CONFLICT was introduced
2016-08-17 17:03:36 -07:00
Tom Lane 18555b1323 Establish conventions about global object names used in regression tests.
To ensure that "make installcheck" can be used safely against an existing
installation, we need to be careful about what global object names
(database, role, and tablespace names) we use; otherwise we might
accidentally clobber important objects.  There's been a weak consensus that
test databases should have names including "regression", and that test role
names should start with "regress_", but we didn't have any particular rule
about tablespace names; and neither of the other rules was followed with
any consistency either.

This commit moves us a long way towards having a hard-and-fast rule that
regression test databases must have names including "regression", and that
test role and tablespace names must start with "regress_".  It's not
completely there because I did not touch some test cases in rolenames.sql
that test creation of special role names like "session_user".  That will
require some rethinking of exactly what we want to test, whereas the intent
of this patch is just to hit all the cases in which the needed renamings
are cosmetic.

There is no enforcement mechanism in this patch either, but if we don't
add one we can expect that the tests will soon be violating the convention
again.  Again, that's not such a cosmetic change and it will require
discussion.  (But I did use a quick-hack enforcement patch to find these
cases.)

Discussion: <16638.1468620817@sss.pgh.pa.us>
2016-07-17 18:42:43 -04:00
Alvaro Herrera 533e9c6b06 Avoid serializability errors when locking a tuple with a committed update
When key-share locking a tuple that has been not-key-updated, and the
update is a committed transaction, in some cases we raised
serializability errors:
    ERROR:  could not serialize access due to concurrent update

Because the key-share doesn't conflict with the update, the error is
unnecessary and inconsistent with the case that the update hasn't
committed yet.  This causes problems for some usage patterns, even if it
can be claimed that it's sufficient to retry the aborted transaction:
given a steady stream of updating transactions and a long locking
transaction, the long transaction can be starved indefinitely despite
multiple retries.

To fix, we recognize that HeapTupleSatisfiesUpdate can return
HeapTupleUpdated when an updating transaction has committed, and that we
need to deal with that case exactly as if it were a non-committed
update: verify whether the two operations conflict, and if not, carry on
normally.  If they do conflict, however, there is a difference: in the
HeapTupleBeingUpdated case we can just sleep until the concurrent
transaction is gone, while in the HeapTupleUpdated case this is not
possible and we must raise an error instead.

Per trouble report from Olivier Dony.

In addition to a couple of test cases that verify the changed behavior,
I added a test case to verify the behavior that remains unchanged,
namely that errors are raised when a update that modifies the key is
used.  That must still generate serializability errors.  One
pre-existing test case changes behavior; per discussion, the new
behavior is actually the desired one.

Discussion: https://www.postgresql.org/message-id/560AA479.4080807@odoo.com
  https://www.postgresql.org/message-id/20151014164844.3019.25750@wrigleys.postgresql.org

Backpatch to 9.3, where the problem appeared.
2016-07-15 14:17:20 -04:00
Tom Lane ad520ec4ac Use memmove() not memcpy() to slide some pointers down.
The previous coding here was formally undefined, though it seems to
accidentally work on most platforms in the buildfarm.  Caught by some
OpenBSD platforms in which libc contains an assertion check for
overlapping areas passed to memcpy().

Thomas Munro
2016-04-27 18:19:28 -04:00
Peter Eisentraut 339025c68f Replace printf format %i by %d
see also ce8d7bb644
2016-04-08 12:42:58 -04:00
Kevin Grittner fcff8a5751 Detect SSI conflicts before reporting constraint violations
While prior to this patch the user-visible effect on the database
of any set of successfully committed serializable transactions was
always consistent with some one-at-a-time order of execution of
those transactions, the presence of declarative constraints could
allow errors to occur which were not possible in any such ordering,
and developers had no good workarounds to prevent user-facing
errors where they were not necessary or desired.  This patch adds
a check for serialization failure ahead of duplicate key checking
so that if a developer explicitly (redundantly) checks for the
pre-existing value they will get the desired serialization failure
where the problem is caused by a concurrent serializable
transaction; otherwise they will get a duplicate key error.

While it would be better if the reads performed by the constraints
could count as part of the work of the transaction for
serialization failure checking, and we will hopefully get there
some day, this patch allows a clean and reliable way for developers
to work around the issue.  In many cases existing code will already
be doing the right thing for this to "just work".

Author: Thomas Munro, with minor editing of docs by me
Reviewed-by: Marko Tiikkaja, Kevin Grittner
2016-04-07 11:12:35 -05:00
Tom Lane 71404af2a2 Fix EvalPlanQual bug when query contains both locked and not-locked rels.
In commit afb9249d06, we (probably I) made ExecLockRows assign
null test tuples to all relations of the query while setting up to do an
EvalPlanQual recheck for a newly-updated locked row.  This was sheerest
brain fade: we should only set test tuples for relations that are lockable
by the LockRows node, and in particular empty test tuples are only sensible
for inheritance child relations that weren't the source of the current
tuple from their inheritance tree.  Setting a null test tuple for an
unrelated table causes it to return NULLs when it should not, as exhibited
in bug #14034 from Bronislav Houdek.  To add insult to injury, doing it the
wrong way required two loops where one would suffice; so the corrected code
is even a bit shorter and faster.

Add a regression test case based on his example, and back-patch to 9.5
where the bug was introduced.
2016-03-22 17:56:20 -04:00
Peter Eisentraut a40814d7aa Handle invalid libpq sockets in more places
Also, make error messages consistent.

From: Michael Paquier <michael.paquier@gmail.com>
2016-03-08 21:10:33 -05:00
Tom Lane 3d523564c5 Suppress scary-looking log messages from async-notify isolation test.
I noticed that the async-notify test results in log messages like these:

LOG:  could not send data to client: Broken pipe
FATAL:  connection to client lost

This is because it unceremoniously disconnects a client session that is
about to have some NOTIFY messages delivered to it.  Such log messages
during a regression test might well cause people to go looking for a
problem that doesn't really exist (it did cause me to waste some time that
way).  We can shut it up by adding an UNLISTEN command to session teardown.

Patch HEAD only; this doesn't seem significant enough to back-patch.
2016-02-29 19:29:19 -05:00
Alvaro Herrera 54638f5708 Make new isolationtester test more stable
The original coding of the test was relying too much on the ordering in
which backends are awakened once an advisory lock which they wait for is
released.  Change the code so that each backend uses its own advisory
lock instead, so that the output becomes stable.  Also add a few seconds
of sleep between lock releases, so that the test isn't broken in
overloaded buildfarm animals, as suggested by Tom Lane.

Per buildfarm members spoonbill and guaibasaurus.

Discussion: https://www.postgresql.org/message-id/19294.1456551587%40sss.pgh.pa.us
2016-02-29 16:34:56 -03:00
Andrew Dunstan 87cc6b57a9 Respect TEMP_CONFIG when pg_regress_check and friends are called
This reverts commit 9117985b6b in favor of
a more general solution.
2016-02-27 12:28:21 -05:00
Alvaro Herrera c9578135f7 Add isolationtester spec for old heapam.c bug
In 0e5680f473, I fixed a bug in heapam that caused spurious deadlocks
when multiple updates concurrently attempted to modify the old version
of an updated tuple whose new version was key-share locked.  I proposed
an isolationtester spec file that reproduced the bug, but back then
isolationtester wasn't mature enough to be able to run it.  Now that
38f8bdcac4 is in the tree, we can have this spec file too.

Discussion: https://www.postgresql.org/message-id/20141212205254.GC1768%40alvh.no-ip.org
2016-02-26 17:11:15 -03:00
Tom Lane 52f5d578d6 Create a function to reliably identify which sessions block which others.
This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether.  (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)

The new function has the following behaviors that are painful or impossible
to get right via pg_locks:

1. Correctly understands which lock modes block which other ones.

2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.

3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.

The motivation for doing this right now is mostly to fix the isolation
tests.  Commit 38f8bdcac4 lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly.  But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions.  Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds.  That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
2016-02-22 14:31:43 -05:00
Tom Lane e84e06d2b3 Increase deadlock_timeout some more in the deadlock-hard isolation test.
The previous value of 5s is inadequate for the buildfarm's
CLOBBER_CACHE_ALWAYS animals: they take long enough to do the is-it-waiting
queries that the timeout expires, allowing the database state to change,
before isolationtester is done looking.  Perhaps 10s will be enough.
(If it isn't, I'm inclined to reduce the number of sessions involved.)
2016-02-12 17:22:42 -05:00
Tom Lane dca369320f Revert "isolationtester: don't repeat the is-it-waiting query when retrying a step."
This mostly reverts commit 9c9782f066.
I left in the parts that rearranged removal of completed waiting steps;
but the idea of not rechecking a step's blocked-ness isn't working.
2016-02-12 17:12:23 -05:00
Tom Lane 3992188c2a Revert "Still further tweaking of deadlock isolation tests."
This reverts commit d03130d378.
That was dependent on an isolationtester.c change that now proves
to be broken; we will need to find another solution.
2016-02-12 17:02:59 -05:00
Tom Lane d03130d378 Still further tweaking of deadlock isolation tests.
It turns out that there is a second race condition in the new deadlock-hard
test: once the deadlock detector fires, it's uncertain whether step s7a8 or
step s8a1 will report first, because killing s8's transaction unblocks s7.
So far, s7 has only been seen to report first in CLOBBER_CACHE_ALWAYS
builds, but it's pretty reproducible there, and in theory it should
sometimes occur in normal builds too.  If s7 were a bit slower than usual,
that could also break the test, since the existing expected-file assumes
that we'll see s7a8 report the first time we check it after s8a1 completes.
To fix, add a post-lock delay to s7a8.
2016-02-12 14:19:57 -05:00
Tom Lane 9c9782f066 isolationtester: don't repeat the is-it-waiting query when retrying a step.
If we're retrying a step, then we already decided it was blocked on a lock,
and there's no need to recheck that.  The original coding of commit
38f8bdcac4 resulted in a large number of
is-it-waiting queries when dealing with multiple concurrently-blocked
sessions, which is fairly pointless and also results in test failures in
CLOBBER_CACHE_ALWAYS builds, where the is-it-waiting query is quite slow.

This definition also permits appending pg_sleep() calls to steps where it's
needed to control the order of finish of concurrent steps.  Before, that
did not work nicely because we'd decide that a step performing a sleep was
not blocked and hang up waiting for it to finish, rather than noticing the
completion of the concurrent step we're supposed to notice first.

In passing, revise handling of removal of completed waiting steps
to make it a bit less messy.
2016-02-12 14:10:36 -05:00
Tom Lane a361490806 Re-pgindent isolationtester.c.
Need to do some more hacking on this, and got annoyed that it's not
indent clean.
2016-02-12 13:36:13 -05:00
Peter Eisentraut 29b4b7bda6 Fix whitespace 2016-02-12 12:08:40 -05:00
Tom Lane caefc11ef6 Further tweaking of deadlock isolation tests.
The new deadlock-soft-2 test has a timing dependency too: it supposes
that isolationtester will detect step s1b as waiting before the deadlock
detector runs and grants it the lock.  Adjust deadlock_timeout to ensure
that that's true even in CLOBBER_CACHE_ALWAYS builds, where the wait
detection query is quite slow.  Per buildfarm member jaguarundi.
2016-02-11 23:21:33 -05:00
Tom Lane b11d07b6a3 Make new deadlock isolation test more reproducible.
The original formulation of 4c9864b9b4
was extremely timing-sensitive, because it arranged for the deadlock
detector to be running (and possibly unblocking the current query)
at almost exactly the same time as isolationtester would be probing
to see if the query is blocked.  The committed expected-file assumed
that the deadlock detection would finish first, but we see the opposite
on both fast and slow buildfarm animals.  Adjust the deadlock timeout
settings to make it predictable that isolationtester *will* see the
query as waiting before deadlock detection unblocks it.

I used a 5s timeout for the same reasons mentioned in
a7921f71a3.
2016-02-11 11:59:11 -05:00
Tom Lane d9dc2b4149 Code review for isolationtester changes.
Fix a few oversights in 38f8bdcac4982215beb9f65a19debecaf22fd470:
don't leak memory in run_permutation(), remember when we've issued
a cancel rather than issuing another one every 10ms,
fix some typos in comments.
2016-02-11 11:30:52 -05:00
Robert Haas 4c9864b9b4 Add some isolation tests for deadlock detection and resolution.
Previously, we had no test coverage for the deadlock detector.
2016-02-11 08:38:09 -05:00
Robert Haas 38f8bdcac4 Modify the isolation tester so that multiple sessions can wait.
This allows testing of deadlock scenarios.  Scenarios that would
previously have been considered invalid are now simply taken as a
scenario in which more than one backend will wait.
2016-02-11 08:36:30 -05:00
Robert Haas c9882c60f4 Specify permutations for isolation tests with "invalid" permutations.
This is a necessary prerequisite for forthcoming changes to allow deadlock
scenarios to be tested by the isolation tester.  It is also a good idea on
general principle, since these scenarios add no useful test coverage not
provided by other scenarios, but do to take time to execute.
2016-02-11 08:33:24 -05:00
Bruce Momjian ee94300446 Update copyright for 2016
Backpatch certain files through 9.1
2016-01-02 13:33:40 -05:00
Tom Lane 5884b92a84 Fix errors in commit a04bb65f70.
Not a lot of commentary needed here really.
2015-09-30 23:37:26 -04:00
Robert Haas 8bd42fe5c7 Remove unused expected-output file. 2015-08-14 23:13:13 -04:00
Robert Haas 43b4a16817 Reject isolation test specifications with duplicate step names.
alter-table-1.spec has such a case, so change one instance of step
rx1 to rx3 instead.
2015-08-14 22:10:46 -04:00
Tom Lane 6a1e14c62b Temporarily(?) remove BRIN isolation test.
Commit 2834855cb added a not-very-carefully-thought-out isolation test
to check a BRIN index bug fix.  The test depended on the availability
of the pageinspect contrib module, which meant it did not work in
several common testing scenarios such as "make check-world".  It's not
clear whether we want a core test depending on a contrib module like
that, but in any case, failing to deal with the possibility that the
module isn't present in the installation-under-test is not acceptable.

Remove that test pending some better solution.
2015-08-10 10:22:37 -04:00
Alvaro Herrera 2834855cb9 Fix BRIN to use SnapshotAny during summarization
For correctness of summarization results, it is critical that the
snapshot used during the summarization scan is able to see all tuples
that are live to all transactions -- including tuples inserted or
deleted by in-progress transactions.  Otherwise, it would be possible
for a transaction to insert a tuple, then idle for a long time while a
concurrent transaction executes summarization of the range: this would
result in the inserted value not being considered in the summary.
Previously we were trying to use a MVCC snapshot in conjunction with
adding a "placeholder" tuple in the index: the snapshot would see all
committed tuples, and the placeholder tuple would catch insertions by
any new inserters.  The hole is that prior insertions by transactions
that are still in progress by the time the MVCC snapshot was taken were
ignored.

Kevin Grittner reported this as a bogus error message during vacuum with
default transaction isolation mode set to repeatable read (because the
error report mentioned a function name not being invoked during), but
the problem is larger than that.

To fix, tweak IndexBuildHeapRangeScan to have a new mode that behaves
the way we need using SnapshotAny visibility rules.  This change
simplifies the BRIN code a bit, mainly by removing large comments that
were mistaken.  Instead, rely on the SnapshotAny semantics to provide
what it needs.  (The business about a placeholder tuple needs to remain:
that covers the case that a transaction inserts a a tuple in a page that
summarization already scanned.)

Discussion: https://www.postgresql.org/message-id/20150731175700.GX2441@postgresql.org

In passing, remove a couple of unused declarations from brin.h and
reword a comment to be proper English.  This part submitted by Kevin
Grittner.

Backpatch to 9.5, where BRIN was introduced.
2015-08-05 16:20:50 -03:00
Tom Lane 342a1ffa21 Add some test coverage of EvalPlanQual with non-locked tables.
A Salesforce colleague of mine griped that the regression tests don't
exercise EvalPlanQualFetchRowMarks() and allied routines.  Which is
a fair complaint.  Add test cases that go through the REFERENCE and COPY
code paths.  Unfortunately we don't have sufficient infrastructure right
now to exercise the FDW code path in the isolation tests, but this is
surely better than before.
2015-07-29 13:27:56 -04:00
Robert Haas a04bb65f70 Add new function pg_notification_queue_usage.
This tells you what fraction of NOTIFY's queue is currently filled.

Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh.  A few
further tweaks by me.
2015-07-17 09:12:03 -04:00
Tom Lane aa9eac45ea Fix portability issue in isolationtester grammar.
specparse.y and specscanner.l used "string" as a token name.  Now, bison
likes to define each token name as a macro for the token code it assigns,
which means those names are basically off-limits for any other use within
the grammar file or included headers.  So names as generic as "string" are
dangerous.  This is what was causing the recent failures on protosciurus:
some versions of Solaris' sys/kstat.h use "string" as a field name.
With late-model bison we don't see this problem because the token macros
aren't defined till later (that is why castoroides didn't show the problem
even though it's on the same machine).  But protosciurus uses bison 1.875
which defines the token macros up front.

This land mine has been there from day one; we'd have found it sooner
except that protosciurus wasn't trying to run the isolation tests till
recently.

To fix, rename the token to "string_literal" which is hopefully less
likely to collide with names used by system headers.  Back-patch to
all branches containing the isolation tests.
2015-05-27 19:14:51 -04:00
Andres Freund 168d5805e4 Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

Bumps catversion as stored rules change.

Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:43:10 +02:00
Peter Eisentraut dcae5facca Improve speed of make check-world
Before, make check-world would create a new temporary installation for
each test suite, which is slow and wasteful.  Instead, we now create one
test installation that is used by all test suites that are part of a
make run.

The management of the temporary installation is removed from pg_regress
and handled in the makefiles.  This allows for better control, and
unifies the code with that of test suites not run through pg_regress.

review and msvc support by Michael Paquier <michael.paquier@gmail.com>

more review by Fabien Coelho <coelho@cri.ensmp.fr>
2015-04-23 08:59:52 -04:00
Simon Riggs 35ecc24407 Add new test files for lock level patch 2015-04-05 12:03:58 -04:00
Simon Riggs 0ef0396ae1 Reduce lock levels of some trigger DDL and add FKs
Reduce lock levels to ShareRowExclusive for the following SQL
 CREATE TRIGGER (but not DROP or ALTER)
 ALTER TABLE ENABLE TRIGGER
 ALTER TABLE DISABLE TRIGGER
 ALTER TABLE … ADD CONSTRAINT FOREIGN KEY

Original work by Simon Riggs, extracted and refreshed by Andreas Karlsson
New test cases added by Andreas Karlsson
Reviewed by Noah Misch, Andres Freund, Michael Paquier and Simon Riggs
2015-04-05 11:37:08 -04:00
Tom Lane c480cb9d24 Fix use-of-already-freed-memory problem in EvalPlanQual processing.
Up to now, the "child" executor state trees generated for EvalPlanQual
rechecks have simply shared the ResultRelInfo arrays used for the original
execution tree.  However, this leads to dangling-pointer problems, because
ExecInitModifyTable() is all too willing to scribble on some fields of the
ResultRelInfo(s) even when it's being run in one of those child trees.
This trashes those fields from the perspective of the parent tree, because
even if the generated subtree is logically identical to what was in use in
the parent, it's in a memory context that will go away when we're done
with the child state tree.

We do however want to share information in the direction from the parent
down to the children; in particular, fields such as es_instrument *must*
be shared or we'll lose the stats arising from execution of the children.
So the simplest fix is to make a copy of the parent's ResultRelInfo array,
but not copy any fields back at end of child execution.

Per report from Manuel Kniep.  The added isolation test is based on his
example.  In an unpatched memory-clobber-enabled build it will reliably
fail with "ctid is NULL" errors in all branches back to 9.1, as a
consequence of junkfilter->jf_junkAttNo being overwritten with $7f7f.
This test cannot be run as-is before that for lack of WITH syntax; but
I have no doubt that some variant of this problem can arise in older
branches, so apply the code change all the way back.
2015-01-15 18:52:58 -05:00
Bruce Momjian 4baaf863ec Update copyright for 2015
Backpatch certain files through 9.0
2015-01-06 11:43:47 -05:00
Alvaro Herrera d5e3d1e969 Fix thinko in lock mode enum
Commit 0e5680f473 contained a thinko
mixing LOCKMODE with LockTupleMode.  This caused misbehavior in the case
where a tuple is marked with a multixact with at most a FOR SHARE lock,
and another transaction tries to acquire a FOR NO KEY EXCLUSIVE lock;
this case should block but doesn't.

Include a new isolation tester spec file to explicitely try all the
tuple lock combinations; without the fix it shows the problem:

    starting permutation: s1_begin s1_lcksvpt s1_tuplock2 s2_tuplock3 s1_commit
    step s1_begin: BEGIN;
    step s1_lcksvpt: SELECT * FROM multixact_conflict FOR KEY SHARE; SAVEPOINT foo;
    a

    1
    step s1_tuplock2: SELECT * FROM multixact_conflict FOR SHARE;
    a

    1
    step s2_tuplock3: SELECT * FROM multixact_conflict FOR NO KEY UPDATE;
    a

    1
    step s1_commit: COMMIT;

With the fixed code, step s2_tuplock3 blocks until session 1 commits,
which is the correct behavior.

All other cases behave correctly.

Backpatch to 9.3, like the commit that introduced the problem.
2015-01-04 15:48:29 -03:00
Tom Lane 2db576ba8c Fix corner case where SELECT FOR UPDATE could return a row twice.
In READ COMMITTED mode, if a SELECT FOR UPDATE discovers it has to redo
WHERE-clause checking on rows that have been updated since the SELECT's
snapshot, it invokes EvalPlanQual processing to do that.  If this first
occurs within a non-first child table of an inheritance tree, the previous
coding could accidentally re-return a matching row from an earlier,
already-scanned child table.  (And, to add insult to injury, I think this
could make it miss returning a row that should have been returned, if the
updated row that this happens on should still have passed the WHERE qual.)
Per report from Kyotaro Horiguchi; the added isolation test is based on his
test case.

This has been broken for quite awhile, so back-patch to all supported
branches.
2014-12-11 19:37:36 -05:00
Alvaro Herrera df630b0dd5 Implement SKIP LOCKED for row-level locks
This clause changes the behavior of SELECT locking clauses in the
presence of locked rows: instead of causing a process to block waiting
for the locks held by other processes (or raise an error, with NOWAIT),
SKIP LOCKED makes the new reader skip over such rows.  While this is not
appropriate behavior for general purposes, there are some cases in which
it is useful, such as queue-like tables.

Catalog version bumped because this patch changes the representation of
stored rules.

Reviewed by Craig Ringer (based on a previous attempt at an
implementation by Simon Riggs, who also provided input on the syntax
used in the current patch), David Rowley, and Álvaro Herrera.

Author: Thomas Munro
2014-10-07 17:23:34 -03:00
Alvaro Herrera 1c9701cfe5 Fix FOR UPDATE NOWAIT on updated tuple chains
If SELECT FOR UPDATE NOWAIT tries to lock a tuple that is concurrently
being updated, it might fail to honor its NOWAIT specification and block
instead of raising an error.

Fix by adding a no-wait flag to EvalPlanQualFetch which it can pass down
to heap_lock_tuple; also use it in EvalPlanQualFetch itself to avoid
blocking while waiting for a concurrent transaction.

Authors: Craig Ringer and Thomas Munro, tweaked by Álvaro
http://www.postgresql.org/message-id/51FB6703.9090801@2ndquadrant.com

Per Thomas Munro in the course of his SKIP LOCKED feature submission,
who also provided one of the isolation test specs.

Backpatch to 9.4, because that's as far back as it applies without
conflicts (although the bug goes all the way back).  To that branch also
backpatch Thomas Munro's new NOWAIT test cases, committed in master by
Heikki as commit 9ee16b49f0 .
2014-08-27 19:15:18 -04:00
Heikki Linnakangas 9ee16b49f0 Add regression tests for SELECT FOR UPDATE/SHARE NOWAIT.
Thomas Munro
2014-08-25 20:14:43 +03:00
Noah Misch ee9569e4df Finish adding file version information to installed Windows binaries.
In support of this, have the MSVC build follow GNU make in preferring
GNUmakefile over Makefile when a directory contains both.

Michael Paquier, reviewed by MauMau.
2014-08-18 22:59:53 -04:00
Tom Lane 71ed8b3ca7 Revert "Fix bogus %name-prefix option syntax in all our Bison files."
This reverts commit 45b7abe59e.

It turns out that the %name-prefix syntax without "=" does not work
at all in pre-2.4 Bison.  We are not prepared to make such a large
jump in minimum required Bison version just to suppress a warning
message in a version hardly any developers are using yet.
When 3.0 gets more popular, we'll figure out a way to deal with this.
In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for
anyone using 3.0 who doesn't want to see the warning.
2014-05-28 19:21:01 -04:00
Tom Lane 45b7abe59e Fix bogus %name-prefix option syntax in all our Bison files.
%name-prefix doesn't use an "=" sign according to the Bison docs, but it
silently accepted one anyway, until Bison 3.0.  This was originally a
typo of mine in commit 012abebab1, and we
seem to have slavishly copied the error into all the other grammar files.

Per report from Vik Fearing; analysis by Peter Eisentraut.

Back-patch to all active branches, since somebody might try to build
a back branch with up-to-date tools.
2014-05-28 15:41:53 -04:00
Bruce Momjian 0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
Heikki Linnakangas a692ee5870 Replace SYSTEMQUOTEs with Windows-specific wrapper functions.
It's easy to forget using SYSTEMQUOTEs when constructing command strings
for system() or popen(). Even if we fix all the places missing it now, it is
bound to be forgotten again in the future. Introduce wrapper functions that
do the the extra quoting for you, and get rid of SYSTEMQUOTEs in all the
callers.

We previosly used SYSTEMQUOTEs in all the hard-coded command strings, and
this doesn't change the behavior of those. But user-supplied commands, like
archive_command, restore_command, COPY TO/FROM PROGRAM calls, as well as
pgbench's \shell, will now gain an extra pair of quotes. That is desirable,
but if you have existing scripts or config files that include an extra
pair of quotes, those might need to be adjusted.

Reviewed by Amit Kapila and Tom Lane
2014-05-05 16:07:40 +03:00
Simon Riggs f14a6bbedb Isolation test files for ALTER TABLE patch 2014-04-06 11:44:24 -04:00
Simon Riggs e5550d5fec Reduce lock levels of some ALTER TABLE cmds
VALIDATE CONSTRAINT

CLUSTER ON
SET WITHOUT CLUSTER

ALTER COLUMN SET STATISTICS
ALTER COLUMN SET ()
ALTER COLUMN RESET ()

All other sub-commands use AccessExclusiveLock

Simon Riggs and Noah Misch

Reviews by Robert Haas and Andres Freund
2014-04-06 11:13:43 -04:00
Tom Lane 60ff2fdd99 Centralize getopt-related declarations in a new header file pg_getopt.h.
We used to have externs for getopt() and its API variables scattered
all over the place.  Now that we find we're going to need to tweak the
variable declarations for Cygwin, it seems like a good idea to have
just one place to tweak.

In this commit, the variables are declared "#ifndef HAVE_GETOPT_H".
That may or may not work everywhere, but we'll soon find out.

Andres Freund
2014-02-15 14:31:30 -05:00
Bruce Momjian 2fc80e8e83 Rename 'gmake' to 'make' in docs and recommended commands
This simplifies the docs and makes it easier to cut/paste command lines.
2014-02-12 17:29:19 -05:00
Bruce Momjian 7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
Alvaro Herrera 6eda3e9c27 isolationtester: Ensure stderr is unbuffered, too 2013-12-19 22:09:30 -03:00
Alvaro Herrera 73bcb76b77 Make stdout unbuffered
This ensures that all stdout output is flushed immediately, to match
stderr.  This eliminates the need for fflush(stdout) calls sprinkled all
over the place.

Per Daniel Wood in message 519A79C6.90308@salesforce.com
2013-12-19 17:26:27 -03:00
Alvaro Herrera 11ac4c73cb Don't ignore tuple locks propagated by our updates
If a tuple was locked by transaction A, and transaction B updated it,
the new version of the tuple created by B would be locked by A, yet
visible only to B; due to an oversight in HeapTupleSatisfiesUpdate, the
lock held by A wouldn't get checked if transaction B later deleted (or
key-updated) the new version of the tuple.  This might cause referential
integrity checks to give false positives (that is, allow deletes that
should have been rejected).

This is an easy oversight to have made, because prior to improved tuple
locks in commit 0ac5ad5134 it wasn't possible to have tuples created by
our own transaction that were also locked by remote transactions, and so
locks weren't even considered in that code path.

It is recommended that foreign keys be rechecked manually in bulk after
installing this update, in case some referenced rows are missing with
some referencing row remaining.

Per bug reported by Daniel Wood in
CAPweHKe5QQ1747X2c0tA=5zf4YnS2xcvGf13Opd-1Mq24rF1cQ@mail.gmail.com
2013-12-18 13:45:51 -03:00
Alvaro Herrera 312bde3d40 Fix improper abort during update chain locking
In 247c76a989, I added some code to do fine-grained checking of
MultiXact status of locking/updating transactions when traversing an
update chain.  There was a thinko in that patch which would have the
traversing abort, that is return HeapTupleUpdated, when the other
transaction is a committed lock-only.  In this case we should ignore it
and return success instead.  Of course, in the case where there is a
committed update, HeapTupleUpdated is the correct return value.

A user-visible symptom of this bug is that in REPEATABLE READ and
SERIALIZABLE transaction isolation modes spurious serializability errors
can occur:
  ERROR:  could not serialize access due to concurrent update

In order for this to happen, there needs to be a tuple that's key-share-
locked and also updated, and the update must abort; a subsequent
transaction trying to acquire a new lock on that tuple would abort with
the above error.  The reason is that the initial FOR KEY SHARE is seen
as committed by the new locking transaction, which triggers this bug.
(If the UPDATE commits, then the serialization error is correctly
reported.)

When running a query in READ COMMITTED mode, what happens is that the
locking is aborted by the HeapTupleUpdated return value, then
EvalPlanQual fetches the newest version of the tuple, which is then the
only version that gets locked.  (The second time the tuple is checked
there is no misbehavior on the committed lock-only, because it's not
checked by the code that traverses update chains; so no bug.) Only the
newest version of the tuple is locked, not older ones, but this is
harmless.

The isolation test added by this commit illustrates the desired
behavior, including the proper serialization errors that get thrown.

Backpatch to 9.3.
2013-12-05 17:47:51 -03:00
Alvaro Herrera 07aeb1fec5 Avoid resetting Xmax when it's a multi with an aborted update
HeapTupleSatisfiesUpdate can very easily "forget" tuple locks while
checking the contents of a multixact and finding it contains an aborted
update, by setting the HEAP_XMAX_INVALID bit.  This would lead to
concurrent transactions not noticing any previous locks held by
transactions that might still be running, and thus being able to acquire
subsequent locks they wouldn't be normally able to acquire.

This bug was introduced in commit 1ce150b7bb; backpatch this fix to 9.3,
like that commit.

This change reverts the change to the delete-abort-savept isolation test
in 1ce150b7bb, because that behavior change was caused by this bug.

Noticed by Andres Freund while investigating a different issue reported
by Noah Misch.
2013-12-05 12:21:55 -03:00
Bruce Momjian 86ef4796f5 build: pass EXTRA_REGRESS_OPTS to secondary regression tests
Christoph Berg
2013-12-04 10:14:45 -05:00
Alvaro Herrera 1ce150b7bb Don't TransactionIdDidAbort in HeapTupleGetUpdateXid
It is dangerous to do so, because some code expects to be able to see what's
the true Xmax even if it is aborted (particularly while traversing HOT
chains).  So don't do it, and instead rely on the callers to verify for
abortedness, if necessary.

Several race conditions and bugs fixed in the process.  One isolation test
changes the expected output due to these.

This also reverts commit c235a6a589, which is no longer necessary.

Backpatch to 9.3, where this function was introduced.

Andres Freund
2013-11-29 21:47:21 -03:00
Heikki Linnakangas 32ceba3ea7 Replace appendPQExpBuffer(..., <constant>) with appendPQExpBufferStr
Arguably makes the code a bit more readable, and might give a small
performance gain.

David Rowley
2013-11-18 18:34:51 +02:00
Kevin Grittner 7cb964acb7 Fix buffer overrun in isolation test program.
Commit 061b88c732 saved argv0 to a
global buffer without ensuring that it was zero terminated,
allowing references to it to overrun the buffer and access other
memory.  This probably would not have presented any security risk,
but could have resulted in very confusing failures if the path to
the executable was very long.

Reported by David Rowley
2013-11-15 08:27:42 -06:00
Robert Haas 061b88c732 Try again to make pg_isolation_regress work its build directory.
We can't search for the isolationtester binary until after we've set
up the environment, because otherwise when find_other_exec() tries
to invoke it with the -V option, it might fail for inability to
locate a working libpq.  So postpone that step.

Andres Freund
2013-11-12 11:23:47 -05:00
Robert Haas 9b4d52f209 Fix pg_isolation_regress to work outside its build directory.
This makes it possible to, for example, use the isolation tester to
test a contrib module.

Andres Freund
2013-11-08 14:40:41 -05:00
Tom Lane 2c66f9924c Replace pg_asprintf() with psprintf().
This eliminates an awkward coding pattern that's also unnecessarily
inconsistent with backend coding.  psprintf() is now the thing to
use everywhere.
2013-10-22 19:40:26 -04:00
Peter Eisentraut 5b6d08cd29 Add use of asprintf()
Add asprintf(), pg_asprintf(), and psprintf() to simplify string
allocation and composition.  Replacement implementations taken from
NetBSD.

Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Asif Naeem <anaeem.it@gmail.com>
2013-10-13 00:09:18 -04:00
Kevin Grittner 31a877f18b Allow drop-index-concurrently-1 test to run at any isolation level.
It previously reported failure at REPEATABLE READ and SERIALIZABLE
transaction isolation levels for make installcheck.
2013-10-08 16:55:12 -05:00
Alvaro Herrera 1310d4cab2 add multixact-no-deadlock to schedule 2013-10-04 15:52:58 -03:00
Alvaro Herrera 46d8954654 Make some isolationtester specs more complete
Also, make sure they pass on all transaction isolation levels.
2013-10-04 15:52:58 -03:00
Alvaro Herrera 4f0777ba0f isolationtester: Allow tuples to be returned in more places
Previously, isolationtester would forbid returning tuples in
session-specific teardown (but not global teardown), as well as in
global setup.  Allow these places to return tuples, too.
2013-10-04 14:54:55 -03:00
Bruce Momjian 9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
Tom Lane faf4726c9f In isolationtester, retry after EINTR return from select(2).
Per report from Jaime Casanova.  Very curious that no one else has seen
this failure ... but the code is clearly wrong as-is.
2013-04-06 22:28:49 -04:00
Tom Lane 845d335a90 Minor robustness improvements for isolationtester.
Notice and complain about PQcancel() failures.  Also, don't dump core if
an error PGresult doesn't contain severity and message subfields, as it
might not if it was generated by libpq itself.  (We have a longstanding
TODO item to improve that, but in the meantime isolationtester had better
cope.)

I tripped across the latter item while investigating a trouble report on
buildfarm member spoonbill.  As for the former, there's no evidence that
PQcancel failure is actually involved in spoonbill's problem, but it still
seems like a bad idea to ignore an error return code.
2013-04-02 21:15:37 -04:00
Tom Lane a7921f71a3 Bump up timeout delays some more in timeouts isolation test.
The buildfarm members using -DCLOBBER_CACHE_ALWAYS still don't like this
test.  Some experimentation shows that on my machine, isolationtester's
query to check for "waiting" state takes 2 to 2.5 seconds to bind+execute
under -DCLOBBER_CACHE_ALWAYS.  Set the timeouts to 5 seconds to leave some
headroom for possibly-slower buildfarm critters.

Really we ought to fix the "waiting" query, which is not only horridly
slow but outright wrong in detail; and then maybe we can back off these
timeouts.  But right now I'm just trying to get the buildfarm green again.
2013-03-20 13:53:43 -04:00
Tom Lane 4c855750fc Increase timeout delays in new timeouts isolation test.
Buildfarm member friarbird doesn't like this test as-committed, evidently
because it's so slow that the test framework doesn't reliably notice that
the backend is waiting before the timeout goes off.  (This is not totally
surprising, since friarbird builds with -DCLOBBER_CACHE_ALWAYS.)  Increase
the timeout delay from 1 second to 2 in hopes of resolving that problem.
2013-03-17 23:01:20 -04:00
Tom Lane d43837d030 Add lock_timeout configuration parameter.
This GUC allows limiting the time spent waiting to acquire any one
heavyweight lock.

In support of this, improve the recently-added timeout infrastructure
to permit efficiently enabling or disabling multiple timeouts at once.
That reduces the performance hit from turning on lock_timeout, though
it's still not zero.

Zoltán Böszörményi, reviewed by Tom Lane,
Stephen Frost, and Hari Babu
2013-03-16 23:22:57 -04:00
Andrew Dunstan 63d283ecd0 Flush stderr and stdout in isolation tester.
This is a possibly vain attempt to fix a buffering issue
observed for some MSVC builds.
2013-02-27 19:13:07 -05:00
Alvaro Herrera ca5db759b8 isolationtester: add a few fflush(stderr) calls
The lack of them is causing failures in some BF members.

Per Andrew Dunstan.
2013-01-23 13:30:14 -03:00
Alvaro Herrera 0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00
Bruce Momjian bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Tom Lane ef28e05ac5 Fix bogus handling of $(X) (i.e., ".exe") in isolationtester Makefile.
I'm not sure why commit 1eb1dde049 seems
to have made this start to fail on Cygwin when it never did before ---
but nonetheless, the coding was pretty bogus, and unlike the way we
handle $(X) anywhere else.  Per buildfarm.
2012-11-01 19:48:53 -04:00
Simon Riggs 160984c8c8 Isolation test for DROP INDEX CONCURRENTLY
for recent concurrent changes.

Abhijit Menon-Sen
2012-10-18 19:41:40 +01:00
Simon Riggs 5ad72cee7e Revert tests for drop index concurrently. 2012-10-18 15:27:12 +01:00
Simon Riggs 4e206744dc Add isolation tests for DROP INDEX CONCURRENTLY.
Backpatch to 9.2 to ensure bugs are fixed.

Abhijit Menon-Sen
2012-10-18 13:37:09 +01:00
Peter Eisentraut 8521d13194 Refactor flex and bison make rules
Numerous flex and bison make rules have appeared in the source tree
over time, and they are all virtually identical, so we can replace
them by pattern rules with some variables for customization.

Users of pgxs will also be able to benefit from this.
2012-10-11 06:57:04 -04:00
Kevin Grittner cdf91edba9 Fix serializable mode with index-only scans.
Serializable Snapshot Isolation used for serializable transactions
depends on acquiring SIRead locks on all heap relation tuples which
are used to generate the query result, so that a later delete or
update of any of the tuples can flag a read-write conflict between
transactions.  This is normally handled in heapam.c, with tuple level
locking.  Since an index-only scan avoids heap access in many cases,
building the result from the index tuple, the necessary predicate
locks were not being acquired for all tuples in an index-only scan.

To prevent problems with tuple IDs which are vacuumed and re-used
while the transaction still matters, the xmin of the tuple is part of
the tag for the tuple lock.  Since xmin is not available to the
index-only scan for result rows generated from the index tuples, it
is not possible to acquire a tuple-level predicate lock in such
cases, in spite of having the tid.  If we went to the heap to get the
xmin value, it would no longer be an index-only scan.  Rather than
prohibit index-only scans under serializable transaction isolation,
we acquire an SIRead lock on the page containing the tuple, when it
was not necessary to visit the heap for other reasons.

Backpatch to 9.2.

Kevin Grittner and Tom Lane
2012-09-04 21:13:11 -05:00
Kevin Grittner c63f309cca Allow isolation tests to specify multiple setup blocks.
Each setup block is run as a single PQexec submission, and some
statements such as VACUUM cannot be combined with others in such a
block.

Backpatch to 9.2.

Kevin Grittner and Tom Lane
2012-09-04 19:31:06 -05:00
Tom Lane 633f2fbd88 Update isolation tests' README file.
The directions explaining about running the prepared-transactions test
were not updated in commit ae55d9fbe3.
2012-08-08 12:02:07 -04:00
Andrew Dunstan a1e5705c9f Remove now unneeded results file for disabled prepared transactions case. 2012-07-20 16:30:34 -04:00
Andrew Dunstan ae55d9fbe3 Remove prepared transactions from main isolation test schedule.
There is no point in running this test when prepared transactions are disabled,
which is the default. New make targets that include the test are provided. This
will save some useless waste of cycles on buildfarm machines.

Backpatch to 9.1 where these tests were introduced.
2012-07-20 15:51:40 -04:00
Bruce Momjian 927d61eeff Run pgindent on 9.2 source tree in preparation for first 9.3
commit-fest.
2012-06-10 15:20:04 -04:00
Peter Eisentraut 8e5f4300fd Re-add "make check" target in src/test/isolation/Makefile
This effectively reverts 7886cc73ad,
which was done under the impression that isolationtester needs libpq,
which it no longer does (and never really did).
2012-03-02 22:11:57 +02:00
Peter Eisentraut 36a1a8c33d Don't link pg_isolation_regress with libpq
It's not necessary and can only create confusion about which libpq
installation should be used.

Also remove some dead code from the makefile that was apparently
copied from elsewhere.
2012-03-01 20:51:59 +02:00
Tom Lane 759d9d6769 Add simple tests of EvalPlanQual using the isolationtester infrastructure.
Much more could be done here, but at least now we have *some* automated
test coverage of that mechanism.  In particular this tests the writable-CTE
case reported by Phil Sorber.

In passing, remove isolationtester's arbitrary restriction on the number of
steps in a permutation list.  I used this so that a single spec file could
be used to run several related test scenarios, but there are other possible
reasons to want a step series that's not exactly a permutation.  Improve
documentation and fix a couple other nits as well.
2012-01-28 17:55:08 -05:00
Alvaro Herrera 7064fd0648 Detect invalid permutations in isolationtester
isolationtester is now able to continue running other permutations when
it detects that one of them is invalid, which is useful during initial
development of spec files.

Author: Alexander Shulgin
2012-01-14 19:36:39 -03:00
Alvaro Herrera d2a75837cc Avoid NULL pointer dereference in isolationtester 2012-01-14 19:01:32 -03:00
Alvaro Herrera 50363c8f86 Validate number of steps specified in permutation
A permutation that specifies more steps than defined causes
isolationtester to crash, so avoid that.  Using less steps than defined
should probably not be a problem, but no spec currently does that.
2012-01-11 18:48:59 -03:00
Peter Eisentraut f132824c24 Another fix for pg_regress: Replace exit_nicely() with exit() plus
atexit() hook
2012-01-02 23:29:16 +02:00
Bruce Momjian e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
Alvaro Herrera e145891c98 Unbreak isolationtester on Win32
I broke it in a previous commit because I neglected to install the
necessary incantations to have getopt() work on Windows.

Per red blots in buildfarm.
2011-11-04 00:33:48 -02:00
Alvaro Herrera 7ed3605675 Implement a dry-run mode for isolationtester
This mode prints out the permutations that would be run by the given
spec file, in the same format used by the permutation lines in spec
files.  This helps in building new spec files.

Author: Alexander Shulgin, with some tweaks by me
2011-11-03 15:20:10 -02:00
Peter Eisentraut 654e1f96b0 Clean up whitespace and indentation in parser and scanner files
These are not touched by pgindent, so clean them up a bit manually.
2011-11-01 21:51:30 +02:00
Alvaro Herrera 90d8e8ff7e Add debugging aid in isolationtester 2011-10-24 22:14:22 -03:00
Alvaro Herrera bbd38af3a8 Remove dependency on error ordering in isolation tests
We now report errors reported by the just-unblocked and unblocking
transactions identically; this should fix relatively common buildfarm
failures reported by animals that are failing the "wrong" session.
2011-09-27 16:53:35 -03:00
Alvaro Herrera 1734992738 Fix typo 2011-09-27 16:50:27 -03:00
Alvaro Herrera 28190bacfd Add expected isolationtester output when prepared xacts are disabled
This was deemed unnecessary initially but in later discussion it was
agreed otherwise.

Original file from Kevin Grittner, allegedly from Dan Ports.
I had to clean up whitespace a bit per changes from Heikki.
2011-08-25 17:44:56 -03:00
Tom Lane 2e95f1f002 Add "%option warn" to all flex input files that lacked it.
This is recommended in the flex manual, and there seems no good reason
not to use it everywhere.
2011-08-25 13:55:57 -04:00
Alvaro Herrera f18795e7b7 Update FK alternative test output to new whitespace rules
With these changes, the isolation tests pass again on isolation levels
serializable and repeatable read.

Author: Kevin Grittner
2011-08-24 18:24:03 -03:00
Tom Lane 11c88e59a6 Explain max_prepared_transactions requirement in isolation tests' README.
Now that we have a test that requires nondefault settings to pass, it seems
like we'd better mention that detail in the directions about how to run the
tests.

Also do some very minor copy-editing.
2011-08-18 11:45:25 -04:00
Heikki Linnakangas af35737313 Add an SSI regression test that tests all interesting permutations in the
order of begin, prepare, and commit of three concurrent transactions that
have conflicts between them.

The test runs for a quite long time, and the expected output file is huge,
but this test caught some serious bugs during development, so seems
worthwhile to keep. The test uses prepared transactions, so it fails if the
server has max_prepared_transactions=0. Because of that, it's marked as
"ignore" in the schedule file.

Dan Ports
2011-08-18 17:09:58 +03:00
Heikki Linnakangas 62fd1afc55 Strip whitespace from SQL blocks in the isolation test suite. This is purely
cosmetic, it removes a lot of IMHO ugly whitespace from the expected output.
2011-08-18 17:09:58 +03:00
Alvaro Herrera c8dfc89232 Make isolationtester more robust on locked commands
Noah Misch diagnosed the buildfarm problems in the isolation tests
partly as failure to differentiate backends properly; the old code was
using backend IDs, which is not good enough because a new backend might
use an already used ID.  Use PIDs instead.

Also, the code was purposely careless about other concurrent activity,
because it isn't expected; and in fact, it doesn't affect the vast
majority of the time.  However, it can be observed that autovacuum can
block tables for long enough to cause sporadic failures.  The new code
accounts for that by ignoring locks held by processes not explicitly
declared in our spec file.

Author: Noah Misch
2011-07-19 14:22:42 -04:00
Alvaro Herrera d6db0e4e0e Increase deadlock_timeout to 100ms in FK isolation tests
The previous value of 20ms is dangerously close to the time actually
spent just waiting for the deadlock to happen, so on occasion it causes
the test to fail simply because the other session didn't get to run
early enough, not managing to cause the deadlock that needs to be
detected.  With this new value, it's expected that most machines on
normal load will be able to pass the test.

Author: Noah Misch
2011-07-19 13:07:16 -04:00
Alvaro Herrera a0eae1a2ee Add expected regress output on stricter isolation levels
These new files allow the new FK tests on isolationtester to pass on the
serializable and repeatable read isolation levels (which are untested
by the buildfarm).

Author: Kevin Grittner
Reviewed by Noah Misch
2011-07-19 12:43:16 -04:00
Alvaro Herrera d71197cd35 Set different deadlock_timeout on each session in new isolation tests
This provides deterministic deadlock-detection ordering for new
isolation tests, fixing the sporadic failures in them.

Author: Noah Misch
2011-07-15 18:43:33 -04:00
Alvaro Herrera 846af54dd5 Add support for blocked commands in isolationtester
This enables us to test that blocking commands (such as foreign keys
checks that conflict with some other lock) act as intended.  The set of
tests that this adds is pretty minimal, but can easily be extended by
adding new specs.

The intention is that this will serve as a basis for ensuring that
further tweaks of locking implementation preserve (or improve) existing
behavior.

Author: Noah Misch
2011-07-12 17:24:17 -04:00
Tom Lane 1568fa75bc Use single quotes in preference to double quotes for protecting pathnames.
Per recommendation from Peter.  Neither choice is bulletproof, but this
is the existing style and it does help prevent unexpected environment
variable substitution.
2011-06-15 21:45:23 -04:00
Tom Lane a61b6b7d18 Fix assorted issues with build and install paths containing spaces.
Apparently there is no buildfarm critter exercising this case after all,
because it fails in several places.  With this patch, build, install,
check-world, and installcheck-world pass for me on OS X.
2011-06-14 16:40:35 -04:00
Heikki Linnakangas 3103f9a77d The row-version chaining in Serializable Snapshot Isolation was still wrong.
On further analysis, it turns out that it is not needed to duplicate predicate
locks to the new row version at update, the lock on the version that the
transaction saw as visible is enough. However, there was a different bug in
the code that checks for dangerous structures when a new rw-conflict happens.
Fix that bug, and remove all the row-version chaining related code.

Kevin Grittner & Dan Ports, with some comment editorialization by me.
2011-05-30 20:47:17 +03:00
Tom Lane 446d5d32ae Grammar cleanup for src/test/isolation/README
Josh Kupershmidt
2011-05-24 18:52:15 -04:00
Andrew Dunstan b08ddf8c76 Use the right pgsql for isolation tests. 2011-05-22 17:58:26 -04:00
Andrew Dunstan 78b66cff72 Quote isolationtester command name so Windows will not think dot is the command. 2011-05-15 23:42:12 -04:00
Tom Lane 7886cc73ad Remove "make check" target in src/test/isolation/Makefile.
This doesn't work as expected because the isolationtester program requires
libpq to already be installed.  While it works when you've already installed
libpq, having to already have done "make install" defeats most of the point
of a check with a temp installation.  And there are weird corner cases if
the dynamic linker picks up an old libpq.so from system library directories.
Remove the target (or more precisely, make it print a helpful message) so
people don't expect the case to work.
2011-05-09 11:00:30 -04:00
Tom Lane eff223ffd7 Fix some portability issues in isolation regression test driver.
Remove random system #includes in favor of using postgres_fe.h.  (The
alternative to that is letting this module grow its own configuration
testing ability...)

Also fix the "make clean" target to actually clean things up.

Per local testing.
2011-05-08 19:45:00 -04:00
Bruce Momjian bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
Heikki Linnakangas 74a09d9210 Fix bugs in the isolation tester flex rules.
Tom Lane pointed out that it was giving a warning: "-s option given but
default rule can be matched". That was because there was no rule to handle
newline in a quoted string. I made that throw an error.

Also, line number tracking was broken, giving incorrect line number on
error. Fixed that too.
2011-03-10 09:06:56 +02:00
Itagaki Takahiro 2d8de0a50b Cleanup copyright years and file names in the header comments of some files. 2011-03-10 15:05:33 +09:00
Tom Lane 174f65ab00 Fix some oversights in distprep and maintainer-clean targets.
At least two recent commits have apparently imagined that a comment in
a Makefile stating that something would be included in the distribution
tarball was sufficient to make it so.  They hadn't bothered to hook
into the upper maintainer-clean targets either.  Per bug #5923 from
Charles Johnson, in which it emerged that the 9.1alpha4 tarballs are
short a few files that should be there.
2011-03-10 00:04:05 -05:00
Alvaro Herrera 61cf7bcdf7 Fix isolation tester Makefile so that it runs in a VPATH build 2011-02-10 19:50:43 -03:00
Alvaro Herrera 289d730655 Fix the isolation tester compilation on VPATH builds 2011-02-10 19:31:39 -03:00
Heikki Linnakangas dafaa3efb7 Implement genuine serializable isolation level.
Until now, our Serializable mode has in fact been what's called Snapshot
Isolation, which allows some anomalies that could not occur in any
serialized ordering of the transactions. This patch fixes that using a
method called Serializable Snapshot Isolation, based on research papers by
Michael J. Cahill (see README-SSI for full references). In Serializable
Snapshot Isolation, transactions run like they do in Snapshot Isolation,
but a predicate lock manager observes the reads and writes performed and
aborts transactions if it detects that an anomaly might occur. This method
produces some false positives, ie. it sometimes aborts transactions even
though there is no anomaly.

To track reads we implement predicate locking, see storage/lmgr/predicate.c.
Whenever a tuple is read, a predicate lock is acquired on the tuple. Shared
memory is finite, so when a transaction takes many tuple-level locks on a
page, the locks are promoted to a single page-level lock, and further to a
single relation level lock if necessary. To lock key values with no matching
tuple, a sequential scan always takes a relation-level lock, and an index
scan acquires a page-level lock that covers the search key, whether or not
there are any matching keys at the moment.

A predicate lock doesn't conflict with any regular locks or with another
predicate locks in the normal sense. They're only used by the predicate lock
manager to detect the danger of anomalies. Only serializable transactions
participate in predicate locking, so there should be no extra overhead for
for other transactions.

Predicate locks can't be released at commit, but must be remembered until
all the transactions that overlapped with it have completed. That means that
we need to remember an unbounded amount of predicate locks, so we apply a
lossy but conservative method of tracking locks for committed transactions.
If we run short of shared memory, we overflow to a new "pg_serial" SLRU
pool.

We don't currently allow Serializable transactions in Hot Standby mode.
That would be hard, because even read-only transactions can cause anomalies
that wouldn't otherwise occur.

Serializable isolation mode now means the new fully serializable level.
Repeatable Read gives you the old Snapshot Isolation level that we have
always had.

Kevin Grittner and Dan Ports, reviewed by Jeff Davis, Heikki Linnakangas and
Anssi Kääriäinen
2011-02-08 00:09:08 +02:00