Commit Graph

36560 Commits

Author SHA1 Message Date
Alvaro Herrera 1a917ae861 Fix race when updating a tuple concurrently locked by another process
If a tuple is locked, and this lock is later upgraded either to an
update or to a stronger lock, and in the meantime some other process
tries to lock, update or delete the same tuple, it (the tuple) could end
up being updated twice, or having conflicting locks held.

The reason for this is that the second updater checks for a change in
Xmax value, or in the HEAP_XMAX_IS_MULTI infomask bit, after noticing
the first lock; and if there's a change, it restarts and re-evaluates
its ability to update the tuple.  But it neglected to check for changes
in lock strength or in lock-vs-update status when those two properties
stayed the same.  This would lead it to take the wrong decision and
continue with its own update, when in reality it shouldn't do so but
instead restart from the top.

This could lead to either an assertion failure much later (when a
multixact containing multiple updates is detected), or duplicate copies
of tuples.

To fix, make sure to compare the other relevant infomask bits alongside
the Xmax value and HEAP_XMAX_IS_MULTI bit, and restart from the top if
necessary.

Also, in the belt-and-suspenders spirit, add a check to
MultiXactCreateFromMembers that a multixact being created does not have
two or more members that are claimed to be updates.  This should protect
against other bugs that might cause similar bogus situations.

Backpatch to 9.3, where the possibility of multixacts containing updates
was introduced.  (In prior versions it was possible to have the tuple
lock upgraded from shared to exclusive, and an update would not restart
from the top; yet we're protected against a bug there because there's
always a sleep to wait for the locking transaction to complete before
continuing to do anything.  Really, the fact that tuple locks always
conflicted with concurrent updates is what protected against bugs here.)

Per report from Andrew Dunstan and Josh Berkus in thread at
http://www.postgresql.org/message-id/534C8B33.9050807@pgexperts.com

Bug analysis by Andres Freund.
2014-04-24 15:41:55 -03:00
Tom Lane d19bd29f07 Reset pg_stat_activity.xact_start during PREPARE TRANSACTION.
Once we've completed a PREPARE, our session is not running a transaction,
so its entry in pg_stat_activity should show xact_start as null, rather
than leaving the value as the start time of the now-prepared transaction.

I think possibly this oversight was triggered by faulty extrapolation
from the adjacent comment that says PrepareTransaction should not call
AtEOXact_PgStat, so tweak the wording of that comment.

Noted by Andres Freund while considering bug #10123 from Maxim Boguk,
although this error doesn't seem to explain that report.

Back-patch to all active branches.
2014-04-24 13:29:48 -04:00
Magnus Hagander b2c9b161b8 Properly build pg_recvlogical in the msvc build system
Michael Paquier
2014-04-24 09:31:29 +02:00
Tom Lane a0f9358149 Fix incorrect pg_proc.proallargtypes entries for two built-in functions.
pg_sequence_parameters() and pg_identify_object() have had incorrect
proallargtypes entries since 9.1 and 9.3 respectively.  This was mostly
masked by the correct information in proargtypes, but a few operations
such as pg_get_function_arguments() (and thus psql's \df display) would
show the wrong data types for these functions' input parameters.

In HEAD, fix the wrong info, bump catversion, and add an opr_sanity
regression test to catch future mistakes of this sort.

In the back branches, just fix the wrong info so that installations
initdb'd with future minor releases will have the right data.  We
can't force an initdb, and it doesn't seem like a good idea to add
a regression test that will fail on existing installations.

Andres Freund
2014-04-23 21:21:05 -04:00
Tom Lane f0fedfe82c Allow polymorphic aggregates to have non-polymorphic state data types.
Before 9.4, such an aggregate couldn't be declared, because its final
function would have to have polymorphic result type but no polymorphic
argument, which CREATE FUNCTION would quite properly reject.  The
ordered-set-aggregate patch found a workaround: allow the final function
to be declared as accepting additional dummy arguments that have types
matching the aggregate's regular input arguments.  However, we failed
to notice that this problem applies just as much to regular aggregates,
despite the fact that we had a built-in regular aggregate array_agg()
that was known to be undeclarable in SQL because its final function
had an illegal signature.  So what we should have done, and what this
patch does, is to decouple the extra-dummy-arguments behavior from
ordered-set aggregates and make it generally available for all aggregate
declarations.  We have to put this into 9.4 rather than waiting till
later because it slightly alters the rules for declaring ordered-set
aggregates.

The patch turned out a bit bigger than I'd hoped because it proved
necessary to record the extra-arguments option in a new pg_aggregate
column.  I'd thought we could just look at the final function's pronargs
at runtime, but that didn't work well for variadic final functions.
It's probably just as well though, because it simplifies life for pg_dump
to record the option explicitly.

While at it, fix array_agg() to have a valid final-function signature,
and add an opr_sanity test to notice future deviations from polymorphic
consistency.  I also marked the percentile_cont() aggregates as not
needing extra arguments, since they don't.
2014-04-23 19:17:41 -04:00
Peter Eisentraut 125ba2945a doc: Fix DocBook table column count declaration
This was broken in 26cd1d7d95.
2014-04-23 16:14:14 -04:00
Peter Eisentraut c18cc0034e ecpg: Add additional files to .gitignore
These are test files added by f917968537.
2014-04-23 13:30:36 -04:00
Heikki Linnakangas a4ad9afec2 Update obsolete comments.
We no longer have a TLI field in the page header.
2014-04-23 14:41:51 +03:00
Heikki Linnakangas 4a781f1e6c Fix typo, trance -> tranche, in docs.
Amit Langote
2014-04-23 13:00:08 +03:00
Heikki Linnakangas 8fbfbf1472 Fix typos in comment. 2014-04-23 12:56:41 +03:00
Heikki Linnakangas 4fafc4ecd9 Cleanup of new b-tree page deletion code.
When marking a branch as half-dead, a pointer to the top of the branch is
stored in the leaf block's hi-key. During normal operation, the high key
was left in place, and the block number was just stored in the ctid field
of the high key tuple, but in WAL replay, the high key was recreated as a
truncated tuple with zero columns. For the sake of easier debugging, also
truncate the tuple in normal operation, so that the page is identical
after WAL replay. Also, rename the 'downlink' field in the WAL record to
'topparent', as that seems like a more descriptive name. And make sure
it's set to invalid when unlinking the leaf page.
2014-04-23 10:19:54 +03:00
Tom Lane d26b042ce5 Fix documentation of FmgrInfo.fn_nargs.
Some ancient comments claimed that fn_nargs could be -1 to indicate a
variable number of input arguments; but this was never implemented, and
is at variance with what we ultimately did with "variadic" functions.
Update the comments.
2014-04-22 23:22:12 -04:00
Tom Lane c6a4ace5bf Fix broken logic in logical_heap_rewrite_flush_mappings().
It's blatantly obvious that commit 4d0d607a45
wasn't tested.  The leak's real enough, though.
2014-04-22 22:33:35 -04:00
Bruce Momjian cee850c403 revert 4d0d607a45
Revert due to contrib/test_decoding regression failure
2014-04-22 22:21:54 -04:00
Bruce Momjian 2362c2bd23 doc: adjust 9970443640 for "null string"
Report by Andrew Dunstan
2014-04-22 20:33:12 -04:00
Bruce Momjian 9970443640 doc: improve wording of COPY commit 7ec73783d8 2014-04-22 19:16:54 -04:00
Bruce Momjian 8506a607a3 doc: mention CREATE MATERIALIZED VIEW AS can be EXPLAINed
Patch by Amit Langote

Report by

Backpatch through
2014-04-22 18:38:14 -04:00
Bruce Momjian 26cd1d7d95 docs: add results for JSON operator examples
Patch by Sehrope Sarkuni
2014-04-22 18:19:07 -04:00
Bruce Momjian 19fa6161dd build: add EXTRA_REGRESS_OPTS to all pg_regress invocations
Patch by Christoph Berg
2014-04-22 18:13:10 -04:00
Bruce Momjian 72590b3a69 docs: clearify use of pg_database.datistemplate
Patch by Rajeev rastogi
2014-04-22 18:10:14 -04:00
Bruce Momjian 4d0d607a45 release memory used while flushing logical mappings
Patch by Ants Aasma
2014-04-22 18:05:44 -04:00
Bruce Momjian c27bf777cf doc: improve CREATE RULE event list
Patch by Fujii Masao

Report by Emanuel Calvo
2014-04-22 17:54:42 -04:00
Bruce Momjian 2985e16031 regression test: fix hot standby tests by using repeatable read
Serializable transactions won't work on a Hot Standby.  Also fix
VACUUM/ANALYZE label mixup.

Patch by Martín Marqués
2014-04-22 17:23:58 -04:00
Bruce Momjian 7ec73783d8 copy: update docs for FORCE_NULL and FORCE_NOT_NULL combination
Also update regression tests

Patch by Michael Paquier
2014-04-22 16:06:37 -04:00
Heikki Linnakangas 4a5d55ec2b Fix bug in the new B-tree incomplete-split code.
Forgot to update LSN of left sibling's page, when creating a new root.
I fixed this for regular insertions and page splits earlier, but missed
new root creation.
2014-04-22 22:40:44 +03:00
Heikki Linnakangas 45e67a2ad7 Fix Gin README.
The README incorrectly claimed that GIN posting tree pages contain an array
of uncompressed items in addition to compressed posting lists. Earlier
versions of the GIN posting list compression patch worked that way, but not
the one that was committed.
2014-04-22 22:39:50 +03:00
Peter Eisentraut 80ce90b9c4 doc: Improve "replication slot" index entries
Now that we have accumulated two different "replication slot" concepts,
make the index entries consistent.
2014-04-22 15:22:10 -04:00
Heikki Linnakangas 77fe2b6d79 Fix bug in new B-tree page deletion code.
When modifying a page, must hold an exclusive lock. A shared lock is
obviously not good enough.
2014-04-22 15:34:54 +03:00
Heikki Linnakangas 7e30c186da Retain original physical order of tuples in redo of b-tree splits.
It makes no difference to the system, but minimizing the differences
between a master and standby makes debugging simpler.
2014-04-22 13:03:37 +03:00
Heikki Linnakangas 7d98054f0d Fix rm_desc routine of b-tree page delete records.
A couple of typos from my refactoring of the page deletion patch.
2014-04-22 13:02:52 +03:00
Heikki Linnakangas 8d34f68628 Avoid transient bogus page contents when creating a sequence.
Don't use simple_heap_insert to insert the tuple to a sequence relation.
simple_heap_insert creates a heap insertion WAL record, and replaying that
will create a regular heap page without the special area containing the
sequence magic constant, which is wrong for a sequence. That was not a bug
because we always created a sequence WAL record after that, and replaying
that overwrote the bogus heap page, and the transient state could never be
seen by another backend because it was only done when creating a new
sequence relation. But it's simpler and cleaner to avoid that in the first
place.
2014-04-22 10:40:23 +03:00
Tom Lane 78a3c9b6a5 pg_stat_statements forgot to let previous occupant of hook get control too.
pgss_post_parse_analyze() neglected to pass the call on to any earlier
occupant of the post_parse_analyze_hook.  There are no other users of that
hook in contrib/, and most likely none in the wild either, so this is
probably just a latent bug.  But it's a bug nonetheless, so back-patch
to 9.2 where this code was introduced.
2014-04-21 13:28:07 -04:00
Robert Haas 602b27ab8e Fix another typo.
Etsuro Fujita
2014-04-20 16:32:57 +02:00
Robert Haas fab6170cab Fix typo.
Etsuro Fujita
2014-04-20 16:30:55 +02:00
Bruce Momjian 012025f9ae doc: CREATE DATABASE doesn't copy template database-level config params
Report by Alexey Bashtanov
2014-04-19 15:26:49 -04:00
Bruce Momjian 0e8beed515 doc: mention archive_command and recovery_command are exec'ed locally
Report by Craig Ringer
2014-04-19 14:59:47 -04:00
Bruce Momjian 4353d1809f docs: tablespaces cannot be accessed independently
Mention impossibility of moving tablespaces, backing them up
independently, or the inadvisability of placing them on temporary
file systems.

Patch by Craig Ringer, adjustments by Ian Lawrence Warwick and me
2014-04-19 10:52:49 -04:00
Bruce Momjian 13ecb822e8 libpq: have PQconnectdbParams() and PQpingParams accept "" as default
Previously, these functions treated "" optin values as defaults in some
ways, but not in others, like when comparing to .pgpass.  Also, add
documentation to clarify that now "" and NULL use defaults, like
PQsetdbLogin() has always done.

BACKWARD INCOMPATIBILITY

Patch by Adrian Vondendriesch, docs by me

Report by Jeff Janes
2014-04-19 08:41:51 -04:00
Magnus Hagander 66b1084e2c Fix typo
Amit Langote
2014-04-18 12:49:54 +02:00
Peter Eisentraut e7128e8dbb Create function prototype as part of PG_FUNCTION_INFO_V1 macro
Because of gcc -Wmissing-prototypes, all functions in dynamically
loadable modules must have a separate prototype declaration.  This is
meant to detect global functions that are not declared in header files,
but in cases where the function is called via dfmgr, this is redundant.
Besides filling up space with boilerplate, this is a frequent source of
compiler warnings in extension modules.

We can fix that by creating the function prototype as part of the
PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway.  That
makes the code of modules cleaner, because there is one less place where
the entry points have to be listed, and creates an additional check that
functions have the right prototype.

Remove now redundant prototypes from contrib and other modules.
2014-04-18 00:03:19 -04:00
Tom Lane 0156315823 Fix unused-variable warning on Windows.
Introduced in 585bca39: msgid is not used in the Windows code path.

Also adjust comments a tad (mostly to keep pgindent from messing it up).

David Rowley
2014-04-17 16:12:24 -04:00
Bruce Momjian 9fe55259fd pgcrypto: fix memset() calls that might be optimized away
Specifically, on-stack memset() might be removed, so:

	* Replace memset() with px_memset()
	* Add px_memset to copy_crlf()
	* Add px_memset to pgp-s2k.c

Patch by Marko Kreen

Report by PVS-Studio

Backpatch through 8.4.
2014-04-17 12:37:53 -04:00
Bruce Momjian 83defef8c7 report stat() error in trigger file check
Permissions might prevent the existence of the trigger file from being
checked.

Per report from Andres Freund
2014-04-17 11:55:57 -04:00
Bruce Momjian c1275cf741 pg_upgrade: throw an error for non-existent tablespace directories
Non-existent tablespace directory references can occur if user
tablespaces are created inside data directories and the data directory
is renamed in preparation for running pg_upgrade, and the symbolic links
are not updated.

Backpatch to 9.3.
2014-04-17 11:42:21 -04:00
Bruce Momjian 52e757420f docs: adjustments for streaming standbys that disconnect frequently
Document problems when disconnection causes loss of hot_standby_feedback
and suggest adjusting max_standby_archive_delay and
max_standby_streaming_delay.

Initial patch by Marko Tiikkaja, adjustments by me
2014-04-17 10:52:48 -04:00
Heikki Linnakangas 2a8e1ac598 Set the all-visible flag on heap page before writing WAL record, not after.
If we set the all-visible flag after writing WAL record, and XLogInsert
takes a full-page image of the page, the image would not include the flag.
We will then proceed to set the VM bit, which would then be set without the
corresponding all-visible flag on the heap page.

Found by comparing page images on master and standby, after writing/replaying
each WAL record. (There is still a discrepancy: the all-visible flag won't
be set after replaying the HEAP_CLEAN record, even though it is set in the
master. However, it will be set when replaying the HEAP2_VISIBLE record and
setting the VM bit, so the all-visible flag and VM bit are always consistent
on the standby, even though they are momentarily out-of-sync with master)

Backpatch to 9.3 where this code was introduced.
2014-04-17 17:47:50 +03:00
Tom Lane 5f86cbd714 Rename EXPLAIN ANALYZE's "total runtime" output to "execution time".
Now that EXPLAIN also outputs a "planning time" measurement, the use of
"total" here seems rather confusing: it sounds like it might include the
planning time which of course it doesn't.  Majority opinion was that
"execution time" is a better label, so we'll call it that.

This should be noted as a backwards incompatibility for tools that examine
EXPLAIN ANALYZE output.

In passing, I failed to resist the temptation to do a little editing on the
materialized-view example affected by this change.
2014-04-16 20:48:59 -04:00
Bruce Momjian e183d11262 docs: properly document psql auto encoding mode
In psql, both stdin and stdout must be terminals to get a client
encoding of 'auto'.

Patch by Albe Laurenz

Backpatch to 9.3.
2014-04-16 19:53:42 -04:00
Bruce Momjian 5d305d86bd libpq: use pgsocket for socket values, for portability
Previously, 'int' was used for socket values in libpq, but socket values
are unsigned on Windows.  This is a style correction.

Initial patch and previous PGINVALID_SOCKET initial patch by Joel
Jacobson, modified by me

Report from PVS-Studio
2014-04-16 19:46:51 -04:00
Bruce Momjian be5f7fff47 doc: move min_recovery_apply_delay into the right section
Patch by Fujii Masao
2014-04-16 19:15:16 -04:00