Commit Graph

2027 Commits

Author SHA1 Message Date
Michael Paquier 1fb6f62a84 Fix typos in various places
Author: Andrea Gelmini
Reviewed-by: Michael Paquier, Justin Pryzby
Discussion: https://postgr.es/m/20190528181718.GA39034@glet
2019-06-03 13:44:03 +09:00
Alvaro Herrera d890fa812d Make one message just like all its siblings. 2019-05-28 23:44:22 -04:00
Thomas Munro 7988cb446d Fix typos.
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CA%2BhUKGJFWXmtYo6Frd77RR8YXCHz7hJ2mRy5aHV%3D7fJOqDnBHA%40mail.gmail.com
2019-05-24 12:00:59 +12:00
Tom Lane 8255c7a5ee Phase 2 pgindent run for v12.
Switch to 2.1 version of pg_bsd_indent.  This formats
multiline function declarations "correctly", that is with
additional lines of parameter declarations indented to match
where the first line's left parenthesis is.

Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com
2019-05-22 13:04:48 -04:00
Tom Lane be76af171c Initial pgindent run for v12.
This is still using the 2.0 version of pg_bsd_indent.
I thought it would be good to commit this separately,
so as to document the differences between 2.0 and 2.1 behavior.

Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us
2019-05-22 12:55:34 -04:00
Tom Lane 93f03dad82 Make BufFileCreateTemp() ensure that temp tablespaces are set up.
If PrepareTempTablespaces() has never been called in the current
transaction, OpenTemporaryFile() will fall back to using the default
tablespace, which is a bug if the user wanted temp files placed elsewhere.
gistInitBuildBuffers() appears to have this disease already, and it
seems like an easy trap for future coders to fall into.

We discussed other ways to close this gap, but none of them are prettier
or more reliable than just having BufFileCreateTemp do it.  In particular,
having fd.c do this creates layering issues that we could do without.

Per suggestion from Melanie Plageman.  Arguably this is a bug fix, but
nobody seems very excited about back-patching, so change in HEAD only.

Discussion: https://postgr.es/m/CAAKRu_YwzjuGAmmaw4-8XO=OVFGR1QhY_Pq-t3wjb9ribBJb_Q@mail.gmail.com
2019-05-18 13:51:16 -04:00
Andres Freund 7f44ede594 tableam: Don't assume that every AM uses md.c style storage.
Previously various parts of the code routed size requests through
RelationGetNumberOfBlocks[InFork]. That works if md.c is used by the
AM, but not otherwise.

Add a tableam callback to return the size of the table. As not every
AM will use postgres' BLCKSZ, have it return bytes, and have
RelationGetNumberOfBlocksInFork() round the byte size up into blocks.

To allow code outside of the AM to determine the actual relation size
map InvalidForkNumber the total size of a relation, as not every AM
might just need the postgres defined forks.

A few users of RelationGetNumberOfBlocks() ought to be converted away
from that. One case, the use of it to determine whether a tid is
valid, will be fixed in a follow up commit. Others will have to wait
for v13.

Author: Andres Freund
Discussion: https://postgr.es/m/20190423225201.3bbv6tbqzkb5w7cw@alap3.anarazel.de
2019-05-17 18:56:47 -07:00
Peter Geoghegan ae7291acbc Standardize ItemIdData terminology.
The term "item pointer" should not be used to refer to ItemIdData
variables, since that is needlessly ambiguous.  Only
ItemPointerData/ItemPointer variables should be called item pointers.

To fix, establish the convention that ItemIdData variables should always
be referred to either as "item identifiers" or "line pointers".  The
term "item identifier" already predominates in docs and translatable
messages, and so should be the preferred alternative there.

Discussion: https://postgr.es/m/CAH2-Wz=c=MZQjUzde3o9+2PLAPuHTpVZPPdYxN=E4ndQ2--8ew@mail.gmail.com
2019-05-13 15:53:39 -07:00
Thomas Munro 47a338cfcd Fix SxactGlobalXmin tracking.
Commit bb16aba50 broke the code that maintains SxactGlobalXmin.  It
could get stuck when a well-timed READ ONLY transaction runs.  If
SxactGlobalXmin stops advancing, transactions on the
FinishedSerializableTransactions queue are never cleaned up, so
resources are effectively leaked.  Revert that hunk of the commit.

Also revert another similar hunk that was probably harmless, but
unnecessary and unjustified, relating to the DOOMED flag in case of
RO_SAFE early release.

Author: Thomas Munro
Reported-by: Tom Lane
Discussion: https://postgr.es/m/16170.1557251214%40sss.pgh.pa.us
2019-05-09 20:32:26 +12:00
Amit Kapila 7db0cde6b5 Revert "Avoid the creation of the free space map for small heap relations".
This feature was using a process local map to track the first few blocks
in the relation.  The map was reset each time we get the block with enough
freespace.  It was discussed that it would be better to track this map on
a per-relation basis in relcache and then invalidate the same whenever
vacuum frees up some space in the page or when FSM is created.  The new
design would be better both in terms of API design and performance.

List of commits reverted, in reverse chronological order:

06c8a5090e  Improve code comments in b0eaa4c51b.
13e8643bfc  During pg_upgrade, conditionally skip transfer of FSMs.
6f918159a9  Add more tests for FSM.
9c32e4c350  Clear the local map when not used.
29d108cdec  Update the documentation for FSM behavior..
08ecdfe7e5  Make FSM test portable.
b0eaa4c51b  Avoid creation of the free space map for small heap relations.

Discussion: https://postgr.es/m/20190416180452.3pm6uegx54iitbt5@alap3.anarazel.de
2019-05-07 09:30:24 +05:30
Fujii Masao 978b032d1f Fix function names in comments.
Commit 3eb77eba5a renamed some functions, but forgot to
update some comments referencing to those functions.
This commit fixes those function names in the comments.

Kyotaro Horiguchi
2019-04-25 23:43:48 +09:00
Alvaro Herrera 0a999e1290 Unify error messages
... for translatability purposes.
2019-04-24 09:26:13 -04:00
Michael Paquier 47ac2033d4 Simplify some ERROR paths clearing wait events and transient files
Transient files and wait events get normally cleaned up when seeing an
exception (be it in the context of a transaction for a backend or
another process like the checkpointer), hence there is little point in
complicating error code paths to do this work.  This shaves a bit of
code, and removes some extra handling with errno which needed to be
preserved during the cleanup steps done.

Reported-by: Masahiko Sawada
Author: Michael Paquier
Reviewed-by: Tom Lane, Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoDhHYVq5KkXfkaHhmjA-zJYj-e4teiRAJefvXuKJz1tKQ@mail.gmail.com
2019-04-17 09:51:45 +09:00
Noah Misch c098509927 Consistently test for in-use shared memory.
postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process.  When the postmaster.pid file was
missing, a starting postmaster used weaker checks.  Change to use the
same checks in both scenarios.  This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start".  A postmaster
will no longer stop if shmat() of an old segment fails with EACCES.  A
postmaster will no longer recycle segments pertaining to other data
directories.  That's good for production, but it's bad for integration
tests that crash a postmaster and immediately delete its data directory.
Such a test now leaks a segment indefinitely.  No "make check-world"
test does that.  win32_shmem.c already avoided all these problems.  In
9.6 and later, enhance PostgresNode to facilitate testing.  Back-patch
to 9.4 (all supported versions).

Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com
2019-04-12 22:36:38 -07:00
Noah Misch 82150a05be Revert "Consistently test for in-use shared memory."
This reverts commits 2f932f71d9,
16ee6eaf80 and
6f0e190056.  The buildfarm has revealed
several bugs.  Back-patch like the original commits.

Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com
2019-04-05 00:00:52 -07:00
Thomas Munro 794c543b17 Fix bugs in mdsyncfiletag().
Commit 3eb77eba moved a _mdfd_getseg() call from mdsync() into a new
callback function mdsyncfiletag(), but didn't get the arguments quite
right.  Without the EXTENSION_DONT_CHECK_SIZE flag we fail to open a
segment if lower-numbered segments have been truncated, and it wants
a block number rather than a segment number.

While comparing with the older coding, also remove an unnecessary
clobbering of errno, and adjust the code in mdunlinkfiletag() to
ressemble the original code from mdpostckpt() more closely instead
of using an unnecessary call to smgropen().

Author: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKGL%2BYLUOA0eYiBXBfwW%2BbH5kFgh94%3DgQH0jHEJ-t5Y91wQ%40mail.gmail.com
2019-04-05 17:41:58 +13:00
Thomas Munro 3eb77eba5a Refactor the fsync queue for wider use.
Previously, md.c and checkpointer.c were tightly integrated so that
fsync calls could be handed off and processed in the background.
Introduce a system of callbacks and file tags, so that other modules
can hand off fsync work in the same way.

For now only md.c uses the new interface, but other users are being
proposed.  Since there may be use cases that are not strictly SMGR
implementations, use a new function table for sync handlers rather
than extending the traditional SMGR one.

Instead of using a bitmapset of segment numbers for each RelFileNode
in the checkpointer's hash table, make the segment number part of the
key.  This requires sending explicit "forget" requests for every
segment individually when relations are dropped, but suits the file
layout schemes of proposed future users better (ie sparse or high
segment numbers).

Author: Shawn Debnath and Thomas Munro
Reviewed-by: Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com
2019-04-04 23:38:38 +13:00
Noah Misch 2f932f71d9 Consistently test for in-use shared memory.
postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process.  When the postmaster.pid file was
missing, a starting postmaster used weaker checks.  Change to use the
same checks in both scenarios.  This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start".  A postmaster
will no longer recycle segments pertaining to other data directories.
That's good for production, but it's bad for integration tests that
crash a postmaster and immediately delete its data directory.  Such a
test now leaks a segment indefinitely.  No "make check-world" test does
that.  win32_shmem.c already avoided all these problems.  In 9.6 and
later, enhance PostgresNode to facilitate testing.  Back-patch to 9.4
(all supported versions).

Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com
2019-04-03 17:03:46 -07:00
Alvaro Herrera e8abf97af7 Prevent use of uninitialized variable
Per buildfarm member longfin.
2019-04-02 16:03:26 -03:00
Alvaro Herrera ab0dfc961b Report progress of CREATE INDEX operations
This uses the progress reporting infrastructure added by c16dc1aca5,
adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY.

There are two pieces to this: one is index-AM-agnostic, and the other is
AM-specific.  The latter is fairly elaborate for btrees, including
reportage for parallel index builds and the separate phases that btree
index creation uses; other index AMs, which are much simpler in their
building procedures, have simplistic reporting only, but that seems
sufficient, at least for non-concurrent builds.

The index-AM-agnostic part is fairly complete, providing insight into
the CONCURRENTLY wait phases as well as block-based progress during the
index validation table scan.  (The index validation index scan requires
patching each AM, which has not been included here.)

Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada
Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql
2019-04-02 15:18:08 -03:00
Thomas Munro 2fc7af5e96 Add basic infrastructure for 64 bit transaction IDs.
Instead of inferring epoch progress from xids and checkpoints,
introduce a 64 bit FullTransactionId type and use it to track xid
generation.  This fixes an unlikely bug where the epoch is reported
incorrectly if the range of active xids wraps around more than once
between checkpoints.

The only user-visible effect of this commit is to correct the epoch
used by txid_current() and txid_status(), also visible with
pg_controldata, in those rare circumstances.  It also creates some
basic infrastructure so that later patches can use 64 bit
transaction IDs in more places.

The new type is a struct that we pass by value, as a form of strong
typedef.  This prevents the sort of accidental confusion between
TransactionId and FullTransactionId that would be possible if we
were to use a plain old uint64.

Author: Thomas Munro
Reported-by: Amit Kapila
Reviewed-by: Andres Freund, Tom Lane, Heikki Linnakangas
Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com
2019-03-28 18:12:20 +13:00
Tomas Vondra 6ca015f9f0 Track unowned relations in doubly-linked list
Relations dropped in a single transaction are tracked in a list of
unowned relations.  With large number of dropped relations this resulted
in poor performance at the end of a transaction, when the relations are
removed from the singly linked list one by one.

Commit b4166911 attempted to address this issue (particularly when it
happens during recovery) by removing the relations in a reverse order,
resulting in O(1) lookups in the list of unowned relations.  This did
not work reliably, though, and it was possible to trigger the O(N^2)
behavior in various ways.

Instead of trying to remove the relations in a specific order with
respect to the linked list, which seems rather fragile, switch to a
regular doubly linked.  That allows us to remove relations cheaply no
matter where in the list they are.

As b4166911 was a bugfix, backpatched to all supported versions, do the
same thing here.

Reviewed-by: Alvaro Herrera
Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com
Backpatch-through: 9.4
2019-03-27 02:39:39 +01:00
Peter Eisentraut 481018f280 Add macro to cast away volatile without allowing changes to underlying type
This adds unvolatize(), which works just like unconstify() but for volatile.

Discussion: https://www.postgresql.org/message-id/flat/7a5cbea7-b8df-e910-0f10-04014bcad701%402ndquadrant.com
2019-03-25 09:37:03 +01:00
Amit Kapila 06c8a5090e Improve code comments in b0eaa4c51b.
Author: John Naylor
Discussion: https://postgr.es/m/CACPNZCswjyGJxTT=mxHgK=Z=mJ9uJ4WEx_UO=bNwpR_i0EaHHg@mail.gmail.com
2019-03-16 06:55:56 +05:30
Thomas Munro bb16aba50c Enable parallel query with SERIALIZABLE isolation.
Previously, the SERIALIZABLE isolation level prevented parallel query
from being used.  Allow the two features to be used together by
sharing the leader's SERIALIZABLEXACT with parallel workers.

An extra per-SERIALIZABLEXACT LWLock is introduced to make it safe to
share, and new logic is introduced to coordinate the early release
of the SERIALIZABLEXACT required for the SXACT_FLAG_RO_SAFE
optimization, as follows:

The first backend to observe the SXACT_FLAG_RO_SAFE flag (set by
some other transaction) will 'partially release' the SERIALIZABLEXACT,
meaning that the conflicts and locks it holds are released, but the
SERIALIZABLEXACT itself will remain active because other backends
might still have a pointer to it.

Whenever any backend notices the SXACT_FLAG_RO_SAFE flag, it clears
its own MySerializableXact variable and frees local resources so that
it can skip SSI checks for the rest of the transaction.  In the
special case of the leader process, it transfers the SERIALIZABLEXACT
to a new variable SavedSerializableXact, so that it can be completely
released at the end of the transaction after all workers have exited.

Remove the serializable_okay flag added to CreateParallelContext() by
commit 9da0cc35, because it's now redundant.

Author: Thomas Munro
Reviewed-by: Haribabu Kommi, Robert Haas, Masahiko Sawada, Kevin Grittner
Discussion: https://postgr.es/m/CAEepm=0gXGYhtrVDWOTHS8SQQy_=S9xo+8oCxGLWZAOoeJ=yzQ@mail.gmail.com
2019-03-15 17:47:04 +13:00
Alvaro Herrera af38498d4c Move hash_any prototype from access/hash.h to utils/hashutils.h
... as well as its implementation from backend/access/hash/hashfunc.c to
backend/utils/hash/hashfn.c.

access/hash is the place for the hash index AM, not really appropriate
for generic facilities, which is what hash_any is; having things the old
way meant that anything using hash_any had to include the AM's include
file, pointlessly polluting its namespace with unrelated, unnecessary
cruft.

Also move the HTEqual strategy number to access/stratnum.h from
access/hash.h.

To avoid breaking third-party extension code, add an #include
"utils/hashutils.h" to access/hash.h.  (An easily removed line by
committers who enjoy their asbestos suits to protect them from angry
extension authors.)

Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql
2019-03-11 13:17:50 -03:00
Magnus Hagander 6b9e875f72 Track block level checksum failures in pg_stat_database
This adds a column that counts how many checksum failures have occurred
on files belonging to a specific database. Both checksum failures
during normal backend processing and those created when a base backup
detects a checksum failure are counted.

Author: Magnus Hagander
Reviewed by: Julien Rouhaud
2019-03-09 10:47:30 -08:00
Michael Paquier 82a5649fb9 Tighten use of OpenTransientFile and CloseTransientFile
This fixes two sets of issues related to the use of transient files in
the backend:
1) OpenTransientFile() has been used in some code paths with read-write
flags while read-only is sufficient, so switch those calls to be
read-only where necessary.  These have been reported by Joe Conway.
2) When opening transient files, it is up to the caller to close the
file descriptors opened.  In error code paths, CloseTransientFile() gets
called to clean up things before issuing an error.  However in normal
exit paths, a lot of callers of CloseTransientFile() never actually
reported errors, which could leave a file descriptor open without
knowing about it.  This is an issue I complained about a couple of
times, but never had the courage to write and submit a patch, so here we
go.

Note that one frontend code path is impacted by this commit so as an
error is issued when fetching control file data, making backend and
frontend to be treated consistently.

Reported-by: Joe Conway, Michael Paquier
Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Georgios Kokolatos, Joe Conway
Discussion: https://postgr.es/m/20190301023338.GD1348@paquier.xyz
Discussion: https://postgr.es/m/c49b69ec-e2f7-ff33-4f17-0eaa4f2cef27@joeconway.com
2019-03-09 08:50:55 +09:00
Thomas Munro 91595f9d49 Drop the vestigial "smgr" type.
Before commit 3fa2bb31 this type appeared in the catalogs to
select which of several block storage mechanisms each relation
used.

New features under development propose to revive the concept of
different block storage managers for new kinds of data accessed
via bufmgr.c, but don't need to put references to them in the
catalogs.  So, avoid useless maintenance work on this type by
dropping it.  Update some regression tests that were referencing
it where any type would do.

Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com
2019-03-07 15:44:04 +13:00
Peter Eisentraut 278584b526 Remove volatile from latch API
This was no longer useful since the latch functions use memory
barriers already, which are also compiler barriers, and volatile does
not help with cross-process access.

Discussion: https://www.postgresql.org/message-id/flat/20190218202511.qsfpuj5sy4dbezcw%40alap3.anarazel.de#18783c27d73e9e40009c82f6e0df0974
2019-03-04 11:30:41 +01:00
Amit Kapila 9c32e4c350 Clear the local map when not used.
After commit b0eaa4c51b, we use a local map of pages to find the required
space for small relations.  We do clear this map when we have found a block
with enough free space, when we extend the relation, or on transaction
abort so that it can be used next time.  However, we miss to clear it when
we didn't find any pages to try from the map which leads to an assertion
failure when we later tried to use it after relation extension.

In the passing, I have improved some comments in this area.

Reported-by: Tom Lane based on buildfarm results
Author: Amit Kapila
Reviewed-by: John Naylor
Tested-by: Kuntal Ghosh
Discussion: https://postgr.es/m/32368.1551114120@sss.pgh.pa.us
2019-03-01 07:38:47 +05:30
Michael Paquier effe7d9552 Make release of 2PC identifier and locks consistent in COMMIT PREPARED
When preparing a transaction in two-phase commit, a dummy PGPROC entry
holding the GID used for the transaction is registered, which gets
released once COMMIT PREPARED is run.  Prior releasing its shared memory
state, all the locks taken in the prepared transaction are released
using a dedicated set of callbacks (pgstat and multixact having similar
callbacks), which may cause the locks to be released before the GID is
set free.

Hence, there is a small window where lock conflicts could happen, for
example:
- Transaction A releases its locks, still holding its GID in shared
memory.
- Transaction B held a lock which conflicted with locks of transaction
A.
- Transaction B continues its processing, reusing the same GID as
transaction A.
- Transaction B fails because of a conflicting GID, already in use by
transaction A.

This commit changes the shared memory state release so as post-commit
callbacks and predicate lock cleanup happen consistently with the shared
memory state cleanup for the dummy PGPROC entry.  The race window is
small and 2PC had this issue from the start, so no backpatch is done.
On top if that fixes discussed involved ABI breakages, which are not
welcome in stable branches.

Reported-by: Oleksii Kliukin, Ildar Musin
Diagnosed-by: Oleksii Kliukin, Ildar Musin
Author: Michael Paquier
Reviewed-by: Masahiko Sawada, Oleksii Kliukin
Discussion: https://postgr.es/m/BF9B38A4-2BFF-46E8-BA87-A2D00A8047A6@hintbits.com
2019-02-25 14:19:34 +09:00
Thomas Munro f16735d80d Tolerate EINVAL when calling fsync() on a directory.
Previously, we tolerated EBADF as a way for the operating system to
indicate that it doesn't support fsync() on a directory.  Tolerate
EINVAL too, for older versions of Linux CIFS.

Bug #15636.  Back-patch all the way.

Reported-by: John Klann
Discussion: https://postgr.es/m/15636-d380890dafd78fc6@postgresql.org
2019-02-24 23:50:20 +13:00
Thomas Munro 483520eca4 Tolerate ENOSYS failure from sync_file_range().
One unintended consequence of commit 9ccdd7f6 was that Windows WSL
users started getting a panic whenever we tried to initiate data
flushing with sync_file_range(), because WSL does not implement that
system call.  Previously, they got a stream of periodic warnings,
which was also undesirable but at least ignorable.

Prevent the panic by handling ENOSYS specially and skipping the panic
promotion with data_sync_elevel().  Also suppress future attempts
after the first such failure so that the pre-existing problem of
noisy warnings is improved.

Back-patch to 9.6 (older branches were not affected in this way by
9ccdd7f6).

Author: Thomas Munro and James Sewell
Tested-by: James Sewell
Reported-by: Bruce Klein
Discussion: https://postgr.es/m/CA+mCpegfOUph2U4ZADtQT16dfbkjjYNJL1bSTWErsazaFjQW9A@mail.gmail.com
2019-02-24 22:37:20 +13:00
Thomas Munro 0b55aaacec Fix race in dsm_unpin_segment() when handles are reused.
Teach dsm_unpin_segment() to skip segments that are in the process
of being destroyed by another backend, when searching for a handle.
Such a segment cannot possibly be the one we are looking for, even
if its handle matches.  Another slot might hold a recently created
segment that has the same handle value by coincidence, and we need
to keep searching for that one.

The bug caused rare "cannot unpin a segment that is not pinned"
errors on 10 and 11.  Similar to commit 6c0fb941 for dsm_attach().

Back-patch to 10, where dsm_unpin_segment() landed.

Author: Thomas Munro
Reported-by: Justin Pryzby
Tested-by: Justin Pryzby (along with other recent DSA/DSM fixes)
Discussion: https://postgr.es/m/20190216023854.GF30291@telsasoft.com
2019-02-18 09:58:29 +13:00
Thomas Munro 6c0fb94189 Fix race in dsm_attach() when handles are reused.
DSM handle values can be reused as soon as the underlying shared memory
object has been destroyed.  That means that for a brief moment we
might have two DSM slots with the same handle.  While trying to attach,
if we encounter a slot with refcnt == 1, meaning that it is currently
being destroyed, we should continue our search in case the same handle
exists in another slot.

The race manifested as a rare "dsa_area could not attach to segment"
error, and was more likely in 10 and 11 due to the lack of distinct
seed for random() in parallel workers.  It was made very unlikely in
in master by commit 197e4af9, and older releases don't usually create
new DSM segments in background workers so it was also unlikely there.

This fixes the root cause of bug report #15585, in which the error
could also sometimes result in a self-deadlock in the error path.
It's not yet clear if further changes are needed to avoid that failure
mode.

Back-patch to 9.4, where dsm.c arrived.

Author: Thomas Munro
Reported-by: Justin Pryzby, Sergei Kornilov
Discussion: https://postgr.es/m/20190207014719.GJ29720@telsasoft.com
Discussion: https://postgr.es/m/15585-324ff6a93a18da46@postgresql.org
2019-02-15 14:05:09 +13:00
Michael Paquier ea92368cd1 Move max_wal_senders out of max_connections for connection slot handling
Since its introduction, max_wal_senders is counted as part of
max_connections when it comes to define how many connection slots can be
used for replication connections with a WAL sender context.  This can
lead to confusion for some users, as it could be possible to block a
base backup or replication from happening because other backend sessions
are already taken for other purposes by an application, and
superuser-only connection slots are not a correct solution to handle
that case.

This commit makes max_wal_senders independent of max_connections for its
handling of PGPROC entries in ProcGlobal, meaning that connection slots
for WAL senders are handled using their own free queue, like autovacuum
workers and bgworkers.

One compatibility issue that this change creates is that a standby now
requires to have a value of max_wal_senders at least equal to its
primary.  So, if a standby created enforces the value of
max_wal_senders to be lower than that, then this could break failovers.
Normally this should not be an issue though, as any settings of a
standby are inherited from its primary as postgresql.conf gets normally
copied as part of a base backup, so parameters would be consistent.

Author: Alexander Kukushkin
Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii
Kliukin
Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com
2019-02-12 10:07:56 +09:00
Amit Kapila b0eaa4c51b Avoid creation of the free space map for small heap relations, take 2.
Previously, all heaps had FSMs. For very small tables, this means that the
FSM took up more space than the heap did. This is wasteful, so now we
refrain from creating the FSM for heaps with 4 pages or fewer. If the last
known target block has insufficient space, we still try to insert into some
other page before giving up and extending the relation, since doing
otherwise leads to table bloat. Testing showed that trying every page
penalized performance slightly, so we compromise and try every other page.
This way, we visit at most two pages. Any pages with wasted free space
become visible at next relation extension, so we still control table bloat.
As a bonus, directly attempting one or two pages can even be faster than
consulting the FSM would have been.

Once the FSM is created for a heap we don't remove it even if somebody
deletes all the rows from the corresponding relation.  We don't think it is
a useful optimization as it is quite likely that relation will again grow
to the same size.

Author: John Naylor, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Mithun C Y
Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
2019-02-04 07:49:15 +05:30
Thomas Munro f1bebef60e Add shared_memory_type GUC.
Since 9.3 we have used anonymous shared mmap for our main shared memory
region, except in EXEC_BACKEND builds.  Provide a GUC so that users
can opt for System V shared memory once again, like in 9.2 and earlier.

A later patch proposes to add huge/large page support for AIX, which
requires System V shared memory and provided the motivation to revive
this possibility.  It may also be useful on some BSDs.

Author: Andres Freund (revived and documented by Thomas Munro)
Discussion: https://postgr.es/m/HE1PR0202MB28126DB4E0B6621CC6A1A91286D90%40HE1PR0202MB2812.eurprd02.prod.outlook.com
Discussion: https://postgr.es/m/2AE143D2-87D3-4AD1-AC78-CE2258230C05%40FreeBSD.org
2019-02-03 12:47:26 +01:00
Amit Kapila a23676503b Revert "Avoid creation of the free space map for small heap relations."
This reverts commit ac88d2962a.
2019-01-28 11:31:44 +05:30
Amit Kapila ac88d2962a Avoid creation of the free space map for small heap relations.
Previously, all heaps had FSMs. For very small tables, this means that the
FSM took up more space than the heap did. This is wasteful, so now we
refrain from creating the FSM for heaps with 4 pages or fewer. If the last
known target block has insufficient space, we still try to insert into some
other page before giving up and extending the relation, since doing
otherwise leads to table bloat. Testing showed that trying every page
penalized performance slightly, so we compromise and try every other page.
This way, we visit at most two pages. Any pages with wasted free space
become visible at next relation extension, so we still control table bloat.
As a bonus, directly attempting one or two pages can even be faster than
consulting the FSM would have been.

Once the FSM is created for a heap we don't remove it even if somebody
deletes all the rows from the corresponding relation.  We don't think it is
a useful optimization as it is quite likely that relation will again grow
to the same size.

Author: John Naylor with design inputs and some code contribution by Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Mithun C Y
Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
2019-01-28 08:14:06 +05:30
Amit Kapila d66e3664b8 In bootstrap mode, don't allow the creation of files if they don't already
exist.

In commit's b9d01fe288 and 3908473c80, we have added some code where we
allowed the creation of files during mdopen even if they didn't exist
during the bootstrap mode.  The later commit obviates the need for same.

This was harmless code till now but with an upcoming feature where we don't
allow to create FSM for small tables, this will needlessly create FSM
files.

Author: John Naylor
Reviewed-by: Amit Kapila
Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
	    https://www.postgresql.org/message-id/CAA4eK1KsET6sotf+rzOTQfb83pzVEzVhbQi1nxGFYVstVWXUGw@mail.gmail.com
2019-01-28 07:52:51 +05:30
Andres Freund c91560defc Move remaining code from tqual.[ch] to heapam.h / heapam_visibility.c.
Given these routines are heap specific, and that there will be more
generic visibility support in via table AM, it makes sense to move the
prototypes to heapam.h (routines like HeapTupleSatisfiesVacuum will
not be exposed in a generic fashion, because they are too storage
specific).

Similarly, the code in tqual.c is specific to heap, so moving it into
access/heap/ makes sense.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-21 17:07:10 -08:00
Andres Freund e7cc78ad43 Remove superfluous tqual.h includes.
Most of these had been obsoleted by 568d4138c / the SnapshotNow
removal.

This is is preparation for moving most of tqual.[ch] into either
snapmgr.h or heapam.h, which in turn is in preparation for pluggable
table AMs.

Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
2019-01-21 12:15:02 -08:00
Andres Freund e0c4ec0728 Replace uses of heap_open et al with the corresponding table_* function.
Author: Andres Freund
Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
2019-01-21 10:51:37 -08:00
Andres Freund 111944c5ee Replace heapam.h includes with {table, relation}.h where applicable.
A lot of files only included heapam.h for relation_open, heap_open etc
- replace the heapam.h include in those files with the narrower
header.

Author: Andres Freund
Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
2019-01-21 10:51:37 -08:00
Michael Paquier 5d59a6c5ea Fix grammar mistakes in md.c
Author: Kirk Jamison
Discussion: https://postgr.es/m/D09B13F772D2274BB348A310EE3027C640AC54@g01jpexmbkw24
2019-01-10 09:36:25 +09:00
Bruce Momjian 97c39498e5 Update copyright for 2019
Backpatch-through: certain files through 9.4
2019-01-02 12:44:25 -05:00
Michael Paquier 1707a0d2aa Remove configure switch --disable-strong-random
This removes a portion of infrastructure introduced by fe0a0b5 to allow
compilation of Postgres in environments where no strong random source is
available, meaning that there is no linking to OpenSSL and no
/dev/urandom (Windows having its own CryptoAPI).  No systems shipped
this century lack /dev/urandom, and the buildfarm is actually not
testing this switch at all, so just remove it.  This simplifies
particularly some backend code which included a fallback implementation
using shared memory, and removes a set of alternate regression output
files from pgcrypto.

Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20181230063219.GG608@paquier.xyz
2019-01-01 20:05:51 +09:00
Peter Geoghegan 1a990b207b Have BufFileSize() ereport() on FileSize() failure.
Move the responsibility for checking for and reporting a failure from
the only current BufFileSize() caller, logtape.c, to BufFileSize()
itself.  Code within buffile.c is generally responsible for interfacing
with fd.c to report irrecoverable failures.  This seems like a
convention that's worth sticking to.

Reorganizing things this way makes it easy to make the error message
raised in the event of BufFileSize() failure descriptive of the
underlying problem.  We're now clear on the distinction between
temporary file name and BufFile name, and can show errno, confident that
its value actually relates to the error being reported.  In passing, an
existing, similar buffile.c ereport() + errcode_for_file_access() site
is changed to follow the same conventions.

The API of the function BufFileSize() is changed by this commit, despite
already being in a stable release (Postgres 11).  This seems acceptable,
since the BufFileSize() ABI was changed by commit aa55183042, which
hasn't made it into a point release yet.  Besides, it's difficult to
imagine a third party BufFileSize() caller not just raising an error
anyway, since BufFile state should be considered corrupt when
BufFileSize() fails.

Per complaint from Tom Lane.

Discussion: https://postgr.es/m/26974.1540826748@sss.pgh.pa.us
Backpatch: 11-, where shared BufFiles were introduced.
2018-11-28 14:42:54 -08:00