Commit Graph

32352 Commits

Author SHA1 Message Date
Tom Lane 3cb646264e Use a ResourceOwner to track buffer pins in all cases.
Historically, we've allowed auxiliary processes to take buffer pins without
tracking them in a ResourceOwner.  However, that creates problems for error
recovery.  In particular, we've seen multiple reports of assertion crashes
in the startup process when it gets an error while holding a buffer pin,
as for example if it gets ENOSPC during a write.  In a non-assert build,
the process would simply exit without releasing the pin at all.  We've
gotten away with that so far just because a failure exit of the startup
process translates to a database crash anyhow; but any similar behavior
in other aux processes could result in stuck pins and subsequent problems
in vacuum.

To improve this, institute a policy that we must *always* have a resowner
backing any attempt to pin a buffer, which we can enforce just by removing
the previous special-case code in resowner.c.  Add infrastructure to make
it easy to create a process-lifespan AuxProcessResourceOwner and clear
out its contents at appropriate times.  Replace existing ad-hoc resowner
management in bgwriter.c and other aux processes with that.  (Thus, while
the startup process gains a resowner where it had none at all before, some
other aux process types are replacing an ad-hoc resowner with this code.)
Also use the AuxProcessResourceOwner to manage buffer pins taken during
StartupXLOG and ShutdownXLOG, even when those are being run in a bootstrap
process or a standalone backend rather than a true auxiliary process.

In passing, remove some other ad-hoc resource owner creations that had
gotten cargo-culted into various other places.  As far as I can tell
that was all unnecessary, and if it had been necessary it was incomplete,
due to lacking any provision for clearing those resowners later.
(Also worth noting in this connection is that a process that hasn't called
InitBufferPoolBackend has no business accessing buffers; so there's more
to do than just add the resowner if we want to touch buffers in processes
not covered by this patch.)

Although this fixes a very old bug, no back-patch, because there's no
evidence of any significant problem in non-assert builds.

Patch by me, pursuant to a report from Justin Pryzby.  Thanks to
Robert Haas and Kyotaro Horiguchi for reviews.

Discussion: https://postgr.es/m/20180627233939.GA10276@telsasoft.com
2018-07-18 12:15:16 -04:00
Heikki Linnakangas 6b387179ba Fix misc typos, mostly in comments.
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.

Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
2018-07-18 16:17:32 +03:00
Michael Paquier 94019c879a Fix more portability issues with casts to Size when using off_t
This should tame the beast, as there are no other places where off_t is
used in the new error messages.

Reported again by longfin, which complained about walsender.c while I
spotted the other two ones while double-checking.
2018-07-18 09:51:53 +09:00
Michael Paquier 8bd064f0c7 Fix casting in error message for two-phase file
This error from from 811b6e3, which causes compilation warnings with OSX
10.3 and clang.

Reported by Tom Lane, via buildfarm member longfin.
2018-07-18 09:11:34 +09:00
Michael Paquier 811b6e36a9 Rework error messages around file handling
Some error messages related to file handling are using the code path
context to define their state.  For example, 2PC-related errors are
referring to "two-phase status files", or "relation mapping file" is
used for catalog-to-filenode mapping, however those prove to be
difficult to translate, and are not more helpful than just referring to
the path of the file being worked on.  So simplify all those error
messages by just referring to files with their path used.  In some
cases, like the manipulation of WAL segments, the context is actually
helpful so those are kept.

Calls to the system function read() have also been rather inconsistent
with their error handling sometimes not reporting the number of bytes
read, and some other code paths trying to use an errno which has not
been set.  The in-core functions are using a more consistent pattern
with this patch, which checks for both errno if set or if an
inconsistent read is happening.

So as to care about pluralization when reading an unexpected number of
byte(s), "could not read: read %d of %zu" is used as error message, with
%d field being the output result of read() and %zu the expected size.
This simplifies the work of translators with less variations of the same
message.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20180520000522.GB1603@paquier.xyz
2018-07-18 08:01:23 +09:00
Alvaro Herrera 1c04d4beea Revise BuildIndexValueDescription to simplify it
Getting a pg_index tuple from syscache when the open index relation is
available is pointless -- just use the one from relcache.

Noticed while reviewing code for cb9db2ab06.

No backpatch.
2018-07-16 20:20:15 -04:00
Alvaro Herrera cb9db2ab06 Fix ALTER TABLE...SET STATS error message for included columns
The existing error message was complaining that the column is not an
expression, which is not correct.  Introduce a suitable wording
variation and a test.

Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20180628182803.e4632d5a.nagata@sraoss.co.jp
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
2018-07-16 20:00:39 -04:00
Alvaro Herrera e353389d24 Fix partition pruning with IS [NOT] NULL clauses
The original code was unable to prune partitions that could not possibly
contain NULL values, when the query specified less than all columns in a
multicolumn partition key.  Reorder the if-tests so that it is, and add
more commentary and regression tests.

Reported-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com>
Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Reviewed-by: amul sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CAFjFpRc7qjLUfXLVBBC_HAnx644sjTYM=qVoT3TJ840HPbsTXw@mail.gmail.com
2018-07-16 18:38:59 -04:00
Robert Haas 32df1c9afa Add subtransaction handling for table synchronization workers.
Since the old logic was completely unaware of subtransactions, a
change made in a subsequently-aborted subtransaction would still cause
workers to be stopped at toplevel transaction commit.  Fix that by
managing a stack of worker lists rather than just one.

Amit Khandekar and Robert Haas

Discussion: http://postgr.es/m/CAJ3gD9eaG_mWqiOTA2LfAug-VRNn1hrhf50Xi1YroxL37QkZNg@mail.gmail.com
2018-07-16 17:33:22 -04:00
Peter Eisentraut f7cb2842bf Add plan_cache_mode setting
This allows overriding the choice of custom or generic plan.

Author: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRAGLaiEm8ur5DWEBo7qHRWTk9HxkuUAz00CZZtJj-LkCA%40mail.gmail.com
2018-07-16 13:35:41 +02:00
Peter Eisentraut a06e56b247 doc: Update redirecting links
Update links that resulted in redirects.  Most are changes from http to
https, but there are also some other minor edits.  (There are still some
redirects where the target URL looks less elegant than the one we
currently have.  I have left those as is.)
2018-07-16 10:48:05 +02:00
Tom Lane 1007b0a126 Fix hashjoin costing mistake introduced with inner_unique optimization.
In final_cost_hashjoin(), commit 9c7f5229a allowed inner_unique cases
to follow a code path previously used only for SEMI/ANTI joins; but it
neglected to fix an if-test within that path that assumed SEMI and ANTI
were the only possible cases.  This resulted in a wrong value for
hashjointuples, and an ensuing bad cost estimate, for inner_unique normal
joins.  Fortunately, for inner_unique normal joins we can assume the number
of joined tuples is the same as for a SEMI join; so there's no need for
more code, we just have to invert the test to check for ANTI not SEMI.

It turns out that in two contrib tests in which commit 9c7f5229a
changed the plan expected for a query, the change was actually wrong
and induced by this estimation error, not by any real improvement.
Hence this patch also reverts those changes.

Per report from RK Korlapati.  Backpatch to v10 where the error was
introduced.

David Rowley

Discussion: https://postgr.es/m/CA+SNy03bhq0fodsfOkeWDCreNjJVjsdHwUsb7AG=jpe0PtZc_g@mail.gmail.com
2018-07-14 11:59:12 -04:00
Peter Eisentraut 333224c99e Update documentation editor setup instructions
Now that the documentation sources are in XML rather than SGML, some of
the documentation about the editor, or more specifically Emacs, setup
needs updating.  The updated instructions recommend using nxml-mode,
which works mostly out of the box, with some small tweaks in
emacs.samples and .dir-locals.el.

Also remove some obsolete stuff in .dir-locals.el.  I did, however,
leave the sgml-mode settings in there so that someone using Emacs
without emacs.samples gets those settings when editing a *.sgml file.
2018-07-13 21:23:41 +02:00
Tom Lane 4984784f83 Fix crash in json{b}_populate_recordset() and json{b}_to_recordset().
As of commit 37a795a60, populate_recordset_worker() tried to pass back
(as rsi.setDesc) a tupdesc that it also had cached in its fn_extra.
But the core executor would free the passed-back tupdesc, risking a
crash if the function were called again in the same query.  The safest
and least invasive way to fix that is to make an extra tupdesc copy
to pass back.

While at it, I failed to resist the temptation to get rid of unnecessary
get_fn_expr_argtype() calls here and in populate_record_worker().

Per report from Dmitry Dolgov; thanks to Michael Paquier and
Andrew Gierth for investigation and discussion.

Discussion: https://postgr.es/m/CA+q6zcWzN9ztCfR47ZwgTr1KLnuO6BAY6FurxXhovP4hxr+yOQ@mail.gmail.com
2018-07-13 14:16:54 -04:00
Alvaro Herrera 93ad00c968 Dump foreign keys on partitioned tables
The patch that ended up as commit 3de241dba8 ("Foreign keys on
partitioned tables") lacked pg_dump tests, so the pg_dump code that was
there to support it inadvertently stopped working when in later
development I modified the backend code not to emit pg_trigger rows for
the partitioned table itself.

Bug analysis and code fix is by Michaël.  I (Álvaro) added the test.

Reported-by: amul sul <sulamul@gmail.com>
Co-authored-by: Michaël Paquier <michael@paquier.xyz>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CAAJ_b94n=UsNVhgs97vCaWEZAMe-tGDRVuZ73oePQH=eaJKGSA@mail.gmail.com
2018-07-13 13:13:26 -04:00
Heikki Linnakangas 42f70cd9c3 Improve performance of tuple conversion map generation
Previously convert_tuples_by_name_map naively performed a search of each
outdesc column starting at the first column in indesc and searched each
indesc column until a match was found.  When partitioned tables had many
columns this could result in slow generation of the tuple conversion maps.
For INSERT and UPDATE statements that touched few rows, this could mean a
very large overhead indeed.

We can do a bit better with this loop.  It's quite likely that the columns
in partitioned tables and their partitions are in the same order, so it
makes sense to start searching for each column outer column at the inner
column position 1 after where the previous match was found (per idea from
Alexander Kuzmenkov). This makes the best case search O(N) instead of
O(N^2).  The worst case is still O(N^2), but it seems unlikely that would
happen.

Likewise, in the planner, make_inh_translation_list's search for the
matching column could often end up falling back on an O(N^2) type search.
This commit also improves that by first checking the column that follows
the previous match, instead of the column with the same attnum.  If we
fail to match here we fallback on the syscache's hashtable lookup.

Author: David Rowley
Reviewed-by: Alexander Kuzmenkov
Discussion: https://www.postgresql.org/message-id/CAKJS1f9-wijVgMdRp6_qDMEQDJJ%2BA_n%3DxzZuTmLx5Fz6cwf%2B8A%40mail.gmail.com
2018-07-13 19:54:05 +03:00
Tom Lane 130beba36d Fix inadequate buffer locking in FSM and VM page re-initialization.
When reading an existing FSM or VM page that was found to be corrupt by the
buffer manager, the code applied PageInit() to reinitialize the page, but
did so without any locking.  There is thus a hazard that two backends might
concurrently do PageInit, which in itself would still be OK, but the slower
one might then zero over subsequent data changes applied by the faster one.
Even that is unlikely to be fatal; but it's not desirable, so add locking
to prevent it.

This does not add any locking overhead in the normal code path where the
page is OK.  It's not immediately obvious that that's safe, but I believe
it is, for reasons explained in the added comments.

Problem noted by R P Asim.  It's been like this for a long time, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/CANXE4Te4G0TGq6cr0-TvwP0H4BNiK_-hB5gHe8mF+nz0mcYfMQ@mail.gmail.com
2018-07-13 11:53:10 -04:00
Peter Eisentraut 3884072329 Prohibit transaction commands in security definer procedures
Starting and aborting transactions in security definer procedures
doesn't work.  StartTransaction() insists that the security context
stack is empty, so this would currently cause a crash, and
AbortTransaction() resets it.  This could be made to work by
reorganizing the code, but right now we just prohibit it.

Reported-by: amul sul <sulamul@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b96Gupt_LFL7uNyy3c50-wbhA68NUjiK5%3DrF6_w%3Dpq_T%3DQ%40mail.gmail.com
2018-07-13 10:41:32 +02:00
Peter Eisentraut 1f4ec89459 Remove obsolete documentation build tools for Windows
The scripts and instructions have been nonfunctional at least since
PostgreSQL 10 (commit 510074f9f0) and
nobody has stepped up to fix them.  So right now just remove them until
someone wants to resurrect them.

Discussion: https://www.postgresql.org/message-id/flat/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3%40yesql.se
2018-07-13 10:01:04 +02:00
Thomas Munro e8d9caa436 Accept invalidation messages in InitializeSessionUserId().
If the authentication method modified the system catalogs through a
separate database connection (say, to create a new role on the fly),
make sure syscache sees the changes before we try to find the user.

Author: Thomas Munro
Reviewed-by: Tom Lane, Andres Freund
Discussion: https://postgr.es/m/CAEepm%3D3_h0_cgmz5PMyab4xk_OFrg6G5VCN%3DnF4chFXM9iFOqA%40mail.gmail.com
2018-07-13 16:19:15 +12:00
Thomas Munro 387a5cfb94 Add pg_dump --on-conflict-do-nothing option.
When dumping INSERT statements, optionally add ON CONFLICT DO NOTHING.

Author: Surafel Temesgen
Reviewed-by: Takeshi Ideriha, Nico Williams, Dilip Kumar
Discussion: https://postgr.es/m/CALAY4q-PQ9cOEzs2%2BQHK5ObfF_4QbmBaYXbZx6BGGN66Q-n8FA%40mail.gmail.com
2018-07-13 13:57:03 +12:00
Michael Paquier ce89ad0fa0 Fix argument of pg_create_logical_replication_slot for slot name
All attributes and arguments using a slot name map to the data type
"name", but this function has been using "text".  This is cosmetic, as
even if text is used then the slot name would be truncated to 64
characters anyway and stored as such.  The documentation already said
so and the function already assumed that the argument was of this type
when fetching its value.

Bump catalog version.

Author: Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoADYz_-eAqH5AVFaCaojcRgwpo9PW=u8kgTMys63oB8Cw@mail.gmail.com
2018-07-13 09:32:12 +09:00
Michael Paquier 5fc1008e8a Clean up temporary WAL segments after an instance crash
Temporary WAL segments are created in pg_wal and named as xlogtemp.pid
before being renamed to the real deal when creating a new segment.  If
an instance crashes after the temporary segment is created and before
the rename is done, then the server would finish with unremovable data.

After an instance crash, scan pg_wal and remove any such segments.  With
repetitive unlucky crashes this would contribute to disk bloat and
presents risks of ENOSPC especially with max_wal_size close to the
maximum allowed.

Author: Michael Paquier
Reviewed-by: Yugo Nagata, Heikki Linnakangas
Discussion: https://postgr.es/m/20180514054955.GF1528@paquier.xyz
2018-07-13 06:43:20 +09:00
Peter Eisentraut 5e6e2c8773 Reset shmem_exit_inprogress after shmem_exit()
In ad9a274778, shmem_exit_inprogress was
introduced.  But we need to reset it after shmem_exit(), because unlike
the similar proc_exit(), shmem_exit() can also be called for cleanup
when the process will not exit.

Reported-by: Andrew Gierth <andrew@tao11.riddles.org.uk>
2018-07-12 20:22:17 +02:00
Alvaro Herrera cd073d8f70 Fix FK checks of TRUNCATE involving partitioned tables
When truncating a table that is referenced by foreign keys in
partitioned tables, the check to ensure the referencing table are also
truncated spuriously failed.  This is because it was relying on
relhastriggers as a proxy for the table having FKs, and that's wrong for
partitioned tables.  Fix it to consider such tables separately.  There
may be a better way ... but this code is pretty inefficient already.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/20180711000624.zmeizicibxeehhsg@alvherre.pgsql
2018-07-12 12:09:08 -04:00
Peter Eisentraut 8e599897ca Improve two error messages 2018-07-12 12:35:59 +02:00
Peter Eisentraut ecd6b4342a Add regression test for system catalog toast tables
For the moment, this just records which system catalogs have toast
tables right now.  Future patches will possibly change that set.

from Tom Lane via Joe Conway

Discussion: https://www.postgresql.org/message-id/flat/84ddff04-f122-784b-b6c5-3536804495f8@joeconway.com/
2018-07-12 12:31:49 +02:00
Amit Kapila 40ca70ebcc Allow using the updated tuple while moving it to a different partition.
An update that causes the tuple to be moved to a different partition was
missing out on re-constructing the to-be-updated tuple, based on the latest
tuple in the update chain.  Instead, it's simply deleting the latest tuple
and inserting a new tuple in the new partition based on the old tuple.
Commit 2f17844104 didn't consider this case, so some of the updates were
getting lost.

In passing, change the argument order for output parameter in ExecDelete
and add some commentary about it.

Reported-by: Pavan Deolasee
Author: Amit Khandekar, with minor changes by me
Reviewed-by: Dilip Kumar, Amit Kapila and Alvaro Herrera
Backpatch-through: 11
Discussion: https://postgr.es/m/CAJ3gD9fRbEzDqdeDq1jxqZUb47kJn+tQ7=Bcgjc8quqKsDViKQ@mail.gmail.com
2018-07-12 12:51:39 +05:30
Michael Paquier edc6b41bd4 Rename VACOPT_NOWAIT to VACOPT_SKIP_LOCKED
When it comes to SELECT ... FOR or LOCK, NOWAIT means to not wait for
something to happen, and issue an error.  SKIP LOCKED means to not wait
for something to happen but to move on without issuing an error.  The
internal option of autovacuum and autoanalyze mentioned above, used only
when wraparound is not involved was named NOWAIT, but behaves like SKIP
LOCKED which is confusing.

Author: Nathan Bossart
Discussion: https://postgr.es/m/20180307050345.GA3095@paquier.xyz
2018-07-12 14:28:28 +09:00
Michael Paquier 6551f3daa2 Add assertion in expand_vacuum_rel() for non-autovacuum path
The code path where the assertion is added helps to check that
autovacuum always includes a relation OID when doing a vacuum on it.
Extracted from a larger patch set to add support for SKIP LOCKED with
manual VACUUM commands.

Author: Nathan Bossart
Discussion: https://postgr.es/m/9EF7EBE4-720D-4CF1-9D0E-4403D7E92990@amazon.com
2018-07-12 13:50:17 +09:00
Michael Paquier 9a7b7adc13 Make logical WAL sender report streaming state appropriately
WAL senders sending logically-decoded data fail to properly report in
"streaming" state when starting up, hence as long as one extra record is
not replayed, such WAL senders would remain in a "catchup" state, which
is inconsistent with the physical cousin.

This can be easily reproduced by for example using pg_recvlogical and
restarting the upstream server.  The TAP tests have been slightly
modified to detect the failure and strengthened so as future tests also
make sure that a node is in streaming state when waiting for its
catchup.

Backpatch down to 9.4 where this code has been introduced.

Reported-by: Sawada Masahiko
Author: Simon Riggs, Sawada Masahiko
Reviewed-by: Petr Jelinek, Michael Paquier, Vaishnavi Prabakaran
Discussion: https://postgr.es/m/CAD21AoB2ZbCCqOx=bgKMcLrAvs1V0ZMqzs7wBTuDySezTGtMZA@mail.gmail.com
2018-07-12 10:19:35 +09:00
Tom Lane 39a96512b3 Mark built-in btree comparison functions as leakproof where it's safe.
Generally, if the comparison operators for a datatype or pair of datatypes
are leakproof, the corresponding btree comparison support function can be
considered so as well.  But we had not originally worried about marking
support functions as leakproof, reasoning that they'd not likely be used in
queries so the marking wouldn't matter.  It turns out there's at least one
place where it does matter: calc_arraycontsel() finds the target datatype's
default btree comparison function and tries to use that to estimate
selectivity, but it will be blocked in some cases if the function isn't
leakproof.  This leads to unnecessarily poor selectivity estimates and bad
plans, as seen in bug #15251.

Hence, run around and apply proleakproof markings where the corresponding
btree comparison operators are leakproof.  (I did eyeball each function
to verify that it wasn't doing anything surprising, too.)

This isn't a full solution to bug #15251, and it's not back-patchable
because of the need for a catversion bump.  A more useful response probably
is to consider whether we can check permissions on the parent table instead
of the child.  However, this change will help in some cases where that
won't, and it's easy enough to do in HEAD, so let's do so.

Discussion: https://postgr.es/m/3876.1531261875@sss.pgh.pa.us
2018-07-11 18:47:31 -04:00
Tom Lane 57cd2b6e6d Fix create_scan_plan's handling of sortgrouprefs for physical tlists.
We should only run apply_pathtarget_labeling_to_tlist if CP_LABEL_TLIST
was specified, because only in that case has use_physical_tlist checked
that the labeling will succeed; otherwise we may get an "ORDER/GROUP BY
expression not found in targetlist" error.  (This subsumes the previous
test about gating_clauses, because we reset "flags" to zero earlier
if there are gating clauses to apply.)

The only known case in which a failure can occur is with a ProjectSet
path directly atop a table scan path, although it seems likely that there
are other cases or will be such in future.  This means that the failure
is currently only visible in the v10 branch: 9.6 didn't have ProjectSet,
while in v11 and HEAD, apply_scanjoin_target_to_paths for some weird
reason is using create_projection_path not apply_projection_to_path,
masking the problem because there's a ProjectionPath in between.

Nonetheless this code is clearly wrong on its own terms, so back-patch
to 9.6 where this logic was introduced.

Per report from Regina Obe.

Discussion: https://postgr.es/m/001501d40f88$75186950$5f493bf0$@pcorp.us
2018-07-11 15:25:28 -04:00
Tom Lane ff4f889164 Fix bugs with degenerate window ORDER BY clauses in GROUPS/RANGE mode.
nodeWindowAgg.c failed to cope with the possibility that no ordering
columns are defined in the window frame for GROUPS mode or RANGE OFFSET
mode, leading to assertion failures or odd errors, as reported by Masahiko
Sawada and Lukas Eder.  In RANGE OFFSET mode, an ordering column is really
required, so add an Assert about that.  In GROUPS mode, the code would
work, except that the node initialization code wasn't in sync with the
execution code about when to set up tuplestore read pointers and spare
slots.  Fix the latter for consistency's sake (even though I think the
changes described below make the out-of-sync cases unreachable for now).

Per SQL spec, a single ordering column is required for RANGE OFFSET mode,
and at least one ordering column is required for GROUPS mode.  The parser
enforced the former but not the latter; add a check for that.

We were able to reach the no-ordering-column cases even with fully spec
compliant queries, though, because the planner would drop partitioning
and ordering columns from the generated plan if they were redundant with
earlier columns according to the redundant-pathkey logic, for instance
"PARTITION BY x ORDER BY y" in the presence of a "WHERE x=y" qual.
While in principle that's an optimization that could save some pointless
comparisons at runtime, it seems unlikely to be meaningful in the real
world.  I think this behavior was not so much an intentional optimization
as a side-effect of an ancient decision to construct the plan node's
ordering-column info by reverse-engineering the PathKeys of the input
path.  If we give up redundant-column removal then it takes very little
code to generate the plan node info directly from the WindowClause,
ensuring that we have the expected number of ordering columns in all
cases.  (If anyone does complain about this, the planner could perhaps
be taught to remove redundant columns only when it's safe to do so,
ie *not* in RANGE OFFSET mode.  But I doubt anyone ever will.)

With these changes, the WindowAggPath.winpathkeys field is not used for
anything anymore, so remove it.

The test cases added here are not actually very interesting given the
removal of the redundant-column-removal logic, but they would represent
important corner cases if anyone ever tries to put that back.

Tom Lane and Masahiko Sawada.  Back-patch to v11 where RANGE OFFSET
and GROUPS modes were added.

Discussion: https://postgr.es/m/CAD21AoDrWqycq-w_+Bx1cjc+YUhZ11XTj9rfxNiNDojjBx8Fjw@mail.gmail.com
Discussion: https://postgr.es/m/153086788677.17476.8002640580496698831@wrigleys.postgresql.org
2018-07-11 12:07:20 -04:00
Alexander Korotkov edf59c40dd Fix more wrong paths in header comments
It appears that there are more files, whose header comment paths are
wrong.  So, fix those paths.  No backpatching per proposal of Tom Lane.

Discussion: https://postgr.es/m/CAPpHfdsJyYbOj59MOQL%2B4XxdcomLSLfLqBtAvwR%2BpsCqj3ELdQ%40mail.gmail.com
2018-07-11 17:57:04 +03:00
Alvaro Herrera f2c587067a Rethink how to get float.h in old Windows API for isnan/isinf
We include <float.h> in every place that needs isnan(), because MSVC
used to require it.  However, since MSVC 2013 that's no longer necessary
(cf. commit cec8394b5c), so we can retire the inclusion to a
version-specific stanza in win32_port.h, where it doesn't need to
pollute random .c files.  The header is of course still needed in a few
places for other reasons.

I (Álvaro) removed float.h from a few more files than in Emre's original
patch.  This doesn't break the build in my system, but we'll see what
the buildfarm has to say about it all.

Author: Emre Hasegeli
Discussion: https://postgr.es/m/CAE2gYzyc0+5uG+Cd9-BSL7NKC8LSHLNg1Aq2=8ubjnUwut4_iw@mail.gmail.com
2018-07-11 09:11:48 -04:00
Alexander Korotkov a01d0fa1d8 Fix wrong file path in header comment
Header comment of shm_mq.c was mistakenly specifying path to shm_mq.h.
It was introduced in ec9037df.  So, theoretically it could be
backpatched to 9.4, but it doesn't seem to worth it.
2018-07-11 13:16:46 +03:00
Thomas Munro f98b8476cd Use signals for postmaster death on FreeBSD.
Use FreeBSD 11.2's new support for detecting parent process death to
make PostmasterIsAlive() very cheap, as was done for Linux in an
earlier commit.

Author: Thomas Munro
Discussion: https://postgr.es/m/7261eb39-0369-f2f4-1bb5-62f3b6083b5e@iki.fi
2018-07-11 13:14:07 +12:00
Thomas Munro 9f09529952 Use signals for postmaster death on Linux.
Linux provides a way to ask for a signal when your parent process dies.
Use that to make PostmasterIsAlive() very cheap.

Based on a suggestion from Andres Freund.

Author: Thomas Munro, Heikki Linnakangas
Reviewed-By: Michael Paquier
Discussion: https://postgr.es/m/7261eb39-0369-f2f4-1bb5-62f3b6083b5e%40iki.fi
Discussion: https://postgr.es/m/20180411002643.6buofht4ranhei7k%40alap3.anarazel.de
2018-07-11 12:47:06 +12:00
Michael Paquier 56a7147213 Block replication slot advance for these not yet reserving WAL
Such replication slots are physical slots freshly created without WAL
being reserved, which is the default behavior, which have not been used
yet as WAL consumption resources to retain WAL.  This prevents advancing
a slot to a position older than any WAL available, which could falsify
calculations for WAL segment recycling.

This also cleans up a bit the code, as ReplicationSlotRelease() would be
called on ERROR, and improves error messages.

Reported-by: Kyotaro Horiguchi
Author: Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera, Kyotaro Horiguchi
Discussion: https://postgr.es/m/20180626071305.GH31353@paquier.xyz
2018-07-11 08:56:24 +09:00
Alvaro Herrera b6e3a3a492 Better handle pseudotypes as partition keys
We fail to handle polymorphic types properly when they are used as
partition keys: we were unnecessarily adding a RelabelType node on top,
which confuses code examining the nodes.  In particular, this makes
predtest.c-based partition pruning not to work, and ruleutils.c to emit
expressions that are uglier than needed.  Fix it by not adding RelabelType
when not needed.

In master/11 the new pruning code is separate so it doesn't suffer from
this problem, since we already fixed it (in essentially the same way) in
e5dcbb88a1, which also added a few tests; back-patch those tests to
pg10 also.  But since UPDATE/DELETE still uses predtest.c in pg11, this
change improves partitioning for those cases too.  Add tests for this.
The ruleutils.c behavior change is relevant in pg11/master too.

Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/54745d13-7ed4-54ac-97d8-ea1eec95ae25@lab.ntt.co.jp
2018-07-10 15:19:40 -04:00
Peter Eisentraut bcbd940806 Remove dynamic_shared_memory_type=none
PostgreSQL nowadays offers some kind of dynamic shared memory feature on
all supported platforms.  Having the choice of "none" prevents us from
relying on DSM in core features.  So this patch removes the choice of
"none".

Author: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
2018-07-10 18:35:24 +02:00
Heikki Linnakangas 17b715c634 Add test case for EEOP_INNER_SYSVAR/EEOP_OUTER_SYSVAR executor opcodes.
The EEOP_INNER_SYSVAR and EEOP_OUTER_SYSVAR executor opcodes are not
exercised by normal queries, because setrefs.c will resolve the references
to system columns in the scan nodes already. Join nodes refer to them by
their position in the child node's target list, like user columns.

The only place where those opcodes are used, is in evaluating a trigger's
WHEN condition that references system columns. Trigger evaluation abuses
the INNER/OUTER Vars to refer to the OLD and NEW tuples. The code to handle
the opcodes is pretty straightforward, but it seems like a good idea to
have some test coverage for them, anyway, so that they don't get removed or
broken by accident.

Author: Ashutosh Bapat, with some changes by me.
Discussion: https://www.postgresql.org/message-id/CAFjFpRerUFX=T0nSnCoroXAJMoo-xah9J+pi7+xDUx86PtQmew@mail.gmail.com
2018-07-10 16:16:14 +03:00
Peter Eisentraut 1486f7f981 Fix typos 2018-07-10 11:14:53 +02:00
Michael Paquier 8a00b96aa9 Add pg_rewind --no-sync
This is an option consistent with what pg_dump and pg_basebackup provide
which is useful for leveraging the I/O effort when testing things, not
to be used in a production environment.

Author: Michael Paquier
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/20180325122607.GB3707@paquier.xyz
2018-07-10 08:51:10 +09:00
Michael Paquier 9a4059d4ff Simplify logic to sync target directory at the end of pg_rewind
The previous sync logic relied on looking for and then launching
externally initdb -S, which is a simple wrapper on top of fsync_pgdata.
There is nothing preventing pg_rewind to directly call this routine, so
remove the dependency to initdb and just call it directly.

Author: Michael Paquier
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/20180325122607.GB3707@paquier.xyz
2018-07-10 08:39:27 +09:00
Tom Lane 0905fe8911 Avoid emitting a bogus WAL record when recycling an all-zero btree page.
Commit fafa374f2 caused _bt_getbuf() to possibly emit a WAL record for
a page that it was about to recycle.  However, it failed to distinguish
all-zero pages from dead pages, which is important because only the
latter have valid btpo.xact values, or indeed any special space at all.
Recycling an all-zero page with XLogStandbyInfoActive() enabled therefore
led to an Assert failure, or to emission of a WAL record containing a
bogus cutoff XID, which might lead to unnecessary query cancellations
on hot standby servers.

Per reports from Antonin Houska and 自己.  Amit Kapila was first to
propose this fix, and Robert Haas, myself, and Kyotaro Horiguchi
reviewed it at various times.

This is an old bug, so back-patch to all supported branches.

Discussion: https://postgr.es/m/2628.1474272158@localhost
Discussion: https://postgr.es/m/48875502.f4a0.1635f0c27b0.Coremail.zoulx1982@163.com
2018-07-09 19:26:19 -04:00
Alvaro Herrera a22445ff0b Flip argument order in XLogSegNoOffsetToRecPtr
Commit fc49e24fa6 added an input argument after the existing output
argument.  Flip those.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20180708182345.imdgovmkffgtihhk@alvherre.pgsql
2018-07-09 14:33:38 -04:00
Peter Eisentraut ec67b89816 Add UtilityReturnsTuples() support for CALL
This ensures that prepared statements for CALL can return tuples.
2018-07-09 13:58:08 +02:00
Michael Paquier cbc55da556 Rework order of end-of-recovery actions to delay timeline history write
A critical failure in some of the end-of-recovery actions before the
end-of-recovery record is written can cause PostgreSQL to react
inconsistently with the rest of the cluster in the event of a crash
before the final record is written.  Two such failures are for example
an error while processing a two-phase state files or when operating on
recovery.conf.  With this commit, the failures are still considered
FATAL, but the write of the timeline history file is delayed as much as
possible so as the window between the moment the file is written and the
end-of-recovery record is generated gets minimized. This way, in the
event of a crash or a failure, the new timeline decided at promotion
will not seem taken by other nodes in the cluster.  It is not really
possible to reduce to zero this window, hence one could still see
failures if a crash happens between the history file write and the
end-of-recovery record, so any future code should be careful when
adding new end-of-recovery actions.  The original report from Magnus
Hagander mentioned a renamed recovery.conf as original end-of-recovery
failure which caused a timeline to be seen as taken but the subsequent
processing on the now-missing recovery.conf cause the startup process to
issue stop on FATAL, which at follow-up startup made the system
inconsistent because of on-disk changes which already happened.

Processing of two-phase state files still needs some work as corrupted
entries are simply ignored now.  This is left as a future item and this
commit fixes the original complain.

Reported-by: Magnus Hagander
Author: Heikki Linnakangas
Reviewed-by: Alexander Korotkov, Michael Paquier, David Steele
Discussion: https://postgr.es/m/CABUevEz09XY2EevA2dLjPCY-C5UO4Hq=XxmXLmF6ipNFecbShQ@mail.gmail.com
2018-07-09 10:22:34 +09:00
Peter Geoghegan e915fed291 Correct obsolete unique index insertion comment.
Commit bc292937ae failed to update a comment about unique index
checking.  _bt_insertonpg() is no longer responsible for finding an
insertion location while preventing conflicting insertions.
2018-07-08 10:50:13 -07:00
Michael Paquier 677da8c15d Use access() to check file existence in GetNewRelFileNode()
Previous code used BasicOpenFile() and close() just to check for a file
collision, while there is no need to hold open a file descriptor but
that's an overkill here.

Author: Paul Guo
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/CABQrizcUtiHaquxK=d4etBX8GF9kbZB50Nt1gO9_aN-e9SptyQ@mail.gmail.com
2018-07-08 18:53:20 +09:00
Peter Eisentraut 0903bbdad2 Add separate error message for procedure does not exist
While we probably don't want to split up all error messages into
function and procedure variants, this one is a very prominent one, so
it's helpful to be more specific here.
2018-07-07 11:17:04 +02:00
Peter Eisentraut 2e78c5b522 Fix assert in nested SQL procedure call
When executing CALL in PL/pgSQL, we need to set a snapshot before
invoking the to-be-called procedure.  Otherwise, the to-be-called
procedure might end up running without a snapshot.  For LANGUAGE SQL
procedures, this would result in an assertion failure.  (For most other
languages, this is usually not a problem, because those use SPI and SPI
sets snapshots in most cases.)  Setting the snapshot restores the
behavior of how CALL worked when it was handled as a generic SQL
statement in PL/pgSQL (exec_stmt_execsql()).

This change revealed another problem:  In SPI_commit(), we popped the
active snapshot before committing the transaction, to avoid "snapshot %p
still active" errors.  However, there is no particular reason why only
at most one snapshot should be on the stack.  So change this to pop all
active snapshots instead of only one.
2018-07-06 23:25:44 +02:00
Peter Eisentraut e34ec13620 Allow CALL with polymorphic type arguments
In order to be able to resolve polymorphic types, we need to set fn_expr
before invoking the procedure.
2018-07-06 22:40:16 +02:00
Alvaro Herrera 0ce5cf2ef2 Allow replication slots to be dropped in single-user mode
Starting with commit 9915de6c1c, replication slot drop uses a
condition variable sleep to wait until the current user of the slot goes
away.  This is more user friendly than the previous behavior of erroring
out if the slot is in use, but it fails with a not-for-user-consumption
error message in single-user mode; plus, if you're using single-user
mode because you don't want to start the server in the regular mode
(say, disk is full and WAL won't recycle because of the slot), it's
inconvenient.

Fix by skipping the cond variable sleep in single-user mode, since
there can't be anybody to wait for anyway.

Reported-by: tushar <tushar.ahuja@enterprisedb.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/3b2f809f-326c-38dd-7a9e-897f957a4eb1@enterprisedb.com
2018-07-06 16:38:30 -04:00
Andrew Dunstan 8fb68aa265 Print DEBUG2 like that rather than as DEBUG
DEBUG is an alias for DEBUG2, but we want DEBUG2 to show in the settings
no matter how it was spelled.

Takeshi Ideriha

Discussion: https://postgr.es/m/4E72940DA2BF16479384A86D54D0988A5678EC03@G01JPEXMBKW04
2018-07-06 07:29:12 -04:00
Jeff Davis 4513d3a4be Add test for partitionwise join involving default partition.
Author: Rajkumar Raghuwanshi
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/CAKcux6ky5YeZAY74qSh-ayPZZEQchz092g71iXXbC0%2BE3xoscA%40mail.gmail.com
Discussion: https://postgr.es/m/CAKcux6kOQ85Xtzxu3tM1mR7Vk%3D7Z2e4rG7dL1iMZqPgLMpxQYg%40mail.gmail.com
2018-07-05 18:56:12 -07:00
Alvaro Herrera 3ca966c06f logical decoding: beware of an unset specinsert change
Coverity complains that there is no protection in the code (at least in
non-assertion-enabled builds) against speculative insertion failing to
follow the expected protocol.  Add an elog(ERROR) for the case.
2018-07-05 17:42:37 -04:00
Peter Eisentraut f61988d160 Fix typo 2018-07-05 08:31:40 +02:00
Michael Paquier 3c64dcb1e3 Prevent references to invalid relation pages after fresh promotion
If a standby crashes after promotion before having completed its first
post-recovery checkpoint, then the minimal recovery point which marks
the LSN position where the cluster is able to reach consistency may be
set to a position older than the first end-of-recovery checkpoint while
all the WAL available should be replayed.  This leads to the instance
thinking that it contains inconsistent pages, causing a PANIC and a hard
instance crash even if all the WAL available has not been replayed for
certain sets of records replayed.  When in crash recovery,
minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
which forces the recovery to replay all the WAL available, so this
commit makes sure that the local copy of minRecoveryPoint from the
control file is initialized properly and stays as it is while crash
recovery is performed.  Once switching to archive recovery or if crash
recovery finishes, then the local copy minRecoveryPoint can be safely
updated.

Pavan Deolasee has reported and diagnosed the failure in the first
place, and the base fix idea to rely on the local copy of
minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
into a full-fledged patch by me.  The test included in this commit has
been written by Álvaro Herrera and Pavan Deolasee, which I have modified
to make it faster and more reliable with sleep phases.

Backpatch down to all supported versions where the bug appears, aka 9.3
which is where the end-of-recovery checkpoint is not run by the startup
process anymore.  The test gets easily supported down to 10, still it
has been tested on all branches.

Reported-by: Pavan Deolasee
Diagnosed-by: Pavan Deolasee
Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
Herrera
Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
2018-07-05 10:46:18 +09:00
Andres Freund 249126e761 Use context with correct lifetime in hypothetical_dense_rank_final.
The query lifetime expression context created in
hypothetical_dense_rank_final() was buggily allocated in the calling
memory context. I (Andres) broke that in bf6c614a2f.

Reported-By: Rajkumar Raghuwanshi
Author: Amit Langote
Discussion:  https://postgr.es/m/CAKcux6kmzWmur5HhA_aU6gYVFu0RLQdgJJ+aC9SLdcOvBSrpfA@mail.gmail.com
Backpatch: 11-
2018-07-04 17:36:01 -07:00
Andres Freund 3a01f68e35 Check for interrupts inside the nbtree page deletion code.
When deleting pages the nbtree code has to walk through siblings of a
tree node. When those sibling links are corrupted that can lead to
endless loops - which are currently not interruptible.  This is
especially problematic if autovacuum is repeatedly blocked on such
indexes, as it can be hard to get out of that situation without
resorting to single user mode.

Thus add interrupt checks to appropriate places in such
loops. Unfortunately in one of the cases it's it's not easy to do so.

Between 9.3 and 9.4 the page deletion (and page split) code changed
significantly. Before it was significantly less robust against
interruptions. Therefore don't backpatch to 9.3.

Author: Andres Freund
Discussion: https://postgr.es/m/20180627191629.wkunw2qbibnvlz53@alap3.anarazel.de
Backpatch: 9.4-
2018-07-04 14:58:25 -07:00
Fujii Masao b41669118c Improve the performance of relation deletes during recovery.
When multiple relations are deleted at the same transaction,
the files of those relations are deleted by one call to smgrdounlinkall(),
which leads to scan whole shared_buffers only one time. OTOH,
previously, during recovery, smgrdounlink() (not smgrdounlinkall()) was
called for each file to delete, which led to scan shared_buffers
multiple times. Obviously this could cause to increase the WAL replay
time very much especially when shared_buffers was huge.

To alleviate this situation, this commit changes the recovery so that
it also calls smgrdounlinkall() only one time to delete multiple
relation files.

This is just fix for oversight of commit 279628a0a7, not new feature.
So, per discussion on pgsql-hackers, we concluded to backpatch this
to all supported versions.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Andres Freund, Thomas Munro, Kyotaro Horiguchi, Takayuki Tsunakawa
Discussion: https://postgr.es/m/CAHGQGwHVQkdfDqtvGVkty+19cQakAydXn1etGND3X0PHbZ3+6w@mail.gmail.com
2018-07-05 02:23:46 +09:00
Michael Paquier fc057b2b8f Remove dead code for temporary relations in partition planning
Since recent commit 1c7c317c, temporary relations cannot be mixed with
permanent relations within the same partition tree, and the same counts
for temporary relations created by other sessions, which the planner
simply discarded.  Instead be paranoid and issue an error, as those
should be blocked at definition time, at least for now.

At the same time, a test case is added to stress what has been moved
when expand_partitioned_rtentry gets called recursively but bumps on a
partitioned relation with no partitions which should be handled the same
way as the non-inheritance case.  This code may be reworked in a close
future, and covering this code path will limit surprises.

Reported-by: David Rowley
Author: David Rowley
Reviewed-by: Amit Langote, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
2018-07-04 10:37:40 +09:00
Peter Eisentraut 2c059c86ba Add $Test::Builder::Level to pgbench test functions
same as c4309f4aee
2018-07-03 23:22:01 +02:00
Peter Eisentraut 6837078687 Correct comment 2018-07-03 19:55:00 +02:00
Michael Paquier c55de5e512 Add wait event for fsync of WAL segments
This has been visibly a forgotten spot in the first implementation of
wait events for I/O added by 249cf07, and what has been missing is a
fsync call for WAL segments which is a wrapper reacting on the value of
GUC wal_sync_method.

Reported-by: Konstantin Knizhnik
Author: Konstantin Knizhnik
Reviewed-by: Craig Ringer, Michael Paquier
Discussion: https://postgr.es/m/4a243897-0ad8-f471-aa40-242591f2476e@postgrespro.ru
2018-07-02 22:19:46 +09:00
Michael Paquier c072e80337 Correct function name in comment of logical decoding code
Reported-by: Dave Cramer
Author: Euler Taveira
Discussion: https://postgr.es/m/CADK3HHKnPGJDLhjOFBY6+70Wd14iEH8c2GKw7UrOuUHp_GNFrA@mail.gmail.com
2018-07-02 13:30:12 +09:00
Peter Eisentraut 7bdea62635 Fix libpq example programs
When these programs call pg_catalog.set_config, they need to check for
PGRES_TUPLES_OK instead of PGRES_COMMAND_OK.  Fix for
5770172cb0.

Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>
2018-07-01 14:06:40 +02:00
Andrew Dunstan 56b4da8c9d Use more modern instructions for creating a new dev cycle 2018-07-01 07:55:05 -04:00
Michael Paquier 9994013ff3 Add tests for inheritance trees mixing permanent and temporary relations
While working on 1c7c317c and related things, which has clarified the
use of partitions with temporary tables, I have noticed that there could
be better coverage for inheritance trees mixing temporary and permanent
relations.  A lot of cross-checks happen in MergeAttributes() which is
not designed for this purpose, so the tests added in this commit will
make sure that any kind of future refactoring will limit the amount of
compatibility breakage.

Author: Michael Paquier
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/20180619022131.GE3314@paquier.xyz
2018-07-01 20:20:06 +09:00
Peter Eisentraut c4309f4aee Use $Test::Builder::Level in TAP test functions
In TAP test functions, that is, those that produce test results, locally
increment $Test::Builder::Level.  This has the effect that test failures
are reported at the callers location rather than somewhere in the test
support libraries.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
2018-07-01 12:58:32 +02:00
Andrew Dunstan feced1387f Stamp HEAD as 12devel
Let the hacking begin ...
2018-06-30 12:47:59 -04:00
Andrew Dunstan d842139099 perltidy run prior to branching 2018-06-30 12:28:55 -04:00
Andrew Dunstan 1e9c858090 pgindent run prior to branching 2018-06-30 12:25:49 -04:00
Andrew Dunstan 2c64d20048 Update typedefs list 2018-06-30 12:07:27 -04:00
Alvaro Herrera bc87f22ef6 psql: show cloned triggers in partitions
In a partition, row triggers that had been cloned from their parent
partitioned table would not be listed at all in psql's \d, which could
surprise users, per insistent complaint from Ashutosh Bapat (though his
aim was elsewhere).  The simplest possible fix, suggested by Peter
Eisentraut, seems to be to list triggers marked as internal if they have
a row in pg_depend that points to some other trigger.

Author: Álvaro Herrera
Discussion: https://postgr.es/m/20180618165910.p26vhk7dpq65ix54@alvherre.pgsql
2018-06-29 11:53:22 -04:00
Alvaro Herrera 41372071df Fix crash when ALTER TABLE recreates indexes on partitions
The skip_build flag was not being passed correctly when recursing to
indexes on partitions, leading to attempts to rebuild indexes when they
were not yet ready to be rebuilt.

Reported-by: Rajkumar Raghuwanshi
Discussion: https://postgr.es/m/CAKcux6mxNCGsgATwf5CGMF8g4WSupCXicCVMeKUTuWbyxHOMsQ@mail.gmail.com
2018-06-29 11:49:30 -04:00
Michael Paquier dad335b89f Replace search.cpan.org with metacpan.org
search.cpan.org has been EOL'd, with metacpan.org being the official
replacement to which URLs now redirect.  Update links to match the new
URL. Also update links to CPAN to use https as it will redirect from
http.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/B74C0219-6BA9-46E1-A524-5B9E8CD3BDB3@yesql.se
2018-06-29 22:02:20 +09:00
Michael Paquier dad5f8a3d5 Make capitalization of term "OpenSSL" more consistent
This includes code comments and documentation.  No backpatch as this is
cosmetic even if there are documentation changes which are user-facing.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/BB89928E-2BC7-489E-A5E4-6D204B3954CF@yesql.se
2018-06-29 09:45:44 +09:00
Alvaro Herrera f5545287dc Fix typo in comment
Author: Amit Langote
Discussion: https://postgr.es/m/b23dc88b-df41-ef07-22c5-12f77cf73b57@lab.ntt.co.jp
2018-06-27 15:40:28 -04:00
Amit Kapila 2e61c50785 Fix thinko in comments.
A slot can not be stored in a tuple but it's vice versa.

Reported-by: Ashutosh Bapat
Author: Ashutosh Bapat
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com
2018-06-27 18:05:24 +05:30
Andres Freund 42121790ca Change pqformat.h's integer handling functions to take unsigned integers.
As added in 1de09ad8eb the new functions
all accept signed integers as parameters. That's not great, because
it's perfectly reasonable to call them with unsigned parameters.
Unfortunately unsigned to signed conversion is not well defined, when
exceeding the range of the signed value.  That's presently not a
practical issue in postgres (among other reasons because we force
gcc's hand with -fwrapv).  But it's clearly not quite right.

Thus change the signatures to accept unsigned integers instead, signed
to unsigned conversion is always well defined. Also change the
backward compat pq_sendint() - while it's deprecated it seems better
to be consistent.

Per discussion between Andrew Gierth, Michael Paquier, Alvaro Herrera
and Tom Lane.

Reported-By: Andrew Gierth
Author: Andres Freund
Discussion: https://postgr.es/m/87r2m10zm2.fsf@news-spur.riddles.org.uk
2018-06-26 23:40:32 -07:00
Andres Freund 986070872f Remove duplicated return statement from llvmjit code.
The duplicated return clearly doesn't make sense / isn't
reachable. Likely introduced by me (Andres), while revising the code.

Author: Rushabh Lathia
Discussion: https://postgr.es/m/CAGPqQf2raxWOcbuTP36M1rEF3=Rfo7oD29K3psdyHMeE5swBRg@mail.gmail.com
2018-06-26 23:16:50 -07:00
Peter Eisentraut 0fcf5e0e6e Fix whitespace 2018-06-27 08:03:54 +02:00
Amit Kapila 8121ab88e7 Cosmetic improvements for faster column addition.
Changed the name of few structure members for the sake of clarity and
removed spurious whitespace.

Reported-by: Amit Kapila
Author: Amit Kapila, based on suggestion by Andrew Dunstan
Reviewed-by: Alvaro Herrera
Discussion: https://postgr.es/m/CAA4eK1K2znsFpC+NQ9A4vxT4uDxADN4RmvHX0L6Y=aHVo9gB4Q@mail.gmail.com
2018-06-27 08:16:13 +05:30
Alvaro Herrera f49a80c481 Fix "base" snapshot handling in logical decoding
Two closely related bugs are fixed.  First, xmin of logical slots was
advanced too early.  During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them.  The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.

To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN.  This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment).  Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.

The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise.  To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.

Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.

Liberally sprinkle code comments, and rewrite a few existing ones.  This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.

Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
2018-06-26 16:48:10 -04:00
Alexander Korotkov 4d54543efa Fix upper limit for vacuum_cleanup_index_scale_factor
6ca33a88 sets upper limit for vacuum_cleanup_index_scale_factor to
DBL_MAX.  DBL_MAX appears to be platform-dependent. That causes
many buildfarm animals to fail, because we check boundaries of
vacuum_cleanup_index_scale_factor in regression tests.

This commit changes upper limit from DBL_MAX to just "large enough"
limit, which was arbitrary selected as 1e10.

Author: Alexander Korotkov
Reported-by: Tom Lane, Darafei Praliaskouski
Discussion: https://postgr.es/m/CAPpHfdvewmr4PcpRjrkstoNn1n2_6dL-iHRB21CCfZ0efZdBTg%40mail.gmail.com
Discussion: https://postgr.es/m/CAC8Q8tLYFOpKNaPS_E7V8KtPdE%3D_TnAn16t%3DA3LuL%3DXjfOO-BQ%40mail.gmail.com
2018-06-26 21:55:59 +03:00
Peter Geoghegan aefb0a382c Correct a comment on logtape.c's leader tape.
randomAccess parallel tuplesorts are disallowed because the leader would
try to write to its own leader tape, not because the leader would try to
write to a worker tape directly.

Cleanup from commit 9da0cc3528.
2018-06-26 11:16:20 -07:00
Peter Geoghegan cdc2693a11 Remove obsolete comment block in nbtsort.c.
Building a new nbtree index through incremental insertions would always
be slower than our actual approach of sorting using tuplesort,
assembling leaf pages from tuplesort output, and writing and WAL-logging
whole pages.  Remove a comment block from the Berkeley days claiming
that incremental insertions might be slightly faster with presorted
input.

Discussion: https://postgr.es/m/CAH2-WzmKs4mLAoFgJ3yHMRYc849efc=dw+pNRb3NEog2oJoCNw@mail.gmail.com
2018-06-26 10:08:44 -07:00
Alvaro Herrera 040da42367 Enable failure to rename a partitioned index
Concurrently with partitioned index development (commit 8b08f7d482),
the code to handle failure to rename indexes was refactored (commit
8b9e9644dc).  Turns out that that particular case was untested, which
naturally led it to be broken.  Add tests and the missing code line.

Co-authored-by: David Rowley <dgrowley@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>
Discussion: https://postgr.es/m/CAKcux6mfYMS3OX0ywjOiWiGSEKhJf-1zdeTceHFbd0mScUzU5A@mail.gmail.com
2018-06-26 11:54:45 -04:00
Alvaro Herrera 7d872c91a3 Allow direct lookups of AppendRelInfo by child relid
find_appinfos_by_relids had quite a large overhead when the number of
items in the append_rel_list was high, as it had to trawl through the
append_rel_list looking for AppendRelInfos belonging to the given
childrelids.  Since there can only be a single AppendRelInfo for each
child rel, it seems much better to store an array in PlannerInfo which
indexes these by child relid, making the function O(1) rather than O(N).
This function was only called once inside the planner, so just replace
that call with a lookup to the new array.  find_childrel_appendrelinfo
is now unused and thus removed.

This fixes a planner performance regression new to v11 reported by
Thomas Reiss.

Author: David Rowley
Reported-by: Thomas Reiss
Reviewed-by: Ashutosh Bapat
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/94dd7a4b-5e50-0712-911d-2278e055c622@dalibo.com
2018-06-26 10:35:26 -04:00
Alexander Korotkov 6ca33a885b Increase upper limit for vacuum_cleanup_index_scale_factor
Upper limits for vacuum_cleanup_index_scale_factor GUC and reloption
were initially set to 100.0 in 857f9c36.  However, after further
discussion, it appears that some users like to disable B-tree cleanup
index scan completely (assuming there are no deleted pages).

vacuum_cleanup_index_scale_factor is used barely to protect against
stalled index statistics.  And after detailed consideration it appears
that risk of stalled index statistics is low.  And it would be nice to
allow advanced users setting higher values of
vacuum_cleanup_index_scale_factor.  So, set upper limit for these
GUC and reloption to DBL_MAX.

Author: Alexander Korotkov
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAC8Q8tJCb%3DgxhzcV7T6ctx7PY-Ux1oA-AsTJc6cAVNsQiYcCzA%40mail.gmail.com
2018-06-26 15:00:51 +03:00
Peter Eisentraut c9301deb9b Reword SPI_ERROR_TRANSACTION errors in PL/pgSQL
The previous message for SPI_ERROR_TRANSACTION claimed "cannot begin/end
transactions in PL/pgSQL", but that is no longer true.  Nevertheless,
the error can still happen, so reword the messages.  The error cases in
exec_prepare_plan() could never happen, so remove them.
2018-06-26 11:38:46 +02:00
Thomas Munro a40cff8956 Move RecoveryLockList into a hash table.
Standbys frequently need to release all locks held by a given xid.
Instead of searching one big list linearly, let's create one list
per xid and put them in a hash table, so we can find what we need
in O(1) time.

Earlier analysis and a prototype were done by David Rowley, though
this isn't his patch.

Back-patch all the way.

Author: Thomas Munro
Diagnosed-by: David Rowley, Andres Freund
Reviewed-by: Andres Freund, Tom Lane, Robert Haas
Discussion: https://postgr.es/m/CAEepm%3D1mL0KiQ2KJ4yuPpLGX94a4Ns_W6TL4EGRouxWibu56pA%40mail.gmail.com
Discussion: https://postgr.es/m/CAKJS1f9vJ841HY%3DwonnLVbfkTWGYWdPN72VMxnArcGCjF3SywA%40mail.gmail.com
2018-06-26 18:45:45 +12:00
Michael Paquier c672d709b0 Fix description and documentation related to pg_restore --no-comments
These descriptions have been referring to object dump, but a restore
operation is done.

Reported-by: Andrey Lizenko
Author: Andrey Lizenko
Discussion: https://postgr.es/m/152992021588.1268.16786093506650391435@wrigleys.postgresql.org
2018-06-26 14:57:53 +09:00
Michael Paquier d08c3d5197 Correct handling of fsync failures with tar mode of walmethods.c
This file has been missing the fact that it needs to report back to
callers a proper failure on fsync calls.  I have spotted the one in
tar_finish() while Kuntal has spotted the one in tar_close().

Backpatch down to 10 where this code has been introduced.

Reported by: Michael Paquier, Kuntal Ghosh
Author: Michael Paquier
Reviewed-by: Kuntal Ghosh, Magnus Hagander
Discussion: https://postgr.es/m/20180625024356.GD1146@paquier.xyz
2018-06-26 09:41:58 +09:00
Alvaro Herrera 322548a8ab Update obsolete comments
Commit 9fab40ad32 removed some pre-allocating logic in
reorderbuffer.c, but left outdated comments in place.  Repair.

Author: Álvaro Herrera
2018-06-25 15:45:48 -04:00
Alvaro Herrera 1d4e5edc1d Stamp 11beta2. 2018-06-25 11:09:49 -04:00
Peter Eisentraut 299addd592 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 884f33d735870f94357820800840af3e93ff4628
2018-06-25 12:37:18 +02:00
Michael Paquier 6cb3372411 Address set of issues with errno handling
System calls mixed up in error code paths are causing two issues which
several code paths have not correctly handled:
1) For write() calls, sometimes the system may return less bytes than
what has been written without errno being set.  Some paths were careful
enough to consider that case, and assumed that errno should be set to
ENOSPC, other calls missed that.
2) errno generated by a system call is overwritten by other system calls
which may succeed once an error code path is taken, causing what is
reported to the user to be incorrect.

This patch uses the brute-force approach of correcting all those code
paths.  Some refactoring could happen in the future, but this is let as
future work, which is not targeted for back-branches anyway.

Author: Michael Paquier
Reviewed-by: Ashutosh Sharma
Discussion: https://postgr.es/m/20180622061535.GD5215@paquier.xyz
2018-06-25 11:19:05 +09:00
Andrew Dunstan 123efbccea Mark binary_upgrade_set_missing_value as parallel_unsafe
per buildfarm.

Bump catalog version again although in practice nobody is going to use
this in a parallel query.
2018-06-23 08:43:05 -04:00
Alvaro Herrera 475be5e790 When index recurses to a partition, map columns numbers
Two out of three code paths were mapping column numbers correctly if a
partition had different column numbers than parent table, but the most
commonly used one (recursing in CREATE INDEX to a new index on a
partition) failed to map attribute numbers in expressions.  Oddly
enough, attnums in WHERE clauses are already handled correctly
everywhere.

Reported-by: Amit Langote
Author: Amit Langote
Discussion: https://postgr.es/m/dce1fda4-e0f0-94c9-6abb-f5956a98c057@lab.ntt.co.jp
Reviewed-by: Álvaro Herrera
2018-06-22 16:45:48 -04:00
Robert Haas c6f28af5d7 Avoid generating bogus paths with partitionwise aggregate.
Previously, if some or all partitions had no partially aggregated path,
we would still try to generate a partially aggregated path for the
parent, leading to assertion failures or wrong answers.

Report by Rajkumar Raghuwanshi.  Patch by Jeevan Chalke, reviewed
by Ashutosh Bapat.  A few changes by me.

Discussion: http://postgr.es/m/CAKcux6=q4+Mw8gOOX16ef6ZMFp9Cve7KWFstUsrDa4GiFaXGUQ@mail.gmail.com
2018-06-22 09:20:19 -04:00
Andrew Dunstan 2448adf29c Allow for pg_upgrade of attributes with missing values
Commit 16828d5c02 neglected to do this, so upgraded databases would
silently get null instead of the specified default in rows without the
attribute defined.

A new binary upgrade function is provided to perform this and pg_dump is
adjusted to output a call to the function if required in binary upgrade
mode.

Also included is code to drop missing attribute values for dropped
columns. That way if the type is later dropped the missing value won't
have a dangling reference to the type.

Finally the regression tests are adjusted to ensure that there is a row
with a missing value so that this code is exercised in upgrade testing.

Catalog version unfortunately bumped.

Regression test changes from Tom Lane.
Remainder from me, reviewed by Tom Lane, Andres Freund, Alvaro Herrera

Discussion: https://postgr.es/m/19987.1529420110@sss.pgh.pa.us
2018-06-22 08:42:36 -04:00
Alexander Korotkov 9a994e37e0 Fixes for vacuum_cleanup_index_scale_factor GUC option
vacuum_cleanup_index_scale_factor was located in autovacuum group of
GUCs.  However, it affects not only autovacuum, but also manually run
VACUUM.  It appears that "client connection defaults" group of GUCs
is more appropriate for vacuum_cleanup_index_scale_factor, because
vacuum_*_age options are already located there.

Also, vacuum_cleanup_index_scale_factor was missed in
postgresql.conf.sample.  So, add it there with appropriate comment.

Author: Masahiko Sawada with minor editorization by me
Discussion: https://postgr.es/m/CAD21AoArsoXMLKudXSKN679FRzs6oubEchM53bHwn8Tp%3D2boNg%40mail.gmail.com
2018-06-22 12:26:21 +03:00
Michael Paquier 0aa5e65ab4 Fix typo in comment of commit_ts.c for incorrect reference to CLOG
Author: Shao Bret
2018-06-22 13:30:26 +09:00
Amit Kapila 98d476a965 Improve coding pattern in Parallel Append code.
The create_append_path code didn't consider that list_concat will
modify it's first argument leading to inconsistent traversal of
resulting list.  In practice, it won't lead to any user-visible bug
but changing it for making the code behave consistently.

Reported-by: Tom Lane
Author: Tom Lane
Reviewed-by: Amit Khandekar and Amit Kapila
Discussion: https://postgr.es/m/32365.1528994120@sss.pgh.pa.us
2018-06-22 08:43:36 +05:30
Tom Lane ec4719cd15 Fix partial aggregation for variance(int4) and related aggregates.
A typo in numeric_poly_combine caused bogus results for queries using
it, but of course would only manifest if parallel aggregation is
performed.  Reported by Rajkumar Raghuwanshi.

David Rowley did the diagnosis and the fix; I editorialized rather
heavily on his regression test additions.

Back-patch to v10 where the breakage was introduced (by 9cca11c91).

Discussion: https://postgr.es/m/CAKcux6nU4E2x8nkSBpLOT2DPvQ5LviJ3SGyAN6Sz7qDH4G4+Pw@mail.gmail.com
2018-06-21 16:18:39 -04:00
Alvaro Herrera e474c2b7e4 Set correct context for XPath evaluation
According to the SQL standard, the context of XMLTABLE's XPath
row_expression is the document node of the XML input document, not the
root node.  This becomes visible when a relative path rather than
absolute is used as row expression.  Absolute paths is what was used in
original tests and docs (and the most common form used in examples
throughout the interwebs), which explains why this wasn't noticed
before.

Other functions such as xpath() and xpath_exists() also have this
problem.  While not specified by the SQL standard, it would be pretty
odd to leave those functions to behave differently than XMLTABLE, so
change them too.  However, this is a backwards-incompatible change.

No backpatch, out of fear of breaking code depending on the original
broken behavior.

Author: Markus Winand
Reported-By: Markus Winand
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
2018-06-21 15:56:11 -04:00
Tom Lane 425b4c082c Improve requirements documentation for ldap test suite.
Text by me; data contributed by me, Thomas Munro, Michael Paquier.

Discussion: https://postgr.es/m/20180521013425.GA4476@paquier.xyz
2018-06-21 12:37:21 -04:00
Tom Lane 07e5a21352 Fix mishandling of sortgroupref labels while splitting SRF targetlists.
split_pathtarget_at_srfs() neglected to worry about sortgroupref labels
in the intermediate PathTargets it constructs.  I think we'd supposed
that their labeling didn't matter, but it does at least for the case that
GroupAggregate/GatherMerge nodes appear immediately under the ProjectSet
step(s).  This results in "ERROR: ORDER/GROUP BY expression not found in
targetlist" during create_plan(), as reported by Rajkumar Raghuwanshi.

To fix, make this logic track the sortgroupref labeling of expressions,
not just their contents.  This also restores the pre-v10 behavior that
separate GROUP BY expressions will be kept distinct even if they are
textually equal().

Discussion: https://postgr.es/m/CAKcux6=1_Ye9kx8YLBPmJs_xE72PPc6vNi5q2AOHowMaCWjJ2w@mail.gmail.com
2018-06-21 10:58:42 -04:00
Alvaro Herrera 9cd929d360 Update expected XML output with disabled XML
Should have been done in previous commit.
2018-06-20 13:05:44 -04:00
Alvaro Herrera b7f0be9a7e Accept TEXT and CDATA nodes in XMLTABLE's column_expression.
Column expressions that match TEXT or CDATA nodes must return the
contents of the nodes themselves, not the content of non-existing
children (i.e. the empty string).

Author: Markus Winand
Reported-by: Markus Winand
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/0684A598-002C-42A2-AE12-F024A324EAE4@winand.at
2018-06-20 12:58:12 -04:00
Magnus Hagander 3adcad4558 Add missing include
Per buildfarm
2018-06-20 18:19:35 +02:00
Alvaro Herrera 8f97af60d1 Consistently use the term 'partitioned rel' in partprune comments
We were using 'partition rel' in a few places, which is quite confusing.

Author: Amit Langote
Reviewed-by: David Rowley
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/fd256561-31a2-4b7e-cd84-d8241e7ebc3f@lab.ntt.co.jp
2018-06-20 11:43:01 -04:00
Magnus Hagander 741ee9dc81 Support long option for --pgdata in pg_verify_checksums
Author: Daniel Gustafsson <daniel@yesql.se>
2018-06-20 14:33:48 +02:00
Amit Kapila 403318b71f Don't consider parallel append for parallel unsafe paths.
Commit ab72716778 allowed Parallel Append paths to be generated for a
relation that is not parallel safe.  Prevent that from happening.

Initial analysis by Tom Lane.

Reported-by: Rajkumar Raghuwanshi
Author: Amit Kapila and Rajkumar Raghuwanshi
Reviewed-by: Amit Khandekar and Robert Haas
Discussion:https://postgr.es/m/CAKcux6=tPJ6nJ08r__nU_pmLQiC0xY15Fn0HvG1Cprsjdd9s_Q@mail.gmail.com
2018-06-20 07:51:42 +05:30
Michael Paquier 1c7c317cd9 Clarify use of temporary tables within partition trees
Since their introduction, partition trees have been a bit lossy
regarding temporary relations.  Inheritance trees respect the following
patterns:
1) a child relation can be temporary if the parent is permanent.
2) a child relation can be temporary if the parent is temporary.
3) a child relation cannot be permanent if the parent is temporary.
4) The use of temporary relations also imply that when both parent and
child need to be from the same sessions.

Partitions share many similar patterns with inheritance, however the
handling of the partition bounds make the situation a bit tricky for
case 1) as the partition code bases a lot of its lookup code upon
PartitionDesc which does not really look after relpersistence.  This
causes for example a temporary partition created by session A to be
visible by another session B, preventing this session B to create an
extra partition which overlaps with the temporary one created by A with
a non-intuitive error message.  There could be use-cases where mixing
permanent partitioned tables with temporary partitions make sense, but
that would be a new feature.  Partitions respect 2), 3) and 4) already.

It is a bit depressing to see those error checks happening in
MergeAttributes() whose purpose is different, but that's left as future
refactoring work.

Back-patch down to 10, which is where partitioning has been introduced,
except that default partitions do not apply there.  Documentation also
includes limitations related to the use of temporary tables with
partition trees.

Reported-by: David Rowley
Author: Amit Langote, Michael Paquier
Reviewed-by: Ashutosh Bapat, Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f94Ojk0og9GMkRHGt8wHTW=ijq5KzJKuoBoqWLwSVwGmw@mail.gmail.com
2018-06-20 10:42:25 +09:00
Tom Lane c992dca26e Clarify the README files for the various separate TAP-based test suites.
Explain the difference between "make check" and "make installcheck".
Mention the need for --enable-tap-tests (only some of these did so
before).  Standardize their wording about how to run the tests.
2018-06-19 19:30:50 -04:00
Bruce Momjian 9bab9cb36a README: add URLs for openldap installation
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/20180521013425.GA4476@paquier.xyz

Backpatch-through: head
2018-06-19 15:52:17 -04:00
Michael Paquier bde64eb610 Track new configure flags introduced for version 11 in pg_config.h.win32
The following set of flags mainly matter when building Postgres code
with MSVC and those have been forgotten with latest developments:
- HAVE_LDAP_INITIALIZE, added by 35c0754f, and marked as disabled.
ldap_initialize() is a non-standard extension that provides a way to use
"ldaps" with OpenLDAP, but it is not supported on Windows, and instead
the non-standard ldap_sslinit() is used if WIN32 is defined.  Per input
from Thomas Munro.
- HAVE_X509_GET_SIGNATURE_NID, added by 054e8c6c, which is used by
SCRAM's channel binding tls-server-end-point.  Having this flag disabled
would cause this channel binding type to be unsupported for Windows
builds.
- HAVE_SSL_CLEAR_OPTIONS, added recently as of a364dfa4 to disable SSL
compression.
- HAVE_ASN1_STRING_GET0_DATA, added by 5c6df67, which is used to track
a new compatibility with OpenSSL 1.1.0.  This was missing from
pg_config.win32.h and is not enabled by default.  HAVE_BIO_GET_DATA,
HAVE_OPENSSL_INIT_SSL and HAVE_BIO_METH_NEW gain the same treatment.

The second and third flags are enabled with this commit, which raises
the bar of OpenSSL support to 1.0.2 on Windows as a minimum.  As this
is the LTS (long-time support) version of OpenSSL community and knowing
that all recent installers referred by OpenSSL upstream don't have
anymore 1.0.1 or older, we could live with that requirement.  In order
to allow the code to compile with OpenSSL 1.1.0, all the flags mentioned
above need to be enabled in pg_config.h.win32.

Author: Michael Paquier
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/20180529211559.GF6632@paquier.xyz
2018-06-19 09:00:33 +09:00
Tom Lane 3a382983d1 Allow plperl_sv_to_datum to look through scalar refs.
There seems little reason for the policy of throwing error if we
find a ref to something other than a hash or array.   Recursively
look through the ref, instead.  This makes the behavior in non-transform
cases comparable to what was already instantiated for jsonb_plperl.

Note that because we invoke any available transform function before
considering the ref case, it's up to each transform function whether
it wants to play along with this behavior or do something different.

Because the previous behavior was just to throw a useless error,
this seems unlikely to create any compatibility issues.  Still, given
the lack of field complaints so far, seems best not to back-patch.

Discussion: https://postgr.es/m/28336.1528393969@sss.pgh.pa.us
2018-06-18 15:31:57 -04:00
Tom Lane 45e98ee730 Remove obsolete prohibition on function name matching a column name.
ProcedureCreate formerly threw an error if the function to be created
has one argument of composite type and the function name matches some
column of the composite type.  This was a (very non-bulletproof) defense
against creating situations where f(x) and x.f are ambiguous.  But we
don't really need such a defense in the wake of commit b97a3465d, which
allows us to deal with such situations fairly cleanly.  This behavior
also created a dump-and-reload hazard, since a function might be
rejected if a conflicting column name had been added to the input
composite type later.  Hence, let's just drop the check.

Discussion: https://postgr.es/m/CAOW5sYa3Wp7KozCuzjOdw6PiOYPi6D=VvRybtH2S=2C0SVmRmA@mail.gmail.com
2018-06-18 11:57:33 -04:00
Tom Lane b97a3465d7 Consider syntactic form when disambiguating function vs column reference.
Postgres has traditionally considered the syntactic forms f(x) and x.f
to be equivalent, allowing tricks such as writing a function and then
using it as though it were a computed-on-demand column.  However, our
behavior when both interpretations are feasible left something to be
desired: we always chose the column interpretation.  This could lead
to very surprising results, as in a recent bug report from Neil Conway.
It also created a dump-and-reload hazard, since what was a function
call in a dumped view could get interpreted as a column reference
at reload, if a matching column name had been added to the underlying
table since the view was created.

What seems better, in ambiguous situations, is to prefer the choice
matching the syntactic form of the reference.  This seems much less
astonishing in general, and it fixes the dump/reload hazard.

Although this could be called a bug fix, there have been few complaints
and there's some small risk of breaking applications that depend on the
old behavior, so no back-patch.  It does seem reasonable to slip it
into v11, though.

Discussion: https://postgr.es/m/CAOW5sYa3Wp7KozCuzjOdw6PiOYPi6D=VvRybtH2S=2C0SVmRmA@mail.gmail.com
2018-06-18 11:39:33 -04:00
Thomas Munro 4c8156d871 Add PGTYPESchar_free() to avoid cross-module problems on Windows.
On Windows, it is sometimes important for corresponding malloc() and
free() calls to be made from the same DLL, since some build options can
result in multiple allocators being active at the same time.  For that
reason we already provided PQfreemem().  This commit adds a similar
function for freeing string results allocated by the pgtypes library.

Author: Takayuki Tsunakawa
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8AD5D6%40G01JPEXMBYT05
2018-06-18 18:33:53 +12:00
Michael Paquier 70b4f82a4b Prevent hard failures of standbys caused by recycled WAL segments
When a standby's WAL receiver stops reading WAL from a WAL stream, it
writes data to the current WAL segment without having priorily zero'ed
the page currently written to, which can cause the WAL reader to read
junk data from a past recycled segment and then it would try to get a
record from it.  While sanity checks in place provide most of the
protection needed, in some rare circumstances, with chances increasing
when a record header crosses a page boundary, then the startup process
could fail violently on an allocation failure, as follows:
FATAL:  invalid memory alloc request size XXX

This is confusing for the user and also unhelpful as this requires in
the worst case a manual restart of the instance, impacting potentially
the availability of the cluster, and this also makes WAL data look like
it is in a corrupted state.

The chances of seeing failures are higher if the connection between the
standby and its root node is unstable, causing WAL pages to be written
in the middle.  A couple of approaches have been discussed, like
zero-ing  new WAL pages within the WAL receiver itself but this has the
disadvantage of impacting performance of any existing instances as this
breaks the sequential writes done by the WAL receiver.  This commit
deals with the problem with a more simple approach, which has no
performance impact without reducing the detection of the problem: if a
record is found with a length higher than 1GB for backends, then do not
try any allocation and report a soft failure which will force the
standby to retry reading WAL.  It could be possible that the allocation
call passes and that an unnecessary amount of memory is allocated,
however follow-up checks on records would just fail, making this
allocation short-lived anyway.

This patch owes a great deal to Tsunakawa Takayuki for reporting the
failure first, and then discussing a couple of potential approaches to
the problem.

Backpatch down to 9.5, which is where palloc_extended has been
introduced.

Reported-by: Tsunakawa Takayuki
Reviewed-by: Tsunakawa Takayuki
Author: Michael Paquier
Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8B57AD@G01JPEXMBYT05
2018-06-18 10:43:27 +09:00
Tom Lane 9b53d96684 Suppress -Wshift-negative-value warnings.
Clean up four places that result in compiler warnings when using recent
gcc with this warning class enabled (as seen on buildfarm members
calliphoridae, skink, and others).  In all these places, this is purely
cosmetic, because the shift distance could not be large enough to risk
a change of sign, so there's no chance of implementation-dependent
behavior.  Still, it's easy enough to avoid the warning by casting the
shifted value to unsigned, so let's do that.

Patch HEAD only, this isn't worth a back-patch.
2018-06-17 16:15:11 -04:00
Tom Lane 6b74f5eaad Avoid unnecessary use of strncpy in a couple of places in ecpg.
Use of strncpy with a length limit based on the source, rather than
the destination, is non-idiomatic and draws warnings from gcc 8.
Replace with memcpy, which does exactly the same thing in these cases,
but with less chance for confusion.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16 14:58:11 -04:00
Tom Lane 5d923eb29b Use snprintf not sprintf in pg_waldump's timestamptz_to_str.
This could only cause an issue if strftime returned a ridiculously
long timezone name, which seems unlikely; and it wouldn't qualify
as a security problem even then, since pg_waldump (nee pg_xlogdump)
is a debug tool not part of the server.  But gcc 8 has started issuing
warnings about it, so let's use snprintf and be safe.

Backpatch to 9.3 where this code was added.

Discussion: https://postgr.es/m/21789.1529170195@sss.pgh.pa.us
2018-06-16 14:45:47 -04:00
Tom Lane 0dcf68e5a1 Fix some minor error-checking oversights in ParseFuncOrColumn().
Recent additions to ParseFuncOrColumn to make it reject non-procedure
functions in CALL were neither adequate nor documented.  Reorganize
the code to ensure uniform results for all the cases that should be
rejected.  Also, use ERRCODE_WRONG_OBJECT_TYPE for this case as well
as the converse case of a procedure in a non-CALL context.  The
original coding used ERRCODE_UNDEFINED_FUNCTION which seems wrong,
and is certainly inconsistent with the adjacent wrong-kind-of-routine
errors.

This reorganization also causes the checks for aggregate decoration with
a non-aggregate function to be made in the FUNCDETAIL_COERCION case;
that they were not is a long-standing oversight.

Discussion: https://postgr.es/m/14497.1529089235@sss.pgh.pa.us
2018-06-16 14:11:14 -04:00
Simon Riggs 15378c1a15 Remove AELs from subxids correctly on standby
Issues relate only to subtransactions that hold AccessExclusiveLocks
when replayed on standby.

Prior to PG10, aborting subtransactions that held an
AccessExclusiveLock failed to release the lock until top level commit or
abort. 49bff5300d fixed that.

However, 49bff5300d also introduced a similar bug where subtransaction
commit would fail to release an AccessExclusiveLock, leaving the lock to
be removed sometimes early and sometimes late. This commit fixes
that bug also. Backpatch to PG10 needed.

Tested by observation. Note need for multi-node isolationtester to improve
test coverage for this and other HS cases.

Reported-by: Simon Riggs
Author: Simon Riggs
2018-06-16 14:03:29 +01:00
Tatsuo Ishii 1cfdb1cb0e Fix memory leak in BufFileCreateShared().
Also this commit unifies some duplicated code in makeBufFile() and
BufFileOpenShared() into new function makeBufFileCommon().

Author: Antonin Houska
Reviewed-By: Thomas Munro, Tatsuo Ishii
Discussion: https://postgr.es/m/16139.1529049566%40localhost
2018-06-16 14:21:08 +09:00
Alvaro Herrera ff03112bdc Fix off-by-one bug in XactLogCommitRecord
Commit 1eb6d6527a introduced zeroed alignment bytes in the GID field
of commit/abort WAL records.  Fixup commit cf5a189059 later changed
that representation into a regular cstring with a single terminating
zero byte, but it also introduced an off-by-one mistake.  Fix that.

Author: Nikhil Sontakke
Reported-by: Nikhil Sontakke
Discussion: https://postgr.es/m/CAMGcDxey6dG1DP34_tJMoWPcp5sPJUAL4K5CayUUXLQSx2GQpA@mail.gmail.com
2018-06-15 15:00:41 -04:00
Tatsuo Ishii 969274d813 Fix memory leak.
Memory is allocated twice for "file" and "files" variables in
BufFileOpenShared().

Author: Antonin Houska
Discussion: https://postgr.es/m/11329.1529045692%40localhost
2018-06-15 16:32:59 +09:00
Alvaro Herrera 74da7cda31 Fail BRIN control functions during recovery explicitly
They already fail anyway, but prior to this patch they raise an ugly
error message about a lock that cannot be acquired.  This just improves
the message.

Author: Masahiko Sawada
Reported-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoBZau4g4_NUf3BKNd=CdYK+xaPdtJCzvOC1TxGdTiJx_Q@mail.gmail.com
Reviewed-by: Kuntal Ghosh, Alexander Korotkov, Simon Riggs, Michaël Paquier, Álvaro Herrera
2018-06-14 12:51:32 -04:00
Simon Riggs dc878ffedf Remove spurious code comments in standby related code
GetRunningTransactionData() suggested that subxids were not worth
optimizing away if overflowed, yet they have already been removed
for that case.

Changes to LogAccessExclusiveLock() API forgot to remove the
prior comment when it was copied to LockAcquire().
2018-06-14 12:17:51 +01:00
Simon Riggs 802bde87ba Remove cut-off bug from RunningTransactionData
32ac7a118f tried to fix a Hot Standby issue
reported by Greg Stark, but in doing so caused
a different bug to appear, noted by Andres Freund.

Revoke the core changes from 32ac7a118f,
leaving in its place a minor change in code
ordering and comments to explain for the future.
2018-06-14 12:02:41 +01:00
Tom Lane 91781335ed Code review for match_clause_to_partition_key().
Fix inconsistent decisions about NOMATCH vs UNSUPPORTED result codes.
If we're going to cater for partkeys that have the same expression and
different collations, surely we should also support partkeys with the
same expression and different opclasses.

Clean up shaky handling of commuted opclauses, eg checking the wrong
operator to see what its negator is.  This wouldn't cause any actual
bugs given a sane opclass definition, but it doesn't seem helpful to
expend more code to be less correct.

Improve handling of null elements in ScalarArrayOp arrays: in the
"op ALL" case, we can conclude they result in an unsatisfiable clause.

Minor cosmetic changes and comment improvements.
2018-06-13 16:10:30 -04:00
Tom Lane 19832753f1 Fix some ill-chosen names for globally-visible partition support functions.
"compute_hash_value" is particularly gratuitously generic, but IMO
all of these ought to have names clearly related to partitioning.
2018-06-13 13:18:02 -04:00
Tom Lane e23bae82cf Fix up run-time partition pruning's use of relcache's partition data.
The previous coding saved pointers into the partitioned table's relcache
entry, but then closed the relcache entry, causing those pointers to
nominally become dangling.  Actual trouble would be seen in the field
only if a relcache flush occurred mid-query, but that's hardly out of
the question.

While we could fix this by copying all the data in question at query
start, it seems better to just hold the relcache entry open for the
whole query.

While at it, improve the handling of support-function lookups: do that
once per query not once per pruning test.  There's still something to be
desired here, in that we fail to exploit the possibility of caching data
across queries in the fn_extra fields of the relcache's FmgrInfo structs,
which could happen if we just used those structs in-place rather than
copying them.  However, combining that with the possibility of per-query
lookups of cross-type comparison functions seems to require changes in the
APIs of a lot of the pruning support functions, so it's too invasive to
consider as part of this patch.  A win would ensue only for complex
partition key data types (e.g. arrays), so it may not be worth the
trouble.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/17850.1528755844@sss.pgh.pa.us
2018-06-13 12:03:26 -04:00
Andrew Dunstan e3eb8be77e Exclude files in .git from list of perl files
The .git directory might contain perl files, as hooks, for example.
Since we have no control over these they should be excluded from things
like our perlcritic checks.

Per offline report from Mike Blackwell.
2018-06-12 14:54:43 -04:00
Andres Freund a54e1f1587 Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
When vacuum processes a relation it uses the corresponding relcache
entry's relfrozenxid / relminmxid as a cutoff for when to remove
tuples etc. Unfortunately for nailed relations (i.e. critical system
catalogs) bugs could frequently lead to the corresponding relcache
entry being stale.

This set of bugs could cause actual data corruption as vacuum would
potentially not remove the correct row versions, potentially reviving
them at a later point.  After 699bf7d05c some corruptions in this vein
were prevented, but the additional error checks could also trigger
spuriously. Examples of such errors are:
  ERROR: found xmin ... from before relfrozenxid ...
and
  ERROR: found multixact ... from before relminmxid ...
To be caused by this bug the errors have to occur on system catalog
tables.

The two bugs are:

1) Invalidations for nailed relations were ignored, based on the
   theory that the relcache entry for such tables doesn't
   change. Which is largely true, except for fields like relfrozenxid
   etc.  This means that changes to relations vacuumed in other
   sessions weren't picked up by already existing sessions.  Luckily
   autovacuum doesn't have particularly longrunning sessions.

2) For shared *and* nailed relations, the shared relcache init file
   was never invalidated while running.  That means that for such
   tables (e.g. pg_authid, pg_database) it's not just already existing
   sessions that are affected, but even new connections are as well.
   That explains why the reports usually were about pg_authid et. al.

To fix 1), revalidate the rd_rel portion of a relcache entry when
invalid. This implies a bit of extra complexity to deal with
bootstrapping, but it's not too bad.  The fix for 2) is simpler,
simply always remove both the shared and local init files.

Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion:
    https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
    https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
    https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
    https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
Backpatch: 9.3-
2018-06-12 11:13:21 -07:00
Peter Eisentraut 8a07ebb3c1 Convert debug message from ereport to elog 2018-06-12 11:33:39 -04:00
Tom Lane bdc643e5e4 Fix access to just-closed relcache entry.
It might be impossible for this to cause a problem in non-debug builds,
since there'd be no opportunity for the relcache entry to get recycled
before the fetch.  It blows up nicely with -DRELCACHE_FORCE_RELEASE plus
valgrind, though.

Evidently introduced by careless refactoring in commit f0e44751d.
Back-patch accordingly.

Discussion: https://postgr.es/m/27543.1528758304@sss.pgh.pa.us
2018-06-11 19:18:04 -04:00
Michael Paquier f8795d2ec8 Fix oversight from 9e149c8 with spin-lock handling
Calling an external function while a pin-lock is held is a bad idea as
those are designed to be short-lived.  The stress of a first commit into
a large git history may contribute to that.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20180611164952.vmxdpdpirdtkdsz6@alap3.anarazel.de
2018-06-12 06:52:34 +09:00
Tom Lane 69025c5a07 Improve ExecFindInitialMatchingSubPlans's subplan renumbering logic.
We don't need two passes if we scan child partitions before parents,
as that way the children's present_parts are up to date before they're
needed.  I (tgl) think there's actually a bug being fixed here, for the
case of an intermediate partitioned table with no direct leaf children,
but haven't attempted to construct a test case to prove it.

David Rowley

Discussion: https://postgr.es/m/CAKJS1f-6GODRNgEtdPxCnAPme2h2hTztB6LmtfdmcYAAOE0kQg@mail.gmail.com
2018-06-11 17:35:53 -04:00
Tom Lane 4e23236403 Improve commentary about run-time partition pruning data structures.
No code changes except for a couple of new Asserts.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAKJS1f-6GODRNgEtdPxCnAPme2h2hTztB6LmtfdmcYAAOE0kQg@mail.gmail.com
2018-06-11 17:35:53 -04:00
Peter Eisentraut e5d11b91e4 Adjust error message
Makes it look more similar to other ones, and avoids the need for
pluralization.
2018-06-11 17:19:11 -04:00