postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-07-31 03:43:22 +02:00

Author	SHA1	Message	Date
Tom Lane	2f48ede080	Avoid using a cursor in plpgsql's RETURN QUERY statement. plpgsql has always executed the query given in a RETURN QUERY command by opening it as a cursor and then fetching a few rows at a time, which it turns around and dumps into the function's result tuplestore. The point of this was to keep from blowing out memory with an oversized SPITupleTable result (note that while a tuplestore can spill tuples to disk, SPITupleTable cannot). However, it's rather inefficient, both because of extra data copying and because of executor entry/exit overhead. In recent versions, a new performance problem has emerged: use of a cursor prevents use of a parallel plan for the executed query. We can improve matters by skipping use of a cursor and having the executor push result tuples directly into the function's result tuplestore. However, a moderate amount of new infrastructure is needed to make that idea work: * We can use the existing tstoreReceiver.c DestReceiver code to funnel executor output to the tuplestore, but it has to be extended to support plpgsql's requirement for possibly applying a tuple conversion map. * SPI needs to be extended to allow use of a caller-supplied DestReceiver instead of its usual receiver that puts tuples into a SPITupleTable. Two new API calls are needed to handle both the RETURN QUERY and RETURN QUERY EXECUTE cases. I also felt that I didn't want these new API calls to use the legacy method of specifying query parameter values with "char" null flags (the old ' '/'n' convention); rather they should accept ParamListInfo objects containing the parameter type and value info. This required a bit of additional new infrastructure since we didn't yet have any parse analysis callback that would interpret $N parameter symbols according to type data supplied in a ParamListInfo. There seems to be no harm in letting makeParamList install that callback by default, rather than leaving a new ParamListInfo's parserSetup hook as NULL. (Indeed, as of HEAD, I couldn't find anyplace that was using the parserSetup field at all; plpgsql was using parserSetupArg for its own purposes, but parserSetup seemed to be write-only.) We can actually get plpgsql out of the business of using legacy null flags altogether, and using ParamListInfo instead of its ad-hoc PreparedParamsData structure; but this requires inventing one more SPI API call that can replace SPI_cursor_open_with_args. That seems worth doing, though. SPI_execute_with_args and SPI_cursor_open_with_args are now unused anywhere in the core PG distribution. Perhaps someday we could deprecate/remove them. But cleaning up the crufty bits of the SPI API is a task for a different patch. Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to consider back-patching. Patch by me; thanks to Hamid Akhtar for review. Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org	2020-06-12 12:14:32 -04:00
Michael Paquier	aaf8c99050	Fix typos and some format mistakes in comments Author: Justin Pryzby Discussion: https://postgr.es/m/20200612023709.GC14879@telsasoft.com	2020-06-12 21:05:10 +09:00
Peter Eisentraut	ffd2582297	Make more use of RELKIND_HAS_STORAGE() Make use of RELKIND_HAS_STORAGE() where appropriate, instead of listing out the relkinds individually. No behavior change intended. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/7a22bf51-2480-d999-1794-191ba67ff47c%402ndquadrant.com	2020-06-12 09:10:26 +02:00
Thomas Munro	7aa4fb5925	Improve comments for [Heap]CheckForSerializableConflictOut(). Rewrite the documentation of these functions, in light of recent bug fix commit `5940ffb2`. Back-patch to 13 where the check-for-conflict-out code was split up into AM-specific and generic parts, and new documentation was added that now looked wrong. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/db7b729d-0226-d162-a126-8a8ab2dc4443%40jepsen.io	2020-06-12 10:55:38 +12:00
Tom Lane	77a3be32f7	Fix mishandling of NaN counts in numeric_[avg_]combine. When merging two NumericAggStates, the code missed adding the new state's NaNcount unless its N was also nonzero; since those counts are independent, this is wrong. This would only have visible effect if some partial aggregate scans found only NaNs while earlier ones found only non-NaNs; then we could end up falsely deciding that there were no NaNs and fail to return a NaN final result as expected. That's pretty improbable, so it's no surprise this hasn't been reported from the field. Still, it's a bug. I didn't try to produce a regression test that would show the bug, but I did notice that these functions weren't being reached at all in our regression tests, so I improved the tests to at least exercise them. With these additions, I see pretty complete code coverage on the aggregation-related functions in numeric.c. Back-patch to 9.6 where this code was introduced. (I only added the improved test case as far back as v10, though, since the relevant part of aggregates.sql isn't there at all in 9.6.)	2020-06-11 17:38:42 -04:00
Jeff Davis	92c58fd948	Rework HashAgg GUCs. Eliminate enable_groupingsets_hash_disk, which was primarily useful for testing grouping sets that use HashAgg and spill. Instead, hack the table stats to convince the planner to choose hashed aggregation for grouping sets that will spill to disk. Suggested by Melanie Plageman. Rename enable_hashagg_disk to hashagg_avoid_disk_plan, and invert the meaning of on/off. The new name indicates more strongly that it only affects the planner. Also, the word "avoid" is less definite, which should avoid surprises when HashAgg still needs to use the disk. Change suggested by Justin Pryzby, though I chose a different GUC name. Discussion: https://postgr.es/m/CAAKRu_aisiENMsPM2gC4oUY1hHG3yrCwY-fXUg22C6_MJUwQdA%40mail.gmail.com Discussion: https://postgr.es/m/20200610021544.GA14879@telsasoft.com Backpatch-through: 13	2020-06-11 12:57:43 -07:00
Peter Geoghegan	5940ffb221	Avoid update conflict out serialization anomalies. SSI's HeapCheckForSerializableConflictOut() test failed to correctly handle conditions involving a concurrently inserted tuple which is later concurrently updated by a separate transaction . A SELECT statement that called HeapCheckForSerializableConflictOut() could end up using the same XID (updater's XID) for both the original tuple, and the successor tuple, missing the XID of the xact that created the original tuple entirely. This only happened when neither tuple from the chain was visible to the transaction's MVCC snapshot. The observable symptoms of this bug were subtle. A pair of transactions could commit, with the later transaction failing to observe the effects of the earlier transaction (because of the confusion created by the update to the non-visible row). This bug dates all the way back to commit `dafaa3ef`, which added SSI. To fix, make sure that we check the xmin of concurrently inserted tuples that happen to also have been updated concurrently. Author: Peter Geoghegan Reported-By: Kyle Kingsbury Reviewed-By: Thomas Munro Discussion: https://postgr.es/m/db7b729d-0226-d162-a126-8a8ab2dc4443@jepsen.io Backpatch: All supported versions	2020-06-11 10:09:47 -07:00
Peter Eisentraut	3fbd4bb6f4	Refactor DROP LANGUAGE grammar Fold it into the generic DropStmt. Discussion: https://www.postgresql.org/message-id/flat/163c00a5-f634-ca52-fc7c-0e53deda8735%402ndquadrant.com	2020-06-11 11:18:15 +02:00
Peter Eisentraut	5333e014ab	Remove deprecated syntax from CREATE/DROP LANGUAGE Remove the option to specify the language name as a single-quoted string. This has been obsolete since `ee8ed85da3`. Removing it allows better grammar refactoring. The syntax of the CREATE FUNCTION LANGUAGE clause is not changed. Discussion: https://www.postgresql.org/message-id/flat/163c00a5-f634-ca52-fc7c-0e53deda8735%402ndquadrant.com	2020-06-11 10:26:12 +02:00
Peter Eisentraut	c4325cefba	Fold AlterForeignTableStmt into AlterTableStmt All other relation types are handled by AlterTableStmt, so it's unnecessary to make a different statement for foreign tables. Discussion: https://www.postgresql.org/message-id/flat/163c00a5-f634-ca52-fc7c-0e53deda8735%402ndquadrant.com	2020-06-11 08:21:24 +02:00
Peter Eisentraut	c2bd1fec32	Remove redundant grammar symbols access_method, database_name, and index_name are all just name, and they are not used consistently for their alleged purpose, so remove them. They have been around since ancient times but have no current reason for existing. Removing them can simplify future grammar refactoring. Discussion: https://www.postgresql.org/message-id/flat/163c00a5-f634-ca52-fc7c-0e53deda8735%402ndquadrant.com	2020-06-10 22:58:46 +02:00
Peter Eisentraut	c7eab0e97e	Change default of password_encryption to scram-sha-256 Also, the legacy values on/true/yes/1 for password_encryption that mapped to md5 are removed. The only valid values are now scram-sha-256 and md5. Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org> Discussion: https://www.postgresql.org/message-id/flat/d5b0ad33-7d94-bdd1-caac-43a1c782cab2%402ndquadrant.com	2020-06-10 16:42:55 +02:00
Peter Eisentraut	5a4ada71a8	Update description of parameter password_encryption The previous description string still described the pre-PostgreSQL 10 (pre `eb61136dc7`) behavior of selecting between encrypted and unencrypted, but it is now choosing between encryption algorithms.	2020-06-10 11:57:41 +02:00
Amit Kapila	c5c000b103	Fix ReorderBuffer memory overflow check. Commit `cec2edfa78` introduced logical_decoding_work_mem to limit ReorderBuffer memory usage. We spill the changes once the memory occupied by changes exceeds logical_decoding_work_mem. There was an assumption in the code that by evicting the largest (sub)transaction we will come under the memory limit as the selected transaction will be at least as large as the most recent change (which caused us to go over the memory limit). However, that is not true because a user can reduce the logical_decoding_work_mem to a smaller value before the most recent change. We fix it by allowing to evict the transactions until we reach under the memory limit. Reported-by: Fujii Masao Author: Amit Kapila Reviewed-by: Fujii Masao Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/2b7ba291-22e0-a187-d167-9e5309a3458d@oss.nttdata.com	2020-06-10 10:20:10 +05:30
Peter Eisentraut	350f47786c	Spelling adjustments similar to `0fd2a79a63`	2020-06-09 10:41:41 +02:00
Peter Eisentraut	b1d32d3e32	Unify drop-by-OID functions There are a number of Remove${Something}ById() functions that are essentially identical in structure and only different in which catalog they are working on. Refactor this to be one generic function. The information about which oid column, index, etc. to use was already available in ObjectProperty for most catalogs, in a few cases it was easily added. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/331d9661-1743-857f-1cbb-d5728bcd62cb%402ndquadrant.com	2020-06-09 09:39:46 +02:00
David Rowley	b27c90bbe4	Fix invalid function references in a few comments These appear to have been forgotten when the functions were renamed in `1fd687a03`. Backpatch-through: 13, where the functions were renamed	2020-06-09 18:43:15 +12:00
Jeff Davis	1b2c29469a	Fix HashAgg regression from choosing too many initial buckets. Diagnosis by Andres. Reported-by: Pavel Stehule Discussion: https://postgr.es/m/CAFj8pRDLVakD5Aagt3yZeEQeTeEWaS3YE5h8XC3Q3qJ6TYkc2Q%40mail.gmail.com Backpatch-through: 13	2020-06-08 21:04:16 -07:00
Peter Eisentraut	cbcc8726bb	Update snowball Update to snowball tag v2.0.0. Major changes are new stemmers for Basque, Catalan, and Hindi. Discussion: https://www.postgresql.org/message-id/flat/a8eeabd6-2be1-43fe-401e-a97594c38478%402ndquadrant.com	2020-06-08 08:07:15 +02:00
Thomas Munro	57cb806308	Fix locking bugs that could corrupt pg_control. The redo routines for XLOG_CHECKPOINT_{ONLINE,SHUTDOWN} must acquire ControlFileLock before modifying ControlFile->checkPointCopy, or the checkpointer could write out a control file with a bad checksum. Likewise, XLogReportParameters() must acquire ControlFileLock before modifying ControlFile and calling UpdateControlFile(). Back-patch to all supported releases. Author: Nathan Bossart <bossartn@amazon.com> Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/70BF24D6-DC51-443F-B55A-95735803842A%40amazon.com	2020-06-08 13:57:24 +12:00
Michael Paquier	879ad9f90e	Fix crash in WAL sender when starting physical replication Since database connections can be used with WAL senders in 9.4, it is possible to use physical replication. This commit fixes a crash when starting physical replication with a WAL sender using a database connection, caused by the refactoring done in `850196b`. There have been discussions about forbidding the use of physical replication in a database connection, but this is left for later, taking care only of the crash new to 13. While on it, add a test to check for a failure when attempting logical replication if the WAL sender does not have a database connection. This part is extracted from a larger patch by Kyotaro Horiguchi. Reported-by: Vladimir Sitnikov Author: Michael Paquier, Kyotaro Horiguchi Reviewed-by: Kyotaro Horiguchi, Álvaro Herrera Discussion: https://postgr.es/m/CAB=Je-GOWMj1PTPkeUhjqQp-4W3=nW-pXe2Hjax6rJFffB5_Aw@mail.gmail.com Backpatch-through: 13	2020-06-08 10:12:24 +09:00
Tom Lane	b5d69b7c22	pgindent run prior to branching v13. pgperltidy and reformat-dat-files too, though those didn't find anything to change.	2020-06-07 16:57:08 -04:00
Jeff Davis	1fbb6c93df	Fix platform-specific performance regression in logtape.c. Commit `24d85952` made a change that indirectly caused a performance regression by triggering a change in the way GCC optimizes memcpy() on some platforms. The behavior seemed to contradict a GCC document, so I filed a report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95556 This patch implements a narrow workaround which eliminates the regression I observed. The workaround is benign enough that it seems unlikely to cause a different regression on another platform. Discussion: https://postgr.es/m/99b2eab335c1592c925d8143979c8e9e81e1575f.camel@j-davis.com	2020-06-07 09:25:55 -07:00
Peter Eisentraut	0fd2a79a63	Spelling adjustments	2020-06-07 15:06:51 +02:00
Peter Eisentraut	f4c88ce1a2	Formatting and punctuation improvements in postgresql.conf.sample	2020-06-07 14:35:12 +02:00
Tom Lane	0c882e52a8	Improve ineq_histogram_selectivity's behavior for non-default orderings. ineq_histogram_selectivity() can be invoked in situations where the ordering we care about is not that of the column's histogram. We could be considering some other collation, or even more drastically, the query operator might not agree at all with what was used to construct the histogram. (We'll get here for anything using scalarineqsel-based estimators, so that's quite likely to happen for extension operators.) Up to now we just ignored this issue and assumed we were dealing with an operator/collation whose sort order exactly matches the histogram, possibly resulting in junk estimates if the binary search gets confused. It's past time to improve that, since the use of nondefault collations is increasing. What we can do is verify that the given operator and collation match what's recorded in pg_statistic, and use the existing code only if so. When they don't match, instead execute the operator against each histogram entry, and take the fraction of successes as our selectivity estimate. This gives an estimate that is probably good to about 1/histogram_size, with no assumptions about ordering. (The quality of the estimate is likely to degrade near the ends of the value range, since the two orderings probably don't agree on what is an extremal value; but this is surely going to be more reliable than what we did before.) At some point we might further improve matters by storing more than one histogram calculated according to different orderings. But this code would still be good fallback logic when no matches exist, so that is not an argument for not doing this. While here, also improve get_variable_range() to deal more honestly with non-default collations. This isn't back-patchable, because it requires adding another argument to ineq_histogram_selectivity, and because it might have significant impact on the estimation results for extension operators relying on scalarineqsel --- mostly for the better, one hopes, but in any case destabilizing plan choices in back branches is best avoided. Per investigation of a report from James Lucas. Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com	2020-06-05 16:55:27 -04:00
Tom Lane	044c99bc56	Use query collation, not column's collation, while examining statistics. Commit `5e0928005` changed the planner so that, instead of blindly using DEFAULT_COLLATION_OID when invoking operators for selectivity estimation, it would use the collation of the column whose statistics we're considering. This was recognized as still being not quite the right thing, but it seemed like a good incremental improvement. However, shortly thereafter we introduced nondeterministic collations, and that creates cases where operators can fail if they're passed the wrong collation. We don't want planning to fail in cases where the query itself would work, so this means that we must use the query's collation when invoking operators for estimation purposes. The only real problem this creates is in ineq_histogram_selectivity, where the binary search might produce a garbage answer if we perform comparisons using a different collation than the column's histogram is ordered with. However, when the query's collation is significantly different from the column's default collation, the estimate we previously generated would be pretty irrelevant anyway; so it's not clear that this will result in noticeably worse estimates in practice. (A follow-on patch will improve this situation in HEAD, but it seems too invasive for back-patch.) The patch requires changing the signatures of mcv_selectivity and allied functions, which are exported and very possibly are used by extensions. In HEAD, I just did that, but an API/ABI break of this sort isn't acceptable in stable branches. Therefore, in v12 the patch introduces "mcv_selectivity_ext" and so on, with signatures matching HEAD, and makes the old functions into wrappers that assume DEFAULT_COLLATION_OID should be used. That does not match the prior behavior, but it should avoid risk of failure in most cases. (In practice, I think most extension datatypes aren't collation-aware, so the change probably doesn't matter to them.) Per report from James Lucas. Back-patch to v12 where the problem was introduced. Discussion: https://postgr.es/m/CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com	2020-06-05 16:18:50 -04:00
Michael Paquier	1127f0e392	Preserve pg_index.indisreplident across REINDEX CONCURRENTLY If the flag value is lost, logical decoding would work the same way as REPLICA IDENTITY NOTHING, meaning that no old tuple values would be included in the changes anymore produced by logical decoding. Author: Michael Paquier Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/20200603065340.GK89559@paquier.xyz Backpatch-through: 12	2020-06-05 10:26:02 +09:00
Tom Lane	a9632830bb	Reject "23:59:60.nnn" in datetime input. It's intentional that we don't allow values greater than 24 hours, while we do allow "24:00:00" as well as "23:59:60" as inputs. However, the range check was miscoded in such a way that it would accept "23:59:60.nnn" with a nonzero fraction. For time or timetz, the stored result would then be greater than "24:00:00" which would fail dump/reload, not to mention possibly confusing other operations. Fix by explicitly calculating the result and making sure it does not exceed 24 hours. (This calculation is redundant with what will happen later in tm2time or tm2timetz. Maybe someday somebody will find that annoying enough to justify refactoring to avoid the duplication; but that seems too invasive for a back-patched bug fix, and the cost is probably unmeasurable anyway.) Note that this change also rejects such input as the time portion of a timestamp(tz) value. Back-patch to v10. The bug is far older, but to change this pre-v10 we'd need to ensure that the logic behaves sanely with float timestamps, which is possibly nontrivial due to roundoff considerations. Doesn't really seem worth troubling with. Per report from Christoph Berg. Discussion: https://postgr.es/m/20200520125807.GB296739@msg.df7cb.de	2020-06-04 16:42:23 -04:00
Michael Paquier	3fa44a3004	Fix comment in be-secure-openssl.c Since `573bd08`, hardcoded DH parameters have been moved to a different file, making the comment on top of load_dh_buffer() incorrect. Author: Daniel Gustafsson Discussion: https://postgr.es/m/D9492CCB-9A91-4181-A847-1779630BE2A7@yesql.se	2020-06-04 13:02:59 +09:00
Michael Paquier	c1669fd581	Fix instance of elog() called while holding a spinlock This broke the project rule to not call any complex code while a spinlock is held. Issue introduced by `b89e151`. Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com Backpatch-through: 9.5	2020-06-04 10:17:49 +09:00
Tom Lane	f88bd3139f	Don't call palloc() while holding a spinlock, either. Fix some more violations of the "only straight-line code inside a spinlock" rule. These are hazardous not only because they risk holding the lock for an excessively long time, but because it's possible for palloc to throw elog(ERROR), leaving a stuck spinlock behind. copy_replication_slot() had two separate places that did pallocs while holding a spinlock. We can make the code simpler and safer by copying the whole ReplicationSlot struct into a local variable while holding the spinlock, and then referencing that copy. (While that's arguably more cycles than we really need to spend holding the lock, the struct isn't all that big, and this way seems far more maintainable than copying fields piecemeal. Anyway this is surely much cheaper than a palloc.) That bug goes back to v12. InvalidateObsoleteReplicationSlots() not only did a palloc while holding a spinlock, but for extra sloppiness then leaked the memory --- probably for the lifetime of the checkpointer process, though I didn't try to verify that. Fortunately that silliness is new in HEAD. pg_get_replication_slots() had a cosmetic violation of the rule, in that it only assumed it's safe to call namecpy() while holding a spinlock. Still, that's a hazard waiting to bite somebody, and there were some other cosmetic coding-rule violations in the same function, so clean it up. I back-patched this as far as v10; the code exists before that but it looks different, and this didn't seem important enough to adapt the patch further back. Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com	2020-06-03 12:36:23 -04:00
Fujii Masao	caa3c4242c	Don't call elog() while holding spinlock. Previously UpdateSpillStats() called elog(DEBUG2) while holding the spinlock even though the local variables that the elog() accesses don't need to be protected by the lock. Since spinlocks are intended for very short-term locks, they should not be used when calling elog(DEBUG2). So this commit moves that elog() out of spinlock period. Author: Kyotaro Horiguchi Reviewed-by: Amit Kapila and Fujii Masao Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com	2020-06-02 19:21:04 +09:00
Peter Eisentraut	42181b1015	Use correct and consistent unit abbreviation	2020-06-01 21:18:36 +02:00
Michael Paquier	ce1c5b9ae8	Fix use-after-release mistake in currtid() and currtid2() for views This issue has been present since the introduction of this code as of `a3519a2` from 2002, and has been found by buildfarm member prion that uses RELCACHE_FORCE_RELEASE via the tests introduced recently in `e786be5`. Discussion: https://postgr.es/m/20200601022055.GB4121@paquier.xyz Backpatch-through: 9.5	2020-06-01 14:41:18 +09:00
Michael Paquier	e786be5fcb	Fix crashes with currtid() and currtid2() A relation that has no storage initializes rd_tableam to NULL, which caused those two functions to crash because of a pointer dereference. Note that in 11 and older versions, this has always failed with a confusing error "could not open file". These two functions are used by the Postgres ODBC driver, which requires them only when connecting to a backend strictly older than 8.1. When connected to 8.2 or a newer version, the driver uses a RETURNING clause instead whose support has been added in 8.2, so it should be possible to just remove both functions in the future. This is left as an issue to address later. While on it, add more regression tests for those functions as we never really had coverage for them, and for aggregates of TIDs. Reported-by: Jaime Casanova, via sqlsmith Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CAJGNTeO93u-5APMga6WH41eTZ3Uee9f3s8dCpA-GSSqNs1b=Ug@mail.gmail.com Backpatch-through: 12	2020-06-01 10:32:06 +09:00
Tomas Vondra	4cad2534da	Use CP_SMALL_TLIST for hash aggregate Commit `1f39bce021` added disk-based hash aggregation, which may spill incoming tuples to disk. It however did not request projection to make the tuples as narrow as possible, which may mean having to spill much more data than necessary (increasing I/O, pushing other stuff from page cache, etc.). This adds CP_SMALL_TLIST in places that may use hash aggregation - we do that only for AGG_HASHED. It's unnecessary for AGG_SORTED, because that either uses explicit Sort (which already does projection) or pre-sorted input (which does not need spilling to disk). Author: Tomas Vondra Reviewed-by: Jeff Davis Discussion: https://postgr.es/m/20200519151202.u2p2gpiawoaznsv2%40development	2020-05-31 14:43:13 +02:00
Andres Freund	6a4a335b84	llvmjit: Fix building against LLVM 11 by removing unnecessary include. LLVM has removed this header, in the branch that will become llvm 11. But as it turns out we didn't actually need it, so just remove it. Author: Jesse Zhang <sbjesse@gmail.com> Discussion: https://postgr.es/m/CAGf+fX7bvtP0YXMu7pOsu_NwhxW6dArTkxb=jt7M2-UJkyJ_3g@mail.gmail.com Backpatch: 11, where JIT support using llvm was introduced.	2020-05-28 15:24:28 -07:00
Joe Conway	887cdff4dc	Add CHECK_FOR_INTERRUPTS() to the repeat() function The repeat() function loops for potentially a long time without ever checking for interrupts. This prevents, for example, a query cancel from interrupting until the work is all done. Fix by inserting a CHECK_FOR_INTERRUPTS() into the loop. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/flat/8692553c-7fe8-17d9-cbc1-7cddb758f4c6%40joeconway.com	2020-05-28 13:19:00 -04:00
Heikki Linnakangas	5b1c61e8b8	Add missing error code to "cannot attach index ..." error. ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE was used in an ereport with the same message but different errdetail a few lines earlier, so use that here as well. Backpatch-through: 11	2020-05-28 12:37:00 +03:00
Michael Paquier	55ca50deb8	Fix some mentions to memory units in postgresql.conf.sample The default unit for max_slot_wal_keep_size is megabytes. While on it, also change temp_file_limit to use a more consistent wording. Reported-by: Jeff Janes, Fujii Masao Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/CAMkU=1wWZhhjpwRFKJ9waQGxxROeC0P6UqPvb90fAaGz7dhoHA@mail.gmail.com	2020-05-28 15:39:05 +09:00
Jeff Davis	896ddf9b3c	Avoid fragmentation of logical tapes when writing concurrently. Disk-based HashAgg relies on writing to multiple tapes concurrently. Avoid fragmentation of the tapes' blocks by preallocating many blocks for a tape at once. No file operations are performed during preallocation; only the block numbers are reserved. Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20200519151202.u2p2gpiawoaznsv2%40development	2020-05-26 16:49:43 -07:00
Peter Eisentraut	add4211600	Add lcov exclusion markers to jsonpath scanner This was done for all scanners in `4211673622` but not added to the new one.	2020-05-26 14:09:36 +02:00
Bruce Momjian	ac5852fb30	gss: add missing references to hostgssenc and hostnogssenc These were missed when these were added to pg_hba.conf in PG 12; updates docs and pg_hba.conf.sample. Reported-by: Arthur Nascimento Bug: 16380 Discussion: https://postgr.es/m/20200421182736.GG19613@momjian.us Backpatch-through: 12	2020-05-25 20:19:28 -04:00
Noah Misch	587322de36	Reconcile nodes/*funcs.c. The stmt_len changes do not affect behavior. LimitPath has no other support functions, so that part changes only debugging output.	2020-05-25 16:23:48 -07:00
Michael Paquier	a995b371ae	Add missing invocations to object access hooks The following commands have been missing calls to object access hooks InvokeObjectPost{Create\|Alter}Hook normally applied to all commands: - ALTER RULE RENAME TO - ALTER USER MAPPING - CREATE ACCESS METHOD - CREATE STATISTICS Thanks also to Robert Haas for the discussion. Author: Mark Dilger Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/435CD295-F409-44E0-91EC-DF32C7AFCD76@enterprisedb.com	2020-05-23 14:03:04 +09:00
Alvaro Herrera	c99cec96b8	Fix two typos in a comment They were introduced in 898e5e3290a7; backpatch to 12.	2020-05-22 17:39:16 -04:00
Peter Eisentraut	574925bfd0	Remove unnecessary cast Probably copied from nearby calls where it is necessary. But this one also casts away constness, so it was doubly annoying.	2020-05-22 10:36:49 +02:00
Etsuro Fujita	bb2ae6fa47	Adjust indentation in src/backend/optimizer/README. The previous indentation of optimizer functions was unclear; adjust the indentation dashes so that a deeper level of indentation indicates that the outer optimizer function calls the inner one. Author: Richard Guo, with additional change by me Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/CAMbWs4-U-ogzpchGsP2BBMufCss1hktm%2B%2BeTJK_dUC196pw0cQ%40mail.gmail.com	2020-05-22 15:45:00 +09:00
Noah Misch	3350fb5d1f	Clear some style deviations.	2020-05-21 08:31:16 -07:00
Tom Lane	c7d65a252c	part_strategy does not need its very own keyword classification. This should be plain old ColId. Making it so makes the grammar less complicated, and makes the compiled tables a kilobyte or so smaller (likely because they don't have to deal with a keyword classification that's not used anyplace else).	2020-05-19 20:09:59 -04:00
Peter Geoghegan	67b0b2dbf9	Reconsider nbtree page deletion assertion. Commit `624686abcf` added an assertion that verified that _bt_search successfully relocated the leaf page undergoing deletion. Page deletion cannot deal with the case where the descent stack is to the right of the page, so this seemed critical (deletion can only handle the case where the descent stack is to the left of the leaf/target page). However, the assertion went a bit too far. Since only a buffer pin is held on the leaf page throughout the call to _bt_search, nothing guarantees that it can't have split during this small window. And if does actually split, _bt_search may end up "relocating" a page to the right of the original target leaf page. This scenario seems extremely unlikely, but it must still be considered. Remove the assertion, and document how we cope in this scenario.	2020-05-19 15:04:34 -07:00
Alvaro Herrera	c301c2e739	WITH TIES: number of rows is optional and defaults to one FETCH FIRST .. ONLY implements this correctly, but we missed to include it for FETCH FIRST .. WITH TIES in commit `357889eb17`. Author: Vik Fearing Discussion: https://postgr.es/m/6aa690ef-551d-e24f-2690-c38c2442947c@postgresfriends.org	2020-05-18 19:28:46 -04:00
Peter Eisentraut	ac449d8801	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 031ca65d7825c3e539a3e62ea9d6630af12e6b6b	2020-05-18 12:49:30 +02:00
Magnus Hagander	a01debe3db	Fix typos in README Author: Daniel Gustafsson	2020-05-18 11:55:35 +02:00
Amit Kapila	7e041b0c1d	Fix comment in slot.c. Reported-by: Sawada Masahiko Author: Sawada Masahiko Reviewed-by: Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CA+fd4k4Ws7M7YQ8PqSym5WB1y75dZeBTd1sZJUQdfe0KJQ-iSA@mail.gmail.com	2020-05-18 07:53:26 +05:30
Tom Lane	3048898e73	Mop-up for wait event naming issues. Synchronize the event names for parallel hash join waits with other event names, by getting rid of the slashes and dropping "-ing" suffixes. Rename ClogGroupUpdate to XactGroupUpdate, to match the new SLRU name. Move the ProcSignalBarrier event to the IPC category; it doesn't belong under IO. Also a bit more wordsmithing in the wait event documentation tables. Discussion: https://postgr.es/m/4505.1589640417@sss.pgh.pa.us	2020-05-16 21:00:11 -04:00
Michael Paquier	2c8dd05d6c	Make pg_stat_wal_receiver consistent with the WAL receiver's shmem info `d140f2f3` has renamed receivedUpto to flushedUpto, and has added writtenUpto to the WAL receiver's shared memory information, but pg_stat_wal_receiver was not consistent with that. This commit renames received_lsn to flushed_lsn, and adds a new column called written_lsn. Bump catalog version. Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20200515090817.GA212736@paquier.xyz	2020-05-17 09:22:07 +09:00
Tom Lane	fa27dd40d5	Run pgindent with new pg_bsd_indent version 2.1.1. Thomas Munro fixed a longstanding annoyance in pg_bsd_indent, that it would misformat lines containing IsA() macros on the assumption that the IsA() call should be treated like a cast. This improves some other cases involving field/variable names that match typedefs, too. The only places that get worse are a couple of uses of the OpenSSL macro STACK_OF(); we'll gladly take that trade-off. Discussion: https://postgr.es/m/20200114221814.GA19630@alvherre.pgsql	2020-05-16 11:54:51 -04:00
Tom Lane	e02ad575d8	Final pgindent run with pg_bsd_indent version 2.1. This is just to provide a clean basis for comparison of the results of the new version. I did fix a typo that crept into `242dfcbaf`. Discussion: https://postgr.es/m/20200114221814.GA19630@alvherre.pgsql	2020-05-16 11:49:14 -04:00
Michael Paquier	7ccb2f54d9	Fix assertion with relation using REPLICA IDENTITY FULL in subscriber In a logical replication subscriber, a table using REPLICA IDENTITY FULL which has a primary key would try to use the primary key's index available to scan for a tuple, but an assertion only assumed as correct the case of an index associated to REPLICA IDENTITY USING INDEX. This commit corrects the assertion so as the use of a primary key index is a valid case. Reported-by: Dilip Kumar Analyzed-by: Dilip Kumar Author: Euler Taveira Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/CAFiTN-u64S5bUiPL1q5kwpHNd0hRnf1OE-bzxNiOs5zo84i51w@mail.gmail.com Backpatch-through: 10	2020-05-16 18:15:18 +09:00
Tom Lane	474e7da648	Change locktype "speculative token" to "spectoken". It's just weird that this name wasn't chosen to look like an identifier. The suspicion that it wasn't thought about too hard is reinforced by the fact that it wasn't documented in the pg_locks view (until I did so, a day or two back). Update, and add a comment reminding future adjusters of this array to fix the docs too. Do some desultory wordsmithing on various entries in the wait events tables. Discussion: https://postgr.es/m/24595.1589326879@sss.pgh.pa.us	2020-05-15 21:47:34 -04:00
Alvaro Herrera	1d3743023e	Fix walsender error cleanup code In commit `850196b610` I (Álvaro) failed to handle the case of walsender shutting down on an error before setting up its 'xlogreader' pointer; the error handling code dereferences the pointer, causing a crash. Fix by testing the pointer before trying to dereference it. Kyotaro authored the code fix; I adopted Nathan's test case to be used by the TAP tests and added the necessary PostgresNode change. Reported-by: Nathan Bossart <bossartn@amazon.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/C04FC24E-903D-4423-B312-6910E4D846E5@amazon.com	2020-05-15 20:00:52 -04:00
Tom Lane	14a9101091	Drop the redundant "Lock" suffix from LWLock wait event names. This was mostly confusing, especially since some wait events in this class had the suffix and some did not. While at it, stop exposing MainLWLockNames[] as a globally visible name; any code using that directly is almost certainly wrong, as its name has been misleading for some time. (GetLWLockIdentifier() is what to use instead.) Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us	2020-05-15 19:55:56 -04:00
Tom Lane	8048404939	Fix bogus initialization of replication origin shared memory state. The previous coding zeroed out offsetof(ReplicationStateCtl, states) more bytes than it was entitled to, as a consequence of starting the zeroing from the wrong pointer (or, if you prefer, using the wrong calculation of how much to zero). It's unsurprising that this has not caused any reported problems, since it can be expected that the newly-allocated block is at the end of what we've used in shared memory, and we always make the shmem block substantially bigger than minimally necessary. Nonetheless, this is wrong and it could bite us someday; plus it's a dangerous model for somebody to copy. This dates back to the introduction of this code (commit `5aa235042`), so back-patch to all supported branches.	2020-05-15 19:05:39 -04:00
Tom Lane	36ac359d36	Rename assorted LWLock tranches. Choose names that fit into the conventions for wait event names (particularly, that multi-word names are in the style MultiWordName) and hopefully convey more information to non-hacker users than the previous names did. Also rename SerializablePredicateLockListLock to SerializablePredicateListLock; the old name was long enough to cause table formatting problems, plus the double occurrence of "Lock" seems confusing/error-prone. Also change a couple of particularly opaque LWLock field names. Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us	2020-05-15 18:11:07 -04:00
Alvaro Herrera	a0ab4f4909	Add comments linking pg_strftime to timestamptz_to_str	2020-05-15 18:05:34 -04:00
Alvaro Herrera	242dfcbafa	Avoid killing btree items that are already dead _bt_killitems marks btree items dead when a scan leaves the page where they live, but it does so with only share lock (to improve concurrency). This was historicall okay, since killing a dead item has no consequences. However, with the advent of data checksums and wal_log_hints, this action incurs a WAL full-page-image record of the page. Multiple concurrent processes would write the same page several times, leading to WAL bloat. The probability of this happening can be reduced by only killing items if they're not already dead, so change the code to do that. The problem could eliminated completely by having _bt_killitems upgrade to exclusive lock upon seeing a killable item, but that would reduce concurrency so it's considered a cure worse than the disease. Backpatch all the way back to 9.5, since wal_log_hints was introduced in 9.4. Author: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://postgr.es/m/CA+fd4k6PeRj2CkzapWNrERkja5G0-6D-YQiKfbukJV+qZGFZ_Q@mail.gmail.com	2020-05-15 16:50:34 -04:00
Tom Lane	5da14938f7	Rename SLRU structures and associated LWLocks. Originally, the names assigned to SLRUs had no purpose other than being shmem lookup keys, so not a lot of thought went into them. As of v13, though, we're exposing them in the pg_stat_slru view and the pg_stat_reset_slru function, so it seems advisable to take a bit more care. Rename them to names based on the associated on-disk storage directories (which fortunately we did think about, to some extent; since those are also visible to DBAs, consistency seems like a good thing). Also rename the associated LWLocks, since those names are likewise user-exposed now as wait event names. For the most part I only touched symbols used in the respective modules' SimpleLruInit() calls, not the names of other related objects. This renaming could have been taken further, and maybe someday we will do so. But for now it seems undesirable to change the names of any globally visible functions or structs, so some inconsistency is unavoidable. (But I did terminate "oldserxid" with prejudice, as I found that name both unreadable and not descriptive of the SLRU's contents.) Table 27.12 needs re-alphabetization now, but I'll leave that till after the other LWLock renamings I have in mind. Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us	2020-05-15 14:28:25 -04:00
Amit Kapila	a9cf48a4cf	Make COPY TO keep locks until the transaction end. COPY TO released the ACCESS SHARE lock immediately when it was done rather than holding on to it until the end of the transaction. This breaks the case where a REPEATABLE READ transaction could see an empty table if it repeats a COPY statement and somebody truncated the table in the meantime. Before `4dded12faa` the lock was also released after COPY FROM, but the commit failed to notice the irregularity in COPY TO. This is old behavior but doesn't seem important enough to backpatch. Author: Laurenz Albe, based on suggestion by Robert Haas and Tom Lane Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/7bcfc39d4176faf85ab317d0c26786953646a411.camel@cybertec.at	2020-05-15 08:10:00 +05:30
Michael Paquier	ff87fabef2	Remove duplicated comment block in event_trigger.c The reasons why event triggers are disabled in standalone mode are documented in the code path of ddl_command_start, and other places checking if standalone mode is enabled or not mention to refer to the comment for ddl_command_start, except for table_rewrite that duplicated the same explanation. Reported-by: David G. Johnston Discussion: https://postgr.es/m/CAKFQuwYqHtXpvr2mBJRwH9f+Y5y1GXw3rhbaAu0Dk2MoNevsmA@mail.gmail.com	2020-05-15 08:19:30 +09:00
Tom Lane	5cbfce562f	Initial pgindent and pgperltidy run for v13. Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit `c9d297751`, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.	2020-05-14 13:06:50 -04:00
Tom Lane	29c3e2dd5a	Collect built-in LWLock tranche names statically, not dynamically. There is little point in using the LWLockRegisterTranche mechanism for built-in tranche names. It wastes cycles, it creates opportunities for bugs (since failing to register a tranche name is a very hard-to-detect problem), and the lack of any centralized list of names encourages sloppy nonconformity in name choices. Moreover, since we have a centralized list of the tranches anyway in enum BuiltinTrancheIds, we're certainly not buying any flexibility in return for these disadvantages. Hence, nuke all the backend-internal LWLockRegisterTranche calls, and instead provide a const array of the builtin tranche names. (I have in mind to change a bunch of these names shortly, but this patch is just about getting them into one place.) Discussion: https://postgr.es/m/9056.1589419765@sss.pgh.pa.us	2020-05-14 11:10:31 -04:00
Heikki Linnakangas	e8abf585ab	Move check for fsync=off so that pendingOps still gets cleared. Commit `3eb77eba5a` moved the loop and refactored it, and inadvertently changed the effect of fsync=off so that it also skipped removing entries from the pendingOps table. That was not intentional, and leads to an assertion failure if you turn fsync on while the server is running and reload the config. Backpatch-through: 12- Reviewed-By: Thomas Munro Discussion: https://www.postgresql.org/message-id/3cbc7f4b-a5fa-56e9-9591-c886deb07513%40iki.fi	2020-05-14 08:39:26 +03:00
Amit Kapila	a169155453	Fix the MSVC build for versions 2015 and later. Visual Studio 2015 and later versions should still be able to do the same as Visual Studio 2012, but the declaration of locale_name is missing in _locale_t, causing the code compilation to fail, hence this falls back instead on to enumerating all system locales by using EnumSystemLocalesEx to find the required locale name. If the input argument is in Unix-style then we can get ISO Locale name directly by using GetLocaleInfoEx() with LCType as LOCALE_SNAME. In passing, change the documentation references of the now obsolete links. Note that this problem occurs only with NLS enabled builds. Author: Juan José Santamaría Flecha, Davinder Singh and Amit Kapila Reviewed-by: Ranier Vilela and Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/CAHzhFSFoJEWezR96um4-rg5W6m2Rj9Ud2CNZvV4NWc9tXV7aXQ@mail.gmail.com	2020-05-14 09:24:33 +05:30
Tom Lane	7fd89f4d7a	Fix async.c to not register any SLRU stats counts in the postmaster. Previously, AsyncShmemInit forcibly initialized the first page of the async SLRU, to save dealing with that case in asyncQueueAddEntries. But this is a poor tradeoff, since many installations do not ever use NOTIFY; for them, expending those cycles in AsyncShmemInit is a complete waste. Besides, this only saves a couple of instructions in asyncQueueAddEntries, which hardly seems likely to be measurable. The real reason to change this now, though, is that now that we track SLRU access stats, the existing code is causing the postmaster to accumulate some access counts, which then get inherited into child processes by fork(), messing up the statistics. Delaying the initialization into the first child that does a NOTIFY fixes that. Hence, we can revert `f3d23d83e`, which was an incorrect attempt at fixing that issue. Also, add an Assert to pgstat.c that should catch any future errors of the same sort. Discussion: https://postgr.es/m/8367.1589391884@sss.pgh.pa.us	2020-05-13 22:48:26 -04:00
Alvaro Herrera	17cc133f01	Dial back -Wimplicit-fallthrough to level 3 The additional pain from level 4 is excessive for the gain. Also revert all the source annotation changes to their original wordings, to avoid back-patching pain. Discussion: https://postgr.es/m/31166.1589378554@sss.pgh.pa.us	2020-05-13 15:31:14 -04:00
Tom Lane	81ca868630	Improve management of SLRU statistics collection. Instead of re-identifying which statistics bucket to use for a given SLRU on every counter increment, do it once during shmem initialization. This saves a fair number of cycles, and there's no real cost because we could not have a bucket assignment that varies over time or across backends anyway. Also, get rid of the ill-considered decision to let pgstat.c pry directly into SLRU's shared state; it's cleaner just to have slru.c pass the stats bucket number. In consequence of these changes, there's no longer any need to store an SLRU's LWLock tranche info in shared memory, so get rid of that, making this a net reduction in shmem consumption. (That partly reverts fe702a7b3.) This is basically code review for `28cac71bd`, so I also cleaned up some comments, removed a dangling extern declaration, fixed some things that should be static and/or const, etc. Discussion: https://postgr.es/m/3618.1589313035@sss.pgh.pa.us	2020-05-13 13:08:23 -04:00
Alvaro Herrera	850196b610	Adjust walsender usage of xlogreader, simplify APIs * Have both physical and logical walsender share a 'xlogreader' state struct for tracking state. This replaces the existing globals sendSeg and sendCxt. * Change WALRead not to receive XLogReaderState->seg and ->segcxt as separate arguments anymore; just use the ones from 'state'. This is made possible by the above change. * have the XLogReader segment_open contract require the callbacks to install the file descriptor in the state struct themselves instead of returning it. xlogreader was already ignoring any possible failed return from the callbacks, relying solely on them never returning. (This point is not altogether excellent, as it means the callbacks have to know more of XLogReaderState; but to really improve on that we would have to pass back error info from the callbacks to xlogreader. And the complexity would not be saved but instead just transferred to the callers of WALRead, which would have to learn how to throw errors from the open_segment callback in addition of, as currently, from pg_pread.) * segment_open no longer receives the 'segcxt' as a separate argument, since it's part of the XLogReaderState argument. Per comments from Kyotaro Horiguchi. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20200511203336.GA9913@alvherre.pgsql	2020-05-13 12:17:08 -04:00
Fujii Masao	043e3e0401	Use proper GetDatum function in pg_stat_get_slru(). This commit changes pg_stat_get_slru() so that it uses TimestampTzGetDatum() for stats_reset field because that field stores the timestamp with time zone value. Previously Int64GetDatum() was used. Author: Fujii Masao Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/b8784fe6-1401-ab35-aa14-d57b5bb8e312@oss.nttdata.com	2020-05-13 22:20:37 +09:00
Fujii Masao	f3d23d83ef	Initialize SLRU stats entries to zero. Previously since SLRUStats was not initialized, SLRU stats counters could begin with non-zero value. Which could lead to incorrect results in pg_stat_slru view. Author: Fujii Masao Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/976bbb73-a112-de3c-c488-b34b64609793@oss.nttdata.com	2020-05-13 22:19:25 +09:00
Alvaro Herrera	3e9744465d	Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGS Use it at level 4, a bit more restrictive than the default level, and tweak our commanding comments to FALLTHROUGH. (However, leave zic.c alone, since it's external code; to avoid the warnings that would appear there, change CFLAGS for that file in the Makefile.) Author: Julien Rouhaud <rjuju123@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org	2020-05-12 16:07:30 -04:00
Tomas Vondra	6a918c3ac8	Rework EXPLAIN format for incremental sort The explain format used by incremental sort was somewhat inconsistent with other nodes, making it harder to parse and understand. This commit addresses that by - adding an extra space to better separate groups of values - using colons instead of equal signs to separate key/value - properly capitalizing first letter of a key - using separate lines for full and pre-sorted groups These changes were proposed by Justin Pryzby and mostly copy the final explain format used to report WAL usage. Author: Justin Pryzby Reviewed-by: James Coleman Discussion: https://postgr.es/m/20200419023625.GP26953@telsasoft.com	2020-05-12 20:04:39 +02:00
Tomas Vondra	1a40d37a9f	Fix typos and improve incremental sort comments Author: Justin Pryzby, James Coleman Discussion: https://postgr.es/m/20200419023625.GP26953@telsasoft.com	2020-05-12 19:37:13 +02:00
Etsuro Fujita	2793bbe75e	Remove unnecessary #include. My oversight in commit `c8434d64c`.	2020-05-12 19:55:55 +09:00
Michael Paquier	078c9cd258	Fix comment in xlogutils.c The existing callers of XLogReadDetermineTimeline() performing recovery need to check a replay LSN position when determining on which timeline to read a WAL page. A portion of the comment describing this function said exactly that, while referring to a routine for fetching a write LSN, something not available in recovery. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200511.101619.2043820539323292957.horikyota.ntt@gmail.com	2020-05-12 14:43:57 +09:00
Peter Geoghegan	624686abcf	Adjust "root of to-be-deleted subtree" function. Restructure the function that locates the root of the to-be-deleted subtree during nbtree page deletion. Handle the conditions that make page deletion unsafe in a slightly more uniform way, and acknowledge the fact that the behavior with incomplete splits on internal pages is different (as pointed out in the nbtree README as of commit `35bc0ec7`). Also invent new terminology that avoids ambiguity around which pages are about to be deleted. Consistently use the term "to-be-deleted subtree", not the ambiguous term "branch". We were calling the subtree parent page the "top parent page", but that was quite misleading. The top parent page usually refers to a page unlinked from its siblings and marked deleted (during the second stage of page deletion). There was one kind of top parent page that we merely removed a downlink from, and another kind of top parent page that we actually marked deleted. Eliminate the ambiguity by inventing a new term ("subtree parent page") that refers to the former kind of page only.	2020-05-11 11:01:07 -07:00
Alvaro Herrera	a8be5364ac	Fix obsolete references to "XLogRead" The one in xlogreader.h was pointed out by Antonin Houska; I (Álvaro) noticed the others by grepping. Author: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/28250.1589186654@antos	2020-05-11 12:46:41 -04:00
Peter Eisentraut	7a9c9ce641	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 80d8f54b3c5533ec036404bd3c3b24ff4825d037	2020-05-11 13:14:32 +02:00
Michael Paquier	e111c9f90a	Remove smgrdounlink() in smgr.c from the code tree The last caller of this routine was removed in `b416691`, and as a wise man said one day, dead code tends to silently break. Per discussion between Fujii Masao, Peter Geoghegan, Vignesh C and me. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=sg5H8-vG4d5UmAofdcRMpeTDt2K-NUWp4GSfhenRGAQ@mail.gmail.com	2020-05-10 10:58:54 +09:00
Tomas Vondra	ebeb3dea77	Simplify show_incremental_sort_info a bit Incremental sort always processes at least one full group group before switching to prefix groups, so it's enough to check just the number of full groups. There was no risk of division by zero due to the extra condition, but it made the code harder to understand. Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAp+7qoS92-4V1vLChpdY3vEkLCbf+gye6P-4cirE-0z0A@mail.gmail.com	2020-05-09 19:41:42 +02:00
Tomas Vondra	9155b4be9a	Do no reset bounded before incremental sort rescan ExecReScanIncrementalSort was resetting bounded=false, which means the optimization would be disabled on all rescans. This happens because ExecSetTupleBound is called before the rescan, not after it. Author: James Coleman Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20200414065336.GI1492@paquier.xyz	2020-05-09 19:41:36 +02:00
Tomas Vondra	c442722648	Fix handling of REWIND/MARK/BACKWARD in incremental sort The executor flags were not handled entirely correctly, although the bugs were mostly harmless and it was mostly comment inaccuracy. We don't need to strip any of the flags for child nodes. Incremental sort does not support backward scans of mark/restore, so MARK/BACKWARDS flags should not be possible. So we simply ensure this using an assert, and we don't bother removing them when initializing the child node. With REWIND it's a bit less clear - incremental sort does not support REWIND, but there is no way to signal this - it's legal to just ignore the flag. We however continue passing the flag to child nodes, because they might be useful to leverage that. Reported-by: Michael Paquier Author: James Coleman Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20200414065336.GI1492@paquier.xyz	2020-05-09 19:41:18 +02:00
Alvaro Herrera	b060dbe000	Rework XLogReader callback system Code review for `0dc8ead463`, prompted by a bug closed by `91c40548d5`. XLogReader's system for opening and closing segments had gotten too complicated, with callbacks being passed at both the XLogReaderAllocate level (read_page) as well as at the WALRead level (segment_open). This was confusing and hard to follow, so restructure things so that these callbacks are passed together at XLogReaderAllocate time, and add another callback to the set (segment_close) to make it a coherent whole. Also, ensure XLogReaderState is an argument to all the callbacks, so that they can grab at the ->private data if necessary. Document the whole arrangement more clearly. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20200422175754.GA19858@alvherre.pgsql	2020-05-08 15:40:11 -04:00
Peter Eisentraut	086ffddf36	Fix several DDL issues of generated columns versus inheritance Several combinations of generated columns and inheritance in CREATE TABLE were not handled correctly. Specifically: - Disallow a child column specifying a generation expression if the parent column is a generated column. The child column definition must be unadorned and the parent column's generation expression will be copied. - Prohibit a child column of a generated parent column specifying default values or identity. - Allow a child column of a not-generated parent column specifying itself as a generated column. This previously did not work, but it was possible to arrive at the state via other means (involving ALTER TABLE), so it seems sensible to support it. Add tests for each case. Also add documentation about the rules involving generated columns and inheritance. Discussion: https://www.postgresql.org/message-id/flat/15830.1575468847%40sss.pgh.pa.us https://www.postgresql.org/message-id/flat/2678bad1-048f-519a-ef24-b12962f41807%40enterprisedb.com https://www.postgresql.org/message-id/flat/CAJvUf_u4h0DxkCMCeEKAWCuzGUTnDP-G5iVmSwxLQSXn0_FWNQ%40mail.gmail.com	2020-05-08 11:31:57 +02:00
Peter Eisentraut	501e41dd3c	Propagate ALTER TABLE ... SET STORAGE to indexes When creating a new index, the attstorage setting of the table column is copied to regular (non-expression) index columns. But a later ALTER TABLE ... SET STORAGE is not propagated to indexes, thus creating an inconsistent and undumpable state. Discussion: https://www.postgresql.org/message-id/flat/9765d72b-37c0-06f5-e349-2a580aafd989%402ndquadrant.com	2020-05-08 08:39:17 +02:00
Fujii Masao	f2ff203596	Report missing wait event for timeline history file. TimelineHistoryRead and TimelineHistoryWrite wait events are reported during waiting for a read and write of a timeline history file, respectively. However, previously, TimelineHistoryRead wait event was not reported while readTimeLineHistory() was reading a timeline history file. Also TimelineHistoryWrite was not reported while writeTimeLineHistory() was writing one line with the details of the timeline split, at the end. This commit fixes these issues. Back-patch to v10 where wait events for a timeline history file was added. Author: Masahiro Ikeda Reviewed-by: Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/d11b0c910b63684424e06772eb844ab5@oss.nttdata.com	2020-05-08 10:36:40 +09:00
Peter Geoghegan	cd8c73a38a	Refactor nbtree deletion INCOMPLETE_SPLIT check. Factor out code common to _bt_lock_branch_parent() and _bt_pagedel() into a new utility function. This new function is used to check that the left sibling of a deletion target page does not have the INCOMPLETE_SPLIT page flag set. If it is set then deletion is unsafe; there won't be a usable pivot tuple (with a downlink) in the parent page that points to the deletion target page. The page deletion algorithm is not prepared to deal with that. Also restructure an existing, related utility function that checks if the right sibling of the target page has the ISHALFDEAD page flag set. This organization highlights the symmetry between the two cases. The goal is to make the design of page deletion clearer. Both functions involve a sibling page with a flag that indicates that there was an interrupted operation (a page split or a page deletion) that resulted in a page pointed to by sibling pages, but not pointed to in the parent. And, both functions indicate if page deletion is unsafe due to the absence of a particular downlink in the parent page.	2020-05-07 16:08:54 -07:00
Tom Lane	db89f0e3a4	Fix YA text phrase search bug. checkcondition_str() failed to report multiple matches for a prefix pattern correctly: it would dutifully merge the match positions, but then after exiting that loop, if the last prefix-matching word had had no suitable positions, it would report there were no matches. The upshot would be failing to recognize a match that the query should match. It looks like you need all of these conditions to see the bug: * a phrase search (else we don't ask for match position details) * a prefix search item (else we don't get to this code) * a weight restriction (else checkclass_str won't fail) Noted while investigating a problem report from Pavel Borisov, though this is distinct from the issue he was on about. Back-patch to 9.6 where phrase search was added.	2020-05-07 15:59:51 -04:00
Alvaro Herrera	5be594caf8	Heed lock protocol in DROP OWNED BY We were acquiring object locks then deleting objects one by one, instead of acquiring all object locks first, ignoring those that did not exist, and then deleting all objects together. The latter is the correct protocol to use, and what this commits changes to code to do. Failing to follow that leads to "cache lookup failed for relation XYZ" error reports when DROP OWNED runs concurrently with other DDL -- for example, a session termination that removes some temp tables. Author: Álvaro Herrera Reported-by: Mithun Chicklore Yogendra (Mithun CY) Reviewed-by: Ahsan Hadi, Tom Lane Discussion: https://postgr.es/m/CADq3xVZTbzK4ZLKq+dn_vB4QafXXbmMgDP3trY-GuLnib2Ai1w@mail.gmail.com	2020-05-06 12:29:41 -04:00
Tom Lane	46da7bf671	Fix severe memory leaks in GSSAPI encryption support. Both the backend and libpq leaked buffers containing encrypted data to be transmitted, so that the process size would grow roughly as the total amount of data sent. There were also far-less-critical leaks of the same sort in GSSAPI session establishment. Oversight in commit `b0b39f72b`, which I failed to notice while reviewing the code in `2c0cdc818`. Per complaint from pmc@citylink. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/20200504115649.GA77072@gate.oper.dinoex.org	2020-05-05 13:10:17 -04:00
Amit Kapila	69bfaf2e1d	Change the display of WAL usage statistics in Explain. In commit `33e05f89c5`, we have added the option to display WAL usage statistics in Explain and auto_explain. The display format used two spaces between each field which is inconsistent with Buffer usage statistics which is using one space between each field. Change the format to make WAL usage statistics consistent with Buffer usage statistics. This commit also changed the usage of "full page writes" to "full page images" for WAL usage statistics to make it consistent with other parts of code and docs. Author: Julien Rouhaud, Amit Kapila Reviewed-by: Justin Pryzby, Kyotaro Horiguchi and Amit Kapila Discussion: https://postgr.es/m/CAB-hujrP8ZfUkvL5OYETipQwA=e3n7oqHFU=4ZLxWS_Cza3kQQ@mail.gmail.com	2020-05-05 08:00:53 +05:30
Alexander Korotkov	9f87ae38ea	Fix typo in comment Reported-by: Oleg Bartunov	2020-05-03 12:19:31 +03:00
Peter Geoghegan	9dc7251417	Refactor btvacuumpage(). Remove one of the arguments to btvacuumpage(), and give up on the idea that it's a recursive function. We now use the term "backtracking" to refer to the case where an earlier block must be visited to make sure no tuples that need to be removed were missed. Advertising btvacuumpage() as a recursive function was unhelpful. In reality the function always simulates recursion with a loop (it doesn't actually call itself). This wasn't just necessary as a precaution (per the comments mentioning tail recursion), though. There is no reliable natural limit on the number of times we can backtrack. There are important behavioral difference when "recursing"/backtracking, mostly related to page deletion. We don't perform page deletion when backtracking due to the extra complexity. And when we recurse, we're not performing a physical order scan anymore, so we expect fairly different conditions to hold for the page. Structuring the code like this makes it clearer how _bt_pagedel() cooperates with btvacuumpage() and btvacuumscan() (as established in commit `b0229f26` and commit `73a076b0`). Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion: https://postgr.es/m/CAH2-WzmRGMDWiLMcb+zagG9652PboNN4Gfcq1Gc_wJL6A716MA@mail.gmail.com	2020-05-02 14:04:33 -07:00
Stephen Frost	b68a560f8e	Fix GSS client to non-GSS server connection If the client is compiled with GSSAPI support and tries to start up GSS with the server, but the server is not compiled with GSSAPI support, we would mistakenly end up falling through to call ProcessStartupPacket with secure_done = true, but the client might then try to perform SSL, which the backend wouldn't understand and we'd end up failing the connection with: FATAL: unsupported frontend protocol 1234.5679: server supports 2.0 to 3.0 Fix by arranging to track ssl_done independently from gss_done, instead of trying to use the same boolean for both. Author: Andrew Gierth Discussion: https://postgr.es/m/87h82kzwqn.fsf@news-spur.riddles.org.uk Backpatch: 12-, where GSSAPI encryption was added.	2020-05-02 11:39:26 -04:00
Tomas Vondra	d5d09692ea	Remove superfluous memset from pgstat_recv_resetslrucounter The extra memset meant pg_stat_reset_slru() always reset all the entries even when reset of a single entry was requested, but the timestamp was left uninitialized. Reported-by: Atsushi Torikoshi Discussion: https://postgr.es/m/CACZ0uYFe16pjZxQYaTn53mspyM7dgMPYL3DJLjjPw69GMCC2Ow%40mail.gmail.com	2020-05-02 15:30:10 +02:00
Tomas Vondra	60fbb4d762	Simplify cost_incremental_sort a bit Commit `de0dc1a847` added code to cost_incremental_sort to handle varno 0. Explicitly removing the RelabelType is not really necessary, because the pull_varnos handles that just fine, which simplifies the code a bit. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_3_D2J5XxOuw68hvn0-gJsw9FXNSGcZka9aTymn9UJ8A%40mail.gmail.com Discussion: https://postgr.es/m/20200411214639.GK2228%40telsasoft.com	2020-05-02 01:33:51 +02:00
Tomas Vondra	2e08d314ed	Remove pg_xact entry from SLRU stats The "pg_xact" entry was duplicate with "clog" and was added by mistake. Reported-by: Fujii Masao Discussion: https://postgr.es/m/20200119143707.gyinppnigokesjok@development	2020-05-02 00:36:25 +02:00
Tom Lane	0da06d9faf	Get rid of trailing semicolons in C macro definitions. Writing a trailing semicolon in a macro is almost never the right thing, because you almost always want to write a semicolon after each macro call instead. (Even if there was some reason to prefer not to, pgindent would probably make a hash of code formatted that way; so within PG the rule should basically be "don't do it".) Thus, if we have a semi inside the macro, the compiler sees "something;;". Much of the time the extra empty statement is harmless, but it could lead to mysterious syntax errors at call sites. In perhaps an overabundance of neatnik-ism, let's run around and get rid of the excess semicolons whereever possible. The only thing worse than a mysterious syntax error is a mysterious syntax error that only happens in the back branches; therefore, backpatch these changes where relevant, which is most of them because most of these mistakes are old. (The lack of reported problems shows that this is largely a hypothetical issue, but still, it could bite us in some future patch.) John Naylor and Tom Lane Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com	2020-05-01 17:28:00 -04:00
Peter Geoghegan	69cf853fe7	Clear up issue with FSM and oldest bpto.xact. On further reflection, code comments added by commit `b0229f26` slightly misrepresented how we determine the oldest bpto.xact for the index. btvacuumpage() does not treat the bpto.xact of a page that it put in the FSM as a candidate to be the oldest deleted page (the delete-marked page that has the oldest bpto.xact XID among all pages encountered). The definition of a deleted page for the purposes of the bpto.xact calculation is different from the definition used by the bulk delete statistics. The bulk delete statistics don't distinguish between pages that were deleted by the current VACUUM, pages deleted by a previous VACUUM operation but not yet recyclable/reusable, and pages that are reusable (though reusable pages are counted separately). Backpatch: 11-, just like commit `b0229f26`.	2020-05-01 12:19:44 -07:00
Peter Geoghegan	4e21f8b633	Reorder function prototypes for consistency.	2020-05-01 10:03:38 -07:00
Peter Geoghegan	73a076b03f	Fix undercounting in VACUUM VERBOSE output. The logic for determining how many nbtree pages in an index are deleted pages sometimes undercounted pages. Pages that were deleted by the current VACUUM operation (as opposed to some previous VACUUM operation whose deleted pages have yet to be reused) were sometimes overlooked. The final count is exposed to users through VACUUM VERBOSE's "%u index pages have been deleted" output. btvacuumpage() avoided double-counting when _bt_pagedel() deleted more than one page by assuming that only one page was deleted, and that the additional deleted pages would get picked up during a future call to btvacuumpage() by the same VACUUM operation. _bt_pagedel() can legitimately delete pages that the btvacuumscan() scan will not visit again, though, so that assumption was slightly faulty. Fix the accounting by teaching _bt_pagedel() about its caller's requirements. It now only reports on pages that it knows btvacuumscan() won't visit again (including the current btvacuumpage() page), so everything works out in the end. This bug has been around forever. Only backpatch to v11, though, to keep _bt_pagedel() is sync on the branches that have today's bugfix commit `b0229f26da`. Note that this commit changes the signature of _bt_pagedel(), just like commit `b0229f26da`. Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion: https://postgr.es/m/CAH2-WzkrXBcMQWAYUJMFTTvzx_r4q=pYSjDe07JnUXhe+OZnJA@mail.gmail.com Backpatch: 11-	2020-05-01 09:51:09 -07:00
Peter Geoghegan	b0229f26da	Fix bug in nbtree VACUUM "skip full scan" feature. Commit `857f9c36cd` (which taught nbtree VACUUM to skip a scan of the index from btcleanup in situations where it doesn't seem worth it) made VACUUM maintain the oldest btpo.xact among all deleted pages for the index as a whole. It failed to handle all the details surrounding pages that are deleted by the current VACUUM operation correctly (though pages deleted by some previous VACUUM operation were processed correctly). The most immediate problem was that the special area of the page was examined without a buffer pin at one point. More fundamentally, the handling failed to account for the full range of _bt_pagedel() behaviors. For example, _bt_pagedel() sometimes deletes internal pages in passing, as part of deleting an entire subtree with btvacuumpage() caller's page as the leaf level page. The original leaf page passed to _bt_pagedel() might not be the page that it deletes first in cases where deletion can take place. It's unclear how disruptive this bug may have been, or what symptoms users might want to look out for. The issue was spotted during unrelated code review. To fix, push down the logic for maintaining the oldest btpo.xact to _bt_pagedel(). btvacuumpage() is now responsible for pages that were fully deleted by a previous VACUUM operation, while _bt_pagedel() is now responsible for pages that were deleted by the current VACUUM operation (this includes half-dead pages from a previous interrupted VACUUM operation that become fully deleted in _bt_pagedel()). Note that _bt_pagedel() should never encounter an existing deleted page. This commit theoretically breaks the ABI of a stable release by changing the signature of _bt_pagedel(). However, if any third party extension is actually affected by this, then it must already be completely broken (since there are numerous assumptions made in _bt_pagedel() that cannot be met outside of VACUUM). It seems highly unlikely that such an extension actually exists, in any case. Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion: https://postgr.es/m/CAH2-WzkrXBcMQWAYUJMFTTvzx_r4q=pYSjDe07JnUXhe+OZnJA@mail.gmail.com Backpatch: 11-, where the "skip full scan" feature was introduced.	2020-05-01 08:39:52 -07:00
Peter Geoghegan	dd1f645cc8	Fix AddressSanitizer use-after-scope complaint. XLogRegisterBufData() does not copy data pointed to by caller's pointer argument. Oversight in commit `0d861bbb70`. Author: Peter Eisentraut Reported-By: Peter Eisentraut Discussion: https://postgr.es/m/21800dbe-a13e-22f7-d423-b81db9d249f5@2ndquadrant.com	2020-04-30 12:31:56 -07:00
Peter Eisentraut	eb892102e0	Make SQL/JSON error code names match SQL standard see also `a00c53b0cb`	2020-04-30 09:34:54 +02:00
Peter Geoghegan	ab2343d4cb	Remove redundant _bt_killitems() buffer check. _bt_getbuf() cannot return an invalid buffer. Oversight in commit `2ed5b87f96`.	2020-04-29 18:17:49 -07:00
Michael Paquier	e30b0b5cfa	Fix check for conflicting SSL min/max protocol settings Commit `79dfa8a` has introduced a check to catch when the minimum protocol version was set higher than the maximum version, however an error was getting generated when both bounds are set even if they are able to work, causing a backend to not use a new SSL context but keep the old one. Author: Daniel Gustafsson Discussion: https://postgr.es/m/14BFD060-8C9D-43B4-897D-D5D9AA6FC92B@yesql.se	2020-04-30 08:14:02 +09:00
Alvaro Herrera	1816a1c6ff	Fix checkpoint signalling Checkpointer uses its MyLatch to wake up when a checkpoint request is received. But before commit `c655077639` the latch was not used for anything else, so the code could just go to sleep after each loop without rechecking the sleeping condition. That commit added a separate ResetLatch in its code path[1], which can cause a checkpoint to go unnoticed for potentially a long time. Fix by skipping sleep if any checkpoint flags are set. Also add a test to verify this; authored by Kyotaro Horiguchi. [1] CreateCheckPoint -> InvalidateObsoleteReplicationSlots -> ConditionVariableTimeSleep Report and diagnosis by Kyotaro Horiguchi. Co-authored-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20200408.141956.891237856186513376.horikyota.ntt@gmail.com	2020-04-29 18:46:42 -04:00
Alvaro Herrera	d0abe78d84	Check slot->restart_lsn validity in a few more places Lack of these checks could cause visible misbehavior, including assertion failures. This was missed in commit `c655077639`, whereby restart_lsn becomes invalid when the size limit is exceeded. Also reword some existing error messages, and add errdetail(), so that the reported errors all match in spirit. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20200408.093710.447591748588426656.horikyota.ntt@gmail.com	2020-04-28 20:39:04 -04:00
Peter Eisentraut	6baa17fbd1	Add missing gettext triggers Some translatable strings have been moved to scanner_yyerror(), so we need to add that, too.	2020-04-28 13:35:40 +02:00
Alexander Korotkov	ef11051bbe	Fix definition of pg_statio_all_tables view pg_statio_all_tables view appears to have a wrong grouping. As the result numbers of toast index blocks read and hit were multiplied to the number of table indexes. This commit fixes the view definition. Backpatching this appears difficult. We don't have a mechanism to patch a system catalog of existing instances in minor upgrade. We can write a release notes instruction to do this manually. But per discussion this is probably not so critical bug for doing such an intrusive fix. Reported-by: Andrei Zubkov Discussion: https://postgr.es/m/CAPpHfdtMYkkNudLMG9G0dxX_B%3Dn5sfKzOyxxrvWYtSicaGW0Lw%40mail.gmail.com	2020-04-28 11:30:33 +03:00
Tom Lane	e81e5741a6	Fix full text search to handle NOT above a phrase search correctly. Queries such as '!(foo<->bar)' failed to find matching rows when implemented as a GiST or GIN index search. That's because of failing to handle phrase searches as tri-valued when considering a query without any position information for the target tsvector. We can only say that the phrase operator might match, not that it does match; and therefore its NOT also might match. The previous coding incorrectly inverted the approximate phrase result to decide that there was certainly no match. To fix, we need to make TS_phrase_execute return a real ternary result, and then bubble that up accurately in TS_execute. As long as we have to do that anyway, we can simplify the baroque things TS_phrase_execute was doing internally to manage tri-valued searching with only a bool as explicit result. For now, I left the externally-visible result of TS_execute as a plain bool. There do not appear to be any outside callers that need to distinguish a three-way result, given that they passed in a flag saying what to do in the absence of position data. This might need to change someday, but we wouldn't want to back-patch such a change. Although tsginidx.c has its own TS_execute_ternary implementation for use at upper index levels, that sadly managed to get this case wrong as well :-(. Fixing it is a lot easier fortunately. Per bug #16388 from Charles Offenbacher. Back-patch to 9.6 where phrase search was introduced. Discussion: https://postgr.es/m/16388-98cffba38d0b7e6e@postgresql.org	2020-04-27 12:21:04 -04:00
Michael Paquier	641b76d9d1	Fix some typos Author: Justin Pryzby Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com	2020-04-27 14:59:36 +09:00
Peter Eisentraut	f057980149	Fix typo from `303640199d`	2020-04-26 13:48:33 +02:00
Peter Geoghegan	7154aa16a6	Fix another minor page deletion buffer lock issue. Avoid accessing the leaf page's top parent tuple without a buffer lock held during the second phase of nbtree page deletion. The old approach was safe, though only because VACUUM never drops its buffer pin (and because only VACUUM itself can modify a half-dead page). Even still, it seems like a good idea to be strict here. Tighten things up by copying the top parent page's block number to a local variable before releasing the buffer lock on the leaf page -- not after. This is a follow-up to commit `fa7ff642`, which fixed a similar issue in the first phase of nbtree page deletion. Update some related comments in passing. Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com	2020-04-25 16:45:20 -07:00
Peter Geoghegan	fa7ff642c2	Fix minor nbtree page deletion buffer lock issue. Avoid accessing the deletion target page's special area during nbtree page deletion at a point where there is no buffer lock held. This issue was detected by a patch that extends Valgrind's memcheck tool to mark nbtree pages that are unsafe to access (due to not having a buffer lock or buffer pin) as NOACCESS. We do hold a buffer pin at this point, and only access the special area, so the old approach was safe. Even still, it seems like a good idea to tighten up the rules in this area. There is no reason to not simply insist on always holding a buffer lock (not just a pin) when accessing nbtree pages. Update some related comments in passing. Discussion: https://postgr.es/m/CAH2-WzkLgyN3zBvRZ1pkNJThC=xi_0gpWRUb_45eexLH1+k2_Q@mail.gmail.com	2020-04-25 14:17:02 -07:00
Noah Misch	f246ea3b2a	In caught-up logical walsender, sleep only in WalSndWaitForWal(). Before sleeping, WalSndWaitForWal() sends a keepalive if MyWalSnd->write < sentPtr. When the latest physical LSN yields no logical replication messages (a common case), that keepalive elicits a reply. Processing the reply updates pg_stat_replication.replay_lsn. WalSndLoop() lacks that; when WalSndLoop() slept, replay_lsn advancement could stall until wal_receiver_status_interval elapsed. This sometimes stalled src/test/subscription/t/001_rep_changes.pl for up to 10s. Reviewed by Fujii Masao and Michael Paquier. Discussion: https://postgr.es/m/20200418070142.GA1075445@rfd.leadboat.com	2020-04-25 10:18:12 -07:00
Noah Misch	72a3dc321d	Revert "When WalSndCaughtUp, sleep only in WalSndWaitForWal()." This reverts commit `4216858122`. It caused idle physical walsenders to busy-wait, as reported by Fujii Masao. Discussion: https://postgr.es/m/20200417054146.GA1061007@rfd.leadboat.com	2020-04-25 10:17:26 -07:00
Andrew Gierth	d9a4cce29d	Fix error case for CREATE ROLE ... IN ROLE. CreateRole() was passing a Value node, not a RoleSpec node, for the newly-created role name when adding the role as a member of existing roles for the IN ROLE syntax. This mistake went unnoticed because the node in question is used only for error messages and is not accessed on non-error paths. In older pg versions (such as 9.5 where this was found), this results in an "unexpected node type" error in place of the real error. That node type check was removed at some point, after which the code would accidentally fail to fail on 64-bit platforms (on which accessing the Value node as if it were a RoleSpec would be mostly harmless) or give an "unexpected role type" error on 32-bit platforms. Fix the code to pass the correct node type, and add an lfirst_node assertion just in case. Per report on irc from user m1chelangelo. Backpatch all the way, because this error has been around for a long time.	2020-04-25 05:09:30 +01:00
Tom Lane	baf17ad9df	Repair performance regression in information_schema.triggers view. Commit `32ff26911` introduced use of rank() into the triggers view to calculate the spec-mandated action_order column. As written, this prevents query constraints on the table-name column from being pushed below the window aggregate step. That's bad for performance of this typical usage pattern, since the view now has to be evaluated for all tables not just the one(s) the user wants to see. It's also the cause of some recent buildfarm failures, in which trying to evaluate the view rows for triggers in process of being dropped resulted in "cache lookup failed for function NNN" errors. Those rows aren't of interest to the test script doing the query, but the filter that would eliminate them is being applied too late. None of this happened before the rank() call was there, so it's a regression compared to v10 and before. We can improve matters by changing the rank() call so that instead of partitioning by OIDs, it partitions by nspname and relname, casting those to sql_identifier so that they match the respective view output columns exactly. The planner has enough intelligence to know that constraints on partitioning columns are safe to push down, so this eliminates the performance problem and the regression test failure risk. We could make the other partitioning columns match view outputs as well, but it'd be more complicated and the performance benefits are questionable. Side note: as this stands, the planner will push down constraints on event_object_table and trigger_schema, but not on event_object_schema, because it checks for ressortgroupref matches not expression equivalence. That might be worth improving someday, but it's not necessary to fix the immediate concern. Back-patch to v11 where the rank() call was added. Ordinarily we'd not change information_schema in released branches, but the test failure has been seen in v12 and presumably could happen in v11 as well, so we need to do this to keep the buildfarm happy. The change is harmless so far as users are concerned. Some might wish to apply it to existing installations if performance of this type of query is of concern, but those who don't are no worse off. I bumped catversion in HEAD as a pro forma matter (there's no catalog incompatibility that would really require a re-initdb). Obviously that can't be done in the back branches. Discussion: https://postgr.es/m/5891.1587594470@sss.pgh.pa.us	2020-04-24 12:02:36 -04:00
Michael Paquier	4e87c4836a	Fix handling of WAL segments ready to be archived during crash recovery `78ea8b5` has fixed an issue related to the recycling of WAL segments on standbys depending on archive_mode. However, it has introduced a regression with the handling of WAL segments ready to be archived during crash recovery, causing those files to be recycled without getting archived. This commit fixes the regression by tracking in shared memory if a live cluster is either in crash recovery or archive recovery as the handling of WAL segments ready to be archived is different in both cases (those WAL segments should not be removed during crash recovery), and by using this new shared memory state to decide if a segment can be recycled or not. Previously, it was not possible to know if a cluster was in crash recovery or archive recovery as the shared state was able to track only if recovery was happening or not, leading to the problem. A set of TAP tests is added to close the gap here, making sure that WAL segments ready to be archived are correctly handled when a cluster is in archive or crash recovery with archive_mode set to "on" or "always", for both standby and primary. Reported-by: Benoît Lobréau Author: Jehan-Guillaume de Rorthais Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/20200331172229.40ee00dc@firost Backpatch-through: 9.5	2020-04-24 08:48:28 +09:00
Tom Lane	3436c5e283	Remove ACLDEBUG #define and associated code. In the footsteps of `aaf069aa3`, remove ACLDEBUG, which was the only other remaining undocumented symbol in pg_config_manual.h. The fact that nobody had bothered to document it in seventeen years is a good clue to its usefulness. In practice, none of the tracing logic it enabled would be of any value without additional effort. Discussion: https://postgr.es/m/6631.1587565046@sss.pgh.pa.us	2020-04-23 15:38:04 -04:00
Tom Lane	ee88ef55db	Remove useless (and broken) logging logic in memory context functions. Nobody really uses this stuff, especially not since we created valgrind-based infrastructure that does the same thing better. It is thus unsurprising that the generation.c and slab.c versions were actually broken. Rather than fix 'em, let's just remove 'em. Alexander Lakhin Discussion: https://postgr.es/m/8936216c-3492-3f6e-634b-d638fddc5f91@gmail.com	2020-04-23 15:27:37 -04:00
Robert Haas	3989dbdf12	Rename exposed identifiers to say "backup manifest". Function names declared "extern" now use BackupManifest in the name rather than just Manifest, and data types use backup_manifest rather than just manifest. Per note from Michael Paquier. Discussion: http://postgr.es/m/20200418125713.GG350229@paquier.xyz	2020-04-23 08:44:06 -04:00
Andres Freund	299298bc87	Fix transient memory leak for SRFs in FROM. In `a9c35cf85c` I changed ExecMakeTableFunctionResult() to dynamically allocate the FunctionCallInfo used to call the SRF. Unfortunately I did not account for the fact that the surrounding memory context has query lifetime, leading to a leak till the end of the query. In most cases the leak is fairly inconsequential, but if the FunctionScan is done many times in the query, the leak can add up. This happens e.g. if the function scan is on the inner side of a nested loop, due to a lateral join. EXPLAIN SELECT sum(f) FROM generate_series(1, 100000000) g(i), generate_series(i, i+1) f; quickly shows the leak. Instead of explicitly freeing the FunctionCallInfo it seems better to make sure all the per-set temporary state in ExecMakeTableFunctionResult() is cleaned up wholesale. Currently that's probably just the FunctionCallInfo allocation, but since there's some initialization work, and since there's already an appropriate context, this seems like a more robust approach. Bug: #16112 Reported-By: Ben Cornett Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/16112-4448bbf55a404189%40postgresql.org Backpatch: 12, `a9c35cf85c`	2020-04-22 19:53:06 -07:00
Tomas Vondra	de0dc1a847	Fix cost_incremental_sort for expressions with varno 0 When estimating the number of pre-sorted groups in cost_incremental_sort we must not pass Vars with varno 0 to estimate_num_groups, which would cause failues in find_base_rel. This may happen when sorting output of set operations, thanks to generate_append_tlist. Unlike recurse_set_operations we can't easily access the original target list, so if we find any Vars with varno 0, we fall back to the default estimate DEFAULT_NUM_DISTINCT. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20200411214639.GK2228%40telsasoft.com	2020-04-23 00:15:24 +02:00
David Rowley	9f2c4edec2	Remove bogus Assert in foreign key cloning code This Assert was trying to ensure that the number of columns in the foreign key being cloned was the same number of attributes in the parentRel. Of course, it's perfectly valid to have columns in the table which are not part of the foreign key constraint. It appears that this Assert was misunderstanding that. Reported-by: Rajkumar Raghuwanshi Reviewed-by: amul sul Discussion: https://postgr.es/m/CAKcux6=z1dtiWw5BOpqDx-U6KTiq+zD0Y2m810zUtWL+giVXWA@mail.gmail.com	2020-04-22 22:12:19 +12:00
Peter Eisentraut	aaf069aa34	Remove HEAPDEBUGALL This has been broken since PostgreSQL 12 and was probably never really used. PostgreSQL 12 added an analogous HEAPAMSLOTDEBUGALL, which still works right now, but it's also not very useful, so remove that as well. Discussion: https://www.postgresql.org/message-id/flat/645c0646-4218-d4c3-409a-a7003a0c108d%402ndquadrant.com	2020-04-22 08:35:33 +02:00
Tom Lane	d12bdba77b	Fix possible crash during FATAL exit from reindexing. index.c supposed that it could just use a PG_TRY block to clean up the state associated with an active REINDEX operation. However, that code doesn't run if we do a FATAL exit --- for example, due to a SIGTERM shutdown signal --- while the REINDEX is happening. And that state does get consulted during catalog accesses, which makes it problematic if we do any catalog accesses during shutdown --- for example, to clean up any temp tables created in the session. If this combination of circumstances occurred, we could find ourselves trying to access already-freed memory. In debug builds that'd fairly reliably cause an assertion failure. In production we might often get away with it, but with some bad luck it could cause a core dump. Another possible bad outcome is an erroneous conclusion that an index-to-be-accessed is being reindexed; but it looks like that would be unlikely to have any consequences worse than failing to drop temp tables right away. (They'd still get dropped by the next session that uses that temp schema.) To fix, get rid of the use of PG_TRY here, and instead hook into the transaction abort mechanisms to clean up reindex state. Per bug #16378 from Alexander Lakhin. This has been wrong for a very long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/16378-7a70ca41b3ec2009@postgresql.org	2020-04-21 15:58:42 -04:00
Tom Lane	5836d32655	Fix minor violations of FunctionCallInvoke usage protocol. Working on commit `1c455078b` led me to check through FunctionCallInvoke call sites to see if every one was being honest about (a) making sure that fcinfo.isnull is initially false, and (b) checking its state after the call. Sure enough, I found some violations. The main one is that finalize_partialaggregate re-used serialfn_fcinfo without resetting isnull, even though it clearly intends to cater for serialfns that return NULL. There would only be an issue with a non-strict serialfn, since it's unlikely that a serialfn would return NULL for non-null input. We have no non-strict serialfns in core, and there may be none in the wild either, which would account for the lack of complaints. Still, it's clearly wrong, so back-patch that fix to 9.6 where finalize_partialaggregate was introduced. Also, arrayfuncs.c and rowtypes.c contained various callers that were not bothering to check for result nulls. While what's being called is a comparison or hash function that probably shouldn't return null, that's a lousy excuse for not having any check at all. There are existing places that just Assert(!fcinfo->isnull) in comparable situations, so I added that to the places that were calling btree comparison or hash support functions. In the places calling boolean-returning equality functions, it's quite cheap to have them treat isnull as FALSE, so make those places do that. Also remove some "locfcinfo->isnull = false" assignments that are unnecessary given the assumption that no previous call returned null. These changes seem like mostly neatnik-ism or debugging support, so I didn't back-patch.	2020-04-21 14:23:53 -04:00
Alvaro Herrera	afccd76f1c	Fix detaching partitions with cloned row triggers When a partition is detached, any triggers that had been cloned from its parent were not properly disentangled from its parent triggers. This resulted in triggers that could not be dropped because they depended on the trigger in the trigger in the no-longer-parent table: ALTER TABLE t DETACH PARTITION t1; DROP TRIGGER trig ON t1; ERROR: cannot drop trigger trig on table t1 because trigger trig on table t requires it HINT: You can drop trigger trig on table t instead. Moreover the table can no longer be re-attached to its parent, because the trigger name is already taken: ALTER TABLE t ATTACH PARTITION t1 FOR VALUES FROM (1)TO(2); ERROR: trigger "trig" for relation "t1" already exists The former is a bug introduced in commit `86f575948c`. (The latter is not necessarily a bug, but it makes the bug more uncomfortable.) To avoid the complexity that would be needed to tell whether the trigger has a local definition that has to be merged with the one coming from the parent table, establish the behavior that the trigger is removed when the table is detached. Backpatch to pg11. Author: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/20200408152412.GZ2228@telsasoft.com	2020-04-21 13:57:00 -04:00
Peter Geoghegan	1542e16f2c	Consider outliers in split interval calculation. Commit `0d861bbb`, which introduced deduplication to nbtree, added some logic to take large posting list tuples into account when choosing a split point. We subtract firstright posting list overhead from the projected new high key size when calculating leftfree/rightfree values for an affected candidate split point. Posting list tuples aren't special to nbtsplitloc.c, but taking them into account like this makes a huge difference in practice. Posting list tuples are frequently tuple size outliers. However, commit `0d861bbb` missed a closely related issue: split interval itself is calculated based on the assumption that tuples on the page being split are roughly equisized. That assumption was acceptable back when commit `fab25024` taught the logic for choosing a split point about suffix truncation, but it's pretty questionable now that very large tuple sizes are common. This oversight led to unbalanced page splits in low cardinality multi-column indexes when deduplication was used: page splits that don't give sufficient weight to how unbalanced the split is when the interval happens to include some large posting list tuples (and when most other tuples on the page are not so large). Nail this down by calculating an initial split interval in a way that's attuned to the actual cost that we want to keep under control (not a fuzzy proxy for the cost): apply a leftfree + rightfree evenness test to each candidate split point that actually gets included in the split interval (for the default strategy). This replaces logic that used a percentage of all legal split points for the page as the basis of the initial split interval. Discussion: https://postgr.es/m/CAH2-WznJt5aT2uUB2Bs+JBLdwe0XTX67+xeLFcaNvCKxO=QBVQ@mail.gmail.com	2020-04-21 09:59:24 -07:00
Tom Lane	1c455078b0	Allow matchingsel() to be used with operators that might return NULL. Although selfuncs.c will never call a target operator with null inputs, some functions might return null anyway. The existing coding will fail if that happens (since FunctionCall2Coll will punt), which seems undesirable given that matchingsel() has such a broad range of potential applicability --- in fact, we already have a problem because we apply it to jsonb_path_exists_opr, which can return null. Hence, rejigger the underlying functions mcv_selectivity and histogram_selectivity to cope, treating a null result as false. While we are at it, we can move the InitFunctionCallInfoData overhead out of the inner loops, which isn't a huge number of cycles but might save something considering we are likely calling functions as cheap as int4eq(). Plus, the number of loop cycles to be expected is much more than it was when this code was written, since typical settings of default_statistics_target are higher. In view of that consideration, let's apply the same change to var_eq_const, eqjoinsel_inner, and eqjoinsel_semi. We do not expect equality functions to ever return null for non-null inputs (and certainly that code has been that way a long time without complaints), but the cycle savings seem attractive, especially in the eqjoinsel loops where there's potentially an O(N^2) savings. Similar code exists in ineq_histogram_selectivity and get_variable_range, but I forebore from changing those for now. The performance argument for changing ineq_histogram_selectivity is really weak anyway, since that will only iterate log2(N) times. Nikita Glukhov and Tom Lane Discussion: https://postgr.es/m/9d3b0959-95d6-c37e-2c0b-287bcfe5c705@postgrespro.ru	2020-04-21 12:56:55 -04:00
Tom Lane	9d25e1aa31	Clean up cpluspluscheck violation. "operator" is a reserved word in C++, so per project conventions, don't use it as an identifier in header files. My oversight in commit `a80818605`.	2020-04-21 11:21:15 -04:00
Robert Haas	079ac29d4d	Move the server's backup manifest code to a separate file. basebackup.c is already a pretty big and complicated file, so it makes more sense to keep the backup manifest support routines in a separate file, for clarity and ease of maintenance. Discussion: http://postgr.es/m/CA+TgmoavRak5OdP76P8eJExDYhPEKWjMb0sxW7dF01dWFgE=uA@mail.gmail.com	2020-04-20 14:38:15 -04:00
Alvaro Herrera	5fc703946b	Add ALTER .. NO DEPENDS ON Commit `f2fcad27d5` (9.6 era) added the ability to mark objects as dependent an extension, but forgot to add a way for such dependencies to be removed. This commit fixes that oversight. Strictly speaking this should be backpatched to 9.6, but due to lack of demand we're not doing so at this time. Discussion: https://postgr.es/m/20200217225333.GA30974@alvherre.pgsql Reviewed-by: ahsan hadi <ahsan.hadi@gmail.com> Reviewed-by: Ibrar Ahmed <ibrar.ahmad@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2020-04-20 13:42:12 -04:00
Magnus Hagander	7e4e574744	Allow pg_read_all_stats to access all stats views again The views pg_stat_progress_* had not gotten the memo that pg_read_all_stats is supposed to be able to read all statistics. Also make a pass over all text-returning pg_stat_xyz functions that could return "insufficient privilege" and make sure they also respect pg_read_all_status. Reported-by: Andrey M. Borodin Reviewed-by: Andrey M. Borodin, Kyotaro Horiguchi Discussion: https://postgr.es/m/13145F2F-8458-4977-9D2D-7B2E862E5722@yandex-team.ru	2020-04-20 12:53:40 +02:00
Jeff Davis	0cacb2b79d	Fix missing pfree() in logtape.c, missed by `24d85952`.	2020-04-19 10:33:06 -07:00
Tom Lane	f332241a60	Fix race conditions in synchronous standby management. We have repeatedly seen the buildfarm reach the Assert(false) in SyncRepGetSyncStandbysPriority. This apparently is due to failing to consider the possibility that the sync_standby_priority values in shared memory might be inconsistent; but they will be whenever only some of the walsenders have updated their values after a change in the synchronous_standby_names setting. That function is vastly too complex for what it does, anyway, so rewriting it seems better than trying to apply a band-aid fix. Furthermore, the API of SyncRepGetSyncStandbys is broken by design: it returns a list of WalSnd array indexes, but there is nothing guaranteeing that the contents of the WalSnd array remain stable. Thus, if some walsender exits and then a new walsender process takes over that WalSnd array slot, a caller might make use of WAL position data that it should not, potentially leading to incorrect decisions about whether to release transactions that are waiting for synchronous commit. To fix, replace SyncRepGetSyncStandbys with a new function SyncRepGetCandidateStandbys that copies all the required data from shared memory while holding the relevant mutexes. If the associated walsender process then exits, this data is still safe to make release decisions with, since we know that that much WAL was sent to a valid standby server. This incidentally means that we no longer need to treat sync_standby_priority as protected by the SyncRepLock rather than the per-walsender mutex. SyncRepGetSyncStandbys is no longer used by the core code, so remove it entirely in HEAD. However, it seems possible that external code is relying on that function, so do not remove it from the back branches. Instead, just remove the known-incorrect Assert. When the bug occurs, the function will return a too-short list, which callers should treat as meaning there are not enough sync standbys, which seems like a reasonably safe fallback until the inconsistent state is resolved. Moreover it's bug-compatible with what has been happening in non-assert builds. We cannot do anything about the walsender-replacement race condition without an API/ABI break. The bogus assertion exists back to 9.6, but 9.6 is sufficiently different from the later branches that the patch doesn't apply at all. I chose to just remove the bogus assertion in 9.6, feeling that the probability of a bad outcome from the walsender-replacement race condition is too low to justify rewriting the whole patch for 9.6. Discussion: https://postgr.es/m/21519.1585272409@sss.pgh.pa.us	2020-04-18 14:02:44 -04:00
David Rowley	3cb02e307e	Fix possible crash with GENERATED ALWAYS columns In some corner cases, this could also lead to corrupted values being included in the tuple. Users who are concerned that they are affected by this should first upgrade and then perform a base backup of their database and restore onto an off-line server. They should then query each table with generated columns to ensure there are no rows where the generated expression does not match a newly calculated version of the GENERATED ALWAYS expression. If no crashes occur and no rows are returned then you're not affected. Fixes bug #16369. Reported-by: Cameron Ezell Discussion: https://postgr.es/m/16369-5845a6f1bef59884@postgresql.org Backpatch-through: 12 (where GENERATED ALWAYS columns were added.)	2020-04-18 14:10:37 +12:00

1 2 3 4 5 ...

20734 Commits