postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-07-18 00:51:09 +02:00

Author	SHA1	Message	Date
Thomas Munro	db2687d1f3	Optimize PredicateLockTuple(). PredicateLockTuple() has a fast exit if tuple was written by the current transaction, as in that case it already has a lock. This check can be performed using TransactionIdIsCurrentTransactionId() instead of SubTransGetTopmostTransaction(), to avoid any chance of having to hit the disk. Author: Ashwin Agrawal, based on a suggestion from Andres Freund Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com	2019-11-11 17:06:59 +13:00
Thomas Munro	695c5977c8	Optimize TransactionIdIsCurrentTransactionId(). If the passed in xid is the current top transaction, we can do a fast check and exit early. This should work well for the current heap but also works very well for proposed AMs that don't use a separate xid for subtransactions. Author: Ashwin Agrawal, based on a suggestion from Andres Freund Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com	2019-11-11 16:33:04 +13:00
Amit Kapila	9fab25c6cd	Rearrange dropdb() to avoid errors after allowing other sessions to exit. During Drop Database, it is better to error out before allowing other sessions to exit and forcefully terminating autovacuum workers. All the other errors except for checking subscriptions are already done before. Author: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1+qhLkCYG2oy9xug9ur_j=G2wQNRYAyd+-kZfZ1z42pLw@mail.gmail.com	2019-11-11 07:42:45 +05:30
Peter Eisentraut	1c60e40ad5	Fix negative bitmapset member not allowed error in logical replication This happens when we add a replica identity column on a subscriber that does not yet exist on the publisher, according to the mapping maintained by the subscriber. Code that checks whether the target relation on the subscriber is updatable would check the replica identity attribute bitmap with a column number -1, which would result in an error. To fix, skip such columns in the bitmap lookup and consider the relation not updatable. The result is consistent with the rule that the replica identity columns on the subscriber must be a subset of those on the publisher, since if the column doesn't exist on the publisher, the column set on the subscriber can't be a subset. Reported-by: Tim Clarke <tim.clarke@minerva.info> Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info	2019-11-09 08:35:44 +01:00
Andres Freund	aae50236e4	Pass ItemPointer not HeapTuple to IndexBuildCallback. Not all AMs use HeapTuples internally, making it inconvenient to pass a HeapTuple. As the index callbacks really only need the TID, not the full tuple, modify callback to only take ItemPointer. Author: Ashwin Agrawal Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CALfoeis6=8ehuR=VNtHvj3z16cYfCwPdTcpaxU+sfSUJ5QgR3g@mail.gmail.com	2019-11-08 11:49:29 -08:00
Alvaro Herrera	71a8a4f6e3	Add backtrace support for error reporting Add some support for automatically showing backtraces in certain error situations in the server. Backtraces are shown on assertion failure; also, a new setting backtrace_functions can be set to a list of C function names, and all ereport()s and elog()s from the mentioned functions will have backtraces generated. Finally, the function errbacktrace() can be manually added to an ereport() call to generate a backtrace for that call. Authors: Peter Eisentraut, Álvaro Herrera Discussion: https://postgr.es/m//5f48cb47-bf1e-05b6-7aae-3bf2cd01586d@2ndquadrant.com Discussion: https://postgr.es/m/CAMsr+YGL+yfWE=JvbUbnpWtrRZNey7hJ07+zT4bYJdVp4Szdrg@mail.gmail.com	2019-11-08 15:44:20 -03:00
Peter Eisentraut	3dcffb381c	Fix gratuitous error message variation	2019-11-08 18:37:17 +01:00
Peter Eisentraut	b85e43feb3	More precise errors from initial pg_control check Use a separate error message for invalid checkpoint location and invalid state instead of just "invalid data" for both. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20191107041630.GK1768@paquier.xyz	2019-11-08 08:03:16 +01:00
Peter Geoghegan	e86c8ef243	Use "low key" terminology in nbtsort.c. nbtree index builds once stashed the "minimum key" for a page, which was used as the basis of the pivot tuple that gets placed in the next level up (i.e. the tuple that stores the downlink to the page in question). It doesn't quite work that way anymore, so the "minimum key" terminology now seems misleading (these days the minimum key is actually a straight copy of the high key from the left sibling, which is a distinct thing in subtle but important ways). Rename this concept to "low key". This name is a lot clearer given that there is now a sharp distinction between pivot and non-pivot tuples. Also remove comments that describe obsolete details about how the minimum key concept used to work. Rather than generating the minus infinity item for the leftmost page on a level by copying the new item and truncating that copy, simply allocate a small buffer. The old approach confusingly created the impression that the new item had some kind of significance. This was another artifact of how things used to work before commits `8224de4f` and `dd299df8`.	2019-11-07 17:12:09 -08:00
Alvaro Herrera	b4bcc6bfdf	Fix SET CONSTRAINTS .. DEFERRED on partitioned tables SET CONSTRAINTS ... DEFERRED failed on partitioned tables, because of a sanity check that ensures that the affected constraints have triggers. On partitioned tables, the triggers are in the leaf partitions, not in the partitioned relations themselves, so the sanity check fails. Removing the sanity check solves the problem, because the code needed to support the case is already there. Backpatch to 11. Note: deferred unique constraints are not affected by this bug, because they do have triggers in the parent partitioned table. I did not add a test for this scenario. Discussion: https://postgr.es/m/20191105212915.GA11324@alvherre.pgsql	2019-11-07 13:59:24 -03:00
Tom Lane	a7145f6bc8	Fix integer-overflow edge case detection in interval_mul and pgbench. This patch adopts the overflow check logic introduced by commit `cbdb8b4c0` into two more places. interval_mul() failed to notice if it computed a new microseconds value that was one more than INT64_MAX, and pgbench's double-to-int64 logic had the same sorts of edge-case problems that `cbdb8b4c0` fixed in the core code. To make this easier to get right in future, put the guts of the checks into new macros in c.h, and add commentary about how to use the macros correctly. Back-patch to all supported branches, as we did with the previous fix. Yuya Watari Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com	2019-11-07 11:22:58 -05:00
Peter Eisentraut	581a55889b	Fix nested error handling in PG_FINALLY We need to pop the error stack before running the user-supplied PG_FINALLY code. Otherwise an error in the cleanup code would end up at the same sigsetjmp() invocation and result in an infinite error handling loop. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com	2019-11-07 09:56:47 +01:00
Fujii Masao	a0c96856e8	Fix assertion failure when running pgbench -s. If there is the WAL page that the continuation WAL record just fits within (i.e., the continuation record ends just at the end of the page) and the LSN in such page is specified with -s option, previously pg_waldump caused an assertion failure. The cause of this assertion failure was that XLogFindNextRecord() that pg_waldump -s calls mistakenly handled such special WAL page. This commit changes XLogFindNextRecord() so that it can handle such WAL page correctly. Back-patch to all supported versions. Author: Andrey Lepikhov Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru	2019-11-07 16:31:36 +09:00
Thomas Munro	7815e7efdb	Add reusable routine for making arrays unique. Introduce qunique() and qunique_arg(), which can be used after qsort() and qsort_arg() respectively to remove duplicate values. Use it where appropriate. Author: Thomas Munro Reviewed-by: Tom Lane (in an earlier version) Discussion: https://postgr.es/m/CAEepm%3D2vmFTNpAmwbGGD2WaryM6T3hSDVKQPfUwjdD_5XY6vAA%40mail.gmail.com	2019-11-07 17:00:48 +13:00
Michael Paquier	3feb6ace7c	Check after errors of SPI_execute() in xml.c SPI gets used to build a list of relation OIDs for XML object generation, and one code path building a list uses SPI_execute() without looking at errors it produces. So fix that. Author: Mark Dilger Reviewed-by: Michael Paquier, Pavel Stehule Discussion: https://postgr.es/m/17d30445-4862-7917-170f-84328dcd292d@gmail.com	2019-11-07 11:13:31 +09:00
Tomas Vondra	6e3e6cc0e8	Allow sampling of statements depending on duration This allows logging a sample of statements, without incurring excessive log traffic (which may impact performance). This can be useful when analyzing workloads with lots of short queries. The sampling is configured using two new GUC parameters: * log_min_duration_sample - minimum required statement duration * log_statement_sample_rate - sample rate (0.0 - 1.0) Only statements with duration exceeding log_min_duration_sample are considered for sampling. To enable sampling, both those GUCs have to be set correctly. The existing log_min_duration_statement GUC has a higher priority, i.e. statements with duration exceeding log_min_duration_statement will be always logged, irrespectedly of how the sampling is configured. This means only configurations log_min_duration_sample < log_min_duration_statement do actually sample the statements, instead of logging everything. Author: Adrien Nayrat Reviewed-by: David Rowley, Vik Fearing, Tomas Vondra Discussion: https://postgr.es/m/bbe0a1a8-a8f7-3be2-155a-888e661cc06c@anayrat.info	2019-11-06 19:11:07 +01:00
Tom Lane	22e44e8dbc	Minor code review for tuple slot rewrite. Avoid creating transiently-inconsistent slot states where possible, by not setting TTS_FLAG_SHOULDFREE until after the slot actually has a free'able tuple pointer, and by making sure that we reset tts_nvalid and related derived state before we replace the tuple contents. This would only matter if something were to examine the slot after we'd suffered some kind of error (e.g. out of memory) while manipulating the slot. We typically don't do that, so these changes might just be cosmetic --- but even if so, it seems like good future-proofing. Also remove some redundant Asserts, and add a couple for consistency. Back-patch to v12 where all this code was rewritten. Discussion: https://postgr.es/m/16095-c3ff2e5283b8dba5@postgresql.org	2019-11-06 12:00:17 -05:00
Tom Lane	ff43b3e88e	Sync our DTrace infrastructure with c.h's definition of type bool. Since commit `d26a810eb`, we've defined bool as being either _Bool from <stdbool.h>, or "unsigned char"; but that commit overlooked the fact that probes.d has "#define bool char". For consistency, make it say "unsigned char" instead. This should be strictly a cosmetic change, but it seems best to be in sync. Formally, in the now-normal case where we're using <stdbool.h>, it'd be better to write "#define bool _Bool". However, then we'd need some build infrastructure to inject that configuration choice into probes.d, and it doesn't seem worth the trouble. We only use <stdbool.h> if sizeof(_Bool) is 1, so having DTrace think that bool parameters are "unsigned char" should be close enough. Back-patch to v12 where `d26a810eb` came in. Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com	2019-11-06 11:11:40 -05:00
Peter Eisentraut	d40abd5fcf	Fix memory allocation mistake The previous code was allocating more memory than necessary because the formula used the wrong data type. Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Discussion: https://www.postgresql.org/message-id/20191105172918.3e32a446@firost	2019-11-06 14:20:29 +01:00
Peter Eisentraut	5b7ba75f7f	Remove unused function argument The cache_plan argument to ri_PlanCheck has not been used since `e8c9fd5fdf`. Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ec8a8b45-a30b-9193-cd4b-985d60d1497e%402ndquadrant.com	2019-11-06 08:19:27 +01:00
Michael Paquier	5f6b1eb0cf	Fix timestamp of sent message for write context in logical decoding When sending data for logical decoding using the streaming replication protocol via a WAL sender, the timestamp of the sent write message is allocated at the beginning of the message when preparing for the write, and actually computed when the write message is ready to be sent. The timestamp was getting computed after sending the message. This impacts anything using logical decoding, causing for example logical replication to report mostly NULL for last_msg_send_time in pg_stat_subscription. This commit makes sure that the timestamp is computed before sending the message. This is wrong since `5a991ef`, so backpatch down to 9.4. Author: Jeff Janes Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com Backpatch-through: 9.4	2019-11-06 16:12:21 +09:00
Andrew Gierth	a9056cc637	Request small targetlist for input to WindowAgg. WindowAgg will potentially store large numbers of input rows into tuplestores to allow access to other rows in the frame. If the input is coming via an explicit Sort node, then unneeded columns will already have been discarded (since Sort requests a small tlist); but there are idioms like COUNT(*) OVER () that result in the input not being sorted at all, and cases where the input is being sorted by some means other than a Sort; if we don't request a small tlist, then WindowAgg's storage requirement is inflated by the unneeded columns. Backpatch back to 9.6, where the current tlist handling was added. (Prior to that, WindowAgg would always use a small tlist.) Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk	2019-11-06 04:13:30 +00:00
Fujii Masao	979766c0af	Correct the command tags for ALTER ... RENAME COLUMN. Previously ALTER MATERIALIZED VIEW / FOREIGN TABLE ... RENAME COLUMN ... returned "ALTER TABLE" as a command tag. This commit fixes them so that they return "ALTER MATERIALIZED VIEW" and "ALTER FOREIGN TABLE" as command tags, respectively. This issue exists in all supported versions, but we don't back-patch this because it's not enough of a bug to justify taking any compatibility risks for. Otherwise, the back-patch would cause minor version update to break, for example, the existing event trigger functions using TG_TAG. Author: Fujii Masao Reviewed-by: Ibrar Ahmed Discussion: https://postgr.es/m/CAHGQGwGUaC03FFdTFoHsCuDrrNvFvNVQ6xyd40==P25WvuBJjg@mail.gmail.com	2019-11-06 12:54:17 +09:00
Andres Freund	26aaf97b68	Make StringInfo available to frontend code. There's plenty places in frontend code that could benefit from a string buffer implementation. Some because it yields simpler and faster code, and some others because of the desire to share code between backend and frontend. While there is a string buffer implementation available to frontend code, libpq's PQExpBuffer, it is clunkier than stringinfo, it introduces a libpq dependency, doesn't allow for sharing between frontend and backend code, and has a higher API/ABI stability requirement due to being exposed via libpq. Therefore it seems best to just making StringInfo being usable by frontend code. There's not much to do for that, except for rewriting two subsequent elog/ereport calls into others types of error reporting, and deciding on a maximum string length. For the maximum string size I decided to privately define MaxAllocSize to the same value as used in the backend. It seems likely that we'll want to reconsider this for both backend and frontend code in the not too far away future. For now I've left stringinfo.h in lib/, rather than common/, to reduce the likelihood of unnecessary breakage. We could alternatively decide to provide a redirecting stringinfo.h in lib/, or just not provide compatibility. Author: Andres Freund Reviewed-By: Kyotaro Horiguchi, Daniel Gustafsson Discussion: https://postgr.es/m/20190920051857.2fhnvhvx4qdddviz@alap3.anarazel.de	2019-11-05 14:56:40 -08:00
Andres Freund	01368e5d9d	Split all OBJS style lines in makefiles into one-line-per-entry style. When maintaining or merging patches, one of the most common sources for conflicts are the list of objects in makefiles. Especially when the split across lines has been changed on both sides, which is somewhat common due to attempting to stay below 80 columns, those conflicts are unnecessarily laborious to resolve. By splitting, and alphabetically sorting, OBJS style lines into one object per line, conflicts should be less frequent, and easier to resolve when they still occur. Author: Andres Freund Discussion: https://postgr.es/m/20191029200901.vww4idgcxv74cwes@alap3.anarazel.de	2019-11-05 14:41:07 -08:00
Tom Lane	66c61c81b9	Tweak some authentication debug messages to follow project style. Avoid initial capital, since that's not how we do it. Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com	2019-11-05 14:29:08 -05:00
Tom Lane	3affe76ef8	Avoid logging complaints about abandoned connections when using PAM. For a long time (since commit `aed378e8d`) we have had a policy to log nothing about a connection if the client disconnects when challenged for a password. This is because libpq-using clients will typically do that, and then come back for a new connection attempt once they've collected a password from their user, so that logging the abandoned connection attempt will just result in log spam. However, this did not work well for PAM authentication: the bottom-level function pam_passwd_conv_proc() was on board with it, but we logged messages at higher levels anyway, for lack of any reporting mechanism. Add a flag and tweak the logic so that the case is silent, as it is for other password-using auth mechanisms. Per complaint from Yoann La Cancellera. It's been like this for awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com	2019-11-05 14:27:37 -05:00
Tom Lane	a30531c5c8	Fix "unexpected relkind" error when denying permissions on toast tables. get_relkind_objtype, and hence get_object_type, failed when applied to a toast table. This is not a good thing, because it prevents reporting of perfectly legitimate permissions errors. (At present, these functions are in fact only used to determine the ObjectType argument for acl_error() calls.) It seems best to have them fall back to returning OBJECT_TABLE in every case where they can't determine an object type for a pg_class entry, so do that. In passing, make some edits to alter.c to make it more obvious that those calls of get_object_type() are used only for error reporting. This might save a few cycles in the non-error code path, too. Back-patch to v11 where this issue originated. John Hsu, Michael Paquier, Tom Lane Discussion: https://postgr.es/m/C652D3DF-2B0C-4128-9420-FB5379F6B1E4@amazon.com	2019-11-05 13:40:37 -05:00
Tom Lane	529ebb20aa	Generate EquivalenceClass members for partitionwise child join rels. Commit `d25ea0127` got rid of what I thought were entirely unnecessary derived child expressions in EquivalenceClasses for EC members that mention multiple baserels. But it turns out that some of the child expressions that code created are necessary for partitionwise joins, else we fail to find matching pathkeys for Sort nodes. (This happens only for certain shapes of the resulting plan; it may be that partitionwise aggregation is also necessary to show the failure, though I'm not sure of that.) Reverting that commit entirely would be quite painful performance-wise for large partition sets. So instead, add code that explicitly generates child expressions that match only partitionwise child join rels we have actually generated. Per report from Justin Pryzby. (Amit Langote noticed the problem earlier, though it's not clear if he recognized then that it could result in a planner error, not merely failure to exploit partitionwise join, in the code as-committed.) Back-patch to v12 where commit `d25ea0127` came in. Amit Langote, with lots of kibitzing from me Discussion: https://postgr.es/m/CA+HiwqG2WVUGmLJqtR0tPFhniO=H=9qQ+Z3L_ZC+Y3-EVQHFGg@mail.gmail.com Discussion: https://postgr.es/m/20191011143703.GN10470@telsasoft.com	2019-11-05 11:42:24 -05:00
Michael Paquier	3534fa2233	Refactor code building relation options Historically, the code to build relation options has been shaped the same way in multiple code paths by using a set of datums in input with the options parsed with a static table which is then filled with the option values. This introduces a new common routine in reloptions.c to do most of the legwork for the in-core code paths. Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA+HiwqGsoSn_uTPPYT19WrtR7oYpYtv4CdS0xuedTKiHHWuk_g@mail.gmail.com	2019-11-05 09:17:05 +09:00
Tom Lane	ec28808ba8	Fix ginEntryInsert's counting of GIN leaf tuples. As the code stands, nEntries counts the number of ginEntryInsert() calls, so that's what you end up with at the end of a GIN index build. However, ginvacuumcleanup() recomputes nEntries as the number of surviving leaf tuples, and that's generally consistent with the way that gincostestimate() uses the value. So let's clearly define nEntries as the number of leaf tuples, and therefore adjust ginEntryInsert() to increment it only when we make a new one, not when we add TIDs into an existing tuple or posting tree. In practice this inconsistency probably has little impact, so I don't feel a need to back-patch. Insung Moon and Keisuke Kuroda Discussion: https://postgr.es/m/CAEMmqBuH_O-oXL+3_ArQ6F5cJ7kXVow2SGQB3HRacku_T+xkmA@mail.gmail.com	2019-11-04 14:16:42 -05:00
Peter Eisentraut	a63c84e59a	Fix some compiler warnings on older compilers Some older compilers appear to not understand the recently introduced PG_FINALLY code structure that well in some circumstances and complain about possibly uninitialized variables. So to fix, initialize the variables explicitly in the cases complained about. Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com	2019-11-04 11:07:32 +01:00
Peter Eisentraut	8557a6f10c	Catch invalid typlens in a couple of places Rearrange the logic in record_image_cmp() and datum_image_eq() to error out on unexpected typlens (either not supported there or completely invalid due to corruption). Barring corruption, this is not possible today but it seems more future-proof and robust to fix this. Reported-by: Peter Geoghegan <pg@bowt.ie>	2019-11-04 09:08:15 +01:00
Tom Lane	db27b60f07	Suppress warning from older compilers. Commit `8af1624e3` introduced a warning about possibly returning without a value, on compilers that don't realize that ereport(ERROR) doesn't return. Tweak the code to avoid that. Per buildfarm. Back-patch to 9.6, like the aforesaid commit.	2019-11-03 16:10:23 -05:00
Tom Lane	8af1624e3f	Validate ispell dictionaries more carefully. Using incorrect, or just mismatched, dictionary and affix files could result in a crash, due to failure to cross-check offsets obtained from the file. Add necessary validation, as well as some Asserts for future-proofing. Per bug #16050 from Alexander Lakhin. Back-patch to 9.6 where the problem was introduced. Arthur Zakirov, per initial investigation by Tomas Vondra Discussion: https://postgr.es/m/16050-024ae722464ab604@postgresql.org Discussion: https://postgr.es/m/20191013012610.2p2fp3zzpoav7jzf@development	2019-11-02 16:45:32 -04:00
Michael Paquier	dc816e5815	Fix failure when creating cloned indexes for a partition When using CREATE TABLE for a new partition, the partitioned indexes of the parent are created automatically in a fashion similar to LIKE INDEXES. The new partition and its parent use a mapping for attribute numbers for this operation, and while the mapping was correctly built, its length was defined as the number of attributes of the newly-created child, and not the parent. If the parent includes dropped columns, this could cause failures. This is wrong since `8b08f7d` which has introduced the concept of partitioned indexes, so backpatch down to 11. Reported-by: Wyatt Alt Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/CAGem3qCcRmhbs4jYMkenYNfP2kEusDXvTfw-q+eOhM0zTceG-g@mail.gmail.com Backpatch-through: 11	2019-11-02 14:16:04 +09:00
Michael Paquier	e174f699c4	Add some assertions in syncrep.c A couple of routines assume that the LWLock SyncRepLock needs to be taken, so add a couple of assertions to be sure of that. Also, when waiting for a given LSN at transaction commit, the code implied that the syncrep queue cleanup happens while holding interrupts, but the code never checked after that. Author: Michael Paquier Reviewed-by: Fujii Masao, Kyotaro Horiguchi, Dongming Liu Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com	2019-11-01 22:51:05 +09:00
Michael Paquier	20345197ff	Fix race condition at backend exit when deleting element in syncrep queue When a backend exits, it gets deleted from the syncrep queue if present. The queue was checked without SyncRepLock taken in exclusive mode, so it would have been possible for a backend to remove itself after a WAL sender already did the job. Fix this issue based on a suggestion from Fujii Masao, by first checking the queue without the lock. Then, if the backend is present in the queue, take the lock and perform an additional lookup check before doing the element deletion. Author: Dongming Liu Reviewed-by: Kyotaro Horiguchi, Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/a0806273-8bbb-43b3-bbe1-c45a58f6ae21.lingce.ldm@alibaba-inc.com Backpatch-through: 9.4	2019-11-01 22:38:32 +09:00
Peter Eisentraut	604bd36711	PG_FINALLY This gives an alternative way of catching exceptions, for the common case where the cleanup code is the same in the error and non-error cases. So instead of PG_TRY(); { ... code that might throw ereport(ERROR) ... } PG_CATCH(); { cleanup(); PG_RE_THROW(); } PG_END_TRY(); cleanup(); one can write PG_TRY(); { ... code that might throw ereport(ERROR) ... } PG_FINALLY(); { cleanup(); } PG_END_TRY(); Discussion: https://www.postgresql.org/message-id/flat/95a822c3-728b-af0e-d7e5-71890507ae0c%402ndquadrant.com	2019-11-01 11:18:03 +01:00
Peter Eisentraut	7302514088	Add const qualifiers to internal range type APIs Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/dc9b45fa-b950-fadc-4751-85d6f729df55%402ndquadrant.com	2019-10-31 07:48:21 +01:00
Michael Paquier	f921ea624e	Fix typo in comment of syncrep.c Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20191030.123428.18823202335157111.horikyota.ntt@gmail.com	2019-10-31 10:22:24 +09:00
Peter Eisentraut	c5e1df951d	Remove one use of IDENT_USERNAME_MAX IDENT_USERNAME_MAX is the maximum length of the information returned by an ident server, per RFC 1413. Using it as the buffer size in peer authentication is inappropriate. It was done here because of the historical relationship between peer and ident authentication. To reduce confusion between the two authenticaton methods and disentangle their code, use a dynamically allocated buffer instead. Discussion: https://www.postgresql.org/message-id/flat/c798fba5-8b71-4f27-c78e-37714037ea31%402ndquadrant.com	2019-10-30 11:18:00 +01:00
Peter Eisentraut	5cc1e64fb6	Update code comments about peer authenticaton For historical reasons, the functions for peer authentication were grouped under ident authentication. But they are really completely separate, so give them their own section headings.	2019-10-30 09:13:39 +01:00
Michael Paquier	6ca86bb7e9	Fix typos in the code Author: Vignesh C Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/CALDaNm0ni+GAOe4+fbXiOxNrVudajMYmhJFtXGX-zBPoN8ixhw@mail.gmail.com	2019-10-30 10:03:00 +09:00
Michael Paquier	d80be6f2f6	Fix handling of pg_class.relispartition at swap phase in REINDEX CONCURRENTLY When cancelling REINDEX CONCURRENTLY after swapping the old and new indexes (for example interruption at step 5), the old index remains around and is marked as invalid. The old index should also be manually droppable to clean up the parent relation from any invalid indexes still remaining. For a partition index reindexed, pg_class.relispartition was not getting updated, causing the index to not be droppable as DROP INDEX would look for dependencies in a partition tree, which do not exist anymore after the swap phase is done. The fix here is simple: when swapping the old and new indexes, make sure that pg_class.relispartition is correctly switched, similarly to what is done for the index name. Reported-by: Justin Pryzby Author: Michael Paquier Discussion: https://postgr.es/m/20191015164047.GA22729@telsasoft.com Backpatch-through: 12	2019-10-29 11:08:09 +09:00
Tom Lane	8b7a0f1d11	Allow extracting fields from a ROW() expression in more cases. Teach get_expr_result_type() to manufacture a tuple descriptor directly from a RowExpr node. If the RowExpr has type RECORD, this is the only way to get a tupdesc for its result, since even if the rowtype has been blessed, we don't have its typmod available at this point. (If the RowExpr has some named composite type, we continue to let the existing code handle it, since the RowExpr might well not have the correct column names embedded in it.) This fixes assorted corner cases illustrated by the added regression tests. Discussion: https://postgr.es/m/10872.1572202006@sss.pgh.pa.us	2019-10-28 15:08:24 -04:00
Tom Lane	bd1ef5799b	Handle empty-string edge cases correctly in strpos(). Commit `9556aa01c` rearranged the innards of text_position() in a way that would make it not work for empty search strings. Which is fine, because all callers of that code special-case an empty pattern in some way. However, the primary use-case (text_position itself) got special-cased incorrectly: historically it's returned 1 not 0 for an empty search string. Restore the historical behavior. Per complaint from Austin Drenski (via Shay Rojansky). Back-patch to v12 where it got broken. Discussion: https://postgr.es/m/CADT4RqAz7oN4vkPir86Kg1_mQBmBxCp-L_=9vRpgSNPJf0KRkw@mail.gmail.com	2019-10-28 12:21:13 -04:00
Michael Paquier	68ac9cf249	Fix dependency handling at swap phase of REINDEX CONCURRENTLY When swapping the dependencies of the old and new indexes, the code has been correctly switching all links in pg_depend from the old to the new index for both referencing and referenced entries. However it forgot the fact that the new index may itself have existing entries in pg_depend, like references to the parent table attributes. This resulted in duplicated entries in pg_depend after running REINDEX CONCURRENTLY. Fix this problem by removing any existing entries in pg_depend on the new index before switching the dependencies of the old index to the new one. More regression tests are added to check the consistency of entries in pg_depend for indexes, including partition indexes. Author: Michael Paquier Discussion: https://postgr.es/m/20191025064318.GF8671@paquier.xyz Backpatch-through: 12	2019-10-28 11:57:31 +09:00
Michael Paquier	51970fa8df	Fix initialization of fake LSN for unlogged relations `9155580` has changed the value of the first fake LSN for unlogged relations from 1 to FirstNormalUnloggedLSN (aka 1000), GiST requiring a non-zero LSN on some pages to allow an interlocking logic to work, but its value was still initialized to 1 at the beginning of recovery or after running pg_resetwal. This fixes the initialization for both code paths. Author: Takayuki Tsunakawa Reviewed-by: Dilip Kumar, Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/OSBPR01MB2503CE851940C17DE44AE3D9FE6F0@OSBPR01MB2503.jpnprd01.prod.outlook.com Backpatch-through: 12	2019-10-27 13:54:12 +09:00
Peter Eisentraut	2fc2a88e67	Remove obsolete information schema tables Remove SQL_LANGUAGES, which was eliminated in SQL:2008, and SQL_PACKAGES and SQL_SIZING_PROFILES, which were eliminated in SQL:2011. Since they were dropped by the SQL standard, the information in them was no longer updated and therefore no longer useful. This also removes the feature-package association information in sql_feature_packages.txt, but for the time begin we are keeping the information which features are in the Core package (that is, mandatory SQL features). Maybe at some point someone wants to invent a way to store that that does not involve using the "package" mechanism anymore. Discussion https://www.postgresql.org/message-id/flat/91334220-7900-071b-9327-0c6ecd012017%402ndquadrant.com	2019-10-25 21:37:14 +02:00
Tom Lane	22f6f2c1cc	Improve management of statement timeouts. Commit `f8e5f156b` added private state in postgres.c to track whether a statement timeout is running. This seems like bad design to me; timeout.c's private state should be the single source of truth about that. We already fixed one bug associated with failure to keep those states in sync (cf. `be42015fc`), and I've got little faith that we won't find more in future. So get rid of postgres.c's local variable by exposing a way to ask timeout.c whether a timeout is running. (Obviously, such an inquiry is subject to race conditions, but it seems fine for the purpose at hand.) To make get_timeout_active() as cheap as possible, add a flag in the per-timeout struct showing whether that timeout is active. This allows some small savings elsewhere in timeout.c, mainly elimination of unnecessary searches of the active_timeouts array. While at it, fix enable_statement_timeout to not call disable_timeout when statement_timeout is 0 and the timeout is not running. This avoids a useless deschedule-and-reschedule-timeouts cycle, which represents a significant savings (at least one kernel call) when there is any other active timeout. Right now, there usually isn't, but there are proposals around to change that. Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org	2019-10-25 11:41:16 -04:00
Tom Lane	2b2bacdca0	Reset statement_timeout between queries of a multi-query string. Historically, we started the timer (if StatementTimeout > 0) at the beginning of a simple-Query message and usually let it run until the end, so that the timeout limit applied to the entire query string, and intra-string changes of the statement_timeout GUC had no effect. But, confusingly, a COMMIT within the string would reset the state and allow a fresh timeout cycle to start with the current setting. Commit `f8e5f156b` changed the behavior of statement_timeout for extended query protocol, and as an apparently-unintended side effect, a change in the statement_timeout GUC during a multi-statement simple-Query message might have an effect immediately --- but only if it was going from "disabled" to "enabled". This is all pretty confusing, not to mention completely undocumented. Let's change things so that the timeout is always reset between queries of a multi-query string, whether they're transaction control commands or not. Thus the active timeout setting is applied to each query in the string, separately. This costs a few more cycles if statement_timeout is active, but it provides much more intuitive behavior, especially if one changes statement_timeout in one of the queries of the string. Also, add something to the documentation to explain all this. Per bug #16035 from Raj Mohite. Although this is a bug fix, I'm hesitant to back-patch it; conceivably somebody has worked out the old behavior and is depending on it. (But note that this change should make the behavior less restrictive in most cases, since the timeout will now be applied to shorter segments of code.) Discussion: https://postgr.es/m/16035-456e6e69ebfd4374@postgresql.org	2019-10-25 11:15:50 -04:00
Michael Paquier	8270a0d9a9	Handle interrupts within a transaction context in REINDEX CONCURRENTLY Phases 2 (building the new index) and 3 (validating the new index) checked for interrupts outside a transaction context, having as consequence to not release session-level locks taken on the parent relation and the old and new indexes processed. This could for example be triggered with statement_timeout and a bad timing, and would issue confusing error messages when shutting down the session still holding the locks (note that an assertion failure would be triggered first), on top of more issues with concurrent sessions trying to take a lock that would interfere with the SHARE UPDATE EXCLUSIVE locks hold here. This moves all the interruption checks inside a transaction context. Note that I have manually tested all interruptions to make sure that invalid indexes can be cleaned up properly. Partition indexes still have issues on their own with some missing dependency handling, which will be dealt with in a follow-up patch. Reported-by: Justin Pryzby Author: Michael Paquier Discussion: https://postgr.es/m/20191013025145.GC4475@telsasoft.com Backpatch-through: 12	2019-10-25 10:20:08 +09:00
Fujii Masao	3b0c59ac1c	Fix typo in xlog.c. Author: Fujii Masao Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAHGQGwH7dtYvOZZ8c0AG5AJwH5pfiRdKaCptY1_RdHy0HYeRfQ@mail.gmail.com	2019-10-24 14:13:36 +09:00
Michael Paquier	5d3500da72	Acquire properly session-level lock on new index in REINDEX CONCURRENTLY In the first transaction run for REINDEX CONCURRENTLY, a thinko in the existing logic caused two session locks to be taken on the old index, causing the session lock on the newly-created index to be missed. This made possible concurrent DDL commands (like ALTER INDEX) on the new index while REINDEX CONCURRENTLY was processing from the point where the first internal transaction committed. This issue has been discovered while digging into another bug. Author: Michael Paquier Discussion: https://postgr.es/m/20191021074323.GB1869@paquier.xyz Backpatch-through: 12	2019-10-23 15:04:48 +09:00
Michael Paquier	e3db3f829f	Clean up properly error_context_stack in autovacuum worker on exception Any callback set would have no meaning in the context of an exception. As an autovacuum worker exits quickly in this context, this could be only an issue within EmitErrorReport(), where the elog hook is for example called. That's unlikely to going to be a problem, but let's be clean and consistent with other code paths handling exceptions. This is present since `2909419`, which introduced autovacuum. Author: Ashwin Agrawal Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALfoeisM+_+dgmAdAOHAu0k-ZpEHHqSSG=GRf3pKJGm8OqWX0w@mail.gmail.com Backpatch-through: 9.4	2019-10-23 10:25:06 +09:00
Peter Eisentraut	f86f46d091	Fix comment The last argument of smgrextend() was renamed from isTemp to skipFsync in `debcec7dc3`, but the comments at two call sites were not updated.	2019-10-22 09:58:20 +02:00
Alexander Korotkov	52ad1e6599	Refactor jsonpath's compareDatetime() This commit refactors come ridiculous coding in compareDatetime(). Also, it provides correct cross-datatype comparison even when one of values overflows during cast. That eliminates dilemma on whether we should suppress overflow errors during cast. Reported-by: Tom Lane Discussion: https://postgr.es/m/32308.1569455803%40sss.pgh.pa.us Discussion: https://postgr.es/m/a5629d0c-8162-7559-16aa-0c8390d6ba5f%40postgrespro.ru Author: Nikita Glukhov, Alexander Korotkov	2019-10-21 23:07:07 +03:00
Alexander Korotkov	a6888fde7f	Refactor timestamp2timestamptz_opt_error() While casting from timestamp to timestamptz we do timestamp2tm() then tm2timestamp(). This commit eliminates call to tm2timestamp(). Instead, it directly applies timezone offset to the original timestamp value. That makes upcoming datetime overflow handling in jsonpath easier. That should also save us some CPU cycles. Discussion: https://postgr.es/m/CAPpHfdvRPRh_mTGar5WmDeRZ%3DU5dOXHdxspYYD%3D76m3knNGjXA%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Tom Lane	2019-10-21 23:07:07 +03:00
Etsuro Fujita	80831bcdbe	Update obsolete comment. Commit `b52b7dc25`, which moved code creating PartitionBoundInfo in RelationBuildPartitionDesc() in partcache.c (relocated to partdesc.c afterwards) to partbounds.c, should have updated this, but didn't. Author: Etsuro Fujita Reviewed-by: Alvaro Herrera Backpatch-through: 12 Discussion: https://postgr.es/m/CAPmGK16Uxr%3DPatiGyaRwiQVLB7Y-GqbkK3AxRLVYzU0Czv%3DsEw%40mail.gmail.com	2019-10-21 17:30:00 +09:00
Amit Kapila	70a6c37d52	Fix memory leak introduced in commit `7df159a620`. We memorize all internal and empty leaf pages in the 1st vacuum stage for gist indexes. They are used in the 2nd stage, to delete all the empty pages. There was a memory context page_set_context for this purpose, but we never used it. Reported-by: Amit Kapila Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 12, where it got introduced Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com	2019-10-21 08:57:32 +05:30
Peter Eisentraut	5d3587d14b	Fix most -Wundef warnings In some cases #if was used instead of #ifdef in an inconsistent style. Cleaning this up also helps when analyzing cases like `38d8dce61f` where this makes a difference. There are no behavior changes here, but the change in pg_bswap.h would prevent possible accidental misuse by third-party code. Discussion: https://www.postgresql.org/message-id/flat/3b615ca5-c595-3f1d-fdf7-a429e564f614%402ndquadrant.com	2019-10-19 18:31:38 +02:00
Noah Misch	48cc59ed24	Use standard compare_exchange loop style in ProcArrayGroupClearXid(). Besides style, this might improve performance in the contended case. Reviewed by Amit Kapila. Discussion: https://postgr.es/m/20191015035348.GA4166224@rfd.leadboat.com	2019-10-18 20:21:10 -07:00
Michael Paquier	f25968c496	Remove last traces of heap_open/close in the tree Since pluggable storage has been introduced, those two routines have been replaced by table_open/close, with some compatibility macros still present to allow extensions to compile correctly with v12. Some code paths using the old routines still remained, so replace them. Based on the discussion done, the consensus reached is that it is better to remove those compatibility macros so as nothing new uses the old routines, so remove also the compatibility macros. Discussion: https://postgr.es/m/20191017014706.GF5605@paquier.xyz	2019-10-19 11:18:15 +09:00
Fujii Masao	ec1259e880	Fix failure of archive recovery with recovery_min_apply_delay enabled. recovery_min_apply_delay parameter is intended for use with streaming replication deployments. However, the document clearly explains that the parameter will be honored in all cases if it's specified. So it should take effect even if in archive recovery. But, previously, archive recovery with recovery_min_apply_delay enabled always failed, and caused assertion failure if --enable-caasert is enabled. The cause of this problem is that; the ownership of recoveryWakeupLatch that recovery_min_apply_delay uses was taken only when standby mode is requested. So unowned latch could be used in archive recovery, and which caused the failure. This commit changes recovery code so that the ownership of recoveryWakeupLatch is taken even in archive recovery. Which prevents archive recovery with recovery_min_apply_delay from failing. Back-patch to v9.4 where recovery_min_apply_delay was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com	2019-10-18 22:32:18 +09:00
Fujii Masao	9b95a36be8	Make crash recovery ignore recovery_min_apply_delay setting. In v11 or before, this setting could not take effect in crash recovery because it's specified in recovery.conf and crash recovery always starts without recovery.conf. But commit `2dedf4d9a8` integrated recovery.conf into postgresql.conf and which unexpectedly allowed this setting to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery_min_apply_delay setting. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEyD6HdZLfdWc+95g=VQFPR4zQL4n+yHxQgGEGjaSVheQ@mail.gmail.com Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net	2019-10-18 22:24:18 +09:00
Alvaro Herrera	89403ed228	Fix typo Apparently while this code was being developed, ReindexRelationConcurrently operated on multiple relations. The version that was ultimately pushed doesn't, so this comment's use of plural is inaccurate.	2019-10-18 14:49:39 +02:00
Alvaro Herrera	d2efb90dba	Update comments about progress reporting by index_drop Michaël Paquier complained that index_drop is requesting progress reporting for non-obvious reasons, so let's add a comment to explain why. Discussion: https://postgr.es/m/20191017010412.GH2602@paquier.xyz	2019-10-18 07:23:05 -03:00
Michael Paquier	3f60f690fa	Fix timeout handling in logical replication worker The timestamp tracking the last moment a message is received in a logical replication worker was initialized in each loop checking if a message was received or not, causing wal_receiver_timeout to be ignored in basically any logical replication deployments. This also broke the ping sent to the server when reaching half of wal_receiver_timeout. This simply moves the initialization of the timestamp out of the apply loop to the beginning of LogicalRepApplyLoop(). Reported-by: Jehan-Guillaume De Rorthais Author: Julien Rouhaud Discussion: https://postgr.es/m/CAOBaU_ZHESFcWva8jLjtZdCLspMj7vqaB2k++rjHLY897ZxbYw@mail.gmail.com Backpatch-through: 10	2019-10-18 14:26:29 +09:00
Alvaro Herrera	38ddeab13b	Fix minor bug in logical-replication walsender shutdown Logical walsender should exit when it catches up with sending WAL during shutdown; but there was a rare corner case when it failed to because of a race condition that puts it back to wait for more WAL instead -- but since there wasn't any, it'd not shut down immediately. It would only continue the shutdown when wal_sender_timeout terminates the sleep, which causes annoying waits during shutdown procedure. Restructure the code so that we no longer forget to set WalSndCaughtUp in that case. This was an oversight in commit `c6c333436`. Backpatch all the way down to 9.4. Author: Craig Ringer, Álvaro Herrera Discussion: https://postgr.es/m/CAMsr+YEuz4XwZX_QmnX_-2530XhyAmnK=zCmicEnq1vLr0aZ-g@mail.gmail.com	2019-10-17 15:06:06 +02:00
Thomas Munro	3c8c55dd54	When restoring GUCs in parallel workers, show an error context. Otherwise it can be hard to see where an error is coming from, when the parallel worker sets all the GUCs that it received from the leader. Bug #15726. Back-patch to 9.5, where RestoreGUCState() appeared. Reported-by: Tiago Anastacio Reviewed-by: Daniel Gustafsson, Tom Lane Discussion: https://postgr.es/m/15726-6d67e4fa14f027b3%40postgresql.org	2019-10-17 13:47:01 +13:00
Thomas Munro	6bda2af039	Fix bug that could try to freeze running multixacts. Commits `801c2dc7` and `801c2dc7` made it possible for vacuum to try to freeze a multixact that is still running. That was prevented by a check, but raised an error. Repair. Back-patch all the way. Author: Nathan Bossart, Jeremy Schneider Reported-by: Jeremy Schneider Reviewed-by: Jim Nasby, Thomas Munro Discussion: https://postgr.es/m/DAFB8AFF-2F05-4E33-AD7F-FF8B0F760C17%40amazon.com	2019-10-17 09:59:21 +13:00
Alvaro Herrera	0d21f919eb	Fix crash when reporting CREATE INDEX progress A race condition can make us try to dereference a NULL pointer to the PGPROC struct of a process that's already finished. That results in crashes during REINDEX CONCURRENTLY and CREATE INDEX CONCURRENTLY. This was introduced in `ab0dfc961b`, so backpatch to pg12. Reported by: Justin Pryzby Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/20191012004446.GT10470@telsasoft.com	2019-10-16 14:51:34 +02:00
Michael Paquier	1de4fd1092	Refresh some incorrect links in pg_crc.c/h Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm0LPk9vTGTBPBRv0=fX=94o4r6-DuBbHNeCN2AH5bufLw@mail.gmail.com	2019-10-16 15:10:14 +09:00
Thomas Munro	d5ac14f9cc	Use libc version as a collation version on glibc systems. Using glibc's version string to detect potential collation definition changes is not 100% reliable, but it's better than nothing. Currently this affects only collations explicitly provided by "libc". More work will be needed to handle the default collation. Author: Thomas Munro, based on a suggestion from Christoph Berg Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com	2019-10-16 17:28:24 +13:00
Andres Freund	cef82eda14	Fix CLUSTER on expression indexes. Since the introduction of different slot types, in `1a0586de36`, we create a virtual slot in tuplesort_begin_cluster(). While that looks right, it unfortunately doesn't actually work, as ExecStoreHeapTuple() is used to store tuples in the slot. Unfortunately no regression tests for CLUSTER on expression indexes existed so far. Fix the slot type, and add bare bones tests for CLUSTER on expression indexes. Reported-By: Justin Pryzby Author: Andres Freund Discussion: https://postgr.es/m/20191011210320.GS10470@telsasoft.com Backpatch: 12, like `1a0586de36`	2019-10-15 10:40:13 -07:00
Peter Eisentraut	bdb839cbde	Update unicode.org URLs Use https, consistent host name, remove references to ftp. Also update the URLs for CLDR, which has moved from Trac to GitHub.	2019-10-13 22:10:38 +02:00
Tom Lane	9abb2bfc04	In the postmaster, rely on the signal infrastructure to block signals. POSIX sigaction(2) can be told to block a set of signals while a signal handler executes. Make use of that instead of manually blocking and unblocking signals in the postmaster's signal handlers. This should save a few cycles, and it also prevents recursive invocation of signal handlers when many signals arrive in close succession. We have seen buildfarm failures that seem to be due to postmaster stack overflow caused by such recursion (exacerbated by a Linux PPC64 kernel bug). This doesn't change anything about the way that it works on Windows. Somebody might consider adjusting port/win32/signal.c to let it work similarly, but I'm not in a position to do that. For the moment, just apply to HEAD. Possibly we should consider back-patching this, but it'd be good to let it age awhile first. Discussion: https://postgr.es/m/14878.1570820201@sss.pgh.pa.us	2019-10-13 15:48:26 -04:00
Michael Paquier	1df5875d39	Fix dependency handling of column drop with partitioned tables When dropping a column on a partitioned table which has one or more partitioned indexes, the operation was failing as dependencies with partitioned indexes using the column dropped were not getting removed in a way consistent with the columns involved across all the relations part of an inheritance tree. This commit refactors the code executing column drop so as all the columns from an inheritance tree to remove are gathered first, and dropped all at the end. This way, we let the dependency machinery sort out by itself the deletion of all the columns with the partitioned indexes across a partition tree. This issue has been introduced by `1d92a0c`, so backpatch down to REL_12_STABLE. Author: Amit Langote, Michael Paquier Reviewed-by: Álvaro Herrera, Ashutosh Sharma Discussion: https://postgr.es/m/CA+HiwqE9kuBsZ3b5pob2-cvE8ofzPWs-og+g8bKKGnu6b4-yTQ@mail.gmail.com Backpatch-through: 12	2019-10-13 17:51:55 +09:00
Peter Eisentraut	b4675a8ae2	Fix use of term "verifier" Within the context of SCRAM, "verifier" has a specific meaning in the protocol, per RFCs. The existing code used "verifier" differently, to mean whatever is or would be stored in pg_auth.rolpassword. Fix this by using the term "secret" for this, following RFC 5803. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/be397b06-6e4b-ba71-c7fb-54cae84a7e18%402ndquadrant.com	2019-10-12 21:41:59 +02:00
Fujii Masao	20961ceaf0	Make crash recovery ignore restore_command and recovery_end_command settings. In v11 or before, those settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit `2dedf4d9a8` integrated recovery.conf into postgresql.conf and which unexpectedly allowed those settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore restore_command and recovery_end_command settings. Back-patch to v12 where the issue was added. Author: Fujii Masao Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net	2019-10-11 15:47:59 +09:00
Andres Freund	93765bd956	Fix table rewrites that include a column without a default. In `c2fe139c20` I made ATRewriteTable() use tuple slots. Unfortunately I did not notice that columns can be added in a rewrite that do not have a default, when another column is added/altered requiring one. Initialize columns to NULL again, and add tests. Bug: #16038 Reported-By: anonymous Author: Andres Freund Discussion: https://postgr.es/m/16038-5c974541f2bf6749@postgresql.org Backpatch: 12, where the bug was introduced in `c2fe139c20`	2019-10-09 22:00:50 -07:00
Peter Eisentraut	50518ec296	Revert "Use libc version as a collation version on glibc systems." This reverts commit `9f90b1d08d`. This needs some refinements in the pg_dump and pg_upgrade tests.	2019-10-09 21:36:01 +02:00
Peter Eisentraut	9f90b1d08d	Use libc version as a collation version on glibc systems. Using glibc's version number to detect potential collation definition changes is not 100% reliable, but it's better than nothing. Author: Thomas Munro Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/4b76c6d4-ae5e-0dc6-7d0d-b5c796a07e34%402ndquadrant.com	2019-10-09 21:17:47 +02:00
Michael Paquier	b8e19b932a	Flush logical mapping files with fd opened for read/write at checkpoint The file descriptor was opened with read-only to fsync a regular file, which would cause EBADFD errors on some platforms. This is similar to the recent fix done by `a586cc4b` (which was broken by me with `82a5649`), except that I noticed this issue while monitoring the backend code for similar mistakes. Backpatch to 9.4, as this has been introduced since logical decoding exists as of `b89e151`. Author: Michael Paquier Reviewed-by: Andres Freund Discussion: https://postgr.es/m/20191006045548.GA14532@paquier.xyz Backpatch-through: 9.4	2019-10-09 13:30:43 +09:00
Peter Eisentraut	38d8dce61f	Remove some code for old unsupported versions of MSVC As of `d9dd406fe2`, we require MSVC 2013, which means _MSC_VER >= 1800. This means that conditionals about older versions of _MSC_VER can be removed or simplified. Previous code was also in some cases handling MinGW, where _MSC_VER is not defined at all, incorrectly, such as in pg_ctl.c and win32_port.h, leading to some compiler warnings. This should now be handled better. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2019-10-08 10:50:54 +02:00
Michael Paquier	a7471bd85c	Update some outdated links about XLC and UNIX specification Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm3Dy=dTdx8UCVw=DWbzLzmRUC1dkq45=heOZDUg3U_PtA@mail.gmail.com	2019-10-08 14:31:30 +09:00
Tom Lane	3887e9455f	Check for too many postmaster children before spawning a bgworker. The postmaster's code path for spawning a bgworker neglected to check whether we already have the max number of live child processes. That's a bit hard to hit, since it would necessarily be a transient condition; but if we do, AssignPostmasterChildSlot() fails causing a postmaster crash, as seen in a report from Bhargav Kamineni. To fix, invoke canAcceptConnections() in the bgworker code path, as we do in the other code paths that spawn children. Since we don't want the same pmState tests in this case, add a child-process-type parameter to canAcceptConnections() so that it can know what to do. Back-patch to 9.5. In principle the same hazard exists in 9.4, but the code is enough different that this patch wouldn't quite fix it there. Given the tiny usage of bgworkers in that branch it doesn't seem worth creating a variant patch for it. Discussion: https://postgr.es/m/18733.1570382257@sss.pgh.pa.us	2019-10-07 12:39:09 -04:00
Tom Lane	ac12ab06a9	Avoid trying to release a List's initial allocation via repalloc(). Commit `1cff1b95a` included some code that supposed it could repalloc() a memory chunk to a smaller size without risk of the chunk moving. That was not a great idea, because it depended on undocumented behavior of AllocSetRealloc, which commit `c477f3e44` changed thereby breaking it. (Not to mention that this code ought to work with other memory context types, which might not work the same...) So get rid of the repalloc calls, and instead just wipe the now-unused ListCell array and/or tell Valgrind it's NOACCESS, as if we'd freed it. In cases where the initial list allocation had been quite large, this could represent an annoying waste of space. In principle we could ameliorate that by allocating the initial cell array separately when it exceeds some threshold. But that would complicate new_list() which is hot code, and the returns would materialize only in narrow cases. On balance I don't think it'd be worth it. Discussion: https://postgr.es/m/17059.1570208426@sss.pgh.pa.us	2019-10-06 12:06:30 -04:00
Tomas Vondra	36425ece5d	Change MemoryContextMemAllocated to return Size Commit `f2369bc610` switched most of the memory accounting from int64 to Size, but it forgot to change the MemoryContextMemAllocated return type. So this fixes that omission. Discussion: https://www.postgresql.org/message-id/11238.1570200198%40sss.pgh.pa.us	2019-10-05 20:49:39 +02:00
Andres Freund	d986d4e87f	Fix crash caused by EPQ happening with a before update trigger present. When ExecBRUpdateTriggers()'s GetTupleForTrigger() follows an EPQ chain the former needs to run the result tuple through the junkfilter again, and update the slot containing the new version of the tuple to contain that new version. The input tuple may already be in the junkfilter's output slot, which used to be OK - we don't need the previous version anymore. Unfortunately `ff11e7f4b9` started to use ExecCopySlot() to update newslot, and ExecCopySlot() doesn't support copying a slot into itself, leading to a slot in a corrupt state, which then can cause crashes or other symptoms. Fix this by skipping the ExecCopySlot() when copying into itself. While we could have easily made ExecCopySlot() handle that case, it seems better to add an assert forbidding doing so instead. As the goal of copying might be to make the contents of one slot independent from another, it seems failure prone to handle doing so silently. A follow-up commit will add tests for the obviously under-covered combination of EPQ and triggers. Done as a separate commit as it might make sense to backpatch them further than this bug. Also remove confusion with confusing variable names for slots in ExecBRDeleteTriggers() and ExecBRUpdateTriggers(). Bug: #16036 Reported-By: Антон Власов Author: Andres Freund Discussion: https://postgr.es/m/16036-28184c90d952fb7f@postgresql.org Backpatch: 12-, where `ff11e7f4b9` was merged	2019-10-04 13:50:49 -07:00
Andres Freund	a586cc4b6c	Use a fd opened for read/write when syncing slots during startup, take 2. Cribbing from `dfbaed4597`: Some operating systems, including the reporter's windows, return EBADFD or similar when fsync() is invoked on a O_RDONLY file descriptor. Unfortunately RestoreSlotFromDisk() does exactly that; which causes failures after restarts in at least some scenarios. If you hit the bug the error message will be something like ERROR: could not fsync file "pg_replslot/$name/state": Bad file descriptor Simply use O_RDWR instead of O_RDONLY when opening the relevant file descriptor to fix the bug. Unfortunately this fix was undone in `82a5649fb9`. Re-apply, and add a comment. Bug: 16039 Reported-By: Hans Buschmann Author: Andres Freund Discussion: https://postgr.es/m/16039-196fc97cc05e141c@postgresql.org Backpatch: 12-, as `82a5649fb9`	2019-10-04 13:34:28 -07:00
Robert Haas	2e8b6bfa90	Rename some toasting functions based on whether they are heap-specific. The old names for the attribute-detoasting functions names included the word "heap," which seems outdated now that the heap is only one of potentially many table access methods. On the other hand, toast_insert_or_update and toast_delete are heap-specific, so rename them by adding "heap_" as a prefix. Not all of the work of making the TOAST system fully accessible to AMs other than the heap is done yet, but there seems to be little harm in getting this renaming out of the way now. Commit `8b94dab066` already divided up the functions among various files partially according to whether it was intended that they should be heap-specific or AM-agnostic, so this is just clarifying the division contemplated by that commit. Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro, Andres Freund, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com	2019-10-04 14:24:46 -04:00
Tom Lane	61aa9f544a	Fix bitshiftright()'s zero-padding some more. Commit `5ac0d9360` failed to entirely fix bitshiftright's habit of leaving one-bits in the pad space that should be all zeroes, because in a moment of sheer brain fade I'd concluded that only the code path used for not-a-multiple-of-8 shift distances needed to be fixed. Of course, a multiple-of-8 shift distance can also cause the problem, so we need to forcibly zero the extra bits in both cases. Per bug #16037 from Alexander Lakhin. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/16037-1d1ebca564db54f4@postgresql.org	2019-10-04 10:34:40 -04:00
Tomas Vondra	f2369bc610	Use Size instead of int64 to track allocated memory Commit `5dd7fc1519` added block-level memory accounting, but used int64 variable to track the amount of allocated memory. That is incorrect, because we have Size for exactly these purposes, but it was mostly harmless until `c477f3e449` which changed how we handle with repalloc() when downsizing the chunk. Previously we've ignored these cases and just kept using the original chunk, but now we need to update the accounting, and the code was doing this: context->mem_allocated += blksize - oldblksize; Both blksize and oldblksize are Size (so unsigned) which means the subtraction underflows, producing a very high positive value. On 64-bit platforms (where Size has the same size as mem_alllocated) this happens to work because the result wraps to the right value, but on (some) 32-bit platforms this fails. This fixes two things - it changes mem_allocated (and related variables) to Size, and it splits the update to two separate steps, to prevent any underflows. Discussion: https://www.postgresql.org/message-id/15151.1570163761%40sss.pgh.pa.us	2019-10-04 16:10:56 +02:00
Robert Haas	967e276e9f	Remove AtSubStart_Notify. Allocate notify-related state lazily instead. This makes trivial subtransactions noticeably faster. Patch by me, reviewed and tested by Dilip Kumar, Kyotaro Horiguchi, and Jeevan Ladhe. Discussion: https://postgr.es/m/CA+TgmobE1J22S1eC-6N-je9LgrcwZypkwp+zH6JXo9mc=4Nk3A@mail.gmail.com	2019-10-04 08:19:25 -04:00
Tom Lane	8e10405c74	Avoid unnecessary out-of-memory errors during encoding conversion. Encoding conversion uses the very simplistic rule that the output can't be more than 4X longer than the input, and palloc's a buffer of that size. This results in failure to convert any string longer than 1/4 GB, which is becoming an annoying limitation. As a band-aid to improve matters, allow the allocated output buffer size to exceed 1GB. We still insist that the final result fit into MaxAllocSize (1GB), though. Perhaps it'd be safe to relax that restriction, but it'd require close analysis of all callers, which is daunting (not least because external modules might call these functions). For the moment, this should allow a 2X to 4X improvement in the longest string we can convert, which is a useful gain in return for quite a simple patch. Also, once we have successfully converted a long string, repalloc the output down to the actual string length, returning the excess to the malloc pool. This seems worth doing since we can usually expect to give back several MB if we take this path at all. This still leaves much to be desired, most notably that the assumption that MAX_CONVERSION_GROWTH == 4 is very fragile, and yet we have no guard code verifying that the output buffer isn't overrun. Fixing that would require significant changes in the encoding conversion APIs, so it'll have to wait for some other day. The present patch seems safely back-patchable, so patch all supported branches. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us	2019-10-03 17:34:25 -04:00
Tom Lane	c477f3e449	Allow repalloc() to give back space when a large chunk is downsized. Up to now, if you resized a large (>8K) palloc chunk down to a smaller size, aset.c made no attempt to return any space to the malloc pool. That's unpleasant if a really large allocation is resized to a significantly smaller size. I think no such cases existed when this code was designed, and I'm not sure whether they're common even yet, but an upcoming fix to encoding conversion will certainly create such cases. Therefore, fix AllocSetRealloc so that it gives realloc() a chance to do something with the block. This doesn't noticeably increase complexity, we mostly just have to change the order in which the cases are considered. Back-patch to all supported branches. Discussion: https://postgr.es/m/20190816181418.GA898@alvherre.pgsql Discussion: https://postgr.es/m/3614.1569359690@sss.pgh.pa.us	2019-10-03 13:56:26 -04:00
Andrew Gierth	b7a1c5539a	Selectively include window frames in expression walks/mutates. query_tree_walker and query_tree_mutator were skipping the windowClause of the query, without regard for the fact that the startOffset and endOffset in a WindowClause node are expression trees that need to be processed. This was an oversight in commit `ec4be2ee6` from 2010 which added the expression fields; the main symptom is that function parameters in window frame clauses don't work in inlined functions. Fix (as conservatively as possible since this needs to not break existing out-of-tree callers) and add tests. Backpatch all the way, since this has been broken since 9.0. Per report from Alastair McKinley; fix by me with kibitzing and review from Tom Lane. Discussion: https://postgr.es/m/DB6PR0202MB2904E7FDDA9D81504D1E8C68E3800@DB6PR0202MB2904.eurprd02.prod.outlook.com	2019-10-03 10:54:52 +01:00
Michael Paquier	df86e52cac	Remove temporary WAL and history files at the end of archive recovery `cbc55da` has reworked the order of some actions at the end of archive recovery. Unfortunately this overlooked the fact that the startup process needs to remove RECOVERYXLOG (for temporary WAL segment newly recovered from archives) and RECOVERYHISTORY (for temporary history file) at this step, leaving the files around even after recovery ended. Backpatch to 9.5, like the previous commit. Author: Sawada Masahiko Reviewed-by: Fujii Masao, Michael Paquier Discussion: https://postgr.es/m/CAD21AoBO_eDQub6zojFnWtnmutRBWvYf7=cW4Hsqj+U_R26w3Q@mail.gmail.com Backpatch-through: 9.5	2019-10-02 15:53:07 +09:00
Michael Paquier	9555cc8d2b	Revert hooks for session start and end, take two The location of the session end hook has been chosen so as it is possible to allow modules to do their own transactions, however any trying to any any subsystem which went through before_shmem_exit() would cause issues, limiting the pluggability of the hook. Per discussion with Tom Lane and Andres Freund. Discussion: https://postgr.es/m/18722.1569906636@sss.pgh.pa.us	2019-10-02 09:55:27 +09:00
Tomas Vondra	fa2fe04bf1	Mark two variables in in aset.c with PG_USED_FOR_ASSERTS_ONLY This fixes two compiler warnings about unused variables in non-assert builds, introduced by `5dd7fc1519`.	2019-10-01 14:39:06 +02:00
Tomas Vondra	11a078cf87	Optimize partial TOAST decompression Commit `4d0e994eed` added support for partial TOAST decompression, so the decompression is interrupted after producing the requested prefix. For prefix and slices near the beginning of the entry, this may saves a lot of decompression work. That however only deals with decompression - the whole compressed entry was still fetched and re-assembled, even though the compression used only a small fraction of it. This commit improves that by computing how much compressed data may be needed to decompress the requested prefix, and then fetches only the necessary part. We always need to fetch a bit more compressed data than the requested (uncompressed) prefix, because the prefix may not be compressible at all and pglz itself adds a bit of overhead. That means this optimization is most effective when the requested prefix is much smaller than the whole compressed entry. Author: Binguo Bao Reviewed-by: Andrey Borodin, Tomas Vondra, Paul Ramsey Discussion: https://www.postgresql.org/message-id/flat/CAL-OGkthU9Gs7TZchf5OWaL-Gsi=hXqufTxKv9qpNG73d5na_g@mail.gmail.com	2019-10-01 14:28:28 +02:00
Michael Paquier	e788bd924c	Add hooks for session start and session end, take two These hooks can be used in loadable modules. A simple test module is included. The first attempt was done with `cd8ce3a` but we lacked handling for NO_INSTALLCHECK in the MSVC scripts (problem solved afterwards by `431f1599`) so the buildfarm got angry. This also fixes a couple of issues noticed upon review compared to the first attempt, so the code has slightly changed, resulting in a more simple test module. Author: Fabrízio de Royes Mello, Yugo Nagata Reviewed-by: Andrew Dunstan, Michael Paquier, Aleksandr Parfenov Discussion: https://postgr.es/m/20170720204733.40f2b7eb.nagata@sraoss.co.jp Discussion: https://postgr.es/m/20190823042602.GB5275@paquier.xyz	2019-10-01 12:15:25 +09:00
Tomas Vondra	5dd7fc1519	Add transparent block-level memory accounting Adds accounting of memory allocated in a memory context. Compared to various ad hoc solutions, the main advantage is that the accounting is transparent and does not require direct control over allocations (this matters for use cases where the allocations happen in user code, like for example aggregate states allocated in a transition functions). To reduce overhead, the accounting happens at the block level (not for individual chunks) and only the context immediately owning the block is updated. When inquiring about amount of memory allocated in a context, we have to recursively walk all children contexts. This "lazy" accounting works well for cases with relatively small number of contexts in the relevant subtree and/or with infrequent inquiries. Author: Jeff Davis Reivewed-by: Tomas Vondra, Melanie Plageman, Soumyadeep Chakraborty Discussion: https://www.postgresql.org/message-id/flat/027a129b8525601c6a680d27ce3a7172dab61aab.camel@j-davis.com	2019-10-01 03:13:39 +02:00
Andres Freund	36d22dd95b	Don't generate EEOP_*_FETCHSOME operations for slots know to be virtual. That avoids unnecessary work during both interpreted execution, and JIT compiled expression evaluation. Both benefit from fewer expression steps needing be processed, and for interpreted execution there now is a fastpath dedicated to just fetching a value from a virtual slot. That's e.g. beneficial for hashjoins over nodes that perform projections, as the hashed columns are currently fetched individually. Author: Soumyadeep Chakraborty, Andres Freund Discussion: https://postgr.es/m/CAE-ML+9OKSN71+mHtfMD-L24oDp8dGTfaVjDU6U+j+FNAW5kRQ@mail.gmail.com	2019-09-30 16:06:16 -07:00
Andres Freund	34c9c53bb0	Reduce code duplication for ExecJustVar operations. This is mainly in preparation for adding further fastpath evaluation routines. Also reorder ExecJustVar functions to be consistent with the order in which they're used. Author: Andres Freund Discussion: https://postgr.es/m/CAE-ML+9OKSN71+mHtfMD-L24oDp8dGTfaVjDU6U+j+FNAW5kRQ@mail.gmail.com	2019-09-30 15:32:00 -07:00
Fujii Masao	7acf8a876b	Make crash recovery ignore recovery target settings. In v11 or before, recovery target settings could not take effect in crash recovery because they are specified in recovery.conf and crash recovery always starts without recovery.conf. But commit `2dedf4d9a8` integrated recovery.conf into postgresql.conf and which unexpectedly allowed recovery target settings to take effect even in crash recovery. This is definitely not good behavior. To fix the issue, this commit makes crash recovery always ignore recovery target settings. Back-patch to v12. Author: Peter Eisentraut Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/e445616d-023e-a268-8aa1-67b8b335340c@pgmasters.net	2019-09-30 10:18:15 +09:00
Andres Freund	ac88807f9b	jit: Re-allow JIT compilation of execGrouping.c hashtable comparisons. In the course of `5567d12ce0`, `356687bd8` and `317ffdfeaa`, I changed BuildTupleHashTable[Ext]'s call to ExecBuildGroupingEqual to not pass in the parent node, but NULL. Which in turn prevents the tuple equality comparator from being JIT compiled. While that fixes bug #15486, it is not actually necessary after all of the above commits, as we don't re-build the comparator when using the new BuildTupleHashTableExt() interface (as the content of the hashtable are reset, but the TupleHashTable itself is not). Therefore re-allow jit compilation for callers that use BuildTupleHashTableExt with a separate context for "metadata" and content. As in the previous commit, there's ongoing work to make this easier to test to prevent such regressions in the future, but that infrastructure is not going to be backpatchable. The performance impact of not JIT compiling hashtable equality comparators can be substantial e.g. for aggregation queries that aggregate a lot of input rows to few output rows (when there are a lot of output groups, there will be fewer comparisons). Author: Andres Freund Discussion: https://postgr.es/m/20190927072053.njf6prdl3vb7y7qb@alap3.anarazel.de Backpatch: 11, just as `5567d12ce0`	2019-09-29 16:24:32 -07:00
Andres Freund	97e971ee05	Fix determination when slot types for upper executor nodes are fixed. For many queries the fact that the tuple descriptor from the lower node was not taken into account when determining whether the type of a slot is fixed, lead to tuple deforming for such upper nodes not to be JIT accelerated. I broke this in `675af5c01e`. There is ongoing work to enable writing regression tests for related behavior (including a patch that would have detected this regression), by optionally showing such details in EXPLAIN. But as it seems unlikely that that will be suitable for stable branches, just merge the fix for now. While it's fairly close to the 12 release window, the fact that 11 continues to perform JITed tuple deforming in these cases, that there's still cases where we do so in 12, and the fact that the performance regression can be sizable, weigh in favor of fixing it now. Author: Andres Freund Discussion: https://postgr.es/m/20190927072053.njf6prdl3vb7y7qb@alap3.anarazel.de Backpatch: 12-, where `675af5c01e` was merged.	2019-09-29 15:46:17 -07:00
Peter Eisentraut	4e6f101e92	Fix compilation with older OpenSSL versions Some older OpenSSL versions (0.9.8 branch) define TLS_VERSION macros but not the corresponding SSL_OP_NO_ macro, which causes the code for handling ssl_min_protocol_version/ssl_max_protocol_version to fail to compile. To fix, add more #ifdefs and error handling. Reported-by: Victor Wagner <vitus@wagner.pp.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/20190924101859.09383b4f%40fafnir.local.vm	2019-09-28 22:49:01 +02:00
Michael Paquier	55282fa20f	Remove code relevant to OpenSSL 0.9.6 in be/fe-secure-openssl.c HEAD supports OpenSSL 0.9.8 and newer versions, and this code likely got forgotten as its surrounding comments mention an incorrect version number. Author: Michael Paquier Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/20190927032311.GB8485@paquier.xyz	2019-09-28 15:22:49 +09:00
Andres Freund	3f6b3be39c	Silence -Wmaybe-uninitialized compiler warnings in dbcommands.c. When compiling postgres using gcc -O3, there are false-positive warnings about the now initialized variables. Silence them. Author: Peter Eisentraut, Andres Freund Discussion: https://postgr.es/m/15fb2350-b8b8-e188-278f-0b34fdee5210@2ndquadrant.com	2019-09-27 14:14:30 -07:00
Andres Freund	c967e13f40	Fix implicit-fallthrough compiler warning introduced in `6dda292d4d`. For some reason at least gcc-9 warns about the fallthrough, even though it otherwise recognizes that elog(ERROR, ...) doesn't return. Author: Andres Freund	2019-09-27 10:29:25 -07:00
Michael Paquier	fbfa566488	Fix lockmode initialization for custom relation options The code was enforcing AccessExclusiveLock for all custom relation options, which is incorrect as the APIs allow a custom lock level to be set. While on it, fix a couple of inconsistencies in the tests and the README of dummy_index_am. Oversights in commit `773df88`. Discussion: https://postgr.es/m/20190925234152.GA2115@paquier.xyz	2019-09-27 09:31:20 +09:00
Michael Paquier	6e22813b2d	Fix comment in xlogreader.c This has been introduced by `709d003`, that has moved readSegNo, readOff and readPageTLI into a new structure called WALOpenSegment initialized separately. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20190926.110809.248342687.horikyota.ntt@gmail.com	2019-09-26 11:53:37 +09:00
Alexander Korotkov	7881bb14f4	Correctly cast types to Datum and back in compareDatetime() Discussion: https://postgr.es/m/CAPpHfdteFKW6MLpXM4md99m55YAuXs0n9_P2wiTq_EmG09doUA%40mail.gmail.com	2019-09-26 02:09:01 +03:00
Tom Lane	b81a9c2fc5	Fix handling of GENERATED columns in CREATE TABLE LIKE INCLUDING DEFAULTS. LIKE INCLUDING DEFAULTS tried to copy the attrdef expression without copying the state of the attgenerated column. This is in fact wrong, because GENERATED and DEFAULT expressions are not the same kind of animal; one can contain Vars and the other not. We must copy attgenerated when we're copying the attrdef expression. Rearrange the if-tests so that the expression is copied only when the correct one of INCLUDING DEFAULTS and INCLUDING GENERATED has been specified. Per private report from Manuel Rigger. Tom Lane and Peter Eisentraut	2019-09-25 17:30:42 -04:00
Alexander Korotkov	bffe1bd684	Implement jsonpath .datetime() method This commit implements jsonpath .datetime() method as it's specified in SQL/JSON standard. There are no-argument and single-argument versions of this method. No-argument version selects first of ISO datetime formats matching input string. Single-argument version accepts template string as its argument. Additionally to .datetime() method itself this commit also implements comparison ability of resulting date and time values. There is some difficulty because exising jsonb_path_() functions are immutable, while comparison of timezoned and non-timezoned types involves current timezone. At first, current timezone could be changes in session. Moreover, timezones themselves are not immutable and could be updated. This is why we let existing immutable functions throw errors on such non-immutable comparison. In the same time this commit provides jsonb_path__tz() functions which are stable and support operations involving timezones. As new functions are added to the system catalog, catversion is bumped. Support of .datetime() method was the only blocker prevents T832 from being marked as supported. sql_features.txt is updated correspondingly. Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov. Heavily revised by me. Comments were adjusted by Liudmila Mantrova. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Alexander Korotkov, Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Liudmila Mantrova Reviewed-by: Anastasia Lubennikova, Peter Eisentraut	2019-09-25 22:51:51 +03:00
Alexander Korotkov	6dda292d4d	Allow datetime values in JsonbValue SQL/JSON standard allows manipulation with datetime values. So, it appears to be convinient to allow datetime values to be represented in JsonbValue struct. These datetime values are allowed for temporary representation only. During serialization datetime values are converted into strings. SQL/JSON requires writing timestamps with timezone in the same timezone offset as they were parsed. This is why we allow storage of timezone offset in JsonbValue struct. For the same reason timezone offset argument is added to JsonEncodeDateTime() function. Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov. Revised by me. Comments were adjusted by Liudmila Mantrova. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Alexander Korotkov, Liudmila Mantrova Reviewed-by: Anastasia Lubennikova, Peter Eisentraut	2019-09-25 22:51:51 +03:00
Alexander Korotkov	5bc450629b	Error suppression support for upcoming jsonpath .datetime() method Add support of error suppression in some date and time manipulation functions as it's required for jsonpath .datetime() method support. This commit doesn't use PG_TRY()/PG_CATCH() in order to implement that. Instead, it provides internal versions of date and time functions used, which support error suppression. Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Alexander Korotkov, Nikita Glukhov Reviewed-by: Anastasia Lubennikova, Peter Eisentraut	2019-09-25 22:51:51 +03:00
Alexander Korotkov	66c74f8b6e	Implement parse_datetime() function This commit adds parse_datetime() function, which implements datetime parsing with extended features demanded by upcoming jsonpath .datetime() method: * Dynamic type identification based on template string, * Support for standard-conforming 'strict' mode, * Timezone offset is returned as separate value. Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov. Revised by me. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Alexander Korotkov Reviewed-by: Anastasia Lubennikova, Peter Eisentraut	2019-09-25 22:51:51 +03:00
Alexander Korotkov	1a950f37d0	Implement standard datetime parsing mode SQL Standard 2016 defines rules for handling separators in datetime template strings, which are different to to_date()/to_timestamp() rules. Standard allows only small set of separators and requires strict matching for them. Standard applies to jsonpath .datetime() method and CAST (... FORMAT ...) SQL clause. We're not going to change handling of separators in existing to_date()/to_timestamp() functions, because their current behavior is familiar for users. Standard behavior now available by special flag, which will be used in upcoming .datetime() jsonpath method. Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Alexander Korotkov	2019-09-25 22:51:29 +03:00
Alvaro Herrera	773df883e8	Support reloptions of enum type All our current in core relation options of type string (not many, admittedly) behave in reality like enums. But after seeing an implementation for enum reloptions, it's clear that strings are messier, so introduce the new reloption type. Switch all string options to be enums instead. Fortunately we have a recently introduced test module for reloptions, so we don't lose coverage of string reloptions, which may still be used by third-party modules. Authors: Nikolay Shaplov, Álvaro Herrera Reviewed-by: Nikita Glukhov, Aleksandr Parfenov Discussion: https://postgr.es/m/43332102.S2V5pIjXRx@x200m	2019-09-25 15:56:52 -03:00
Michael Paquier	69f9410807	Allow definition of lock mode for custom reloptions Relation options can define a lock mode other than AccessExclusiveMode since `47167b7`, but modules defining custom relation options did not really have a way to enforce that. Correct that by extending the current API set so as modules can define a custom lock mode. Author: Michael Paquier Reviewed-by: Kuntal Ghosh Discussion: https://postgr.es/m/20190920013831.GD1844@paquier.xyz	2019-09-25 10:13:52 +09:00
Michael Paquier	736b84eede	Fix failure with lock mode used for custom relation options In-core relation options can use a custom lock mode since `47167b7`, that has lowered the lock available for some autovacuum parameters. However it forgot to consider custom relation options. This causes failures with ALTER TABLE SET when changing a custom relation option, as its lock is not defined. The existing APIs to define a custom reloption does not allow to define a custom lock mode, so enforce its initialization to AccessExclusiveMode which should be safe enough in all cases. An upcoming patch will extend the existing APIs to allow a custom lock mode to be defined. The problem can be reproduced with bloom indexes, so add a test there. Reported-by: Nikolay Sharplov Analyzed-by: Thomas Munro, Michael Paquier Author: Michael Paquier Reviewed-by: Kuntal Ghosh Discussion: https://postgr.es/m/20190920013831.GD1844@paquier.xyz Backpatch-through: 9.6	2019-09-25 10:07:23 +09:00
Alexander Korotkov	90c0987258	Fix bug in pairingheap_SpGistSearchItem_cmp() Our item contains only so->numberOfNonNullOrderBys of distances. Reflect that in the loop upper bound. Discussion: https://postgr.es/m/53536807-784c-e029-6e92-6da802ab8d60%40postgrespro.ru Author: Nikita Glukhov Backpatch-through: 12	2019-09-25 01:47:36 +03:00
Alvaro Herrera	709d003fbd	Rework WAL-reading supporting structs The state-tracking of WAL reading in various places was pretty messy, mostly because the ancient physical-replication WAL reading code wasn't using the XLogReader abstraction. This led to some untidy code. Make it prettier by creating two additional supporting structs, WALSegmentContext and WALOpenSegment which keep track of WAL-reading state. This makes code cleaner, as well as supports more future cleanup. Author: Antonin Houska Reviewed-by: Álvaro Herrera and (older versions) Robert Haas Discussion: https://postgr.es/m/14984.1554998742@spoje.net	2019-09-24 16:39:53 -03:00
Tom Lane	a9ae99d019	Prevent bogus pullup of constant-valued functions returning composite. Fix an oversight in commit `7266d0997`: as it stood, the code failed when a function-in-FROM returns composite and can be simplified to a composite constant. For the moment, just test for composite result and abandon pullup if we see one. To make it actually work, we'd have to decompose the composite constant into per-column constants; which is surely do-able, but I'm not convinced it's worth the code space. Per report from Raúl Marín Rodríguez. Discussion: https://postgr.es/m/CAM6_UM4isP+buRA5sWodO_MUEgutms-KDfnkwGmryc5DGj9XuQ@mail.gmail.com	2019-09-24 12:11:32 -04:00
Fujii Masao	6d05086c0a	Speedup truncations of relation forks. When a relation is truncated, shared_buffers needs to be scanned so that any buffers for the relation forks are invalidated in it. Previously, shared_buffers was scanned for each relation forks, i.e., MAIN, FSM and VM, when VACUUM truncated off any empty pages at the end of relation or TRUNCATE truncated the relation in place. Since shared_buffers needed to be scanned multiple times, it could take a long time to finish those commands especially when shared_buffers was large. This commit changes the logic so that shared_buffers is scanned only one time for those three relation forks. Author: Kirk Jamison Reviewed-by: Masahiko Sawada, Thomas Munro, Alvaro Herrera, Takayuki Tsunakawa and Fujii Masao Discussion: https://postgr.es/m/D09B13F772D2274BB348A310EE3027C64E2067@g01jpexmbkw24	2019-09-24 17:31:26 +09:00
Andres Freund	30d1379658	Fix ExprState's tag to be of type NodeTag rather than Node. This appears to have been an oversight in `b8d7f053c5`. As it's effectively harmless, though confusing, only fix in master. Author: Andres Freund	2019-09-23 15:28:13 -07:00
Peter Eisentraut	887248e97e	Message style fixes	2019-09-23 13:38:39 +02:00
Tom Lane	5ac0d93600	Fix failure to zero-pad the result of bitshiftright(). If the bitstring length is not a multiple of 8, we'd shift the rightmost bits into the pad space, which must be zeroes --- bit_cmp, for one, depends on that. This'd lead to the result failing to compare equal to what it should compare equal to, as reported in bug #16013 from Daryl Waycott. This is, if memory serves, not the first such bug in the bitstring functions. In hopes of making it the last one, do a bit more work than minimally necessary to fix the bug: * Add assertion checks to bit_out() and varbit_out() to complain if they are given incorrectly-padded input. This will improve the odds that manual testing of any new patch finds problems. * Encapsulate the padding-related logic in macros to make it easier to use. Also, remove unnecessary padding logic from bit_or() and bitxor(). Somebody had already noted that we need not re-pad the result of bit_and() since the inputs are required to be the same length, but failed to extrapolate that to the other two. Also, move a comment block that once was near the head of varbit.c (but people kept putting other stuff in front of it), to put it in the header block. Note for the release notes: if anyone has inconsistent data as a result of saving the output of bitshiftright() in a table, it's possible to fix it with something like UPDATE mytab SET bitcol = ~(~bitcol) WHERE bitcol != ~(~bitcol); This has been broken since day one, so back-patch to all supported branches. Discussion: https://postgr.es/m/16013-c2765b6996aacae9@postgresql.org	2019-09-22 17:45:59 -04:00
Tom Lane	0a2f894c3c	Fix typo in tts_virtual_copyslot. The code used the destination slot's natts where it intended to use the source slot's natts. Adding an Assert shows that there is no case in "make check-world" where these counts are different, so maybe this is a harmless bug, but it's still a bug. Takayuki Tsunakawa Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1FD34C0E@G01JPEXMBYT05	2019-09-22 14:21:07 -04:00
Tom Lane	51004c7172	Make some efficiency improvements in LISTEN/NOTIFY. Move the responsibility for advancing the NOTIFY queue tail pointer from the listener(s) to the notification sender, and only have the sender do it once every few queue pages, rather than after every batch of notifications as at present. This reduces the number of times we execute asyncQueueAdvanceTail, and reduces contention when there are multiple listeners (since that function requires exclusive lock). This change relies on the observation that we don't really need the tail pointer to be exactly up-to-date. It's certainly not necessary to attempt to release disk space more often than once per SLRU segment. The only other usage of the tail pointer is that an incoming listener, if it's the only listener in its database, will need to scan the queue forward from the tail; but that's surely a less performance-critical path than routine sending and receiving of notifies. We compromise by advancing the tail pointer after every 4 pages of output, so that it shouldn't get more than a few pages behind. Also, when sending signals to other backends after adding notify message(s) to the queue, recognize that only backends in our own database are going to care about those messages, so only such backends really need to be awakened promptly. Backends in other databases should get kicked if they're well behind on reading the queue, else they'll hold back the global tail pointer; but wakening them for every single message is pointless. This change can substantially reduce signal traffic if listeners are spread among many databases. It won't help for the common case of only a single active database, but the extra check costs very little. Martijn van Oosterhout, with some adjustments by me Discussion: https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com Discussion: https://postgr.es/m/CADWG95uFj8rLM52Er80JnhRsTbb_AqPP1ANHS8XQRGbqLrU+jA@mail.gmail.com	2019-09-22 11:46:29 -04:00
Tom Lane	c160b8928c	Straighten out leakproofness markings on text comparison functions. Since we introduced the idea of leakproof functions, texteq and textne were marked leakproof but their sibling text comparison functions were not. This inconsistency seemed justified because texteq/textne just relied on memcmp() and so could easily be seen to be leakproof, while the other comparison functions are far more complex and indeed can throw input-dependent errors. However, that argument crashed and burned with the addition of nondeterministic collations, because now texteq/textne may invoke the exact same varstr_cmp() infrastructure as the rest. It makes no sense whatever to give them different leakproofness markings. After a certain amount of angst we've concluded that it's all right to consider varstr_cmp() to be leakproof, mostly because the other choice would be disastrous for performance of many queries where leakproofness matters. The input-dependent errors should only be reachable for corrupt input data, or so we hope anyway; certainly, if they are reachable in practice, we've got problems with requirements as basic as maintaining a btree index on a text column. Hence, run around to all the SQL functions that derive from varstr_cmp() and mark them leakproof. This should result in a useful gain in flexibility/performance for queries in which non-leakproofness degrades the efficiency of the query plan. Back-patch to v12 where nondeterministic collations were added. While this isn't an essential bug fix given the determination that varstr_cmp() is leakproof, we might as well apply it now that we've been forced into a post-beta4 catversion bump. Discussion: https://postgr.es/m/31481.1568303470@sss.pgh.pa.us	2019-09-21 16:56:30 -04:00
Tom Lane	2810396312	Fix up handling of nondeterministic collations with pattern_ops opclasses. text_pattern_ops and its siblings can't be used with nondeterministic collations, because they use the text_eq operator which will not behave as bitwise equality if applied with a nondeterministic collation. The initial implementation of that restriction was to insert a run-time test in the related comparison functions, but that is inefficient, may throw misleading errors, and will throw errors in some cases that would work. It seems sufficient to just prevent the combination during CREATE INDEX, so do that instead. Lacking any better way to identify the opclasses involved, we need to hard-wire tests for them, which requires hand-assigned values for their OIDs, which forces a catversion bump because they previously had OIDs that would be assigned automatically. That's slightly annoying in the v12 branch, but fortunately we're not at rc1 yet, so just do it. Back-patch to v12 where nondeterministic collations were added. In passing, run make reformat-dat-files, which found some unrelated whitespace issues (slightly different ones in HEAD and v12). Peter Eisentraut, with small corrections by me Discussion: https://postgr.es/m/22566.1568675619@sss.pgh.pa.us	2019-09-21 16:29:17 -04:00
Alvaro Herrera	1a2983231d	Split out code into new getKeyJsonValueFromContainer() The new function stashes its output value in a JsonbValue that can be passed in by the caller, which enables some of them to pass stack-allocated structs -- saving palloc cycles. It also allows some callers that know they are handling a jsonb object to use this new jsonb object-specific API, instead of going through generic container findJsonbValueFromContainer. Author: Nikita Glukhov Discussion: https://postgr.es/m/7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru	2019-09-20 20:18:11 -03:00
Alvaro Herrera	dbb9aeda99	Optimize get_jsonb_path_all avoiding an iterator Instead of creating an iterator object at each step down the JSONB object/array, we can just just examine its object/array flags, which is faster. Also, use the recently introduced JsonbValueAsText instead of open-coding the same thing, for code simplicity. Author: Nikita Glukhov Discussion: https://postgr.es/m/7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru	2019-09-20 19:31:32 -03:00
Alvaro Herrera	abb014a631	Refactor code into new JsonbValueAsText, and use it more jsonb_object_field_text and jsonb_array_element_text both contained identical copies of this code, so extract that into new routine JsonbValueAsText. This can also be used in other places, to measurable performance benefit: the jsonb_each() and jsonb_array_elements() functions can use it for outputting text forms instead of their less efficient current implementation (because we no longer need to build intermediate a jsonb representation of each value). Author: Nikita Glukhov Discussion: https://postgr.es/m/7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru	2019-09-20 19:30:16 -03:00
Tom Lane	e56cad84d5	Fix some minor spec-compliance issues in jsonpath lexer. Although the SQL/JSON tech report makes reference to ECMAScript which allows both single- and double-quoted strings, all the rest of the report speaks only of double-quoted string literals in jsonpaths. That's more compatible with JSON itself; moreover single-quoted strings are hard to use inside a jsonpath that is itself a single-quoted SQL literal. So guess that the intent is to allow only double-quoted literals, and remove lexer support for single-quoted literals. It'll be less painful to add this again later if we're wrong, than to remove a shipped feature. Also, adjust the lexer so that unrecognized backslash sequences are treated as just meaning the escaped character, not as errors. This change has much better support in the standards, as JSON, JavaScript and ECMAScript all make it plain that that's what's supposed to happen. Back-patch to v12. Discussion: https://postgr.es/m/CAPpHfdvDci4iqNF9fhRkTqhe-5_8HmzeLt56drH%2B_Rv2rNRqfg@mail.gmail.com	2019-09-20 14:22:58 -04:00
Alvaro Herrera	d1b0007639	Fix progress report of REINDEX INDEX I (Álvaro) broke that in commit `6212276e43` -- forgot to set the necessary flag. Repair. Author: Amit Langote Discussion: https://postgr.es/m/CA+HiwqEaM2tV5awKhP1vSbgjQe_uXVU15Oi4sTgwgempwMiT8g@mail.gmail.com	2019-09-20 12:56:00 -03:00
Alexander Korotkov	8c8a267201	Fix freeing old values in index_store_float8_orderby_distances() `6cae9d2c10` has added an error in freeing old values in index_store_float8_orderby_distances() function. It looks for old value in scan->xs_orderbynulls[i] after setting a new value there. This commit fixes that. Also it removes short-circuit in handling distances == NULL situation. Now distances == NULL will be treated the same way as array with all null distances. That is, previous values will be freed if any. Reported-by: Tom Lane, Nikita Glukhov Discussion: https://postgr.es/m/CAPpHfdu2wcoAVAm3Ek66rP%3Duo_C-D84%2B%2Buf1VEcbyi_caBXWCA%40mail.gmail.com Discussion: https://postgr.es/m/426580d3-a668-b9d1-7b8e-f74d1a6524e0%40postgrespro.ru Backpatch-through: 12	2019-09-20 01:19:08 +03:00
Alexander Korotkov	6cae9d2c10	Improve handling of NULLs in KNN-GiST and KNN-SP-GiST This commit improves subject in two ways: * It removes ugliness of `02f90879e7`, which stores distance values and null flags in two separate arrays after GISTSearchItem struct. Instead we pack both distance value and null flag in IndexOrderByDistance struct. Alignment overhead should be negligible, because we typically deal with at most few "col op const" expressions in ORDER BY clause. * It fixes handling of "col op NULL" expression in KNN-SP-GiST. Now, these expression are not passed to support functions, which can't deal with them. Instead, NULL result is implicitly assumed. It future we may decide to teach support functions to deal with NULL arguments, but current solution is bugfix suitable for backpatch. Reported-by: Nikita Glukhov Discussion: https://postgr.es/m/826f57ee-afc7-8977-c44c-6111d18b02ec%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Alexander Korotkov Backpatch-through: 9.4	2019-09-19 21:48:39 +03:00
Peter Eisentraut	e1c8743e6c	GSSAPI error message improvements Make the error messages around GSSAPI encryption a bit clearer. Tweak some messages to avoid plural problems. Also make a code change for clarity. Using "conf" for "confidential" is quite confusing. Using "conf_state" is perhaps not much better but that's what the GSSAPI documentation uses, so there is at least some hope of understanding it.	2019-09-19 15:09:49 +02:00
Fujii Masao	33a94bae60	Remove unused smgrdounlinkfork() function. smgrdounlinkfork() became dead code as the result of commit `ece01aae47`, but it was left in place just in case we want it someday. However no users have appeared in 7 years, so it's time to remove this unused function. Author: Kirk Jamison Discussion: https://www.postgresql.org/message-id/D09B13F772D2274BB348A310EE3027C64E2067@g01jpexmbkw24	2019-09-18 21:05:33 +09:00
Peter Eisentraut	48770492c3	Add some const decorations to array constants Author: Mark G <markg735@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEeOP_YFVeFjq4zDZLDQbLSRFxBiTpwBQHxCNgGd%2Bp5VztTXyQ%40mail.gmail.com	2019-09-17 22:03:00 +02:00
Tom Lane	d5b90cd648	Fix bogus handling of XQuery regex option flags. The SQL spec defers to XQuery to define what the option flags are for LIKE_REGEX patterns. XQuery says that: * 's' allows the dot character to match newlines, which by default it will not; * 'm' allows ^ and $ to match at newlines, not only at the start/end of the whole string. Thus, these are not inverses as they are for the similarly-named POSIX options, and neither one corresponds to the POSIX 'n' option. Fortunately, Spencer's library does expose these two behaviors as separately twiddlable flags, so we just have to fix the mapping from JSP flag bits to REG flag bits. I also chose to rename the symbol for 's' to DOTALL, to make it clearer that it's not the inverse of MLINE. Also, XQuery says that if the 'q' flag "is used together with the m, s, or x flag, that flag has no effect". I read this as saying that 'q' overrides the other flags; whoever wrote our code seems to have read it backwards. Lastly, while XQuery's 'x' flag is related to what Spencer's code does for REG_EXPANDED, it's not the same or a subset. It seems best to treat XQuery's 'x' as unimplemented for now. Maybe later we can expand our regex code to offer 'x'-style parsing as a separate option. While at it, refactor the jsonpath code so that (a) there's only one copy of the flag transformation logic not two, and (b) the processing of flags is independent of the order in which the flags are written. We need some documentation updates to go with this, but I'll tackle that separately. Back-patch to v12 where this code originated. Discussion: https://postgr.es/m/CAPpHfdvDci4iqNF9fhRkTqhe-5_8HmzeLt56drH%2B_Rv2rNRqfg@mail.gmail.com Reference: https://www.w3.org/TR/2017/REC-xpath-functions-31-20170321/#flags	2019-09-17 15:39:51 -04:00
Peter Eisentraut	a25221f53c	Remove mingwcompat.c We believe that the issues that this was working around have been fixed in MinGW more than 5 years ago, so this isn't necessary anymore. Discussion: https://www.postgresql.org/message-id/flat/20190719050830.GK1859%40paquier.xyz	2019-09-17 11:34:28 +02:00
Alexander Korotkov	b64b857f50	Support for SSSSS datetime format pattern SQL Standard 2016 defines SSSSS format pattern for seconds past midnight in jsonpath .datetime() method and CAST (... FORMAT ...) SQL clause. In our datetime parsing engine we currently support it with SSSS name. This commit adds SSSSS as an alias for SSSS. Alias is added in favor of upcoming jsonpath .datetime() method. But it's also supported in to_date()/ to_timestamp() as positive side effect. Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Nikita Glukhov, Alexander Korotkov Reviewed-by: Anastasia Lubennikova, Peter Eisentraut	2019-09-16 21:14:56 +03:00
Alexander Korotkov	d589f94460	Support for FF1-FF6 datetime format patterns SQL Standard 2016 defines FF1-FF9 format patters for fractions of seconds in jsonpath .datetime() method and CAST (... FORMAT ...) SQL clause. Parsing engine of upcoming .datetime() method will be shared with to_date()/ to_timestamp(). This patch implements FF1-FF6 format patterns for upcoming jsonpath .datetime() method. to_date()/to_timestamp() functions will also get support of this format patterns as positive side effect. FF7-FF9 are not supported due to lack of precision in our internal timestamp representation. Extracted from original patch by Nikita Glukhov, Teodor Sigaev, Oleg Bartunov. Heavily revised by me. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Discussion: https://postgr.es/m/CAPpHfdsZgYEra_PeCLGNoXOWYx6iU-S3wF8aX0ObQUcZU%2B4XTw%40mail.gmail.com Author: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov, Alexander Korotkov Reviewed-by: Anastasia Lubennikova, Peter Eisentraut	2019-09-16 21:14:32 +03:00
Tom Lane	d812257809	Fix bogus sizeof calculations. Noted by Coverity. Typo in `27cc7cd2b`, so back-patch to v12 as that was.	2019-09-15 11:51:57 -04:00
Tom Lane	b360e0fcd7	Make tuplesort_set_bound() assertions more comprehensible, hopefully. Add the comments that I griped were missing. Also re-order tests so that parallelism-related tests aren't randomly separated from each other. Discussion: https://postgr.es/m/CAAaqYe9GD__4Crm=ddz+-XXcNhfY_V5gFYdLdmkFNq=2VHO56Q@mail.gmail.com	2019-09-13 16:57:07 -04:00
Alvaro Herrera	bac2fae05c	logical decoding: process ASSIGNMENT during snapshot build Most WAL records are ignored in early SnapBuild snapshot build phases. But it's critical to process some of them, so that later messages have the correct transaction state after the snapshot is completely built; in particular, XLOG_XACT_ASSIGNMENT messages are critical in order for sub-transactions to be correctly assigned to their parent transactions, or at least one assert misbehaves, as reported by Ildar Musin. Diagnosed-by: Masahiko Sawada Author: Masahiko Sawada Discussion: https://postgr.es/m/CAONYFtOv+Er1p3WAuwUsy1zsCFrSYvpHLhapC_fMD-zNaRWxYg@mail.gmail.com	2019-09-13 16:36:28 -03:00
Alvaro Herrera	6212276e43	Fix progress reporting of CLUSTER / VACUUM FULL The progress state was being clobbered once the first index completed being rebuilt, causing the final phases of the operation not show anything in the progress view. This was inadvertently broken in `03f9e5cba0`, which added progress tracking for REINDEX. (The reason this bugfix is this small is that I had already noticed this problem when writing monitoring for CREATE INDEX, and had already worked around it, as can be seen in discussion starting at https://postgr.es/m/20190329150218.GA25010@alvherre.pgsql Fixing the problem is just a matter of fixing one place touched by the REINDEX monitoring.) Reported by: Álvaro Herrera Author: Álvaro Herrera Discussion: https://postgr.es/m/20190801184333.GA21369@alvherre.pgsql	2019-09-13 14:54:26 -03:00
Peter Geoghegan	3b6b54f178	Fix nbtree page split rmgr desc routine. Include newitemoff in rmgr desc output for nbtree page split records. In passing, correct an obsolete comment that claimed that newitemoff is only logged for _L variant nbtree page split WAL records. Both issues were oversights in commit `2c03216d83`, which revamped the WAL format. Author: Peter Geoghegan Backpatch: 9.5-, where the WAL format was revamped.	2019-09-12 15:45:08 -07:00
Peter Geoghegan	1b9becd43c	Remove redundant _bt_truncate() comment paragraph.	2019-09-12 09:51:27 -07:00
Alvaro Herrera	bc98e1ea64	Merge two assertions to make comment clearer Authored by Tom Lane, after a gripe from James Coleman. Discussion: https://postgr.es/m/CAAaqYe9GD__4Crm=ddz+-XXcNhfY_V5gFYdLdmkFNq=2VHO56Q@mail.gmail.com	2019-09-12 10:37:04 -03:00
Tom Lane	9a86f03b4e	Rearrange postmaster's startup sequence for better syslogger results. This is a second try at what commit `57431a911` tried to do, namely, launch the syslogger before we open postmaster sockets so that our messages about the sockets end up in the syslogger files. That commit fell foul of a bunch of subtle issues caused by trying to launch a postmaster child process before creating shared memory. Rather than messing with that interaction, let's postpone opening the sockets till after we launch the syslogger. This would not have been terribly safe before commit `7de19fbc0`, because we relied on socket opening to detect whether any competing postmasters were using the same port number. But now that we choose IPC keys without regard to the port number, there's no interaction to worry about. Also delay creation of the external PID file (if requested) till after the sockets are open, since external code could plausibly be relying on that ordering of events. And postpone most of the work of RemovePgTempFiles() so that that potentially-slow processing still happens after we make the external PID file. We have to be a bit careful about that last though: as noted in the discussion subsequent to bug #15804, EXEC_BACKEND builds still have to clear the parameter-file temp dir before launching the syslogger. Patch by me; thanks to Michael Paquier for review/testing. Discussion: https://postgr.es/m/15804-3721117bf40fb654@postgresql.org	2019-09-11 11:43:01 -04:00
Tomas Vondra	d06215d03b	Allow setting statistics target for extended statistics When building statistics, we need to decide how many rows to sample and how accurate the resulting statistics should be. Until now, it was not possible to explicitly define statistics target for extended statistics objects, the value was always computed from the per-attribute targets with a fallback to the system-wide default statistics target. That's a bit inconvenient, as it ties together the statistics target set for per-column and extended statistics. In some cases it may be useful to require larger sample / higher accuracy for extended statics (or the other way around), but with this approach that's not possible. So this commit introduces a new command, allowing to specify statistics target for individual extended statistics objects, overriding the value derived from per-attribute targets (and the system default). ALTER STATISTICS stat_name SET STATISTICS target_value; When determining statistics target for an extended statistics object we first look at this explicitly set value. When this value is -1, we fall back to the old formula, looking at the per-attribute targets first and then the system default. This means the behavior is backwards compatible with older PostgreSQL releases. Author: Tomas Vondra Discussion: https://postgr.es/m/20190618213357.vli3i23vpkset2xd@development Reviewed-by: Kirk Jamison, Dean Rasheed	2019-09-11 00:25:51 +02:00
Tom Lane	bca6e64354	Reduce overhead of scanning the backend[] array in LISTEN/NOTIFY. Up to now, async.c scanned its whole array of per-backend state whenever it needed to find listening backends. That's expensive if MaxBackends is large, so extend the data structure with list links that thread the active entries together. A downside of this change is that asyncQueueUnregister (unregister a listening backend at backend exit) now requires exclusive not shared lock, and it can take awhile if there are many other listening backends. We could improve the latter issue by using a doubly- not singly-linked list, but it's probably not worth the storage space; typical usage patterns for LISTEN/NOTIFY have fairly long-lived listeners. In return for that, Exec_ListenPreCommit (initially register a listening backend), SignalBackends, and asyncQueueAdvanceTail get significantly faster when MaxBackends is much larger than the number of listening backends. If most of the potential backend slots are listening, we don't win, but that's a case where the actual interprocess-signal overhead is going to swamp these considerations anyway. Martijn van Oosterhout, hacked a bit more by me Discussion: https://postgr.es/m/CADWG95vtRBFDdrx1JdT1_9nhOFw48KaeTev6F_LtDQAFVpSPhA@mail.gmail.com	2019-09-10 18:15:20 -04:00
Peter Geoghegan	55d015bde0	Add _bt_binsrch() scantid assertion to nbtree. Assert that _bt_binsrch() binary searches with scantid set in insertion scankey cannot be performed on leaf pages. Leaf-level binary searches where scantid is set must use _bt_binsrch_insert() instead. _bt_binsrch_insert() is likely to have additional responsibilities in the future, such as searching within GIN-style posting lists using scantid. It seems like a good idea to tighten things up now.	2019-09-09 11:41:19 -07:00
Andres Freund	27cc7cd2bc	Reorder EPQ work, to fix rowmark related bugs and improve efficiency. In `ad0bda5d24` I changed the EvalPlanQual machinery to store substitution tuples in slot, instead of using plain HeapTuples. The main motivation for that was that using HeapTuples will be inefficient for future tableams. But it turns out that that conversion was buggy for non-locking rowmarks - the wrong tuple descriptor was used to create the slot. As a secondary issue `5db6df0c0` changed ExecLockRows() to begin EPQ earlier, to allow to fetch the locked rows directly into the EPQ slots, instead of having to copy tuples around. Unfortunately, as Tom complained, that forces some expensive initialization to happen earlier. As a third issue, the test coverage for EPQ was clearly insufficient. Fixing the first issue is unfortunately not trivial: Non-locked row marks were fetched at the start of EPQ, and we don't have the type information for the rowmarks available at that point. While we could change that, it's not easy. It might be worthwhile to change that at some point, but to fix this bug, it seems better to delay fetching non-locking rowmarks when they're actually needed, rather than eagerly. They're referenced at most once, and in cases where EPQ fails, might never be referenced. Fetching them when needed also increases locality a bit. To be able to fetch rowmarks during execution, rather than initialization, we need to be able to access the active EPQState, as that contains necessary data. To do so move EPQ related data from EState to EPQState, and, only for EStates creates as part of EPQ, reference the associated EPQState from EState. To fix the second issue, change EPQ initialization to allow use of EvalPlanQualSlot() to be used before EvalPlanQualBegin() (but obviously still requiring EvalPlanQualInit() to have been done). As these changes made struct EState harder to understand, e.g. by adding multiple EStates, significantly reorder the members, and add a lot more comments. Also add a few more EPQ tests, including one that fails for the first issue above. More is needed. Reported-By: yi huang Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/CAHU7rYZo_C4ULsAx_LAj8az9zqgrD8WDd4hTegDTMM1LMqrBsg@mail.gmail.com https://postgr.es/m/24530.1562686693@sss.pgh.pa.us Backpatch: 12-, where the EPQ changes were introduced	2019-09-09 05:14:11 -07:00
Alexander Korotkov	7e04160390	Fix handling of non-key columns get_index_column_opclass() `f2e40380` introduces support of non-key attributes in GiST indexes. Then if get_index_column_opclass() is asked by gistproperty() to get an opclass of non-key column, it returns garbage past oidvector value. This commit fixes that by making get_index_column_opclass() return InvalidOid in this case. Discussion: https://postgr.es/m/20190902231948.GA5343%40alvherre.pgsql Author: Nikita Glukhov, Alexander Korotkov Backpatch-through: 12	2019-09-09 13:50:12 +03:00
Tom Lane	1192e3fb54	Fix RelationIdGetRelation calls that weren't bothering with error checks. Some of these are quite old, but that doesn't make them not bugs. We'd rather report a failure via elog than SIGSEGV. While at it, uniformly spell the error check as !RelationIsValid(rel) rather than a bare rel == NULL test. The machine code is the same but it seems better to be consistent. Coverity complained about this today, not sure why, because the mistake is in fact old.	2019-09-08 17:00:50 -04:00
Alexander Korotkov	02f90879e7	Fix handling of NULL distances in KNN-GiST In order to implement NULL LAST semantic GiST previously assumed distance to the NULL value to be Inf. However, our distance functions can return Inf and NaN for non-null values. In such cases, NULL LAST semantic appears to be broken. This commit fixes that by introducing separate array of null flags for distances. Backpatch to all supported versions. Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Backpatch-through: 9.4	2019-09-08 22:08:12 +03:00
Alexander Korotkov	e5d8f35961	Fix handling Inf and Nan values in GiST pairing heap comparator Previously plain float comparison was used in GiST pairing heap. Such comparison doesn't provide proper ordering for value sets containing Inf and Nan values. This commit fixes that by usage of float8_cmp_internal(). Note, there is remaining problem with NULL distances, which are represented as Inf in pairing heap. It would be fixes in subsequent commit. Backpatch to all supported versions. Reported-by: Andrey Borodin Discussion: https://postgr.es/m/CAPpHfdsNvNdA0DBS%2BwMpFrgwT6C3-q50sFVGLSiuWnV3FqOJuQ%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Heikki Linnakangas Backpatch-through: 9.4	2019-09-08 22:08:12 +03:00
Peter Eisentraut	862ef372d6	Fix behavior of AND CHAIN outside of explicit transaction blocks When using COMMIT AND CHAIN or ROLLBACK AND CHAIN not in an explicit transaction block, the previous implementation would leave a transaction block active in the ROLLBACK case but not the COMMIT case. To fix for now, error out when using these commands not in an explicit transaction block. This restriction could be lifted if a sensible definition and implementation is found. Bug: #15977 Author: fn ln <emuser20140816@gmail.com> Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2019-09-08 16:23:03 +02:00
Tom Lane	db43831899	Avoid using INFO elevel for what are fundamentally debug messages. Commit `6f6b99d13` stuck an INFO message into the fast path for checking partition constraints, for no very good reason except that it made it easy for the regression tests to verify that that path was taken. Assorted later patches did likewise, increasing the unsuppressable-chatter level from ALTER TABLE even more. This isn't good for the user experience, so let's drop these messages down to DEBUG1 where they belong. So as not to have a loss of test coverage, create a TAP test that runs the relevant queries with client_min_messages = DEBUG1 and greps for the expected messages. This testing method is a bit brute-force --- in particular, it duplicates the execution of a fair amount of the core create_table and alter_table tests. We experimented with other solutions, but running any significant amount of standard testing with client_min_messages = DEBUG1 seems to have a lot of output-stability pitfalls, cf commits `bbb96c370` and `5655565c0`. Possibly at some point we'll look into whether we can reduce the amount of test duplication. Backpatch into v12, because some of these messages are new in v12 and we don't really want to ship it that way. Sergei Kornilov Discussion: https://postgr.es/m/81911511895540@web58j.yandex.ru Discussion: https://postgr.es/m/4859321552643736@myt5-02b80404fd9e.qloud-c.yandex.net	2019-09-07 19:03:11 -04:00
Tom Lane	ca70bdaefe	Fix issues around strictness of SIMILAR TO. As a result of some long-ago quick hacks, the SIMILAR TO operator and the corresponding flavor of substring() interpreted "ESCAPE NULL" as selecting the default escape character '\'. This is both surprising and not per spec: the standard is clear that these functions should return NULL for NULL input. Additionally, because of inconsistency of the strictness markings of 3-argument substring() and similar_escape(), the planner could not inline the SQL definition of substring(), resulting in a substantial performance penalty compared to the underlying POSIX substring() function. The simplest fix for this would be to change the strictness marking of similar_escape(), but if we do that we risk breaking existing views that depend on that function. Hence, leave similar_escape() as-is as a compatibility function, and instead invent a new function similar_to_escape() that comes in two strict variants. There are a couple of other behaviors in this area that are also not per spec, but they are documented and seem generally at least as sane as the spec's definition, so leave them alone. But improve the documentation to describe them fully. Patch by me; thanks to Álvaro Herrera and Andrew Gierth for review and discussion. Discussion: https://postgr.es/m/14047.1557708214@sss.pgh.pa.us	2019-09-07 14:21:59 -04:00
Robert Haas	bd124996ef	Create an API for inserting and deleting rows in TOAST tables. This moves much of the non-heap-specific logic from toast_delete and toast_insert_or_update into a helper functions accessible via a new header, toast_helper.h. Using the functions in this module, a table AM can implement creation and deletion of TOAST table rows with much less code duplication than was possible heretofore. Some table AMs won't want to use the TOAST logic at all, but for those that do this will make that easier. Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro, Andres Freund, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com	2019-09-06 10:38:51 -04:00
Robert Haas	286af0ce12	When performing a base backup, check for read errors. The old code didn't differentiate between a read error and a concurrent truncation. fread reports both of these by returning 0; you have to use feof() or ferror() to distinguish between them, which this code did not do. It might be a better idea to use read() rather than fread() here, so that we can display a less-generic error message, but I'm not sure that would qualify as a back-patchable bug fix, so just do this much for now. Jeevan Chalke, reviewed by Jeevan Ladhe and by me. Discussion: http://postgr.es/m/CA+TgmobG4ywMzL5oQq2a8YKp8x2p3p1LOMMcGqpS7aekT9+ETA@mail.gmail.com	2019-09-06 08:22:32 -04:00
Fujii Masao	946647f845	Make pg_promote() detect postmaster death while waiting for promotion to end. Previously even if postmaster died and WaitLatch() woke up with that event while pg_promote() was waiting for the standby promotion to finish, pg_promote() did nothing special and kept waiting until timeout occurred. This could cause a busy loop. This patch make pg_promote() return false immediately when postmaster dies, to avoid such a busy loop. Back-patch to v12 where pg_promote() was added. Author: Fujii Masao Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAHGQGwEs9ROgSp+QF+YdDU+xP8W=CY1k-_Ov-d_Z3JY+to3eXA@mail.gmail.com	2019-09-06 14:27:25 +09:00
Tom Lane	7de19fbc0b	Use data directory inode number, not port, to select SysV resource keys. This approach provides a much tighter binding between a data directory and the associated SysV shared memory block (and SysV or named-POSIX semaphores, if we're using those). Key collisions are still possible, but only between data directories stored on different filesystems, so the situation should be negligible in practice. More importantly, restarting the postmaster with a different port number no longer risks failing to identify a relevant shared memory block, even when postmaster.pid has been removed. A standalone backend is likewise much more certain to detect conflicting leftover backends. (In the longer term, we might now think about deprecating the port as a cluster-wide value, so that one postmaster could support sockets with varying port numbers. But that's for another day.) The hazards fixed here apply only on Unix systems; our Windows code paths already use identifiers derived from the data directory path name rather than the port. src/test/recovery/t/017_shm.pl, which intends to test key-collision cases, has been substantially rewritten since it can no longer use two postmasters with identical port numbers to trigger the case. Instead, use Perl's IPC::SharedMem module to create a conflicting shmem segment directly. The test script will be skipped if that module is not available. (This means that some older buildfarm members won't run it, but I don't think that that results in any meaningful coverage loss.) Patch by me; thanks to Noah Misch and Peter Eisentraut for discussion and review. Discussion: https://postgr.es/m/16908.1557521200@sss.pgh.pa.us	2019-09-05 13:31:46 -04:00
Robert Haas	8b94dab066	Split tuptoaster.c into three separate files. detoast.c/h contain functions required to detoast a datum, partially or completely, plus a few other utility functions for examining the size of toasted datums. toast_internals.c/h contain functions that are used internally to the TOAST subsystem but which (mostly) do not need to be accessed from outside. heaptoast.c/h contains code that is intrinsically specific to the heap AM, either because it operates on HeapTuples or is based on the layout of a heap page. detoast.c and toast_internals.c are placed in src/backend/access/common rather than src/backend/access/heap. At present, both files still have dependencies on the heap, but that will be improved in a future commit. Patch by me, reviewed and tested by Prabhat Sabu, Thomas Munro, Andres Freund, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoZv-=2iWM4jcw5ZhJeL18HF96+W1yJeYrnGMYdkFFnEpQ@mail.gmail.com	2019-09-05 13:15:10 -04:00
Peter Eisentraut	74a308cf52	Use explicit_bzero Use the explicit_bzero() function in places where it is important that security information such as passwords is cleared from memory. There might be other places where it could be useful; this is just an initial collection. For platforms that don't have explicit_bzero(), provide various fallback implementations. (explicit_bzero() itself isn't standard, but as Linux/glibc, FreeBSD, and OpenBSD have it, it's the most common spelling, so it makes sense to make that the invocation point.) Discussion: https://www.postgresql.org/message-id/flat/42d26bde-5d5b-c90d-87ae-6cab875f73be%402ndquadrant.com	2019-09-05 08:30:42 +02:00
Michael Paquier	ae060a52b2	Fix thinko when ending progress report for a backend The logic ending progress reporting for a backend entry introduced by `b6fb647` causes callers of pgstat_progress_end_command() to do some extra work when track_activities is enabled as the process fields are reset in the backend entry even if no command were started for reporting. This resets the fields only if a command is registered for progress reporting, and only if track_activities is enabled. Author: Masahiho Sawada Discussion: https://postgr.es/m/CAD21AoCry_vJ0E-m5oxJXGL3pnos-xYGCzF95rK5Bbi3Uf-rpA@mail.gmail.com Backpatch-through: 9.6	2019-09-04 15:46:37 +09:00
Alvaro Herrera	25dcc9d35d	Make XLogReaderInvalReadState static This function is only used by xlogreader.c itself, so there's no need to export it. It was introduced by commit `3b02ea4f07` with the apparent intention that it could be used externally, but I couldn't find any external code calling it. I (Álvaro) couldn't resist the urge to sort nearby function prototypes properly while at it. Author: Antonin Houska Discussion: https://postgr.es/m/14984.1554998742@spoje.net	2019-09-03 17:41:43 -04:00
Alvaro Herrera	fe66125974	Remove 'msg' parameter from convert_tuples_by_name The message was included as a parameter when this function was added in `dcb2bda9b7`, but I don't think it has ever served any useful purpose. Let's stop spreading it pointlessly. Reviewed by Amit Langote and Peter Eisentraut. Discussion: https://postgr.es/m/20190806224728.GA17233@alvherre.pgsql	2019-09-03 14:47:29 -04:00
Peter Eisentraut	396e4afdbc	Better error messages for short reads/writes in SLRU This avoids getting a Could not read from file ...: Success. for a short read or write (since errno is not set in that case). Instead, report a more specific error messages. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com	2019-09-03 08:30:21 +02:00
Michael Paquier	3a54eb1a38	Fix memory leak with lower, upper and initcap with ICU-provided collations The leak happens in str_tolower, str_toupper and str_initcap, which are used in several places including their equivalent SQL-level functions, and can only be triggered when using an ICU-provided collation when converting the input string. `b615920` fixed a similar leak. Backpatch down 10 where ICU collations have been introduced. Author: Konstantin Knizhnik Discussion: https://postgr.es/m/94c0ad0a-cbc2-e4a3-7829-2bdeaf9146db@postgrespro.ru Backpatch-through: 10	2019-09-03 12:30:53 +09:00
Tom Lane	f63a5ead9d	Avoid touching replica identity index in ExtractReplicaIdentity(). In what seems like a fit of misplaced optimization, ExtractReplicaIdentity() accessed the relation's replica-identity index without taking any lock on it. Usually, the surrounding query already holds some lock so this is safe enough ... but in the case of a previously-planned delete, there might be no existing lock. Given a suitable test case, this is exposed in v12 and HEAD by an assertion added by commit `b04aeb0a0`. The whole thing's rather poorly thought out anyway; rather than looking directly at the index, we should use the index-attributes bitmap that's held by the parent table's relcache entry, as the caller functions do. This is more consistent and likely a bit faster, since it avoids a cache lookup. Hence, change to doing it that way. While at it, rather than blithely assuming that the identity columns are non-null (with catastrophic results if that's wrong), add assertion checks that they aren't null. Possibly those should be actual test-and-elog, but I'll leave it like this for now. In principle, this is a bug that's been there since this code was introduced (in 9.4). In practice, the risk seems quite low, since we do have a lock on the index's parent table, so concurrent changes to the index's catalog entries seem unlikely. Given the precedent that commit `9c703c169` wasn't back-patched, I won't risk back-patching this further than v12. Per report from Hadi Moshayedi. Discussion: https://postgr.es/m/CAK=1=Wrek44Ese1V7LjKiQS-Nd-5LgLi_5_CskGbpggKEf3tKQ@mail.gmail.com	2019-09-02 16:10:37 -04:00
Heikki Linnakangas	bde7493d10	Fix overflow check and comment in GIN posting list encoding. The comment did not match what the code actually did for integers with the 43rd bit set. You get an integer like that, if you have a posting list with two adjacent TIDs that are more than 2^31 blocks apart. According to the comment, we would store that in 6 bytes, with no continuation bit on the 6th byte, but in reality, the code encodes it using 7 bytes, with a continuation bit on the 6th byte as normal. The decoding routine also handled these 7-byte integers correctly, except for an overflow check that assumed that one integer needs at most 6 bytes. Fix the overflow check, and fix the comment to match what the code actually does. Also fix the comment that claimed that there are 17 unused bits in the 64-bit representation of an item pointer. In reality, there are 64-32-11=21. Fitting any item pointer into max 6 bytes was an important property when this was written, because in the old pre-9.4 format, item pointers were stored as plain arrays, with 6 bytes for every item pointer. The maximum of 6 bytes per integer in the new format guaranteed that we could convert any page from the old format to the new format after upgrade, so that the new format was never larger than the old format. But we hardly need to worry about that anymore, and running into that problem during upgrade, where an item pointer is expanded from 6 to 7 bytes such that the data doesn't fit on a page anymore, is implausible in practice anyway. Backpatch to all supported versions. This also includes a little test module to test these large distances between item pointers, without requiring a 16 TB table. It is not backpatched, I'm including it more for the benefit of future development of new posting list formats. Discussion: https://www.postgresql.org/message-id/33bfc20a-5c86-f50c-f5a5-58e9925d05ff%40iki.fi Reviewed-by: Masahiko Sawada, Alexander Korotkov	2019-08-28 12:55:33 +03:00
Thomas Munro	720b59b55b	Avoid catalog lookups in RelationAllowsEarlyPruning(). RelationAllowsEarlyPruning() performed a catalog scan, but is used in two contexts where that was a bad idea: 1. In heap_page_prune_opt(), which runs very frequently in some large scans. This caused major performance problems in a field report that was easy to reproduce. 2. In TestForOldSnapshot(), which runs while we hold a buffer content lock. It's not clear if this was guaranteed to be free of buffer deadlock risk. The check was introduced in commit `2cc41acd8` and defended against a real problem: 9.6's hash indexes have no page LSN and so we can't allow early pruning (ie the snapshot-too-old feature). We can remove the check from all later releases though: hash indexes are now logged, and there is no way to create UNLOGGED indexes on regular logged tables. If a future release allows such a combination, it might need to put a similar check in place, but it'll need some more thought. Back-patch to 10. Author: Thomas Munro Reviewed-by: Tom Lane, who spotted the second problem Discussion: https://postgr.es/m/CA%2BhUKGKT8oTkp5jw_U4p0S-7UG9zsvtw_M47Y285bER6a2gD%2Bg%40mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1%2BWy%2BN4eE5zPm765h68LrkWc3Biu_8rzzi%2BOYX4j%2BiHRw%40mail.gmail.com	2019-08-28 16:18:29 +12:00
Peter Geoghegan	b8b3a276d4	Remove obsolete nbtree page deletion comment. Commit `efada2b8e9`, which made the nbtree page deletion algorithm more robust, removed the concept of a half-dead internal page. Remove a comment about half dead parent pages that was overlooked.	2019-08-27 14:01:43 -07:00
Tom Lane	6e42130568	Reject empty names and recursion in config-file include directives. An empty file name or subdirectory name leads join_path_components() to just produce the parent directory name, which leads to weird failures or recursive inclusions. Let's throw a specific error for that. It takes only slightly more code to detect all-blank names, so do so. Also, detect direct recursion, ie a file calling itself. As coded this will also detect recursion via "include_dir '.'", which is perhaps more likely than explicitly including the file itself. Detecting indirect recursion would require API changes for guc-file.l functions, which seems not worth it since extensions might call them. The nesting depth limit will catch such cases eventually, just not with such an on-point error message. In passing, adjust the example usages in postgresql.conf.sample to perhaps eliminate the problem at the source: there's no reason for the examples to suggest that an empty value is valid. Per a trouble report from Brent Bates. Back-patch to 9.5; the issue is old, but the code in 9.4 is enough different that the patch doesn't apply easily, and it doesn't seem worth the trouble to fix there. Ian Barwick and Tom Lane Discussion: https://postgr.es/m/8c8bcbca-3bd9-dc6e-8986-04a5abdef142@2ndquadrant.com	2019-08-27 14:44:26 -04:00
Tom Lane	ee32782395	Fix postmaster state machine to handle dead_end child crashes better. A report from Alvaro Herrera shows that if we're in PM_STARTUP state, and we spawn a dead_end child to reject some incoming connection request, and that child dies with an unexpected exit code, the postmaster does not respond well. We correctly send SIGQUIT to the startup process, but then: * if the startup process exits with nonzero exit code, as expected, we thought that that indicated a crash and aborted startup. * if the startup process exits with zero exit code, which is possible due to the inherent race condition, we'd advance to PM_RUN state which is fine --- but the code forgot that AbortStartTime would be nonzero in this situation. We'd either die on the Asserts saying that it was zero, or perhaps misbehave later on. (A quick look suggests that the only misbehavior might be busy-waiting due to DetermineSleepTime doing the wrong thing.) To fix the first point, adjust the state-machine logic to recognize that a nonzero exit code is expected after sending SIGQUIT, and have it transition to a state where we can restart the startup process. To fix the second point, change the Asserts to clear the variable rather than just claiming it should be clear already. Perhaps we could improve this further by not treating a crash of a dead_end child as a reason for panic'ing the database. However, since those child processes are connected to shared memory, that seems a bit risky. There are few good reasons for a dead_end child to report failure anyway (the cause of this in Alvaro's report is quite unclear). On balance, therefore, a minimal fix seems best. This is an oversight in commit `45811be94`. While that was back-patched, I'm hesitant to back-patch this change. The lack of reasons for a dead_end child to fail suggests that the case should be very rare in the field, which squares with the lack of reports; so it seems like this might not be worth the risk of introducing new issues. In any case we can let it bake awhile in HEAD before considering a back-patch. Discussion: https://postgr.es/m/20190615160950.GA31378@alvherre.pgsql	2019-08-26 15:59:44 -04:00
Thomas Munro	f493d98c16	Don't rely on llvm::make_unique. Bleeding-edge LLVM has stopped supplying replacements for various C++14 library features, for people on older C++ versions. Since we're not ready to require C++14 yet, just use plain old new instead of make_unique. As revealed by buildfarm animal seawasp. Back-patch to 11. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CA%2BhUKGJWG7unNqmkxg7nC5o3o-0p2XP6co4r%3D9epqYMm8UY4Mw%40mail.gmail.com	2019-08-25 14:45:51 +12:00
Peter Geoghegan	867d25ccb4	Explain subtlety in nbtree locking protocol. The Postgres approach to coupling locks during an ascent of the tree is slightly different to the approach taken by Lehman and Yao. Add a new paragraph to the "Differences to the Lehman & Yao algorithm" section of the nbtree README that explains the similarities and differences.	2019-08-23 20:24:49 -07:00
Peter Eisentraut	21e60fa8fe	Update SQL conformance information T612 has been fully supported since the major window function enhancements in PostgreSQL 11, but it wasn't updated at the time.	2019-08-22 15:36:30 +02:00
Peter Eisentraut	a00c53b0cb	Make SQL/JSON error code names match SQL standard There were some minor differences that didn't seem necessary. Discussion: https://www.postgresql.org/message-id/flat/86b67eef-bb26-c97d-3e35-64f1fbd4f9fe%402ndquadrant.com	2019-08-22 10:45:38 +02:00
Peter Geoghegan	091bd6befc	Update comments on nbtree stack struct. Adjust the struct comment that describes how page splits use their descent stack to cascade up the tree from the leaf level. In passing, fix up some unrelated nbtree comments that had typos or were obsolete.	2019-08-21 13:50:27 -07:00
Peter Eisentraut	c45643d618	Remove configure detection of crypt() crypt() hasn't been needed since crypt detection was removed from PostgreSQL, so these configure checks are not necessary. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/21f88934-f00c-27f6-a9d8-7ea06d317781%402ndquadrant.com	2019-08-21 21:36:54 +02:00
Alvaro Herrera	8f75e8e446	Fix typo In early development patches, "replication origins" were called "identifiers"; almost everything was renamed, but these references to the old terminology went unnoticed. Reported-by: Craig Ringer	2019-08-21 11:12:44 -04:00
Peter Eisentraut	db1f28917b	Clean up some SCRAM attribute processing Correct the comment for read_any_attr(). Give a clearer error message when parsing at the end of the string, when the client-final-message does not contain a "p" attribute (for some reason). Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/2fb8a15b-de35-682d-a77b-edcc9c52fa12%402ndquadrant.com	2019-08-20 22:33:06 +02:00
Alvaro Herrera	f8cf524da1	Fix bogus comment Author: Alexander Lakhin Discussion: https://postgr.es/m/20190819072244.GE18166@paquier.xyz	2019-08-20 16:04:09 -04:00
Tom Lane	e136a0d8ca	Restore json{b}_populate_record{set}'s ability to take type info from AS. If the record argument is NULL and has no declared type more concrete than RECORD, we can't extract useful information about the desired rowtype from it. In this case, see if we're in FROM with an AS clause, and if so extract the needed rowtype info from AS. It worked like this before v11, but commit `37a795a60` removed the behavior, reasoning that it was undocumented, inefficient, and utterly not self-consistent. If you want to take type info from an AS clause, you should be using the json_to_record() family of functions not the json_populate_record() family. Also, it was already the case that the "populate" functions would fail for a null-valued RECORD input (with an unfriendly "record type has not been registered" error) when there wasn't an AS clause at hand, and it wasn't obvious that that behavior wasn't OK when there was one. However, it emerges that some people were depending on this to work, and indeed the rather off-point error message you got if you left off AS encouraged slapping on AS without switching to the json_to_record() family. Hence, put back the fallback behavior of looking for AS. While at it, improve the run-time error you get when there's no place to obtain type info; we can do a lot better than "record type has not been registered". (We can't, unfortunately, easily improve the parse-time error message that leads people down this path in the first place.) While at it, I refactored the code a bit to avoid duplicating the same logic in several different places. Per bug #15940 from Jaroslav Sivy. Back-patch to v11 where the current coding came in. (The pre-v11 deficiencies in this area aren't regressions, so we'll leave those branches alone.) Patch by me, based on preliminary analysis by Dmitry Dolgov. Discussion: https://postgr.es/m/15940-2ab76dc58ffb85b6@postgresql.org	2019-08-19 18:01:09 -04:00
Michael Paquier	c96581abe4	Fix inconsistencies and typos in the tree, take 11 This fixes various typos in docs and comments, and removes some orphaned definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/5da8e325-c665-da95-21e0-c8a99ea61fbf@gmail.com	2019-08-19 16:21:39 +09:00
Tom Lane	927f34ce8a	Avoid conflicts with library versions of inet_net_ntop() and friends. Prefix inet_net_ntop and sibling routines with "pg_" to ensure that they aren't mistaken for C-library functions. This fixes warnings from cpluspluscheck on some platforms, and should help reduce reader confusion everywhere, since our functions aren't exactly interchangeable with the library versions (they may have different ideas about address family codes). This shouldn't be fixing any actual bugs, unless somebody's linker is misbehaving, so no need to back-patch. Discussion: https://postgr.es/m/20518.1559494394@sss.pgh.pa.us	2019-08-18 19:27:23 -04:00
Tom Lane	232720be9b	Fix incidental warnings from cpluspluscheck. Remove use of "register" keyword in hashfn.c. It's obsolescent according to recent C++ compilers, and no modern C compiler pays much attention to it either. Also fix one cosmetic warning about signed vs unsigned comparison. Discussion: https://postgr.es/m/20518.1559494394@sss.pgh.pa.us	2019-08-18 19:01:40 -04:00
Tom Lane	4d4c66addf	Disallow changing an inherited column's type if not all parents changed. If a table inherits from multiple unrelated parents, we must disallow changing the type of a column inherited from multiple such parents, else it would be out of step with the other parents. However, it's possible for the column to ultimately be inherited from just one common ancestor, in which case a change starting from that ancestor should still be allowed. (I would not be excited about preserving that option, were it not that we have regression test cases exercising it already ...) It's slightly annoying that this patch looks different from the logic with the same end goal in renameatt(), and more annoying that it requires an extra syscache lookup to make the test. However, the recursion logic is quite different in the two functions, and a back-patched bug fix is no place to be trying to unify them. Per report from Manuel Rigger. Back-patch to 9.5. The bug exists in 9.4 too (and doubtless much further back); but the way the recursion is done in 9.4 is a good bit different, so that substantial refactoring would be needed to fix it in 9.4. I'm disinclined to do that, or risk introducing new bugs, for a bug that has escaped notice for this long. Discussion: https://postgr.es/m/CA+u7OA4qogDv9rz1HAb-ADxttXYPqQdUdPY_yd4kCzywNxRQXA@mail.gmail.com	2019-08-18 17:11:57 -04:00
Andres Freund	f7db0ac7d5	Add default_table_access_method to postgresql.conf.sample. Reported-By: Heikki Linnakangas Author: Michael Paquier Discussion: https://postgr.es/m/d6ffbebb-a0d2-181c-811d-b029b2225ed7@iki.fi Backpatch: 12-, where pluggable table access methods were introduced	2019-08-16 15:24:22 -07:00
Andres Freund	fb3b098fe8	Remove fmgr.h includes from headers that don't really need it. Most of the fmgr.h includes were obsoleted by `352a24a1f9`. A few others can be obsoleted using the underlying struct type in an implementation detail. Author: Andres Freund Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de	2019-08-16 10:35:31 -07:00
Andres Freund	6a04d345fd	Don't include utils/array.h from acl.h. For most uses of acl.h the details of how "Acl" internally looks like are irrelevant. It might make sense to move a lot of the implementation details into a separate header at a later point. The main motivation of this change is to avoid including fmgr.h (via array.h, which needs it for exposed structs) in a lot of files that otherwise don't need it. A subsequent commit will remove the fmgr.h include from a lot of files. Directly include utils/array.h and utils/expandeddatum.h from the files that need them, but previously included them indirectly, via acl.h. Author: Andres Freund Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de	2019-08-16 10:33:30 -07:00
Etsuro Fujita	076e9d4209	Remove useless bms_free() calls in build_child_join_rel(). These seem to be leftovers from the original partitionwise-join patch, perhaps. Discussion: https://postgr.es/m/CAPmGK145YiMTPRnvev1dLz8na_-0aZ=Xyqn8f2QsJFBUTObNow@mail.gmail.com	2019-08-16 14:35:55 +09:00
Tom Lane	fe9b7b2fe5	Fix plpgsql to re-look-up composite type names at need. Commit `4b93f5799` rearranged things in plpgsql to make it cope better with composite types changing underneath it intra-session. However, I failed to consider the case of a composite type being dropped and recreated entirely. In my defense, the previous coding didn't consider that possibility at all either --- but it would accidentally work so long as you didn't change the type's field list, because the built-at-compile-time list of component variables would then still match the type's new definition. The new coding, however, occasionally tries to re-look-up the type by OID, and then fails to find the dropped type. To fix this, we need to save the TypeName struct, and then redo the type OID lookup from that. Of course that's expensive, so we don't want to do it every time we need the type OID. This can be fixed in the same way that `4b93f5799` dealt with changes to composite types' definitions: keep an eye on the type's typcache entry to see if its tupledesc has been invalidated. (Perhaps, at some point, this mechanism should be generalized so it can work for non-composite types too; but for now, plpgsql only tries to cope with intra-session redefinitions of composites.) I'm slightly hesitant to back-patch this into v11, because it changes the contents of struct PLpgSQL_type as well as the signature of plpgsql_build_datatype(), so in principle it could break code that is poking into the innards of plpgsql. However, the only popular extension of that ilk is pldebugger, and it doesn't seem to be affected. Since this is a regression for people who were relying on the old behavior, it seems worth taking the small risk of causing compatibility issues. Per bug #15913 from Daniel Fiori. Back-patch to v11 where `4b93f5799` came in. Discussion: https://postgr.es/m/15913-a7e112e16dedcffc@postgresql.org	2019-08-15 15:21:47 -04:00
Tom Lane	bb5ae8f6c4	Use a hash table to de-duplicate NOTIFY events faster. Previously, async.c got rid of duplicate notifications by scanning the list of pending events to compare each one to the proposed new event. This works okay for very small numbers of distinct events, but degrades as O(N^2) for many events. We can improve matters by using a hash table to probe for duplicates. So as not to add a lot of overhead for the simple cases that the code did handle well before, create the hash table only once a (sub)transaction has queued more than 16 distinct notify events. A downside is that we now have to do per-event work to propagate a successful subtransaction's notify events up to its parent. (But this isn't significant unless the subtransaction had many events, in which case the O(N^2) behavior would have been in play already, so we still come out ahead.) We can make some lemonade out of this lemon, though: since we must examine each event anyway, it's now possible to de-duplicate events fully, rather than skipping that for events merged up from subtransactions. Hence, remove the old weasel wording in notify.sgml about whether de-duplication happens or not, and adjust the test case in async-notify.spec that exhibited the old behavior. While at it, rearrange the definition of struct Notification to make it more compact and require just one palloc per event, rather than two or three. This saves space when there are a lot of events, in fact more than enough to buy back the space needed for the hash table. Patch by me, based on discussions around a different patch submitted by Filip Rembiałkowski. Discussion: https://postgr.es/m/17822.1564186806@sss.pgh.pa.us	2019-08-15 12:22:12 -04:00
Tom Lane	f1bf619acd	Fix ALTER SYSTEM to cope with duplicate entries in postgresql.auto.conf. ALTER SYSTEM itself normally won't make duplicate entries (although up till this patch, it was possible to confuse it by writing case variants of a GUC's name). However, if some external tool has appended entries to the file, that could result in duplicate entries for a single GUC name. In such a situation, ALTER SYSTEM did exactly the wrong thing, because it replaced or removed only the first matching entry, leaving the later one(s) still there and hence still determining the active value. This patch fixes that by making ALTER SYSTEM sweep through the file and remove all matching entries, then (if not ALTER SYSTEM RESET) append the new setting to the end. This means entries will be in order of last setting rather than first setting, but that shouldn't hurt anything. Also, make the comparisons case-insensitive so that the right things happen if you do, say, ALTER SYSTEM SET "TimeZone" = 'whatever'. This has been broken since ALTER SYSTEM was invented, so back-patch to all supported branches. Ian Barwick, with minor mods by me Discussion: https://postgr.es/m/aed6cc9f-98f3-2693-ac81-52bb0052307e@2ndquadrant.com	2019-08-14 15:09:42 -04:00
Peter Geoghegan	9c02cf5661	Remove block number field from nbtree stack. The initial value of the nbtree stack downlink block number field recorded during an initial descent of the tree wasn't actually used. Both _bt_getstackbuf() callers overwrote the value with their own value. Remove the block number field from the stack struct, and add a child block number argument to _bt_getstackbuf() in its place. This makes the overall design of _bt_getstackbuf() clearer. Author: Peter Geoghegan Reviewed-By: Anastasia Lubennikova Discussion: https://postgr.es/m/CAH2-Wzmx+UbXt2YNOUCZ-a04VdXU=S=OHuAuD7Z8uQq-PXTYUg@mail.gmail.com	2019-08-14 11:32:35 -07:00
Peter Eisentraut	fded4773eb	initdb: Remove obsolete locale handling The method of passing LC_COLLATE and LC_CTYPE to the backend during initdb is obsolete as of `61d9674988`. This can all be removed. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/eeaf2f99-a1a6-8aca-3f43-9ab0b2fb112a%402ndquadrant.com	2019-08-14 06:51:13 +02:00
Peter Geoghegan	68ef887842	Remove obsolete nbtree README commentary. Commit `d2086b08b0` removed almost all cases where nbtree must release a read buffer lock and acquire a write buffer lock instead, so remaining cases in which that's still necessary are not notable enough to appear in the nbtree README. More importantly, holding on to a buffer pin in cases where nbtree must trade a read lock for a write lock is very unlikely to save any I/O. This seems to have been a long overlooked throwback to a time when nbtree cared about write-ordering dependencies, and performed synchronous buffer writes. It hasn't worked that way in many years.	2019-08-13 17:16:44 -07:00
Peter Geoghegan	af0ba49809	Use PageIndexTupleOverwrite() within nbtree. Use the PageIndexTupleOverwrite() bufpage.c routine within nbtree instead of deleting a tuple and re-inserting its replacement. This makes the intent of affected code slightly clearer. It also makes CREATE INDEX slightly faster, since there is no longer a need to shift every leaf page's line pointer array back and forth during index builds. Author: Peter Geoghegan, Anastasia Lubennikova Reviewed-By: Anastasia Lubennikova Discussion: https://postgr.es/m/CAH2-Wz=Zk=B9+Vwm376WuO7YTjFc2SSskifQm4Nme3RRRPtOSQ@mail.gmail.com	2019-08-13 11:54:26 -07:00
Alvaro Herrera	815ef2f568	Don't constraint-exclude partitioned tables as much We only need to invoke constraint exclusion on partitioned tables when they are a partition, and they themselves contain a default partition; it's not necessary otherwise, and it's expensive, so avoid it. Also, we were trying once for each clause separately, but we can do it for all the clauses at once. While at it, centralize setting of RelOptInfo->partition_qual instead of computing it in slightly different ways in different places. Per complaints from Simon Riggs about 4e85642d935e; reviewed by Yuzuko Hosoya, Kyotaro Horiguchi. Author: Amit Langote. I (Álvaro) again mangled the patch somewhat. Discussion: https://postgr.es/m/CANP8+j+tMCY=nEcQeqQam85=uopLBtX-2vHiLD2bbp7iQQUKpA@mail.gmail.com	2019-08-13 10:26:04 -04:00
Michael Paquier	66bde49d96	Fix inconsistencies and typos in the tree, take 10 This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com	2019-08-13 13:53:41 +09:00
Tom Lane	03c811a483	Fix planner's test for case-foldable characters in ILIKE with ICU. As coded, the ICU-collation path in pattern_char_isalpha() failed to consider regular ASCII letters to be case-varying. This led to like_fixed_prefix treating too much of an ILIKE pattern as being a fixed prefix, so that indexscans derived from an ILIKE clause might miss entries that they should find. Per bug #15892 from James Inform. This is an oversight in the original ICU patch (commit `eccfef81e`), so back-patch to v10 where that came in. Discussion: https://postgr.es/m/15892-e5d2bea3e8a04a1b@postgresql.org	2019-08-12 13:15:47 -04:00
Tom Lane	3c926587b5	Remove EState.es_range_table_array. Now that list_nth is O(1), there's no good reason to maintain a separate array of RTE pointers rather than indexing into estate->es_range_table. Deleting the array doesn't save all that much either; but just on cleanliness grounds, it's better not to have duplicate representations of the identical information. Discussion: https://postgr.es/m/14960.1565384592@sss.pgh.pa.us	2019-08-12 11:58:35 -04:00
Tom Lane	5ee190f8ec	Rationalize use of list_concat + list_copy combinations. In the wake of commit `1cff1b95a`, the result of list_concat no longer shares the ListCells of the second input. Therefore, we can replace "list_concat(x, list_copy(y))" with just "list_concat(x, y)". To improve call sites that were list_copy'ing the first argument, or both arguments, invent "list_concat_copy()" which produces a new list sharing no ListCells with either input. (This is a bit faster than "list_concat(list_copy(x), y)" because it makes the result list the right size to start with.) In call sites that were not list_copy'ing the second argument, the new semantics mean that we are usually leaking the second List's storage, since typically there is no remaining pointer to it. We considered inventing another list_copy variant that would list_free the second input, but concluded that for most call sites it isn't worth worrying about, given the relative compactness of the new List representation. (Note that in cases where such leakage would happen, the old code already leaked the second List's header; so we're only discussing the size of the leak not whether there is one. I did adjust two or three places that had been troubling to free that header so that they manually free the whole second List.) Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us	2019-08-12 11:20:18 -04:00
Alexander Korotkov	251c8e39bc	Fix string comparison in jsonpath Take into account pg_server_to_any() may return input string "as is". Reported-by: Andrew Dunstan, Thomas Munro Discussion: https://postgr.es/m/0ed83a33-d900-466a-880a-70ef456c721f%402ndQuadrant.com Author: Alexander Korotkov, Thomas Munro Backpatch-through: 12	2019-08-12 06:26:13 +03:00
Alexander Korotkov	d54ceb9e17	Adjust string comparison in jsonpath We have implemented jsonpath string comparison using default database locale. However, standard requires us to compare Unicode codepoints. This commit implements that, but for performance reasons we still use per-byte comparison for "==" operator. Thus, for consistency other comparison operators do per-byte comparison if Unicode codepoints appear to be equal. In some edge cases, when same Unicode codepoints have different binary representations in database encoding, we diverge standard to achieve better performance of "==" operator. In future to implement strict standard conformance, we can do normalization of input JSON strings. Original patch was written by Nikita Glukhov, rewritten by me. Reported-by: Markus Winand Discussion: https://postgr.es/m/8B7FA3B4-328D-43D7-95A8-37B8891B8C78%40winand.at Author: Nikita Glukhov, Alexander Korotkov Backpatch-through: 12	2019-08-11 22:54:53 +03:00
Tom Lane	cabe0f298e	Fix "ANALYZE t, t" inside a transaction block. This failed with either "tuple already updated by self" or "duplicate key value violates unique constraint", depending on whether the table had previously been analyzed or not. The reason is that ANALYZE tried to insert or update the same pg_statistic rows twice, and there was no CommandCounterIncrement between. So add one. The same case works fine outside a transaction block, because then there's a whole transaction boundary between, as a consequence of the way VACUUM works. This issue has been latent all along, but the problem was unreachable before commit `11d8d72c2` added the ability to specify multiple tables in ANALYZE. We could, perhaps, alternatively fix it by adding code to de-duplicate the list of VacuumRelations --- but that would add a lot of overhead to work around dumb commands, so it's not attractive. Per bug #15946 from Yaroslav Schekin. Back-patch to v11. (Note: in v11 I also back-patched the test added by commit 23224563d; otherwise the problem doesn't manifest in the test I added, because "vactst" is empty when the tests for multiple ANALYZE targets are reached. That seems like not a very good thing anyway, so I did this rather than rethinking the choice of test case.) Discussion: https://postgr.es/m/15946-5c7570a2884a26cf@postgresql.org	2019-08-10 11:30:11 -04:00
Peter Geoghegan	d8cd68c8d4	Rename tuplesort.c's SortTuple.tupindex field. Rename the "tupindex" field from tuplesort.c's SortTuple struct to "srctape", since it can only ever be used to store a source/input tape number when merging external sort runs. This has been the case since commit `8b304b8b72`, which removed replacement selection sort from tuplesort.c.	2019-08-09 17:06:45 -07:00
Tom Lane	0662eb6219	Fix SIGSEGV in pruning for ScalarArrayOp with constant-null array. Not much to be said here: commit `9fdb675fc` should have checked constisnull, didn't. Per report from Piotr Włodarczyk. Back-patch to v11 where bug was introduced. Discussion: https://postgr.es/m/CAP-dhMr+vRpwizEYjUjsiZ1vwqpohTm+3Pbdt6Pr7FEgPq9R0Q@mail.gmail.com	2019-08-09 13:20:28 -04:00
Tom Lane	1661a40505	Cosmetic improvements in setup of planner's per-RTE arrays. Merge setup_append_rel_array into setup_simple_rel_arrays. There's no particularly good reason to keep them separate, and it's inconsistent with the lack of separation in expand_planner_arrays. The only apparent benefit was that the fast path for trivial queries in query_planner() doesn't need to set up the append_rel_array; but all we're saving there is an if-test and NULL assignment, which surely ought to be negligible. Also improve some obsolete comments. Discussion: https://postgr.es/m/17220.1565301350@sss.pgh.pa.us	2019-08-09 12:33:43 -04:00
Michael Paquier	b8f2da0ac5	Refactor logic to remove trailing CR/LF characters from strings `b654714` has reworked the way trailing CR/LF characters are removed from strings. This commit introduces a new routine in common/string.c and refactors the code so as the logic is in a single place, mostly. Author: Michael Paquier Reviewed-by: Bruce Momjian Discussion: https://postgr.es/m/20190801031820.GF29334@paquier.xyz	2019-08-09 11:05:14 +09:00
Peter Geoghegan	28b901f73a	Update obsolete tuplesort READTUP() comment. READTUP() routines do not and cannot use the resettable "tuplecontext" memory context, since it is deleted when merging begins. Update an obsolete comment that claimed otherwise. This was an oversight in commit `e94568ecc1`. In passing, fix an unrelated tuplesort typo.	2019-08-08 13:20:44 -07:00
Alvaro Herrera	e1f4c481b9	Remove unnecessary #include <limits.h> This include was probably copied from tuplestore.c, but it's not needed. Extracted from a larger patch submitted by vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm1B9naPDTm3ox1m_yZvOm3KA5S4kZQSWWAeLHAQ=3gV1Q@mail.gmail.com	2019-08-07 16:55:31 -04:00
Alvaro Herrera	12afc7145c	Add comment on no default partition with hash partitioning Discussion: https://postgr.es/m/20190806222735.GA9535@alvherre.pgsql	2019-08-07 12:27:47 -04:00
Alvaro Herrera	4e85642d93	Apply constraint exclusion more generally in partitioning We were applying constraint exclusion on the partition constraint when generating pruning steps for a clause, but only for the rather restricted situation of them being boolean OR operators; however it is possible to have differently shaped clauses that also benefit from constraint exclusion. This applies particularly to the default partition since their constraints are in essence a long list of OR'ed subclauses ... but it applies to other cases too. So in certain cases we're scanning partitions that we don't need to. Remove the specialized code in OR clauses, and add a generally applicable test of the clause refuting the partition constraint; mark the whole pruning operation as contradictory if it hits. This has the unwanted side-effect of testing some (most? all?) constraints more than once if constraint_exclusion=on. That seems unavoidable as far as I can tell without some additional work, but that's not the recommended setting for that parameter anyway. However, because this imposes additional processing cost for all queries using partitioned tables, I decided not to backpatch this change. Author: Amit Langote, Yuzuko Hosoya, Álvaro Herrera Reviewers: Shawn Wang, Thibaut Madeleine, Yoshikazu Imai, Kyotaro Horiguchi; they were also uncredited reviewers for commit `489247b0e6`. Discussion: https://postgr.es/m/9bb31dfe-b0d0-53f3-3ea6-e64b811424cf@lab.ntt.co.jp	2019-08-07 12:21:54 -04:00
Etsuro Fujita	68343b4ad7	Fix typos in comments.	2019-08-07 19:05:17 +09:00
Heikki Linnakangas	1169fcf129	Fix predicate-locking of HOT updated rows. In serializable mode, heap_hot_search_buffer() incorrectly acquired a predicate lock on the root tuple, not the returned tuple that satisfied the visibility checks. As explained in README-SSI, the predicate lock does not need to be copied or extended to other tuple versions, but for that to work, the correct, visible, tuple version must be locked in the first place. The original SSI commit had this bug in it, but it was fixed back in 2013, in commit `81fbbfe335`. But unfortunately, it was reintroduced a few months later in commit `b89e151054`. Wising up from that, add a regression test to cover this, so that it doesn't get reintroduced again. Also, move the code that sets 't_self', so that it happens at the same time that the other HeapTuple fields are set, to make it more clear that all the code in the loop operate on the "current" tuple in the chain, not the root tuple. Bug spotted by Andres Freund, analysis and original fix by Thomas Munro, test case and some additional changes to the fix by Heikki Linnakangas. Backpatch to all supported versions (9.4). Discussion: https://www.postgresql.org/message-id/20190731210630.nqhszuktygwftjty%40alap3.anarazel.de	2019-08-07 12:40:49 +03:00
Michael Paquier	64579be64a	Fix some incorrect parsing of time with time zone strings When parsing a timetz string with a dynamic timezone abbreviation or a timezone not specified, it was possible to generate incorrect timestamps based on a date which uses some non-initialized variables if the input string did not specify fully a date to parse. This is already checked when a full timezone spec is included in the input string, but the two other cases mentioned above missed the same checks. This gets fixed by generating an error as this input is invalid, or in short when a date is not fully specified. Valgrind was complaining about this problem. Bug: #15910 Author: Alexander Lakhin Discussion: https://postgr.es/m/15910-2eba5106b9aa0c61@postgresql.org Backpatch-through: 9.4	2019-08-07 18:16:31 +09:00
Michael Paquier	75c1921cd6	Adjust tuple data lookup logic in multi-insert logical decoding As of now, logical decoding of a multi-insert has been scanning all xl_multi_insert_tuple entries only if XLH_INSERT_CONTAINS_NEW_TUPLE was getting set in the record. This is not an issue on HEAD as multi-insert records are not used for system catalogs, but the logical decoding logic includes all the code necessary to handle that properly, except that the code missed to iterate correctly over all xl_multi_insert_tuple entries when the flag is not set. Hence, when trying to use multi-insert for system catalogs, an assertion would be triggered. An upcoming patch is going to make use of multi-insert for system catalogs, and this fixes the logic to make sure that all entries are scanned correctly without softening the existing assertions. Reported-by: Daniel Gustafsson Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/CBFFD532-C033-49EB-9A5A-F67EAEE9EB0B@yesql.se	2019-08-07 10:28:16 +09:00
Michael Paquier	940c8b01b0	Fix typo in pathnode.c Author: Amit Langote Discussion: https://postgr.es/m/CA+HiwqFhZ6ABoz-i=JZ5wMMyz-orx4asjR0og9qBtgEwOww6Yg@mail.gmail.com	2019-08-06 18:11:02 +09:00
Tom Lane	4766dce0dd	Fix choice of comparison operators for cross-type hashed subplans. Commit `bf6c614a2` rearranged the lookup of the comparison operators needed in a hashed subplan, and in so doing, broke the cross-type case: it caused the original LHS-vs-RHS operator to be used to compare hash table entries too (which of course are all of the RHS type). This leads to C functions being passed a Datum that is not of the type they expect, with the usual hazards of crashes and unauthorized server memory disclosure. For the set of hashable cross-type operators present in v11 core Postgres, this bug is nearly harmless on 64-bit machines, which may explain why it escaped earlier detection. But it is a live security hazard on 32-bit machines; and of course there may be extensions that add more hashable cross-type operators, which would increase the risk. Reported by Andreas Seltenreich. Back-patch to v11 where the problem came in. Security: CVE-2019-10209	2019-08-05 11:20:31 -04:00
Noah Misch	ffa2d37e5f	Require the schema qualification in pg_temp.type_name(arg). Commit `aa27977fe2` introduced this restriction for pg_temp.function_name(arg); do likewise for types created in temporary schemas. Programs that this breaks should add "pg_temp." schema qualification or switch to arg::type_name syntax. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. Reported by Tom Lane. Security: CVE-2019-10208	2019-08-05 07:48:41 -07:00
Michael Paquier	a76cfba663	Add safeguards in LSN, numeric and float calculation for custom errors Those data types use parsing and/or calculation wrapper routines which can generate some generic error messages in the event of a failure. The caller of these routines can also pass a pointer variable settable by the routine to track if an error has happened, letting the caller decide what to do in the event of an error and what error message to generate. Those routines have been slacking the initialization of the tracking flag, which can be confusing when reading the code, so add some safeguards against calls of these parsing routines which could lead to a dubious result. The LSN parsing gains an assertion to make sure that the tracking flag is set, while numeric and float paths initialize the flag to a saner state. Author: Jeevan Ladhe Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/CAOgcT0NOM9oR0Hag_3VpyW0uF3iCU=BDUFSPfk9JrWXRcWQHqw@mail.gmail.com	2019-08-05 15:35:16 +09:00
Michael Paquier	8548ddc61b	Fix inconsistencies and typos in the tree, take 9 This addresses more issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com	2019-08-05 12:14:58 +09:00
Tomas Vondra	75506195da	Revert "Add log_statement_sample_rate parameter" This reverts commit `88bdbd3f74`. As committed, statement sampling used the existing duration threshold (log_min_duration_statement) when decide which statements to sample. The issue is that even the longest statements are subject to sampling, and so may not end up logged. An improvement was proposed, introducing a second duration threshold, but it would not be backwards compatible. So we've decided to revert this feature - the separate threshold should be part of the feature itself. Discussion: https://postgr.es/m/CAFj8pRDS8tQ3Wviw9%3DAvODyUciPSrGeMhJi_WPE%2BEB8%2B4gLL-Q%40mail.gmail.com	2019-08-04 23:38:27 +02:00
Tomas Vondra	4f9ed8f3c5	Revert "Silence compiler warning" This reverts commit `9dc1225855`. As committed, statement sampling used the existing duration threshold (log_min_duration_statement) when decide which statements to sample. The issue is that even the longest statements are subject to sampling, and so may not end up logged. An improvement was proposed, introducing a second duration threshold, but it would not be backwards compatible. So we've decided to revert this feature - the separate threshold should be part of the feature itself. Discussion: https://postgr.es/m/CAFj8pRDS8tQ3Wviw9%3DAvODyUciPSrGeMhJi_WPE%2BEB8%2B4gLL-Q%40mail.gmail.com	2019-08-04 23:38:19 +02:00
Alvaro Herrera	489247b0e6	Improve pruning of a default partition When querying a partitioned table containing a default partition, we were wrongly deciding to include it in the scan too early in the process, failing to exclude it in some cases. If we reinterpret the PruneStepResult.scan_default flag slightly, we can do a better job at detecting that it can be excluded. The change is that we avoid setting the flag for that pruning step unless the step absolutely requires the default partition to be scanned (in contrast with the previous arrangement, which was to set it unless the step was able to prune it). So get_matching_partitions() must explicitly check the partition that each returned bound value corresponds to in order to determine whether the default one needs to be included, rather than relying on the flag from the final step result. Author: Yuzuko Hosoya <hosoya.yuzuko@lab.ntt.co.jp> Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Discussion: https://postgr.es/m/00e601d4ca86$932b8bc0$b982a340$@lab.ntt.co.jp	2019-08-04 11:18:45 -04:00
Michael Paquier	69edf4f880	Refactor BuildIndexInfo() with the new makeIndexInfo() This portion of the code got forgotten in `7cce159` which has introduced a new routine to build this node, and this finishes the unification of the places where IndexInfo is initialized. Author: Michael Paquier Discussion: https://postgr.es/m/20190801041322.GA3435@paquier.xyz	2019-08-04 11:18:57 +09:00
Andres Freund	2abd7ae9b2	Fix representation of hash keys in Hash/HashJoin nodes. In `5f32b29c18` I changed the creation of HashState.hashkeys to actually use HashState as the parent (instead of HashJoinState, which was incorrect, as they were executed below HashState), to fix the problem of hashkeys expressions otherwise relying on slot types appropriate for HashJoinState, rather than HashState as would be correct. That reliance was only introduced in 12, which is why it previously worked to use HashJoinState as the parent (although I'd be unsurprised if there were problematic cases). Unfortunately that's not a sufficient solution, because before this commit, the to-be-hashed expressions referenced inner/outer as appropriate for the HashJoin, not Hash. That didn't have obvious bad consequences, because the slots containing the tuples were put into ecxt_innertuple when hashing a tuple for HashState (even though Hash doesn't have an inner plan). There are less common cases where this can cause visible problems however (rather than just confusion when inspecting such executor trees). E.g. "ERROR: bogus varno: 65000", when explaining queries containing a HashJoin where the subsidiary Hash node's hash keys reference a subplan. While normally hashkeys aren't displayed by EXPLAIN, if one of those expressions references a subplan, that subplan may be printed as part of the Hash node - which then failed because an inner plan was referenced, and Hash doesn't have that. It seems quite possible that there's other broken cases, too. Fix the problem by properly splitting the expression for the HashJoin and Hash nodes at plan time, and have them reference the proper subsidiary node. While other workarounds are possible, fixing this correctly seems easy enough. It was a pretty ugly hack to have ExecInitHashJoin put the expression into the already initialized HashState, in the first place. I decided to not just split inner/outer hashkeys inside make_hashjoin(), but also to separate out hashoperators and hashcollations at plan time. Otherwise we would have ended up having two very similar loops, one at plan time and the other during executor startup. The work seems to more appropriately belong to plan time, anyway. Reported-By: Nikita Glukhov, Alexander Korotkov Author: Andres Freund Reviewed-By: Tom Lane, in an earlier version Discussion: https://postgr.es/m/CAPpHfdvGVegF_TKKRiBrSmatJL2dR9uwFCuR+teQ_8tEXU8mxg@mail.gmail.com Backpatch: 12-	2019-08-02 00:02:46 -07:00
Tom Lane	7266d0997d	Allow functions-in-FROM to be pulled up if they reduce to constants. This allows simplification of the plan tree in some common usage patterns: we can get rid of a join to the function RTE. In principle we could pull up any immutable expression, but restricting it to Consts avoids the risk that multiple evaluations of the expression might cost more than we can save. (Possibly this could be improved in future --- but we've more or less promised people that putting a function in FROM guarantees single evaluation, so we'd have to tread carefully.) To do this, we need to rearrange when eval_const_expressions() happens for expressions in function RTEs. I moved it to inline_set_returning_functions(), which already has to iterate over every function RTE, and in consequence renamed that function to preprocess_function_rtes(). A useful consequence is that inline_set_returning_function() no longer has to do this for itself, simplifying that code. In passing, break out pull_up_simple_subquery's code that knows where everything that needs pullup_replace_vars() processing is, so that the new pull_up_constant_function() routine can share it. We'd gotten away with one-and-a-half copies of that code so far, since pull_up_simple_values() could assume that a lot of cases didn't apply to it --- but I don't think pull_up_constant_function() can make any simplifying assumptions. Might as well make pull_up_simple_values() use it too. (Possibly this refactoring should go further: maybe we could share some of the code to fill in the pullup_replace_vars_context struct? For now, I left it that the callers fill that completely.) Note: the one existing test case that this patch changes has to be changed because inlining its function RTEs would destroy the point of the test, namely to check join order. Alexander Kuzmenkov and Aleksandr Parfenov, reviewed by Antonin Houska and Anastasia Lubennikova, and whacked around some more by me Discussion: https://postgr.es/m/402356c32eeb93d4fed01f66d6c7fe2d@postgrespro.ru	2019-08-01 18:50:22 -04:00
Peter Geoghegan	71dcd74386	Add sort support routine for the inet data type. Add sort support for inet, including support for abbreviated keys. Testing has shown that this reduces the time taken to sort medium to large inet/cidr inputs by ~50-60% in realistic cases. Author: Brandur Leach Reviewed-By: Peter Geoghegan, Edmund Horner Discussion: https://postgr.es/m/CABR_9B-PQ8o2MZNJ88wo6r-NxW2EFG70M96Wmcgf99G6HUQ3sw@mail.gmail.com	2019-08-01 09:34:14 -07:00
Tom Lane	da9456d22a	Add an isolation test to exercise parallel-worker deadlock resolution. Commit `a1c1af2a1` added logic in the deadlock checker to handle lock grouping, but it was very poorly tested, as evidenced by the bug fixed in `3420851a2`. Add a test case that exercises that a bit better (and catches the bug --- if you revert `3420851a2`, this will hang). Since it's pretty hard to get parallel workers to take exclusive regular locks that their parents don't already have, this test operates by creating a deadlock among advisory locks taken in parallel workers. To make that happen, we must override the parallel-safety labeling of the advisory-lock functions, which we do by putting them in mislabeled, non-inlinable wrapper functions. We also have to remove the redundant PreventAdvisoryLocksInParallelMode checks in lockfuncs.c. That seems fine though; if some user accidentally does what this test is intentionally doing, not much harm will ensue. (If there are any remaining bugs that are reachable that way, they're probably reachable in other ways too.) Discussion: https://postgr.es/m/3243.1564437314@sss.pgh.pa.us	2019-08-01 11:50:00 -04:00
Peter Eisentraut	fd6ec93bf8	Add error codes to some corruption log messages In some cases we have elog(ERROR) while corruption is certain and we can give a clear error code ERRCODE_DATA_CORRUPTED or ERRCODE_INDEX_CORRUPTED. Author: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://www.postgresql.org/message-id/flat/25F6C686-6442-4A6B-BAF8-A6F7B84B16DE@yandex-team.ru	2019-08-01 11:15:26 +02:00
Andres Freund	870b1d6800	Remove superfluous newlines in function prototypes. These were introduced by pgindent due to fixe to broken indentation (c.f. `8255c7a5ee`). Previously the mis-indentation of function prototypes was creatively used to reduce indentation in a few places. As that formatting only exists in master and REL_12_STABLE, it seems better to fix it in both, rather than having some odd indentation in v12 that somebody might copy for future patches or such. Author: Andres Freund Discussion: https://postgr.es/m/20190728013754.jwcbe5nfyt3533vx@alap3.anarazel.de Backpatch: 12-	2019-07-31 00:05:21 -07:00
Andres Freund	6384e87be2	Remove superfluous semicolon. Author: Andres Freund	2019-07-30 22:09:53 -07:00
Heikki Linnakangas	a29834beb1	Allow table AM's to use rd_amcache, too. The rd_amcache allows an index AM to cache arbitrary information in a relcache entry. This commit moves the cleanup of rd_amcache so that it can also be used by table AMs. Nothing takes advantage of that yet, but I'm sure it'll come handy for anyone writing new table AMs. Backpatch to v12, where table AM interface was introduced. Reviewed-by: Julien Rouhaud	2019-07-30 21:43:27 +03:00
Tomas Vondra	14ef15a222	Don't build extended statistics on inheritance trees When performing ANALYZE on inheritance trees, we collect two samples for each relation - one for the relation alone, and one for the inheritance subtree (relation and its child relations). And then we build statistics on each sample, so for each relation we get two sets of statistics. For regular (per-column) statistics this works fine, because the catalog includes a flag differentiating statistics built from those two samples. But we don't have such flag in the extended statistics catalogs, and we ended up updating the same row twice, triggering this error: ERROR: tuple already updated by self The simplest solution is to disable extended statistics on inheritance trees, which is what this commit is doing. In the future we may need to do something similar to per-column statistics, but that requires adding a flag to the catalog - and that's not backpatchable. Moreover, the current selectivity estimation code only works with individual relations, so building statistics on inheritance trees would be pointless anyway. Author: Tomas Vondra Backpatch-to: 10- Discussion: https://postgr.es/m/20190618231233.GA27470@telsasoft.com Reported-by: Justin Pryzby	2019-07-30 19:47:33 +02:00
Tom Lane	3420851a2c	Fix busted logic for parallel lock grouping in TopoSort(). A "break" statement erroneously left behind by commit `a1c1af2a1` caused TopoSort to do the wrong thing if a lock's wait list contained multiple members of the same locking group. Because parallel workers don't normally need any locks not already taken by their leader, this is very hard --- maybe impossible --- to hit in production. Still, if it did happen, the queries involved in an otherwise-resolvable deadlock would block until canceled. In addition to removing the bogus "break", add an Assert showing that the conflicting uses of the beforeConstraints[] array (for both counts and flags) don't overlap, and add some commentary explaining why not; because it's not obvious without explanation, IMHO. Original report and patch from Rui Hai Jiang; additional assert and commentary by me. Back-patch to 9.6 where the bug came in. Discussion: https://postgr.es/m/CAEri+mLd3bpHLyW+a9pSe1y=aEkeuJpwBSwvo-+m4n7-ceRmXw@mail.gmail.com	2019-07-29 18:49:04 -04:00
Michael Paquier	eb43f3d193	Fix inconsistencies and typos in the tree This is numbered take 8, and addresses again a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/b137b5eb-9c95-9c2f-586e-38aba7d59788@gmail.com	2019-07-29 12:28:30 +09:00
Michael Paquier	7cce159349	Fix handling of expressions and predicates in REINDEX CONCURRENTLY When copying the definition of an index rebuilt concurrently for the new entry, the index information was taken directly from the old index using the relation cache. In this case, predicates and expressions have some post-processing to prepare things for the planner, which loses some information including the collations added in any of them. This inconsistency can cause issues when attempting for example a table rewrite, and makes the new indexes rebuilt concurrently inconsistent with the old entries. In order to fix the problem, fetch expressions and predicates directly from the catalog of the old entry, and fill in IndexInfo for the new index with that. This makes the process more consistent with DefineIndex(), and the code is refactored with the addition of a routine to create an IndexInfo node. Reported-by: Manuel Rigger Author: Michael Paquier Discussion: https://postgr.es/m/CA+u7OA5Hp0ra235F3czPom_FyAd-3+XwSJmX95r1+sRPOJc9VQ@mail.gmail.com Backpatch-through: 12	2019-07-29 09:58:49 +09:00
Thomas Munro	a2a777d011	Avoid macro clash with LLVM 9. Early previews of LLVM 9 reveal that our Min() macro causes compiler errors in LLVM headers reached by the #include directives in llvmjit_inline.cpp. Let's just undefine it. Per buildfarm animal seawasp. Back-patch to 11. Reviewed-by: Fabien Coelho, Tom Lane Discussion: https://postgr.es/m/20190606173216.GA6306%40alvherre.pgsql	2019-07-29 10:23:55 +12:00
Michael Paquier	b7a82317b6	Fix typo in fd.c The frontend version of walkdir() is defined in file_utils.c, and not initdb.c. Author: Sehrope Sarkuni Discussion: https://postgr.es/m/CAH7T-artawnBt4=KODNCD8Mt2ZX4CCjJT8c=_=950xjutcRZ4Q@mail.gmail.com	2019-07-28 16:21:53 +09:00
Tom Lane	8ab66081ca	Tweak our special-case logic for the IANA "Factory" timezone. pg_timezone_names() tries to avoid showing the "Factory" zone in the view, mainly because that has traditionally had a very long "abbreviation" such as "Local time zone must be set--see zic manual page", so that showing it messes up psql's formatting of the whole view. Since tzdb version 2016g, IANA instead uses the abbreviation "-00", which is sane enough that there's no reason to discriminate against it. On the other hand, it emerges that FreeBSD and possibly other packagers are so wedded to backwards compatibility that they hack the IANA data to keep the old spelling --- and not just that old spelling, but even older spellings that IANA used back in the stone age. This caused the filter logic to fail to suppress "Factory" at all on such platforms, though the formatting problem is definitely real in that case. To solve both problems, get rid of the hard-wired assumption about exactly what Factory's abbreviation is, and instead reject abbreviations exceeding 31 characters. This will allow Factory to appear in the view if and only if it's using the modern abbreviation. In passing, simplify the code we add to zic.c to support "zic -P" to remove its now-obsolete hacks to not print the Factory zone's abbreviation. Unlike pg_timezone_names(), there's no reason for that code to support old/nonstandard timezone data. Since we generally prefer to keep timezone-related behavior the same in all branches, and since this is arguably a bug fix, back-patch to all supported branches. Discussion: https://postgr.es/m/3961.1564086915@sss.pgh.pa.us	2019-07-26 13:07:08 -04:00
Tom Lane	b9d2c5c7ac	Fix loss of fractional digits for large values in cash_numeric(). Money values exceeding about 18 digits (depending on lc_monetary) could be inaccurately converted to numeric, due to select_div_scale() deciding it didn't need to compute any fractional digits. Force its hand by setting the dscale of one division input to equal the number of fractional digits we need. In passing, rearrange the logic to not do useless work in locales where money values are considered integral. Per bug #15925 from Slawomir Chodnicki. Back-patch to all supported branches. Discussion: https://postgr.es/m/15925-da9953e2674bb5c8@postgresql.org	2019-07-26 11:59:00 -04:00
Andres Freund	af3deff3f2	Fix slot type handling for Agg nodes performing internal sorts. Since `15d8f8312` we assert that - and since `7ef04e4d2c`, `4da597edf1` rely on - the slot type for an expression's ecxt_{outer,inner,scan}tuple not changing, unless explicitly flagged as such. That allows to either skip deforming (for a virtual tuple slot) or optimize the code for JIT accelerated deforming appropriately (for other known slot types). This assumption was sometimes violated for grouping sets, when nodeAgg.c internally uses tuplesorts, and the child node doesn't return a TTSOpsMinimalTuple type slot. Detect that case, and flag that the outer slot might not be "fixed". It's probably worthwhile to optimize this further in the future, and more granularly determine whether the slot is fixed. As we already instantiate per-phase transition and equal expressions, we could cheaply set the slot type appropriately for each phase. But that's a separate change from this bugfix. This commit does include a very minor optimization by avoiding to create a slot for handling tuplesorts, if no such sorts are performed. Previously we created that slot unnecessarily in the common case of computing all grouping sets via hashing. The code looked too confusing without that, as the conditions for needing a sort slot and flagging that the slot type isn't fixed, are the same. Reported-By: Ashutosh Sharma Author: Andres Freund Discussion: https://postgr.es/m/CAE9k0PmNaMD2oHTEAhRyxnxpaDaYkuBYkLa1dpOpn=RS0iS2AQ@mail.gmail.com Backpatch: 12-, where the bug was introduced in `15d8f8312`	2019-07-25 14:28:55 -07:00
Tom Lane	b654714f9b	Fix failures to ignore \r when reading Windows-style newlines. libpq failed to ignore Windows-style newlines in connection service files. This normally wasn't a problem on Windows itself, because fgets() would convert \r\n to just \n. But if libpq were running inside a program that changes the default fopen mode to binary, it would see the \r's and think they were data. In any case, it's project policy to ignore \r in text files unconditionally, because people sometimes try to use files with DOS-style newlines on Unix machines, where the C library won't hide that from us. Hence, adjust parseServiceFile() to ignore \r as well as \n at the end of the line. In HEAD, go a little further and make it ignore all trailing whitespace, to match what it's always done with leading whitespace. In HEAD, also run around and fix up everyplace where we have newline-chomping code to make all those places look consistent and uniformly drop \r. It is not clear whether any of those changes are fixing live bugs. Most of the non-cosmetic changes are in places that are reading popen output, and the jury is still out as to whether popen on Windows can return \r\n. (The Windows-specific code in pipe_read_line seems to think so, but our lack of support for this elsewhere suggests maybe it's not a problem in practice.) Hence, I desisted from applying those changes to back branches, except in run_ssl_passphrase_command() which is new enough and little-tested enough that we'd probably not have heard about any problems there. Tom Lane and Michael Paquier, per bug #15827 from Jorge Gustavo Rocha. Back-patch the parseServiceFile() change to all supported branches, and the run_ssl_passphrase_command() change to v11 where that was added. Discussion: https://postgr.es/m/15827-e6ba53a3a7ed543c@postgresql.org	2019-07-25 12:11:17 -04:00
Andres Freund	ecbdd00934	Fix system column accesses in ON CONFLICT ... RETURNING. After `277cb78983` ON CONFLICT ... SET ... RETURNING failed with ERROR: virtual tuple table slot does not have system attributes when taking the update path, as the slot used to insert into the table (and then process RETURNING) was defined to be a virtual slot in that commit. Virtual slots don't support system columns except for tableoid and ctid, as the other system columns are AM dependent. Fix that by using a slot of the table's type. Add tests for system column accesses in ON CONFLICT ... RETURNING. Reported-By: Roby, bisected to the relevant commit by Jeff Janes Author: Andres Freund Discussion: https://postgr.es/m/73436355-6432-49B1-92ED-1FE4F7E7E100@finefun.com.au Backpatch: 12-, where the bug was introduced in `277cb78983`	2019-07-24 18:45:58 -07:00
Heikki Linnakangas	6655a7299d	Use full 64-bit XID for checking if a deleted GiST page is old enough. Otherwise, after a deleted page gets even older, it becomes unrecyclable again. B-tree has the same problem, and has had since time immemorial, but let's at least fix this in GiST, where this is new. Backpatch to v12, where GiST page deletion was introduced. Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru	2019-07-24 20:24:07 +03:00
Heikki Linnakangas	9eb5607e69	Refactor checks for deleted GiST pages. The explicit check in gistScanPage() isn't currently really necessary, as a deleted page is always empty, so the loop would fall through without doing anything, anyway. But it's a marginal optimization, and it gives a nice place to attach a comment to explain how it works. Backpatch to v12, where GiST page deletion was introduced. Reviewed-by: Andrey Borodin Discussion: https://www.postgresql.org/message-id/835A15A5-F1B4-4446-A711-BF48357EB602%40yandex-team.ru	2019-07-24 20:24:05 +03:00
Alvaro Herrera	5562272a42	Check that partitions are not in use when dropping constraints If the user creates a deferred constraint in a partition, and in a transaction they cause the constraint's trigger execution to be deferred until commit time and drop the constraint, then when commit time comes the queued trigger will fail to run because the trigger object will have been dropped. This is explained because when a constraint gets dropped in a partitioned table, the recursion to drop the ones in partitions is done by the dependency mechanism, not by ALTER TABLE traversing the recursion tree as in all other cases. In the non-partitioned case, this problem is avoided by checking that the table is not "in use" by alter-table; other alter-table subcommands that recurse to partitions do that check for each partition. But the dependency mechanism doesn't have a way to do that. Fix the problem by applying the same check to all partitions during ALTER TABLE's "prep" phase, which correctly raises the necessary error. Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com> Discussion: https://postgr.es/m/CAKcux6nZiO9-eEpr1ZD84bT1mBoVmeZkfont8iSpcmYrjhGWgA@mail.gmail.com	2019-07-23 17:22:15 -04:00
Peter Eisentraut	06140c201b	Add CREATE DATABASE LOCALE option This sets both LC_COLLATE and LC_CTYPE with one option. Similar behavior is already supported in initdb, CREATE COLLATION, and createdb. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr> Discussion: https://www.postgresql.org/message-id/flat/d9d5043a-dc70-da8a-0166-1e218e6e34d4%402ndquadrant.com	2019-07-23 14:47:24 +02:00
Tom Lane	a0555ddab9	Install dependencies to prevent dropping partition key columns. The logic in ATExecDropColumn that rejects dropping partition key columns is quite an inadequate defense, because it doesn't execute in cases where a column needs to be dropped due to cascade from something that only the column, not the whole partitioned table, depends on. That leaves us with a badly broken partitioned table; even an attempt to load its relcache entry will fail. We really need to have explicit pg_depend entries that show that the column can't be dropped without dropping the whole table. Hence, add those entries. In v12 and HEAD, bump catversion to ensure that partitioned tables will have such entries. We can't do that in released branches of course, so in v10 and v11 this patch affords protection only to partitioned tables created after the patch is installed. Given the lack of field complaints (this bug was found by fuzz-testing not by end users), that's probably good enough. In passing, fix ATExecDropColumn and ATPrepAlterColumnType messages to be more specific about which partition key column they're complaining about. Per report from Manuel Rigger. Back-patch to v10 where partitioned tables were added. Discussion: https://postgr.es/m/CA+u7OA4JKCPFrdrAbOs7XBiCyD61XJxeNav4LefkSmBLQ-Vobg@mail.gmail.com Discussion: https://postgr.es/m/31920.1562526703@sss.pgh.pa.us	2019-07-22 14:55:40 -04:00
David Rowley	1e6a759838	Use appendBinaryStringInfo in more places where the length is known When we already know the length that we're going to append, then it makes sense to use appendBinaryStringInfo instead of appendStringInfoString so that the append can be performed with a simple memcpy() using a known length rather than having to first perform a strlen() call to obtain the length. Discussion: https://postgr.es/m/CAKJS1f8+FRAM1s5+mAa3isajeEoAaicJ=4e0WzrH3tAusbbiMQ@mail.gmail.com	2019-07-23 00:14:11 +12:00
Peter Eisentraut	19781729f7	Make identity sequence management more robust Some code could get confused when certain catalog state involving both identity and serial sequences was present, perhaps during an attempt to upgrade the latter to the former. Specifically, dropping the default of a serial column maintains the ownership of the sequence by the column, and so it would then be possible to afterwards make the column an identity column that would now own two sequences. This causes the code that looks up the identity sequence to error out, making the new identity column inoperable until the ownership of the previous sequence is released. To fix this, make the identity sequence lookup only consider sequences with the appropriate dependency type for an identity sequence, so it only ever finds one (unless something else is broken). In the above example, the old serial sequence would then be ignored. Reorganize the various owned-sequence-lookup functions a bit to make this clearer. Reported-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://www.postgresql.org/message-id/flat/470c54fc8590be4de0f41b0d295fd6390d5e8a6c.camel@cybertec.at	2019-07-22 12:07:10 +02:00
David Rowley	efdcca55a3	Make better use of the new List implementation in a couple of places In nodeAppend.c and nodeMergeAppend.c there were some foreach loops which looped over the list of subplans and only performed any work if the subplan index was found in a Bitmapset. With the old linked list implementation of List, this form made sense as accessing the Nth list element was O(N). However, thanks to `1cff1b95a` we now have array-based lists, so accessing the Nth element has become O(1). Here we make the most of the O(1) lookups and just loop over the set members of the Bitmapset with bms_next_member(). This performs slightly better when a small number of the list items are in the Bitmapset. Micro benchmarks show that when the Bitmapset contains all or most of the list items then the new code is ever so slightly slower. In practice, the cost is so small that it's drowned out by various other things such as locking the relations belonging to each subplan, etc. The primary goal here is to leave better code examples around which benefit better from the new list implementation. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAKJS1f8ZcsLVgkF4wOfRyMYTcPgLFiUAOedFC+U2vK_aFZk-BA@mail.gmail.com	2019-07-22 19:03:12 +12:00
Michael Paquier	23bccc823d	Fix inconsistencies and typos in the tree This is numbered take 7, and addresses a set of issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/dff75442-2468-f74f-568c-6006e141062f@gmail.com	2019-07-22 10:01:50 +09:00
David Rowley	e1a0f6a983	Adjust overly strict Assert `3373c7155` changed how we determine EquivalenceClasses for relations and added an Assert to ensure all relations mentioned in each EC's ec_relids was a RELOPT_BASEREL. However, the join removal code may remove a LEFT JOIN and since it does not clean up EC members belonging to the removed relations it can leave RELOPT_DEADREL rels in ec_relids. Fix this by adjusting the Assert to allow RELOPT_DEADREL rels too. Reported-by: sqlsmith via Andreas Seltenreich Discussion: https://postgr.es/m/87y30r8sls.fsf@ansel.ydns.eu	2019-07-22 10:29:41 +12:00
Tom Lane	330cafdfaa	Remove no-longer-helpful reliance on fixed-size local array. Coverity complained about this code, apparently because it uses a local array of size FUNC_MAX_ARGS without a guard that the input argument list is no longer than that. (Not sure why it complained today, since this code's been the same for a long time; possibly it re-analyzed everything the List API change touched?) Rather than add a guard, though, let's just get rid of the local array altogether. It was only there to avoid list_nth() calls, and those are no longer expensive.	2019-07-21 11:42:11 -04:00
David Rowley	3373c71553	Speed up finding EquivalenceClasses for a given set of rels Previously in order to determine which ECs a relation had members in, we had to loop over all ECs stored in PlannerInfo's eq_classes and check if ec_relids mentioned the relation. For the most part, this was fine, as generally, unless queries were fairly complex, the overhead of performing the lookup would have not been that significant. However, when queries contained large numbers of joins and ECs, the overhead to find the set of classes matching a given set of relations could become a significant portion of the overall planning effort. Here we allow a much more efficient method to access the ECs which match a given relation or set of relations. A new Bitmapset field in RelOptInfo now exists to store the indexes into PlannerInfo's eq_classes list which each relation is mentioned in. This allows very fast lookups to find all ECs belonging to a single relation. When we need to lookup ECs belonging to a given pair of relations, we can simply bitwise-AND the Bitmapsets from each relation and use the result to perform the lookup. We also take the opportunity to write a new implementation of generate_join_implied_equalities which makes use of the new indexes. generate_join_implied_equalities_for_ecs must remain as is as it can be given a custom list of ECs, which we can't easily determine the indexes of. This was originally intended to fix the performance penalty of looking up foreign keys matching a join condition which was introduced by `100340e2d`. However, we're speeding up much more than just that here. Author: David Rowley, Tom Lane Reviewed-by: Tom Lane, Tomas Vondra Discussion: https://postgr.es/m/6970.1545327857@sss.pgh.pa.us	2019-07-21 17:30:58 +12:00
Tomas Vondra	a63378a03e	Use column collation for extended statistics The current extended statistics code was a bit confused which collation to use. When building the statistics, the collations defined as default for the data types were used (since commit `5e0928005`). The MCV code was however using the column collations for MCV serialization, and then DEFAULT_COLLATION_OID when computing estimates. So overall the code was using all three possible options, inconsistently. This uses the column colation everywhere - this makes it consistent with what `5e0928005` did for regular stats. We however do not track the collations in a catalog, because we can derive them from column-level information. This may need to change in the future, e.g. after allowing statistics on expressions. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12	2019-07-20 16:37:37 +02:00
Tomas Vondra	e38a55ba46	Rework examine_opclause_expression to use varonleft The examine_opclause_expression function needs to return information on which side of the operator we found the Var, but the variable was called "isgt" which is rather misleading (it assumes the operator is either less-than or greater-than, but it may be equality or something else). Other places in the planner use a variable called "varonleft" for this purpose, so just adopt the same convention here. The code also assumed we don't care about this flag for equality, as (Var = Const) and (Const = Var) should be the same thing. But that does not work for cross-type operators, in which case we need to pass the parameters to the procedure in the right order. So just use the same code for all types of expressions. This means we don't need to care about the selectivity estimation function anymore, at least not in this code. We should only get the supported cases here (thanks to statext_is_compatible_clause). Reviewed-by: Tom Lane Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12	2019-07-20 16:37:30 +02:00
Jeff Davis	b538c90b1b	Fix error in commit `e6feef57`. I was careless passing a datum directly to DATE_NOT_FINITE without calling DatumGetDateADT() first. Backpatch-through: 9.4	2019-07-18 17:04:50 -07:00
Michael Paquier	70a33b2109	Fix typo in mvdistinct.c Noticed while browsing the code.	2019-07-19 08:50:14 +09:00
Jeff Davis	e6feef571a	Fix daterange canonicalization for +/- infinity. The values 'infinity' and '-infinity' are a part of the DATE type itself, so a bound of the date 'infinity' is not the same as an unbounded/infinite range. However, it is still wrong to try to canonicalize such values, because adding or subtracting one has no effect. Fix by treating 'infinity' and '-infinity' the same as unbounded ranges for the purposes of canonicalization (but not other purposes). Backpatch to all versions because it is inconsistent with the documented behavior. Note that this could be an incompatibility for applications relying on the behavior contrary to the documentation. Author: Laurenz Albe Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/77f24ea19ab802bc9bc60ddbb8977ee2d646aec1.camel%40cybertec.at Backpatch-through: 9.4	2019-07-18 13:41:10 -07:00
Peter Geoghegan	d004147eb3	Fix nbtree metapage cache upgrade bug. Commit `857f9c36cd`, which taught nbtree VACUUM to avoid unnecessary index scans, bumped the nbtree version number from 2 to 3, while adding the ability for nbtree indexes to be upgraded on-the-fly. Various assertions that assumed that an nbtree index was always on version 2 had to be changed to accept any supported version (version 2 or 3 on Postgres 11). However, a few assertions were missed in the initial commit, all of which were in code paths that cache a local copy of the metapage metadata, where the index had been expected to be on the current version (no longer version 2) as a generic sanity check. Rather than simply update the assertions, follow-up commit `0a64b45152` intentionally made the metapage caching code update the per-backend cached metadata version without changing the on-disk version at the same time. This could even happen when the planner needed to determine the height of a B-Tree for costing purposes. The assertions only fail on Postgres v12 when upgrading from v10, because they were adjusted to use the authoritative shared memory metapage by v12's commit `dd299df8`. To fix, remove the cache-only upgrade mechanism entirely, and update the assertions themselves to accept any supported version (go back to using the cached version in v12). The fix is almost a full revert of commit `0a64b45152` on the v11 branch. VACUUM only considers the authoritative metapage, and never bothers with a locally cached version, whereas everywhere else isn't interested in the metapage fields that were added by commit `857f9c36cd`. It seems unlikely that this bug has affected any user on v11. Reported-By: Christoph Berg Bug: #15896 Discussion: https://postgr.es/m/15896-5b25e260fdb0b081%40postgresql.org Backpatch: 11-, where VACUUM was taught to avoid unnecessary index scans.	2019-07-18 13:22:56 -07:00
Tom Lane	bc8393cf27	Further adjust SPITupleTable to provide a public row-count field. Now that commit `fec0778c8` drew a clear line between public and private fields in SPITupleTable, it seems pretty silly that the count of valid tuples isn't on the public side of that line. The reason why not was that there wasn't such a count. For reasons lost in the mists of time, spi.c preferred to keep a count of remaining free entries in the array. But that seems pretty pointless: it's unlike the way we handle similar code everywhere else, and it involves extra subtractions that surely outweigh having to do a comparison rather than test-for-zero to check for array-full. Hence, rearrange so that this code does the expansible array logic the same as everywhere else, with a count of valid entries alongside the allocated array length. And document the count as public. I looked for core-code callers where it would make sense to start relying on tuptable->numvals rather than the separate SPI_processed variable. Right now there don't seem to be places where it'd be a win to do so without more code restructuring than I care to undertake today. In principle, though, having SPITupleTables be fully self-contained should be helpful down the line. Discussion: https://postgr.es/m/16852.1563395722@sss.pgh.pa.us	2019-07-18 10:37:13 -04:00
Tomas Vondra	7d24f6a490	Simplify bitmap updates in multivariate MCV code When evaluating clauses on a multivariate MCV list, we build a bitmap tracking how the clauses match each item of the MCV list. When updating the bitmap we need to consider the current value (tracking how the item matches preceding clauses), match for the current clause and whether the clauses are connected by AND or OR. Until now the logic was copied on every place updating the bitmap, which was not quite readable. So just move it to a separate function and call it where needed. Backpatch to 12, where the code was introduced. While not a bugfix, this should make maintenance and future backpatches easier. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu	2019-07-18 11:29:38 +02:00
Tomas Vondra	e4deae7396	Fix handling of NULLs in MCV items and constants There were two issues in how the extended statistics handled NULL values in opclauses. Firstly, the code was oblivious to the possibility that Const may be NULL (constisnull=true) in which case the constvalue is undefined. We need to treat this as a mismatch, and not call the proc. Secondly, the MCV item itself may contain NULL values too - the code already did check that, and updated the match bitmap accordingly, but failed to ensure we won't call the operator procedure anyway. It did work for AND-clauses, because in that case false in the bitmap stops evaluation of further clauses. But for OR-clauses ir was not easy to get incorrect estimates or even trigger a crash. This fixes both issues by extending the existing check so that it looks at constisnull too, and making sure it skips calling the procedure. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu	2019-07-18 11:29:38 +02:00
Tomas Vondra	e8b6ae2130	Fix handling of opclauses in extended statistics We expect opclauses to have exactly one Var and one Const, but the code was checking the Const by calling is_pseudo_constant_clause() which is incorrect - we need a proper constant. Fixed by using plain IsA(x,Const) to check type of the node. We need to do these checks in two places, so move it into a separate function that can be called in both places. Reported by Andreas Seltenreich, based on crash reported by sqlsmith. Backpatch to v12, where this code was introduced. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12	2019-07-18 11:29:38 +02:00
Tomas Vondra	a4303a078c	Remove unnecessary TYPECACHE_GT_OPR lookup The TYPECACHE_GT_OPR is not needed (it used to be in older version of the MCV code), but the compiler failed to detect this as the result was used in a fmgr_info() call, populating a FmgrInfo entry. Backpatch to v12, where this code was introduced. Discussion: https://postgr.es/m/8736jdhbhc.fsf%40ansel.ydns.eu Backpatch-to: 12	2019-07-18 11:29:38 +02:00
Tom Lane	d97b714a21	Avoid using lcons and list_delete_first where it's easy to do so. Formerly, lcons was about the same speed as lappend, but with the new List implementation, that's not so; with a long List, data movement imposes an O(N) cost on lcons and list_delete_first, but not lappend. Hence, invent list_delete_last with semantics parallel to list_delete_first (but O(1) cost), and change various places to use lappend and list_delete_last where this can be done without much violence to the code logic. There are quite a few places that construct result lists using lcons not lappend. Some have semantic rationales for that; I added comments about it to a couple that didn't have them already. In many such places though, I think the coding is that way only because back in the dark ages lcons was faster than lappend. Hence, switch to lappend where this can be done without causing semantic changes. In ExecInitExprRec(), this results in aggregates and window functions that are in the same plan node being executed in a different order than before. Generally, the executions of such functions ought to be independent of each other, so this shouldn't result in visibly different query results. But if you push it, as one regression test case does, you can show that the order is different. The new order seems saner; it's closer to the order of the functions in the query text. And we never documented or promised anything about this, anyway. Also, in gistfinishsplit(), don't bother building a reverse-order list; it's easy now to iterate backwards through the original list. It'd be possible to go further towards removing uses of lcons and list_delete_first, but it'd require more extensive logic changes, and I'm not convinced it's worth it. Most of the remaining uses deal with queues that probably never get long enough to be worth sweating over. (Actually, I doubt that any of the changes in this patch will have measurable performance effects either. But better to have good examples than bad ones in the code base.) Patch by me, thanks to David Rowley and Daniel Gustafsson for review. Discussion: https://postgr.es/m/21272.1563318411@sss.pgh.pa.us	2019-07-17 11:15:34 -04:00
Thomas Munro	dfd0121dc7	Move some md.c-specific logic from smgr.c to md.c. Potential future SMGR implementations may not want to create tablespace directories when creating an SMGR relation. Move that logic to mdcreate(). Move the initialization of md-specific data structures from smgropen() to a new callback mdopen(). Author: Thomas Munro Reviewed-by: Shawn Debnath (as part of an earlier patch set) Discussion: https://postgr.es/m/CA%2BhUKG%2BOZqOiOuDm5tC5DyQZtJ3FH4%2BFSVMqtdC4P1atpJ%2Bqhg%40mail.gmail.com	2019-07-17 15:00:22 +12:00
Tom Lane	3093eb2b83	Fix thinko in construction of old_conpfeqop list. This should lappend the OIDs, not lcons them; the existing code produced a list in reversed order. This is harmless for single-key FKs or FKs where all the key columns are of the same type, which probably explains how it went unnoticed. But if those conditions are not met, ATAddForeignKeyConstraint would make the wrong decision about whether an existing FK needs to be revalidated. I think it would almost always err in the safe direction by revalidating a constraint that didn't need it. You could imagine scenarios where the pfeqop check was fooled by swapping the types of two FK columns in one ALTER TABLE, but that case would probably be rejected by other tests, so it might be impossible to get to the worst-case scenario where an FK should be revalidated and isn't. (And even then, it's likely to be fine, unless there are weird inconsistencies in the equality behavior of the replacement types.) However, this is a performance bug at least. Noted while poking around to see whether lcons calls could be converted to lappend. This bug is old, dating to commit `cb3a7c2b9`, so back-patch to all supported branches.	2019-07-16 18:17:47 -04:00
Tom Lane	c245776906	Remove lappend_cell...() family of List functions. It seems worth getting rid of these functions because they require the caller to retain a ListCell pointer into a List that it's modifying, which is a dangerous practice with the new List implementation. (The only other List-modifying function that takes a ListCell pointer as input is list_delete_cell, which nowadays is preferentially used via the constrained API foreach_delete_current.) There was only one remaining caller of these functions after commit `2f5b8eb5a`, and that was some fairly ugly GEQO code that can be much more clearly expressed using a list-index variable and list_insert_nth. Hence, rewrite that code, and remove the functions. Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us	2019-07-16 13:12:24 -04:00
Tom Lane	2f5b8eb5a2	Clean up some ad-hoc code for sorting and de-duplicating Lists. heap.c and relcache.c contained nearly identical copies of logic to insert OIDs into an OID list while preserving the list's OID ordering (and rejecting duplicates, in one case but not the other). The comments argue that this is faster than qsort for small numbers of OIDs, which is at best unproven, and seems even less likely to be true now that lappend_cell_oid has to move data around. In any case it's ugly and hard-to-follow code, and if we do have a lot of OIDs to consider, it's O(N^2). Hence, replace with simply lappend'ing OIDs to a List, then list_sort the completed List, then remove adjacent duplicates if necessary. This is demonstrably O(N log N) and it's much simpler for the callers. It's possible that this would be somewhat inefficient if there were a very large number of duplicates, but that seems unlikely in the existing usage. This adds list_deduplicate_oid and list_oid_cmp infrastructure to list.c. I didn't bother with equivalent functionality for integer or pointer Lists, but such could always be added later if we find a use for it. Discussion: https://postgr.es/m/26193.1563228600@sss.pgh.pa.us	2019-07-16 12:04:06 -04:00
Tom Lane	569ed7f483	Redesign the API for list sorting (list_qsort becomes list_sort). In the wake of commit `1cff1b95a`, the obvious way to sort a List is to apply qsort() directly to the array of ListCells. list_qsort was building an intermediate array of pointers-to-ListCells, which we no longer need, but getting rid of it forces an API change: the comparator functions need to do one less level of indirection. Since we're having to touch the callers anyway, let's do two additional changes: sort the given list in-place rather than making a copy (as none of the existing callers have any use for the copying behavior), and rename list_qsort to list_sort. It was argued that the old name exposes more about the implementation than it should, which I find pretty questionable, but a better reason to rename it is to be sure we get the attention of any external callers about the need to fix their comparator functions. While we're at it, change four existing callers of qsort() to use list_sort instead; previously, they all had local reinventions of list_qsort, ie build-an-array-from-a-List-and-qsort-it. (There are some other places where changing to list_sort perhaps would be worthwhile, but they're less obviously wins.) Discussion: https://postgr.es/m/29361.1563220190@sss.pgh.pa.us	2019-07-16 11:51:44 -04:00
Michael Paquier	0896ae561b	Fix inconsistencies and typos in the tree This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com	2019-07-16 13:23:53 +09:00
Tom Lane	4c3d05d875	Remove dead code. These memory context switches are useless in the wake of commit `1cff1b95a`. Noted by Jesper Pedersen. Discussion: https://postgr.es/m/f078ce63-9e04-0f3e-d200-d7ee66279abe@redhat.com	2019-07-15 23:27:13 -04:00
Peter Geoghegan	bfdbac2ab3	Correct nbtsplitloc.c comment. The logic just added by commit `e3899ffd` falls back on a 50:50 page split in the event of a new item that's just to the right of our provisional "many duplicates" split point. Fix a comment that incorrectly claimed that the new item had to be just to the left of our provisional split point. Backpatch: 12-, just like commit `e3899ffd`.	2019-07-15 14:35:06 -07:00
Peter Geoghegan	e3899ffd8b	Fix pathological nbtree split point choice issue. Specific ever-decreasing insertion patterns could cause successive unbalanced nbtree page splits. Problem cases involve a large group of duplicates to the left, and ever-decreasing insertions to the right. To fix, detect the situation by considering the newitem offset before performing a split using nbtsplitloc.c's "many duplicates" strategy. If the new item was inserted just to the right of our provisional "many duplicates" split point, infer ever-decreasing insertions and fall back on a 50:50 (space delta optimal) split. This seems to barely affect cases that already had acceptable space utilization. An alternative fix also seems possible. Instead of changing nbtsplitloc.c split choice logic, we could instead teach _bt_truncate() to generate a new value for new high keys by interpolating from the lastleft and firstright key values. That would certainly be a more elegant fix, but it isn't suitable for backpatching. Discussion: https://postgr.es/m/CAH2-WznCNvhZpxa__GqAa1fgQ9uYdVc=_apArkW2nc-K3O7_NA@mail.gmail.com Backpatch: 12-, where the nbtree page split enhancements were introduced.	2019-07-15 13:19:13 -07:00
Tom Lane	1cff1b95ab	Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit `d0b4399d8`) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell " pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically doesn't require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us	2019-07-15 13:41:58 -04:00
Thomas Munro	67b9b3ca32	Provide XLogRecGetFullXid(). In order to be able to work with FullTransactionId values during replay without increasing the size of the WAL, infer the epoch. In general we can't do that safely, but during replay we can because we know that nextFullXid can't advance concurrently. Prevent frontend code from seeing this new function, due to the above restriction. Perhaps in future it will be possible to extract the value entirely from independent WAL records, and then this restriction can be lifted. Author: Thomas Munro, based on earlier code from Andres Freund Discussion: https://postgr.es/m/CA%2BhUKG%2BmLmuDjMi6o1dxkKvGRL56Y2Rz%2BiXAcrZV03G9ZuFQ8Q%40mail.gmail.com	2019-07-15 17:04:29 +12:00
Peter Eisentraut	5925e55498	Add gen_random_uuid function This adds a built-in function to generate UUIDs. PostgreSQL hasn't had a built-in function to generate a UUID yet, relying on external modules such as uuid-ossp and pgcrypto to provide one. Now that we have a strong random number generator built-in, we can easily provide a version 4 (random) UUID generation function. This patch takes the existing function gen_random_uuid() from pgcrypto and makes it a built-in function. The pgcrypto implementation now internally redirects to the built-in one. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr> Discussion: https://www.postgresql.org/message-id/6a65610c-46fc-2323-6b78-e8086340a325@2ndquadrant.com	2019-07-14 14:30:27 +02:00
Alexander Korotkov	c085e1c1cb	Add support for <-> (box, point) operator to GiST box_ops Index-based calculation of this operator is exact. So, signature of gist_bbox_distance() function is changes so that caller is responsible for setting *recheck flag. Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Tom Lane, Alexander Korotkov	2019-07-14 15:09:15 +03:00
Alexander Korotkov	6254c55f81	Add missing commutators for distance operators Some of <-> operators between geometric types have their commutators missed. This commit adds them. The motivation is upcoming kNN support for some of those operators. Discussion: https://postgr.es/m/f71ba19d-d989-63b6-f04a-abf02ad9345d%40postgrespro.ru Author: Nikita Glukhov Reviewed-by: Tom Lane, Alexander Korotkov	2019-07-14 14:55:01 +03:00
Thomas Munro	b91dd9de5e	Forward received condition variable signals on cancel. After a process decides not to wait for a condition variable, it can still consume a signal before it reaches ConditionVariableCancelSleep(). In that case, pass the signal on to another waiter if possible, so that a signal doesn't go missing when there is another process ready to receive it. Author: Thomas Munro Reviewed-by: Shawn Debnath Discussion: https://postgr.es/m/CA%2BhUKGLQ_RW%2BXs8znDn36e-%2Bmq2--zrPemBqTQ8eKT-VO1OF4Q%40mail.gmail.com	2019-07-13 14:50:18 +12:00
Thomas Munro	1321509fa4	Introduce timed waits for condition variables. Provide ConditionVariableTimedSleep(), like ConditionVariableSleep() but with a timeout argument. Author: Shawn Debnath Reviewed-by: Kyotaro Horiguchi, Thomas Munro Discussion: https://postgr.es/m/eeb06007ccfe46e399df6af18bfcd15a@EX13D05UWC002.ant.amazon.com	2019-07-13 13:51:05 +12:00
Thomas Munro	b31fbe852c	Warn if wal_level is too low when creating a publication. Provide a hint to users that they need to increase wal_level before subscriptions can work. Author: Lucas Viecelli, with some adjustments by Thomas Munro Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAPjy-57rn5Y9g4e5u--eSOP-7P4QrE9uOZmT2ZcUebF8qxsYhg%40mail.gmail.com	2019-07-13 10:35:34 +12:00
Tom Lane	d3751adcf1	Fix get_actual_variable_range() to cope with broken HOT chains. Commit `3ca930fc3` modified get_actual_variable_range() to use a new "SnapshotNonVacuumable" snapshot type for selecting tuples that it would consider valid. However, because that snapshot type can accept recently-dead tuples, this caused a bug when using a recently-created index: we might accept a recently-dead tuple that is an early member of a broken HOT chain and does not actually match the index entry. Then, the data extracted from the heap tuple would not necessarily be an endpoint value of the column; it could even be NULL, leading to get_actual_variable_range() itself reporting "found unexpected null value in index". Even without an error, this could lead to poor plan choices due to an erroneous notion of the endpoint value. We can improve matters by changing the code to use the index-only scan technique (which didn't exist when get_actual_variable_range was originally written). If any of the tuples in a HOT chain are live enough to satisfy SnapshotNonVacuumable, we take the data from the index entry, ignoring what is in the heap. This fixes the problem without changing the live-vs-dead-tuple behavior from what was intended by commit `3ca930fc3`. A side benefit is that for static tables we might not have to touch the heap at all (when the extremal value is in an all-visible page). In addition, we can save some overhead by not having to create a complete ExecutorState, and we don't need to run FormIndexDatum, avoiding more cycles as well as the possibility of failure for indexes on expressions. (I'm not sure that this code would ever be used to determine the extreme value of an expression, in the current state of the planner; but it's definitely possible that lower-order columns of the selected index could be expressions. So one could construct perhaps-artificial examples in which the old code unexpectedly failed due to trying to compute an expression's value for a now-dead row.) Per report from Manuel Rigger. Back-patch to v11 where commit `3ca930fc3` came in. Discussion: https://postgr.es/m/CA+u7OA7W4NWEhCvftdV6_8bbm2vgypi5nuxfnSEJQqVKFSUoMg@mail.gmail.com	2019-07-12 16:24:59 -04:00
David Rowley	cfde234939	Fix RANGE partition pruning with multiple boolean partition keys match_clause_to_partition_key incorrectly would return PARTCLAUSE_UNSUPPORTED if a bool qual could not be matched to the current partition key. This was a problem, as it causes the calling function to discard the qual and not try to match it to any other partition key. If there was another partition key which did match this qual, then the qual would not be checked again and we could fail to prune some partitions. The worst this could do was to cause partitions not to be pruned when they could have been, so there was no danger of incorrect query results here. Fix this by changing match_boolean_partition_clause to have it return a PartClauseMatchStatus rather than a boolean value. This allows it to communicate if the qual is unsupported or if it just does not match this particular partition key, previously these two cases were treated the same. Now, if match_clause_to_partition_key is unable to match the qual to any other qual type then we can simply return the value from the match_boolean_partition_clause call so that the calling function properly treats the qual as either unmatched or unsupported. Reported-by: Rares Salcudean Reviewed-by: Amit Langote Backpatch-through: 11 where partition pruning was introduced Discussion: https://postgr.es/m/CAHp_FN2xwEznH6oyS0hNTuUUZKp5PvegcVv=Co6nBXJ+mC7Y5w@mail.gmail.com	2019-07-12 19:12:38 +12:00
Tom Lane	b5810de3f4	Reduce memory consumption for multi-statement query strings. Previously, exec_simple_query always ran parse analysis, rewrite, and planning in MessageContext, allowing all the data generated thereby to persist until the end of processing of the whole query string. That's fine for single-command strings, but if a client sends many commands in a single simple-Query message, this strategy could result in annoying memory bloat, as complained of by Andreas Seltenreich. To fix, create a child context to do this work in, and reclaim it after each command. But we only do so for parsetrees that are not last in their query string. That avoids adding any memory management overhead for the typical case of a single-command string. Memory allocated for the last parsetree would be freed immediately after finishing the command string anyway. Similarly, adjust extension.c's execute_sql_string() to reclaim memory after each command. In that usage, multi-command strings are the norm, so it's a bit surprising that no one has yet complained of bloat --- especially since the bloat extended to whatever data ProcessUtility execution might leak. Amit Langote, reviewed by Julien Rouhaud Discussion: https://postgr.es/m/87ftp6l2qr.fsf@credativ.de	2019-07-10 14:32:38 -04:00
Michael Paquier	fa19a08d71	Fix variable initialization when using buffering build with GiST This can cause valgrind to complain, as the flag marking a buffer as a temporary copy was not getting initialized. While on it, fill in with zeros newly-created buffer pages. This does not matter when loading a block from a temporary file, but it makes the push of an index tuple into a new buffer page safer. This has been introduced by `1d27dcf`, so backpatch all the way down to 9.4. Author: Alexander Lakhin Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org Backpatch-through: 9.4	2019-07-10 15:14:54 +09:00
David Rowley	f7c830f1ab	Fix missing calls to table_finish_bulk_insert during COPY, take 2 `86b85044e` abstracted calls to heap functions in COPY FROM to support a generic table AM. However, when performing a copy into a partitioned table, this commit neglected to call table_finish_bulk_insert for each partition. Before `86b85044e`, when we always called the heap functions, there was no need to call heapam_finish_bulk_insert for partitions since it only did any work when performing a copy without WAL. For partitioned tables, this was unsupported anyway, so there was no issue. With pluggable storage, we can't make any assumptions about what the table AM might want to do in its equivalent function, so we'd better ensure we always call table_finish_bulk_insert each partition that's received a row. For now, we make the table_finish_bulk_insert call whenever we evict a CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean that it's possible that we call table_finish_bulk_insert multiple times per partition, which is not a problem other than being an inefficiency. Improving this requires a more invasive patch, so let's leave that for another day. This also changes things so that we no longer needlessly call table_finish_bulk_insert when performing a COPY FROM for a non-partitioned table when not using multi-inserts. Reported-by: Robert Haas Backpatch-through: 12 Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com	2019-07-10 16:03:04 +12:00
Thomas Munro	f5825853e3	Pass QueryEnvironment down to EvalPlanQual's EState. Otherwise the executor can't see trigger transition tables during EPQ evaluation. Fixes bug #15900 and almost certainly also #15720. Back-patch to 10, where trigger transition tables landed. Author: Alex Aktsipetrov Reviewed-by: Thomas Munro, Tom Lane Discussion: https://postgr.es/m/15900-bc482754fe8d7415%40postgresql.org Discussion: https://postgr.es/m/15720-38c2b29e5d720187%40postgresql.org	2019-07-10 10:15:32 +12:00
Alvaro Herrera	2c84ea6cf9	Propagate trigger arguments to partitions We were creating the cloned triggers with an empty list of arguments, losing the ones that had been specified by the user when creating the trigger in the partitioned table. Repair. This was forgotten in commit `86f575948c`. Author: Patrick McHardy Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20190709130027.amr2cavjvo7rdvac@access1.trash.net Discussion: https://postgr.es/m/15752-123bc90287986de4@postgresql.org	2019-07-09 17:16:36 -04:00
Bruce Momjian	ba09342518	Adjust ssl_ciphers to be specific to OpenSSL Syntax is OpenSSL-specific, so only use it for OpenSSL. Discussion: https://postgr.es/m/8232E273-7B25-47F4-B0E7-3D4264106F82@yesql.se Author: Daniel Gustafsson Backpatch-through: head	2019-07-08 19:39:48 -04:00
Robert Haas	554106b116	tableam: Provide helper functions for relation sizing. Most block-based table AMs will need the exact same implementation of the relation_size callback as the heap, and if they use a standard page layout, they will likely need an implementation of the relation_estimate_size callback that is very similar to that of the heap. Rearrange to facilitate code reuse. Patch by me, reviewed by Michael Paquier, Daniel Gustafsson, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoZ6DBPnP1E-vRpQZUJQijJFD54F+SR_pxGiAAS-MyrigA@mail.gmail.com	2019-07-08 14:51:53 -04:00
Michael Paquier	6b8548964b	Fix inconsistencies in the code This addresses a couple of issues in the code: - Typos and inconsistencies in comments and function declarations. - Removal of unreferenced function declarations. - Removal of unnecessary compile flags. - A cleanup error in regressplans.sh. Author: Alexander Lakhin Discussion: https://postgr.es/m/0c991fdf-2670-1997-c027-772a420c4604@gmail.com	2019-07-08 13:15:09 +09:00
Peter Eisentraut	7e9a4c5c3d	Use consistent style for checking return from system calls Use if (something() != 0) error ... instead of just if (something) error ... The latter is not incorrect, but it's a bit confusing and not the common style. Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com	2019-07-07 15:28:49 +02:00
Amit Kapila	78d41f6c9b	Add missing assertions for required table am callbacks. Reported-by: Ashwin Agrawal Author: Ashwin Agrawal Reviewed-by: Amit Kapila Backpatch-through: 12, where it was introduced Discussion: https://postgr.es/m/CALfoeisgdZhYDrJOukaBzvXfJOK2FQ0szVMK7dzmcy6w93iDUA@mail.gmail.com	2019-07-06 11:41:23 +05:30
Tom Lane	0ab1a2e39b	Remove dead encoding-conversion functions. The code for conversions SQL_ASCII <-> MULE_INTERNAL and SQL_ASCII <-> UTF8 was unreachable, because we long ago changed the wrapper functions pg_do_encoding_conversion() et al so that they have hard-wired behaviors for conversions involving SQL_ASCII. (At least some of those fast paths date back to 2002, though it looks like we may not have been totally consistent about this until later.) Given the lack of complaints, nobody is dissatisfied with this state of affairs. Hence, let's just remove the unreachable code. Also, change CREATE CONVERSION so that it rejects attempts to define such conversions. Since we consider that SQL_ASCII represents lack of knowledge about the encoding in use, such a conversion would be semantically dubious even if it were reachable. Adjust a couple of regression test cases that had randomly decided to rely on these conversion functions rather than any other ones. Discussion: https://postgr.es/m/41163.1559156593@sss.pgh.pa.us	2019-07-05 14:17:27 -04:00
Tomas Vondra	ef777cb093	Remove unused variable in statext_mcv_serialize() The itemlen variable used to be referenced in multiple places, but since reworking the serialization code it's used only in one assert. Fixed by removing the variable and calling the macro from the assert directly. Backpatch to 12, where this code was introduced. Reported-by: Jeff Janes Discussion: https://postgr.es/m/CAMkU=1zc_ovH9NZd_9ovuiEWkF9yX06URUDdXCmgDydf-bqB5A@mail.gmail.com	2019-07-05 18:51:56 +02:00
Thomas Munro	e8fdcacc6c	Improve comment in postgresql.conf.sample. The Unix manual section that "man tcp" appears in varies, so let's just leave it out of the command to run.	2019-07-05 21:03:51 +12:00
Michael Paquier	313f87a171	Add min() and max() aggregates for pg_lsn This is useful for monitoring, when it comes for example to calculations of WAL retention with replication slots and delays with a set of standbys. Bump catalog version. Author: Fabrízio de Royes Mello Reviewed-by: Surafel Temesgen Discussion: https://postgr.es/m/CAFcNs+oc8ZoHhowA4rR1GGCgG8QNgK_TOwPRVYQo5rYy8_PXzA@mail.gmail.com	2019-07-05 12:21:11 +09:00
Tomas Vondra	08aa131c7a	Simplify pg_mcv_list (de)serialization The serialization format of multivariate MCV lists included alignment in order to allow direct access to part of the serialized data, but despite multiple fixes (see for example commits `d85e0f366a` and `ea4e1c0e8f`) this proved to be problematic. This commit abandons alignment in the serialized format, and just copies everything during deserialization. We now also track amount of memory needed after deserialization (including alignment), which allows us to deserialize the MCV list in a single pass. Bump catversion, as this affects contents of pg_statistic_ext_data. Backpatch to 12, where multi-column MCV lists were introduced. Author: Tomas Vondra Reviewed-by: Tom Lane Discussion: https://postgr.es/m/2201.1561521148@sss.pgh.pa.us	2019-07-05 01:32:49 +02:00
Tomas Vondra	4d66285adc	Fix pg_mcv_list_items() to produce text[] The function pg_mcv_list_items() returns values stored in MCV items. The items may contain columns with different data types, so the function was generating text array-like representation, but in an ad-hoc way without properly escaping various characters etc. Fixed by simply building a text[] array, which also makes it easier to use from queries etc. Requires changes to pg_proc entry, so bump catversion. Backpatch to 12, where multi-column MCV lists were introduced. Author: Tomas Vondra Reviewed-by: Dean Rasheed Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development	2019-07-05 01:32:46 +02:00
Tomas Vondra	e365a581c2	Speed-up build of MCV lists with many distinct values When building multi-column MCV lists, we compute base frequency for each item, i.e. a product of per-column frequencies for values from the item. As a value may be in multiple groups, the code was scanning the whole array of groups while adding items to the MCV list. This works fine as long as the number of distinct groups is small, but it's easy to trigger trigger O(N^2) behavior, especially after increasing statistics target. This commit precomputes frequencies for values in all columns, so that when computing the base frequency it's enough to make a simple bsearch lookup in the array. Backpatch to 12, where multi-column MCV lists were introduced. Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development	2019-07-05 01:32:33 +02:00
Peter Eisentraut	6a1cd8b923	Unwind some workarounds for lack of portable int64 format specifier Because there is no portable int64/uint64 format specifier and we can't stick macros like INT64_FORMAT into the middle of a translatable string, we have been using various workarounds that put the number to be printed into a string buffer first. Now that we always use our own sprintf(), we can rely on %lld and %llu to work, so we can use those. This patch undoes this workaround in a few places where it was egregiously verbose. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/CAH2-Wz%3DWbNxc5ob5NJ9yqo2RMJ0q4HXDS30GVCobeCvC9A1L9A%40mail.gmail.com	2019-07-04 17:01:43 +02:00
Peter Eisentraut	7b925e1270	Sync our Snowball stemmer dictionaries with current upstream The main change is a new stemmer for Greek. There are minor changes in the Danish and French stemmers. Author: Panagiotis Mavrogiorgos <pmav99@gmail.com>	2019-07-04 13:26:48 +02:00
Peter Eisentraut	dedb6e0143	Clean up whitespace a bit	2019-07-04 13:26:48 +02:00
Michael Paquier	cfc40d384a	Introduce safer encoding and decoding routines for base64.c This is a follow-up refactoring after `09ec55b` and `b674211`, which has proved that the encoding and decoding routines used by SCRAM have a poor interface when it comes to check after buffer overflows. This adds an extra argument in the shape of the length of the result buffer for each routine, which is used for overflow checks when encoding or decoding an input string. The original idea comes from Tom Lane. As a result of that, the encoding routine can now fail, so all its callers are adjusted to generate proper error messages in case of problems. On failure, the result buffer gets zeroed. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20190623132535.GB1628@paquier.xyz	2019-07-04 16:08:09 +09:00
David Rowley	8abc13a889	Use appendStringInfoString and appendPQExpBufferStr where possible This changes various places where appendPQExpBuffer was used in places where it was possible to use appendPQExpBufferStr, and likewise for appendStringInfo and appendStringInfoString. This is really just a stylistic improvement, but there are also small performance gains to be had from doing this. Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com	2019-07-04 13:01:13 +12:00
David Rowley	a5be4062f7	Don't remove surplus columns from GROUP BY for inheritance parents `d4c3a156c` added code to remove columns that were not part of a table's PRIMARY KEY constraint from the GROUP BY clause when all the primary key columns were present in the group by. This is fine to do since we know that there will only be one row per group coming from this relation. However, the logic failed to consider inheritance parent relations. These can have child relations without a primary key, but even if they did, they could duplicate one of the parent's rows or one from another child relation. In this case, those additional GROUP BY columns are required. Fix this by disabling the optimization for inheritance parent tables. In v11 and beyond, partitioned tables are fine since partitions cannot overlap and before v11 partitioned tables could not have a primary key. Reported-by: Manuel Rigger Discussion: http://postgr.es/m/CA+u7OA7VLKf_vEr6kLF3MnWSA9LToJYncgpNX2tQ-oWzYCBQAw@mail.gmail.com Backpatch-through: 9.6	2019-07-03 23:44:54 +12:00
Peter Geoghegan	66c5bd3a6f	Remove obsolete nbtree "get root" comment. Remove a very old Berkeley era comment that doesn't seem to have anything to do with the current locking considerations within _bt_getroot(). Discussion: https://postgr.es/m/CAH2-WzmA2H+rL-xxF5o6QhMD+9x6cJTnz2Mr3Li_pbPBmqoTBQ@mail.gmail.com	2019-07-01 22:28:08 -07:00
Tom Lane	9e1c9f9594	pgindent run prior to branching v12. pgperltidy and reformat-dat-files too, though the latter didn't find anything to change.	2019-07-01 12:37:52 -04:00
David Rowley	f5db56fc4d	Revert fix missing call to table_finish_bulk_insert during COPY This reverts commits `4de60244e` and `b2d69806d`. Further thought is required to make this work properly.	2019-07-02 03:44:56 +12:00
David Rowley	b2d69806d8	Remove surplus call to table_finish_bulk_insert `4de60244e` added the call to table_finish_bulk_insert to the CopyMultiInsertBufferCleanup function. We use a CopyMultiInsertBuffer even for non-partitioned tables, so having the cleanup do that meant we would call table_finsh_bulk_insert twice when performing COPY FROM with a non-partitioned table. Here we can just remove the direct call in CopyFrom and let CopyMultiInsertBufferCleanup handle the call instead.	2019-07-02 03:07:15 +12:00
David Rowley	4de60244e2	Fix missing call to table_finish_bulk_insert during COPY `86b85044e` abstracted calls to heap functions in COPY FROM to support a generic table AM. However, when performing a copy into a partitioned table, this commit neglected to call table_finish_bulk_insert for each partition. Before `86b85044e`, when we always called the heap functions, there was no need to call heapam_finish_bulk_insert for partitions since it only did any work when performing a copy without WAL. For partitioned tables, this was unsupported anyway, so there was no issue. With pluggable storage, we can't make any assumptions about what the table AM might want to do in its equivalent function, so we'd better ensure we always call table_finish_bulk_insert each partition that's received a row. For now, we make the table_finish_bulk_insert call whenever we evict a CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean that it's possible that we call table_finish_bulk_insert multiple times per partition, which is not a problem other than being an inefficiency. Improving this requires a more invasive patch, so let's leave that for another day. In passing, move the table_finish_bulk_insert for the target of the COPY command so that it's only called when we're actually performing bulk inserts. We don't need to call this when inserting 1 row at a time. Reported-by: Robert Haas Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com	2019-07-02 01:23:26 +12:00
Peter Eisentraut	1b29e990e3	Add missing serial commas	2019-07-01 13:07:14 +02:00
Michael Paquier	c74d49d41c	Fix many typos and inconsistencies Author: Alexander Lakhin Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com	2019-07-01 10:00:23 +09:00
Noah Misch	459c3cdb4a	Don't read fields of a misaligned ExpandedObjectHeader or AnyArrayType. UBSan complains about this. Instead, cast to a suitable type requiring only 4-byte alignment. DatumGetAnyArrayP() already assumes one can cast between AnyArrayType and ArrayType, so this doesn't introduce a new assumption. Back-patch to 9.5, where AnyArrayType was introduced. Reviewed by Tom Lane. Discussion: https://postgr.es/m/20190629210334.GA1244217@rfd.leadboat.com	2019-06-30 17:34:17 -07:00
Andrew Gierth	da53be23d1	Repair logic for reordering grouping sets optimization. The logic in reorder_grouping_sets to order grouping set elements to match a pre-specified sort ordering was defective, resulting in unnecessary sort nodes (though the query output would still be correct). Repair, simplifying the code a little, and add a test. Per report from Richard Guo, though I didn't use their patch. Original bug seems to have been my fault. Backpatch back to 9.5 where grouping sets were introduced. Discussion: https://postgr.es/m/CAN_9JTzyjGcUjiBHxLsgqfk7PkdLGXiM=pwM+=ph2LsWw0WO1A@mail.gmail.com	2019-06-30 23:49:13 +01:00
Peter Eisentraut	2e810508f6	Fix breakage introduced in pg_lsn_in() Using PG_RETURN_LSN() from non-fmgr pg_lsn_in_internal() happened to work on some platforms, but should just be a plain "return".	2019-06-30 13:25:33 +02:00
Peter Eisentraut	21f428ebde	Don't call data type input functions in GUC check hooks Instead of calling pg_lsn_in() in check_recovery_target_lsn and timestamptz_in() in check_recovery_target_time, reorganize the respective code so that we don't raise any errors in the check hooks. The previous code tried to use PG_TRY/PG_CATCH to handle errors in a way that is not safe, so now the code contains no ereport() calls and can operate safely within the GUC error handling system. Moreover, since the interpretation of the recovery_target_time string may depend on the time zone, we cannot do the final processing of that string until all the GUC processing is done. Instead, check_recovery_target_time() now does some parsing for syntax checking, but the actual conversion to a timestamptz value is done later in the recovery code that uses it. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/20190611061115.njjwkagvxp4qujhp%40alap3.anarazel.de	2019-06-30 10:27:43 +02:00
Peter Eisentraut	666cbae16d	Remove explicit error handling for obsolete date/time values The date/time values 'current', 'invalid', and 'undefined' were removed a long time ago, but the code still contains explicit error handling for the transition. To simplify the code and avoid having to handle these values everywhere, just remove the recognition of these tokens altogether now. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2019-06-30 10:27:35 +02:00
Tom Lane	54100f5c60	Add an enforcement mechanism for global object names in regression tests. In commit `18555b132` we tentatively established a rule that regression tests should use names containing "regression" for databases, and names starting with "regress_" for all other globally-visible object names, so as to circumscribe the side-effects that "make installcheck" could have on an existing installation. This commit adds a simple enforcement mechanism for that rule: if the code is compiled with ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS defined, it will emit a warning (not an error) whenever a database, role, tablespace, subscription, or replication origin name is created that doesn't obey the rule. Running one or more buildfarm members with that symbol defined should be enough to catch new violations, at least in the regular regression tests. Most TAP tests wouldn't notice such warnings, but that's actually fine because TAP tests don't execute against an existing server anyway. Since it's already the case that running src/test/modules/ tests in installcheck mode is deprecated, we can use that as a home for tests that seem unsafe to run against an existing server, such as tests that might have side-effects on existing roles. Document that (though this commit doesn't in itself make it any less safe than before). Update regress.sgml to define these restrictions more clearly, and to clean up assorted lack-of-up-to-date-ness in its descriptions of the available regression tests. Discussion: https://postgr.es/m/16638.1468620817@sss.pgh.pa.us	2019-06-29 11:34:00 -04:00
Tom Lane	a1e61badf9	Disallow user-created replication origins named "pg_xxx". Since we generate such names internally, it seems like a good idea to have a policy of disallowing them for user use, as we do for many other object types. Otherwise attempts to use them will randomly fail due to collisions with internally-generated names. Discussion: https://postgr.es/m/3606.1561747369@sss.pgh.pa.us	2019-06-29 10:30:08 -04:00
Michael Paquier	c0faa72750	Remove unnecessary header from be-secure-gssapi.c libpq/libpq-be.h is included by libpq/libpq.h so there is no need to explicitly include it separately. Author: Daniel Gustafsson Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/A4852E46-9ED1-4861-A23B-22A83E34A084@yesql.se	2019-06-29 11:17:37 +09:00
Alvaro Herrera	23cccb17fe	Fix for dropped columns in a partitioned table's default partition We forgot to map column numbers to/from the default partition for various operations, leading to valid cases failing with spurious errors, such as ERROR: attribute N of type some_partition has been dropped It was also possible that the search for conflicting rows in the default partition when attaching another partition would fail to detect some. Secondarily, it was also possible that such a search should be skipped (because the constraint was implied) but wasn't. Fix all this by mapping column numbers when necessary. Reported by: Daniel Wilches Author: Amit Langote Discussion: https://postgr.es/m/15873-8c61945d6b3ef87c@postgresql.org	2019-06-28 14:51:08 -04:00
Thomas Munro	74b7cc8c02	Fix misleading comment in nodeIndexonlyscan.c. The stated reason for acquiring predicate locks on heap pages hasn't existed since commit `c01262a8`, so fix the comment. Perhaps in a later release we'll also be able to change the code to use tuple locks. Back-patch all the way. Reviewed-by: Ashwin Agrawal Discussion: https://postgr.es/m/CAEepm%3D2GK3FVdnt5V3d%2Bh9njWipCv_fNL%3DwjxyUhzsF%3D0PcbNg%40mail.gmail.com	2019-06-28 17:13:08 +12:00
Tomas Vondra	69fd82fedd	Update reference to sampling algorithm in analyze.c Commit `83e176ec1` moved row sampling functions from analyze.c to utils/misc/sampling.c, but failed to update comment referring to the sampling algorithm from Jeff Vitter's paper. Correct the comment by pointing to utils/misc/sampling.c. Author: Etsuro Fujita Discussion: https://postgr.es/m/CAPmGK154gp%2BQd%3DcorQOv%2BPmbyVyZBjp_%2Bhb766UJeD1e_ie6XQ%40mail.gmail.com	2019-06-27 18:01:54 +02:00
Alvaro Herrera	050098b14e	Fix use-after-free introduced in `55ed3defc9` Evidenced by failure under RELCACHE_FORCE_RELEASE (buildfarm member prion). Author: Amit Langote Discussion: https://postgr.es/m/CA+HiwqGV=k_Eh4jBiQw66ivvdG+EUkrEYeHTYL1SvDj_YOYV0g@mail.gmail.com	2019-06-27 11:57:10 -04:00
Peter Eisentraut	f2f0082ef5	Update comment Function was renamed/replaced in `c2fe139c20` but the header comment was not updated.	2019-06-27 15:57:14 +02:00
Alvaro Herrera	55ed3defc9	Fix partitioned index creation with foreign partitions When a partitioned tables contains foreign tables as partitions, it is not possible to implement unique or primary key indexes -- but when regular indexes are created, there is no reason to do anything other than ignoring such partitions. We were raising errors upon encountering the foreign partitions, which is unfriendly and doesn't protect against any actual problems. Relax this restriction so that index creation is allowed on partitioned tables containing foreign partitions, becoming a no-op on them. (We may later want to redefine this so that the FDW is told to create the indexes on the foreign side.) This applies to CREATE INDEX, as well as ALTER TABLE / ATTACH PARTITION and CREATE TABLE / PARTITION OF. Backpatch to 11, where indexes on partitioned tables were introduced. Discussion: https://postgr.es/m/15724-d5a58fa9472eef4f@postgresql.org Author: Álvaro Herrera Reviewed-by: Amit Langote	2019-06-26 18:38:51 -04:00
Michael Paquier	ce59b75d44	Add toast-level reloption for vacuum_index_cleanup `a96c41f` has introduced the option for heap, but it still lacked the variant to control the behavior for toast relations. While on it, refactor the tests so as they stress more scenarios with the various values that vacuum_index_cleanup can use. It would be useful to couple those tests with pageinspect to check that pages are actually cleaned up, but this is left for later. Author: Masahiko Sawada, Michael Paquier Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAD21AoCqs8iN04RX=i1KtLSaX5RrTEM04b7NHYps4+rqtpWNEg@mail.gmail.com	2019-06-25 09:09:27 +09:00
Tom Lane	ccfcc8fdbd	Purely-cosmetic adjustments in tablecmds.c. Move ATExecAlterColumnGenericOptions away from where it was unthinkingly dropped, in the middle of a lot of ALTER COLUMN TYPE code. I don't have any high principles about where to put it instead, so let's just put it after ALTER COLUMN TYPE and before ALTER OWNER, matching existing decisions about how to order related code stanzas. Also add the minimal function header comment that the original author was too cool to bother with. Along the way, upgrade header comments for nearby ALTER COLUMN TYPE functions. Discussion: https://postgr.es/m/14787.1561403130@sss.pgh.pa.us	2019-06-24 17:19:37 -04:00
Tom Lane	f946a40914	Further fix ALTER COLUMN TYPE's handling of indexes and index constraints. This patch reverts all the code changes of commit `e76de8861`, which turns out to have been seriously misguided. We can't wait till later to compute the definition string for an index; we must capture that before applying the data type change for any column it depends on, else ruleutils.c will deliverr wrong/misleading results. (This fine point was documented nowhere, of course.) I'd also managed to forget that ATExecAlterColumnType executes once per ALTER COLUMN TYPE clause, not once per statement; which resulted in the code being basically completely broken for any case in which multiple ALTER COLUMN TYPE clauses are applied to a table having non-constraint indexes that must be rebuilt. Through very bad luck, none of the existing test cases nor the ones added by `e76de8861` caught that, but of course it was soon found in the field. The previous patch also had an implicit assumption that if a constraint's index had a dependency on a table column, so would the constraint --- but that isn't actually true, so it didn't fix such cases. Instead of trying to delete unneeded index dependencies later, do the is-there-a-constraint lookup immediately on seeing an index dependency, and switch to remembering the constraint if so. In the unusual case of multiple column dependencies for a constraint index, this will result in duplicate constraint lookups, but that's not that horrible compared to all the other work that happens here. Besides, such cases did not work at all before, so it's hard to argue that they're performance-critical for anyone. Per bug #15865 from Keith Fiske. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/15865-17940eacc8f8b081@postgresql.org	2019-06-24 16:43:21 -04:00
Peter Eisentraut	12e037e209	Upgrade internal error message to external As part of REINDEX CONCURRENTLY, this formerly internal-only error message becomes potentially user-visible (see regression tests), so change from errmsg_internal() to errmsg(), and update comment.	2019-06-24 10:39:12 +02:00
Noah Misch	9a81c9fa3f	Don't call PG_RETURN_BOOL() in a function not returning Datum. This code is new in v12, and the defect probably was not user-visible.	2019-06-23 12:02:19 -07:00
Dean Rasheed	d7f8d26d9f	Add security checks to the multivariate MCV estimation code. The multivariate MCV estimation code may run user-defined operators on the values in the MCV list, which means that those operators may potentially leak the values from the MCV list. Guard against leaking data to unprivileged users by checking that the user has SELECT privileges on the table or all of the columns referred to by the statistics. Additionally, if there are any securityQuals on the RTE (either due to RLS policies on the table, or accessing the table via a security barrier view), not all rows may be visible to the current user, even if they have table or column privileges. Thus we further insist that the operator be leakproof in this case. Dean Rasheed, reviewed by Tomas Vondra. Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui=Vdx4N==VV5XOK5dsXfnGgVOz_JhAicB=ZA@mail.gmail.com	2019-06-23 18:50:08 +01:00
Thomas Munro	89ff7c08ee	Remove unnecessary comment. Author: Vik Fearing Discussion: https://postgr.es/m/150d3e9f-c7ec-3fb3-4fdb-def47c4144af%402ndquadrant.com	2019-06-23 22:19:59 +12:00
Thomas Munro	25b93a2967	Remove obsolete comments about sempahores from proc.c. Commit `6753333f` switched from a semaphore-based wait to a latch-based wait for ProcSleep()/ProcWakeup(), but left behind some stray references to semaphores. Back-patch to 9.5. Reviewed-by: Daniel Gustafsson, Michael Paquier Discussion: https://postgr.es/m/CA+hUKGLs5H6zhmgTijZ1OaJvC1sG0=AFXc1aHuce32tKiQrdEA@mail.gmail.com	2019-06-21 10:57:07 +12:00
Michael Paquier	20e1cc898d	Rework some error strings for REINDEX CONCURRENTLY with system catalogs This makes the whole user experience more consistent when bumping into failures, and more in line with the rewording done via `508300e`. Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20190514153252.GA22168@alvherre.pgsql	2019-06-20 13:28:12 +09:00
Alexander Korotkov	261a5c1928	Support 'q' flag in jsonpath 'like_regex' predicate SQL/JSON standard defines that jsonpath 'like_regex' predicate should support the same set of flags as XQuery/XPath. It appears that implementation of 'q' flag was missed. This commit fixes that. Discussion: https://postgr.es/m/CAPpHfdtyfPsxLYiTjp5Ov8T5xGsB5t3CwE5%2B3PS%3DLLwA%2BxTJog%40mail.gmail.com Author: Nikita Glukhov, Alexander Korotkov	2019-06-19 22:41:57 +03:00
Peter Eisentraut	d8594d123c	Update list of combining characters The list of combining characters to ignore for calculating the display width of a string (used for example by psql) was wildly outdated and incorrect. Discussion: https://www.postgresql.org/message-id/flat/bbb19114-af1e-513b-08a9-61272794bd5c%402ndquadrant.com	2019-06-19 21:35:41 +02:00
Magnus Hagander	66013fe730	Fix typo Author: Daniel Gustafsson	2019-06-19 14:59:26 +02:00
Michael Paquier	3c28fd2281	Fix description of WAL record XLOG_BTREE_META_CLEANUP This record uses one metadata buffer and registers some data associated to the buffer, but when parsing the record for its description a direct access to the record data was done, but there is none. This leads usually to an incorrect description, but can also cause crashes like in pg_waldump. Instead, fix things so as the parsing uses the data associated to the metadata block. This is an oversight from `3d92796`, so backpatch down to 11. Author: Michael Paquier Description: https://postgr.es/m/20190617013059.GA3153@paquier.xyz Backpatch-through: 11	2019-06-19 11:02:19 +09:00
Andres Freund	23224563d9	Fix memory corruption/crash in ANALYZE. This fixes an embarrassing oversight I (Andres) made in `737a292b`, namely missing two place where liverows/deadrows were used when converting those variables to pointers, leading to incrementing the pointer, rather than the value. It's not that actually that easy to trigger a crash: One needs tuples deleted by the current transaction, followed by a tuple deleted in another session, all in one page. Which is presumably why this hasn't been noticed before. Reported-By: Steve Singer Author: Steve Singer Discussion: https://postgr.es/m/c7988239-d42c-ddc4-41db-171b23b35e4f@ssinger.info	2019-06-18 15:51:04 -07:00
Alvaro Herrera	8b21b416ed	Avoid spurious deadlocks when upgrading a tuple lock This puts back reverted commit `de87a084c0`, with some bug fixes. When two (or more) transactions are waiting for transaction T1 to release a tuple-level lock, and transaction T1 upgrades its lock to a higher level, a spurious deadlock can be reported among the waiting transactions when T1 finishes. The simplest example case seems to be: T1: select id from job where name = 'a' for key share; Y: select id from job where name = 'a' for update; -- starts waiting for T1 Z: select id from job where name = 'a' for key share; T1: update job set name = 'b' where id = 1; Z: update job set name = 'c' where id = 1; -- starts waiting for T1 T1: rollback; At this point, transaction Y is rolled back on account of a deadlock: Y holds the heavyweight tuple lock and is waiting for the Xmax to be released, while Z holds part of the multixact and tries to acquire the heavyweight lock (per protocol) and goes to sleep; once T1 releases its part of the multixact, Z is awakened only to be put back to sleep on the heavyweight lock that Y is holding while sleeping. Kaboom. This can be avoided by having Z skip the heavyweight lock acquisition. As far as I can see, the biggest downside is that if there are multiple Z transactions, the order in which they resume after T1 finishes is not guaranteed. Backpatch to 9.6. The patch applies cleanly on 9.5, but the new tests don't work there (because isolationtester is not smart enough), so I'm not going to risk it. Author: Oleksii Kliukin Discussion: https://postgr.es/m/B9C9D7CD-EB94-4635-91B6-E558ACEC0EC3@hintbits.com Discussion: https://postgr.es/m/2815.1560521451@sss.pgh.pa.us	2019-06-18 18:23:16 -04:00
Thomas Munro	aca127c105	Prevent Parallel Hash Join for JOIN_UNIQUE_INNER. WHERE EXISTS (...) queries cannot be executed by Parallel Hash Join with jointype JOIN_UNIQUE_INNER, because there is no way to make a partial plan totally unique. The consequence of allowing such plans was duplicate results from some EXISTS queries. Back-patch to 11. Bug #15857. Author: Thomas Munro Reviewed-by: Tom Lane Reported-by: Vladimir Kriukov Discussion: https://postgr.es/m/15857-d1ba2a64bce0795e%40postgresql.org	2019-06-19 01:25:57 +12:00
Peter Eisentraut	91acff7a53	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 1a710c413ce4c4cd081843e563cde256bb95f490	2019-06-17 15:30:20 +02:00
Michael Paquier	09ec55b933	Fix buffer overflow when parsing SCRAM verifiers in backend Any authenticated user can overflow a stack-based buffer by changing the user's own password to a purpose-crafted value. This often suffices to execute arbitrary code as the PostgreSQL operating system account. This fix is contributed by multiple folks, based on an initial analysis from Tom Lane. This issue has been introduced by `68e61ee`, so it was possible to make use of it at authentication time. It became more easily to trigger after `ccae190` which has made the SCRAM parsing more strict when changing a password, in the case where the client passes down a verifier already hashed using SCRAM. Back-patch to v10 where SCRAM has been introduced. Reported-by: Alexander Lakhin Author: Jonathan Katz, Heikki Linnakangas, Michael Paquier Security: CVE-2019-10164 Backpatch-through: 10	2019-06-17 21:48:17 +09:00
Michael Paquier	3412030205	Fix more typos and inconsistencies in the tree Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com	2019-06-17 16:13:16 +09:00
Alvaro Herrera	9d20b0ec8f	Revert "Avoid spurious deadlocks when upgrading a tuple lock" This reverts commits `3da73d6839` and `de87a084c0`. This code has some tricky corner cases that I'm not sure are correct and not properly tested anyway, so I'm reverting the whole thing for next week's releases (reintroducing the deadlock bug that we set to fix). I'll try again afterwards. Discussion: https://postgr.es/m/E1hbXKQ-0003g1-0C@gemulon.postgresql.org	2019-06-16 22:24:21 -04:00
Tom Lane	6973b058bc	Further fix privileges on pg_statistic_ext[_data]. We don't need to restrict column privileges on pg_statistic_ext; all of that data is OK to read publicly. What we do need to do, which was overlooked by `6cbfb784c`, is revoke public read access on pg_statistic_ext_data; otherwise we still have the same security hole we started with. Catversion bump to ensure that installations calling themselves beta2 will have this fix. Diagnosis/correction by Dean Rasheed and Tomas Vondra, but I'm going to go ahead and push this fix ASAP so we get more buildfarm cycles on it. Discussion: https://postgr.es/m/8833.1560647898@sss.pgh.pa.us	2019-06-16 11:00:23 -04:00
Tomas Vondra	fc8cf3df47	Fix privileges on pg_statistic_ext.tableoid The GRANT in system_views allowed SELECT privileges on various columns in the pg_statistic_ext catalog, but tableoid was not included in the list. That made pg_dump fail because it's accessing this column when building the list of extended statistics to dump. Discussion: https://postgr.es/m/8833.1560647898%40sss.pgh.pa.us	2019-06-16 12:12:16 +02:00
Tomas Vondra	aa087ec64f	Add pg_stats_ext view for extended statistics Regular per-column statistics are stored in pg_statistics catalog, which is however rather difficult to read, so we also have pg_stats view with a human-reablable version of the data. For extended statistic the catalog was fairly easy to read, so we did not have such human-readable view so far. Commit 9b6babfa2d however did split the catalog into two, which makes querying harder. Furthermore, we want to show the multi-column MCV list in a way similar to per-column stats (and not as a bytea value). This commit introduces pg_stats_ext view, joining the two catalogs and massaging the data to produce human-readable output similar to pg_stats. It also considers RLS and access privileges - the data is shown only when the user has access to all columns the extended statistic is defined on. Bumped CATVERSION due to adding new system view. Author: Dean Rasheed, with improvements by me Reviewed-by: Dean Rasheed, John Naylor Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui%3DVdx4N%3D%3DVV5XOK5dsXfnGgVOz_JhAicB%3DZA%40mail.gmail.com	2019-06-16 01:20:39 +02:00
Tomas Vondra	6cbfb784c3	Rework the pg_statistic_ext catalog Since extended statistic got introduced in PostgreSQL 10, there was a single catalog pg_statistic_ext storing both the definitions and built statistic. That's however problematic when a user is supposed to have access only to the definitions, but not to user data. Consider for example pg_dump on a database with RLS enabled - if the pg_statistic_ext catalog respects RLS (which it should, if it contains user data), pg_dump would not see any records and the result would not define any extended statistics. That would be a surprising behavior. Until now this was not a pressing issue, because the existing types of extended statistic (functional dependencies and ndistinct coefficients) do not include any user data directly. This changed with introduction of MCV lists, which do include most common combinations of values. The easiest way to fix this is to split the pg_statistic_ext catalog into two - one for definitions, one for the built statistic values. The new catalog is called pg_statistic_ext_data, and we're maintaining a 1:1 relationship with the old catalog - either there are matching records in both catalogs, or neither of them. Bumped CATVERSION due to changing system catalog definitions. Author: Dean Rasheed, with improvements by me Reviewed-by: Dean Rasheed, John Naylor Discussion: https://postgr.es/m/CAEZATCUhT9rt7Ui%3DVdx4N%3D%3DVV5XOK5dsXfnGgVOz_JhAicB%3DZA%40mail.gmail.com	2019-06-16 01:20:31 +02:00
Alvaro Herrera	3da73d6839	Silence compiler warning Introduced in `de87a084c0`.	2019-06-14 11:33:40 -04:00
Michael Paquier	f43608bda2	Fix typos and inconsistencies in code comments Author: Alexander Lakhin Discussion: https://postgr.es/m/dec6aae8-2d63-639f-4d50-20e229fb83e3@gmail.com	2019-06-14 09:34:34 +09:00
Tom Lane	d25ea01275	Avoid combinatorial explosion in add_child_rel_equivalences(). If an EquivalenceClass member expression includes variables from multiple appendrels, then instead of producing one substituted expression per child relation as intended, we'd create additional child expressions for combinations of children of different appendrels. This happened because the child expressions generated while considering the first appendrel were taken as sources during substitution of the second appendrel, and so on. The extra expressions are useless, and are harmless unless there are too many of them --- but if you have several appendrels with a thousand or so members each, it gets bad fast. To fix, consider only original (non-em_is_child) EC members as candidates to be expanded. This requires the ability to substitute directly from a top parent relation's Vars to those of an indirect descendant relation, but we already have that in adjust_appendrel_attrs_multilevel(). Per bug #15847 from Feike Steenbergen. This is a longstanding misbehavior, but it's only worth worrying about when there are more appendrel children than we've historically considered wise to use. So I'm not going to take the risk of back-patching this. Discussion: https://postgr.es/m/15847-ea3734094bf8ae61@postgresql.org	2019-06-13 18:10:20 -04:00
Alvaro Herrera	de87a084c0	Avoid spurious deadlocks when upgrading a tuple lock When two (or more) transactions are waiting for transaction T1 to release a tuple-level lock, and transaction T1 upgrades its lock to a higher level, a spurious deadlock can be reported among the waiting transactions when T1 finishes. The simplest example case seems to be: T1: select id from job where name = 'a' for key share; Y: select id from job where name = 'a' for update; -- starts waiting for X Z: select id from job where name = 'a' for key share; T1: update job set name = 'b' where id = 1; Z: update job set name = 'c' where id = 1; -- starts waiting for X T1: rollback; At this point, transaction Y is rolled back on account of a deadlock: Y holds the heavyweight tuple lock and is waiting for the Xmax to be released, while Z holds part of the multixact and tries to acquire the heavyweight lock (per protocol) and goes to sleep; once X releases its part of the multixact, Z is awakened only to be put back to sleep on the heavyweight lock that Y is holding while sleeping. Kaboom. This can be avoided by having Z skip the heavyweight lock acquisition. As far as I can see, the biggest downside is that if there are multiple Z transactions, the order in which they resume after X finishes is not guaranteed. Backpatch to 9.6. The patch applies cleanly on 9.5, but the new tests don't work there (because isolationtester is not smart enough), so I'm not going to risk it. Author: Oleksii Kliukin Discussion: https://postgr.es/m/B9C9D7CD-EB94-4635-91B6-E558ACEC0EC3@hintbits.com	2019-06-13 17:28:24 -04:00
Tom Lane	3d99a81397	Fix incorrect printing of queries with duplicated join names. Given a query in which multiple JOIN nodes used the same alias (which'd necessarily be in different sub-SELECTs), ruleutils.c would assign the JOIN nodes distinct aliases for clarity ... but then it forgot to print the modified aliases when dumping the JOIN nodes themselves. This results in a dump/reload hazard for views, because the emitted query is flat-out incorrect: Vars will be printed with table names that have no referent. This has been wrong for a long time, so back-patch to all supported branches. Philip Dubé Discussion: https://postgr.es/m/CY4PR2101MB080246F2955FF58A6ED1FEAC98140@CY4PR2101MB0802.namprd21.prod.outlook.com	2019-06-12 19:43:08 -04:00
Tom Lane	e76de88615	Fix ALTER COLUMN TYPE failure with a partial exclusion constraint. ATExecAlterColumnType failed to consider the possibility that an index that needs to be rebuilt might be a child of a constraint that needs to be rebuilt. We missed this so far because usually a constraint index doesn't have a direct dependency on its table, just on the constraint object. But if there's a WHERE clause, then dependency analysis of the WHERE clause results in direct dependencies on the column(s) mentioned in WHERE. This led to trying to drop and rebuild both the constraint and its underlying index. In v11/HEAD, we successfully drop both the index and the constraint, and then try to rebuild both, and of course the second rebuild hits a duplicate-index-name problem. Before v11, it fails with obscure messages about a missing relation OID, due to trying to drop the index twice. This is essentially the same kind of problem noted in commit `20bef2c31`: the possible dependency linkages are broader than what ATExecAlterColumnType was designed for. It was probably OK when written, but it's certainly been broken since the introduction of partial exclusion constraints. Fix by adding an explicit check for whether any of the indexes-to-be-rebuilt belong to any of the constraints-to-be-rebuilt, and ignoring any that do. In passing, fix a latent bug introduced by commit `8b08f7d48`: in get_constraint_index() we must "continue" not "break" when rejecting a relation of a wrong relkind. This is harmless today because we don't expect that code path to be taken anyway; but if there ever were any relations to be ignored, the existing coding would have an extremely undesirable dependency on the order of pg_depend entries. Also adjust a couple of obsolete comments. Per bug #15835 from Yaroslav Schekin. Back-patch to all supported branches. Discussion: https://postgr.es/m/15835-32d9b7a76c06a7a9@postgresql.org	2019-06-12 12:29:39 -04:00
Michael Paquier	ceac4505d3	Fix handling of COMMENT for domain constraints For a non-superuser, changing a comment on a domain constraint was leading to a cache lookup failure as the code tried to perform the ownership lookup on the constraint OID itself, thinking that it was a type, but this check needs to happen on the type the domain constraint relies on. As the type a domain constraint relies on can be guessed directly based on the constraint OID, first fetch its type OID and perform the ownership on it. This is broken since `7eca575`, which has split the handling of comments for table constraints and domain constraints, so back-patch down to 9.5. Reported-by: Clemens Ladisch Author: Daniel Gustafsson, Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/15833-808e11904835d26f@postgresql.org Backpatch-through: 9.5	2019-06-12 11:30:11 +09:00
Tom Lane	6f34fcbbd5	Fix conversion of JSON strings to JSON output columns in json_to_record(). json_to_record(), when an output column is declared as type json or jsonb, should emit the corresponding field of the input JSON object. But it got this slightly wrong when the field is just a string literal: it failed to escape the contents of the string. That typically resulted in syntax errors if the string contained any double quotes or backslashes. jsonb_to_record() handles such cases correctly, but I added corresponding test cases for it too, to prevent future backsliding. Improve the documentation, as it provided only a very hand-wavy description of the conversion rules used by these functions. Per bug report from Robert Vollmert. Back-patch to v10 where the error was introduced (by commit `cf35346e8`). Note that PG 9.4 - 9.6 also get this case wrong, but differently so: they feed the de-escaped contents of the string literal to json[b]_in. That behavior is less obviously wrong, so possibly it's being depended on in the field, so I won't risk trying to make the older branches behave like the newer ones. Discussion: https://postgr.es/m/D6921B37-BD8E-4664-8D5F-DB3525765DCD@vllmrt.net	2019-06-11 13:33:22 -04:00
Andres Freund	fff2a7d7bd	Don't access catalogs to validate GUCs when not connected to a DB. Vignesh found this bug in the check function for default_table_access_method's check hook, but that was just copied from older GUCs. Investigation by Michael and me then found the bug in further places. When not connected to a database (e.g. in a walsender connection), we cannot perform (most) GUC checks that need database access. Even when only shared tables are needed, unless they're nailed (c.f. RelationCacheInitializePhase2()), they cannot be accessed without pg_class etc. being present. Fix by extending the existing IsTransactionState() checks to also check for MyDatabaseOid. Reported-By: Vignesh C, Michael Paquier, Andres Freund Author: Vignesh C, Andres Freund Discussion: https://postgr.es/m/CALDaNm1KXK9gbZfY-p_peRFm_XrBh1OwQO1Kk6Gig0c0fVZ2uw%40mail.gmail.com Backpatch: 9.4-	2019-06-10 23:34:50 -07:00
Noah Misch	44982e7d09	Reconcile nodes/*funcs.c with PostgreSQL 12 work. One would have needed out-of-tree code to observe the defects. Remove unreferenced fields instead of completing their support functions. Since in-tree code can't reach _readIntoClause(), no catversion bump.	2019-06-09 14:00:36 -07:00
Noah Misch	31d250e049	Update stale comments, and fix comment typos.	2019-06-08 10:12:26 -07:00
Amit Kapila	92c4abc736	Fix assorted inconsistencies. There were a number of issues in the recent commits which include typos, code and comments mismatch, leftover function declarations. Fix them. Reported-by: Alexander Lakhin Author: Alexander Lakhin, Amit Kapila and Amit Langote Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/ef0c0232-0c1d-3a35-63d4-0ebd06e31387@gmail.com	2019-06-08 08:16:38 +05:30
Michael Paquier	35b2d4bc0e	Move be-gssapi-common.h into src/include/libpq/ The file has been introduced in src/backend/libpq/ as of `b0b39f72`, but all backend-side headers of libpq are located in src/include/libpq/. Note that the identification path on top of the file referred to src/include/libpq/ from the start. Author: Michael Paquier Reviewed-by: Stephen Frost Discussion: https://postgr.es/m/20190607043415.GE1736@paquier.xyz	2019-06-08 09:59:02 +09:00
Alvaro Herrera	a36c84c3e4	Fix default_tablespace usage for partitioned tables In commit `87259588d0` I (Álvaro) tried to rationalize the determination of tablespace to use for partitioned tables, but failed to handle the default_tablespace case. Repair and add proper tests. Author: Amit Langote, Rushabh Lathia Reported-by: Rushabh Lathia Reviewed-by: Amit Langote, Álvaro Herrera Discussion: https://postgr.es/m/CAGPqQf0cYjm1=rjxk_6gU0SjUS70=yFUAdCJLwWzh9bhNJnyVg@mail.gmail.com	2019-06-07 00:44:17 -04:00
Amit Kapila	d8261595bc	Fix inconsistency in comments atop ExecParallelEstimate. When this code was initially introduced in commit `d1b7c1ff`, the structure used was SharedPlanStateInstrumentation, but later when it got changed to Instrumentation structure in commit `b287df70`, we forgot to update the comment. Reported-by: Wu Fei Author: Wu Fei Reviewed-by: Amit Kapila Backpatch-through: 9.6 Discussion: https://postgr.es/m/52E6E0843B9D774C8C73D6CF64402F0562215EB2@G08CNEXMBPEKD02.g08.fujitsu.local	2019-06-07 05:23:52 +05:30
Alvaro Herrera	e8bdea58f9	Fix message style Mark one message not for translation, and prefer "cannot" over "may not", per commentary from Robert Haas. Discussion: https://postgr.es/m/20190430145813.GA29872@alvherre.pgsql	2019-06-06 12:57:57 -04:00
Heikki Linnakangas	cd96389d71	Fix confusion on different kinds of slots in IndexOnlyScans. We used the same slot to store a tuple from the index, and to store a tuple from the table. That's not OK. It worked with the heap, because heapam_getnextslot() stores a HeapTuple to the slot, and doesn't care how large the tts_values/nulls arrays are. But when I played with a toy table AM implementation that used a virtual tuple, it caused memory overruns. In the passing, tidy up comments on the ioss_PscanLen fields.	2019-06-06 09:46:52 +03:00
David Rowley	e24a815c1c	Fix confusing NOTICE text in REINDEX CONCURRENTLY When performing REINDEX TABLE CONCURRENTLY, if all of the table's indexes could not be reindexed, a NOTICE message claimed that the table had no indexes. This was confusing, so let's change the NOTICE text to something less confusing. In passing, also mention in the comment before ReindexRelationConcurrently that materialized views are supported too and also explain what the return value of the function means. Author: Ashwin Agrawal Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CALfoeithHvi13p_VyR8kt9o6Pa7Z=Smi6Nfc2anHnQx5Lj8bTQ@mail.gmail.com	2019-06-05 21:05:41 +12:00
David Rowley	56b3b38382	Fix incorrect index behavior in COPY FROM with partitioned tables `86b85044e` rewrote how COPY FROM works to allow multiple tuple buffers to exist to once thus allowing multi-inserts to be used in more cases with partitioned tables. That commit neglected to update the estate's es_result_relation_info when flushing the insert buffer to the partition making it possible for the index tuples to be added into an index on the wrong partition. Fix this and also add an Assert in ExecInsertIndexTuples to help ensure that we never make this mistake again. Reported-by: Haruka Takatsuka Author: Ashutosh Sharma Discussion: https://postgr.es/m/15832-b1bf336a4ee246b5@postgresql.org	2019-06-05 18:28:38 +12:00
Michael Paquier	f7e954ad1c	Rework code using list_delete_cell() in MergeAttributes When merging two attributes, we are sure that at least one remains. However, when deleting one element in the attribute list we may finish with an empty list returned as NIL by list_delete_cell(), but the code failed to track that, which is not project-like. Adjust the call so as we check for an empty list, and make use of it in an assertion. This has been introduced by `e7b3349`, when adding support for CREATE TABLE OF. Author: Mark Dilger Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/CAE-h2TpPDqSWgOvfvSziOaMngMPwW+QZcmPpY8hQ_KOJ2+3hXQ@mail.gmail.com	2019-06-05 15:01:14 +09:00
Peter Eisentraut	c880096dc1	Add command column to pg_stat_progress_create_index This allows determining which command is running, similar to pg_stat_progress_cluster. Discussion: https://www.postgresql.org/message-id/flat/f0e56b3b-74b7-6cbc-e207-a5ed6bee18dc%402ndquadrant.com	2019-06-04 09:29:02 +02:00
Tom Lane	eaf0292c3b	Fix unsafe memory management in CloneRowTriggersToPartition(). It's not really supported to call systable_getnext() in a different memory context than systable_beginscan() was called in, and it's definitely not safe to do so and then reset that context between calls. I'm not very clear on how this code survived CLOBBER_CACHE_ALWAYS testing ... but Alexander Lakhin found a case that would crash it pretty reliably. Per bug #15828. Fix, and backpatch to v11 where this code came in. Discussion: https://postgr.es/m/15828-f6ddd7df4852f473@postgresql.org	2019-06-03 16:59:26 -04:00
Peter Eisentraut	05d36b68ed	Update SQL conformance information about JSON path Reviewed-by: Oleg Bartunov <obartunov@postgrespro.ru>	2019-06-03 21:36:04 +02:00
Michael Paquier	1fb6f62a84	Fix typos in various places Author: Andrea Gelmini Reviewed-by: Michael Paquier, Justin Pryzby Discussion: https://postgr.es/m/20190528181718.GA39034@glet	2019-06-03 13:44:03 +09:00
Alvaro Herrera	d22f885f89	Fix double-phrase typo in message New in `147e3722f7`.	2019-05-31 10:08:37 -04:00
Tomas Vondra	fe415ff104	Make error logging in extended statistics more consistent Most errors reported in extended statistics are internal issues, and so should use elog(). The MCV list code was already following this rule, but the functional dependencies and ndistinct coefficients were using a mix of elog() and ereport(). Fix this by changing most places to elog(), with the exception of input functions. This is a mostly cosmetic change, it makes the life a little bit easier for translators, as elog() messages are not translated. So backpatch to PostgreSQL 10, where extended statistics were introduced. Author: Tomas Vondra Backpatch-through: 10 where extended statistics were added Discussion: https://postgr.es/m/20190503154404.GA7478@alvherre.pgsql	2019-05-30 17:03:36 +02:00
Alvaro Herrera	d890fa812d	Make one message just like all its siblings.	2019-05-28 23:44:22 -04:00
Alvaro Herrera	a100974751	Fix typo in message I introduced the typo in source code in the course of `75445c1515`. Repair.	2019-05-28 17:36:14 -04:00
Amit Kapila	9679345f3c	Fix typos. Reported-by: Alexander Lakhin Author: Alexander Lakhin Reviewed-by: Amit Kapila and Tom Lane Discussion: https://postgr.es/m/7208de98-add8-8537-91c0-f8b089e2928c@gmail.com	2019-05-26 18:28:18 +05:30
Thomas Munro	4c9210f34c	Update copyright year. Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA%2BhUKGJFWXmtYo6Frd77RR8YXCHz7hJ2mRy5aHV%3D7fJOqDnBHA%40mail.gmail.com	2019-05-24 12:03:32 +12:00
Thomas Munro	7988cb446d	Fix typos. Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA%2BhUKGJFWXmtYo6Frd77RR8YXCHz7hJ2mRy5aHV%3D7fJOqDnBHA%40mail.gmail.com	2019-05-24 12:00:59 +12:00
Andres Freund	73b8c3bd28	tableam: Rename wrapper functions to match callback names. Some of the wrapper functions didn't match the callback names. Many of them due to staying "consistent" with historic naming of the wrapped functionality. We decided that for most cases it's more important to be for tableam to be consistent going forward, than with the past. The one exception is beginscan/endscan/... because it'd have looked odd to have systable_beginscan/endscan/... with a different naming scheme, and changing the systable_* APIs would have caused way too much churn (including breaking a lot of external users). Author: Ashwin Agrawal, with some small additions by Andres Freund Reviewed-By: Andres Freund Discussion: https://postgr.es/m/CALfoeiugyrXZfX7n0ORCa4L-m834dzmaE8eFdbNR6PMpetU4Ww@mail.gmail.com	2019-05-23 16:32:36 -07:00
Andrew Gierth	44e95b5728	Fix array size allocation for HashAggregate hash keys. When there were duplicate columns in the hash key list, the array sizes could be miscomputed, resulting in access off the end of the array. Adjust the computation to ensure the array is always large enough. (I considered whether the duplicates could be removed in planning, but I can't rule out the possibility that duplicate columns might have different hash functions assigned. Simpler to just make sure it works at execution time regardless.) Bug apparently introduced in `fc4b3dea2` as part of narrowing down the tuples stored in the hashtable. Reported by Colm McHugh of Salesforce, though I didn't use their patch. Backpatch back to version 10 where the bug was introduced. Discussion: https://postgr.es/m/CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com	2019-05-23 15:26:01 +01:00
Tom Lane	db6e2b4c52	Initial pgperltidy run for v12. Make all the perl code look nice, too (for some value of "nice").	2019-05-22 13:36:19 -04:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Tom Lane	be76af171c	Initial pgindent run for v12. This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us	2019-05-22 12:55:34 -04:00
Peter Eisentraut	66a4bad83a	Convert ExecComputeStoredGenerated to use tuple slots This code was still using the old style of forming a heap tuple rather than using tuple slots. This would be less efficient if a non-heap access method was used. And using tuple slots is actually quite a bit faster when using heap as well. Also add some test cases for generated columns with null values and with varlena values. This lack of coverage was discovered while working on this patch. Discussion: https://www.postgresql.org/message-id/flat/20190331025744.ugbsyks7czfcoksd%40alap3.anarazel.de	2019-05-22 18:41:53 +02:00
Tom Lane	166f69f769	Fix O(N^2) performance issue in pg_publication_tables view. The original coding of this view relied on a correlated IN sub-query. Our planner is not very bright about correlated sub-queries, and even if it were, there's no way for it to know that the output of pg_get_publication_tables() is duplicate-free, making the de-duplicating semantics of IN unnecessary. Hence, rewrite as a LATERAL sub-query. This provides circa 100X speedup for me with a few hundred published tables (the whole regression database), and things would degrade as roughly O(published_relations * all_relations) beyond that. Because the rules.out expected output changes, force a catversion bump. Ordinarily we might not want to do that post-beta1; but we already know we'll be doing a catversion bump before beta2 to fix pg_statistic_ext issues, so it's pretty much free to fix it now instead of waiting for v13. Per report and fix suggestion from PegoraroF10. Discussion: https://postgr.es/m/1551385426763-0.post@n3.nabble.com	2019-05-22 11:47:02 -04:00
Robert Haas	1171d7d585	tableam: Move heap-specific logic from needs_toast_table below tableam. This allows table AMs to completely suppress TOAST table creation, or to modify the conditions under which they are created. Patch by me. Reviewed by Andres Freund. Discussion: http://postgr.es/m/CA+Tgmoa4O2n=yphqD2pERUnYmUO84bH1SqMsA-nSxBGsZ7gWfA@mail.gmail.com	2019-05-21 11:57:13 -04:00
Fujii Masao	b8e2170e40	Fix comment for issue_xlog_fsync(). "segno" is the argument for the function, not "log" and "seg". Author: Antonin Houska Discussion: https://postgr.es/m/11863.1558361020@spoje.net	2019-05-21 00:44:00 +09:00
Fujii Masao	fc7c281f87	Make VACUUM accept 1 and 0 as a boolean value. Commit `41b54ba78e` allowed existing VACUUM options to take a boolean argument. It's documented that valid boolean values that VACUUM can accept are true, false, on, off, 1, and 0. But previously the parser failed to accept 1 and 0 as a boolean value in VACUUM syntax because of a lack of NumericOnly clause for vac_analyze_option_arg in gram.y. This commit adds such NumericOnly clause so that VACUUM options can take also 1 and 0 as a boolean value. Discussion: https://postgr.es/m/CAHGQGwGYg82A8UCQxZe7Zn9MnyUBGdyB=1CNpKF3jBny+RbyfA@mail.gmail.com	2019-05-21 00:23:16 +09:00
Peter Eisentraut	3c439a58df	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: a20bf6b8a5b4e32450967055eb5b07cee4704edd	2019-05-20 16:00:53 +02:00
Andres Freund	fb504c5e4b	Remove outdated comment in copy.c.	2019-05-19 20:47:54 -07:00
Andres Freund	2657283256	Minimally fix partial aggregation for aggregates that don't have one argument. For partial aggregation combine steps, AggStatePerTrans->numTransInputs was set to the transition function's number of inputs, rather than the combine function's number of inputs (always 1). That lead to partial aggregates with strict combine functions to wrongly check for NOT NULL input as required by strictness. When the aggregate wasn't exactly passed one argument, the strictness check was either omitted (in the 0 args case) or too many arguments were checked. In the latter case we'd read beyond the end of FunctionCallInfoData->args (only in master). AggStatePerTrans->numTransInputs actually has been wrong since since 9.6, where partial aggregates were added. But it turns out to not be an active problem in 9.6 and 10, because numTransInputs wasn't used at all for combine functions: Before `c253b722f6` there simply was no NULL check for the input to strict trans functions, and after that the check was simply hardcoded for the right offset in fcinfo, as it's done by code specific to combine functions. In `bf6c614a2f` (11) the strictness check was generalized, with common code doing the strictness checks for both plain and combine transition functions, based on numTransInputs. For combine functions this lead to not emitting an expression step to check for strict input in the 0 arguments case, and in the > 1 arguments case, we'd check too many arguments.Due to the fact that the relevant fcinfo->isnull[2..] was always zero-initialized (more or less by accident, by being part of the AggStatePerTrans struct, which is palloc0'ed), there was no observable damage in the latter case before `a9c35cf85c`, we just checked too many array elements. Due to the changes in `a9c35cf85c`, > 1 argument bug became visible, because these days fcinfo is a) dynamically allocated without being zeroed b) exactly the length required for the number of specified arguments (hardcoded to 2 in this case). This commit only contains a fairly minimal fix, setting numTransInputs to a hardcoded 1 when building a pertrans for a combine function. It seems likely that we'll want to clean this up further (e.g. the arguments build_pertrans_for_aggref() aren't particularly meaningful for combine functions). But the wrap date for 12 beta1 is coming up fast, so it seems good to have a minimal fix in place. Backpatch to 11. While AggStatePerTrans->numTransInputs was set wrongly before that, the value was not used for combine functions. Reported-By: Rajkumar Raghuwanshi Diagnosed-By: Kyotaro Horiguchi, Jeevan Chalke, Andres Freund, David Rowley Author: David Rowley, Kyotaro Horiguchi, Andres Freund Discussion: https://postgr.es/m/CAKcux6=uZEyWyLw0N7HtR9OBc-sWEFeByEZC7t-KDf15FKxVew@mail.gmail.com	2019-05-19 18:01:06 -07:00
Andres Freund	c3b23ae457	Don't to predicate lock for analyze scans, refactor scan option passing. Before this commit, when ANALYZE was run on a table and serializable was used (either by virtue of an explicit BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE, or default_transaction_isolation being set to serializable) a null pointer dereference lead to a crash. The analyze scan doesn't need a snapshot (nor predicate locking), but before this commit a scan only contained information about being a bitmap or sample scan. Refactor the option passing to the scan_begin callback to use a bitmask instead. Alternatively we could have added a new boolean parameter, but that seems harder to read. Even before this issue various people (Heikki, Tom, Robert) suggested doing so. These changes don't change the scan APIs outside of tableam. The flags argument could be exposed, it's not necessary to fix this problem. Also the wrapper table_beginscan* functions encapsulate most of that complexity. After these changes fixing the bug is trivial, just don't acquire predicate lock for analyze style scans. That was already done for bitmap heap scans. Add an assert that a snapshot is passed when acquiring the predicate lock, so this kind of bug doesn't require running with serializable. Also add a comment about sample scans currently requiring predicate locking the entire relation, that previously wasn't remarked upon. Reported-By: Joe Wildish Author: Andres Freund Discussion: https://postgr.es/m/4EA80A20-E9BF-49F1-9F01-5B66CAB21453@elusive.cx https://postgr.es/m/20190411164947.nkii4gaeilt4bui7@alap3.anarazel.de https://postgr.es/m/20190518203102.g7peu2fianukjuxm@alap3.anarazel.de	2019-05-19 15:10:28 -07:00
Tom Lane	8334515529	Revert "postmaster: Start syslogger earlier". This commit reverts `57431a911d`. While that's still a good idea in the abstract, we found out that there are multiple crasher bugs in it on Windows builds, making the logging_collector option unusable on Windows. There's no time left to fix these issues before 12beta1, so revert the patch to allow Windows beta testing to proceed. We'll try again at some future date. Per bug #15804 from Yulian Khodorkovskiy and additional investigation by Michael Paquier. Discussion: https://postgr.es/m/15804-3721117bf40fb654@postgresql.org	2019-05-19 11:14:23 -04:00
Alexander Korotkov	da24961e9e	Fix declarations of couple jsonpath functions Make jsonb_path_query_array() and jsonb_path_query_first() use PG_FUNCTION_ARGS macro instead of its expansion.	2019-05-19 07:45:42 +03:00
Tom Lane	93f03dad82	Make BufFileCreateTemp() ensure that temp tablespaces are set up. If PrepareTempTablespaces() has never been called in the current transaction, OpenTemporaryFile() will fall back to using the default tablespace, which is a bug if the user wanted temp files placed elsewhere. gistInitBuildBuffers() appears to have this disease already, and it seems like an easy trap for future coders to fall into. We discussed other ways to close this gap, but none of them are prettier or more reliable than just having BufFileCreateTemp do it. In particular, having fd.c do this creates layering issues that we could do without. Per suggestion from Melanie Plageman. Arguably this is a bug fix, but nobody seems very excited about back-patching, so change in HEAD only. Discussion: https://postgr.es/m/CAAKRu_YwzjuGAmmaw4-8XO=OVFGR1QhY_Pq-t3wjb9ribBJb_Q@mail.gmail.com	2019-05-18 13:51:16 -04:00
Tom Lane	d307954a7d	"A void function may not return a value". Per buildfarm.	2019-05-18 00:40:39 -04:00
Andres Freund	147e3722f7	tableam: Avoid relying on relation size to determine validity of tids. Instead add a tableam callback to do so. To avoid adding per validation overhead, pass a scan to tuple_tid_valid. In heap's case we'd otherwise incurred a RelationGetNumberOfBlocks() call for each tid - which'd have added noticable overhead to nodeTidscan.c. Author: Andres Freund Reviewed-By: Ashwin Agrawal Discussion: https://postgr.es/m/20190515185447.gno2jtqxyktylyvs@alap3.anarazel.de	2019-05-17 18:56:55 -07:00
Andres Freund	7f44ede594	tableam: Don't assume that every AM uses md.c style storage. Previously various parts of the code routed size requests through RelationGetNumberOfBlocks[InFork]. That works if md.c is used by the AM, but not otherwise. Add a tableam callback to return the size of the table. As not every AM will use postgres' BLCKSZ, have it return bytes, and have RelationGetNumberOfBlocksInFork() round the byte size up into blocks. To allow code outside of the AM to determine the actual relation size map InvalidForkNumber the total size of a relation, as not every AM might just need the postgres defined forks. A few users of RelationGetNumberOfBlocks() ought to be converted away from that. One case, the use of it to determine whether a tid is valid, will be fixed in a follow up commit. Others will have to wait for v13. Author: Andres Freund Discussion: https://postgr.es/m/20190423225201.3bbv6tbqzkb5w7cw@alap3.anarazel.de	2019-05-17 18:56:47 -07:00
Tom Lane	6630ccad7a	Restructure creation of run-time pruning steps. Previously, gen_partprune_steps() always built executor pruning steps using all suitable clauses, including those containing PARAM_EXEC Params. This meant that the pruning steps were only completely safe for executor run-time (scan start) pruning. To prune at executor startup, we had to ignore the steps involving exec Params. But this doesn't really work in general, since there may be logic changes needed as well --- for example, pruning according to the last operator's btree strategy is the wrong thing if we're not applying that operator. The rules embodied in gen_partprune_steps() and its minions are sufficiently complicated that tracking their incremental effects in other logic seems quite impractical. Short of a complete redesign, the only safe fix seems to be to run gen_partprune_steps() twice, once to create executor startup pruning steps and then again for run-time pruning steps. We can save a few cycles however by noting during the first scan whether we rejected any clauses because they involved exec Params --- if not, we don't need to do the second scan. In support of this, refactor the internal APIs in partprune.c to make more use of passing information in the GeneratePruningStepsContext struct, rather than as separate arguments. This is, I hope, the last piece of our response to a bug report from Alan Jackson. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/FAD28A83-AC73-489E-A058-2681FA31D648@tvsquared.com	2019-05-17 19:44:34 -04:00
Alvaro Herrera	75445c1515	More message style fixes Discussion: https://postgr.es/m/20190515183005.GA26486@alvherre.pgsql	2019-05-16 19:14:31 -04:00
Peter Geoghegan	3f58cc6dd8	Remove extra nbtree half-dead internal page check. It's not safe for nbtree VACUUM to attempt to delete a target page whose right sibling is already half-dead, since that would fail the cross-check when VACUUM attempts to re-find a downlink to the right sibling in the parent page. Logic to prevent this from happening was added by commit `8da3183780`, which addressed a bug in the overhaul of page deletion that went into PostgreSQL 9.4 (commit `efada2b8e9`). VACUUM was made to check the right sibling page, and back off when it happened to be half-dead already. However, it is only truly necessary to do the right sibling check on the leaf level, since that transitively determines if the deletion target's parent's right sibling page is itself undergoing deletion. Remove the internal page level check, and add a comment explaining why the leaf level check alone suffices. The extra check is also unnecessary due to the fact that internal pages that are marked half-dead are generally considered corrupt. Commit `efada2b8e9` established the principle that there should never be half-dead internal pages (internal pages pending deletion are possible, but that status is never directly represented in the internal page). VACUUM will complain about corruption when it encounters half-dead internal pages, so VACUUM is bound to raise an error one way or another when an nbtree index has a half-dead internal page (contrib/amcheck will also report that the page is corrupt). It's possible that a pg_upgrade'd 9.3 database will still have half-dead internal pages, so it may seem like there is an argument for leaving the check in place to reliably get a cleaner error message that advises the user to REINDEX. However, leaf pages are also deleted in the first phase of deletion prior to PostgreSQL 9.4, so I believe we won't even attempt to re-find the parent page anyway (we won't have the fully deleted leaf page as the right sibling of our target page, so we won't even try to find a downlink for it). Discussion: https://postgr.es/m/CAH2-Wzm_ntmqJjWLRyKzimFmFvk+BnVAvUpaA4s1h9Ja58woaQ@mail.gmail.com	2019-05-16 15:11:58 -07:00
Tom Lane	3922f10646	Fix bogus logic for combining range-partitioned columns during pruning. gen_prune_steps_from_opexps's notion of how to do this was overly complicated and underly correct. Per discussion of a report from Alan Jackson (though this fixes only one aspect of that problem). Back-patch to v11 where this code came in. Amit Langote Discussion: https://postgr.es/m/FAD28A83-AC73-489E-A058-2681FA31D648@tvsquared.com	2019-05-16 16:25:43 -04:00
Tom Lane	4b1fcb43d0	Fix partition pruning to treat stable comparison operators properly. Cross-type comparison operators in a btree or hash opclass might be only stable not immutable (this is true of timestamp vs. timestamptz for example). partprune.c ignored this possibility and would perform plan-time pruning with them anyway, possibly leading to wrong answers if the environment changed between planning and execution. To fix, teach gen_partprune_steps() to do things differently when creating plan-time pruning steps vs. run-time pruning steps. analyze_partkey_exprs() also needs an extra check, which is rather annoying but now is not the time to restructure things enough to avoid that. While at it, simplify the logic for the plan-time case a little by insisting that the comparison value be a Const and nothing else. This relies on the assumption that eval_const_expressions will have reduced any immutable expression to a Const; which is not quite 100% true, but certainly any case that comes up often enough to be interesting should have simplification logic there. Also improve a bunch of inadequate/obsolete/wrong comments. Per discussion of a report from Alan Jackson (though this fixes only one aspect of that problem). Back-patch to v11 where this code came in. David Rowley, with some further hacking by me Discussion: https://postgr.es/m/FAD28A83-AC73-489E-A058-2681FA31D648@tvsquared.com	2019-05-16 11:58:21 -04:00
Peter Geoghegan	489e431ba5	Remove obsolete nbtree insertion comment. Remove a Berkeley-era comment above _bt_insertonpg() that admonishes the reader to grok Lehman and Yao's paper before making any changes. This made a certain amount of sense back when _bt_insertonpg() was responsible for most of the things that are now spread across _bt_insertonpg(), _bt_findinsertloc(), _bt_insert_parent(), and _bt_split(), but it doesn't work like that anymore. I believe that this comment alludes to the need to "couple" or "crab" buffer locks as we ascend the tree as page splits cascade upwards. The nbtree README already explains this in detail, which seems sufficient. Besides, the changes to page splits made by commit `40dae7ec53` altered the exact details of how buffer locks are retained during splits; Lehman and Yao's original algorithm seems to release the lock on the left child page/buffer slightly earlier than _bt_insertonpg()/_bt_insert_parent() can.	2019-05-15 16:53:11 -07:00
Peter Geoghegan	7505da2f45	Reverse order of newitem nbtree candidate splits. Commit `fab25024`, which taught nbtree to choose candidate split points more carefully, had _bt_findsplitloc() record all possible split points in an initial pass over a page that is about to be split. The order that candidate split points were processed and stored in was assumed to match the offset number order of split points on an imaginary version of the page that contains the same items as the original, but also fits newitem (the item that provoked the split precisely because it didn't fit). However, the order of split points in the final array was not quite what was expected: the split point that makes newitem the firstright item came after the split point that makes newitem the lastleft item -- not before. As a result, _bt_findsplitloc() could get confused about the leftmost and rightmost tuples among all possible split points recorded for the page. This seems to have no appreciable impact on the quality of the final split point chosen by _bt_findsplitloc(), but it's still wrong. To fix, switch the order in which newitem candidate splits are recorded in. This also makes it possible to describe candidate split points in terms of which pair of adjoining tuples enclose the split point within _bt_findsplitloc(), making it clearer why it's generally safe for _bt_split() to expect lastleft and firstright tuples.	2019-05-15 12:22:07 -07:00
Andres Freund	aa4b8c61d2	Handle table_complete_speculative's succeeded argument as documented. For some reason both callsite and the implementation for heapam had the meaning inverted (i.e. succeeded == true was passed in case of conflict). That's confusing. I (Andres) briefly pondered whether it'd be better to rename table_complete_speculative's argument to 'bool specConflict' or such, but decided not to. The 'complete' in the function name for me makes `succeeded` sound a bit better. Reported-By: Ashwin Agrawal, Melanie Plageman, Heikki Linnakangas Discussion: https://postgr.es/m/CALfoeitk7-TACwYv3hCw45FNPjkA86RfXg4iQ5kAOPhR+F1Y4w@mail.gmail.com https://postgr.es/m/97673451-339f-b21e-a781-998d06b1067c@iki.fi	2019-05-14 12:19:32 -07:00
Tom Lane	7c850320d8	Fix SQL-style substring() to have spec-compliant greediness behavior. SQL's regular-expression substring() function is defined to have a pattern argument that's separated into three subpatterns by escape- double-quote markers; the function result is the part of the input matching the second subpattern. The standard makes it clear that if there is ambiguity about how to match the input to the subpatterns, the first and third subpatterns should be taken to match the smallest possible amount of text (i.e., they're "non greedy", in the terms of our regex code). We were not doing it that way: the first subpattern would eat the largest possible amount of text, causing the function result to be shorter than what the spec requires. Fix that by attaching explicit greediness quantifiers to the subpatterns. (This depends on the regex fix in commit 8a29ed053; before that, this didn't reliably change the regex engine's behavior.) Also, by adding parentheses around each subpattern, we ensure that "\|" (OR) in the subpatterns behave sanely. Previously, "\|" in the first or third subpatterns didn't work. This patch also makes the function throw error if you write more than two escape-double-quote markers, and do something sane if you write just one, and document that behavior. Previously, an odd number of markers led to a confusing complaint about unbalanced parentheses, while extra pairs of markers were just ignored. (Note that the spec requires exactly two markers, but we've historically allowed there to be none, and this patch preserves the old behavior for that case.) In passing, adjust some substring() test cases that didn't really prove what they said they were testing for: they used patterns that didn't match the data string, so that the output would be NULL whether or not the function was really strict. Although this is certainly a bug fix, changing the behavior in back branches seems undesirable: applications could perhaps be depending on the old behavior, since it's not obviously wrong unless you read the spec very closely. Hence, no back-patch. Discussion: https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net	2019-05-14 11:27:31 -04:00
Tom Lane	fb489e4b31	In bootstrap mode, use default signal handling for SIGINT etc. Previously, the code pointed the standard process-termination signals to postgres.c's die(). That would typically result in an attempt to execute a transaction abort, which is not possible in bootstrap mode, leading to PANIC. This choice seems to be a leftover from an old code structure in which the same signal-assignment code was used for many sorts of auxiliary processes, including interactive standalone backends. It's not very sensible for bootstrap mode, which has no interest in either interactivity or continuing after an error. We can get better behavior with less effort by just letting normal process termination happen, after which the parent initdb process will clean up. This is basically cosmetic in any case, since initdb will react the same way whether bootstrap dies on a signal or abort(). Given the lack of previous complaints, I don't feel a need to back-patch, even though the behavior is old. Discussion: https://postgr.es/m/3850b11a.5121.16aaf827e4a.Coremail.thunder1@126.com	2019-05-14 10:22:28 -04:00
Peter Eisentraut	037165ca95	Update SQL features/conformance information to SQL:2016	2019-05-14 15:44:37 +02:00
Peter Eisentraut	eb3a1376c9	Update information_schema for SQL:2016 This is mainly a light renumbering to match the sections in the standard.	2019-05-14 15:44:37 +02:00
Heikki Linnakangas	22251686f0	Detect internal GiST page splits correctly during index build. As we descend the GiST tree during insertion, we modify any downlinks on the way down to include the new tuple we're about to insert (if they don't cover it already). Modifying an existing downlink might cause an internal page to split, if the new downlink tuple is larger than the old one. If that happens, we need to back up to the parent and re-choose a page to insert to. We used to detect that situation, thanks to the NSN-LSN interlock normally used to detect concurrent page splits, but that got broken by commit `9155580fd5`. With that commit, we now use a dummy constant LSN value for every page during index build, so the LSN-NSN interlock no longer works. I thought that was OK because there can't be any other backends modifying the index during index build, but missed that the insertion itself can modify the page we're inserting to. The consequence was that we would sometimes insert the new tuple to an incorrect page, one whose downlink doesn't cover the new tuple. To fix, add a flag to the stack that keeps track of the state while descending tree, to indicate that a page was split, and that we need to retry the descend from the parent. Thomas Munro first reported that the contrib/intarray regression test was failing occasionally on the buildfarm after commit `9155580fd5`. The failure was intermittent, because the gistchoose() function is not deterministic, and would only occasionally create the right circumstances for this bug to cause the failure. Patch by Anastasia Lubennikova, with some changes by me to make it work correctly also when the internal page split also causes the "grandparent" to be split. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJRzLo7tZExWfSbwM3XuK7aAK7FhdBV0FLkbUG%2BW0v0zg%40mail.gmail.com	2019-05-14 13:18:44 +03:00
Heikki Linnakangas	e95d550bbb	Fix comment on when HOT update is possible. The conditions listed in this comment have changed several times, and at some point the thing that the "if so" referred to was negated. The text was OK up to 9.6. It was differently wrong in v10, v11 and master, so fix in all those versions.	2019-05-14 13:06:28 +03:00
Etsuro Fujita	7d9eca59cf	Fix typo.	2019-05-14 16:05:37 +09:00
Michael Paquier	7e19929ea2	Fix duplicated words in comments Author: Stephen Amell Discussion: https://postgr.es/m/539fa271-21b3-777e-a468-d96cffe9c768@gmail.com	2019-05-14 09:37:35 +09:00
Peter Geoghegan	ae7291acbc	Standardize ItemIdData terminology. The term "item pointer" should not be used to refer to ItemIdData variables, since that is needlessly ambiguous. Only ItemPointerData/ItemPointer variables should be called item pointers. To fix, establish the convention that ItemIdData variables should always be referred to either as "item identifiers" or "line pointers". The term "item identifier" already predominates in docs and translatable messages, and so should be the preferred alternative there. Discussion: https://postgr.es/m/CAH2-Wz=c=MZQjUzde3o9+2PLAPuHTpVZPPdYxN=E4ndQ2--8ew@mail.gmail.com	2019-05-13 15:53:39 -07:00
Tom Lane	32ebb35128	Fix logical replication's ideas about which type OIDs are built-in. Only hand-assigned type OIDs should be presumed to match across different PG servers; those assigned during genbki.pl or during initdb are likely to change due to addition or removal of unrelated objects. This means that the cutoff should be FirstGenbkiObjectId (in HEAD) or FirstBootstrapObjectId (before that), not FirstNormalObjectId. Compare postgres_fdw's is_builtin() test. It's likely that this error has no observable consequence in a normally-functioning system, since ATM the only affected type OIDs are system catalog rowtypes and information_schema types, which would not typically be interesting for logical replication. But you could probably break it if you tried hard, so back-patch. Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us	2019-05-13 17:23:00 -04:00
Tom Lane	e34ee993fb	Improve commentary about hack in is_publishable_class(). The FirstNormalObjectId test here is a kluge that needs to go away, but the only substitute we can think of is to add a column to pg_class, which will take more work than can be handled right now. Add some commentary in the meanwhile. Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us	2019-05-13 17:05:48 -04:00
Peter Geoghegan	9b42e71376	Don't leave behind junk nbtree pages during split. Commit `8fa30f906b` reduced the elevel of a number of "can't happen" _bt_split() errors from PANIC to ERROR. At the same time, the new right page buffer for the split could continue to be acquired well before the critical section. This was possible because it was relatively straightforward to make sure that _bt_split() could not throw an error, with a few specific exceptions. The exceptional cases were safe because they involved specific, well understood errors, making it possible to consistently zero the right page before actually raising an error using elog(). There was no danger of leaving around a junk page, provided _bt_split() stuck to this coding rule. Commit `8224de4f`, which introduced INCLUDE indexes, added code to make _bt_split() truncate away non-key attributes. This happened at a point that broke the rule around zeroing the right page in _bt_split(). If truncation failed (perhaps due to palloc() failure), that would result in an errant right page buffer with junk contents. This could confuse VACUUM when it attempted to delete the page, and should be avoided on general principle. To fix, reorganize _bt_split() so that truncation occurs before the new right page buffer is even acquired. A junk page/buffer will not be left behind if _bt_nonkey_truncate()/_bt_truncate() raise an error. Discussion: https://postgr.es/m/CAH2-WzkcWT_-NH7EeL=Az4efg0KCV+wArygW8zKB=+HoP=VWMw@mail.gmail.com Backpatch: 11-, where INCLUDE indexes were introduced.	2019-05-13 10:27:59 -07:00
Michael Paquier	1171dbde2d	Fix incorrect return value in JSON equality function for scalars equalsJsonbScalarValue() uses a boolean as return type, however for one code path -1 gets returned, which is confusing. The origin of the confusion is visibly that this code got copy-pasted from compareJsonbScalarValue() since it has been introduced in `d1d50bf`. No backpatch, as this is only cosmetic. Author: Rikard Falkeborn Discussion: https://postgr.es/m/CADRDgG7mJnek6HNW13f+LF6V=6gag9PM+P7H5dnyWZAv49aBGg@mail.gmail.com	2019-05-13 09:11:50 +09:00
Tom Lane	8a29ed0530	Fix misoptimization of "{1,1}" quantifiers in regular expressions. A bounded quantifier with m = n = 1 might be thought a no-op. But according to our documentation (which traces back to Henry Spencer's original man page) it still imposes greediness, or non-greediness in the case of the non-greedy variant "{1,1}?", on whatever it's attached to. This turns out not to work though, because parseqatom() optimizes away the m = n = 1 case without regard for whether it's supposed to change the greediness of the argument RE. We can fix this by just not applying the optimization when the greediness needs to change; the subsequent general cases handle it fine. The three cases in which we can still apply the optimization are (a) no quantifier, or quantifier does not impose a preference; (b) atom has no greediness property, implying it cannot match a variable amount of text anyway; or (c) quantifier's greediness is same as atom's. Note that in most cases where one of these applies, we'd have exited earlier in the "not a messy case" fast path. I think it's now only possible to get to the optimization when the atom involves capturing parentheses or a non-top-level backref. Back-patch to all supported branches. I'd ordinarily be hesitant to put a subtle behavioral change into back branches, but in this case it's very hard to see a reason why somebody would write "{1,1}?" unless they're trying to get the documented change-of-greediness behavior. Discussion: https://postgr.es/m/5bb27a41-350d-37bf-901e-9d26f5592dd0@charter.net	2019-05-12 18:53:38 -04:00
Noah Misch	d02768ddd1	Fail pgwin32_message_to_UTF16() for SQL_ASCII messages. The function had been interpreting SQL_ASCII messages as UTF8, throwing an error when they were invalid UTF8. The new behavior is consistent with pg_do_encoding_conversion(). This affects LOG_DESTINATION_STDERR and LOG_DESTINATION_EVENTLOG, which will send untranslated bytes to write() and ReportEventA(). On buildfarm member bowerbird, enabling log_connections caused an error whenever the role name was not valid UTF8. Back-patch to 9.4 (all supported versions). Discussion: https://postgr.es/m/20190512015615.GD1124997@rfd.leadboat.com	2019-05-12 10:33:05 -07:00
Tom Lane	85ccb6899c	Rearrange pgstat_bestart() to avoid failures within its critical section. We long ago decided to design the shared PgBackendStatus data structure to minimize the cost of writing status updates, which means that writers just have to increment the st_changecount field twice. That isn't hooked into any sort of resource management mechanism, which means that if something were to throw error between the two increments, the st_changecount field would be left odd indefinitely. That would cause readers to lock up. Now, since it's also a bad idea to leave the field odd for longer than absolutely necessary (because readers will spin while we have it set), the expectation was that we'd treat these segments like spinlock critical sections, with only short, more or less straight-line, code in them. That was fine as originally designed, but commit `9029f4b37` broke it by inserting a significant amount of non-straight-line code into pgstat_bestart(), code that is very capable of throwing errors, not to mention taking a significant amount of time during which readers will spin. We have a report from Neeraj Kumar of readers actually locking up, which I suspect was due to an encoding conversion error in X509_NAME_to_cstring, though conceivably it was just a garden-variety OOM failure. Subsequent commits have loaded even more dubious code into pgstat_bestart's critical section (and commit `fc70a4b0d` deserves some kind of booby prize for managing to miss the critical section entirely, although the negative consequences seem minimal given that the PgBackendStatus entry should be seen by readers as inactive at that point). The right way to fix this mess seems to be to compute all these values into a local copy of the process' PgBackendStatus struct, and then just copy the data back within the critical section proper. This plan can't be implemented completely cleanly because of the struct's heavy reliance on out-of-line strings, which we must initialize separately within the critical section. But still, the critical section is far smaller and safer than it was before. In hopes of forestalling future errors of the same ilk, rename the macros for st_changecount management to make it more apparent that the writer-side macros create a critical section. And to prevent the worst consequences if we nonetheless manage to mess it up anyway, adjust those macros so that they really are a critical section, ie they now bump CritSectionCount. That doesn't add much overhead, and it guarantees that if we do somehow throw an error while the counter is odd, it will lead to PANIC and a database restart to reset shared memory. Back-patch to 9.5 where the problem was introduced. In HEAD, also fix an oversight in commit `b0b39f72b`: it failed to teach pgstat_read_current_status to copy st_gssstatus data from shared memory to local memory. Hence, subsequent use of that data within the transaction would potentially see changing data that it shouldn't see. Discussion: https://postgr.es/m/CAPR3Wj5Z17=+eeyrn_ZDG3NQGYgMEOY6JV6Y-WRRhGgwc16U3Q@mail.gmail.com	2019-05-11 21:27:29 -04:00
Tom Lane	610747d86e	Cope with EINVAL and EIDRM shmat() failures in PGSharedMemoryAttach. There's a very old race condition in our code to see whether a pre-existing shared memory segment is still in use by a conflicting postmaster: it's possible for the other postmaster to remove the segment in between our shmctl() and shmat() calls. It's a narrow window, and there's no risk unless both postmasters are using the same port number, but that's possible during parallelized "make check" tests. (Note that while the TAP tests take some pains to choose a randomized port number, pg_regress doesn't.) If it does happen, we treated that as an unexpected case and errored out. To fix, allow EINVAL to be treated as segment-not-present, and the same for EIDRM on Linux. AFAICS, the considerations here are basically identical to the checks for acceptable shmctl() failures, so I documented and coded it that way. While at it, adjust PGSharedMemoryAttach's API to remove its undocumented dependency on UsedShmemSegAddr in favor of passing the attach address explicitly. This makes it easier to be sure we're using a null shmaddr when probing for segment conflicts (thus avoiding questions about what EINVAL means). I don't think there was a bug there, but it required fragile assumptions about the state of UsedShmemSegAddr during PGSharedMemoryIsInUse. Commit `c09850992` may have made this failure more probable by applying the conflicting-segment tests more often. Hence, back-patch to all supported branches, as that was. Discussion: https://postgr.es/m/22224.1557340366@sss.pgh.pa.us	2019-05-10 14:56:41 -04:00
Michael Paquier	508300e2e1	Improve and fix some error handling for REINDEX INDEX/TABLE CONCURRENTLY This improves the user experience when it comes to restrict several flavors of REINDEX CONCURRENTLY. First, for INDEX, remove a restriction on shared relations as we already check after catalog relations. Then, for TABLE, add a proper error message when attempting to run the command on system catalogs. The code path of CREATE INDEX CONCURRENTLY already complains about that, but if a REINDEX is issued then then the error generated is confusing. While on it, add more tests to check restrictions on catalog indexes and on toast table/index for catalogs. Some error messages are improved, with wording suggestion coming from Tom Lane. Reported-by: Tom Lane Author: Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/23694.1556806002@sss.pgh.pa.us	2019-05-10 08:18:46 +09:00
Tom Lane	24c19e9f66	Repair issues with faulty generation of merge-append plans. create_merge_append_plan failed to honor the CP_EXACT_TLIST flag: it would generate the expected targetlist but then it felt free to add resjunk sort targets to it. This demonstrably leads to assertion failures in v11 and HEAD, and it's probably just accidental that we don't see the same in older branches. I've not looked into whether there would be any real-world consequences in non-assert builds. In HEAD, create_append_plan has sprouted the same problem, so fix that too (although we do not have any test cases that seem able to reach that bug). This is an oversight in commit `3fc6e2d7f` which invented the CP_EXACT_TLIST flag, so back-patch to 9.6 where that came in. convert_subquery_pathkeys would create pathkeys for subquery output values if they match any EquivalenceClass known in the outer query and are available in the subquery's syntactic targetlist. However, the second part of that condition is wrong, because such values might not appear in the subquery relation's reltarget list, which would mean that they couldn't be accessed above the level of the subquery scan. We must check that they appear in the reltarget list, instead. This can lead to dropping knowledge about the subquery's sort ordering, but I believe it's okay, because any sort key that the outer query actually has any interest in would appear in the reltarget list. This second issue is of very long standing, but right now there's no evidence that it causes observable problems before 9.6, so I refrained from back-patching further than that. We can revisit that choice if somebody finds a way to make it cause problems in older branches. (Developing useful test cases for these issues is really problematic; fixing convert_subquery_pathkeys removes the only known way to exhibit the create_merge_append_plan bug, and neither of the test cases added by this patch causes a problem in all branches, even when considering the issues separately.) The second issue explains bug #15795 from Suresh Kumar R ("could not find pathkey item to sort" with nested DISTINCT queries). I stumbled across the first issue while investigating that. Discussion: https://postgr.es/m/15795-fadb56c8e44ee73c@postgresql.org	2019-05-09 16:53:05 -04:00
Etsuro Fujita	edbcbe277d	postgres_fdw: Fix cost estimation for aggregate pushdown. In commit `7012b132d0`, which added support for aggregate pushdown in postgres_fdw, the expense of evaluating the final scan/join target computed by make_group_input_target() was not accounted for at all in costing aggregate pushdown paths with local statistics. The right fix for this would be to have a separate upper stage to adjust the final scan/join relation (see comments for apply_scanjoin_target_to_paths()); but for now, fix by adding the tlist eval cost when costing aggregate pushdown paths with local statistics. Apply this to HEAD only to avoid destabilizing existing plan choices. Author: Etsuro Fujita Reviewed-By: Antonin Houska Discussion: https://postgr.es/m/5C66A056.60007%40lab.ntt.co.jp	2019-05-09 18:39:23 +09:00
Thomas Munro	47a338cfcd	Fix SxactGlobalXmin tracking. Commit `bb16aba50` broke the code that maintains SxactGlobalXmin. It could get stuck when a well-timed READ ONLY transaction runs. If SxactGlobalXmin stops advancing, transactions on the FinishedSerializableTransactions queue are never cleaned up, so resources are effectively leaked. Revert that hunk of the commit. Also revert another similar hunk that was probably harmless, but unnecessary and unjustified, relating to the DOOMED flag in case of RO_SAFE early release. Author: Thomas Munro Reported-by: Tom Lane Discussion: https://postgr.es/m/16170.1557251214%40sss.pgh.pa.us	2019-05-09 20:32:26 +12:00
Tom Lane	2d7d946cd3	Clean up the behavior and API of catalog.c's is-catalog-relation tests. The right way for IsCatalogRelation/Class to behave is to return true for OIDs less than FirstBootstrapObjectId (not FirstNormalObjectId), without any of the ad-hoc fooling around with schema membership. The previous code was wrong because (1) it claimed that information_schema tables were not catalog relations but their toast tables were, which is silly; and (2) if you dropped and recreated information_schema, which is a supported operation, the behavior changed. That's even sillier. With this definition, "catalog relations" are exactly the ones traceable to the postgres.bki data, which seems like what we want. With this simplification, we don't actually need access to the pg_class tuple to identify a catalog relation; we only need its OID. Hence, replace IsCatalogClass with "IsCatalogRelationOid(oid)". But keep IsCatalogRelation as a convenience function. This allows fixing some arguably-wrong semantics in contrib/sepgsql and ReindexRelationConcurrently, which were using an IsSystemNamespace test where what they really should be using is IsCatalogRelationOid. The previous coding failed to protect toast tables of system catalogs, and also was not on board with the general principle that user-created tables do not become catalogs just by virtue of being renamed into pg_catalog. We can also get rid of a messy hack in ReindexMultipleTables. While we're at it, also rename IsSystemNamespace to IsCatalogNamespace, because the previous name invited confusion with the more expansive semantics used by IsSystemRelation/Class. Also improve the comments in catalog.c. There are a few remaining places in replication-related code that are special-casing OIDs below FirstNormalObjectId. I'm inclined to think those are wrong too, and if there should be any special case it should just extend to FirstBootstrapObjectId. But first we need to debate whether a FOR ALL TABLES publication should include information_schema. Discussion: https://postgr.es/m/21697.1557092753@sss.pgh.pa.us Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us	2019-05-08 23:27:38 -04:00
Peter Geoghegan	d95e36dc38	Remove obsolete nbtree split REDO routine comment. Commit `dd299df818`, which added suffix truncation to nbtree, simplified the WAL record format used by page splits. It became necessary to explicitly WAL-log the new high key for the left half of a split in all cases, which relieved the REDO routine from having to reconstruct a new high key for the left page by copying the first item from the right page. Remove a comment that referred to the previous practice.	2019-05-08 12:47:20 -07:00
Alvaro Herrera	61639816b8	Fix error messages Some messages related to foreign servers were reporting the server name without quotes, or not at all; our style is to have all names be quoted, and the server name already appears quoted in a few other messages, so just add quotes and make them all consistent. Remove an extra "s" in other messages (typos introduced by myself in `f56f8f8da6`).	2019-05-08 13:20:16 -04:00
Peter Eisentraut	add85ead4a	Fix table lock levels for REINDEX INDEX CONCURRENTLY REINDEX CONCURRENTLY locks tables with ShareUpdateExclusiveLock rather than the ShareLock used by a plain REINDEX. However, RangeVarCallbackForReindexIndex() was not updated for that and still used the ShareLock only. This would lead to lock upgrades later, leading to possible deadlocks. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/20190430151735.wi52sxjvxsjvaxxt%40alap3.anarazel.de	2019-05-08 14:30:23 +02:00
Heikki Linnakangas	256be1050c	Remove leftover reference to old "flat file" mechanism in a comment. The flat file mechanism was removed in PostgreSQL 9.0.	2019-05-08 09:32:34 +03:00
Peter Geoghegan	d65b5ccad6	Correct obsolete nbtsort.c minimum key comment. It is no longer possible under any circumstances for nbtree code to reconstruct a strict lower bound key (parent page's pivot tuple key) for a right sibling page by retrieving the first item in the right sibling page.	2019-05-07 21:42:12 -07:00
Alexander Korotkov	29ceacc3f9	Improve error reporting in jsonpath This commit contains multiple improvements to error reporting in jsonpath including but not limited to getting rid of following things: * definition of error messages in macros, * errdetail() when valueable information could fit to errmsg(), * word "singleton" which is not properly explained anywhere, * line breaks in error messages. Reported-by: Tom Lane Discussion: https://postgr.es/m/14890.1555523005%40sss.pgh.pa.us Author: Alexander Korotkov Reviewed-by: Tom Lane	2019-05-08 01:02:59 +03:00
Fujii Masao	b84dbc8eb8	Add TRUNCATE parameter to VACUUM. This commit adds new parameter to VACUUM command, TRUNCATE, which specifies that VACUUM should attempt to truncate off any empty pages at the end of the table and allow the disk space for the truncated pages to be returned to the operating system. This parameter, if specified, overrides the vacuum_truncate reloption. If neither the reloption nor the VACUUM option is used, the default is true, as before. Author: Fujii Masao Reviewed-by: Julien Rouhaud, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoD+qtrSDL=GSma4Wd3kLYLeRC0hPna-YAdkDeV4z156vg@mail.gmail.com	2019-05-08 02:10:33 +09:00
Magnus Hagander	98719af6c2	Fix typos and clarify a comment Author: Daniel Gustafsson <daniel@yesql.se>	2019-05-07 18:26:09 +02:00
Tom Lane	8d0ddccec6	Avoid "invalid memory alloc request size" while reading pg_stat_activity. On a 64-bit machine, if you set track_activity_query_size and max_connections such that their product exceeds 1GB, shared memory setup will still succeed (given enough RAM), but attempts to read pg_stat_activity fail with "invalid memory alloc request size". Work around that by using MemoryContextAllocHuge to allocate the local copy of the activity strings. Using the "huge" API costs us nothing extra in normal cases, and it seems better than throwing an error and/or explaining to people why they can't do this. This situation seems insanely profligate today, but who knows what people will consider normal in ten or twenty years? So let's fix it in HEAD but not worry about a back-patch. Per report from James Tomson. Discussion: https://postgr.es/m/1CFDCCD6-B268-48D8-85C8-400D2790B2C3@pushd.com	2019-05-07 11:41:37 -04:00
Amit Kapila	7db0cde6b5	Revert "Avoid the creation of the free space map for small heap relations". This feature was using a process local map to track the first few blocks in the relation. The map was reset each time we get the block with enough freespace. It was discussed that it would be better to track this map on a per-relation basis in relcache and then invalidate the same whenever vacuum frees up some space in the page or when FSM is created. The new design would be better both in terms of API design and performance. List of commits reverted, in reverse chronological order: `06c8a5090e` Improve code comments in `b0eaa4c51b`. `13e8643bfc` During pg_upgrade, conditionally skip transfer of FSMs. `6f918159a9` Add more tests for FSM. `9c32e4c350` Clear the local map when not used. `29d108cdec` Update the documentation for FSM behavior.. `08ecdfe7e5` Make FSM test portable. `b0eaa4c51b` Avoid creation of the free space map for small heap relations. Discussion: https://postgr.es/m/20190416180452.3pm6uegx54iitbt5@alap3.anarazel.de	2019-05-07 09:30:24 +05:30
Dean Rasheed	a0905056fd	Use checkAsUser for selectivity estimator checks, if it's set. In examine_variable() and examine_simple_variable(), when checking the user's table and column privileges to determine whether to grant access to the pg_statistic data, use checkAsUser for the privilege checks, if it's set. This will be the case if we're accessing the table via a view, to indicate that we should perform privilege checks as the view owner rather than the current user. This change makes this planner check consistent with the check in the executor, so the planner will be able to make use of statistics if the table is accessible via the view. This fixes a performance regression introduced by commit `e2d4ef8de8`, which affects queries against non-security barrier views in the case where the user doesn't have privileges on the underlying table, but the view owner does. Note that it continues to provide the same safeguards controlling access to pg_statistic for direct table access (in which case checkAsUser won't be set) and for security barrier views, because of the nearby checks on rte->security_barrier and rte->securityQuals. Back-patch to all supported branches because `e2d4ef8de8` was. Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.	2019-05-06 11:54:32 +01:00
Dean Rasheed	1aebfbea83	Fix security checks for selectivity estimation functions with RLS. In commit `e2d4ef8de8`, security checks were added to prevent user-supplied operators from running over data from pg_statistic unless the user has table or column privileges on the table, or the operator is leakproof. For a table with RLS, however, checking for table or column privileges is insufficient, since that does not guarantee that the user has permission to view all of the column's data. Fix this by also checking for securityQuals on the RTE, and insisting that the operator be leakproof if there are any. Thus the leakproofness check will only be skipped if there are no securityQuals and the user has table or column privileges on the table -- i.e., only if we know that the user has access to all the data in the column. Back-patch to 9.5 where RLS was added. Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost. Security: CVE-2019-10130	2019-05-06 11:38:43 +01:00
Tom Lane	bd5e8b627b	Bring pg_nextoid()'s error messages into line with message style guide. Noticed while reviewing nearby code. Given all the disclaimers about this not being meant as user-facing code, I wonder whether we should make these non-translatable? But in any case there's little excuse for them not to be good English.	2019-05-05 17:06:53 -04:00
Tom Lane	9691aa72e2	Fix style violations in syscache lookups. Project style is to check the success of SearchSysCacheN and friends by applying HeapTupleIsValid to the result. A tiny minority of calls creatively did it differently. Bring them into line with the rest. This is just cosmetic, since HeapTupleIsValid is indeed just a null check at the moment ... but that may not be true forever, and in any case it puts a mental burden on readers who may wonder why these call sites are not like the rest. Back-patch to v11 just to keep the branches in sync. (The bulk of these errors seem to have originated in v11 or v12, though a few are old.) Per searching to see if anyplace else had made the same error repaired in `62148c352`.	2019-05-05 13:10:07 -04:00
Tom Lane	62148c3520	Add check for syscache lookup failure in update_relispartition(). Omitted in commit `05b38c7e6` (though it looks like the original blame belongs to `9e9befac4`). A failure is admittedly unlikely, but if it did happen, SIGSEGV is not the approved method of reporting it. Per Coverity. Back-patch to v11 where the broken code originated.	2019-05-05 12:44:32 -04:00
Peter Geoghegan	7b37f4b02e	Correct more obsolete nbtree page split comments. Commit `3f342839` corrected obsolete comments about buffer locks at the main _bt_insert_parent() call site, but missed similar obsolete comments above _bt_insert_parent() itself. Both sets of comments were rendered obsolete by commit `40dae7ec53`, which made the nbtree page split algorithm more robust. Fix the comments that were missed the first time around now. In passing, refine a related _bt_insert_parent() comment about re-finding the parent page to insert new downlink.	2019-05-03 13:34:45 -07:00
Tom Lane	f884dca495	Remove RelationSetIndexList(). In the wake of commit `f912d7dec`, RelationSetIndexList isn't used any more. It was always a horrid wart, so getting rid of it is very nice. We can also convert rd_indexvalid back to a plain boolean. Discussion: https://postgr.es/m/28926.1556664156@sss.pgh.pa.us	2019-05-03 10:26:14 -04:00
Tom Lane	f912d7dec2	Fix reindexing of pg_class indexes some more. Commits `3dbb317d3` et al failed under CLOBBER_CACHE_ALWAYS testing. Investigation showed that to reindex pg_class_oid_index, we must suppress accesses to the index (via SetReindexProcessing) before we call RelationSetNewRelfilenode, or at least before we do CommandCounterIncrement therein; otherwise, relcache reloads happening within the CCI may try to fetch pg_class rows using the index's new relfilenode value, which is as yet an empty file. Of course, the point of `3dbb317d3` was that that ordering didn't work either, because then RelationSetNewRelfilenode's own update of the index's pg_class row cannot access the index, should it need to. There are various ways we might have got around that, but Andres Freund came up with a brilliant solution: for a mapped index, we can really just skip the pg_class update altogether. The only fields it was actually changing were relpages etc, but it was just setting them to zeroes which is useless make-work. (Correct new values will be installed at the end of index build.) All pg_class indexes are mapped and probably always will be, so this eliminates the problem by removing work rather than adding it, always a pleasant outcome. Having taught RelationSetNewRelfilenode to do it that way, we can revert the code reordering in reindex_index. (But I left the moved setup code where it was; there seems no reason why it has to run without use of the old index. If you're trying to fix a busted pg_class index, you'll have had to disable system index use altogether to get this far.) Moreover, this means we don't need RelationSetIndexList at all, because reindex_relation's hacking to make "REINDEX TABLE pg_class" work is likewise now unnecessary. We'll leave that code in place in the back branches, but a follow-on patch will remove it in HEAD. In passing, do some minor cleanup for commit `5c1560606` (in HEAD only), notably removing a duplicate newrnode assignment. Patch by me, using a core idea due to Andres Freund. Back-patch to all supported branches, as `3dbb317d3` was. Discussion: https://postgr.es/m/28926.1556664156@sss.pgh.pa.us	2019-05-02 19:11:28 -04:00
Alvaro Herrera	2bf372a4ae	heap_prepare_freeze_tuple: Simplify coding Commit `d2599ecfcc` introduced some contorted, confused code around: readers would think that it's possible for HeapTupleHeaderGetXmin return a non-frozen value for some frozen tuples, which would be disastrous. There's no actual bug, but it seems better to make it clearer. Per gripe from Tom Lane and Andres Freund. Discussion: https://postgr.es/m/30116.1555430496@sss.pgh.pa.us	2019-05-02 16:14:08 -04:00
Peter Geoghegan	6dd86c269d	Fix nbtsort.c's page space accounting. Commit `dd299df818`, which made heap TID a tiebreaker nbtree index column, introduced new rules on page space management to make suffix truncation safe. In general, suffix truncation needs to have a small amount of extra space available on the new left page when splitting a leaf page. This is needed in case it turns out that truncation cannot even "truncate away the heap TID column", resulting in a larger-than-firstright leaf high key with an explicit heap TID representation. Despite all this, CREATE INDEX/nbtsort.c did not account for the possible need for extra heap TID space on leaf pages when deciding whether or not a new item could fit on current page. This could lead to "failed to add item to the index page" errors when CREATE INDEX/nbtsort.c tried to finish off a leaf page that lacked space for a larger-than-firstright leaf high key (it only had space for firstright tuple, which was just short of what was needed following "truncation"). Several conditions needed to be met all at once for CREATE INDEX to fail. The problem was in the hard limit on what will fit on a page, which tends to be masked by the soft fillfactor-wise limit. The easiest way to recreate the problem seems to be a CREATE INDEX on a low cardinality text column, with tuples that are of non-uniform width, using a fillfactor of 100. To fix, bring nbtsort.c in line with nbtsplitloc.c, which already pessimistically assumes that all leaf page splits will have high keys that have a heap TID appended. Reported-By: Andreas Joseph Krogh Discussion: https://postgr.es/m/VisenaEmail.c5.3ee7fe277d514162.16a6d785bea@tc7-visena	2019-05-02 12:33:35 -07:00
Robert Haas	dd69597988	Fix some problems with VACUUM (INDEX_CLEANUP FALSE). The new nleft_dead_tuples and nleft_dead_itemids fields are confusing and do not seem like the correct way forward. One of them is tested via an assertion that can fail, as it has already done on buildfarm member topminnow. Remove the assertion and the fields. Change the logic for the case where a tuple is not initially pruned by heap_page_prune but later diagnosed HEAPTUPLE_DEAD by HeapTupleSatisfiesVacuum. Previously, tupgone = true was set in that case, which leads to treating the tuple as one that will be removed. In a normal vacuum, that's OK, because we'll remove index entries for it and then the second heap pass will remove the tuple itself, but when index cleanup is disabled, those things don't happen, so we must instead treat it as a recently-dead tuple that we have voluntarily chosen to keep. Report and analysis by Tom Lane. This patch loosely based on one from Masahiko Sawada, but I changed most of it.	2019-05-02 10:07:13 -04:00
Magnus Hagander	659e53498c	Fix union for pgstat message types The message type for temp files and for checksum failures were missing from the union. Due to the coding style used there was no compiler error when this happend. So change the code to actively use the union thereby producing a compiler error if the same mistake happens again, suggested by Tom Lane. Author: Julien Rouhaud Reported-By: Tomas Vondra Discussion: https://postgr.es/m/20190430163328.zd4rrlnbvgaqlcdz@development	2019-05-01 12:30:44 +02:00
Andres Freund	4b40d40b30	Fix unused variable compiler warning in !debug builds. Introduced in `3dbb317d3`. Fix by using the new local variable in more places. Reported-By: Bruce Momjian (off-list) Backpatch: 9.4-, like `3dbb317d3`	2019-04-30 17:45:32 -07:00
Andres Freund	a0b5bb6e02	Improve comment spelling and style in llvmjit_deform.c. Author: Justin Pryzby Discussion: https://postgr.es/m/20190408141828.GE10080@telsasoft.com https://postgr.es/m/20181127184133.GM10913@telsasoft.com	2019-04-30 16:20:07 -07:00
Andres Freund	3a48005b00	Improve code inferring length of bitmap for JITed tuple deforming. While discussing comment improvements (see next commit) by Justin Pryzby, Tom complained about a few details of the logic to infer the length of the NULL bitmap when building the JITed tuple deforming function. That bitmap allows to avoid checking the tuple header's natts, a check which often causes a pipeline stall Improvements: a) As long as missing columns aren't taken into account, we can continue to infer the length of the NULL bitmap from NOT NULL columns following it. Previously we stopped at the first missing column. It's unlikely to matter much in practice, but the alternative would have been to document why we stop. b) For robustness reasons it seems better to also check against attisdropped - RemoveAttributeById() sets attnotnull to false, but an additional check is trivial. c) Improve related comments Discussion: https://postgr.es/m/20637.1555957068@sss.pgh.pa.us Backpatch: -	2019-04-30 16:20:07 -07:00
Tom Lane	e03ff73969	Clean up handling of constraint_exclusion and enable_partition_pruning. The interaction of these parameters was a bit confused/confusing, and in fact v11 entirely misses the opportunity to apply partition constraints when a partition is accessed directly (rather than indirectly from its parent). In HEAD, establish the principle that enable_partition_pruning controls partition pruning and nothing else. When accessing a partition via its parent, we do partition pruning (if enabled by enable_partition_pruning) and then there is no need to consider partition constraints in the constraint_exclusion logic. When accessing a partition directly, its partition constraints are applied by the constraint_exclusion logic, only if constraint_exclusion = on. In v11, we can't have such a clean division of these GUCs' effects, partly because we don't want to break compatibility too much in a released branch, and partly because the clean coding requires inheritance_planner to have applied partition pruning to a partitioned target table, which it doesn't in v11. However, we can tweak things enough to cover the missed case, which seems like a good idea since it's potentially a performance regression from v10. This patch keeps v11's previous behavior in which enable_partition_pruning overrides constraint_exclusion for an inherited target table, though. In HEAD, also teach relation_excluded_by_constraints that it's okay to use inheritable constraints when trying to prune a traditional inheritance tree. This might not be thought worthy of effort given that that feature is semi-deprecated now, but we have enough infrastructure that it only takes a couple more lines of code to do it correctly. Amit Langote and Tom Lane Discussion: https://postgr.es/m/9813f079-f16b-61c8-9ab7-4363cab28d80@lab.ntt.co.jp Discussion: https://postgr.es/m/29069.1555970894@sss.pgh.pa.us	2019-04-30 15:03:50 -04:00
Alvaro Herrera	9f8b717a80	Message style fixes	2019-04-30 10:33:37 -04:00
Alvaro Herrera	9a83afecb7	Widen tuple counter variables from long to int64 Mistake in ab0dfc961b6a; progress reporting would have wrapped around for indexes created with more than 2^31 tuples. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=WbNxc5ob5NJ9yqo2RMJ0q4HXDS30GVCobeCvC9A1L9A@mail.gmail.com	2019-04-30 10:27:38 -04:00
Andres Freund	3dbb317d32	Fix potential assertion failure when reindexing a pg_class index. When reindexing individual indexes on pg_class it was possible to either trigger an assertion failure: TRAP: FailedAssertion("!(!ReindexIsProcessingIndex(((index)->rd_id))) That's because reindex_index() called SetReindexProcessing() - which enables an asserts ensuring no index insertions happen into the index - before calling RelationSetNewRelfilenode(). That not correct for indexes on pg_class, because RelationSetNewRelfilenode() updates the relevant pg_class row, which needs to update the indexes. The are two reasons this wasn't noticed earlier. Firstly the bug doesn't trigger when reindexing all of pg_class, as reindex_relation has code "hiding" all yet-to-be-reindexed indexes. Secondly, the bug only triggers when the the update to pg_class doesn't turn out to be a HOT update - otherwise there's no index insertion to trigger the bug. Most of the time there's enough space, making this bug hard to trigger. To fix, move RelationSetNewRelfilenode() to before the SetReindexProcessing() (and, together with some other code, to outside of the PG_TRY()). To make sure the error checking intended by SetReindexProcessing() is more robust, modify CatalogIndexInsert() to check ReindexIsProcessingIndex() even when the update is a HOT update. Also add a few regression tests for REINDEXing of system catalogs. The last two improvements would have prevented some of the issues fixed in `5c1560606d` from being introduced in the first place. Reported-By: Michael Paquier Diagnosed-By: Tom Lane and Andres Freund Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz Backpatch: 9.4-, the bug is present in all branches	2019-04-29 19:42:08 -07:00
Andres Freund	5c1560606d	Fix several recently introduced issues around handling new relation forks. Most of these stem from `d25f519107` "tableam: relation creation, VACUUM FULL/CLUSTER, SET TABLESPACE.". 1) To pass data to the relation_set_new_filenode() RelationSetNewRelfilenode() was made to update RelationData.rd_rel directly. That's not OK however, as it makes the relcache entries temporarily inconsistent. Which among other scenarios is a problem if a REINDEX targets an index on pg_class - the CatalogTupleUpdate() in RelationSetNewRelfilenode(). Presumably that was introduced because other places in the code do so - while those aren't "good practice" they don't appear to be actively buggy (e.g. because system tables may not be targeted). I (Andres) should have caught this while reviewing and signficantly evolving the code in that commit, mea culpa. Fix that by instead passing in the new RelFileNode as separate argument to relation_set_new_filenode() and rely on the relcache to update the catalog entry. Also revert that the RelationMapUpdateMap() call was changed to immediate, and undo some other more unnecessary changes. 2) Document that the relation_set_new_filenode cannot rely on the whole relcache entry to be valid. It might be worthwhile to refactor the code to never have to rely on that, but given the way heap_create() is currently coded, that'd be a large change. 3) ATExecSetTableSpace() shouldn't do FlushRelationBuffers() itself. A table AM might not use shared buffers at all. Move to index_copy_data() and heapam_relation_copy_data(). 4) heapam_relation_set_new_filenode() previously sometimes accessed rel->rd_rel->relpersistence rather than the `persistence` argument. Code movement mistake. 5) Previously heapam_relation_set_new_filenode() re-opened the smgr relation to create the init for, if necesary. Instead have RelationCreateStorage() return the SMgrRelation and use it to create the init fork. 6) Add a note about the danger of modifying the relcache directly to ATExecSetTableSpace() - it's currently not a bug because there's a check ERRORing for catalog tables. Regression tests and assertion improvements that together trigger the bug described in 1) will be added in a later commit, as there is a related bug on all branches. Reported-By: Michael Paquier Diagnosed-By: Tom Lane and Andres Freund Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz	2019-04-29 19:28:05 -07:00
Peter Geoghegan	9ee7414ed0	Remove obsolete _bt_insert_parent() comment. Remove a comment that refers to a coding practice that was fully removed by commit `a8b8f4db`, which introduced MarkBufferDirty(). It looks like the comment was even obsolete before then, since it concerns write-ordering dependencies with synchronous buffer writes.	2019-04-29 14:14:38 -07:00
Tom Lane	a1a789eb5a	In walreceiver, don't try to do ereport() in a signal handler. This is quite unsafe, even for the case of ereport(FATAL) where we won't return control to the interrupted code, and despite this code's use of a flag to restrict the areas where we'd try to do it. It's possible for example that we interrupt malloc or free while that's holding a lock that's meant to protect against cross-thread interference. Then, any attempt to do malloc or free within ereport() will result in a deadlock, preventing the walreceiver process from exiting in response to SIGTERM. We hypothesize that this explains some hard-to-reproduce failures seen in the buildfarm. Hence, get rid of the immediate-exit code in WalRcvShutdownHandler, as well as the logic associated with WalRcvImmediateInterruptOK. Instead, we need to take care that potentially-blocking operations in the walreceiver's data transmission logic (libpqwalreceiver.c) will respond reasonably promptly to the process's latch becoming set and then call ProcessWalRcvInterrupts. Much of the needed code for that was already present in libpqwalreceiver.c. I refactored things a bit so that all the uses of PQgetResult use latch-aware waiting, but didn't need to do much more. These changes should be enough to ensure that libpqwalreceiver.c will respond promptly to SIGTERM whenever it's waiting to receive data. In principle, it could block for a long time while waiting to send data too, and this patch does nothing to guard against that. I think that that hazard is mostly theoretical though: such blocking should occur only if we fill the kernel's data transmission buffers, and we don't generally send enough data to make that happen without waiting for input. If we find out that the hazard isn't just theoretical, we could fix it by using PQsetnonblocking, but that would require more ticklish changes than I care to make now. This is a bug fix, but it seems like too big a change to push into the back branches without much more testing than there's time for right now. Perhaps we'll back-patch once we have more confidence in the change. Patch by me; thanks to Thomas Munro for review. Discussion: https://postgr.es/m/20190416070119.GK2673@paquier.xyz	2019-04-29 12:26:07 -04:00
Peter Eisentraut	cd3e27464c	Fix potential catalog corruption with temporary identity columns If a temporary table with an identity column and ON COMMIT DROP is created in a single-statement transaction (not useful, but allowed), it would leave the catalog corrupted. We need to add a CommandCounterIncrement() so that PreCommit_on_commit_actions() sees the created dependency between table and sequence and can clean it up. The analogous and more useful case of doing this in a transaction block already runs some CommandCounterIncrement() before it gets to the on-commit cleanup, so it wasn't a problem in practical use. Several locations for placing the new CommandCounterIncrement() call were discussed. This patch places it at the end of standard_ProcessUtility(). That would also help if other commands were to create catalog entries that some on-commit action would like to see. Bug: #15631 Reported-by: Serge Latyntsev <dnsl48@gmail.com> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Michael Paquier <michael@paquier.xyz>	2019-04-29 08:49:03 +02:00
Noah Misch	90e7f31773	Use preprocessor conditions compatible with Emacs indent. Emacs wrongly indented hundreds of subsequent lines.	2019-04-28 12:56:53 -07:00
Tom Lane	e481d26285	Clean up minor warnings from buildfarm. Be more consistent about use of XXXGetDatum macros in new jsonpath code. This is mostly to avoid having code that looks randomly different from everyplace else that's doing the exact same thing. In pg_regress.c, avoid an unreferenced-function warning from compilers that don't understand pg_attribute_unused(). Putting the function inside the same #ifdef as its only caller is more straightforward coding anyway. In be-secure-openssl.c, avoid use of pg_attribute_unused() on a label. That's pretty creative, but there's no good reason to suppose that it's portable, and there's absolutely no need to use goto's here in the first place. (This wasn't actually causing any buildfarm complaints, but it's new code in v12 so it has no portability track record.)	2019-04-28 12:45:55 -04:00
Tom Lane	c01eb619a8	Apply stopgap fix for bug #15672 . Fix DefineIndex so that it doesn't attempt to pass down a to-be-reused index relfilenode to a child index creation, and fix TryReuseIndex to not think that reuse is sensible for a partitioned index. In v11, this fixes a problem where ALTER TABLE on a partitioned table could assign the same relfilenode to several different child indexes, causing very nasty catalog corruption --- in fact, attempting to DROP the partitioned table then leads not only to a database crash, but to inability to restart because the same crash will recur during WAL replay. Either of these two changes would be enough to prevent the failure, but since neither action could possibly be sane, let's put in both changes for future-proofing. In HEAD, no such bug manifests, but that's just an accidental consequence of having changed the pg_class representation of partitioned indexes to have relfilenode = 0. Both of these changes still seem like smart future-proofing. This is only a stop-gap because the code for ALTER TABLE on a partitioned table with a no-op type change still leaves a great deal to be desired. As the added regression tests show, it gets things wrong for comments on child indexes/constraints, and it is regenerating child indexes it doesn't have to. However, fixing those problems will take more work which may not get back-patched into v11. We need a fix for the corruption problem now. Per bug #15672 from Jianing Yang. Patch by me, regression test cases based on work by Amit Langote, who also did a lot of the investigative work. Discussion: https://postgr.es/m/15672-b9fa7db32698269f@postgresql.org	2019-04-26 17:18:07 -04:00
Alvaro Herrera	05b38c7e63	Fix partitioned index attachment When an existing index in a partition is attached to a new index on its parent, we forgot to set the "relispartition" flag correctly, which meant that it was not possible to find the index in various operations, such as adding a foreign key constraint that references that partitioned table. One of four places that was assigning the parent index was forgetting to do that, so fix by shifting responsibility of updating the flag to the routine that changes the parent. Author: Amit Langote, Álvaro Herrera Reported-by: Hubert "depesz" Lubaczewski Discussion: https://postgr.es/m/CA+HiwqHMsRtRYRWYTWavKJ8x14AFsv7bmAV46mYwnfD3vy8goQ@mail.gmail.com	2019-04-25 11:22:29 -04:00
Fujii Masao	978b032d1f	Fix function names in comments. Commit `3eb77eba5a` renamed some functions, but forgot to update some comments referencing to those functions. This commit fixes those function names in the comments. Kyotaro Horiguchi	2019-04-25 23:43:48 +09:00
Alvaro Herrera	87259588d0	Fix tablespace inheritance for partitioned rels Commit `ca4103025d` left a few loose ends. The most important one (broken pg_dump output) is already fixed by virtue of commit `3b23552ad8`, but some things remained: * When ALTER TABLE rewrites tables, the indexes must remain in the tablespace they were originally in. This didn't work because index recreation during ALTER TABLE runs manufactured SQL (yuck), which runs afoul of default_tablespace in competition with the parent relation tablespace. To fix, reset default_tablespace to the empty string temporarily, and add the TABLESPACE clause as appropriate. * Setting a partitioned rel's tablespace to the database default is confusing; if it worked, it would direct the partitions to that tablespace regardless of default_tablespace. But in reality it does not work, and making it work is a larger project. Therefore, throw an error when this condition is detected, to alert the unwary. Add some docs and tests, too. Author: Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f_1c260nOt_vBJ067AZ3JXptXVRohDVMLEBmudX1YEx-A@mail.gmail.com	2019-04-25 10:31:32 -04:00
Tom Lane	0fae846232	Fix some minor postmaster-state-machine issues. In sigusr1_handler, don't ignore PMSIGNAL_ADVANCE_STATE_MACHINE based on pmState. The restriction is unnecessary (PostmasterStateMachine should work in any state), not future-proof (since it makes too many assumptions about why the signal might be sent), and broken even today because a race condition can make it necessary to respond to the signal in PM_WAIT_READONLY state. The race condition seems unlikely, but if it did happen, a hot-standby postmaster could fail to shut down after receiving a smart-shutdown request. In MaybeStartWalReceiver, don't clear the WalReceiverRequested flag if the fork attempt fails. Leaving it set allows us to try again in future iterations of the postmaster idle loop. (The startup process would eventually send a fresh request signal, but this change may allow us to retry the fork sooner.) Remove an obsolete comment and unnecessary test in PostmasterStateMachine's handling of PM_SHUTDOWN_2 state. It's not possible to have a live walreceiver in that state, and AFAICT has not been possible since commit `5e85315ea`. This isn't a live bug, but the false comment is quite confusing to readers. In passing, rearrange sigusr1_handler's CheckPromoteSignal tests so that we don't uselessly perform stat() calls that we're going to ignore the results of. Add some comments clarifying the behavior of MaybeStartWalReceiver; I very nearly rearranged it in a way that'd reintroduce the race condition fixed in `e5d494d78`. Mea culpa for not commenting that properly at the time. Back-patch to all supported branches. The PMSIGNAL_ADVANCE_STATE_MACHINE change is the only one of even minor significance, but we might as well keep this code in sync across branches. Discussion: https://postgr.es/m/9001.1556046681@sss.pgh.pa.us	2019-04-24 14:15:44 -04:00
Alvaro Herrera	0a999e1290	Unify error messages ... for translatability purposes.	2019-04-24 09:26:13 -04:00
Andres Freund	fdc7efcc30	Allow pg_class xid & multixid horizons to not be set. This allows table AMs that don't need these horizons. This was already documented in the tableam relation_set_new_filenode callback, but an assert prevented if from actually working (the test AM code contained the change itself). Defang the asserts in the general code, and move the stronger ones into heap AM. Relatedly, after CLUSTER/VACUUM, we'd always assign a relfrozenxid / relminmxid. Change the table_relation_copy_for_cluster() interface to allow the AM to overwrite the horizons that get set on the pg_class entry. This'd also in the future allow AMs like heap to compute a relfrozenxid during rewrite that's the table's actual minimum rather than a pre-determined value. Arguably it'd have been better to move the whole computation / setting of those values into the callback, but it seems likely that for other reasons it'd be better to be able to use one value to vacuum/cluster multiple tables (e.g. a toast's horizon shouldn't be different than the table's). Reported-By: Heikki Linnakangas Author: Andres Freund Discussion: https://postgr.es/m/9a7fb9cc-2419-5db7-8840-ddc10c93f122@iki.fi	2019-04-23 21:42:12 -07:00
Tom Lane	7ad1cd31bf	Repair assorted issues in locale data extraction. cache_locale_time (extraction of LC_TIME-related info) had never been taught the lessons we previously learned about extraction of info related to LC_MONETARY and LC_NUMERIC. Specifically, commit `95a777c61` taught PGLC_localeconv() that data coming out of localeconv() was in an encoding determined by the relevant locale, but we didn't realize that there's a similar issue with strftime(). And commit `a4930e7ca` hardened PGLC_localeconv() against errors occurring partway through, but failed to do likewise for cache_locale_time(). So, rearrange the latter function to perform encoding conversion and not risk failure while it's got the locales set to temporary values. This time around I also changed PGLC_localeconv() to treat it as FATAL if it can't restore the previous settings of the locale values. There is no reason (except possibly OOM) for that to fail, and proceeding with the wrong locale values seems like a seriously bad idea --- especially on Windows where we have to also temporarily change LC_CTYPE. Also, protect against the possibility that we can't identify the codeset reported for LC_MONETARY or LC_NUMERIC; rather than just failing, try to validate the data without conversion. The user-visible symptom this fixes is that if LC_TIME is set to a locale name that implies an encoding different from the database encoding, non-ASCII localized day and month names would be retrieved in the wrong encoding, leading to either unexpected encoding-conversion error reports or wrong output from to_char(). The other possible failure modes are unlikely enough that we've not seen reports of them, AFAIK. The encoding conversion problems do not manifest on Windows, since we'd already created special-case code to handle that issue there. Per report from Juan José Santamaría Flecha. Back-patch to all supported versions. Juan José Santamaría Flecha and Tom Lane Discussion: https://postgr.es/m/CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com	2019-04-23 18:51:30 -04:00
Peter Geoghegan	9b10926263	Prevent O(N^2) unique index insertion edge case. Commit `dd299df8` made nbtree treat heap TID as a tiebreaker column, establishing the principle that there is only one correct location (page and page offset number) for every index tuple, no matter what. Insertions of tuples into non-unique indexes proceed as if heap TID (scan key's scantid) is just another user-attribute value, but insertions into unique indexes are more delicate. The TID value in scantid must initially be omitted to ensure that the unique index insertion visits every leaf page that duplicates could be on. The scantid is set once again after unique checking finishes successfully, which can force _bt_findinsertloc() to step right one or more times, to locate the leaf page that the new tuple must be inserted on. Stepping right within _bt_findinsertloc() was assumed to occur no more frequently than stepping right within _bt_check_unique(), but there was one important case where that assumption was incorrect: inserting a "duplicate" with NULL values. Since _bt_check_unique() didn't do any real work in this case, it wasn't appropriate for _bt_findinsertloc() to behave as if it was finishing off a conventional unique insertion, where any existing physical duplicate must be dead or recently dead. _bt_findinsertloc() might have to grovel through a substantial portion of all of the leaf pages in the index to insert a single tuple, even when there were no dead tuples. To fix, treat insertions of tuples with NULLs into a unique index as if they were insertions into a non-unique index: never unset scantid before calling _bt_search() to descend the tree, and bypass _bt_check_unique() entirely. _bt_check_unique() is no longer responsible for incoming tuples with NULL values. Discussion: https://postgr.es/m/CAH2-Wzm08nr+JPx4jMOa9CGqxWYDQ-_D4wtPBiKghXAUiUy-nQ@mail.gmail.com	2019-04-23 10:33:57 -07:00
Tom Lane	f4a3fdfbdc	Avoid order-of-execution problems with ALTER TABLE ADD PRIMARY KEY. Up to now, DefineIndex() was responsible for adding attnotnull constraints to the columns of a primary key, in any case where it hadn't been convenient for transformIndexConstraint() to mark those columns as is_not_null. It (or rather its minion index_check_primary_key) did this by executing an ALTER TABLE SET NOT NULL command for the target table. The trouble with this solution is that if we're creating the index due to ALTER TABLE ADD PRIMARY KEY, and the outer ALTER TABLE has additional sub-commands, the inner ALTER TABLE's operations executed at the wrong time with respect to the outer ALTER TABLE's operations. In particular, the inner ALTER would perform a validation scan at a point where the table's storage might be inconsistent with its catalog entries. (This is on the hairy edge of being a security problem, but AFAICS it isn't one because the inner scan would only be interested in the tuples' null bitmaps.) This can result in unexpected failures, such as the one seen in bug #15580 from Allison Kaptur. To fix, let's remove the attempt to do SET NOT NULL from DefineIndex(), reducing index_check_primary_key's role to verifying that the columns are already not null. (It shouldn't ever see such a case, but it seems wise to keep the check for safety.) Instead, make transformIndexConstraint() generate ALTER TABLE SET NOT NULL subcommands to be executed ahead of the ADD PRIMARY KEY operation in every case where it can't force the column to be created already-not-null. This requires only minor surgery in parse_utilcmd.c, and it makes for a much more satisfying spec for transformIndexConstraint(): it's no longer having to take it on faith that someone else will handle addition of NOT NULL constraints. To make that work, we have to move the execution of AT_SetNotNull into an ALTER pass that executes ahead of AT_PASS_ADD_INDEX. I moved it to AT_PASS_COL_ATTRS, and put that after AT_PASS_ADD_COL to avoid failure when the column is being added in the same command. This incidentally fixes a bug in the only previous usage of AT_PASS_COL_ATTRS, for AT_SetIdentity: it didn't work either for a newly-added column. Playing around with this exposed a separate bug in ALTER TABLE ONLY ... ADD PRIMARY KEY for partitioned tables. The intent of the ONLY modifier in that context is to prevent doing anything that would require holding lock for a long time --- but the implied SET NOT NULL would recurse to the child partitions, and do an expensive validation scan for any child where the column(s) were not already NOT NULL. To fix that, invent a new ALTER subcommand AT_CheckNotNull that just insists that a child column be already NOT NULL, and apply that, not AT_SetNotNull, when recursing to children in this scenario. This results in a slightly laxer definition of ALTER TABLE ONLY ... SET NOT NULL for partitioned tables, too: that command will now work as long as all children are already NOT NULL, whereas before it just threw up its hands if there were any partitions. In passing, clean up the API of generateClonedIndexStmt(): remove a useless argument, ensure that the output argument is not left undefined, update the header comment. A small side effect of this change is that no-such-column errors in ALTER TABLE ADD PRIMARY KEY now produce a different message that includes the table name, because they are now detected by the SET NOT NULL step which has historically worded its error that way. That seems fine to me, so I didn't make any effort to avoid the wording change. The basic bug #15580 is of very long standing, and these other bugs aren't new in v12 either. However, this is a pretty significant change in the way ALTER TABLE ADD PRIMARY KEY works. On balance it seems best not to back-patch, at least not till we get some more confidence that this patch has no new bugs. Patch by me, but thanks to Jie Zhang for a preliminary version. Discussion: https://postgr.es/m/15580-d1a6de5a3d65da51@postgresql.org Discussion: https://postgr.es/m/1396E95157071C4EBBA51892C5368521017F2E6E63@G08CNEXMBPEKD02.g08.fujitsu.local	2019-04-23 12:25:27 -04:00
Tom Lane	c06e3550dc	Don't request pretty-printed output from xmlNodeDump(). xml.c passed format = 1 to xmlNodeDump(), resulting in sometimes getting extra whitespace (newlines + spaces) in the output. We don't really want that, first because whitespace might be semantically significant in some XML uses, and second because it happens only very inconsistently. Only one case in our regression tests is affected. This potentially affects the results of xpath() and the XMLTABLE construct, when emitting nodeset values. Note that the older code in contrib/xml2 doesn't do this; it seems to have been an aboriginal bad decision in commit `ea3b212fe`. While this definitely seems like a bug to me, the small number of complaints to date argues against back-patching a behavioral change. Hence, fix in HEAD only, at least for now. Per report from Jean-Marc Voillequin. Discussion: https://postgr.es/m/1EC8157EB499BF459A516ADCF135ADCE3A23A9CA@LON-WGMSX712.ad.moodys.net	2019-04-23 10:51:07 -04:00
Michael Paquier	ccae190b91	Fix detection of passwords hashed with MD5 or SCRAM-SHA-256 This commit fixes a couple of issues related to the way password verifiers hashed with MD5 or SCRAM-SHA-256 are detected, leading to being able to store in catalogs passwords which do not follow the supported hash formats: - A MD5-hashed entry was checked based on if its header uses "md5" and if the string length matches what is expected. Unfortunately the code never checked if the hash only used hexadecimal characters, as reported by Tom Lane. - A SCRAM-hashed entry was checked based on only its header, which should be "SCRAM-SHA-256$", but it never checked for any fields afterwards, as reported by Jonathan Katz. Backpatch down to v10, which is where SCRAM has been introduced, and where password verifiers in plain format have been removed. Author: Jonathan Katz Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/016deb6b-1f0a-8e9f-1833-a8675b170aa9@postgresql.org Backpatch-through: 10	2019-04-23 15:43:21 +09:00
Andres Freund	b5f58cf213	Convert gist to compute page level xid horizon on primary. Due to parallel development, gist added the missing conflict information in `c952eae52a`, while `558a9165e0` moved that computation to the primary for the index types that already had it. Thus adapt gist to also compute on the primary, using index_compute_xid_horizon_for_tuples() instead of its own copy of the logic. This also adds pg_waldump support for XLOG_GIST_DELETE records, which previously was not properly present. Bumps WAL version. Author: Andres Freund Discussion: https://postgr.es/m/20190406050243.bszosdg4buvabfrt@alap3.anarazel.de	2019-04-22 14:28:30 -07:00
Tomas Vondra	d08c44f7a4	Fix mvdistinct and dependencies size calculations The formulas used to calculate size while (de)serializing mvndistinct and functional dependencies were based on offset() of the structs. But that is incorrect, because the structures are not copied directly, we we copy the individual fields directly. At the moment this works fine, because there is no alignment padding on any platform we support. But it might break if we ever added some fields into any of the structs, for example. It's also confusing. Fixed by reworking the macros to directly sum sizes of serialized fields. The macros are now useful only for serialiation, so there is no point in keeping them in the public header file. So make them private by moving them to the .c files. Also adds a couple more asserts to check the serialization, and fixes an incorrect allocation of MVDependency instead of (MVDependency *). Reported-By: Tom Lane Discussion: https://postgr.es/m/29785.1555365602@sss.pgh.pa.us	2019-04-21 20:23:34 +02:00
Andres Freund	b8b94ea129	Fix slot type issue for fuzzy distance index scan over out-of-core table AM. For amcanreorderby scans the nodeIndexscan.c's reorder queue holds heap tuples, but the underlying table likely does not. Before this fix we'd return different types of slots, depending on whether the tuple came from the reorder queue, or from the index + table. While that could be fixed by signalling that the node doesn't return a fixed type of slot, it seems better to instead remove the separate slot for the reorder queue, and use ExecForceStoreHeapTuple() to store tuples from the queue. It's not particularly common to need reordering, after all. This reverts most of the iss_ReorderQueueSlot related changes to nodeIndexscan.c made in `1a0586de36`, except that now ExecForceStoreHeapTuple() is used instead of ExecStoreHeapTuple(). Noticed when testing zheap against the in-core version of tableam. Author: Andres Freund	2019-04-19 11:42:37 -07:00
Andres Freund	88e6ad3054	Fix two memory leaks around force-storing tuples in slots. As reported by Tom, when ExecStoreMinimalTuple() had to perform a conversion to store the minimal tuple in the slot, it forgot to respect the shouldFree flag, and leaked the tuple into the current memory context if true. Fix that by freeing the tuple in that case. Looking at the relevant code made me (Andres) realize that not having the shouldFree parameter to ExecForceStoreHeapTuple() was a bad idea. Some callers had to locally implement the necessary logic, and in one case it was missing, creating a potential per-group leak in non-hashed aggregation. The choice to not free the tuple in ExecComputeStoredGenerated() is not pretty, but not introduced by this commit - I'll start a separate discussion about it. Reported-By: Tom Lane Discussion: https://postgr.es/m/366.1555382816@sss.pgh.pa.us	2019-04-19 11:39:56 -07:00
Tom Lane	4d5840cea9	Fix problems with auto-held portals. HoldPinnedPortals() did things in the wrong order: it must not mark a portal autoHeld until it's been successfully held. Otherwise, a failure while persisting the portal results in a server crash because we think the portal is in a good state when it's not. Also add a check that portal->status is READY before attempting to hold a pinned portal. We have such a check before the only other use of HoldPortal(), so it seems unwise not to check it here. Lastly, rethink the responsibility for where to call HoldPinnedPortals. The comment for it imagined that it was optional for any individual PL to call it or not, but that cannot be the case: if some outer level of procedure has a pinned portal, failing to persist it when an inner procedure commits is going to be trouble. Let's have SPI do it instead of the individual PLs. That's not a complete solution, since in theory a PL might not be using SPI to perform commit/rollback, but such a PL is going to have to be aware of lots of related requirements anyway. (This change doesn't cause an API break for any external PLs that might be calling HoldPinnedPortals per the old regime, because calling it twice during a commit or rollback sequence won't hurt.) Per bug #15703 from Julian Schauder. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/15703-c12c5bc0ea34ba26@postgresql.org	2019-04-19 11:20:37 -04:00
Michael Paquier	148266fa35	Fix collection of typos and grammar mistakes in docs and comments Author: Justin Pryzby Discussion: https://postgr.es/m/20190330224333.GQ5815@telsasoft.com	2019-04-19 16:57:40 +09:00
Andres Freund	75e03eabea	Fix potential use-after-free for BEFORE UPDATE row triggers on non-core AMs. When such a trigger returns the old row version, it naturally get stored in the slot for the trigger result. When a table AMs doesn't store HeapTuples internally, ExecBRUpdateTriggers() frees the old row version passed to triggers - but before this fix it might still be referenced by the slot holding the new tuple. Noticed when running the out-of-core zheap AM against the in-core version of tableam. Author: Andres Freund	2019-04-18 17:53:54 -07:00
Peter Eisentraut	bb385c4fb0	Fix handling of temp and unlogged tables in FOR ALL TABLES publications If a FOR ALL TABLES publication exists, temporary and unlogged tables are ignored for publishing changes. But CheckCmdReplicaIdentity() would still check in that case that such a table has a replica identity set before accepting updates. To fix, have GetRelationPublicationActions() return that such a table publishes no actions. Discussion: https://www.postgresql.org/message-id/f3f151f7-c4dd-1646-b998-f60bd6217dd3@2ndquadrant.com	2019-04-18 08:55:55 +02:00
Bruce Momjian	fb9c475597	postgresql.conf.sample: add proper defaults for include actions Previously, include actions include_dir, include_if_exists, and include listed commented-out values which were not the defaults, which is inconsistent with other entries. Instead, replace them with '', which is the default value. Reported-by: Emanuel Araújo Discussion: https://postgr.es/m/CAMuTAkYMx6Q27wpELDR3_v9aG443y7ZjeXu15_+1nGUjhMWOJA@mail.gmail.com Backpatch-through: 9.4	2019-04-17 18:12:10 -04:00
Tom Lane	8cde7f4948	Fix assorted minor bogosity in GSSAPI transport error messages. I noted that some buildfarm members were complaining about %ld being used to format values that are (probably) declared size_t. Use %zu instead, and insert a cast just in case some versions of the GSSAPI API declare the length field differently. While at it, clean up gratuitous differences in wording of equivalent messages, show the complained-of length in all relevant messages not just some, include trailing newline where needed, adjust random deviations from project-standard code layout and message style, etc.	2019-04-17 17:06:50 -04:00
Tom Lane	b4f96d69ad	Minor jsonpath fixes. Restore missed "make clean" rule, fix misspelling. John Naylor Discussion: https://postgr.es/m/CACPNZCt5B8jDCCGQiFoSuqmg-za_NCy4QDioBTLaNRih9+-bXg@mail.gmail.com	2019-04-17 13:37:00 -04:00
Magnus Hagander	252b707bc4	Return NULL for checksum failures if checksums are not enabled Returning 0 could falsely indicate that there is no problem. NULL correctly indicates that there is no information about potential problems. Also return 0 as numbackends instead of NULL for shared objects (as no connection can be made to a shared object only). Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net>	2019-04-17 13:51:48 +02:00
Michael Paquier	9010156445	Fix thinko introduced by `82a5649` in slot.c When saving a replication slot, failing to close the temporary path used to save the slot information is considered as a failure and reported as such. However the code forgot to leave immediately as other failure paths do. Noticed while looking up at this area of the code for another patch.	2019-04-17 10:01:22 +09:00
Michael Paquier	47ac2033d4	Simplify some ERROR paths clearing wait events and transient files Transient files and wait events get normally cleaned up when seeing an exception (be it in the context of a transaction for a backend or another process like the checkpointer), hence there is little point in complicating error code paths to do this work. This shaves a bit of code, and removes some extra handling with errno which needed to be preserved during the cleanup steps done. Reported-by: Masahiko Sawada Author: Michael Paquier Reviewed-by: Tom Lane, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoDhHYVq5KkXfkaHhmjA-zJYj-e4teiRAJefvXuKJz1tKQ@mail.gmail.com	2019-04-17 09:51:45 +09:00
Michael Paquier	a6dcf9df4d	Rework handling of invalid indexes with REINDEX CONCURRENTLY Per discussion with others, allowing REINDEX INDEX CONCURRENTLY to work for invalid indexes when working directly on them can have a lot of value to unlock situations with invalid indexes without having to use a dance involving DROP INDEX followed by an extra CREATE INDEX CONCURRENTLY (which would not work for indexes with constraint dependency anyway). This also does not create extra bloat on the relation involved as this works on individual indexes, so let's enable it. Note that REINDEX TABLE CONCURRENTLY still bypasses invalid indexes as we don't want to bloat the number of indexes defined on a relation in the event of multiple and successive failures of REINDEX CONCURRENTLY. More regression tests are added to cover those behaviors, using an invalid index created with CREATE INDEX CONCURRENTLY. Reported-by: Dagfinn Ilmari Mannsåker, Álvaro Herrera Author: Michael Paquier Reviewed-by: Peter Eisentraut, Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/20190411134947.GA22043@alvherre.pgsql	2019-04-17 09:33:51 +09:00
Michael Paquier	5ed4b123b6	Remove duplicate assignment when initializing logical decoder context The private data in the WAL reader is already getting set when allocating it. Author: Antonin Houska Reviewed-by: Tom Lane Discussion: https://postgr.es/m/30563.1555329094@localhost	2019-04-16 15:08:38 +09:00
Tomas Vondra	3824ca30d1	Fix pg_mcv_list deserialization The memcpy() was copying type OIDs in the wrong direction, so the deserialized MCV list always had them as 0. This is mostly harmless except when printing the data in pg_mcv_list_items(), in which case it reported ERROR: cache lookup failed for type 0 Also added a simple regression test for pg_mcv_list_items() function, printing a single-item MCV list. Reported-By: Dean Rasheed Discussion: https://postgr.es/m/CAEZATCX6T0iDTTZrqyec4Cd6b4yuL7euu4=rQRXaVBAVrUi1Cg@mail.gmail.com	2019-04-16 00:01:39 +02:00
Tom Lane	4b40e44f07	Fix failure with textual partition hash keys. Commit `5e1963fb7` overlooked two places in partbounds.c that now need to pass a collation identifier to the hash functions for a partition key column. Amit Langote, per report from Jesper Pedersen Discussion: https://postgr.es/m/a620f85a-42ab-e0f3-3337-b04b97e2e2f5@redhat.com	2019-04-15 16:47:09 -04:00
Alexander Korotkov	1e87198182	Fix division by zero in _bt_vacuum_needs_cleanup() Checks inside _bt_vacuum_needs_cleanup() allow division by zero to happen when metad->btm_last_cleanup_num_heap_tuples == 0. This commit adjusts the expression so that no division by zero might happen. Reported-by: Piotr Stefaniak Discussion: https://postgr.es/m/DB8PR03MB5931C41F7787A95313F08322F22A0%40DB8PR03MB5931.eurprd03.prod.outlook.com Reviewed-by: Masahiko Sawada Backpatch-through: 11	2019-04-15 20:20:43 +03:00
Etsuro Fujita	3a45321a49	Fix thinko in ExecCleanupTupleRouting(). Commit `3f2393edef` changed ExecCleanupTupleRouting() so that it skipped cleaning up subplan resultrels before calling EndForeignInsert(), but that would cause an issue: when those resultrels were foreign tables, the FDWs would fail to shut down. Repair by skipping it after calling EndForeignInsert() as before. Author: Etsuro Fujita Reviewed-by: David Rowley and Amit Langote Discussion: https://postgr.es/m/5CAF3B8F.2090905@lab.ntt.co.jp	2019-04-15 19:01:09 +09:00
Peter Eisentraut	abb9c63b2c	Unbreak index optimization for LIKE on bytea The same code is used to handle both text and bytea, but bytea is not collation-aware, so we shouldn't call get_collation_isdeterministic() in that case, since that will error out with an invalid collation. Reported-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAM2%2B6%3DWaf3qJ1%3DyVTUH8_yG-SC0xcBMY%2BSFLhvKKNnWNXSUDBw%40mail.gmail.com	2019-04-15 09:29:17 +02:00
Michael Paquier	c34677fdaa	Fix SHOW ALL command for non-superusers with replication connection Since Postgres 10, SHOW commands can be triggered with replication connections in a WAL sender context, however it missed that a transaction context is needed for syscache lookups. This commit makes sure that the syscache lookups can happen correctly by setting a transaction context when running SHOW commands in a WAL sender. Superuser-only parameters can be displayed using SHOW commands not only to superusers, but also to members of system role pg_read_all_settings, which requires a syscache lookup to check if the connected role is a member of this system role or not, or the instance crashes. Superusers do not need to check the syscache so it worked correctly in this case. New tests are added to cover this issue. Reported-by: Alexander Kukushkin Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/15734-2daa8761eeed8e20@postgresql.org Backpatch-through: 10	2019-04-15 12:34:32 +09:00
Tom Lane	5f1433ac5e	Prevent memory leaks associated with relcache rd_partcheck structures. The original coding of generate_partition_qual() just copied the list of predicate expressions into the global CacheMemoryContext, making it effectively impossible to clean up when the owning relcache entry is destroyed --- the relevant code in RelationDestroyRelation() only managed to free the topmost List header :-(. This resulted in a session-lifespan memory leak whenever a table partition's relcache entry is rebuilt. Fortunately, that's not normally a large data structure, and rebuilds shouldn't occur all that often in production situations; but this is still a bug worth fixing back to v10 where the code was introduced. To fix, put the cached expression tree into its own small memory context, as we do with other complicated substructures of relcache entries. Also, deal more honestly with the case that a partition has an empty partcheck list; while that probably isn't a case that's very interesting for production use, it's legal. In passing, clarify comments about how partitioning-related relcache data structures are managed, and add some Asserts that we're not leaking old copies when we overwrite these data fields. Amit Langote and Tom Lane Discussion: https://postgr.es/m/7961.1552498252@sss.pgh.pa.us	2019-04-13 13:22:26 -04:00
Noah Misch	c098509927	Consistently test for in-use shared memory. postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer stop if shmat() of an old segment fails with EACCES. A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com	2019-04-12 22:36:38 -07:00
Magnus Hagander	77bd49adba	Show shared object statistics in pg_stat_database This adds a row to the pg_stat_database view with datoid 0 and datname NULL for those objects that are not in a database. This was added particularly for checksums, but we were already tracking more satistics for these objects, just not returning it. Also add a checksum_last_failure column that holds the timestamptz of the last checksum failure that occurred in a database (or in a non-dataabase file), if any. Author: Julien Rouhaud <rjuju123@gmail.com>	2019-04-12 14:04:50 +02:00
Peter Eisentraut	ef6f30fe77	Fix REINDEX CONCURRENTLY of partitions In case of a partition index, when swapping the old and new index, we also need to attach the new index as a partition and detach the old one. Also, to handle partition indexes, we not only need to change dependencies referencing the index, but also dependencies of the index referencing something else. The previous code did this only specifically for a constraint, but we also need to do this for partitioned indexes. So instead write a generic function that does it for all dependencies. Author: Michael Paquier <michael@paquier.xyz> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/DF4PR8401MB11964EDB77C860078C343BEBEE5A0%40DF4PR8401MB1196.NAMPRD84.PROD.OUTLOOK.COM#154df1fedb735190a773481765f7b874	2019-04-12 08:36:05 +02:00
Thomas Munro	f7feb020c3	Fix GetNewTransactionId()'s interaction with xidVacLimit. Commit `ad308058` switched to returning a FullTransactionId, but failed to load the potentially updated value in the case where xidVacLimit is reached and we release and reacquire the lock. Repair, closing bug #15727. While reviewing that commit, also fix the size computation used by EstimateTransactionStateSize() and switch to the mul_size() macro traditionally used in such expressions. Author: Thomas Munro Reported-by: Roman Zharkov Discussion: https://postgr.es/m/15727-0be246e7d852d229%40postgresql.org	2019-04-12 16:47:50 +12:00
Michael Paquier	d87ab88686	Fix typos in reloptions.c Author: Kirk Jamison Discussion: https://postgr.es/m/D09B13F772D2274BB348A310EE3027C6493463@g01jpexmbkw24	2019-04-12 12:56:38 +09:00
Michael Paquier	d527fda621	Fix more strcmp() calls using boolean-like comparisons for result checks Such calls can confuse the reader as strcmp() uses an integer as result. The places patched here have been spotted by Thomas Munro, David Rowley and myself. Author: Michael Paquier Reviewed-by: David Rowley Discussion: https://postgr.es/m/20190411021946.GG2728@paquier.xyz	2019-04-12 10:16:49 +09:00
Tom Lane	4cae471d1b	Fix backwards test in operator_precedence_warning logic. Warnings about unary minus might have been wrong. It's a bit surprising that nobody noticed yet ... probably the precedence-warning feature hasn't really been used much in the field. Rikard Falkeborn Discussion: https://postgr.es/m/CADRDgG6fzA8A2oeygUw4=o7ywo4kvz26NxCSgpq22nMD73Bx4Q@mail.gmail.com	2019-04-10 19:02:21 -04:00
Amit Kapila	bdf35744bd	Avoid counting transaction stats for parallel worker cooperating transaction. The transaction that is initiated by the parallel worker to cooperate with the actual transaction started by the main backend to complete the query execution should not be counted as a separate transaction. The other internal transactions started and committed by the parallel worker are still counted as separate transactions as we that is what we do in other places like autovacuum. This will partially fix the bloat in transaction stats due to additional transactions performed by parallel workers. For a complete fix, we need to decide how we want to show all the transactions that are started internally for various operations and that is a matter of separate patch. Reported-by: Haribabu Kommi Author: Haribabu Kommi Reviewed-by: Amit Kapila, Jamison Kirk and Rahila Syed Backpatch-through: 9.6 Discussion: https://postgr.es/m/CAJrrPGc9=jKXuScvNyQ+VNhO0FZk7LLAShAJRyZjnedd2D61EQ@mail.gmail.com	2019-04-10 08:24:15 +05:30
Thomas Munro	255044889d	Fix typos.	2019-04-10 09:21:06 +12:00
Tom Lane	9476131278	Prevent inlining of multiply-referenced CTEs with outer recursive refs. This has to be prevented because inlining would result in multiple self-references, which we don't support (and in fact that's disallowed by the SQL spec, see statements about linearly vs. nonlinearly recursive queries). Bug fix for commit `608b167f9`. Per report from Yaroslav Schekin (via Andrew Gierth) Discussion: https://postgr.es/m/87wolmg60q.fsf@news-spur.riddles.org.uk	2019-04-09 15:47:35 -04:00
Alvaro Herrera	4dba0f6dc4	Fix typo	2019-04-09 13:00:12 -04:00
Noah Misch	ba3fb5d4fb	Define WIN32_STACK_RLIMIT throughout win32 and cygwin builds. The MSVC build system already did this, and commit `617dc6d299` used it in a second file. Back-patch to 9.4, like that commit. Discussion: https://postgr.es/m/CAA8=A7_1SWc3+3Z=-utQrQFOtrj_DeohRVt7diA2tZozxsyUOQ@mail.gmail.com	2019-04-09 08:25:39 -07:00
Peter Eisentraut	9efe068e48	Replace tabs with spaces in one .sql file Let's at least keep this consistent within the same file.	2019-04-09 15:54:37 +02:00
Heikki Linnakangas	16954e22e2	Fix example in comment. Author: Adrien Nayrat	2019-04-09 08:33:42 +03:00
Noah Misch	617dc6d299	Avoid "could not reattach" by providing space for concurrent allocation. We've long had reports of intermittent "could not reattach to shared memory" errors on Windows. Buildfarm member dory fails that way when PGSharedMemoryReAttach() execution overlaps with creation of a thread for the process's "default thread pool". Fix that by providing a second region to receive asynchronous allocations that would otherwise intrude into UsedShmemSegAddr. In pgwin32_ReserveSharedMemoryRegion(), stop trying to free reservations landing at incorrect addresses; the caller's next step has been to terminate the affected process. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. He also did much of the prerequisite research; see commit `bcbf2346d6`. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com	2019-04-08 21:39:00 -07:00
Tom Lane	45f8eaa8e3	Fix improper interaction of FULL JOINs with lateral references. join_is_legal() needs to reject forming certain outer joins in cases where that would lead the planner down a blind alley. However, it mistakenly supposed that the way to handle full joins was to treat them as applying the same constraints as for left joins, only to both sides. That doesn't work, as shown in bug #15741 from Anthony Skorski: given a lateral reference out of a join that's fully enclosed by a full join, the code would fail to believe that any join ordering is legal, resulting in errors like "failed to build any N-way joins". However, we don't really need to consider full joins at all for this purpose, because we effectively force them to be evaluated in syntactic order, and that order is always legal for lateral references. Hence, get rid of this broken logic for full joins and just ignore them instead. This seems to have been an oversight in commit `7e19db0c0`. Back-patch to all supported branches, as that was. Discussion: https://postgr.es/m/15741-276f1f464b3f40eb@postgresql.org	2019-04-08 16:09:26 -04:00
Tom Lane	a8cb8f1246	Fix EvalPlanQualStart to handle partitioned result rels correctly. The es_root_result_relations array needs to be shallow-copied in the same way as the main es_result_relations array, else EPQ rechecks on partitioned result relations fail, as seen in bug #15677 from Norbert Benkocs. Amit Langote, isolation test case added by me Discussion: https://postgr.es/m/15677-0bf089579b4cd02d@postgresql.org Discussion: https://postgr.es/m/19321.1554567786@sss.pgh.pa.us	2019-04-08 12:20:22 -04:00
Fujii Masao	119dcfad98	Add vacuum_truncate reloption. vacuum_truncate controls whether vacuum tries to truncate off any empty pages at the end of the table. Previously vacuum always tried to do the truncation. However, the truncation could cause some problems; for example, ACCESS EXCLUSIVE lock needs to be taken on the table during the truncation and can cause the query cancellation on the standby even if hot_standby_feedback is true. Setting this reloption to false can be helpful to avoid such problems. Author: Tsunakawa Takayuki Reviewed-By: Julien Rouhaud, Masahiko Sawada, Michael Paquier, Kirk Jamison and Fujii Masao Discussion: https://postgr.es/m/CAHGQGwE5UqFqSq1=kV3QtTUtXphTdyHA-8rAj4A=Y+e4kyp3BQ@mail.gmail.com	2019-04-08 16:43:57 +09:00
Andres Freund	4c9e1bd0a3	Reset memory context once per tuple in validateForeignKeyConstraint. When using tableam ExecFetchSlotHeapTuple() might return a separately allocated tuple. We could use the shouldFree argument to explicitly free it, but it seems more robust to to protect Also add a CHECK_FOR_INTERRUPTS() after each tuple. It's likely that each AM has (heap does) a CFI somewhere in the relevant path, but it seems more robust to have one in validateForeignKeyConstraint() itself. Note that this only affects the cases that couldn't be optimized to be verified with a query. Author: Andres Freund Reviewed-By: Tom Lane (in an earlier version) Discussion: https://postgr.es/m/19030.1554574075@sss.pgh.pa.us https://postgr.es/m/CAKJS1f_SHKcPYMsi39An5aUjhAcEMZb6Cx1Sj1QWEWSiKJkBVQ@mail.gmail.com https://postgr.es/m/20180711185628.mrvl46bjgk2uxoki@alap3.anarazel.de	2019-04-07 22:42:42 -07:00
Andres Freund	41f5e04aec	Fix a number of issues around modifying a previously updated row. This commit fixes three, unfortunately related, issues: 1) Since `5db6df0c01`, the introduction of DML via tableam, it was possible to trigger "ERROR: unexpected table_lock_tuple status: 1" when updating a row that was previously updated in the same transaction - but only when the previously updated row was before updated in a concurrent transaction (and READ COMMITTED was used). The reason for that was that that case simply wasn't expected. Fixing that lead to: 2) Even before the above commit, there were error checks (introduced in `6868ed7491`) preventing a row being updated by different commands within the same statement (say in a function called by an UPDATE) - but that check wasn't performed when the row was first updated in a concurrent transaction - instead the second update was silently skipped in that case. After this change we throw the same error as we'd without the concurrent transaction. 3) The error messages (introduced in `6868ed7491`) preventing such updates emitted the same error message for both DELETE and UPDATE ("tuple to be updated was already modified by an operation triggered by the current command"). While that could be changed separately, it made it hard to write tests that verify the correct correct behavior of the code. This commit changes heap's implementation of table_lock_tuple() to return TM_SelfModified instead of TM_Invisible (previously loosely modeled after EvalPlanQualFetch), and teaches nodeModifyTable.c to handle that in response to table_lock_tuple() and not just in response to table_(delete\|update). Additionally it fixes the wrong error message (see 3 above). The comment for table_lock_tuple() is also adjusted to state that TM_Deleted won't return information in TM_FailureData - it'll not always be available. This also adds tests to ensure that DELETE/UPDATE correctly error out when affecting a row that concurrently was modified by another transaction. Author: Andres Freund Reported-By: Tom Lane, when investigating a bug bug fix to another bug by Amit Langote Discussion: https://postgr.es/m/19321.1554567786@sss.pgh.pa.us	2019-04-07 22:14:47 -07:00
Tom Lane	80a96e066e	Avoid fetching past the end of the indoption array. pg_get_indexdef_worker carelessly fetched indoption entries even for non-key index columns that don't have one. 99.999% of the time this would be harmless, since the code wouldn't examine the value ... but some fine day this will be a fetch off the end of memory, resulting in SIGSEGV. Detected through valgrind testing. Odd that the buildfarm's valgrind critters haven't noticed.	2019-04-07 18:19:16 -04:00
Tom Lane	159970bcad	Clean up side-effects of commits `ab5fcf2b0` et al. Before those commits, partitioning-related code in the executor could assume that ModifyTableState.resultRelInfo[] contains only leaf partitions. However, now a fully-pruned update results in a dummy ModifyTable that references the root partitioned table, and that breaks some stuff. In v11, this led to an assertion or core dump in the tuple routing code. Fix by disabling tuple routing, since we don't need that anyway. (I chose to do that in HEAD as well for safety, even though the problem doesn't manifest in HEAD as it stands.) In v10, this confused ExecInitModifyTable's decision about whether it needed to close the root table. But we can get rid of that altogether by being smarter about where to find the root table. Note that since the referenced commits haven't shipped yet, this isn't fixing any bug the field has seen. Amit Langote, per a report from me Discussion: https://postgr.es/m/20710.1554582479@sss.pgh.pa.us	2019-04-07 12:54:22 -04:00
Peter Eisentraut	03f9e5cba0	Report progress of REINDEX operations This uses the same infrastructure that the CREATE INDEX progress reporting uses. Add a column to pg_stat_progress_create_index to report the OID of the index being worked on. This was not necessary for CREATE INDEX, but it's useful for REINDEX. Also edit the phase descriptions a bit to be more consistent with the source code comments. Discussion: https://www.postgresql.org/message-id/ef6a6757-c36a-9e81-123f-13b19e36b7d7%402ndquadrant.com	2019-04-07 12:35:29 +02:00
Peter Eisentraut	106f2eb664	Cast pg_stat_progress_cluster.cluster_index_relid to oid It's tracked internally as bigint, but when presented to the user it should be oid.	2019-04-07 10:31:32 +02:00
Tom Lane	46e3442c9e	Fix failures in validateForeignKeyConstraint's slow path. The foreign-key-checking loop in ATRewriteTables failed to ignore relations without storage (e.g., partitioned tables), unlike the initial loop. This accidentally worked as long as RI_Initial_Check succeeded, which it does in most practical cases (including all the ones exercised in the existing regression tests :-(). However, if that failed, as for instance when there are permissions issues, then we entered the slow fire-the-trigger-on-each-tuple path. And that would try to read from the referencing relation, and fail if it lacks storage. A second problem, recently introduced in HEAD, was that this loop had been broken by sloppy refactoring for the tableam API changes. Repair both issues, and add a regression test case so we have some coverage on this code path. Back-patch as needed to v11. (It looks like this code could do with additional bulletproofing, but let's get a working test case in place first.) Hadi Moshayedi, Tom Lane, Andres Freund Discussion: https://postgr.es/m/CAK=1=WrnNmBbe5D9sm3t0a6dnAq3cdbF1vXY816j1wsMqzC8bw@mail.gmail.com Discussion: https://postgr.es/m/19030.1554574075@sss.pgh.pa.us Discussion: https://postgr.es/m/20190325180405.jytoehuzkeozggxx%40alap3.anarazel.de	2019-04-06 15:09:09 -04:00
Michael Paquier	249d649996	Add support TCP user timeout in libpq and the backend server Similarly to the set of parameters for keepalive, a connection parameter for libpq is added as well as a backend GUC, called tcp_user_timeout. Increasing the TCP user timeout is useful to allow a connection to survive extended periods without end-to-end connection, and decreasing it allows application to fail faster. By default, the parameter is 0, which makes the connection use the system default, and follows a logic close to the keepalive parameters in its handling. When connecting through a Unix-socket domain, the parameters have no effect. Author: Ryohei Nagaura Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04	2019-04-06 15:23:37 +09:00
Tom Lane	959d00e9db	Use Append rather than MergeAppend for scanning ordered partitions. If we need ordered output from a scan of a partitioned table, but the ordering matches the partition ordering, then we don't need to use a MergeAppend to combine the pre-ordered per-partition scan results: a plain Append will produce the same results. This both saves useless comparison work inside the MergeAppend proper, and allows us to start returning tuples after istarting up just the first child node not all of them. However, all is not peaches and cream, because if some of the child nodes have high startup costs then there will be big discontinuities in the tuples-returned-versus-elapsed-time curve. The planner's cost model cannot handle that (yet, anyway). If we model the Append's startup cost as being just the first child's startup cost, we may drastically underestimate the cost of fetching slightly more tuples than are available from the first child. Since we've had bad experiences with over-optimistic choices of "fast start" plans for ORDER BY LIMIT queries, that seems scary. As a klugy workaround, set the startup cost estimate for an ordered Append to be the sum of its children's startup costs (as MergeAppend would). This doesn't really describe reality, but it's less likely to cause a bad plan choice than an underestimated startup cost would. In practice, the cases where we really care about this optimization will have child plans that are IndexScans with zero startup cost, so that the overly conservative estimate is still just zero. David Rowley, reviewed by Julien Rouhaud and Antonin Houska Discussion: https://postgr.es/m/CAKJS1f-hAqhPLRk_RaSFTgYxd=Tz5hA7kQ2h4-DhJufQk8TGuw@mail.gmail.com	2019-04-05 19:20:43 -04:00
Alvaro Herrera	9f06d79ef8	Add facility to copy replication slots This allows the user to create duplicates of existing replication slots, either logical or physical, and even changing properties such as whether they are temporary or the output plugin used. There are multiple uses for this, such as initializing multiple replicas using the slot for one base backup; when doing investigation of logical replication issues; and to select a different output plugins. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/CAD21AoAm7XX8y_tOPP6j4Nzzch12FvA1wPqiO690RCk+uYVstg@mail.gmail.com	2019-04-05 18:05:18 -03:00
Thomas Munro	de2b38419c	Wake up interested backends when a checkpoint fails. Commit `c6c9474a` switched to condition variables instead of sleep loops to notify backends of checkpoint start and stop, but forgot to broadcast in case of checkpoint failure. Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKGJKbCd%2B_K%2BSEBsbHxVT60SG0ivWHHAdvL0bLTUt2xpA2w%40mail.gmail.com	2019-04-06 09:31:48 +13:00
Peter Eisentraut	edda32ee25	Fix compiler warning Rewrite get_attgenerated() to avoid compiler warning if the compiler does not recognize that elog(ERROR) does not return. Reported-by: David Rowley <david.rowley@2ndquadrant.com>	2019-04-05 09:23:07 +02:00
Noah Misch	82150a05be	Revert "Consistently test for in-use shared memory." This reverts commits `2f932f71d9`, `16ee6eaf80` and `6f0e190056`. The buildfarm has revealed several bugs. Back-patch like the original commits. Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com	2019-04-05 00:00:52 -07:00
Thomas Munro	794c543b17	Fix bugs in mdsyncfiletag(). Commit `3eb77eba` moved a _mdfd_getseg() call from mdsync() into a new callback function mdsyncfiletag(), but didn't get the arguments quite right. Without the EXTENSION_DONT_CHECK_SIZE flag we fail to open a segment if lower-numbered segments have been truncated, and it wants a block number rather than a segment number. While comparing with the older coding, also remove an unnecessary clobbering of errno, and adjust the code in mdunlinkfiletag() to ressemble the original code from mdpostckpt() more closely instead of using an unnecessary call to smgropen(). Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKGL%2BYLUOA0eYiBXBfwW%2BbH5kFgh94%3DgQH0jHEJ-t5Y91wQ%40mail.gmail.com	2019-04-05 17:41:58 +13:00
Andres Freund	57a7a3adfe	Remove unused struct member, enforce multi_insert callback presence. Author: David Rowley, Andres Freund Discussion: https://postgr.es/m/CAKJS1f9=9phmm66diAji4gvHnWSrK7BGFoNct+mEUT_c8pPOjw@mail.gmail.com	2019-04-04 17:39:39 -07:00
Andres Freund	ea97e440b8	Harden tableam against nonexistant / wrong kind of AMs. Previously it was allowed to set default_table_access_method to an empty string. That makes sense for default_tablespace, where that was copied from, as it signals falling back to the database's default tablespace. As there is no equivalent for table AMs, forbid that. Also make sure to throw a usable error when creating a table using an index AM, by using get_am_type_oid() to implement get_table_am_oid() instead of a separate copy. Previously we'd error out only later, in GetTableAmRoutine(). Thirdly remove GetTableAmRoutineByAmId() - it was only used in an earlier version of `8586bf7ed8`. Add tests for the above (some for index AMs as well).	2019-04-04 17:39:39 -07:00
Andres Freund	86b85044e8	tableam: Add table_multi_insert() and revamp/speed-up COPY FROM buffering. This adds table_multi_insert(), and converts COPY FROM, the only user of heap_multi_insert, to it. A simple conversion of COPY FROM use slots would have yielded a slowdown when inserting into a partitioned table for some workloads. Different partitions might need different slots (both slot types and their descriptors), and dropping / creating slots when there's constant partition changes is measurable. Thus instead revamp the COPY FROM buffering for partitioned tables to allow to buffer inserts into multiple tables, flushing only when limits are reached across all partition buffers. By only dropping slots when there've been inserts into too many different partitions, the aforementioned overhead is gone. By allowing larger batches, even when there are frequent partition changes, we actuall speed such cases up significantly. By using slots COPY of very narrow rows into unlogged / temporary might slow down very slightly (due to the indirect function calls). Author: David Rowley, Andres Freund, Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20190327054923.t3epfuewxfqdt22e@alap3.anarazel.de	2019-04-04 16:28:18 -07:00
Tom Lane	9c703c169a	Make queries' locking of indexes more consistent. The assertions added by commit `b04aeb0a0` exposed that there are some code paths wherein the executor will try to open an index without holding any lock on it. We do have some lock on the index's table, so it seems likely that there's no fatal problem with this (for instance, the index couldn't get dropped from under us). Still, it's bad practice and we should fix it. To do so, remove the optimizations in ExecInitIndexScan and friends that tried to avoid taking a lock on an index belonging to a target relation, and just take the lock always. In non-bug cases, this will result in no additional shared-memory access, since we'll find in the local lock table that we already have a lock of the desired type; hence, no significant performance degradation should occur. Also, adjust the planner and executor so that the type of lock taken on an index is always identical to the type of lock taken for its table, by relying on the recently added RangeTblEntry.rellockmode field. This avoids some corner cases where that might not have been true before (possibly resulting in extra locking overhead), and prevents future maintenance issues from having multiple bits of logic that all needed to be in sync. In addition, this change removes all core calls to ExecRelationIsTargetRelation, which avoids a possible O(N^2) startup penalty for queries with large numbers of target relations. (We'd probably remove that function altogether, were it not that we advertise it as something that FDWs might want to use.) Also adjust some places in selfuncs.c to not take any lock on indexes they are transiently opening, since we can assume that plancat.c did that already. In passing, change gin_clean_pending_list() to take RowExclusiveLock not AccessShareLock on its target index. Although it's not clear that that's actually a bug, it seemed very strange for a function that's explicitly going to modify the index to use only AccessShareLock. David Rowley, reviewed by Julien Rouhaud and Amit Langote, a bit of further tweaking by me Discussion: https://postgr.es/m/19465.1541636036@sss.pgh.pa.us	2019-04-04 15:12:58 -04:00
Robert Haas	a96c41feec	Allow VACUUM to be run with index cleanup disabled. This commit adds a new reloption, vacuum_index_cleanup, which controls whether index cleanup is performed for a particular relation by default. It also adds a new option to the VACUUM command, INDEX_CLEANUP, which can be used to override the reloption. If neither the reloption nor the VACUUM option is used, the default is true, as before. Masahiko Sawada, reviewed and tested by Nathan Bossart, Alvaro Herrera, Kyotaro Horiguchi, Darafei Praliaskouski, and me. The wording of the documentation is mostly due to me. Discussion: http://postgr.es/m/CAD21AoAt5R3DNUZSjOoXDUY=naYPUOuffVsRzuTYMz29yLzQCA@mail.gmail.com	2019-04-04 15:04:43 -04:00
Peter Geoghegan	74eb2176bf	Invalidate binary search bounds consistently. _bt_check_unique() failed to invalidate binary search bounds in the event of a live conflict following commit `e5adcb78`. This resulted in problems after waiting for the conflicting xact to commit or abort. The subsequent call to _bt_check_unique() would restore the initial binary search bounds, rather than starting a new search. Fix by explicitly invalidating bounds when it becomes clear that there is a live conflict that insertion will have to wait to resolve. Ashutosh Sharma, with a few additional tweaks by me. Author: Ashutosh Sharma Reported-By: Ashutosh Sharma Diagnosed-By: Ashutosh Sharma Discussion: https://postgr.es/m/CAE9k0PnQp-qr-UYKMSCzdC2FBzdE4wKP41hZrZvvP26dKLonLg@mail.gmail.com	2019-04-04 09:38:08 -07:00
Thomas Munro	3eb77eba5a	Refactor the fsync queue for wider use. Previously, md.c and checkpointer.c were tightly integrated so that fsync calls could be handed off and processed in the background. Introduce a system of callbacks and file tags, so that other modules can hand off fsync work in the same way. For now only md.c uses the new interface, but other users are being proposed. Since there may be use cases that are not strictly SMGR implementations, use a new function table for sync handlers rather than extending the traditional SMGR one. Instead of using a bitmapset of segment numbers for each RelFileNode in the checkpointer's hash table, make the segment number part of the key. This requires sending explicit "forget" requests for every segment individually when relations are dropped, but suits the file layout schemes of proposed future users better (ie sparse or high segment numbers). Author: Shawn Debnath and Thomas Munro Reviewed-by: Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com	2019-04-04 23:38:38 +13:00
Noah Misch	6f0e190056	Silence -Wimplicit-fallthrough in sysv_shmem.c. Commit `2f932f71d9` added code that elicits a warning on buildfarm member flaviventris. Back-patch to 9.4, like that commit. Reported by Andres Freund. Discussion: https://postgr.es/m/20190404020057.galelv7by75ekqrh@alap3.anarazel.de	2019-04-03 23:23:35 -07:00
Noah Misch	ab9ed9be23	Assert that pgwin32_signal_initialize() has been called early enough. Before the pgwin32_signal_initialize() call, the backend version of pg_usleep() has no effect. No in-tree code falls afoul of that today, but temporary commit `23078689a9` did so. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com	2019-04-03 17:11:16 -07:00
Noah Misch	2f932f71d9	Consistently test for in-use shared memory. postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com	2019-04-03 17:03:46 -07:00
Tomas Vondra	ea569d64ac	Add SETTINGS option to EXPLAIN, to print modified settings. Query planning is affected by a number of configuration options, and it may be crucial to know which of those options were set to non-default values. With this patch you can say EXPLAIN (SETTINGS ON) to include that information in the query plan. Only options affecting planning, with values different from the built-in default are printed. This patch also adds auto_explain.log_settings option, providing the same capability in auto_explain module. Author: Tomas Vondra Reviewed-by: Rafia Sabih, John Naylor Discussion: https://postgr.es/m/e1791b4c-df9c-be02-edc5-7c8874944be0@2ndquadrant.com	2019-04-04 00:04:31 +02:00
Alvaro Herrera	d1f04b96b9	Tweak docs for log_statement_sample_rate Author: Justin Pryzby, partly after a suggestion from Masahiko Sawada Discussion: https://postgr.es/m/20190328135918.GA27808@telsasoft.com Discussion: https://postgr.es/m/CAD21AoB9+y8N4+Fan-ne-_7J5yTybPttxeVKfwUocKp4zT1vNQ@mail.gmail.com	2019-04-03 18:56:56 -03:00
Alvaro Herrera	799e220346	Log all statements from a sample of transactions This is useful to obtain a view of the different transaction types in an application, regardless of the durations of the statements each runs. Author: Adrien Nayrat Reviewed-by: Masahiko Sawada, Hayato Kuroda, Andres Freund	2019-04-03 18:43:59 -03:00
Tomas Vondra	c50b3158bf	Reduce overhead of pg_mcv_list (de)serialization Commit `ea4e1c0e8f` resolved issues with memory alignment in serialized pg_mcv_list values, but it required copying data to/from the varlena buffer during serialization and deserialization. As the MCV lits may be fairly large, the overhead (memory consumption, CPU usage) can get rather significant too. This change tweaks the serialization format so that the alignment is correct with respect to the varlena value, and so the parts may be accessed directly without copying the data. Catversion bump, as it affects existing pg_statistic_ext data.	2019-04-03 21:23:40 +02:00
Stephen Frost	b0b39f72b9	GSSAPI encryption support On both the frontend and backend, prepare for GSSAPI encryption support by moving common code for error handling into a separate file. Fix a TODO for handling multiple status messages in the process. Eliminate the OIDs, which have not been needed for some time. Add frontend and backend encryption support functions. Keep the context initiation for authentication-only separate on both the frontend and backend in order to avoid concerns about changing the requested flags to include encryption support. In postmaster, pull GSSAPI authorization checking into a shared function. Also share the initiator name between the encryption and non-encryption codepaths. For HBA, add "hostgssenc" and "hostnogssenc" entries that behave similarly to their SSL counterparts. "hostgssenc" requires either "gss", "trust", or "reject" for its authentication. Similarly, add a "gssencmode" parameter to libpq. Supported values are "disable", "require", and "prefer". Notably, negotiation will only be attempted if credentials can be acquired. Move credential acquisition into its own function to support this behavior. Add a simple pg_stat_gssapi view similar to pg_stat_ssl, for monitoring if GSSAPI authentication was used, what principal was used, and if encryption is being used on the connection. Finally, add documentation for everything new, and update existing documentation on connection security. Thanks to Michael Paquier for the Windows fixes. Author: Robbie Harwood, with changes to the read/write functions by me. Reviewed in various forms and at different times by: Michael Paquier, Andres Freund, David Steele. Discussion: https://www.postgresql.org/message-id/flat/jlg1tgq1ktm.fsf@thriss.redhat.com	2019-04-03 15:02:33 -04:00
Alvaro Herrera	5f6fc34af5	Copy name when cloning FKs recurses to partitions We were passing a string owned by a syscache entry, which was released before recursing. Fix by pstrdup'ing the string. Per buildfarm member prion.	2019-04-03 15:35:54 -03:00
Alvaro Herrera	f56f8f8da6	Support foreign keys that reference partitioned tables Previously, while primary keys could be made on partitioned tables, it was not possible to define foreign keys that reference those primary keys. Now it is possible to do that. Author: Álvaro Herrera Reviewed-by: Amit Langote, Jesper Pedersen Discussion: https://postgr.es/m/20181102234158.735b3fevta63msbj@alvherre.pgsql	2019-04-03 14:40:21 -03:00
Heikki Linnakangas	9155580fd5	Generate less WAL during GiST, GIN and SP-GiST index build. Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov	2019-04-03 17:03:15 +03:00
Alvaro Herrera	5f768045a1	Correctly initialize newly added struct member Valgrind was rightly complaining that IndexVacuumInfo->report_progress (added by commit `ab0dfc961b`) was not being initialized in some code paths. Repair. Per buildfarm member lousyjack.	2019-04-03 09:58:47 -03:00
Alvaro Herrera	e8abf97af7	Prevent use of uninitialized variable Per buildfarm member longfin.	2019-04-02 16:03:26 -03:00
Alvaro Herrera	ab0dfc961b	Report progress of CREATE INDEX operations This uses the progress reporting infrastructure added by `c16dc1aca5`, adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY. There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate for btrees, including reportage for parallel index builds and the separate phases that btree index creation uses; other index AMs, which are much simpler in their building procedures, have simplistic reporting only, but that seems sufficient, at least for non-concurrent builds. The index-AM-agnostic part is fairly complete, providing insight into the CONCURRENTLY wait phases as well as block-based progress during the index validation table scan. (The index validation index scan requires patching each AM, which has not been included here.) Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql	2019-04-02 15:18:08 -03:00
Stephen Frost	4d0e994eed	Add support for partial TOAST decompression When asked for a slice of a TOAST entry, decompress enough to return the slice instead of decompressing the entire object. For use cases where the slice is at, or near, the beginning of the entry, this avoids a lot of unnecessary decompression work. This changes the signature of pglz_decompress() by adding a boolean to indicate if it's ok for the call to finish before consuming all of the source or destination buffers. Author: Paul Ramsey Reviewed-By: Rafia Sabih, Darafei Praliaskouski, Regina Obe Discussion: https://postgr.es/m/CACowWR07EDm7Y4m2kbhN_jnys%3DBBf9A6768RyQdKm_%3DNpkcaWg%40mail.gmail.com	2019-04-02 12:35:32 -04:00
Etsuro Fujita	d50d172e51	postgres_fdw: Perform the (FINAL, NULL) upperrel operations remotely. The upper-planner pathification allows FDWs to arrange to push down different types of upper-stage operations to the remote side. This commit teaches postgres_fdw to do it for the (FINAL, NULL) upperrel, which is responsible for doing LockRows, LIMIT, and/or ModifyTable. This provides the ability for postgres_fdw to handle SELECT commands so that it 1) skips the LockRows step (if any) (note that this is safe since it performs early locking) and 2) pushes down the LIMIT and/or OFFSET restrictions (if any) to the remote side. This doesn't handle the INSERT/UPDATE/DELETE cases. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk	2019-04-02 20:30:45 +09:00
Etsuro Fujita	aef65db676	Refactor create_limit_path() to share cost adjustment code with FDWs. This is in preparation for an upcoming commit. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk	2019-04-02 19:55:12 +09:00
Dean Rasheed	e2d28c0f40	Perform RLS subquery checks as the right user when going via a view. When accessing a table with RLS via a view, the RLS checks are performed as the view owner. However, the code neglected to propagate that to any subqueries in the RLS checks. Fix that by calling setRuleCheckAsUser() for all RLS policy quals and withCheckOption checks for RTEs with RLS. Back-patch to 9.5 where RLS was added. Per bug #15708 from daurnimator. Discussion: https://postgr.es/m/15708-d65cab2ce9b1717a@postgresql.org	2019-04-02 08:13:59 +01:00
Thomas Munro	475861b261	Add wal_recycle and wal_init_zero GUCs. On at least ZFS, it can be beneficial to create new WAL files every time and not to bother zero-filling them. Since it's not clear which other filesystems might benefit from one or both of those things, add individual GUCs to control those two behaviors independently and make only very general statements in the docs. Author: Jerry Jelinek, with some adjustments by Thomas Munro Reviewed-by: Alvaro Herrera, Andres Freund, Tomas Vondra, Robert Haas and others Discussion: https://postgr.es/m/CACPQ5Fo00QR7LNAcd1ZjgoBi4y97%2BK760YABs0vQHH5dLdkkMA%40mail.gmail.com	2019-04-02 14:37:14 +13:00
Andres Freund	d45e401586	tableam: Add table_finish_bulk_insert(). This replaces the previous calls of heap_sync() in places using bulk-insert. By passing in the flags used for bulk-insert the AM can decide (first at insert time and then during the finish call) which of the optimizations apply to it, and what operations are necessary to finish a bulk insert operation. Also change HEAP_INSERT_* flags to TABLE_INSERT, and rename hi_options to ti_options. These changes are made even in copy.c, which hasn't yet been converted to tableam. There's no harm in doing so. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-04-01 14:41:42 -07:00
Thomas Munro	4fd05bb55b	Fix deadlock in heap_compute_xid_horizon_for_tuples(). We can't call code that uses syscache while we hold buffer locks on a catalog relation. If passed such a relation, just fall back to the general effective_io_concurrency GUC rather than trying to look up the containing tablespace's IO concurrency setting. We might find a better way to control prefetching in follow-up work, but for now this is enough to avoid the deadlock introduced by commit `558a9165e0`. Reviewed-by: Andres Freund Diagnosed-by: Peter Geoghegan Discussion: https://postgr.es/m/CA%2BhUKGLCwPF0S4Mk7S8qw%2BDK0Bq65LueN9rofAA3HHSYikW-Zw%40mail.gmail.com Discussion: https://postgr.es/m/962831d8-c18d-180d-75fb-8b842e3a2742%40chrullrich.net	2019-04-02 09:29:49 +13:00
Tom Lane	12d46ac392	Improve documentation about our XML functionality. Add a section explaining how our XML features depart from current versions of the SQL standard. Update and clarify the descriptions of some XML functions. Chapman Flack, reviewed by Ryan Lambert Discussion: https://postgr.es/m/5BD1284C.1010305@anastigmatix.net Discussion: https://postgr.es/m/5C81F8C0.6090901@anastigmatix.net Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com	2019-04-01 16:20:22 -04:00
Tom Lane	b2b819019f	Add volatile qualifier missed in commit `2e616dee9`. Noted by Pavel Stehule Discussion: https://postgr.es/m/CAFj8pRAaGO5FX7bnP3E=mRssoK8y5T78x7jKy-vDiyS68L888Q@mail.gmail.com	2019-04-01 14:37:25 -04:00
Peter Eisentraut	cc8d415117	Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com	2019-04-01 20:01:35 +02:00
Alexander Korotkov	b4cc19ab01	Throw error in jsonb_path_match() when result is not single boolean jsonb_path_match() checks if jsonb document matches jsonpath query. Therefore, jsonpath query should return single boolean. Currently, if result of jsonpath is not a single boolean, NULL is returned independently whether silent mode is on or off. But that appears to be wrong when silent mode is off. This commit makes jsonb_path_match() throw an error in this case. Author: Nikita Glukhov	2019-04-01 18:09:20 +03:00
Alexander Korotkov	2e643501e5	Restrict some cases in parsing numerics in jsonpath Jsonpath now accepts integers with leading zeroes and floats starting with a dot. However, SQL standard requires to follow JSON specification, which doesn't allow none of these cases. Our json[b] datatypes also restrict that. So, restrict it in jsonpath altogether. Author: Nikita Glukhov	2019-04-01 18:09:09 +03:00
Alexander Korotkov	0a02e2ae02	GIN support for @@ and @? jsonpath operators This commit makes existing GIN operator classes jsonb_ops and json_path_ops support "jsonb @@ jsonpath" and "jsonb @? jsonpath" operators. Basic idea is to extract statements of following form out of jsonpath. key1.key2. ... .keyN = const The rest of jsonpath is rechecked from heap. Catversion is bumped. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Nikita Glukhov, Alexander Korotkov Reviewed-by: Jonathan Katz, Pavel Stehule	2019-04-01 18:08:52 +03:00
Peter Eisentraut	7241911782	Catch syntax error in generated column definition The syntax GENERATED BY DEFAULT AS (expr) is not allowed but we have to accept it in the grammar to avoid shift/reduce conflicts because of the similar syntax for identity columns. The existing code just ignored this, incorrectly. Add an explicit error check and a bespoke error message. Reported-by: Justin Pryzby <pryzby@telsasoft.com>	2019-04-01 10:46:37 +02:00
Michael Paquier	4ae7f02b03	Fix thinko in allocation call during MVC list deserialization Spotted by Coverity.	2019-04-01 14:16:27 +09:00
Noah Misch	5a907404b5	Update HINT for pre-existing shared memory block. One should almost always terminate an old process, not use a manual removal tool like ipcrm. Removal of the ipcclean script eleven years ago (`39627b1ae6`) and its non-replacement corroborate that manual shm removal is now a niche goal. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20180812064815.GB2301738@rfd.leadboat.com	2019-03-31 19:32:48 -07:00
Andres Freund	bfbcad478f	tableam: bitmap table scan. This moves bitmap heap scan support to below an optional tableam callback. It's optional as the whole concept of bitmap heapscans is fairly block specific. This basically moves the work previously done in bitgetpage() into the new scan_bitmap_next_block callback, and the direct poking into the buffer done in BitmapHeapNext() into the new scan_bitmap_next_tuple() callback. The abstraction is currently somewhat leaky because nodeBitmapHeapscan.c's prefetching and visibilitymap based logic remains - it's likely that we'll later have to move more into the AM. But it's not trivial to do so without introducing a significant amount of code duplication between the AMs, so that's a project for later. Note that now nodeBitmapHeapscan.c and the associated node types are a bit misnamed. But it's not clear whether renaming wouldn't be a cure worse than the disease. Either way, that'd be best done in a separate commit. Author: Andres Freund Reviewed-By: Robert Haas (in an older version) Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-31 18:37:57 -07:00
Andres Freund	73c954d248	tableam: sample scan. This moves sample scan support to below tableam. It's not optional as there is, in contrast to e.g. bitmap heap scans, no alternative way to perform tablesample queries. If an AM can't deal with the block based API, it will have to throw an ERROR. The tableam callbacks for this are block based, but given the current TsmRoutine interface, that seems to be required. The new interface doesn't require TsmRoutines to perform visibility checks anymore - that requires the TsmRoutine to know details about the AM, which we want to avoid. To continue to allow taking the returned number of tuples account SampleScanState now has a donetuples field (which previously e.g. existed in SystemRowsSamplerData), which is only incremented after the visibility check succeeds. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-31 18:37:57 -07:00
Andres Freund	4bb50236eb	tableam: Formatting and other minor cleanups. The superflous heapam_xlog.h includes were reported by Peter Geoghegan.	2019-03-31 18:16:53 -07:00
Peter Geoghegan	76a39f2295	Fix nbtree high key "continuescan" row compare bug. Commit `29b64d1d` mishandled skipping over truncated high key attributes during row comparisons. The row comparison key matching loop would loop forever when a truncated attribute was encountered for a row compare subkey. Fix by following the example of other code in the loop: advance the current subkey, or break out of the loop when the last subkey is reached. Add test coverage for the relevant _bt_check_rowcompare() code path. The new test case is somewhat tied to nbtree implementation details, which isn't ideal, but seems unavoidable.	2019-03-31 17:24:04 -07:00
Tom Lane	9fd4de119c	Compute root->qual_security_level in a less random place. We can set this up once and for all in subquery_planner's initial survey of the flattened rangetable, rather than incrementally adjusting it in build_simple_rel. The previous approach made it rather hard to reason about exactly when the value would be available, and we were definitely using it in some places before the final value was computed. Noted while fooling around with Amit Langote's patch to delay creation of inheritance child rels. That didn't break this code, but it made it even more fragile, IMO.	2019-03-31 13:47:41 -04:00
Michael Paquier	2aa6e331ea	Skip redundant anti-wraparound vacuums An anti-wraparound vacuum has to be by definition aggressive as it needs to work on all the pages of a relation. However it can happen that due to some concurrent activity an anti-wraparound vacuum is marked as non-aggressive, which makes it redundant with a previous run, and it is actually useless as an anti-wraparound vacuum should process all the pages of a relation. This commit makes such vacuums to be skipped. An anti-wraparound vacuum not aggressive can be found easily by mixing low values of autovacuum_freeze_max_age (to control anti-wraparound) and autovacuum_freeze_table_age (to control the aggressiveness). `28a8fa9` has added some extra logging printing all the possible combinations of anti-wraparound and aggressive vacuums, which now gets simplified as an anti-wraparound vacuum also non-aggressive gets skipped. Per discussion mainly between Andrew Dunstan, Robert Haas, Álvaro Herrera, Kyotaro Horiguchi, Masahiko Sawada, and myself. Author: Kyotaro Horiguchi, Michael Paquier Reviewed-by: Andrew Dunstan, Álvaro Herrera Discussion: https://postgr.es/m/20180914153554.562muwr3uwujno75@alvherre.pgsql	2019-03-31 22:59:12 +09:00
Andres Freund	696d78469f	tableam: Move heap specific logic from estimate_rel_size below tableam. This just moves the table/matview[/toast] determination of relation size to a callback, and uses a copy of the existing logic to implement that callback for heap. It probably would make sense to also move the index specific logic into a callback, so the metapage handling (and probably more) can be index specific. But that's a separate task. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-30 19:26:36 -07:00
Andres Freund	737a292b5d	tableam: VACUUM and ANALYZE support. This is a relatively straightforward move of the current implementation to sit below tableam. As the current analyze sampling implementation is pretty inherently block based, the tableam analyze interface is as well. It might make sense to generalize that at some point, but that seems like a larger project that shouldn't be undertaken at the same time as the introduction of tableam. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-30 19:25:58 -07:00
Tomas Vondra	0f5493fdf1	Fix typo Author: John Naylor	2019-03-31 03:29:58 +02:00

... 10 11 12 13 14 ...

20459 Commits