postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	b4f96d69ad	Minor jsonpath fixes. Restore missed "make clean" rule, fix misspelling. John Naylor Discussion: https://postgr.es/m/CACPNZCt5B8jDCCGQiFoSuqmg-za_NCy4QDioBTLaNRih9+-bXg@mail.gmail.com	2019-04-17 13:37:00 -04:00
Magnus Hagander	252b707bc4	Return NULL for checksum failures if checksums are not enabled Returning 0 could falsely indicate that there is no problem. NULL correctly indicates that there is no information about potential problems. Also return 0 as numbackends instead of NULL for shared objects (as no connection can be made to a shared object only). Author: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net>	2019-04-17 13:51:48 +02:00
Michael Paquier	9010156445	Fix thinko introduced by `82a5649` in slot.c When saving a replication slot, failing to close the temporary path used to save the slot information is considered as a failure and reported as such. However the code forgot to leave immediately as other failure paths do. Noticed while looking up at this area of the code for another patch.	2019-04-17 10:01:22 +09:00
Michael Paquier	47ac2033d4	Simplify some ERROR paths clearing wait events and transient files Transient files and wait events get normally cleaned up when seeing an exception (be it in the context of a transaction for a backend or another process like the checkpointer), hence there is little point in complicating error code paths to do this work. This shaves a bit of code, and removes some extra handling with errno which needed to be preserved during the cleanup steps done. Reported-by: Masahiko Sawada Author: Michael Paquier Reviewed-by: Tom Lane, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoDhHYVq5KkXfkaHhmjA-zJYj-e4teiRAJefvXuKJz1tKQ@mail.gmail.com	2019-04-17 09:51:45 +09:00
Michael Paquier	a6dcf9df4d	Rework handling of invalid indexes with REINDEX CONCURRENTLY Per discussion with others, allowing REINDEX INDEX CONCURRENTLY to work for invalid indexes when working directly on them can have a lot of value to unlock situations with invalid indexes without having to use a dance involving DROP INDEX followed by an extra CREATE INDEX CONCURRENTLY (which would not work for indexes with constraint dependency anyway). This also does not create extra bloat on the relation involved as this works on individual indexes, so let's enable it. Note that REINDEX TABLE CONCURRENTLY still bypasses invalid indexes as we don't want to bloat the number of indexes defined on a relation in the event of multiple and successive failures of REINDEX CONCURRENTLY. More regression tests are added to cover those behaviors, using an invalid index created with CREATE INDEX CONCURRENTLY. Reported-by: Dagfinn Ilmari Mannsåker, Álvaro Herrera Author: Michael Paquier Reviewed-by: Peter Eisentraut, Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/20190411134947.GA22043@alvherre.pgsql	2019-04-17 09:33:51 +09:00
Michael Paquier	5ed4b123b6	Remove duplicate assignment when initializing logical decoder context The private data in the WAL reader is already getting set when allocating it. Author: Antonin Houska Reviewed-by: Tom Lane Discussion: https://postgr.es/m/30563.1555329094@localhost	2019-04-16 15:08:38 +09:00
Noah Misch	e12a472612	Don't write to stdin of a test process that could have already exited. Instead, close that stdin. Per buildfarm member conchuela. Back-patch to 9.6, where the test was introduced. Discussion: https://postgr.es/m/26478.1555373328@sss.pgh.pa.us	2019-04-15 18:13:44 -07:00
Tom Lane	dde7fb7836	Use [FLEXIBLE_ARRAY_MEMBER] not [1] in MultiSortSupportData. This struct seems to have not gotten the word about preferred coding style for variable-length arrays.	2019-04-15 19:32:44 -04:00
Tomas Vondra	dbb984128e	Convert pre-existing stats_ext tests to new style The regression tests added in commit `7300a69950` test cardinality estimates using a function that extracts the interesting pieces from the EXPLAIN output, instead of testing the whole plan. That seems both easier to understand and less fragile, so this applies the same approach to pre-existing tests of ndistinct coefficients and functional dependencies. Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com	2019-04-16 00:02:22 +02:00
Tomas Vondra	3824ca30d1	Fix pg_mcv_list deserialization The memcpy() was copying type OIDs in the wrong direction, so the deserialized MCV list always had them as 0. This is mostly harmless except when printing the data in pg_mcv_list_items(), in which case it reported ERROR: cache lookup failed for type 0 Also added a simple regression test for pg_mcv_list_items() function, printing a single-item MCV list. Reported-By: Dean Rasheed Discussion: https://postgr.es/m/CAEZATCX6T0iDTTZrqyec4Cd6b4yuL7euu4=rQRXaVBAVrUi1Cg@mail.gmail.com	2019-04-16 00:01:39 +02:00
Tom Lane	4b40e44f07	Fix failure with textual partition hash keys. Commit `5e1963fb7` overlooked two places in partbounds.c that now need to pass a collation identifier to the hash functions for a partition key column. Amit Langote, per report from Jesper Pedersen Discussion: https://postgr.es/m/a620f85a-42ab-e0f3-3337-b04b97e2e2f5@redhat.com	2019-04-15 16:47:09 -04:00
Tom Lane	47169c2550	Avoid possible regression test instability in timestamp.sql. Concurrent autovacuum could result in a change in the order of the live rows in timestamp_tbl. While this would not happen with the default autovacuum parameters, it's fairly easy to hit if autovacuum_vacuum_threshold is made small enough to allow autovac to decide to process this table. That's a stumbling block for trying to exercise autovacuum aggressively using the core regression tests. To fix, replace an unqualified DELETE with a TRUNCATE. There's a similar DELETE just above (and no order-sensitive queries between), so this doesn't lose any test coverage and might indeed be argued to improve it. Discussion: https://postgr.es/m/17428.1555348950@sss.pgh.pa.us	2019-04-15 16:20:01 -04:00
Alexander Korotkov	1e87198182	Fix division by zero in _bt_vacuum_needs_cleanup() Checks inside _bt_vacuum_needs_cleanup() allow division by zero to happen when metad->btm_last_cleanup_num_heap_tuples == 0. This commit adjusts the expression so that no division by zero might happen. Reported-by: Piotr Stefaniak Discussion: https://postgr.es/m/DB8PR03MB5931C41F7787A95313F08322F22A0%40DB8PR03MB5931.eurprd03.prod.outlook.com Reviewed-by: Masahiko Sawada Backpatch-through: 11	2019-04-15 20:20:43 +03:00
Etsuro Fujita	3a45321a49	Fix thinko in ExecCleanupTupleRouting(). Commit `3f2393edef` changed ExecCleanupTupleRouting() so that it skipped cleaning up subplan resultrels before calling EndForeignInsert(), but that would cause an issue: when those resultrels were foreign tables, the FDWs would fail to shut down. Repair by skipping it after calling EndForeignInsert() as before. Author: Etsuro Fujita Reviewed-by: David Rowley and Amit Langote Discussion: https://postgr.es/m/5CAF3B8F.2090905@lab.ntt.co.jp	2019-04-15 19:01:09 +09:00
Peter Eisentraut	abb9c63b2c	Unbreak index optimization for LIKE on bytea The same code is used to handle both text and bytea, but bytea is not collation-aware, so we shouldn't call get_collation_isdeterministic() in that case, since that will error out with an invalid collation. Reported-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAM2%2B6%3DWaf3qJ1%3DyVTUH8_yG-SC0xcBMY%2BSFLhvKKNnWNXSUDBw%40mail.gmail.com	2019-04-15 09:29:17 +02:00
Michael Paquier	c34677fdaa	Fix SHOW ALL command for non-superusers with replication connection Since Postgres 10, SHOW commands can be triggered with replication connections in a WAL sender context, however it missed that a transaction context is needed for syscache lookups. This commit makes sure that the syscache lookups can happen correctly by setting a transaction context when running SHOW commands in a WAL sender. Superuser-only parameters can be displayed using SHOW commands not only to superusers, but also to members of system role pg_read_all_settings, which requires a syscache lookup to check if the connected role is a member of this system role or not, or the instance crashes. Superusers do not need to check the syscache so it worked correctly in this case. New tests are added to cover this issue. Reported-by: Alexander Kukushkin Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/15734-2daa8761eeed8e20@postgresql.org Backpatch-through: 10	2019-04-15 12:34:32 +09:00
Noah Misch	4ab02e8156	Test both 0.0.0.0 and 127.0.0.x addresses to find a usable port. Commit `c098509927` changed PostgresNode::get_new_node() to probe 0.0.0.0 instead of 127.0.0.1, but the new test was less effective for Windows native Perl. This increased the failure rate of buildfarm members bowerbird and jacana. Instead, test 0.0.0.0 and concrete addresses. This restores the old level of defense, but the algorithm is still subject to its longstanding time of check to time of use race condition. Back-patch to 9.6, like the previous change. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se	2019-04-14 20:02:19 -07:00
Michael Paquier	d9f543e9e9	Switch TAP tests of pg_rewind to use non-superuser role, take two Up to now the tests of pg_rewind have been using a superuser for all its tests (which is the default of many tests actually, and something that ought to be reviewed) when involving an online source server, still it is possible to use a non-superuser role to do that as long as this role is granted permissions to execute all the source-side functions used for the rewind. This is possible since v11, and was already documented as of `bfc8068`. PostgresNode::init is extended so as callers of this routine can add extra options to configure the authentication of a new node, which gets used by this commit, and allows the tests to work properly on Windows where SSPI is used. This will allow to catch up easily any change in pg_rewind if the tool begins to use more backend-side functions, so as the properties introduced by v11 are kept. Per suggestion from Peter Eisentraut. Author: Michael Paquier Reviewed-by: Magnus Hagander Discussion: https://postgr.es/m/20190411041336.GM2728@paquier.xyz	2019-04-14 18:47:51 +09:00
Noah Misch	9daefff122	MSYS: Translate REGRESS_SHLIB to a Windows file name. Per buildfarm member jacana. Back-patch to v11; earlier branches skip the affected test under msys. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se	2019-04-14 00:42:34 -07:00
Noah Misch	947a35014f	When Perl "kill(9, ...)" fails, try "pg_ctl kill". Per buildfarm member jacana, the former fails under msys Perl 5.8.8. Back-patch to 9.6, like the code in question. Discussion: https://postgr.es/m/GrdLgAdUK9FdyZg8VIcTDKVOkys122ZINEb3CjjoySfGj2KyPiMKTh1zqtRp0TAD7FJ27G-OBB3eplxIB5GhcQH5o8zzGZfp0MuJaXJxVxk=@yesql.se	2019-04-13 11:09:27 -07:00
Tom Lane	5f1433ac5e	Prevent memory leaks associated with relcache rd_partcheck structures. The original coding of generate_partition_qual() just copied the list of predicate expressions into the global CacheMemoryContext, making it effectively impossible to clean up when the owning relcache entry is destroyed --- the relevant code in RelationDestroyRelation() only managed to free the topmost List header :-(. This resulted in a session-lifespan memory leak whenever a table partition's relcache entry is rebuilt. Fortunately, that's not normally a large data structure, and rebuilds shouldn't occur all that often in production situations; but this is still a bug worth fixing back to v10 where the code was introduced. To fix, put the cached expression tree into its own small memory context, as we do with other complicated substructures of relcache entries. Also, deal more honestly with the case that a partition has an empty partcheck list; while that probably isn't a case that's very interesting for production use, it's legal. In passing, clarify comments about how partitioning-related relcache data structures are managed, and add some Asserts that we're not leaking old copies when we overwrite these data fields. Amit Langote and Tom Lane Discussion: https://postgr.es/m/7961.1552498252@sss.pgh.pa.us	2019-04-13 13:22:26 -04:00
Noah Misch	c098509927	Consistently test for in-use shared memory. postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer stop if shmat() of an old segment fails with EACCES. A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com	2019-04-12 22:36:38 -07:00
Michael Paquier	db8db624e8	Revert "Switch TAP tests of pg_rewind to use a role with minimal permissions" This reverts commit `d4e2a84`, which added a new user with limited permissions to run the TAP tests of pg_rewind. Buildfarm machine members on Windows jacana and bowerbird have been complaining about that, the new role not being able to run the rewind because SSPI is not configured to allow it. Fixing the test requires passing down directly the new user to pg_regress with --create-role so as SSPI can work properly. Reported-by: Andrew Dunstan Discussion: https://postgr.es/m/3cd43d33-f415-cc41-ade3-7230ab15b2c9@2ndQuadrant.com	2019-04-13 13:20:21 +09:00
Magnus Hagander	77bd49adba	Show shared object statistics in pg_stat_database This adds a row to the pg_stat_database view with datoid 0 and datname NULL for those objects that are not in a database. This was added particularly for checksums, but we were already tracking more satistics for these objects, just not returning it. Also add a checksum_last_failure column that holds the timestamptz of the last checksum failure that occurred in a database (or in a non-dataabase file), if any. Author: Julien Rouhaud <rjuju123@gmail.com>	2019-04-12 14:04:50 +02:00
Peter Eisentraut	ef6f30fe77	Fix REINDEX CONCURRENTLY of partitions In case of a partition index, when swapping the old and new index, we also need to attach the new index as a partition and detach the old one. Also, to handle partition indexes, we not only need to change dependencies referencing the index, but also dependencies of the index referencing something else. The previous code did this only specifically for a constraint, but we also need to do this for partitioned indexes. So instead write a generic function that does it for all dependencies. Author: Michael Paquier <michael@paquier.xyz> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/DF4PR8401MB11964EDB77C860078C343BEBEE5A0%40DF4PR8401MB1196.NAMPRD84.PROD.OUTLOOK.COM#154df1fedb735190a773481765f7b874	2019-04-12 08:36:05 +02:00
Thomas Munro	f7feb020c3	Fix GetNewTransactionId()'s interaction with xidVacLimit. Commit `ad308058` switched to returning a FullTransactionId, but failed to load the potentially updated value in the case where xidVacLimit is reached and we release and reacquire the lock. Repair, closing bug #15727. While reviewing that commit, also fix the size computation used by EstimateTransactionStateSize() and switch to the mul_size() macro traditionally used in such expressions. Author: Thomas Munro Reported-by: Roman Zharkov Discussion: https://postgr.es/m/15727-0be246e7d852d229%40postgresql.org	2019-04-12 16:47:50 +12:00
Michael Paquier	d87ab88686	Fix typos in reloptions.c Author: Kirk Jamison Discussion: https://postgr.es/m/D09B13F772D2274BB348A310EE3027C6493463@g01jpexmbkw24	2019-04-12 12:56:38 +09:00
Michael Paquier	d4e2a843e6	Switch TAP tests of pg_rewind to use a role with minimal permissions Up to now the tests of pg_rewind have been using a superuser for all the tests (which is the default of many tests actually, and something that ought to be reviewed) when involving an online source server, still it is possible to use a non-superuser role to do that as long as this role is granted permissions to execute all the source-side functions used for the rewind. This is possible since v11, and was already documented as of `bfc8068`. This will allow to catch up easily any change in pg_rewind if the tool begins to use more backend-side functions, so as the properties introduced by v11 are kept. Per suggestion from Peter Eisentraut. Author: Michael Paquier Reviewed-by: Magnus Hagander Discussion: https://postgr.es/m/20190411041336.GM2728@paquier.xyz	2019-04-12 10:46:43 +09:00
Michael Paquier	d527fda621	Fix more strcmp() calls using boolean-like comparisons for result checks Such calls can confuse the reader as strcmp() uses an integer as result. The places patched here have been spotted by Thomas Munro, David Rowley and myself. Author: Michael Paquier Reviewed-by: David Rowley Discussion: https://postgr.es/m/20190411021946.GG2728@paquier.xyz	2019-04-12 10:16:49 +09:00
Tom Lane	798070ec05	Re-order some regression test scripts for more parallelism. Move the strings, numerology, insert, insert_conflict, select and errors tests to be parts of nearby parallel groups, instead of executing by themselves. (Moving "select" required adjusting the constraints test, which uses a table named "tmp" as select also does. There don't seem to be any other conflicts.) Move psql and stats_ext to the next parallel group, where the rules test also has a long runtime. To make it safe to run stats_ext in parallel with rules, I adjusted the latter to only dump views/rules from the pg_catalog and public schemas, which was what it was doing anyway. stats_ext makes some views in a transient schema, which now will not affect rules. Reorder serial_schedule to match parallel_schedule. Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us	2019-04-11 18:16:50 -04:00
Tom Lane	5874c70557	Speed up sort-order-comparison tests in create_index_spgist. This test script verifies that KNN searches of an SP-GiST index produce the same sort order as a seqscan-and-sort. The FULL JOINs used for that are exceedingly slow, however. Investigation shows that the problem is that the initial join is on the rank() values, and we have a lot of duplicates due to the data set containing 1000 duplicate points. We're therefore going to produce 1000000 join rows that have to be thrown away again by the join filter. We can improve matters by using row_number() instead of rank(), so that the initial join keys are unique. The catch is that that makes the results sensitive to the sorting of rows with equal distances from the reference point. That doesn't matter for the actually-equal points, but as luck would have it, the data set also contains two distinct points that have identical distances to the origin. So those two rows could legitimately appear in either order, causing unwanted output from the check queries. However, it doesn't seem like it's the job of this test to check whether the <-> operator correctly computes distances; its charter is just to verify that SP-GiST emits the values in distance order. So we can dodge the indeterminacy problem by having the check only compare row numbers and distances not the actual point values. This change reduces the run time of create_index_spgist by a good three-quarters, on my machine, with ensuing beneficial effects on the runtime of create_index (thanks to interactions with CREATE INDEX CONCURRENTLY tests in the latter). I see a net improvement of more than 2X in the runtime of their parallel test group. Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us	2019-04-11 17:01:35 -04:00
Tom Lane	385d396b80	Split up a couple of long-running regression test scripts. The point of this change is to increase the potential for parallelism while running the core regression tests. Most people these days are using parallel testing modes on multi-core machines, so we might as well try a bit harder to keep multiple cores busy. Hence, a test that runs much longer than others in its parallel group is a candidate to be sub-divided. In this patch, create_index.sql and join.sql are split up. I haven't changed the content of the tests in any way, just moved them. I moved create_index.sql's SP-GiST-related tests into a new script create_index_spgist, and moved its btree multilevel page deletion test over to the existing script btree_index. (btree_index is a more natural home for that test, and it's shorter than others in its parallel group, so this doesn't hurt total runtime of that group.) There might be room for more aggressive splitting of create_index, but this is enough to improve matters considerably. Likewise, I moved join.sql's "exercises for the hash join code" into a new file join_hash. Those exercises contributed three-quarters of the script's runtime. Which might well be excessive ... but for the moment, I'm satisfied with shoving them into a different parallel group, where they can share runtime with the roughly-equally-lengthy gist test. (Note for anybody following along at home: there are interesting interactions between the runtimes of create_index and anything running in parallel with it, because the tests of CREATE INDEX CONCURRENTLY in that file will repeatedly block waiting for concurrent transactions to commit. As committed in this patch, create_index and create_index_spgist have roughly equal runtimes, but that's mostly an artifact of forced synchronization of the CONCURRENTLY tests; when run serially, create_index is much faster. A followup patch will reduce the runtime of create_index_spgist and thereby also create_index.) Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us	2019-04-11 16:15:54 -04:00
Tom Lane	6726d8d476	Move plpgsql error-trapping tests to a new module-specific test file. The test for statement timeout has a 2-second timeout, which was only moderately annoying when it was written, but nowadays it contributes a pretty significant chunk of the elapsed time needed to run the core regression tests on a fast machine. We can improve this situation by pushing the test into a plpgsql-specific test file instead of having it in a core regression test. That's a clean win when considering just the core tests. Even when considering check-world or a buildfarm test run, we should come out ahead because the core tests get run more times in those sequences. Furthermore, since the plpgsql tests aren't currently parallelized, it seems likely that the timing problems reflected in commit `f1e671a0b` (which increased that timeout from 1 sec to 2) will be much less severe in this context. Hence, let's try cutting the timeout back to 1 second in hopes of a further win for check-world. We can undo that if buildfarm experience proves it to be a bad idea. To give the new test file some modicum of intellectual coherency, I moved the surrounding tests related to error-trapping along with the statement timeout test proper. Those other tests don't run long enough to have any particular bearing on test-runtime considerations. The tests are the same as before, except with minor adjustments to not depend on an externally-created table. Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us	2019-04-11 15:09:28 -04:00
Michael Meskes	ed16ba3248	Fix off-by-one check that can lead to a memory overflow in ecpg. Patch by Liu Huailing <liuhuailing@cn.fujitsu.com>	2019-04-11 20:56:17 +02:00
Tom Lane	4aaa3b5cf1	Remove duplicative polygon SP-GiST sequencing test. Code coverage comparisons confirm that the tests using quad_poly_tbl_ord_seq1/quad_poly_tbl_ord_idx1 hit no code paths not also covered by the similar tests using quad_poly_tbl_ord_seq2/quad_poly_tbl_ord_idx2. Since these test cases are pretty expensive, they need to contribute more than zero benefit. In passing, make quad_poly_tbl_ord_seq2 a temp table, since there seems little reason to keep it around after the test. Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us	2019-04-11 14:35:47 -04:00
Tom Lane	f72d9a5e7d	Remove redundant and ineffective test for btree insertion fast path. indexing.sql's test for this feature was added along with the feature in commit `2b2727343`. However, shortly later that test was rendered ineffective by commit `074251db6`, which limited when the optimization would be applied, so that the test didn't test it. Since then, commit `dd299df81` added new tests (in btree_index.sql) that actually do test the feature. Code coverage comparisons confirm that this test sequence adds no meaningful coverage, and it's rather expensive, accounting for nearly half of the runtime of indexing.sql according to my measurements. So let's remove it. Per advice from Peter Geoghegan. Discussion: https://postgr.es/m/735.1554935715@sss.pgh.pa.us	2019-04-11 13:15:59 -04:00
Alvaro Herrera	65d857d92c	Fix declaration after statement This style is frowned upon. I inadvertently introduced one in commit `fe0e0b4fc7`. (My compiler does not complain about it, even though -Wdeclaration-after-statement is specified. Weird.) Author: Masahiko Sawada	2019-04-10 22:31:40 -04:00
Tom Lane	4cae471d1b	Fix backwards test in operator_precedence_warning logic. Warnings about unary minus might have been wrong. It's a bit surprising that nobody noticed yet ... probably the precedence-warning feature hasn't really been used much in the field. Rikard Falkeborn Discussion: https://postgr.es/m/CADRDgG6fzA8A2oeygUw4=o7ywo4kvz26NxCSgpq22nMD73Bx4Q@mail.gmail.com	2019-04-10 19:02:21 -04:00
Peter Eisentraut	765525c8c2	pg_restore: Make not verbose by default This was accidentally changed in `cc8d415117`. Reported-by: Christoph Berg <myon@debian.org>	2019-04-10 11:45:00 +02:00
Amit Kapila	bdf35744bd	Avoid counting transaction stats for parallel worker cooperating transaction. The transaction that is initiated by the parallel worker to cooperate with the actual transaction started by the main backend to complete the query execution should not be counted as a separate transaction. The other internal transactions started and committed by the parallel worker are still counted as separate transactions as we that is what we do in other places like autovacuum. This will partially fix the bloat in transaction stats due to additional transactions performed by parallel workers. For a complete fix, we need to decide how we want to show all the transactions that are started internally for various operations and that is a matter of separate patch. Reported-by: Haribabu Kommi Author: Haribabu Kommi Reviewed-by: Amit Kapila, Jamison Kirk and Rahila Syed Backpatch-through: 9.6 Discussion: https://postgr.es/m/CAJrrPGc9=jKXuScvNyQ+VNhO0FZk7LLAShAJRyZjnedd2D61EQ@mail.gmail.com	2019-04-10 08:24:15 +05:30
Thomas Munro	d614aae02e	Improve comment in sync.h. Per off-list complaint from Andres Freund.	2019-04-10 12:49:49 +12:00
Thomas Munro	255044889d	Fix typos.	2019-04-10 09:21:06 +12:00
Tom Lane	9476131278	Prevent inlining of multiply-referenced CTEs with outer recursive refs. This has to be prevented because inlining would result in multiple self-references, which we don't support (and in fact that's disallowed by the SQL spec, see statements about linearly vs. nonlinearly recursive queries). Bug fix for commit `608b167f9`. Per report from Yaroslav Schekin (via Andrew Gierth) Discussion: https://postgr.es/m/87wolmg60q.fsf@news-spur.riddles.org.uk	2019-04-09 15:47:35 -04:00
Alvaro Herrera	4dba0f6dc4	Fix typo	2019-04-09 13:00:12 -04:00
Alvaro Herrera	fe0e0b4fc7	Fix memory leak in pgbench Commit `25ee70511e` introduced a memory leak in pgbench: some PGresult structs were not being freed during error bailout, because we're now doing more PQgetResult() calls than previously. Since there's more cleanup code outside the discard_response() routine than in it, refactor the cleanup code, removing the routine. This has little effect currently, since we abandon processing after hitting errors, but if we ever get further pgbench features (such as testing for serializable transactions), it'll matter. Per Coverity. Reviewed-by: Michaël Paquier	2019-04-09 12:46:34 -04:00
Tom Lane	a2418f9e23	Test some more cases with partitioned tables in EvalPlanQual. We weren't testing anything involving EPQ on UPDATEs that move tuples into different partitions. Depending on the implementation, it might be that these cases aren't actually very interesting ... but given our thin coverage of EPQ in general, I think it's a good idea to have a test case. Amit Langote, minor tweak by me Discussion: https://postgr.es/m/7889df35-ad1a-691a-00e3-4d4b18f364e3@lab.ntt.co.jp	2019-04-09 11:43:03 -04:00
Noah Misch	ba3fb5d4fb	Define WIN32_STACK_RLIMIT throughout win32 and cygwin builds. The MSVC build system already did this, and commit `617dc6d299` used it in a second file. Back-patch to 9.4, like that commit. Discussion: https://postgr.es/m/CAA8=A7_1SWc3+3Z=-utQrQFOtrj_DeohRVt7diA2tZozxsyUOQ@mail.gmail.com	2019-04-09 08:25:39 -07:00
Peter Eisentraut	9efe068e48	Replace tabs with spaces in one .sql file Let's at least keep this consistent within the same file.	2019-04-09 15:54:37 +02:00
Heikki Linnakangas	16954e22e2	Fix example in comment. Author: Adrien Nayrat	2019-04-09 08:33:42 +03:00
Noah Misch	617dc6d299	Avoid "could not reattach" by providing space for concurrent allocation. We've long had reports of intermittent "could not reattach to shared memory" errors on Windows. Buildfarm member dory fails that way when PGSharedMemoryReAttach() execution overlaps with creation of a thread for the process's "default thread pool". Fix that by providing a second region to receive asynchronous allocations that would otherwise intrude into UsedShmemSegAddr. In pgwin32_ReserveSharedMemoryRegion(), stop trying to free reservations landing at incorrect addresses; the caller's next step has been to terminate the affected process. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. He also did much of the prerequisite research; see commit `bcbf2346d6`. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com	2019-04-08 21:39:00 -07:00
Andres Freund	6421011ea2	tableam: comment and formatting fixes. Author: Heikki Linnakangas Discussion: https://postgr.es/m/9a7fb9cc-2419-5db7-8840-ddc10c93f122@iki.fi	2019-04-08 16:24:36 -07:00
Tom Lane	45f8eaa8e3	Fix improper interaction of FULL JOINs with lateral references. join_is_legal() needs to reject forming certain outer joins in cases where that would lead the planner down a blind alley. However, it mistakenly supposed that the way to handle full joins was to treat them as applying the same constraints as for left joins, only to both sides. That doesn't work, as shown in bug #15741 from Anthony Skorski: given a lateral reference out of a join that's fully enclosed by a full join, the code would fail to believe that any join ordering is legal, resulting in errors like "failed to build any N-way joins". However, we don't really need to consider full joins at all for this purpose, because we effectively force them to be evaluated in syntactic order, and that order is always legal for lateral references. Hence, get rid of this broken logic for full joins and just ignore them instead. This seems to have been an oversight in commit `7e19db0c0`. Back-patch to all supported branches, as that was. Discussion: https://postgr.es/m/15741-276f1f464b3f40eb@postgresql.org	2019-04-08 16:09:26 -04:00
Tom Lane	a8cb8f1246	Fix EvalPlanQualStart to handle partitioned result rels correctly. The es_root_result_relations array needs to be shallow-copied in the same way as the main es_result_relations array, else EPQ rechecks on partitioned result relations fail, as seen in bug #15677 from Norbert Benkocs. Amit Langote, isolation test case added by me Discussion: https://postgr.es/m/15677-0bf089579b4cd02d@postgresql.org Discussion: https://postgr.es/m/19321.1554567786@sss.pgh.pa.us	2019-04-08 12:20:22 -04:00
Fujii Masao	119dcfad98	Add vacuum_truncate reloption. vacuum_truncate controls whether vacuum tries to truncate off any empty pages at the end of the table. Previously vacuum always tried to do the truncation. However, the truncation could cause some problems; for example, ACCESS EXCLUSIVE lock needs to be taken on the table during the truncation and can cause the query cancellation on the standby even if hot_standby_feedback is true. Setting this reloption to false can be helpful to avoid such problems. Author: Tsunakawa Takayuki Reviewed-By: Julien Rouhaud, Masahiko Sawada, Michael Paquier, Kirk Jamison and Fujii Masao Discussion: https://postgr.es/m/CAHGQGwE5UqFqSq1=kV3QtTUtXphTdyHA-8rAj4A=Y+e4kyp3BQ@mail.gmail.com	2019-04-08 16:43:57 +09:00
Andres Freund	4c9e1bd0a3	Reset memory context once per tuple in validateForeignKeyConstraint. When using tableam ExecFetchSlotHeapTuple() might return a separately allocated tuple. We could use the shouldFree argument to explicitly free it, but it seems more robust to to protect Also add a CHECK_FOR_INTERRUPTS() after each tuple. It's likely that each AM has (heap does) a CFI somewhere in the relevant path, but it seems more robust to have one in validateForeignKeyConstraint() itself. Note that this only affects the cases that couldn't be optimized to be verified with a query. Author: Andres Freund Reviewed-By: Tom Lane (in an earlier version) Discussion: https://postgr.es/m/19030.1554574075@sss.pgh.pa.us https://postgr.es/m/CAKJS1f_SHKcPYMsi39An5aUjhAcEMZb6Cx1Sj1QWEWSiKJkBVQ@mail.gmail.com https://postgr.es/m/20180711185628.mrvl46bjgk2uxoki@alap3.anarazel.de	2019-04-07 22:42:42 -07:00
Andres Freund	41f5e04aec	Fix a number of issues around modifying a previously updated row. This commit fixes three, unfortunately related, issues: 1) Since `5db6df0c01`, the introduction of DML via tableam, it was possible to trigger "ERROR: unexpected table_lock_tuple status: 1" when updating a row that was previously updated in the same transaction - but only when the previously updated row was before updated in a concurrent transaction (and READ COMMITTED was used). The reason for that was that that case simply wasn't expected. Fixing that lead to: 2) Even before the above commit, there were error checks (introduced in `6868ed7491`) preventing a row being updated by different commands within the same statement (say in a function called by an UPDATE) - but that check wasn't performed when the row was first updated in a concurrent transaction - instead the second update was silently skipped in that case. After this change we throw the same error as we'd without the concurrent transaction. 3) The error messages (introduced in `6868ed7491`) preventing such updates emitted the same error message for both DELETE and UPDATE ("tuple to be updated was already modified by an operation triggered by the current command"). While that could be changed separately, it made it hard to write tests that verify the correct correct behavior of the code. This commit changes heap's implementation of table_lock_tuple() to return TM_SelfModified instead of TM_Invisible (previously loosely modeled after EvalPlanQualFetch), and teaches nodeModifyTable.c to handle that in response to table_lock_tuple() and not just in response to table_(delete\|update). Additionally it fixes the wrong error message (see 3 above). The comment for table_lock_tuple() is also adjusted to state that TM_Deleted won't return information in TM_FailureData - it'll not always be available. This also adds tests to ensure that DELETE/UPDATE correctly error out when affecting a row that concurrently was modified by another transaction. Author: Andres Freund Reported-By: Tom Lane, when investigating a bug bug fix to another bug by Amit Langote Discussion: https://postgr.es/m/19321.1554567786@sss.pgh.pa.us	2019-04-07 22:14:47 -07:00
Michael Paquier	964bae4d84	Add more tests for partition tuple routing with dropped attributes As bug #15733 has proved, we are lacking coverage for partition tuple routing with dropped attributes when involving three levels of partitioning or more. There was only an active bug in this area for v11, and HEAD is proving to handle those scenarios fine, still it lacked some coverage for the previous problem. Author: Amit Langote, Michael Paquier Discussion: https://postgr.es/m/15733-7692379e310b80ec@postgresql.org	2019-04-08 13:44:55 +09:00
Tom Lane	80a96e066e	Avoid fetching past the end of the indoption array. pg_get_indexdef_worker carelessly fetched indoption entries even for non-key index columns that don't have one. 99.999% of the time this would be harmless, since the code wouldn't examine the value ... but some fine day this will be a fetch off the end of memory, resulting in SIGSEGV. Detected through valgrind testing. Odd that the buildfarm's valgrind critters haven't noticed.	2019-04-07 18:19:16 -04:00
Alvaro Herrera	1c5d9270e3	psql \dP: list partitioned tables and indexes The new command lists partitioned relations (tables and/or indexes), possibly with their sizes, possibly including partitioned partitions; their parents (if not top-level); if indexes show the tables they belong to; and their descriptions. While there are various possible improvements to this, having it in this form is already a great improvement over not having any way to obtain this report. Author: Pavel Stěhule, with help from Mathias Brossard, Amit Langote and Justin Pryzby. Reviewed-by: Amit Langote, Mathias Brossard, Melanie Plageman, Michaël Paquier, Álvaro Herrera	2019-04-07 15:07:21 -04:00
Tom Lane	159970bcad	Clean up side-effects of commits `ab5fcf2b0` et al. Before those commits, partitioning-related code in the executor could assume that ModifyTableState.resultRelInfo[] contains only leaf partitions. However, now a fully-pruned update results in a dummy ModifyTable that references the root partitioned table, and that breaks some stuff. In v11, this led to an assertion or core dump in the tuple routing code. Fix by disabling tuple routing, since we don't need that anyway. (I chose to do that in HEAD as well for safety, even though the problem doesn't manifest in HEAD as it stands.) In v10, this confused ExecInitModifyTable's decision about whether it needed to close the root table. But we can get rid of that altogether by being smarter about where to find the root table. Note that since the referenced commits haven't shipped yet, this isn't fixing any bug the field has seen. Amit Langote, per a report from me Discussion: https://postgr.es/m/20710.1554582479@sss.pgh.pa.us	2019-04-07 12:54:22 -04:00
Peter Eisentraut	03f9e5cba0	Report progress of REINDEX operations This uses the same infrastructure that the CREATE INDEX progress reporting uses. Add a column to pg_stat_progress_create_index to report the OID of the index being worked on. This was not necessary for CREATE INDEX, but it's useful for REINDEX. Also edit the phase descriptions a bit to be more consistent with the source code comments. Discussion: https://www.postgresql.org/message-id/ef6a6757-c36a-9e81-123f-13b19e36b7d7%402ndquadrant.com	2019-04-07 12:35:29 +02:00
Peter Eisentraut	106f2eb664	Cast pg_stat_progress_cluster.cluster_index_relid to oid It's tracked internally as bigint, but when presented to the user it should be oid.	2019-04-07 10:31:32 +02:00
Tom Lane	46e3442c9e	Fix failures in validateForeignKeyConstraint's slow path. The foreign-key-checking loop in ATRewriteTables failed to ignore relations without storage (e.g., partitioned tables), unlike the initial loop. This accidentally worked as long as RI_Initial_Check succeeded, which it does in most practical cases (including all the ones exercised in the existing regression tests :-(). However, if that failed, as for instance when there are permissions issues, then we entered the slow fire-the-trigger-on-each-tuple path. And that would try to read from the referencing relation, and fail if it lacks storage. A second problem, recently introduced in HEAD, was that this loop had been broken by sloppy refactoring for the tableam API changes. Repair both issues, and add a regression test case so we have some coverage on this code path. Back-patch as needed to v11. (It looks like this code could do with additional bulletproofing, but let's get a working test case in place first.) Hadi Moshayedi, Tom Lane, Andres Freund Discussion: https://postgr.es/m/CAK=1=WrnNmBbe5D9sm3t0a6dnAq3cdbF1vXY816j1wsMqzC8bw@mail.gmail.com Discussion: https://postgr.es/m/19030.1554574075@sss.pgh.pa.us Discussion: https://postgr.es/m/20190325180405.jytoehuzkeozggxx%40alap3.anarazel.de	2019-04-06 15:09:09 -04:00
Michael Paquier	249d649996	Add support TCP user timeout in libpq and the backend server Similarly to the set of parameters for keepalive, a connection parameter for libpq is added as well as a backend GUC, called tcp_user_timeout. Increasing the TCP user timeout is useful to allow a connection to survive extended periods without end-to-end connection, and decreasing it allows application to fail faster. By default, the parameter is 0, which makes the connection use the system default, and follows a logic close to the keepalive parameters in its handling. When connecting through a Unix-socket domain, the parameters have no effect. Author: Ryohei Nagaura Reviewed-by: Fabien Coelho, Robert Haas, Kyotaro Horiguchi, Kirk Jamison, Mikalai Keida, Takayuki Tsunakawa, Andrei Yahorau Discussion: https://postgr.es/m/EDA4195584F5064680D8130B1CA91C45367328@G01JPEXMBYT04	2019-04-06 15:23:37 +09:00
Tom Lane	959d00e9db	Use Append rather than MergeAppend for scanning ordered partitions. If we need ordered output from a scan of a partitioned table, but the ordering matches the partition ordering, then we don't need to use a MergeAppend to combine the pre-ordered per-partition scan results: a plain Append will produce the same results. This both saves useless comparison work inside the MergeAppend proper, and allows us to start returning tuples after istarting up just the first child node not all of them. However, all is not peaches and cream, because if some of the child nodes have high startup costs then there will be big discontinuities in the tuples-returned-versus-elapsed-time curve. The planner's cost model cannot handle that (yet, anyway). If we model the Append's startup cost as being just the first child's startup cost, we may drastically underestimate the cost of fetching slightly more tuples than are available from the first child. Since we've had bad experiences with over-optimistic choices of "fast start" plans for ORDER BY LIMIT queries, that seems scary. As a klugy workaround, set the startup cost estimate for an ordered Append to be the sum of its children's startup costs (as MergeAppend would). This doesn't really describe reality, but it's less likely to cause a bad plan choice than an underestimated startup cost would. In practice, the cases where we really care about this optimization will have child plans that are IndexScans with zero startup cost, so that the overly conservative estimate is still just zero. David Rowley, reviewed by Julien Rouhaud and Antonin Houska Discussion: https://postgr.es/m/CAKJS1f-hAqhPLRk_RaSFTgYxd=Tz5hA7kQ2h4-DhJufQk8TGuw@mail.gmail.com	2019-04-05 19:20:43 -04:00
Alvaro Herrera	9f06d79ef8	Add facility to copy replication slots This allows the user to create duplicates of existing replication slots, either logical or physical, and even changing properties such as whether they are temporary or the output plugin used. There are multiple uses for this, such as initializing multiple replicas using the slot for one base backup; when doing investigation of logical replication issues; and to select a different output plugins. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/CAD21AoAm7XX8y_tOPP6j4Nzzch12FvA1wPqiO690RCk+uYVstg@mail.gmail.com	2019-04-05 18:05:18 -03:00
Thomas Munro	de2b38419c	Wake up interested backends when a checkpoint fails. Commit `c6c9474a` switched to condition variables instead of sleep loops to notify backends of checkpoint start and stop, but forgot to broadcast in case of checkpoint failure. Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKGJKbCd%2B_K%2BSEBsbHxVT60SG0ivWHHAdvL0bLTUt2xpA2w%40mail.gmail.com	2019-04-06 09:31:48 +13:00
Tom Lane	478cacb50e	Ensure consistent name matching behavior in processSQLNamePattern(). Prior to v12, if you used a collation-sensitive regex feature in a pattern handled by processSQLNamePattern() (for instance, \d '\\w+' in psql), the behavior you got matched the database's default collation. Since commit `586b98fdf` you'd usually get C-collation behavior, because the catalog "name"-type columns are now marked as COLLATE "C". Add explicit COLLATE specifications to restore the prior behavior. (Note for whoever writes the v12 release notes: the need for this shows that while `586b98fdf` preserved pre-v12 behavior of "name" columns for simple comparison operators, it changed the behavior of regex operators on those columns. Although this patch fixes it for pattern matches generated by our own tools, user-written queries will still be affected. So we'd better mention this issue as a compatibility item.) Daniel Vérité Discussion: https://postgr.es/m/701e51f0-0ec0-4e70-a365-1958d66dd8d2@manitou-mail.org	2019-04-05 12:59:57 -04:00
Peter Eisentraut	edda32ee25	Fix compiler warning Rewrite get_attgenerated() to avoid compiler warning if the compiler does not recognize that elog(ERROR) does not return. Reported-by: David Rowley <david.rowley@2ndquadrant.com>	2019-04-05 09:23:07 +02:00
Noah Misch	82150a05be	Revert "Consistently test for in-use shared memory." This reverts commits `2f932f71d9`, `16ee6eaf80` and `6f0e190056`. The buildfarm has revealed several bugs. Back-patch like the original commits. Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com	2019-04-05 00:00:52 -07:00
Thomas Munro	794c543b17	Fix bugs in mdsyncfiletag(). Commit `3eb77eba` moved a _mdfd_getseg() call from mdsync() into a new callback function mdsyncfiletag(), but didn't get the arguments quite right. Without the EXTENSION_DONT_CHECK_SIZE flag we fail to open a segment if lower-numbered segments have been truncated, and it wants a block number rather than a segment number. While comparing with the older coding, also remove an unnecessary clobbering of errno, and adjust the code in mdunlinkfiletag() to ressemble the original code from mdpostckpt() more closely instead of using an unnecessary call to smgropen(). Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKGL%2BYLUOA0eYiBXBfwW%2BbH5kFgh94%3DgQH0jHEJ-t5Y91wQ%40mail.gmail.com	2019-04-05 17:41:58 +13:00
Stephen Frost	c46c85d459	Handle errors during GSSAPI startup better There was some confusion over the format of the error message returned from the server during GSSAPI startup; specifically, it was expected that a length would be returned when, in reality, at this early stage in the startup sequence, no length is returned from the server as part of an error message. Correct the client-side code for dealing with error messages sent by the server during startup by simply reading what's available into our buffer, after we've discovered it's an error message, and then reporting back what was returned. In passing, also add in documentation of the environment variable PGGSSENCMODE which was missed previously, and adjust the code to look for the PGGSSENCMODE variable (the environment variable change was missed in the prior GSSMODE -> GSSENCMODE commit). Error-handling issue discovered by Peter Eisentraut, the rest were items discovered during testing of the error handling.	2019-04-04 22:52:42 -04:00
Andres Freund	57a7a3adfe	Remove unused struct member, enforce multi_insert callback presence. Author: David Rowley, Andres Freund Discussion: https://postgr.es/m/CAKJS1f9=9phmm66diAji4gvHnWSrK7BGFoNct+mEUT_c8pPOjw@mail.gmail.com	2019-04-04 17:39:39 -07:00
Andres Freund	ea97e440b8	Harden tableam against nonexistant / wrong kind of AMs. Previously it was allowed to set default_table_access_method to an empty string. That makes sense for default_tablespace, where that was copied from, as it signals falling back to the database's default tablespace. As there is no equivalent for table AMs, forbid that. Also make sure to throw a usable error when creating a table using an index AM, by using get_am_type_oid() to implement get_table_am_oid() instead of a separate copy. Previously we'd error out only later, in GetTableAmRoutine(). Thirdly remove GetTableAmRoutineByAmId() - it was only used in an earlier version of `8586bf7ed8`. Add tests for the above (some for index AMs as well).	2019-04-04 17:39:39 -07:00
Peter Geoghegan	344b7e11bb	Add test coverage for rootdescend verification. Commit `c1afd175`, which added support for rootdescend verification to amcheck, added only minimal regression test coverage. Address this by making sure that rootdescend verification is run on a multi-level index. In passing, simplify some of the regression tests that exercise multi-level nbtree page deletion. Both issues spotted while rereviewing coverage of the nbtree patch series using gcov.	2019-04-04 17:25:35 -07:00
Andres Freund	86b85044e8	tableam: Add table_multi_insert() and revamp/speed-up COPY FROM buffering. This adds table_multi_insert(), and converts COPY FROM, the only user of heap_multi_insert, to it. A simple conversion of COPY FROM use slots would have yielded a slowdown when inserting into a partitioned table for some workloads. Different partitions might need different slots (both slot types and their descriptors), and dropping / creating slots when there's constant partition changes is measurable. Thus instead revamp the COPY FROM buffering for partitioned tables to allow to buffer inserts into multiple tables, flushing only when limits are reached across all partition buffers. By only dropping slots when there've been inserts into too many different partitions, the aforementioned overhead is gone. By allowing larger batches, even when there are frequent partition changes, we actuall speed such cases up significantly. By using slots COPY of very narrow rows into unlogged / temporary might slow down very slightly (due to the indirect function calls). Author: David Rowley, Andres Freund, Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20190327054923.t3epfuewxfqdt22e@alap3.anarazel.de	2019-04-04 16:28:18 -07:00
Tom Lane	7bac3acab4	Add a "SQLSTATE-only" error verbosity option to libpq and psql. This is intended for use mostly in test scripts for external tools, which could do without cross-PG-version variations in error message wording. Of course, the SQLSTATE isn't guaranteed stable either, but it should be more so than the error message text. Note: there's a bit of an ABI change for libpq here, but it seems OK because if somebody compiles against a newer version of libpq-fe.h, and then tries to pass PQERRORS_SQLSTATE to PQsetErrorVerbosity() of an older libpq library, it will be accepted and then act like PQERRORS_DEFAULT, thanks to the way the tests in pqBuildErrorMessage3 have historically been phrased. That seems acceptable. Didier Gautheron, reviewed by Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/CAJRYxuKyj4zA+JGVrtx8OWAuBfE-_wN4sUMK4H49EuPed=mOBw@mail.gmail.com	2019-04-04 17:22:02 -04:00
Alvaro Herrera	413ccaa74d	pg_restore: Require "-f -" to mean stdout The previous convention that stdout was selected by default when nothing is specified was just too error-prone. After a suggestion from Andrew Gierth. Author: Euler Taveira Reviewed-by: Yoshikazu Imai, José Arthur Benetasso Villanova Discussion: https://postgr.es/m/87sgwrmhdv.fsf@news-spur.riddles.org.uk	2019-04-04 16:54:15 -03:00
Tom Lane	9c703c169a	Make queries' locking of indexes more consistent. The assertions added by commit `b04aeb0a0` exposed that there are some code paths wherein the executor will try to open an index without holding any lock on it. We do have some lock on the index's table, so it seems likely that there's no fatal problem with this (for instance, the index couldn't get dropped from under us). Still, it's bad practice and we should fix it. To do so, remove the optimizations in ExecInitIndexScan and friends that tried to avoid taking a lock on an index belonging to a target relation, and just take the lock always. In non-bug cases, this will result in no additional shared-memory access, since we'll find in the local lock table that we already have a lock of the desired type; hence, no significant performance degradation should occur. Also, adjust the planner and executor so that the type of lock taken on an index is always identical to the type of lock taken for its table, by relying on the recently added RangeTblEntry.rellockmode field. This avoids some corner cases where that might not have been true before (possibly resulting in extra locking overhead), and prevents future maintenance issues from having multiple bits of logic that all needed to be in sync. In addition, this change removes all core calls to ExecRelationIsTargetRelation, which avoids a possible O(N^2) startup penalty for queries with large numbers of target relations. (We'd probably remove that function altogether, were it not that we advertise it as something that FDWs might want to use.) Also adjust some places in selfuncs.c to not take any lock on indexes they are transiently opening, since we can assume that plancat.c did that already. In passing, change gin_clean_pending_list() to take RowExclusiveLock not AccessShareLock on its target index. Although it's not clear that that's actually a bug, it seemed very strange for a function that's explicitly going to modify the index to use only AccessShareLock. David Rowley, reviewed by Julien Rouhaud and Amit Langote, a bit of further tweaking by me Discussion: https://postgr.es/m/19465.1541636036@sss.pgh.pa.us	2019-04-04 15:12:58 -04:00
Robert Haas	a96c41feec	Allow VACUUM to be run with index cleanup disabled. This commit adds a new reloption, vacuum_index_cleanup, which controls whether index cleanup is performed for a particular relation by default. It also adds a new option to the VACUUM command, INDEX_CLEANUP, which can be used to override the reloption. If neither the reloption nor the VACUUM option is used, the default is true, as before. Masahiko Sawada, reviewed and tested by Nathan Bossart, Alvaro Herrera, Kyotaro Horiguchi, Darafei Praliaskouski, and me. The wording of the documentation is mostly due to me. Discussion: http://postgr.es/m/CAD21AoAt5R3DNUZSjOoXDUY=naYPUOuffVsRzuTYMz29yLzQCA@mail.gmail.com	2019-04-04 15:04:43 -04:00
Peter Geoghegan	74eb2176bf	Invalidate binary search bounds consistently. _bt_check_unique() failed to invalidate binary search bounds in the event of a live conflict following commit `e5adcb78`. This resulted in problems after waiting for the conflicting xact to commit or abort. The subsequent call to _bt_check_unique() would restore the initial binary search bounds, rather than starting a new search. Fix by explicitly invalidating bounds when it becomes clear that there is a live conflict that insertion will have to wait to resolve. Ashutosh Sharma, with a few additional tweaks by me. Author: Ashutosh Sharma Reported-By: Ashutosh Sharma Diagnosed-By: Ashutosh Sharma Discussion: https://postgr.es/m/CAE9k0PnQp-qr-UYKMSCzdC2FBzdE4wKP41hZrZvvP26dKLonLg@mail.gmail.com	2019-04-04 09:38:08 -07:00
Stephen Frost	87e16db5eb	Move the be_gssapi_get_* prototypes The be_gssapi_get_* prototypes were put close to similar ones for SSL- but a bit too close since that meant they ended up only being included for SSL-enabled builds. Move those to be under ENABLE_GSS instead. Pointed out by Tom.	2019-04-04 11:11:46 -04:00
Thomas Munro	3eb77eba5a	Refactor the fsync queue for wider use. Previously, md.c and checkpointer.c were tightly integrated so that fsync calls could be handed off and processed in the background. Introduce a system of callbacks and file tags, so that other modules can hand off fsync work in the same way. For now only md.c uses the new interface, but other users are being proposed. Since there may be use cases that are not strictly SMGR implementations, use a new function table for sync handlers rather than extending the traditional SMGR one. Instead of using a bitmapset of segment numbers for each RelFileNode in the checkpointer's hash table, make the segment number part of the key. This requires sending explicit "forget" requests for every segment individually when relations are dropped, but suits the file layout schemes of proposed future users better (ie sparse or high segment numbers). Author: Shawn Debnath and Thomas Munro Reviewed-by: Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com	2019-04-04 23:38:38 +13:00
Noah Misch	6f0e190056	Silence -Wimplicit-fallthrough in sysv_shmem.c. Commit `2f932f71d9` added code that elicits a warning on buildfarm member flaviventris. Back-patch to 9.4, like that commit. Reported by Andres Freund. Discussion: https://postgr.es/m/20190404020057.galelv7by75ekqrh@alap3.anarazel.de	2019-04-03 23:23:35 -07:00
Noah Misch	16ee6eaf80	Make src/test/recovery/t/017_shm.pl safe for concurrent execution. Buildfarm members idiacanthus and komodoensis, which share a host, both executed this test in the same second. That failed. Back-patch to 9.6, where the test first appeared. Discussion: https://postgr.es/m/20190404020543.GA1319573@rfd.leadboat.com	2019-04-03 23:16:37 -07:00
Michael Paquier	92c76021ae	Improve readability of some tests in strings.sql `c251336` has added some tests to check if a toast relation should be empty or not, hardcoding the toast relation name when calling pg_relation_size(). pg_class.reltoastrelid offers the same information, so simplify the tests to use that. Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20190403065949.GH3298@paquier.xyz	2019-04-04 10:24:56 +09:00
Andres Freund	b73c3a1196	tableam: basic documentation. This adds documentation about the user oriented parts of table access methods (i.e. the default_table_access_method GUC and the USING clause for CREATE TABLE etc), adds a basic chapter about the table access method interface, and adds a note to storage.sgml that it's contents don't necessarily apply for non-builtin AMs. Author: Haribabu Kommi and Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-04-03 17:40:29 -07:00
Noah Misch	ab9ed9be23	Assert that pgwin32_signal_initialize() has been called early enough. Before the pgwin32_signal_initialize() call, the backend version of pg_usleep() has no effect. No in-tree code falls afoul of that today, but temporary commit `23078689a9` did so. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com	2019-04-03 17:11:16 -07:00
Noah Misch	f433394e48	Handle USE_MODULE_DB for all tests able to use an installed postmaster. When $(MODULES) and $(MODULE_big) are empty, derive the database name from the first element of $(REGRESS) instead of using a constant string. When deriving the database name from $(MODULES), use its first element instead of the entire list; the earlier approach would fail if any multi-module directory had $(REGRESS) tests. Treat isolation suites and src/pl correspondingly. Under USE_MODULE_DB=1, installcheck-world and check-world no longer reuse any database name in a given postmaster. Buildfarm members axolotl, mandrill and frogfish saw spurious "is being accessed by other users" failures that would not have happened without database name reuse. (The CountOtherDBBackends() 5s deadline expired during DROP DATABASE; a backend for an earlier test suite had used the same database name and had not yet exited.) Back-patch to 9.4 (all supported versions), except bits pertaining to isolation suites. Concept reviewed by Andrew Dunstan, Andres Freund and Tom Lane. Discussion: https://postgr.es/m/20190401135213.GE891537@rfd.leadboat.com	2019-04-03 17:06:01 -07:00
Noah Misch	2f932f71d9	Consistently test for in-use shared memory. postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com	2019-04-03 17:03:46 -07:00
Tomas Vondra	ea569d64ac	Add SETTINGS option to EXPLAIN, to print modified settings. Query planning is affected by a number of configuration options, and it may be crucial to know which of those options were set to non-default values. With this patch you can say EXPLAIN (SETTINGS ON) to include that information in the query plan. Only options affecting planning, with values different from the built-in default are printed. This patch also adds auto_explain.log_settings option, providing the same capability in auto_explain module. Author: Tomas Vondra Reviewed-by: Rafia Sabih, John Naylor Discussion: https://postgr.es/m/e1791b4c-df9c-be02-edc5-7c8874944be0@2ndquadrant.com	2019-04-04 00:04:31 +02:00
Alvaro Herrera	d1f04b96b9	Tweak docs for log_statement_sample_rate Author: Justin Pryzby, partly after a suggestion from Masahiko Sawada Discussion: https://postgr.es/m/20190328135918.GA27808@telsasoft.com Discussion: https://postgr.es/m/CAD21AoB9+y8N4+Fan-ne-_7J5yTybPttxeVKfwUocKp4zT1vNQ@mail.gmail.com	2019-04-03 18:56:56 -03:00
Alvaro Herrera	799e220346	Log all statements from a sample of transactions This is useful to obtain a view of the different transaction types in an application, regardless of the durations of the statements each runs. Author: Adrien Nayrat Reviewed-by: Masahiko Sawada, Hayato Kuroda, Andres Freund	2019-04-03 18:43:59 -03:00
Tom Lane	d8c0bd9fef	Remove now-unnecessary thread pointer arguments in pgbench. Not required after nuking the zipfian thread-local cache. Also add a comment about hazardous pointer punning in threadRun(), and avoid using "thread" to refer to the threads array as a whole. Fabien Coelho and Tom Lane, per suggestion from Alvaro Herrera Discussion: https://postgr.es/m/alpine.DEB.2.21.1904032126060.7997@lancre	2019-04-03 17:16:09 -04:00
Tomas Vondra	c50b3158bf	Reduce overhead of pg_mcv_list (de)serialization Commit `ea4e1c0e8f` resolved issues with memory alignment in serialized pg_mcv_list values, but it required copying data to/from the varlena buffer during serialization and deserialization. As the MCV lits may be fairly large, the overhead (memory consumption, CPU usage) can get rather significant too. This change tweaks the serialization format so that the alignment is correct with respect to the varlena value, and so the parts may be accessed directly without copying the data. Catversion bump, as it affects existing pg_statistic_ext data.	2019-04-03 21:23:40 +02:00
Stephen Frost	b0b39f72b9	GSSAPI encryption support On both the frontend and backend, prepare for GSSAPI encryption support by moving common code for error handling into a separate file. Fix a TODO for handling multiple status messages in the process. Eliminate the OIDs, which have not been needed for some time. Add frontend and backend encryption support functions. Keep the context initiation for authentication-only separate on both the frontend and backend in order to avoid concerns about changing the requested flags to include encryption support. In postmaster, pull GSSAPI authorization checking into a shared function. Also share the initiator name between the encryption and non-encryption codepaths. For HBA, add "hostgssenc" and "hostnogssenc" entries that behave similarly to their SSL counterparts. "hostgssenc" requires either "gss", "trust", or "reject" for its authentication. Similarly, add a "gssencmode" parameter to libpq. Supported values are "disable", "require", and "prefer". Notably, negotiation will only be attempted if credentials can be acquired. Move credential acquisition into its own function to support this behavior. Add a simple pg_stat_gssapi view similar to pg_stat_ssl, for monitoring if GSSAPI authentication was used, what principal was used, and if encryption is being used on the connection. Finally, add documentation for everything new, and update existing documentation on connection security. Thanks to Michael Paquier for the Windows fixes. Author: Robbie Harwood, with changes to the read/write functions by me. Reviewed in various forms and at different times by: Michael Paquier, Andres Freund, David Steele. Discussion: https://www.postgresql.org/message-id/flat/jlg1tgq1ktm.fsf@thriss.redhat.com	2019-04-03 15:02:33 -04:00
Alvaro Herrera	5f6fc34af5	Copy name when cloning FKs recurses to partitions We were passing a string owned by a syscache entry, which was released before recursing. Fix by pstrdup'ing the string. Per buildfarm member prion.	2019-04-03 15:35:54 -03:00
Alvaro Herrera	f56f8f8da6	Support foreign keys that reference partitioned tables Previously, while primary keys could be made on partitioned tables, it was not possible to define foreign keys that reference those primary keys. Now it is possible to do that. Author: Álvaro Herrera Reviewed-by: Amit Langote, Jesper Pedersen Discussion: https://postgr.es/m/20181102234158.735b3fevta63msbj@alvherre.pgsql	2019-04-03 14:40:21 -03:00
Heikki Linnakangas	9155580fd5	Generate less WAL during GiST, GIN and SP-GiST index build. Instead of WAL-logging every modification during the build separately, first build the index without any WAL-logging, and make a separate pass through the index at the end, to write all pages to the WAL. This significantly reduces the amount of WAL generated, and is usually also faster, despite the extra I/O needed for the extra scan through the index. WAL generated this way is also faster to replay. For GiST, the LSN-NSN interlock makes this a little tricky. All pages must be marked with a valid (i.e. non-zero) LSN, so that the parent-child LSN-NSN interlock works correctly. We now use magic value 1 for that during index build. Change the fake LSN counter to begin from 1000, so that 1 is safely smaller than any real or fake LSN. 2 would've been enough for our purposes, but let's reserve a bigger range, in case we need more special values in the future. Author: Anastasia Lubennikova, Andrey V. Lepikhov Reviewed-by: Heikki Linnakangas, Dmitry Dolgov	2019-04-03 17:03:15 +03:00
Alvaro Herrera	5f768045a1	Correctly initialize newly added struct member Valgrind was rightly complaining that IndexVacuumInfo->report_progress (added by commit `ab0dfc961b`) was not being initialized in some code paths. Repair. Per buildfarm member lousyjack.	2019-04-03 09:58:47 -03:00
Alvaro Herrera	e8abf97af7	Prevent use of uninitialized variable Per buildfarm member longfin.	2019-04-02 16:03:26 -03:00
Alvaro Herrera	11074f26bc	Update expected output for modified catalog definition Pilot error in previous commit	2019-04-02 15:45:45 -03:00
Alvaro Herrera	ab0dfc961b	Report progress of CREATE INDEX operations This uses the progress reporting infrastructure added by `c16dc1aca5`, adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY. There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate for btrees, including reportage for parallel index builds and the separate phases that btree index creation uses; other index AMs, which are much simpler in their building procedures, have simplistic reporting only, but that seems sufficient, at least for non-concurrent builds. The index-AM-agnostic part is fairly complete, providing insight into the CONCURRENTLY wait phases as well as block-based progress during the index validation table scan. (The index validation index scan requires patching each AM, which has not been included here.) Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql	2019-04-02 15:18:08 -03:00
Stephen Frost	4d0e994eed	Add support for partial TOAST decompression When asked for a slice of a TOAST entry, decompress enough to return the slice instead of decompressing the entire object. For use cases where the slice is at, or near, the beginning of the entry, this avoids a lot of unnecessary decompression work. This changes the signature of pglz_decompress() by adding a boolean to indicate if it's ok for the call to finish before consuming all of the source or destination buffers. Author: Paul Ramsey Reviewed-By: Rafia Sabih, Darafei Praliaskouski, Regina Obe Discussion: https://postgr.es/m/CACowWR07EDm7Y4m2kbhN_jnys%3DBBf9A6768RyQdKm_%3DNpkcaWg%40mail.gmail.com	2019-04-02 12:35:32 -04:00
Etsuro Fujita	d50d172e51	postgres_fdw: Perform the (FINAL, NULL) upperrel operations remotely. The upper-planner pathification allows FDWs to arrange to push down different types of upper-stage operations to the remote side. This commit teaches postgres_fdw to do it for the (FINAL, NULL) upperrel, which is responsible for doing LockRows, LIMIT, and/or ModifyTable. This provides the ability for postgres_fdw to handle SELECT commands so that it 1) skips the LockRows step (if any) (note that this is safe since it performs early locking) and 2) pushes down the LIMIT and/or OFFSET restrictions (if any) to the remote side. This doesn't handle the INSERT/UPDATE/DELETE cases. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk	2019-04-02 20:30:45 +09:00
Etsuro Fujita	aef65db676	Refactor create_limit_path() to share cost adjustment code with FDWs. This is in preparation for an upcoming commit. Author: Etsuro Fujita Reviewed-By: Antonin Houska and Jeff Janes Discussion: https://postgr.es/m/87pnz1aby9.fsf@news-spur.riddles.org.uk	2019-04-02 19:55:12 +09:00
Dean Rasheed	e2d28c0f40	Perform RLS subquery checks as the right user when going via a view. When accessing a table with RLS via a view, the RLS checks are performed as the view owner. However, the code neglected to propagate that to any subqueries in the RLS checks. Fix that by calling setRuleCheckAsUser() for all RLS policy quals and withCheckOption checks for RTEs with RLS. Back-patch to 9.5 where RLS was added. Per bug #15708 from daurnimator. Discussion: https://postgr.es/m/15708-d65cab2ce9b1717a@postgresql.org	2019-04-02 08:13:59 +01:00
Michael Paquier	280e5f1405	Add progress reporting to pg_checksums This adds a new option to pg_checksums called -P/--progress, showing every second some information about the computation state of an operation for --check and --enable (--disable only updates the control file and is quick). This requires a pre-scan of the data folder so as the total size of checksummable items can be calculated, and then it gets compared to the amount processed. Similarly to what is done for pg_rewind and pg_basebackup, the information printed in the progress report consists of the current amount of data computed and the total amount of data to compute. This could be extended later on. Author: Michael Banck, Bernd Helmle Reviewed-by: Fabien Coelho, Michael Paquier Discussion: https://postgr.es/m/1535719851.1286.17.camel@credativ.de	2019-04-02 10:58:07 +09:00
Thomas Munro	475861b261	Add wal_recycle and wal_init_zero GUCs. On at least ZFS, it can be beneficial to create new WAL files every time and not to bother zero-filling them. Since it's not clear which other filesystems might benefit from one or both of those things, add individual GUCs to control those two behaviors independently and make only very general statements in the docs. Author: Jerry Jelinek, with some adjustments by Thomas Munro Reviewed-by: Alvaro Herrera, Andres Freund, Tomas Vondra, Robert Haas and others Discussion: https://postgr.es/m/CACPQ5Fo00QR7LNAcd1ZjgoBi4y97%2BK760YABs0vQHH5dLdkkMA%40mail.gmail.com	2019-04-02 14:37:14 +13:00
Andres Freund	d45e401586	tableam: Add table_finish_bulk_insert(). This replaces the previous calls of heap_sync() in places using bulk-insert. By passing in the flags used for bulk-insert the AM can decide (first at insert time and then during the finish call) which of the optimizations apply to it, and what operations are necessary to finish a bulk insert operation. Also change HEAP_INSERT_* flags to TABLE_INSERT, and rename hi_options to ti_options. These changes are made even in copy.c, which hasn't yet been converted to tableam. There's no harm in doing so. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-04-01 14:41:42 -07:00
Tom Lane	26a76cb640	Restrict pgbench's zipfian parameter to ensure good performance. Remove the code that supported zipfian distribution parameters less than 1.0, as it had undocumented performance hazards, and it's not clear that the case is useful enough to justify either fixing or documenting those hazards. Also, since the code path for parameter > 1.0 could perform badly for values very close to 1.0, establish a minimum allowed value of 1.001. This solution seems superior to the previous vague documentation warning about small values not performing well. Fabien Coelho, per a gripe from Tomas Vondra Discussion: https://postgr.es/m/b5e172e9-ad22-48a3-86a3-589afa20e8f7@2ndquadrant.com	2019-04-01 17:37:34 -04:00
Thomas Munro	4fd05bb55b	Fix deadlock in heap_compute_xid_horizon_for_tuples(). We can't call code that uses syscache while we hold buffer locks on a catalog relation. If passed such a relation, just fall back to the general effective_io_concurrency GUC rather than trying to look up the containing tablespace's IO concurrency setting. We might find a better way to control prefetching in follow-up work, but for now this is enough to avoid the deadlock introduced by commit `558a9165e0`. Reviewed-by: Andres Freund Diagnosed-by: Peter Geoghegan Discussion: https://postgr.es/m/CA%2BhUKGLCwPF0S4Mk7S8qw%2BDK0Bq65LueN9rofAA3HHSYikW-Zw%40mail.gmail.com Discussion: https://postgr.es/m/962831d8-c18d-180d-75fb-8b842e3a2742%40chrullrich.net	2019-04-02 09:29:49 +13:00
Tom Lane	12d46ac392	Improve documentation about our XML functionality. Add a section explaining how our XML features depart from current versions of the SQL standard. Update and clarify the descriptions of some XML functions. Chapman Flack, reviewed by Ryan Lambert Discussion: https://postgr.es/m/5BD1284C.1010305@anastigmatix.net Discussion: https://postgr.es/m/5C81F8C0.6090901@anastigmatix.net Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com	2019-04-01 16:20:22 -04:00
Tom Lane	b2b819019f	Add volatile qualifier missed in commit `2e616dee9`. Noted by Pavel Stehule Discussion: https://postgr.es/m/CAFj8pRAaGO5FX7bnP3E=mRssoK8y5T78x7jKy-vDiyS68L888Q@mail.gmail.com	2019-04-01 14:37:25 -04:00
Peter Eisentraut	cc8d415117	Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com	2019-04-01 20:01:35 +02:00
Alexander Korotkov	b4cc19ab01	Throw error in jsonb_path_match() when result is not single boolean jsonb_path_match() checks if jsonb document matches jsonpath query. Therefore, jsonpath query should return single boolean. Currently, if result of jsonpath is not a single boolean, NULL is returned independently whether silent mode is on or off. But that appears to be wrong when silent mode is off. This commit makes jsonb_path_match() throw an error in this case. Author: Nikita Glukhov	2019-04-01 18:09:20 +03:00
Alexander Korotkov	2e643501e5	Restrict some cases in parsing numerics in jsonpath Jsonpath now accepts integers with leading zeroes and floats starting with a dot. However, SQL standard requires to follow JSON specification, which doesn't allow none of these cases. Our json[b] datatypes also restrict that. So, restrict it in jsonpath altogether. Author: Nikita Glukhov	2019-04-01 18:09:09 +03:00
Alexander Korotkov	0a02e2ae02	GIN support for @@ and @? jsonpath operators This commit makes existing GIN operator classes jsonb_ops and json_path_ops support "jsonb @@ jsonpath" and "jsonb @? jsonpath" operators. Basic idea is to extract statements of following form out of jsonpath. key1.key2. ... .keyN = const The rest of jsonpath is rechecked from heap. Catversion is bumped. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Nikita Glukhov, Alexander Korotkov Reviewed-by: Jonathan Katz, Pavel Stehule	2019-04-01 18:08:52 +03:00
Peter Eisentraut	7241911782	Catch syntax error in generated column definition The syntax GENERATED BY DEFAULT AS (expr) is not allowed but we have to accept it in the grammar to avoid shift/reduce conflicts because of the similar syntax for identity columns. The existing code just ignored this, incorrectly. Add an explicit error check and a bespoke error message. Reported-by: Justin Pryzby <pryzby@telsasoft.com>	2019-04-01 10:46:37 +02:00
Michael Paquier	4ae7f02b03	Fix thinko in allocation call during MVC list deserialization Spotted by Coverity.	2019-04-01 14:16:27 +09:00
Noah Misch	5a907404b5	Update HINT for pre-existing shared memory block. One should almost always terminate an old process, not use a manual removal tool like ipcrm. Removal of the ipcclean script eleven years ago (`39627b1ae6`) and its non-replacement corroborate that manual shm removal is now a niche goal. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20180812064815.GB2301738@rfd.leadboat.com	2019-03-31 19:32:48 -07:00
Andres Freund	bfbcad478f	tableam: bitmap table scan. This moves bitmap heap scan support to below an optional tableam callback. It's optional as the whole concept of bitmap heapscans is fairly block specific. This basically moves the work previously done in bitgetpage() into the new scan_bitmap_next_block callback, and the direct poking into the buffer done in BitmapHeapNext() into the new scan_bitmap_next_tuple() callback. The abstraction is currently somewhat leaky because nodeBitmapHeapscan.c's prefetching and visibilitymap based logic remains - it's likely that we'll later have to move more into the AM. But it's not trivial to do so without introducing a significant amount of code duplication between the AMs, so that's a project for later. Note that now nodeBitmapHeapscan.c and the associated node types are a bit misnamed. But it's not clear whether renaming wouldn't be a cure worse than the disease. Either way, that'd be best done in a separate commit. Author: Andres Freund Reviewed-By: Robert Haas (in an older version) Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-31 18:37:57 -07:00
Andres Freund	73c954d248	tableam: sample scan. This moves sample scan support to below tableam. It's not optional as there is, in contrast to e.g. bitmap heap scans, no alternative way to perform tablesample queries. If an AM can't deal with the block based API, it will have to throw an ERROR. The tableam callbacks for this are block based, but given the current TsmRoutine interface, that seems to be required. The new interface doesn't require TsmRoutines to perform visibility checks anymore - that requires the TsmRoutine to know details about the AM, which we want to avoid. To continue to allow taking the returned number of tuples account SampleScanState now has a donetuples field (which previously e.g. existed in SystemRowsSamplerData), which is only incremented after the visibility check succeeds. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-31 18:37:57 -07:00
Andres Freund	4bb50236eb	tableam: Formatting and other minor cleanups. The superflous heapam_xlog.h includes were reported by Peter Geoghegan.	2019-03-31 18:16:53 -07:00
Peter Geoghegan	76a39f2295	Fix nbtree high key "continuescan" row compare bug. Commit `29b64d1d` mishandled skipping over truncated high key attributes during row comparisons. The row comparison key matching loop would loop forever when a truncated attribute was encountered for a row compare subkey. Fix by following the example of other code in the loop: advance the current subkey, or break out of the loop when the last subkey is reached. Add test coverage for the relevant _bt_check_rowcompare() code path. The new test case is somewhat tied to nbtree implementation details, which isn't ideal, but seems unavoidable.	2019-03-31 17:24:04 -07:00
Tom Lane	8fba397f0c	Add test case exercising formerly-unreached code in inheritance_planner. There was some debate about whether the code I'd added to remap AppendRelInfos obtained from the initial SELECT planning run is actually necessary. Add a test case demonstrating that it is. Discussion: https://postgr.es/m/23831.1553873385@sss.pgh.pa.us	2019-03-31 15:49:06 -04:00
Tom Lane	9fd4de119c	Compute root->qual_security_level in a less random place. We can set this up once and for all in subquery_planner's initial survey of the flattened rangetable, rather than incrementally adjusting it in build_simple_rel. The previous approach made it rather hard to reason about exactly when the value would be available, and we were definitely using it in some places before the final value was computed. Noted while fooling around with Amit Langote's patch to delay creation of inheritance child rels. That didn't break this code, but it made it even more fragile, IMO.	2019-03-31 13:47:41 -04:00
Michael Paquier	2aa6e331ea	Skip redundant anti-wraparound vacuums An anti-wraparound vacuum has to be by definition aggressive as it needs to work on all the pages of a relation. However it can happen that due to some concurrent activity an anti-wraparound vacuum is marked as non-aggressive, which makes it redundant with a previous run, and it is actually useless as an anti-wraparound vacuum should process all the pages of a relation. This commit makes such vacuums to be skipped. An anti-wraparound vacuum not aggressive can be found easily by mixing low values of autovacuum_freeze_max_age (to control anti-wraparound) and autovacuum_freeze_table_age (to control the aggressiveness). `28a8fa9` has added some extra logging printing all the possible combinations of anti-wraparound and aggressive vacuums, which now gets simplified as an anti-wraparound vacuum also non-aggressive gets skipped. Per discussion mainly between Andrew Dunstan, Robert Haas, Álvaro Herrera, Kyotaro Horiguchi, Masahiko Sawada, and myself. Author: Kyotaro Horiguchi, Michael Paquier Reviewed-by: Andrew Dunstan, Álvaro Herrera Discussion: https://postgr.es/m/20180914153554.562muwr3uwujno75@alvherre.pgsql	2019-03-31 22:59:12 +09:00
Andrew Dunstan	47b3c26642	Have pg_upgrade's Makefile honor NO_TEMP_INSTALL Backpatch to 9.5, when pg_upgrade's location changed. Discussion: https://postgr.es/m/5506b8fa-7dad-8483-053c-7ca7ef04f01a@2ndQuadrant.com	2019-03-31 08:19:05 -04:00
Andres Freund	696d78469f	tableam: Move heap specific logic from estimate_rel_size below tableam. This just moves the table/matview[/toast] determination of relation size to a callback, and uses a copy of the existing logic to implement that callback for heap. It probably would make sense to also move the index specific logic into a callback, so the metapage handling (and probably more) can be index specific. But that's a separate task. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-30 19:26:36 -07:00
Andres Freund	737a292b5d	tableam: VACUUM and ANALYZE support. This is a relatively straightforward move of the current implementation to sit below tableam. As the current analyze sampling implementation is pretty inherently block based, the tableam analyze interface is as well. It might make sense to generalize that at some point, but that seems like a larger project that shouldn't be undertaken at the same time as the introduction of tableam. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-30 19:25:58 -07:00
Tomas Vondra	0f5493fdf1	Fix typo Author: John Naylor	2019-03-31 03:29:58 +02:00
Tom Lane	428b260f87	Speed up planning when partitions can be pruned at plan time. Previously, the planner created RangeTblEntry and RelOptInfo structs for every partition of a partitioned table, even though many of them might later be deemed uninteresting thanks to partition pruning logic. This incurred significant overhead when there are many partitions. Arrange to postpone creation of these data structures until after we've processed the query enough to identify restriction quals for the partitioned table, and then apply partition pruning before not after creation of each partition's data structures. In this way we need not open the partition relations at all for partitions that the planner has no real interest in. For queries that can be proven at plan time to access only a small number of partitions, this patch improves the practical maximum number of partitions from under 100 to perhaps a few thousand. Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen, Yoshikazu Imai, and David Rowley Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-30 18:58:55 -04:00
Tomas Vondra	ad3107b973	Fix compiler warnings in multivariate MCV code Compiler warnings were observed on gcc 3.4.6 (on gaur). The assert is unnecessary, as the indexes are uint16 and so always >= 0. Reported-by: Tom Lane	2019-03-30 18:43:16 +01:00
Tomas Vondra	ea4e1c0e8f	Additional fixes of memory alignment in pg_mcv_list code Commit `d85e0f366a` tried to fix memory alignment issues in serialization and deserialization of pg_mcv_list values, but it was a few bricks shy. The arrays of uint16 indexes in serialized items was not aligned, and the both the values and isnull flags were using the same pointer. Per investigation by Tom Lane on gaur.	2019-03-30 18:34:59 +01:00
Tom Lane	7ad6498fd5	Avoid crash in partitionwise join planning under GEQO. While trying to plan a partitionwise join, we may be faced with cases where one or both input partitions for a particular segment of the join have been pruned away. In HEAD and v11, this is problematic because earlier processing didn't bother to make a pruned RelOptInfo fully valid. With an upcoming patch to make partition pruning more efficient, this'll be even more problematic because said RelOptInfo won't exist at all. The existing code attempts to deal with this by retroactively making the RelOptInfo fully valid, but that causes crashes under GEQO because join planning is done in a short-lived memory context. In v11 we could probably have fixed this by switching to the planner's main context while fixing up the RelOptInfo, but that idea doesn't scale well to the upcoming patch. It would be better not to mess with the base-relation data structures during join planning, anyway --- that's just a recipe for order-of-operations bugs. In many cases, though, we don't actually need the child RelOptInfo, because if the input is certainly empty then the join segment's result is certainly empty, so we can skip making a join plan altogether. (The existing code ultimately arrives at the same conclusion, but only after doing a lot more work.) This approach works except when the pruned-away partition is on the nullable side of a LEFT, ANTI, or FULL join, and the other side isn't pruned. But in those cases the existing code leaves a lot to be desired anyway --- the correct output is just the result of the unpruned side of the join, but we were emitting a useless outer join against a dummy Result. Pending somebody writing code to handle that more nicely, let's just abandon the partitionwise-join optimization in such cases. When the modified code skips making a join plan, it doesn't make a join RelOptInfo either; this requires some upper-level code to cope with nulls in part_rels[] arrays. We would have had to have that anyway after the upcoming patch. Back-patch to v11 since the crash is demonstrable there. Discussion: https://postgr.es/m/8305.1553884377@sss.pgh.pa.us	2019-03-30 12:48:32 -04:00
Peter Eisentraut	fc22b6623b	Generated columns This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com	2019-03-30 08:15:57 +01:00
Peter Eisentraut	6b8b5364dd	Small code simplification for REINDEX CONCURRENTLY This was left over from an earlier code structure.	2019-03-30 07:16:24 +01:00
Peter Geoghegan	9c7fb7e6d8	Tweak some nbtree-related code comments.	2019-03-29 12:29:05 -07:00
Tomas Vondra	d85e0f366a	Fix memory alignment in pg_mcv_list serialization Blind attempt at fixing ia64, hppa an sparc builds. The serialized representation of MCV lists did not enforce proper memory alignment for internal fields, resulting in deserialization issues on platforms that are more sensitive to this (ia64, sparc and hppa). This forces a catalog version bump, because the layout of serialized pg_mcv_list changes. Broken since `7300a699`.	2019-03-29 19:06:38 +01:00
Andres Freund	d3a5fc17eb	Show table access methods as such in psql's \dA. Previously we didn't display a type for table access methods. Author: Haribabu Kommi Discussion: CAJrrPGeeYOqP3hkZyohDx_8dot4zvPuPMDBmhJ=iC85cTBNeYw@mail.gmail.com	2019-03-29 08:59:40 -07:00
Andres Freund	ffa8444ce4	tableam: Comment fixes. Author: Haribabu Kommi Discussion: CAJrrPGeeYOqP3hkZyohDx_8dot4zvPuPMDBmhJ=iC85cTBNeYw@mail.gmail.com	2019-03-29 08:17:26 -07:00
Robert Haas	41b54ba78e	Allow existing VACUUM options to take a Boolean argument. This makes VACUUM work more like EXPLAIN already does without changing the meaning of any commands that already work. It is intended to facilitate the addition of future VACUUM options that may take non-Boolean parameters or that default to false. Masahiko Sawada, reviewed by me. Discussion: http://postgr.es/m/CA+TgmobpYrXr5sUaEe_T0boabV0DSm=utSOZzwCUNqfLEEm8Mw@mail.gmail.com Discussion: http://postgr.es/m/CAD21AoBaFcKBAeL5_++j+Vzir2vBBcF4juW7qH8b3HsQY=Q6+w@mail.gmail.com	2019-03-29 08:22:49 -04:00
Robert Haas	c900c15269	Warn more strongly about the dangers of exclusive backup mode. Especially, warn about the hazards of mishandling the backup_label file. Adjust a couple of server messages to be more clear about the hazards associated with removing backup_label files, too. David Steele and Robert Haas, reviewed by Laurenz Albe, Martín Marqués, Peter Eisentraut, and Magnus Hagander. Discussion: http://postgr.es/m/7d85c387-000e-16f0-e00b-50bf83c22127@pgmasters.net	2019-03-29 08:15:16 -04:00
Peter Eisentraut	bb76134b08	Fix incorrect code in new REINDEX CONCURRENTLY code The previous code was adding pointers to transient variables to a list, but by the time the list was read, the variable might be gone, depending on the compiler. Fix it by making copies in the proper memory context.	2019-03-29 10:53:40 +01:00
Peter Eisentraut	5dc92b844e	REINDEX CONCURRENTLY This adds the CONCURRENTLY option to the REINDEX command. A REINDEX CONCURRENTLY on a specific index creates a new index (like CREATE INDEX CONCURRENTLY), then renames the old index away and the new index in place and adjusts the dependencies, and then drops the old index (like DROP INDEX CONCURRENTLY). The REINDEX command also has the capability to run its other variants (TABLE, DATABASE) with the CONCURRENTLY option (but not SYSTEM). The reindexdb command gets the --concurrently option. Author: Michael Paquier, Andreas Karlsson, Peter Eisentraut Reviewed-by: Andres Freund, Fujii Masao, Jim Nasby, Sergei Kornilov Discussion: https://www.postgresql.org/message-id/flat/60052986-956b-4478-45ed-8bd119e9b9cf%402ndquadrant.com#74948a1044c56c5e817a5050f554ddee	2019-03-29 08:26:33 +01:00
Andres Freund	d25f519107	tableam: relation creation, VACUUM FULL/CLUSTER, SET TABLESPACE. This moves the responsibility for: - creating the storage necessary for a relation, including creating a new relfilenode for a relation with existing storage - non-transactional truncation of a relation - VACUUM FULL / CLUSTER's rewrite of a table below tableam. This is fairly straight forward, with a bit of complexity smattered in to move the computation of xid / multixid horizons below the AM, as they don't make sense for every table AM. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-28 20:01:43 -07:00
Thomas Munro	7e69323bf7	Fix typo. Author: Masahiko Sawada	2019-03-29 10:03:58 +13:00
Andres Freund	46bcd2af18	Fix a few comment copy & pastos.	2019-03-28 13:42:37 -07:00
Tomas Vondra	62bf0fb35c	Fix deserialization of pg_mcv_list values There were multiple issues in deserialization of pg_mcv_list values. Firstly, the data is loaded from syscache, but the deserialization was performed after ReleaseSysCache(), at which point the data might have already disappeared. Fixed by moving the calls in statext_mcv_load, and using the same NULL-handling code as existing stats. Secondly, the deserialized representation used pointers into the serialized representation. But that is also unsafe, because the data may disappear at any time. Fixed by reworking and simplifying the deserialization code to always copy all the data. And thirdly, when deserializing values for types passed by value, the code simply did memcpy(d,s,typlen) which however does not work on bigendian machines. Fixed by using fetch_att/store_att_byval.	2019-03-28 20:03:14 +01:00
Thomas Munro	ad308058cc	Use FullTransactionId for the transaction stack. Provide GetTopFullTransactionId() and GetCurrentFullTransactionId(). The intended users of these interfaces are access methods that use xids for visibility checks but don't want to have to go back and "freeze" existing references some time later before the 32 bit xid counter wraps around. Use a new struct to serialize the transaction state for parallel query, because FullTransactionId doesn't fit into the previous serialization scheme very well. Author: Thomas Munro Reviewed-by: Heikki Linnakangas Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com	2019-03-28 18:24:43 +13:00
Thomas Munro	2fc7af5e96	Add basic infrastructure for 64 bit transaction IDs. Instead of inferring epoch progress from xids and checkpoints, introduce a 64 bit FullTransactionId type and use it to track xid generation. This fixes an unlikely bug where the epoch is reported incorrectly if the range of active xids wraps around more than once between checkpoints. The only user-visible effect of this commit is to correct the epoch used by txid_current() and txid_status(), also visible with pg_controldata, in those rare circumstances. It also creates some basic infrastructure so that later patches can use 64 bit transaction IDs in more places. The new type is a struct that we pass by value, as a form of strong typedef. This prevents the sort of accidental confusion between TransactionId and FullTransactionId that would be possible if we were to use a plain old uint64. Author: Thomas Munro Reported-by: Amit Kapila Reviewed-by: Andres Freund, Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com	2019-03-28 18:12:20 +13:00
Andres Freund	2a96909a4a	tableam: Support for an index build's initial table scan(s). To support building indexes over tables of different AMs, the scans to do so need to be routed through the table AM. While moving a fair amount of code, nearly all the changes are just moving code to below a callback. Currently the range based interface wouldn't make much sense for non block based table AMs. But that seems aceptable for now. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-27 19:59:06 -07:00
Tomas Vondra	a63b29a1de	Minor improvements for the multivariate MCV lists The MCV build should always call get_mincount_for_mcv_list(), as the there is no other logic to decide whether the MCV list represents all the data. So just remove the (ngroups > nitems) condition. Also, when building MCV lists, the number of items was limited by the statistics target (i.e. up to 10000). But when deserializing the MCV list, a different value (8192) was used to check the input, causing an error. Simply ensure that the same value is used in both places. This should have been included in `7300a69950`, but I forgot to include it in that commit.	2019-03-27 20:07:41 +01:00
Tomas Vondra	7300a69950	Add support for multivariate MCV lists Introduce a third extended statistic type, supported by the CREATE STATISTICS command - MCV lists, a generalization of the statistic already built and used for individual columns. Compared to the already supported types (n-distinct coefficients and functional dependencies), MCV lists are more complex, include column values and allow estimation of much wider range of common clauses (equality and inequality conditions, IS NULL, IS NOT NULL etc.). Similarly to the other types, a new pseudo-type (pg_mcv_list) is used. Author: Tomas Vondra Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com	2019-03-27 18:32:18 +01:00
Tom Lane	333ed246c6	Avoid passing query tlist around separately from root->processed_tlist. In the dim past, the planner kept the fully-processed version of the query targetlist (the result of preprocess_targetlist) in grouping_planner's local variable "tlist", and only grudgingly passed it to individual other routines as needed. Later we discovered a need to still have it available after grouping_planner finishes, and invented the root->processed_tlist field for that purpose, but it wasn't used internally to grouping_planner; the tlist was still being passed around separately in the same places as before. Now comes a proposed patch to allow appendrel expansion to add entries to the processed tlist, well after preprocess_targetlist has finished its work. To avoid having to pass around the tlist explicitly, it's proposed to allow appendrel expansion to modify root->processed_tlist. That makes aliasing the tlist with assorted parameters and local variables really scary. It would accidentally work as long as the tlist is initially nonempty, because then the List header won't move around, but it's not exactly hard to think of ways for that to break. Aliased values are poor programming practice anyway. Hence, get rid of local variables and parameters that can be identified with root->processed_tlist, in favor of just using that field directly. And adjust comments to match. (Some of the new comments speak as though it's already possible for appendrel expansion to modify the tlist; that's not true yet, but will happen in a later patch.) Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-27 12:57:49 -04:00
Alvaro Herrera	9938d11633	pgbench: doExecuteCommand -> executeMetaCommand The new function is only in charge of meta commands, not SQL commands. This change makes the code a little clearer: now all the state changes are effected by advanceConnectionState. It also removes one indent level, which makes the diff look bulkier than it really is. Author: Fabien Coelho Reviewed-by: Kirk Jamison Discussion: https://postgr.es/m/alpine.DEB.2.21.1811240904500.12627@lancre	2019-03-27 12:21:02 -03:00
Tom Lane	a51cc7e9e6	Suppress uninitialized-variable warning. Apparently Andres' compiler is smart enough to see that hpage must be initialized before use ... but mine isn't.	2019-03-27 11:10:42 -04:00
Michael Paquier	ecfed4a122	Improve error handling of column references in expression transformation Column references are not allowed in default expressions and partition bound expressions, and are restricted as such once the transformation of their expressions is done. However, trying to use more complex column references can lead to confusing error messages. For example, trying to use a two-field column reference name for default expressions and partition bounds leads to "missing FROM-clause entry for table", which makes no sense in their respective context. In order to make the errors generated more useful, this commit adds more verbose messages when transforming column references depending on the context. This has a little consequence though: for example an expression using an aggregate with a column reference as argument would cause an error to be generated for the column reference, while the aggregate was the problem reported before this commit because column references get transformed first. The confusion exists for default expressions for a long time, and the problem is new as of v12 for partition bounds. Still per the lack of complaints on the matter no backpatch is done. The patch has been written by Amit Langote and me, and Tom Lane has provided the improvement of the documentation for default expressions on the CREATE TABLE page. Author: Amit Langote, Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20190326020853.GM2558@paquier.xyz	2019-03-27 21:04:25 +09:00
Thomas Munro	d2fd7f74ee	Fix off-by-one error in txid_status(). The transaction ID returned by GetNextXidAndEpoch() is in the future, so we can't attempt to access its status or we might try to read a CLOG page that doesn't exist. The > vs >= confusion probably stemmed from the choice of a variable name containing the word "last" instead of "next", so fix that too. Back-patch to 10 where the function arrived. Author: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKG%2Buua_BV5cyfsioKVN2d61Lukg28ECsWTXKvh%3DBtN2DPA%40mail.gmail.com	2019-03-27 21:30:04 +13:00
Michael Paquier	1983af8e89	Switch some palloc/memset calls to palloc0 Some code paths have been doing some allocations followed by an immediate memset() to initialize the allocated area with zeros, this is a bit overkill as there are already interfaces to do both things in one call. Author: Daniel Gustafsson Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se	2019-03-27 12:02:50 +09:00
Michael Paquier	5bde1651bb	Switch function current_schema[s]() to be parallel-unsafe When invoked for the first time in a session, current_schema() and current_schemas() can finish by creating a temporary schema. Currently those functions are parallel-safe, however if for a reason or another they get launched across multiple parallel workers, they would fail when attempting to create a temporary schema as temporary contexts are not supported in this case. The original issue has been spotted by buildfarm members crake and lapwing, after commit `c5660e0` has introduced the first regression tests based on current_schema() in the tree. After that, `396676b` has introduced a workaround to avoid parallel plans but that was not completely right either. Catversion is bumped. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/20190118024618.GF1883@paquier.xyz	2019-03-27 11:35:12 +09:00
Tomas Vondra	6ca015f9f0	Track unowned relations in doubly-linked list Relations dropped in a single transaction are tracked in a list of unowned relations. With large number of dropped relations this resulted in poor performance at the end of a transaction, when the relations are removed from the singly linked list one by one. Commit `b4166911` attempted to address this issue (particularly when it happens during recovery) by removing the relations in a reverse order, resulting in O(1) lookups in the list of unowned relations. This did not work reliably, though, and it was possible to trigger the O(N^2) behavior in various ways. Instead of trying to remove the relations in a specific order with respect to the linked list, which seems rather fragile, switch to a regular doubly linked. That allows us to remove relations cheaply no matter where in the list they are. As `b4166911` was a bugfix, backpatched to all supported versions, do the same thing here. Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com Backpatch-through: 9.4	2019-03-27 02:39:39 +01:00
Andres Freund	558a9165e0	Compute XID horizon for page level index vacuum on primary. Previously the xid horizon was only computed during WAL replay. That had two major problems: 1) It relied on knowing what the table pointed to looks like. That was easy enough before the introducing of tableam (we knew it had to be heap, although some trickery around logging the heap relfilenodes was required). But to properly handle table AMs we need per-database catalog access to look up the AM handler, which recovery doesn't allow. 2) Not knowing the xid horizon also makes it hard to support logical decoding on standbys. When on a catalog table, we need to be able to conflict with slots that have an xid horizon that's too old. But computing the horizon by visiting the heap only works once consistency is reached, but we always need to be able to detect conflicts. There's also a secondary problem, in that the current method performs redundant work on every standby. But that's counterbalanced by potentially computing the value when not necessary (either because there's no standby, or because there's no connected backends). Solve 1) and 2) by moving computation of the xid horizon to the primary and by involving tableam in the computation of the horizon. To address the potentially increased overhead, increase the efficiency of the xid horizon computation for heap by sorting the tids, and eliminating redundant buffer accesses. When prefetching is available, additionally perform prefetching of buffers. As this is more of a maintenance task, rather than something routinely done in every read only query, we add an arbitrary 10 to the effective concurrency - thereby using IO concurrency, when not globally enabled. That's possibly not the perfect formula, but seems good enough for now. Bumps WAL format, as latestRemovedXid is now part of the records, and the heap's relfilenode isn't anymore. Author: Andres Freund, Amit Khandekar, Robert Haas Reviewed-By: Robert Haas Discussion: https://postgr.es/m/20181212204154.nsxf3gzqv3gesl32@alap3.anarazel.de https://postgr.es/m/20181214014235.dal5ogljs3bmlq44@alap3.anarazel.de https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-26 16:52:54 -07:00
Alvaro Herrera	126d631222	Fix partitioned index creation bug with dropped columns ALTER INDEX .. ATTACH PARTITION fails if the partitioned table where the index is defined contains more dropped columns than its partition, with this message: ERROR: incorrect attribute map The cause was that one caller of CompareIndexInfo was passing the number of attributes of the partition rather than the parent, which confused the length check. Repair. This can cause pg_upgrade to fail when used on such a database. Leave some more objects around after regression tests, so that the case is detected by pg_upgrade test suite. Remove some spurious empty lines noticed while looking for other cases of the same problem. Discussion: https://postgr.es/m/20190326213924.GA2322@alvherre.pgsql	2019-03-26 20:19:28 -03:00
Tom Lane	53bcf5e3db	Build "other rels" of appendrel baserels in a separate step. Up to now, otherrel RelOptInfos were built at the same time as baserel RelOptInfos, thanks to recursion in build_simple_rel(). However, nothing in query_planner's preprocessing cares at all about otherrels, only baserels, so we don't really need to build them until just before we enter make_one_rel. This has two benefits: * create_lateral_join_info did a lot of extra work to propagate lateral-reference information from parents to the correct children. But if we delay creation of the children till after that, it's trivial (and much harder to break, too). * Since we have all the restriction quals correctly assigned to parent appendrels by this point, it'll be possible to do plan-time pruning and never make child RelOptInfos at all for partitions that can be pruned away. That's not done here, but will be later on. Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen, Yoshikazu Imai, and David Rowley Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-26 18:21:10 -04:00
Tom Lane	8994cc6ffc	Add ORDER BY to more ICU regression test cases. Commit `c77e12208` didn't fully fix the problem. Per buildfarm and local testing.	2019-03-26 17:46:04 -04:00
Tom Lane	7c366ac969	Fix oversight in data-type change for autovacuum_vacuum_cost_delay. Commit `caf626b2c` missed that the relevant reloptions entry needs to be moved from the intRelOpts[] array to realRelOpts[]. Somewhat surprisingly, it seems to work anyway, perhaps because the desired default and limit values are all integers. We ought to have either a simpler data structure or better cross-checking here, but that's for another patch. Nikolay Shaplov Discussion: https://postgr.es/m/4861742.12LTaSB3sv@x200m	2019-03-26 13:32:38 -04:00
Alvaro Herrera	1d21ba8a9b	psql: Schema-qualify typecast in one \d query Bug introduced in my commit `bc87f22ef6`	2019-03-26 13:06:41 -03:00
Tom Lane	e8d5dd6be7	Get rid of duplicate child RTE for a partitioned table. We've been creating duplicate RTEs for partitioned tables just because we do so for regular inheritance parent tables. But unlike regular-inheritance parents which are themselves regular tables and thus need to be scanned, partitioned tables don't need the extra RTE. This makes the conditions for building a child RTE the same as those for building an AppendRelInfo, allowing minor simplification in expand_single_inheritance_child. Since the planner's actual processing is driven off the AppendRelInfo list, nothing much changes beyond that, we just have one fewer useless RTE entry. Amit Langote, reviewed and hacked a bit by me Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-26 12:03:27 -04:00
Alvaro Herrera	1af25ca0c2	Improve psql's \d display of foreign key constraints When used on a partition containing foreign keys coming from one of its ancestors, \d would (rather unhelpfully) print the details about the pg_constraint row in the partition. This becomes a bit frustrating when the user tries things like dropping the FK in the partition; instead, show the details for the foreign key on the table where it is defined. Also, when a table is referenced by a foreign key on a partitioned table, we would show multiple "Referenced by" lines, one for each partition, which gets unwieldy pretty fast. Modify that so that it shows only one line for the ancestor partitioned table where the FK is defined. Discussion: https://postgr.es/m/20181204143834.ym6euxxxi5aeqdpn@alvherre.pgsql Reviewed-by: Tom Lane, Amit Langote, Peter Eisentraut	2019-03-26 11:14:34 -03:00
Peter Eisentraut	c8c885b7a5	Fix misplaced const These instances were apparently trying to carry the const qualifier from the arguments through the complex casts, but for that the const qualifier was misplaced.	2019-03-26 09:23:08 +01:00
Andres Freund	2ac1b2b175	Remove heap_hot_search(). After `71bdc99d0d`, "tableam: Add helper for indexes to check if a corresponding table tuples exist." there's no in-core user left. As there's unlikely to be an external user, and such an external user could easily be adjusted to use table_index_fetch_tuple_check(), remove heap_hot_search(). Per complaint from Peter Geoghegan Author: Andres Freund Discussion: https://postgr.es/m/CAH2-Wzn0Oq4ftJrTqRAsWy2WGjv0QrJcwoZ+yqWsF_Z5vjUBFw@mail.gmail.com	2019-03-25 19:04:41 -07:00
Michael Paquier	cdde886d36	Fix crash when using partition bound expressions Since `7c079d7`, partition bounds are able to use generalized expression syntax when processed, treating "minvalue" and "maxvalue" as specific cases as they get passed down for transformation as a column references. The checks for infinite bounds in range expressions have been lax though, causing crashes when trying to use column reference names with more than one field. Here is an example causing a crash: CREATE TABLE list_parted (a int) PARTITION BY LIST (a); CREATE TABLE part_list_crash PARTITION OF list_parted FOR VALUES IN (somename.somename); Note that the creation of the second relation should fail as partition bounds cannot have column references in their expressions, so when finding an expression which does not match the expected infinite bounds, then this commit lets the generic transformation machinery check after it. The error message generated in this case references as well a missing RTE, which is confusing. This problem will be treated separately as it impacts as well default expressions for some time, and for now only the cases where a crash can happen are fixed. While on it, extend the set of regression tests in place for list partition bounds and add an extra set for range partition bounds. Reported-by: Alexander Lakhin Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/15668-0377b1981aa1a393@postgresql.org	2019-03-26 10:09:14 +09:00
Andres Freund	2e3da03e9e	tableam: Add table_get_latest_tid, to wrap heap_get_latest_tid. This primarily is to allow WHERE CURRENT OF to continue to work as it currently does. It's not clear to me that these semantics make sense for every AM, but it works for the in-core heap, and the out of core zheap. We can refine it further at a later point if necessary. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-25 17:14:48 -07:00
Andres Freund	71bdc99d0d	tableam: Add helper for indexes to check if a corresponding table tuples exist. This is, likely exclusively, useful to verify that conflicts detected in a unique index are with live tuples, rather than dead ones. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-25 16:52:55 -07:00
Thomas Munro	aa1419e63f	Add MacPorts support to src/test/ldap tests. Previously the test knew how to find an OpenLDAP installation at the paths used by Homebrew. Add the MacPorts paths too. Author: Thomas Munro Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CA%2BhUKGKrjGS7sO4jc53gp3qipCtEvThtdP_%3DzoixgX5ZBq4Nbw%40mail.gmail.com	2019-03-26 11:44:18 +13:00
Tom Lane	f7111f72d2	Improve planner's selectivity estimates for inequalities on CTID. We were getting just DEFAULT_INEQ_SEL for comparisons such as "ctid >= constant", but it's possible to do a lot better if we don't mind some assumptions about the table's tuple density being reasonably uniform. There are already assumptions much like that elsewhere in the planner, so that hardly seems like much of an objection. Extracted from a patch set that also proposes to introduce a special executor node type for such queries. Not sure if that's going to make it into v12, but improving the selectivity estimate is useful independently of that. Edmund Horner, reviewed by David Rowley Discussion: https://postgr.es/m/CAMyN-kB-nFTkF=VA_JPwFNo08S0d-Yk0F741S2B7LDmYAi8eyA@mail.gmail.com	2019-03-25 18:42:52 -04:00
Tom Lane	8edd0e7946	Suppress Append and MergeAppend plan nodes that have a single child. If there's only one child relation, the Append or MergeAppend isn't doing anything useful, and can be elided. It does have a purpose during planning though, which is to serve as a buffer between parent and child Var numbering. Therefore we keep it all the way through to setrefs.c, and get rid of it only after fixing references in the plan level(s) above it. This works largely the same as setrefs.c's ancient hack to get rid of no-op SubqueryScan nodes, and can even share some code with that. Note the change to make setrefs.c use apply_tlist_labeling rather than ad-hoc code. This has the effect of propagating the child's resjunk and ressortgroupref labels, which formerly weren't propagated when removing a SubqueryScan. Doing that is demonstrably necessary for the [Merge]Append cases, and seems harmless for SubqueryScan, if only because trivial_subqueryscan is afraid to collapse cases where the resjunk marking differs. (I suspect that restriction could now be removed, though it's unclear that it'd make any new matches possible, since the outer query can't have references to a child resjunk column.) David Rowley, reviewed by Alvaro Herrera and Tomas Vondra Discussion: https://postgr.es/m/CAKJS1f_7u8ATyJ1JGTMHFoKDvZdeF-iEBhs+sM_SXowOr9cArg@mail.gmail.com	2019-03-25 15:42:35 -04:00
Peter Geoghegan	f21668f328	Add "split after new tuple" nbtree optimization. Add additional heuristics to the algorithm for locating an optimal split location. New logic identifies localized monotonically increasing values in indexes with multiple columns. When this insertion pattern is detected, page splits split just after the new item that provoked a page split (or apply leaf fillfactor in the style of a rightmost page split). This optimization is a variation of the long established leaf fillfactor optimization used during rightmost page splits. 50/50 page splits are only appropriate with a pattern of truly random insertions, where the average space utilization ends up at 65% - 70%. Without this patch, affected cases have leaf pages that are no more than about 50% full on average. Future insertions can never make use of the free space left behind. With this patch, affected cases have leaf pages that are about 90% full on average (assuming a fillfactor of 90). Localized monotonically increasing insertion patterns are presumed to be fairly common in real-world applications. There is a fair amount of anecdotal evidence for this. Both pg_depend system catalog indexes (pg_depend_depender_index and pg_depend_reference_index) are at least 20% smaller after the regression tests are run when the optimization is available. Furthermore, many of the indexes created by a fair use implementation of TPC-C for Postgres are consistently about 40% smaller when the optimization is available. Note that even pg_upgrade'd v3 indexes make use of this optimization. Author: Peter Geoghegan Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/CAH2-WzkpKeZJrXvR_p7VSY1b-s85E3gHyTbZQzR0BkJ5LrWF_A@mail.gmail.com	2019-03-25 09:44:25 -07:00
Tom Lane	f7ff0ae842	Further code review for new integerset code. Mostly cosmetic adjustments, but I added a more reliable method of detecting whether an iteration is in progress.	2019-03-25 12:23:48 -04:00
Heikki Linnakangas	9f75e37723	Refactor code to print pgbench progress reports. threadRun() function is very long and deeply-nested. Extract the code to print progress reports to a separate function, to make it slightly easier to read. Author: Fabien Coelho Discussion: https://www.postgresql.org/message-id/alpine.DEB.2.21.1903101225270.17271%40lancre	2019-03-25 18:07:29 +02:00
Robert Haas	5857be907d	Fix use of wrong datatype with sizeof(). OID and int are the same size, but they are not the same thing. David Rowley Discussion: http://postgr.es/m/CAKJS1f_MhS++XngkTvWL9X1v8M5t-0N0B-R465yHQY=TmNV0Ew@mail.gmail.com	2019-03-25 11:28:06 -04:00
Alvaro Herrera	25ee70511e	pgbench: Remove \cset Partial revert of commit `6260cc550b`, "pgbench: add \cset and \gset commands". While \gset is widely considered a useful and necessary tool for user- defined benchmarks, \cset does not have as much value, and its implementation was considered "not to be up to project standards" (though I, Álvaro, can't quite understand exactly how). Therefore, remove \cset. Author: Fabien Coelho Discussion: https://postgr.es/m/alpine.DEB.2.21.1903230716030.18811@lancre Discussion: https://postgr.es/m/201901101900.mv7zduch6sad@alvherre.pgsql	2019-03-25 12:16:07 -03:00
Robert Haas	6f97457e0d	Add progress reporting for CLUSTER and VACUUM FULL. This uses the same progress reporting infrastructure added in commit `c16dc1aca5` and extends it to these additional cases. We lack the ability to track the internal progress of sorts and index builds so the information reported is coarse-grained for some parts of the operation, but it still seems like a significant improvement over having nothing at all. Tatsuro Yamada, reviewed by Thomas Munro, Masahiko Sawada, Michael Paquier, Jeff Janes, Alvaro Herrera, Rafia Sabih, and by me. A fair amount of polishing also by me. Discussion: http://postgr.es/m/59A77072.3090401@lab.ntt.co.jp	2019-03-25 10:59:04 -04:00
Alexander Korotkov	1d88a75c42	Get rid of backtracking in jsonpath_scan.l Non-backtracking flex parsers work faster than backtracking ones. So, this commit gets rid of backtracking in jsonpath_scan.l. That required explicit handling of some cases as well as manual backtracking for some cases. More regression tests for numerics are added. Discussion: https://mail.google.com/mail/u/0?ik=a20b091faa&view=om&permmsgid=msg-f%3A1628425344167939063 Author: John Naylor, Nikita Gluknov, Alexander Korotkov	2019-03-25 15:43:56 +03:00
Alexander Korotkov	8b17298f0b	Cosmetic changes for jsonpath_gram.y and jsonpath_scan.l This commit include formatting improvements, renamings and comments. Also, it makes jsonpath_scan.l be more uniform with other our lexers. Firstly, states names are renamed to more short alternatives. Secondly, <INITIAL> prefix removed from the rules. Corresponding rules are moved to the tail, so they would anyway work only in initial state. Author: Alexander Korotkov Reviewed-by: John Naylor	2019-03-25 15:42:51 +03:00
Heikki Linnakangas	d303122eab	Clean up the Simple-8b encoder code. Coverity complained that simple8b_encode() might read beyond the end of the 'diffs' array, in the loop to encode the integers. That was a false positive, because we never get into the loop in modes 0 or 1, and the array is large enough for all the other modes. But I admit it's very subtle, so it's not surprising that Coverity didn't see it, and it's not very obvious to humans either. Refactor it, so that the second loop re-computes the differences, instead of carrying them over from the first loop in the 'diffs' array. This way, the 'diffs' array is not needed anymore. It makes no measurable difference in performance, and seems more straightforward this way. Also, improve the comments in simple8b_encode(): fix the comment about its return value that was flat-out wrong, and explain the condition when it returns EMPTY_CODEWORD better. In the passing, move the 'selector' from the codeword's low bits to the high bits. It doesn't matter much, but looking at the original paper, and googling around for other Simple-8b implementations, that's how it's usually done. Per Coverity, and Tom Lane's report off-list.	2019-03-25 11:39:51 +02:00
Peter Eisentraut	148cf5f462	Align timestamps in pg_regress output This way the timestamps line up in a mix of "ok" and "FAILED" output. Author: Christoph Berg <christoph.berg@credativ.de> Discussion: https://www.postgresql.org/message-id/20190321115059.GF2687%40msg.df7cb.de	2019-03-25 10:02:24 +01:00
Peter Eisentraut	481018f280	Add macro to cast away volatile without allowing changes to underlying type This adds unvolatize(), which works just like unconstify() but for volatile. Discussion: https://www.postgresql.org/message-id/flat/7a5cbea7-b8df-e910-0f10-04014bcad701%402ndquadrant.com	2019-03-25 09:37:03 +01:00
Andres Freund	9a8ee1dc65	tableam: Add and use table_fetch_row_version(). This is essentially the tableam version of heapam_fetch(), i.e. fetching a tuple identified by a tid, performing visibility checks. Note that this different from table_index_fetch_tuple(), which is for index lookups. It therefore has to handle a tid pointing to an earlier version of a tuple if the AM uses an optimization like heap's HOT. Add comments to that end. This commit removes the stats_relation argument from heap_fetch, as it's been unused for a long time. Author: Andres Freund Reviewed-By: Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-25 00:17:59 -07:00
Peter Eisentraut	c77e12208c	Add ORDER BY to regression test case Apparently, the output order is different on different endianness, per build farm member snapper.	2019-03-25 08:15:38 +01:00
Andres Freund	919e48b943	tableam: Use in CREATE TABLE AS and CREATE MATERIALIZED VIEW. Previously those directly performed a heap_insert(). Use table_insert() instead. The input slot of those routines is not of the target relation - we could fix that by copying if necessary, but that'd not be beneficial for performance. As those codepaths don't access any AM specific tuple fields (say xmin/xmax), there's no need to use an AM specific slot. Author: Andres Freund Reviewed-By: Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-24 18:58:37 -07:00
Tom Lane	940311e4bb	Un-hide most cascaded-drop details in regression test results. Now that the ordering of DROP messages ought to be stable everywhere, we should not need these kluges of hiding DETAIL output just to avoid unstable ordering. Hiding it's not great for test coverage, so let's undo that where possible. In a small number of places, it's necessary to leave it in, for example because the output might include a variable pg_temp_nnn schema name. I also left things alone in places where the details would depend on other regression test scripts, e.g. plpython_drop.sql. Perhaps buildfarm experience will show this to be a bad idea, but if so I'd like to know why. Discussion: https://postgr.es/m/E1h6eep-0001Mw-Vd@gemulon.postgresql.org	2019-03-24 19:15:37 -04:00
Tom Lane	af6550d344	Sort dependent objects before reporting them in DROP ROLE. Commit `8aa9dd74b` didn't quite finish the job in this area after all, because DROP ROLE has a code path distinct from DROP OWNED BY, and it was still reporting dependent objects in whatever order the index scan returned them in. Buildfarm experience shows that index ordering of equal-keyed objects is significantly less stable than before in the wake of using heap TIDs as tie-breakers. So if we try to hide the unstable ordering by suppressing DETAIL reports, we're just going to end up having to do that for every DROP that reports multiple objects. That's not great from a coverage or problem-detection standpoint, and it's something we'll inevitably forget in future patches, leading to more iterations of fixing-an- unstable-result. So let's just bite the bullet and sort here too. Discussion: https://postgr.es/m/E1h6eep-0001Mw-Vd@gemulon.postgresql.org	2019-03-24 18:17:53 -04:00
Peter Geoghegan	59ab3be9e4	Remove dead code from nbtsplitloc.c. It doesn't make sense to consider the possibility that there will only be one candidate split point when choosing among split points to find the split with the lowest penalty. This is a vestige of an earlier version of the patch that became commit `fab25024`. Issue spotted while rereviewing coverage of the nbtree patch series using gcov.	2019-03-24 12:28:58 -07:00
Michael Paquier	276d2e6c2d	Make current_logfiles use permissions assigned to files in data directory Since its introduction in `19dc233c`, current_logfiles has been assigned the same permissions as a log file, which can be enforced with log_file_mode. This setup can lead to incompatibility problems with group access permissions as current_logfiles is not located in the log directory, but at the root of the data folder. Hence, if group permissions are used but log_file_mode is more restrictive, a backup with a user in the group having read access could fail even if the log directory is located outside of the data folder. Per discussion with the folks mentioned below, we have concluded that current_logfiles should not be treated as a log file as it only stores metadata related to log files, and that it should use the same permissions as all other files in the data directory. This solution has the merit to be simple and fixes all the interaction problems between group access and log_file_mode. Author: Haribabu Kommi Reviewed-by: Stephen Frost, Robert Haas, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CAJrrPGcEotF1P7AWoeQyD3Pqr-0xkQg_Herv98DjbaMj+naozw@mail.gmail.com Backpatch-through: 11, where group access has been added.	2019-03-24 21:00:35 +09:00
Peter Eisentraut	280a408b48	Transaction chaining Add command variants COMMIT AND CHAIN and ROLLBACK AND CHAIN, which start new transactions with the same transaction characteristics as the just finished one, per SQL standard. Support for transaction chaining in PL/pgSQL is also added. This functionality is especially useful when running COMMIT in a loop in PL/pgSQL. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr> Discussion: https://www.postgresql.org/message-id/flat/28536681-324b-10dc-ade8-ab46f7645a5a@2ndquadrant.com	2019-03-24 11:33:02 +01:00
Andres Freund	b2db277057	Remove spurious return. Per buildfarm member anole. Author: Andres Freund	2019-03-23 21:09:39 -07:00
Andres Freund	5db6df0c01	tableam: Add tuple_{insert, delete, update, lock} and use. This adds new, required, table AM callbacks for insert/delete/update and lock_tuple. To be able to reasonably use those, the EvalPlanQual mechanism had to be adapted, moving more logic into the AM. Previously both delete/update/lock call-sites and the EPQ mechanism had to have awareness of the specific tuple format to be able to fetch the latest version of a tuple. Obviously that needs to be abstracted away. To do so, move the logic that find the latest row version into the AM. lock_tuple has a new flag argument, TUPLE_LOCK_FLAG_FIND_LAST_VERSION, that forces it to lock the last version, rather than the current one. It'd have been possible to do so via a separate callback as well, but finding the last version usually also necessitates locking the newest version, making it sensible to combine the two. This replaces the previous use of EvalPlanQualFetch(). Additionally HeapTupleUpdated, which previously signaled either a concurrent update or delete, is now split into two, to avoid callers needing AM specific knowledge to differentiate. The move of finding the latest row version into tuple_lock means that encountering a row concurrently moved into another partition will now raise an error about "tuple to be locked" rather than "tuple to be updated/deleted" - which is accurate, as that always happens when locking rows. While possible slightly less helpful for users, it seems like an acceptable trade-off. As part of this commit HTSU_Result has been renamed to TM_Result, and its members been expanded to differentiated between updating and deleting. HeapUpdateFailureData has been renamed to TM_FailureData. The interface to speculative insertion is changed so nodeModifyTable.c does not have to set the speculative token itself anymore. Instead there's a version of tuple_insert, tuple_insert_speculative, that performs the speculative insertion (without requiring a flag to signal that fact), and the speculative insertion is either made permanent with table_complete_speculative(succeeded = true) or aborted with succeeded = false). Note that multi_insert is not yet routed through tableam, nor is COPY. Changing multi_insert requires changes to copy.c that are large enough to better be done separately. Similarly, although simpler, CREATE TABLE AS and CREATE MATERIALIZED VIEW are also only going to be adjusted in a later commit. Author: Andres Freund and Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20190313003903.nwvrxi7rw3ywhdel@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql	2019-03-23 19:55:57 -07:00
Tom Lane	f778e537a0	Remove inadequate check for duplicate "xml" PI. I failed to think about PIs starting with "xml". We don't really need this check at all, so just take it out. Oversight in commit `8d1dadb25` et al.	2019-03-23 17:40:19 -04:00
Tom Lane	4870dce37f	Ensure xmloption = content while restoring pg_dump output. In combination with the previous commit, this ensures that valid XML data can always be dumped and reloaded, whether it is "document" or "content". Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com	2019-03-23 16:51:37 -04:00
Tom Lane	8d1dadb25b	Accept XML documents when xmloption = content, as required by SQL:2006+. Previously we were using the SQL:2003 definition, which doesn't allow this, but that creates a serious dump/restore gotcha: there is no setting of xmloption that will allow all valid XML data. Hence, switch to the 2006 definition. Since libxml doesn't accept <!DOCTYPE> directives in the mode we use for CONTENT parsing, the implementation is to detect <!DOCTYPE> in the input and switch to DOCUMENT parsing mode. This should not cost much, because <!DOCTYPE> should be close to the front of the input if it's there at all. It's possible that this causes the error messages for malformed input to be slightly different than they were before, if said input includes <!DOCTYPE>; but that does not seem like a big problem. In passing, buy back a few cycles in parsing of large XML documents by not doing strlen() of the whole input in parse_xml_decl(). Back-patch because dump/restore failures are not nice. This change shouldn't break any cases that worked before, so it seems safe to back-patch. Chapman Flack (revised a bit by me) Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com	2019-03-23 16:51:37 -04:00
Peter Geoghegan	05f110cc0b	Suppress DETAIL output from an event_trigger test. Suppress 3 lines of unstable DETAIL output from a DROP ROLE statement in event_trigger.sql. This is further cleanup for commit `dd299df8`. Note that the event_trigger test instability issue is very similar to the recently suppressed foreign_data test instability issue. Both issues involve DETAIL output for a DROP ROLE statement that needed to be changed as part of `dd299df8`. Per buildfarm member macaque.	2019-03-23 13:49:53 -07:00
Peter Geoghegan	29b64d1de7	Add nbtree high key "continuescan" optimization. Teach nbtree forward index scans to check the high key before moving to the right sibling page in the hope of finding that it isn't actually necessary to do so. The new check may indicate that the scan definitely cannot find matching tuples to the right, ending the scan immediately. We already opportunistically force a similar "continuescan orientated" key check of the final non-pivot tuple when it's clear that it cannot be returned to the scan due to being dead-to-all. The new high key check is complementary. The new approach for forward scans is more effective than checking the final non-pivot tuple, especially with composite indexes and non-unique indexes. The improvements to the logic for picking a split point added by commit `fab25024` make it likely that relatively dissimilar high keys will appear on a page. A distinguishing key value that can only appear on non-pivot tuples on the right sibling page will often be present in leaf page high keys. Since forcing the final item to be key checked no longer makes any difference in the case of forward scans, the existing extra key check is now only used for backwards scans. Backward scans continue to opportunistically check the final non-pivot tuple, which is actually the first non-pivot tuple on the page (not the last). Note that even pg_upgrade'd v3 indexes make use of this optimization. Author: Peter Geoghegan, Heikki Linnakangas Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/CAH2-WzkOmUduME31QnuTFpimejuQoiZ-HOf0pOWeFZNhTMctvA@mail.gmail.com	2019-03-23 11:01:53 -07:00
Michael Paquier	4ba96d1b82	Improve format of code and some error messages in pg_checksums This makes the code more consistent with the surroundings. Author: Fabrízio de Royes Mello Discussion: https://postgr.es/m/CAFcNs+pXb_35r5feMU3-dWsWxXU=Yjq+spUsthFyGFbT0QcaKg@mail.gmail.com	2019-03-23 21:56:43 +09:00
Tom Lane	fb50d3f03f	Add unreachable "break" to satisfy -Wimplicit-fallthrough. gcc is a bit pickier about this than perhaps it should be. Discussion: https://postgr.es/m/E1h6zzT-0003ft-DD@gemulon.postgresql.org	2019-03-23 01:32:58 -04:00
Andres Freund	cdcffe2263	Expand EPQ tests for UPDATEs and DELETEs Previously there was basically no coverage for UPDATEs encountering deleted rows, and no coverage for DELETE having to perform EPQ. That's problematic for an upcoming commit in which EPQ is tought to integrate with tableams. Also, there was no test for UPDATE to encounter a row UPDATEd into another partition. Author: Andres Freund	2019-03-22 19:55:23 -07:00
Michael Paquier	e0090c8690	Add option -N/--no-sync to pg_checksums This is an option consistent with what pg_dump, pg_rewind and pg_basebackup provide which is useful for leveraging the I/O effort when testing things, not to be used in a production environment. Author: Michael Paquier Reviewed-by: Michael Banck, Fabien Coelho, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de	2019-03-23 08:37:36 +09:00
Peter Eisentraut	7b084b3831	Revert "Add gitignore entries for jsonpath_gram.h" This reverts commit `4e274a043f`. These files aren't actually built anymore since `550b9d26f`.	2019-03-23 00:19:34 +01:00
Michael Paquier	ed308d7837	Add options to enable and disable checksums in pg_checksums An offline cluster can now work with more modes in pg_checksums: - --enable enables checksums in a cluster, updating all blocks with a correct checksum, and updating the control file at the end. - --disable disables checksums in a cluster, updating only the control file. - --check is an extra option able to verify checksums for a cluster, and the default used if no mode is specified. When running --enable or --disable, the data folder gets fsync'd for durability, and then it is followed by a control file update and flush to keep the operation consistent should the tool be interrupted, killed or the host unplugged. If no mode is specified in the options, then --check is used for compatibility with older versions of pg_checksums (named pg_verify_checksums in v11 where it was introduced). Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de	2019-03-23 08:12:55 +09:00
Peter Eisentraut	87914e708a	Make subscription collation test work independent of locale We need to set the database to UTF8 encoding so that the test can use Unicode escapes.	2019-03-22 23:33:31 +01:00
Peter Eisentraut	4e274a043f	Add gitignore entries for jsonpath_gram.h	2019-03-22 23:19:30 +01:00
Tom Lane	734308a220	Rearrange make_partitionedrel_pruneinfo to avoid work when we can't prune. Postpone most of the effort of constructing PartitionedRelPruneInfos until after we have found out whether run-time pruning is needed at all. This costs very little duplicated effort (basically just an extra find_base_rel() call per partition) and saves quite a bit when we can't do run-time pruning. Also, merge the first loop (for building relid_subpart_map) into the second loop, since we don't need the map to be valid during that loop. Amit Langote Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-22 14:56:12 -04:00
Peter Geoghegan	09963cedce	Go back to suppressing foreign_data DETAIL test output. This is almost a straight revert of commit `fff518d`, which itself was a revert of `7d3bf73ac`. It turns out that commit `8aa9dd74`, which sorted dependent objects before deletion in DROP OWNED BY, was not sufficient to make all remaining unstable DETAIL output stable. Unstable DETAIL output from DROP ROLE was not affected, because that happens to use a different code path. It doesn't seem worthwhile to fix the other code path at this time. Discussion: https://postgr.es/m/6226.1553274783@sss.pgh.pa.us	2019-03-22 11:34:28 -07:00
Tom Lane	c8151e6423	Don't copy PartitionBoundInfo in set_relation_partition_info. I (tgl) remain dubious that it's a good idea for PartitionDirectory to hold a pin on a relcache entry throughout planning, rather than copying the data or using some kind of refcount scheme. However, it's certainly the responsibility of the PartitionDirectory code to ensure that what it's handing back is a stable data structure, not that of its caller. So this is a pretty clear oversight in commit `898e5e329`, and one that can cost a lot of performance when there are many partitions. Amit Langote (extracted from a much larger patch set) Discussion: https://postgr.es/m/CA+TgmoY3bRmGB6-DUnoVy5fJoreiBJ43rwMrQRCdPXuKt4Ykaw@mail.gmail.com Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-22 14:16:58 -04:00
Heikki Linnakangas	b5fd4972a3	Fix yet more portability bugs in integerset and its tests. There were more large constants that needed UINT64CONST. And one variable was declared as "int", when it needed to be uint64. These bugs were only visible on 32-bit systems; clearly I should've tested on one, given that this code does a lot of work with 64-bit integers. Also, in the test "huge distances" test, the code created some values with random distances between them, but the test logic didn't take into account the possibility that the random distance was exactly 1. That never actually happens with the seed we're using, but let's be tidy.	2019-03-22 17:59:19 +02:00
Peter Eisentraut	638db07814	Fix ICU tests for older ICU versions Change the tests to use old-style ICU locale specifications so that they can run on older ICU versions.	2019-03-22 14:40:56 +01:00
Heikki Linnakangas	c477c68c8f	More portability fixes for integerset tests. Use UINT64CONST for large constants.	2019-03-22 14:57:35 +02:00
Heikki Linnakangas	32f8ddf7e1	Make printf format strings in test_integerset portable. Use UINT64_FORMAT for printing uint64s.	2019-03-22 14:42:33 +02:00
Heikki Linnakangas	608c5f4347	Make the integerset test more verbose. Buildfarm member 'woodlouse' failed one of the tests, and I'm not sure which test failed. Better to print the names of the tests, so that it will appear in the regression.diffs on failure.	2019-03-22 14:32:53 +02:00
Heikki Linnakangas	d1b9ee4e44	Fix bug in the GiST vacuum's 2nd stage. We mustn't assume that the IndexVacuumInfo pointer passed to bulkdelete() stage is still valid in the vacuumcleanup() stage. Per very pink buildfarm.	2019-03-22 14:11:46 +02:00
Heikki Linnakangas	7df159a620	Delete empty pages during GiST VACUUM. To do this, we scan GiST two times. In the first pass we make note of empty leaf pages and internal pages. At second pass we scan through internal pages, looking for downlinks to the empty pages. Deleting internal pages is still not supported, like in nbtree, the last child of an internal page is never deleted. That means that if you have a workload where new keys are always inserted to different area than where old keys are removed, the index will still grow without bound. But the rate of growth will be an order of magnitude slower than before. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/B1E4DF12-6CD3-4706-BDBD-BF3283328F60@yandex-team.ru	2019-03-22 13:21:45 +02:00
Heikki Linnakangas	df816f6ad5	Add IntegerSet, to hold large sets of 64-bit ints efficiently. The set is implemented as a B-tree, with a compact representation at leaf items, using Simple-8b algorithm, so that clusters of nearby values use less memory. The IntegerSet isn't used for anything yet, aside from the test code, but we have two patches in the works that would benefit from this: A patch to allow GiST vacuum to delete empty pages, and a patch to reduce heap VACUUM's memory usage, by storing the list of dead TIDs more efficiently and lifting the 1 GB limit on its size. This includes a unit test module, in src/test/modules/test_integerset. It can be used to verify correctness, as a regression test, but if you run it manully, it can also print memory usage and execution time of some of the tests. Author: Heikki Linnakangas, Andrey Borodin Reviewed-by: Julien Rouhaud Discussion: https://www.postgresql.org/message-id/b5e82599-1966-5783-733c-1a947ddb729f@iki.fi	2019-03-22 13:21:45 +02:00
Peter Eisentraut	5e1963fb76	Collations with nondeterministic comparison This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com	2019-03-22 12:12:43 +01:00
Michael Paquier	2ab6d28d23	Fix crash with pg_partition_root Trying to call the function with the top-most parent of a partition tree was leading to a crash. In this case the correct result is to return the top-most parent itself. Reported-by: Álvaro Herrera Author: Michael Paquier Reviewed-by: Amit Langote Discussion: https://postgr.es/m/20190322032612.GA323@alvherre.pgsql	2019-03-22 17:27:38 +09:00
Peter Geoghegan	fff518d051	Revert "Suppress DETAIL output from a foreign_data test." This should be superseded by commit `8aa9dd74`.	2019-03-21 15:33:13 -07:00
Alvaro Herrera	03ae9d59bd	Catversion bump announced in previous commit but forgotten	2019-03-21 18:43:41 -03:00
Alvaro Herrera	7e7c57bbb2	Fix dependency recording bug for partitioned PKs When DefineIndex recurses to create constraints on partitions, it needs to use the value returned by index_constraint_create to set up partition dependencies. However, in the course of fixing the DEPENDENCY_INTERNAL_AUTO mess, commit `1d92a0c9f7` introduced some code to that function that clobbered the return value, causing the recorded OID to be of the wrong object. Close examination of pg_depend after creating the tables leads to indescribable objects :-( My sin (in commit `bdc3d7fa23`, while preparing for DDL deparsing in event triggers) was to use a variable name for the return value that's typically used for throwaway objects in dependency-setting calls ("referenced"). Fix by changing the variable names to match extended practice (the return value is "myself" rather than "referenced".) The pg_upgrade test notices the problem (in an indirect way: the pg_dump outputs are in different order), but only if you create the objects in a specific way that wasn't being used in the existing tests. Add a stanza to leave some objects around that shows the bug. Catversion bump because preexisting databases might have bogus pg_depend entries. Discussion: https://postgr.es/m/20190318204235.GA30360@alvherre.pgsql	2019-03-21 18:34:29 -03:00
Tom Lane	bfb456c1b9	Improve error reporting for DROP FUNCTION/PROCEDURE/AGGREGATE/ROUTINE. These commands allow the argument type list to be omitted if there is just one object that matches by name. However, if that syntax was used with DROP IF EXISTS and there was more than one match, you got a "function ... does not exist, skipping" notice message rather than a truthful complaint about the ambiguity. This was basically due to poor factorization and a rats-nest of logic, so refactor the relevant lookup code to make it cleaner. Note that this amounts to narrowing the scope of which sorts of error conditions IF EXISTS will bypass. Per discussion, we only intend it to skip no-such-object cases, not multiple-possible-matches cases. Per bug #15572 from Ash Marath. Although this definitely seems like a bug, it's not clear that people would thank us for changing the behavior in minor releases, so no back-patch. David Rowley, reviewed by Julien Rouhaud and Pavel Stehule Discussion: https://postgr.es/m/15572-ed1b9ed09503de8a@postgresql.org	2019-03-21 11:52:08 -04:00
Thomas Munro	0f086f84ad	Add DNS SRV support for LDAP server discovery. LDAP servers can be advertised on a network with RFC 2782 DNS SRV records. The OpenLDAP command-line tools automatically try to find servers that way, if no server name is provided by the user. Teach PostgreSQL to do the same using OpenLDAP's support functions, when building with OpenLDAP. For now, we assume that HAVE_LDAP_INITIALIZE (an OpenLDAP extension available since OpenLDAP 2.0 and also present in Apple LDAP) implies that you also have ldap_domain2hostlist() (which arrived in the same OpenLDAP version and is also present in Apple LDAP). Author: Thomas Munro Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/CAEepm=2hAnSfhdsd6vXsM6VZVN0br-FbAZ-O+Swk18S5HkCP=A@mail.gmail.com	2019-03-21 15:28:17 +13:00
Tom Lane	8aa9dd74b3	Sort the dependent objects before deletion in DROP OWNED BY. This finishes a task we left undone in commit `f1ad067fc`, by extending the delete-in-descending-OID-order rule to deletions triggered by DROP OWNED BY. We've coped with machine-dependent deletion orders one time too many, and the new issues caused by Peter G's recent nbtree hacking seem like the last straw. Discussion: https://postgr.es/m/E1h6eep-0001Mw-Vd@gemulon.postgresql.org	2019-03-20 18:06:29 -04:00
Alvaro Herrera	a6da004715	Add index_get_partition convenience function This new function simplifies some existing coding, as well as supports future patches. Discussion: https://postgr.es/m/201901222145.t6wws6t6vrcu@alvherre.pgsql Reviewed-by: Amit Langote, Jesper Pedersen	2019-03-20 18:18:50 -03:00
Peter Geoghegan	3d0dcc5c7f	Fix spurious compiler warning in nbtxlog.c. Cleanup from commit `dd299df8`. Per complaint from Tom Lane.	2019-03-20 14:04:35 -07:00
Peter Geoghegan	7d3bf73ac4	Suppress DETAIL output from a foreign_data test. Unstable sort order related to changes to nbtree from commit `dd299df8` can cause two lines of DETAIL output to be in opposite-of-expected order. Suppress the output using the same VERBOSITY hack that is used elsewhere in the foreign_data tests. Note that the same foreign_data.out DETAIL output was mechanically updated by commit `dd299df8`. Only a few such changes were required, though. Per buildfarm member batfish. Discussion: https://postgr.es/m/CAH2-WzkCQ_MtKeOpzozj7QhhgP1unXsK8o9DMAFvDqQFEPpkYQ@mail.gmail.com	2019-03-20 13:38:38 -07:00
Alvaro Herrera	815b20ae0c	Restore RI trigger sanity check I unnecessarily removed this check in `3de241dba8` because I misunderstood what the final representation of constraints across a partitioning hierarchy was to be. Put it back (in both branches). Discussion: https://postgr.es/m/201901222145.t6wws6t6vrcu@alvherre.pgsql	2019-03-20 17:28:43 -03:00
Peter Geoghegan	fab2502433	Consider secondary factors during nbtree splits. Teach nbtree to give some consideration to how "distinguishing" candidate leaf page split points are. This should not noticeably affect the balance of free space within each half of the split, while still making suffix truncation truncate away significantly more attributes on average. The logic for choosing a leaf split point now uses a fallback mode in the case where the page is full of duplicates and it isn't possible to find even a minimally distinguishing split point. When the page is full of duplicates, the split should pack the left half very tightly, while leaving the right half mostly empty. Our assumption is that logical duplicates will almost always be inserted in ascending heap TID order with v4 indexes. This strategy leaves most of the free space on the half of the split that will likely be where future logical duplicates of the same value need to be placed. The number of cycles added is not very noticeable. This is important because deciding on a split point takes place while at least one exclusive buffer lock is held. We avoid using authoritative insertion scankey comparisons to save cycles, unlike suffix truncation proper. We use a faster binary comparison instead. Note that even pg_upgrade'd v3 indexes make use of these optimizations. Benchmarking has shown that even v3 indexes benefit, despite the fact that suffix truncation will only truncate non-key attributes in INCLUDE indexes. Grouping relatively similar tuples together is beneficial in and of itself, since it reduces the number of leaf pages that must be accessed by subsequent index scans. Author: Peter Geoghegan Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/CAH2-WzmmoLNQOj9mAD78iQHfWLJDszHEDrAzGTUMG3mVh5xWPw@mail.gmail.com	2019-03-20 10:12:19 -07:00
Peter Geoghegan	dd299df818	Make heap TID a tiebreaker nbtree index column. Make nbtree treat all index tuples as having a heap TID attribute. Index searches can distinguish duplicates by heap TID, since heap TID is always guaranteed to be unique. This general approach has numerous benefits for performance, and is prerequisite to teaching VACUUM to perform "retail index tuple deletion". Naively adding a new attribute to every pivot tuple has unacceptable overhead (it bloats internal pages), so suffix truncation of pivot tuples is added. This will usually truncate away the "extra" heap TID attribute from pivot tuples during a leaf page split, and may also truncate away additional user attributes. This can increase fan-out, especially in a multi-column index. Truncation can only occur at the attribute granularity, which isn't particularly effective, but works well enough for now. A future patch may add support for truncating "within" text attributes by generating truncated key values using new opclass infrastructure. Only new indexes (BTREE_VERSION 4 indexes) will have insertions that treat heap TID as a tiebreaker attribute, or will have pivot tuples undergo suffix truncation during a leaf page split (on-disk compatibility with versions 2 and 3 is preserved). Upgrades to version 4 cannot be performed on-the-fly, unlike upgrades from version 2 to version 3. contrib/amcheck continues to work with version 2 and 3 indexes, while also enforcing stricter invariants when verifying version 4 indexes. These stricter invariants are the same invariants described by "3.1.12 Sequencing" from the Lehman and Yao paper. A later patch will enhance the logic used by nbtree to pick a split point. This patch is likely to negatively impact performance without smarter choices around the precise point to split leaf pages at. Making these two mostly-distinct sets of enhancements into distinct commits seems like it might clarify their design, even though neither commit is particularly useful on its own. The maximum allowed size of new tuples is reduced by an amount equal to the space required to store an extra MAXALIGN()'d TID in a new high key during leaf page splits. The user-facing definition of the "1/3 of a page" restriction is already imprecise, and so does not need to be revised. However, there should be a compatibility note in the v12 release notes. Author: Peter Geoghegan Reviewed-By: Heikki Linnakangas, Alexander Korotkov Discussion: https://postgr.es/m/CAH2-WzkVb0Kom=R+88fDFb=JSxZMFvbHVC6Mn9LJ2n=X=kS-Uw@mail.gmail.com	2019-03-20 10:04:01 -07:00
Peter Geoghegan	e5adcb789d	Refactor nbtree insertion scankeys. Use dedicated struct to represent nbtree insertion scan keys. Having a dedicated struct makes the difference between search type scankeys and insertion scankeys a lot clearer, and simplifies the signature of several related functions. This is based on a suggestion by Andrey Lepikhov. Streamline how unique index insertions cache binary search progress. Cache the state of in-progress binary searches within _bt_check_unique() for later instead of having callers avoid repeating the binary search in an ad-hoc manner. This makes it easy to add a new optimization: _bt_check_unique() now falls out of its loop immediately in the common case where it's already clear that there couldn't possibly be a duplicate. The new _bt_check_unique() scheme makes it a lot easier to manage cached binary search effort afterwards, from within _bt_findinsertloc(). This is needed for the upcoming patch to make nbtree tuples unique by treating heap TID as a final tiebreaker column. Unique key binary searches need to restore lower and upper bounds. They cannot simply continue to use the >= lower bound as the offset to insert at, because the heap TID tiebreaker column must be used in comparisons for the restored binary search (unlike the original _bt_check_unique() binary search, where scankey's heap TID column must be omitted). Author: Peter Geoghegan, Heikki Linnakangas Reviewed-By: Heikki Linnakangas, Andrey Lepikhov Discussion: https://postgr.es/m/CAH2-WzmE6AhUdk9NdWBf4K3HjWXZBX3+umC7mH7+WDrKcRtsOw@mail.gmail.com	2019-03-20 09:30:57 -07:00
Alexander Korotkov	550b9d26f8	Get rid of jsonpath_gram.h and jsonpath_scanner.h Jsonpath grammar and scanner are both quite small. It doesn't worth complexity to compile them separately. This commit makes grammar and scanner be compiled at once. Therefore, jsonpath_gram.h and jsonpath_gram.h are no longer needed. This commit also does some reorganization of code in jsonpath_gram.y. Discussion: https://postgr.es/m/d47b2023-3ecb-5f04-d253-d557547cf74f%402ndQuadrant.com	2019-03-20 11:13:34 +03:00
Alexander Korotkov	641fde2523	Remove ambiguity for jsonb_path_match() and jsonb_path_exists() There are 2-arguments and 4-arguments versions of jsonb_path_match() and jsonb_path_exists(). But 4-arguments versions have optional 3rd and 4th arguments, that leads to ambiguity. In the same time 2-arguments versions are needed only for @@ and @? operators. So, rename 2-arguments versions to remove the ambiguity. Catversion is bumped.	2019-03-20 10:30:56 +03:00
Tom Lane	1f39a1c064	Restructure libpq's handling of send failures. Originally, if libpq got a failure (e.g., ECONNRESET) while trying to send data to the server, it would just report that and wash its hands of the matter. It was soon found that that wasn't a very pleasant way of coping with server-initiated disconnections, so we introduced a hack (pqHandleSendFailure) in the code that sends queries to make it peek ahead for server error reports before reporting the send failure. It now emerges that related cases can occur during connection setup; in particular, as of TLS 1.3 it's unsafe to assume that SSL connection failures will be reported by SSL_connect rather than during our first send attempt. We could have fixed that in a hacky way by applying pqHandleSendFailure after a startup packet send failure, but (a) pqHandleSendFailure explicitly disclaims suitability for use in any state except query startup, and (b) the problem still potentially exists for other send attempts in libpq. Instead, let's fix this in a more general fashion by eliminating pqHandleSendFailure altogether, and instead arranging to postpone all reports of send failures in libpq until after we've made an attempt to read and process server messages. The send failure won't be reported at all if we find a server message or detect input EOF. (Note: this removes one of the reasons why libpq typically overwrites, rather than appending to, conn->errorMessage: pqHandleSendFailure needed that behavior so that the send failure report would be replaced if we got a server message or read failure report. Eventually I'd like to get rid of that overwrite behavior altogether, but today is not that day. For the moment, pqSendSome is assuming that its callees will overwrite not append to conn->errorMessage.) Possibly this change should get back-patched someday; but it needs testing first, so let's not consider that till after v12 beta. Discussion: https://postgr.es/m/CAEepm=2n6Nv+5tFfe8YnkUm1fXgvxR0Mm1FoD+QKG-vLNGLyKg@mail.gmail.com	2019-03-19 16:20:28 -04:00
Alexander Korotkov	5e28b778bf	Rename typedef in jsonpath_gram.y from "string" to "JsonPathString" Reason is the same as in `75c57058b0`.	2019-03-19 21:01:10 +03:00
Peter Geoghegan	1009920aaa	Tweak nbtsearch.c function prototype order. nbtsearch.c's static function prototypes were slightly out of order. Make the order consistent with static function definition order.	2019-03-19 09:59:05 -07:00
Tom Lane	0dfe3d0ef5	Make checkpoint requests more robust. Commit `6f6a6d8b1` introduced a delay of up to 2 seconds if we're trying to request a checkpoint but the checkpointer hasn't started yet (or, much less likely, our kill() call fails). However buildfarm experience shows that that's not quite enough for slow or heavily-loaded machines. There's no good reason to assume that the checkpointer won't start eventually, so we may as well make the timeout much longer, say 60 sec. However, if the caller didn't say CHECKPOINT_WAIT, it seems like a bad idea to be waiting at all, much less for as long as 60 sec. We can remove the need for that, and make this whole thing more robust, by adjusting the code so that the existence of a pending checkpoint request is clear from the contents of shared memory, and making sure that the checkpointer process will notice it at startup even if it did not get a signal. In this way there's no need for a non-CHECKPOINT_WAIT call to wait at all; if it can't send the signal, it can nonetheless assume that the checkpointer will eventually service the request. A potential downside of this change is that "kill -INT" on the checkpointer process is no longer enough to trigger a checkpoint, should anyone be relying on something so hacky. But there's no obvious reason to do it like that rather than issuing a plain old CHECKPOINT command, so we'll assume that nobody is. There doesn't seem to be a way to preserve this undocumented quasi-feature without introducing race conditions. Since a principal reason for messing with this is to prevent intermittent buildfarm failures, back-patch to all supported branches. Discussion: https://postgr.es/m/27830.1552752475@sss.pgh.pa.us	2019-03-19 12:49:27 -04:00
Peter Eisentraut	28988a84cf	Reorder LOCALLOCK structure members to compact the size Save 8 bytes (on x86-64) by filling up padding holes. Author: Takayuki Tsunakawa <tsunakawa.takay@jp.fujitsu.com> Discussion: https://www.postgresql.org/message-id/20190219001639.ft7kxir2iz644alf@alap3.anarazel.de	2019-03-19 14:07:08 +01:00
Alexander Korotkov	75c57058b0	Rename typedef in jsonpath_scan.l from "keyword" to "JsonPathKeyword" Typedef name should be both unique and non-intersect with variable names across all the sources. That makes both pg_indent and debuggers happy. Discussion: https://postgr.es/m/23865.1552936099%40sss.pgh.pa.us	2019-03-19 13:40:55 +03:00
Peter Eisentraut	590a87025b	Ignore attempts to add TOAST table to shared or catalog tables Running ALTER TABLE on any table will check if a TOAST table needs to be added. On shared tables, this would previously fail, thus effectively disabling ALTER TABLE for those tables. On (non-shared) system catalogs, on the other hand, it would add a TOAST table, even though we don't really want TOAST tables on some system catalogs. In some cases, it would also fail with an error "AccessExclusiveLock required to add toast table.", depending on what locks the ALTER TABLE actions had already taken. So instead, just ignore attempts to add TOAST tables to such tables, outside of bootstrap mode, pretending they don't need one. This allows running ALTER TABLE on such tables without messing up the TOAST situation. Legitimate uses for ALTER TABLE on system catalogs include setting reloptions (say, fillfactor or autovacuum settings). (All this still requires allow_system_table_mods, which is independent of this.) Discussion: https://www.postgresql.org/message-id/flat/e49f825b-fb25-0bc8-8afc-d5ad895c7975@2ndquadrant.com	2019-03-19 11:15:50 +01:00
Peter Eisentraut	e537ac5182	Fix whitespace	2019-03-19 10:28:34 +01:00
Peter Eisentraut	1f050c08f9	Fix bug in support for collation attributes on older ICU versions Unrecognized attribute names are supposed to be ignored. But the code would error out on an unrecognized attribute value even if it did not recognize the attribute name. So unrecognized attributes wouldn't really be ignored unless the value happened to be one that matched a recognized value. This would break some important cases where the attribute would be processed by ucol_open() directly. Fix that and add a test case. The restructured code should also avoid compiler warnings about initializing a UColAttribute value to -1, because the type might be an unsigned enum. (reported by Andres Freund)	2019-03-19 09:37:46 +01:00

... 3 4 5 6 7 ...

33864 Commits