postgresql

Commit Graph

Author	SHA1	Message	Date
Robert Haas	dc1057fcd8	Prevent generation of bogus subquery scan paths. Commit `0927d2f46d` didn't check that consider_parallel was set for the target relation or account for the possibility that required_outer might be non-empty. To prevent future bugs of this ilk, add some assertions to add_partial_path and do a bit of future-proofing of the code recently added to recurse_set_operations. Report by Andreas Seltenreich. Patch by Jeevan Chalke. Review by Amit Kapila and by me. Discussion: http://postgr.es/m/CAM2+6=U+9otsyF2fYB8x_2TBeHTR90itarqW=qAEjN-kHaC7kw@mail.gmail.com	2018-04-25 15:25:55 -04:00
Tom Lane	f04d4ac919	Reindent Perl files with perltidy version 20170521. Discussion: https://postgr.es/m/CABUevEzK3cNiHZQ18f5tK0guoT+cN_jWeVzhYYxY=r+1Q3SmoA@mail.gmail.com	2018-04-25 14:00:19 -04:00
Alvaro Herrera	bd4aad3239	Update ExecInitPartitionInfo comment Remove the words "if not already done." This obsolete wording corresponds to an early development version of what became `edd44738bc`. Author: Etsuro Fujita Reviewed-by: Amit Langote Discussion: https://postgr.es/m/5ADF117B.5030606@lab.ntt.co.jp	2018-04-24 23:00:48 -03:00
Alvaro Herrera	1957f8dabf	Initialize ExprStates once in run-time partition pruning Instead of doing ExecInitExpr every time a Param needs to be evaluated in run-time partition pruning, do it once during run-time pruning set-up and cache the exprstate in PartitionPruneContext, saving a lot of work. Author: David Rowley Reviewed-by: Amit Langote, Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f8-x+q-90QAPDu_okhQBV4DPEtPz8CJ=m0940GyT4DA4w@mail.gmail.com	2018-04-24 14:03:10 -03:00
Alvaro Herrera	055fb8d33d	Add GUC enable_partition_pruning This controls both plan-time and execution-time new-style partition pruning. While finer-grain control is possible (maybe using an enum GUC instead of boolean), there doesn't seem to be much need for that. This new parameter controls partition pruning for all queries: trivially, SELECT queries that affect partitioned tables are naturally under its control since they are using the new technology. However, while UPDATE/DELETE queries do not use the new code, we make the new GUC control their behavior also (stealing control from constraint_exclusion), because it is more natural, and it leads to a more natural transition to the future in which those queries will also use the new pruning code. Constraint exclusion still controls pruning for regular inheritance situations (those not involving partitioned tables). Author: David Rowley Review: Amit Langote, Ashutosh Bapat, Justin Pryzby, David G. Johnston Discussion: https://postgr.es/m/CAKJS1f_0HwsxJG9m+nzU+CizxSdGtfe6iF_ykPYBiYft302DCw@mail.gmail.com	2018-04-23 17:57:43 -03:00
Tom Lane	4df58f7ed7	Fix handling of partition bounds for boolean partitioning columns. Previously, you could partition by a boolean column as long as you spelled the bound values as string literals, for instance FOR VALUES IN ('t'). The trouble with this is that ruleutils.c printed that as FOR VALUES IN (TRUE), which is reasonable syntax but wasn't accepted by the grammar. That results in dump-and-reload failures for such cases. Apply a minimal fix that just causes TRUE and FALSE to be converted to strings 'true' and 'false'. This is pretty grotty, but it's too late for a more principled fix in v11 (to say nothing of v10). We should revisit the whole issue of how partition bound values are parsed for v12. Amit Langote Discussion: https://postgr.es/m/e05c5162-1103-7e37-d1ab-6de3e0afaf70@lab.ntt.co.jp	2018-04-23 15:29:11 -04:00
Peter Eisentraut	df044026fc	Fix typo in logical truncate replication This could result in some misbehavior in a cascading replication setup.	2018-04-23 13:38:22 -04:00
Alvaro Herrera	bc972072a3	Add missing pstrdup Lifetime of the input string is not right, so create a separate copy. Author: Amit Langote Discussion: https://postgr.es/m/a2773420-50d1-0a42-3396-fe42b0921134@lab.ntt.co.jp	2018-04-23 12:11:41 -03:00
Alvaro Herrera	dfce1f9e4e	Remove useless default clause in switch The switch covers all values of the enum driver variable, so having a default: clause is useless, even if it's only to do Assert(false).	2018-04-23 12:11:41 -03:00
Teodor Sigaev	a5ab8928d7	Make bms_prev_member work correctly with a 64 bit bitmapword `5c067521` erroneously had coded bms_prev_member assuming that a bitmapword would always hold 32 bits and started it's search on what it thought was the highest 8-bits of the word. This was not the case if bitmapwords were 64 bits. In passing add a test to exercise this function a little. Previously there was no coverage at all. David Rowly	2018-04-23 17:59:17 +03:00
Teodor Sigaev	6db4b49986	Fix wrong validation of top-parent pointer during page deletion in Btree. After introducing usage of t_tid of inner or page high key for storing number of attributes of tuple, validation of tuple's ItemPointer with ItemPointerIsValid becomes incorrect, it's need to validate only blocknumber of ItemPointer. Missing this causes a incorrect page deletion, fix that. Test is added. BTW, current contrib/amcheck doesn't fail on index corrupted by this way. Also introduce BTreeTupleGetTopParent/BTreeTupleSetTopParent macroses to improve code readability and to avoid possible confusion with page high key: high key is used to store top-parent link for branch to remove. Bug found by Michael Paquier, but bug doesn't exist in previous versions because t_tid was set to P_HIKEY. Author: Teodor Sigaev Reviewer: Peter Geoghegan Discussion: https://www.postgresql.org/message-id/flat/20180419052436.GA16000%40paquier.xyz	2018-04-23 15:55:10 +03:00
Tom Lane	a66c03f698	Add missing "static" marker. Per pademelon.	2018-04-21 11:21:18 -04:00
Stephen Frost	a0fefbcb71	Fix a couple minor typos In commit `f0e4475`, GetIndexOpClass was renamed to ResolveOpClass, but the comment in typecmds.c didn't get the memo. In objectaddress.c, missing 'of' in a comment. Both noticed by Vik Fearing, patch is mine though.	2018-04-20 19:04:54 -04:00
Tom Lane	b1b71f1658	Fix race conditions when an event trigger is added concurrently with DDL. EventTriggerTableRewrite crashed if there were table_rewrite triggers present, but there had not been when the calling command started. EventTriggerDDLCommandEnd called ddl_command_end triggers if present, even if there had been no such triggers when the calling command started, which would lead to a failure in pg_event_trigger_ddl_commands. In both cases, fix by doing nothing; it's better to wait till the next command when things will be properly initialized. In passing, remove an elog(DEBUG1) call that might have seemed interesting four years ago but surely isn't today. We found this because of intermittent failures in the buildfarm. Thanks to Alvaro Herrera and Andrew Gierth for analysis. Back-patch to 9.5; some of this code exists before that, but the specific hazards we need to guard against don't. Discussion: https://postgr.es/m/5767.1523995174@sss.pgh.pa.us	2018-04-20 17:15:31 -04:00
Tom Lane	ec38dcd363	Tweak a couple of planner APIs to save recalculating join relids. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se	2018-04-20 16:00:47 -04:00
Tom Lane	c792c7db41	Change more places to be less trusting of RestrictInfo.is_pushed_down. On further reflection, commit `e5d83995e` didn't go far enough: pretty much everywhere in the planner that examines a clause's is_pushed_down flag ought to be changed to use the more complicated behavior where we also check the clause's required_relids. Otherwise we could make incorrect decisions about whether, say, a clause is safe to use as a hash clause. Some (many?) of these places are safe as-is, either because they are never reached while considering a parameterized path, or because there are additional checks that would reject a pushed-down clause anyway. However, it seems smarter to just code them all the same way rather than rely on easily-broken reasoning of that sort. In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should be used in place of direct tests on the is_pushed_down flag. Like the previous patch, back-patch to all supported branches. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se	2018-04-20 15:19:16 -04:00
Tom Lane	68c23cba34	Improve consistency of comments in system catalog headers. Use the term "system catalog" rather than "system relation" in assorted places where it's clearly referring to a table rather than, say, an index. Use more natural word order in the header boilerplate, improve some of the one-liner catalog descriptions, and fix assorted random deviations from the normal boilerplate. All purely neatnik-ism, but why not. John Naylor, some additional cleanup by me Discussion: https://postgr.es/m/CAJVSVGUeJmFB3h-NJ18P32NPa+kzC165nm7GSoGHfPaN80Wxcw@mail.gmail.com	2018-04-19 17:14:09 -04:00
Tom Lane	e5d83995e9	Fix incorrect handling of join clauses pushed into parameterized paths. In some cases a clause attached to an outer join can be pushed down into the outer join's RHS even though the clause is not degenerate --- this can happen if we choose to make a parameterized path for the RHS. If the clause ends up attached to a lower outer join, we'd misclassify it as being a "join filter" not a plain "filter" condition at that node, leading to wrong query results. To fix, teach extract_actual_join_clauses to examine each join clause's required_relids, not just its is_pushed_down flag. (The latter now seems vestigial, or at least in need of rethinking, but we won't do anything so invasive as redefining it in a bug-fix patch.) This has been wrong since we introduced parameterized paths in 9.2, though it's evidently hard to hit given the lack of previous reports. The test case used here involves a lateral function call, and I think that a lateral reference may be required to get the planner to select a broken plan; though I wouldn't swear to that. In any case, even if LATERAL is needed to trigger the bug, it still affects all supported branches, so back-patch to all. Per report from Andreas Karlsson. Thanks to Andrew Gierth for preliminary investigation. Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se	2018-04-19 15:49:30 -04:00
Alvaro Herrera	79b2e52615	Remove quick path in ExecInitPartitionInfo for equal tupdescs I added this "optimization" on top of Amit Langote's `158b7bc6d7`, but the quick path is never taken because the partition uses a different pg_type oid than its parent table (causing equalTupleDescs to return false). Changing that requires more analysis and is too considered dangerous at this point in the cycle, so revert it. We might make it work someday, but not for pg11. Discussion: https://postgr.es/m/825031be-942c-8c24-6163-13c27f217a3d@lab.ntt.co.jp	2018-04-19 16:46:53 -03:00
Alvaro Herrera	2d625176c0	Plural of modulus is moduli	2018-04-19 12:39:13 -03:00
Alvaro Herrera	e5dcbb88a1	Rework code to determine partition pruning procedure Amit Langote reported that partition prune was unable to work with arrays, enums, etc, which led him to research the appropriate way to match query clauses to partition keys: instead of searching for an exact match of the expression's type, it is better to rely on the fact that the expression qual has already been resolved to a specific operator, and that the partition key is linked to a specific operator family. With that info, it's possible to figure out the strategy and comparison function to use for the pruning clause in a manner that works reliably for pseudo-types also. Include new test cases that demonstrate pruning where pseudotypes are involved. Author: Amit Langote, Álvaro Herrera Discussion: https://postgr.es/m/2b02f1e9-9812-9c41-972d-517bdc0f815d@lab.ntt.co.jp	2018-04-19 12:01:37 -03:00
Teodor Sigaev	f97f0c921a	Adjust _bt_insertonpg() comments Remove an obsolete reference to the 'afteritem' argument, which was removed by commit `bc292937`. Add a comment that clarifies how _bt_insertonpg() indirectly handles the insertion of high key items. Author: Peter Geoghegan	2018-04-19 11:08:45 +03:00
Teodor Sigaev	3d927961ae	Handle XLOG_BTREE_META_CLEANUP in btree_desc() and btree_identify() New WAL record XLOG_BTREE_META_CLEANUP introduced in `857f9c36` has no handling in btree_desc() and btree_identify(). This patch implements corresponding handling. Alexander Korotkov	2018-04-19 09:27:56 +03:00
Teodor Sigaev	075aade436	Adjust INCLUDE index truncation comments and code. Add several assertions that ensure that we're dealing with a pivot tuple without non-key attributes where that's expected. Also, remove the assertion within _bt_isequal(), restoring the v10 function signature. A similar check will be performed for the page highkey within _bt_moveright() in most cases. Also avoid dropping all objects within regression tests, to increase pg_dump test coverage for INCLUDE indexes. Rather than using infrastructure that's generally intended to be used with reference counted heap tuple descriptors during truncation, use the same function that was introduced to store flat TupleDescs in shared memory (we use a temp palloc'd buffer). This isn't strictly necessary, but seems more future-proof than the old approach. It also lets us avoid including rel.h within indextuple.c, which was arguably a modularity violation. Also, we now call index_deform_tuple() with the truncated TupleDesc, not the source TupleDesc, since that's more robust, and saves a few cycles. In passing, fix a memory leak by pfree'ing truncated pivot tuple memory during CREATE INDEX. Also pfree during a page split, just to be consistent. Refactor _bt_check_natts() to be more readable. Author: Peter Geoghegan with some editorization by me Reviewed by: Alexander Korotkov, Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CAH2-Wz%3DkCWuXeMrBCopC-tFs3FbiVxQNjjgNKdG2sHxZ5k2y3w%40mail.gmail.com	2018-04-19 08:45:58 +03:00
Tom Lane	5372c2c841	Improve error detection/reporting in Catalog.pm and genbki.pl. Clean up error messages relating to mistakes in .dat files: make sure they provide the .dat file name and line number, not the place in the Perl script that's reporting the problem. Adopt more uniform message phrasing, too. Make genbki.pl spit up on unrecognized field names in the input hashes. Previously, it just silently ignored such fields, which could make a misspelled field name into a very hard-to-decipher problem. (This is in genbki.pl, not Catalog.pm, because we don't want reformat_dat_file.pl to complain about unrecognized fields. We'd rather it silently dropped them, to facilitate removing unwanted fields after a reorganization.)	2018-04-18 18:17:02 -04:00
Tom Lane	1dec82068b	Better fix for deadlock hazard in CREATE INDEX CONCURRENTLY. Commit `54eff5311` did not account for the possibility that we'd have a transaction snapshot due to default_transaction_isolation being set high enough to require one. The transaction snapshot is enough to hold back our advertised xmin and thus risk deadlock anyway. The only way to get rid of that snap is to start a new transaction, so let's do that instead. Also throw in an assert checking that we really have gotten to a state where no xmin is being advertised. Back-patch to 9.4, like the previous commit. Discussion: https://postgr.es/m/CAMkU=1ztk3TpQdcUNbxq93pc80FrXUjpDWLGMeVBDx71GHNwZQ@mail.gmail.com	2018-04-18 12:07:37 -04:00
Tom Lane	55d26ff638	Rationalize handling of single and double quotes in bootstrap data. Change things around so that proper quoting of values interpolated into the BKI data by initdb is the responsibility of initdb, not something we half-heartedly handle by putting double quotes into the raw BKI data. (Note: experimentation shows that it still doesn't work to put a double quote into the initial superuser username, but that's the fault of inadequate quoting while interpolating the name into SQL scripts; the BKI aspect of it works fine now.) Having done that, we can remove the special-case handling of values that look like "something" from genbki.pl, and instead teach it to escape double --- and single --- quotes properly. This removes the nowhere-documented need to treat those specially in the BKI source data; whatever you write will be passed through unchanged into the inserted data value, modulo Perl's rules about single-quoted strings. Add documentation explaining the (pre-existing) handling of backslashes in the BKI data. Per an earlier discussion with John Naylor. Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com	2018-04-17 19:53:50 -04:00
Tom Lane	9ffcccdb95	Rationalize handling of array type names in bootstrap data. Formerly, Catalog.pm turned a C array type declaration in the catalog header files into a SQL type, e.g., 'foo[]'. Along the way, genbki.pl turned this into '_foo' for the purpose of type lookups, but wrote 'foo[]' to postgres.bki. During bootstrap, bootscanner.l had to have a special case rule to tokenize this, and then MapArrayTypeName() would turn 'foo[]' into '_foo' one more time. This seems unnecessarily complicated, especially since nobody cares that much about the readability of postgres.bki. Instead, make Catalog.pm convert the C declaration into '_foo' to start with, and preserve that representation of the type name throughout bootstrap data processing. Then rip out the special-case code in bootscanner.l and bootstrap.c. This changes postgres.bki to the extent that array fields are now declared like proconfig = _text , rather than proconfig = text[] , No documentation update, since the SGML docs didn't mention any of this in the first place, and it's all pretty transparent to writers of catalog header files anyway. John Naylor Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com	2018-04-17 18:29:11 -04:00
Tom Lane	e90d4ddc63	Simplify genbki.pl's data quoting rules. During the bootstrap data format conversion, it seemed important for verifiability's sake that the generated postgres.bki file stayed the same as before. That resulted in adding a bunch of ad-hoc rules about when to quote emitted data values, to match previous manual decisions that had often quoted values unnecessarily. Now that the conversion is complete, it seems fine to remove all those ad-hoc rules. The net actual effect on the current contents of postgres.bki is that some fields that had been quoted despite containing only digits or only "-" lose their unnecessary quotes. Also, now that genbki.pl will always quote values containing a backslash, there's no need for bootscanner.l to allow unquoted octal escapes; so simplify its production for "id" by removing that possibility. John Naylor, slightly modified by me Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com	2018-04-17 18:10:16 -04:00
Heikki Linnakangas	cf5a189059	Fix confusion on the padding of GIDs in on commit and abort records. Review of commit 1eb6d652: It's pointless to add padding to the GID fields, when the code that follows assumes that there is no alignment, and uses memcpy(). Remove the pointless padding. Update comments to note the new fields in the WAL records. Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/33b787bf-dc20-1161-54e9-3f3b607bf59d%40iki.fi	2018-04-17 16:10:42 -04:00
Alvaro Herrera	b7e2cbc5b4	Update Append's idea of first_partial_plan It turns out that after runtime partition pruning, Append's first_partial_plan does not accurately represent partial plans to run, if any of those got pruned. This could limit participation of workers in some partial subplans, if other subplans got pruned. Fix it by keeping an index of the first valid partial subplan in the state node, determined at execnode Init time. Author: David Rowley, with cosmetic changes by me. Discussion: https://postgr.es/m/CAKJS1f8o2Yd=rOP=Et3A0FWgF+gSAOkFSU6eNhnGzTPV7nN8sQ@mail.gmail.com	2018-04-17 16:25:02 -03:00
Heikki Linnakangas	55101549d5	Fix a few typos in comments and variable names. Author: Michael Paquier Discussion: https://www.postgresql.org/message-id/20180411075223.GB19732%40paquier.xyz	2018-04-17 11:54:57 -04:00
Tatsuo Ishii	03030512d1	Add more infinite recursion detection while locking a view. Also add regression test cases for detecting infinite recursion in locking view tests. Some document enhancements. Patch by Yugo Nagata.	2018-04-17 16:59:17 +09:00
Tom Lane	b15e8f71db	Fix broken collation-aware searches in SP-GiST text opclass. spg_text_leaf_consistent() supposed that it should compare only Min(querylen, entrylen) bytes of the two strings, and then deal with any excess bytes in one string or the other by assuming the longer string is greater if the prefixes are equal. Quite aside from the fact that that's just wrong in some locales (e.g., 'ch' is not less than 'd' in cs_CZ), it also risked passing incomplete multibyte characters to strcoll(), with ensuing bad results. Instead, just pass the full strings to varstr_cmp, and let it decide what to do about unequal-length strings. Fortunately, this error doesn't imply any index corruption, it's just that searches might return the wrong set of entries. Per report from Emre Hasegeli, though this is not his patch. Thanks to Peter Geoghegan for review and discussion. This code was born broken, so back-patch to all supported branches. In HEAD, I failed to resist the temptation to do a bit of cosmetic cleanup/pgindent'ing on `710d90da1`, too. Discussion: https://postgr.es/m/CAE2gYzzb6K51VnTq5i5p52z+j9p2duEa-K1T3RrC_GQEynAKEg@mail.gmail.com	2018-04-16 16:06:58 -04:00
Alvaro Herrera	158b7bc6d7	Ignore whole-rows in INSERT/CONFLICT with partitioned tables We had an Assert() preventing whole-row expressions from being used in the SET clause of INSERT ON CONFLICT, but it seems unnecessary, given some tests, so remove it. Add a new test to exercise the case. Still at ExecInitPartitionInfo, we used map_partition_varattnos (which constructs an attribute map, then calls map_variable_attnos) using the same two relations many times in different expressions and with different parameters. Constructing the map over and over is a waste. To avoid this repeated work, construct the map once, and use map_variable_attnos() directly instead. Author: Amit Langote, per comments by me (Álvaro) Discussion: https://postgr.es/m/20180326142016.m4st5e34chrzrknk@alvherre.pgsql	2018-04-16 15:52:28 -03:00
Tom Lane	f8a187bdba	Clean up callers of JsonbIteratorNext(). Coverity complained about the lack of a check on the return value in parse_jsonb_index_flags' last call of JsonbIteratorNext. Seems like a reasonable gripe to me, especially since the code is depending on that being WJB_DONE to not leak memory, so add a check. In passing, improve a couple other places where the result was being ignored, either by adding an assert or at least a cast to void. Also, don't spell "WJB_DONE" as "0". That's horrid coding style, and it wasn't consistent either.	2018-04-15 12:40:01 -04:00
Magnus Hagander	33cedf1474	Don't attempt to verify checksums on new pages Teach both base backups and pg_verify_checksums that if a page is new, it does not have a checksum yet, so it shouldn't be verified. Noted by Tomas Vondra, review by David Steele.	2018-04-15 14:05:56 +02:00
Tom Lane	49ac4039b2	Simplify view-expansion code in rewriteHandler.c. In the wake of commit `50c6bb022`, it's not necessary for ApplyRetrieveRule to have a forUpdatePushedDown parameter. By the time control gets here for any given view-referencing RTE, we should already have pushed down the effects of any FOR UPDATE/SHARE clauses affecting the view from outer query levels. Hence if we don't find a RowMarkClause at the current query level, that's sufficient proof that there is no outer one either. This in turn means we need no forUpdatePushedDown parameter for fireRIRrules. I wonder whether we oughtn't also revert commit `cba2d2717`, since it now seems likely that that was band-aiding around the bad effects of doing FOR UPDATE pushdown and view expansion in the wrong order. However, in the absence of evidence that the current coding of markQueryForLocking is actually buggy (i.e. missing RTEs it ought to mark), it seems best to leave it alone. Discussion: https://postgr.es/m/24db7b8f-3de5-e25f-7ab9-d8848351d42c@gmail.com	2018-04-14 21:01:03 -04:00
Alvaro Herrera	da6f3e45dd	Reorganize partitioning code There's been a massive addition of partitioning code in PostgreSQL 11, with little oversight on its placement, resulting in a catalog/partition.c with poorly defined boundaries and responsibilities. This commit tries to set a couple of distinct modules to separate things a little bit. There are no code changes here, only code movement. There are three new files: src/backend/utils/cache/partcache.c src/include/partitioning/partdefs.h src/include/utils/partcache.h The previous arrangement of #including catalog/partition.h almost everywhere is no more. Authors: Amit Langote and Álvaro Herrera Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql	2018-04-14 21:12:14 -03:00
Tom Lane	50c6bb0224	Fix enforcement of SELECT FOR UPDATE permissions with nested views. SELECT FOR UPDATE on a view should require UPDATE (as well as SELECT) permissions on the view, and then the view's owner needs those same permissions against the relations it references, and so on all the way down to base tables. But ApplyRetrieveRule did things in the wrong order, resulting in failure to mark intermediate view levels as needing UPDATE permission. Thus for example, if user A creates a table T and an updatable view V1 on T, then grants only SELECT permissions on V1 to user B, B could create a second view V2 on V1 and then would be allowed to perform SELECT FOR UPDATE via V2 (since V1 wouldn't be checked for UPDATE permissions). To fix, just switch the order of expanding sub-views and marking referenced objects as needing UPDATE permission. I think additional simplifications are now possible, but that's distinct from the bug fix proper. This is certainly a security issue, but the consequences are pretty minor (just the ability to lock rows that shouldn't be lockable). Against that we have a small risk of breaking applications that are working as-desired, since nested views have behaved this way since such cases worked at all. On balance I'm inclined not to back-patch. Per report from Alexander Lakhin. Discussion: https://postgr.es/m/24db7b8f-3de5-e25f-7ab9-d8848351d42c@gmail.com	2018-04-14 15:38:09 -04:00
Peter Eisentraut	e013288a65	Improve code comments As of `0c2c81b403`, the replication parameter in libpq is no longer "deliberately undocumented".	2018-04-14 10:04:36 -04:00
Peter Eisentraut	a8677e3ff6	Support named and default arguments in CALL We need to call expand_function_arguments() to expand named and default arguments. In PL/pgSQL, we also need to deal with named and default INOUT arguments when receiving the output values into variables. Author: Pavel Stehule <pavel.stehule@gmail.com>	2018-04-14 09:13:53 -04:00
Andrew Dunstan	7c44c46deb	Prevent segfault in expand_tuple with no missing values Commit `16828d5c` forgot to check that it had a set of missing values before trying to retrieve a value from it. An additional query to add coverage for this code is added to the regression test. Per bug report from Andreas Seltenreich.	2018-04-13 16:43:33 -04:00
Tom Lane	8bf358c18e	Improve regression test coverage for src/backend/tsearch/spell.c. In passing, throw an error if the AF count is too small, rather than just silently discarding extra affix entries. Note that the new regression test cases require installing the updated src/backend/tsearch/dicts files. Arthur Zakirov Discussion: https://postgr.es/m/20180413113447.GA32474@zakirov.localdomain	2018-04-13 13:49:52 -04:00
Tom Lane	65a69dfa08	Fix bogus affix-merging code. NISortAffixes() compared successive compound affixes incorrectly, thus possibly failing to merge identical affixes, or (less likely) merging ones that shouldn't be merged. The user-visible effects of this are unclear, to me anyway. Per bug #15150 from Alexander Lakhin. It's been broken for a long time, so back-patch to all supported branches. Arthur Zakirov Discussion: https://postgr.es/m/152353327780.31225.13445405496721177988@wrigleys.postgresql.org	2018-04-12 18:39:51 -04:00
Alvaro Herrera	b8ca984b2c	Revert lowering of lock level for ATTACH PARTITION I lowered the lock level for partitions being scanned from AccessExclusive to ShareLock in the course of `72cf7f310c`, but that was bogus, as pointed out by Robert Haas. Revert that bit. Doing this is possible, but requires more work. Discussion: https://postgr.es/m/CA+TgmobV7Nfmqv+TZXcdSsb9Bjc-OL-Anv6BNmCbfJVZLYPE4Q@mail.gmail.com	2018-04-12 16:53:27 -03:00
Alvaro Herrera	181ccbb5e4	Add comment about default partition in check_new_partition_bound The intention of the test is not immediately obvious, so we need this much.	2018-04-12 16:52:29 -03:00
Alvaro Herrera	a4d56f583e	Use the right memory context for partkey's FmgrInfo We were using CurrentMemoryContext to put the partsupfunc fmgr_info into, which isn't right, because we want the PartitionKey as a whole to be in the isolated Relation->rd_partkeycxt context. This can cause a crash with user-defined support functions in the operator classes used by partitioning keys. (Maybe this can cause problems with core-supplied opclasses too, not sure.) This is demonstrably broken in Postgres 10, too, but the initial proposed fix runs afoul of a problem discussed back when `8a0596cb65` ("Get rid of copy_partition_key") reorganized that code: namely that it is possible to jump out of RelationBuildPartitionKey because of some error and leave a dangling memory context child of CacheMemoryContext. Also, while reviewing this I noticed that the removed-in-pg11 copy_partition_key was doing something wrong, unfixed in pg10, namely doing memcpy() on the FmgrInfo, which is bogus (should be doing fmgr_info_copy). Therefore, in branch pg10, the sane fix seems to be to backpatch both the aforementioned `8a0596cb65` and its followup `be2343221f` ("Protect against hypothetical memory leaks in RelationGetPartitionKey"), so do that, then apply the fmgr_info memcxt bugfix on top. Add a test case exercising btree-based custom operator classes, which causes a crash prior to this fix. This is not a security problem, because in order to create an operator class you need superuser privileges anyway. Authors: Álvaro Herrera and Amit Langote Reported and diagnosed by: Amit Langote Discussion: https://postgr.es/m/3041e853-b1dd-a0c6-ff21-7cc5633bffd0@lab.ntt.co.jp	2018-04-12 15:08:10 -03:00
Teodor Sigaev	524054598f	Fix interference between covering indexes and partitioned tables The bug is caused due to the original IndexStmt that DefineIndex receives being overwritten when processing the INCLUDE columns. Use separate list of index params to propagate to child tables. Add tests covering this case. Amit Langote and Alexander Korotkov. Re-commit `5c6110c6a9` because it discovered a bug fixed in `c266ed31a8` Discussion: https://www.postgresql.org/message-id/CAJGNTeO%3DBguEyG8wxMpU_Vgvg3nGGzy71zUQ0RpzEn_mb0bSWA%40mail.gmail.com	2018-04-12 17:25:13 +03:00
Teodor Sigaev	c266ed31a8	Cleanup covering infrastructure - Explicitly forbids opclass, collation and indoptions (like DESC/ASC etc) for including columns. Throw an error if user points that. - Truncated storage arrays for such attributes to store only key atrributes, added assertion checks. - Do not check opfamily and collation for including columns in CompareIndexInfo() Discussion: https://www.postgresql.org/message-id/5ee72852-3c4e-ee35-e2ed-c1d053d45c08@sigaev.ru	2018-04-12 16:37:22 +03:00
Simon Riggs	08ea7a2291	Revert MERGE patch This reverts commits `d204ef6377`, `83454e3c2b` and a few more commits thereafter (complete list at the end) related to MERGE feature. While the feature was fully functional, with sufficient test coverage and necessary documentation, it was felt that some parts of the executor and parse-analyzer can use a different design and it wasn't possible to do that in the available time. So it was decided to revert the patch for PG11 and retry again in the future. Thanks again to all reviewers and bug reporters. List of commits reverted, in reverse chronological order: `f1464c5380` Improve parse representation for MERGE `ddb4158579` MERGE syntax diagram correction `530e69e59b` Allow cpluspluscheck to pass by renaming variable `01b88b4df5` MERGE minor errata `3af7b2b0d4` MERGE fix variable warning in non-assert builds `a5d86181ec` MERGE INSERT allows only one VALUES clause `4b2d44031f` MERGE post-commit review `4923550c20` Tab completion for MERGE `aa3faa3c7a` WITH support in MERGE `83454e3c2b` New files for MERGE `d204ef6377` MERGE SQL Command following SQL:2016 Author: Pavan Deolasee Reviewed-by: Michael Paquier	2018-04-12 11:22:56 +01:00
Teodor Sigaev	c9c875a28f	Rename IndexInfo.ii_KeyAttrNumbers array Rename ii_KeyAttrNumbers to ii_IndexAttrNumbers to prevent confusion with ii_NumIndexAttrs/ii_NumIndexKeyAttrs. ii_IndexAttrNumbers contains all attributes including "including" columns, not only key attribute. Discussion: https://www.postgresql.org/message-id/13123421-1d52-d0e4-c95c-6d69011e0595%40sigaev.ru	2018-04-12 13:02:45 +03:00
Alvaro Herrera	9e9befac4a	Set relispartition correctly for index partitions Oversight in commit 8b08f7d4820f: pg_class.relispartition was not being set for index partitions, which is a bit odd, and was also causing the code to unnecessarily call has_superclass() when simply checking the flag was enough. Author: Álvaro Herrera Reported-by: Amit Langote Discussion: https://postgr.es/m/12085bc4-0bc6-0f3a-4c43-57fe0681772b@lab.ntt.co.jp	2018-04-11 21:27:12 -03:00
Tom Lane	d1e9079295	Ignore nextOid when replaying an ONLINE checkpoint. The nextOid value is from the start of the checkpoint and may well be stale compared to values from more recent XLOG_NEXTOID records. Previously, we adopted it anyway, allowing the OID counter to go backwards during a crash. While this should be harmless, it contributed to the severity of the bug fixed in commit `0408e1ed5`, by allowing duplicate TOAST OIDs to be assigned immediately following a crash. Without this error, that issue would only have arisen when TOAST objects just younger than a multiple of 2^32 OIDs were deleted and then not vacuumed in time to avoid a conflict. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com	2018-04-11 18:11:29 -04:00
Tom Lane	0408e1ed59	Do not select new object OIDs that match recently-dead entries. When selecting a new OID, we take care to avoid picking one that's already in use in the target table, so as not to create duplicates after the OID counter has wrapped around. However, up to now we used SnapshotDirty when scanning for pre-existing entries. That ignores committed-dead rows, so that we could select an OID matching a deleted-but-not-yet-vacuumed row. While that mostly worked, it has two problems: * If recently deleted, the dead row might still be visible to MVCC snapshots, creating a risk for duplicate OIDs when examining the catalogs within our own transaction. Such duplication couldn't be visible outside the object-creating transaction, though, and we've heard few if any field reports corresponding to such a symptom. * When selecting a TOAST OID, deleted toast rows definitely are visible to SnapshotToast, and will remain so until vacuumed away. This leads to a conflict that will manifest in errors like "unexpected chunk number 0 (expected 1) for toast value nnnnn". We've been seeing reports of such errors from the field for years, but the cause was unclear before. The fix is simple: just use SnapshotAny to search for conflicting rows. This results in a slightly longer window before object OIDs can be recycled, but that seems unlikely to create any large problems. Pavan Deolasee Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com	2018-04-11 17:41:22 -04:00
Heikki Linnakangas	811969b218	Allocate enough shared string memory for stats of auxiliary processes. This fixes a bug whereby the st_appname, st_clienthostname, and st_activity_raw fields for auxiliary processes point beyond the end of their respective shared memory segments. As a result, the application_name of a backend might show up as the client hostname of an auxiliary process. Backpatch to v10, where this bug was introduced, when the auxiliary processes were added to the array. Author: Edmund Horner Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAMyN-kA7aOJzBmrYFdXcc7Z0NmW%2B5jBaf_m%3D_-77uRNyKC9r%3DA%40mail.gmail.com	2018-04-11 23:39:49 +03:00
Heikki Linnakangas	a820b4c329	Make local copy of client hostnames in backend status array. The other strings, application_name and query string, were snapshotted to local memory in pgstat_read_current_status(), but we forgot to do that for client hostnames. As a result, the client hostname would appear to change in the local copy, if the client disconnected. Backpatch to all supported versions. Author: Edmund Horner Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAMyN-kA7aOJzBmrYFdXcc7Z0NmW%2B5jBaf_m%3D_-77uRNyKC9r%3DA%40mail.gmail.com	2018-04-11 23:39:48 +03:00
Alvaro Herrera	72cf7f310c	Fix ALTER TABLE .. ATTACH PARTITION ... DEFAULT If the table being attached contained values that contradict the default partition's partition constraint, it would fail to complain, because CommandCounterIncrement changes in `4dba331cb3` coupled with some bogus coding in the existing ValidatePartitionConstraints prevented the partition constraint from being validated after all -- or rather, it caused to constraint to become an empty one, always succeeding. Fix by not re-reading the OID of the default partition in ATExecAttachPartition. To forestall similar problems, revise the existing code: * rename routine from ValidatePartitionConstraints() to QueuePartitionConstraintValidation, to better represent what it actually does. * add an Assert() to make sure that when queueing a constraint for a partition we're not overwriting a constraint previously queued. * add an Assert() that we don't try to invoke the special-purpose validation of the default partition when attaching the default partition itself. While at it, change some loops to obtain partition OIDs from partdesc->oids rather than find_all_inheritors; reduce the lock level of partitions being scanned from AccessExclusiveLock to ShareLock; rewrite QueuePartitionConstraintValidation in a recursive fashion rather than repetitive. Author: Álvaro Herrera. Tests written by Amit Langote Reported-by: Rushabh Lathia Diagnosed-by: Kyotaro HORIGUCHI, who also provided the initial fix. Reviewed-by: Kyotaro HORIGUCHI, Amit Langote, Jeevan Ladhe Discussion: https://postgr.es/m/CAGPqQf0W+v-Ci_qNV_5R3A=Z9LsK4+jO7LzgddRncpp_rrnJqQ@mail.gmail.com	2018-04-11 15:32:46 -03:00
Teodor Sigaev	92899992e1	Temporary revert `5c6110c6a9` It discovers one more bug in CompareIndexInfo(), should be fixed first.	2018-04-11 19:32:19 +03:00
Teodor Sigaev	5c6110c6a9	Fix interference between cavering indexes and partitioned tables The bug is caused due to the original IndexStmt that DefineIndex receives being overwritten when processing the INCLUDE columns. Use separate list of index params to propagate to child tables. Add tests covering this case. Amit Langote and Alexander Korotkov.	2018-04-11 16:44:26 +03:00
Andrew Dunstan	8716b264ed	minor comment fixes in nbtinsert.c	2018-04-10 18:36:40 -04:00
Tom Lane	231bcd0803	Fix incorrect close() call in dsm_impl_mmap(). One improbable error-exit path in this function used close() where it should have used CloseTransientFile(). This is unlikely to be hit in the field, and I think the consequences wouldn't be awful (just an elog(LOG) bleat later). But a bug is a bug, so back-patch to 9.4 where this code came in. Pan Bian Discussion: https://postgr.es/m/152056616579.4966.583293218357089052@wrigleys.postgresql.org	2018-04-10 18:34:54 -04:00
Andrew Dunstan	074251db67	Adjustments to the btree fastpath optimization. This optimization was introduced in commit `2b272734`. The changes include some additional comments and documentation, and also these more substantive changes: . ensure the optimization is only applied on the leaf node of a tree whose root is on level 2 or more. It's of little value on small trees. . Delay calling RelationSetTargetBlock() until after the critical section of _bt_insertonpg . ensure the optimization is also applied to unlogged tables. Pavan Deolasee and Peter Geoghegan with some very light editing from me. Discussion: https://postgr.es/m/CABOikdO8jhRarNC60nZLktZYhxt+TK8z_V97+Ny499YQdyAfug@mail.gmail.com	2018-04-10 18:21:03 -04:00
Alvaro Herrera	15a8f8caad	Fix IndexOnlyScan counter for heap fetches in parallel mode The HeapFetches counter was using a simple value in IndexOnlyScanState, which fails to propagate values from parallel workers; so the counts are wrong when IndexOnlyScan runs in parallel. Move it to Instrumentation, like all the other counters. While at it, change INSERT ON CONFLICT conflicting tuple counter to use the new ntuples2 instead of nfiltered2, which is a blatant misuse. Discussion: https://postgr.es/m/20180409215851.idwc75ct2bzi6tea@alvherre.pgsql	2018-04-10 15:56:15 -03:00
Heikki Linnakangas	29d7ebf51e	Fix comment on B-tree insertion fastpath condition. The comment earlier in the function correctly states "and the insertion key is strictly greater than the first key in this page". That is what we check here, not "greater than or equal".	2018-04-10 16:57:19 +03:00
Tom Lane	3b8f6e75f3	Fix partial-build problems introduced by having more generated headers. Commit `372728b0d` created some problems for usages like building a subdirectory without having first done "make all" at the top level, or for proceeding directly to "make install" without "make all". The only reasonably clean way to fix this seems to be to force the submake-generated-headers rule to fire in any "make all" or "make install" command anywhere in the tree. To avoid lots of redundant work, as well as parallel make jobs possibly clobbering each others' output, we still need to be sure that the rule fires only once in a recursive build. For that, adopt the same MAKELEVEL hack previously used for "temp-install". But try to document it a bit better. The submake-errcodes mechanism previously used in src/port/ and src/common/ is subsumed by this, so we can get rid of those special cases. It was inadequate for src/common/ anyway after the aforesaid commit, and it always risked parallel attempts to build errcodes.h. Discussion: https://postgr.es/m/E1f5FAB-0006LU-MB@gemulon.postgresql.org	2018-04-09 16:42:10 -04:00
Alvaro Herrera	468abb8f7a	Fix incorrect logic for choosing the next Parallel Append subplan In `499be013de` support for pruning unneeded Append subnodes was added. The logic in that commit was not correctly checking if the next subplan was in fact a valid subplan. This could cause parallel workers processes to be given a subplan to work on which didn't require any work. Per code review following an otherwise unexplained regression failure in buildfarm member Pademelon. (We haven't been able to reproduce the failure, so this is a bit of a blind fix in terms of whether it'll actually fix it; but it is a clear bug nonetheless). In passing, also add a comment to explain what first_partial_plan means. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f_E5r05hHUVG3UmCQJ49DGKKHtN=SHybD44LdzBn+CJng@mail.gmail.com	2018-04-09 17:23:49 -03:00
Tom Lane	a65e17bd6f	Reduce chattiness of genbki.pl and Gen_fmgrtab.pl. Make these scripts emit just one log message when they run, not one per output file. The latter is way too verbose in the wake of commit `372728b0d`. The specific wording used is what already existed in the MSVC scripts. John Naylor Discussion: https://postgr.es/m/11103.1523208822@sss.pgh.pa.us	2018-04-09 15:01:10 -04:00
Tom Lane	af1a949109	Further cleanup of client dependencies on src/include/catalog headers. In commit `9c0a0de4c`, I'd failed to notice that catalog/catalog.h should also be considered a frontend-unsafe header, because it includes (and needs) the full form of pg_class.h, not to mention relcache.h. However, various frontend code was depending on it to get TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for. The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY, as well as the OIDCHARS symbol, to common/relpath.h. Do that, and mop up inclusions as necessary. (I found that quite a few current users of catalog/catalog.h don't seem to need it at all anymore, apparently as a result of the refactorings that created common/relpath.[hc]. And initdb.c needed it only as a route to pg_class_d.h.) Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us	2018-04-09 14:39:58 -04:00
Magnus Hagander	a228cc13ae	Revert "Allow on-line enabling and disabling of data checksums" This reverts the backend sides of commit `1fde38beaa`. I have, at least for now, left the pg_verify_checksums tool in place, as this tool can be very valuable without the rest of the patch as well, and since it's a read-only tool that only runs when the cluster is down it should be a lot safer.	2018-04-09 19:03:42 +02:00
Alvaro Herrera	d7a95f06a1	Minor comment updates Fix a couple of typos, and update a comment about why we set a BMS to NULL. Author: David Rowley Discussion: http://postgr.es/m/CAKJS1f-tux=KdUz6ENJ9GHM_V2qgxysadYiOyQS9Ko9PTteVhQ@mail.gmail.com	2018-04-09 11:17:35 -03:00
Alvaro Herrera	7ba6ee815d	Add missed bms_copy() in perform_pruning_combine_step We were initializing a BMS to merely reference an existing one, which would cause a double-free (and a crash) when the recursive algorithm tried to intersect it with an empty one. Fix it by creating a copy at initialization time. Reported-by: sqlsmith (by way of Andreas Seltenreich) Author: Amit Langote Discussion: https://postgr.es/m/87in923lyw.fsf@ansel.ydns.eu	2018-04-09 10:54:28 -03:00
Heikki Linnakangas	2c19ea863a	Fix typo in comment. Author: Kyotaro Horiguchi	2018-04-09 14:20:13 +03:00
Tom Lane	b3b7f7898f	Fix additional breakage in covering-index patch. CheckIndexCompatible() misused ComputeIndexAttrs() by not bothering to fill ii_NumIndexAttrs and ii_NumIndexKeyAttrs in the passed IndexInfo. Omission of ii_NumIndexAttrs was previously unimportant, but now this matters because ComputeIndexAttrs depends on ii_NumIndexKeyAttrs to decide how many columns it needs to report on. (BTW, the fact that this oversight wasn't detected earlier implies that we have no regression test verifying whether CheckIndexCompatible ever succeeds. Bad dog. Not the job of this patch to fix it, though.) Also, change the API of ComputeIndexAttrs so that it fills the opclass output array for all column positions, as it does for the options output array; positions for non-key index columns are filled with zeroes. This isn't directly fixing any bug, but it seems like a good idea. Per valgrind failure reports from buildfarm. Alexander Korotkov, tweaked a bit by me Discussion: https://postgr.es/m/CAPpHfduWrysrT-qAhn+3Ea5+Mg6Vhc-oA6o2Z-hRCPRdvf3tiw@mail.gmail.com	2018-04-08 17:23:39 -04:00
Tom Lane	cefa387153	Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers. Traditionally, include/catalog/pg_foo.h contains extern declarations for functions in backend/catalog/pg_foo.c, in addition to its function as the authoritative definition of the pg_foo catalog's rowtype. In some cases, we'd been forced to split out those extern declarations into separate pg_foo_fn.h headers so that the catalog definitions could be #include'd by frontend code. That problem is gone as of commit `9c0a0de4c`, so let's undo the splits to make things less confusing. Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us	2018-04-08 14:35:29 -04:00
Tom Lane	372728b0d4	Replace our traditional initial-catalog-data format with a better design. Historically, the initial catalog data to be installed during bootstrap has been written in DATA() lines in the catalog header files. This had lots of disadvantages: the format was badly underdocumented, it was very difficult to edit the data in any mechanized way, and due to the lack of any abstraction the data was verbose, hard to read/understand, and easy to get wrong. Hence, move this data into separate ".dat" files and represent it in a way that can easily be read and rewritten by Perl scripts. The new format is essentially "key => value" for each column; while it's a bit repetitive, explicit labeling of each value makes the data far more readable and less error-prone. Provide a way to abbreviate entries by omitting field values that match a specified default value for their column. This allows removal of a large amount of repetitive boilerplate and also lowers the barrier to adding new columns. Also teach genbki.pl how to translate symbolic OID references into numeric OIDs for more cases than just "regproc"-like pg_proc references. It can now do that for regprocedure-like references (thus solving the problem that regproc is ambiguous for overloaded functions), operators, types, opfamilies, opclasses, and access methods. Use this to turn nearly all OID cross-references in the initial data into symbolic form. This represents a very large step forward in readability and error resistance of the initial catalog data. It should also reduce the difficulty of renumbering OID assignments in uncommitted patches. Also, solve the longstanding problem that frontend code that would like to use OID macros and other information from the catalog headers often had difficulty with backend-only code in the headers. To do this, arrange for all generated macros, plus such other declarations as we deem fit, to be placed in "derived" header files that are safe for frontend inclusion. (Once clients migrate to using these pg__d.h headers, it will be possible to get rid of the pg__fn.h headers, which only exist to quarantine code away from clients. That is left for follow-on patches, however.) The now-automatically-generated macros include the Anum_xxx and Natts_xxx constants that we used to have to update by hand when adding or removing catalog columns. Replace the former manual method of generating OID macros for pg_type entries with an automatic method, ensuring that all built-in types have OID macros. (But note that this patch does not change the way that OID macros for pg_proc entries are built and used. It's not clear that making that match the other catalogs would be worth extra code churn.) Add SGML documentation explaining what the new data format is and how to work with it. Despite being a very large change in the catalog headers, there is no catversion bump here, because postgres.bki and related output files haven't changed at all. John Naylor, based on ideas from various people; review and minor additional coding by me; previous review by Alvaro Herrera Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com	2018-04-08 13:17:27 -04:00
Teodor Sigaev	02f3e558f2	match_clause_to_index should check only key columns Alexander Korotkov per gripe from Tom Lane noticed on valgrind-enabled buildfarm members	2018-04-08 19:58:15 +03:00
Teodor Sigaev	34602b0a1d	Remove unused variable in non-assert-enabled build Use field of structure in Assert directly Jeff Janes	2018-04-08 19:30:38 +03:00
Andrew Gierth	49b0e300f7	Support index INCLUDE in the AM properties interface. This rectifies an oversight in commit `8224de4f4`, by adding a new property 'can_include' for pg_indexam_has_property, and adjusting the results of pg_index_column_has_property to give more appropriate results for INCLUDEd columns.	2018-04-08 06:02:05 +01:00
Stephen Frost	2b74022473	Fix EXEC BACKEND + Windows builds for group privs Under EXEC BACKEND we also need to be going through the group privileges setup since we do support that on Unixy systems, so add that to SubPostmasterMain(). Under Windows, we need to simply return true from GetDataDirectoryCreatePerm(), but that wasn't happening due to a missing #else clause. Per buildfarm.	2018-04-07 19:01:43 -04:00
Stephen Frost	c37b3d08ca	Allow group access on PGDATA Allow the cluster to be optionally init'd with read access for the group. This means a relatively non-privileged user can perform a backup of the cluster without requiring write privileges, which enhances security. The mode of PGDATA is used to determine whether group permissions are enabled for directory and file creates. This method was chosen as it's simple and works well for the various utilities that write into PGDATA. Changing the mode of PGDATA manually will not automatically change the mode of all the files contained therein. If the user would like to enable group access on an existing cluster then changing the mode of all the existing files will be required. Note that pg_upgrade will automatically change the mode of all migrated files if the new cluster is init'd with the -g option. Tests are included for the backend and all the utilities which operate on the PG data directory to ensure that the correct mode is set based on the data directory permissions. Author: David Steele <david@pgmasters.net> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net	2018-04-07 17:45:39 -04:00
Stephen Frost	da9b580d89	Refactor dir/file permissions Consolidate directory and file create permissions for tools which work with the PG data directory by adding a new module (common/file_perm.c) that contains variables (pg_file_create_mode, pg_dir_create_mode) and constants to initialize them (0600 for files and 0700 for directories). Convert mkdir() calls in the backend to MakePGDirectory() if the original call used default permissions (always the case for regular PG directories). Add tests to make sure permissions in PGDATA are set correctly by the tools which modify the PG data directory. Authors: David Steele <david@pgmasters.net>, Adam Brightwell <adam.brightwell@crunchydata.com> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net	2018-04-07 17:45:39 -04:00
Alvaro Herrera	499be013de	Support partition pruning at execution time Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com	2018-04-07 17:54:39 -03:00
Alvaro Herrera	5c0675215e	Add bms_prev_member function This works very much like the existing bms_last_member function, only it traverses through the Bitmapset in the opposite direction from the most significant bit down to the least significant bit. A special prevbit value of -1 may be used to have the function determine the most significant bit. This is useful for starting a loop. When there are no members less than prevbit, the function returns -2 to indicate there are no more members. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f-K=3d5MDASNYFJpUpc20xcBnAwNC1-AOeunhn0OtkWbQ@mail.gmail.com	2018-04-07 17:54:39 -03:00
Andres Freund	f16241bef7	Raise error when affecting tuple moved into different partition. When an update moves a row between partitions (supported since `2f17844104`), our normal logic for following update chains in READ COMMITTED mode doesn't work anymore. Cross partition updates are modeled as an delete from the old and insert into the new partition. No ctid chain exists across partitions, and there's no convenient space to introduce that link. Not throwing an error in a partitioned context when one would have been thrown without partitioning is obviously problematic. This commit introduces infrastructure to detect when a tuple has been moved, not just plainly deleted. That allows to throw an error when encountering a deletion that's actually a move, while attempting to following a ctid chain. The row deleted as part of a cross partition update is marked by pointing it's t_ctid to an invalid block, instead of self as a normal update would. That was deemed to be the least invasive and most future proof way to represent the knowledge, given how few infomask bits are there to be recycled (there's also some locking issues with using infomask bits). External code following ctid chains should be updated to check for moved tuples. The most likely consequence of not doing so is a missed error. Author: Amul Sul, editorialized by me Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com	2018-04-07 13:24:27 -07:00
Teodor Sigaev	8224de4f42	Indexes with INCLUDE columns and their support in B-tree This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru	2018-04-07 23:00:39 +03:00
Teodor Sigaev	1c1791e000	Add json(b)_to_tsvector function Jsonb has a complex nature so there isn't best-for-everything way to convert it to tsvector for full text search. Current to_tsvector(json(b)) suggests to convert only string values, but it's possible to index keys, numerics and even booleans value. To solve that json(b)_to_tsvector has a second required argument contained a list of desired types of json fields. Second argument is a jsonb scalar or array right now with possibility to add new options in a future. Bump catalog version Author: Dmitry Dolgov with some editorization by me Reviewed by: Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com	2018-04-07 20:58:03 +03:00
Peter Eisentraut	039eb6e92f	Logical replication support for TRUNCATE Update the built-in logical replication system to make use of the previously added logical decoding for TRUNCATE support. Add the required truncate callback to pgoutput and a new logical replication protocol message. Publications get a new attribute to determine whether to replicate truncate actions. When updating a publication via pg_dump from an older version, this is not set, thus preserving the previous behavior. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-04-07 11:34:11 -04:00
Peter Eisentraut	5dfd1e5a66	Logical decoding of TRUNCATE Add a new WAL record type for TRUNCATE, which is only used when wal_level >= logical. (For physical replication, TRUNCATE is already replicated via SMGR records.) Add new callback for logical decoding output plugins to receive TRUNCATE actions. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-04-07 11:34:10 -04:00
Teodor Sigaev	b508a56f2f	Predicate locking in hash indexes. Hash index searches acquire predicate locks on the primary page of a bucket. It acquires a lock on both the old and new buckets for scans that happen concurrently with page splits. During a bucket split, a predicate lock is copied from the primary page of an old bucket to the primary page of a new bucket. Author: Shubham Barai, Amit Kapila Reviewed by: Amit Kapila, Alexander Korotkov, Thomas Munro Discussion: https://www.postgresql.org/message-id/flat/CALxAEPvNsM2GTiXdRgaaZ1Pjd1bs+sxfFsf7Ytr+iq+5JJoYXA@mail.gmail.com	2018-04-07 16:59:14 +03:00
Alvaro Herrera	971d7ddbe1	Document partprune.c a little better Author: Amit Langote Reviewed-by: Álvaro Herrera, David Rowley Discussion: https://postgr.es/m/CA+HiwqGzq4D6z=8R0AP+XhbTFCQ-4Ct+t2ekqjE9Fpm84_JUGg@mail.gmail.com	2018-04-07 10:35:38 -03:00
Andres Freund	8c3debbbf6	Fix and improve pg_atomic_flag fallback implementation. The atomics fallback implementation for pg_atomic_flag was broken, returning the inverted value from pg_atomic_test_set_flag(). This was unnoticed because a) atomic flags were unused until recently b) the test code wasn't run when the fallback implementation was in use (because it didn't allow to test for some edge cases). Fix the bug, and improve the fallback so it has the same behaviour as the non-fallback implementation in the problematic edge cases. That breaks ABI compatibility in the back branches when fallbacks are in use, but given they were broken until now... Author: Andres Freund Reported-by: Daniel Gustafsson Discussion: https://postgr.es/m/FB948276-7B32-4B77-83E6-D00167F8EEB4@yesql.se https://postgr.es/m/20180406233854.uni2h3mbnveczl32@alap3.anarazel.de Backpatch: 9.5-, where the atomics abstraction was introduced.	2018-04-06 19:55:32 -07:00
Robert Haas	47cb9ca49a	Fix possible failure in parallel index build. Report and proposed fix by David Rowley, put in patch form by Peter Geoghegan. Discussion: http://postgr.es/m/CAKJS1f91kq1wfYR8rnRRfKtxyhU2woEA+=whd640UxMyU+O0EQ@mail.gmail.com	2018-04-06 19:28:48 -04:00
Robert Haas	3d956d9562	Allow insert and update tuple routing and COPY for foreign tables. Also enable this for postgres_fdw. Etsuro Fujita, based on an earlier patch by Amit Langote. The larger patch series of which this is a part has been reviewed by Amit Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost, and me. Minor documentation changes to the final version by me. Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp	2018-04-06 19:22:03 -04:00
Alvaro Herrera	9fdb675fc5	Faster partition pruning Add a new module backend/partitioning/partprune.c, implementing a more sophisticated algorithm for partition pruning. The new module uses each partition's "boundinfo" for pruning instead of constraint exclusion, based on an idea proposed by Robert Haas of a "pruning program": a list of steps generated from the query quals which are run iteratively to obtain a list of partitions that must be scanned in order to satisfy those quals. At present, this targets planner-time partition pruning, but there exist further patches to apply partition pruning at execution time as well. This commit also moves some definitions from include/catalog/partition.h to a new file include/partitioning/partbounds.h, in an attempt to rationalize partitioning related code. Authors: Amit Langote, David Rowley, Dilip Kumar Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen. Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp	2018-04-06 16:44:05 -03:00
Stephen Frost	11523e860f	Support new default roles with adminpack This provides a newer version of adminpack which works with the newly added default roles to support GRANT'ing to non-superusers access to read and write files, along with related functions (unlinking files, getting file length, renaming/removing files, scanning the log file directory) which are supported through adminpack. Note that new versions of the functions are required because an environment might have an updated version of the library but still have the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to PUBLIC for the functions). This patch also removes the long-deprecated alternative names for functions that adminpack used to include and which are now included in the backend, in adminpack v1.1. Applications using the deprecated names should be updated to use the backend functions instead. Existing installations which continue to use adminpack v1.0 should continue to function until/unless adminpack is upgraded. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Stephen Frost	0fdc8495bf	Add default roles for file/program access This patch adds new default roles named 'pg_read_server_files', 'pg_write_server_files', 'pg_execute_server_program' which allow an administrator to GRANT to a non-superuser role the ability to access server-side files or run programs through PostgreSQL (as the user the database is running as). Having one of these roles allows a non-superuser to use server-side COPY to read, write, or with a program, and to use file_fdw (if installed by a superuser and GRANT'd USAGE on it) to read from files or run a program. The existing misc file functions are also changed to allow a user with the 'pg_read_server_files' default role to read any files on the filesystem, matching the privileges given to that role through COPY and file_fdw from above. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Stephen Frost	e79350fef2	Remove explicit superuser checks in favor of ACLs This removes the explicit superuser checks in the various file-access functions in the backend, specifically pg_ls_dir(), pg_read_file(), pg_read_binary_file(), and pg_stat_file(). Instead, EXECUTE is REVOKE'd from public for these, meaning that only a superuser is able to run them by default, but access to them can be GRANT'd to other roles. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Peter Eisentraut	94c1f9ba11	Add memory context identifier to portal context Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us	2018-04-06 12:37:54 -04:00
Peter Eisentraut	bbca77623f	Rename MemoryContextCopySetIdentifier() for clarity MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us	2018-04-06 12:37:54 -04:00
Robert Haas	cfbecf8100	Enforce child constraints during COPY TO a partitioned table. The previous coding inadvertently checked the constraints for the partitioned table rather than the target partition, which could lead to data in a partition that fails to satisfy some constraint on that partition. This problem seems to date back to when table partitioning was introduced; prior to that, there was only one target table for a COPY, so the problem didn't occur, and the code just didn't get updated. Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp	2018-04-06 11:42:28 -04:00
Peter Eisentraut	bcf79b5bb6	Split the SetSubscriptionRelState function into two We don't actually need the insert-or-update logic, so it's clearer to have separate functions for the inserting and updating. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2018-04-06 10:00:26 -04:00
Peter Eisentraut	c25304a945	Improve messaging during logical replication worker startup In case the subscription is removed before the worker is fully started, give a specific error message instead of the generic "cache lookup" error. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2018-04-06 09:07:09 -04:00
Simon Riggs	f1464c5380	Improve parse representation for MERGE Separation of parser data structures from executor, as requested by Tom Lane. Further improvements possible. While there, implement error for multiple VALUES clauses via parser to allow line number of error, as requested by Andres Freund. Author: Pavan Deolasee Discussion: https://www.postgresql.org/message-id/CABOikdPpqjectFchg0FyTOpsGXyPoqwgC==OLKWuxgBOsrDDZw@mail.gmail.com	2018-04-06 09:38:59 +01:00
Magnus Hagander	1fde38beaa	Allow on-line enabling and disabling of data checksums This makes it possible to turn checksums on in a live cluster, without the previous need for dump/reload or logical replication (and to turn it off). Enabling checkusm starts a background process in the form of a launcher/worker combination that goes through the entire database and recalculates checksums on each and every page. Only when all pages have been checksummed are they fully enabled in the cluster. Any failure of the process will revert to checksums off and the process has to be started. This adds a new WAL record that indicates the state of checksums, so the process works across replicated clusters. Authors: Magnus Hagander and Daniel Gustafsson Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin	2018-04-05 22:04:48 +02:00
Simon Riggs	530e69e59b	Allow cpluspluscheck to pass by renaming variable Use of a C++ keyword as a function name caused problems Reported-by: Álvaro Herrera	2018-04-05 20:06:02 +01:00
Magnus Hagander	eed1ce72e1	Allow background workers to bypass datallowconn THis adds a "flags" field to the BackgroundWorkerInitializeConnection() and BackgroundWorkerInitializeConnectionByOid(). For now only one flag, BGWORKER_BYPASS_ALLOWCONN, is defined, which allows the worker to ignore datallowconn.	2018-04-05 19:02:45 +02:00
Teodor Sigaev	1664ae1978	Add websearch_to_tsquery Error-tolerant conversion function with web-like syntax for search query, it simplifies constraining search engine with close to habitual interface for users. Bump catalog version Authors: Victor Drobny, Dmitry Ivanov with editorization by me Reviewed by: Aleksander Alekseev, Tomas Vondra, Thomas Munro, Aleksandr Parfenov Discussion: https://www.postgresql.org/message-id/flat/fe931111ff7e9ad79196486ada79e268@postgrespro.ru	2018-04-05 19:55:11 +03:00
Teodor Sigaev	0a64b45152	Fix handling of non-upgraded B-tree metapages `857f9c36` bumps B-tree metapage version while upgrade is performed "on the fly" when needed. However, some asserts fired when old version metapage was cached to rel->rd_amcache. Despite new metadata fields are never used from rel->rd_amcache, that needs to be fixed. This patch introduces metadata upgrade during its caching, which fills unavailable fields with their default values. contrib/pageinspect is also patched to handle non-upgraded metapages in the same way. Author: Alexander Korotkov	2018-04-05 17:56:00 +03:00
Simon Riggs	01b88b4df5	MERGE minor errata	2018-04-05 13:19:13 +01:00
Simon Riggs	3af7b2b0d4	MERGE fix variable warning in non-assert builds Author: Jesper Pedersen	2018-04-05 13:02:29 +01:00
Teodor Sigaev	17d8beb4f5	Remove unused vars and mark assert-only vars Kyotaro HORIGUCHI	2018-04-05 13:16:15 +03:00
Teodor Sigaev	51e6562324	Fix typo Masahiko Sawada	2018-04-05 13:04:18 +03:00
Simon Riggs	4b2d44031f	MERGE post-commit review Review comments from Andres Freund * Consolidate code into AfterTriggerGetTransitionTable() * Rename nodeMerge.c to execMerge.c * Rename nodeMerge.h to execMerge.h * Move MERGE handling in ExecInitModifyTable() into a execMerge.c ExecInitMerge() * Move mt_merge_subcommands flags into execMerge.h * Rename opt_and_condition to opt_merge_when_and_condition * Wordsmith various comments Author: Pavan Deolasee Reviewer: Simon Riggs	2018-04-05 09:54:07 +01:00
Andrew Gierth	1fd8690668	Install errcodes.txt for use by extensions. Maintainers of out-of-tree PLs typically need access to the set of error codes. To avoid the need to duplicate that information in some form in PL source trees, provide errcodes.txt as part of a server installation. Thomas Munro, based on a suggestion from Andrew Gierth Discussion: https://postgr.es/m/87woykk7mu.fsf%40news-spur.riddles.org.uk	2018-04-05 04:05:40 +01:00
Alvaro Herrera	7d7c99790b	Restore erroneously removed ONLY from PK check This is a blind fix, since I don't have SE-Linux to verify it. Per unwanted change in rhinoceros, running sepgsql tests. Noted by Tom Lane. Discussion: https://postgr.es/m/32347.1522865050@sss.pgh.pa.us	2018-04-04 16:38:11 -03:00
Tom Lane	1383e2a1a9	Improve FSM management for BRIN indexes. BRIN indexes like to propagate additions of free space into the upper pages of their free space maps as soon as the new space is known, even when it's just on one individual index page. Previously this required calling FreeSpaceMapVacuum, which is quite an expensive thing if the map is large. Use the FreeSpaceMapVacuumRange function recently added by commit `c79f6df75` to reduce the amount of work done for this purpose. Fix a couple of places that neglected to do the upper-page vacuuming at all after recording new free space. If the policy is to be that BRIN should do that, it should do it everywhere. Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do FreeSpaceMapVacuum unconditionally in brin_vacuum_scan. Because of the FSM's imprecise storage of free space, the old complications here seldom bought anything, they just slowed things down. This approach also provides a predictable path for FSM corruption to be repaired. Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer where it's about to return an extended page to the caller. The caller should do that, instead, after it's inserted its new tuple. Fix the one caller that forgot to do so. Simplify logic in brin_doupdate's same-page-update case by postponing brin_initialize_empty_new_buffer to after the critical section; I see little point in doing it before. Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan. Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in a couple of places where we already had the right values. Move a BRIN_elog debug logging call out of a critical section; that's pretty unsafe and I don't think it buys us anything to not wait till after the critical section. Move the "extended = false" step in brin_getinsertbuffer into the routine's main loop. There's no actual bug there, since the loop can't iterate with extended still true, but it doesn't seem very future-proof as coded; and it's certainly not documented as a loop invariant. This is all from follow-on investigation inspired by commit `c79f6df75`. Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us	2018-04-04 14:26:04 -04:00
Alvaro Herrera	3de241dba8	Foreign keys on partitioned tables Author: Álvaro Herrera Discussion: https://postgr.es/m/20171231194359.cvojcour423ulha4@alvherre.pgsql Reviewed-by: Peter Eisentraut	2018-04-04 14:02:49 -03:00
Teodor Sigaev	857f9c36cd	Skip full index scan during cleanup of B-tree indexes when possible Vacuum of index consists from two stages: multiple (zero of more) ambulkdelete calls and one amvacuumcleanup call. When workload on particular table is append-only, then autovacuum isn't intended to touch this table. However, user may run vacuum manually in order to fill visibility map and get benefits of index-only scans. Then ambulkdelete wouldn't be called for indexes of such table (because no heap tuples were deleted), only amvacuumcleanup would be called In this case, amvacuumcleanup would perform full index scan for two objectives: put recyclable pages into free space map and update index statistics. This patch allows btvacuumclanup to skip full index scan when two conditions are satisfied: no pages are going to be put into free space map and index statistics isn't stalled. In order to check first condition, we store oldest btpo_xact in the meta-page. When it's precedes RecentGlobalXmin, then there are some recyclable pages. In order to check second condition we store number of heap tuples observed during previous full index scan by cleanup. If fraction of newly inserted tuples is less than vacuum_cleanup_index_scale_factor, then statistics isn't considered to be stalled. vacuum_cleanup_index_scale_factor can be defined as both reloption and GUC (default). This patch bumps B-tree meta-page version. Upgrade of meta-page is performed "on the fly": during VACUUM meta-page is rewritten with new version. No special handling in pg_upgrade is required. Author: Masahiko Sawada, Alexander Korotkov Review by: Peter Geoghegan, Kyotaro Horiguchi, Alexander Korotkov, Yura Sokolov Discussion: https://www.postgresql.org/message-id/flat/CAD21AoAX+d2oD_nrd9O2YkpzHaFr=uQeGr9s1rKC3O4ENc568g@mail.gmail.com	2018-04-04 19:29:00 +03:00
Alvaro Herrera	851f4b4e14	Don't clone internal triggers to partitions Trigger cloning to partitions was supposed to occur for user-visible triggers only, but during development the protection that prevented it from occurring to internal triggers was lost. Reinstate it, as well as add a test case to ensure internal triggers (in the tested case, triggers implementing a deferred unique constraint) are not cloned. Without the code fix, the partitions in the test end up with different numbers of triggers, which is clearly wrong ... Bug in `86f575948c`. Discussion: https://postgr.es/m/20180403214903.ozfagwjcpk337uw7@alvherre.pgsql	2018-04-03 19:08:25 -03:00
Alvaro Herrera	cd5005bc12	Pass correct TupDesc to ri_NullCheck() in Assert Previous coding was passing the wrong table's tuple descriptor, which accidentally fails to fail because no existing test case exercises a foreign key in which the referenced attributes are further to the right of the referencing attributes. Add a test so that further breakage is visible. This got broken in `16828d5c02`. Discussion: https://postgr.es/m/20180403204723.fqte755nukgm42uf@alvherre.pgsql	2018-04-03 18:04:50 -03:00
Tom Lane	dddfc4cb2e	Prevent accidental linking of system-supplied copies of libpq.so etc. We were being careless in some places about the order of -L switches in link command lines, such that -L switches referring to external directories could come before those referring to directories within the build tree. This made it possible to accidentally link a system-supplied library, for example /usr/lib/libpq.so, in place of the one built in the build tree. Hilarity ensued, the more so the older the system-supplied library is. To fix, break LDFLAGS into two parts, a sub-variable LDFLAGS_INTERNAL and the main LDFLAGS variable, both of which are "recursively expanded" so that they can be incrementally adjusted by different makefiles. Establish a policy that -L switches for directories in the build tree must always be added to LDFLAGS_INTERNAL, while -L switches for external directories must always be added to LDFLAGS. This is sufficient to ensure a safe search order. For simplicity, we typically also put -l switches for the respective libraries into those same variables. (Traditional make usage would have us put -l switches into LIBS, but cleaning that up is a project for another day, as there's no clear need for it.) This turns out to also require separating SHLIB_LINK into two variables, SHLIB_LINK and SHLIB_LINK_INTERNAL, with a similar rule about which switches go into which variable. And likewise for PG_LIBS. Although this change might appear to affect external users of pgxs.mk, I think it doesn't; they shouldn't have any need to touch the _INTERNAL variables. In passing, tweak src/common/Makefile so that the value of CPPFLAGS recorded in pg_config lacks "-DFRONTEND" and the recorded value of LDFLAGS lacks "-L../../../src/common". Both of those things are mistakes, apparently introduced during prior code rearrangements, as old versions of pg_config don't print them. In general we don't want anything that's specific to the src/common subdirectory to appear in those outputs. This is certainly a bug fix, but in view of the lack of field complaints, I'm unsure whether it's worth the risk of back-patching. In any case it seems wise to see what the buildfarm makes of it first. Discussion: https://postgr.es/m/25214.1522604295@sss.pgh.pa.us	2018-04-03 16:26:05 -04:00
Bruce Momjian	242408dbef	C comment: mention null handling in BuildTupleFromCStrings() Discussion: https://postgr.es/m/CAFjFpRcF-wNbe0w-m3NpkEwr9shmOZ=GoESOzd2Wog9h55J8sA@mail.gmail.com Author: Ashutosh Bapat	2018-04-03 14:01:14 -04:00
Teodor Sigaev	710d90da1f	Add prefix operator for TEXT type. The prefix operator along with SP-GiST indexes can be used as an alternative for LIKE 'word%' commands and it doesn't have a limitation of string/prefix length as B-Tree has. Bump catalog version Author: Ildus Kurbangaliev with some editorization by me Review by: Arthur Zakirov, Alexander Korotkov, and me Discussion: https://www.postgresql.org/message-id/flat/20180202180327.222b04b3@wp.localdomain	2018-04-03 19:46:45 +03:00
Magnus Hagander	10d62d1065	Properly use INT64_FORMAT in output Per buildfarm animal prairiedog, suggestion solution from Tom.	2018-04-03 16:39:29 +02:00
Magnus Hagander	a08dc71195	Fix for checksum validation patch Reorder the check for non-BLCKSZ size reads to make sure we don't abort sending the file in this case. Missed in the previous commit.	2018-04-03 13:57:49 +02:00
Magnus Hagander	4eb77d50c2	Validate page level checksums in base backups When base backups are run over the replication protocol (for example using pg_basebackup), verify the checksums of all data blocks if checksums are enabled. If checksum failures are encountered, log them as warnings but don't abort the backup. This becomes the default behaviour in pg_basebackup (provided checksums are enabled on the server), so add a switch (-k) to disable the checks if necessary. Author: Michael Banck Reviewed-By: Magnus Hagander, David Steele Discussion: https://postgr.es/m/20180228180856.GE13784@nighthawk.caipicrew.dd-dns.de	2018-04-03 13:47:16 +02:00
Simon Riggs	aa3faa3c7a	WITH support in MERGE Author: Peter Geoghegan Recursive support removed, no tests Docs added by me	2018-04-03 12:13:59 +01:00
Simon Riggs	83454e3c2b	New files for MERGE	2018-04-03 10:22:21 +01:00
Simon Riggs	d204ef6377	MERGE SQL Command following SQL:2016 MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com	2018-04-03 09:28:16 +01:00
Simon Riggs	aa5877bb26	Revert "MERGE SQL Command following SQL:2016" This reverts commit `e6597dc353`.	2018-04-02 21:36:38 +01:00
Simon Riggs	7cf8a5c302	Revert "Modified files for MERGE" This reverts commit `354f13855e`.	2018-04-02 21:34:15 +01:00
Simon Riggs	354f13855e	Modified files for MERGE	2018-04-02 21:12:47 +01:00
Simon Riggs	e6597dc353	MERGE SQL Command following SQL:2016 MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewers: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com	2018-04-02 21:04:35 +01:00
Tom Lane	b01f32c313	Fix some dubious WAL-parsing code. Coverity complained about possible buffer overrun in two places added by commit `1eb6d6527`, and AFAICS it's reasonable to worry: even granting that the WAL originator properly truncated the commit GID to GIDSIZE, we should not really bet our lives on that having the same value as it does in the current build. Hence, use strlcpy() not strcpy(), and adjust the pointer advancement logic to be sure we skip over the whole source string even if strlcpy() truncated it.	2018-04-02 13:46:21 -04:00
Peter Eisentraut	2764d5dcfa	Make be-secure-common.c more consistent for future SSL implementations Recent commit `8a3d9425` has introduced be-secure-common.c, which is aimed at including backend-side APIs that can be used by any SSL implementation. The purpose is similar to fe-secure-common.c for the frontend-side APIs. However, this has forgotten to include check_ssl_key_file_permissions() in the move, which causes a double dependency between be-secure.c and be-secure-openssl.c. Refactor the code in a more logical way. This also puts into light an API which is usable by future SSL implementations for permissions on SSL key files. Author: Michael Paquier <michael@paquier.xyz>	2018-04-02 11:37:40 -04:00
Robert Haas	7e0d64c7a5	postgres_fdw: Push down partition-wise aggregation. Since commit `7012b132d0`, postgres_fdw has been able to push down the toplevel aggregation operation to the remote server. Commit `e2f1eb0ee3` made it possible to break down the toplevel aggregation into one aggregate per partition. This commit lets postgres_fdw push down aggregation in that case just as it does at the top level. In order to make this work, this commit adds an additional argument to the GetForeignUpperPaths FDW API. A matching argument is added to the signature for create_upper_paths_hook. Third-party code using either of these will need to be updated. Also adjust create_foreignscan_plan() so that it picks up the correct set of relids in this case. Jeevan Chalke, reviewed by Ashutosh Bapat and by me and with some adjustments by me. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, and Rafia Sabih. Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com	2018-04-02 10:51:50 -04:00
Tom Lane	0b11a674fb	Fix a boatload of typos in C comments. Justin Pryzby Discussion: https://postgr.es/m/20180331105640.GK28454@telsasoft.com	2018-04-01 15:01:28 -04:00
Andres Freund	686d399f2b	Fix non-portable use of round(). round() is from C99. Use rint() instead. There are behavioral differences between round() and rint(), but they should not matter to the Bloom filter optimal_k() function. We already assume POSIX behavior for rint(), so there is no question of rint() not using "rounds towards nearest" as its rounding mode. Cleanup from commit `51bc271790`. Per buildfarm member thrips. Author: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn76eCGUonARy-wrVtMHsf+4cvbK_oJAWTLfORTU5ki0w@mail.gmail.com	2018-03-31 20:26:47 -07:00
Andres Freund	51bc271790	Add Bloom filter implementation. A Bloom filter is a space-efficient, probabilistic data structure that can be used to test set membership. Callers will sometimes incur false positives, but never false negatives. The rate of false positives is a function of the total number of elements and the amount of memory available for the Bloom filter. Two classic applications of Bloom filters are cache filtering, and data synchronization testing. Any user of Bloom filters must accept the possibility of false positives as a cost worth paying for the benefit in space efficiency. This commit adds a test harness extension module, test_bloomfilter. It can be used to get a sense of how the Bloom filter implementation performs under varying conditions. This is infrastructure for the upcoming "heapallindexed" amcheck patch, which verifies the consistency of a heap relation against one of its indexes. Author: Peter Geoghegan Reviewed-By: Andrey Borodin, Michael Paquier, Thomas Munro, Andres Freund Discussion: https://postgr.es/m/CAH2-Wzm5VmG7cu1N-H=nnS57wZThoSDQU+F5dewx3o84M+jY=g@mail.gmail.com	2018-03-31 17:49:41 -07:00
Andrew Dunstan	ed69864350	Small cleanups in fast default code. Problems identified by Andres Freund and Haribabu Kommi	2018-04-01 08:16:18 +09:30
Andres Freund	a4ebbd2752	Remove PARTIAL_LINKING build mode. In `9956ddc191`, ten years ago, the current objfile.txt based linking model was introduced. It's time to retire the old SUBSYS.o based model. This primarily is pertinent because the bitcode files for LLVM based inlining are not produced when using PARTIAL_LINKING. It does not seem worth to fix PARTIAL_LINKING to support that. Author: Andres Freund Discussion: https://postgr.es/m/20180121204356.d5oeu34jetqhmdv2@alap3.anarazel.de	2018-03-30 17:33:04 -07:00
Tatsuo Ishii	1b26bd4089	Fix bug with view locking code. LockViewRecurese() obtains view relation using heap_open() and passes it to get_view_query() to get view info. It immediately closes the relation then uses the returned view info by calling LockViewRecurse_walker(). Since get_view_query() returns a pointer within the relcache, the relcache should be kept until LockViewRecurse_walker() returns. Otherwise the relation could point to a garbage memory area. Fix is moving the heap_close() call after LockViewRecurse_walker(). Problem reported by Tom Lane (buildfarm is unhappy, especially prion since it enables -DRELCACHE_FORCE_RELEASE cpp flag), fix by me.	2018-03-31 09:26:43 +09:00
Andres Freund	3e256e5506	Add SKIP_LOCKED option to RangeVarGetRelidExtended(). This will be used for VACUUM (SKIP LOCKED). Author: Nathan Bossart Reviewed-By: Michael Paquier and Andres Freund Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de	2018-03-30 17:05:16 -07:00
Andres Freund	d87510a524	Combine options for RangeVarGetRelidExtended() into a flags argument. A followup patch will add a SKIP_LOCKED option. To avoid introducing evermore arguments, breaking existing callers each time, introduce a flags argument. This'll no doubt break a few external users... Also change the MISSING_OK behaviour so a DEBUG1 debug message is emitted when a relation is not found. Author: Nathan Bossart Reviewed-By: Michael Paquier and Andres Freund Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de	2018-03-30 17:05:16 -07:00
Fujii Masao	9a895462d9	Enhance pg_stat_wal_receiver view to display host and port of sender server. Previously there was no way in the standby side to find out the host and port of the sender server that the walreceiver was currently connected to when multiple hosts and ports were specified in primary_conninfo. For that purpose, this patch adds sender_host and sender_port columns into pg_stat_wal_receiver view. They report the host and port that the active replication connection currently uses. Bump catalog version. Author: Haribabu Kommi Reviewed-by: Michael Paquier and me Discussion: https://postgr.es/m/CAJrrPGcV_aq8=cdqkFhVDJKEnDQ70yRTTdY9RODzMnXNrCz2Ow@mail.gmail.com	2018-03-31 07:51:22 +09:00
Tom Lane	4a33bb59df	Ensure that WAL pages skipped by a forced WAL switch are zero-filled. In the previous coding, skipped pages were mostly zeroes, but they still had valid WAL page headers. That makes them very much less compressible than an unbroken string of zeroes would be --- about 10X worse for bzip2 compression, for instance. We don't need those headers, so tweak the logic so that we zero them out. Chapman Flack, reviewed by Daniel Gustafsson Discussion: https://postgr.es/m/579297F8.7020107@anastigmatix.net	2018-03-30 16:18:18 -04:00
Tom Lane	e5eb4fa873	Remove obsolete SLRU wrapping and warnings from predicate.c. When SSI was developed, slru.c was limited to segment files with names in the range 0000-FFFF. This didn't allow enough space for predicate.c to store every possible XID when spilling old transactions to disk, so it would wrap around sooner and print warnings. Since commits `638cf09e` and `73c986ad` increased the number of segment files slru.c could manage, that behavior is unnecessary. Therefore remove that code. Also remove the macro OldSerXidSegment, which has been unused since `4cd3fb6e`. Thomas Munro, reviewed by Anastasia Lubennikova Discussion: https://postgr.es/m/CAEepm=3XfsTSxgEbEOmxu0QDiXy0o18NUg2nC89JZcCGE+XFPA@mail.gmail.com	2018-03-30 15:11:39 -04:00
Tom Lane	1bb9e731e1	Improve out-of-memory error reports by including memory context name. Add the target context's name to the errdetail field of "out of memory" errors in mcxt.c. Per discussion, this seems likely to be useful to help narrow down the cause of a reported failure, and it costs little. Also, now that context names are required to be compile-time constants in all cases, there's little reason to be concerned about security issues from exposing these names to users. (Because of such concerns, we are not including the context "ident" field.) In passing, add unlikely() markers to the allocation-failed tests, just to be sure the compiler is on the right page about that. Also, in palloc and friends, copy CurrentMemoryContext into a local variable, as that's almost surely cheaper to reference than a global. Discussion: https://postgr.es/m/1099.1522285628@sss.pgh.pa.us	2018-03-30 13:53:33 -04:00
Tom Lane	c79f6df75d	Do index FSM vacuuming sooner. In btree and SP-GiST indexes, move the responsibility for calling IndexFreeSpaceMapVacuum from the vacuumcleanup phase to the bulkdelete phase, and do it if and only if we found some pages that could be put into FSM. As in commit `851a26e26`, the idea is to make free pages visible to FSM searchers sooner when vacuuming very large tables (large enough to need multiple bulkdelete scans). This adds more redundant work than that commit did, since we have to scan the entire index FSM each time rather than being able to localize what needs to be updated; but it still seems worthwhile. However, we can buy something back by not touching the FSM at all when there are no pages that can be put in it. That will result in slower recovery from corrupt upper FSM pages in such a scenario, but it doesn't seem like that's a case we need to optimize for. Hash indexes don't use FSM at all. GIN, GiST, and bloom indexes update FSM during the vacuumcleanup phase not bulkdelete, so that doing something comparable to this would be a much more invasive change, and it's not clear it's worth it. BRIN indexes do things sufficiently differently that this change doesn't apply to them, either. Claudio Freire, reviewed by Masahiko Sawada and Jing Wang, some additional tweaks by me Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com	2018-03-30 11:48:20 -04:00
Robert Haas	96030f9a48	Don't call IS_DUMMY_REL() when cheapest_total_path might be junk. Unlike the previous coding, this might result in a Gather per Append subplan when the target list is parallel-restricted, but such a plan is probably worth considering in that case, since a single Gather on top of the entire Append is impossible. Per Andres Freund and the buildfarm. Discussion: http://postgr.es/m/20180330050351.bmxx4cdtz67czjda@alap3.anarazel.de	2018-03-30 11:40:41 -04:00
Teodor Sigaev	43d1ed60fd	Predicate locking in GIN index Predicate locks are used on per page basis only if fastupdate = off, in opposite case predicate lock on pending list will effectively lock whole index, to reduce locking overhead, just lock a relation. Entry and posting trees are essentially B-tree, so locks are acquired on leaf pages only. Author: Shubham Barai with some editorization by me and Dmitry Ivanov Review by: Alexander Korotkov, Dmitry Ivanov, Fedor Sigaev Discussion: https://www.postgresql.org/message-id/flat/CALxAEPt5sWW+EwTaKUGFL5_XFcZ0MuGBcyJ70oqbWqr42YKR8Q@mail.gmail.com	2018-03-30 14:23:17 +03:00
Magnus Hagander	019fa576ca	Fix typo in comment Author: Michael Paquier <michael@paquier.xyz>	2018-03-30 12:35:13 +02:00
Tatsuo Ishii	34c20de4d0	Allow to lock views. Now all tables used in view definitions can be recursively locked by a LOCK command. Author: Yugo Nagata Reviewed by Robert Haas, Thomas Munro and me. Discussion: https://postgr.es/m/20171011183629.eb2817b3.nagata%40sraoss.co.jp	2018-03-30 09:18:02 +09:00
Andres Freund	fb60478011	Improve JIT docs. Author: John Naylor and Andres Freund Discussion: https://postgr.es/m/CAJVSVGUs-VcwSY7-Kx-GQe__8hvWuA4Uhyf3gxoMXeiZqebE9g@mail.gmail.com	2018-03-29 16:13:40 -07:00
Robert Haas	c1de1a3a8b	Remove 'target' from GroupPathExtraData. It's not needed. Jeevan Chalke Discussion: http://postgr.es/m/CAM2+6=XPWujjmj5zUaBTGDoB38CemwcPmjkRy0qOcsQj_V+2sQ@mail.gmail.com	2018-03-29 16:17:18 -04:00
Robert Haas	11cf92f6e2	Rewrite the code that applies scan/join targets to paths. If the toplevel scan/join target list is parallel-safe, postpone generating Gather (or Gather Merge) paths until after the toplevel has been adjusted to return it. This (correctly) makes queries with expensive functions in the target list more likely to choose a parallel plan, since the cost of the plan now reflects the fact that the evaluation will happen in the workers rather than the leader. The original complaint about this problem was from Jeff Janes. If the toplevel scan/join relation is partitioned, recursively apply the changes to all partitions. This sometimes allows us to get rid of Result nodes, because Append is not projection-capable but its children may be. It also cleans up what appears to be incorrect SRF handling from commit e2f1eb0ee30d144628ab523432320f174a2c8966: the old code had no knowledge of SRFs for child scan/join rels. Because we now use create_projection_path() in some cases where we formerly used apply_projection_to_path(), this changes the ordering of columns in some queries generated by postgres_fdw. Update regression outputs accordingly. Patch by me, reviewed by Amit Kapila and by Ashutosh Bapat. Other fixes for this problem (substantially different from this version) were reviewed by Dilip Kumar, Amit Khandekar, and Marina Polyakova. Discussion: http://postgr.es/m/CAMkU=1ycXNipvhWuweUVpKuyu6SpNjF=yHWu4c4US5JgVGxtZQ@mail.gmail.com	2018-03-29 15:49:31 -04:00
Robert Haas	3f90ec8597	Postpone generate_gather_paths for topmost scan/join rel. Don't call generate_gather_paths for the topmost scan/join relation when it is initially populated with paths. Instead, do the work in grouping_planner. By itself, this gains nothing; in fact it loses slightly because we end up calling set_cheapest() for the topmost scan/join rel twice rather than once. However, it paves the way for a future commit which will postpone generate_gather_paths for the topmost scan/join relation even further, allowing more accurate costing of parallel paths. Amit Kapila and Robert Haas. Earlier versions of this patch (which different substantially) were reviewed by Dilip Kumar, Amit Khandekar, Marina Polyakova, and Ashutosh Bapat.	2018-03-29 15:40:40 -04:00
Robert Haas	d7c19e62a8	Teach create_projection_plan to omit projection where possible. We sometimes insert a ProjectionPath into a plan tree when projection is not strictly required. The existing code already arranges to avoid emitting a Result node when the ProjectionPath's subpath can perform the projection itself, but previously it didn't consider the possibility that the parent node might not actually require the projection to be performed at all. Skipping projection when it's not required can not only avoid Result nodes that aren't needed, but also avoid losing the "physical tlist" optimization unneccessarily. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoakT5gmahbPWGqrR2nAdFOMAOnOXYoWHRdVfGWs34t6_A@mail.gmail.com	2018-03-29 15:37:48 -04:00
Bruce Momjian	20b4323bd1	C comments: "a" <--> "an" corrections Reported-by: Michael Paquier, Abhijit Menon-Sen Discussion: https://postgr.es/m/20180305045854.GB2266@paquier.xyz Author: Michael Paquier, Abhijit Menon-Sen, me	2018-03-29 15:18:53 -04:00
Bruce Momjian	3282c4c136	README change: update for hash access method Reported-by: Thomas Munro, Justin Pryzby Discussion: https://postgr.es/m/CAEepm=1_682z-09DNHj4GkCJAqWK-D6h9Oq5ea84T1oqq1-Utg@mail.gmail.com	2018-03-29 14:38:39 -04:00
Tom Lane	2b1759e267	Remove unnecessary BufferGetPage() calls in fsm_vacuum_page(). Just noticed that these were quite redundant, since we're holding the page address in a local variable anyway, and we have pin on the buffer throughout. Also improve a comment.	2018-03-29 12:44:19 -04:00
Tom Lane	a063baaced	Remove UpdateFreeSpaceMap(), use FreeSpaceMapVacuumRange() instead. FreeSpaceMapVacuumRange has the same effect, is more efficient if many pages are involved, and makes fewer assumptions about how it's used. Notably, Claudio Freire pointed out that UpdateFreeSpaceMap could fail if the specified freespace value isn't the maximum possible. This isn't a problem for the single existing user, but the function represents an attractive nuisance IMO, because it's named as though it were a general-purpose update function and its limitations are undocumented. In any case we don't need multiple ways to get the same result. In passing, do some code review and cleanup in RelationAddExtraBlocks. In particular, I see no excuse for it to omit the PageIsNew safety check that's done in the mainline extension path in RelationGetBufferForTuple. Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com	2018-03-29 12:22:44 -04:00
Bruce Momjian	bc0021ef09	C comment: fix wording about shared memory message queue Reported-by: Tels Discussion: https://postgr.es/m/e66e05bc55f5ce904e361ad17a3395ae.squirrel@sm.webmail.pair.com	2018-03-29 12:18:42 -04:00
Tom Lane	851a26e266	While vacuuming a large table, update upper-level FSM data every so often. VACUUM updates leaf-level FSM entries immediately after cleaning the corresponding heap blocks. fsmpage.c updates the intra-page search trees on the leaf-level FSM pages when this happens, but it does not touch the upper-level FSM pages, so that the released space might not actually be findable by searchers. Previously, updating the upper-level pages happened only at the conclusion of the VACUUM run, in a single FreeSpaceMapVacuum() call. This is bad because the VACUUM might get canceled before ever reaching that point, so that from the point of view of searchers no space has been freed at all, leading to table bloat. We can improve matters by updating the upper pages immediately after each cycle of index-cleaning and heap-cleaning, processing just the FSM pages corresponding to the range of heap blocks we have now fully cleaned. This adds a small amount of extra work, since the FSM pages leading down to each range boundary will be touched twice, but it's pretty negligible compared to everything else going on in a large VACUUM. If there are no indexes, VACUUM doesn't work in cycles but just cleans each heap page on first visit. In that case we just arbitrarily update upper FSM pages after each 8GB of heap. That maintains the goal of not letting all this work slide until the very end, and it doesn't seem worth expending extra complexity on a case that so seldom occurs in practice. In either case, the FSM is fully up to date before any attempt is made to truncate the relation, so that the most likely scenario for VACUUM cancellation no longer results in out-of-date upper FSM pages. When we do successfully truncate, adjusting the FSM to reflect that is now fully handled within FreeSpaceMapTruncateRel. Claudio Freire, reviewed by Masahiko Sawada and Jing Wang, some additional tweaks by me Discussion: https://postgr.es/m/CAGTBQpYR0uJCNTt3M5GOzBRHo+-GccNO1nCaQ8yEJmZKSW5q1A@mail.gmail.com	2018-03-29 11:29:54 -04:00
Teodor Sigaev	c0cbe00fee	Add casts from jsonb Add explicit cast from scalar jsonb to all numeric and bool types. It would be better to have cast from scalar jsonb to text too but there is already a cast from jsonb to text as just text representation of json. There is no way to have two different casts for the same type's pair. Bump catalog version Author: Anastasia Lubennikova with editorization by Nikita Glukhov and me Review by: Aleksander Alekseev, Nikita Glukhov, Darafei Praliaskouski Discussion: https://www.postgresql.org/message-id/flat/0154d35a-24ae-f063-5273-9ffcdf1c7f2e@postgrespro.ru	2018-03-29 16:33:56 +03:00
Magnus Hagander	669820a3d9	Fix typo in comment Arthur Zakirov, confirmed by Thomas Munro	2018-03-29 11:42:32 +02:00
Peter Eisentraut	056a5a3f63	Allow committing inside cursor loop Previously, committing or aborting inside a cursor loop was prohibited because that would close and remove the cursor. To allow that, automatically convert such cursors to holdable cursors so they survive commits or rollbacks. Portals now have a new state "auto-held", which means they have been converted automatically from pinned. An auto-held portal is kept on transaction commit or rollback, but is still removed when returning to the main loop on error. This supports all languages that have cursor loop constructs: PL/pgSQL, PL/Python, PL/Perl. Reviewed-by: Ildus Kurbangaliev <i.kurbangaliev@postgrespro.ru>	2018-03-28 19:03:26 -04:00
Bruce Momjian	a2894cce54	C comment: fix typo, log -> lag Reported-by: atorikoshi Discussion: https://postgr.es/m/b61f2ab9-c0e0-d33d-ce3f-42a228025681@lab.ntt.co.jp Author: atorikoshi	2018-03-28 18:23:47 -04:00
Andres Freund	a0a08c1d85	Fix mistakes in the just added JIT docs. Reported-By: Lukas Fittl Author: Andres Freund	2018-03-28 15:07:08 -07:00
Andres Freund	e6c039d13e	Add documentation for the JIT feature. As promised in earlier commits, this adds documentation about the new build options, the new GUCs, about the planner logic when JIT is used, and the benefits of JIT in general. Also adds a more implementation oriented README. I'm sure we're going to want to expand this further, but I think this is a reasonable start. Author: Andres Freund, with contributions by Thomas Munro Reviewed-By: Thomas Munro Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-28 14:22:42 -07:00
Andres Freund	1f0c6a9e7d	Add EXPLAIN support for JIT. This just shows a few details about JITing, e.g. how many functions have been JITed, and how long that took. To avoid noise in regression tests with functions sometimes being JITed in --with-llvm builds, disable display when COSTS OFF is specified. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-28 13:26:51 -07:00
Andres Freund	9370462e9a	Add inlining support to LLVM JIT provider. This provides infrastructure to allow JITed code to inline code implemented in C. This e.g. can be postgres internal functions or extension code. This already speeds up long running queries, by allowing the LLVM optimizer to optimize across function boundaries. The optimization potential currently doesn't reach its full potential because LLVM cannot optimize the FunctionCallInfoData argument fully away, because it's allocated on the heap rather than the stack. Fixing that is beyond what's realistic for v11. To be able to do that, use CLANG to convert C code to LLVM bitcode, and have LLVM build a summary for it. That bitcode can then be used to to inline functions at runtime. For that the bitcode needs to be installed. Postgres bitcode goes into $pkglibdir/bitcode/postgres, extensions go into equivalent directories. PGXS has been modified so that happens automatically if postgres has been compiled with LLVM support. Currently this isn't the fastest inline implementation, modules are reloaded from disk during inlining. That's to work around an apparent LLVM bug, triggering an apparently spurious error in LLVM assertion enabled builds. Once that is resolved we can remove the superfluous read from disk. Docs will follow in a later commit containing docs for the whole JIT feature. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-28 13:19:08 -07:00
Fujii Masao	266b6acb31	Make pg_rewind skip files and directories that are removed during server start. The target cluster that was rewound needs to perform recovery from the checkpoint created at failover, which leads it to remove or recreate some files and directories that may have been copied from the source cluster. So pg_rewind can skip synchronizing such files and directories, and which reduces the amount of data transferred during a rewind without changing the usefulness of the operation. Author: Michael Paquier Reviewed-by: Anastasia Lubennikova, Stephen Frost and me Discussion: https://postgr.es/m/20180205071022.GA17337@paquier.xyz	2018-03-29 04:56:52 +09:00
Peter Eisentraut	d92bc83c48	PL/pgSQL: Nested CALL with transactions So far, a nested CALL or DO in PL/pgSQL would not establish a context where transaction control statements were allowed. This fixes that by handling CALL and DO specially in PL/pgSQL, passing the atomic/nonatomic execution context through and doing the required management around transaction boundaries. Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com>	2018-03-28 13:31:27 -04:00
Tom Lane	c2d4eb1b1f	Fix actual and potential double-frees around tuplesort usage. tuplesort_gettupleslot() passed back tuples allocated in the tuplesort's own memory context, even when the caller was responsible to free them. This created a double-free hazard, because some callers might destroy the tuplesort object (via tuplesort_end) before trying to clean up the last returned tuple. To avoid this, change the API to specify that the tuple is allocated in the caller's memory context. v10 and HEAD already did things that way, but in 9.5 and 9.6 this is a live bug that can demonstrably cause crashes with some grouping-set usages. In 9.5 and 9.6, this requires doing an extra tuple copy in some cases, which is unfortunate. But the amount of refactoring needed to avoid it seems excessive for a back-patched change, especially since the cases where an extra copy happens are less performance-critical. Likewise change tuplesort_getdatum() to return pass-by-reference Datums in the caller's context not the tuplesort's context. There seem to be no live bugs among its callers, but clearly the same sort of situation could happen in future. For other tuplesort fetch routines, continue to allocate the memory in the tuplesort's context. This is a little inconsistent with what we now do for tuplesort_gettupleslot() and tuplesort_getdatum(), but that's preferable to adding new copy overhead in the back branches where it's clearly unnecessary. These other fetch routines provide the weakest possible guarantees about tuple memory lifespan from v10 on, anyway, so this actually seems more consistent overall. Adjust relevant comments to reflect these API redefinitions. Arguably, we should change the pre-9.5 branches as well, but since there are no known failure cases there, it seems not worth the risk. Peter Geoghegan, per report from Bernd Helmle. Reviewed by Kyotaro Horiguchi; thanks also to Andreas Seltenreich for extracting a self-contained test case. Discussion: https://postgr.es/m/1512661638.9720.34.camel@oopsware.de	2018-03-28 13:26:57 -04:00
Simon Riggs	1eb6d6527a	Store 2PC GID in commit/abort WAL recs for logical decoding Store GID of 2PC in commit/abort WAL records when wal_level = logical. This allows logical decoding to send the SAME gid to subscribers across restarts of logical replication. Track relica origin replay progress for 2PC. (Edited from patch 0003 in the logical decoding 2PC series.) Authors: Nikhil Sontakke, Stas Kelvich Reviewed-by: Simon Riggs, Andres Freund	2018-03-28 17:42:50 +01:00
Andres Freund	f4f5845b31	Quick adaption of JIT tuple deforming to the fast default patch. Instead using memset to set tts_isnull, call the new slot_getmissingattrs(). Also fix a bug (= instead of >=) in the code generation. Normally = is correct, but when repeatedly deforming fields not in a tuple (e.g. deform up to natts + 1 and then natts + 2) >= is needed. Discussion: https://postgr.es/m/20180328010053.i2qvsuuusst4lgmc@alap3.anarazel.de	2018-03-27 21:03:10 -07:00
Andrew Dunstan	16828d5c02	Fast ALTER TABLE ADD COLUMN with a non-NULL default Currently adding a column to a table with a non-NULL default results in a rewrite of the table. For large tables this can be both expensive and disruptive. This patch removes the need for the rewrite as long as the default value is not volatile. The default expression is evaluated at the time of the ALTER TABLE and the result stored in a new column (attmissingval) in pg_attribute, and a new column (atthasmissing) is set to true. Any existing row when fetched will be supplied with the attmissingval. New rows will have the supplied value or the default and so will never need the attmissingval. Any time the table is rewritten all the atthasmissing and attmissingval settings for the attributes are cleared, as they are no longer needed. The most visible code change from this is in heap_attisnull, which acquires a third TupleDesc argument, allowing it to detect a missing value if there is one. In many cases where it is known that there will not be any (e.g. catalog relations) NULL can be passed for this argument. Andrew Dunstan, heavily modified from an original patch from Serge Rielau. Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley. Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com	2018-03-28 10:43:52 +10:30
Tom Lane	442accc3fe	Allow memory contexts to have both fixed and variable ident strings. Originally, we treated memory context names as potentially variable in all cases, and therefore always copied them into the context header. Commit `9fa6f00b1` rethought this a little bit and invented a distinction between fixed and variable names, skipping the copy step for the former. But we can make things both simpler and more useful by instead allowing there to be two parts to a context's identification, a fixed "name" and an optional, variable "ident". The name supplied in the context create call is now required to be a compile-time-constant string in all cases, as it is never copied but just pointed to. The "ident" string, if wanted, is supplied later. This is needed because typically we want the ident to be stored inside the context so that it's cleaned up automatically on context deletion; that means it has to be copied into the context before we can set the pointer. The cost of this approach is basically just an additional pointer field in struct MemoryContextData, which isn't much overhead, and is bought back entirely in the AllocSet case by not needing a headerSize field anymore, since we no longer have to cope with variable header length. In addition, we can simplify the internal interfaces for memory context creation still further, saving a few cycles there. And it's no longer true that a custom identifier disqualifies a context from participating in aset.c's freelist scheme, so possibly there's some win on that end. All the places that were using non-compile-time-constant context names are adjusted to put the variable info into the "ident" instead. This allows more effective identification of those contexts in many cases; for example, subsidary contexts of relcache entries are now identified by both type (e.g. "index info") and relname, where before you got only one or the other. Contexts associated with PL function cache entries are now identified more fully and uniformly, too. I also arranged for plancache contexts to use the query source string as their identifier. This is basically free for CachedPlanSources, as they contained a copy of that string already. We pay an extra pstrdup to do it for CachedPlans. That could perhaps be avoided, but it would make things more fragile (since the CachedPlanSource is sometimes destroyed first). I suspect future improvements in error reporting will require CachedPlans to have a copy of that string anyway, so it's not clear that it's worth moving mountains to avoid it now. This also changes the APIs for context statistics routines so that the context-specific routines no longer assume that output goes straight to stderr, nor do they know all details of the output format. This is useful immediately to reduce code duplication, and it also allows for external code to do something with stats output that's different from printing to stderr. The reason for pushing this now rather than waiting for v12 is that it rethinks some of the API changes made by commit `9fa6f00b1`. Seems better for extension authors to endure just one round of API changes not two. Discussion: https://postgr.es/m/CAB=Je-FdtmFZ9y9REHD7VsSrnCkiBhsA4mdsLKSPauwXtQBeNA@mail.gmail.com	2018-03-27 16:46:51 -04:00
Simon Riggs	c203d6cf81	Allow HOT updates for some expression indexes If the value of an index expression is unchanged after UPDATE, allow HOT updates where previously we disallowed them, giving a significant performance boost in those cases. Particularly useful for indexes such as JSON->>field where the JSON value changes but the indexed value does not. Submitted as "surjective indexes" patch, now enabled by use of new "recheck_on_update" parameter. Author: Konstantin Knizhnik Reviewer: Simon Riggs, with much wordsmithing and some cleanup	2018-03-27 19:57:02 +01:00
Teodor Sigaev	920a5e500a	Skip temp tables from basebackup. Do not store temp tables in basebackup, they will not be visible anyway, so, there are not reasons to store them. Author: David Steel Reviewed by: me Discussion: https://www.postgresql.org/message-id/flat/5ea4d26a-a453-c1b7-eff9-5a3ef8f8aceb@pgmasters.net	2018-03-27 16:14:40 +03:00
Teodor Sigaev	3ad55863e9	Add predicate locking for GiST Add page-level predicate locking, due to gist's code organization, patch seems close to trivial: add check before page changing, add predicate lock before page scanning. Although choosing right place to check is not simple: it should not be called during index build, it should support insertion of new downlink and so on. Author: Shubham Barai with editorization by me and Alexander Korotkov Reviewed by: Alexander Korotkov, Andrey Borodin, me Discussion: https://www.postgresql.org/message-id/flat/CALxAEPtdcANpw5ePU3LvnTP8HCENFw6wygupQAyNBgD-sG3h0g@mail.gmail.com	2018-03-27 15:43:19 +03:00
Andres Freund	4b9094eb6e	Adapt to LLVM 7+ Orc API changes. This is mostly done to be able to validate features and fixes submitted to LLVM. Given the size of these changes that seems acceptable. Author: Andres Freund	2018-03-26 16:04:53 -07:00
Andres Freund	071371bc43	LLVMJIT: Free created module in LLVM < 5. Due to the differing APIs between versions, I forgot to deallocate the generated module in older LLVM versions, leading to a memory leak. Author: Andres Freund	2018-03-26 16:04:39 -07:00
Andres Freund	96b5eac918	Correct some typos in the new JIT code. Author: Thomas Munro	2018-03-26 12:58:17 -07:00
Andres Freund	32af96b2b1	JIT tuple deforming in LLVM JIT provider. Performing JIT compilation for deforming gains performance benefits over unJITed deforming from compile-time knowledge of the tuple descriptor. Fixed column widths, NOT NULLness, etc can be taken advantage of. Right now the JITed deforming is only used when deforming tuples as part of expression evaluation (and obviously only if the descriptor is known). It's likely to be beneficial in other cases, too. By default tuple deforming is JITed whenever an expression is JIT compiled. There's a separate boolean GUC controlling it, but that's expected to be primarily useful for development and benchmarking. Docs will follow in a later commit containing docs for the whole JIT feature. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-26 12:57:19 -07:00
Alvaro Herrera	530bcf7581	Fix thinko in comment The listed numbers disagreed with the ones being used in the symbols; but instead of just fixing the numbers in the comment, use the symbolic name instead, which seems clearer. This has been wrong all along, so apply back to 9.5 where BRIN was introduced. Reported-by: Tomas Vondra Discussion: https://postgr.es/m/5ff514f2-8b1e-6366-b11c-8e2ed442562d@2ndquadrant.com	2018-03-26 12:03:42 -03:00
Alvaro Herrera	555ee77a96	Handle INSERT .. ON CONFLICT with partitioned tables Commit `eb7ed3f306` enabled unique constraints on partitioned tables, but one thing that was not working properly is INSERT/ON CONFLICT. This commit introduces a new node keeps state related to the ON CONFLICT clause per partition, and fills it when that partition is about to be used for tuple routing. Author: Amit Langote, Álvaro Herrera Reviewed-by: Etsuro Fujita, Pavan Deolasee Discussion: https://postgr.es/m/20180228004602.cwdyralmg5ejdqkq@alvherre.pgsql	2018-03-26 10:43:54 -03:00
Alvaro Herrera	1b89c2188b	Fix typo	2018-03-26 09:56:41 -03:00
Andrew Dunstan	2b27273435	Optimize btree insertions for common case of increasing values Remember the last page of an index insert if it's the rightmost leaf page. If the next entry belongs on and can fit in the remembered page, insert the new entry there as long as we can get a lock on the page. Otherwise, fall back on the more expensive method of searching for the right place to insert the entry. This provides a performance improvement for the common case where an index entry is for monotonically increasing or nearly monotonically increasing value such as an identity field or a current timestamp. Pavan Deolasee Reviewed by Claudio Freire, Simon Riggs and Peter Geoghegan Discussion: https://postgr.es/m/CABOikdM9DrupjyKZZFM5k8-0RCDs1wk6JzEkg7UgSW6QzOwMZw@mail.gmail.com	2018-03-26 22:39:24 +10:30
Tom Lane	d0c0c89453	Fix unsafe extraction of the OID part of a relation filename. Commit `8694cc96b` did this randomly differently from other callers of parse_filename_for_nontemp_relation(). Perhaps unsurprisingly, the randomly different way is wrong; it fails to ensure the extracted string is null-terminated. Per buildfarm member skink. Discussion: https://postgr.es/m/14453.1522001792@sss.pgh.pa.us	2018-03-25 15:15:40 -04:00
Tom Lane	3a2cb59887	Remove useless if-test. Coverity complained that this check is pointless, and it's right. There is no case where we'd call ExecutorStart with a null plannedstmt, and if we did, it'd have crashed before here. Thinko in commit `cc415a56d`.	2018-03-25 14:54:16 -04:00
Peter Eisentraut	52f3a9d6a3	Small refactoring Put the "atomic" argument of ExecuteDoStmt() and ExecuteCallStmt() into a variable instead of repeating the formula.	2018-03-23 17:18:22 -04:00
Tom Lane	4b538727e2	Fix make rules that generate multiple output files. For years, our makefiles have correctly observed that "there is no correct way to write a rule that generates two files". However, what we did is to provide empty rules that "generate" the secondary output files from the primary one, and that's not right either. Depending on the details of the creating process, the primary file might end up timestamped later than one or more secondary files, causing subsequent make runs to consider the secondary file(s) out of date. That's harmless in a plain build, since make will just re-execute the empty rule and nothing happens. But it's fatal in a VPATH build, since make will expect the secondary file to be rebuilt in the build directory. This would manifest as "file not found" failures during VPATH builds from tarballs, if we were ever unlucky enough to ship a tarball with apparently out-of-date secondary files. (It's not clear whether that has ever actually happened, but it definitely could.) To ensure that secondary output files have timestamps >= their primary's, change our makefile convention to be that we provide a "touch $@" action not an empty rule. Also, make sure that this rule actually gets invoked during a distprep run, else the hazard remains. It's been like this a long time, so back-patch to all supported branches. In HEAD, I skipped the changes in src/backend/catalog/Makefile, because those rules are due to get replaced soon in the bootstrap data format patch, and there seems no need to create a merge issue for that patch. If for some reason we fail to land that patch in v11, we'll need to back-fill the changes in that one makefile from v10. Discussion: https://postgr.es/m/18556.1521668179@sss.pgh.pa.us	2018-03-23 13:46:00 -04:00
Teodor Sigaev	8694cc96b5	Exclude unlogged tables from base backups Exclude unlogged tables from base backup entirely except init fork which marks created unlogged table. The next question is do not backup temp table but it's a story for separate patch. Author: David Steele Review by: Adam Brightwell, Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/04791bab-cb04-ba43-e9c0-664a4c1ffb2c@pgmasters.net	2018-03-23 19:14:12 +03:00
Alvaro Herrera	86f575948c	Allow FOR EACH ROW triggers on partitioned tables Previously, FOR EACH ROW triggers were not allowed in partitioned tables. Now we allow AFTER triggers on them, and on trigger creation we cascade to create an identical trigger in each partition. We also clone the triggers to each partition that is created or attached later. This means that deferred unique keys are allowed on partitioned tables, too. Author: Álvaro Herrera Reviewed-by: Peter Eisentraut, Simon Riggs, Amit Langote, Robert Haas, Thomas Munro Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql	2018-03-23 10:48:22 -03:00
Andres Freund	2111a48a0c	Adapt expression JIT to stdbool.h introduction. The LLVM JIT provider uses clang to synchronize types between normal C code and runtime generated code. Clang represents stdbool.h style booleans in return values & parameters differently from booleans stored in variables. Thus the expression compilation code from `2a0faed9d` needs to be adapted to `9a95a77d9`. Instead of hardcoding i8 as the type for booleans (which already was wrong on some edge case platforms!), use postgres' notion of a boolean as used for storage and for parameters. Per buildfarm animal xenodermus. Author: Andres Freund	2018-03-22 22:15:51 -07:00
Peter Eisentraut	9a95a77d9d	Use stdbool.h if suitable Using the standard bool type provided by C allows some recent compilers and debuggers to give better diagnostics. Also, some extension code and third-party headers are increasingly pulling in stdbool.h, so it's probably saner if everyone uses the same definition. But PostgreSQL code is not prepared to handle bool of a size other than 1, so we keep our own old definition if we encounter a stdbool.h with a bool of a different size. (Among current build farm members, this only applies to old macOS versions on PowerPC.) To check that the used bool is of the right size, add a static assertions about size of GinTernaryValue vs bool. This is currently the only place that assumes that bool and char are of the same size. Discussion: https://www.postgresql.org/message-id/flat/3a0fe7e1-5ed1-414b-9230-53bbc0ed1f49@2ndquadrant.com	2018-03-22 20:42:25 -04:00
Andres Freund	2a0faed9d7	Add expression compilation support to LLVM JIT provider. In addition to the interpretation of expressions (which back evaluation of WHERE clauses, target list projection, aggregates transition values etc) support compiling expressions to native code, using the infrastructure added in earlier commits. To avoid duplicating a lot of code, only support emitting code for cases that are likely to be performance critical. For expression steps that aren't deemed that, use the existing interpreter. The generated code isn't great - some architectural changes are required to address that. But this already yields a significant speedup for some analytics queries, particularly with WHERE clauses filtering a lot, or computing multiple aggregates. Author: Andres Freund Tested-By: Thomas Munro Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de Disable JITing for VALUES() nodes. VALUES() nodes are only ever executed once. This is primarily helpful for debugging, when forcing JITing even for cheap queries. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-22 14:45:59 -07:00
Andres Freund	fb46ac26fe	Expand list of synchronized types and functions in LLVM JIT provider. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-22 14:45:59 -07:00
Tom Lane	feb8254518	Improve style guideline compliance of assorted error-report messages. Per the project style guide, details and hints should have leading capitalization and end with a period. On the other hand, errcontext should not be capitalized and should not end with a period. To support well formatted error contexts in dblink, extend dblink_res_error() to take a format+arguments rather than a hardcoded string. Daniel Gustafsson Discussion: https://postgr.es/m/B3C002C8-21A0-4F53-A06E-8CAB29FCF295@yesql.se	2018-03-22 17:33:10 -04:00
Robert Haas	88ba0ae2aa	Consider Parallel Append of partial paths for UNION [ALL]. Without this patch, we can implement a UNION or UNION ALL as an Append where Gather appears beneath one or more of the Append branches, but this lets us put the Gather node on top, with a partial path for each relation underneath. There is considerably more work that could be done to improve planning in this area, but that will probably need to wait for a future release. Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com	2018-03-22 16:09:28 -04:00
Tom Lane	7c91a0364f	Sync up our various ways of estimating pg_class.reltuples. VACUUM thought that reltuples represents the total number of tuples in the relation, while ANALYZE counted only live tuples. This can cause "flapping" in the value when background vacuums and analyzes happen separately. The planner's use of reltuples essentially assumes that it's the count of live (visible) tuples, so let's standardize on having it mean live tuples. Another issue is that the definition of "live tuple" isn't totally clear; what should be done with INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples? ANALYZE's choices in this regard are made on the assumption that if the originating transaction commits at all, it will happen after ANALYZE finishes, so we should ignore the effects of the in-progress transaction --- unless it is our own transaction, and then we should count it. Let's propagate this definition into VACUUM, too. Likewise propagate this definition into CREATE INDEX, and into contrib/pgstattuple's pgstattuple_approx() function. Tomas Vondra, reviewed by Haribabu Kommi, some corrections by me Discussion: https://postgr.es/m/16db4468-edfa-830a-f921-39a50498e77e@2ndquadrant.com	2018-03-22 15:47:41 -04:00
Andres Freund	cc415a56d0	Basic planner and executor integration for JIT. This adds simple cost based plan time decision about whether JIT should be performed. jit_above_cost, jit_optimize_above_cost are compared with the total cost of a plan, and if the cost is above them JIT is performed / optimization is performed respectively. For that PlannedStmt and EState have a jitFlags (es_jit_flags) field that stores information about what JIT operations should be performed. EState now also has a new es_jit field, which can store a JitContext. When there are no errors the context is released in standard_ExecutorEnd(). It is likely that the default values for jit_[optimize_]above_cost will need to be adapted further, but in my test these values seem to work reasonably. Author: Andres Freund, with feedback by Peter Eisentraut Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-22 11:51:58 -07:00
Andres Freund	250bca7fc1	Debugging and profiling support for LLVM JIT provider. This currently requires patches to the LLVM codebase to be effective (submitted upstream), the GUCs are available without those patches however. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-22 11:07:55 -07:00
Andres Freund	b96d550eb0	Support for optimizing and emitting code in LLVM JIT provider. This commit introduces the ability to actually generate code using LLVM. In particular, this adds: - Ability to emit code both in heavily optimized and largely unoptimized fashion - Batching facility to allow functions to be defined in small increments, but optimized and emitted in executable form in larger batches (for performance and memory efficiency) - Type and function declaration synchronization between runtime generated code and normal postgres code. This is critical to be able to access struct fields etc. - Developer oriented jit_dump_bitcode GUC, for inspecting / debugging the generated code. - per JitContext statistics of number of functions, time spent generating code, optimizing, and emitting it. This will later be employed for EXPLAIN support. This commit doesn't yet contain any code actually generating functions. That'll follow in later commits. Documentation for GUCs added, and for JIT in general, will be added in later commits. Author: Andres Freund, with contributions by Pierre Ducroquet Testing-By: Thomas Munro, Peter Eisentraut Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-22 11:05:22 -07:00
Robert Haas	2fe6336e2d	Avoid creating a TOAST table for a partitioned table. It's useless. Amit Langote Discussion: http://postgr.es/m/b4c9dee6-d134-49b8-79c4-07fbd7c3b898@lab.ntt.co.jp	2018-03-22 13:49:38 -04:00
Robert Haas	8a8c4f3b32	Fix typo in comment. Michael Paquier Discussion: http://postgr.es/m/20180205071404.GB17337@paquier.xyz	2018-03-22 13:36:14 -04:00
Tom Lane	649f179250	Fix tuple counting in SP-GiST index build. Count the number of tuples in the index honestly, instead of assuming that it's the same as the number of tuples in the heap. (It might be different if the index is partial.) Back-patch to all supported versions. Tomas Vondra Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com	2018-03-22 13:24:05 -04:00
Robert Haas	7de4a1bcc5	Call pgstat_report_activity() in parallel CREATE INDEX workers. Also set debug_query_string. Oversight in commit `9da0cc3528` Peter Geoghegan, per a report by Phil Florent. Discussion: https://postgr.es/m/CAH2-Wzmf-34hD4n40uTuE-ZY9P5c%2BmvhFbCdQfN%3DKrKiVm3j3A%40mail.gmail.com	2018-03-22 13:15:03 -04:00
Robert Haas	e2f1eb0ee3	Implement partition-wise grouping/aggregation. If the partition keys of input relation are part of the GROUP BY clause, all the rows belonging to a given group come from a single partition. This allows aggregation/grouping over a partitioned relation to be broken down * into aggregation/grouping on each partition. This should be no worse, and often better, than the normal approach. If the GROUP BY clause does not contain all the partition keys, we can still perform partial aggregation for each partition and then finalize aggregation after appending the partial results. This is less certain to be a win, but it's still useful. Jeevan Chalke, Ashutosh Bapat, Robert Haas. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, and Rafia Sabih. Discussion: http://postgr.es/m/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com	2018-03-22 12:49:48 -04:00
Dean Rasheed	b5db1d93d2	Improve ANALYZE's strategy for finding MCVs. Previously, a value was included in the MCV list if its frequency was 25% larger than the estimated average frequency of all nonnull values in the table. For uniform distributions, that can lead to values being included in the MCV list and significantly overestimated on the basis of relatively few (sometimes just 2) instances being seen in the sample. For non-uniform distributions, it can lead to too few values being included in the MCV list, since the overall average frequency may be dominated by a small number of very common values, while the remaining values may still have a large spread of frequencies, causing both substantial overestimation and underestimation of the remaining values. Furthermore, increasing the statistics target may have little effect because the overall average frequency will remain relatively unchanged. Instead, populate the MCV list with the largest set of common values that are statistically significantly more common than the average frequency of the remaining values. This takes into account the variance of the sample counts, which depends on the counts themselves and on the proportion of the table that was sampled. As a result, it constrains the relative standard error of estimates based on the frequencies of values in the list, reducing the chances of too many values being included. At the same time, it allows more values to be included, since the MCVs need only be more common than the remaining non-MCVs, rather than the overall average. Thus it tends to produce fewer MCVs than the previous code for uniform distributions, and more for non-uniform distributions, reducing estimation errors in both cases. In addition, the algorithm responds better to increasing the statistics target, allowing more values to be included in the MCV list when more of the table is sampled. Jeff Janes, substantially modified by me. Reviewed by John Naylor and Tomas Vondra. Discussion: https://postgr.es/m/CAMkU=1yvdGvW9TmiLAhz2erFnvnPFYHbOZuO+a=4DVkzpuQ2tw@mail.gmail.com	2018-03-22 09:37:36 +00:00
Andres Freund	31bc604e0b	Add file containing extensions of the LLVM C API. Author: Andres Freund Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-21 19:44:17 -07:00
Andres Freund	432bb9e04d	Basic JIT provider and error handling infrastructure. This commit introduces: 1) JIT provider abstraction, which allows JIT functionality to be implemented in separate shared libraries. That's desirable because it allows to install JIT support as a separate package, and because it allows experimentation with different forms of JITing. 2) JITContexts which can be, using functions introduced in follow up commits, used to emit JITed functions, and have them be cleaned up on error. 3) The outline of a LLVM JIT provider, which will be fleshed out in subsequent commits. Documentation for GUCs added, and for JIT in general, will be added in later commits. Author: Andres Freund, with architectural input from Jeff Davis Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-21 19:28:28 -07:00
Tom Lane	846b5a5257	Prevent extensions from creating custom GUCs that are GUC_LIST_QUOTE. Pending some solution for the problems noted in commit `742869946`, disallow dynamic creation of GUC_LIST_QUOTE variables. If there are any extensions out there using this feature, they'd not be happy for us to start enforcing this rule in minor releases, so this is a HEAD-only change. The previous commit didn't make things any worse than they already were for such cases. Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz	2018-03-21 20:11:07 -04:00
Tom Lane	742869946f	Fix mishandling of quoted-list GUC values in pg_dump and ruleutils.c. Code that prints out the contents of setconfig or proconfig arrays in SQL format needs to handle GUC_LIST_QUOTE variables differently from other ones, because for those variables, flatten_set_variable_args() already applied a layer of quoting. The value can therefore safely be printed as-is, and indeed must be, or flatten_set_variable_args() will muck it up completely on reload. For all other GUC variables, it's necessary and sufficient to quote the value as a SQL literal. We'd recognized the need for this long ago, but mis-analyzed the need slightly, thinking that all GUC_LIST_INPUT variables needed the special treatment. That's actually wrong, since a valid value of a LIST variable might include characters that need quoting, although no existing variables accept such values. More to the point, we hadn't made any particular effort to keep the various places that deal with this up-to-date with the set of variables that actually need special treatment, meaning that we'd do the wrong thing with, for example, temp_tablespaces values. This affects dumping of SET clauses attached to functions, as well as ALTER DATABASE/ROLE SET commands. In ruleutils.c we can fix it reasonably honestly by exporting a guc.c function that allows discovering the flags for a given GUC variable. But pg_dump doesn't have easy access to that, so continue the old method of having a hard-wired list of affected variable names. At least we can fix it to have just one list not two, and update the list to match current reality. A remaining problem with this is that it only works for built-in GUC variables. pg_dump's list obvious knows nothing of third-party extensions, and even the "ask guc.c" method isn't bulletproof since the relevant extension might not be loaded. There's no obvious solution to that, so for now, we'll just have to discourage extension authors from inventing custom GUCs that need GUC_LIST_QUOTE. This has been busted for a long time, so back-patch to all supported branches. Michael Paquier and Tom Lane, reviewed by Kyotaro Horiguchi and Pavel Stehule Discussion: https://postgr.es/m/20180111064900.GA51030@paquier.xyz	2018-03-21 20:03:28 -04:00
Tom Lane	0f0deb7194	Improve predtest.c's handling of cases with NULL-constant inputs. Currently, if operator_predicate_proof() is given an operator clause like "something op NULL", it just throws up its hands and reports it can't prove anything. But we can often do better than that, if the operator is strict, because then we know that the clause returns NULL overall. Depending on whether we're trying to prove or refute something, and whether we need weak or strong semantics for NULL, this may be enough to prove the implication, especially when we rely on the standard rule that "false implies anything". In particular, this lets us do something useful with questions like "does X IN (1,3,5,NULL) imply X <= 5?" The null entry in the IN list can effectively be ignored for this purpose, but the proof rules were not previously smart enough to deduce that. This patch is by me, but it owes something to previous work by Amit Langote to try to solve problems of the form mentioned. Thanks also to Emre Hasegeli and Ashutosh Bapat for review. Discussion: https://postgr.es/m/3bad48fc-f257-c445-feeb-8a2b2fb622ba@lab.ntt.co.jp	2018-03-21 18:30:46 -04:00
Alvaro Herrera	56163004b8	Fix relcache handling of the 'default' partition My commit `4dba331cb3` that moved around CommandCounterIncrement calls in partitioning DDL code unearthed a problem with the relcache handling for the 'default' partition: the construction of a correct relcache entry for the partitioned table was at the mercy of lack of CCI calls in non-trivial amounts of code. This was prone to creating problems later on, as the code develops. This was visible as a test failure in a compile with RELCACHE_FORCE_RELASE (buildfarm member prion). The problem is that after the mentioned commit it was possible to create a relcache entry that had incomplete information regarding the default partition because I introduced a CCI between adding the catalog entries for the default partition (StorePartitionBound) and the update of pg_partitioned_table entry for its parent partitioned table (update_default_partition_oid). It seems the best fix is to move the latter so that it occurs inside the former; the purposeful lack of intervening CCI should be more obvious, and harder to break. I also remove a check in RelationBuildPartitionDesc that returns NULL if the key is not set. I couldn't find any place that needs this hack anymore; probably it was required because of bugs that have since been fixed. Fix a few typos I noticed while reviewing the code involved. Discussion: https://postgr.es/m/20180320182659.nyzn3vqtjbbtfgwq@alvherre.pgsql	2018-03-21 12:03:35 -03:00
Peter Eisentraut	325f2ec555	Handle heap rewrites even better in logical decoding Logical decoding should not publish anything about tables created as part of a heap rewrite during DDL. Those tables don't exist externally, so consumers of logical decoding cannot do anything sensible with that information. In `ab28feae2b`, we worked around this for built-in logical replication, but that was hack. This is a more proper fix: We mark such transient heaps using the new field pg_class.relwrite, linking to the original relation OID. By default, we ignore them in logical decoding before they get to the output plugin. Optionally, a plugin can register their interest in getting such changes, if they handle DDL specially, in which case the new field will help them get information about the actual table. Reviewed-by: Craig Ringer <craig@2ndquadrant.com>	2018-03-21 09:15:04 -04:00
Andrew Gierth	d2d79887ea	Repair crash with unsortable grouping sets. If there were multiple grouping sets, none of them empty, all of which were unsortable, then an oversight in consider_groupingsets_paths led to a null pointer dereference. Fix, and add a regression test for this case. Per report from Dang Minh Huong, though I didn't use their patch. Backpatch to 10.x where hashed grouping sets were added.	2018-03-21 11:39:28 +00:00
Andres Freund	4c0000b839	Handle EEOP_FUNCEXPR_[STRICT_]FUSAGE out of line. This isn't a very common op, and it doesn't seem worth duplicating for JIT. Author: Andres Freund	2018-03-20 17:32:21 -07:00
Robert Haas	94150513ec	Don't pass the grouping target around unnecessarily. Since commit `4f15e5d09d` made grouped_rel set reltarget, a variety of other functions can just get it from grouped_rel instead of having to pass it around explicitly. Simplify accordingly. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com	2018-03-20 11:37:43 -04:00
Robert Haas	b5996c2791	Determine grouping strategies in create_grouping_paths. Partition-wise aggregate will call create_ordinary_grouping_paths multiple times and we don't want to redo this work every time; have the caller do it instead and pass the details down. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoY7VYYn9a7YHj1nJL6zj6BkHmt4K-un9LRmXkyqRZyynA@mail.gmail.com	2018-03-20 11:31:06 -04:00
Robert Haas	4f15e5d09d	Defer creation of partially-grouped relation until it's needed. This avoids unnecessarily creating a RelOptInfo for which we have no actual need. This idea is from Ashutosh Bapat, who wrote a very different patch to accomplish a similar goal. It will be more important if and when we get partition-wise aggregate, since then there could be many partially grouped relations all of which could potentially be unnecessary. In passing, this sets the grouping relation's reltarget, which wasn't done previously but makes things simpler for this refactoring. Along the way, adjust things so that add_paths_to_partial_grouping_rel, now renamed create_partial_grouping_paths, does not perform the Gather or Gather Merge steps to generate non-partial paths from partial paths; have the caller do it instead. This is again for the convenience of partition-wise aggregate, which wants to inject additional partial paths are created and before we decide which ones to Gather/Gather Merge. This might seem like a separate change, but it's actually pretty closely entangled; I couldn't really see much value in separating it and having to change some things twice. Patch by me, reviewed by Ashutosh Bapat. Discussion: http://postgr.es/m/CA+TgmoZ+ZJTVad-=vEq393N99KTooxv9k7M+z73qnTAqkb49BQ@mail.gmail.com	2018-03-20 11:18:04 -04:00
Alvaro Herrera	4dba331cb3	Fix CommandCounterIncrement in partition-related DDL It makes sense to do the CCIs in the places that do catalog updates, rather than before the places that error out because the former ones fail to do it. In particular, it looks like StorePartitionBound() and IndexSetParentIndex() ought to make their own CCIs. Per review comments from Peter Eisentraut for row-level triggers on partitioned tables. Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql	2018-03-20 11:19:41 -03:00
Tom Lane	467963c3e9	Prevent query-lifespan memory leakage of SP-GiST traversal values. The original coding of the SP-GiST scan traversalValue feature (commit `ccd6eb49a`) arranged for traversal values to be stored in the query's main executor context. That's fine if there's only one index scan per query, but if there are many, we have a memory leak as successive scans create new traversal values. Fix it by creating a separate memory context for traversal values, which we can reset during spgrescan(). Back-patch to 9.6 where this code was introduced. In principle, adding the traversalCxt field to SpGistScanOpaqueData creates an ABI break in the back branches. But I (tgl) have little sympathy for extensions including spgist_private.h, so I'm not very worried about that. Alternatively we could stick the new field at the end of the struct in back branches, but that has its own downsides. Anton Dignös, reviewed by Alexander Kuzmenkov Discussion: https://postgr.es/m/CALNdv1jb6y2Te-m8xHLxLX12RsBmZJ1f4hESX7J0HjgyOhA9eA@mail.gmail.com	2018-03-19 23:59:30 -04:00
Peter Eisentraut	13c7c65ec9	Add missing break	2018-03-19 19:45:51 -04:00
Tom Lane	6497a18e6c	Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY. refresh_by_match_merge() has some issues in the way it builds a SQL query to construct the "diff" table: 1. It doesn't require the selected unique index(es) to be indimmediate. 2. It doesn't pay attention to the particular equality semantics enforced by a given index, but just assumes that they must be those of the column datatype's default btree opclass. 3. It doesn't check that the indexes are btrees. 4. It's insufficiently careful to ensure that the parser will pick the intended operator when parsing the query. (This would have been a security bug before CVE-2018-1058.) 5. It's not careful about indexes on system columns. The way to fix #4 is to make use of the existing code in ri_triggers.c for generating an arbitrary binary operator clause. I chose to move that to ruleutils.c, since that seems a more reasonable place to be exporting such functionality from than ri_triggers.c. While #1, #3, and #5 are just latent given existing feature restrictions, and #2 doesn't arise in the core system for lack of alternate opclasses with different equality behaviors, #4 seems like an issue worth back-patching. That's the bulk of the change anyway, so just back-patch the whole thing to 9.4 where this code was introduced. Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us	2018-03-19 18:50:05 -04:00
Tom Lane	6fbd5cce22	Fix performance hazard in REFRESH MATERIALIZED VIEW CONCURRENTLY. Jeff Janes discovered that commit `7ca25b7de` made one of the queries run by REFRESH MATERIALIZED VIEW CONCURRENTLY perform badly. The root cause is bad cardinality estimation for correlated quals, but a principled solution to that problem is some way off, especially since the planner lacks any statistics about whole-row variables. Moreover, in non-error cases this query produces no rows, meaning it must be run to completion; but use of LIMIT 1 encourages the planner to pick a fast-start, slow-completion plan, exactly not what we want. Remove the LIMIT clause, and instead rely on the count parameter we pass to SPI_execute() to prevent excess work if the query does return some rows. While we've heard no field reports of planner misbehavior with this query, it could be that people are having performance issues that haven't reached the level of pain needed to cause a bug report. In any case, that LIMIT clause can't possibly do anything helpful with any existing version of the planner, and it demonstrably can cause bad choices in some cases, so back-patch to 9.4 where the code was introduced. Thomas Munro Discussion: https://postgr.es/m/CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com	2018-03-19 17:23:21 -04:00
Alvaro Herrera	ee0a1fc84e	Remove unnecessary members from ModifyTableState and ExecInsert These values can be obtained from the ModifyTable node which is already a part of both the ModifyTableState and ExecInsert. Author: Álvaro Herrera, Amit Langote Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/20180316151303.rml2p5wffn3o6qy6@alvherre.pgsql	2018-03-19 18:09:43 -03:00
Alvaro Herrera	839a8eb2b3	Expand comment a little bit The previous commit removed a comment that was a bit more verbose than its replacement.	2018-03-19 18:01:27 -03:00
Alvaro Herrera	6666ee49f4	Fix state reversal after partition tuple routing We make some changes to ModifyTableState and the EState it uses whenever we route tuples to partitions; but we weren't restoring properly in all cases, possibly causing crashes when partitions with different tuple descriptors are targeted by tuples inserted in the same command. Refactor some code, creating ExecPrepareTupleRouting, to encapsulate the needed state changing logic, and have it invoked one level above its current place (ie. put it in ExecModifyTable instead of ExecInsert); this makes it all more readable. Add a test case to exercise this. We don't support having views as partitions; and since only views can have INSTEAD OF triggers, there is no point in testing for INSTEAD OF when processing insertions into a partitioned table. Remove code that appears to support this (but which is actually never relevant.) In passing, fix location of some very confusing comments in ModifyTableState. Reported-by: Amit Langote Author: Etsuro Fujita, Amit Langote Discussion: https://postgr/es/m/0473bf5c-57b1-f1f7-3d58-455c2230bc5f@lab.ntt.co.jp	2018-03-19 17:45:53 -03:00
Robert Haas	c596fadbfe	Generate a separate upper relation for each stage of setop planning. Commit `3fc6e2d7f5` made setop planning stages return paths rather than plans, but all such paths were loosely associated with a single RelOptInfo, and only the final path was added to the RelOptInfo. Even at the time, it was foreseen that this should be changed, because there is otherwise no good way for a single stage of setop planning to return multiple paths. With this patch, each stage of set operation planning now creates a separate RelOptInfo; these are distinguished by using appropriate relid sets. Note that this patch does nothing whatsoever about actually returning multiple paths for the same set operation; it just makes it possible for a future patch to do so. Along the way, adjust things so that create_upper_paths_hook is called for each of these new RelOptInfos rather than just once, since that might be useful to extensions using that hook. It might be a good to provide an FDW API here as well, but I didn't try to do that for now. Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com	2018-03-19 11:55:38 -04:00
Robert Haas	49525c4630	Rewrite recurse_union_children to iterate, rather than recurse. Also, rename it to plan_union_chidren, so the old name wasn't very descriptive. This results in a small net reduction in code, seems at least to me to be easier to understand, and saves space on the process stack. Patch by me, reviewed and tested by Ashutosh Bapat and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CA+TgmoaLRAOqHmMZx=ESM3VDEPceg+-XXZsRXQ8GtFJO_zbMSw@mail.gmail.com	2018-03-19 11:54:56 -04:00
Magnus Hagander	71cce90ee9	Fix typo in comment Author: Daniel Gustafsson <daniel@yesql.se>	2018-03-19 10:45:44 +01:00
Tom Lane	8f5ac44043	Fix WHERE CURRENT OF when the referenced cursor uses an index-only scan. "UPDATE/DELETE WHERE CURRENT OF cursor_name" failed, with an error message like "cannot extract system attribute from virtual tuple", if the cursor was using a index-only scan for the target table. Fix it by digging the current TID out of the indexscan state. It seems likely that the same failure could occur for CustomScan plans and perhaps some FDW plan types, so that leaving this to be treated as an internal error with an obscure message isn't as good an idea as it first seemed. Hence, add a bit of heaptuple.c infrastructure to let us deliver a more on-topic message. I chose to make the message match what you get for the case where execCurrentOf can't identify the target scan node at all, "cursor "foo" is not a simply updatable scan of table "bar"". Perhaps it should be different, but we can always adjust that later. In the future, it might be nice to provide hooks that would let custom scan providers and/or FDWs deal with this in other ways; but that's not a suitable topic for a back-patchable bug fix. It's been like this all along, so back-patch to all supported branches. Yugo Nagata and Tom Lane Discussion: https://postgr.es/m/20180201013349.937dfc5f.nagata@sraoss.co.jp	2018-03-17 14:59:49 -04:00
Peter Eisentraut	8a3d942529	Add ssl_passphrase_command setting This allows specifying an external command for prompting for or otherwise obtaining passphrases for SSL key files. This is useful because in many cases there is no TTY easily available during service startup. Also add a setting ssl_passphrase_command_supports_reload, which allows supporting SSL configuration reload even if SSL files need passphrases. Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2018-03-17 08:28:51 -04:00
Andres Freund	7a50bb690b	Add 'unit' parameter to ExplainProperty{Integer,Float}. This allows to deduplicate some existing code, but mainly avoids some duplication in upcoming commits. In passing, fix variable names indicating wrong unit (seconds instead of ms). Author: Andres Freund Discussion: https://postgr.es/m/20180314002740.cah3mdsonz5mxney@alap3.anarazel.de	2018-03-16 23:16:04 -07:00
Andres Freund	f3e4b95edb	Make ExplainPropertyInteger accept 64bit input, remove *Long variant. 'long' is not useful type across platforms, as it's 32bit on 32 bit platforms, and even on some 64bit platforms (e.g. windows) it's still only 32bits wide. As ExplainPropertyInteger should never be performance critical, change it to accept a 64bit argument and remove ExplainPropertyLong. Author: Andres Freund Discussion: https://postgr.es/m/20180314164832.n56wt7zcbpzi6zxe@alap3.anarazel.de	2018-03-16 23:13:12 -07:00
Tom Lane	9e17bdb8a5	Fix query-lifespan memory leakage in repeatedly executed hash joins. ExecHashTableCreate allocated some memory that wasn't freed by ExecHashTableDestroy, specifically the per-hash-key function information. That's not a huge amount of data, but if one runs a query that repeats a hash join enough times, it builds up. Fix by arranging for the data in question to be kept in the hashtable's hashCxt instead of leaving it "loose" in the query-lifespan executor context. (This ensures that we'll also clean up anything that the hash functions allocate in fn_mcxt.) Per report from Amit Khandekar. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAJ3gD9cFofAWGvcxLOxDHC=B0hjtW8yGmUsF2hdGh97CM38=7g@mail.gmail.com	2018-03-16 16:03:45 -04:00
Peter Eisentraut	4120864b9e	Change transaction state debug strings to match enum symbols In some cases, these were different for no apparent reason, making debugging unnecessarily mysterious. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-03-16 13:18:06 -04:00
Peter Eisentraut	81148856b0	Improve savepoint error messages Include the savepoint name in the error message and rephrase it a bit to match common style. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-03-16 13:18:06 -04:00
Peter Eisentraut	ec87efde8d	Simplify parse representation of savepoint commands Instead of embedding the savepoint name in a list and then requiring complex code to unpack it, just add another struct field to store it directly. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-03-16 13:18:06 -04:00
Peter Eisentraut	04700b685f	Rename TransactionChain functions We call this thing a "transaction block" everywhere except in a few functions, where it is mysteriously called a "transaction chain". In the SQL standard, a transaction chain is something different. So rename these functions to match the common terminology. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-03-16 13:18:06 -04:00
Peter Eisentraut	8d47a90862	Update function comments After `a6542a4b68`, some function comments were misplaced. Fix that. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-03-16 13:18:05 -04:00
Tom Lane	877cdf11ea	Mop-up for letting VOID-returning SQL functions end with a SELECT. Part of the intent in commit `fd1a421fe` was to allow SQL functions that are declared to return VOID to contain anything, including an unrelated final SELECT, the same as SQL-language procedures can. However, the planner's inlining logic didn't get that memo. Fix it, and add some regression tests covering this area, since evidently we had none. In passing, clean up some typos in comments in create_function_3.sql, and get rid of its none-too-safe assumption that DROP CASCADE notice output is immutably ordered. Per report from Prabhat Sahu. Discussion: https://postgr.es/m/CANEvxPqxAj6nNHVcaXxpTeEFPmh24Whu+23emgjiuKrhJSct0A@mail.gmail.com	2018-03-16 12:48:13 -04:00
Robert Haas	1466bcfa4a	Split create_grouping_paths into degenerate and non-degenerate cases. There's no functional change here, or at least I hope there isn't, just code rearrangement. The rearrangement is motivated by partition-wise aggregate, which doesn't need to consider the degenerate case but wants to reuse the logic for the ordinary case. Based loosely on a patch from Ashutosh Bapat and Jeevan Chalke, but I whacked it around pretty heavily. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CAFjFpRewpqCmVkwvq6qrRjmbMDpN0CZvRRzjd8UvncczA3Oz1Q@mail.gmail.com	2018-03-15 14:43:58 -04:00
Peter Eisentraut	3a4b891964	Fix more format truncation issues Fix the warnings created by the compiler warning options -Wformat-overflow=2 -Wformat-truncation=2, supported since GCC 7. This is a more aggressive variant of the fixes in `6275f5d28a`, which GCC 7 warned about by default. The issues are all harmless, but some dubious coding patterns are cleaned up. One issue that is of external interest is that BGW_MAXLEN is increased from 64 to 96. Apparently, the old value would cause the bgw_name of logical replication workers to be truncated in some circumstances. But this doesn't actually add those warning options. It appears that the warnings depend a bit on compilation and optimization options, so it would be annoying to have to keep up with that. This is more of a once-in-a-while cleanup. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2018-03-15 11:41:42 -04:00
Robert Haas	648a6c7bd8	Pass additional arguments to a couple of grouping-related functions. get_number_of_groups() and make_partial_grouping_target() currently fish information directly out of the PlannerInfo; in the former case, the target list, and in the latter case, the HAVING qual. This works fine if there's only one grouping relation, but if the pending patch for partition-wise aggregate gets committed, we'll have multiple grouping relations and must therefore use appropriately translated versions of these values for each one. To make that simpler, pass the values to be used as arguments. Jeevan Chalke. The larger patch series of which this patch is a part was also reviewed and tested by Antonin Houska, Rajkumar Raghuwanshi, David Rowley, Dilip Kumar, Konstantin Knizhnik, Pascal Legrand, Rafia Sabih, and me. Discussion: http://postgr.es/m/CAM2+6=UqFnFUypOvLdm5TgC+2M=-E0Q7_LOh0VDFFzmk2BBPzQ@mail.gmail.com Discussion: http://postgr.es/m/CAM2+6=W+L=C4yBqMrgrfTfNtbtmr4T53-hZhwbA2kvbZ9VMrrw@mail.gmail.com	2018-03-15 11:33:52 -04:00

... 3 4 5 6 7 ...

18474 Commits