postgresql

Commit Graph

Author	SHA1	Message	Date
Alvaro Herrera	3e9744465d	Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGS Use it at level 4, a bit more restrictive than the default level, and tweak our commanding comments to FALLTHROUGH. (However, leave zic.c alone, since it's external code; to avoid the warnings that would appear there, change CFLAGS for that file in the Makefile.) Author: Julien Rouhaud <rjuju123@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org	2020-05-12 16:07:30 -04:00
Bruce Momjian	7559d8ebfa	Update copyrights for 2020 Backpatch-through: update all files in master, backpatch legal files through 9.4	2020-01-01 12:21:45 -05:00
Andres Freund	6a04d345fd	Don't include utils/array.h from acl.h. For most uses of acl.h the details of how "Acl" internally looks like are irrelevant. It might make sense to move a lot of the implementation details into a separate header at a later point. The main motivation of this change is to avoid including fmgr.h (via array.h, which needs it for exposed structs) in a lot of files that otherwise don't need it. A subsequent commit will remove the fmgr.h include from a lot of files. Directly include utils/array.h and utils/expandeddatum.h from the files that need them, but previously included them indirectly, via acl.h. Author: Andres Freund Discussion: https://postgr.es/m/20190803193733.g3l3x3o42uv4qj7l@alap3.anarazel.de	2019-08-16 10:33:30 -07:00
Alvaro Herrera	815ef2f568	Don't constraint-exclude partitioned tables as much We only need to invoke constraint exclusion on partitioned tables when they are a partition, and they themselves contain a default partition; it's not necessary otherwise, and it's expensive, so avoid it. Also, we were trying once for each clause separately, but we can do it for all the clauses at once. While at it, centralize setting of RelOptInfo->partition_qual instead of computing it in slightly different ways in different places. Per complaints from Simon Riggs about 4e85642d935e; reviewed by Yuzuko Hosoya, Kyotaro Horiguchi. Author: Amit Langote. I (Álvaro) again mangled the patch somewhat. Discussion: https://postgr.es/m/CANP8+j+tMCY=nEcQeqQam85=uopLBtX-2vHiLD2bbp7iQQUKpA@mail.gmail.com	2019-08-13 10:26:04 -04:00
Michael Paquier	66bde49d96	Fix inconsistencies and typos in the tree, take 10 This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com	2019-08-13 13:53:41 +09:00
Tom Lane	5ee190f8ec	Rationalize use of list_concat + list_copy combinations. In the wake of commit `1cff1b95a`, the result of list_concat no longer shares the ListCells of the second input. Therefore, we can replace "list_concat(x, list_copy(y))" with just "list_concat(x, y)". To improve call sites that were list_copy'ing the first argument, or both arguments, invent "list_concat_copy()" which produces a new list sharing no ListCells with either input. (This is a bit faster than "list_concat(list_copy(x), y)" because it makes the result list the right size to start with.) In call sites that were not list_copy'ing the second argument, the new semantics mean that we are usually leaking the second List's storage, since typically there is no remaining pointer to it. We considered inventing another list_copy variant that would list_free the second input, but concluded that for most call sites it isn't worth worrying about, given the relative compactness of the new List representation. (Note that in cases where such leakage would happen, the old code already leaked the second List's header; so we're only discussing the size of the leak not whether there is one. I did adjust two or three places that had been troubling to free that header so that they manually free the whole second List.) Patch by me; thanks to David Rowley for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us	2019-08-12 11:20:18 -04:00
Tom Lane	0662eb6219	Fix SIGSEGV in pruning for ScalarArrayOp with constant-null array. Not much to be said here: commit `9fdb675fc` should have checked constisnull, didn't. Per report from Piotr Włodarczyk. Back-patch to v11 where bug was introduced. Discussion: https://postgr.es/m/CAP-dhMr+vRpwizEYjUjsiZ1vwqpohTm+3Pbdt6Pr7FEgPq9R0Q@mail.gmail.com	2019-08-09 13:20:28 -04:00
Alvaro Herrera	4e85642d93	Apply constraint exclusion more generally in partitioning We were applying constraint exclusion on the partition constraint when generating pruning steps for a clause, but only for the rather restricted situation of them being boolean OR operators; however it is possible to have differently shaped clauses that also benefit from constraint exclusion. This applies particularly to the default partition since their constraints are in essence a long list of OR'ed subclauses ... but it applies to other cases too. So in certain cases we're scanning partitions that we don't need to. Remove the specialized code in OR clauses, and add a generally applicable test of the clause refuting the partition constraint; mark the whole pruning operation as contradictory if it hits. This has the unwanted side-effect of testing some (most? all?) constraints more than once if constraint_exclusion=on. That seems unavoidable as far as I can tell without some additional work, but that's not the recommended setting for that parameter anyway. However, because this imposes additional processing cost for all queries using partitioned tables, I decided not to backpatch this change. Author: Amit Langote, Yuzuko Hosoya, Álvaro Herrera Reviewers: Shawn Wang, Thibaut Madeleine, Yoshikazu Imai, Kyotaro Horiguchi; they were also uncredited reviewers for commit `489247b0e6`. Discussion: https://postgr.es/m/9bb31dfe-b0d0-53f3-3ea6-e64b811424cf@lab.ntt.co.jp	2019-08-07 12:21:54 -04:00
Alvaro Herrera	489247b0e6	Improve pruning of a default partition When querying a partitioned table containing a default partition, we were wrongly deciding to include it in the scan too early in the process, failing to exclude it in some cases. If we reinterpret the PruneStepResult.scan_default flag slightly, we can do a better job at detecting that it can be excluded. The change is that we avoid setting the flag for that pruning step unless the step absolutely requires the default partition to be scanned (in contrast with the previous arrangement, which was to set it unless the step was able to prune it). So get_matching_partitions() must explicitly check the partition that each returned bound value corresponds to in order to determine whether the default one needs to be included, rather than relying on the flag from the final step result. Author: Yuzuko Hosoya <hosoya.yuzuko@lab.ntt.co.jp> Reviewed-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Discussion: https://postgr.es/m/00e601d4ca86$932b8bc0$b982a340$@lab.ntt.co.jp	2019-08-04 11:18:45 -04:00
Tom Lane	1cff1b95ab	Represent Lists as expansible arrays, not chains of cons-cells. Originally, Postgres Lists were a more or less exact reimplementation of Lisp lists, which consist of chains of separately-allocated cons cells, each having a value and a next-cell link. We'd hacked that once before (commit `d0b4399d8`) to add a separate List header, but the data was still in cons cells. That makes some operations -- notably list_nth() -- O(N), and it's bulky because of the next-cell pointers and per-cell palloc overhead, and it's very cache-unfriendly if the cons cells end up scattered around rather than being adjacent. In this rewrite, we still have List headers, but the data is in a resizable array of values, with no next-cell links. Now we need at most two palloc's per List, and often only one, since we can allocate some values in the same palloc call as the List header. (Of course, extending an existing List may require repalloc's to enlarge the array. But this involves just O(log N) allocations not O(N).) Of course this is not without downsides. The key difficulty is that addition or deletion of a list entry may now cause other entries to move, which it did not before. For example, that breaks foreach() and sister macros, which historically used a pointer to the current cons-cell as loop state. We can repair those macros transparently by making their actual loop state be an integer list index; the exposed "ListCell " pointer is no longer state carried across loop iterations, but is just a derived value. (In practice, modern compilers can optimize things back to having just one loop state value, at least for simple cases with inline loop bodies.) In principle, this is a semantics change for cases where the loop body inserts or deletes list entries ahead of the current loop index; but I found no such cases in the Postgres code. The change is not at all transparent for code that doesn't use foreach() but chases lists "by hand" using lnext(). The largest share of such code in the backend is in loops that were maintaining "prev" and "next" variables in addition to the current-cell pointer, in order to delete list cells efficiently using list_delete_cell(). However, we no longer need a previous-cell pointer to delete a list cell efficiently. Keeping a next-cell pointer doesn't work, as explained above, but we can improve matters by changing such code to use a regular foreach() loop and then using the new macro foreach_delete_current() to delete the current cell. (This macro knows how to update the associated foreach loop's state so that no cells will be missed in the traversal.) There remains a nontrivial risk of code assuming that a ListCell pointer will remain good over an operation that could now move the list contents. To help catch such errors, list.c can be compiled with a new define symbol DEBUG_LIST_MEMORY_USAGE that forcibly moves list contents whenever that could possibly happen. This makes list operations significantly more expensive so it's not normally turned on (though it is on by default if USE_VALGRIND is on). There are two notable API differences from the previous code: * lnext() now requires the List's header pointer in addition to the current cell's address. * list_delete_cell() no longer requires a previous-cell argument. These changes are somewhat unfortunate, but on the other hand code using either function needs inspection to see if it is assuming anything it shouldn't, so it's not all bad. Programmers should be aware of these significant performance changes: * list_nth() and related functions are now O(1); so there's no major access-speed difference between a list and an array. * Inserting or deleting a list element now takes time proportional to the distance to the end of the list, due to moving the array elements. (However, it typically doesn't require palloc or pfree, so except in long lists it's probably still faster than before.) Notably, lcons() used to be about the same cost as lappend(), but that's no longer true if the list is long. Code that uses lcons() and list_delete_first() to maintain a stack might usefully be rewritten to push and pop at the end of the list rather than the beginning. * There are now list_insert_nth...() and list_delete_nth...() functions that add or remove a list cell identified by index. These have the data-movement penalty explained above, but there's no search penalty. * list_concat() and variants now copy the second list's data into storage belonging to the first list, so there is no longer any sharing of cells between the input lists. The second argument is now declared "const List *" to reflect that it isn't changed. This patch just does the minimum needed to get the new implementation in place and fix bugs exposed by the regression tests. As suggested by the foregoing, there's a fair amount of followup work remaining to do. Also, the ENABLE_LIST_COMPAT macros are finally removed in this commit. Code using those should have been gone a dozen years ago. Patch by me; thanks to David Rowley, Jesper Pedersen, and others for review. Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us	2019-07-15 13:41:58 -04:00
David Rowley	cfde234939	Fix RANGE partition pruning with multiple boolean partition keys match_clause_to_partition_key incorrectly would return PARTCLAUSE_UNSUPPORTED if a bool qual could not be matched to the current partition key. This was a problem, as it causes the calling function to discard the qual and not try to match it to any other partition key. If there was another partition key which did match this qual, then the qual would not be checked again and we could fail to prune some partitions. The worst this could do was to cause partitions not to be pruned when they could have been, so there was no danger of incorrect query results here. Fix this by changing match_boolean_partition_clause to have it return a PartClauseMatchStatus rather than a boolean value. This allows it to communicate if the qual is unsupported or if it just does not match this particular partition key, previously these two cases were treated the same. Now, if match_clause_to_partition_key is unable to match the qual to any other qual type then we can simply return the value from the match_boolean_partition_clause call so that the calling function properly treats the qual as either unmatched or unsupported. Reported-by: Rares Salcudean Reviewed-by: Amit Langote Backpatch-through: 11 where partition pruning was introduced Discussion: https://postgr.es/m/CAHp_FN2xwEznH6oyS0hNTuUUZKp5PvegcVv=Co6nBXJ+mC7Y5w@mail.gmail.com	2019-07-12 19:12:38 +12:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Tom Lane	6630ccad7a	Restructure creation of run-time pruning steps. Previously, gen_partprune_steps() always built executor pruning steps using all suitable clauses, including those containing PARAM_EXEC Params. This meant that the pruning steps were only completely safe for executor run-time (scan start) pruning. To prune at executor startup, we had to ignore the steps involving exec Params. But this doesn't really work in general, since there may be logic changes needed as well --- for example, pruning according to the last operator's btree strategy is the wrong thing if we're not applying that operator. The rules embodied in gen_partprune_steps() and its minions are sufficiently complicated that tracking their incremental effects in other logic seems quite impractical. Short of a complete redesign, the only safe fix seems to be to run gen_partprune_steps() twice, once to create executor startup pruning steps and then again for run-time pruning steps. We can save a few cycles however by noting during the first scan whether we rejected any clauses because they involved exec Params --- if not, we don't need to do the second scan. In support of this, refactor the internal APIs in partprune.c to make more use of passing information in the GeneratePruningStepsContext struct, rather than as separate arguments. This is, I hope, the last piece of our response to a bug report from Alan Jackson. Back-patch to v11 where this code came in. Discussion: https://postgr.es/m/FAD28A83-AC73-489E-A058-2681FA31D648@tvsquared.com	2019-05-17 19:44:34 -04:00
Tom Lane	3922f10646	Fix bogus logic for combining range-partitioned columns during pruning. gen_prune_steps_from_opexps's notion of how to do this was overly complicated and underly correct. Per discussion of a report from Alan Jackson (though this fixes only one aspect of that problem). Back-patch to v11 where this code came in. Amit Langote Discussion: https://postgr.es/m/FAD28A83-AC73-489E-A058-2681FA31D648@tvsquared.com	2019-05-16 16:25:43 -04:00
Tom Lane	4b1fcb43d0	Fix partition pruning to treat stable comparison operators properly. Cross-type comparison operators in a btree or hash opclass might be only stable not immutable (this is true of timestamp vs. timestamptz for example). partprune.c ignored this possibility and would perform plan-time pruning with them anyway, possibly leading to wrong answers if the environment changed between planning and execution. To fix, teach gen_partprune_steps() to do things differently when creating plan-time pruning steps vs. run-time pruning steps. analyze_partkey_exprs() also needs an extra check, which is rather annoying but now is not the time to restructure things enough to avoid that. While at it, simplify the logic for the plan-time case a little by insisting that the comparison value be a Const and nothing else. This relies on the assumption that eval_const_expressions will have reduced any immutable expression to a Const; which is not quite 100% true, but certainly any case that comes up often enough to be interesting should have simplification logic there. Also improve a bunch of inadequate/obsolete/wrong comments. Per discussion of a report from Alan Jackson (though this fixes only one aspect of that problem). Back-patch to v11 where this code came in. David Rowley, with some further hacking by me Discussion: https://postgr.es/m/FAD28A83-AC73-489E-A058-2681FA31D648@tvsquared.com	2019-05-16 11:58:21 -04:00
Tom Lane	428b260f87	Speed up planning when partitions can be pruned at plan time. Previously, the planner created RangeTblEntry and RelOptInfo structs for every partition of a partitioned table, even though many of them might later be deemed uninteresting thanks to partition pruning logic. This incurred significant overhead when there are many partitions. Arrange to postpone creation of these data structures until after we've processed the query enough to identify restriction quals for the partitioned table, and then apply partition pruning before not after creation of each partition's data structures. In this way we need not open the partition relations at all for partitions that the planner has no real interest in. For queries that can be proven at plan time to access only a small number of partitions, this patch improves the practical maximum number of partitions from under 100 to perhaps a few thousand. Amit Langote, reviewed at various times by Dilip Kumar, Jesper Pedersen, Yoshikazu Imai, and David Rowley Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-30 18:58:55 -04:00
Robert Haas	5857be907d	Fix use of wrong datatype with sizeof(). OID and int are the same size, but they are not the same thing. David Rowley Discussion: http://postgr.es/m/CAKJS1f_MhS++XngkTvWL9X1v8M5t-0N0B-R465yHQY=TmNV0Ew@mail.gmail.com	2019-03-25 11:28:06 -04:00
Tom Lane	734308a220	Rearrange make_partitionedrel_pruneinfo to avoid work when we can't prune. Postpone most of the effort of constructing PartitionedRelPruneInfos until after we have found out whether run-time pruning is needed at all. This costs very little duplicated effort (basically just an extra find_base_rel() call per partition) and saves quite a bit when we can't do run-time pruning. Also, merge the first loop (for building relid_subpart_map) into the second loop, since we don't need the map to be valid during that loop. Amit Langote Discussion: https://postgr.es/m/9d7c5112-cb99-6a47-d3be-cf1ee6862a1d@lab.ntt.co.jp	2019-03-22 14:56:12 -04:00
Peter Eisentraut	5e1963fb76	Collations with nondeterministic comparison This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com	2019-03-22 12:12:43 +01:00
Robert Haas	898e5e3290	Allow ATTACH PARTITION with only ShareUpdateExclusiveLock. We still require AccessExclusiveLock on the partition itself, because otherwise an insert that violates the newly-imposed partition constraint could be in progress at the same time that we're changing that constraint; only the lock level on the parent relation is weakened. To make this safe, we have to cope with (at least) three separate problems. First, relevant DDL might commit while we're in the process of building a PartitionDesc. If so, find_inheritance_children() might see a new partition while the RELOID system cache still has the old partition bound cached, and even before invalidation messages have been queued. To fix that, if we see that the pg_class tuple seems to be missing or to have a null relpartbound, refetch the value directly from the table. We can't get the wrong value, because DETACH PARTITION still requires AccessExclusiveLock throughout; if we ever want to change that, this will need more thought. In testing, I found it quite difficult to hit even the null-relpartbound case; the race condition is extremely tight, but the theoretical risk is there. Second, successive calls to RelationGetPartitionDesc might not return the same answer. The query planner will get confused if lookup up the PartitionDesc for a particular relation does not return a consistent answer for the entire duration of query planning. Likewise, query execution will get confused if the same relation seems to have a different PartitionDesc at different times. Invent a new PartitionDirectory concept and use it to ensure consistency. This ensures that a single invocation of either the planner or the executor sees the same view of the PartitionDesc from beginning to end, but it does not guarantee that the planner and the executor see the same view. Since this allows pointers to old PartitionDesc entries to survive even after a relcache rebuild, also postpone removing the old PartitionDesc entry until we're certain no one is using it. For the most part, it seems to be OK for the planner and executor to have different views of the PartitionDesc, because the executor will just ignore any concurrently added partitions which were unknown at plan time; those partitions won't be part of the inheritance expansion, but invalidation messages will trigger replanning at some point. Normally, this happens by the time the very next command is executed, but if the next command acquires no locks and executes a prepared query, it can manage not to notice until a new transaction is started. We might want to tighten that up, but it's material for a separate patch. There would still be a small window where a query that started just after an ATTACH PARTITION command committed might fail to notice its results -- but only if the command starts before the commit has been acknowledged to the user. All in all, the warts here around serializability seem small enough to be worth accepting for the considerable advantage of being able to add partitions without a full table lock. Although in general the consequences of new partitions showing up between planning and execution are limited to the query not noticing the new partitions, run-time partition pruning will get confused in that case, so that's the third problem that this patch fixes. Run-time partition pruning assumes that indexes into the PartitionDesc are stable between planning and execution. So, add code so that if new partitions are added between plan time and execution time, the indexes stored in the subplan_map[] and subpart_map[] arrays within the plan's PartitionedRelPruneInfo get adjusted accordingly. There does not seem to be a simple way to generalize this scheme to cope with partitions that are removed, mostly because they could then get added back again with different bounds, but it works OK for added partitions. This code does not try to ensure that every backend participating in a parallel query sees the same view of the PartitionDesc. That currently doesn't matter, because we never pass PartitionDesc indexes between backends. Each backend will ignore the concurrently added partitions which it notices, and it doesn't matter if different backends are ignoring different sets of concurrently added partitions. If in the future that matters, for example because we allow writes in parallel query and want all participants to do tuple routing to the same set of partitions, the PartitionDirectory concept could be improved to share PartitionDescs across backends. There is a draft patch to serialize and restore PartitionDescs on the thread where this patch was discussed, which may be a useful place to start. Patch by me. Thanks to Alvaro Herrera, David Rowley, Simon Riggs, Amit Langote, and Michael Paquier for discussion, and to Alvaro Herrera for some review. Discussion: http://postgr.es/m/CA+Tgmobt2upbSocvvDej3yzokd7AkiT+PvgFH+a9-5VV1oJNSQ@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZE0r9-cyA-aY6f8WFEROaDLLL7Vf81kZ8MtFCkxpeQSw@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoY13KQZF-=HNTrt9UYWYx3_oYOQpu9ioNT49jGgiDpUEA@mail.gmail.com	2019-03-07 11:13:12 -05:00
Tom Lane	f09346a9c6	Refactor planner's header files. Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us	2019-01-29 15:48:51 -05:00
Tom Lane	a1b8c41e99	Make some small planner API cleanups. Move a few very simple node-creation and node-type-testing functions from the planner's clauses.c to nodes/makefuncs and nodes/nodeFuncs. There's nothing planner-specific about them, as evidenced by the number of other places that were using them. While at it, rename and_clause() etc to is_andclause() etc, to clarify that they are node-type-testing functions not node-creation functions. And use "static inline" implementations for the shortest ones. Also, modify flatten_join_alias_vars() and some subsidiary functions to take a Query not a PlannerInfo to define the join structure that Vars should be translated according to. They were only using the "parse" field of the PlannerInfo anyway, so this just requires removing one level of indirection. The advantage is that now parse_agg.c can use flatten_join_alias_vars() without the horrid kluge of creating an incomplete PlannerInfo, which will allow that file to be decoupled from relation.h in a subsequent patch. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us	2019-01-29 15:26:44 -05:00
Alvaro Herrera	b60c397599	Move inheritance expansion code into its own file This commit moves expand_inherited_tables and underlings from optimizer/prep/prepunionc.c to optimizer/utils/inherit.c. Also, all of the AppendRelInfo-based expression manipulation routines are moved to optimizer/utils/appendinfo.c. No functional code changes. One exception is the introduction of make_append_rel_info, but that's still just moving around code. Also, stop including <limits.h> in prepunion.c, which no longer needs it since `3fc6e2d7f5`. I (Álvaro) noticed this because Amit was copying that to inherit.c, which likewise doesn't need it. Author: Amit Langote Discussion: https://postgr.es/m/3be67028-a00a-502c-199a-da00eec8fb6e@lab.ntt.co.jp	2019-01-10 14:54:31 -03:00
Bruce Momjian	97c39498e5	Update copyright for 2019 Backpatch-through: certain files through 9.4	2019-01-02 12:44:25 -05:00
Michael Paquier	170dccc69d	Fix incorrect routine name reference in partprune.c Author: Yuzuko Hosoya Discussion: https://postgr.es/m/00ac01d4774c$7feac860$7fc05920$@lab.ntt.co.jp	2018-11-08 20:14:16 +09:00
Tom Lane	9ddef36278	Centralize executor's opening/closing of Relations for rangetable entries. Create an array estate->es_relations[] paralleling the es_range_table, and store references to Relations (relcache entries) there, so that any given RT entry is opened and closed just once per executor run. Scan nodes typically still call ExecOpenScanRelation, but ExecCloseScanRelation is no more; relation closing is now done centrally in ExecEndPlan. This is slightly more complex than one would expect because of the interactions with relcache references held in ResultRelInfo nodes. The general convention is now that ResultRelInfo->ri_RelationDesc does not represent a separate relcache reference and so does not need to be explicitly closed; but there is an exception for ResultRelInfos in the es_trig_target_relations list, which are manufactured by ExecGetTriggerResultRel and have to be cleaned up by ExecCleanUpTriggerState. (That much was true all along, but these ResultRelInfos are now more different from others than they used to be.) To allow the partition pruning logic to make use of es_relations[] rather than having its own relcache references, adjust PartitionedRelPruneInfo to store an RT index rather than a relation OID. Amit Langote, reviewed by David Rowley and Jesper Pedersen, some mods by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-10-04 14:03:42 -04:00
Michael Paquier	9226a3b89b	Remove duplicated words split across lines in comments This has been detected using some interesting tricks with sed, and the method used is mentioned in details in the discussion below. Author: Justin Pryzby Discussion: https://postgr.es/m/20180908013109.GB15350@telsasoft.com	2018-09-08 12:24:19 -07:00
Thomas Munro	18e586741b	Fix typos. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f8du35u5DprpykWvgNEScxapbWYJdHq%2Bz06Wj3Y2KFPbw%40mail.gmail.com	2018-08-27 09:32:59 +12:00
Tom Lane	59ef49d26d	Remove bogus Assert in make_partitionedrel_pruneinfo(). This Assert thought that a given rel couldn't be both leaf and non-leaf, but it turns out that in some unusual plan trees that's wrong, so remove it. The lack of testing for cases like that is quite concerning --- there is little reason for confidence that there aren't other bugs in the area. But developing a stable test case seems rather difficult, and in any case we don't need this Assert. David Rowley Discussion: https://postgr.es/m/CAJGNTeOkdk=UVuMugmKL7M=owgt4nNr1wjxMg1F+mHsXyLCzFA@mail.gmail.com	2018-08-08 20:02:32 -04:00
Tom Lane	11e22e486d	Match RelOptInfos by relids not pointer equality. Commit `1c2cb2744` added some code that tried to detect whether two RelOptInfos were the "same" rel by pointer comparison; but it turns out that inheritance_planner breaks that, through its shenanigans with copying some relations forward into new subproblems. Compare relid sets instead. Add a regression test case to exercise this area. Problem reported by Rushabh Lathia; diagnosis and fix by Amit Langote, modified a bit by me. Discussion: https://postgr.es/m/CAGPqQf3anJGj65bqAQ9edDr8gF7qig6_avRgwMT9MsZ19COUPw@mail.gmail.com	2018-08-08 11:44:50 -04:00
Tom Lane	1c2cb2744b	Fix run-time partition pruning for appends with multiple source rels. The previous coding here supposed that if run-time partitioning applied to a particular Append/MergeAppend plan, then all child plans of that node must be members of a single partitioning hierarchy. This is totally wrong, since an Append could be formed from a UNION ALL: we could have multiple hierarchies sharing the same Append, or child plans that aren't part of any hierarchy. To fix, restructure the related plan-time and execution-time data structures so that we can have a separate list or array for each partitioning hierarchy. Also track subplans that are not part of any hierarchy, and make sure they don't get pruned. Per reports from Phil Florent and others. Back-patch to v11, since the bug originated there. David Rowley, with a lot of cosmetic adjustments by me; thanks also to Amit Langote for review. Discussion: https://postgr.es/m/HE1PR03MB17068BB27404C90B5B788BCABA7B0@HE1PR03MB1706.eurprd03.prod.outlook.com	2018-08-01 19:42:52 -04:00
Alvaro Herrera	d25d45e4d9	Verify range bounds to bms_add_range when necessary Now that the bms_add_range boundary protections are gone, some alternative ones are needed in a few places. Author: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Discussion: https://postgr.es/m/3437ccf8-a144-55ff-1e2f-fc16b437823b@lab.ntt.co.jp	2018-07-30 18:45:39 -04:00
Alvaro Herrera	e353389d24	Fix partition pruning with IS [NOT] NULL clauses The original code was unable to prune partitions that could not possibly contain NULL values, when the query specified less than all columns in a multicolumn partition key. Reorder the if-tests so that it is, and add more commentary and regression tests. Reported-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com> Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> Reviewed-by: amul sul <sulamul@gmail.com> Discussion: https://postgr.es/m/CAFjFpRc7qjLUfXLVBBC_HAnx644sjTYM=qVoT3TJ840HPbsTXw@mail.gmail.com	2018-07-16 18:38:59 -04:00
Alvaro Herrera	8f97af60d1	Consistently use the term 'partitioned rel' in partprune comments We were using 'partition rel' in a few places, which is quite confusing. Author: Amit Langote Reviewed-by: David Rowley Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/fd256561-31a2-4b7e-cd84-d8241e7ebc3f@lab.ntt.co.jp	2018-06-20 11:43:01 -04:00
Tom Lane	91781335ed	Code review for match_clause_to_partition_key(). Fix inconsistent decisions about NOMATCH vs UNSUPPORTED result codes. If we're going to cater for partkeys that have the same expression and different collations, surely we should also support partkeys with the same expression and different opclasses. Clean up shaky handling of commuted opclauses, eg checking the wrong operator to see what its negator is. This wouldn't cause any actual bugs given a sane opclass definition, but it doesn't seem helpful to expend more code to be less correct. Improve handling of null elements in ScalarArrayOp arrays: in the "op ALL" case, we can conclude they result in an unsatisfiable clause. Minor cosmetic changes and comment improvements.	2018-06-13 16:10:30 -04:00
Tom Lane	19832753f1	Fix some ill-chosen names for globally-visible partition support functions. "compute_hash_value" is particularly gratuitously generic, but IMO all of these ought to have names clearly related to partitioning.	2018-06-13 13:18:02 -04:00
Tom Lane	e23bae82cf	Fix up run-time partition pruning's use of relcache's partition data. The previous coding saved pointers into the partitioned table's relcache entry, but then closed the relcache entry, causing those pointers to nominally become dangling. Actual trouble would be seen in the field only if a relcache flush occurred mid-query, but that's hardly out of the question. While we could fix this by copying all the data in question at query start, it seems better to just hold the relcache entry open for the whole query. While at it, improve the handling of support-function lookups: do that once per query not once per pruning test. There's still something to be desired here, in that we fail to exploit the possibility of caching data across queries in the fn_extra fields of the relcache's FmgrInfo structs, which could happen if we just used those structs in-place rather than copying them. However, combining that with the possibility of per-query lookups of cross-type comparison functions seems to require changes in the APIs of a lot of the pruning support functions, so it's too invasive to consider as part of this patch. A win would ensue only for complex partition key data types (e.g. arrays), so it may not be worth the trouble. David Rowley and Tom Lane Discussion: https://postgr.es/m/17850.1528755844@sss.pgh.pa.us	2018-06-13 12:03:26 -04:00
Tom Lane	4e23236403	Improve commentary about run-time partition pruning data structures. No code changes except for a couple of new Asserts. David Rowley and Tom Lane Discussion: https://postgr.es/m/CAKJS1f-6GODRNgEtdPxCnAPme2h2hTztB6LmtfdmcYAAOE0kQg@mail.gmail.com	2018-06-11 17:35:53 -04:00
Tom Lane	be3d90026a	Fix run-time partition pruning code to handle NULL values properly. The previous coding just ignored pruning constraints that compare a partition key to a null-valued expression. This is silly, since really what we can do there is conclude that all partitions are rejected: the pruning operator is known strict so the comparison must always fail. This also fixes the logic to not ignore constisnull for a Const comparison value. That's probably an unreachable case, since the planner would normally have simplified away a strict operator with a constant-null input. But this code has no business assuming that. David Rowley, per a gripe from me Discussion: https://postgr.es/m/26279.1528670981@sss.pgh.pa.us	2018-06-11 12:08:15 -04:00
Tom Lane	321f648a31	Assorted cosmetic cleanup of run-time-partition-pruning code. Use "subplan" rather than "subnode" to refer to the child plans of a partitioning Append; this seems a bit more specific and hence clearer. Improve assorted comments. No non-cosmetic changes. David Rowley and Tom Lane Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com	2018-06-10 18:24:34 -04:00
Tom Lane	73b7f48f78	Improve run-time partition pruning to handle any stable expression. The initial coding of the run-time-pruning feature only coped with cases where the partition key(s) are compared to Params. That is a bit silly; we can allow it to work with any non-Var-containing stable expression, as long as we take special care with expressions containing PARAM_EXEC Params. The code is hardly any longer this way, and it's considerably clearer (IMO at least). Per gripe from Pavel Stehule. David Rowley, whacked around a bit by me Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com	2018-06-10 15:22:32 -04:00
Alvaro Herrera	d758d9702e	Fix assorted partition pruning bugs match_clause_to_partition_key failed to consider COERCION_PATH_ARRAYCOERCE cases in scalar-op-array expressions, so it was possible to crash the server easily. To handle this case properly (ie. prune partitions) we would need to run a bit of executor code during planning. Maybe it can be improved, but for now let's just not crash. Add a test case that used to trigger the crash. Author: Michaël Paquier match_clause_to_partition_key failed to indicate that operators that don't have a commutator in a btree opclass are unsupported. It is possible for this to cause a crash later if such an operator is used in a scalar-op-array expression. Add a test case that used to the crash. Author: Amit Langote One caller of gen_partprune_steps_internal in match_clause_to_partition_key was too optimistic about the former never returning an empty step list. Rid it of its innocence. (Having fixed the bug above, I no longer know how to exploit this, so no test case for it, but it remained a bug.) Revise code flow a little bit, for succintness. Author: Álvaro Herrera Reported-by: Marina Polyakova Reviewed-by: Michaël Paquier Reviewed-by: Amit Langote Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/ff8f9bfa485ff961d6bb43e54120485b@postgrespro.ru	2018-05-09 11:27:04 -03:00
Alvaro Herrera	d1e2cac5ff	Make gen_partprune_steps static There's no need to export this function, so don't. Michaël didn't actually write the patch, but we list him as first author because with a trivial one like this, intellectual authorship is as important (if not more) as bit shovelling. Author: Michaël Paquier, Amit Langote Discussion: https://postgr.es/m/c91299c4-199b-0f16-339b-a29d6d2a39ee@lab.ntt.co.jp	2018-05-09 10:40:25 -03:00
Alvaro Herrera	c775fb9e18	Remove useless 'default' clause Author: Michael Paquier Reviewed-by: Amit Langote Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/20180424012042.GD1570@paquier.xyz Discussion: https://postgr.es/m/20180509061039.GC11897@paquier.xyz	2018-05-09 10:33:55 -03:00
Tom Lane	bdf46af748	Post-feature-freeze pgindent run. Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us	2018-04-26 14:47:16 -04:00
Alvaro Herrera	1957f8dabf	Initialize ExprStates once in run-time partition pruning Instead of doing ExecInitExpr every time a Param needs to be evaluated in run-time partition pruning, do it once during run-time pruning set-up and cache the exprstate in PartitionPruneContext, saving a lot of work. Author: David Rowley Reviewed-by: Amit Langote, Álvaro Herrera Discussion: https://postgr.es/m/CAKJS1f8-x+q-90QAPDu_okhQBV4DPEtPz8CJ=m0940GyT4DA4w@mail.gmail.com	2018-04-24 14:03:10 -03:00
Alvaro Herrera	dfce1f9e4e	Remove useless default clause in switch The switch covers all values of the enum driver variable, so having a default: clause is useless, even if it's only to do Assert(false).	2018-04-23 12:11:41 -03:00
Alvaro Herrera	e5dcbb88a1	Rework code to determine partition pruning procedure Amit Langote reported that partition prune was unable to work with arrays, enums, etc, which led him to research the appropriate way to match query clauses to partition keys: instead of searching for an exact match of the expression's type, it is better to rely on the fact that the expression qual has already been resolved to a specific operator, and that the partition key is linked to a specific operator family. With that info, it's possible to figure out the strategy and comparison function to use for the pruning clause in a manner that works reliably for pseudo-types also. Include new test cases that demonstrate pruning where pseudotypes are involved. Author: Amit Langote, Álvaro Herrera Discussion: https://postgr.es/m/2b02f1e9-9812-9c41-972d-517bdc0f815d@lab.ntt.co.jp	2018-04-19 12:01:37 -03:00
Alvaro Herrera	7ba6ee815d	Add missed bms_copy() in perform_pruning_combine_step We were initializing a BMS to merely reference an existing one, which would cause a double-free (and a crash) when the recursive algorithm tried to intersect it with an empty one. Fix it by creating a copy at initialization time. Reported-by: sqlsmith (by way of Andreas Seltenreich) Author: Amit Langote Discussion: https://postgr.es/m/87in923lyw.fsf@ansel.ydns.eu	2018-04-09 10:54:28 -03:00
Alvaro Herrera	499be013de	Support partition pruning at execution time Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com	2018-04-07 17:54:39 -03:00

1 2

52 Commits