postgresql

Commit Graph

Author	SHA1	Message	Date
David Rowley	1349d2790b	Improve performance of ORDER BY / DISTINCT aggregates ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been executed by always performing a sort in nodeAgg.c to sort the tuples in the current group into the correct order before calling the transition function on the sorted tuples. This was not great as often there might be an index that could have provided pre-sorted input and allowed the transition functions to be called as the rows come in, rather than having to store them in a tuplestore in order to sort them once all the tuples for the group have arrived. Here we change the planner so it requests a path with a sort order which supports the most amount of ORDER BY / DISTINCT aggregate functions and add new code to the executor to allow it to support the processing of ORDER BY / DISTINCT aggregates where the tuples are already sorted in the correct order. Since there can be many ORDER BY / DISTINCT aggregates in any given query level, it's very possible that we can't find an order that suits all of these aggregates. The sort order that the planner chooses is simply the one that suits the most aggregate functions. We take the most strictly sorted variation of each order and see how many aggregate functions can use that, then we try again with the order of the remaining aggregates to see if another order would suit more aggregate functions. For example: SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ... would request the sort order to be {a, b} because {a} is a subset of the sort order of {a,b}, but; SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ... would just pick a plan ordered by {a} (we give precedence to aggregates which are earlier in the targetlist). SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ... would choose to order by {b} since two aggregates suit that vs just one that requires input ordered by {a}. Author: David Rowley Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane Discussion: https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com	2022-08-02 23:11:45 +12:00
Tom Lane	af119e08fd	Dump more fields when dumping planner internal data structures. Commit `964d01ae9` marked a lot of fields as read_write_ignore to stay consistent with what was dumped by the manually-maintained outfuncs.c code. However, it seems that a pretty fair number of those omissions were either flat-out oversights, or a shortcut taken because hand-written code seemed like it'd be too much trouble. Let's upgrade things where it seems to make sense to dump. To do this, we need to add support to gen_node_support.pl and outfuncs.c for variable-length arrays of Node pointers. That's pretty straightforward given the model of the existing code for arrays of scalars, but I found I needed to tighten the type-recognizing regexes in gen_node_support.pl. (As they stood, they mistook "foo *" for "foo ". Make sure they're all fully anchored to prevent additional problems.) The main thing left un-done here is that a lot of partitioning-related structs are still not dumped, because they are bare structs not Nodes. I'm not sure about the wisdom of that choice ... but changing it would be fairly invasive, so it probably requires more justification than just making planner node dumps more complete. Discussion: https://postgr.es/m/1295668.1658258637@sss.pgh.pa.us	2022-07-20 13:54:30 -04:00
Tom Lane	d6a3aeb9a3	Convert planner's AggInfo and AggTransInfo structs to proper Nodes. This is mostly just to get outfuncs.c support for them, so that the agginfos and aggtransinfos lists can be dumped when dumping the contents of PlannerInfo. While here, improve some related comments; notably, clean up obsolete comments left over from when preprocess_minmax_aggregates had to make its own scan of the query tree. Discussion: https://postgr.es/m/742479.1658160504@sss.pgh.pa.us	2022-07-19 12:29:37 -04:00
Tom Lane	7c0eb3c622	Tighten up parsing logic in gen_node_support.pl. Teach this script to handle function pointer fields honestly. Previously they were just silently ignored, but that's not likely to be a behavior we can accept indefinitely. This mostly entails fixing it so that a field declaration spanning multiple lines can be parsed, because we have a bunch of such fields that're laid out that way. But that's a good improvement in its own right. With that change and a minor regex adjustment, the only struct it fails to parse in the node-defining headers is A_Const, because of the embedded union. The path of least resistance is to move that union declaration outside the struct. Having done those things, we can make it error out if it finds any within-struct syntax it doesn't understand, which seems like a pretty important property for robustness. This commit doesn't change the output files at all; it's just in the way of future-proofing. Discussion: https://postgr.es/m/2593369.1657759779@sss.pgh.pa.us	2022-07-14 09:04:23 -04:00
Tom Lane	ff33a8c887	Remove artificial restrictions on which node types have out/read funcs. The initial version of gen_node_support.pl manually excluded most utility statement node types from having out/read support, and also some raw-parse-tree-only node types. That was mostly to keep the output comparable to the old hand-maintained code. We'd like to have out/read support for utility statements, for debugging purposes and so that they can be included in new-style SQL functions; so it's time to lift that restriction. Most if not all of the previously-excluded raw-parse-tree-only node types can appear in expression subtrees of utility statements, so they have to be handled too. We don't quite have full read support yet; certain custom_read_write node types need to have their handwritten read functions implemented before that will work. Doing this allows us to drop the previous hack in _outQuery to not dump the utilityStmt field in most cases, which means we no longer need manually-maintained out/read functions for Query, so get rid of those in favor of auto-generating them. Fix a couple of omissions in gen_node_support.pl that are exposed through having to handle more node types. catversion bump forced because somebody was sloppy about the field order in the manually-maintained Query out/read functions. (Committers should note that almost all changes in parsenodes.h are now grounds for a catversion bump.)	2022-07-13 11:48:17 -04:00
Peter Eisentraut	784cedda06	Allow specifying STORAGE attribute for a new table Previously, the STORAGE specification was only available in ALTER TABLE. This makes it available in CREATE TABLE as well. Also make the code and the documentation for STORAGE and COMPRESSION attributes consistent. Author: Teodor Sigaev <teodor@sigaev.ru> Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: wenjing zeng <wjzeng2012@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/de83407a-ae3d-a8e1-a788-920eb334f25b@sigaev.ru	2022-07-13 12:21:45 +02:00
David Rowley	4cc832f94a	Tidy up code in get_cheapest_group_keys_order() There are a few things that we could do a little better within get_cheapest_group_keys_order(): 1. We should be using list_free() rather than pfree() on a List. 2. We should use for_each_from() instead of manually coding a for loop to skip the first n elements of a List 3. list_truncate(list_copy(...), n) is not a great way to copy the first n elements of a list. Let's invent list_copy_head() for that. That way we don't need to copy the entire list just to truncate it directly afterwards. 4. We can simplify finding the cheapest cost by setting the cheapest cost variable to DBL_MAX. That allows us to skip special-casing the initial iteration of the loop. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvrGyL3ft8waEkncG9y5HDMu5TFFJB1paoTC8zi9YK97Nw@mail.gmail.com Backpatch-through: 15, where get_cheapest_group_keys_order was added.	2022-07-13 14:02:20 +12:00
Tom Lane	eea9fa9b25	Add defenses against unexpected changes in the NodeTag enum list. Having different build systems producing different contents of the NodeTag enum would be catastrophic for extension ABI stability. But that ordering depends on the order in which gen_node_support.pl processes its input files. It seems too fragile to let the Makefiles, MSVC build scripts, and soon meson build scripts all set this order independently. As a klugy but serviceable solution, put a canonical copy of the file list into gen_node_support.pl itself, and check that against the files given on the command line. Also, while it's fine to add and delete node tags during development, we must not let the assigned NodeTag values change unexpectedly in stable branches. Add a cross-check that can be enabled when a branch is forked off (or later, but that is a time when we're unlikely to miss doing it). It just checks that the last auto-assigned number doesn't change, which is simplistic but will catch the most likely sorts of mistakes. From time to time we do need to add a node tag in a stable branch. To support doing that without changing the branch's auto-assigned tag numbers, invent pg_node_attr(nodetag_number(VALUE)) which can be used to give such a node a hand-assigned tag above the last auto-assigned one. Discussion: https://postgr.es/m/1249010.1657574337@sss.pgh.pa.us	2022-07-12 11:22:52 -04:00
Tom Lane	ca187d7455	Invent nodetag_only attribute for Nodes. This allows explaining gen_node_support.pl's handling of execnodes.h and some other input files as being a shortcut for explicit marking of all their node declarations as pg_node_attr(nodetag_only). I foresee that someday we might need to be more fine-grained about that, and this change provides the infrastructure needed to do so. For now, it just allows removal of the script's klugy special case for CallContext and InlineCodeBlock. Discussion: https://postgr.es/m/75063.1657410615@sss.pgh.pa.us	2022-07-12 10:46:58 -04:00
Tom Lane	3cd0ac9878	Doc: rearrange high-level commentary about node support coverage. copyfuncs.c and friends no longer seem like great places to put high-level remarks about what's covered and what isn't. Move that material to backend/nodes/README and other more-prominent places. Add back (versions of) some remarks that disappeared in `2be87f092`. Discussion: https://postgr.es/m/3843645.1657385930@sss.pgh.pa.us	2022-07-09 15:10:15 -04:00
Tom Lane	8c73c11a0d	Mark Scan as an abstract node type, too. On further review, this one is never instantiated either.	2022-07-09 13:58:06 -04:00
Tom Lane	b4f79d278f	Mark PlanState as an abstract node type. In the same vein as commit `251154beb`, make it clear that we never instantiate PlanState. Also mark MemoryContextData as abstract. This has no effect right now, since memnodes.h isn't one of the files fed to gen_node_support.pl. But it seems like good documentation and future-proofing.	2022-07-09 13:35:37 -04:00
Peter Eisentraut	2be87f092a	Remove code sections obsoleted by node support automation This removes the code sections that were ifdef'ed out by `964d01ae90`.	2022-07-09 15:06:01 +02:00
Peter Eisentraut	3e44aee3ce	Move a comment Move a comment from the to-be-deleted section of nodes.h to where it might still be useful.	2022-07-09 13:22:46 +02:00
Peter Eisentraut	964d01ae90	Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com	2022-07-09 08:53:59 +02:00
Peter Eisentraut	251154bebe	Remove T_Join and T_Plan These are abstract node types that don't need to have a node tag defined. Discussion: https://www.postgresql.org/message-id/2592455.1657140387%40sss.pgh.pa.us	2022-07-08 10:40:44 +02:00
Peter Eisentraut	3140f08985	Reformat some more node comments Reformat some more comments in node field definitions to avoid long lines. Similar to `835d476fd2`, based on additional per-field annotations that will be required. Discussion: https://www.postgresql.org/message-id/c5906b07-220a-a3d4-8ff3-8ee593009424@enterprisedb.com	2022-07-08 09:22:27 +02:00
Robert Haas	b0a55e4329	Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com	2022-07-06 11:39:09 -04:00
Alvaro Herrera	f10a025cfe	Implement List support for TransactionId Use it for RelationSyncEntry->streamed_txns, which is currently using an integer list. The API support is not complete, not because it is hard to write but because it's unclear that it's worth the code space, there being so little use of XID lists. Discussion: https://postgr.es/m/202205130830.g5ntonhztspb@alvherre.pgsql Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>	2022-07-04 14:52:12 +02:00
Peter Eisentraut	835d476fd2	Reformat some node comments Reformat some comments in node field definitions to avoid long lines. This makes room for per-field annotations in a future patch to generate node support functions automatically. Discussion: https://www.postgresql.org/message-id/c5906b07-220a-a3d4-8ff3-8ee593009424@enterprisedb.com	2022-07-02 12:47:15 +02:00
David Rowley	c4a4e760f6	Fix incorrect comments for Memoize struct Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/0635f5aa-4973-8dc2-4e4e-df9fd5778a65@enterprisedb.com Backpatch-through: 14, where Memoize was added	2022-05-19 17:14:23 +12:00
Tom Lane	3ab9a63cb6	Rename JsonIsPredicate.value_type, fix JSON backend/nodes/ infrastructure. I started out with the intention to rename value_type to item_type to avoid a collision with a typedef name that appears on some platforms. Along the way, I noticed that the adjacent field "format" was not being correctly handled by the backend/nodes/ infrastructure functions: copyfuncs.c erroneously treated it as a scalar, while equalfuncs, outfuncs, and readfuncs omitted handling it at all. This looks like it might be cosmetic at the moment because the field is always NULL after parse analysis; but that's likely a bug in itself, and the code's certainly not very future-proof. Let's fix it while we can still do so without forcing an initdb on beta testers. Further study found a few other inconsistencies in the backend/nodes/ infrastructure for the recently-added JSON node types, so fix those too. catversion bumped because of potential change in stored rules. Discussion: https://postgr.es/m/526703.1652385613@sss.pgh.pa.us	2022-05-13 11:40:08 -04:00
Tom Lane	23e7b38bfe	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.	2022-05-12 15:17:30 -04:00
Peter Eisentraut	dc2be6ed47	Remove JsonPathSpec typedef It doesn't seem very useful, and it's a bit in the way of the planned node support automation. Discussion: https://www.postgresql.org/message-id/202204191140.3wsbevfhqmu3@alvherre.pgsql	2022-05-04 17:36:31 +02:00
Peter Eisentraut	d47a11da9e	Add missing enum tag in enum used in nodes Similar to `983bdc4fac`. Author: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/202204191140.3wsbevfhqmu3@alvherre.pgsql	2022-05-04 17:34:22 +02:00
Tom Lane	92e7a53752	Remove inadequate assertion check in CTE inlining. inline_cte() expected to find exactly as many references to the target CTE as its cterefcount indicates. While that should be accurate for the tree as emitted by the parser, there are some optimizations that occur upstream of here that could falsify it, notably removal of unused subquery output expressions. Trying to make the accounting 100% accurate seems expensive and doomed to future breakage. It's not really worth it, because all this code is protecting is downstream assumptions that every referenced CTE has a plan. Let's convert those assertions to regular test-and-elog just in case there's some actual problem, and then drop the failing assertion. Per report from Tomas Vondra (thanks also to Richard Guo for analysis). Back-patch to v12 where the faulty code came in. Discussion: https://postgr.es/m/29196a1e-ed47-c7ca-9be2-b1c636816183@enterprisedb.com	2022-04-21 17:58:52 -04:00
Robert Haas	8ec569479f	Apply PGDLLIMPORT markings broadly. Up until now, we've had a policy of only marking certain variables in the PostgreSQL header files with PGDLLIMPORT, but now we've decided to mark them all. This means that extensions running on Windows should no longer operate at a disadvantage as compared to extensions running on Linux: if the variable is present in a header file, it should be accessible. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com	2022-04-08 08:16:38 -04:00
David Rowley	9d9c02ccd1	Teach planner and executor about monotonic window funcs Window functions such as row_number() always return a value higher than the previously returned value for tuples in any given window partition. Traditionally queries such as; SELECT * FROM ( SELECT , row_number() over (order by c) rn FROM t ) t WHERE rn <= 10; were executed fairly inefficiently. Neither the query planner nor the executor knew that once rn made it to 11 that nothing further would match the outer query's WHERE clause. It would blindly continue until all tuples were exhausted from the subquery. Here we implement means to make the above execute more efficiently. This is done by way of adding a pg_proc.prosupport function to various of the built-in window functions and adding supporting code to allow the support function to inform the planner if the window function is monotonically increasing, monotonically decreasing, both or neither. The planner is then able to make use of that information and possibly allow the executor to short-circuit execution by way of adding a "run condition" to the WindowAgg to allow it to determine if some of its execution work can be skipped. This "run condition" is not like a normal filter. These run conditions are only built using quals comparing values to monotonic window functions. For monotonic increasing functions, quals making use of the btree operators for <, <= and = can be used (assuming the window function column is on the left). You can see here that once such a condition becomes false that a monotonic increasing function could never make it subsequently true again. For monotonically decreasing functions the >, >= and = btree operators for the given type can be used for run conditions. The best-case situation for this is when there is a single WindowAgg node without a PARTITION BY clause. Here when the run condition becomes false the WindowAgg node can simply return NULL. No more tuples will ever match the run condition. It's a little more complex when there is a PARTITION BY clause. In this case, we cannot return NULL as we must still process other partitions. To speed this case up we pull tuples from the outer plan to check if they're from the same partition and simply discard them if they are. When we find a tuple belonging to another partition we start processing as normal again until the run condition becomes false or we run out of tuples to process. When there are multiple WindowAgg nodes to evaluate then this complicates the situation. For intermediate WindowAggs we must ensure we always return all tuples to the calling node. Any filtering done could lead to incorrect results in WindowAgg nodes above. For all intermediate nodes, we can still save some work when the run condition becomes false. We've no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot reference the value of these and these tuples will not appear in the final result anyway. The savings here are small in comparison to what can be saved in the top-level WingowAgg, but still worthwhile. Intermediate WindowAgg nodes never filter out tuples, but here we change WindowAgg so that the top-level WindowAgg filters out tuples that don't match the intermediate WindowAgg node's run condition. Such filters appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node. Here we add prosupport functions to allow the above to work for; row_number(), rank(), dense_rank(), count() and count(expr). It appears technically possible to do the same for min() and max(), however, it seems unlikely to be useful enough, so that's not done here. Bump catversion Author: David Rowley Reviewed-by: Andy Fan, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com	2022-04-08 10:34:36 +12:00
Tomas Vondra	2c7ea57e56	Revert "Logical decoding of sequences" This reverts a sequence of commits, implementing features related to logical decoding and replication of sequences: - `0da92dc530` - `80901b3291` - `b779d7d8fd` - `d5ed9da41d` - `a180c2b34d` - `75b1521dae` - `2d2232933b` - `002c9dd97a` - `05843b1aa4` The implementation has issues, mostly due to combining transactional and non-transactional behavior of sequences. It's not clear how this could be fixed, but it'll require reworking significant part of the patch. Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com	2022-04-07 20:06:36 +02:00
Tom Lane	a0ffa885e4	Allow granting SET and ALTER SYSTEM privileges on GUC parameters. This patch allows "PGC_SUSET" parameters to be set by non-superusers if they have been explicitly granted the privilege to do so. The privilege to perform ALTER SYSTEM SET/RESET on a specific parameter can also be granted. Such privileges are cluster-wide, not per database. They are tracked in a new shared catalog, pg_parameter_acl. Granting and revoking these new privileges works as one would expect. One caveat is that PGC_USERSET GUCs are unaffected by the SET privilege --- one could wish that those were handled by a revocable grant to PUBLIC, but they are not, because we couldn't make it robust enough for GUCs defined by extensions. Mark Dilger, reviewed at various times by Andrew Dunstan, Robert Haas, Joshua Brindle, and myself Discussion: https://postgr.es/m/3D691E20-C1D5-4B80-8BA5-6BEB63AF3029@enterprisedb.com	2022-04-06 13:24:33 -04:00
Etsuro Fujita	c2bb02bc2e	Allow asynchronous execution in more cases. In commit `27e1f1456`, create_append_plan() only allowed the subplan created from a given subpath to be executed asynchronously when it was an async-capable ForeignPath. To extend coverage, this patch handles cases when the given subpath includes some other Path types as well that can be omitted in the plan processing, such as a ProjectionPath directly atop an async-capable ForeignPath, allowing asynchronous execution in partitioned-scan/partitioned-join queries with non-Var tlist expressions and more UNION queries. Andrey Lepikhov and Etsuro Fujita, reviewed by Alexander Pyhalov and Zhihong Yu. Discussion: https://postgr.es/m/659c37a8-3e71-0ff2-394c-f04428c76f08%40postgrespro.ru	2022-04-06 15:45:00 +09:00
Andrew Dunstan	fadb48b00e	PLAN clauses for JSON_TABLE These clauses allow the user to specify how data from nested paths are joined, allowing considerable freedom in shaping the tabular output of JSON_TABLE. PLAN DEFAULT allows the user to specify the global strategies when dealing with sibling or child nested paths. The is often sufficient to achieve the necessary goal, and is considerably simpler than the full PLAN clause, which allows the user to specify the strategy to be used for each named nested path. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru	2022-04-05 14:17:08 -04:00
Andrew Dunstan	4e34747c88	JSON_TABLE This feature allows jsonb data to be treated as a table and thus used in a FROM clause like other tabular data. Data can be selected from the jsonb using jsonpath expressions, and hoisted out of nested structures in the jsonb to form multiple rows, more or less like an outer join. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zhihong Yu (whose name I previously misspelled), Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/7e2cb85d-24cf-4abb-30a5-1a33715959bd@postgrespro.ru	2022-04-04 16:03:47 -04:00
Andrew Dunstan	49082c2cc3	RETURNING clause for JSON() and JSON_SCALAR() This patch is extracted from a larger patch that allowed setting the default returned value from these functions to json or jsonb. That had problems, but this piece of it is fine. For these functions only json or jsonb can be specified in the RETURNING clause. Extracted from an original patch from Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-31 15:45:24 -04:00
Tomas Vondra	db0d67db24	Optimize order of GROUP BY keys When evaluating a query with a multi-column GROUP BY clause using sort, the cost may be heavily dependent on the order in which the keys are compared when building the groups. Grouping does not imply any ordering, so we're allowed to compare the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg, we simply compared keys in the order as specified in the query. This commit explores alternative ordering of the keys, trying to find a cheaper one. In principle, we might generate grouping paths for all permutations of the keys, and leave the rest to the optimizer. But that might get very expensive, so we try to pick only a couple interesting orderings based on both local and global information. When planning the grouping path, we explore statistics (number of distinct values, cost of the comparison function) for the keys and reorder them to minimize comparison costs. Intuitively, it may be better to perform more expensive comparisons (for complex data types etc.) last, because maybe the cheaper comparisons will be enough. Similarly, the higher the cardinality of a key, the lower the probability we’ll need to compare more keys. The patch generates and costs various orderings, picking the cheapest ones. The ordering of group keys may interact with other parts of the query, some of which may not be known while planning the grouping. E.g. there may be an explicit ORDER BY clause, or some other ordering-dependent operation, higher up in the query, and using the same ordering may allow using either incremental sort or even eliminate the sort entirely. The patch generates orderings and picks those minimizing the comparison cost (for various pathkeys), and then adds orderings that might be useful for operations higher up in the plan (ORDER BY, etc.). Finally, it always keeps the ordering specified in the query, on the assumption the user might have additional insights. This introduces a new GUC enable_group_by_reordering, so that the optimization may be disabled if needed. The original patch was proposed by Teodor Sigaev, and later improved and reworked by Dmitry Dolgov. Reviews by a number of people, including me, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu. Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com	2022-03-31 01:13:33 +02:00
Andrew Dunstan	606948b058	SQL JSON functions This Patch introduces three SQL standard JSON functions: JSON() (incorrectly mentioned in my commit message for `f4fb45d15c`) JSON_SCALAR() JSON_SERIALIZE() JSON() produces json values from text, bytea, json or jsonb values, and has facilitites for handling duplicate keys. JSON_SCALAR() produces a json value from any scalar sql value, including json and jsonb. JSON_SERIALIZE() produces text or bytea from input which containis or represents json or jsonb; For the most part these functions don't add any significant new capabilities, but they will be of use to users wanting standard compliant JSON handling. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-30 16:30:37 -04:00
Andrew Dunstan	1a36bc9dba	SQL/JSON query functions This introduces the SQL/JSON functions for querying JSON data using jsonpath expressions. The functions are: JSON_EXISTS() JSON_QUERY() JSON_VALUE() All of these functions only operate on jsonb. The workaround for now is to cast the argument to jsonb. JSON_EXISTS() tests if the jsonpath expression applied to the jsonb value yields any values. JSON_VALUE() must return a single value, and an error occurs if it tries to return multiple values. JSON_QUERY() must return a json object or array, and there are various WRAPPER options for handling scalar or multi-value results. Both these functions have options for handling EMPTY and ERROR conditions. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-29 16:57:13 -04:00
Andrew Dunstan	33a377608f	IS JSON predicate This patch intrdocuces the SQL standard IS JSON predicate. It operates on text and bytea values representing JSON as well as on the json and jsonb types. Each test has an IS and IS NOT variant. The tests are: IS JSON [VALUE] IS JSON ARRAY IS JSON OBJECT IS JSON SCALAR IS JSON WITH \| WITHOUT UNIQUE KEYS These are mostly self-explanatory, but note that IS JSON WITHOUT UNIQUE KEYS is true whenever IS JSON is true, and IS JSON WITH UNIQUE KEYS is true whenever IS JSON is true except it IS JSON OBJECT is true and there are duplicate keys (which is never the case when applied to jsonb values). Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-28 15:37:08 -04:00
Alvaro Herrera	7103ebb7aa	Add support for MERGE SQL command MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql	2022-03-28 16:47:48 +02:00
Andrew Dunstan	f4fb45d15c	SQL/JSON constructors This patch introduces the SQL/JSON standard constructors for JSON: JSON() JSON_ARRAY() JSON_ARRAYAGG() JSON_OBJECT() JSON_OBJECTAGG() For the most part these functions provide facilities that mimic existing json/jsonb functions. However, they also offer some useful additional functionality. In addition to text input, the JSON() function accepts bytea input, which it will decode and constuct a json value from. The other functions provide useful options for handling duplicate keys and null values. This series of patches will be followed by a consolidated documentation patch. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-27 17:03:34 -04:00
Andrew Dunstan	f79b803dcc	Common SQL/JSON clauses This introduces some of the building blocks used by the SQL/JSON constructor and query functions. Specifically, it provides node executor and grammar support for the FORMAT JSON [ENCODING foo] clause, and values decorated with it, and for the RETURNING clause. The following SQL/JSON patches will leverage these. Nikita Glukhov (who probably deserves an award for perseverance). Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-27 17:03:33 -04:00
Tomas Vondra	923def9a53	Allow specifying column lists for logical replication This allows specifying an optional column list when adding a table to logical replication. The column list may be specified after the table name, enclosed in parentheses. Columns not included in this list are not sent to the subscriber, allowing the schema on the subscriber to be a subset of the publisher schema. For UPDATE/DELETE publications, the column list needs to cover all REPLICA IDENTITY columns. For INSERT publications, the column list is arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if the table uses REPLICA IDENTITY FULL, column list is not allowed. The column list can contain only simple column references. Complex expressions, function calls etc. are not allowed. This restriction could be relaxed in the future. During the initial table synchronization, only columns included in the column list are copied to the subscriber. If the subscription has several publications, containing the same table with different column lists, columns specified in any of the lists will be copied. This means all columns are replicated if the table has no column list at all (which is treated as column list with all columns), or when of the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that matches the schema of the table). For partitioned tables, publish_via_partition_root determines whether the column list for the root or the leaf relation will be used. If the parameter is 'false' (the default), the list defined for the leaf relation is used. Otherwise, the column list for the root partition will be used. Psql commands \dRp+ and \d <table-name> now display any column lists. Author: Tomas Vondra, Alvaro Herrera, Rahila Syed Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed, Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com	2022-03-26 01:01:27 +01:00
Tomas Vondra	75b1521dae	Add decoding of sequences to built-in replication This commit adds support for decoding of sequences to the built-in replication (the infrastructure was added by commit `0da92dc530`). The syntax and behavior mostly mimics handling of tables, i.e. a publication may be defined as FOR ALL SEQUENCES (replicating all sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating all sequences in a particular schema) or individual sequences. To publish sequence modifications, the publication has to include 'sequence' action. The protocol is extended with a new message, describing sequence increments. A new system view pg_publication_sequences lists all the sequences added to a publication, both directly and indirectly. Various psql commands (\d and \dRp) are improved to also display publications including a given sequence, or sequences included in a publication. Author: Tomas Vondra, Cary Huang Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca	2022-03-24 18:49:27 +01:00
Andrew Dunstan	1460fc5942	Revert "Common SQL/JSON clauses" This reverts commit `865fe4d5df`. This has caused issues with a significant number of buildfarm members	2022-03-22 19:56:14 -04:00
Andrew Dunstan	865fe4d5df	Common SQL/JSON clauses This introduces some of the building blocks used by the SQL/JSON constructor and query functions. Specifically, it provides node executor and grammar support for the FORMAT JSON [ENCODING foo] clause, and values decorated with it, and for the RETURNING clause. The following SQL/JSON patches will leverage these. Nikita Glukhov (who probably deserves an award for perseverance). Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup. Erik Rijkers, Zihong Yu and Himanshu Upadhyaya. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-22 17:32:54 -04:00
Andrew Dunstan	d11e84ea46	Add String object access hooks This caters for cases where the access is to an object identified by name rather than Oid. The first user of these is the GUC access controls Joshua Brindle and Mark Dilger Discussion: https://postgr.es/m/47F87A0E-C0E5-43A6-89F6-D403F2B45175@enterprisedb.com	2022-03-22 10:28:31 -04:00
Amit Kapila	208c5d65bb	Add ALTER SUBSCRIPTION ... SKIP. This feature allows skipping the transaction on subscriber nodes. If incoming change violates any constraint, logical replication stops until it's resolved. Currently, users need to either manually resolve the conflict by updating a subscriber-side database or by using function pg_replication_origin_advance() to skip the conflicting transaction. This commit introduces a simpler way to skip the conflicting transactions. The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX), which allows the apply worker to skip the transaction finished at specified LSN. The apply worker skips all data modification changes within the transaction. Author: Masahiko Sawada Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2022-03-22 07:11:19 +05:30
Alvaro Herrera	ba9a7e3921	Enforce foreign key correctly during cross-partition updates When an update on a partitioned table referenced in foreign key constraints causes a row to move from one partition to another, the fact that the move is implemented as a delete followed by an insert on the target partition causes the foreign key triggers to have surprising behavior. For example, a given foreign key's delete trigger which implements the ON DELETE CASCADE clause of that key will delete any referencing rows when triggered for that internal DELETE, although it should not, because the referenced row is simply being moved from one partition of the referenced root partitioned table into another, not being deleted from it. This commit teaches trigger.c to skip queuing such delete trigger events on the leaf partitions in favor of an UPDATE event fired on the root target relation. Doing so is sensible because both the old and the new tuple "logically" belong to the root relation. The after trigger event queuing interface now allows passing the source and the target partitions of a particular cross-partition update when registering the update event for the root partitioned table. Along with the two ctids of the old and the new tuple, the after trigger event now also stores the OIDs of those partitions. The tuples fetched from the source and the target partitions are converted into the root table format, if necessary, before they are passed to the trigger function. The implementation currently has a limitation that only the foreign keys pointing into the query's target relation are considered, not those of its sub-partitioned partitions. That seems like a reasonable limitation, because it sounds rare to have distinct foreign keys pointing to sub-partitioned partitions instead of to the root table. This misbehavior stems from commit `f56f8f8da6` (which added support for foreign keys to reference partitioned tables) not paying sufficient attention to commit `2f17844104` (which had introduced cross-partition updates a year earlier). Even though the former commit goes back to Postgres 12, we're not backpatching this fix at this time for fear of destabilizing things too much, and because there are a few ABI breaks in it that we'd have to work around in older branches. It also depends on commit `f4566345cf`, which had its own share of backpatchability issues as well. Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Eduard Català <eduard.catala@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFvkBCmfwkQX_yBqv2Wz8ugUGiBDxum8=WvVbfU1TXaNg@mail.gmail.com Discussion: https://postgr.es/m/CAL54xNZsLwEM1XCk5yW9EqaRzsZYHuWsHQkA2L5MOSKXAwviCQ@mail.gmail.com	2022-03-20 18:43:40 +01:00
Tom Lane	d7b5c071dd	Don't bother to attach column name lists to RowExprs of named types. If a RowExpr is marked as returning a named composite type, we aren't going to consult its colnames list; we'll use the attribute names shown for the type in pg_attribute. Hence, skip storing that list, to save a few nanoseconds when copying the expression tree around. Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us	2022-03-17 18:25:44 -04:00
Amit Kapila	52e4f0cd47	Allow specifying row filters for logical replication of tables. This feature adds row filtering for publication tables. When a publication is defined or modified, an optional WHERE clause can be specified. Rows that don't satisfy this WHERE clause will be filtered out. This allows a set of tables to be partially replicated. The row filter is per table. A new row filter can be added simply by specifying a WHERE clause after the table name. The WHERE clause must be enclosed by parentheses. The row filter WHERE clause for a table added to a publication that publishes UPDATE and/or DELETE operations must contain only columns that are covered by REPLICA IDENTITY. The row filter WHERE clause for a table added to a publication that publishes INSERT can use any column. If the row filter evaluates to NULL, it is regarded as "false". The WHERE clause only allows simple expressions that don't have user-defined functions, user-defined operators, user-defined types, user-defined collations, non-immutable built-in functions, or references to system columns. These restrictions could be addressed in the future. If you choose to do the initial table synchronization, only data that satisfies the row filters is copied to the subscriber. If the subscription has several publications in which a table has been published with different WHERE clauses, rows that satisfy ANY of the expressions will be copied. If a subscriber is a pre-15 version, the initial table synchronization won't use row filters even if they are defined in the publisher. The row filters are applied before publishing the changes. If the subscription has several publications in which the same table has been published with different filters (for the same publish operation), those expressions get OR'ed together so that rows satisfying any of the expressions will be replicated. This means all the other filters become redundant if (a) one of the publications have no filter at all, (b) one of the publications was created using FOR ALL TABLES, (c) one of the publications was created using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema. If your publication contains a partitioned table, the publication parameter publish_via_partition_root determines if it uses the partition's row filter (if the parameter is false, the default) or the root partitioned table's row filter. Psql commands \dRp+ and \d <table-name> will display any row filters. Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com	2022-02-22 08:11:50 +05:30

1 2 3 4 5 ...

1723 Commits