postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Eisentraut	0d21d4b9bc	Add standard collation UNICODE This adds a new predefined collation named UNICODE, which sorts by the default Unicode collation algorithm specifications, per SQL standard. This only works if ICU support is built. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/flat/1293e382-2093-a2bf-a397-c04e8f83d3c2@enterprisedb.com	2023-03-10 13:35:43 +01:00
Peter Eisentraut	30a53b7929	Allow tailoring of ICU locales with custom rules This exposes the ICU facility to add custom collation rules to a standard collation. New options are added to CREATE COLLATION, CREATE DATABASE, createdb, and initdb to set the rules. Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Discussion: https://www.postgresql.org/message-id/flat/821c71a4-6ef0-d366-9acf-bb8e367f739f@enterprisedb.com	2023-03-08 16:56:37 +01:00
Michael Paquier	b8da37b3ad	Rework pg_input_error_message(), now renamed pg_input_error_info() pg_input_error_info() is now a SQL function able to return a row with more than just the error message generated for incorrect data type inputs when these are able to handle soft failures, returning more contents of ErrorData, as of: - The error message (same as before). - The error detail, if set. - The error hint, if set. - SQL error code. All the regression tests that relied on pg_input_error_message() are updated to reflect the effects of the rename. Per discussion with Tom Lane and Andrew Dunstan. Author: Nathan Bossart Discussion: https://postgr.es/m/139a68e1-bd1f-a9a7-b5fe-0be9845c6311@dunslane.net	2023-02-28 08:04:13 +09:00
Peter Eisentraut	2ddab010c2	Implement ANY_VALUE aggregate SQL:2023 defines an ANY_VALUE aggregate whose purpose is to emit an implementation-dependent (i.e. non-deterministic) value from the aggregated rows. Author: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5cff866c-10a8-d2df-32cb-e9072e6b04a2@postgresfriends.org	2023-02-22 09:33:07 +01:00
Andres Freund	a9c70b46db	Add pg_stat_io view, providing more detailed IO statistics Builds on `28e626bde0` and `f30d62c2fc`. See the former for motivation. Rows of the view show IO operations for a particular backend type, IO target object, IO context combination (e.g. a client backend's operations on permanent relations in shared buffers) and each column in the view is the total number of IO Operations done (e.g. writes). So a cell in the view would be, for example, the number of blocks of relation data written from shared buffers by client backends since the last stats reset. In anticipation of tracking WAL IO and non-block-oriented IO (such as temporary file IO), the "op_bytes" column specifies the unit of the "reads", "writes", and "extends" columns for a given row. Rows for combinations of IO operation, backend type, target object and context that never occur, are ommitted entirely. For example, checkpointer will never operate on temporary relations. Similarly, if an IO operation never occurs for such a combination, the IO operation's cell will be null, to distinguish from 0 observed IO operations. For example, bgwriter should not perform reads. Note that some of the cells in the view are redundant with fields in pg_stat_bgwriter (e.g. buffers_backend). For now, these have been kept for backwards compatibility. Bumps catversion. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Samay Sharma <smilingsamay@gmail.com> Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de	2023-02-11 09:52:15 -08:00
Tom Lane	2489d76c49	Make Vars be outer-join-aware. Traditionally we used the same Var struct to represent the value of a table column everywhere in parse and plan trees. This choice predates our support for SQL outer joins, and it's really a pretty bad idea with outer joins, because the Var's value can depend on where it is in the tree: it might go to NULL above an outer join. So expression nodes that are equal() per equalfuncs.c might not represent the same value, which is a huge correctness hazard for the planner. To improve this, decorate Var nodes with a bitmapset showing which outer joins (identified by RTE indexes) may have nulled them at the point in the parse tree where the Var appears. This allows us to trust that equal() Vars represent the same value. A certain amount of klugery is still needed to cope with cases where we re-order two outer joins, but it's possible to make it work without sacrificing that core principle. PlaceHolderVars receive similar decoration for the same reason. In the planner, we include these outer join bitmapsets into the relids that an expression is considered to depend on, and in consequence also add outer-join relids to the relids of join RelOptInfos. This allows us to correctly perceive whether an expression can be calculated above or below a particular outer join. This change affects FDWs that want to plan foreign joins. They must follow suit when labeling foreign joins in order to match with the core planner, but for many purposes (if postgres_fdw is any guide) they'd prefer to consider only base relations within the join. To support both requirements, redefine ForeignScan.fs_relids as base+OJ relids, and add a new field fs_base_relids that's set up by the core planner. Large though it is, this commit just does the minimum necessary to install the new mechanisms and get check-world passing again. Follow-up patches will perform some cleanup. (The README additions and comments mention some stuff that will appear in the follow-up.) Patch by me; thanks to Richard Guo for review. Discussion: https://postgr.es/m/830269.1656693747@sss.pgh.pa.us	2023-01-30 13:16:20 -05:00
Tom Lane	3cece34be8	Remove special outfuncs/readfuncs handling of RangeVar.catalogname. Historically we skipped writing/reading this field, but that no longer works under WRITE_READ_PARSE_PLAN_TREES since we expanded the coverage of that option to include utility commands (`787102b56`). Remove the special case and just treat this field normally. Bump catversion out of an abundance of caution --- I do not think we currently ever store RangeVar nodes in the catalogs, but perhaps I'm wrong. Per report from Pavel Stehule. Discussion: https://postgr.es/m/CAFj8pRAYvYu-qU7-NieqRRyaQZk-yr3UjtHQ2LR62PS9M1dZMA@mail.gmail.com	2023-01-23 13:33:19 -05:00
David Rowley	16fd03e956	Allow parallel aggregate on string_agg and array_agg This adds combine, serial and deserial functions for the array_agg() and string_agg() aggregate functions, thus allowing these aggregates to partake in partial aggregations. This allows both parallel aggregation to take place when these aggregates are present and also allows additional partition-wise aggregation plan shapes to include plans that require additional aggregation once the partially aggregated results from the partitions have been combined. Author: David Rowley Reviewed-by: Andres Freund, Tomas Vondra, Stephen Frost, Tom Lane Discussion: https://postgr.es/m/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A@mail.gmail.com	2023-01-23 17:35:01 +13:00
Robert Haas	557890920d	Bump catversion for `6e2775e4d4`. It creates a new predefined role.	2023-01-20 16:37:26 -05:00
Robert Haas	6e2775e4d4	Add new GUC reserved_connections. This provides a way to reserve connection slots for non-superusers. The slots reserved via the new GUC are available only to users who have the new predefined role pg_use_reserved_connections. superuser_reserved_connections remains as a final reserve in case reserved_connections has been exhausted. Patch by Nathan Bossart. Reviewed by Tushar Ahuja and by me. Discussion: http://postgr.es/m/20230119194601.GA4105788@nathanxps13	2023-01-20 15:39:13 -05:00
Tom Lane	47bb9db759	Get rid of the "new" and "old" entries in a view's rangetable. The rule system needs "old" and/or "new" pseudo-RTEs in rule actions that are ON INSERT/UPDATE/DELETE. Historically it's put such entries into the ON SELECT rules of views as well, but those are really quite vestigial. The only thing we've used them for is to carry the view's relid forward to AcquireExecutorLocks (so that we can re-lock the view to verify it hasn't changed before re-using a plan) and to carry its relid and permissions data forward to execution-time permissions checks. What we can do instead of that is to retain these fields of the RTE_RELATION RTE for the view even after we convert it to an RTE_SUBQUERY RTE. This requires a tiny amount of extra complication in the planner and AcquireExecutorLocks, but on the other hand we can get rid of the logic that moves that data from one place to another. The principal immediate benefit of doing this, aside from a small saving in the pg_rewrite data for views, is that these pseudo-RTEs no longer trigger ruleutils.c's heuristic about qualifying variable names when the rangetable's length is more than 1. That results in quite a number of small simplifications in regression test outputs, which are all to the good IMO. Bump catversion because we need to dump a few more fields of RTE_SUBQUERY RTEs. While those will always be zeroes anyway in stored rules (because we'd never populate them until query rewrite) they are useful for debugging, and it seems like we'd better make sure to transmit such RTEs accurately in plans sent to parallel workers. I don't think the executor actually examines these fields after startup, but someday it might. This is a second attempt at committing `1b4d280ea`. The difference from the first time is that now we can add some filtering rules to AdjustUpgrade.pm to allow cross-version upgrade testing to pass despite all the cosmetic changes in CREATE VIEW outputs. Amit Langote (filtering rules by me) Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us	2023-01-18 13:23:57 -05:00
Amit Kapila	d540a02a72	Display the leader apply worker's PID for parallel apply workers. Add leader_pid to pg_stat_subscription. leader_pid is the process ID of the leader apply worker if this process is a parallel apply worker. If this field is NULL, it indicates that the process is a leader apply worker or a synchronization worker. The new column makes it easier to distinguish parallel apply workers from other kinds of workers and helps to identify the leader for the parallel workers corresponding to a particular subscription. Additionally, update the leader_pid column in pg_stat_activity as well to display the PID of the leader apply worker for parallel apply workers. Author: Hou Zhijie Reviewed-by: Peter Smith, Sawada Masahiko, Amit Kapila, Shveta Mallik Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com	2023-01-18 09:03:12 +05:30
Amit Kapila	b7ae039536	Ignore dropped and generated columns from the column list. We don't allow different column lists for the same table in the different publications of the single subscription. A publication with a column list except for dropped and generated columns should be considered the same as a publication with no column list (which implicitly includes all columns as part of the columns list). However, as we were not excluding the dropped and generated columns from the column list combining such publications leads to an error "cannot use different column lists for table ...". We decided not to backpatch this fix as there is a risk of users seeing this as a behavior change and also we didn't see any field report of this case. Author: Shi yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/OSZPR01MB631091CCBC56F195B1B9ACB0FDFE9@OSZPR01MB6310.jpnprd01.prod.outlook.com	2023-01-13 14:49:23 +05:30
Tom Lane	f0e6d6d3c9	Revert "Get rid of the "new" and "old" entries in a view's rangetable." This reverts commit `1b4d280ea1`. It's broken the buildfarm members that run cross-version-upgrade tests, because they're not prepared to deal with cosmetic differences between CREATE VIEW commands emitted by older servers and HEAD. Even if we had a solution to that, which we don't, it'd take some time to roll it out to the affected animals. This improvement isn't valuable enough to justify addressing that problem on an emergency basis, so revert it for now.	2023-01-11 23:01:22 -05:00
Tom Lane	1b4d280ea1	Get rid of the "new" and "old" entries in a view's rangetable. The rule system needs "old" and/or "new" pseudo-RTEs in rule actions that are ON INSERT/UPDATE/DELETE. Historically it's put such entries into the ON SELECT rules of views as well, but those are really quite vestigial. The only thing we've used them for is to carry the view's relid forward to AcquireExecutorLocks (so that we can re-lock the view to verify it hasn't changed before re-using a plan) and to carry its relid and permissions data forward to execution-time permissions checks. What we can do instead of that is to retain these fields of the RTE_RELATION RTE for the view even after we convert it to an RTE_SUBQUERY RTE. This requires a tiny amount of extra complication in the planner and AcquireExecutorLocks, but on the other hand we can get rid of the logic that moves that data from one place to another. The principal immediate benefit of doing this, aside from a small saving in the pg_rewrite data for views, is that these pseudo-RTEs no longer trigger ruleutils.c's heuristic about qualifying variable names when the rangetable's length is more than 1. That results in quite a number of small simplifications in regression test outputs, which are all to the good IMO. Bump catversion because we need to dump a few more fields of RTE_SUBQUERY RTEs. While those will always be zeroes anyway in stored rules (because we'd never populate them until query rewrite) they are useful for debugging, and it seems like we'd better make sure to transmit such RTEs accurately in plans sent to parallel workers. I don't think the executor actually examines these fields after startup, but someday it might. Amit Langote Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com	2023-01-11 19:41:09 -05:00
Tom Lane	38d81760c4	Invent random_normal() to provide normally-distributed random numbers. There is already a version of this in contrib/tablefunc, but it seems sufficiently widely useful to justify having it in core. Paul Ramsey Discussion: https://postgr.es/m/CACowWR0DqHAvOKUCNxTrASFkWsDLqKMd6WiXvVvaWg4pV1BMnQ@mail.gmail.com	2023-01-09 12:44:00 -05:00
Amit Kapila	216a784829	Perform apply of large transactions by parallel workers. Currently, for large transactions, the publisher sends the data in multiple streams (changes divided into chunks depending upon logical_decoding_work_mem), and then on the subscriber-side, the apply worker writes the changes into temporary files and once it receives the commit, it reads from those files and applies the entire transaction. To improve the performance of such transactions, we can instead allow them to be applied via parallel workers. In this approach, we assign a new parallel apply worker (if available) as soon as the xact's first stream is received and the leader apply worker will send changes to this new worker via shared memory. The parallel apply worker will directly apply the change instead of writing it to temporary files. However, if the leader apply worker times out while attempting to send a message to the parallel apply worker, it will switch to "partial serialize" mode - in this mode, the leader serializes all remaining changes to a file and notifies the parallel apply workers to read and apply them at the end of the transaction. We use a non-blocking way to send the messages from the leader apply worker to the parallel apply to avoid deadlocks. We keep this parallel apply assigned till the transaction commit is received and also wait for the worker to finish at commit. This preserves commit ordering and avoid writing to and reading from files in most cases. We still need to spill if there is no worker available. This patch also extends the SUBSCRIPTION 'streaming' parameter so that the user can control whether to apply the streaming transaction in a parallel apply worker or spill the change to disk. The user can set the streaming parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means the streaming will be applied via a parallel apply worker, if available. The parameter value 'on' means the streaming transaction will be spilled to disk. The default value is 'off' (same as current behaviour). In addition, the patch extends the logical replication STREAM_ABORT message so that abort_lsn and abort_time can also be sent which can be used to update the replication origin in parallel apply worker when the streaming transaction is aborted. Because this message extension is needed to support parallel streaming, parallel streaming is not supported for publications on servers < PG16. Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com	2023-01-09 07:52:45 +05:30
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00
David Rowley	b5aff92557	Fix recent accidental omission in pg_proc.dat `ed1a88dda` added support functions for the ntile(), percent_rank() and cume_dist() window functions but neglected to actually add these support functions to the pg_proc entry for the corresponding window function. Also, take this opportunity to add these window functions to one of the regression tests added in `ed1a88dda` to give the support functions a little bit of exercise. If I'd done that in the first place then the omission would have been more obvious. Bump the catversion, again.	2022-12-24 13:18:35 +13:00
Michael Paquier	13e0d7a603	Rename pg_dissect_walfile_name() to pg_split_walfile_name() The former name was discussed as being confusing, so use "split", as per a suggestion from Magnus Hagander. While on it, one of the output arguments is renamed from "segno" to "segment_number", as per a suggestion from Kyotaro Horiguchi. The documentation is updated to reflect all these changes. Bump catalog version. Author: Bharath Rupireddy, Michael Paquier Discussion: https://postgr.es/m/CABUevEytQVaOOhGdoh0D7hGwe3fuKcRF6NthsSW7ww04EmtFgQ@mail.gmail.com	2022-12-23 09:15:01 +09:00
David Rowley	ed1a88ddac	Allow window functions to adjust their frameOptions WindowFuncs such as row_number() don't care if it's called with ROWS UNBOUNDED PRECEDING AND CURRENT ROW or with RANGE UNBOUNDED PRECEDING AND CURRENT ROW. The latter is less efficient as the RANGE option requires that the executor check for peer rows, so using the ROW option instead would cause less overhead. Because RANGE is part of the default frame options for WindowClauses, it means WindowAgg is, by default, working much harder than it needs to for window functions where the ROWS / RANGE option has no effect on the window function's result. On a test query from the discussion thread, a performance improvement of 344% was seen by using ROWS instead of RANGE. Here we add a new support function node type to allow support functions to be called for window functions so that the most optimal version of the frame options can be set. The planner has been adjusted so that the frame options are changed only if all window functions sharing the same window clause agree on what the optimized frame options are. Here we give the ability for row_number(), rank(), dense_rank(), percent_rank(), cume_dist() and ntile() to alter their WindowClause's frameOptions. Reviewed-by: Vik Fearing, Erwin Brandstetter, Zhihong Yu Discussion: https://postgr.es/m/CAGHENJ7LBBszxS+SkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvohAKEtTXxq7Pc-ic2dKT8oZfbRKeEJP64M0B6+S88z+A@mail.gmail.com	2022-12-23 12:43:52 +13:00
Andrew Dunstan	8284cf5f74	Add copyright notices to meson files Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net	2022-12-20 07:54:39 -05:00
Michael Paquier	cca1863489	Add pg_dissect_walfile_name() This function takes in input a WAL segment name and returns a tuple made of the segment sequence number (dependent on the WAL segment size of the cluster) and its timeline, as of a thin SQL wrapper around the existing XLogFromFileName(). This function has multiple usages, like being able to compile a LSN from a file name and an offset, or finding the timeline of a segment without having to do to some maths based on the first eight characters of the segment. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Kyotaro Horiguchi, Maxim Orlov, Michael Paquier Discussion: https://postgr.es/m/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com	2022-12-20 13:36:27 +09:00
Robert Haas	10ea0f924a	Expose some information about backend subxact status. A new function pg_stat_get_backend_subxact() can be used to get information about the number of subtransactions in the cache of a particular backend and whether that cache has overflowed. This can be useful for tracking down performance problems that can result from overflowed snapshots. Dilip Kumar, reviewed by Zhihong Yu, Nikolay Samokhvalov, Justin Pryzby, Nathan Bossart, Ashutosh Sharma, Julien Rouhaud. Additional design comments from Andres Freund, Tom Lane, Bruce Momjian, and David G. Johnston. Discussion: http://postgr.es/m/CAFiTN-ut0uwkRJDQJeDPXpVyTWD46m3gt3JDToE02hTfONEN=Q@mail.gmail.com	2022-12-19 14:43:09 -05:00
Peter Eisentraut	75f49221c2	Static assertions cleanup Because we added StaticAssertStmt() first before StaticAssertDecl(), some uses as well as the instructions in c.h are now a bit backwards from the "native" way static assertions are meant to be used in C. This updates the guidance and moves some static assertions to better places. Specifically, since the addition of StaticAssertDecl(), we can put static assertions at the file level. This moves a number of static assertions out of function bodies, where they might have been stuck out of necessity, to perhaps better places at the file level or in header files. Also, when the static assertion appears in a position where a declaration is allowed, then using StaticAssertDecl() is more native than StaticAssertStmt(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/941a04e7-dd6f-c0e4-8cdf-a33b3338cbda%40enterprisedb.com	2022-12-15 10:10:32 +01:00
Jeff Davis	60684dd834	Add grantable MAINTAIN privilege and pg_maintain role. Allows VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER, and LOCK TABLE. Effectively reverts `4441fc704d`. Instead of creating separate privileges for VACUUM, ANALYZE, and other maintenance commands, group them together under a single MAINTAIN privilege. Author: Nathan Bossart Discussion: https://postgr.es/m/20221212210136.GA449764@nathanxps13 Discussion: https://postgr.es/m/45224.1670476523@sss.pgh.pa.us	2022-12-13 17:33:28 -08:00
Alvaro Herrera	840ff5f451	Get rid of recursion-marker values in enum AlterTableType During ALTER TABLE execution, when prep-time handling of subcommands of certain types determine that execution-time handling requires recursion, they signal this by changing the subcommand type to a special value. This can be done in a simpler way by using a separate flag introduced by commit `ec0925c22a`, so do that. Catversion bumped. It's not clear to me that ALTER TABLE subcommands are stored anywhere in catalogs (CREATE FUNCTION rejects it in BEGIN ATOMIC function bodies), but we do have both write and read support for them, so be safe. Discussion: https://postgr.es/m/20220929090033.zxuaezcdwh2fgfjb@alvherre.pgsql	2022-12-12 11:13:26 +01:00
Tom Lane	1939d26282	Add test scaffolding for soft error reporting from input functions. pg_input_is_valid() returns boolean, while pg_input_error_message() returns the primary error message if the input is bad, or NULL if the input is OK. The main reason for having two functions is so that we can test both the details-wanted and the no-details-wanted code paths. Although these are primarily designed with testing in mind, it could well be that they'll be useful to end users as well. This patch is mostly by me, but it owes very substantial debt to earlier work by Nikita Glukhov, Andrew Dunstan, and Amul Sul. Thanks to Andres Freund for review. Discussion: https://postgr.es/m/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru	2022-12-09 10:08:44 -05:00
Alexander Korotkov	096dd80f3c	Add USER SET parameter values for pg_db_role_setting The USER SET flag specifies that the variable should be set on behalf of an ordinary role. That lets ordinary roles set placeholder variables, which permission requirements are not known yet. Such a value wouldn't be used if the variable finally appear to require superuser privileges. The new flags are stored in the pg_db_role_setting.setuser array. Catversion is bumped. This commit is inspired by the previous work by Steve Chavez. Discussion: https://postgr.es/m/CAPpHfdsLd6E--epnGqXENqLP6dLwuNZrPMcNYb3wJ87WR7UBOQ%40mail.gmail.com Author: Alexander Korotkov, Steve Chavez Reviewed-by: Pavel Borisov, Steve Chavez	2022-12-09 13:12:20 +03:00
Alvaro Herrera	a61b1f7482	Rework query relation permission checking Currently, information about the permissions to be checked on relations mentioned in a query is stored in their range table entries. So the executor must scan the entire range table looking for relations that need to have permissions checked. This can make the permission checking part of the executor initialization needlessly expensive when many inheritance children are present in the range range. While the permissions need not be checked on the individual child relations, the executor still must visit every range table entry to filter them out. This commit moves the permission checking information out of the range table entries into a new plan node called RTEPermissionInfo. Every top-level (inheritance "root") RTE_RELATION entry in the range table gets one and a list of those is maintained alongside the range table. This new list is initialized by the parser when initializing the range table. The rewriter can add more entries to it as rules/views are expanded. Finally, the planner combines the lists of the individual subqueries into one flat list that is passed to the executor for checking. To make it quick to find the RTEPermissionInfo entry belonging to a given relation, RangeTblEntry gets a new Index field 'perminfoindex' that stores the corresponding RTEPermissionInfo's index in the query's list of the latter. ExecutorCheckPerms_hook has gained another List * argument; the signature is now: typedef bool (ExecutorCheckPerms_hook_type) (List rangeTable, List *rtePermInfos, bool ereport_on_violation); The first argument is no longer used by any in-core uses of the hook, but we leave it in place because there may be other implementations that do. Implementations should likely scan the rtePermInfos list to determine which operations to allow or deny. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com	2022-12-06 16:09:24 +01:00
Alvaro Herrera	ec38694894	Move PartitioPruneInfo out of plan nodes into PlannedStmt The planner will now add a given PartitioPruneInfo to PlannedStmt.partPruneInfos instead of directly to the Append/MergeAppend plan node. What gets set instead in the latter is an index field which points to the list element of PlannedStmt.partPruneInfos containing the PartitioPruneInfo belonging to the plan node. A later commit will make AcquireExecutorLocks() do the initial partition pruning to determine a minimal set of partitions to be locked when validating a plan tree and it will need to consult the PartitioPruneInfos referenced therein to do so. It would be better for the PartitioPruneInfos to be accessible directly than requiring a walk of the plan tree to find them, which is easier when it can be done by simply iterating over PlannedStmt.partPruneInfos. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com	2022-12-01 12:56:21 +01:00
Alvaro Herrera	8f2e74bf87	Bump catalog version for previous commit	2022-11-30 12:09:13 +01:00
Andrew Dunstan	4441fc704d	Provide non-superuser predefined roles for vacuum and analyze This provides two new predefined roles: pg_vacuum_all_tables and pg_analyze_all_tables. Roles which have been granted these roles can perform vacuum or analyse respectively on any or all tables as if they were a superuser. This removes the need to grant superuser privilege to roles just so they can perform vacuum and/or analyze. Nathan Bossart Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael Paquier. Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13	2022-11-28 12:08:14 -05:00
Michael Paquier	a54b658ce7	Add support for file inclusions in HBA and ident configuration files pg_hba.conf and pg_ident.conf gain support for three record keywords: - "include", to include a file. - "include_if_exists", to include a file, ignoring it if missing. - "include_dir", to include a directory of files. These are classified by name (C locale, mostly) and need to be prefixed by ".conf", hence following the same rules as GUCs. This commit relies on the refactoring pieces done in `efc9816`, `ad6c528`, `783e8c6` and `1b73d0b`, adding a small wrapper to build a list of TokenizedAuthLines (tokenize_include_file), and the code is shaped to offer some symmetry with what is done for GUCs with the same options. pg_hba_file_rules and pg_ident_file_mappings gain a new field called file_name, to track from which file a record is located, taking advantage of the addition of rule_number in `c591300` to offer an organized view of the HBA or ident records loaded. Bump catalog version. Author: Julien Rouhaud Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud	2022-11-24 13:51:34 +09:00
Andrew Dunstan	7b378237aa	Expand AclMode to 64 bits We're running out of bits for new permissions. This change doubles the number of permissions we can accomodate from 16 to 32, so the forthcoming new ones for vacuum/analyze don't exhaust the pool. Nathan Bossart Reviewed by: Bharath Rupireddy, Kyotaro Horiguchi, Stephen Frost, Robert Haas, Mark Dilger, Tom Lane, Corey Huinker, David G. Johnston, Michael Paquier. Discussion: https://postgr.es/m/20220722203735.GB3996698@nathanxps13	2022-11-23 14:43:16 -05:00
Michael Paquier	f193883fc9	Replace SQLValueFunction by COERCE_SQL_SYNTAX This switch impacts 9 patterns related to a SQL-mandated special syntax for function calls: - LOCALTIME [ ( typmod ) ] - LOCALTIMESTAMP [ ( typmod ) ] - CURRENT_TIME [ ( typmod ) ] - CURRENT_TIMESTAMP [ ( typmod ) ] - CURRENT_DATE Five new entries are added to pg_proc to compensate the removal of SQLValueFunction to provide backward-compatibility and making this change transparent for the end-user (for example for the attribute generated when a keyword is specified in a SELECT or in a FROM clause without an alias, or when specifying something else than an Iconst to the parser). The parser included a set of checks coming from the files in charge of holding the C functions used for the SQLValueFunction calls (as of transformSQLValueFunction()), which are now moved within each function's execution path, so this reduces the dependencies between the execution and the parsing steps. As of this change, all the SQL keywords use the same paths for their work, relying only on COERCE_SQL_SYNTAX. Like `fb32748`, no performance difference has been noticed, while the perf profiles get reduced with ExecEvalSQLValueFunction() gone. Bump catalog version. Reviewed-by: Corey Huinker, Ted Yu Discussion: https://postgr.es/m/YzaG3MoryCguUOym@paquier.xyz	2022-11-21 18:31:59 +09:00
Michael Paquier	fb32748e32	Switch SQLValueFunction on "name" to use COERCE_SQL_SYNTAX This commit changes six SQL keywords to use COERCE_SQL_SYNTAX rather than relying on SQLValueFunction: - CURRENT_ROLE - CURRENT_USER - USER - SESSION_USER - CURRENT_CATALOG - CURRENT_SCHEMA Among the six, "user", "current_role" and "current_catalog" require specific SQL functions to allow ruleutils.c to map them to the SQL keywords these require when using COERCE_SQL_SYNTAX. Having pg_proc.proname match with the keyword ensures that the compatibility remains the same when projecting any of these keywords in a FROM clause to an attribute name when an alias is not specified. This is covered by the tests added in `2e0d80c`, making sure that a correct mapping happens with each SQL keyword. The three others (current_schema, session_user and current_user) already have pg_proc entries for this job, so this brings more consistency between the way such keywords are treated in the parser, the executor and ruleutils.c. SQLValueFunction is reduced to half its contents after this change, simplifying its logic a bit as there is no need to enforce a C collation anymore for the entries returning a name as a result. I have made a few performance tests, with a million-ish calls to these keywords without seeing a difference in run-time or in perf profiles (ExecEvalSQLValueFunction() is removed from the profiles). The remaining SQLValueFunctions are now related to timestamps and dates. Bump catalog version. Reviewed-by: Corey Huinker Discussion: https://postgr.es/m/YzaG3MoryCguUOym@paquier.xyz	2022-11-20 10:58:28 +09:00
Joe Conway	ed1d3132d2	Fix catversion Commit `2fb6154fc` didn't quite get the catversion correct per usual norms. Fix it. Reported by Rishu Bagga.	2022-11-19 17:55:52 -05:00
Robert Haas	2fb6154fcd	Fix typos and bump catversion. Typos reported by Álvaro Herrera and Erik Rijkers. Catversion bump for `3d14e171e9` was inadvertently omitted.	2022-11-18 16:16:21 -05:00
Robert Haas	3d14e171e9	Add a SET option to the GRANT command. Similar to how the INHERIT option controls whether or not the permissions of the granted role are automatically available to the grantee, the new SET permission controls whether or not the grantee may use the SET ROLE command to assume the privileges of the granted role. In addition, the new SET permission controls whether or not it is possible to transfer ownership of objects to the target role or to create new objects owned by the target role using commands such as CREATE DATABASE .. OWNER. We could alternatively have made this controlled by the INHERIT option, or allow it when either option is given. An advantage of this approach is that if you are granted a predefined role with INHERIT TRUE, SET FALSE, you can't go and create objects owned by that role. The underlying theory here is that the ability to create objects as a target role is not a privilege per se, and thus does not depend on whether you inherit the target role's privileges. However, it's surely something you could do anyway if you could SET ROLE to the target role, and thus making it contingent on whether you have that ability is reasonable. Design review by Nathan Bossat, Wolfgang Walther, Jeff Davis, Peter Eisentraut, and Stephen Frost. Discussion: http://postgr.es/m/CA+Tgmob+zDSRS6JXYrgq0NWdzCXuTNzT5eK54Dn2hhgt17nm8A@mail.gmail.com	2022-11-18 12:32:56 -05:00
Tom Lane	533e02e927	Fix volatility marking of timestamptz_trunc_zone. It's safe to mark this as immutable, because it does not depend on the timezone GUC setting. Oversight in commit `600b04d6b`. (There's an argument that timezone definitions do change from time to time, but we have not worried about that in marking other timestamp-related functions; for example AT TIME ZONE has always been considered immutable. The situation is no worse than our problems with time-varying locales, surely.) Przemysław Sztoch Discussion: https://postgr.es/m/eaa3fabe-50fc-bbe8-b096-ce62ddadab85@sztoch.pl	2022-11-12 13:29:52 -05:00
Michael Paquier	c591300a8f	Add rule_number to pg_hba_file_rules and map_number to pg_ident_file_mappings These numbers are strictly-monotone identifiers assigned to each rule of pg_hba_file_rules and each map of pg_ident_file_mappings when loading the HBA and ident configuration files, indicating the order in which they are checked at authentication time, until a match is found. With only one file loaded currently, this is equivalent to the line numbers assigned to the entries loaded if one wants to know their order, but this becomes mandatory once the inclusion of external files is added to the HBA and ident files to be able to know in which order the rules and/or maps are applied at authentication. Note that NULL is used when a HBA or ident entry cannot be parsed or validated, aka when an error exists, contrary to the line number. Bump catalog version. Author: Julien Rouhaud Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud	2022-10-26 15:22:15 +09:00
Tom Lane	8272749e8c	Record dependencies of a cast on other casts that it requires. When creating a cast that uses a conversion function, we've historically allowed the input and result types to be binary-compatible with the function's input and result types, rather than necessarily being identical. This means that the new cast is logically dependent on the binary-compatible cast or casts that it references: if those are defined by pg_cast entries, and you try to restore the new cast without having defined them, it'll fail. Hence, we should make pg_depend entries to record these dependencies so that pg_dump knows that there is an ordering requirement. This is not the only place where we allow such shortcuts; aggregate functions for example are similarly lax, and in principle should gain similar dependencies. However, for now it seems sufficient to fix the cast-versus-cast case, as pg_dump's other ordering heuristics should keep it out of trouble for other object types. Per report from David Turoň; thanks also to Robert Haas for preliminary investigation. I considered back-patching, but seeing that this issue has existed for many years without previous reports, it's not clear it's worth the trouble. Moreover, back-patching wouldn't be enough to ensure that the new pg_depend entries exist in existing databases anyway. Discussion: https://postgr.es/m/OF0A160F3E.578B15D1-ONC12588DA.003E4857-C12588DA.0045A428@notes.linuxbox.cz	2022-10-17 14:02:05 -04:00
Andres Freund	c037471832	pgstat: Track time of the last scan of a relation It can be useful to know when a relation has last been used, e.g., when evaluating whether an index is still required. It was already possible to infer the time of the last usage by tracking, e.g., pg_stat_all_indexes.idx_scan over time. But far from everybody does so. To make it easier to detect the last time a relation has been scanned, track that time in each relation's pgstat entry. To minimize overhead a) the timestamp is updated only when the backend pending stats entry is flushed to shared stats b) the last transaction's stop timestamp is used as the timestamp. Bumps catalog and stats format versions. Author: Dave Page <dpage@pgadmin.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bruce Momjian <bruce@momjian.us> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Discussion: https://postgr.es/m/CA+OCxozrVHNFVEPkweUHMZje+t1tfY816d9MZYc6eZwOOusOaQ@mail.gmail.com	2022-10-14 11:11:34 -07:00
Michael Paquier	0823d061b0	Introduce SYSTEM_USER SYSTEM_USER is a reserved keyword of the SQL specification that, roughly described, is aimed at reporting some information about the system user who has connected to the database server. It may include implementation-specific information about the means by the user connected, like an authentication method. This commit implements SYSTEM_USER as of auth_method:identity, where "auth_method" is a keyword about the authentication method used to log into the server (like peer, md5, scram-sha-256, gss, etc.) and "identity" is the authentication identity as introduced by `9afffcb` (peer sets authn to the OS user name, gss to the user principal, etc.). This format has been suggested by Tom Lane. Note that thanks to `d951052`, SYSTEM_USER is available to parallel workers. Bump catalog version. Author: Bertrand Drouvot Reviewed-by: Jacob Champion, Joe Conway, Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/7e692b8c-0b11-45db-1cad-3afc5b57409f@amazon.com	2022-09-29 15:05:40 +09:00
Robert Haas	a448e49bcb	Revert 56-bit relfilenode change and follow-up commits. There are still some alignment-related failures in the buildfarm, which might or might not be able to be fixed quickly, but I've also just realized that it increased the size of many WAL records by 4 bytes because a block reference contains a RelFileLocator. The effect of that hasn't been studied or discussed, so revert for now.	2022-09-28 09:55:28 -04:00
Robert Haas	05d4cbf9b6	Increase width of RelFileNumbers from 32 bits to 56 bits. RelFileNumbers are now assigned using a separate counter, instead of being assigned from the OID counter. This counter never wraps around: if all 2^56 possible RelFileNumbers are used, an internal error occurs. As the cluster is limited to 2^64 total bytes of WAL, this limitation should not cause a problem in practice. If the counter were 64 bits wide rather than 56 bits wide, we would need to increase the width of the BufferTag, which might adversely impact buffer lookup performance. Also, this lets us use bigint for pg_class.relfilenode and other places where these values are exposed at the SQL level without worrying about overflow. This should remove the need to keep "tombstone" files around until the next checkpoint when relations are removed. We do that to keep RelFileNumbers from being recycled, but now that won't happen anyway. However, this patch doesn't actually change anything in this area; it just makes it possible for a future patch to do so. Dilip Kumar, based on an idea from Andres Freund, who also reviewed some earlier versions of the patch. Further review and some wordsmithing by me. Also reviewed at various points by Ashutosh Sharma, Vignesh C, Amul Sul, Álvaro Herrera, and Tom Lane. Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com	2022-09-27 13:25:21 -04:00
Robert Haas	2f47715cc8	Move RelFileNumber declarations to common/relpath.h. Previously, these were declared in postgres_ext.h, but they are not needed nearly so widely as the OID declarations, so that doesn't necessarily make sense. Also, because postgres_ext.h is included before most of c.h has been processed, the previous location creates some problems for a pending patch. Patch by me, reviewed by Dilip Kumar. Discussion: http://postgr.es/m/CA+TgmoYc8oevMqRokZQ4y_6aRn-7XQny1JBr5DyWR_jiFtONHw@mail.gmail.com	2022-09-27 12:01:57 -04:00
Peter Eisentraut	c07785d458	catversion bump for `8999f5ed3c`	2022-09-26 15:56:47 +02:00
Andres Freund	e6927270cd	meson: Add initial version of meson based build system Autoconf is showing its age, fewer and fewer contributors know how to wrangle it. Recursive make has a lot of hard to resolve dependency issues and slow incremental rebuilds. Our home-grown MSVC build system is hard to maintain for developers not using Windows and runs tests serially. While these and other issues could individually be addressed with incremental improvements, together they seem best addressed by moving to a more modern build system. After evaluating different build system choices, we chose to use meson, to a good degree based on the adoption by other open source projects. We decided that it's more realistic to commit a relatively early version of the new build system and mature it in tree. This commit adds an initial version of a meson based build system. It supports building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD, Solaris and Windows (however only gcc is supported on aix, solaris). For Windows/MSVC postgres can now be built with ninja (faster, particularly for incremental builds) and msbuild (supporting the visual studio GUI, but building slower). Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM bitcode generation, documentation adjustments) are done in subsequent commits requiring further review. Other aspects (e.g. not installing test-only extensions) are not yet addressed. When building on Windows with msbuild, builds are slower when using a visual studio version older than 2019, because those versions do not support MultiToolTask, required by meson for intra-target parallelism. The plan is to remove the MSVC specific build system in src/tools/msvc soon after reaching feature parity. However, we're not planning to remove the autoconf/make build system in the near future. Likely we're going to keep at least the parts required for PGXS to keep working around until all supported versions build with meson. Some initial help for postgres developers is at https://wiki.postgresql.org/wiki/Meson With contributions from Thomas Munro, John Naylor, Stone Tickle and others. Author: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://postgr.es/m/20211012083721.hvixq4pnh2pixr3j@alap3.anarazel.de	2022-09-21 22:37:17 -07:00
Jeff Davis	bb44a6ba48	Improve comment for OAT_POST_CREATE. Clarify that the command counter may or may not have been incremented. We may want to change the behavior to be more consistent, but until that time, at least improve the comment. Discussion: https://postgr.es/m/CAHoZxqvN2eoic_CvjsAvpryyLyA2xG8JmsyMtKFFJz_1oFhfOg%40mail.gmail.com Reported-by: Mary Xu	2022-09-20 10:52:01 -07:00
Peter Geoghegan	bfcf1b3480	Harmonize parameter names in storage and AM code. Make sure that function declarations use names that exactly match the corresponding names from function definitions in storage, catalog, access method, executor, and logical replication code, as well as in miscellaneous utility/library code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com	2022-09-19 19:18:36 -07:00
Tom Lane	20b6847176	Fix new pg_publication_tables query. The addition of published column names forgot to filter on attisdropped, leading to cases where you could see "........pg.dropped.1........" or the like as a reportedly-published column. While we're here, rewrite the new subquery to get a more efficient plan for it. Hou Zhijie, per report from Jaime Casanova. Back-patch to v15 where the bug was introduced. (Sadly, this means we need a post-beta4 catversion bump before beta4 has even hit the streets. I see no good alternative though.) Discussion: https://postgr.es/m/Yxa1SU4nH2HfN3/i@ahch-to	2022-09-06 18:00:32 -04:00
Tom Lane	ff720a597c	Fix planner to consider matches to boolean columns in extension indexes. The planner has to special-case indexes on boolean columns, because what we need for an indexscan on such a column is a qual of the shape of "boolvar = pseudoconstant". For plain bool constants, previous simplification will have reduced this to "boolvar" or "NOT boolvar", and we have to reverse that if we want to make an indexqual. There is existing code to do so, but it only fires when the index's opfamily is BOOL_BTREE_FAM_OID or BOOL_HASH_FAM_OID. Thus extension AMs, or extension opclasses such as contrib/btree_gin, are out in the cold. The reason for hard-wiring the set of relevant opfamilies was mostly to avoid a catalog lookup in a hot code path. We can improve matters while not taking much of a performance hit by relying on the hard-wired set when the opfamily OID is visibly built-in, and only checking the catalogs when dealing with an extension opfamily. While here, rename IsBooleanOpfamily to IsBuiltinBooleanOpfamily to remind future users of that macro of its limitations. At some point we might want to make indxpath.c's improved version of the test globally accessible, but it's not presently needed elsewhere. Zongliang Quan and Tom Lane Discussion: https://postgr.es/m/f293b91d-1d46-d386-b6bb-4b06ff5c667b@yeah.net	2022-09-02 17:01:51 -04:00
Andrew Dunstan	2f2b18bd3f	Revert SQL/JSON features The reverts the following and makes some associated cleanups: commit f79b803dc: Common SQL/JSON clauses commit f4fb45d15: SQL/JSON constructors commit 5f0adec25: Make STRING an unreserved_keyword. commit 33a377608: IS JSON predicate commit 1a36bc9db: SQL/JSON query functions commit 606948b05: SQL JSON functions commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR() commit 4e34747c8: JSON_TABLE commit fadb48b00: PLAN clauses for JSON_TABLE commit 2ef6f11b0: Reduce running time of jsonb_sqljson test commit 14d3f24fa: Further improve jsonb_sqljson parallel test commit a6baa4bad: Documentation for SQL/JSON features commit b46bcf7a4: Improve readability of SQL/JSON documentation. commit 112fdb352: Fix finalization for json_objectagg and friends commit fcdb35c32: Fix transformJsonBehavior commit 4cd8717af: Improve a couple of sql/json error messages commit f7a605f63: Small cleanups in SQL/JSON code commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug commit a79153b7a: Claim SQL standard compliance for SQL/JSON features commit a1e7616d6: Rework SQL/JSON documentation commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types. commit 3c633f32b: Only allow returning string types or bytea from json_serialize commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size. The release notes are also adjusted. Backpatch to release 15. Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org	2022-09-01 17:07:14 -04:00
Robert Haas	e3ce2de09d	Allow grant-level control of role inheritance behavior. The GRANT statement can now specify WITH INHERIT TRUE or WITH INHERIT FALSE to control whether the member inherits the granted role's permissions. For symmetry, you can now likewise write WITH ADMIN TRUE or WITH ADMIN FALSE to turn ADMIN OPTION on or off. If a GRANT does not specify WITH INHERIT, the behavior based on whether the member role is marked INHERIT or NOINHERIT. This means that if all roles are marked INHERIT or NOINHERIT before any role grants are performed, the behavior is identical to what we had before; otherwise, it's different, because ALTER ROLE [NO]INHERIT now only changes the default behavior of future grants, and has no effect on existing ones. Patch by me. Reviewed and testing by Nathan Bossart and Tushar Ahuja, with design-level comments from various others. Discussion: http://postgr.es/m/CA+Tgmoa5Sf4PiWrfxA=sGzDKg0Ojo3dADw=wAHOhR9dggV=RmQ@mail.gmail.com	2022-08-25 10:06:02 -04:00
Robert Haas	ce6b672e44	Make role grant system more consistent with other privileges. Previously, membership of role A in role B could be recorded in the catalog tables only once. This meant that a new grant of role A to role B would overwrite the previous grant. For other object types, a new grant of permission on an object - in this case role A - exists along side the existing grant provided that the grantor is different. Either grant can be revoked independently of the other, and permissions remain so long as at least one grant remains. Make role grants work similarly. Previously, when granting membership in a role, the superuser could specify any role whatsoever as the grantor, but for other object types, the grantor of record must be either the owner of the object, or a role that currently has privileges to perform a similar GRANT. Implement the same scheme for role grants, treating the bootstrap superuser as the role owner since roles do not have owners. This means that attempting to revoke a grant, or admin option on a grant, can now fail if there are dependent privileges, and that CASCADE can be used to revoke these. It also means that you can't grant ADMIN OPTION on a role back to a user who granted it directly or indirectly to you, similar to how you can't give WITH GRANT OPTION on a privilege back to a role which granted it directly or indirectly to you. Previously, only the superuser could specify GRANTED BY with a user other than the current user. Relax that rule to allow the grantor to be any role whose privileges the current user posseses. This doesn't improve compatibility with what we do for other object types, where support for GRANTED BY is entirely vestigial, but it makes this feature more usable and seems to make sense to change at the same time we're changing related behaviors. Along the way, fix "ALTER GROUP group_name ADD USER user_name" to require the same privileges as "GRANT group_name TO user_name". Previously, CREATEROLE privileges were sufficient for either, but only the former form was permissible with ADMIN OPTION on the role. Now, either CREATEROLE or ADMIN OPTION on the role suffices for either spelling. Patch by me, reviewed by Stephen Frost. Discussion: http://postgr.es/m/CA+TgmoaFr-RZeQ+WoQ5nKPv97oT9+aDgK_a5+qWHSgbDsMp1Vg@mail.gmail.com	2022-08-22 11:35:17 -04:00
Robert Haas	9288c2e6f8	Bump catversion for `6566133c5f` Omission noted by Tom Lane.	2022-08-18 15:10:06 -04:00
Robert Haas	6566133c5f	Ensure that pg_auth_members.grantor is always valid. Previously, "GRANT foo TO bar" or "GRANT foo TO bar GRANTED BY baz" would record the OID of the grantor in pg_auth_members.grantor, but that role could later be dropped without modifying or removing the pg_auth_members record. That's not great, because we typically try to avoid dangling references in catalog data. Now, a role grant depends on the grantor, and the grantor can't be dropped without removing the grant or changing the grantor. "DROP OWNED BY" will remove the grant, just as it does for other kinds of privileges. "REASSIGN OWNED BY" will not, again just like what we do in other cases involving privileges. pg_auth_members now has an OID column, because that is needed in order for dependencies to work. It also now has an index on the grantor column, because otherwise dropping a role would require a sequential scan of the entire table to see whether the role's OID is in use as a grantor. That probably wouldn't be too large a problem in practice, but it seems better to have an index just in case. A follow-on patch is planned with the goal of more thoroughly rationalizing the behavior of role grants. This patch is just trying to do enough to make sure that the data we store in the catalogs is at some basic level valid. Patch by me, reviewed by Stephen Frost Discussion: http://postgr.es/m/CA+TgmoaFr-RZeQ+WoQ5nKPv97oT9+aDgK_a5+qWHSgbDsMp1Vg@mail.gmail.com	2022-08-18 13:13:02 -04:00
Tom Lane	b9b21acc76	In extensions, don't replace objects not belonging to the extension. Previously, if an extension script did CREATE OR REPLACE and there was an existing object not belonging to the extension, it would overwrite the object and adopt it into the extension. This is problematic, first because the overwrite is probably unintentional, and second because we didn't change the object's ownership. Thus a hostile user could create an object in advance of an expected CREATE EXTENSION command, and would then have ownership rights on an extension object, which could be modified for trojan-horse-type attacks. Hence, forbid CREATE OR REPLACE of an existing object unless it already belongs to the extension. (Note that we've always forbidden replacing an object that belongs to some other extension; only the behavior for previously-free-standing objects changes here.) For the same reason, also fail CREATE IF NOT EXISTS when there is an existing object that doesn't belong to the extension. Our thanks to Sven Klemm for reporting this problem. Security: CVE-2022-2625	2022-08-08 11:12:31 -04:00
David Rowley	1349d2790b	Improve performance of ORDER BY / DISTINCT aggregates ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been executed by always performing a sort in nodeAgg.c to sort the tuples in the current group into the correct order before calling the transition function on the sorted tuples. This was not great as often there might be an index that could have provided pre-sorted input and allowed the transition functions to be called as the rows come in, rather than having to store them in a tuplestore in order to sort them once all the tuples for the group have arrived. Here we change the planner so it requests a path with a sort order which supports the most amount of ORDER BY / DISTINCT aggregate functions and add new code to the executor to allow it to support the processing of ORDER BY / DISTINCT aggregates where the tuples are already sorted in the correct order. Since there can be many ORDER BY / DISTINCT aggregates in any given query level, it's very possible that we can't find an order that suits all of these aggregates. The sort order that the planner chooses is simply the one that suits the most aggregate functions. We take the most strictly sorted variation of each order and see how many aggregate functions can use that, then we try again with the order of the remaining aggregates to see if another order would suit more aggregate functions. For example: SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ... would request the sort order to be {a, b} because {a} is a subset of the sort order of {a,b}, but; SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ... would just pick a plan ordered by {a} (we give precedence to aggregates which are earlier in the targetlist). SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ... would choose to order by {b} since two aggregates suit that vs just one that requires input ordered by {a}. Author: David Rowley Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane Discussion: https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com	2022-08-02 23:11:45 +12:00
Amit Kapila	6b24d3f9cc	Move common catalog cache access routines to lsyscache.c In passing, move pg_relation_is_publishable next to similar functions. Suggested-by: Alvaro Herrera Author: Amit Kapila Reviewed-by: Hou Zhijie Discussion: https://postgr.es/m/CAHut+PupQ5UW9A9ut0Yjt21J9tHhx958z5L0k8-9hTYf_NYqxA@mail.gmail.com	2022-08-02 10:47:22 +05:30
John Naylor	c689baa158	Fix comment in pg_db_role_setting.h Noted by Japin Li Discussion: https://www.postgresql.org/message-id/MEYP282MB16691ACEDBC94161CF4BA1CCB69A9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM	2022-08-02 11:49:37 +07:00
Tom Lane	283129e325	Support pg_read_[binary_]file (filename, missing_ok). There wasn't an especially nice way to read all of a file while passing missing_ok = true. Add an additional overloaded variant to support that use-case. While here, refactor the C code to avoid a rats-nest of PG_NARGS checks, instead handling the argument collection in the outer wrapper functions. It's a bit longer this way, but far more straightforward. (Upon looking at the code coverage report for genfile.c, I was impelled to also add a test case for pg_stat_file() -- tgl) Kyotaro Horiguchi Discussion: https://postgr.es/m/20220607.160520.1984541900138970018.horikyota.ntt@gmail.com	2022-07-29 15:38:49 -04:00
Robert Haas	5f858dd3be	Bump catversion for commit `d8cd0c6c95`. The catalog contents haven't changed, but it's good to make clear that initdb is required. Changing RELMAPPER_FILEMAGIC would be more appropriate, but that doesn't actually produce a useful diagnostic, so cheat by doing this instead. Discussion: http://postgr.es/m/20220727171939.6ixixqcjt5riil2o@alvherre.pgsql	2022-07-27 16:18:21 -04:00
Michael Paquier	ce3049b021	Refactor code in charge of grabbing the relations of a subscription GetSubscriptionRelations() and GetSubscriptionNotReadyRelations() share mostly the same code, which scans pg_subscription_rel and fetches all the relations of a given subscription. The only difference is that the second routine looks for all the relations not in a ready state. This commit refactors the code to use a single routine, shaving a bit of code. Author: Vignesh C Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Michael Paquier, Peter Smith Discussion: https://postgr.es/m/CALDaNm0eW-9g4G_EzHebnFT5zZoasWCS_EzZQ5BgnLZny9S=pg@mail.gmail.com	2022-07-27 19:50:06 +09:00
Amit Kapila	366283961a	Allow users to skip logical replication of data having origin. This patch adds a new SUBSCRIPTION parameter "origin". It specifies whether the subscription will request the publisher to only send changes that don't have an origin or send changes regardless of origin. Setting it to "none" means that the subscription will request the publisher to only send changes that have no origin associated. Setting it to "any" means that the publisher sends changes regardless of their origin. The default is "any". Usage: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=9999' PUBLICATION pub1 WITH (origin = none); This can be used to avoid loops (infinite replication of the same data) among replication nodes. This feature allows filtering only the replication data originating from WAL but for initial sync (initial copy of table data) we don't have such a facility as we can only distinguish the data based on origin from WAL. As a follow-up patch, we are planning to forbid the initial sync if the origin is specified as none and we notice that the publication tables were also replicated from other publishers to avoid duplicate data or loops. We forbid to allow creating origin with names 'none' and 'any' to avoid confusion with the same name options. Author: Vignesh C, Amit Kapila Reviewed-By: Peter Smith, Amit Kapila, Dilip Kumar, Shi yu, Ashutosh Bapat, Hayato Kuroda Discussion: https://postgr.es/m/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubhjcQ@mail.gmail.com	2022-07-21 08:47:38 +05:30
Tom Lane	2d04277121	Make serialization of Nodes' scalar-array fields more robust. When the ability to print variable-length-array fields was first added to outfuncs.c, there was no corresponding read capability, as it was used only for debug dumps of planner-internal Nodes. Not a lot of thought seems to have been put into the output format: it's just the space-separated array elements and nothing else. Later such fields appeared in Plan nodes, and still later we grew read support so that Plans could be transferred to parallel workers, but the original text format wasn't rethought. It seems inadequate to me because (a) no cross-check is possible that we got the right number of array entries, (b) we can't tell the difference between a NULL pointer and a zero-length array, and (c) except for WRITE_INDEX_ARRAY, we'd crash if a non-zero length is specified when the pointer is NULL, a situation that can arise in some fields that we currently conveniently avoid printing. Since we're currently in a campaign to make the Node infrastructure generally more it-just-works-without-thinking-about-it, now seems like a good time to improve this. Let's adopt a format similar to that used for Lists, that is "<>" for a NULL pointer or "(item item item)" otherwise. Also retool the code to not have so many copies of the identical logic. I bumped catversion out of an abundance of caution, although I think that we don't use any such array fields in Nodes that can get into the catalogs. Discussion: https://postgr.es/m/1528424.1658272135@sss.pgh.pa.us	2022-07-20 13:04:33 -04:00
Tom Lane	ff33a8c887	Remove artificial restrictions on which node types have out/read funcs. The initial version of gen_node_support.pl manually excluded most utility statement node types from having out/read support, and also some raw-parse-tree-only node types. That was mostly to keep the output comparable to the old hand-maintained code. We'd like to have out/read support for utility statements, for debugging purposes and so that they can be included in new-style SQL functions; so it's time to lift that restriction. Most if not all of the previously-excluded raw-parse-tree-only node types can appear in expression subtrees of utility statements, so they have to be handled too. We don't quite have full read support yet; certain custom_read_write node types need to have their handwritten read functions implemented before that will work. Doing this allows us to drop the previous hack in _outQuery to not dump the utilityStmt field in most cases, which means we no longer need manually-maintained out/read functions for Query, so get rid of those in favor of auto-generating them. Fix a couple of omissions in gen_node_support.pl that are exposed through having to handle more node types. catversion bump forced because somebody was sloppy about the field order in the manually-maintained Query out/read functions. (Committers should note that almost all changes in parsenodes.h are now grounds for a catversion bump.)	2022-07-13 11:48:17 -04:00
Peter Eisentraut	964d01ae90	Automatically generate node support functions Add a script to automatically generate the node support functions (copy, equal, out, and read, as well as the node tags enum) from the struct definitions. For each of the four node support files, it creates two include files, e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main file. All the scaffolding of the main file stays in place. I have tried to mostly make the coverage of the output match what is currently there. For example, one could now do out/read coverage of utility statement nodes, but I have manually excluded those for now. The reason is mainly that it's easier to diff the before and after, and adding a bunch of stuff like this might require a separate analysis and review. Subtyping (TidScan -> Scan) is supported. For the hard cases, you can just write a manual function and exclude generating one. For the not so hard cases, there is a way of annotating struct fields to get special behaviors. For example, pg_node_attr(equal_ignore) has the field ignored in equal functions. (In this patch, I have only ifdef'ed out the code to could be removed, mainly so that it won't constantly have merge conflicts. It will be deleted in a separate patch. All the code comments that are worth keeping from those sections have already been moved to the header files where the structs are defined.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com	2022-07-09 08:53:59 +02:00
Peter Eisentraut	251154bebe	Remove T_Join and T_Plan These are abstract node types that don't need to have a node tag defined. Discussion: https://www.postgresql.org/message-id/2592455.1657140387%40sss.pgh.pa.us	2022-07-08 10:40:44 +02:00
Robert Haas	b0a55e4329	Change internal RelFileNode references to RelFileNumber or RelFileLocator. We have been using the term RelFileNode to refer to either (1) the integer that is used to name the sequence of files for a certain relation within the directory set aside for that tablespace/database combination; or (2) that value plus the OIDs of the tablespace and database; or occasionally (3) the whole series of files created for a relation based on those values. Using the same name for more than one thing is confusing. Replace RelFileNode with RelFileNumber when we're talking about just the single number, i.e. (1) from above, and with RelFileLocator when we're talking about all the things that are needed to locate a relation's files on disk, i.e. (2) from above. In the places where we refer to (3) as a relfilenode, instead refer to "relation storage". Since there is a ton of SQL code in the world that knows about pg_class.relfilenode, don't change the name of that column, or of other SQL-facing things that derive their name from it. On the other hand, do adjust closely-related internal terminology. For example, the structure member names dbNode and spcNode appear to be derived from the fact that the structure itself was called RelFileNode, so change those to dbOid and spcOid. Likewise, various variables with names like rnode and relnode get renamed appropriately, according to how they're being used in context. Hopefully, this is clearer than before. It is also preparation for future patches that intend to widen the relfilenumber fields from its current width of 32 bits. Variables that store a relfilenumber are now declared as type RelFileNumber rather than type Oid; right now, these are the same, but that can now more easily be changed. Dilip Kumar, per an idea from me. Reviewed also by Andres Freund. I fixed some whitespace issues, changed a couple of words in a comment, and made one other minor correction. Discussion: http://postgr.es/m/CA+TgmoamOtXbVAQf9hWFzonUo6bhhjS6toZQd7HZ-pmojtAmag@mail.gmail.com Discussion: http://postgr.es/m/CA+Tgmobp7+7kmi4gkq7Y+4AM9fTvL+O1oQ4-5gFTT+6Ng-dQ=g@mail.gmail.com Discussion: http://postgr.es/m/CAFiTN-vTe79M8uDH1yprOU64MNFE+R3ODRuA+JWf27JbhY4hJw@mail.gmail.com	2022-07-06 11:39:09 -04:00
Robert Haas	b9eb0ff09e	Rename pg_checkpointer predefined role to pg_checkpoint. This is more consistent with how other predefined roles that confer specific privileges are named. Nathan Bosart Discussion: http://postgr.es/m/CA+TgmoatH7+yYe+A8uJFNogg3VUDtFE6c-77yHAY8TRWR7oqyw@mail.gmail.com	2022-07-05 13:31:42 -04:00
Peter Eisentraut	84ad713cf8	Add result_types column to pg_prepared_statements view Containing the types of the columns returned by the prepared statement. Prompted by question from IRC user mlvzk. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/flat/871qwpo7te.fsf@wibble.ilmari.org	2022-07-05 07:23:32 +02:00
Tom Lane	82d0ffae32	pgindent run prior to branching v15. pgperltidy and reformat-dat-files too. Not many changes.	2022-06-30 11:03:03 -04:00
Amit Kapila	0ff20288e1	Extend pg_publication_tables to display column list and row filter. Commit `923def9a53` and `52e4f0cd47` allowed to specify column lists and row filters for publication tables. This commit extends the pg_publication_tables view and pg_get_publication_tables function to display that information. This information will be useful to users and we also need this for the later commit that prohibits combining multiple publications with different column lists for the same table. Author: Hou Zhijie Reviewed By: Amit Kapila, Alvaro Herrera, Shi Yu, Takamichi Osumi Discussion: https://postgr.es/m/202204251548.mudq7jbqnh7r@alvherre.pgsql	2022-05-19 08:20:55 +05:30
Tom Lane	3ab9a63cb6	Rename JsonIsPredicate.value_type, fix JSON backend/nodes/ infrastructure. I started out with the intention to rename value_type to item_type to avoid a collision with a typedef name that appears on some platforms. Along the way, I noticed that the adjacent field "format" was not being correctly handled by the backend/nodes/ infrastructure functions: copyfuncs.c erroneously treated it as a scalar, while equalfuncs, outfuncs, and readfuncs omitted handling it at all. This looks like it might be cosmetic at the moment because the field is always NULL after parse analysis; but that's likely a bug in itself, and the code's certainly not very future-proof. Let's fix it while we can still do so without forcing an initdb on beta testers. Further study found a few other inconsistencies in the backend/nodes/ infrastructure for the recently-added JSON node types, so fix those too. catversion bumped because of potential change in stored rules. Discussion: https://postgr.es/m/526703.1652385613@sss.pgh.pa.us	2022-05-13 11:40:08 -04:00
Tom Lane	c2f436151e	Do pre-release housekeeping on catalog data. Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta tasks specified by RELEASE_CHANGES. For reference, the command was ./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6205	2022-05-12 15:35:15 -04:00
Tom Lane	23e7b38bfe	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.	2022-05-12 15:17:30 -04:00
Tom Lane	2cb1272445	Rethink method for assigning OIDs to the template0 and postgres DBs. Commit `aa0105141` assigned fixed OIDs to template0 and postgres in a very ad-hoc way. Notably, instead of teaching Catalog.pm about these OIDs, the unused_oids script was just hacked to not show them as unused. That's problematic since, for example, duplicate_oids wouldn't report any future conflict. Hence, invent a macro DECLARE_OID_DEFINING_MACRO() that can be used to define an OID that is known to Catalog.pm and will participate in duplicate-detection as well as renumbering by renumber_oids.pl. (We don't anticipate renumbering these particular OIDs, but we might as well build out all the Catalog.pm infrastructure while we're here.) Another issue is that `aa0105141` neglected to touch IsPinnedObject, with the result that it now claimed template0 and postgres are pinned. The right thing to do there seems to be to teach it that no database is pinned, since in fact DROP DATABASE doesn't check for pinned-ness (and at least for these cases, that is an intentional choice). It's not clear whether this wrong answer had any visible effect, but perhaps it could have resulted in erroneous management of dependency entries. In passing, rename the TemplateDbOid macro to Template1DbOid to reduce confusion (likely we should have done that way back when we invented template0, but we didn't), and rename the OID macros for template0 and postgres to have a similar style. There are no changes to postgres.bki here, so no need for a catversion bump. Discussion: https://postgr.es/m/2935358.1650479692@sss.pgh.pa.us	2022-04-21 16:23:15 -04:00
Tom Lane	40eba064b2	Use DECLARE_TOAST_WITH_MACRO() to simplify toast-table declarations. This is needed so that renumber_oids.pl can handle renumbering shared catalog declarations, which need to provide C macros for the OIDs of the shared toast table and index. The previous method of writing a C macro separately was error-prone anyway. Also teach renumber_oids.pl about DECLARE_UNIQUE_INDEX_PKEY, as we missed doing when inventing that macro. There are no changes to postgres.bki here, so no need for a catversion bump. Discussion: https://postgr.es/m/2995325.1650487527@sss.pgh.pa.us	2022-04-21 12:02:23 -04:00
Robert Haas	8ec569479f	Apply PGDLLIMPORT markings broadly. Up until now, we've had a policy of only marking certain variables in the PostgreSQL header files with PGDLLIMPORT, but now we've decided to mark them all. This means that extensions running on Windows should no longer operate at a disadvantage as compared to extensions running on Linux: if the variable is present in a header file, it should be accessible. Discussion: http://postgr.es/m/CA+TgmoYanc1_FSfimhgiWSqVyP5KKmh5NP2BWNwDhO8Pg2vGYQ@mail.gmail.com	2022-04-08 08:16:38 -04:00
David Rowley	9d9c02ccd1	Teach planner and executor about monotonic window funcs Window functions such as row_number() always return a value higher than the previously returned value for tuples in any given window partition. Traditionally queries such as; SELECT * FROM ( SELECT , row_number() over (order by c) rn FROM t ) t WHERE rn <= 10; were executed fairly inefficiently. Neither the query planner nor the executor knew that once rn made it to 11 that nothing further would match the outer query's WHERE clause. It would blindly continue until all tuples were exhausted from the subquery. Here we implement means to make the above execute more efficiently. This is done by way of adding a pg_proc.prosupport function to various of the built-in window functions and adding supporting code to allow the support function to inform the planner if the window function is monotonically increasing, monotonically decreasing, both or neither. The planner is then able to make use of that information and possibly allow the executor to short-circuit execution by way of adding a "run condition" to the WindowAgg to allow it to determine if some of its execution work can be skipped. This "run condition" is not like a normal filter. These run conditions are only built using quals comparing values to monotonic window functions. For monotonic increasing functions, quals making use of the btree operators for <, <= and = can be used (assuming the window function column is on the left). You can see here that once such a condition becomes false that a monotonic increasing function could never make it subsequently true again. For monotonically decreasing functions the >, >= and = btree operators for the given type can be used for run conditions. The best-case situation for this is when there is a single WindowAgg node without a PARTITION BY clause. Here when the run condition becomes false the WindowAgg node can simply return NULL. No more tuples will ever match the run condition. It's a little more complex when there is a PARTITION BY clause. In this case, we cannot return NULL as we must still process other partitions. To speed this case up we pull tuples from the outer plan to check if they're from the same partition and simply discard them if they are. When we find a tuple belonging to another partition we start processing as normal again until the run condition becomes false or we run out of tuples to process. When there are multiple WindowAgg nodes to evaluate then this complicates the situation. For intermediate WindowAggs we must ensure we always return all tuples to the calling node. Any filtering done could lead to incorrect results in WindowAgg nodes above. For all intermediate nodes, we can still save some work when the run condition becomes false. We've no need to evaluate the WindowFuncs anymore. Other WindowAgg nodes cannot reference the value of these and these tuples will not appear in the final result anyway. The savings here are small in comparison to what can be saved in the top-level WingowAgg, but still worthwhile. Intermediate WindowAgg nodes never filter out tuples, but here we change WindowAgg so that the top-level WindowAgg filters out tuples that don't match the intermediate WindowAgg node's run condition. Such filters appear in the "Filter" clause in EXPLAIN for the top-level WindowAgg node. Here we add prosupport functions to allow the above to work for; row_number(), rank(), dense_rank(), count() and count(expr). It appears technically possible to do the same for min() and max(), however, it seems unlikely to be useful enough, so that's not done here. Bump catversion Author: David Rowley Reviewed-by: Andy Fan, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqvp3At8++yF8ij06sdcoo1S_b2YoaT9D4Nf+MObzsrLQ@mail.gmail.com	2022-04-08 10:34:36 +12:00
Tomas Vondra	2c7ea57e56	Revert "Logical decoding of sequences" This reverts a sequence of commits, implementing features related to logical decoding and replication of sequences: - `0da92dc530` - `80901b3291` - `b779d7d8fd` - `d5ed9da41d` - `a180c2b34d` - `75b1521dae` - `2d2232933b` - `002c9dd97a` - `05843b1aa4` The implementation has issues, mostly due to combining transactional and non-transactional behavior of sequences. It's not clear how this could be fixed, but it'll require reworking significant part of the patch. Discussion: https://postgr.es/m/95345a19-d508-63d1-860a-f5c2f41e8d40@enterprisedb.com	2022-04-07 20:06:36 +02:00
Thomas Munro	5dc0418fab	Prefetch data referenced by the WAL, take II. Introduce a new GUC recovery_prefetch. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Since not all OSes have that system call, "try" is provided so that it can be enabled where available. Better mechanisms for asynchronous I/O are possible in later work. Set to "try" for now for test coverage. Default setting to be finalized before release. The GUC wal_decode_buffer_size limits the distance we can look ahead in bytes of decoded data. The existing GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os allowed, based on pessimistic heuristics used to infer that I/Os have begun and completed. We'll also not look more than maintenance_io_concurrency * 4 block references ahead. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (earlier version) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> (earlier version) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (earlier version) Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> (earlier version) Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> (earlier version) Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com	2022-04-07 19:42:14 +12:00
Andres Freund	ad401664b8	pgstat: add pg_stat_have_stats() test helper. Will be used by tests committed subsequently. Bumps catversion (this time for real, the one in `0f96965c65` got lost when rebasing over `5c279a6d35`). Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_aNxL1WegCa45r=VAViCLnpOU7uNC7bTtGw+=QAPyYivw@mail.gmail.com	2022-04-07 00:21:54 -07:00
Andres Freund	0f96965c65	pgstat: add pg_stat_force_next_flush(), use it to simplify tests. In the stats collector days it was hard to write tests for the stats system, because fundamentally delivery of stats messages over UDP was not synchronous (nor guaranteed). Now we easily can force pending stats updates to be flushed synchronously. This moves stats.sql into a parallel group, there isn't a reason for it to run in isolation anymore. And it may shake out some bugs. Bumps catversion. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-04-06 23:35:56 -07:00
Jeff Davis	5c279a6d35	Custom WAL Resource Managers. Allow extensions to specify a new custom resource manager (rmgr), which allows specialized WAL. This is meant to be used by a Table Access Method or Index Access Method. Prior to this commit, only Generic WAL was available, which offers support for recovery and physical replication but not logical replication. Reviewed-by: Julien Rouhaud, Bharath Rupireddy, Andres Freund Discussion: https://postgr.es/m/ed1fb2e22d15d3563ae0eb610f7b61bb15999c0a.camel%40j-davis.com	2022-04-06 23:06:46 -07:00
Amit Kapila	79b716cfb7	Reorder subskiplsn in pg_subscription to avoid alignment issues. The column 'subskiplsn' uses TYPALIGN_DOUBLE (which has 4 bytes alignment on AIX) for storage. But the C Struct (Form_pg_subscription) has 8-byte alignment for this field, so retrieving it from storage causes an unaligned read. To fix this, we rearranged the 'subskiplsn' column in the catalog so that it naturally comes at an 8-byte boundary. We have fixed a similar problem in commit `f3b421da5f`. This patch adds a test to avoid a similar mistake in the future. Reported-by: Noah Misch Diagnosed-by: Noah Misch, Masahiko Sawada, Amit Kapila Author: Masahiko Sawada Reviewed-by: Noah Misch, Amit Kapila Discussion: https://postgr.es/m/20220401074423.GC3682158@rfd.leadboat.com https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2022-04-07 09:39:25 +05:30
Stephen Frost	e99546f566	Forgotten catversion bump for `39969e2a1e`	2022-04-06 15:00:07 -04:00
Stephen Frost	39969e2a1e	Remove exclusive backup mode Exclusive-mode backups have been deprecated since 9.6 (when non-exclusive backups were introduced) due to the issues they can cause should the system crash while one is running and generally because non-exclusive provides a much better interface. Further, exclusive backup mode wasn't really being tested (nor was most of the related code- like being able to log in just to stop an exclusive backup and the bits of the state machine related to that) and having to possibly deal with an exclusive backup and the backup_label file existing during pg_basebackup, pg_rewind, etc, added other complexities that we are better off without. This patch removes the exclusive backup mode, the various special cases for dealing with it, and greatly simplifies the online backup code and documentation. Authors: David Steele, Nathan Bossart Reviewed-by: Chapman Flack Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d1172c@pgmasters.net https://postgr.es/m/CAHg+QDfiM+WU61tF6=nPZocMZvHDzCK47Kneyb0ZRULYzV5sKQ@mail.gmail.com	2022-04-06 14:41:03 -04:00
Tom Lane	a0ffa885e4	Allow granting SET and ALTER SYSTEM privileges on GUC parameters. This patch allows "PGC_SUSET" parameters to be set by non-superusers if they have been explicitly granted the privilege to do so. The privilege to perform ALTER SYSTEM SET/RESET on a specific parameter can also be granted. Such privileges are cluster-wide, not per database. They are tracked in a new shared catalog, pg_parameter_acl. Granting and revoking these new privileges works as one would expect. One caveat is that PGC_USERSET GUCs are unaffected by the SET privilege --- one could wish that those were handled by a revocable grant to PUBLIC, but they are not, because we couldn't make it robust enough for GUCs defined by extensions. Mark Dilger, reviewed at various times by Andrew Dunstan, Robert Haas, Joshua Brindle, and myself Discussion: https://postgr.es/m/3D691E20-C1D5-4B80-8BA5-6BEB63AF3029@enterprisedb.com	2022-04-06 13:24:33 -04:00
Peter Eisentraut	7ae1619bc5	Add range_agg with multirange inputs range_agg for normal ranges already existed. A lot of code can be shared. Author: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Chapman Flack <chap@anastigmatix.net> Discussion: https://www.postgresql.org/message-id/flat/007ef255-35ef-fd26-679c-f97e7a7f30c2@illuminatedcomputing.com	2022-03-30 20:16:23 +02:00
Robert Haas	9c08aea6a3	Add new block-by-block strategy for CREATE DATABASE. Because this strategy logs changes on a block-by-block basis, it avoids the need to checkpoint before and after the operation. However, because it logs each changed block individually, it might generate a lot of extra write-ahead logging if the template database is large. Therefore, the older strategy remains available via a new STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy option to createdb. Somewhat controversially, this patch assembles the list of relations to be copied to the new database by reading the pg_class relation of the template database. Cross-database access like this isn't normally possible, but it can be made to work here because there can't be any connections to the database being copied, nor can it contain any in-doubt transactions. Even so, we have to use lower-level interfaces than normal, since the table scan and relcache interfaces will not work for a database to which we're not connected. The advantage of this approach is that we do not need to rely on the filesystem to determine what ought to be copied, but instead on PostgreSQL's own knowledge of the database structure. This avoids, for example, copying stray files that happen to be located in the source database directory. Dilip Kumar, with a fairly large number of cosmetic changes by me. Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor, Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian, Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others. Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com	2022-03-29 11:48:36 -04:00
Michael Paquier	a2c84990be	Add system view pg_ident_file_mappings This view is similar to pg_hba_file_rules view, except that it is associated with the parsing of pg_ident.conf. Similarly to its cousin, this view is useful to check via SQL if changes planned in pg_ident.conf would work upon reload or restart, or to diagnose a previous failure. Bumps catalog version. Author: Julien Rouhaud Reviewed-by: Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud	2022-03-29 10:15:48 +09:00
Andres Freund	da4b56662f	Mark pg_stat_get_subscription_stats() strict. It accidentally was marked as non-strict. As it was introduced only in HEAD, we can just fix the catalog. Bumps catversion. Discussion: https://postgr.es/m/20220326212432.s5n2maw6kugnpyxw@alap3.anarazel.de	2022-03-27 21:47:26 -07:00
Andres Freund	43a7dc96eb	Fix NULL input behaviour of pg_stat_get_replication_slot(). pg_stat_get_replication_slot() accidentally was marked as non-strict, crashing when called with NULL input. As it's already released, introduce an explicit NULL check in 14, fix the catalog in HEAD. Bumps catversion in HEAD. Discussion: https://postgr.es/m/20220326212432.s5n2maw6kugnpyxw@alap3.anarazel.de Backpatch: 14-, where replication slot stats were introduced	2022-03-27 21:46:23 -07:00
Andrew Dunstan	f4fb45d15c	SQL/JSON constructors This patch introduces the SQL/JSON standard constructors for JSON: JSON() JSON_ARRAY() JSON_ARRAYAGG() JSON_OBJECT() JSON_OBJECTAGG() For the most part these functions provide facilities that mimic existing json/jsonb functions. However, they also offer some useful additional functionality. In addition to text input, the JSON() function accepts bytea input, which it will decode and constuct a json value from. The other functions provide useful options for handling duplicate keys and null values. This series of patches will be followed by a consolidated documentation patch. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-27 17:03:34 -04:00
Tomas Vondra	923def9a53	Allow specifying column lists for logical replication This allows specifying an optional column list when adding a table to logical replication. The column list may be specified after the table name, enclosed in parentheses. Columns not included in this list are not sent to the subscriber, allowing the schema on the subscriber to be a subset of the publisher schema. For UPDATE/DELETE publications, the column list needs to cover all REPLICA IDENTITY columns. For INSERT publications, the column list is arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if the table uses REPLICA IDENTITY FULL, column list is not allowed. The column list can contain only simple column references. Complex expressions, function calls etc. are not allowed. This restriction could be relaxed in the future. During the initial table synchronization, only columns included in the column list are copied to the subscriber. If the subscription has several publications, containing the same table with different column lists, columns specified in any of the lists will be copied. This means all columns are replicated if the table has no column list at all (which is treated as column list with all columns), or when of the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that matches the schema of the table). For partitioned tables, publish_via_partition_root determines whether the column list for the root or the leaf relation will be used. If the parameter is 'false' (the default), the list defined for the leaf relation is used. Otherwise, the column list for the root partition will be used. Psql commands \dRp+ and \d <table-name> now display any column lists. Author: Tomas Vondra, Alvaro Herrera, Rahila Syed Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed, Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com	2022-03-26 01:01:27 +01:00
Tom Lane	ce95c54376	Fix pg_statio_all_tables view for multiple TOAST indexes. A TOAST table can normally have only one index, but there are corner cases where it has more; for example, transiently during REINDEX CONCURRENTLY. In such a case, the pg_statio_all_tables view produced multiple rows for the owning table, one per TOAST index. Refactor the view to avoid that, instead summing the stats across all the indexes, as we do for regular table indexes. While this has been wrong for a long time, back-patching seems unwise due to the difficulty of putting a system view change into back branches. Andrei Zubkov, tweaked a bit by me Discussion: https://postgr.es/m/acefef4189706971fc475f912c1afdab1c48d627.camel@moonset.ru	2022-03-24 16:33:13 -04:00
Tomas Vondra	75b1521dae	Add decoding of sequences to built-in replication This commit adds support for decoding of sequences to the built-in replication (the infrastructure was added by commit `0da92dc530`). The syntax and behavior mostly mimics handling of tables, i.e. a publication may be defined as FOR ALL SEQUENCES (replicating all sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating all sequences in a particular schema) or individual sequences. To publish sequence modifications, the publication has to include 'sequence' action. The protocol is extended with a new message, describing sequence increments. A new system view pg_publication_sequences lists all the sequences added to a publication, both directly and indirectly. Various psql commands (\d and \dRp) are improved to also display publications including a given sequence, or sequences included in a publication. Author: Tomas Vondra, Cary Huang Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca	2022-03-24 18:49:27 +01:00
Andres Freund	3707e437c7	Add missing xlogdefs.h include to pg_subscription.h. Missed in `208c5d65bb`. The missing include causes headerscheck to fail. Per buildfarm member crake.	2022-03-22 16:46:24 -07:00
Andrew Dunstan	d11e84ea46	Add String object access hooks This caters for cases where the access is to an object identified by name rather than Oid. The first user of these is the GUC access controls Joshua Brindle and Mark Dilger Discussion: https://postgr.es/m/47F87A0E-C0E5-43A6-89F6-D403F2B45175@enterprisedb.com	2022-03-22 10:28:31 -04:00
Amit Kapila	208c5d65bb	Add ALTER SUBSCRIPTION ... SKIP. This feature allows skipping the transaction on subscriber nodes. If incoming change violates any constraint, logical replication stops until it's resolved. Currently, users need to either manually resolve the conflict by updating a subscriber-side database or by using function pg_replication_origin_advance() to skip the conflicting transaction. This commit introduces a simpler way to skip the conflicting transactions. The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX), which allows the apply worker to skip the transaction finished at specified LSN. The apply worker skips all data modification changes within the transaction. Author: Masahiko Sawada Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2022-03-22 07:11:19 +05:30
Tom Lane	cb02fcb4c9	Fix bogus dependency handling for GENERATED expressions. For GENERATED columns, we record all dependencies of the generation expression as AUTO dependencies of the column itself. This means that the generated column is silently dropped if any dependency is removed, even if CASCADE wasn't specified. This is at least a POLA violation, but I think it's actually based on a misreading of the standard. The standard does say that you can't drop a dependent GENERATED column in RESTRICT mode; but that's buried down in a subparagraph, on a different page from some pseudocode that makes it look like an AUTO drop is being suggested. Change this to be more like the way that we handle regular default expressions, ie record the dependencies as NORMAL dependencies of the pg_attrdef entry. Also, make the pg_attrdef entry's dependency on the column itself be INTERNAL not AUTO. That has two effects: * the column will go away, not just lose its default, if any dependency of the expression is dropped with CASCADE. So we don't need any special mechanism to make that happen. * it provides an additional cross-check preventing someone from dropping the default expression without dropping the column. catversion bump because of change in the contents of pg_depend (which also requires a change in one information_schema view). Per bug #17439 from Kevin Humphreys. Although this is a longstanding bug, it seems impractical to back-patch because of the need for catalog contents changes. Discussion: https://postgr.es/m/17439-7df4421197e928f0@postgresql.org	2022-03-21 14:58:49 -04:00
Tom Lane	17f3bc0928	Move pg_attrdef manipulation code into new file catalog/pg_attrdef.c. This is a pure refactoring commit: there isn't (I hope) any functional change. StoreAttrDefault and RemoveAttrDefault[ById] are moved from heap.c, reducing the size of that overly-large file by about 300 lines. I took the opportunity to trim unused #includes from heap.c, too. Two new functions for translating between a pg_attrdef OID and the relid/attnum of the owning column are created by extracting ad-hoc code from objectaddress.c. This already removes one copy of said code, and a follow-on bug fix will create more callers. The only other function directly manipulating pg_attrdef is AttrDefaultFetch. I judged it was better to leave that in relcache.c, since it shares special concerns about recursion and error handling with the rest of that module. Discussion: https://postgr.es/m/651168.1647451676@sss.pgh.pa.us	2022-03-21 14:38:23 -04:00
Peter Eisentraut	f2553d4306	Add option to use ICU as global locale provider This adds the option to use ICU as the default locale provider for either the whole cluster or a database. New options for initdb, createdb, and CREATE DATABASE are used to select this. Since some (legacy) code still uses the libc locale facilities directly, we still need to set the libc global locale settings even if ICU is otherwise selected. So pg_database now has three locale-related fields: the existing datcollate and datctype, which are always set, and a new daticulocale, which is only set if ICU is selected. A similar change is made in pg_collation for consistency, but in that case, only the libc-related fields or the ICU-related field is set, never both. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com	2022-03-17 11:13:16 +01:00
Tomas Vondra	c91f71b9dc	Fix publish_as_relid with multiple publications Commit `83fd4532a7` allowed publishing of changes via ancestors, for publications defined with publish_via_partition_root. But the way the ancestor was determined in get_rel_sync_entry() was incorrect, simply updating the same variable. So with multiple publications, replicating different ancestors, the outcome depended on the order of publications in the list - the value from the last loop was used, even if it wasn't the top-most ancestor. This is a probably rare situation, as in most cases publications do not overlap, so each partition has exactly one candidate ancestor to replicate as and there's no ambiguity. Fixed by tracking the "ancestor level" for each publication, and picking the top-most ancestor. Adds a test case, verifying the correct ancestor is used for publishing the changes and that this does not depend on order of publications in the list. Older releases have another bug in this loop - once all actions are replicated, the loop is terminated, on the assumption that inspecting additional publications is unecessary. But that misses the fact that those additional applications may replicate different ancestors. Fixed by removal of this break condition. We might still terminate the loop in some cases (e.g. when replicating all actions and the ancestor is the partition root). Backpatch to 13, where publish_via_partition_root was introduced. Initial report and fix by me, test added by Hou zj. Reviews and improvements by Amit Kapila. Author: Tomas Vondra, Hou zj, Amit Kapila Reviewed-by: Amit Kapila, Hou zj Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com	2022-03-16 18:05:58 +01:00
Amit Kapila	705e20f855	Optionally disable subscriptions on error. Logical replication apply workers for a subscription can easily get stuck in an infinite loop of attempting to apply a change, triggering an error (such as a constraint violation), exiting with the error written to the subscription server log, and restarting. To partially remedy the situation, this patch adds a new subscription option named 'disable_on_error'. To be consistent with old behavior, this option defaults to false. When true, both the tablesync worker and apply worker catch any errors thrown and disable the subscription in order to break the loop. The error is still also written in the logs. Once the subscription is disabled, users can either manually resolve the conflict/error or skip the conflicting transaction by using pg_replication_origin_advance() function. After resolving the conflict, users need to enable the subscription to allow apply process to proceed. Author: Osumi Takamichi and Mark Dilger Reviewed-by: Greg Nancarrow, Vignesh C, Amit Kapila, Wang wei, Tang Haiying, Peter Smith, Masahiko Sawada, Shi Yu Discussion : https://postgr.es/m/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com	2022-03-14 09:32:40 +05:30
Michael Paquier	62ce0c758d	Fix catalog data of pg_stop_backup(), labelled v2 This function has been incorrectly marked as a set-returning function with prorows (estimated number of rows) set to 1 since its creation in `7117685`, that introduced non-exclusive backups. There is no need for that as the function is designed to return only one tuple. This commit fixes the catalog definition of pg_stop_backup_v2() so as it is not marked as proretset anymore, with prorows set to 0. This simplifies its internals by removing one tuplestore (used for one single record anyway) and by removing all the checks related to a set-returning function. Issue found during my quest to simplify some of the logic used in in-core system functions. Bump catalog version. Reviewed-by: Aleksander Alekseev, Kyotaro Horiguchi Discussion: https://postgr.es/m/Yh8guT78f1Ercfzw@paquier.xyz	2022-03-03 10:51:57 +09:00
Amit Kapila	7a85073290	Reconsider pg_stat_subscription_workers view. It was decided (refer to the Discussion link below) that the stats collector is not an appropriate place to store the error information of subscription workers. This patch changes the pg_stat_subscription_workers view (introduced by commit `8d74fc96db`) so that it stores only statistics counters: apply_error_count and sync_error_count, and has one entry for each subscription. The removed error information such as error-XID and the error message would be stored in another way in the future which is more reliable and persistent. After removing these error details, there is no longer any relation information, so the subscription statistics are now a cluster-wide statistics. The patch also changes the view name to pg_stat_subscription_stats since the word "worker" is an implementation detail that we use one worker for one tablesync and one apply. Author: Masahiko Sawada, based on suggestions by Andres Freund Reviewed-by: Peter Smith, Haiying Tang, Takamichi Osumi, Amit Kapila Discussion: https://postgr.es/m/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de	2022-03-01 06:17:52 +05:30
Amit Kapila	22eb12cfff	Fix few values in pg_proc for pg_stat_get_replication_slot. The function pg_stat_get_replication_slot() is not a SRF but marked incorrectly in the pg_proc. Reported-by: Michael Paquier Discussion: https://postgr.es/m/YhMk4RjoMK3CCXy2@paquier.xyz	2022-02-25 07:51:21 +05:30
Amit Kapila	52e4f0cd47	Allow specifying row filters for logical replication of tables. This feature adds row filtering for publication tables. When a publication is defined or modified, an optional WHERE clause can be specified. Rows that don't satisfy this WHERE clause will be filtered out. This allows a set of tables to be partially replicated. The row filter is per table. A new row filter can be added simply by specifying a WHERE clause after the table name. The WHERE clause must be enclosed by parentheses. The row filter WHERE clause for a table added to a publication that publishes UPDATE and/or DELETE operations must contain only columns that are covered by REPLICA IDENTITY. The row filter WHERE clause for a table added to a publication that publishes INSERT can use any column. If the row filter evaluates to NULL, it is regarded as "false". The WHERE clause only allows simple expressions that don't have user-defined functions, user-defined operators, user-defined types, user-defined collations, non-immutable built-in functions, or references to system columns. These restrictions could be addressed in the future. If you choose to do the initial table synchronization, only data that satisfies the row filters is copied to the subscriber. If the subscription has several publications in which a table has been published with different WHERE clauses, rows that satisfy ANY of the expressions will be copied. If a subscriber is a pre-15 version, the initial table synchronization won't use row filters even if they are defined in the publisher. The row filters are applied before publishing the changes. If the subscription has several publications in which the same table has been published with different filters (for the same publish operation), those expressions get OR'ed together so that rows satisfying any of the expressions will be replicated. This means all the other filters become redundant if (a) one of the publications have no filter at all, (b) one of the publications was created using FOR ALL TABLES, (c) one of the publications was created using FOR ALL TABLES IN SCHEMA and the table belongs to that same schema. If your publication contains a partitioned table, the publication parameter publish_via_partition_root determines if it uses the partition's row filter (if the parameter is false, the default) or the root partitioned table's row filter. Psql commands \dRp+ and \d <table-name> will display any row filters. Author: Hou Zhijie, Euler Taveira, Peter Smith, Ajin Cherian Reviewed-by: Greg Nancarrow, Haiying Tang, Amit Kapila, Tomas Vondra, Dilip Kumar, Vignesh C, Alvaro Herrera, Andres Freund, Wei Wang Discussion: https://www.postgresql.org/message-id/flat/CAHE3wggb715X%2BmK_DitLXF25B%3DjE6xyNCH4YOwM860JR7HarGQ%40mail.gmail.com	2022-02-22 08:11:50 +05:30
Peter Eisentraut	37851a8b83	Database-level collation version tracking This adds to database objects the same version tracking that collation objects have. There is a new pg_database column datcollversion that stores the version, a new function pg_database_collation_actual_version() to get the version from the operating system, and a new subcommand ALTER DATABASE ... REFRESH COLLATION VERSION. This was not originally added together with pg_collation.collversion, since originally version tracking was only supported for ICU, and ICU on a database-level is not currently supported. But we now have version tracking for glibc (since PG13), FreeBSD (since PG14), and Windows (since PG13), so this is useful to have now. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f0ff3190-29a3-5b39-a179-fa32eee57db6%40enterprisedb.com	2022-02-14 08:27:26 +01:00
Fujii Masao	400fc6b648	Add min() and max() aggregates for xid8. Bump catalog version. Author: Ken Kato Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/47d77b18c44f87f8222c4c7a3e2dee6b@oss.nttdata.com	2022-02-10 12:33:41 +09:00
Peter Eisentraut	94aa7cc5f7	Add UNIQUE null treatment option The SQL standard has been ambiguous about whether null values in unique constraints should be considered equal or not. Different implementations have different behaviors. In the SQL:202x draft, this has been formalized by making this implementation-defined and adding an option on unique constraint definitions UNIQUE [ NULLS [NOT] DISTINCT ] to choose a behavior explicitly. This patch adds this option to PostgreSQL. The default behavior remains UNIQUE NULLS DISTINCT. Making this happen in the btree code is pretty easy; most of the patch is just to carry the flag around to all the places that need it. The CREATE UNIQUE INDEX syntax extension is not from the standard, it's my own invention. I named all the internal flags, catalog columns, etc. in the negative ("nulls not distinct") so that the default PostgreSQL behavior is the default if the flag is false. Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/84e5ee1b-387e-9a54-c326-9082674bde78@enterprisedb.com	2022-02-03 11:48:21 +01:00
Peter Eisentraut	87669de72c	Some cleanup for change of collate and ctype fields to type text Some cleanup for commit 54637508f87bd5f07fb9406bac6b08240283be3b: Reformat pg_database.dat to reflect the new field order. Also update the corresponding example in bki.sgml. Reorder the way the fields are filled in dbcommands.c to correspond to the new order.	2022-02-02 11:58:55 +01:00
Michael Paquier	d10e41d423	Introduce pg_settings_get_flags() to find flags associated to a GUC The most meaningful flags are shown, which are the ones useful for the user and for automating and extending the set of tests supported currently by check_guc. This script may actually be removed in the future, but we are not completely sure yet if and how we want to support the remaining sanity checks performed there, that are now integrated in the main regression test suite as of this commit. Thanks also to Peter Eisentraut and Kyotaro Horiguchi for the discussion. Bump catalog version. Author: Justin Pryzby Discussion: https://postgr.es/m/20211129030833.GJ17618@telsasoft.com	2022-01-31 08:56:41 +09:00
Peter Eisentraut	54637508f8	Change collate and ctype fields to type text This changes the data type of the catalog fields datcollate, datctype, collcollate, and collctype from name to text. There wasn't ever a really good reason for them to be of type name; presumably this was just carried over from when they were fixed-size fields in pg_control, first into the corresponding pg_database fields, and then to pg_collation. The values are not identifiers or object names, and we don't ever look them up that way. Changing to type text saves space in the typical case, since locale names are typically only a few bytes long. But it is also possible that an ICU locale name with several customization options appended could be longer than 63 bytes, so this also enables that case, which was previously probably broken. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a@2ndquadrant.com	2022-01-27 08:54:25 +01:00
Robert Haas	aa01051418	pg_upgrade: Preserve database OIDs. Commit `9a974cbcba` arranged to preserve relfilenodes and tablespace OIDs. For similar reasons, also arrange to preserve database OIDs. One problem is that, up until now, the OIDs assigned to the template0 and postgres databases have not been fixed. This could be a problem when upgrading, because pg_upgrade might try to migrate a database from the old cluster to the new cluster while keeping the OID and find a different database with that OID, resulting in a failure. If it finds a database with the same name and the same OID that's OK: it will be dropped and recreated. But the same OID and a different name is a problem. To prevent that, fix the OIDs for postgres and template0 to specific values less than 16384. To avoid running afoul of this rule, these values should not be changed in future releases. It's not a problem that these OIDs aren't fixed in existing releases, because the OIDs that we're assigning here weren't used for either of these databases in any previous release. Thus, there's no chance that an upgrade of a cluster from any previous release will collide with the OIDs we're assigning here. And going forward, the OIDs will always be fixed, so the only potential collision is with a system database having the same name and the same OID, which is OK. This patch lets users assign a specific OID to a database as well, provided however that it can't be less than 16384. I (rhaas) thought it might be better not to expose this capability to users, but the consensus was otherwise, so the syntax is documented. Letting users assign OIDs below 16384 would not be OK, though, because a user-created database with a low-numbered OID might collide with a system-created database in a future release. We therefore prohibit that. Shruthi KC, based on an earlier patch from Antonin Houska, reviewed and with some adjustments by me. Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com Discussion: http://postgr.es/m/CAASxf_Mnwm1Dh2vd5FAhVX6S1nwNSZUB1z12VddYtM++H2+p7w@mail.gmail.com	2022-01-24 14:23:43 -05:00
Robert Haas	ab4fd4f868	Remove 'datlastsysoid'. It hasn't been used for anything for a long time. Up until recently, we still queried it when dumping very old servers, but since commit `30e7c175b8`, there's no longer any code at all that cares about it. Discussion: http://postgr.es/m/CA+Tgmoa14=BRq0WEd0eevjEMn9EkghDB1FZEkBw7+UAb7tF49A@mail.gmail.com	2022-01-20 09:01:12 -05:00
Tom Lane	a3d6264bbc	interval_out() must be marked STABLE, not IMMUTABLE. Its results vary depending on the IntervalStyle GUC, so it cannot be considered immutable. This is an extremely ancient bug. AFAICT it was a sloppy mistake in `6f58115dd`, which marked it "cacheable" alongside marking several other interval functions that way. At the time, interval_out() depended on DateStyle not IntervalStyle, but it was still wrong. Back-patching this change doesn't look very practical, so I won't. Aside from the usual difficulties of getting catalog changes applied to existing databases, people might have indexes, generated columns, etc that depend on interval-to-text casts being considered immutable. (This'd not really give them any problem as long as they never change IntervalStyle.) They wouldn't appreciate us breaking such usage in minor releases. Per bug #17371 from Marcus Gartner. Discussion: https://postgr.es/m/17371-8f57e6e9ca5e35bf@postgresql.org	2022-01-19 17:17:55 -05:00
Robert Haas	9a974cbcba	pg_upgrade: Preserve relfilenodes and tablespace OIDs. Currently, database OIDs, relfilenodes, and tablespace OIDs can all change when a cluster is upgraded using pg_upgrade. It seems better to preserve them, because (1) it makes troubleshooting pg_upgrade easier, since you don't have to do a lot of work to match up files in the old and new clusters, (2) it allows 'rsync' to save bandwidth when used to re-sync a cluster after an upgrade, and (3) if we ever encrypt or sign blocks, we would likely want to use a nonce that depends on these values. This patch only arranges to preserve relfilenodes and tablespace OIDs. The task of preserving database OIDs is left for another patch, since it involves some complexities that don't exist in these cases. Database OIDs have a similar issue, but there are some tricky points in that case that do not apply to these cases, so that problem is left for another patch. Shruthi KC, based on an earlier patch from Antonin Houska, reviewed and with some adjustments by me. Discussion: http://postgr.es/m/CA+TgmoYgTwYcUmB=e8+hRHOFA0kkS6Kde85+UNdon6q7bt1niQ@mail.gmail.com	2022-01-17 13:40:27 -05:00
Tomas Vondra	269b532aef	Add stxdinherit flag to pg_statistic_ext_data Add pg_statistic_ext_data.stxdinherit flag, so that for each extended statistics definition we can store two versions of data - one for the relation alone, one for the whole inheritance tree. This is analogous to pg_statistic.stainherit, but we failed to include such flag in catalogs for extended statistics, and we had to work around it (see commits `859b3003de`, `36c4bc6e72` and `20b9fa308e`). This changes the relationship between the two catalogs storing extended statistics objects (pg_statistic_ext and pg_statistic_ext_data). Until now, there was a simple 1:1 mapping - for each definition there was one pg_statistic_ext_data row, and this row was inserted while creating the statistics (and then updated during ANALYZE). With the stxdinherit flag, we don't know how many rows there will be (child relations may be added after the statistics object is defined), so there may be up to two rows. We could make CREATE STATISTICS to always create both rows, but that seems wasteful - without partitioning we only need stxdinherit=false rows, and declaratively partitioned tables need only stxdinherit=true. So we no longer initialize pg_statistic_ext_data in CREATE STATISTICS, and instead make that a responsibility of ANALYZE. Which is what we do for regular statistics too. Patch by me, with extensive improvements and fixes by Justin Pryzby. Author: Tomas Vondra, Justin Pryzby Reviewed-by: Tomas Vondra, Justin Pryzby Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-16 13:38:01 +01:00
Alvaro Herrera	025b920a3d	Add index on pg_publication_rel.prpubid This should have been added for the benefit of GetPublicationRelations; let's add it now. I couldn't measure a performance difference in the TAP tests, but that may be because the tests use very few publications. Discussion: https://postgr.es/m/202201120041.p24wvsfcsope@alvherre.pgsql	2022-01-12 16:24:26 -03:00
Bruce Momjian	27b77ecf9f	Update copyright for 2022 Backpatch-through: 10	2022-01-07 19:04:57 -05:00
Michael Paquier	fc95d35b94	Correct comment and some documentation about REPLICA_IDENTITY_INDEX catalog/pg_class.h was stating that REPLICA_IDENTITY_INDEX with a dropped index is equivalent to REPLICA_IDENTITY_DEFAULT. The code tells a different story, as it is equivalent to REPLICA_IDENTITY_NOTHING. The behavior exists since the introduction of replica identities, and `fe7fd4e` even added tests for this case but I somewhat forgot to fix this comment. While on it, this commit reorganizes the documentation about replica identities on the ALTER TABLE page, and a note is added about the case of dropped indexes with REPLICA_IDENTITY_INDEX. Author: Michael Paquier, Wei Wang Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/OS3PR01MB6275464AD0A681A0793F56879E759@OS3PR01MB6275.jpnprd01.prod.outlook.com Backpatch-through: 10	2021-12-22 16:37:58 +09:00
Tom Lane	189699dd36	Remove unimplemented/undocumented geometric functions & operators. Nobody has filled in these stubs for upwards of twenty years, so it's time to drop the idea that they might get implemented any day now. The associated pg_operator and pg_proc entries are just confusing wastes of space. Per complaint from Anton Voloshin. Discussion: https://postgr.es/m/3426566.1638832718@sss.pgh.pa.us	2021-12-13 18:08:28 -05:00
Tom Lane	07eee5a0dc	Create a new type category for "internal use" types. Historically we've put type "char" into the S (String) typcategory, although calling it a string is a stretch considering it can only store one byte. (In our actual usage, it's more like an enum.) This choice now seems wrong in view of the special heuristics that parse_func.c and parse_coerce.c have for TYPCATEGORY_STRING: it's not a great idea for "char" to have those preferential casting behaviors. Worse than that, recent patches inventing special-purpose types like pg_node_tree have assigned typcategory S to those types, meaning they also get preferential casting treatment that's designed on the assumption that they can hold arbitrary text. To fix, invent a new category TYPCATEGORY_INTERNAL for internal-use types, and assign that to all these types. I used code 'Z' for lack of a better idea ('I' was already taken). This change breaks one query in psql/describe.c, which now needs to explicitly cast a catalog "char" column to text before concatenating it with an undecorated literal. Also, a test case in contrib/citext now needs an explicit cast to convert citext to "char". Since the point of this change is to not have "char" be a surprisingly-available cast target, these breakages seem OK. Per report from Ian Campbell. Discussion: https://postgr.es/m/2216388.1638480141@sss.pgh.pa.us	2021-12-11 14:10:51 -05:00
Peter Eisentraut	d6f96ed94e	Allow specifying column list for foreign key ON DELETE SET actions Extend the foreign key ON DELETE actions SET NULL and SET DEFAULT by allowing the specification of a column list, like CREATE TABLE posts ( ... FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL (author_id) ); If a column list is specified, only those columns are set to null/default, instead of all the columns in the foreign-key constraint. This is useful for multitenant or sharded schemas, where the tenant or shard ID is included in the primary key of all tables but shouldn't be set to null. Author: Paul Martinez <paulmtz@google.com> Discussion: https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV=njbSMxf+rbDHpx=W=B7AEaMKn8dWn9OZJY7w@mail.gmail.com	2021-12-08 11:13:57 +01:00
Amit Kapila	1a2aaeb0db	Fix changing the ownership of ALL TABLES IN SCHEMA publication. Ensure that the new owner of ALL TABLES IN SCHEMA publication must be a superuser. The same is already ensured during CREATE PUBLICATION. Author: Vignesh C Reviewed-by: Nathan Bossart, Greg Nancarrow, Michael Paquier, Haiying Tang Discussion: https://postgr.es/m/CALDaNm0E5U-RqxFuFrkZrQeG7ae5trGa=xs=iRtPPHULtT4zOw@mail.gmail.com	2021-12-08 11:31:16 +05:30
Peter Eisentraut	37b2764593	Some RELKIND macro refactoring Add more macros to group some RELKIND_* macros: - RELKIND_HAS_PARTITIONS() - RELKIND_HAS_TABLESPACE() - RELKIND_HAS_TABLE_AM() Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/a574c8f1-9c84-93ad-a9e5-65233d6fc00f%40enterprisedb.com	2021-12-03 14:08:19 +01:00
Amit Kapila	8d74fc96db	Add a view to show the stats of subscription workers. This commit adds a new system view pg_stat_subscription_workers, that shows information about any errors which occur during the application of logical replication changes as well as during performing initial table synchronization. The subscription statistics entries are removed when the corresponding subscription is removed. It also adds an SQL function pg_stat_reset_subscription_worker() to reset single subscription errors. The contents of this view can be used by an upcoming patch that skips the particular transaction that conflicts with the existing data on the subscriber. This view can be extended in the future to track other xact related statistics like the number of xacts committed/aborted for subscription workers. Author: Masahiko Sawada Reviewed-by: Greg Nancarrow, Hou Zhijie, Tang Haiying, Vignesh C, Dilip Kumar, Takamichi Osumi, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2021-11-30 08:54:30 +05:30
Michael Paquier	1922d7c6e1	Add SQL functions to monitor the directory contents of replication slots This commit adds a set of functions able to look at the contents of various paths related to replication slots: - pg_ls_logicalsnapdir, for pg_logical/snapshots/ - pg_ls_logicalmapdir, for pg_logical/mappings/ - pg_ls_replslotdir, for pg_replslot/<slot_name>/ These are intended to be used by monitoring tools. Unlike pg_ls_dir(), execution permission can be granted to non-superusers. Roles members of pg_monitor gain have access to those functions. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Justin Pryzby Discussion: https://postgr.es/m/CALj2ACWsfizZjMN6bzzdxOk1ADQQeSw8HhEjhmVXn_Pu+7VzLw@mail.gmail.com	2021-11-23 19:29:42 +09:00
Tom Lane	a148f8bc04	Add a planner support function for starts_with(). This fills in some gaps in planner support for starts_with() and the equivalent ^@ operator: * A condition such as "textcol ^@ constant" can now use a regular btree index, not only an SP-GiST index, so long as the index's collation is C. (This works just like "textcol LIKE 'foo%'".) * "starts_with(textcol, constant)" can be optimized the same as "textcol ^@ constant". * Fixed-prefix LIKE and regex patterns are now more like starts_with() in another way: if you apply one to an SPGiST-indexed column, you'll get an index condition using ^@ rather than two index conditions with >= and <. Per a complaint from Shay Rojansky. Patch by me; thanks to Nathan Bossart for review. Discussion: https://postgr.es/m/232599.1633800229@sss.pgh.pa.us	2021-11-17 16:54:12 -05:00
Jeff Davis	4168a47454	Add pg_checkpointer predefined role for CHECKPOINT command. Any user with the privileges of pg_checkpointer can issue a CHECKPOINT command. Reviewed-by: Stephen Frost Discussion: https://postgr.es/m/67a1d667e8ec228b5e07f232184c80348c5d93f4.camel%40j-davis.com	2021-11-09 16:59:14 -08:00
Jeff Davis	77ea4f9439	Grant memory views to pg_read_all_stats. Grant privileges on views pg_backend_memory_contexts and pg_shmem_allocations to the role pg_read_all_stats. Also grant on the underlying functions that those views depend on. Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/CALj2ACWAZo3Ar_EVsn2Zf9irG+hYK3cmh1KWhZS_Od45nd01RA@mail.gmail.com	2021-10-27 14:06:30 -07:00
Amit Kapila	5a2832465f	Allow publishing the tables of schema. A new option "FOR ALL TABLES IN SCHEMA" in Create/Alter Publication allows one or more schemas to be specified, whose tables are selected by the publisher for sending the data to the subscriber. The new syntax allows specifying both the tables and schemas. For example: CREATE PUBLICATION pub1 FOR TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2; OR ALTER PUBLICATION pub1 ADD TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2; A new system table "pg_publication_namespace" has been added, to maintain the schemas that the user wants to publish through the publication. Modified the output plugin (pgoutput) to publish the changes if the relation is part of schema publication. Updates pg_dump to identify and dump schema publications. Updates the \d family of commands to display schema publications and \dRp+ variant will now display associated schemas if any. Author: Vignesh C, Hou Zhijie, Amit Kapila Syntax-Suggested-by: Tom Lane, Alvaro Herrera Reviewed-by: Greg Nancarrow, Masahiko Sawada, Hou Zhijie, Amit Kapila, Haiying Tang, Ajin Cherian, Rahila Syed, Bharath Rupireddy, Mark Dilger Tested-by: Haiying Tang Discussion: https://www.postgresql.org/message-id/CALDaNm0OANxuJ6RXqwZsM1MSY4s19nuH3734j4a72etDwvBETQ@mail.gmail.com	2021-10-27 07:44:52 +05:30
Jeff Davis	f0b051e322	Allow GRANT on pg_log_backend_memory_contexts(). Remove superuser check, allowing any user granted permissions on pg_log_backend_memory_contexts() to log the memory contexts of any backend. Note that this could allow a privileged non-superuser to log the memory contexts of a superuser backend, but as discussed, that does not seem to be a problem. Reviewed-by: Nathan Bossart, Bharath Rupireddy, Michael Paquier, Kyotaro Horiguchi, Andres Freund Discussion: https://postgr.es/m/e5cf6684d17c8d1ef4904ae248605ccd6da03e72.camel@j-davis.com	2021-10-26 13:31:38 -07:00
Alvaro Herrera	ff9f111bce	Fix WAL replay in presence of an incomplete record Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit `515e3d84a0`, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql	2021-09-29 11:21:51 -03:00
Amit Kapila	4548c76738	Invalidate all partitions for a partitioned table in publication. Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com	2021-09-22 08:00:54 +05:30
Noah Misch	b073c3ccd0	Revoke PUBLIC CREATE from public schema, now owned by pg_database_owner. This switches the default ACL to what the documentation has recommended since CVE-2018-1058. Upgrades will carry forward any old ownership and ACL. Sites that declined the 2018 recommendation should take a fresh look. Recipes for commissioning a new database cluster from scratch may need to create a schema, grant more privileges, etc. Out-of-tree test suites may require such updates. Reviewed by Peter Eisentraut. Discussion: https://postgr.es/m/20201031163518.GB4039133@rfd.leadboat.com	2021-09-09 23:38:09 -07:00
Alvaro Herrera	0c6828fa98	Add PublicationTable and PublicationRelInfo structs These encapsulate a relation when referred from replication DDL. Currently they don't do anything useful (they're just wrappers around RangeVar and Relation respectively) but in the future they'll be used to carry column lists. Extracted from a larger patch by Rahila Syed. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com	2021-09-06 14:24:50 -03:00
Tom Lane	388e71af88	Make timetz_zone() stable, and correct a bug for DYNTZ abbreviations. Historically, timetz_zone() has used time(NULL) as the reference point for deciding whether DST is active. That means its result can change intra-statement, requiring it to be marked VOLATILE (cf. `35979e6c3`). But that definition is pretty inconsistent with the way we deal with timestamps elsewhere. Let's make it use the transaction start time ("now()") as the reference point instead. That lets it be marked STABLE, and also saves a kernel call per invocation. While at it, remove the function's use of pg_time_t and pg_localtime. Those are inconsistent with the other code in this area, which indeed created a bug: timetz_zone() delivered completely wrong answers if the zone was specified by a dynamic TZ abbreviation. (We need to do something about that in the back branches, but the fix will look different from this.) Aleksander Alekseev and Tom Lane Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com	2021-09-06 11:03:56 -04:00
John Naylor	0c6a6a0ab7	Set the volatility of the timestamptz version of date_bin() back to immutable `543f36b43d` was too hasty in thinking that the volatility of date_bin() had to match date_trunc(), since only the latter references session_timezone. Bump catversion Per feedback from Aleksander Alekseev Backpatch to v14, as the former commit was	2021-09-03 13:39:16 -04:00
Fujii Masao	e04267844a	Enhance pg_stat_reset_single_table_counters function. This commit allows pg_stat_reset_single_table_counters() to reset statistics for a single relation shared across all databases in the cluster to zero. Bump catalog version. Author: B Sadhu Prasad Patro Reviewed-by: Mahendra Singh Thalor, Himanshu Upadhyaya, Dilip Kumar, Fujii Masao Discussion: https://postgr.es/m/CAFF0-CGy7EHeF=AqqkGMF85cySPQBgDcvNk73G2O0vL94O5U5A@mail.gmail.com	2021-09-02 14:01:06 +09:00
John Naylor	543f36b43d	Mark the timestamptz variant of date_bin() as stable Previously, it was immutable by lack of marking. This is not correct, since the time zone could change. Bump catversion Discussion: https://www.postgresql.org/message-id/CAFBsxsG2UHk8mOWL0tca%3D_cg%2B_oA5mVRNLhDF0TBw980iOg5NQ%40mail.gmail.com Backpatch to v14, when this function came in	2021-08-31 15:18:26 -04:00
Amit Kapila	29b5905470	Fix toast rewrites in logical decoding. Commit `325f2ec555` introduced pg_class.relwrite to skip operations on tables created as part of a heap rewrite during DDL. It links such transient heaps to the original relation OID via this new field in pg_class but forgot to do anything about toast tables. So, logical decoding was not able to skip operations on internally created toast tables. This leads to an error when we tried to decode the WAL for the next operation for which it appeared that there is a toast data where actually it didn't have any toast data. To fix this, we set pg_class.relwrite for internally created toast tables as well which allowed skipping operations on them during logical decoding. Author: Bertrand Drouvot Reviewed-by: David Zhang, Amit Kapila Backpatch-through: 11, where it was introduced Discussion: https://postgr.es/m/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com	2021-08-25 09:53:07 +05:30
Tom Lane	6b71c925cb	Prevent ALTER TYPE/DOMAIN/OPERATOR from changing extension membership. If recordDependencyOnCurrentExtension is invoked on a pre-existing, free-standing object during an extension update script, that object will become owned by the extension. In our current code this is possible in three cases: * Replacing a "shell" type or operator. * CREATE OR REPLACE overwriting an existing object. * ALTER TYPE SET, ALTER DOMAIN SET, and ALTER OPERATOR SET. The first of these cases is intentional behavior, as noted by the existing comments for GenerateTypeDependencies. It seems like appropriate behavior for CREATE OR REPLACE too; at least, the obvious alternatives are not better. However, the fact that it happens during ALTER is an artifact of trying to share code (GenerateTypeDependencies and makeOperatorDependencies) between the CREATE and ALTER cases. Since an extension script would be unlikely to ALTER an object that didn't already belong to the extension, this behavior is not very troubling for the direct target object ... but ALTER TYPE SET will recurse to dependent domains, and it is very uncool for those to become owned by the extension if they were not already. Let's fix this by redefining the ALTER cases to never change extension membership, full stop. We could minimize the behavioral change by only changing the behavior when ALTER TYPE SET is recursing to a domain, but that would complicate the code and it does not seem like a better definition. Per bug #17144 from Alex Kozhemyakin. Back-patch to v13 where ALTER TYPE SET was added. (The other cases are older, but since they only affect the directly-named object, there's not enough of a problem to justify changing the behavior further back.) Discussion: https://postgr.es/m/17144-e67d7a8f049de9af@postgresql.org	2021-08-17 14:29:22 -04:00
Tom Lane	6424337073	Add assorted new regexp_xxx SQL functions. This patch adds new functions regexp_count(), regexp_instr(), regexp_like(), and regexp_substr(), and extends regexp_replace() with some new optional arguments. All these functions follow the definitions used in Oracle, although there are small differences in the regexp language due to using our own regexp engine -- most notably, that the default newline-matching behavior is different. Similar functions appear in DB2 and elsewhere, too. Aside from easing portability, these functions are easier to use for certain tasks than our existing regexp_match[es] functions. Gilles Darold, heavily revised by me Discussion: https://postgr.es/m/fc160ee0-c843-b024-29bb-97b5da61971f@darold.net	2021-08-03 13:08:49 -04:00
Dean Rasheed	085f931f52	Allow numeric scale to be negative or greater than precision. Formerly, when specifying NUMERIC(precision, scale), the scale had to be in the range [0, precision], which was per SQL spec. This commit extends the range of allowed scales to [-1000, 1000], independent of the precision (whose valid range remains [1, 1000]). A negative scale implies rounding before the decimal point. For example, a column might be declared with a scale of -3 to round values to the nearest thousand. Note that the display scale remains non-negative, so in this case the display scale will be zero, and all digits before the decimal point will be displayed. A scale greater than the precision supports fractional values with zeros immediately after the decimal point. Take the opportunity to tidy up the code that packs, unpacks and validates the contents of a typmod integer, encapsulating it in a small set of new inline functions. Bump the catversion because the allowed contents of atttypmod have changed for numeric columns. This isn't a change that requires a re-initdb, but negative scale values in the typmod would confuse old backends. Dean Rasheed, with additional improvements by Tom Lane. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCWdNLgpKihmURF8nfofP0RFtAKJ7ktY6GcZOPnMfUoRqA@mail.gmail.com	2021-07-26 14:13:47 +01:00
Alexander Korotkov	f157db8622	Forgotten catversion bump for `9e3c217bd9`	2021-07-18 21:08:44 +03:00
Alexander Korotkov	9e3c217bd9	Support for unnest(multirange) It has been spotted that multiranges lack of ability to decompose them into individual ranges. Subscription and proper expanded object representation require substantial work, and it's too late for v14. This commit provides the implementation of unnest(multirange), which is quite trivial. unnest(multirange) is defined as a polymorphic procedure. Catversion is bumped. Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org Author: Alexander Korotkov Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu, Tom Lane Reviewed-by: Alvaro Herrera	2021-07-18 21:07:24 +03:00
Tom Lane	a49d081235	Replace explicit PIN entries in pg_depend with an OID range test. As of v14, pg_depend contains almost 7000 "pin" entries recording the OIDs of built-in objects. This is a fair amount of bloat for every database, and it adds time to pg_depend lookups as well as initdb. We can get rid of all of those entries in favor of an OID range check, i.e. "OIDs below FirstUnpinnedObjectId are pinned". (template1 and the public schema are exceptions. Those exceptions are now wired into IsPinnedObject() instead of initdb's code for filling pg_depend, but it's the same amount of cruft either way.) The contents of pg_shdepend are modified likewise. Discussion: https://postgr.es/m/3737988.1618451008@sss.pgh.pa.us	2021-07-15 11:41:47 -04:00
Alexander Korotkov	768ea9bcf9	Fix small inconsistencies in catalog definition of multirange operators This commit fixes the description of a couple of multirange operators and oprjoin for another multirange operator. The change of oprjoin is more cosmetic since both old and new functions return the same constant. These cosmetic changes don't worth catalog incompatibility between 14beta2 and 14beta3. So, catversion isn't bumped. Discussion: https://postgr.es/m/CAPpHfdv9OZEuZDqOQoUKpXhq%3Dmc-qa4gKCPmcgG5Vvesu7%3Ds1w%40mail.gmail.com Backpatch-throgh: 14	2021-07-15 14:18:20 +03:00
Amit Kapila	a8fd13cab0	Add support for prepared transactions to built-in logical replication. To add support for streaming transactions at prepare time into the built-in logical replication, we need to do the following things: * Modify the output plugin (pgoutput) to implement the new two-phase API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle two-phase transactions by replaying them on prepare. * Add a new SUBSCRIPTION option "two_phase" to allow users to enable two-phase transactions. We enable the two_phase once the initial data sync is over. We however must explicitly disable replication of two-phase transactions during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover, we don't have a replication connection open so we don't know where to send the data anyway. The streaming option is not allowed with this new two_phase option. This can be done as a separate patch. We don't allow to toggle two_phase option of a subscription because it can lead to an inconsistent replica. For the same reason, we don't allow to refresh the publication once the two_phase is enabled for a subscription unless copy_data option is false. Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com	2021-07-14 07:33:50 +05:30
Peter Eisentraut	2ed532ee8c	Improve error messages about mismatching relkind Most error messages about a relkind that was not supported or appropriate for the command was of the pattern "relation \"%s\" is not a table, foreign table, or materialized view" This style can become verbose and tedious to maintain. Moreover, it's not very helpful: If I'm trying to create a comment on a TOAST table, which is not supported, then the information that I could have created a comment on a materialized view is pointless. Instead, write the primary error message shorter and saying more directly that what was attempted is not possible. Then, in the detail message, explain that the operation is not supported for the relkind the object was. To simplify that, add a new function errdetail_relkind_not_supported() that does this. In passing, make use of RELKIND_HAS_STORAGE() where appropriate, instead of listing out the relkinds individually. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/dc35a398-37d0-75ce-07ea-1dd71d98f8ec@2ndquadrant.com	2021-07-08 09:44:51 +02:00
David Rowley	29f45e299e	Use a hash table to speed up NOT IN(values) Similar to `50e17ad28`, which allowed hash tables to be used for IN clauses with a set of constants, here we add the same feature for NOT IN clauses. NOT IN evaluates the same as: WHERE a <> v1 AND a <> v2 AND a <> v3. Obviously, if we're using a hash table we must be exactly equivalent to that and return the same result taking into account that either side of the condition could contain a NULL. This requires a little bit of special handling to make work with the hash table version. When processing NOT IN, the ScalarArrayOpExpr's operator will be the <> operator. To be able to build and lookup a hash table we must use the <>'s negator operator. The planner checks if that exists and is hashable and sets the relevant fields in ScalarArrayOpExpr to instruct the executor to use hashing. Author: David Rowley, James Coleman Reviewed-by: James Coleman, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvoF1mum_FRk6D621edcB6KSHBi2+GAgWmioj5AhOu2vwQ@mail.gmail.com	2021-07-07 16:29:17 +12:00
Peter Eisentraut	6a6389a08b	Add index OID macro argument to DECLARE_INDEX Instead of defining symbols such as AmOidIndexId explicitly, include them as an argument of DECLARE_INDEX() and have genbki.pl generate the way as the table OID symbols from the CATALOG() declaration. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/ccef1e46-a404-25b1-9b4c-85f2c08e1f28%40enterprisedb.com	2021-06-29 08:08:40 +02:00
Noah Misch	7ac10f6920	Dump COMMENT ON SCHEMA public. As we do for other attributes of the public schema, omit the COMMENT command when its payload would match what initdb had installed. For dumps that do carry this new COMMENT command, non-superusers restoring them are likely to get an error. Reviewed by Asif Rehman. Discussion: https://postgr.es/m/ab48a34c-60f6-e388-502a-3e5fe46a2dae@postgresfriends.org	2021-06-28 18:34:55 -07:00
Alexander Korotkov	817bb0a7d1	Revert `29854ee8d1` due to buildfarm failures Reported-by: Tom Lane Discussion: https://postgr.es/m/CAPpHfdvcnw3x7jdV3r52p4%3D5S4WUxBCzcQKB3JukQHoicv1LSQ%40mail.gmail.com	2021-06-15 21:44:40 +03:00
Alexander Korotkov	29854ee8d1	Support for unnest(multirange) and cast multirange as an array of ranges It has been spotted that multiranges lack of ability to decompose them into individual ranges. Subscription and proper expanded object representation require substantial work, and it's too late for v14. This commit provides the implementation of unnest(multirange) and cast multirange as an array of ranges, which is quite trivial. unnest(multirange) is defined as a polymorphic procedure. The catalog description of the cast underlying procedure is duplicated for each multirange type because we don't have anyrangearray polymorphic type to use here. Catversion is bumped. Reported-by: Jonathan S. Katz Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org Author: Alexander Korotkov Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu	2021-06-15 15:59:20 +03:00
Noah Misch	5f1df62a45	Remove pg_wait_for_backend_termination(). It was unable to wait on a backend that had already left the procarray. Users tolerant of that limitation can poll pg_stat_activity. Other users can employ the "timeout" argument of pg_terminate_backend(). Reviewed by Bharath Rupireddy. Discussion: https://postgr.es/m/20210605013236.GA208701@rfd.leadboat.com	2021-06-14 17:29:37 -07:00
Noah Misch	0aac73e6a2	Copy-edit text for the pg_terminate_backend() "timeout" parameter. Revert the pg_description entry to its v13 form, since those messages usually remain shorter and don't discuss individual parameters. No catversion bump, since pg_description content does not impair backend compatibility or application compatibility. Justin Pryzby Discussion: https://postgr.es/m/20210612182743.GY16435@telsasoft.com	2021-06-14 17:29:37 -07:00
Tom Lane	e56bce5d43	Reconsider the handling of procedure OUT parameters. Commit `2453ea142` redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us	2021-06-10 17:11:36 -04:00
Tomas Vondra	d57ecebd12	Add transformed flag to nodes/funcs.c for CREATE STATISTICS Commit `a4d75c86bf` added a new flag, tracking if the statement was processed by transformStatsStmt(), but failed to add this flag to nodes/funcs.c. Catversion bump, due to adding a flag to copy/equal/out functions. Reported-by: Noah Misch Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com	2021-06-06 20:52:58 +02:00
Noah Misch	a2dee328bb	Standardize nodes/*funcs.c cosmetics for ForeignScan.resultRelation. catversion bump due to readfuncs.c field order change.	2021-06-06 00:08:21 -07:00
Tom Lane	3590680b85	Fix incorrect permissions on pg_subscription. The documented intent is for all columns except subconninfo to be publicly readable. However, this has been overlooked twice. subsynccommit has never been readable since it was introduced, nor has the oid column (which is important for joining). Given the lack of previous complaints, it's not clear that it's worth doing anything about this in the back branches. But there's still time to fix it inexpensively for v14. Per report from Israel Barth (via Euler Taveira). Patch by Euler Taveira, possibly-vain comment updates by me. Discussion: https://postgr.es/m/b8f7c17c-0041-46b6-acfe-2d1f5a985ab4@www.fastmail.com	2021-06-03 14:54:06 -04:00
Tom Lane	a4390abecf	Reduce the range of OIDs reserved for genbki.pl. Commit `ab596105b` increased FirstBootstrapObjectId from 12000 to 13000, but we've had some push-back about that. It's worrisome to reduce the daylight between there and FirstNormalObjectId, because the number of OIDs consumed during initdb for collation objects is hard to predict. We can improve the situation by abandoning the assumption that these OIDs must be globally unique. It should be sufficient for them to be unique per-catalog. (Any code that's unhappy about that is broken anyway, since no more than per-catalog uniqueness can be guaranteed once the OID counter wraps around.) With that change, the largest OID assigned during genbki.pl (starting from a base of 10000) is a bit under 11000. This allows reverting FirstBootstrapObjectId to 12000 with reasonable confidence that that will be sufficient for many years to come. We are not, at this time, abandoning the expectation that hand-assigned OIDs (below 10000) are globally unique. Someday that'll likely be necessary, but the need seems years away still. This is late for v14, but it seems worth doing it now so that downstream software doesn't have to deal with the consequences of a change in FirstBootstrapObjectId. In any case, we already bought into forcing an initdb for beta2, so another catversion bump won't hurt. Discussion: https://postgr.es/m/1665197.1622065382@sss.pgh.pa.us	2021-05-27 15:55:08 -04:00
Tom Lane	e6241d8e03	Rethink definition of pg_attribute.attcompression. Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to compress, use the current setting of default_toast_compression". This allows '\0' to be a suitable default choice regardless of datatype, greatly simplifying code paths that initialize tupledescs and the like. It seems like a more user-friendly approach as well, because now the default compression choice doesn't migrate into table definitions, meaning that changing default_toast_compression is usually sufficient to flip an installation's behavior; one needn't tediously issue per-column ALTER SET COMPRESSION commands. Along the way, fix a few minor bugs and documentation issues with the per-column-compression feature. Adopt more robust APIs for SetIndexStorageProperties and GetAttributeCompression. Bump catversion because typical contents of attcompression will now be different. We could get away without doing that, but it seems better to ensure v14 installations all agree on this. (We already forced initdb for beta2, anyway.) Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us	2021-05-27 13:24:27 -04:00
Tom Lane	f5024d8d7b	Re-order pg_attribute columns to eliminate some padding space. Now that attcompression is just a char, there's a lot of wasted padding space after it. Move it into the group of char-wide columns to save a net of 4 bytes per pg_attribute entry. While we're at it, swap the order of attstorage and attalign to make for a more logical grouping of these columns. Also re-order actions in related code to match the new field ordering. This patch also fixes one outright bug: equalTupleDescs() failed to compare attcompression. That could, for example, cause relcache reload to fail to adopt a new value following a change. Michael Paquier and Tom Lane, per a gripe from Andres Freund. Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de	2021-05-23 12:12:09 -04:00
Tom Lane	1447244286	Do pre-release housekeeping on catalog data. Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta tasks specified by RELEASE_CHANGES. For reference, the command was ./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6150	2021-05-12 13:36:06 -04:00
Tom Lane	def5b065ff	Initial pgindent and pgperltidy run for v14. Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.	2021-05-12 13:14:10 -04:00
Thomas Munro	c2dc19342e	Revert recovery prefetching feature. This set of commits has some bugs with known fixes, but at this late stage in the release cycle it seems best to revert and resubmit next time, along with some new automated test coverage for this whole area. Commits reverted: dc88460c: Doc: Review for "Optionally prefetch referenced data in recovery." 1d257577: Optionally prefetch referenced data in recovery. f003d9f8: Add circular WAL decoding buffer. 323cbe7c: Remove read_page callback from XLogReader. Remove the new GUC group WAL_RECOVERY recently added by `a55a9847`, as the corresponding section of config.sgml is now reverted. Discussion: https://postgr.es/m/CAOuzzgrn7iKnFRsB4MHp3UisEQAGgZMbk_ViTN4HV4-Ksq8zCg%40mail.gmail.com	2021-05-10 16:06:09 +12:00
Thomas Munro	ec48314708	Revert per-index collation version tracking feature. Design problems were discovered in the handling of composite types and record types that would cause some relevant versions not to be recorded. Misgivings were also expressed about the use of the pg_depend catalog for this purpose. We're out of time for this release so we'll revert and try again. Commits reverted: 1bf946bd: Doc: Document known problem with Windows collation versions. cf002008: Remove no-longer-relevant test case. ef387bed: Fix bogus collation-version-recording logic. 0fb0a050: Hide internal error for pg_collation_actual_version(<bad OID>). ff942057: Suppress "warning: variable 'collcollate' set but not used". d50e3b1f: Fix assertion in collation version lookup. f24b1569: Rethink extraction of collation dependencies. 257836a7: Track collation versions for indexes. cd6f479e: Add pg_depend.refobjversion. 7d1297df: Remove pg_collation.collversion. Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com	2021-05-07 21:10:11 +12:00
Alvaro Herrera	d6b8d29419	Allow a partdesc-omitting-partitions to be cached Makes partition descriptor acquisition faster during the transient period in which a partition is in the process of being detached. This also adds the restriction that only one partition can be in pending-detach state for a partitioned table. While at it, return find_inheritance_children() API to what it was before `71f4c8c6f7`, and create a separate find_inheritance_children_extended() that returns detailed info about detached partitions. (This incidentally fixes a bug in `8aba932251` whereby a memory context holding a transient partdesc is reparented to a NULL PortalContext, leading to permanent leak of that memory. The fix is to no longer rely on reparenting contexts to PortalContext. Reported by Amit Langote.) Per gripe from Amit Langote Discussion: https://postgr.es/m/CA+HiwqFgpP1LxJZOBYGt9rpvTjXXkg5qG2+Xch2Z1Q7KrqZR1A@mail.gmail.com	2021-04-28 15:44:35 -04:00
Amit Kapila	3fa17d3771	Use HTAB for replication slot statistics. Previously, we used to use the array of size max_replication_slots to store stats for replication slots. But that had two problems in the cases where a message for dropping a slot gets lost: 1) the stats for the new slot are not recorded if the array is full and 2) writing beyond the end of the array if the user reduces the max_replication_slots. This commit uses HTAB for replication slot statistics, resolving both problems. Now, pgstat_vacuum_stat() search for all the dead replication slots in stats hashtable and tell the collector to remove them. To avoid showing the stats for the already-dropped slots, pg_stat_replication_slots view searches slot stats by the slot name taken from pg_replication_slots. Also, we send a message for creating a slot at slot creation, initializing the stats. This reduces the possibility that the stats are accumulated into the old slot stats when a message for dropping a slot gets lost. Reported-by: Andres Freund Author: Sawada Masahiko, test case by Vignesh C Reviewed-by: Amit Kapila, Vignesh C, Dilip Kumar Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de	2021-04-27 09:09:11 +05:30
Alexander Korotkov	6bbcff096f	Mark multirange_constructor0() and multirange_constructor2() strict These functions shouldn't receive null arguments: multirange_constructor0() doesn't have any arguments while multirange_constructor2() has a single array argument, which is never null. But mark them strict anyway for the sake of uniformity. Also, make checks for null arguments use elog() instead of ereport() as these errors should normally be never thrown. And adjust corresponding comments. Catversion is bumped. Reported-by: Peter Eisentraut Discussion: https://postgr.es/m/0f783a96-8d67-9e71-996b-f34a7352eeef%40enterprisedb.com	2021-04-23 13:25:45 +03:00
Alvaro Herrera	8aba932251	Fix relcache inconsistency hazard in partition detach During queries coming from ri_triggers.c, we need to omit partitions that are marked pending detach -- otherwise, the RI query is tricked into allowing a row into the referencing table whose corresponding row is in the detached partition. Which is bogus: once the detach operation completes, the row becomes an orphan. However, the code was not doing that in repeatable-read transactions, because relcache kept a copy of the partition descriptor that included the partition, and used it in the RI query. This commit changes the partdesc cache code to only keep descriptors that aren't dependent on a snapshot (namely: those where no detached partition exist, and those where detached partitions are included). When a partdesc-without- detached-partitions is requested, we create one afresh each time; also, those partdescs are stored in PortalContext instead of CacheMemoryContext. find_inheritance_children gets a new output *detached_exist boolean, which indicates whether any partition marked pending-detach is found. Its "include_detached" input flag is changed to "omit_detached", because that name captures desired the semantics more naturally. CreatePartitionDirectory() and RelationGetPartitionDesc() arguments are identically renamed. This was noticed because a buildfarm member that runs with relcache clobbering, which would not keep the improperly cached partdesc, broke one test, which led us to realize that the expected output of that test was bogus. This commit also corrects that expected output. Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/3269784.1617215412@sss.pgh.pa.us	2021-04-22 15:13:25 -04:00
Peter Eisentraut	d84ffffe58	Add DISTINCT to information schema usage views Since pg_depend can contain duplicate entries, we need to eliminate those in information schema views that build on pg_depend, using DISTINCT. Some of the older views already did that correctly, but some of the more recently added ones didn't. (In some of these views, it might not be possible to reproduce the issue because of how the implementation happens to deduplicate dependencies while recording them, but it seems better to keep this consistent in all cases.)	2021-04-21 11:54:47 +02:00
Peter Eisentraut	544b28088f	doc: Improve hyphenation consistency	2021-04-21 08:14:43 +02:00
Bruce Momjian	9660834dd8	adjust query id feature to use pg_stat_activity.query_id Previously, it was pg_stat_activity.queryid to match the pg_stat_statements queryid column. This is an adjustment to patch `4f0b0966c8`. This also adjusts some of the internal function calls to match. Catversion bumped. Reported-by: Álvaro Herrera, Julien Rouhaud Discussion: https://postgr.es/m/20210408032704.GA7498@alvherre.pgsql	2021-04-20 12:22:26 -04:00
Tom Lane	8a2df442b6	Update dummy prosrc values. Ooops, forgot to s/system_views.sql/system_functions.sql/g in this part of `767982e36`. No need for an additional catversion bump, I think, since these strings are gone by the time initdb finishes. Discussion: https://postgr.es/m/3956760.1618529139@sss.pgh.pa.us	2021-04-16 19:01:22 -04:00
Tom Lane	767982e362	Convert built-in SQL-language functions to SQL-standard-body style. Adopt the new pre-parsed representation for all built-in and information_schema SQL-language functions, except for a small number that can't presently be converted because they have polymorphic arguments. This eliminates residual hazards around search-path safety of these functions, and might provide some small performance benefits by reducing parsing costs. It seems useful also to provide more test coverage for the SQL-standard-body feature. Discussion: https://postgr.es/m/3956760.1618529139@sss.pgh.pa.us	2021-04-16 18:37:02 -04:00
Amit Kapila	f5fc2f5b23	Add information of total data processed to replication slot stats. This adds the statistics about total transactions count and total transaction data logically sent to the decoding output plugin from ReorderBuffer. Users can query the pg_stat_replication_slots view to check these stats. Suggested-by: Andres Freund Author: Vignesh C and Amit Kapila Reviewed-by: Sawada Masahiko, Amit Kapila Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de	2021-04-16 07:34:43 +05:30
Tom Lane	1111b2668d	Undo decision to allow pg_proc.prosrc to be NULL. Commit `e717a9a18` changed the longstanding rule that prosrc is NOT NULL because when a SQL-language function is written in SQL-standard style, we don't currently have anything useful to put there. This seems a poor decision though, as it could easily have negative impacts on external PLs (opening them to crashes they didn't use to have, for instance). SQL-function-related code can just as easily test "is prosqlbody not null" as "is prosrc null", so there's no real gain there either. Hence, revert the NOT NULL marking removal and adjust related logic. For now, we just put an empty string into prosrc for SQL-standard functions. Maybe we'll have a better idea later, although the history of things like pg_attrdef.adsrc suggests that it's not easy to maintain a string equivalent of a node tree. This also adds an assertion that queryDesc->sourceText != NULL to standard_ExecutorStart. We'd been silently relying on that for awhile, so let's make it less silent. Also fix some overlooked documentation and test cases. Discussion: https://postgr.es/m/2197698.1617984583@sss.pgh.pa.us	2021-04-15 17:17:20 -04:00
Noah Misch	df5efaf441	Standardize pg_authid oid_symbol values. Commit `c9c41c7a33` used two different naming patterns. Standardize on the majority pattern, which was the only pattern in the last reviewed version of that commit.	2021-04-10 12:01:41 -07:00
David Rowley	50e17ad281	Speedup ScalarArrayOpExpr evaluation ScalarArrayOpExprs with "useOr=true" and a set of Consts on the righthand side have traditionally been evaluated by using a linear search over the array. When these arrays contain large numbers of elements then this linear search could become a significant part of execution time. Here we add a new method of evaluating ScalarArrayOpExpr expressions to allow them to be evaluated by first building a hash table containing each element, then on subsequent evaluations, we just probe that hash table to determine if there is a match. The planner is in charge of determining when this optimization is possible and it enables it by setting hashfuncid in the ScalarArrayOpExpr. The executor will only perform the hash table evaluation when the hashfuncid is set. This means that not all cases are optimized. For example CHECK constraints containing an IN clause won't go through the planner, so won't get the hashfuncid set. We could maybe do something about that at some later date. The reason we're not doing it now is from fear that we may slow down cases where the expression is evaluated only once. Those cases can be common, for example, a single row INSERT to a table with a CHECK constraint containing an IN clause. In the planner, we enable this when there are suitable hash functions for the ScalarArrayOpExpr's operator and only when there is at least MIN_ARRAY_SIZE_FOR_HASHED_SAOP elements in the array. The threshold is currently set to 9. Author: James Coleman, David Rowley Reviewed-by: David Rowley, Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/CAAaqYe8x62+=wn0zvNKCj55tPpg-JBHzhZFFc6ANovdqFw7-dA@mail.gmail.com	2021-04-08 23:51:22 +12:00
Thomas Munro	1d257577e0	Optionally prefetch referenced data in recovery. Introduce a new GUC recovery_prefetch, disabled by default. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Better mechanisms will follow in later work on the I/O subsystem. The GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os we allow ourselves to initiate, based on pessimistic heuristics used to infer that I/Os have begun and completed. The GUC wal_decode_buffer_size is used to limit the maximum distance we are prepared to read ahead in the WAL to find uncached blocks. Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (parts) Reviewed-by: Andres Freund <andres@anarazel.de> (parts) Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (parts) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com	2021-04-08 23:20:42 +12:00
Magnus Hagander	aaf0432572	Add functions to wait for backend termination This adds a function, pg_wait_for_backend_termination(), and a new timeout argument to pg_terminate_backend(), which will wait for the backend to actually terminate (with or without signaling it to do so depending on which function is called). The default behaviour of pg_terminate_backend() remains being timeout=0 which does not waiting. For pg_wait_for_backend_termination() the default wait is 5 seconds. Author: Bharath Rupireddy Reviewed-By: Fujii Masao, David Johnston, Muhammad Usama, Hou Zhijie, Magnus Hagander Discussion: https://postgr.es/m/CALj2ACUBpunmyhYZw-kXCYs5NM+h6oG_7Df_Tn4mLmmUQifkqA@mail.gmail.com	2021-04-08 11:40:54 +02:00
Peter Eisentraut	e717a9a18b	SQL-standard function body This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova <jcasanov@systemguards.com.ec> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com	2021-04-07 21:47:55 +02:00
Bruce Momjian	4f0b0966c8	Make use of in-core query id added by commit `5fd9dfa5f5` Use the in-core query id computation for pg_stat_activity, log_line_prefix, and EXPLAIN VERBOSE. Similar to other fields in pg_stat_activity, only the queryid from the top level statements are exposed, and if the backends status isn't active then the queryid from the last executed statements is displayed. Add a %Q placeholder to include the queryid in log_line_prefix, which will also only expose top level statements. For EXPLAIN VERBOSE, if a query identifier has been computed, either by enabling compute_query_id or using a third-party module, display it. Bump catalog version. Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol Author: Julien Rouhaud Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu	2021-04-07 14:04:06 -04:00
Peter Eisentraut	a2da77cdb4	Change return type of EXTRACT to numeric The previous implementation of EXTRACT mapped internally to date_part(), which returned type double precision (since it was implemented long before the numeric type existed). This can lead to imprecise output in some cases, so returning numeric would be preferrable. Changing the return type of an existing function is a bit risky, so instead we do the following: We implement a new set of functions, which are now called "extract", in parallel to the existing date_part functions. They work the same way internally but use numeric instead of float8. The EXTRACT construct is now mapped by the parser to these new extract functions. That way, dumps of views etc. from old versions (which would use date_part) continue to work unchanged, but new uses will map to the new extract functions. Additionally, the reverse compilation of EXTRACT now reproduces the original syntax, using the new mechanism introduced in `40c24bfef9`. The following minor changes of behavior result from the new implementation: - The column name from an isolated EXTRACT call is now "extract" instead of "date_part". - Extract from date now rejects inappropriate field names such as HOUR. It was previously mapped internally to extract from timestamp, so it would silently accept everything appropriate for timestamp. - Return values when extracting fields with possibly fractional values, such as second and epoch, now have the full scale that the value has internally (so, for example, '1.000000' instead of just '1'). Reported-by: Petr Fedorov <petr.fedorov@phystech.edu> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu	2021-04-06 07:20:42 +02:00
Fujii Masao	43620e3286	Add function to log the memory contexts of specified backend process. Commit `3e98c0bafb` added pg_backend_memory_contexts view to display the memory contexts of the backend process. However its target process is limited to the backend that is accessing to the view. So this is not so convenient when investigating the local memory bloat of other backend process. To improve this situation, this commit adds pg_log_backend_memory_contexts() function that requests to log the memory contexts of the specified backend process. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be used in some environments because no debugger is available there. So, pg_log_backend_memory_contexts() allows us to see the memory contexts of specified backend more easily. Only superusers are allowed to request to log the memory contexts because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, at the next CHECK_FOR_INTERRUPTS(), the target backend logs its memory contexts at LOG_SERVER_ONLY level, so that these memory contexts will appear in the server log but not be sent to the client. It logs one message per memory context. Because if it buffers all memory contexts into StringInfo to log them as one message, which may require the buffer to be enlarged very much and lead to OOM error since there can be a large number of memory contexts in a backend. When a backend process is consuming huge memory, logging all its memory contexts might overrun available disk space. To prevent this, now this patch limits the number of child contexts to log per parent to 100. As with MemoryContextStats(), it supposes that practical cases where the log gets long will typically be huge numbers of siblings under the same parent context; while the additional debugging value from seeing details about individual siblings beyond 100 will not be large. There was another proposed patch to add the function to return the memory contexts of specified backend as the result sets, instead of logging them, in the discussion. However that patch is not included in this commit because it had several issues to address. Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra, Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com	2021-04-06 13:44:15 +09:00
Stephen Frost	6c3ffd697e	Add pg_read_all_data and pg_write_all_data roles A commonly requested use-case is to have a role who can run an unfettered pg_dump without having to explicitly GRANT that user access to all tables, schemas, et al, without that role being a superuser. This address that by adding a "pg_read_all_data" role which implicitly gives any member of this role SELECT rights on all tables, views and sequences, and USAGE rights on all schemas. As there may be cases where it's also useful to have a role who has write access to all objects, pg_write_all_data is also introduced and gives users implicit INSERT, UPDATE and DELETE rights on all tables, views and sequences. These roles can not be logged into directly but instead should be GRANT'd to a role which is able to log in. As noted in the documentation, if RLS is being used then an administrator may (or may not) wish to set BYPASSRLS on the login role which these predefined roles are GRANT'd to. Reviewed-by: Georgios Kokolatos Discussion: https://postgr.es/m/20200828003023.GU29590@tamriel.snowman.net	2021-04-05 13:42:52 -04:00
Stephen Frost	c9c41c7a33	Rename Default Roles to Predefined Roles The term 'default roles' wasn't quite apt as these roles aren't able to be modified or removed after installation, so rename them to be 'Predefined Roles' instead, adding an entry into the newly added Obsolete Appendix to help users of current releases find the new documentation. Bruce Momjian and Stephen Frost Discussion: https://postgr.es/m/157742545062.1149.11052653770497832538%40wrigleys.postgresql.org and https://www.postgresql.org/message-id/20201120211304.GG16415@tamriel.snowman.net	2021-04-01 15:32:06 -04:00
Heikki Linnakangas	ea1b99a661	Add 'noError' argument to encoding conversion functions. With the 'noError' argument, you can try to convert a buffer without knowing the character boundaries beforehand. The functions now need to return the number of input bytes successfully converted. This is is a backwards-incompatible change, if you have created a custom encoding conversion with CREATE CONVERSION. This adds a check to pg_upgrade for that, refusing the upgrade if there are any user-defined encoding conversions. Custom conversions are very rare, there are no commonly used extensions that I know of that uses that feature. No other objects can depend on conversions, so if you do have one, you can fairly easily drop it before upgrading, and recreate it after the upgrade with an updated version. Add regression tests for built-in encoding conversions. This doesn't cover every conversion, but it covers all the internal functions in conv.c that are used to implement the conversions. Reviewed-by: John Naylor Discussion: https://www.postgresql.org/message-id/e7861509-3960-538a-9025-b75a61188e01%40iki.fi	2021-04-01 11:45:22 +03:00
Peter Eisentraut	f37fec837c	Add unistr function This allows decoding a string with Unicode escape sequences. It is similar to Unicode escape strings, but offers some more flexibility. Author: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Asif Rehman <asifr.rehman@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRA5GnKT+gDVwbVRH2ep451H_myBt+NTz8RkYUARE9+qOQ@mail.gmail.com	2021-03-29 11:56:53 +02:00
Tomas Vondra	a4d75c86bf	Extended statistics on expressions Allow defining extended statistics on expressions, not just just on simple column references. With this commit, expressions are supported by all existing extended statistics kinds, improving the same types of estimates. A simple example may look like this: CREATE TABLE t (a int); CREATE STATISTICS s ON mod(a,10), mod(a,20) FROM t; ANALYZE t; The collected statistics are useful e.g. to estimate queries with those expressions in WHERE or GROUP BY clauses: SELECT * FROM t WHERE mod(a,10) = 0 AND mod(a,20) = 0; SELECT 1 FROM t GROUP BY mod(a,10), mod(a,20); This introduces new internal statistics kind 'e' (expressions) which is built automatically when the statistics object definition includes any expressions. This represents single-expression statistics, as if there was an expression index (but without the index maintenance overhead). The statistics is stored in pg_statistics_ext_data as an array of composite types, which is possible thanks to `79f6a942bd`. CREATE STATISTICS allows building statistics on a single expression, in which case in which case it's not possible to specify statistics kinds. A new system view pg_stats_ext_exprs can be used to display expression statistics, similarly to pg_stats and pg_stats_ext views. ALTER TABLE ... ALTER COLUMN ... TYPE now treats indexes the same way it treats indexes, i.e. it drops and recreates the statistics. This means all statistics are reset, and we no longer try to preserve at least the functional dependencies. This should not be a major issue in practice, as the functional dependencies actually rely on per-column statistics, which were always reset anyway. Author: Tomas Vondra Reviewed-by: Justin Pryzby, Dean Rasheed, Zhihong Yu Discussion: https://postgr.es/m/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com	2021-03-27 00:01:11 +01:00
Noah Misch	a14a0118a1	Add "pg_database_owner" default role. Membership consists, implicitly, of the current database owner. Expect use in template databases. Once pg_database_owner has rights within a template, each owner of a database instantiated from that template will exercise those rights. Reviewed by John Naylor. Discussion: https://postgr.es/m/20201228043148.GA1053024@rfd.leadboat.com	2021-03-26 10:42:17 -07:00

... 2 3 4 5 6 ...

2747 Commits