postgresql

Commit Graph

Author	SHA1	Message	Date
Peter Geoghegan	f3c15cbe50	Generalize how VACUUM skips all-frozen pages. Non-aggressive VACUUMs were at a gratuitous disadvantage (relative to aggressive VACUUMs) around advancing relfrozenxid and relminmxid before now. The issue only came up when concurrent activity unset some heap page's visibility map bit right as VACUUM was considering if the page should get counted in frozenskipped_pages. The non-aggressive case would recheck the all-frozen bit at this point. The aggressive case reasoned that the page (a skippable page) must have at least been all-frozen in the recent past, so skipping it won't make relfrozenxid advancement unsafe (which is never okay for aggressive VACUUMs). The recheck created a window for some other backend to confuse matters for VACUUM. If the page's VM bit turned out to be unset, VACUUM would conclude that the page was _never_ all-frozen. frozenskipped_pages was not incremented, and yet VACUUM couldn't back out of skipping at this late stage (it couldn't choose to scan the page instead). This made it unsafe to advance relfrozenxid later on. Consistently avoid the issue by generalizing how we skip frozen pages during aggressive VACUUMs: take the same approach when skipping any skippable page range during aggressive and non-aggressive VACUUMs alike. The new approach makes ranges (not individual pages) the fundamental unit of skipping using the visibility map. frozenskipped_pages is replaced with a boolean flag that represents whether some skippable range with one or more all-visible pages was actually skipped. It is safe for VACUUM to treat a page as all-frozen provided it at least had its all-frozen bit set after the OldestXmin cutoff was established. VACUUM is only required to scan pages that might have XIDs < OldestXmin (unfrozen XIDs) to be able to safely advance relfrozenxid. Tuples concurrently inserted on "skipped" pages can be thought of as equivalent to tuples concurrently inserted on a block >= rel_pages. It's possible that the issue this commit fixes hardly ever came up in practice. But we only had to be unlucky once to lose out on advancing relfrozenxid -- a single affected heap page was enough to throw VACUUM off. That seems like something to avoid on general principle. This is similar to an issue fixed by commit `44fa8488`, which taught vacuumlazy.c to not give up on non-aggressive relfrozenxid advancement just because a cleanup lock wasn't immediately available on some heap page. Skipping an all-visible range is now explicitly structured as a choice made by non-aggressive VACUUMs, by weighing known costs (scanning extra skippable pages to freeze their tuples early) against known benefits (advancing relfrozenxid early). This works in essentially the same way as it always has (don't skip ranges < SKIP_PAGES_THRESHOLD). We could do much better here in the future by considering other relevant factors. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAH2-Wzn6bGJGfOy3zSTJicKLw99PHJeSOQBOViKjSCinaxUKDQ@mail.gmail.com Discussion: https://postgr.es/m/CA%2BTgmoZiSOY6H7aadw5ZZGm7zYmfDzL6nwmL5V7GL4HgJgLF_w%40mail.gmail.com	2022-04-03 13:35:43 -07:00
Peter Geoghegan	0b018fabaa	Set relfrozenxid to oldest extant XID seen by VACUUM. When VACUUM set relfrozenxid before now, it set it to whatever value was used to determine which tuples to freeze -- the FreezeLimit cutoff. This approach was very naive. The relfrozenxid invariant only requires that new relfrozenxid values be <= the oldest extant XID remaining in the table (at the point that the VACUUM operation ends), which in general might be much more recent than FreezeLimit. VACUUM now carefully tracks the oldest remaining XID/MultiXactId as it goes (the oldest remaining values _after_ lazy_scan_prune processing). The final values are set as the table's new relfrozenxid and new relminmxid in pg_class at the end of each VACUUM. The oldest XID might come from a tuple's xmin, xmax, or xvac fields. It might even come from one of the table's remaining MultiXacts. Final relfrozenxid values must still be >= FreezeLimit in an aggressive VACUUM (FreezeLimit still acts as a lower bound on the final value that aggressive VACUUM can set relfrozenxid to). Since standard VACUUMs still make no guarantees about advancing relfrozenxid, they might as well set relfrozenxid to a value from well before FreezeLimit when the opportunity presents itself. In general standard VACUUMs may now set relfrozenxid to any value > the original relfrozenxid and <= OldestXmin. Credit for the general idea of using the oldest extant XID to set pg_class.relfrozenxid at the end of VACUUM goes to Andres Freund. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAH2-WzkymFbz6D_vL+jmqSn_5q1wsFvFrE+37yLgL_Rkfd6Gzg@mail.gmail.com	2022-04-03 09:57:21 -07:00
Tom Lane	e39f990467	Fix overflow hazards in interval input and output conversions. DecodeInterval (interval input) was careless about integer-overflow hazards, allowing bogus results to be obtained for sufficiently large input values. Also, since it initially converted the input to a "struct tm", it was impossible to produce the full range of representable interval values. Meanwhile, EncodeInterval (interval output) and a few other functions could suffer failures if asked to process sufficiently large interval values, because they also relied on being able to represent an interval in "struct tm" which is not designed to handle that. Fix all this stuff by introducing new struct types that are more fit for purpose. While this is clearly a bug fix, it's also an API break for any code that's calling these functions directly. So back-patching doesn't seem wise, especially in view of the lack of field complaints. Joe Koshakow, editorialized a bit by me Discussion: https://postgr.es/m/CAAvxfHff0JLYHwyBrtMx_=6wr=k2Xp+D+-X3vEhHjJYMj+mQcg@mail.gmail.com	2022-04-02 16:12:29 -04:00
Peter Geoghegan	14bf1e8313	vacuumlazy.c: Clean up variable declarations. Move some of the heap_vacuum_rel() instrumentation related variables to the scope where they're actually needed. Also reorder some of the variable declarations at the start of heap_vacuum_rel() so that related variables appear together.	2022-04-02 10:33:21 -07:00
Joe Conway	9752436f04	Use has_privs_for_roles for predefined role checks: round 2 Similar to commit `6198420ad`, replace is_member_of_role with has_privs_for_role for predefined role access checks in recently committed basebackup code. In passing fix a double-word error in a nearby comment. Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com	2022-04-02 13:24:38 -04:00
Alvaro Herrera	cfdd03f45e	Allow CLUSTER on partitioned tables This is essentially the same as applying VACUUM FULL to a partitioned table, which has been supported since commit `3c3bb99330` (March 2017). While there's no great use case in applying CLUSTER to partitioned tables, we don't have any strong reason not to allow it either. For now, partitioned indexes cannot be marked clustered, so an index must always be specified. While at it, rename some variables that were RangeVars during the development that led to `8bc717cb88` but never made it that way to the source tree; there's no need to perpetuate names that have always been more confusing than helpful. Author: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/20201028003312.GU9241@telsasoft.com Discussion: https://postgr.es/m/20200611153502.GT14879@telsasoft.com	2022-04-02 19:08:34 +02:00
John Naylor	6974924347	Specialize tuplesort routines for different kinds of abbreviated keys Previously, the specialized tuplesort routine inlined handling for reverse-sort and NULLs-ordering but called the datum comparator via a pointer in the SortSupport struct parameter. Testing has showed that we can get a useful performance gain by specializing datum comparison for the different representations of abbreviated keys -- signed and unsigned 64-bit integers and signed 32-bit integers. Almost all abbreviatable data types will benefit -- the only exception for now is numeric, since the datum comparison is more complex. The performance gain depends on data type and input distribution, but often falls in the range of 10-20% faster. Thomas Munro Reviewed by Peter Geoghegan, review and performance testing by me Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKKYttZZk-JMRQSVak%3DCXSJ5fiwtirFf%3Dn%3DPAbumvn1Ww%40mail.gmail.com	2022-04-02 15:22:25 +07:00
Michael Paquier	d16773cdc8	Add macros in hash and btree AMs to get the special area of their pages This makes the code more consistent with SpGiST, GiST and GIN, that already use this style, and the idea is to make easier the introduction of more sanity checks for each of these AM-specific macros. BRIN uses a different set of macros to get a page's type and flags, so it has no need for something similar. Author: Matthias van de Meent Discussion: https://postgr.es/m/CAEze2WjE3+tGO9Fs9+iZMU+z6mMZKo54W1Zt98WKqbEUHbHOBg@mail.gmail.com	2022-04-01 13:24:50 +09:00
Andrew Dunstan	9f91344223	Fix comments with "a expression"	2022-03-31 15:45:25 -04:00
Andrew Dunstan	49082c2cc3	RETURNING clause for JSON() and JSON_SCALAR() This patch is extracted from a larger patch that allowed setting the default returned value from these functions to json or jsonb. That had problems, but this piece of it is fine. For these functions only json or jsonb can be specified in the RETURNING clause. Extracted from an original patch from Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-31 15:45:24 -04:00
Tom Lane	f3dd9fe1dd	Fix postgres_fdw to check shippability of sort clauses properly. postgres_fdw would push ORDER BY clauses to the remote side without verifying that the sort operator is safe to ship. Moreover, it failed to print a suitable USING clause if the sort operator isn't default for the sort expression's type. The net result of this is that the remote sort might not have anywhere near the semantics we expect, which'd be disastrous for locally-performed merge joins in particular. We addressed similar issues in the context of ORDER BY within an aggregate function call in commit `7012b132d`, but failed to notice that query-level ORDER BY was broken. Thus, much of the necessary logic already existed, but it requires refactoring to be usable in both cases. Back-patch to all supported branches. In HEAD only, remove the core code's copy of find_em_expr_for_rel, which is no longer used and really should never have been pushed into equivclass.c in the first place. Ronan Dunklau, per report from David Rowley; reviews by David Rowley, Ranier Vilela, and myself Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com	2022-03-31 14:29:48 -04:00
Amit Kapila	8f2e2bbf14	Raise a WARNING for missing publications. When we create or alter a subscription to add publications raise a warning for non-existent publications. We don't want to give an error here because it is possible that users can later create the missing publications. Author: Vignesh C Reviewed-by: Bharath Rupireddy, Japin Li, Dilip Kumar, Euler Taveira, Ashutosh Sharma, Amit Kapila Discussion: https://postgr.es/m/CALDaNm0f4YujGW+q-Di0CbZpnQKFFrXntikaQQKuEmGG0=Zw=Q@mail.gmail.com	2022-03-31 08:25:50 +05:30
Tomas Vondra	db0d67db24	Optimize order of GROUP BY keys When evaluating a query with a multi-column GROUP BY clause using sort, the cost may be heavily dependent on the order in which the keys are compared when building the groups. Grouping does not imply any ordering, so we're allowed to compare the keys in arbitrary order, and a Hash Agg leverages this. But for Group Agg, we simply compared keys in the order as specified in the query. This commit explores alternative ordering of the keys, trying to find a cheaper one. In principle, we might generate grouping paths for all permutations of the keys, and leave the rest to the optimizer. But that might get very expensive, so we try to pick only a couple interesting orderings based on both local and global information. When planning the grouping path, we explore statistics (number of distinct values, cost of the comparison function) for the keys and reorder them to minimize comparison costs. Intuitively, it may be better to perform more expensive comparisons (for complex data types etc.) last, because maybe the cheaper comparisons will be enough. Similarly, the higher the cardinality of a key, the lower the probability we’ll need to compare more keys. The patch generates and costs various orderings, picking the cheapest ones. The ordering of group keys may interact with other parts of the query, some of which may not be known while planning the grouping. E.g. there may be an explicit ORDER BY clause, or some other ordering-dependent operation, higher up in the query, and using the same ordering may allow using either incremental sort or even eliminate the sort entirely. The patch generates orderings and picks those minimizing the comparison cost (for various pathkeys), and then adds orderings that might be useful for operations higher up in the plan (ORDER BY, etc.). Finally, it always keeps the ordering specified in the query, on the assumption the user might have additional insights. This introduces a new GUC enable_group_by_reordering, so that the optimization may be disabled if needed. The original patch was proposed by Teodor Sigaev, and later improved and reworked by Dmitry Dolgov. Reviews by a number of people, including me, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed and Zhihong Yu. Author: Dmitry Dolgov, Teodor Sigaev, Tomas Vondra Reviewed-by: Tomas Vondra, Andrey Lepikhov, Claudio Freire, Ibrar Ahmed, Zhihong Yu Discussion: https://postgr.es/m/7c79e6a5-8597-74e8-0671-1c39d124c9d6%40sigaev.ru Discussion: https://postgr.es/m/CA%2Bq6zcW_4o2NC0zutLkOJPsFt80megSpX_dVRo6GK9PC-Jx_Ag%40mail.gmail.com	2022-03-31 01:13:33 +02:00
Andrew Dunstan	606948b058	SQL JSON functions This Patch introduces three SQL standard JSON functions: JSON() (incorrectly mentioned in my commit message for `f4fb45d15c`) JSON_SCALAR() JSON_SERIALIZE() JSON() produces json values from text, bytea, json or jsonb values, and has facilitites for handling duplicate keys. JSON_SCALAR() produces a json value from any scalar sql value, including json and jsonb. JSON_SERIALIZE() produces text or bytea from input which containis or represents json or jsonb; For the most part these functions don't add any significant new capabilities, but they will be of use to users wanting standard compliant JSON handling. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-30 16:30:37 -04:00
Peter Eisentraut	7ae1619bc5	Add range_agg with multirange inputs range_agg for normal ranges already existed. A lot of code can be shared. Author: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Chapman Flack <chap@anastigmatix.net> Discussion: https://www.postgresql.org/message-id/flat/007ef255-35ef-fd26-679c-f97e7a7f30c2@illuminatedcomputing.com	2022-03-30 20:16:23 +02:00
Peter Eisentraut	f453d684ec	Change some internal error messages to elogs Author: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Chapman Flack <chap@anastigmatix.net> Discussion: https://www.postgresql.org/message-id/flat/007ef255-35ef-fd26-679c-f97e7a7f30c2@illuminatedcomputing.com	2022-03-30 17:53:54 +02:00
Robert Haas	51c0d186d9	Allow parallel zstd compression when taking a base backup. libzstd allows transparent parallel compression just by setting an option when creating the compression context, so permit that for both client and server-side backup compression. To use this, use something like pg_basebackup --compress WHERE-zstd:workers=N where WHERE is "client" or "server" and N is an integer. When compression is performed on the server side, this will spawn threads inside the PostgreSQL backend. While there is almost no PostgreSQL server code which is thread-safe, the threads here are used internally by libzstd and touch only data structures controlled by libzstd. Patch by me, based in part on earlier work by Dipesh Pandit and Jeevan Ladhe. Reviewed by Justin Pryzby. Discussion: http://postgr.es/m/CA+Tgmobj6u-nWF-j=FemygUhobhryLxf9h-wJN7W-2rSsseHNA@mail.gmail.com	2022-03-30 09:41:26 -04:00
Etsuro Fujita	f505bec711	Fix typo in comment.	2022-03-30 19:00:00 +09:00
Peter Eisentraut	072132f04e	Add header matching mode to COPY FROM COPY FROM supports the HEADER option to silently discard the header line from a CSV or text file. It is possible to load by mistake a file that matches the expected format, for example, if two text columns have been swapped, resulting in garbage in the database. This adds a new option value HEADER MATCH that checks the column names in the header line against the actual column names and errors out if they do not match. Author: Rémi Lapeyre <remi.lapeyre@lenstra.fr> Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/CAF1-J-0PtCWMeLtswwGV2M70U26n4g33gpe1rcKQqe6wVQDrFA@mail.gmail.com	2022-03-30 09:02:31 +02:00
Amit Kapila	d5a9d86d8f	Skip empty transactions for logical replication. The current logical replication behavior is to send every transaction to subscriber even if the transaction is empty. This can happen because transaction doesn't contain changes from the selected publications or all the changes got filtered. It is a waste of CPU cycles and network bandwidth to build/transmit these empty transactions. This patch addresses the above problem by postponing the BEGIN message until the first change is sent. While processing a COMMIT message, if there was no other change for that transaction, do not send the COMMIT message. This allows us to skip sending BEGIN/COMMIT messages for empty transactions. When skipping empty transactions in synchronous replication mode, we send a keepalive message to avoid delaying such transactions. Author: Ajin Cherian, Hou Zhijie, Euler Taveira Reviewed-by: Peter Smith, Takamichi Osumi, Shi Yu, Masahiko Sawada, Greg Nancarrow, Vignesh C, Amit Kapila Discussion: https://postgr.es/m/CAMkU=1yohp9-dv48FLoSPrMqYEyyS5ZWkaZGD41RJr10xiNo_Q@mail.gmail.com	2022-03-30 07:41:05 +05:30
Andrew Dunstan	1a36bc9dba	SQL/JSON query functions This introduces the SQL/JSON functions for querying JSON data using jsonpath expressions. The functions are: JSON_EXISTS() JSON_QUERY() JSON_VALUE() All of these functions only operate on jsonb. The workaround for now is to cast the argument to jsonb. JSON_EXISTS() tests if the jsonpath expression applied to the jsonb value yields any values. JSON_VALUE() must return a single value, and an error occurs if it tries to return multiple values. JSON_QUERY() must return a json object or array, and there are various WRAPPER options for handling scalar or multi-value results. Both these functions have options for handling EMPTY and ERROR conditions. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-29 16:57:13 -04:00
Robert Haas	9c08aea6a3	Add new block-by-block strategy for CREATE DATABASE. Because this strategy logs changes on a block-by-block basis, it avoids the need to checkpoint before and after the operation. However, because it logs each changed block individually, it might generate a lot of extra write-ahead logging if the template database is large. Therefore, the older strategy remains available via a new STRATEGY parameter to CREATE DATABASE, and a corresponding --strategy option to createdb. Somewhat controversially, this patch assembles the list of relations to be copied to the new database by reading the pg_class relation of the template database. Cross-database access like this isn't normally possible, but it can be made to work here because there can't be any connections to the database being copied, nor can it contain any in-doubt transactions. Even so, we have to use lower-level interfaces than normal, since the table scan and relcache interfaces will not work for a database to which we're not connected. The advantage of this approach is that we do not need to rely on the filesystem to determine what ought to be copied, but instead on PostgreSQL's own knowledge of the database structure. This avoids, for example, copying stray files that happen to be located in the source database directory. Dilip Kumar, with a fairly large number of cosmetic changes by me. Reviewed and tested by Ashutosh Sharma, Andres Freund, John Naylor, Greg Nancarrow, Neha Sharma. Additional feedback from Bruce Momjian, Heikki Linnakangas, Julien Rouhaud, Adam Brusselback, Kyotaro Horiguchi, Tomas Vondra, Andrew Dunstan, Álvaro Herrera, and others. Discussion: http://postgr.es/m/CA+TgmoYtcdxBjLh31DLxUXHxFVMPGzrU5_T=CYCvRyFHywSBUQ@mail.gmail.com	2022-03-29 11:48:36 -04:00
Alvaro Herrera	bf902c1393	Revert "Fix replay of create database records on standby" This reverts commit `49d9cfc68b`. The approach taken by this patch has problems, so we'll come up with a radically different fix. Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com	2022-03-29 15:36:21 +02:00
Robert Haas	edea649afb	Explain why the startup process can't cause a shortage of sinval slots. Bharath Rupireddy, reviewed by Fujii Masao and Yura Sokolov. Lightly edited by me. Discussion: http://postgr.es/m/CALj2ACU=3_frMkDp9UUeuZoAMjaK1y0Z_q5RFNbGvwi8NM==AA@mail.gmail.com	2022-03-29 09:24:24 -04:00
Michael Paquier	a2c84990be	Add system view pg_ident_file_mappings This view is similar to pg_hba_file_rules view, except that it is associated with the parsing of pg_ident.conf. Similarly to its cousin, this view is useful to check via SQL if changes planned in pg_ident.conf would work upon reload or restart, or to diagnose a previous failure. Bumps catalog version. Author: Julien Rouhaud Reviewed-by: Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud	2022-03-29 10:15:48 +09:00
Andrew Dunstan	33a377608f	IS JSON predicate This patch intrdocuces the SQL standard IS JSON predicate. It operates on text and bytea values representing JSON as well as on the json and jsonb types. Each test has an IS and IS NOT variant. The tests are: IS JSON [VALUE] IS JSON ARRAY IS JSON OBJECT IS JSON SCALAR IS JSON WITH \| WITHOUT UNIQUE KEYS These are mostly self-explanatory, but note that IS JSON WITHOUT UNIQUE KEYS is true whenever IS JSON is true, and IS JSON WITH UNIQUE KEYS is true whenever IS JSON is true except it IS JSON OBJECT is true and there are duplicate keys (which is never the case when applied to jsonb values). Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-28 15:37:08 -04:00
Joe Conway	6198420ad8	Use has_privs_for_roles for predefined role checks Generally if a role is granted membership to another role with NOINHERIT they must use SET ROLE to access the privileges of that role, however with predefined roles the membership and privilege is conflated. Fix that by replacing is_member_of_role with has_privs_for_role for predefined roles. Patch does not remove is_member_of_role from acl.h, but it does add a warning not to use that function for privilege checking. Not backpatched based on hackers list discussion. Author: Joshua Brindle Reviewed-by: Stephen Frost, Nathan Bossart, Joe Conway Discussion: https://postgr.es/m/flat/CAGB+Vh4Zv_TvKt2tv3QNS6tUM_F_9icmuj0zjywwcgVi4PAhFA@mail.gmail.com	2022-03-28 15:10:04 -04:00
Robert Haas	79de9842ab	Remove the ability of a role to administer itself. Commit `f9fd176461` effectively gave every role ADMIN OPTION on itself. However, this appears to be something that happened accidentally as a result of refactoring work rather than an intentional decision. Almost a decade later, it was discovered that this was a security vulnerability. As a result, commit `fea164a72a` restricted this implicit ADMIN OPTION privilege to be exercisable only when the role being administered is the same as the session user and when no security-restricted operation is in progress. That commit also documented the existence of this implicit privilege for what seems to be the first time. The effect of the privilege is to allow a login role to grant the privileges of that role, and optionally ADMIN OPTION on it, to some other role. That's an unusual thing to do, because generally membership is granted in roles used as groups, rather than roles used as users. Therefore, it does not seem likely that removing the privilege will break things for many PostgreSQL users. However, it will make it easier to reason about the permissions system. This is the only case where a user who has not been given any special permission (superuser, or ADMIN OPTION on some role) can modify role membership, so removing it makes things more consistent. For example, if a superuser sets up role A and B and grants A to B but no other privileges to anyone, she can now be sure that no one else will be able to revoke that grant. Without this change, that would have been true only if A was a non-login role. Patch by me. Reviewed by Tom Lane and Stephen Frost. Discussion: http://postgr.es/m/CA+Tgmoawdt03kbA+dNyBcNWJpRxu0f4X=69Y3+DkXXZqmwMDLg@mail.gmail.com	2022-03-28 13:38:13 -04:00
Robert Haas	61762426e6	Fix a few goofs in new backup compression code. When we try to set the zstd compression level either on the client or on the server, check for errors. For any algorithm, on the client side, don't try to set the compression level unless the user specified one. This was visibly broken for zstd, which managed to set -1 rather than 0 in this case, but tidy up the code for the other methods, too. On the client side, if we fail to create a ZSTD_CCtx, exit after reporting the error. Otherwise we'll dereference a null pointer. Patch by me, reviewed by Dipesh Pandit. Discussion: http://postgr.es/m/CA+TgmoZK3zLQUCGi1h4XZw4jHiAWtcACc+GsdJR1_Mc19jUjXA@mail.gmail.com	2022-03-28 12:19:05 -04:00
Tom Lane	d22646922d	Add public ruleutils.c entry point to deparse a Query. This has no in-core callers but will be wanted by extensions. It's just a thin wrapper around get_query_def, so it adds little code. Also, fix get_from_clause_item() to force insertion of an alias for a SUBQUERY RTE item. This is irrelevant to existing uses because RTE_SUBQUERY items made by the parser always have aliases already. However, if one tried to use pg_get_querydef() to inspect a post-rewrite Query, it could be an issue. In any case, get_from_clause_item already contained logic to force alias insertion for VALUES items, so the lack of the same for SUBQUERY is a pretty clear oversight. In passing, replace duplicated code for selection of pretty-print options with a common macro. Julien Rouhaud, reviewed by Pavel Stehule, Gilles Darold, and myself Discussion: https://postgr.es/m/20210627041138.zklczwmu3ms4ufnk@nol	2022-03-28 11:19:37 -04:00
Alvaro Herrera	7103ebb7aa	Add support for MERGE SQL command MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows -- a task that would otherwise require multiple PL statements. For example, MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular tables, partitioned tables and inheritance hierarchies, including column and row security enforcement, as well as support for row and statement triggers and transition tables therein. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used from PL/pgSQL. MERGE does not support targetting updatable views or foreign tables, and RETURNING clauses are not allowed either. These limitations are likely fixable with sufficient effort. Rewrite rules are also not supported, but it's not clear that we'd want to support them. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Amit Langote <amitlangote09@gmail.com> Author: Simon Riggs <simon.riggs@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Reviewed-by: Peter Geoghegan <pg@bowt.ie> (earlier versions) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql	2022-03-28 16:47:48 +02:00
Peter Eisentraut	e26114c817	Make JSON path numeric literals more correct Per ECMAScript standard (ECMA-262, referenced by SQL standard), the syntax forms .1 1. should be allowed for decimal numeric literals, but the existing implementation rejected them. Also, by the same standard, reject trailing junk after numeric literals. Note that the ECMAScript standard for numeric literals is in respects like these slightly different from the JSON standard, which might be the original cause for this discrepancy. A change is that this kind of syntax is now rejected: 1.type() This needs to be written as (1).type() This is correct; normal JavaScript also does not accept this syntax. We also need to fix up the jsonpath output function for this case. We put parentheses around numeric items if they are followed by another path item. Reviewed-by: Nikita Glukhov <n.gluhov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/50a828cc-0a00-7791-7883-2ed06dfb2dbb@enterprisedb.com	2022-03-28 11:11:39 +02:00
Andres Freund	91c0570a79	Don't fail for > 1 walsenders in 019_replslot_limit, add debug messages. So far the first of the retries introduced in `f28bf667f6` resolves the issue. But I (Andres) am still suspicious that the start of the failures might indicate a problem. To reduce noise, stop reporting a failure if a retry resolves the problem. To allow figuring out what causes the slow slot drop, add a few more debug messages to ReplicationSlotDropPtr. See also commit `afdeff1052`, `fe0972ee5e` and `f28bf667f6`. Discussion: https://postgr.es/m/20220327213219.smdvfkq2fl74flow@alap3.anarazel.de	2022-03-27 22:35:42 -07:00
Tom Lane	cc7401d5ca	Fix up compiler warnings/errors from `f4fb45d15`. Per early buildfarm returns.	2022-03-27 18:32:40 -04:00
Andrew Dunstan	f4fb45d15c	SQL/JSON constructors This patch introduces the SQL/JSON standard constructors for JSON: JSON() JSON_ARRAY() JSON_ARRAYAGG() JSON_OBJECT() JSON_OBJECTAGG() For the most part these functions provide facilities that mimic existing json/jsonb functions. However, they also offer some useful additional functionality. In addition to text input, the JSON() function accepts bytea input, which it will decode and constuct a json value from. The other functions provide useful options for handling duplicate keys and null values. This series of patches will be followed by a consolidated documentation patch. Nikita Glukhov Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-27 17:03:34 -04:00
Andrew Dunstan	f79b803dcc	Common SQL/JSON clauses This introduces some of the building blocks used by the SQL/JSON constructor and query functions. Specifically, it provides node executor and grammar support for the FORMAT JSON [ENCODING foo] clause, and values decorated with it, and for the RETURNING clause. The following SQL/JSON patches will leverage these. Nikita Glukhov (who probably deserves an award for perseverance). Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-27 17:03:33 -04:00
Tom Lane	c2d81ee902	Remove useless variable. flatten_join_alias_vars_mutator counted attnums, but didn't do anything with the count (no doubt it did at one time). Noted by buildfarm member seawasp.	2022-03-27 15:05:22 -04:00
Tom Lane	0fb6954aa5	Fix breakage of get_ps_display() in the PS_USE_NONE case. Commit `8c6d30f21` caused this function to fail to set *displen in the PS_USE_NONE code path. If the variable's previous value had been negative, that'd lead to a memory clobber at some call sites. We'd managed not to notice due to very thin test coverage of such configurations, but this appears to explain buildfarm member lorikeet's recent struggles. Credit to Andrew Dunstan for spotting the problem. Back-patch to v13 where the bug was introduced. Discussion: https://postgr.es/m/136102.1648320427@sss.pgh.pa.us	2022-03-27 12:57:46 -04:00
Michael Paquier	411b91360f	Fix comment in execParallel.c `0f61727` has made this comment incorrect. Author: Julien Rouhaud Reviewed-by: Matthias van de Meent Discussion: https://postgr.es/m/20220326160117.qtp5nkuku6cvhcby@jrouhaud	2022-03-27 18:22:22 +09:00
Tom Lane	979cd655c1	Suppress compiler warning in pub_collist_to_bitmapset(). A fair percentage of the buildfarm doesn't recognize that oldcxt won't be used uninitialized; silence those warnings by initializing it. While here, upgrade the function's thoroughly inadequate header comment. Oversight in `923def9a5`, per buildfarm.	2022-03-26 11:53:37 -04:00
Tomas Vondra	923def9a53	Allow specifying column lists for logical replication This allows specifying an optional column list when adding a table to logical replication. The column list may be specified after the table name, enclosed in parentheses. Columns not included in this list are not sent to the subscriber, allowing the schema on the subscriber to be a subset of the publisher schema. For UPDATE/DELETE publications, the column list needs to cover all REPLICA IDENTITY columns. For INSERT publications, the column list is arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if the table uses REPLICA IDENTITY FULL, column list is not allowed. The column list can contain only simple column references. Complex expressions, function calls etc. are not allowed. This restriction could be relaxed in the future. During the initial table synchronization, only columns included in the column list are copied to the subscriber. If the subscription has several publications, containing the same table with different column lists, columns specified in any of the lists will be copied. This means all columns are replicated if the table has no column list at all (which is treated as column list with all columns), or when of the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that matches the schema of the table). For partitioned tables, publish_via_partition_root determines whether the column list for the root or the leaf relation will be used. If the parameter is 'false' (the default), the list defined for the leaf relation is used. Otherwise, the column list for the root partition will be used. Psql commands \dRp+ and \d <table-name> now display any column lists. Author: Tomas Vondra, Alvaro Herrera, Rahila Syed Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed, Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com	2022-03-26 01:01:27 +01:00
Tomas Vondra	05843b1aa4	Minor improvements in sequence decoding code and docs A couple minor comment improvements and code cleanups, based on post-commit feedback to the sequence decoding patch. Author: Amit Kapila, vignesh C Discussion: https://postgr.es/m/aeb2ba8d-e6f4-5486-cc4c-0d4982c291cb@enterprisedb.com	2022-03-25 21:07:17 +01:00
Tomas Vondra	002c9dd97a	Handle sequences in preprocess_pubobj_list Commit `75b1521dae` added support for logical replication of sequences, including grammar changes, but it did not update preprocess_pubobj_list accordingly. This can cause segfaults with "continuations", i.e. when command specifies a list of objects: CREATE PUBLICATION p FOR SEQUENCE s1, s2; Reported by Amit Kapila, patch by me. Reported-by: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1JxDNKGBSNTyN-Xj2JRjzFo+ziSqJbjH==vuO0YF_CQrg@mail.gmail.com	2022-03-25 14:29:56 +01:00
Alvaro Herrera	49d9cfc68b	Fix replay of create database records on standby Crash recovery on standby may encounter missing directories when replaying create database WAL records. Prior to this patch, the standby would fail to recover in such a case. However, the directories could be legitimately missing. Consider a sequence of WAL records as follows: CREATE DATABASE DROP DATABASE DROP TABLESPACE If, after replaying the last WAL record and removing the tablespace directory, the standby crashes and has to replay the create database record again, the crash recovery must be able to move on. This patch adds a mechanism similar to invalid-page tracking, to keep a tally of missing directories during crash recovery. If all the missing directory references are matched with corresponding drop records at the end of crash recovery, the standby can safely continue following the primary. Backpatch to 13, at least for now. The bug is older, but fixing it in older branches requires more careful study of the interactions with commit `e6d8069522`, which appeared in 13. A new TAP test file is added to verify the condition. However, because it depends on commit `d6d317dbf6`, it can only be added to branch master. I (Álvaro) manually verified that the code behaves as expected in branch 14. It's a bit nervous-making to leave the code uncovered by tests in older branches, but leaving the bug unfixed is even worse. Also, the main reason this fix took so long is precisely that we couldn't agree on a good strategy to approach testing for the bug, so perhaps this is the best we can do. Diagnosed-by: Paul Guo <paulguo@gmail.com> Author: Paul Guo <paulguo@gmail.com> Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Asim R Praveen <apraveen@pivotal.io> Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com	2022-03-25 13:16:21 +01:00
Peter Eisentraut	23119d51a1	Refactor DLSUFFIX handling Move DLSUFFIX from makefiles into header files for all platforms. Move the DLSUFFIX assignment from src/makefiles/ to src/templates/, have configure read it, and then substitute it into Makefile.global and pg_config.h. This avoids the need for all makefile rules that need it to locally set CPPFLAGS. It also resolves an inconsistent setup between the two Windows build systems. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/2f9861fb-8969-9005-7518-b8e60f2bead9@enterprisedb.com	2022-03-25 08:56:02 +01:00
Michael Paquier	ad8759bea0	Fix typos in standby.c xl_running_xacts exists, not xl_xact_running_xacts. Author: Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB57160D8B01097FFB5C175065941A9@OS0PR01MB5716.jpnprd01.prod.outlook.com	2022-03-25 14:11:18 +09:00
Amit Kapila	3e67a5cac6	Remove some useless free calls. These were introduced in recent commit `52e4f0cd47`. We were trying to free some transient space consumption and that too was not entirely correct and complete. We don't need this partial freeing of memory as it will be allocated just once for a query and will be freed at the end of the query. Author: Zhihong Yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CALNJ-vQORfQ=vicbKA_RmeGZGzm1y3WsEcZqXWi7qjN43Cz_vg@mail.gmail.com	2022-03-25 07:37:06 +05:30
Tom Lane	ce95c54376	Fix pg_statio_all_tables view for multiple TOAST indexes. A TOAST table can normally have only one index, but there are corner cases where it has more; for example, transiently during REINDEX CONCURRENTLY. In such a case, the pg_statio_all_tables view produced multiple rows for the owning table, one per TOAST index. Refactor the view to avoid that, instead summing the stats across all the indexes, as we do for regular table indexes. While this has been wrong for a long time, back-patching seems unwise due to the difficulty of putting a system view change into back branches. Andrei Zubkov, tweaked a bit by me Discussion: https://postgr.es/m/acefef4189706971fc475f912c1afdab1c48d627.camel@moonset.ru	2022-03-24 16:33:13 -04:00
Robert Haas	412ad7a556	Fix possible recovery trouble if TRUNCATE overlaps a checkpoint. If TRUNCATE causes some buffers to be invalidated and thus the checkpoint does not flush them, TRUNCATE must also ensure that the corresponding files are truncated on disk. Otherwise, a replay from the checkpoint might find that the buffers exist but have the wrong contents, which may cause replay to fail. Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design suggestion from Heikki Linnakangas, with some changes to the comments by me. Review of this and a prior patch that approached the issue differently by Heikki Linnakangas, Andres Freund, Álvaro Herrera, Masahiko Sawada, and Tom Lane. Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com	2022-03-24 14:52:28 -04:00
Tomas Vondra	75b1521dae	Add decoding of sequences to built-in replication This commit adds support for decoding of sequences to the built-in replication (the infrastructure was added by commit `0da92dc530`). The syntax and behavior mostly mimics handling of tables, i.e. a publication may be defined as FOR ALL SEQUENCES (replicating all sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating all sequences in a particular schema) or individual sequences. To publish sequence modifications, the publication has to include 'sequence' action. The protocol is extended with a new message, describing sequence increments. A new system view pg_publication_sequences lists all the sequences added to a publication, both directly and indirectly. Various psql commands (\d and \dRp) are improved to also display publications including a given sequence, or sequences included in a publication. Author: Tomas Vondra, Cary Huang Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca	2022-03-24 18:49:27 +01:00
Alvaro Herrera	e27f4ee0a7	Change fastgetattr and heap_getattr to inline functions They were macros previously, but recent callsite additions made Coverity complain about one of the assertions being always true. This change could have been made a long time ago, but the Coverity complain broke the inertia. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://postgr.es/m/202203241021.uts52sczx3al@alvherre.pgsql	2022-03-24 18:02:27 +01:00
Tom Lane	0bd7af082a	Invent recursive_worktable_factor GUC to replace hard-wired constant. Up to now, the planner estimated the size of a recursive query's worktable as 10 times the size of the non-recursive term. It's hard to see how to do significantly better than that automatically, but we can give users control over the multiplier to allow tuning for specific use-cases. The default behavior remains the same. Simon Riggs Discussion: https://postgr.es/m/CANbhV-EuaLm4H3g0+BSTYHEGxJj3Kht0R+rJ8vT57Dejnh=_nA@mail.gmail.com	2022-03-24 11:47:41 -04:00
Peter Eisentraut	a47651447f	Remove unnecessary translator comment Discussion: https://www.postgresql.org/message-id/flat/CALj2ACUfJKTmK5v%3DvF%2BH2iLkqM9Yvjsp6iXaCqAks6gDpzZh6g%40mail.gmail.com	2022-03-24 14:07:38 +01:00
Michael Paquier	d4781d8873	Refactor code related to pg_hba_file_rules() into new file hba.c is growing big, and more contents are planned for it. In order to prepare for this future work, this commit moves all the code related to the system function processing the contents of pg_hba.conf, pg_hba_file_rules() to a new file called hbafuncs.c, which will be used as the location for the SQL portion of the authentication file parsing. While on it, HbaToken, the structure holding a string token lexed from a configuration file related to authentication, is renamed to a more generic AuthToken, as it gets used not only for pg_hba.conf, but also for pg_ident.conf. TokenizedLine is now named TokenizedAuthLine. The size of hba.c is reduced by ~12%. Author: Julien Rouhaud Reviewed-by: Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud	2022-03-24 12:42:30 +09:00
Andres Freund	3ac7d02412	Don't try to translate NULL in GetConfigOptionByNum(). Noticed via -fsanitize=undefined. Introduced when a few columns in GetConfigOptionByNum() / pg_settings started to be translated in `72be8c29a` / PG 12. Backpatch to all affected branches, for the same reasons as `46ab07ffda`. Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 12-	2022-03-23 13:05:59 -07:00
Andres Freund	1c6bb380e5	Don't call fwrite() with len == 0 when writing out relcache init file. Noticed via -fsanitize=undefined. Backpatch to all branches, for the same reasons as `46ab07ffda`. Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de Backpatch: 10-	2022-03-23 13:05:25 -07:00
Robert Haas	591767150f	pg_basebackup: Try to fix some failures on Windows. Commit `ffd53659c4` messed up the mechanism that was being used to pass parameters to LogStreamerMain() on Windows. It worked on Linux because only Windows was using threads. Repair by moving the additional parameters added by that commit into the 'logstreamer_param' struct. Along the way, fix a compiler warning on builds without HAVE_LIBZ. Discussion: http://postgr.es/m/CA+TgmoY5=AmWOtMj3v+cySP2rR=Bt6EGyF_joAq4CfczMddKtw@mail.gmail.com	2022-03-23 13:25:26 -04:00
Alvaro Herrera	9d92582abf	Fix "missing continuation record" after standby promotion Invalidate abortedRecPtr and missingContrecPtr after a missing continuation record is successfully skipped on a standby. This fixes a PANIC caused when a recently promoted standby attempts to write an OVERWRITE_RECORD with an LSN of the previously read aborted record. Backpatch to 10 (all stable versions). Author: Sami Imseih <simseih@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com	2022-03-23 18:22:10 +01:00
Robert Haas	607e75e8f0	Unbreak the build. Commit `ffd53659c4` broke the build for anyone not compiling with LZ4 and ZSTD enabled. Woops.	2022-03-23 10:22:54 -04:00
Robert Haas	ffd53659c4	Replace BASE_BACKUP COMPRESSION_LEVEL option with COMPRESSION_DETAIL. There are more compression parameters that can be specified than just an integer compression level, so rename the new COMPRESSION_LEVEL option to COMPRESSION_DETAIL before it gets released. Introduce a flexible syntax for that option to allow arbitrary options to be specified without needing to adjust the main replication grammar, and common code to parse it that is shared between the client and the server. This commit doesn't actually add any new compression parameters, so the only user-visible change is that you can now type something like pg_basebackup --compress gzip:level=5 instead of writing just pg_basebackup --compress gzip:5. However, it should make it easy to add new options. If for example gzip starts offering fries, we can support pg_basebackup --compress gzip:level=5,fries=true for the benefit of users who want fries with that. Along the way, this fixes a few things in pg_basebackup so that the pg_basebackup can be used with a server-side compression algorithm that pg_basebackup itself does not understand. For example, pg_basebackup --compress server-lz4 could still succeed even if only the server and not the client has LZ4 support, provided that the other options to pg_basebackup don't require the client to decompress the archive. Patch by me. Reviewed by Justin Pryzby and Dagfinn Ilmari Mannsåker. Discussion: http://postgr.es/m/CA+TgmoYvpetyRAbbg1M8b3-iHsaN4nsgmWPjOENu5-doHuJ7fA@mail.gmail.com	2022-03-23 09:19:14 -04:00
Andrew Dunstan	1460fc5942	Revert "Common SQL/JSON clauses" This reverts commit `865fe4d5df`. This has caused issues with a significant number of buildfarm members	2022-03-22 19:56:14 -04:00
Andrew Dunstan	865fe4d5df	Common SQL/JSON clauses This introduces some of the building blocks used by the SQL/JSON constructor and query functions. Specifically, it provides node executor and grammar support for the FORMAT JSON [ENCODING foo] clause, and values decorated with it, and for the RETURNING clause. The following SQL/JSON patches will leverage these. Nikita Glukhov (who probably deserves an award for perseverance). Reviewers have included (in no particular order) Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup. Erik Rijkers, Zihong Yu and Himanshu Upadhyaya. Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru	2022-03-22 17:32:54 -04:00
Andres Freund	ce8c978295	pgstat: fix function name in comment. Introduced in `3fa17d3771`.	2022-03-22 08:15:40 -07:00
Andrew Dunstan	d11e84ea46	Add String object access hooks This caters for cases where the access is to an object identified by name rather than Oid. The first user of these is the GUC access controls Joshua Brindle and Mark Dilger Discussion: https://postgr.es/m/47F87A0E-C0E5-43A6-89F6-D403F2B45175@enterprisedb.com	2022-03-22 10:28:31 -04:00
Tom Lane	29992a6a50	Revert "graceful shutdown" changes for Windows. This reverts commits `6051857fc` and `ed52c3707` in HEAD (they were already reverted in the back branches). Further testing has shown that while those changes do fix some things, they also break others; in particular, it looks like walreceivers fail to detect walsender-initiated connection close reliably if the walsender shuts down this way. A proper fix for this seems possible but won't be done in time for v15. Discussion: https://postgr.es/m/CA+hUKG+OeoETZQ=Qw5Ub5h3tmwQhBmDA=nuNO3KG=zWfUypFAw@mail.gmail.com Discussion: https://postgr.es/m/CA+hUKGKkp2XkvSe9nG+bsgkXVKCdTeGSa_TR0Qx1jafc_oqCVA@mail.gmail.com	2022-03-22 10:19:15 -04:00
Dean Rasheed	7faa5fc84b	Add support for security invoker views. A security invoker view checks permissions for accessing its underlying base relations using the privileges of the user of the view, rather than the privileges of the view owner. Additionally, if any of the base relations are tables with RLS enabled, the policies of the user of the view are applied, rather than those of the view owner. This allows views to be defined without giving away additional privileges on the underlying base relations, and matches a similar feature available in other database systems. It also allows views to operate more naturally with RLS, without affecting the assignments of policies to users. Christoph Heiss, with some additional hacking by me. Reviewed by Laurenz Albe and Wolfgang Walther. Discussion: https://postgr.es/m/b66dd6d6-ad3e-c6f2-8b90-47be773da240%40cybertec.at	2022-03-22 10:28:10 +00:00
Amit Kapila	208c5d65bb	Add ALTER SUBSCRIPTION ... SKIP. This feature allows skipping the transaction on subscriber nodes. If incoming change violates any constraint, logical replication stops until it's resolved. Currently, users need to either manually resolve the conflict by updating a subscriber-side database or by using function pg_replication_origin_advance() to skip the conflicting transaction. This commit introduces a simpler way to skip the conflicting transactions. The user can specify LSN by ALTER SUBSCRIPTION ... SKIP (lsn = XXX), which allows the apply worker to skip the transaction finished at specified LSN. The apply worker skips all data modification changes within the transaction. Author: Masahiko Sawada Reviewed-by: Takamichi Osumi, Hou Zhijie, Peter Eisentraut, Amit Kapila, Shi Yu, Vignesh C, Greg Nancarrow, Haiying Tang, Euler Taveira Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2022-03-22 07:11:19 +05:30
Andres Freund	315ae75e9b	pgstat: reorder pgstat.[ch] contents. Now that `13619598f1` has split pgstat up into multiple files it isn't quite as hard to come up with a sensible order for pgstat.[ch]. Inconsistent naming makes it still not quite right looking, but that's work for another commit. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-21 16:21:00 -07:00
Tom Lane	2591ee8ec4	Fix assorted missing logic for GroupingFunc nodes. The planner needs to treat GroupingFunc like Aggref for many purposes, in particular with respect to processing of the argument expressions, which are not to be evaluated at runtime. A few places hadn't gotten that memo, notably including subselect.c's processing of outer-level aggregates. This resulted in assertion failures or wrong plans for cases in which a GROUPING() construct references an outer aggregation level. Also fix missing special cases for GroupingFunc in cost_qual_eval (resulting in wrong cost estimates for GROUPING(), although it's not clear that that would affect plan shapes in practice) and in ruleutils.c (resulting in excess parentheses in pretty-print mode). Per bug #17088 from Yaoguang Chen. Back-patch to all supported branches. Richard Guo, Tom Lane Discussion: https://postgr.es/m/17088-e33882b387de7f5c@postgresql.org	2022-03-21 17:44:29 -04:00
Andres Freund	13619598f1	pgstat: split different types of stats into separate files. pgstat.c is very long, and it's hard to find an order that makes sense and is likely to be maintained over time. Splitting the different pieces into separate files makes that a lot easier. With a few exceptions, this commit just moves code around. Those exceptions are: - adding file headers for new files - removing 'static' from functions - adapting pgstat_assert_is_up() to work across TUs - minor comment adjustments git diff --color-moved=dimmed-zebra is very helpful separating code movement from code changes. The next commit in this series will reorder pgstat.[ch] contents to be a bit more coherent. Earlier revisions of this patch had "global" statistics (archiver, bgwriter, checkpointer, replication slots, SLRU, WAL) in one file, because each seemed small enough. However later commits will increase their size and their aggregate size is not insubstantial. It also just seems easier to split each type of statistic into its own file. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-21 12:02:25 -07:00
Tom Lane	cb02fcb4c9	Fix bogus dependency handling for GENERATED expressions. For GENERATED columns, we record all dependencies of the generation expression as AUTO dependencies of the column itself. This means that the generated column is silently dropped if any dependency is removed, even if CASCADE wasn't specified. This is at least a POLA violation, but I think it's actually based on a misreading of the standard. The standard does say that you can't drop a dependent GENERATED column in RESTRICT mode; but that's buried down in a subparagraph, on a different page from some pseudocode that makes it look like an AUTO drop is being suggested. Change this to be more like the way that we handle regular default expressions, ie record the dependencies as NORMAL dependencies of the pg_attrdef entry. Also, make the pg_attrdef entry's dependency on the column itself be INTERNAL not AUTO. That has two effects: * the column will go away, not just lose its default, if any dependency of the expression is dropped with CASCADE. So we don't need any special mechanism to make that happen. * it provides an additional cross-check preventing someone from dropping the default expression without dropping the column. catversion bump because of change in the contents of pg_depend (which also requires a change in one information_schema view). Per bug #17439 from Kevin Humphreys. Although this is a longstanding bug, it seems impractical to back-patch because of the need for catalog contents changes. Discussion: https://postgr.es/m/17439-7df4421197e928f0@postgresql.org	2022-03-21 14:58:49 -04:00
Tom Lane	17f3bc0928	Move pg_attrdef manipulation code into new file catalog/pg_attrdef.c. This is a pure refactoring commit: there isn't (I hope) any functional change. StoreAttrDefault and RemoveAttrDefault[ById] are moved from heap.c, reducing the size of that overly-large file by about 300 lines. I took the opportunity to trim unused #includes from heap.c, too. Two new functions for translating between a pg_attrdef OID and the relid/attnum of the owning column are created by extracting ad-hoc code from objectaddress.c. This already removes one copy of said code, and a follow-on bug fix will create more callers. The only other function directly manipulating pg_attrdef is AttrDefaultFetch. I judged it was better to leave that in relcache.c, since it shares special concerns about recursion and error handling with the rest of that module. Discussion: https://postgr.es/m/651168.1647451676@sss.pgh.pa.us	2022-03-21 14:38:23 -04:00
Tom Lane	7b6ec86532	Fix risk of deadlock failure while dropping a partitioned index. DROP INDEX needs to lock the index's table before the index itself, else it will deadlock against ordinary queries that acquire the relation locks in that order. This is correctly mechanized for plain indexes by RangeVarCallbackForDropRelation; but in the case of a partitioned index, we neglected to lock the child tables in advance of locking the child indexes. We can fix that by traversing the inheritance tree and acquiring the needed locks in RemoveRelations, after we have acquired our locks on the parent partitioned table and index. While at it, do some refactoring to eliminate confusion between the actual and expected relkind in RangeVarCallbackForDropRelation. We can save a couple of syscache lookups too, by having that function pass back info that RemoveRelations will need. Back-patch to v11 where partitioned indexes were added. Jimmy Yih, Gaurab Dey, Tom Lane Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com	2022-03-21 12:22:13 -04:00
Tom Lane	1f8bc44868	Remove workarounds for avoiding [U]INT64_FORMAT in translatable strings. Further code simplification along the same lines as `d914eb347` and earlier patches. Aleksander Alekseev, Japin Li Discussion: https://postgr.es/m/CAJ7c6TMSKi3Xs8h5MP38XOnQQpBLazJvVxVfPn++roitDJcR7g@mail.gmail.com	2022-03-21 11:11:55 -04:00
Magnus Hagander	c540d37157	Fix typo in file identification Clearly a simple copy/paste mistake when the file was created.	2022-03-21 12:35:48 +02:00
Andres Freund	d4ba8b51c7	pgstat: separate "xact level" handling out of relation specific functions. This is in preparation of a later commit moving relation stats handling into its own file. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-20 19:12:09 -07:00
Andres Freund	bff258a273	pgstat: rename pgstat_initstats() to pgstat_relation_init(). The old name was overly generic. An upcoming commit moves relation stats handling into its own file, making pgstat_initstats() look even more out of place. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-20 19:12:09 -07:00
Andres Freund	8363102009	pgstat: introduce pgstat_relation_should_count(). A later commit will make the check more complicated than the current (rel)->pgstat_info != NULL. It also just seems nicer to have a central copy of the logic, even while still simple. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-20 19:12:09 -07:00
Alvaro Herrera	2d655a08d5	Blind fix for uninitialized memory bug in `ba9a7e3921` Valgrind animal skink shows a crash in this new code. I couldn't reproduce the problem locally, but going by blind code inspection, initializing insert_destrel should be sufficient to fix the problem.	2022-03-20 22:10:24 +01:00
Alvaro Herrera	ba9a7e3921	Enforce foreign key correctly during cross-partition updates When an update on a partitioned table referenced in foreign key constraints causes a row to move from one partition to another, the fact that the move is implemented as a delete followed by an insert on the target partition causes the foreign key triggers to have surprising behavior. For example, a given foreign key's delete trigger which implements the ON DELETE CASCADE clause of that key will delete any referencing rows when triggered for that internal DELETE, although it should not, because the referenced row is simply being moved from one partition of the referenced root partitioned table into another, not being deleted from it. This commit teaches trigger.c to skip queuing such delete trigger events on the leaf partitions in favor of an UPDATE event fired on the root target relation. Doing so is sensible because both the old and the new tuple "logically" belong to the root relation. The after trigger event queuing interface now allows passing the source and the target partitions of a particular cross-partition update when registering the update event for the root partitioned table. Along with the two ctids of the old and the new tuple, the after trigger event now also stores the OIDs of those partitions. The tuples fetched from the source and the target partitions are converted into the root table format, if necessary, before they are passed to the trigger function. The implementation currently has a limitation that only the foreign keys pointing into the query's target relation are considered, not those of its sub-partitioned partitions. That seems like a reasonable limitation, because it sounds rare to have distinct foreign keys pointing to sub-partitioned partitions instead of to the root table. This misbehavior stems from commit `f56f8f8da6` (which added support for foreign keys to reference partitioned tables) not paying sufficient attention to commit `2f17844104` (which had introduced cross-partition updates a year earlier). Even though the former commit goes back to Postgres 12, we're not backpatching this fix at this time for fear of destabilizing things too much, and because there are a few ABI breaks in it that we'd have to work around in older branches. It also depends on commit `f4566345cf`, which had its own share of backpatchability issues as well. Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Eduard Català <eduard.catala@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFvkBCmfwkQX_yBqv2Wz8ugUGiBDxum8=WvVbfU1TXaNg@mail.gmail.com Discussion: https://postgr.es/m/CAL54xNZsLwEM1XCk5yW9EqaRzsZYHuWsHQkA2L5MOSKXAwviCQ@mail.gmail.com	2022-03-20 18:43:40 +01:00
Peter Eisentraut	3a671e1f7c	Fix global ICU collations for ICU < 54 createdb() didn't check for collation attributes validity, which has to be done explicitly on ICU < 54. It also forgot to close the ICU collator opened during the check which leaks some memory. To fix both, add a new check_icu_locale() that does all the appropriate verification and close the ICU collator. initdb also had some partial check for ICU < 54. To have consistent error reporting across major ICU versions, and get rid of the need to include ucol.h, remove the partial check there. The backend will report an error if needed during the post-boostrap iniitialization phase. Author: Julien Rouhaud <julien.rouhaud@free.fr> Discussion: https://www.postgresql.org/message-id/20220319041459.qqqiqh335sga5ezj@jrouhaud	2022-03-20 10:21:45 +01:00
Andres Freund	78f9506b38	pgstat: split out WAL handling from pgstat_{initialize,report_stat}. A later commit will move the handling of the different kinds of stats into separate files. By splitting out WAL handling in this commit that later move will just move code around without other changes. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-19 11:42:22 -07:00
Andres Freund	89c546c294	pgstat: split relation, database handling out of pgstat_report_stat(). pgstat_report_stat() handles several types of stats, yet relation stats have so far been handled directly in pgstat_report_stat(). A later commit will move the handling of the different kinds of stats into separate files. By splitting out relation handling in this commit that later move will just move code around without other changes. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-19 11:42:22 -07:00
Andres Freund	a3a75b982b	pgstat: run pgindent on pgstat.c/h. Upcoming commits will touch a lot of the pgstats code. Reindenting separately makes it easier to keep the code in a well-formatted shape each step. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220303021600.hs34ghqcw6zcokdh@alap3.anarazel.de	2022-03-19 11:42:22 -07:00
Alvaro Herrera	a1fc50672c	Fix an outdated and grammatically wrong comment Authored by Amit Langote and myself independently Discussion: https://postgr.es/m/CA+HiwqGCjcH0gG-=tM7hhP7TEDmzrHMHJbPGSHtHgFmx9mnFkg@mail.gmail.com	2022-03-19 19:34:04 +01:00
Andres Freund	97bddda61b	Silence -Wmaybe-uninitialized compiler warning in dbcommands.c. Introduced in `f2553d4306`. See also `3f6b3be39c`, which did so for nearby variables. Discussion: https://postgr.es/m/20220319014707.kgtomqdzm6m2ulro@alap3.anarazel.de	2022-03-18 18:48:03 -07:00
Tom Lane	068739fb4f	Fix incorrect xmlschema output for types timetz and timestamptz. The output of table_to_xmlschema() and allied functions includes a regex describing valid values for these types ... but the regex was itself invalid, as it failed to escape a literal "+" sign. Report and fix by Renan Soares Lopes. Back-patch to all supported branches. Discussion: https://postgr.es/m/7f6fabaa-3f8f-49ab-89ca-59fbfe633105@me.com	2022-03-18 16:01:42 -04:00
Thomas Munro	3f1ce97346	Add circular WAL decoding buffer, take II. Teach xlogreader.c to decode the WAL into a circular buffer. This will support optimizations based on looking ahead, to follow in a later commit. * XLogReadRecord() works as before, decoding records one by one, and allowing them to be examined via the traditional XLogRecGetXXX() macros and certain traditional members like xlogreader->ReadRecPtr. * An alternative new interface XLogReadAhead()/XLogNextRecord() is added that returns pointers to DecodedXLogRecord objects so that it's now possible to look ahead in the WAL stream while replaying. * In order to be able to use the new interface effectively while streaming data, support is added for the page_read() callback to respond to a new nonblocking mode with XLREAD_WOULDBLOCK instead of waiting for more data to arrive. No direct user of the new interface is included in this commit, though XLogReadRecord() uses it internally. Existing code doesn't need to change, except in a few places where it was accessing reader internals directly and now needs to go through accessor macros. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> (earlier versions) Discussion: https://postgr.es/m/CA+hUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq=AovOddfHpA@mail.gmail.com	2022-03-18 18:45:47 +13:00
Tom Lane	d7b5c071dd	Don't bother to attach column name lists to RowExprs of named types. If a RowExpr is marked as returning a named composite type, we aren't going to consult its colnames list; we'll use the attribute names shown for the type in pg_attribute. Hence, skip storing that list, to save a few nanoseconds when copying the expression tree around. Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us	2022-03-17 18:25:44 -04:00
Tom Lane	ec62cb0aac	Revert applying column aliases to the output of whole-row Vars. In commit `bf7ca1587`, I had the bright idea that we could make the result of a whole-row Var (that is, foo.) track any column aliases that had been applied to the FROM entry the Var refers to. However, that's not terribly logically consistent, because now the output of the Var is no longer of the named composite type that the Var claims to emit. `bf7ca1587` tried to handle that by changing the output tuple values to be labeled with a blessed RECORD type, but that's really pretty disastrous: we can wind up storing such tuples onto disk, whereupon they're not readable by other sessions. The only practical fix I can see is to give up on what `bf7ca1587` tried to do, and say that the column names of tuples produced by a whole-row Var are always those of the underlying named composite type, query aliases or no. While this introduces some inconsistencies, it removes others, so it's not that awful in the abstract. What is* kind of awful is to make such a behavioral change in a back-patched bug fix. But corrupt data is worse, so back-patched it will be. (A workaround available to anyone who's unhappy about this is to introduce an extra level of sub-SELECT, so that the whole-row Var is referring to the sub-SELECT's output and not to a named table type. Then the Var is of type RECORD to begin with and there's no issue.) Per report from Miles Delahunty. The faulty commit dates to 9.5, so back-patch to all supported branches. Discussion: https://postgr.es/m/2950001.1638729947@sss.pgh.pa.us	2022-03-17 18:18:05 -04:00
Robert Haas	39f0c4bd67	Refactor code for reading and writing relation map files. Restructure things so that the functions which update the global variables shared_map and local_map are separate from the functions which just read and write relation map files without touching any global variables. In the new structure of things, write_relmap_file() writes a relmap file but no longer performs global variable updates. A symmetric function read_relmap_file() that just reads a file without changing any global variables is added, and load_relmap_file(), which does change the global variables, uses it as a subroutine. Because write_relmap_file() no longer updates shared_map and local_map, that logic is moved to perform_relmap_update(). However, no similar logic is added to relmap_redo() even though it also calls write_relmap_file(). That's because recovery must not rely on the contents of the relation map, and therefore there is no need to initialize it. In fact, doing so seems like a mistake, because we might then manage to rely on the in-memory map where we shouldn't. Patch by me, based on earlier work by Dilip Kumar. Reviewed by Ashutosh Sharma. Discussion: http://postgr.es/m/CA+TgmobQLgrt4AXsc0ru7aFFkzv=9fS-Q_yO69=k9WY67RCctg@mail.gmail.com	2022-03-17 13:21:07 -04:00
Tomas Vondra	5a07966225	Fix row filters with multiple publications When publishing changes through a artition root, we should use the row filter for the top-most ancestor. The relation may be added to multiple publications, using different ancestors, and `52e4f0cd47` handled this incorrectly. With `c91f71b9dc` we find the correct top-most ancestor, but the code tried to fetch the row filter from all publications, including those using a different ancestor etc. No row filter can be found for such publications, which was treated as replicating all rows. Similarly to `c91f71b9dc`, this seems to be a rare issue in practice. It requires multiple publications including the same partitioned relation, through different ancestors. Fixed by only passing publications containing the top-most ancestor to pgoutput_row_filter_init(), so that treating a missing row filter as replicating all rows is correct. Report and fix by me, test case by Hou zj. Reviews and improvements by Amit Kapila. Author: Tomas Vondra, Hou zj, Amit Kapila Reviewed-by: Amit Kapila, Hou zj Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com	2022-03-17 17:03:48 +01:00
Alvaro Herrera	25e777cf8e	Split ExecUpdate and ExecDelete into reusable pieces Create subroutines ExecUpdatePrologue / ExecUpdateAct / ExecUpdateEpilogue, and similar for ExecDelete. Introduce a new struct to be used internally in nodeModifyTable.c, dubbed ModifyTableContext, which contains all context information needed to perform these operations, as well as ExecInsert and others. This allows using a different schedule and a different way of evaluating the results of these operations, which can be exploited by a later commit introducing support for MERGE. It also makes ExecUpdate and ExecDelete proper shorter and (hopefully) simpler. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/202202271724.4z7xv3cf46kv@alvherre.pgsql	2022-03-17 11:47:04 +01:00
Peter Eisentraut	f2553d4306	Add option to use ICU as global locale provider This adds the option to use ICU as the default locale provider for either the whole cluster or a database. New options for initdb, createdb, and CREATE DATABASE are used to select this. Since some (legacy) code still uses the libc locale facilities directly, we still need to set the libc global locale settings even if ICU is otherwise selected. So pg_database now has three locale-related fields: the existing datcollate and datctype, which are always set, and a new daticulocale, which is only set if ICU is selected. A similar change is made in pg_collation for consistency, but in that case, only the libc-related fields or the ICU-related field is set, never both. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com	2022-03-17 11:13:16 +01:00
Michael Paquier	f6f0db4d62	Fix pg_tablespace_location() with in-place tablespaces Using this system function with an in-place tablespace (created when allow_in_place_tablespaces is enabled by specifying an empty string as location) caused a failure when using readlink(), as the tablespace is, in this case, not a symbolic link in pg_tblspc/ but a directory. Rather than getting a failure, the commit changes pg_tablespace_location() so as a relative path to the data directory is returned for in-place tablespaces, to make a difference between tablespaces created when allow_in_place_tablespaces is enabled or not. Getting a path rather than an empty string that would match the CREATE TABLESPACE command in this case is more useful for tests that would like to rely on this function. While on it, a regression test is added for this case. This is simple to add in the main regression test suite thanks to regexp_replace() to mask the part of the tablespace location dependent on its OID. Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, Thomas Munro Discussion: https://postgr.es/m/YiG1RleON1WBcLnX@paquier.xyz	2022-03-17 11:25:02 +09:00
Tomas Vondra	c91f71b9dc	Fix publish_as_relid with multiple publications Commit `83fd4532a7` allowed publishing of changes via ancestors, for publications defined with publish_via_partition_root. But the way the ancestor was determined in get_rel_sync_entry() was incorrect, simply updating the same variable. So with multiple publications, replicating different ancestors, the outcome depended on the order of publications in the list - the value from the last loop was used, even if it wasn't the top-most ancestor. This is a probably rare situation, as in most cases publications do not overlap, so each partition has exactly one candidate ancestor to replicate as and there's no ambiguity. Fixed by tracking the "ancestor level" for each publication, and picking the top-most ancestor. Adds a test case, verifying the correct ancestor is used for publishing the changes and that this does not depend on order of publications in the list. Older releases have another bug in this loop - once all actions are replicated, the loop is terminated, on the assumption that inspecting additional publications is unecessary. But that misses the fact that those additional applications may replicate different ancestors. Fixed by removal of this break condition. We might still terminate the loop in some cases (e.g. when replicating all actions and the ancestor is the partition root). Backpatch to 13, where publish_via_partition_root was introduced. Initial report and fix by me, test added by Hou zj. Reviews and improvements by Amit Kapila. Author: Tomas Vondra, Hou zj, Amit Kapila Reviewed-by: Amit Kapila, Hou zj Discussion: https://postgr.es/m/d26d24dd-2fab-3c48-0162-2b7f84a9c893%40enterprisedb.com	2022-03-16 18:05:58 +01:00
Robert Haas	d0083c1d2a	Suppress compiler warnings. Michael Paquier Discussion: http://postgr.es/m/YjGvq4zPDT6j15go@paquier.xyz	2022-03-16 09:26:48 -04:00
Thomas Munro	46d9bfb0a6	Fix race between DROP TABLESPACE and checkpointing. Commands like ALTER TABLE SET TABLESPACE may leave files for the next checkpoint to clean up. If such files are not removed by the time DROP TABLESPACE is called, we request a checkpoint so that they are deleted. However, there is presently a window before checkpoint start where new unlink requests won't be scheduled until the following checkpoint. This means that the checkpoint forced by DROP TABLESPACE might not remove the files we expect it to remove, and the following ERROR will be emitted: ERROR: tablespace "mytblspc" is not empty To fix, add a call to AbsorbSyncRequests() just before advancing the unlink cycle counter. This ensures that any unlink requests forwarded prior to checkpoint start (i.e., when ckpt_started is incremented) will be processed by the current checkpoint. Since AbsorbSyncRequests() performs memory allocations, it cannot be called within a critical section, so we also need to move SyncPreCheckpoint() to before CreateCheckPoint()'s critical section. This is an old bug, so back-patch to all supported versions. Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220215235845.GA2665318%40nathanxps13	2022-03-16 17:20:24 +13:00
Thomas Munro	3390ef1b7b	Fix waiting in RegisterSyncRequest(). If we run out of space in the checkpointer sync request queue (which is hopefully rare on real systems, but common with very small buffer pool), we wait for it to drain. While waiting, we should report that as a wait event so that users know what is going on, and also handle postmaster death, since otherwise the loop might never terminate if the checkpointer has exited. Back-patch to 12. Although the problem exists in earlier releases too, the code is structured differently before 12 so I haven't gone any further for now, in the absence of field complaints. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de	2022-03-16 15:35:16 +13:00
Thomas Munro	5e6368b42e	Wake up for latches in CheckpointWriteDelay(). The checkpointer shouldn't ignore its latch. Other backends may be waiting for it to drain the request queue. Hopefully real systems don't have a full queue often, but the condition is reached easily when shared_buffers is small. This involves defining a new wait event, which will appear in the pg_stat_activity view often due to spread checkpoints. Back-patch only to 14. Even though the problem exists in earlier branches too, it's hard to hit there. In 14 we stopped using signal handlers for latches on Linux, *BSD and macOS, which were previously hiding this problem by interrupting the sleep (though not reliably, as the signal could arrive before the sleep begins; precisely the problem latches address). Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de	2022-03-16 13:57:59 +13:00
Thomas Munro	a56e7b6601	Silence LLVM 14 API deprecation warnings. We are going to need to handle the upcoming opaque pointer API changes[1], possibly in time for LLVM 15, but in the meantime let's silence the warnings produced by LLVM 14. [1] https://llvm.org/docs/OpaquePointers.html Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKG%2Bp%3DfaBQR2PSAqWoWa%2B_tJdKPT0wjZPQe7XcDEttUCgdQ%40mail.gmail.com	2022-03-16 10:30:55 +13:00
Robert Haas	8ef1fa3ee0	Remove accidentally-committed file.	2022-03-15 13:41:36 -04:00
Robert Haas	e4ba69f3f4	Allow extensions to add new backup targets. Commit `3500ccc39b` allowed for base backup targets, meaning that we could do something with the backup other than send it to the client, but all of those targets had to be baked in to the core code. This commit makes it possible for extensions to define additional backup targets. Patch by me, reviewed by Abhijit Menon-Sen. Discussion: http://postgr.es/m/CA+TgmoaqvdT-u3nt+_kkZ7bgDAyqDB0i-+XOMmr5JN2Rd37hxw@mail.gmail.com	2022-03-15 13:22:04 -04:00
Robert Haas	75eae09087	Change HAVE_LIBLZ4 and HAVE_LIBZSTD tests to USE_LZ4 and USE_ZSTD. These tests were added recently, but older code tests USE_LZ4 rathr than HAVE_LIBLZ4, so let's follow the established precedent. It also seems more consistent with the intent of the configure tests, since I think that the USE_* symbols are intended to correspond to what the user requested, and the HAVE_* symbols to what configure found while probing. Discussion: http://postgr.es/m/CA+Tgmoap+hTD2-QNPJLH4tffeFE8MX5+xkbFKMU3FKBy=ZSNKA@mail.gmail.com	2022-03-15 13:06:25 -04:00
Amit Kapila	695f459f17	Fix compiler warning introduced in commit `705e20f855`. Reported-by: Nathan Bossart Author: Nathan Bossart Reviewed-by: Osumi Takamichi Discussion : https://postgr.es/m/20220314230424.GA1085716@nathanxps13	2022-03-15 08:11:17 +05:30
Michael Paquier	6bdf1a1400	Fix collection of typos in the code and the documentation Some words were duplicated while other places were grammatically incorrect, including one variable name in the code. Author: Otto Kekalainen, Justin Pryzby Discussion: https://postgr.es/m/7DDBEFC5-09B6-4325-B942-B563D1A24BDC@amazon.com	2022-03-15 11:29:35 +09:00
Thomas Munro	c6f2f01611	Fix pg_basebackup with in-place tablespaces. Previously, pg_basebackup from a cluster that contained an 'in-place' tablespace, as introduced by commit `7170f215`, would produce a harmless warning on Unix and fail completely on Windows. Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20220304.165449.1200020258723305904.horikyota.ntt%40gmail.com	2022-03-15 14:01:23 +13:00
Robert Haas	9dde82899c	Support "of", "tzh", and "tzm" format codes. The upper case versions "OF", "TZH", and "TZM" are already supported, and all other format codes that are supported in upper case are also supported in lower case, so we should support these as well for consistency. Nitin Jadhav, with a tiny cosmetic change by me. Reviewed by Suraj Kharage and David Zhang. Discussion: http://postgr.es/m/CAMm1aWZ-oZyKd75+8D=VJ0sAoSwtdXWLP-MAWD4D8R1Dgandzw@mail.gmail.com	2022-03-14 16:50:54 -04:00
Amit Kapila	705e20f855	Optionally disable subscriptions on error. Logical replication apply workers for a subscription can easily get stuck in an infinite loop of attempting to apply a change, triggering an error (such as a constraint violation), exiting with the error written to the subscription server log, and restarting. To partially remedy the situation, this patch adds a new subscription option named 'disable_on_error'. To be consistent with old behavior, this option defaults to false. When true, both the tablesync worker and apply worker catch any errors thrown and disable the subscription in order to break the loop. The error is still also written in the logs. Once the subscription is disabled, users can either manually resolve the conflict/error or skip the conflicting transaction by using pg_replication_origin_advance() function. After resolving the conflict, users need to enable the subscription to allow apply process to proceed. Author: Osumi Takamichi and Mark Dilger Reviewed-by: Greg Nancarrow, Vignesh C, Amit Kapila, Wang wei, Tang Haiying, Peter Smith, Masahiko Sawada, Shi Yu Discussion : https://postgr.es/m/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com	2022-03-14 09:32:40 +05:30
Peter Geoghegan	6e20f4600a	VACUUM VERBOSE: tweak scanned_pages logic. Commit `872770fd6c` taught VACUUM VERBOSE and autovacuum logging to display the total number of pages scanned by VACUUM. This information was also displayed as a percentage of rel_pages in parenthesis, which makes it easy to spot trends over time and across tables. The instrumentation displayed "0 scanned (0.00% of total)" for totally empty tables. Tweak the instrumentation: have it show "0 scanned (100.00% of total)" for empty tables instead. This approach is clearer and more consistent.	2022-03-13 13:07:49 -07:00
Peter Geoghegan	e370f100f0	vacuumlazy.c: Standardize rel_pages terminology. VACUUM's rel_pages field indicates the size of the target heap rel just after the table_relation_vacuum() operation began. There are specific expectations around how rel_pages can be related to other nearby state. In particular, the range of rel_pages must contain every tuple in the relation whose tuple headers might contain an XID < OldestXmin. Consistently refer to the field as rel_pages to make this clearer and more discoverable. This is follow-up work to commit `73f6ec3d` from earlier today. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220311031351.sbge5m2bpvy2ttxg@alap3.anarazel.de	2022-03-12 13:20:45 -08:00
Peter Geoghegan	73f6ec3d3c	vacuumlazy.c: document vistest and OldestXmin. Explain the relationship between vacuumlazy.c's vistest and OldestXmin cutoffs. These closely related cutoffs are different in subtle but important ways. Also document a closely related rule: we must establish rel_pages _after_ OldestXmin to ensure that no XID < OldestXmin can be missed by lazy_scan_heap(). It's easier to explain these issues by initializing everything together, so consolidate initialization of vacrel state. Now almost every vacrel field is initialized by heap_vacuum_rel(). The only remaining exception is the dead_items array, which is still managed by lazy_scan_heap() due to interactions with how we initialize parallel VACUUM. Also move the process that updates pg_class entries for each index into heap_vacuum_rel(), and adjust related assertions. All pg_class updates now take place after lazy_scan_heap() returns, which seems clearer. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20211211045710.ljtuu4gfloh754rs@alap3.anarazel.de Discussion: https://postgr.es/m/CAH2-WznYsUxVT156rCQ+q=YD4S4=1M37hWvvHLz-H1pwSM8-Ew@mail.gmail.com	2022-03-12 12:52:38 -08:00
Peter Geoghegan	5b68f75e12	Normalize heap_prepare_freeze_tuple argument name. We called the argument totally_frozen in its function prototype as well as in code comments, even though totally_frozen_p was used in the function definition. Standardize on totally_frozen.	2022-03-11 19:30:21 -08:00
Alvaro Herrera	3a46a45f6f	Add API of sorts for transition table handling in trigger.c Preparatory patch for further additions in this area, particularly to allow MERGE to have separate transition tables for each action. Author: Pavan Deolasee <pavan.deolasee@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/CABOikdNj+8HEJ5D8tu56mrPkjHVRrBb2_cdKWwpiYNcjXgDw8g@mail.gmail.com Discussion: https://postgr.es/m/20201231134736.GA25392@alvherre.pgsql	2022-03-11 20:40:03 -03:00
Tom Lane	641f3dffcd	Restore the previous semantics of get_constraint_index(). Commit `8b069ef5d` changed this function to look at pg_constraint.conindid rather than searching pg_depend. That was a good performance improvement, but it failed to preserve the exact semantics. The old code would only return an index that was "owned by" (internally dependent on) the specified constraint, whereas the new code will also return indexes that are just referenced by foreign key constraints. This confuses ALTER TABLE, which was implicitly expecting the previous semantics, into failing with errors like ERROR: relation 146621 has multiple clustered indexes or ERROR: "pk_attbl" is not an index for table "atref" We can fix this without reverting the performance improvement by adding a contype check in get_constraint_index(). Another way could be to make ALTER TABLE check it, but I'm worried that extension code could also have subtle dependencies on the old semantics. Tom Lane and Japin Li, per bug #17409 from Holly Roberts. Back-patch to v14 where the error crept in. Discussion: https://postgr.es/m/17409-52871dda8b5741cb@postgresql.org	2022-03-11 13:47:29 -05:00
Peter Eisentraut	e94bb1473e	DefineCollation() code cleanup Reorganize the code in DefineCollation() so that the parts using the FROM clause and the parts not doing so are more cleanly separated. No functionality change intended. Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://www.postgresql.org/message-id/29ae752f-80e9-8d31-601c-62cf01cc93d8@enterprisedb.com	2022-03-11 08:32:52 +01:00
Michael Paquier	e9537321a7	Add support for zstd with compression of full-page writes in WAL wal_compression gains a new value, "zstd", to allow the compression of full-page images using the compression method of the same name. Compression is done using the default level recommended by the library, as of ZSTD_CLEVEL_DEFAULT = 3. Some benchmarking has shown that it could make sense to use a level lower for the FPI compression, like 1 or 2, as the compression rate did not change much with a bit less CPU consumed, but any tests done would only cover few scenarios so it is hard to come to a clear conclusion. Anyway, there is no reason to not use the default level instead, which is the level recommended by the library so it should be fine for most cases. zstd outclasses easily pglz, and is better than LZ4 where one wants to have more compression at the cost of extra CPU but both are good enough in their own scenarios, so the choice between one or the other of these comes to a study of the workload patterns and the schema involved, mainly. This commit relies heavily on `4035cd5`, that reshaped the code creating and restoring full-page writes to be aware of the compression type, making this integration straight-forward. This patch borrows some early work from Andrey Borodin, though the patch got a complete rewrite. Author: Justin Pryzby Discussion: https://postgr.es/m/20220222231948.GJ9008@telsasoft.com	2022-03-11 12:18:53 +09:00
Michael Paquier	0071fc7127	Fix header inclusion order in xloginsert.c with lz4.h Per project policy, all system and library headers need to be declared in the backend code after "postgres.h" and before the internal headers, but `4035cd5` broke this policy when adding support for LZ4 in wal_compression. Noticed while reviewing the patch to add support for zstd in this area. This only impacts HEAD, so there is no need for a back-patch.	2022-03-11 10:59:47 +09:00
Andres Freund	352d297dc7	dshash: Add sequential scan support. Add ability to scan all entries sequentially to dshash. The interface is similar but a bit different both from that of dynahash and simple dshash search functions. The most significant differences is that dshash's interfac always needs a call to dshash_seq_term when scan ends. Another is locking. Dshash holds partition lock when returning an entry, dshash_seq_next() also holds lock when returning an entry but callers shouldn't release it, since the lock is essential to continue a scan. The seqscan interface allows entry deletion while a scan is in progress using dshash_delete_current(). Reviewed-By: Andres Freund <andres@anarazel.de> Author: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>	2022-03-10 12:57:05 -08:00
Peter Eisentraut	df4c3cbd8f	Add parse_analyze_withcb() This extracts code from pg_analyze_and_rewrite_withcb() into a separate function that mirrors the existing parse_analyze_fixedparams() and parse_analyze_varparams(). Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/c67ce276-52b4-0239-dc0e-39875bf81840@enterprisedb.com	2022-03-09 11:08:16 +01:00
Robert Haas	1d4be6be65	Fix LZ4 tests for remaining buffer space. We should flush the buffer when the remaining space is less than the maximum amount that we might need, not when it is less than or equal to the maximum amount we might need. Jeevan Ladhe, per an observation from me. Discussion: http://postgr.es/m/CANm22CgVMa85O1akgs+DOPE8NSrT1zbz5_vYfS83_r+6nCivLQ@mail.gmail.com	2022-03-08 10:05:55 -05:00
Robert Haas	7cf085f077	Add support for zstd base backup compression. Both client-side compression and server-side compression are now supported for zstd. In addition, a backup compressed by the server using zstd can now be decompressed by the client in order to accommodate the use of -Fp. Jeevan Ladhe, with some edits by me. Discussion: http://postgr.es/m/CA+Tgmobyzfbz=gyze2_LL1ZumZunmaEKbHQxjrFkOR7APZGu-g@mail.gmail.com	2022-03-08 09:52:43 -05:00
Michael Paquier	c28839c832	Improve comment in execReplication.c Author: Peter Smith Reviewed-by: Julien Rouhaud Discussion: https://postgr.es/m/CAHut+PuRVf3ghNTg8EV5XOQu6unGSZma0ahsRoz-haaOFZe-1A@mail.gmail.com	2022-03-08 14:29:03 +09:00
Amit Kapila	d3e8368c4b	Add the additional information to the logical replication worker errcontext. This commits adds both the finish LSN (commit_lsn in case transaction got committed, prepare_lsn in case of a prepared transaction, etc.) and replication origin name to the existing error context message. This will help users in specifying the origin name and transaction finish LSN to pg_replication_origin_advance() SQL function to skip a particular transaction. Author: Masahiko Sawada Reviewed-by: Takamichi Osumi, Euler Taveira, and Amit Kapila Discussion: https://postgr.es/m/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com	2022-03-08 08:08:32 +05:30
Tomas Vondra	d5ed9da41d	Call ReorderBufferProcessXid from sequence_decode Commit `0da92dc530` added sequence_decode() implementing logical decoding of sequences, but it failed to call ReorderBufferProcessXid() as it should. So add the missing call. Reported-by: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KGn6cQqJEsubOOENwQOANsExiV2sKL52r4U10J8NJEMQ%40mail.gmail.com	2022-03-07 20:53:16 +01:00
Peter Eisentraut	25751f54b8	Add pg_analyze_and_rewrite_varparams() This new function extracts common code from PrepareQuery() and exec_parse_message(). It is then exactly analogous to the existing pg_analyze_and_rewrite_fixedparams() and pg_analyze_and_rewrite_withcb(). To unify these two code paths, this makes PrepareQuery() now subject to log_parser_stats. Also, both paths now invoke TRACE_POSTGRESQL_QUERY_REWRITE_START(). PrepareQuery() no longer checks whether a utility statement was specified. The grammar doesn't allow that anyway, and exec_parse_message() supports it, so restricting it doesn't seem necessary. This also adds QueryEnvironment support to the *varparams functions, for consistency with its cousins, even though it is not used right now. Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/c67ce276-52b4-0239-dc0e-39875bf81840@enterprisedb.com	2022-03-07 08:13:30 +01:00
Amit Kapila	5e0e99a80b	Make the errcontext message in logical replication worker translation friendly. Previously, the message for logical replication worker errcontext is incrementally built, which was not translation friendly. Instead, we use complete sentences with if-else branches. We also remove the commit timestamp from the context message since it's not important information and made the message long. Author: Masahiko Sawada Reviewed-by: Takamichi Osumi, and Amit Kapila Discussion: https://postgr.es/m/CAD21AoBarBf2oTF71ig2g_o=3Z_Dt6_sOpMQma1kFgbnA5OZ_w@mail.gmail.com	2022-03-07 08:33:58 +05:30
Michael Paquier	9e98583898	Create routine able to set single-call SRFs for Materialize mode Set-returning functions that use the Materialize mode, creating a tuplestore to include all the tuples returned in a set rather than doing so in multiple calls, use roughly the same set of steps to prepare ReturnSetInfo for this job: - Check if ReturnSetInfo supports returning a tuplestore and if the materialize mode is enabled. - Create a tuplestore for all the tuples part of the returned set in the per-query memory context, stored in ReturnSetInfo->setResult. - Build a tuple descriptor mostly from get_call_result_type(), then stored in ReturnSetInfo->setDesc. Note that there are some cases where the SRF's tuple descriptor has to be the one specified by the function caller. This refactoring is done so as there are (well, should be) no behavior changes in any of the in-core functions refactored, and the centralized function that checks and sets up the function's ReturnSetInfo can be controlled with a set of bits32 options. Two of them prove to be necessary now: - SRF_SINGLE_USE_EXPECTED to use expectedDesc as tuple descriptor, as expected by the function's caller. - SRF_SINGLE_BLESS to validate the tuple descriptor for the SRF. The same initialization pattern is simplified in 28 places per my count as of src/backend/, shaving up to ~900 lines of code. These mostly come from the removal of the per-query initializations and the sanity checks now grouped in a single location. There are more locations that could be simplified in contrib/, that are left for a follow-up cleanup. `fcc2817`, `07daca5` and `d61a361` have prepared the areas of the code related to this change, to ease this refactoring. Author: Melanie Plageman, Michael Paquier Reviewed-by: Álvaro Herrera, Justin Pryzby Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com	2022-03-07 10:26:29 +09:00
Peter Eisentraut	791b1b71da	Parse/analyze function renaming There are three parallel ways to call parse/analyze: with fixed parameters, with variable parameters, and by supplying your own parser callback. Some of the involved functions were confusingly named and made this API structure more confusing. This patch renames some functions to make this clearer: parse_analyze() -> parse_analyze_fixedparams() pg_analyze_and_rewrite() -> pg_analyze_and_rewrite_fixedparams() (Otherwise one might think this variant doesn't accept parameters, but in fact all three ways accept parameters.) pg_analyze_and_rewrite_params() -> pg_analyze_and_rewrite_withcb() (Before, and also when considering pg_analyze_and_rewrite(), one might think this is the only way to pass parameters. Moreover, the parser callback doesn't necessarily need to parse only parameters, it's just one of the things it could do.) parse_fixed_parameters() -> setup_parse_fixed_parameters() parse_variable_parameters() -> setup_parse_variable_parameters() (These functions don't actually do any parsing, they just set up callbacks to use during parsing later.) This patch also adds some const decorations to the fixed-parameters API, so the distinction from the variable-parameters API is more clear. Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://www.postgresql.org/message-id/flat/c67ce276-52b4-0239-dc0e-39875bf81840@enterprisedb.com	2022-03-04 14:50:22 +01:00
Tom Lane	f7ea240aa7	Tighten overflow checks in tidin(). This code seems to have been written on the assumption that "unsigned long" is 32 bits; or at any rate it ignored the possibility of conversion overflow. Rewrite, borrowing some logic from oidin(). Discussion: https://postgr.es/m/3441768.1646343914@sss.pgh.pa.us	2022-03-03 20:04:35 -05:00
Tom Lane	46ab07ffda	Clean up assorted failures under clang's -fsanitize=undefined checks. Most of these are cases where we could call memcpy() or other libc functions with a NULL pointer and a zero count, which is forbidden by POSIX even though every production version of libc allows it. We've fixed such things before in a piecemeal way, but apparently never made an effort to try to get them all. I don't claim that this patch does so either, but it gets every failure I observe in check-world, using clang 12.0.1 on current RHEL8. numeric.c has a different issue that the sanitizer doesn't like: "ln(-1.0)" will compute log10(0) and then try to assign the resulting -Inf to an integer variable. We don't actually use the result in such a case, so there's no live bug. Back-patch to all supported branches, with the idea that we might start running a buildfarm member that tests this case. This includes back-patching `c1132aae3` (Check the size in COPY_POINTER_FIELD), which previously silenced some of these issues in copyfuncs.c. Discussion: https://postgr.es/m/CALNJ-vT9r0DSsAOw9OXVJFxLENoVS_68kJ5x0p44atoYH+H4dg@mail.gmail.com	2022-03-03 18:13:24 -05:00
Michael Paquier	62ce0c758d	Fix catalog data of pg_stop_backup(), labelled v2 This function has been incorrectly marked as a set-returning function with prorows (estimated number of rows) set to 1 since its creation in `7117685`, that introduced non-exclusive backups. There is no need for that as the function is designed to return only one tuple. This commit fixes the catalog definition of pg_stop_backup_v2() so as it is not marked as proretset anymore, with prorows set to 0. This simplifies its internals by removing one tuplestore (used for one single record anyway) and by removing all the checks related to a set-returning function. Issue found during my quest to simplify some of the logic used in in-core system functions. Bump catalog version. Reviewed-by: Aleksander Alekseev, Kyotaro Horiguchi Discussion: https://postgr.es/m/Yh8guT78f1Ercfzw@paquier.xyz	2022-03-03 10:51:57 +09:00
Amit Kapila	7a85073290	Reconsider pg_stat_subscription_workers view. It was decided (refer to the Discussion link below) that the stats collector is not an appropriate place to store the error information of subscription workers. This patch changes the pg_stat_subscription_workers view (introduced by commit `8d74fc96db`) so that it stores only statistics counters: apply_error_count and sync_error_count, and has one entry for each subscription. The removed error information such as error-XID and the error message would be stored in another way in the future which is more reliable and persistent. After removing these error details, there is no longer any relation information, so the subscription statistics are now a cluster-wide statistics. The patch also changes the view name to pg_stat_subscription_stats since the word "worker" is an implementation detail that we use one worker for one tablesync and one apply. Author: Masahiko Sawada, based on suggestions by Andres Freund Reviewed-by: Peter Smith, Haiying Tang, Takamichi Osumi, Amit Kapila Discussion: https://postgr.es/m/20220125063131.4cmvsxbz2tdg6g65@alap3.anarazel.de	2022-03-01 06:17:52 +05:30
Tom Lane	54bd1e43ca	Handle integer overflow in interval justification functions. justify_interval, justify_hours, and justify_days didn't check for overflow when promoting hours to days or days to months; but that's possible when the upper field's value is already large. Detect and report any such overflow. Also, we can avoid unnecessary overflow in some cases in justify_interval by pre-justifying the days field. (Thanks to Nathan Bossart for this idea.) Joe Koshakow Discussion: https://postgr.es/m/CAAvxfHeNqsJ2xYFbPUf_8nNQUiJqkag04NW6aBQQ0dbZsxfWHA@mail.gmail.com	2022-02-28 15:36:54 -05:00
Tom Lane	a59c79564b	Allow root-owned SSL private keys in libpq, not only the backend. This change makes libpq apply the same private-key-file ownership and permissions checks that we have used in the backend since commit `9a83564c5`. Namely, that the private key can be owned by either the current user or root (with different file permissions allowed in the two cases). This allows system-wide management of key files, which is just as sensible on the client side as the server, particularly when the client is itself some application daemon. Sync the comments about this between libpq and the backend, too. David Steele Discussion: https://postgr.es/m/f4b7bc55-97ac-9e69-7398-335e212f7743@pgmasters.net	2022-02-28 14:12:52 -05:00
Tom Lane	12d768e704	Don't use static storage for SaveTransactionCharacteristics(). This is pretty queasy-making on general principles, and the more so once you notice that CommitTransactionCommand() is actually stomping on the values saved by _SPI_commit(). It's okay as long as the active values didn't change during HoldPinnedPortals(); but that's a larger assumption than I think we want to make, especially since the fix is so simple. Discussion: https://postgr.es/m/1533956.1645731245@sss.pgh.pa.us	2022-02-28 12:54:12 -05:00
Tom Lane	2e517818f4	Fix SPI's handling of errors during transaction commit. SPI_commit previously left it up to the caller to recover from any error occurring during commit. Since that's complicated and requires use of low-level xact.c facilities, it's not too surprising that no caller got it right. Let's move the responsibility for cleanup into spi.c. Doing that requires redefining SPI_commit as starting a new transaction, so that it becomes equivalent to SPI_commit_and_chain except that you get default transaction characteristics instead of preserving the prior transaction's characteristics. We can make this pretty transparent API-wise by redefining SPI_start_transaction() as a no-op. Callers that expect to do something in between might be surprised, but available evidence is that no callers do so. Having made that API redefinition, we can fix this mess by having SPI_commit[_and_chain] trap errors and start a new, clean transaction before re-throwing the error. Likewise for SPI_rollback[_and_chain]. Some cleanup is also needed in AtEOXact_SPI, which was nowhere near smart enough to deal with SPI contexts nested inside a committing context. While plperl and pltcl need no changes beyond removing their now-useless SPI_start_transaction() calls, plpython needs some more work because it hadn't gotten the memo about catching commit/rollback errors in the first place. Such an error resulted in longjmp'ing out of the Python interpreter, which leaks Python stack entries at present and is reported to crash Python 3.11 altogether. Add the missing logic to catch such errors and convert them into Python exceptions. We are probably going to have to back-patch this once Python 3.11 ships, but it's a sufficiently basic change that I'm a bit nervous about doing so immediately. Let's let it bake awhile in HEAD first. Peter Eisentraut and Tom Lane Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org	2022-02-28 12:45:36 -05:00
Dean Rasheed	d1b307eef2	Optimise numeric division for one and two base-NBASE digit divisors. Formerly div_var() had "fast path" short division code that was significantly faster when the divisor was just one base-NBASE digit, but otherwise used long division. This commit adds a new function div_var_int() that divides by an arbitrary 32-bit integer, using the fast short division algorithm, and updates both div_var() and div_var_fast() to use it for one and two digit divisors. In the case of div_var(), this is slightly faster in the one-digit case, because it avoids some digit array copying, and is much faster in the two-digit case where it replaces long division. For div_var_fast(), it is much faster in both cases because the main div_var_fast() algorithm is optimised for larger inputs. Additionally, optimise exp() and ln() by using div_var_int(), allowing a NumericVar to be replaced by an int in a couple of places, most notably in the Taylor series code. This produces a significant speedup of exp(), ln() and the numeric_big regression test. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCVwsBi-ND-t82Cuuh1=8ee6jdOpzsmGN+CUZB6yjLg9jw@mail.gmail.com	2022-02-27 11:12:30 +00:00
Dean Rasheed	d996d648f3	Simplify the inner loop of numeric division in div_var(). In the standard numeric division algorithm, the inner loop multiplies the divisor by the next quotient digit and subtracts that from the working dividend. As suggested by the original code comment, the separate "carry" and "borrow" variables (from the multiplication and subtraction steps respectively) can be folded together into a single variable. Doing so significantly improves performance, as well as simplifying the code. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCVwsBi-ND-t82Cuuh1=8ee6jdOpzsmGN+CUZB6yjLg9jw@mail.gmail.com	2022-02-27 10:41:12 +00:00
Dean Rasheed	e3d41d08a1	Apply auto-vectorization to the inner loop of div_var_fast(). This loop is basically the same as the inner loop of mul_var(), which was auto-vectorized in commit `8870917623`, but the compiler will only consider auto-vectorizing the div_var_fast() loop if the assignment target div[qi + i] is replaced by div_qi[i], where div_qi = &div[qi]. Additionally, since the compiler doesn't know that qdigit is guaranteed to fit in a 16-bit NumericDigit, cast it to NumericDigit before multiplying to make the resulting auto-vectorized code more efficient (avoiding unnecessary multiplication of the high 16 bits). While at it, per suggestion from Tom Lane, change var1digit in mul_var() to be a NumericDigit rather than an int for the same reason. This actually makes no difference with modern gcc, but it might help other compilers generate more efficient assembly. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCVwsBi-ND-t82Cuuh1=8ee6jdOpzsmGN+CUZB6yjLg9jw@mail.gmail.com	2022-02-27 10:15:46 +00:00
Andres Freund	d33aeefd9b	Fix warning on mingw due to pid_t width, introduced in `fe0972ee5e`.	2022-02-26 16:07:07 -08:00
Amit Kapila	a89850a57e	Fix typo in logicalfuncs.c. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACX1mVtw8LWEnZgnpPdk2bPFR1xX2ZN+8GfXCffyip_9=Q@mail.gmail.com	2022-02-26 10:38:37 +05:30
Andres Freund	fe0972ee5e	Add further debug info to help debug 019_replslot_limit.pl failures. See also `afdeff1052`. Failures after that commit provided a few more hints, but not yet enough to understand what's going on. In 019_replslot_limit.pl shut down nodes with fast instead of immediate mode if we observe the failure mode. That should tell us whether the failures we're observing are just a timing issue under high load. PGCTLTIMEOUT should prevent buildfarm animals from hanging endlessly. Also adds a bit more logging to replication slot drop and ShutdownPostgres(). Discussion: https://postgr.es/m/20220225192941.hqnvefgdzaro6gzg@alap3.anarazel.de	2022-02-25 17:04:39 -08:00
Peter Geoghegan	73c61a50a1	vacuumlazy.c: Remove obsolete num_tuples field. Commit `49c9d9fc` unified VACUUM VERBOSE and autovacuum logging. It neglected to remove an old vacrel field that was only used by the old VACUUM VERBOSE, so remove it now. The previous num_tuples approach doesn't seem to have any real advantage over the approach VACUUM VERBOSE takes now (also the approach used by the autovacuum logging code), which is to show new_rel_tuples. new_rel_tuples is the possibly-estimated total number of tuples left in the table, whereas num_tuples meant the number of tuples encountered during the VACUUM operation, after pruning, without regard for tuples from pages skipped via the visibility map. In passing, reorder a related vacrel field for consistency.	2022-02-24 19:01:54 -08:00
Peter Geoghegan	cf879d3069	Remove unnecessary heap_tuple_needs_freeze argument. The buffer argument hasn't been used since the function was first added by commit `bbb6e559c4`. The sibling heap_prepare_freeze_tuple function doesn't have such an argument either. Remove it.	2022-02-24 18:31:07 -08:00
Heikki Linnakangas	6c46e8a5df	Fix data loss on crash after sorted GiST index build. If a checkpoint happens during sorted GiST index build, and the system crashes after the checkpoint and after the index build has finished, the data written to the index before the checkpoint started could be lost. The checkpoint won't fsync it, and it won't be replayed at crash recovery either. Fix by calling smgrimmedsync() after the index build, just like in B-tree index build. Backpatch to v14 where the sorted GiST index build was introduced. Reported-by: Melanie Plageman Discussion: https://www.postgresql.org/message-id/CAAKRu_ZJJynimxKj5xYBSziL62-iEtPE+fx-B=JzR=jUtP92mw@mail.gmail.com	2022-02-24 16:15:12 +02:00
Michael Paquier	e77216fcb0	Simplify more checks related to set-returning functions This makes more consistent the SRF-related checks in the area of PL/pgSQL, PL/Perl, PL/Tcl, pageinspect and some of the JSON worker functions, making it easier to grep for the same error patterns through the code, reducing a bit the translation work. It is worth noting that each_worker_jsonb()/each_worker() in jsonfuncs.c and pageinspect's brin_page_items() were doing a check on expectedDesc that is not required as they fetch their tuple descriptor directly from get_call_result_type(). This looks like a set of copy-paste errors that have spread over the years. This commit is a continuation of the changes begun in `07daca5`, for any remaining code paths on sight. Like `fcc2817`, this makes the code more consistent, easing the integration of a larger patch that will refactor the way tuplestores are created and checked in a good portion of the set-returning functions present in core. I have worked my way through the changes of this patch by myself, and Ranier has proposed the same changes in a different thread in parallel, though there were some inconsistencies related in expectedDesc in what was proposed by him. Author: Michael Paquier, Ranier Vilela Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com Discussion: https://postgr.es/m/CAEudQApm=AFuJjEHLBjBcJbxcw4pBMwg2sHwXyCXYcbBOj3hpg@mail.gmail.com	2022-02-24 16:54:59 +09:00
Michael Paquier	fcc28178c6	Clean up and simplify code in a couple of set-returning functions The following set-returning functions have their logic simplified, to be more consistent with other in-core areas: - pg_prepared_statement()'s tuple descriptor is now created with get_call_result_type() instead of being created from scratch, saving from some duplication with pg_proc.dat. - show_all_file_settings(), similarly, now uses get_call_result_type() to build its tuple descriptor instead of creating it from scratch. - pg_options_to_table() made use of a static routine called only once. This commit removes this internal routine to make the function easier to follow. - pg_config() was using a unique logic style, doing checks on the tuple descriptor passed down in expectedDesc, but it has no need to do so. This switches the function to use a tuplestore with a tuple descriptor retrieved from get_call_result_type(), instead. This simplifies an upcoming patch aimed at refactoring the way tuplestores are created and checked in set-returning functions, this change making sense as its own independent cleanup by shaving some code. Author: Melanie Plageman, Michael Paquier Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com	2022-02-24 16:11:34 +09:00
Tom Lane	bd74c4037c	Re-allow underscore as first character of custom GUC names. Commit `3db826bd5` intended that valid_custom_variable_name's rules for valid identifiers match those of scan.l. However, I (tgl) had some kind of brain fade and put "_" in the wrong list. Fix by Japin Li, per bug #17415 from Daniel Polski. Discussion: https://postgr.es/m/17415-ebdb683d7e09a51c@postgresql.org	2022-02-23 11:10:46 -05:00
Daniel Gustafsson	2313a3ee22	Fix statenames in mergejoin comments The names in the comments were on a few states not consistent with the documented state. Author: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CALNJ-vQVthfQXVqmrHR8BKHtC4fMGbhM1xbvJNJAPexTq_dH=w@mail.gmail.com	2022-02-23 10:54:03 +01:00

1 2 3 4 5 ...

22859 Commits