postgresql

Commit Graph

Author	SHA1	Message	Date
Andrew Dunstan	b6fb534f10	Add IF NOT EXISTS for CREATE SERVER and CREATE USER MAPPING There is still some inconsistency with the error messages surrounding foreign servers. Some use the word "foreign" and some don't. My inclination is to remove all such uses of "foreign" on the basis that the CREATE/ALTER/DROP SERVER commands don't use the word. However, that is left for another day. In this patch I have kept to the existing usage in the affected commands, which omits "foreign". Anastasia Lubennikova, reviewed by Arthur Zakirov and Ashtosh Bapat. Discussion: http://postgr.es/m/7c2ab9b8-388a-1ce0-23a3-7acf2a0ed3c6@postgrespro.ru	2017-03-20 16:40:45 -04:00
Andrew Dunstan	839cb0649a	Use a consistent error message style for user mappings. User mappings are essentially anonymous, so messages referring to "user mapping foo on server bar" are wrong, and inconsistent with other error messages referring to user mappings. To be consistent with existing use, use "user mapping for foo on server bar" instead. I dropped the noise word "user" from the original suggestion to be consistent with other uses. Discussion: http://postgr.es/m/56c6f8ab-b2d6-f1fa-deb0-1d18cf67f7b9@2ndQuadrant.com	2017-03-20 16:01:45 -04:00
Tom Lane	e3044f6184	Avoid use of already-closed relcache entry. Oversight in commit `17f8ffa1e`. Per buildfarm member prion.	2017-03-18 18:43:06 -04:00
Tom Lane	17f8ffa1e3	Fix REFRESH MATERIALIZED VIEW to report activity to the stats collector. The non-concurrent code path for REFRESH MATERIALIZED VIEW failed to report its updates to the stats collector. This is bad since it means auto-analyze doesn't know there's any work to be done. Adjust it to report the refresh as a table truncate followed by insertion of an appropriate number of rows. Since a matview could contain more than INT_MAX rows, change the signature of pgstat_count_heap_insert() to accept an int64 rowcount. (The accumulator it's adding into is already int64, but existing callers could not insert more than a small number of rows at once, so the argument had been declared just "int n".) This is surely a bug fix, but changing pgstat_count_heap_insert()'s API seems too risky for the back branches. Given the lack of previous complaints, I'm not sure it's a big enough problem to justify a kluge solution that would avoid that. So, no back-patch, at least for now. Jim Mlodgenski, adjusted a bit by me Discussion: https://postgr.es/m/CAB_5SRchSz7-WmdO5szdiknG8Oj_GGqJytrk1KRd11yhcMs1KQ@mail.gmail.com	2017-03-18 17:49:39 -04:00
Robert Haas	88e66d193f	Rename "pg_clog" directory to "pg_xact". Names containing the letters "log" sometimes confuse users into believing that only non-critical data is present. It is hoped this renaming will discourage ill-considered removals of transaction status data. Michael Paquier Discussion: http://postgr.es/m/CA+Tgmoa9xFQyjRZupbdEFuwUerFTvC6HjZq1ud6GYragGDFFgA@mail.gmail.com	2017-03-17 09:48:38 -04:00
Andrew Gierth	1914c5ea7d	Avoid having vacuum set reltuples to 0 on non-empty relations in the presence of page pins, which leads to serious estimation errors in the planner. This particularly affects small heavily-accessed tables, especially where locking (e.g. from FK constraints) forces frequent vacuums for mxid cleanup. Fix by keeping separate track of pages whose live tuples were actually counted vs. pages that were only scanned for freezing purposes. Thus, reltuples can only be set to 0 if all pages of the relation were actually counted. Backpatch to all supported versions. Per bug #14057 from Nicolas Baccelli, analyzed by me. Discussion: https://postgr.es/m/20160331103739.8956.94469@wrigleys.postgresql.org	2017-03-16 22:28:03 +00:00
Peter Eisentraut	eb4da3e380	Add option to control snapshot export to CREATE_REPLICATION_SLOT We used to export snapshots unconditionally in CREATE_REPLICATION_SLOT in the replication protocol, but several upcoming patches want more control over what happens. Suppress snapshot export in pg_recvlogical, which neither needs nor can use the exported snapshot. Since snapshot exporting can fail this improves reliability. This also paves the way for allowing the creation of replication slots on standbys, which cannot export snapshots because they cannot allocate new XIDs. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-03-14 17:34:22 -04:00
Robert Haas	c11453ce0a	hash: Add write-ahead logging support. The warning about hash indexes not being write-ahead logged and their use being discouraged has been removed. "snapshot too old" is now supported for tables with hash indexes. Most importantly, barring bugs, hash indexes will now be crash-safe and usable on standbys. This commit doesn't yet add WAL consistency checking for hash indexes, as we now have for other index types; a separate patch has been submitted to cure that lack. Amit Kapila, reviewed and slightly modified by me. The larger patch series of which this is a part has been reviewed and tested by Álvaro Herrera, Ashutosh Sharma, Mark Kirkwood, Jeff Janes, and Jesper Pedersen. Discussion: http://postgr.es/m/CAA4eK1JOBX=YU33631Qh-XivYXtPSALh514+jR8XeD7v+K3r_Q@mail.gmail.com	2017-03-14 13:27:02 -04:00
Peter Eisentraut	f97a028d8e	Spelling fixes in code comments From: Josh Soref <jsoref@gmail.com>	2017-03-14 12:58:39 -04:00
Tom Lane	5ed6fff6b7	Make logging about multixact wraparound protection less chatty. The original messaging design, introduced in commit `068cfadf9`, seems too chatty now that some time has elapsed since the bug fix; most installations will be in good shape and don't really need a reminder about this on every postmaster start. Hence, arrange to suppress the "wraparound protections are now enabled" message during startup (specifically, during the TrimMultiXact() call). The message will still appear if protection becomes effective at some later point. Discussion: https://postgr.es/m/17211.1489189214@sss.pgh.pa.us	2017-03-14 12:47:53 -04:00
Noah Misch	3a0d473192	Use wrappers of PG_DETOAST_DATUM_PACKED() more. This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.	2017-03-12 19:35:34 -04:00
Peter Eisentraut	cd603a4d6b	Use SQL standard error code for nextval	2017-03-09 10:56:44 -05:00
Robert Haas	355d3993c5	Add a Gather Merge executor node. Like Gather, we spawn multiple workers and run the same plan in each one; however, Gather Merge is used when each worker produces the same output ordering and we want to preserve that output ordering while merging together the streams of tuples from various workers. (In a way, Gather Merge is like a hybrid of Gather and MergeAppend.) This works out to a win if it saves us from having to perform an expensive Sort. In cases where only a small amount of data would need to be sorted, it may actually be faster to use a regular Gather node and then sort the results afterward, because Gather Merge sometimes needs to wait synchronously for tuples whereas a pure Gather generally doesn't. But if this avoids an expensive sort then it's a win. Rushabh Lathia, reviewed and tested by Amit Kapila, Thomas Munro, and Neha Sharma, and reviewed and revised by me. Discussion: http://postgr.es/m/CAGPqQf09oPX-cQRpBKS0Gq49Z+m6KBxgxd_p9gX8CKk_d75HoQ@mail.gmail.com	2017-03-09 07:49:29 -05:00
Stephen Frost	f9b1a0dd40	Expose explain's SUMMARY option This exposes the existing explain summary option to users to allow them to choose if they wish to have the planning time and totalled run time included in the EXPLAIN result. The existing default behavior is retained if SUMMARY is not specified- running explain without analyze will not print the summary lines (just the planning time, currently) while running explain with analyze will include the summary lines (both the planning time and the totalled execution time). Users who wish to see the summary information for plain explain can now use: EXPLAIN (SUMMARY ON) query; Users who do not want to have the summary printed for an analyze run can use: EXPLAIN (ANALYZE ON, SUMMARY OFF) query; With this, we can now also have EXPLAIN ANALYZE queries included in our regression tests by using: EXPLAIN (ANALYZE ON, TIMING OFF, SUMMARY off) query; I went ahead and added an example of this, which will hopefully not make the buildfarm complain. Author: Ashutosh Bapat Discussion: https://postgr.es/m/CAFjFpReE5z2h98U2Vuia8hcEkpRRwrauRjHmyE44hNv8-xk+XA@mail.gmail.com	2017-03-08 15:14:03 -05:00
Fujii Masao	4eafdcc276	Prevent logical rep workers with removed subscriptions from starting. Any logical rep workers must have their subscription entries in pg_subscription. To ensure this, we need to prevent the launcher from starting new worker corresponding to the subscription that DROP SUBSCRIPTION command is removing. To implement this, previously LogicalRepLauncherLock was introduced and held until the end of transaction running DROP SUBSCRIPTION. But using LWLock for that purpose was not valid. Instead, this commit changes DROP SUBSCRIPTION so that it takes AccessExclusiveLock on pg_subscription, in order to ensure that the launcher cannot see any subscriptions being removed. Also this commit gets rid of LogicalRepLauncherLock. Patch by me, reviewed by Petr Jelinek Discussion: https://www.postgresql.org/message-id/CAHGQGwHPi8ky-yANFfe0sgmhKtsYcQLTnKx07bW9S7-Rn1746w@mail.gmail.com	2017-03-09 01:44:23 +09:00
Alvaro Herrera	fcec6caafa	Support XMLTABLE query expression XMLTABLE is defined by the SQL/XML standard as a feature that allows turning XML-formatted data into relational form, so that it can be used as a <table primary> in the FROM clause of a query. This new construct provides significant simplicity and performance benefit for XML data processing; what in a client-side custom implementation was reported to take 20 minutes can be executed in 400ms using XMLTABLE. (The same functionality was said to take 10 seconds using nested PostgreSQL XPath function calls, and 5 seconds using XMLReader under PL/Python). The implemented syntax deviates slightly from what the standard requires. First, the standard indicates that the PASSING clause is optional and that multiple XML input documents may be given to it; we make it mandatory and accept a single document only. Second, we don't currently support a default namespace to be specified. This implementation relies on a new executor node based on a hardcoded method table. (Because the grammar is fixed, there is no extensibility in the current approach; further constructs can be implemented on top of this such as JSON_TABLE, but they require changes to core code.) Author: Pavel Stehule, Álvaro Herrera Extensively reviewed by: Craig Ringer Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com	2017-03-08 12:40:26 -03:00
Fujii Masao	77d21970ae	Fix connection leak in DROP SUBSCRIPTION command, take 2. Commit `898a792eb8` fixed the connection leak issue, but it was an unreliable way of bugfix. This bugfix was assuming that walrcv_command() subroutine cannot throw an error, but it's untenable assumption. For example, if it will be changed so that an error is thrown, connection leak issue will happen again. This patch ensures that the connection is closed even when walrcv_command() subroutine throws an error. Patch by me, reviewed by Petr Jelinek and Michael Paquier Discussion: https://www.postgresql.org/message-id/2058.1487704345@sss.pgh.pa.us	2017-03-08 23:43:38 +09:00
Robert Haas	d88d06cd07	Fix relcache reference leak. Reported by Kevin Grittner. Faulty commit identified by Tom Lane. Patch by Amit Langote, reviewed by Michael Paquier. Discussion: http://postgr.es/m/CACjxUsOHbH1=99u8mGxmLHfy5hov4ENEpvM6=3ARjos7wG7rtQ@mail.gmail.com	2017-03-07 11:27:21 -05:00
Heikki Linnakangas	818fd4a67d	Support SCRAM-SHA-256 authentication (RFC 5802 and 7677). This introduces a new generic SASL authentication method, similar to the GSS and SSPI methods. The server first tells the client which SASL authentication mechanism to use, and then the mechanism-specific SASL messages are exchanged in AuthenticationSASLcontinue and PasswordMessage messages. Only SCRAM-SHA-256 is supported at the moment, but this allows adding more SASL mechanisms in the future, without changing the overall protocol. Support for channel binding, aka SCRAM-SHA-256-PLUS is left for later. The SASLPrep algorithm, for pre-processing the password, is not yet implemented. That could cause trouble, if you use a password with non-ASCII characters, and a client library that does implement SASLprep. That will hopefully be added later. Authorization identities, as specified in the SCRAM-SHA-256 specification, are ignored. SET SESSION AUTHORIZATION provides more or less the same functionality, anyway. If a user doesn't exist, perform a "mock" authentication, by constructing an authentic-looking challenge on the fly. The challenge is derived from a new system-wide random value, "mock authentication nonce", which is created at initdb, and stored in the control file. We go through these motions, in order to not give away the information on whether the user exists, to unauthenticated users. Bumps PG_CONTROL_VERSION, because of the new field in control file. Patch by Michael Paquier and Heikki Linnakangas, reviewed at different stages by Robert Haas, Stephen Frost, David Steele, Aleksander Alekseev, and many others. Discussion: https://www.postgresql.org/message-id/CAB7nPqRbR3GmFYdedCAhzukfKrgBLTLtMvENOmPrVWREsZkF8g%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAB7nPqSMXU35g%3DW9X74HVeQp0uvgJxvYOuA4A-A3M%2B0wfEBv-w%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/55192AFE.6080106@iki.fi	2017-03-07 14:25:40 +02:00
Tom Lane	a8df75b0a4	Avoid dangling pointer to relation name in RLS code path in DoCopy(). With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE, and would sometimes fail without that, because it used the relation name directly from the relcache as part of the parsetree it's building. That becomes a potentially-dangling pointer as soon as the relcache entry is closed, a bit further down. Typical symptom if the relcache entry chanced to get cleared would be "relation does not exist" error with a garbage relation name, or possibly a core dump; but if you were really truly unlucky, the COPY might copy from the wrong table. Per report from Andrew Dunstan that regression tests fail with -DRELCACHE_FORCE_RELEASE. The core tests now pass for me (but have not tried "make check-world" yet). Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com	2017-03-06 16:50:47 -05:00
Peter Eisentraut	2ca64c6f71	Replace LookupFuncNameTypeNames() with LookupFuncWithArgs() The old function took function name and function argument list as separate arguments. Now that all function signatures are passed around as ObjectWithArgs structs, this is no longer necessary and can be replaced by a function that takes ObjectWithArgs directly. Similarly for aggregates and operators. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-03-06 13:31:47 -05:00
Peter Eisentraut	8b6d6cf853	Remove objname/objargs split for referring to objects In simpler times, it might have worked to refer to all kinds of objects by a list of name components and an optional argument list. But this doesn't work for all objects, which has resulted in a collection of hacks to place various other nodes types into these fields, which have to be unpacked at the other end. This makes it also weird to represent lists of such things in the grammar, because they would have to be lists of singleton lists, to make the unpacking work consistently. The other problem is that keeping separate name and args fields makes it awkward to deal with lists of functions. Change that by dropping the objargs field and have objname, renamed to object, be a generic Node, which can then be flexibly assigned and managed using the normal Node mechanisms. In many cases it will still be a List of names, in some cases it will be a string Value, for types it will be the existing Typename, for functions it will now use the existing ObjectWithArgs node type. Some of the more obscure object types still use somewhat arbitrary nested lists. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-03-06 13:31:47 -05:00
Peter Eisentraut	550214a4ef	Add operator_with_argtypes grammar rule This makes the handling of operators similar to that of functions and aggregates. Rename node FuncWithArgs to ObjectWithArgs, to reflect the expanded use. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-03-06 13:31:47 -05:00
Peter Eisentraut	63ebd377a6	Use class_args field in opclass_drop This makes it consistent with the usage in opclass_item. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-03-06 13:31:47 -05:00
Simon Riggs	8b4d582d27	Allow partitioned tables to be dropped without CASCADE Record partitioned table dependencies as DEPENDENCY_AUTO rather than DEPENDENCY_NORMAL, so that DROP TABLE just works. Remove all the tests for partitioned tables where earlier work had deliberately avoided using CASCADE. Amit Langote, reviewed by Ashutosh Bapat and myself	2017-03-06 15:50:53 +05:30
Tom Lane	dbca84f04e	In rebuild_relation(), don't access an already-closed relcache entry. This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by Andrew Dunstan, and could sometimes fail in normal operation, resulting in a wrong persistence value being used for the transient table. It's not immediately clear to me what effects that might have beyond the risk of a crash while accessing OldHeap->rd_rel->relpersistence, but it's probably not good. Bug introduced by commit `f41872d0c`, and made substantially worse by commit `85b506bbf`, which added a second such access significantly later than the heap_close. I doubt the first reference could fail in a production scenario, but the second one definitely could. Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com	2017-03-04 16:09:33 -05:00
Peter Eisentraut	272adf4f9c	Disallow CREATE/DROP SUBSCRIPTION in transaction block Disallow CREATE SUBSCRIPTION and DROP SUBSCRIPTION in a transaction block when the replication slot is to be created or dropped, since that cannot be rolled back. based on patch by Masahiko Sawada <sawada.mshk@gmail.com>	2017-03-03 23:29:13 -05:00
Peter Eisentraut	2357c12b49	Fix typo	2017-03-03 18:21:06 -05:00
Peter Eisentraut	6da9759a03	Add RENAME support for PUBLICATIONs and SUBSCRIPTIONs From: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-03-03 10:47:04 -05:00
Simon Riggs	9eb344faf5	Allow vacuums to report oldestxmin Allow VACUUM and Autovacuum to report the oldestxmin value they used while cleaning tables, helping to make better sense out of the other statistics we report in various cases.	2017-03-03 19:18:25 +05:30
Robert Haas	3c3bb99330	Don't uselessly rewrite, truncate, VACUUM, or ANALYZE partitioned tables. Also, recursively perform VACUUM and ANALYZE on partitions when the command is applied to a partitioned table. In passing, some related documentation updates. Amit Langote, reviewed by Michael Paquier, Ashutosh Bapat, and by me. Discussion: http://postgr.es/m/47288cf1-f72c-dfc2-5ff0-4af962ae5c1b@lab.ntt.co.jp	2017-03-02 17:23:44 +05:30
Tom Lane	9e3755ecb2	Remove useless duplicate inclusions of system header files. c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.	2017-02-25 16:12:55 -05:00
Tom Lane	c29aff959d	Consistently declare timestamp variables as TimestampTz. Twiddle the replication-related code so that its timestamp variables are declared TimestampTz, rather than the uninformative "int64" that was previously used for meant-to-be-always-integer timestamps. This resolves the int64-vs-TimestampTz declaration inconsistencies introduced by commit `7c030783a`, though in the opposite direction to what was originally suggested. This required including datatype/timestamp.h in a couple more places than before. I decided it would be a good idea to slim down that header by not having it pull in <float.h> etc, as those headers are no longer at all relevant to its purpose. Unsurprisingly, a small number of .c files turn out to have been depending on those inclusions, so add them back in the .c files as needed. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us Discussion: https://postgr.es/m/27694.1487456324@sss.pgh.pa.us	2017-02-23 15:57:08 -05:00
Tom Lane	b9d092c962	Remove now-dead code for !HAVE_INT64_TIMESTAMP. This is a basically mechanical removal of #ifdef HAVE_INT64_TIMESTAMP tests and the negative-case controlled code. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us	2017-02-23 14:04:43 -05:00
Peter Eisentraut	74321d87fb	Fix whitespace	2017-02-21 15:44:07 -05:00
Fujii Masao	898a792eb8	Fix connection leak in DROP SUBSCRIPTION command. Previously the command forgot to close the connection to the publisher when it failed to drop the replication slot.	2017-02-22 03:36:02 +09:00
Peter Eisentraut	38d103763d	Make more use of castNode()	2017-02-21 11:59:09 -05:00
Robert Haas	a3dc8e495b	Make partitions automatically inherit OIDs. Previously, if the parent was specified as WITH OIDS, each child also had to be explicitly specified as WITH OIDS. Amit Langote, per a report from Simon Riggs. Some additional work on the documentation changes by me. Discussion: http://postgr.es/m/CANP8+jJBpWocfKrbJcaf3iBt9E3U=WPE_NC8YE6rye+YJ1sYnQ@mail.gmail.com	2017-02-19 21:29:27 +05:30
Peter Eisentraut	6d16ecc646	Add CREATE COLLATION IF NOT EXISTS clause The core of the functionality was already implemented when pg_import_system_collations was added. This just exposes it as an option in the SQL command.	2017-02-15 10:01:28 -05:00
Robert Haas	e28b115612	Don't disallow dropping NOT NULL for a list partition key. Range partitioning doesn't support nulls in the partitioning columns, but list partitioning does. Amit Langote, per a complaint from Amul Sul	2017-02-14 12:13:41 -05:00
Noah Misch	f30f34e589	Ignore tablespace ACLs when ignoring schema ACLs. The ALTER TABLE ALTER TYPE implementation can issue DROP INDEX and CREATE INDEX to refit existing indexes for the new column type. Since this CREATE INDEX is an implementation detail of an index alteration, the ensuing DefineIndex() should skip ACL checks specific to index creation. It already skips the namespace ACL check. Make it skip the tablespace ACL check, too. Back-patch to 9.2 (all supported versions). Reviewed by Tom Lane.	2017-02-12 16:03:41 -05:00
Peter Eisentraut	2ea5b06c7a	Add CREATE SEQUENCE AS <data type> clause This stores a data type, required to be an integer type, with the sequence. The sequences min and max values default to the range supported by the type, and they cannot be set to values exceeding that range. The internal implementation of the sequence is not affected. Change the serial types to create sequences of the appropriate type. This makes sure that the min and max values of the sequence for a serial column match the range of values supported by the table column. So the sequence can no longer overflow the table column. This also makes monitoring for sequence exhaustion/wraparound easier, which currently requires various contortions to cross-reference the sequences with the table columns they are used with. This commit also effectively reverts the pg_sequence column reordering in `f3b421da5f`, because the new seqtypid column allows us to fill the hole in the struct and create a more natural overall column ordering. Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-02-10 15:34:35 -05:00
Tom Lane	86d911ec0f	Allow index AMs to cache data across aminsert calls within a SQL command. It's always been possible for index AMs to cache data across successive amgettuple calls within a single SQL command: the IndexScanDesc.opaque field is meant for precisely that. However, no comparable facility exists for amortizing setup work across successive aminsert calls. This patch adds such a feature and teaches GIN, GIST, and BRIN to use it to amortize catalog lookups they'd previously been doing on every call. (The other standard index AMs keep everything they need in the relcache, so there's little to improve there.) For GIN, the overall improvement in a statement that inserts many rows can be as much as 10%, though it seems a bit less for the other two. In addition, this makes a really significant difference in runtime for CLOBBER_CACHE_ALWAYS tests, since in those builds the repeated catalog lookups are vastly more expensive. The reason this has been hard up to now is that the aminsert function is not passed any useful place to cache per-statement data. What I chose to do is to add suitable fields to struct IndexInfo and pass that to aminsert. That's not widening the index AM API very much because IndexInfo is already within the ken of ambuild; in fact, by passing the same info to aminsert as to ambuild, this is really removing an inconsistency in the AM API. Discussion: https://postgr.es/m/27568.1486508680@sss.pgh.pa.us	2017-02-09 11:52:12 -05:00
Robert Haas	a507b86900	Add WAL consistency checking facility. When the new GUC wal_consistency_checking is set to a non-empty value, it triggers recording of additional full-page images, which are compared on the standby against the results of applying the WAL record (without regard to those full-page images). Allowable differences such as hints are masked out, and the resulting pages are compared; any difference results in a FATAL error on the standby. Kuntal Ghosh, based on earlier patches by Michael Paquier and Heikki Linnakangas. Extensively reviewed and revised by Michael Paquier and by me, with additional reviews and comments from Amit Kapila, Álvaro Herrera, Simon Riggs, and Peter Eisentraut.	2017-02-08 15:45:30 -05:00
Heikki Linnakangas	181bdb90ba	Fix typos in comments. Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com	2017-02-06 11:33:58 +02:00
Fujii Masao	39b8cc991f	Be sure to release LogicalRepLauncherLock in DROP SUBSCRIPTION command. Previously DROP SUBSCRIPTION command forgot to release the lock at all. Original patches by Kyotaro Horiguchi and Michael Paquier, but I didn't use them. Discussion: http://postgr.es/m/20170201.173623.66249355.horiguchi.kyotaro@lab.ntt.co.jp	2017-02-04 03:18:13 +09:00
Tom Lane	aedd554f84	Fix CatalogTupleInsert/Update abstraction for case of shared indstate. Add CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo to let callers use the CatalogTupleXXX abstraction layer even in cases where we want to share the results of CatalogOpenIndexes across multiple inserts/updates for efficiency. This finishes the job begun in commit `2f5c9d9c9`, by allowing some remaining simple_heap_insert/update calls to be replaced. The abstraction layer is now complete enough that we don't have to export CatalogIndexInsert at all anymore. Also, this fixes several places in which `2f5c9d9c9` introduced performance regressions by using retail CatalogTupleInsert or CatalogTupleUpdate even though the previous coding had been able to amortize CatalogOpenIndexes work across multiple tuples. A possible future improvement is to arrange for the indexing.c functions to cache the CatalogIndexState somewhere, maybe in the relcache, in which case we could get rid of CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo again. But that's a task for another day. Discussion: https://postgr.es/m/27502.1485981379@sss.pgh.pa.us	2017-02-01 17:18:36 -05:00
Tom Lane	ab02896510	Provide CatalogTupleDelete() as a wrapper around simple_heap_delete(). This extends the work done in commit `2f5c9d9c9` to provide a more nearly complete abstraction layer hiding the details of index updating for catalog changes. That commit only invented abstractions for catalog inserts and updates, leaving nearby code for catalog deletes still calling the heap-level routines directly. That seems rather ugly from here, and it does little to help if we ever want to shift to a storage system in which indexing work is needed at delete time. Hence, create a wrapper function CatalogTupleDelete(), and replace calls of simple_heap_delete() on catalog tuples with it. There are now very few direct calls of [simple_]heap_delete remaining in the tree. Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us	2017-02-01 16:13:30 -05:00
Heikki Linnakangas	dbd69118c0	Replace isMD5() with a more future-proof way to check if pw is encrypted. The rule is that if pg_authid.rolpassword begins with "md5" and has the right length, it's an MD5 hash, otherwise it's a plaintext password. The idiom has been to use isMD5() to check for that, but that gets awkward, when we add new kinds of verifiers, like the verifiers for SCRAM authentication in the pending SCRAM patch set. Replace isMD5() with a new get_password_type() function, so that when new verifier types are added, we don't need to remember to modify every place that currently calls isMD5(), to also recognize the new kinds of verifiers. Also, use the new plain_crypt_verify function in passwordcheck, so that it doesn't need to know about MD5, or in the future, about other kinds of hashes or password verifiers. Reviewed by Michael Paquier and Peter Eisentraut. Discussion: https://www.postgresql.org/message-id/2d07165c-1793-e243-a2a9-e45b624c7580@iki.fi	2017-02-01 13:11:37 +02:00
Alvaro Herrera	2f5c9d9c9c	Tweak catalog indexing abstraction for upcoming WARM Split the existing CatalogUpdateIndexes into two different routines, CatalogTupleInsert and CatalogTupleUpdate, which do both the heap insert/update plus the index update. This removes over 300 lines of boilerplate code all over src/backend/catalog/ and src/backend/commands. The resulting code is much more pleasing to the eye. Also, by encapsulating what happens in detail during an UPDATE, this facilitates the upcoming WARM patch, which is going to add a few more lines to the update case making the boilerplate even more boring. The original CatalogUpdateIndexes is removed; there was only one use left, and since it's just three lines, we can as well expand it in place there. We could keep it, but WARM is going to break all the UPDATE out-of-core callsites anyway, so there seems to be no benefit in doing so. Author: Pavan Deolasee Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com	2017-01-31 18:42:24 -03:00
Stephen Frost	e54f75722c	Handle ALTER EXTENSION ADD/DROP with pg_init_privs In commit `6c268df`, pg_init_privs was added to track the initial privileges of catalog objects and extensions. Unfortunately, that commit didn't include understanding of ALTER EXTENSION ADD/DROP, which allows the objects associated with an extension to be changed after the initial CREATE EXTENSION script has been run. The result of this meant that ACLs for objects added through ALTER EXTENSION ADD were not recorded into pg_init_privs and we would end up including those ACLs in pg_dump when we shouldn't have. This commit corrects that by making sure to have pg_init_privs updated when ALTER EXTENSION ADD/DROP is run, recording the permissions as they are at ALTER EXTENSION ADD time, and removing any if/when ALTER EXTENSION DROP is called. This issue was pointed out by Moshe Jacobson as commentary on bug #14456 (which was actually a bug about versions prior to 9.6 not handling custom ACLs on extensions correctly, an issue now addressed with pg_init_privs in 9.6). Back-patch to 9.6 where pg_init_privs was introduced.	2017-01-29 23:05:07 -05:00
Tom Lane	7afd56c3c6	Use castNode() in a bunch of statement-list-related code. When I wrote commit `ab1f0c822`, I really missed the castNode() macro that Peter E. had proposed shortly before. This back-fills the uses I would have put it to. It's probably not all that significant, but there are more assertions here than there were before, and conceivably they will help catch any bugs associated with those representation changes. I left behind a number of usages like "(Query *) copyObject(query_var)". Those could have been converted as well, but Peter has proposed another notational improvement that would handle copyObject cases automatically, so I let that be for now.	2017-01-26 22:09:34 -05:00
Andres Freund	9ba8a9ce45	Use the new castNode() macro in a number of places. This is far from a pervasive conversion, but it's a good starting point. Author: Peter Eisentraut, with some minor changes by me Reviewed-By: Tom Lane Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com	2017-01-26 16:47:03 -08:00
Peter Eisentraut	3d9e73ea5f	Update copyright years in some recently added files	2017-01-25 12:32:05 -05:00
Peter Eisentraut	65df150a18	Close replication connection when slot creation errors From: Petr Jelinek <pjmodos@pjmodos.net>	2017-01-25 10:47:53 -05:00
Robert Haas	27cdb3414b	Reindent table partitioning code. We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.	2017-01-24 10:20:02 -05:00
Robert Haas	b1ecb9b3fc	Fix interaction of partitioned tables with BulkInsertState. When copying into a partitioned table, the target heap may change from one tuple to next. We must ask ReadBufferBI() to get a new buffer every time such change occurs. To do that, use new function ReleaseBulkInsertStatePin(). This fixes the bug that tuples ended up being inserted into the wrong partition, which occurred exactly because the wrong buffer was used. Amit Langote, per a suggestion from Robert Haas. Some cosmetic adjustments by me. Reports by 高增琦 (Gao Zengqi), Venkata B Nagothi, and Ragnar Ouchterlony. Discussion: http://postgr.es/m/CAFmBtr32FDOqofo8yG-4mjzL1HnYHxXK5S9OGFJ%3D%3DcJpgEW4vA%40mail.gmail.com Discussion: http://postgr.es/m/CAEyp7J9WiX0L3DoiNcRrY-9iyw%3DqP%2Bj%3DDLsAnNFF1xT2J1ggfQ%40mail.gmail.com Discussion: http://postgr.es/m/16d73804-c9cd-14c5-463e-5caad563ff77%40agama.tv Discussion: http://postgr.es/m/CA+TgmoaiZpDVUUN8LZ4jv1qFE_QyR+H9ec+79f5vNczYarg5Zg@mail.gmail.com	2017-01-24 08:50:16 -05:00
Peter Eisentraut	0bc1207aeb	Fix default minimum value for descending sequences For some reason that is lost in history, a descending sequence would default its minimum value to -2^63+1 (-PG_INT64_MAX) instead of -2^63 (PG_INT64_MIN), even though explicitly specifying a minimum value of -2^63 would work. Fix this inconsistency by using the full range by default. Reported-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-01-23 14:00:58 -05:00
Peter Eisentraut	46d482814c	Don't error when no system locales were found initdb used to warn about that, but it was changed to an error in pg_import_system_locales, but some build farm members failed because of that. Change it back to a warning.	2017-01-23 13:45:32 -05:00
Alvaro Herrera	7e26e02eec	Prefetch blocks during lazy vacuum's truncation scan Vacuum truncation scan can be sped up on rotating media by prefetching blocks in forward direction. That makes the blocks already present in memory by the time they are needed, while also letting OS read-ahead kick in. The truncate scan has been measured to be five times faster than without this patch (that was on a slow disk, but it shouldn't hurt on fast disks.) Author: Álvaro Herrera, loosely based on a submission by Claudio Freire Discussion: https://postgr.es/m/CAGTBQpa6NFGO_6g_y_7zQx8L9GcHDSQKYdo1tGuh791z6PYgEg@mail.gmail.com	2017-01-23 12:55:18 -03:00
Peter Eisentraut	f21a563d25	Move some things from builtins.h to new header files This avoids that builtins.h has to include additional header files.	2017-01-20 20:29:53 -05:00
Alvaro Herrera	50cf1c80e6	Record dependencies on owners for logical replication objects This was forgotten in `665d1fad99` and caused the whole buildfarm to become red for a little while. Author: Petr Jelínek Also fix a typo in a nearby error message.	2017-01-20 16:45:02 -03:00
Peter Eisentraut	665d1fad99	Logical replication - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>	2017-01-20 09:04:49 -05:00
Andres Freund	ea15e18677	Remove obsoleted code relating to targetlist SRF evaluation. Since `69f4b9c` plain expression evaluation (and thus normal projection) can't return sets of tuples anymore. Thus remove code dealing with that possibility. This will require adjustments in external code using ExecEvalExpr()/ExecProject() - that should neither be hard nor very common. Author: Andres Freund and Tom Lane Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de	2017-01-19 14:40:41 -08:00
Alvaro Herrera	8eace46d34	Fix race condition in reading commit timestamps If a user requests the commit timestamp for a transaction old enough that its data is concurrently being truncated away by vacuum at just the right time, they would receive an ugly internal file-not-found error message from slru.c rather than the expected NULL return value. In a primary server, the window for the race is very small: the lookup has to occur exactly between the two calls by vacuum, and there's not a lot that happens between them (mostly just a multixact truncate). In a standby server, however, the window is larger because the truncation is executed as soon as the WAL record for it is replayed, but the advance of the oldest-Xid is not executed until the next checkpoint record. To fix in the primary, simply reverse the order of operations in vac_truncate_clog. To fix in the standby, augment the WAL truncation record so that the standby is aware of the new oldest-XID value and can apply the update immediately. WAL version bumped because of this. No backpatch, because of the low importance of the bug and its rarity. Author: Craig Ringer Reviewed-By: Petr Jelínek, Peter Eisentraut Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com	2017-01-19 18:24:17 -03:00
Robert Haas	05bd889904	Fix RETURNING to work correctly with partition tuple routing. In ExecInsert(), do not switch back to the root partitioned table ResultRelInfo until after we finish ExecProcessReturning(), so that RETURNING projection is done using the partition's descriptor. For the projection to work correctly, we must initialize the same for each leaf partition during ModifyTableState initialization. Amit Langote	2017-01-19 13:20:11 -05:00
Robert Haas	39162b2030	Fix failure to enforce partitioning contraint for internal partitions. When a tuple is inherited into a partitioning root, no partition constraints need to be enforced; when it is inserted into a leaf, the parent's partitioning quals needed to be enforced. The previous coding got both of those cases right. When a tuple is inserted into an intermediate level of the partitioning hierarchy (i.e. a table which is both a partition itself and in turn partitioned), it must enforce the partitioning qual inherited from its parent. That case got overlooked; repair. Amit Langote	2017-01-19 12:30:27 -05:00
Andres Freund	69f4b9c85f	Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de	2017-01-18 13:40:27 -08:00
Alvaro Herrera	9a34123bc3	Make messages mentioning type names more uniform This avoids additional translatable strings for each distinct type, as well as making our quoting style around type names more consistent (namely, that we don't quote type names). This continues what started as `f402b99501`. Discussion: https://postgr.es/m/20160401170642.GA57509@alvherre.pgsql	2017-01-18 16:08:20 -03:00
Tom Lane	0333a73400	Avoid conflicts with collation aliases generated by stripping. This resulted in failures depending on the order of "locale -a" output. The original coding in initdb sorted the results, but that should be unnecessary as long as "locale -a" doesn't print duplicate names. The original entries will then all be non-dups, and while we might generate duplicate aliases by stripping, they should be for different encodings and thus not conflict. Even if the latter assumption fails somehow, it won't be fatal because we're using if_not_exists mode for the aliases. Discussion: https://postgr.es/m/26116.1484751196%40sss.pgh.pa.us	2017-01-18 13:44:19 -05:00
Peter Eisentraut	aa17c06fb5	Add function to import operating system collations Move this logic out of initdb into a user-callable function. This simplifies the code and makes it possible to update the standard collations later on if additional operating system collations appear. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Euler Taveira <euler@timbira.com.br>	2017-01-18 09:35:56 -05:00
Peter Eisentraut	352a24a1f9	Generate fmgr prototypes automatically Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains prototypes for all functions registered in pg_proc.h. This avoids having to manually maintain these prototypes across a random variety of header files. It also automatically enforces a correct function signature, and since there are warnings about missing prototypes, it will detect functions that are defined but not registered in pg_proc.h (or otherwise used). Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 14:06:07 -05:00
Tom Lane	ab1f0c8225	Change representation of statement lists, and add statement location info. This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre	2017-01-14 16:02:35 -05:00
Robert Haas	0563a3a8b5	Fix a bug in how we generate partition constraints. Move the code for doing parent attnos to child attnos mapping for Vars in partition constraint expressions to a separate function map_partition_varattnos() and call it from the appropriate places. Doing it in get_qual_from_partbound(), as is now, would produce wrong result in certain multi-level partitioning cases, because it only considers the current pair of parent-child relations. In certain multi-level partitioning cases, attnums for the same key attribute(s) might differ between various levels causing the same attribute to be numbered differently in different instances of the Var corresponding to a given attribute. With this commit, in generate_partition_qual(), we first generate the the whole partition constraint (considering all levels of partitioning) and then do the mapping, so that Vars in the final expression are numbered according the leaf relation (to which it is supposed to apply). Amit Langote, reviewed by me.	2017-01-13 14:04:35 -05:00
Alvaro Herrera	3957b58b88	Fix ALTER TABLE / SET TYPE for irregular inheritance If inherited tables don't have exactly the same schema, the USING clause in an ALTER TABLE / SET DATA TYPE misbehaves when applied to the children tables since commit `9550e8348b`. Starting with that commit, the attribute numbers in the USING expression are fixed during parse analysis. This can lead to bogus errors being reported during execution, such as: ERROR: attribute 2 has wrong type DETAIL: Table has type smallint, but query expects integer. Since it wouldn't do to revert to the original coding, we now apply a transformation to map the attribute numbers to the correct ones for each child. Reported by Justin Pryzby Analysis by Tom Lane; patch by me. Discussion: https://postgr.es/m/20170102225618.GA10071@telsasoft.com	2017-01-09 19:26:58 -03:00
Tom Lane	7c3abe3c92	Get rid of ParseState.p_value_substitute; use a columnref hook instead. I noticed that p_value_substitute, which is a single-purpose kluge I added in 2002 (commit `b0422b215`), could be replaced by having domainAddConstraint install a parser hook that looks for the name "value". The parser hook code only dates back to 2009, so it's not surprising that we had to kluge this in 2002, but we can do it more cleanly now.	2017-01-07 16:02:16 -05:00
Tom Lane	c52d37c8b3	Invalidate cached plans on FDW option changes. This fixes problems where a plan must change but fails to do so, as seen in a bug report from Rajkumar Raghuwanshi. For ALTER FOREIGN TABLE OPTIONS, do this through the standard method of forcing a relcache flush on the table. For ALTER FOREIGN DATA WRAPPER and ALTER SERVER, just flush the whole plan cache on any change in pg_foreign_data_wrapper or pg_foreign_server. That matches the way we handle some other low-probability cases such as opclass changes, and it's unclear that the case arises often enough to be worth working harder. Besides, that gives a patch that is simple enough to back-patch with confidence. Back-patch to 9.3. In principle we could apply the code change to 9.2 as well, but (a) we lack postgres_fdw to test it with, (b) it's doubtful that anyone is doing anything exciting enough with FDWs that far back to need this desperately, and (c) the patch doesn't apply cleanly. Patch originally by Amit Langote, reviewed by Etsuro Fujita and Ashutosh Bapat, who each contributed substantial changes as well. Discussion: https://postgr.es/m/CAKcux6m5cA6rRPTKkqVdJ-R=KKDfe35Q_ZuUqxDSV_4hwga=og@mail.gmail.com	2017-01-06 14:12:52 -05:00
Tom Lane	d86f40009b	Handle OID column inheritance correctly in ALTER TABLE ... INHERIT. Inheritance operations must treat the OID column, if any, much like regular user columns. But MergeAttributesIntoExisting() neglected to do that, leading to weird results after a table with OIDs is associated to a parent with OIDs via ALTER TABLE ... INHERIT. Report and patch by Amit Langote, reviewed by Ashutosh Bapat, some adjustments by me. It's been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/cb13cfe7-a48c-5720-c383-bb843ab28298@lab.ntt.co.jp	2017-01-04 18:00:11 -05:00
Robert Haas	18fc5192a6	Remove unnecessary arguments from partitioning functions. RelationGetPartitionQual() and generate_partition_qual() are always called with recurse = true, so we don't need an argument for that. Extracted by me from a larger patch by Amit Langote.	2017-01-04 14:56:37 -05:00
Robert Haas	f1b4c771ea	Fix reporting of constraint violations for table partitioning. After a tuple is routed to a partition, it has been converted from the root table's row type to the partition's row type. ExecConstraints needs to report the failure using the original tuple and the parent's tuple descriptor rather than the ones for the selected partition. Amit Langote	2017-01-04 14:36:34 -05:00
Robert Haas	345b2dcf07	Move partition_tuple_slot out of EState. Commit `2ac3ef7a01` added a TupleTapleSlot for partition tuple slot to EState (es_partition_tuple_slot) but it's more logical to have it as part of ModifyTableState (mt_partition_tuple_slot) and CopyState (partition_tuple_slot). Discussion: http://postgr.es/m/1bd459d9-4c0c-197a-346e-e5e59e217d97@lab.ntt.co.jp Amit Langote, per a gripe from me	2017-01-04 13:16:59 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Peter Eisentraut	db779d941e	Fix typo in comment	2016-12-29 11:27:41 -05:00
Peter Eisentraut	2e254130d1	Make more use of RoleSpec struct Most code was casting this through a generic Node. By declaring everything as RoleSpec appropriately, we can remove a bunch of casts and ad-hoc node type checking. Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>	2016-12-29 10:49:39 -05:00
Magnus Hagander	6cfa54e384	Fix typo comments Erik Rijkers	2016-12-27 10:24:21 +01:00
Tom Lane	fe591f8bf6	Replace enum InhOption with simple boolean. Now that it has only INH_NO and INH_YES values, it's just weird that it's not a plain bool, so make it that way. Also rename RangeVar.inhOpt to "inh", to be like RangeTblEntry.inh. My recollection is that we gave it a different name specifically because it had a different representation than the derived bool value, but it no longer does. And this is a good forcing function to be sure we catch any places that are affected by the change. Bump catversion because of possible effect on stored RangeVar nodes. I'm not exactly convinced that we ever store RangeVar on disk, but we have a readfuncs function for it, so be cautious. (If we do do so, then commit `e13486eba` was in error not to bump catversion.) Follow-on to commit `e13486eba`. Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com	2016-12-23 13:35:18 -05:00
Peter Eisentraut	158df30359	Remove unnecessary casts of makeNode() result makeNode() is already a macro that has the right result pointer type, so casting it again to the same type is unnecessary.	2016-12-23 13:17:20 -05:00
Robert Haas	e13486eba0	Remove sql_inheritance GUC. This backward-compatibility GUC is long overdue for removal. Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com	2016-12-23 07:35:01 -05:00
Robert Haas	2ac3ef7a01	Fix tuple routing in cases where tuple descriptors don't match. The previous coding failed to work correctly when we have a multi-level partitioned hierarchy where tables at successive levels have different attribute numbers for the partition key attributes. To fix, have each PartitionDispatch object store a standalone TupleTableSlot initialized with the TupleDesc of the corresponding partitioned table, along with a TupleConversionMap to map tuples from the its parent's rowtype to own rowtype. After tuple routing chooses a leaf partition, we must use the leaf partition's tuple descriptor, not the root table's. To that end, a dedicated TupleTableSlot for tuple routing is now allocated in EState. Amit Langote	2016-12-22 17:36:37 -05:00
Stephen Frost	12bd7dd317	Use TSConfigRelationId in AlterTSConfiguration() When we are altering a text search configuration, we are getting the tuple from pg_ts_config and using its OID, so use TSConfigRelationId when invoking any post-alter hooks and setting the object address. Further, in the functions called from AlterTSConfiguration(), we're saving information about the command via EventTriggerCollectAlterTSConfig(), so we should be setting commandCollected to true. Also add a regression test to test_ddl_deparse for ALTER TEXT SEARCH CONFIGURATION. Author: Artur Zakirov, a few additional comments by me Discussion: https://www.postgresql.org/message-id/57a71eba-f2c7-e7fd-6fc0-2126ec0b39bd%40postgrespro.ru Back-patch the fix for the InvokeObjectPostAlterHook() call to 9.3 where it was introduced, and the fix for the ObjectAddressSet() call and setting commandCollected to true to 9.5 where those changes to ProcessUtilitySlow() were introduced.	2016-12-22 17:08:43 -05:00
Robert Haas	3ee8067284	Code review for ATExecAttachPartition. Amit Langote. Most of this reported by Álvaro Herrera.	2016-12-22 12:40:45 -05:00
Dean Rasheed	58b1362642	Fix order of operations in CREATE OR REPLACE VIEW. When CREATE OR REPLACE VIEW acts on an existing view, don't update the view options until after the view query has been updated. This is necessary in the case where CREATE OR REPLACE VIEW is used on an existing view that is not updatable, and the new view is updatable and specifies the WITH CHECK OPTION. In this case, attempting to apply the new options to the view before updating its query fails, because the options are applied using the ALTER TABLE infrastructure which checks that WITH CHECK OPTION is only applied to an updatable view. If new columns are being added to the view, that is also done using the ALTER TABLE infrastructure, but it is important that that still be done before updating the view query, because the rules system checks that the query columns match those on the view relation. Added a comment to explain that, in case someone is tempted to move that to where the view options are now being set. Back-patch to 9.4 where WITH CHECK OPTION was added. Report: https://postgr.es/m/CAEZATCUp%3Dz%3Ds4SzZjr14bfct_bdJNwMPi-gFi3Xc5k1ntbsAgQ%40mail.gmail.com	2016-12-21 16:58:18 +00:00
Robert Haas	cd510f0413	Convert elog() to ereport() and do some wordsmithing. It's not entirely clear that we should log a message here at all, but it's certainly wrong to use elog() for a message that should clearly be translatable. Amit Langote	2016-12-21 11:47:50 -05:00
Robert Haas	1fc5c49450	Refactor partition tuple routing code to reduce duplication. Amit Langote	2016-12-21 11:36:10 -05:00
Peter Eisentraut	f3b421da5f	Reorder pg_sequence columns to avoid alignment issue On AIX, doubles are aligned at 4 bytes, but int64 is aligned at 8 bytes. Our code assumes that doubles have alignment that can also be applied to int64, but that fails in this case. One effect is that heap_form_tuple() writes tuples in a different layout than Form_pg_sequence expects. Rather than rewrite the whole alignment code, work around the issue by reordering the columns in pg_sequence so that the first int64 column naturally comes out at an 8-byte boundary.	2016-12-21 09:06:49 -05:00
Peter Eisentraut	1753b1b027	Add pg_sequence system catalog Move sequence metadata (start, increment, etc.) into a proper system catalog instead of storing it in the sequence heap object. This separates the metadata from the sequence data. Sequence metadata is now operated on transactionally by DDL commands, whereas previously rollbacks of sequence-related DDL commands would be ignored. Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-12-20 08:28:18 -05:00
Robert Haas	7cd0fd655d	Invalid parent's relcache after CREATE TABLE .. PARTITION OF. Otherwise, subsequent commands in the same transaction see the wrong partition descriptor. Amit Langote. Reported by Tomas Vondra and David Fetter. Reviewed by me. Discussion: http://postgr.es/m/22dd313b-d7fd-22b5-0787-654845c8f849%402ndquadrant.com Discussion: http://postgr.es/m/20161215090916.GB20659%40fetter.org	2016-12-19 22:53:30 -05:00
Robert Haas	4b9a98e154	Clean up code, comments, and formatting for table partitioning. Amit Langote, plus pgindent-ing by me. Inspired in part by review comments from Tomas Vondra.	2016-12-13 10:59:14 -05:00
Robert Haas	3856cf9607	Remove should_free arguments to tuplesort routines. Since commit `e94568ecc1`, the answer is always "false", and we do not need to complicate the API by arranging to return a constant value. Peter Geoghegan Discussion: http://postgr.es/m/CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com	2016-12-12 15:57:35 -05:00
Tom Lane	563d575fd7	Fix creative, but unportable, spelling of "ptr != NULL". Or at least I suppose that's what was really meant here. But even aside from the not-per-project-style use of "0" to mean "NULL", I doubt it's safe to assume that all valid pointers are > NULL. Per buildfarm member pademelon.	2016-12-12 11:23:23 -05:00
Robert Haas	f0e44751d7	Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.	2016-12-07 13:17:55 -05:00
Robert Haas	4212cb7326	Fix interaction of parallel query with prepared statements. Previously, a prepared statement created via a Parse message could get a parallel plan, but one created with a PREPARE statement could not. This state of affairs was due to confusion on my (rhaas) part: I erroneously believed that a CREATE TABLE .. AS EXECUTE statement could only be performed with a prepared statement by PREPARE, but in fact one created by a Prepare message works just as well. Therefore, it makes no sense to allow parallel query in one case but not the other. To fix, allow parallel query with all prepared statements, but run the parallel plan serially (i.e. without workers) in the case of CREATE TABLE .. AS EXECUTE. Also, document this. Amit Kapila and Tobias Bussman, plus an extra sentence of documentation by me.	2016-12-06 11:11:54 -05:00
Stephen Frost	093129c9d9	Add support for restrictive RLS policies We have had support for restrictive RLS policies since 9.5, but they were only available through extensions which use the appropriate hooks. This adds support into the grammer, catalog, psql and pg_dump for restrictive RLS policies, thus reducing the cases where an extension is necessary. In passing, also move away from using "AND"d and "OR"d in comments. As pointed out by Alvaro, it's not really appropriate to attempt to make verbs out of "AND" and "OR", so reword those comments which attempted to. Reviewed By: Jeevan Chalke, Dean Rasheed Discussion: https://postgr.es/m/20160901063404.GY4028@tamriel.snowman.net	2016-12-05 15:50:55 -05:00
Alvaro Herrera	fa2fa99552	Permit dump/reload of not-too-large >1GB tuples Our documentation states that our maximum field size is 1 GB, and that our maximum row size of 1.6 TB. However, while this might be attainable in theory with enough contortions, it is not workable in practice; for starters, pg_dump fails to dump tables containing rows larger than 1 GB, even if individual columns are well below the limit; and even if one does manage to manufacture a dump file containing a row that large, the server refuses to load it anyway. This commit enables dumping and reloading of such tuples, provided two conditions are met: 1. no single column is larger than 1 GB (in output size -- for bytea this includes the formatting overhead) 2. the whole row is not larger than 2 GB There are three related changes to enable this: a. StringInfo's API now has two additional functions that allow creating a string that grows beyond the typical 1GB limit (and "long" string). ABI compatibility is maintained. We still limit these strings to 2 GB, though, for reasons explained below. b. COPY now uses long StringInfos, so that pg_dump doesn't choke trying to emit rows longer than 1GB. c. heap_form_tuple now uses the MCXT_ALLOW_HUGE flag in its allocation for the input tuple, which means that large tuples are accepted on input. Note that at this point we do not apply any further limit to the input tuple size. The main reason to limit to 2 GB is that the FE/BE protocol uses 32 bit length words to describe each row; and because the documentation is ambiguous on its signedness and libpq does consider it signed, we cannot use the highest-order bit. Additionally, the StringInfo API uses "int" (which is 4 bytes wide in most platforms) in many places, so we'd need to change that API too in order to improve, which has lots of fallout. Backpatch to 9.5, which is the oldest that has MemoryContextAllocExtended, a necessary piece of infrastructure. We could apply to 9.4 with very minimal additional effort, but any further than that would require backpatching "huge" allocations too. This is the largest set of changes we could find that can be back-patched without breaking compatibility with existing systems. Fixing a bigger set of problems (for example, dumping tuples bigger than 2GB, or dumping fields bigger than 1GB) would require changing the FE/BE protocol and/or changing the StringInfo API in an ABI-incompatible way, neither of which would be back-patchable. Authors: Daniel Vérité, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/20160229183023.GA286012@alvherre.pgsql	2016-12-02 00:34:01 -03:00
Tom Lane	4e026b32d4	Check for pending trigger events on far end when dropping an FK constraint. When dropping a foreign key constraint with ALTER TABLE DROP CONSTRAINT, we refuse the drop if there are any pending trigger events on the named table; this ensures that we won't remove the pg_trigger row that will be consulted by those events. But we should make the same check for the referenced relation, else we might remove a due-to-be-referenced pg_trigger row for that relation too, resulting in "could not find trigger NNN" or "relation NNN has no triggers" errors at commit. Per bug #14431 from Benjie Gillam. Back-patch to all supported branches. Report: <20161124114911.6530.31200@wrigleys.postgresql.org>	2016-11-25 13:44:47 -05:00
Peter Eisentraut	67dc4ccbb2	Add pg_sequences view Like pg_tables, pg_views, and others, this view contains information about sequences in a way that is independent of the system catalog layout but more comprehensive than the information schema. To help implement the view, add a new internal function pg_sequence_last_value() to return the last value of a sequence. This is kept separate from pg_sequence_parameters() to separate querying run-time state from catalog-like information. Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-11-18 14:59:03 -05:00
Tom Lane	279c439c7f	Support "COPY view FROM" for views with INSTEAD OF INSERT triggers. We just pass the data to the INSTEAD trigger. Haribabu Kommi, reviewed by Dilip Kumar Patch: <CAJrrPGcSQkrNkO+4PhLm4B8UQQQmU9YVUuqmtgM=pmzMfxWaWQ@mail.gmail.com>	2016-11-10 14:13:43 -05:00
Robert Haas	dce429b117	Fix typo. Michael Paquier	2016-11-08 15:33:57 -05:00
Kevin Grittner	8c48375e5f	Implement syntax for transition tables in AFTER triggers. This is infrastructure for the complete SQL standard feature. No support is included at this point for execution nodes or PLs. The intent is to add that soon. As this patch leaves things, standard syntax can create tuplestores to contain old and/or new versions of rows affected by a statement. References to these tuplestores are in the TriggerData structure. C triggers can access the tuplestores directly, so they are usable, but they cannot yet be referenced within a SQL statement.	2016-11-04 10:49:50 -05:00
Tom Lane	a522fc3d80	Fix incorrect trigger-property updating in ALTER CONSTRAINT. The code to change the deferrability properties of a foreign-key constraint updated all the associated triggers to match; but a moment's examination of the code that creates those triggers in the first place shows that only some of them should track the constraint's deferrability properties. This leads to odd failures in subsequent exercise of the foreign key, as the triggers are fired at the wrong times. Fix that, and add a regression test comparing the trigger properties produced by ALTER CONSTRAINT with those you get by creating the constraint as-intended to begin with. Per report from James Parks. Back-patch to 9.4 where this ALTER functionality was introduced. Report: <CAJ3Xv+jzJ8iNNUcp4RKW8b6Qp1xVAxHwSXVpjBNygjKxcVuE9w@mail.gmail.com>	2016-10-26 17:05:06 -04:00
Tom Lane	709e461bef	Fix EXPLAIN so that it doesn't emit invalid XML in corner cases. With track_io_timing = on, EXPLAIN (ANALYZE, BUFFERS) will emit fields named like "I/O Read Time". The slash makes that invalid as an XML element name, so that adding FORMAT XML would produce invalid XML. We already have code in there to translate spaces to dashes, so let's generalize that to convert anything that isn't a valid XML name character, viz letters, digits, hyphens, underscores, and periods. We could just reject slashes, which would run a bit faster. But the fact that this went unnoticed for so long doesn't give me a warm feeling that we'd notice the next creative violation, so let's make it a permanent fix. Reported by Markus Winand, though this isn't his initial patch proposal. Back-patch to 9.2 where track_io_timing was added. The problem is only latent in 9.1, so I don't feel a need to fix it there. Discussion: <E0BF6A45-68E8-45E6-918F-741FB332C6BB@winand.at>	2016-10-20 17:17:50 -04:00
Tom Lane	2f1eaf87e8	Drop server support for FE/BE protocol version 1.0. While this isn't a lot of code, it's been essentially untestable for a very long time, because libpq doesn't support anything older than protocol 2.0, and has not since release 6.3. There's no reason to believe any other client-side code still uses that protocol, either. Discussion: <2661.1475849167@sss.pgh.pa.us>	2016-10-11 12:19:18 -04:00
Heikki Linnakangas	6fb12cbcd6	Remove some unnecessary #includes. Amit Langote	2016-10-10 12:22:58 +03:00
Tom Lane	e55a946a81	Fix two bugs in merging of inherited CHECK constraints. Historically, we've allowed users to add a CHECK constraint to a child table and then add an identical CHECK constraint to the parent. This results in "merging" the two constraints so that the pre-existing child constraint ends up with both conislocal = true and coninhcount > 0. However, if you tried to do it in the other order, you got a duplicate constraint error. This is problematic for pg_dump, which needs to issue separated ADD CONSTRAINT commands in some cases, but has no good way to ensure that the constraints will be added in the required order. And it's more than a bit arbitrary, too. The goal of complaining about duplicated ADD CONSTRAINT commands can be served if we reject the case of adding a constraint when the existing one already has conislocal = true; but if it has conislocal = false, let's just make the ADD CONSTRAINT set conislocal = true. In this way, either order of adding the constraints has the same end result. Another problem was that the code allowed creation of a parent constraint marked convalidated that is merged with a child constraint that is !convalidated. In this case, an inheritance scan of the parent table could emit some rows violating the constraint condition, which would be an unexpected result given the marking of the parent constraint as validated. Hence, forbid merging of constraints in this case. (Note: valid child and not-valid parent seems fine, so continue to allow that.) Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced possibly-not-valid check constraints. The second bug obviously doesn't apply before that, and I think the first doesn't either, because pg_dump only gets into this situation when dealing with not-valid constraints. Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com> Discussion: <22108.1475874586@sss.pgh.pa.us>	2016-10-08 19:29:27 -04:00
Stephen Frost	814b9e9b8e	Fix RLS with COPY (col1, col2) FROM tab Attempting to COPY a subset of columns from a table with RLS enabled would fail due to an invalid query being constructed (using a single ColumnRef with the list of fields to exact in 'fields', but that's for the different levels of an indirection for a single column, not for specifying multiple columns). Correct by building a ColumnRef and then RestTarget for each column being requested and then adding those to the targetList for the select query. Include regression tests to hopefully catch if this is broken again in the future. Patch-By: Adam Brightwell Reviewed-By: Michael Paquier	2016-10-03 16:22:57 -04:00
Peter Eisentraut	967ed9205b	Remove dead line of code	2016-09-28 12:00:00 -04:00
Heikki Linnakangas	babe05bc2b	Turn password_encryption GUC into an enum. This makes the parameter easier to extend, to support other password-based authentication protocols than MD5. (SCRAM is being worked on.) The GUC still accepts on/off as aliases for "md5" and "plain", although we may want to remove those once we actually add support for another password hash type. Michael Paquier, reviewed by David Steele, with some further edits by me. Discussion: <CAB7nPqSMXU35g=W9X74HVeQp0uvgJxvYOuA4A-A3M+0wfEBv-w@mail.gmail.com>	2016-09-28 12:22:44 +03:00
Robert Haas	5225c66336	Clarify policy on marking inherited constraints as valid. Amit Langote and Robert Haas	2016-09-15 17:24:54 -04:00
Tom Lane	55c3391d1e	Be pickier about converting between Name and Datum. We were misapplying NameGetDatum() to plain C strings in some places. This worked, because it was just a pointer cast anyway, but it's a type cheat in some sense. Use CStringGetDatum instead, and modify the NameGetDatum macro so it won't compile if applied to something that's not a pointer to NameData. This should result in no changes to generated code, but it is logically cleaner. Mark Dilger, tweaked a bit by me Discussion: <EFD8AC94-4C1F-40C1-A5EA-304080089C1B@gmail.com>	2016-09-13 17:17:48 -04:00
Tom Lane	40b449ae84	Allow CREATE EXTENSION to follow extension update paths. Previously, to update an extension you had to produce both a version-update script and a new base installation script. It's become more and more obvious that that's tedious, duplicative, and error-prone. This patch attempts to improve matters by allowing the new base installation script to be omitted. CREATE EXTENSION will install a requested version if it can find a base script and a chain of update scripts that will get there. As in the existing update logic, shorter chains are preferred if there's more than one possibility, with an arbitrary tie-break rule for chains of equal length. Also adjust the pg_available_extension_versions view to show such versions as installable. While at it, refactor the code so that CASCADE processing works for extensions requested during ApplyExtensionUpdates(). Without this, addition of a new requirement in an updated extension would require creating a new base script, even if there was no other reason to do that. (It would be easy at this point to add a CASCADE option to ALTER EXTENSION UPDATE, to allow the same thing to happen during a manually-commanded version update, but I have not done that here.) Tom Lane, reviewed by Andres Freund Discussion: <20160905005919.jz2m2yh3und2dsuy@alap3.anarazel.de>	2016-09-11 14:15:07 -04:00
Tom Lane	0ab9c56d0f	Support renaming an existing value of an enum type. Not much to be said about this patch: it does what it says on the tin. In passing, rename AlterEnumStmt.skipIfExists to skipIfNewValExists to clarify what it actually does. In the discussion of this patch we considered supporting other similar options, such as IF EXISTS on the type as a whole or IF NOT EXISTS on the target name. This patch doesn't actually add any such feature, but it might happen later. Dagfinn Ilmari Mannsåker, reviewed by Emre Hasegeli Discussion: <CAO=2mx6uvgPaPDf-rHqG8=1MZnGyVDMQeh8zS4euRyyg4D35OQ@mail.gmail.com>	2016-09-07 16:11:56 -04:00
Tom Lane	4f405c8ef4	Add a HINT for client-vs-server COPY failure cases. Users often get confused between COPY and \copy and try to use client-side paths with COPY. The server then cannot find the file (if remote), or sees a permissions problem (if local), or some variant of that. Emit a hint about this in the most common cases. In future we might want to expand the set of errnos for which the hint gets printed, but be conservative for now. Craig Ringer, reviewed by Christoph Berg and Tom Lane Discussion: <CAMsr+YEqtD97qPEzQDqrCt5QiqPbWP_X4hmvy2pQzWC0GWiyPA@mail.gmail.com>	2016-09-06 23:55:55 -04:00
Peter Eisentraut	49eb0fd097	Add location field to DefElem Add a location field to the DefElem struct, used to parse many utility commands. Update various error messages to supply error position information. To propogate the error position information in a more systematic way, create a ParseState in standard_ProcessUtility() and pass that to interested functions implementing the utility commands. This seems better than passing the query string and then reassembling a parse state ad hoc, which violates the encapsulation of the ParseState type. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2016-09-06 12:00:00 -04:00
Simon Riggs	dcb12ce8d8	Fix VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL lazy_truncate_heap() was waiting for VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL, but in microseconds not milliseconds as originally intended. Found by code inspection. Simon Riggs	2016-09-06 15:35:47 +01:00
Tom Lane	25794e841e	Cosmetic code cleanup in commands/extension.c. Some of the comments added by the CREATE EXTENSION CASCADE patch were a bit sloppy, and I didn't care for redeclaring the same local variable inside a nested block either. No functional changes.	2016-09-05 18:53:33 -04:00
Tom Lane	15bc038f9b	Relax transactional restrictions on ALTER TYPE ... ADD VALUE. To prevent possibly breaking indexes on enum columns, we must keep uncommitted enum values from getting stored in tables, unless we can be sure that any such column is new in the current transaction. Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE from being executed at all in a transaction block, unless the target enum type had been created in the current transaction. This patch removes that restriction, and instead insists that an uncommitted enum value can't be referenced unless it belongs to an enum type created in the same transaction as the value. Per discussion, this should be a bit less onerous. It does require each function that could possibly return a new enum value to SQL operations to check this restriction, but there aren't so many of those that this seems unmaintainable. Andrew Dunstan and Tom Lane Discussion: <4075.1459088427@sss.pgh.pa.us>	2016-09-05 12:59:55 -04:00
Heikki Linnakangas	ec136d19b2	Move code shared between libpq and backend from backend/libpq/ to common/. When building libpq, ip.c and md5.c were symlinked or copied from src/backend/libpq into src/interfaces/libpq, but now that we have a directory specifically for routines that are shared between the server and client binaries, src/common/, move them there. Some routines in ip.c were only used in the backend. Keep those in src/backend/libpq, but rename to ifaddr.c to avoid confusion with the file that's now in common. Fix the comment in src/common/Makefile to reflect how libpq actually links those files. There are two more files that libpq symlinks directly from src/backend: encnames.c and wchar.c. I don't feel compelled to move those right now, though. Patch by Michael Paquier, with some changes by me. Discussion: <69938195-9c76-8523-0af8-eb718ea5b36e@iki.fi>	2016-09-02 13:49:59 +03:00
Tom Lane	ea268cdc9a	Add macros to make AllocSetContextCreate() calls simpler and safer. I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>	2016-08-27 17:50:38 -04:00
Tom Lane	b5bce6c1ec	Final pgindent + perltidy run for 9.6.	2016-08-15 13:42:51 -04:00
Tom Lane	ed0097e4f9	Add SQL-accessible functions for inspecting index AM properties. Per discussion, we should provide such functions to replace the lost ability to discover AM properties by inspecting pg_am (cf commit `65c5fcd35`). The added functionality is also meant to displace any code that was looking directly at pg_index.indoption, since we'd rather not believe that the bit meanings in that field are part of any client API contract. As future-proofing, define the SQL API to not assume that properties that are currently AM-wide or index-wide will remain so unless they logically must be; instead, expose them only when inquiring about a specific index or even specific index column. Also provide the ability for an index AM to override the behavior. In passing, document pg_am.amtype, overlooked in commit `473b93287`. Andrew Gierth, with kibitzing by me and others Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>	2016-08-13 18:31:14 -04:00
Tom Lane	4b234fd8bf	Fix inappropriate printing of never-measured times in EXPLAIN. EXPLAIN (ANALYZE, TIMING OFF) would print an elapsed time of zero for a trigger function, because no measurement has been taken but it printed the field anyway. This isn't what EXPLAIN does elsewhere, so suppress it. In the same vein, EXPLAIN (ANALYZE, BUFFERS) with non-text output format would print buffer I/O timing numbers even when no measurement has been taken because track_io_timing is off. That seems not per policy, either, so change it. Back-patch to 9.2 where these features were introduced. Maksim Milyutin Discussion: <081c0540-ecaa-bd29-3fd2-6358f3b359a9@postgrespro.ru>	2016-08-12 12:13:04 -04:00
Tom Lane	95bee941be	Fix misestimation of n_distinct for a nearly-unique column with many nulls. If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <16143.1470350371@sss.pgh.pa.us>	2016-08-07 18:52:02 -04:00
Tom Lane	9ee1cf04ab	Fix TOAST access failure in RETURNING queries. Discussion of commit `3e2f3c2e4` exposed a problem that is of longer standing: since we don't detoast data while sticking it into a portal's holdStore for PORTAL_ONE_RETURNING and PORTAL_UTIL_SELECT queries, and we release the query's snapshot as soon as we're done loading the holdStore, later readout of the holdStore can do TOAST fetches against data that can no longer be seen by any of the session's live snapshots. This means that a concurrent VACUUM could remove the TOAST data before we can fetch it. Commit `3e2f3c2e4` exposed the problem by showing that sometimes we had no live snapshots while fetching TOAST data, but we'd be at risk anyway. I believe this code was all right when written, because our management of a session's exposed xmin was such that the TOAST references were safe until end of transaction. But that's no longer true now that we can advance or clear our PGXACT.xmin intra-transaction. To fix, copy the query's snapshot during FillPortalStore() and save it in the Portal; release it only when the portal is dropped. This essentially implements a policy that we must hold a relevant snapshot whenever we access potentially-toasted data. We had already come to that conclusion in other places, cf commits `08e261cbc9` and `ec543db77b`. I'd have liked to add a regression test case for this, but I didn't see a way to make one that's not unreasonably bloated; it seems to require returning a toasted value to the client, and those will be big. In passing, improve PortalRunUtility() so that it positively verifies that its ending PopActiveSnapshot() call will pop the expected snapshot, removing a rather shaky assumption about which utility commands might do their own PopActiveSnapshot(). There's no known bug here, but now that we're actively referencing the snapshot it's almost free to make this code a bit more bulletproof. We might want to consider back-patching something like this into older branches, but it would be prudent to let it prove itself more in HEAD beforehand. Discussion: <87vazemeda.fsf@credativ.de>	2016-08-07 17:46:08 -04:00
Peter Eisentraut	40fcfec82c	Message style improvements	2016-07-25 22:07:44 -04:00
Kevin Grittner	1c15aac53f	Add comment & docs about no vacuum truncation with sto. Omission noted by Andres Freund.	2016-07-19 16:25:53 -05:00
Andres Freund	eca0f1db14	Clear all-frozen visibilitymap status when locking tuples. Since `a892234` & `fd31cd265` the visibilitymap's freeze bit is used to avoid vacuuming the whole relation in anti-wraparound vacuums. Doing so correctly relies on not adding xids to the heap without also unsetting the visibilitymap flag. Tuple locking related code has not done so. To allow selectively resetting all-frozen - to avoid pessimizing heap_lock_tuple - allow to selectively reset the all-frozen with visibilitymap_clear(). To avoid having to use visibilitymap_get_status (e.g. via VM_ALL_FROZEN) inside a critical section, have visibilitymap_clear() return whether any bits have been reset. There's a remaining issue (denoted by XXX): After the PageIsAllVisible() check in heap_lock_tuple() and heap_lock_updated_tuple_rec() the page status could theoretically change. Practically that currently seems impossible, because updaters will hold a page level pin already. Due to the next beta coming up, it seems better to get the required WAL magic bump done before resolving this issue. The added flags field fields to xl_heap_lock and xl_heap_lock_updated require bumping the WAL magic. Since there's already been a catversion bump since the last beta, that's not an issue. Reviewed-By: Robert Haas, Amit Kapila and Andres Freund Author: Masahiko Sawada, heavily revised by Andres Freund Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com Backpatch: -	2016-07-18 02:01:13 -07:00
Tom Lane	4d042999f9	Print a given subplan only once in EXPLAIN. We have, for a very long time, allowed the same subplan (same member of the PlannedStmt.subplans list) to be referenced by more than one SubPlan node; this avoids problems for cases such as subplans within an IndexScan's indxqual and indxqualorig fields. However, EXPLAIN had not gotten the memo and would print each reference as though it were an independent identical subplan. To fix, track plan_ids of subplans we've printed and don't print the same plan_id twice. Per report from Pavel Stehule. BTW: the particular case of IndexScan didn't cause visible duplication in a plain EXPLAIN, only EXPLAIN ANALYZE, because in the former case we short-circuit executor startup before the indxqual field is processed by ExecInitExpr. That seems like it could easily lead to other EXPLAIN problems in future, but it's not clear how to avoid it without breaking the "EXPLAIN a plan using hypothetical indexes" use-case. For now I've left that issue alone. Although this is a longstanding bug, it's purely cosmetic (no great harm is done by the repeat printout) and we haven't had field complaints before. So I'm hesitant to back-patch it, especially since there is some small risk of ABI problems due to the need to add a new field to ExplainState. In passing, rearrange order of fields in ExplainState to be less random, and update some obsolete comments about when/where to initialize them. Report: <CAFj8pRAimq+NK-menjt+3J4-LFoodDD8Or6=Lc_stcFD+eD4DA@mail.gmail.com>	2016-07-11 18:14:29 -04:00
Robert Haas	10c0558ffe	Fix several mistakes around parallel workers and client_encoding. Previously, workers sent data to the leader using the client encoding. That mostly worked, but the leader the converted the data back to the server encoding. Since not all encoding conversions are reversible, that could provoke failures. Fix by using the database encoding for all communication between worker and leader. Also, while temporary changes to GUC settings, as from the SET clause of a function, are in general OK for parallel query, changing client_encoding this way inside of a parallel worker is not OK. Previously, that would have confused the leader; with these changes, it would not confuse the leader, but it wouldn't do anything either. So refuse such changes in parallel workers. Also, the previous code naively assumed that when it received a NotifyResonse from the worker, it could pass that directly back to the user. But now that worker-to-leader communication always uses the database encoding, that's clearly no longer correct - though, actually, the old way was always broken for V2 clients. So disassemble and reconstitute the message instead. Issues reported by Peter Eisentraut. Patch by me, reviewed by Peter Eisentraut.	2016-06-30 18:35:32 -04:00
Tom Lane	8ebb69f854	Fix some infelicities in EXPLAIN output for parallel query plans. In non-text output formats, parallelized aggregates were reporting "Partial" or "Finalize" as a field named "Operation", which might be all right in the absence of any context --- but other plan node types use that field to report SQL-visible semantics, such as Select/Insert/Update/Delete. So that naming choice didn't seem good to me. I changed it to "Partial Mode". Also, the field did not appear at all for a non-parallelized Agg plan node, which is contrary to expectation in non-text formats. We're notionally producing objects that conform to a schema, so the set of fields for a given node type and EXPLAIN mode should be well-defined. I set it up to fill in "Simple" in such cases. Other fields that were added for parallel query, namely "Parallel Aware" and Gather's "Single Copy", had not gotten the word on that point either. Make them appear always in non-text output. Also, the latter two fields were nominally producing boolean output, but were getting it wrong, because bool values shouldn't be quoted in JSON or YAML. Somehow we'd not needed an ExplainPropertyBool formatting subroutine before 9.6; but now we do, so invent it. Discussion: <16002.1466972724@sss.pgh.pa.us>	2016-06-29 18:51:20 -04:00
Tom Lane	874fe3aea1	Fix CREATE MATVIEW/CREATE TABLE AS ... WITH NO DATA to not plan the query. Previously, these commands always planned the given query and went through executor startup before deciding not to actually run the query if WITH NO DATA is specified. This behavior is problematic for pg_dump because it may cause errors to be raised that we would rather not see before a REFRESH MATERIALIZED VIEW command is issued. See for example bug #13907 from Marian Krucina. This change is not sufficient to fix that particular bug, because we also need to tweak pg_dump to issue the REFRESH later, but it's a necessary step on the way. A user-visible side effect of doing things this way is that the returned command tag for WITH NO DATA cases will now be "CREATE MATERIALIZED VIEW" or "CREATE TABLE AS", not "SELECT 0". We could preserve the old behavior but it would take more code, and arguably that was just an implementation artifact not intended behavior anyhow. In 9.5 and HEAD, also get rid of the static variable CreateAsReladdr, which was trouble waiting to happen; there is not any prohibition on nested CREATE commands. Back-patch to 9.3 where CREATE MATERIALIZED VIEW was introduced. Michael Paquier and Tom Lane Report: <20160202161407.2778.24659@wrigleys.postgresql.org>	2016-06-27 15:57:50 -04:00
Tom Lane	19e972d558	Rethink node-level representation of partial-aggregation modes. The original coding had three separate booleans representing partial aggregation behavior, which was confusing, unreadable, and error-prone, not least because the booleans weren't always listed in the same order. It was also inadequate for the allegedly-desirable future extension to support intermediate partial aggregation, because we'd need separate markers for serialization and deserialization in such a case. Merge these bools into an enum "AggSplit" to provide symbolic names for the supported operating modes (and document what those are). By assigning the values of the enum constants carefully, we can treat AggSplit values as options bitmasks so that tests of what to do aren't noticeably more expensive than before. While at it, get rid of Aggref.aggoutputtype. That's not needed since commit `59a3795c2` got rid of setrefs.c's special-purpose Aggref comparison code, and it likewise seemed more confusing than helpful. Assorted comment cleanup as well (there's still more that I want to do in that line). catversion bump for change in Aggref node contents. Should be the last one for partial-aggregation changes. Discussion: <29309.1466699160@sss.pgh.pa.us>	2016-06-26 14:33:38 -04:00
Tom Lane	f8ace5477e	Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>	2016-06-22 16:52:41 -04:00
Robert Haas	ede62e56fb	Add VACUUM (DISABLE_PAGE_SKIPPING) for emergencies. If you really want to vacuum every single page in the relation, regardless of apparent visibility status or anything else, you can use this option. In previous releases, this behavior could be achieved using VACUUM (FREEZE), but because we can now recognize all-frozen pages as not needing to be frozen again, that no longer works. There should be no need for routine use of this option, but maybe bugs or disaster recovery will necessitate its use. Patch by me, reviewed by Andres Freund.	2016-06-17 15:48:57 -04:00
Robert Haas	38e9f90a22	Fix lazy_scan_heap so that it won't mark pages all-frozen too soon. Commit `a892234f83` added a new bit per page to the visibility map fork indicating whether the page is all-frozen, but incorrectly assumed that if lazy_scan_heap chose to freeze a tuple then that tuple would not need to later be frozen again. This turns out to be false, because xmin and xmax (and conceivably xvac, if dealing with tuples from very old releases) could be frozen at separate times. Thanks to Andres Freund for help in uncovering and tracking down this issue.	2016-06-15 14:30:06 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Peter Eisentraut	5c6d2a5e7c	Message style and wording fixes	2016-06-07 14:18:55 -04:00
Tom Lane	f64340e743	Don't reset changes_since_analyze after a selective-columns ANALYZE. If we ANALYZE only selected columns of a table, we should not postpone auto-analyze because of that; other columns may well still need stats updates. As committed, the counter is left alone if a column list is given, whether or not it includes all analyzable columns of the table. Per complaint from Tomasz Ostrowski. It's been like this a long time, so back-patch to all supported branches. Report: <ef99c1bd-ff60-5f32-2733-c7b504eb960c@ato.waw.pl>	2016-06-06 17:44:17 -04:00
Robert Haas	c6dbf1fe79	Stop the executor if no more tuples can be sent from worker to leader. If a Gather node has read as many tuples as it needs (for example, due to Limit) it may detach the queue connecting it to the worker before reading all of the worker's tuples. Rather than let the worker continue to generate and send all of the results, have it stop after sending the next tuple. More could be done here to stop the worker even quicker, but this is about as well as we can hope to do for 9.6. This is in response to a problem report from Andreas Seltenreich. Commit `44339b892a` should be actually be sufficient to fix that example even without this change, but it seems better to do this, too, since we might otherwise waste quite a large amount of effort in one or more workers. Discussion: CAA4eK1KOKGqmz9bGu+Z42qhRwMbm4R5rfnqsLCNqFs9j14jzEA@mail.gmail.com Amit Kapila	2016-06-06 14:52:58 -04:00
Robert Haas	6436a853f1	Fix comment to be more accurate. Now that we skip vacuuming all-frozen pages, this comment needs updating. Masahiko Sawada	2016-06-03 11:56:57 -04:00
Greg Stark	e1623c3959	Fix various common mispellings. Mostly these are just comments but there are a few in documentation and a handful in code and tests. Hopefully this doesn't cause too much unnecessary pain for backpatching. I relented from some of the most common like "thru" for that reason. The rest don't seem numerous enough to cause problems. Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings	2016-06-03 16:08:45 +01:00
Robert Haas	fdfaccfa79	Cosmetic improvements to freeze map code. Per post-commit review comments from Andres Freund, improve variable names, comments, and in one place, slightly improve the code structure. Masahiko Sawada	2016-06-03 08:43:41 -04:00
Tom Lane	83dbde94f7	Fix DROP ACCESS METHOD IF EXISTS. The IF EXISTS option was documented, and implemented in the grammar, but it didn't actually work for lack of support in does_not_exist_skipping(). Per bug #14160. Report and patch by Kouhei Sutou Report: <20160527070433.19424.81712@wrigleys.postgresql.org>	2016-05-27 11:03:18 -04:00
Tom Lane	2d2e40e3be	Fetch XIDs atomically during vac_truncate_clog(). Because vac_update_datfrozenxid() updates datfrozenxid and datminmxid in-place, it's unsafe to assume that successive reads of those values will give consistent results. Fetch each one just once to ensure sane behavior in the minimum calculation. Noted while reviewing Alexander Korotkov's patch in the same area. Discussion: <8564.1464116473@sss.pgh.pa.us>	2016-05-24 15:47:51 -04:00
Tom Lane	996d273978	Avoid consuming an XID during vac_truncate_clog(). vac_truncate_clog() uses its own transaction ID as the comparison point in a sanity check that no database's datfrozenxid has already wrapped around "into the future". That was probably fine when written, but in a lazy vacuum we won't have assigned an XID, so calling GetCurrentTransactionId() causes an XID to be assigned when otherwise one would not be. Most of the time that's not a big problem ... but if we are hard up against the wraparound limit, consuming XIDs during antiwraparound vacuums is a very bad thing. Instead, use ReadNewTransactionId(), which not only avoids this problem but is in itself a better comparison point to test whether wraparound has already occurred. Report and patch by Alexander Korotkov. Back-patch to all versions. Report: <CAPpHfdspOkmiQsxh-UZw2chM6dRMwXAJGEmmbmqYR=yvM7-s6A@mail.gmail.com>	2016-05-24 15:20:36 -04:00
Stephen Frost	a89505fd21	Remove various special checks around default roles Default roles really should be like regular roles, for the most part. This removes a number of checks that were trying to make default roles extra special by not allowing them to be used as regular roles. We still prevent users from creating roles in the "pg_" namespace or from altering roles which exist in that namespace via ALTER ROLE, as we can't preserve such changes, but otherwise the roles are very much like regular roles. Based on discussion with Robert and Tom.	2016-05-06 14:06:50 -04:00
Kevin Grittner	a343e223a5	Revert no-op changes to BufferGetPage() The reverted changes were intended to force a choice of whether any newly-added BufferGetPage() calls needed to be accompanied by a test of the snapshot age, to support the "snapshot too old" feature. Such an accompanying test is needed in about 7% of the cases, where the page is being used as part of a scan rather than positioning for other purposes (such as DML or vacuuming). The additional effort required for back-patching, and the doubt whether the intended benefit would really be there, have indicated it is best just to rely on developers to do the right thing based on comments and existing usage, as we do with many other conventions. This change should have little or no effect on generated executable code. Motivated by the back-patching pain of Tom Lane and Robert Haas	2016-04-20 08:31:19 -05:00
Tom Lane	c34df8a003	Disallow creation of indexes on system columns (except for OID). Although OID acts pretty much like user data, the other system columns do not, so an index on one would likely misbehave. And it's pretty hard to see a use-case for one, anyway. Let's just forbid the case rather than worry about whether it should be supported. David Rowley	2016-04-16 12:11:41 -04:00
Tom Lane	8f1911d5e6	Fix possible crash in ALTER TABLE ... REPLICA IDENTITY USING INDEX. Careless coding added by commit `07cacba983` could result in a crash or a bizarre error message if someone tried to select an index on the OID column as the replica identity index for a table. Back-patch to 9.4 where the feature was introduced. Discussion: CAKJS1f8TQYgTRDyF1_u9PVCKWRWz+DkieH=U7954HeHVPJKaKg@mail.gmail.com David Rowley	2016-04-15 12:11:40 -04:00
Robert Haas	5702277ca9	Tweak EXPLAIN for parallel query to show workers launched. The previous display was sort of confusing, because it didn't distinguish between the number of workers that we planned to launch and the number that actually got launched. This has already confused several people, so display both numbers and label them clearly. Julien Rouhaud, reviewed by me.	2016-04-15 11:52:18 -04:00
Tom Lane	92a30a7eb0	Fix broken dependency-mongering for index operator classes/families. For a long time, opclasscmds.c explained that "we do not create a dependency link to the AM [for an opclass or opfamily], because we don't currently support DROP ACCESS METHOD". Commit `473b932870` invented DROP ACCESS METHOD, but it batted only 1 for 2 on adding the dependency links, and 0 for 2 on updating the comments about the topic. In passing, undo the same commit's entirely inappropriate decision to blow away an existing index as a side-effect of create_am.sql.	2016-04-13 23:33:31 -04:00
Stephen Frost	bfed4ab824	Disallow SET SESSION AUTHORIZATION pg_* As part of reserving the pg_* namespace for default roles and in line with SET ROLE and other previous efforts, disallow settings the role to a default/reserved role using SET SESSION AUTHORIZATION. These checks and restrictions on what is allowed regarding default / reserved roles are under debate, but it seems prudent to ensure that the existing checks at least cover the intended cases while the debate rages on. On me to clean it up if the consensus decision is to remove these checks.	2016-04-13 21:31:24 -04:00
Alvaro Herrera	bd905a0d04	Fix possible NULL dereference in ExecAlterObjectDependsStmt I used the wrong variable here. Doesn't make a difference today because the only plausible caller passes a non-NULL variable, but someday it will be wrong, and even today's correctness is subtle: the caller that does pass a NULL is never invoked because of object type constraints. Surely not a condition to rely on. Noted by Coverity	2016-04-10 11:03:35 -03:00
Stephen Frost	293007898d	Reserve the "pg_" namespace for roles This will prevent users from creating roles which begin with "pg_" and will check for those roles before allowing an upgrade using pg_upgrade. This will allow for default roles to be provided at initdb time. Reviews by José Luis Tallón and Robert Haas	2016-04-08 16:56:27 -04:00
Kevin Grittner	848ef42bb8	Add the "snapshot too old" feature This feature is controlled by a new old_snapshot_threshold GUC. A value of -1 disables the feature, and that is the default. The value of 0 is just intended for testing. Above that it is the number of minutes a snapshot can reach before pruning and vacuum are allowed to remove dead tuples which the snapshot would otherwise protect. The xmin associated with a transaction ID does still protect dead tuples. A connection which is using an "old" snapshot does not get an error unless it accesses a page modified recently enough that it might not be able to produce accurate results. This is similar to the Oracle feature, and we use the same SQLSTATE and error message for compatibility.	2016-04-08 14:36:30 -05:00
Kevin Grittner	8b65cf4c5e	Modify BufferGetPage() to prepare for "snapshot too old" feature This patch is a no-op patch which is intended to reduce the chances of failures of omission once the functional part of the "snapshot too old" patch goes in. It adds parameters for snapshot, relation, and an enum to specify whether the snapshot age check needs to be done for the page at this point. This initial patch passes NULL for the first two new parameters and BGP_NO_SNAPSHOT_TEST for the third. The follow-on patch will change the places where the test needs to be made.	2016-04-08 14:30:10 -05:00
Teodor Sigaev	8b99edefca	Revert CREATE INDEX ... INCLUDING ... It's not ready yet, revert two commits `690c543550` - unstable test output `386e3d7609` - patch itself	2016-04-08 21:52:13 +03:00
Teodor Sigaev	386e3d7609	CREATE INDEX ... INCLUDING (column[, ...]) Now indexes (but only B-tree for now) can contain "extra" column(s) which doesn't participate in index structure, they are just stored in leaf tuples. It allows to use index only scan by using single index instead of two or more indexes. Author: Anastasia Lubennikova with minor editorializing by me Reviewers: David Rowley, Peter Geoghegan, Jeff Janes	2016-04-08 19:45:59 +03:00
Peter Eisentraut	339025c68f	Replace printf format %i by %d see also `ce8d7bb644`	2016-04-08 12:42:58 -04:00
Tom Lane	93c301fc4f	Fix multiple bugs in tablespace symlink removal. Don't try to examine S_ISLNK(st.st_mode) after a failed lstat(). It's undefined. Also, if the lstat() reported ENOENT, we do not wish that to be a hard error, but the code might nonetheless treat it as one (giving an entirely misleading error message, too) depending on luck-of-the-draw as to what S_ISLNK() returned. Don't throw error for ENOENT from rmdir(), either. (We're not really expecting ENOENT because we just stat'd the file successfully; but if we're going to allow ENOENT in the symlink code path, surely the directory code path should too.) Generate an appropriate errcode for its-the-wrong-type-of-file complaints. (ERRCODE_SYSTEM_ERROR doesn't seem appropriate, and failing to write errcode() around it certainly doesn't work, and not writing an errcode at all is not per project policy.) Valgrind noticed the undefined S_ISLNK result; the other problems emerged while reading the code in the area. All of this appears to have been introduced in `8f15f74a44`. Back-patch to 9.5 where that commit appeared.	2016-04-08 12:31:53 -04:00
Tom Lane	de94e2af18	Run pgindent on a batch of (mostly-planner-related) source files. Getting annoyed at the amount of unrelated chatter I get from pgindent'ing Rowley's unique-joins patch. Re-indent all the files it touches.	2016-04-06 11:34:02 -04:00
Alvaro Herrera	f2fcad27d5	Support ALTER THING .. DEPENDS ON EXTENSION This introduces a new dependency type which marks an object as depending on an extension, such that if the extension is dropped, the object automatically goes away; and also, if the database is dumped, the object is included in the dump output. Currently the grammar supports this for indexes, triggers, materialized views and functions only, although the utility code is generic so adding support for more object types is a matter of touching the parser rules only. Author: Abhijit Menon-Sen Reviewed-by: Alexander Korotkov, Álvaro Herrera Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org	2016-04-05 18:38:54 -03:00
Robert Haas	41ea0c2376	Fix parallel-safety code for parallel aggregation. has_parallel_hazard() was ignoring the proparallel markings for aggregates, which is no good. Fix that. There was no way to mark an aggregate as actually being parallel-safe, either, so add a PARALLEL option to CREATE AGGREGATE. Patch by me, reviewed by David Rowley.	2016-04-05 16:06:15 -04:00
Tom Lane	3c69b33f45	Add a few comments about ANALYZE's strategy for collecting MCVs. Alex Shulgin complained that the underlying strategy wasn't all that apparent, particularly not the fact that we intentionally have two code paths depending on whether we think the column has a limited set of possible values or not. Try to make it clearer.	2016-04-04 17:06:33 -04:00
Tom Lane	391159e03a	Partially revert commit `3d3bf62f30`. On reflection, the pre-existing logic in ANALYZE is specifically meant to compare the frequency of a candidate MCV against the estimated frequency of a random distinct value across the whole table. The change to compare it against the average frequency of values actually seen in the sample doesn't seem very principled, and if anything it would make us less likely not more likely to consider a value an MCV. So revert that, but keep the aspect of considering only nonnull values, which definitely is correct. In passing, rename the local variables in these stanzas to "ndistinct_table", to avoid confusion with the "ndistinct" that appears at an outer scope in compute_scalar_stats.	2016-04-04 16:48:13 -04:00
Tom Lane	3d3bf62f30	Omit null rows when setting the threshold for what's a most-common value. As with the previous patch, large numbers of null rows could skew this calculation unfavorably, causing us to discard values that have a legitimate claim to be MCVs, since our definition of MCV is that it's most common among the non-null population of the column. Hence, make the numerator of avgcount be the number of non-null sample values not the number of sample rows; likewise for maxmincount in the compute_scalar_stats variant. Also, make the denominator be the number of distinct values actually observed in the sample, rather than reversing it back out of the computed stadistinct. This avoids depending on the accuracy of the Haas-Stokes approximation, and really it's what we want anyway; the threshold should depend only on what we see in the sample, not on what we extrapolate about the contents of the whole column. Alex Shulgin, reviewed by Tomas Vondra and myself	2016-04-01 17:03:27 -04:00
Tom Lane	be4b4dc759	Omit null rows when applying the Haas-Stokes estimator for ndistinct. Previously, we included null rows in the values of n and N that went into the formula, which amounts to considering null as a value in its own right; but the d and f1 values do not include nulls. This is inconsistent, and it contributes to significant underestimation of ndistinct when the column is mostly nulls. In any case stadistinct is defined as the number of distinct non-null values, so we should exclude nulls when doing this computation. This is an aboriginal bug in our application of the Haas-Stokes formula, but we'll refrain from back-patching for fear of destabilizing plan choices in released branches. While at it, make the code a bit more readable by omitting unnecessary casts and intermediate variables. Observation and original patch by Tomas Vondra, adjusted to fix both uses of the formula by Alex Shulgin, cosmetic improvements by me	2016-04-01 15:48:24 -04:00
Alvaro Herrera	f402b99501	Type names should not be quoted Our actual convention, contrary to what I said in `59a2111b23`, is not to quote type names, as evidenced by unquoted use of format_type_be() result value in error messages. Remove quotes from recently tweaked messages accordingly. Per note from Tom Lane	2016-04-01 13:35:48 -03:00
Robert Haas	5fe5a2cee9	Allow aggregate transition states to be serialized and deserialized. This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me.	2016-03-29 15:04:05 -04:00
Robert Haas	f9143d102f	Rework custom scans to work more like the new extensible node stuff. Per discussion, the new extensible node framework is thought to be better designed than the custom path/scan/scanstate stuff we added in PostgreSQL 9.5. Rework the latter to be more like the former. This is not backward-compatible, but we generally don't promise that for C APIs, and there probably aren't many people using this yet anyway. KaiGai Kohei, reviewed by Petr Jelinek and me. Some further cosmetic changes by me.	2016-03-29 11:28:04 -04:00
Alvaro Herrera	59a2111b23	Improve internationalization of messages involving type names Change the slightly different variations of the message function FOO must return type BAR to a single wording, removing the variability in type name so that they all create a single translation entry; since the type name is not to be translated, there's no point in it being part of the message anyway. Also, change them all to use the same quoting convention, namely that the function name is not to be quoted but the type name is. (I'm not quite sure why this is so, but it's the clear majority.) Some similar messages such as "encoding conversion function FOO must ..." are also changed.	2016-03-28 14:24:37 -03:00
Tom Lane	c94959d411	Fix DROP OPERATOR to reset oprcom/oprnegate links to the dropped operator. This avoids leaving dangling links in pg_operator; which while fairly harmless are also unsightly. While we're at it, simplify OperatorUpd, which went through heap_modify_tuple for no very good reason considering it had already made a tuple copy it could just scribble on. Roma Sokolov, reviewed by Tomas Vondra, additional hacking by Robert Haas and myself.	2016-03-25 12:33:16 -04:00
Tom Lane	a376960c8f	Suppress compiler warning for get_am_type_string(). Compilers that don't know that elog(ERROR) doesn't return complained that this function might fail to return a value. Per buildfarm. While at it, const-ify the function's declaration, since the intent is evidently to always return a constant string.	2016-03-24 17:22:24 -04:00
Alvaro Herrera	473b932870	Support CREATE ACCESS METHOD This enables external code to create access methods. This is useful so that extensions can add their own access methods which can be formally tracked for dependencies, so that DROP operates correctly. Also, having explicit support makes pg_dump work correctly. Currently only index AMs are supported, but we expect different types to be added in the future. Authors: Alexander Korotkov, Petr Jelínek Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby Commitfest-URL: https://commitfest.postgresql.org/9/353/ Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com	2016-03-23 23:01:35 -03:00
Simon Riggs	8320c625d9	Change comment to describe correct lock level used	2016-03-23 11:32:34 +00:00
Robert Haas	0bf3ae88af	Directly modify foreign tables. postgres_fdw can now sent an UPDATE or DELETE statement directly to the foreign server in simple cases, rather than sending a SELECT FOR UPDATE statement and then updating or deleting rows one-by-one. Etsuro Fujita, reviewed by Rushabh Lathia, Shigeru Hanada, Kyotaro Horiguchi, Albe Laurenz, Thom Brown, and me.	2016-03-18 13:55:52 -04:00
Tom Lane	bd0ab28912	Remove useless double calls of make_parsestate(). Aleksander Alekseev	2016-03-17 16:46:35 -04:00
Robert Haas	bc55cc0b6a	Fix problems in commit `c16dc1aca5`. Vinayak Pokale provided a patch for a copy-and-paste error in a comment. I noticed that I'd use the word "automatically" nearby where I meant to talk about things being "atomic". Rahila Syed spotted a misplaced counter update. Fix all that stuff.	2016-03-16 13:54:04 -04:00
Robert Haas	c16dc1aca5	Add simple VACUUM progress reporting. There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.	2016-03-15 13:32:56 -04:00
Robert Haas	270b7daf5c	Fix EXPLAIN ANALYZE SELECT INTO not to choose a parallel plan. We don't support any parallel write operations at present, so choosing a parallel plan causes us to error out. Also, add a new regression test that uses EXPLAIN ANALYZE SELECT INTO; if we'd had this previously, force_parallel_mode testing would have caught this issue. Mithun Cy and Robert Haas	2016-03-14 19:48:46 -04:00
Tom Lane	23a27b039d	Widen query numbers-of-tuples-processed counters to uint64. This patch widens SPI_processed, EState's es_processed field, PortalData's portalPos field, FuncCallContext's call_cntr and max_calls fields, ExecutorRun's count argument, PortalRunFetch's result, and the max number of rows in a SPITupleTable to uint64, and deals with (I hope) all the ensuing fallout. Some of these values were declared uint32 before, and others "long". I also removed PortalData's posOverflow field, since that logic seems pretty useless given that portalPos is now always 64 bits. The user-visible results are that command tags for SELECT etc will correctly report tuple counts larger than 4G, as will plpgsql's GET GET DIAGNOSTICS ... ROW_COUNT command. Queries processing more tuples than that are still not exactly the norm, but they're becoming more common. Most values associated with FETCH/MOVE distances, such as PortalRun's count argument and the count argument of most SPI functions that have one, remain declared as "long". It's not clear whether it would be worth promoting those to int64; but it would definitely be a large dollop of additional API churn on top of this, and it would only help 32-bit platforms which seem relatively less likely to see any benefit. Andreas Scherbaum, reviewed by Christian Ullrich, additional hacking by me	2016-03-12 16:05:29 -05:00
Robert Haas	fd31cd2651	Don't vacuum all-frozen pages. Commit `a892234f83` gave us enough infrastructure to avoid vacuuming pages where every tuple on the page is already frozen. So, replace the notion of a scan_all or whole-table vacuum with the less onerous notion of an "aggressive" vacuum, which will pages that are all-visible, but still skip those that are all-frozen. This should greatly reduce the cost of anti-wraparound vacuuming on large clusters where the majority of data is never touched between one cycle and the next, because we'll no longer have to read all of those pages only to find out that we don't need to do anything with them. Patch by me, reviewed by Masahiko Sawada.	2016-03-10 16:14:42 -05:00
Tom Lane	364a9f47ab	Refactor pull_var_clause's API to make it less tedious to extend. In commit `1d97c19a0f` and later `c1d9579dd8`, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway.	2016-03-10 15:53:07 -05:00
Robert Haas	be060cbcd4	Re-pgindent vacuumlazy.c.	2016-03-09 13:51:11 -05:00
Robert Haas	b6fb6471f6	Add a generic command progress reporting facility. Using this facility, any utility command can report the target relation upon which it is operating, if there is one, and up to 10 64-bit counters; the intent of this is that users should be able to figure out what a utility command is doing without having to resort to ugly hacks like attaching strace to a backend. As a demonstration, this adds very crude reporting to lazy vacuum; we just report the target relation and nothing else. A forthcoming patch will make VACUUM report a bunch of additional data that will make this much more interesting. But this gets the basic framework in place. Vinayak Pokale, Rahila Syed, Amit Langote, Robert Haas, reviewed by Kyotaro Horiguchi, Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao, and Masanori Oyama.	2016-03-09 12:08:58 -05:00
Robert Haas	dcfecaae9e	Fix parallel query on standby servers. Without this fix, it inevitably bombs out with "ERROR: failed to initialize transaction_read_only to 0". Repair. Ashutosh Sharma; comments adjusted by me.	2016-03-08 10:27:03 -05:00
Robert Haas	77a1d1e798	Department of second thoughts: remove PD_ALL_FROZEN. Commit `a892234f83` added a second bit per page to the visibility map, which still seems like a good idea, but it also added a second page-level bit alongside PD_ALL_VISIBLE to track whether the visibility map bit was set. That no longer seems like a clever plan, because we don't really need that bit for anything. We always clear both bits when the page is modified anyway. Patch by me, reviewed by Kyotaro Horiguchi and Masahiko Sawada.	2016-03-08 08:46:48 -05:00
Robert Haas	a892234f83	Change the format of the VM fork to add a second bit per page. The new bit indicates whether every tuple on the page is already frozen. It is cleared only when the all-visible bit is cleared, and it can be set only when we vacuum a page and find that every tuple on that page is both visible to every transaction and in no need of any future vacuuming. A future commit will use this new bit to optimize away full-table scans that would otherwise be triggered by XID wraparound considerations. A page which is merely all-visible must still be scanned in that case, but a page which is all-frozen need not be. This commit does not attempt that optimization, although that optimization is the goal here. It seems better to get the basic infrastructure in place first. Per discussion, it's very desirable for pg_upgrade to automatically migrate existing VM forks from the old format to the new format. That, too, will be handled in a follow-on patch. Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit Kapila, Simon Riggs, Andres Freund, and others, and substantially revised by me.	2016-03-01 21:49:41 -05:00
Robert Haas	7bea19d0a9	On second thought, disable parallelism for prepared statements. CREATE TABLE .. AS EXECUTE can turn an apparently read-only query into a write operation, which parallel query can't handle. It's a bit of a shame that requires us to avoid parallel query for queries prepared via PREPARE in all cases, but for right now it does.	2016-02-26 16:33:37 +05:30
Robert Haas	57a6a72b6b	Enable parallelism for prepared statements and extended query protocol. Parallel query can't handle running a query only partially rather than to completion. However, there seems to be no way to run a statement prepared via SQL PREPARE other than to completion, so we can enable it there without a problem. The situation is more complicated for the extend query protocol. libpq seems to provide no way to send an Execute message with a non-zero rowcount, but some other client might. If that happens, and a parallel plan was chosen, we'll execute the parallel plan without using any workers, which may be somewhat inefficient but should still work. Hopefully this won't be a problem; users can always set max_parallel_degree=0 to avoid choosing parallel plans in the first place. Amit Kapila, reviewed by me.	2016-02-25 13:02:18 +05:30
Fujii Masao	31b6606c48	Make concurrent refresh check early that there is a unique index on matview. In REFRESH MATERIALIZED VIEW command, CONCURRENTLY option is only allowed if there is at least one unique index with no WHERE clause on one or more columns of the matview. Previously, concurrent refresh checked the existence of a unique index on the matview after filling the data to new snapshot, i.e., after calling refresh_matview_datafill(). So, when there was no unique index, we could need to wait a long time before we detected that and got the error. It was a waste of time. To eliminate such wasting time, this commit changes concurrent refresh so that it checks the existence of a unique index at the beginning of the refresh operation, i.e., before starting any time-consuming jobs. If CONCURRENTLY option is not allowed due to lack of a unique index, concurrent refresh can immediately detect it and emit an error. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Fujii Masao	2016-02-16 02:15:44 +09:00
Tom Lane	72eee410d4	Move pg_constraint.h function declarations to new file pg_constraint_fn.h. A pending patch requires exporting a function returning Bitmapset from catalog/pg_constraint.c. As things stand, that would mean including nodes/bitmapset.h in pg_constraint.h, which might be hazardous for the client-side includability of that header. It's not entirely clear whether any client-side code needs to include pg_constraint.h, but it seems prudent to assume that there is some such code somewhere. Therefore, split off the function definitions into a new file pg_constraint_fn.h, similarly to what we've done for some other catalog header files.	2016-02-11 15:51:28 -05:00
Robert Haas	7c944bd903	Introduce a new GUC force_parallel_mode for testing purposes. When force_parallel_mode = true, we enable the parallel mode restrictions for all queries for which this is believed to be safe. For the subset of those queries believed to be safe to run entirely within a worker, we spin up a worker and run the query there instead of running it in the original process. When force_parallel_mode = regress, make additional changes to allow the regression tests to run cleanly even though parallel workers have been injected under the hood. Taken together, this facilitates both better user testing and better regression testing of the parallelism code. Robert Haas, with help from Amit Kapila and Rushabh Lathia.	2016-02-07 11:41:33 -05:00
Robert Haas	7191ce8bea	Make all built-in lwlock tranche IDs fixed. This makes the values more stable, which seems like a good thing for anybody who needs to look at at them. Alexander Korotkov and Amit Kapila	2016-02-02 06:45:55 -05:00
Robert Haas	a7de3dc5c3	Support multi-stage aggregation. Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me.	2016-01-20 13:46:50 -05:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00
Tom Lane	f47b602df8	Fix bogus lock release in RemovePolicyById and RemoveRoleFromObjectPolicy. Can't release the AccessExclusiveLock on the target table until commit. Otherwise there is a race condition whereby other backends might service our cache invalidation signals before they can actually see the updated catalog rows. Just to add insult to injury, RemovePolicyById was closing the rel (with incorrect lock drop) and then passing the now-dangling rel pointer to CacheInvalidateRelcache. Probably the reason this doesn't fall over on CLOBBER_CACHE buildfarm members is that some outer level of the DROP logic is still holding the rel open ... but it'd have bit us on the arse eventually, no doubt.	2016-01-03 20:53:35 -05:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Tom Lane	0dab5ef39b	Fix ALTER OPERATOR to update dependencies properly. Fix an oversight in commit 321eed5f0f7563a0: replacing an operator's selectivity functions needs to result in a corresponding update in pg_depend. We have a function that can handle that, but it was not called by AlterOperator(). To fix this without enlarging pg_operator.h's #include list beyond what clients can safely include, split off the function definitions into a new file pg_operator_fn.h, similarly to what we've done for some other catalog header files. It's not entirely clear whether any client-side code needs to include pg_operator.h, but it seems prudent to assume that there is some such code somewhere.	2015-12-31 17:37:31 -05:00
Tom Lane	e5d06f2b12	Dept of second thoughts: the !scan_all exit mustn't increase scanned_pages. In the extreme edge case where contended pages are the only ones that escape being scanned, the previous commit would have allowed us to think that relfrozenxid could be advanced, which is exactly wrong.	2015-12-30 17:32:23 -05:00
Tom Lane	e842908233	Avoid useless truncation attempts during VACUUM. VACUUM can skip heap pages altogether when there's a run of consecutive pages that are all-visible according to the visibility map. This causes it to not update its nonempty_pages count, just as if those pages were empty, which means that at the end we will think they are candidates for deletion. Thus, we may take the table's AccessExclusive lock only to find that no pages are really truncatable. This usually causes no real problems on a master server, thanks to the lock being acquired only conditionally; but on hot-standby servers, the same lock must be acquired unconditionally which can result in unnecessary query cancellations. To improve matters, force examination of the table's last page whenever we reach there with a nonempty_pages count that would allow a truncation attempt. If it's not empty, we'll advance nonempty_pages and thereby prevent the truncation attempt. If we are unable to acquire cleanup lock on that page, there's no need to force it, unless we're doing an anti-wraparound vacuum. We can just check for tuples with a shared buffer lock and then give up. (When we are doing an anti-wraparound vacuum, and decide it's okay to skip the page because it contains no freezable tuples, this patch still improves matters because nonempty_pages is properly updated, which it was not before.) Since only the last page is special-cased in this way, we might attempt a truncation that will release many fewer pages than the normal heuristic would suggest; at worst, only one page would be truncated. But that seems all right, because the situation won't repeat during the next vacuum. The real problem with the old logic is that the useless truncation attempt happens every time we vacuum, so long as the state of the last few dozen pages doesn't change. This is a longstanding deficiency, but since the consequences aren't very severe in most scenarios, I'm not going to risk a back-patch. Jeff Janes and Tom Lane	2015-12-30 17:13:15 -05:00
Joe Conway	241448b23a	Rename (new\|old)estCommitTs to (new\|old)estCommitTsXid The variables newestCommitTs and oldestCommitTs sound as if they are timestamps, but in fact they are the transaction Ids that correspond to the newest and oldest timestamps rather than the actual timestamps. Rename these variables to reflect that they are actually xids: to wit newestCommitTsXid and oldestCommitTsXid respectively. Also modify related code in a similar fashion, particularly the user facing output emitted by pg_controldata and pg_resetxlog. Complaint and patch by me, review by Tom Lane and Alvaro Herrera. Backpatch to 9.5 where these variables were first introduced.	2015-12-28 12:34:11 -08:00
Tom Lane	fec1ad94df	Include typmod when complaining about inherited column type mismatches. MergeAttributes() rejects cases where columns to be merged have the same type but different typmod, which is correct; but the error message it printed didn't show either typmod, which is unhelpful. Changing this requires using format_type_with_typemod() in place of TypeNameToString(), which will have some minor side effects on the way some type names are printed, but on balance this is an improvement: the old code sometimes printed one type according to one set of rules and the other type according to the other set, which could be confusing in its own way. Oddly, there were no regression test cases covering any of this behavior, so add some. Complaint and fix by Amit Langote	2015-12-26 13:41:29 -05:00
Alvaro Herrera	756e7b4c9d	Rework internals of changing a type's ownership This is necessary so that REASSIGN OWNED does the right thing with composite types, to wit, that it also alters ownership of the type's pg_class entry -- previously, the pg_class entry remained owned by the original user, which caused later other failures such as the new owner's inability to use ALTER TYPE to rename an attribute of the affected composite. Also, if the original owner is later dropped, the pg_class entry becomes owned by a non-existant user which is bogus. To fix, create a new routine AlterTypeOwner_oid which knows whether to pass the request to ATExecChangeOwner or deal with it directly, and use that in shdepReassignOwner rather than calling AlterTypeOwnerInternal directly. AlterTypeOwnerInternal is now simpler in that it only modifies the pg_type entry and recurses to handle a possible array type; higher-level tasks are handled by either AlterTypeOwner directly or AlterTypeOwner_oid. I took the opportunity to add a few more objects to the test rig for REASSIGN OWNED, so that more cases are exercised. Additional ones could be added for superuser-only-ownable objects (such as FDWs and event triggers) but I didn't want to push my luck by adding a new superuser to the tests on a backpatchable bug fix. Per bug #13666 reported by Chris Pacejo. Backpatch to 9.5. (I would back-patch this all the way back, except that it doesn't apply cleanly in 9.4 and earlier because `59367fdf9` wasn't backpatched. If we decide that we need this in earlier branches too, we should backpatch both.)	2015-12-17 14:25:41 -03:00
Andres Freund	f54d0629ec	Fix ALTER TABLE ... SET TABLESPACE for unlogged relations. Changing the tablespace of an unlogged relation did not WAL log the creation and content of the init fork. Thus, after a standby is promoted, unlogged relation cannot be accessed anymore, with errors like: ERROR: 58P01: could not open file "pg_tblspc/...": No such file or directory Additionally the init fork was not synced to disk, independent of the configured wal_level, a relatively small durability risk. Investigation of that problem also brought to light that, even for permanent relations, the creation of !main forks was not WAL logged, i.e. no XLOG_SMGR_CREATE record were emitted. That mostly turns out not to be a problem, because these files were created when the actual relation data is copied; nonexistent files are not treated as an error condition during replay. But that doesn't work for empty files, and generally feels a bit haphazard. Luckily, outside init and main forks, empty forks don't occur often or are not a problem. Add the required WAL logging and syncing to disk. Reported-By: Michael Paquier Author: Michael Paquier and Andres Freund Discussion: 20151210163230.GA11331@alap3.anarazel.de Backpatch: 9.1, where unlogged relations were introduced	2015-12-12 14:17:39 +01:00
Stephen Frost	833728d4c8	Handle policies during DROP OWNED BY DROP OWNED BY handled GRANT-based ACLs but was not removing roles from policies. Fix that by having DROP OWNED BY remove the role specified from the list of roles the policy (or policies) apply to, or the entire policy (or policies) if it only applied to the role specified. As with ACLs, the DROP OWNED BY caller must have permission to modify the policy or a WARNING is thrown and no change is made to the policy.	2015-12-11 16:12:25 -05:00
Stephen Frost	ed8bec915e	Handle dependencies properly in ALTER POLICY ALTER POLICY hadn't fully considered partial policy alternation (eg: change just the roles on the policy, or just change one of the expressions) when rebuilding the dependencies. Instead, it would happily remove all dependencies which existed for the policy and then only recreate the dependencies for the objects referred to in the specific ALTER POLICY command. Correct that by extracting and building the dependencies for all objects referenced by the policy, regardless of if they were provided as part of the ALTER POLICY command or were already in place as part of the pre-existing policy.	2015-12-11 15:43:03 -05:00
Peter Eisentraut	a351705d8a	Improve some messages	2015-12-10 22:05:27 -05:00
Robert Haas	b287df70e4	Allow EXPLAIN (ANALYZE, VERBOSE) to display per-worker statistics. The original parallel sequential scan commit included only very limited changes to the EXPLAIN output. Aggregated totals from all workers were displayed, but there was no way to see what each individual worker did or to distinguish the effort made by the workers from the effort made by the leader. Per a gripe by Thom Brown (and maybe others). Patch by me, reviewed by Amit Kapila.	2015-12-09 13:21:19 -05:00
Teodor Sigaev	92e38182d7	COPY (INSERT/UPDATE/DELETE .. RETURNING ..) Attached is a patch for being able to do COPY (query) without a CTE. Author: Marko Tiikkaja Review: Michael Paquier	2015-11-27 19:11:22 +03:00
Tom Lane	074c5cfbfb	Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit `5ed6546cf`, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit `5f173040`. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.	2015-11-20 14:55:47 -05:00
Robert Haas	bc4996e61b	Make ALTER .. SET SCHEMA do nothing, instead of throwing an ERROR. This was already true for CREATE EXTENSION, but historically has not been true for other object types. Therefore, this is a backward incompatibility. Per discussion on pgsql-hackers, everyone seems to agree that the new behavior is better. Marti Raudsepp, reviewed by Haribabu Kommi and myself	2015-11-19 10:49:25 -05:00
Peter Eisentraut	c5ec406412	Message style fix from Euler Taveira	2015-11-17 06:53:07 -05:00
Peter Eisentraut	5db837d3f2	Message improvements	2015-11-16 21:39:23 -05:00
Robert Haas	fe702a7b3f	Move each SLRU's lwlocks to a separate tranche. This makes it significantly easier to identify these lwlocks in LWLOCK_STATS or Trace_lwlocks output. It's also arguably better from a modularity standpoint, since lwlock.c no longer needs to know anything about the LWLock needs of the higher-level SLRU facility. Ildus Kurbangaliev, reviewd by Álvaro Herrera and by me.	2015-11-12 14:59:09 -05:00
Robert Haas	f0661c4e8c	Make sequential scans parallel-aware. In addition, this path fills in a number of missing bits and pieces in the parallel infrastructure. Paths and plans now have a parallel_aware flag indicating whether whatever parallel-aware logic they have should be engaged. It is believed that we will need this flag for a number of path/plan types, not just sequential scans, which is why the flag is generic rather than part of the SeqScan structures specifically. Also, execParallel.c now gives parallel nodes a chance to initialize their PlanState nodes from the DSM during parallel worker startup. Amit Kapila, with a fair amount of adjustment by me. Review of previous patch versions by Haribabu Kommi and others.	2015-11-11 08:57:52 -05:00
Peter Eisentraut	7bd099d511	Update spelling of COPY options The preferred spelling was changed from FORCE QUOTE to FORCE_QUOTE and the like, but some code was still referring to the old spellings.	2015-11-04 21:01:26 -05:00
Robert Haas	1efc7e5382	Fix problems with ParamListInfo serialization mechanism. Commit `d1b7c1ffe7` introduced a mechanism for serializing a ParamListInfo structure to be passed to a parallel worker. However, this mechanism failed to handle external expanded values, as pointed out by Noah Misch. Repair. Moreover, plpgsql_param_fetch requires adjustment because the serialization mechanism needs it to skip evaluating unused parameters just as we would do when it is called from copyParamList, but params == estate->paramLI in that case. To fix, make the bms_is_member test in that function unconditional. Finally, have setup_param_list set a new ParamListInfo field, paramMask, to the parameters actually used in the expression, so that we don't try to fetch those that are not needed when serializing a parameter list. This isn't necessary for correctness, but it makes the performance of the parallel executor code comparable to what we do for cases involving cursors. Design suggestions and extensive review by Noah Misch. Patch by me.	2015-11-02 18:11:29 -05:00
Peter Eisentraut	a8d585c091	Message style improvements Message style, plurals, quoting, spelling, consistency with similar messages	2015-10-28 20:38:36 -04:00
Robert Haas	d455651624	Add missing serial comma, for consistency. Amit Langote, per Etsuro Fujita	2015-10-28 12:19:14 +01:00
Robert Haas	9dcce7123e	Fix incorrect message in ATWrongRelkindError. Mistake introduced by commit `3bf3ab8c56`. Etsuro Fujita	2015-10-28 11:47:19 +01:00
Alvaro Herrera	531d21b75f	Cleanup commit timestamp module activaction, again Further tweak commit_ts.c so that on a standby the state is completely consistent with what that in the master, rather than behaving differently in the cases that the settings differ. Now in standby and master the module should always be active or inactive in lockstep. Author: Petr Jelínek, with some further tweaks by Álvaro Herrera. Backpatch to 9.5, where commit timestamps were introduced. Discussion: http://www.postgresql.org/message-id/5622BF9D.2010409@2ndquadrant.com	2015-10-27 15:06:50 -03:00
Robert Haas	872101bede	Add two missing cases to ATWrongRelkindError. This way, we produce a better error message if someone tries to do something like ALTER INDEX .. ALTER COLUMN .. SET STORAGE. Amit Langote	2015-10-22 17:00:53 -04:00
Robert Haas	bde39eed0c	Fix a couple of bugs in recent parallelism-related commits. Commit `816e336f12` added the wrong error check to async.c; sending restrictions is restricted to the leader, not altogether unsafe. Commit `3bd909b220` added ExecShutdownNode to traverse the planstate tree and call shutdown functions, but made a Gather node, the only node that actually has such a function, abort the tree traversal, which is wrong.	2015-10-22 10:49:20 -04:00
Robert Haas	816e336f12	Mark more functions parallel-restricted or parallel-unsafe. Commit `7aea8e4f2d` was overoptimistic about the degree of safety associated with running various functions in parallel mode. Functions that take a table name or OID as an argument are at least parallel-restricted, because the table might be temporary, and we currently don't allow parallel workers to touch temporary tables. Functions that take a query as an argument are outright unsafe, because the query could be anything, including a parallel-unsafe query. Also, the queue of pending notifications is backend-private, so adding to it from a worker doesn't behave correctly. We could fix this by transferring the worker's queue of pending notifications to the master during worker cleanup, but that seems like more trouble than it's worth for now. In addition to adjusting the pg_proc.h markings, also add an explicit check for this in async.c.	2015-10-16 11:49:31 -04:00
Robert Haas	82b37765c7	Fix a problem with parallel workers being unable to restore role. check_role() tries to verify that the user has permission to become the requested role, but this is inappropriate in a parallel worker, which needs to exactly recreate the master's authorization settings. So skip the check in that case. This fixes a bug in commit `924bcf4f16`.	2015-10-16 11:37:19 -04:00
Alvaro Herrera	817588bc2b	Fix bogus comments Author: Amit Langote	2015-10-15 12:20:11 -03:00
Stephen Frost	088c83363a	ALTER TABLE .. FORCE ROW LEVEL SECURITY To allow users to force RLS to always be applied, even for table owners, add ALTER TABLE .. FORCE ROW LEVEL SECURITY. row_security=off overrides FORCE ROW LEVEL SECURITY, to ensure pg_dump output is complete (by default). Also add SECURITY_NOFORCE_RLS context to avoid data corruption when ALTER TABLE .. FORCE ROW SECURITY is being used. The SECURITY_NOFORCE_RLS security context is used only during referential integrity checks and is only considered in check_enable_rls() after we have already checked that the current user is the owner of the relation (which should always be the case during referential integrity checks). Back-patch to 9.5 where RLS was added.	2015-10-04 21:05:08 -04:00
Andres Freund	b67aaf21e8	Add CASCADE support for CREATE EXTENSION. Without CASCADE, if an extension has an unfullfilled dependency on another extension, CREATE EXTENSION ERRORs out with "required extension ... is not installed". That is annoying, especially when that dependency is an implementation detail of the extension, rather than something the extension's user can make sense of. In addition to CASCADE this also includes a small set of regression tests around CREATE EXTENSION. Author: Petr Jelinek, editorialized by Michael Paquier, Andres Freund Reviewed-By: Michael Paquier, Andres Freund, Jeff Janes Discussion: 557E0520.3040800@2ndquadrant.com	2015-10-03 18:23:40 +02:00
Tom Lane	5884b92a84	Fix errors in commit `a04bb65f70`. Not a lot of commentary needed here really.	2015-09-30 23:37:26 -04:00
Tom Lane	07e4d03fb4	Improve LISTEN startup time when there are many unread notifications. If some existing listener is far behind, incoming new listener sessions would start from that session's read pointer and then need to advance over many already-committed notification messages, which they have no interest in. This was expensive in itself and also thrashed the pg_notify SLRU buffers a lot more than necessary. We can improve matters considerably in typical scenarios, without much added cost, by starting from the furthest-ahead read pointer, not the furthest-behind one. We do have to consider only sessions in our own database when doing this, which requires an extra field in the data structure, but that's a pretty small cost. Back-patch to 9.0 where the current LISTEN/NOTIFY logic was introduced. Matt Newell, slightly adjusted by me	2015-09-30 23:32:43 -04:00
Robert Haas	3bd909b220	Add a Gather executor node. A Gather executor node runs any number of copies of a plan in an equal number of workers and merges all of the results into a single tuple stream. It can also run the plan itself, if the workers are unavailable or haven't started up yet. It is intended to work with the Partial Seq Scan node which will be added in future commits. It could also be used to implement parallel query of a different sort by itself, without help from Partial Seq Scan, if the single_copy mode is used. In that mode, a worker executes the plan, and the parallel leader does not, merely collecting the worker's results. So, a Gather node could be inserted into a plan to split the execution of that plan across two processes. Nested Gather nodes aren't currently supported, but we might want to add support for that in the future. There's nothing in the planner to actually generate Gather nodes yet, so it's not quite time to break out the champagne. But we're getting close. Amit Kapila. Some designs suggestions were provided by me, and I also reviewed the patch. Single-copy mode, documentation, and other minor changes also by me.	2015-09-30 19:23:36 -04:00
Tom Lane	6057f61b4d	Small improvements in comments in async.c. We seem to have lost a line somewhere along the way in the comment block that discusses async.c's locks, because it suddenly refers to "both locks" without previously having mentioned more than one. Add a sentence to make that read more sanely. Also, refer to the "pos of the slowest backend" not the "tail of the slowest backend", since we have no per-backend value called "tail".	2015-09-29 22:07:16 -04:00
Alvaro Herrera	590e2d12f0	COPY: use pg_plan_query() instead of planner() While at it, trim the includes list in copy.c. The planner headers cannot be removed, but there are a few others that are not of any use.	2015-09-28 15:14:08 -03:00
Andres Freund	aa29c1ccd9	Remove legacy multixact truncation support. In 9.5 and master there is no need to support legacy truncation. This is just committed separately to make it easier to backpatch the WAL logged multixact truncation to 9.3 and 9.4 if we later decide to do so. I bumped master's magic from 0xD086 to 0xD088 and 9.5's from 0xD085 to 0xD087 to avoid 9.5 reusing a value that has been in use on master while keeping the numbers increasing between major versions. Discussion: 20150621192409.GA4797@alap3.anarazel.de Backpatch: 9.5	2015-09-26 19:04:25 +02:00
Andres Freund	4f627f8973	Rework the way multixact truncations work. The fact that multixact truncations are not WAL logged has caused a fair share of problems. Amongst others it requires to do computations during recovery while the database is not in a consistent state, delaying truncations till checkpoints, and handling members being truncated, but offset not. We tried to put bandaids on lots of these issues over the last years, but it seems time to change course. Thus this patch introduces WAL logging for multixact truncations. This allows: 1) to perform the truncation directly during VACUUM, instead of delaying it to the checkpoint. 2) to avoid looking at the offsets SLRU for truncation during recovery, we can just use the master's values. 3) simplify a fair amount of logic to keep in memory limits straight, this has gotten much easier During the course of fixing this a bunch of additional bugs had to be fixed: 1) Data was not purged from memory the member's SLRU before deleting segments. This happened to be hard or impossible to hit due to the interlock between checkpoints and truncation. 2) find_multixact_start() relied on SimpleLruDoesPhysicalPageExist - but that doesn't work for offsets that haven't yet been flushed to disk. Add code to flush the SLRUs to fix. Not pretty, but it feels slightly safer to only make decisions based on actual on-disk state. 3) find_multixact_start() could be called concurrently with a truncation and thus fail. Via SetOffsetVacuumLimit() that could lead to a round of emergency vacuuming. The problem remains in pg_get_multixact_members(), but that's quite harmless. For now this is going to only get applied to 9.5+, leaving the issues in the older branches in place. It is quite possible that we need to backpatch at a later point though. For the case this gets backpatched we need to handle that an updated standby may be replaying WAL from a not-yet upgraded primary. We have to recognize that situation and use "old style" truncation (i.e. looking at the SLRUs) during WAL replay. In contrast to before, this now happens in the startup process, when replaying a checkpoint record, instead of the checkpointer. Doing truncation in the restartpoint is incorrect, they can happen much later than the original checkpoint, thereby leading to wraparound. To avoid "multixact_redo: unknown op code 48" errors standbys would have to be upgraded before primaries. A later patch will bump the WAL page magic, and remove the legacy truncation codepaths. Legacy truncation support is just included to make a possible future backpatch easier. Discussion: 20150621192409.GA4797@alap3.anarazel.de Reviewed-By: Robert Haas, Alvaro Herrera, Thomas Munro Backpatch: 9.5 for now	2015-09-26 19:04:25 +02:00
Tom Lane	82e1ba7fd6	Make ANALYZE compute basic statistics even for types with no "=" operator. Previously, ANALYZE simply ignored columns of datatypes that have neither a btree nor hash opclass (which means they have no recognized equality operator). Without a notion of equality, we can't identify most-common values nor estimate the number of distinct values. But we can still count nulls and compute the average physical column width, and those stats might be of value. Moreover there are some tools out there that don't work so well if rows are missing from pg_statistic. So let's add suitable logic for this case. While this is arguably a bug fix, it also has the potential to change query plans, and the gain seems not worth taking a risk of that in stable branches. So back-patch into 9.5 but not further. Oleksandr Shulgin, rewritten a bit by me.	2015-09-23 18:26:49 -04:00
Robert Haas	8dd401aa07	Add new function planstate_tree_walker. ExplainPreScanNode knows how to iterate over a generic tree of plan states; factor that logic out into a separate walker function so that other code, such as upcoming patches for parallel query, can also use it. Patch by me, reviewed by Tom Lane.	2015-09-17 11:27:06 -04:00
Robert Haas	7aea8e4f2d	Determine whether it's safe to attempt a parallel plan for a query. Commit `924bcf4f16` introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.	2015-09-16 15:38:47 -04:00
Stephen Frost	22eaf35c1d	RLS refactoring This refactors rewrite/rowsecurity.c to simplify the handling of the default deny case (reducing the number of places where we check for and add the default deny policy from three to one) by splitting up the retrival of the policies from the application of them. This also allowed us to do away with the policy_id field. A policy_name field was added for WithCheckOption policies and is used in error reporting, when available. Patch by Dean Rasheed, with various mostly cosmetic changes by me. Back-patch to 9.5 where RLS was introduced to avoid unnecessary differences, since we're still in alpha, per discussion with Robert.	2015-09-15 15:49:31 -04:00
Tom Lane	9270d8db9a	Fix CreateTableSpace() so it will compile without HAVE_SYMLINK. This has been broken since 9.3 (commit `82b1b213ca` to be exact), which suggests that nobody is any longer using a Windows build system that doesn't provide a symlink emulation. Still, it's wrong on its own terms, so repair. YUriy Zhuravlev	2015-09-05 16:15:38 -04:00

... 3 4 5 6 7 ...

3345 Commits