postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	5e09280057	Make pg_statistic and related code account more honestly for collations. When we first put in collations support, we basically punted on teaching pg_statistic, ANALYZE, and the planner selectivity functions about that. They've just used DEFAULT_COLLATION_OID independently of the actual collation of the data. It's time to improve that, so: * Add columns to pg_statistic that record the specific collation associated with each statistics slot. * Teach ANALYZE to use the column's actual collation when comparing values for statistical purposes, and record this in the appropriate slot. (Note that type-specific typanalyze functions are now expected to fill stats->stacoll with the appropriate collation, too.) * Teach assorted selectivity functions to use the actual collation of the stats they are looking at, instead of just assuming it's DEFAULT_COLLATION_OID. This should give noticeably better results in selectivity estimates for columns with nondefault collations, at least for query clauses that use that same collation (which would be the default behavior in most cases). It's still true that comparisons with explicit COLLATE clauses different from the stored data's collation won't be well-estimated, but that's no worse than before. Also, this patch does make the first step towards doing better with that, which is that it's now theoretically possible to collect stats for a collation other than the column's own collation. Patch by me; thanks to Peter Eisentraut for review. Discussion: https://postgr.es/m/14706.1544630227@sss.pgh.pa.us	2018-12-14 12:52:49 -05:00
Andres Freund	09568ec3d3	Create a separate oid range for oids assigned by genbki.pl. The changes I made in `578b229718` assigned oids below FirstBootstrapObjectId to objects in include/catalog/*.dat files that did not have an oid assigned, starting at the max oid explicitly assigned. Tom criticized that for mainly two reasons: 1) It's not clear which values are manually and which explicitly assigned. 2) The space below FirstBootstrapObjectId gets pretty crowded, and some PostgreSQL forks have used oids >= 9000 for their own objects, to avoid conflicting. Thus create a new range for objects not assigned explicit oids, but assigned by genbki.pl. For now 1-9999 is for explicitly assigned oids, FirstGenbkiObjectId (10000) to FirstBootstrapObjectId (1200) -1 is for genbki.pl assigned oids, and < FirstNormalObjectId (16384) is for oids assigned during bootstrap. It's possible that we'll have to adjust these boundaries, but there's some headroom for now. Add a note suggesting that oids in forks should be assigned in the 9000-9999 range. Catversion bump for obvious reasons. Per complaint from Tom Lane. Author: Andres Freund Discussion: https://postgr.es/m/16845.1544393682@sss.pgh.pa.us	2018-12-13 14:50:57 -08:00
Michael Paquier	7fee252f6f	Add timestamp of last received message from standby to pg_stat_replication The timestamp generated by the standby at message transmission has been included in the protocol since its introduction for both the status update message and hot standby feedback message, but it has never appeared in pg_stat_replication. Seeing this timestamp does not matter much with a cluster which has a lot of activity, but on a mostly-idle cluster, this makes monitoring able to react faster than the configured timeouts. Author: MyungKyu LIM Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/1657809367.407321.1533027417725.JavaMail.jboss@ep2ml404	2018-12-09 16:35:06 +09:00
Andres Freund	578b229718	Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de	2018-11-20 16:00:17 -08:00
Alvaro Herrera	3f2393edef	Redesign initialization of partition routing structures This speeds up write operations (INSERT, UPDATE, DELETE, COPY, as well as the future MERGE) on partitioned tables. This changes the setup for tuple routing so that it does far less work during the initial setup and pushes more work out to when partitions receive tuples. PartitionDispatchData structs for sub-partitioned tables are only created when a tuple gets routed through it. The possibly large arrays in the PartitionTupleRouting struct have largely been removed. The partitions[] array remains but now never contains any NULL gaps. Previously the NULLs had to be skipped during ExecCleanupTupleRouting(), which could add a large overhead to the cleanup when the number of partitions was large. The partitions[] array is allocated small to start with and only enlarged when we route tuples to enough partitions that it runs out of space. This allows us to keep simple single-row partition INSERTs running quickly. Redesign The arrays in PartitionTupleRouting which stored the tuple translation maps have now been removed. These have been moved out into a PartitionRoutingInfo struct which is an additional field in ResultRelInfo. The find_all_inheritors() call still remains by far the slowest part of ExecSetupPartitionTupleRouting(). This commit just removes the other slow parts. In passing also rename the tuple translation maps from being ParentToChild and ChildToParent to being RootToPartition and PartitionToRoot. The old names mislead you into thinking that a partition of some sub-partitioned table would translate to the rowtype of the sub-partitioned table rather than the root partitioned table. Authors: David Rowley and Amit Langote, heavily revised by Álvaro Herrera Testing help from Jesper Pedersen and Kato Sho. Discussion: https://postgr.es/m/CAKJS1f_1RJyFquuCKRFHTdcXqoPX-PYqAd7nz=GVBwvGh4a6xA@mail.gmail.com	2018-11-16 15:01:05 -03:00
Andres Freund	6a744413ea	Make reformat-dat-files, reformat-dat-files VPATH safe. The reformat_dat_file.pl script, added by `372728b0d4`, supported all the necessary options to make it work in a VPATH build, but the makefile invocations didn't take VPATH into account. Fix that. Discussion: https://postgr.es/m/20181115185303.d2z7wonx23mdfvd3@alap3.anarazel.de Backpatch: 11-, where `372728b0d4` was merged	2018-11-15 11:38:03 -08:00
Tom Lane	600b04d6b5	Add a timezone-specific variant of date_trunc(). date_trunc(field, timestamptz, zone_name) performs truncation using the named time zone as reference, rather than working in the session time zone as is the default behavior. It's equivalent to date_trunc(field, timestamptz at time zone zone_name) at time zone zone_name but it's faster, easier to type, and arguably easier to understand. Vik Fearing and Tom Lane Discussion: https://postgr.es/m/6249ffc4-2b22-4c1b-4e7d-7af84fedd7c6@2ndquadrant.com	2018-11-14 15:41:07 -05:00
Tom Lane	fa2952d8eb	Fix missing role dependencies for some schema and type ACLs. This patch fixes several related cases in which pg_shdepend entries were never made, or were lost, for references to roles appearing in the ACLs of schemas and/or types. While that did no immediate harm, if a referenced role were later dropped, the drop would be allowed and would leave a dangling reference in the object's ACL. That still wasn't a big problem for normal database usage, but it would cause obscure failures in subsequent dump/reload or pg_upgrade attempts, taking the form of attempts to grant privileges to all-numeric role names. (I think I've seen field reports matching that symptom, but can't find any right now.) Several cases are fixed here: 1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any existing ACL entries for the domain. This case is ancient, dating back as far as we've had pg_shdepend tracking at all. 2. If a default type privilege applies, CREATE TYPE recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for types in 9.2. 3. If a default schema privilege applies, CREATE SCHEMA recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for schemas in v10 (commit `ab89e465c`). Another somewhat-related problem is that when creating a relation rowtype or implicit array type, TypeCreate would apply any available default type privileges to that type, which we don't really want since such an object isn't supposed to have privileges of its own. (You can't, for example, drop such privileges once they've been added to an array type.) `ab89e465c` is also to blame for a race condition in the regression tests: privileges.sql transiently installed globally-applicable default privileges on schemas, which sometimes got absorbed into the ACLs of schemas created by concurrent test scripts. This should have resulted in failures when privileges.sql tried to drop the role holding such privileges; but thanks to the bug fixed here, it instead led to dangling ACLs in the final state of the regression database. We'd managed not to notice that, but it became obvious in the wake of commit `da906766c`, which allowed the race condition to occur in pg_upgrade tests. To fix, add a function recordDependencyOnNewAcl to encapsulate what callers of get_user_default_acl need to do; while the original call sites got that right via ad-hoc code, none of the later-added ones have. Also change GenerateTypeDependencies to generate these dependencies, which requires adding the typacl to its parameter list. (That might be annoying if there are any extensions calling that function directly; but if there are, they're most likely buggy in the same way as the core callers were, so they need work anyway.) While I was at it, I changed GenerateTypeDependencies to accept most of its parameters in the form of a Form_pg_type pointer, making its parameter list a bit less unwieldy and mistake-prone. The test race condition is fixed just by wrapping the addition and removal of default privileges into a single transaction, so that that state is never visible externally. We might eventually prefer to separate out tests of default privileges into a script that runs by itself, but that would be a bigger change and would make the tests run slower overall. Back-patch relevant parts to all supported branches. Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us	2018-11-09 20:42:14 -05:00
Michael Paquier	8f045e242b	Switch pg_promote to be parallel-safe pg_promote uses nothing relying on a global state, so it is fine to mark it as parallel-safe, conclusion based on a detailed analysis from Robert Haas. This also fixes an inconsistency where pg_proc.dat missed to mark the function with its previous value for proparallel, update which does not matter now as the default is used. Based on a discussion between multiple folks: Laurenz Albe, Robert Haas, Amit Kapila, Tom Lane and myself. Discussion: https://postgr.es/m/20181029082530.GL14242@paquier.xyz	2018-11-06 14:11:21 +09:00
Tom Lane	55f3d10296	Remove unreferenced pg_opfamily entry. The entry with OID 4035, for GIST jsonb_ops, is unused; apparently it was added in preparation for index support that never materialized. Remove it, and add a regression test case to detect future mistakes of the same kind. Discussion: https://postgr.es/m/17188.1541379745@sss.pgh.pa.us	2018-11-05 12:02:27 -05:00
Peter Eisentraut	96b00c433c	Remove obsolete pg_constraint.consrc column This has been deprecated and effectively unused for a long time. Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2018-11-01 20:36:05 +01:00
Peter Eisentraut	fe5038236c	Remove obsolete pg_attrdef.adsrc column This has been deprecated and effectively unused for a long time. Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se>	2018-11-01 20:35:42 +01:00
Michael Paquier	d5eec4eefd	Add pg_partition_tree to display information about partitions This new function is useful to display a full tree of partitions with a partitioned table given in output, and avoids the need of any complex WITH RECURSIVE query when looking at partition trees which are deep multiple levels. It returns a set of records, one for each partition, containing the partition's name, its immediate parent's name, a boolean value telling if the relation is a leaf in the tree and an integer telling its level in the partition tree with given table considered as root, beginning at zero for the root, and incrementing by one each time the scan goes one level down. Author: Amit Langote Reviewed-by: Jesper Pedersen, Michael Paquier, Robert Haas Discussion: https://postgr.es/m/8d00e51a-9a51-ad02-d53e-ba6bf50b2e52@lab.ntt.co.jp	2018-10-30 10:25:06 +09:00
Michael Paquier	10074651e3	Add pg_promote function This function is able to promote a standby with this new SQL-callable function. Execution access can be granted to non-superusers so that failover tools can observe the principle of least privilege. Catalog version is bumped. Author: Laurenz Albe Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/6e7c79b3ec916cf49742fb8849ed17cd87aed620.camel@cybertec.at	2018-10-25 09:46:00 +09:00
Michael Paquier	55853d666c	Clarify descriptions of relhassubclass and relispartition in pg_class Three places are fixed, one for each author. Reported-by: Tom Lane Author: Tom Lane, Amit Langote, Michael Paquier Discussion: https://postgr.es/m/82470.1540177167@sss.pgh.pa.us	2018-10-22 15:26:28 +09:00
Andres Freund	02a30a09f9	Correct constness of system attributes in heap.c & prerequisites. This allows the compiler / linker to mark affected pages as read-only. There's a fair number of pre-requisite changes, to allow the const properly be propagated. Most of consts were already required for correctness anyway, just not represented on the type-level. Arguably we could be more aggressive in using consts in related code, but.. This requires using a few of the types underlying typedefs that removes pointers (e.g. const NameData *) as declaring the typedefed type constant doesn't have the same meaning (it makes the variable const, not what it points to). Discussion: https://postgr.es/m/20181015200754.7y7zfuzsoux2c4ya@alap3.anarazel.de	2018-10-16 09:44:43 -07:00
Andres Freund	cda6a8d01d	Remove deprecated abstime, reltime, tinterval datatypes. These types have been deprecated for a long time. Catversion bump, for obvious reasons. Author: Andres Freund Discussion: https://postgr.es/m/20181009192237.34wjp3nmw7oynmmr@alap3.anarazel.de https://postgr.es/m/20171213080506.cwjkpcz3bkk6yz2u@alap3.anarazel.de https://postgr.es/m/25615.1513115237@sss.pgh.pa.us	2018-10-11 11:59:15 -07:00
Michael Paquier	c481016201	Add pg_ls_archive_statusdir function This function lists the contents of the WAL archive status directory, and is intended to be used by monitoring tools. Unlike pg_ls_dir(), access to it can be granted to non-superusers so that those monitoring tools can observe the principle of least privilege. Access is also given by default to members of pg_monitor. Author: Christoph Moench-Tegeder Reviewed-by: Aya Iwata Discussion: https://postgr.es/m/20180930205920.GA64534@elch.exwg.net	2018-10-09 22:29:09 +09:00
Thomas Munro	212fab9926	Relax transactional restrictions on ALTER TYPE ... ADD VALUE (redux). Originally committed as `15bc038f` (plus some follow-ups), this was reverted in `28e07270` due to a problem discovered in parallel workers. This new version corrects that problem by sending the list of uncommitted enum values to parallel workers. Here follows the original commit message describing the change: To prevent possibly breaking indexes on enum columns, we must keep uncommitted enum values from getting stored in tables, unless we can be sure that any such column is new in the current transaction. Formerly, we enforced this by disallowing ALTER TYPE ... ADD VALUE from being executed at all in a transaction block, unless the target enum type had been created in the current transaction. This patch removes that restriction, and instead insists that an uncommitted enum value can't be referenced unless it belongs to an enum type created in the same transaction as the value. Per discussion, this should be a bit less onerous. It does require each function that could possibly return a new enum value to SQL operations to check this restriction, but there aren't so many of those that this seems unmaintainable. Author: Andrew Dunstan and Tom Lane, with parallel query fix by Thomas Munro Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAEepm%3D0Ei7g6PaNTbcmAh9tCRahQrk%3Dr5ZWLD-jr7hXweYX3yg%40mail.gmail.com Discussion: https://postgr.es/m/4075.1459088427%40sss.pgh.pa.us	2018-10-09 12:51:01 +13:00
Alvaro Herrera	ad08006ba0	Fix event triggers for partitioned tables Index DDL cascading on partitioned tables introduced a way for ALTER TABLE to be called reentrantly. This caused an an important deficiency in event trigger support to be exposed: on exiting the reentrant call, the alter table state object was clobbered, causing a crash when the outer alter table tries to finalize its processing. Fix the crash by creating a stack of event trigger state objects. There are still ways to cause things to misbehave (and probably other crashers) with more elaborate tricks, but at least it now doesn't crash in the obvious scenario. Backpatch to 9.5, where DDL deparsing of event triggers was introduced. Reported-by: Marco Slot Authors: Michaël Paquier, Álvaro Herrera Discussion: https://postgr.es/m/CANNhMLCpi+HQ7M36uPfGbJZEQLyTy7XvX=5EFkpR-b1bo0uJew@mail.gmail.com	2018-10-06 19:17:46 -03:00
Tom Lane	07ee62ce9e	Propagate xactStartTimestamp and stmtStartTimestamp to parallel workers. Previously, a worker process would establish values for these based on its own start time. In v10 and up, this can trivially be shown to cause misbehavior of transaction_timestamp(), timestamp_in(), and related functions which are (perhaps unwisely?) marked parallel-safe. It seems likely that other behaviors might diverge from what happens in the parent as well. It's not as trivial to demonstrate problems in 9.6 or 9.5, but I'm sure it's still possible, so back-patch to all branches containing parallel worker infrastructure. In HEAD only, mark now() and statement_timestamp() as parallel-safe (other affected functions already were). While in theory we could still squeeze that change into v11, it doesn't seem important enough to force a last-minute catversion bump. Konstantin Knizhnik, whacked around a bit by me Discussion: https://postgr.es/m/6406dbd2-5d37-4cb6-6eb2-9c44172c7e7c@postgrespro.ru	2018-10-06 12:00:09 -04:00
Michael Paquier	9cd92d1a33	Add pg_ls_tmpdir function This lists the contents of a temporary directory associated to a given tablespace, useful to get information about on-disk consumption caused by temporary files used by a session query. By default, pg_default is scanned, and a tablespace can be specified as argument. This function is intended to be used by monitoring tools, and, unlike pg_ls_dir(), access to them can be granted to non-superusers so that those monitoring tools can observe the principle of least privilege. Access is also given by default to members of pg_monitor. Author: Nathan Bossart Reviewed-by: Laurenz Albe Discussion: https://postgr.es/m/92F458A2-6459-44B8-A7F2-2ADD3225046A@amazon.com	2018-10-05 09:21:48 +09:00
Tom Lane	fdba460a26	Create an RTE field to record the query's lock mode for each relation. Add RangeTblEntry.rellockmode, which records the appropriate lock mode for each RTE_RELATION rangetable entry (either AccessShareLock, RowShareLock, or RowExclusiveLock depending on the RTE's role in the query). This patch creates the field and makes all creators of RTE nodes fill it in reasonably, but for the moment nothing much is done with it. The plan is to replace assorted post-parser logic that re-determines the right lockmode to use with simple uses of rte->rellockmode. For now, just add Asserts in each of those places that the rellockmode matches what they are computing today. (In some cases the match isn't perfect, so the Asserts are weaker than you might expect; but this seems OK, as per discussion.) This passes check-world for me, but it seems worth pushing in this state to see if the buildfarm finds any problems in cases I failed to test. catversion bump due to change of stored rules. Amit Langote, reviewed by David Rowley and Jesper Pedersen, and whacked around a bit more by me Discussion: https://postgr.es/m/468c85d9-540e-66a2-1dde-fec2b741e688@lab.ntt.co.jp	2018-09-30 13:55:51 -04:00
Alvaro Herrera	62e533d3f1	Remove fmgr.h inclusion from partition.h It's not needed anymore.	2018-09-25 17:52:07 -03:00
Tom Lane	fd582317e1	Sync our Snowball stemmer dictionaries with current upstream. We haven't touched these since text search functionality landed in core in 2007 :-(. While the upstream project isn't a beehive of activity, they do make additions and bug fixes from time to time. Update our copies of these files. Also update our documentation about how to keep things in sync, since they're not making distribution tarballs these days. Fortunately, their source code turns out to be a breeze to build. Notable changes: * The non-UTF8 version of the hungarian stemmer now works in LATIN2 not LATIN1. * New stemmers have appeared for arabic, indonesian, irish, lithuanian, nepali, and tamil. These all work in UTF8, and the indonesian and irish ones also work in LATIN1. (There are some new stemmers that I did not incorporate, mainly because their names don't match the underlying languages, suggesting that they're not to be considered mainstream.) Worth noting: the upstream Nepali dictionary was contributed by Arthur Zakirov. initdb forced because the contents of snowball_create.sql have changed. Still TODO: see about updating the stopword lists. Arthur Zakirov, minor mods and doc work by me Discussion: https://postgr.es/m/20180626122025.GA12647@zakirov.localdomain Discussion: https://postgr.es/m/20180219140849.GA9050@zakirov.localdomain	2018-09-24 17:29:38 -04:00
Joe Conway	c62dd80cdf	Document aclitem functions and operators aclitem functions and operators have been heretofore undocumented. Fix that. While at it, ensure the non-operator aclitem functions have pg_description strings. Does not seem worthwhile to back-patch. Author: Fabien Coelho, with pg_description from John Naylor, and significant refactoring and editorialization by me. Reviewed by: Tom Lane Discussion: https://postgr.es/m/flat/alpine.DEB.2.21.1808010825490.18204%40lancre	2018-09-24 10:14:57 -04:00
Tom Lane	b09a64d602	Add missing pg_description strings for pg_type entries. I noticed that all non-composite, non-array entries in pg_type.dat had descr strings, except for "json" and the pseudo-types. The lack for json seems certainly an oversight, and there's surely little reason to not have entries for the pseudo-types either. So add some. "make reformat-dat-files" turned up some formatting issues in pg_amop.dat, too, so fix those in passing. No catversion bump since the backend doesn't care too much what is in pg_description.	2018-09-20 16:06:18 -04:00
Tom Lane	3dc820c43e	Teach genbki.pl to auto-generate pg_type entries for array types. This eliminates some more tedium in adding new catalog entries, specifically the need to set up an array type when adding a new built-in data type. Now it's sufficient to assign an OID for the array type and write it in an "array_type_oid" metadata field. You don't have to fill the base type's typarray link explicitly, either. No catversion bump since the contents of pg_type aren't changed. (Well, their order might be different, but that doesn't matter.) John Naylor, reviewed and whacked around a bit by Dagfinn Ilmari Mannsåker, and some more by me. Discussion: https://postgr.es/m/CAJVSVGVTb6m9pJF49b3SuA8J+T-THO9c0hxOmoyv-yGKh-FbNg@mail.gmail.com	2018-09-20 15:14:46 -04:00
Alexander Korotkov	2a6368343f	Add support for nearest-neighbor (KNN) searches to SP-GiST Currently, KNN searches were supported only by GiST. SP-GiST also capable to support them. This commit implements that support. SP-GiST scan stack is replaced with queue, which serves as stack if no ordering is specified. KNN support is provided for three SP-GIST opclasses: quad_point_ops, kd_point_ops and poly_ops (catversion is bumped). Some common parts between GiST and SP-GiST KNNs are extracted into separate functions. Discussion: https://postgr.es/m/570825e8-47d0-4732-2bf6-88d67d2d51c8%40postgrespro.ru Author: Nikita Glukhov, Alexander Korotkov based on GSoC work by Vlad Sterzhanov Review: Andrey Borodin, Alexander Korotkov	2018-09-19 01:54:10 +03:00
Tom Lane	db1071d4ee	Fix some minor issues exposed by outfuncs/readfuncs testing. A test patch to pass parse and plan trees through outfuncs + readfuncs exposed several issues that need to be fixed to get clean matches: Query.withCheckOptions failed to get copied; it's intentionally ignored by outfuncs/readfuncs on the grounds that it'd always be NIL anyway in stored rules. This seems less than future-proof, and it's not even saving very much, so just undo the decision and treat the field like all others. Several places that convert a view RTE into a subquery RTE, or similar manipulations, failed to clear out fields that were specific to the original RTE type and should be zero in a subquery RTE. Since readfuncs.c will leave such fields as zero, equalfuncs.c thinks the nodes are different leading to a reported mismatch. It seems like a good idea to clear out the no-longer-needed fields, even though in principle nothing should look at them; the node ought to be indistinguishable from how it would look if we'd built a new node instead of scribbling on the old one. BuildOnConflictExcludedTargetlist randomly set the resname of some TargetEntries to "" not NULL. outfuncs/readfuncs don't distinguish those cases, and so the string will read back in as NULL ... but equalfuncs.c does distinguish. Perhaps we ought to try to make things more consistent in this area --- but it's just useless extra code space for BuildOnConflictExcludedTargetlist to not use NULL here, so I fixed it for now by making it do that. catversion bumped because the change in handling of Query.withCheckOptions affects stored rules. Discussion: https://postgr.es/m/17114.1537138992@sss.pgh.pa.us	2018-09-18 15:08:28 -04:00
Michael Paquier	1d6fbc38d9	Refactor routines for subscription and publication lookups Those routines gain a missing_ok argument, allowing a caller to get a NULL result instead of an error if set to true. This is part of a larger refactoring effort for objectaddress.c where trying to check for non-existing objects does not result in cache lookup failures. Author: Michael Paquier Reviewed-by: Aleksander Alekseev, Álvaro Herrera Discussion: https://postgr.es/m/CAB7nPqSZxrSmdHK-rny7z8mi=EAFXJ5J-0RbzDw6aus=wB5azQ@mail.gmail.com	2018-09-18 12:00:18 +09:00
Tom Lane	ae5205c8a8	Make argument names of pg_get_object_address consistent, and fix docs. pg_get_object_address and pg_identify_object_as_address are supposed to be inverses, but they disagreed as to the names of the arguments representing the textual form of an object address. Moreover, the documented argument names didn't agree with reality at all, either for these functions or pg_identify_object. In HEAD and v11, I think we can get away with renaming the input arguments of pg_get_object_address to match the outputs of pg_identify_object_as_address. In theory that might break queries using named-argument notation to call pg_get_object_address, but it seems really unlikely that anybody is doing that, or that they'd have much trouble adjusting if they were. In older branches, we'll just live with the lack of consistency. Aside from fixing the documentation of these functions to match reality, I couldn't resist the temptation to do some copy-editing. Per complaint from Jean-Pierre Pelletier. Back-patch to 9.5 where these functions were introduced. (Before v11, this is a documentation change only.) Discussion: https://postgr.es/m/CANGqjDnWH8wsTY_GzDUxbt4i=y-85SJreZin4Hm8uOqv1vzRQA@mail.gmail.com	2018-09-05 13:47:28 -04:00
Tom Lane	17b7c302b5	Fully enforce uniqueness of constraint names. It's been true for a long time that we expect names of table and domain constraints to be unique among the constraints of that table or domain. However, the enforcement of that has been pretty haphazard, and it missed some corner cases such as creating a CHECK constraint and then an index constraint of the same name (as per recent report from André Hänsel). Also, due to the lack of an actual unique index enforcing this, duplicates could be created through race conditions. Moreover, the code that searches pg_constraint has been quite inconsistent about how to handle duplicate names if one did occur: some places checked and threw errors if there was more than one match, while others just processed the first match they came to. To fix, create a unique index on (conrelid, contypid, conname). Since either conrelid or contypid is zero, this will separately enforce uniqueness of constraint names among constraints of any one table and any one domain. (If we ever implement SQL assertions, and put them into this catalog, more thought might be needed. But it'd be at least as reasonable to put them into a new catalog; having overloaded this one catalog with two kinds of constraints was a mistake already IMO.) This index can replace the existing non-unique index on conrelid, though we need to keep the one on contypid for query performance reasons. Having done that, we can simplify the logic in various places that either coped with duplicates or neglected to, as well as potentially improve lookup performance when searching for a constraint by name. Also, as per our usual practice, install a preliminary check so that you get something more friendly than a unique-index violation report in the case complained of by André. And teach ChooseIndexName to avoid choosing autogenerated names that would draw such a failure. While it's not possible to make such a change in the back branches, it doesn't seem quite too late to put this into v11, so do so. Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de	2018-09-04 13:45:35 -04:00
Alvaro Herrera	c076f3d74a	Remove pg_constraint.conincluding This column was added in commit `8224de4f42` ("Indexes with INCLUDE columns and their support in B-tree") to ease writing the ruleutils.c supporting code for that feature, but it turns out to be unnecessary -- we can do the same thing with just one more syscache lookup. Even the documentation for the new column being removed in this commit is awkward. Discussion: https://postgr.es/m/20180902165018.33otxftp3olgtu4t@alvherre.pgsql	2018-09-03 12:59:26 -03:00
Peter Eisentraut	a4a232b1e7	Error position support for defaults and check constraints Add support for error position reporting for the expressions contained in defaults and check constraint definitions. This currently works only for CREATE TABLE, not ALTER TABLE, because the latter is not set up to pass around the original query string. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2018-08-30 08:20:23 +02:00
Peter Eisentraut	9b39b799db	Add some not null constraints to catalogs Use BKI_FORCE_NOT_NULL on some catalog field declarations that are never null (according to the source code that accesses them).	2018-08-27 16:21:23 +02:00
Michael Paquier	246a6c8f7b	Make autovacuum more aggressive to remove orphaned temp tables Commit `dafa084`, added in 10, made the removal of temporary orphaned tables more aggressive. This commit makes an extra step into the aggressiveness by adding a flag in each backend's MyProc which tracks down any temporary namespace currently in use. The flag is set when the namespace gets created and can be reset if the temporary namespace has been created in a transaction or sub-transaction which is aborted. The flag value assignment is assumed to be atomic, so this can be done in a lock-less fashion like other flags already present in PGPROC like databaseId or backendId, still the fact that the temporary namespace and table created are still locked until the transaction creating those commits acts as a barrier for other backends. This new flag gets used by autovacuum to discard more aggressively orphaned tables by additionally checking for the database a backend is connected to as well as its temporary namespace in-use, removing orphaned temporary relations even if a backend reuses the same slot as one which created temporary relations in a past session. The base idea of this patch comes from Robert Haas, has been written in its first version by Tsunakawa Takayuki, then heavily reviewed by me. Author: Tsunakawa Takayuki Reviewed-by: Michael Paquier, Kyotaro Horiguchi, Andres Freund Discussion: https://postgr.es/m/0A3221C70F24FB45833433255569204D1F8A4DC6@G01JPEXMBYT05 Backpatch: 11-, as PGPROC gains a new flag and we don't want silent ABI breakages on already released versions.	2018-08-13 11:49:04 +02:00
Michael Paquier	f2b1316a94	Bump catalog version for recent toast table additions This has been forgotten in `96cdeae`.	2018-07-20 09:28:19 +09:00
Michael Paquier	96cdeae07f	Add toast tables to most system catalogs It has been project policy to create toast tables only for those catalogs that might reasonably need one. Since this judgment call can change over time, just create one for every catalog, as this can be useful when creating rather-long entries in catalogs, with recent examples being in the shape of policy expressions or customly-formatted SCRAM verifiers. To prevent circular dependencies and to avoid adding complexity to VACUUM FULL logic, exclude pg_class, pg_attribute, and pg_index. Also, to prevent pg_upgrade from seeing a non-empty new cluster, exclude pg_largeobject and pg_largeobject_metadata from the set as large object data is handled as user data. Those relations have no reason to use a toast table anyway. Author: Joe Conway, John Naylor Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/84ddff04-f122-784b-b6c5-3536804495f8@joeconway.com	2018-07-20 07:43:41 +09:00
Michael Paquier	ce89ad0fa0	Fix argument of pg_create_logical_replication_slot for slot name All attributes and arguments using a slot name map to the data type "name", but this function has been using "text". This is cosmetic, as even if text is used then the slot name would be truncated to 64 characters anyway and stored as such. The documentation already said so and the function already assumed that the argument was of this type when fetching its value. Bump catalog version. Author: Sawada Masahiko Discussion: https://postgr.es/m/CAD21AoADYz_-eAqH5AVFaCaojcRgwpo9PW=u8kgTMys63oB8Cw@mail.gmail.com	2018-07-13 09:32:12 +09:00
Tom Lane	39a96512b3	Mark built-in btree comparison functions as leakproof where it's safe. Generally, if the comparison operators for a datatype or pair of datatypes are leakproof, the corresponding btree comparison support function can be considered so as well. But we had not originally worried about marking support functions as leakproof, reasoning that they'd not likely be used in queries so the marking wouldn't matter. It turns out there's at least one place where it does matter: calc_arraycontsel() finds the target datatype's default btree comparison function and tries to use that to estimate selectivity, but it will be blocked in some cases if the function isn't leakproof. This leads to unnecessarily poor selectivity estimates and bad plans, as seen in bug #15251. Hence, run around and apply proleakproof markings where the corresponding btree comparison operators are leakproof. (I did eyeball each function to verify that it wasn't doing anything surprising, too.) This isn't a full solution to bug #15251, and it's not back-patchable because of the need for a catversion bump. A more useful response probably is to consider whether we can check permissions on the parent table instead of the child. However, this change will help in some cases where that won't, and it's easy enough to do in HEAD, so let's do so. Discussion: https://postgr.es/m/3876.1531261875@sss.pgh.pa.us	2018-07-11 18:47:31 -04:00
Andrew Dunstan	123efbccea	Mark binary_upgrade_set_missing_value as parallel_unsafe per buildfarm. Bump catalog version again although in practice nobody is going to use this in a parallel query.	2018-06-23 08:43:05 -04:00
Andrew Dunstan	2448adf29c	Allow for pg_upgrade of attributes with missing values Commit `16828d5c02` neglected to do this, so upgraded databases would silently get null instead of the specified default in rows without the attribute defined. A new binary upgrade function is provided to perform this and pg_dump is adjusted to output a call to the function if required in binary upgrade mode. Also included is code to drop missing attribute values for dropped columns. That way if the type is later dropped the missing value won't have a dangling reference to the type. Finally the regression tests are adjusted to ensure that there is a row with a missing value so that this code is exercised in upgrade testing. Catalog version unfortunately bumped. Regression test changes from Tom Lane. Remainder from me, reviewed by Tom Lane, Andres Freund, Alvaro Herrera Discussion: https://postgr.es/m/19987.1529420110@sss.pgh.pa.us	2018-06-22 08:42:36 -04:00
Andrew Dunstan	3a7cc727c7	Don't fall off the end of perl functions This complies with the perlcritic policy Subroutines::RequireFinalReturn, which is a severity 4 policy. Since we only currently check at severity level 5, the policy is raised to that level until we move to level 4 or lower, so that any new infringements will be caught. A small cosmetic piece of tidying of the pgperlcritic script is included. Mike Blackwell Discussion: https://postgr.es/m/CAESHdJpfFm_9wQnQ3koY3c91FoRQsO-fh02za9R3OEMndOn84A@mail.gmail.com	2018-05-27 09:08:42 -04:00
Tom Lane	71b349aef4	Update a couple of long-obsolete comments in pg_type.h.	2018-05-26 13:47:26 -04:00
Tom Lane	f755a152d4	Improve spelling of new FINALFUNC_MODIFY aggregate attribute. I'd used SHARABLE as a value originally, but Peter Eisentraut points out that dictionaries agree that SHAREABLE is the preferred spelling. Run around and change that before it's too late. Discussion: https://postgr.es/m/d2e1afd4-659c-50d6-1b20-7cfd3675e909@2ndquadrant.com	2018-05-21 11:41:42 -04:00
Tom Lane	e7a808f947	Assorted minor cleanups for bootstrap-data Perl scripts. FindDefinedSymbol was intended to take an array of possible include paths, but it never actually worked correctly for any but the first array element. Since there's no use-case for more than one path anyway, let's just simplify this code and its callers by redefining it as taking only one include path. Minor other code-beautification without functional effects, except that in one place we format the output as pgindent would do. John Naylor Discussion: https://postgr.es/m/CAJVSVGXM_n32hTTkircW4_K1LQFsJNb6xjs0pAP4QC0ZpyJfPQ@mail.gmail.com	2018-05-19 16:04:47 -04:00
Tom Lane	85475afdb6	Cosmetic improvement: use BKI_DEFAULT and BKI_LOOKUP in pg_language. The point of this is not really to remove redundancy in pg_language.dat; with only three entries, it's hardly worth it. Rather, it is to get to a point where there are exactly zero hard-coded numeric pg_proc OID references in the catalog .dat files. The lanvalidator column was the only remaining location of such references, and it seems like a good thing for future-proofing reasons to make it not be a special case. There are still a few places in the .dat files with numeric OID references to other catalogs, but after review I don't see any that seem worth changing at present. In each case there are just too few entries to make it worth the trouble to create lookup infrastructure. This doesn't change the emitted postgres.bki file, so no catversion bump.	2018-04-29 13:26:26 -04:00
Tom Lane	84549ebd4c	Tweak reformat_dat_file.pl to make it more easily hand-invokable. Use the same code we already applied in duplicate_oids and unused_oids to let this script find Catalog.pm without help. This removes the need to supply a -I switch in most cases. Also, mark the script executable, again to follow the precedent of duplicate_oids and unused_oids. Now you can just do "./reformat_dat_file.pl pg_proc.dat" if you want to reformat only one or a few .dat files rather than all. It'd be possible to remove the -I switches in the Makefile's convenience targets, but I chose to leave them: they don't hurt anything, and it's possible that in weird VPATH situations they might be of value.	2018-04-28 16:09:03 -04:00
Tom Lane	45c6d75f8c	Clarify handling of special-case values in bootstrap catalog data. I (tgl) originally coded the special case for pg_proc.pronargs as though it were a kind of default value. It seems better though to treat computable columns as an independent concern: this makes the code clearer, and probably a bit faster too since we needn't do work inside the per-column loop. Improve related comments, as well, in the expectation that there might be more cases like this in future. John Naylor, some additional comment-hacking by me Discussion: https://postgr.es/m/CAJVSVGW-D7OobzU=dybVT2JqZAx-4X1yvBJdavBmqQL05Q6CLw@mail.gmail.com	2018-04-28 15:27:16 -04:00
Tom Lane	bdf46af748	Post-feature-freeze pgindent run. Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us	2018-04-26 14:47:16 -04:00
Tom Lane	dd4cc9d706	Fix duplicate_oids and unused_oids so user needn't cd to catalog dir. Previously, you had to cd into src/include/catalog before running either of these scripts. That's a bit tedious, so let's make the scripts do it for you. In passing, improve the initial comments in both scripts. Also remove unused_oids' code to complain about duplicate oids. That was added in yesterday's commit `5602265f7`, but on second thought we shouldn't be randomly redefining the script's behavior that way. John Naylor and Tom Lane Discussion: https://postgr.es/m/37D774E4-FE1F-437E-B3D2-593F314B7505@postgrespro.ru	2018-04-26 11:20:01 -04:00
Tom Lane	5602265f77	Convert unused_oids and duplicate_oids to use Catalog.pm infrastructure. unused_oids was previously a shell script, which of course didn't work at all on Windows. Also, commit `372728b0d` introduced some other portability problems, as complained of by Stas Kelvich. We can improve matters by converting it to Perl. While we're at it, let's future-proof both this script and duplicate_oids to use Catalog.pm rather than having a bunch of ad-hoc logic for parsing catalog headers and .dat files. These scripts are thereby a bit slower, which doesn't seem like a problem for typical manual use. It is a little annoying for buildfarm purposes, but we should be able to fix that case by having genbki.pl make the check instead of parsing the headers twice. (That's not done in this commit, though.) Stas Kelvich, adjusted a bit by me Discussion: https://postgr.es/m/37D774E4-FE1F-437E-B3D2-593F314B7505@postgrespro.ru	2018-04-25 16:01:58 -04:00
Tom Lane	1eb3a09e93	Make Catalog.pm's representation of toast and index decls more abstract. Instead of immediately constructing the string we need to emit into the .BKI file, preserve the items we extracted from the header file in a hash. This eases using the info for other purposes. John Naylor (with cosmetic adjustments by me) Discussion: https://postgr.es/m/37D774E4-FE1F-437E-B3D2-593F314B7505@postgrespro.ru	2018-04-25 16:01:58 -04:00
Tom Lane	f04d4ac919	Reindent Perl files with perltidy version 20170521. Discussion: https://postgr.es/m/CABUevEzK3cNiHZQ18f5tK0guoT+cN_jWeVzhYYxY=r+1Q3SmoA@mail.gmail.com	2018-04-25 14:00:19 -04:00
Tom Lane	68c23cba34	Improve consistency of comments in system catalog headers. Use the term "system catalog" rather than "system relation" in assorted places where it's clearly referring to a table rather than, say, an index. Use more natural word order in the header boilerplate, improve some of the one-liner catalog descriptions, and fix assorted random deviations from the normal boilerplate. All purely neatnik-ism, but why not. John Naylor, some additional cleanup by me Discussion: https://postgr.es/m/CAJVSVGUeJmFB3h-NJ18P32NPa+kzC165nm7GSoGHfPaN80Wxcw@mail.gmail.com	2018-04-19 17:14:09 -04:00
Teodor Sigaev	ff4943042f	Fix datatype for number of heap tuples during last cleanup It appears that new fields introduced in `857f9c36` have inconsistent datatypes: BTMetaPageData.btm_last_cleanup_num_heap_tuples is of float4 type, while xl_btree_metadata.last_cleanup_num_heap_tuples is of double type. IndexVacuumInfo.num_heap_tuples, which is a source of values for both former fields is of double type. So, make both those fields in BTMetaPageData and xl_btree_metadata use float8 type in order to match the precision of the source. That shouldn't be double type, because we always use types with explicit width in WAL. Patch introduces incompatibility of on-disk format since `857f9c36` commit, but that versions never was released, so just bump catalog version to avoid possible confusion. Author: Alexander Korortkov	2018-04-19 11:28:03 +03:00
Tom Lane	55d26ff638	Rationalize handling of single and double quotes in bootstrap data. Change things around so that proper quoting of values interpolated into the BKI data by initdb is the responsibility of initdb, not something we half-heartedly handle by putting double quotes into the raw BKI data. (Note: experimentation shows that it still doesn't work to put a double quote into the initial superuser username, but that's the fault of inadequate quoting while interpolating the name into SQL scripts; the BKI aspect of it works fine now.) Having done that, we can remove the special-case handling of values that look like "something" from genbki.pl, and instead teach it to escape double --- and single --- quotes properly. This removes the nowhere-documented need to treat those specially in the BKI source data; whatever you write will be passed through unchanged into the inserted data value, modulo Perl's rules about single-quoted strings. Add documentation explaining the (pre-existing) handling of backslashes in the BKI data. Per an earlier discussion with John Naylor. Discussion: https://postgr.es/m/CAJVSVGUNao=-Q2-vAN3PYcdF5tnL5JAHwGwzZGuYHtq+Mk_9ng@mail.gmail.com	2018-04-17 19:53:50 -04:00
Alvaro Herrera	da6f3e45dd	Reorganize partitioning code There's been a massive addition of partitioning code in PostgreSQL 11, with little oversight on its placement, resulting in a catalog/partition.c with poorly defined boundaries and responsibilities. This commit tries to set a couple of distinct modules to separate things a little bit. There are no code changes here, only code movement. There are three new files: src/backend/utils/cache/partcache.c src/include/partitioning/partdefs.h src/include/utils/partcache.h The previous arrangement of #including catalog/partition.h almost everywhere is no more. Authors: Amit Langote and Álvaro Herrera Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql	2018-04-14 21:12:14 -03:00
Tom Lane	2cdf359fc4	Make reformat_dat_file.pl preserve all blank lines. In its original form, reformat_dat_file.pl smashed consecutive blank lines to a single blank line, which was helpful for mopping up excess whitespace during the bootstrap data format conversion. But going forward, there seems little reason to do that; if developers want to put in multiple blank lines, let 'em. This makes it conform to the documentation I (tgl) wrote, too. In passing, clean up some sloppy markup choices in bki.sgml. John Naylor Discussion: https://postgr.es/m/28827.1523039259@sss.pgh.pa.us	2018-04-09 14:58:39 -04:00
Tom Lane	af1a949109	Further cleanup of client dependencies on src/include/catalog headers. In commit `9c0a0de4c`, I'd failed to notice that catalog/catalog.h should also be considered a frontend-unsafe header, because it includes (and needs) the full form of pg_class.h, not to mention relcache.h. However, various frontend code was depending on it to get TABLESPACE_VERSION_DIRECTORY, so refactoring of some sort is called for. The cleanest answer seems to be to move TABLESPACE_VERSION_DIRECTORY, as well as the OIDCHARS symbol, to common/relpath.h. Do that, and mop up inclusions as necessary. (I found that quite a few current users of catalog/catalog.h don't seem to need it at all anymore, apparently as a result of the refactorings that created common/relpath.[hc]. And initdb.c needed it only as a route to pg_class_d.h.) Discussion: https://postgr.es/m/6629.1523294509@sss.pgh.pa.us	2018-04-09 14:39:58 -04:00
Magnus Hagander	f5543d47bc	catversion bump for online-checksums revert Lack thereof pointed out by Tom Lane.	2018-04-09 19:26:58 +02:00
Magnus Hagander	a228cc13ae	Revert "Allow on-line enabling and disabling of data checksums" This reverts the backend sides of commit `1fde38beaa`. I have, at least for now, left the pg_verify_checksums tool in place, as this tool can be very valuable without the rest of the patch as well, and since it's a read-only tool that only runs when the cluster is down it should be a lot safer.	2018-04-09 19:03:42 +02:00
Tom Lane	4f85f66469	Cosmetic cleanups in initial catalog data. Write ',' and ';' for typdelim values instead of the obscurantist ASCII octal equivalents. Not sure why anybody ever thought the latter were better; maybe it had something to do with lack of a better quoting convention, twenty-plus years ago? Reassign a couple of high-numbered OIDs that were left in during yesterday's mad rush to commit stuff of uncertain internal temperature. The latter requires a catversion bump, though the former wouldn't since the end-result catalog data is unchanged.	2018-04-08 15:55:49 -04:00
Tom Lane	cefa387153	Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers. Traditionally, include/catalog/pg_foo.h contains extern declarations for functions in backend/catalog/pg_foo.c, in addition to its function as the authoritative definition of the pg_foo catalog's rowtype. In some cases, we'd been forced to split out those extern declarations into separate pg_foo_fn.h headers so that the catalog definitions could be #include'd by frontend code. That problem is gone as of commit `9c0a0de4c`, so let's undo the splits to make things less confusing. Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us	2018-04-08 14:35:29 -04:00
Tom Lane	372728b0d4	Replace our traditional initial-catalog-data format with a better design. Historically, the initial catalog data to be installed during bootstrap has been written in DATA() lines in the catalog header files. This had lots of disadvantages: the format was badly underdocumented, it was very difficult to edit the data in any mechanized way, and due to the lack of any abstraction the data was verbose, hard to read/understand, and easy to get wrong. Hence, move this data into separate ".dat" files and represent it in a way that can easily be read and rewritten by Perl scripts. The new format is essentially "key => value" for each column; while it's a bit repetitive, explicit labeling of each value makes the data far more readable and less error-prone. Provide a way to abbreviate entries by omitting field values that match a specified default value for their column. This allows removal of a large amount of repetitive boilerplate and also lowers the barrier to adding new columns. Also teach genbki.pl how to translate symbolic OID references into numeric OIDs for more cases than just "regproc"-like pg_proc references. It can now do that for regprocedure-like references (thus solving the problem that regproc is ambiguous for overloaded functions), operators, types, opfamilies, opclasses, and access methods. Use this to turn nearly all OID cross-references in the initial data into symbolic form. This represents a very large step forward in readability and error resistance of the initial catalog data. It should also reduce the difficulty of renumbering OID assignments in uncommitted patches. Also, solve the longstanding problem that frontend code that would like to use OID macros and other information from the catalog headers often had difficulty with backend-only code in the headers. To do this, arrange for all generated macros, plus such other declarations as we deem fit, to be placed in "derived" header files that are safe for frontend inclusion. (Once clients migrate to using these pg__d.h headers, it will be possible to get rid of the pg__fn.h headers, which only exist to quarantine code away from clients. That is left for follow-on patches, however.) The now-automatically-generated macros include the Anum_xxx and Natts_xxx constants that we used to have to update by hand when adding or removing catalog columns. Replace the former manual method of generating OID macros for pg_type entries with an automatic method, ensuring that all built-in types have OID macros. (But note that this patch does not change the way that OID macros for pg_proc entries are built and used. It's not clear that making that match the other catalogs would be worth extra code churn.) Add SGML documentation explaining what the new data format is and how to work with it. Despite being a very large change in the catalog headers, there is no catversion bump here, because postgres.bki and related output files haven't changed at all. John Naylor, based on ideas from various people; review and minor additional coding by me; previous review by Alvaro Herrera Discussion: https://postgr.es/m/CAJVSVGWO48JbbwXkJz_yBFyGYW-M9YWxnPdxJBUosDC9ou_F0Q@mail.gmail.com	2018-04-08 13:17:27 -04:00
Teodor Sigaev	8224de4f42	Indexes with INCLUDE columns and their support in B-tree This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru	2018-04-07 23:00:39 +03:00
Teodor Sigaev	1c1791e000	Add json(b)_to_tsvector function Jsonb has a complex nature so there isn't best-for-everything way to convert it to tsvector for full text search. Current to_tsvector(json(b)) suggests to convert only string values, but it's possible to index keys, numerics and even booleans value. To solve that json(b)_to_tsvector has a second required argument contained a list of desired types of json fields. Second argument is a jsonb scalar or array right now with possibility to add new options in a future. Bump catalog version Author: Dmitry Dolgov with some editorization by me Reviewed by: Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com	2018-04-07 20:58:03 +03:00
Peter Eisentraut	039eb6e92f	Logical replication support for TRUNCATE Update the built-in logical replication system to make use of the previously added logical decoding for TRUNCATE support. Add the required truncate callback to pgoutput and a new logical replication protocol message. Publications get a new attribute to determine whether to replicate truncate actions. When updating a publication via pg_dump from an older version, this is not set, thus preserving the previous behavior. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-04-07 11:34:11 -04:00
Tom Lane	cb1ff1e5af	Remove some unnecessary quote marks from catalog DATA lines. This has no functional impact whatsoever. However, it causes these unnecessary quote marks to disappear from the generated postgres.bki file, making it easier to verify that the upcoming bootstrap data conversion patch doesn't change the generated file.	2018-04-06 18:58:38 -04:00
Alvaro Herrera	9fdb675fc5	Faster partition pruning Add a new module backend/partitioning/partprune.c, implementing a more sophisticated algorithm for partition pruning. The new module uses each partition's "boundinfo" for pruning instead of constraint exclusion, based on an idea proposed by Robert Haas of a "pruning program": a list of steps generated from the query quals which are run iteratively to obtain a list of partitions that must be scanned in order to satisfy those quals. At present, this targets planner-time partition pruning, but there exist further patches to apply partition pruning at execution time as well. This commit also moves some definitions from include/catalog/partition.h to a new file include/partitioning/partbounds.h, in an attempt to rationalize partitioning related code. Authors: Amit Langote, David Rowley, Dilip Kumar Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen. Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp	2018-04-06 16:44:05 -03:00
Stephen Frost	11523e860f	Support new default roles with adminpack This provides a newer version of adminpack which works with the newly added default roles to support GRANT'ing to non-superusers access to read and write files, along with related functions (unlinking files, getting file length, renaming/removing files, scanning the log file directory) which are supported through adminpack. Note that new versions of the functions are required because an environment might have an updated version of the library but still have the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to PUBLIC for the functions). This patch also removes the long-deprecated alternative names for functions that adminpack used to include and which are now included in the backend, in adminpack v1.1. Applications using the deprecated names should be updated to use the backend functions instead. Existing installations which continue to use adminpack v1.0 should continue to function until/unless adminpack is upgraded. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Stephen Frost	0fdc8495bf	Add default roles for file/program access This patch adds new default roles named 'pg_read_server_files', 'pg_write_server_files', 'pg_execute_server_program' which allow an administrator to GRANT to a non-superuser role the ability to access server-side files or run programs through PostgreSQL (as the user the database is running as). Having one of these roles allows a non-superuser to use server-side COPY to read, write, or with a program, and to use file_fdw (if installed by a superuser and GRANT'd USAGE on it) to read from files or run a program. The existing misc file functions are also changed to allow a user with the 'pg_read_server_files' default role to read any files on the filesystem, matching the privileges given to that role through COPY and file_fdw from above. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Peter Eisentraut	bcf79b5bb6	Split the SetSubscriptionRelState function into two We don't actually need the insert-or-update logic, so it's clearer to have separate functions for the inserting and updating. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2018-04-06 10:00:26 -04:00
Magnus Hagander	1fde38beaa	Allow on-line enabling and disabling of data checksums This makes it possible to turn checksums on in a live cluster, without the previous need for dump/reload or logical replication (and to turn it off). Enabling checkusm starts a background process in the form of a launcher/worker combination that goes through the entire database and recalculates checksums on each and every page. Only when all pages have been checksummed are they fully enabled in the cluster. Any failure of the process will revert to checksums off and the process has to be started. This adds a new WAL record that indicates the state of checksums, so the process works across replicated clusters. Authors: Magnus Hagander and Daniel Gustafsson Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin	2018-04-05 22:04:48 +02:00
Teodor Sigaev	1664ae1978	Add websearch_to_tsquery Error-tolerant conversion function with web-like syntax for search query, it simplifies constraining search engine with close to habitual interface for users. Bump catalog version Authors: Victor Drobny, Dmitry Ivanov with editorization by me Reviewed by: Aleksander Alekseev, Tomas Vondra, Thomas Munro, Aleksandr Parfenov Discussion: https://www.postgresql.org/message-id/flat/fe931111ff7e9ad79196486ada79e268@postgrespro.ru	2018-04-05 19:55:11 +03:00
Alvaro Herrera	3de241dba8	Foreign keys on partitioned tables Author: Álvaro Herrera Discussion: https://postgr.es/m/20171231194359.cvojcour423ulha4@alvherre.pgsql Reviewed-by: Peter Eisentraut	2018-04-04 14:02:49 -03:00
Teodor Sigaev	710d90da1f	Add prefix operator for TEXT type. The prefix operator along with SP-GiST indexes can be used as an alternative for LIKE 'word%' commands and it doesn't have a limitation of string/prefix length as B-Tree has. Bump catalog version Author: Ildus Kurbangaliev with some editorization by me Review by: Arthur Zakirov, Alexander Korotkov, and me Discussion: https://www.postgresql.org/message-id/flat/20180202180327.222b04b3@wp.localdomain	2018-04-03 19:46:45 +03:00
Andres Freund	3e256e5506	Add SKIP_LOCKED option to RangeVarGetRelidExtended(). This will be used for VACUUM (SKIP LOCKED). Author: Nathan Bossart Reviewed-By: Michael Paquier and Andres Freund Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de	2018-03-30 17:05:16 -07:00
Andres Freund	d87510a524	Combine options for RangeVarGetRelidExtended() into a flags argument. A followup patch will add a SKIP_LOCKED option. To avoid introducing evermore arguments, breaking existing callers each time, introduce a flags argument. This'll no doubt break a few external users... Also change the MISSING_OK behaviour so a DEBUG1 debug message is emitted when a relation is not found. Author: Nathan Bossart Reviewed-By: Michael Paquier and Andres Freund Discussion: https://postgr.es/m/20180306005349.b65whmvj7z6hbe2y@alap3.anarazel.de	2018-03-30 17:05:16 -07:00
Fujii Masao	9a895462d9	Enhance pg_stat_wal_receiver view to display host and port of sender server. Previously there was no way in the standby side to find out the host and port of the sender server that the walreceiver was currently connected to when multiple hosts and ports were specified in primary_conninfo. For that purpose, this patch adds sender_host and sender_port columns into pg_stat_wal_receiver view. They report the host and port that the active replication connection currently uses. Bump catalog version. Author: Haribabu Kommi Reviewed-by: Michael Paquier and me Discussion: https://postgr.es/m/CAJrrPGcV_aq8=cdqkFhVDJKEnDQ70yRTTdY9RODzMnXNrCz2Ow@mail.gmail.com	2018-03-31 07:51:22 +09:00
Tom Lane	11002f8afa	Fix bogus provolatile/proparallel markings on a few built-in functions. Richard Yen reported that pg_upgrade failed if the target cluster had force_parallel_mode = on, because binary_upgrade_create_empty_extension() is marked parallel restricted, allowing it to be executed in parallel mode, which complains because it tries to acquire an XID. In general, no function that might try to modify database data should be considered parallel safe or restricted, since execution of it might force XID acquisition. We found several other examples of this mistake. Furthermore, functions that execute user-supplied SQL queries or query fragments, or pull data from user-supplied cursors, had better be marked both volatile and parallel unsafe, because we don't know what the supplied query or cursor might try to do. There were several tsquery and XML functions that had the wrong proparallel marking for this, and some of them were even mislabeled as to volatility. All these bugs are old, dating back to 9.6 for the proparallel mistakes and much further for the provolatile mistakes. We can't force a catversion bump in the back branches, but we can at least ensure that installations initdb'd in future have the right values. Thomas Munro and Tom Lane Discussion: https://postgr.es/m/CAEepm=2sNDScSLTfyMYu32Q=ob98ZGW-vM_2oLxinzSABGQ6VA@mail.gmail.com	2018-03-30 18:14:51 -04:00
Teodor Sigaev	c0cbe00fee	Add casts from jsonb Add explicit cast from scalar jsonb to all numeric and bool types. It would be better to have cast from scalar jsonb to text too but there is already a cast from jsonb to text as just text representation of json. There is no way to have two different casts for the same type's pair. Bump catalog version Author: Anastasia Lubennikova with editorization by Nikita Glukhov and me Review by: Aleksander Alekseev, Nikita Glukhov, Darafei Praliaskouski Discussion: https://www.postgresql.org/message-id/flat/0154d35a-24ae-f063-5273-9ffcdf1c7f2e@postgrespro.ru	2018-03-29 16:33:56 +03:00
Andres Freund	b4013b8e4a	Add catversion bump missed in `16828d5c0`. Given that pg_attribute changed its layout...	2018-03-27 19:07:39 -07:00
Andrew Dunstan	16828d5c02	Fast ALTER TABLE ADD COLUMN with a non-NULL default Currently adding a column to a table with a non-NULL default results in a rewrite of the table. For large tables this can be both expensive and disruptive. This patch removes the need for the rewrite as long as the default value is not volatile. The default expression is evaluated at the time of the ALTER TABLE and the result stored in a new column (attmissingval) in pg_attribute, and a new column (atthasmissing) is set to true. Any existing row when fetched will be supplied with the attmissingval. New rows will have the supplied value or the default and so will never need the attmissingval. Any time the table is rewritten all the atthasmissing and attmissingval settings for the attributes are cleared, as they are no longer needed. The most visible code change from this is in heap_attisnull, which acquires a third TupleDesc argument, allowing it to detect a missing value if there is one. In many cases where it is known that there will not be any (e.g. catalog relations) NULL can be passed for this argument. Andrew Dunstan, heavily modified from an original patch from Serge Rielau. Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley. Discussion: https://postgr.es/m/31e2e921-7002-4c27-59f5-51f08404c858@2ndQuadrant.com	2018-03-28 10:43:52 +10:30
Alvaro Herrera	555ee77a96	Handle INSERT .. ON CONFLICT with partitioned tables Commit `eb7ed3f306` enabled unique constraints on partitioned tables, but one thing that was not working properly is INSERT/ON CONFLICT. This commit introduces a new node keeps state related to the ON CONFLICT clause per partition, and fills it when that partition is about to be used for tuple routing. Author: Amit Langote, Álvaro Herrera Reviewed-by: Etsuro Fujita, Pavan Deolasee Discussion: https://postgr.es/m/20180228004602.cwdyralmg5ejdqkq@alvherre.pgsql	2018-03-26 10:43:54 -03:00
Alvaro Herrera	86f575948c	Allow FOR EACH ROW triggers on partitioned tables Previously, FOR EACH ROW triggers were not allowed in partitioned tables. Now we allow AFTER triggers on them, and on trigger creation we cascade to create an identical trigger in each partition. We also clone the triggers to each partition that is created or attached later. This means that deferred unique keys are allowed on partitioned tables, too. Author: Álvaro Herrera Reviewed-by: Peter Eisentraut, Simon Riggs, Amit Langote, Robert Haas, Thomas Munro Discussion: https://postgr.es/m/20171229225319.ajltgss2ojkfd3kp@alvherre.pgsql	2018-03-23 10:48:22 -03:00
Andres Freund	432bb9e04d	Basic JIT provider and error handling infrastructure. This commit introduces: 1) JIT provider abstraction, which allows JIT functionality to be implemented in separate shared libraries. That's desirable because it allows to install JIT support as a separate package, and because it allows experimentation with different forms of JITing. 2) JITContexts which can be, using functions introduced in follow up commits, used to emit JITed functions, and have them be cleaned up on error. 3) The outline of a LLVM JIT provider, which will be fleshed out in subsequent commits. Documentation for GUCs added, and for JIT in general, will be added in later commits. Author: Andres Freund, with architectural input from Jeff Davis Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de	2018-03-21 19:28:28 -07:00
Tom Lane	27ba260c73	Change oddly-chosen OID allocation. I noticed while fooling with John Naylor's bootstrap-data patch that we had one high-numbered manually assigned OID, 8888, which evidently came from a submission that the committer didn't bother to bring into line with usual OID allocation practices before committing. That's a bad idea, because it creates a hazard for other patches that may be temporarily using high OID numbers. Change it to something more in line with what we usually do. This evidently dates to commit `abb173392`. It's too late to change it in released branches, but we can fix it in HEAD.	2018-03-21 14:57:12 -04:00
Peter Eisentraut	325f2ec555	Handle heap rewrites even better in logical decoding Logical decoding should not publish anything about tables created as part of a heap rewrite during DDL. Those tables don't exist externally, so consumers of logical decoding cannot do anything sensible with that information. In `ab28feae2b`, we worked around this for built-in logical replication, but that was hack. This is a more proper fix: We mark such transient heaps using the new field pg_class.relwrite, linking to the original relation OID. By default, we ignore them in logical decoding before they get to the output plugin. Optionally, a plugin can register their interest in getting such changes, if they handle DDL specially, in which case the new field will help them get information about the actual table. Reviewed-by: Craig Ringer <craig@2ndquadrant.com>	2018-03-21 09:15:04 -04:00
Peter Eisentraut	f66e8bf875	Remove pg_class.relhaspkey It is not used for anything internally, and it cannot be relied on for external uses, so it can just be removed. To correct recommended way to check for a primary key is in pg_index. Discussion: https://www.postgresql.org/message-id/flat/b1a24c6c-6913-f89c-674e-0704f0ed69db@2ndquadrant.com	2018-03-14 15:31:34 -04:00
Peter Eisentraut	bcdd40538a	Fix typo Author: Daniel Gustafsson <daniel@yesql.se>	2018-03-07 09:02:57 -05:00
Tom Lane	a351679c80	Trivial adjustments in preparation for bootstrap data conversion. Rationalize a couple of macro names: * In catalog/pg_init_privs.h, rename Anum_pg_init_privs_privs to Anum_pg_init_privs_initprivs to match the column's actual name. * In ecpg, rename ZPBITOID to BITOID to match catalog/pg_type.h. This reduces reader confusion, and will allow us to generate these macros automatically in future. In catalog/pg_tablespace.h, fix the ordering of related DATA and #define lines to agree with how it's done elsewhere. This has no impact today, but simplifies life for the bootstrap data conversion scripts. John Naylor Discussion: https://postgr.es/m/CAJVSVGXnLH=BSo0x-aA818f=MyQqGS5nM-GDCWAMdnvQJTRC1A@mail.gmail.com	2018-03-03 11:23:33 -05:00
Peter Eisentraut	fd1a421fe6	Add prokind column, replacing proisagg and proiswindow The new column distinguishes normal functions, procedures, aggregates, and window functions. This replaces the existing columns proisagg and proiswindow, and replaces the convention that procedures are indicated by prorettype == 0. Also change prorettype to be VOIDOID for procedures. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz>	2018-03-02 13:48:33 -05:00
Tom Lane	8b29e88cdc	Add window RANGE support for float4, float8, numeric. Commit `0a459cec9` left this for later, but since time's running out, I went ahead and took care of it. There are more data types that somebody might someday want RANGE support for, but this is enough to satisfy all expectations of the SQL standard, which just says that "numeric, datetime, and interval" types should have RANGE support.	2018-02-24 13:23:38 -05:00
Peter Eisentraut	bc1adc651b	Fix filtering of unsupported relations in logical replication In the pgoutput plugin, skip changes for relations that are not publishable, per is_publishable_class(). This concerns in particular materialized views and information_schema tables. While those relations cannot be part of a publication, per existing checks, they will be considered by a FOR ALL TABLES publication. A subscription would not actually apply changes for those relations, again per existing checks, but trying to match incoming changes to local tables on the subscriber would lead to errors if no matching local table exists. Skipping those changes on the publisher avoids sending useless changes and eliminates the error. Bug: #15044 Reported-by: Chad Trabant <chad@iris.washington.edu> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2018-02-23 22:13:21 -05:00
Peter Eisentraut	10cfce34c0	Add user-callable SHA-2 functions Add the user-callable functions sha224, sha256, sha384, sha512. We already had these in the C code to support SCRAM, but there was no test coverage outside of the SCRAM tests. Adding these as user-callable functions allows writing some tests. Also, we have a user-callable md5 function but no more modern alternative, which led to wide use of md5 as a general-purpose hash function, which leads to occasional complaints about using md5. Also mark the existing md5 functions as leak-proof. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2018-02-22 11:34:53 -05:00
Alvaro Herrera	eb7ed3f306	Allow UNIQUE indexes on partitioned tables If we restrict unique constraints on partitioned tables so that they must always include the partition key, then our standard approach to unique indexes already works --- each unique key is forced to exist within a single partition, so enforcing the unique restriction in each index individually is enough to have it enforced globally. Therefore we can implement unique indexes on partitions by simply removing a few restrictions (and adding others.) Discussion: https://postgr.es/m/20171222212921.hi6hg6pem2w2t36z@alvherre.pgsql Discussion: https://postgr.es/m/20171229230607.3iib6b62fn3uaf47@alvherre.pgsql Reviewed-by: Simon Riggs, Jesper Pedersen, Peter Eisentraut, Jaime Casanova, Amit Langote	2018-02-19 17:40:00 -03:00
Tom Lane	0a459cec96	Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com	2018-02-07 00:06:56 -05:00
Robert Haas	9da0cc3528	Support parallel btree index builds. To make this work, tuplesort.c and logtape.c must also support parallelism, so this patch adds that infrastructure and then applies it to the particular case of parallel btree index builds. Testing to date shows that this can often be 2-3x faster than a serial index build. The model for deciding how many workers to use is fairly primitive at present, but it's better than not having the feature. We can refine it as we get more experience. Peter Geoghegan with some help from Rushabh Lathia. While Heikki Linnakangas is not an author of this patch, he wrote other patches without which this feature would not have been possible, and therefore the release notes should possibly credit him as an author of this feature. Reviewed by Claudio Freire, Heikki Linnakangas, Thomas Munro, Tels, Amit Kapila, me. Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com	2018-02-02 13:32:44 -05:00
Peter Eisentraut	8b9e9644dc	Replace AclObjectKind with ObjectType AclObjectKind was basically just another enumeration for object types, and we already have a preferred one for that. It's only used in aclcheck_error. By using ObjectType instead, we can also give some more precise error messages, for example "index" instead of "relation". Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-01-19 14:01:15 -05:00
Alvaro Herrera	8b08f7d482	Local partitioned indexes When CREATE INDEX is run on a partitioned table, create catalog entries for an index on the partitioned table (which is just a placeholder since the table proper has no data of its own), and recurse to create actual indexes on the existing partitions; create them in future partitions also. As a convenience gadget, if the new index definition matches some existing index in partitions, these are picked up and used instead of creating new ones. Whichever way these indexes come about, they become attached to the index on the parent table and are dropped alongside it, and cannot be dropped on isolation unless they are detached first. To support pg_dump'ing these indexes, add commands CREATE INDEX ON ONLY <table> (which creates the index on the parent partitioned table, without recursing) and ALTER INDEX ATTACH PARTITION (which is used after the indexes have been created individually on each partition, to attach them to the parent index). These reconstruct prior database state exactly. Reviewed-by: (in alphabetical order) Peter Eisentraut, Robert Haas, Amit Langote, Jesper Pedersen, Simon Riggs, David Rowley Discussion: https://postgr.es/m/20171113170646.gzweigyrgg6pwsg4@alvherre.pgsql	2018-01-19 11:49:22 -03:00
Robert Haas	29d58fd3ad	Transfer state pertaining to pending REINDEX operations to workers. This will allow the pending patch for parallel CREATE INDEX to work on system catalogs, and to provide the same level of protection against use of user indexes while they are being rebuilt that we have for non-parallel CREATE INDEX. Patch by me, reviewed by Peter Geoghegan. Discussion: http://postgr.es/m/CA+TgmoYN-YQU9JsGQcqFLovZ-C+Xgp1_xhJQad=cunGG-_p5gg@mail.gmail.com Discussion: http://postgr.es/m/CAH2-Wzkv4UNkXYhqQRqk-u9rS7h5c-4cCW+EqQ8K_WSeS43aZg@mail.gmail.com	2018-01-19 07:48:54 -05:00
Simon Riggs	9c7d06d606	Ability to advance replication slots Ability to advance both physical and logical replication slots using a new user function pg_replication_slot_advance(). For logical advance that means records are consumed as fast as possible and changes are not given to output plugin for sending. Makes 2nd phase (after we reached SNAPBUILD_FULL_SNAPSHOT) of replication slot creation faster, especially when there are big transactions as the reorder buffer does not have to deal with data changes and does not have to spill to disk. Author: Petr Jelinek Reviewed-by: Simon Riggs	2018-01-17 11:38:34 +00:00
Alvaro Herrera	49c784ece7	Remove hard-coded schema knowledge about pg_attribute from genbki.pl Add the ability to label a column's default value in the catalog header, and implement this for pg_attribute. A new function in Catalog.pm is used to fill in a tuple with defaults. The build process will complain loudly if a catalog entry is incomplete, Commit `8137f2c323` labeled variable length columns for the C preprocessor. Expose that label to genbki.pl so we can exclude those columns from schema macros in a general fashion. Also, format schema macro entries according to their types. This means slightly less code maintenance, but more importantly it's a proving ground for mechanisms intended to be used in later commits. While at it, I (Álvaro) couldn't resist making some changes in genbki.pl: rename some functions to actually indicate their purpose instead of actively misleading onlookers; and don't iterate on the whole of pg_type to find the entry for each catalog row, using a hash instead of an array. Author: John Naylor, some changes by Álvaro Herrera Discussion: https://postgr.es/m/CAJVSVGVJHwD8sfDfZW9TbCHWKf=C1YDRM-rF=2JenRU_y+VcFg@mail.gmail.com	2018-01-12 11:21:42 -03:00
Robert Haas	ef6087ee5f	Minor preparatory refactoring for UPDATE row movement. Generalize is_partition_attr to has_partition_attrs and make it accessible from outside tablecmds.c. Change map_partition_varattnos to clarify that it can be used for mapping between any two relations in a partitioning hierarchy, not just parent -> child. Amit Khandekar, reviewed by Amit Langote, David Rowley, and me. Some comment changes by me. Discussion: http://postgr.es/m/CAJ3gD9fWfxgKC+PfJZF3hkgAcNOy-LpfPxVYitDEXKHjeieWQQ@mail.gmail.com	2018-01-04 16:25:49 -05:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Teodor Sigaev	ff963b393c	Add polygon opclass for SP-GiST Polygon opclass uses compress method feature of SP-GiST added earlier. For now it's a single operator class which uses this feature. SP-GiST actually indexes a bounding boxes of input polygons, so part of supported operations are lossy. Opclass uses most methods of corresponding opclass over boxes of SP-GiST and treats bounding boxes as point in 4D-space. Bump catalog version. Authors: Nikita Glukhov, Alexander Korotkov with minor editorization by me Reviewed-By: all authors + Darafei Praliaskouski Discussion: https://www.postgresql.org/message-id/flat/54907069.1030506@sigaev.ru	2017-12-25 18:59:38 +03:00
Alvaro Herrera	9373baa0f7	Minor edits to catalog files and scripts This fixes a few typos and small mistakes; it also cleans a few minor stylistic issues. The biggest functional change is that Gen_fmgrtab.pl no longer knows the OID of language 'internal'. Author: John Naylor Discussion: https://postgr.es/m/CAJVSVGXAkwbk-A9QHHHf00N905kKisyQbaYwKqaRpze_gPXGfg@mail.gmail.com	2017-12-21 19:07:32 -03:00
Peter Eisentraut	e4128ee767	SQL procedures This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>	2017-11-30 11:03:20 -05:00
Robert Haas	eaedf0df71	Update typedefs.list and re-run pgindent Discussion: http://postgr.es/m/CA+TgmoaA9=1RWKtBWpDaj+sF3Stgc8sHgf5z=KGtbjwPLQVDMA@mail.gmail.com	2017-11-29 09:24:24 -05:00
Robert Haas	be92769e4e	Set proargmodes for satisfies_hash_partition. It appears that proargmodes should always be set for variadic functions, but satifies_hash_partition had it as NULL. In addition to fixing the problem, add a regression test to guard against future mistakes of this type.	2017-11-17 11:53:00 -05:00
Robert Haas	4e5fe9ad19	Centralize executor-related partitioning code. Some code is moved from partition.c, which has grown very quickly lately; splitting the executor parts out might help to keep it from getting totally out of control. Other code is moved from execMain.c. All is moved to a new file execPartition.c. get_partition_for_tuple now has a new interface that more clearly separates executor concerns from generic concerns. Amit Langote. A slight comment tweak by me. Discussion: http://postgr.es/m/1f0985f8-3b61-8bc4-4350-baa6d804cb6d@lab.ntt.co.jp	2017-11-15 10:26:25 -05:00
Alvaro Herrera	a61f5ab986	Simplify index_[constraint_]create API Instead of passing large swaths of boolean arguments, define some flags that can be used in a bitmask. This makes it easier not only to figure out what each call site is doing, but also to add some new flags. The flags are split in two -- one set for index_create directly and another for constraints. index_create() itself receives both, and then passes down the latter to index_constraint_create(), which can also be called standalone. Discussion: https://postgr.es/m/20171023151251.j75uoe27gajdjmlm@alvherre.pgsql Reviewed-by: Simon Riggs	2017-11-14 15:19:05 +01:00
Peter Eisentraut	0e1539ba0d	Add some const decorations to prototypes Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2017-11-10 13:38:57 -05:00
Robert Haas	1aba8e651a	Add hash partitioning. Hash partitioning is useful when you want to partition a growing data set evenly. This can be useful to keep table sizes reasonable, which makes maintenance operations such as VACUUM faster, or to enable partition-wise join. At present, we still depend on constraint exclusion for partitioning pruning, and the shape of the partition constraints for hash partitioning is such that that doesn't work. Work is underway to fix that, which should both improve performance and make partitioning pruning work with hash partitioning. Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few final tweaks also by me. Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com	2017-11-09 18:07:44 -05:00
Tom Lane	5ecc0d738e	Restrict lo_import()/lo_export() via SQL permissions not hard-wired checks. While it's generally unwise to give permissions on these functions to anyone but a superuser, we've been moving away from hard-wired permission checks inside functions in favor of using the SQL permission system to control access. Bring lo_import() and lo_export() into compliance with that approach. In particular, this removes the manual configuration option ALLOW_DANGEROUS_LO_FUNCTIONS. That dates back to 1999 (commit `4cd4a54c8`); it's unlikely anyone has used it in many years. Moreover, if you really want such behavior, now you can get it with GRANT ... TO PUBLIC instead. Michael Paquier Discussion: https://postgr.es/m/CAB7nPqRHmNOYbETnc_2EjsuzSM00Z+BWKv9sy6tnvSd5gWT_JA@mail.gmail.com	2017-11-09 12:36:58 -05:00
Peter Eisentraut	2eb4a831e5	Change TRUE/FALSE to true/false The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-08 11:37:28 -05:00
Simon Riggs	4b0d28de06	Remove secondary checkpoint Previously server reserved WAL for last two checkpoints, which used too much disk space for small servers. Bumps PG_CONTROL_VERSION Author: Simon Riggs <simon@2ndQuadrant.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-07 12:56:30 -05:00
Dean Rasheed	87b2ebd352	Always require SELECT permission for ON CONFLICT DO UPDATE. The update path of an INSERT ... ON CONFLICT DO UPDATE requires SELECT permission on the columns of the arbiter index, but it failed to check for that in the case of an arbiter specified by constraint name. In addition, for a table with row level security enabled, it failed to check updated rows against the table's SELECT policies when the update path was taken (regardless of how the arbiter index was specified). Backpatch to 9.5 where ON CONFLICT DO UPDATE and RLS were introduced. Security: CVE-2017-15099	2017-11-06 09:19:22 +00:00
Tom Lane	be0ebb65f5	Allow the built-in ordered-set aggregates to share transition state. The built-in OSAs all share the same transition function, so they can share transition state as long as the final functions cooperate to not do the sort step more than once. To avoid running the tuplesort object in randomAccess mode unnecessarily, add a bit of infrastructure to nodeAgg.c to let the aggregate functions find out whether the transition state is actually being shared or not. This doesn't work for the hypothetical aggregates, since those inject a hypothetical row that isn't traceable to the shared input state. So they remain marked aggfinalmodify = 'w'. Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com	2017-10-16 15:51:23 -04:00
Tom Lane	4de2d4fba3	Explicitly track whether aggregate final functions modify transition state. Up to now, there's been hard-wired assumptions that normal aggregates' final functions never modify their transition states, while ordered-set aggregates' final functions always do. This has always been a bit limiting, and in particular it's getting in the way of improving the built-in ordered-set aggregates to allow merging of transition states. Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure that lets the finalfn's behavior be declared explicitly. There are now three possibilities for the finalfn behavior: it's purely read-only, it trashes the transition state irrecoverably, or it changes the state in such a way that no more transfn calls are possible but the state can still be passed to other, compatible finalfns. There are no examples of this third case today, but we'll shortly make the built-in OSAs act like that. This change allows user-defined aggregates to explicitly disclaim support for use as window functions, and/or to prevent transition state merging, if their implementations cannot handle that. While it was previously possible to handle the window case with a run-time error check, there was not any way to prevent transition state merging, which in retrospect is something commit `804163bc2` should have provided for. But better late than never. In passing, split out pg_aggregate.c's extern function declarations into a new header file pg_aggregate_fn.h, similarly to what we've done for some other catalog headers, so that pg_aggregate.h itself can be safe for frontend files to include. This lets pg_dump use the symbolic names for relevant constants. Discussion: https://postgr.es/m/4834.1507849699@sss.pgh.pa.us	2017-10-14 15:21:39 -04:00
Robert Haas	45866c7550	Copy information from the relcache instead of pointing to it. We have the relations continuously locked, but not open, so relcache pointers are not guaranteed to be stable. Per buildfarm member prion. Ashutosh Bapat. I fixed a typo. Discussion: http://postgr.es/m/CAFjFpRcRBqoKLZSNmRsjKr81uEP=ennvqSQaXVCCBTXvJ2rW+Q@mail.gmail.com	2017-10-06 15:28:07 -04:00
Tom Lane	c12d570fa1	Support arrays over domains. Allowing arrays with a domain type as their element type was left un-done in the original domain patch, but not for any very good reason. This omission leads to such surprising results as array_agg() not working on a domain column, because the parser can't identify a suitable output type for the polymorphic aggregate. In order to fix this, first clean up the APIs of coerce_to_domain() and some internal functions in parse_coerce.c so that we consistently pass around a CoercionContext along with CoercionForm. Previously, we sometimes passed an "isExplicit" boolean flag instead, which is strictly less information; and coerce_to_domain() didn't even get that, but instead had to reverse-engineer isExplicit from CoercionForm. That's contrary to the documentation in primnodes.h that says that CoercionForm only affects display and not semantics. I don't think this change fixes any live bugs, but it makes things more consistent. The main reason for doing it though is that now build_coercion_expression() receives ccontext, which it needs in order to be able to recursively invoke coerce_to_target_type(). Next, reimplement ArrayCoerceExpr so that the node does not directly know any details of what has to be done to the individual array elements while performing the array coercion. Instead, the per-element processing is represented by a sub-expression whose input is a source array element and whose output is a target array element. This simplifies life in parse_coerce.c, because it can build that sub-expression by a recursive invocation of coerce_to_target_type(). The executor now handles the per-element processing as a compiled expression instead of hard-wired code. The main advantage of this is that we can use a single ArrayCoerceExpr to handle as many as three successive steps per element: base type conversion, typmod coercion, and domain constraint checking. The old code used two stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty inefficient, and adding yet another array deconstruction to do domain constraint checking seemed very unappetizing. In the case where we just need a single, very simple coercion function, doing this straightforwardly leads to a noticeable increase in the per-array-element runtime cost. Hence, add an additional shortcut evalfunc in execExprInterp.c that skips unnecessary overhead for that specific form of expression. The runtime speed of simple cases is within 1% or so of where it was before, while cases that previously required two levels of array processing are significantly faster. Finally, create an implicit array type for every domain type, as we do for base types, enums, etc. Everything except the array-coercion case seems to just work without further effort. Tom Lane, reviewed by Andrew Dunstan Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us	2017-09-30 13:40:56 -04:00
Tom Lane	28e0727076	Revert to 9.6 treatment of ALTER TYPE enumtype ADD VALUE. This reverts commit `15bc038f9`, along with the followon commits `1635e80d3` and `984c92074` that tried to clean up the problems exposed by bug #14825. The result was incomplete because it failed to address parallel-query requirements. With 10.0 release so close upon us, now does not seem like the time to be adding more code to fix that. I hope we can un-revert this code and add the missing parallel query support during the v11 cycle. Back-patch to v10. Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org	2017-09-27 16:14:43 -04:00
Tom Lane	1635e80d30	Use a blacklist to distinguish original from add-on enum values. Commit `15bc038f9` allowed ALTER TYPE ADD VALUE to be executed inside transaction blocks, by disallowing the use of the added value later in the same transaction, except under limited circumstances. However, the test for "limited circumstances" was heuristic and could reject references to enum values that were created during CREATE TYPE AS ENUM, not just later. This breaks the use-case of restoring pg_dump scripts in a single transaction, as reported in bug #14825 from Balazs Szilfai. We can improve this by keeping a "blacklist" table of enum value OIDs created by ALTER TYPE ADD VALUE during the current transaction. Any visible-but-uncommitted value whose OID is not in the blacklist must have been created by CREATE TYPE AS ENUM, and can be used safely because it could not have a lifespan shorter than its parent enum type. This change also removes the restriction that a renamed enum value can't be used before being committed (unless it was on the blacklist). Andrew Dunstan, with cosmetic improvements by me. Back-patch to v10. Discussion: https://postgr.es/m/20170922185904.1448.16585@wrigleys.postgresql.org	2017-09-26 13:14:46 -04:00
Andres Freund	fc49e24fa6	Make WAL segment size configurable at initdb time. For performance reasons a larger segment size than the default 16MB can be useful. A larger segment size has two main benefits: Firstly, in setups using archiving, it makes it easier to write scripts that can keep up with higher amounts of WAL, secondly, the WAL has to be written and synced to disk less frequently. But at the same time large segment size are disadvantageous for smaller databases. So far the segment size had to be configured at compile time, often making it unrealistic to choose one fitting to a particularly load. Therefore change it to a initdb time setting. This includes a breaking changes to the xlogreader.h API, which now requires the current segment size to be configured. For that and similar reasons a number of binaries had to be taught how to recognize the current segment size. Author: Beena Emerson, editorialized by Andres Freund Reviewed-By: Andres Freund, David Steele, Kuntal Ghosh, Michael Paquier, Peter Eisentraut, Robert Hass, Tushar Ahuja Discussion: https://postgr.es/m/CAOG9ApEAcQ--1ieKbhFzXSQPw_YLmepaa4hNdnY5+ZULpt81Mw@mail.gmail.com	2017-09-19 22:03:48 -07:00
Tom Lane	2d484f9b05	Remove no-op GiST support functions in the core GiST opclasses. The preceding patch allowed us to remove useless GiST support functions. This patch actually does that for all the no-op cases in the core GiST code. This buys us whatever performance gain is to be had, and more importantly exercises the preceding patch. There remain no-op functions in the contrib GiST opclasses, but those will take more work to remove. Discussion: https://postgr.es/m/CAJEAwVELVx9gYscpE=Be6iJxvdW5unZ_LkcAaVNSeOwvdwtD=A@mail.gmail.com	2017-09-19 23:32:59 -04:00
Tom Lane	7d08ce286c	Distinguish selectivity of < from <= and > from >=. Historically, the selectivity functions have simply not distinguished < from <=, or > from >=, arguing that the fraction of the population that satisfies the "=" aspect can be considered to be vanishingly small, if the comparison value isn't any of the most-common-values for the variable. (If it is, the code path that executes the operator against each MCV will take care of things properly.) But that isn't really true unless we're dealing with a continuum of variable values, and in practice we seldom are. If "x = const" would estimate a nonzero number of rows for a given const value, then it follows that we ought to estimate different numbers of rows for "x < const" and "x <= const", even if the const is not one of the MCVs. Handling this more honestly makes a significant difference in edge cases, such as the estimate for a tight range (x BETWEEN y AND z where y and z are close together). Hence, split scalarltsel into scalarltsel/scalarlesel, and similarly split scalargtsel into scalargtsel/scalargesel. Adjust <= and >= operator definitions to reference the new selectivity functions. Improve the core ineq_histogram_selectivity() function to make a correction for equality. (Along the way, I learned quite a bit about exactly why that function gives good answers, which I tried to memorialize in improved comments.) The corresponding join selectivity functions were, and remain, just stubs. But I chose to split them similarly, to avoid confusion and to prevent the need for doing this exercise again if someone ever makes them less stubby. In passing, change ineq_histogram_selectivity's clamp for extreme probability estimates so that it varies depending on the histogram size, instead of being hardwired at 0.0001. With the default histogram size of 100 entries, you still get the old clamp value, but bigger histograms should allow us to put more faith in edge values. Tom Lane, reviewed by Aleksander Alekseev and Kuntal Ghosh Discussion: https://postgr.es/m/12232.1499140410@sss.pgh.pa.us	2017-09-13 11:12:39 -04:00
Peter Eisentraut	821fb8cdbf	Message style fixes	2017-09-11 11:21:27 -04:00
Robert Haas	6f6b99d133	Allow a partitioned table to have a default partition. Any tuples that don't route to any other partition will route to the default partition. Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert Haas, with review and testing at various stages by (at least) Rushabh Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar. Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com	2017-09-08 17:28:04 -04:00
Robert Haas	81c5e46c49	Introduce 64-bit hash functions with a 64-bit seed. This will be useful for hash partitioning, which needs a way to seed the hash functions to avoid problems such as a hash index on a hash partitioned table clumping all values into a small portion of the bucket space; it's also useful for anything that wants a 64-bit hash value rather than a 32-bit hash value. Just in case somebody wants a 64-bit hash value that is compatible with the existing 32-bit hash values, make the low 32-bits of the 64-bit hash value match the 32-bit hash value when the seed is 0. Robert Haas and Amul Sul Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com	2017-08-31 22:21:21 -04:00
Robert Haas	54cde0c4c0	Don't lock tables in RelationGetPartitionDispatchInfo. Instead, lock them in the caller using find_all_inheritors so that they get locked in the standard order, minimizing deadlock risks. Also in RelationGetPartitionDispatchInfo, avoid opening tables which are not partitioned; there's no need. Amit Langote, reviewed by Ashutosh Bapat and Amit Khandekar Discussion: http://postgr.es/m/91b36fa1-c197-b72f-ca6e-56c593bae68c@lab.ntt.co.jp	2017-08-17 15:43:09 -04:00
Robert Haas	e139f1953f	Assorted preparatory refactoring for partition-wise join. Instead of duplicating the logic to search for a matching ParamPathInfo in multiple places, factor it out into a separate function. Pass only the relevant bits of the PartitionKey to partition_bounds_equal instead of the whole thing, because partition-wise join will want to call this without having a PartitionKey available. Adjust allow_star_schema_join and calc_nestloop_required_outer to take relevant Relids rather than the entire Path, because partition-wise join will want to call it with the top-parent relids to determine whether a child join is allowable. Ashutosh Bapat. Review and testing of the larger patch set of which this is a part by Amit Langote, Rajkumar Raghuwanshi, Rafia Sabih, Thomas Munro, Dilip Kumar, and me. Discussion: http://postgr.es/m/CA+TgmobQK80vtXjAsPZWWXd7c8u13G86gmuLupN+uUJjA+i4nA@mail.gmail.com	2017-08-15 12:30:38 -04:00
Robert Haas	610e8ebb0f	Teach map_partition_varattnos to handle whole-row expressions. Otherwise, partitioned tables with RETURNING expressions or subject to a WITH CHECK OPTION do not work properly. Amit Langote, reviewed by Amit Khandekar and Etsuro Fujita. A few comment changes by me. Discussion: http://postgr.es/m/9a39df80-871e-6212-0684-f93c83be4097@lab.ntt.co.jp	2017-08-03 11:21:29 -04:00
Dean Rasheed	d363d42bb9	Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition bounds. Previously, UNBOUNDED meant no lower bound when used in the FROM list, and no upper bound when used in the TO list, which was OK for single-column range partitioning, but problematic with multiple columns. For example, an upper bound of (10.0, UNBOUNDED) would not be collocated with a lower bound of (10.0, UNBOUNDED), thus making it difficult or impossible to define contiguous multi-column range partitions in some cases. Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to represent a partition column that is unbounded below or above respectively. This syntax removes any ambiguity, and ensures that if one partition's lower bound equals another partition's upper bound, then the partitions are contiguous. Also drop the constraint prohibiting finite values after an unbounded column, and just document the fact that any values after MINVALUE or MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED multiple times, which was needlessly verbose. Note: Forces a post-PG 10 beta2 initdb. Report by Amul Sul, original patch by Amit Langote with some additional hacking by me. Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com	2017-07-21 09:20:47 +01:00
Tom Lane	3cb29c42f9	Add static assertions about pg_control fitting into one disk sector. When pg_control was first designed, sizeof(ControlFileData) was small enough that a comment seemed like plenty to document the assumption that it'd fit into one disk sector. Now it's nearly 300 bytes, raising the possibility that somebody would carelessly add enough stuff to create a problem. Let's add a StaticAssertStmt() to ensure that the situation doesn't pass unnoticed if it ever occurs. While at it, rename PG_CONTROL_SIZE to PG_CONTROL_FILE_SIZE to make it clearer what that symbol means, and convert the existing runtime comparisons of sizeof(ControlFileData) vs. PG_CONTROL_FILE_SIZE to be static asserts --- we didn't have that technology when this code was first written. Discussion: https://postgr.es/m/9192.1500490591@sss.pgh.pa.us	2017-07-19 16:16:57 -04:00
Andrew Gierth	501ed02cf6	Fix transition tables for partition/inheritance. We disallow row-level triggers with transition tables on child tables. Transition tables for triggers on the parent table contain only those columns present in the parent. (We can't mix tuple formats in a single transition table.) Patch by Thomas Munro Discussion: https://postgr.es/m/CA%2BTgmoZzTBBAsEUh4MazAN7ga%3D8SsMC-Knp-6cetts9yNZUCcg%40mail.gmail.com	2017-06-28 18:55:03 +01:00
Tom Lane	ddb5fdc068	Further hacking on ICU collation creation and usage. pg_import_system_collations() refused to create any ICU collations if the current database's encoding didn't support ICU. This is wrongheaded: initdb must initialize pg_collation in an encoding-independent way since it might be used in other databases with different encodings. The reason for the restriction seems to be that get_icu_locale_comment() used icu_from_uchar() to convert the UChar-format display name, and that unsurprisingly doesn't know what to do in unsupported encodings. But by the same token that the initial catalog contents must be encoding-independent, we can't allow non-ASCII characters in the comment strings. So we don't really need icu_from_uchar() here: just check for Unicode codes outside the ASCII range, and if there are none, the format conversion is trivial. If there are some, we can simply not install the comment. (In my testing, this affects only Norwegian Bokmål, which has given us trouble before.) For paranoia's sake, also check for non-ASCII characters in ICU locale names, and skip such locales, as we do for libc locales. I don't currently have a reason to believe that this will ever reject anything, but then again the libc maintainers should have known better too. With just the import changes, ICU collations can be found in pg_collation in databases with unsupported encodings. This resulted in more or less clean failures at runtime, but that's not how things act for unsupported encodings with libc collations. Make it work the same as our traditional behavior for libc collations by having collation lookup take into account whether is_encoding_supported_by_icu(). Adjust documentation to match. Also, expand Table 23.1 to show which encodings are supported by ICU. catversion bump because of likely change in pg_collation/pg_description initial contents in ICU-enabled builds. Discussion: https://postgr.es/m/20c74bc3-d6ca-243d-1bbc-12f17fa4fe9a@gmail.com	2017-06-24 13:54:23 -04:00
Tom Lane	0b13b2a771	Rethink behavior of pg_import_system_collations(). Marco Atzeri reported that initdb would fail if "locale -a" reported the same locale name more than once. All previous versions of Postgres implicitly de-duplicated the results of "locale -a", but the rewrite to move the collation import logic into C had lost that property. It had also lost the property that locale names matching built-in collation names were silently ignored. The simplest way to fix this is to make initdb run the function in if-not-exists mode, which means that there's no real use-case for non if-not-exists mode; we might as well just drop the boolean argument and simplify the function's definition to be "add any collations not already known". This change also gets rid of some odd corner cases caused by the fact that aliases were added in if-not-exists mode even if the function argument said otherwise. While at it, adjust the behavior so that pg_import_system_collations() doesn't spew "collation foo already exists, skipping" messages during a re-run; that's completely unhelpful, especially since there are often hundreds of them. And make it return a count of the number of collations it did add, which seems like it might be helpful. Also, re-integrate the previous coding's property that it would make a deterministic selection of which alias to use if there were conflicting possibilities. This would only come into play if "locale -a" reports multiple equivalent locale names, say "de_DE.utf8" and "de_DE.UTF-8", but that hardly seems out of the question. In passing, fix incorrect behavior in pg_import_system_collations()'s ICU code path: it neglected CommandCounterIncrement, which would result in failures if ICU returns duplicate names, and it would try to create comments even if a new collation hadn't been created. Also, reorder operations in initdb so that the 'ucs_basic' collation is created before calling pg_import_system_collations() not after. This prevents a failure if "locale -a" were to report a locale named that. There's no reason to think that that ever happens in the wild, but the old coding would have survived it, so let's be equally robust. Discussion: https://postgr.es/m/20c74bc3-d6ca-243d-1bbc-12f17fa4fe9a@gmail.com	2017-06-23 14:19:58 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Tom Lane	e3860ffa4d	Initial pgindent run with pg_bsd_indent version 2.0. The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname varname", as well as some similar cases for const- or volatile-qualified pointers. Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:39:04 -04:00
Tom Lane	9ef2dbefc7	Final pgindent run with old pg_bsd_indent (version 1.3). This is just to have a clean basis for comparison with the results of the new version (which will indeed end up reverting some of these changes...) Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:09:24 -04:00
Peter Eisentraut	a2141c42f9	Tweak publication fetching in psql Viewing a table with \d in psql also shows the publications at table is in. If a publication is concurrently dropped, this shows an error, because the view pg_publication_tables internally uses pg_get_publication_tables(), which uses a catalog snapshot. This can be particularly annoying if a for-all-tables publication is concurrently dropped. To avoid that, write the query in psql differently. Expose the function pg_relation_is_publishable() to SQL and write the query using that. That still has a risk of being affected by concurrent catalog changes, but in this case it would be a table drop that causes problems, and then the psql \d command wouldn't be interesting anymore anyway. Reported-by: Tom Lane <tgl@sss.pgh.pa.us>	2017-06-20 12:35:02 -04:00
Peter Eisentraut	20d7d68b09	Change pg_get_publication_tables to prosecdef false This was apparently a mistake in the original commit. Reported-by: Tom Lane <tgl@sss.pgh.pa.us>	2017-06-20 10:03:35 -04:00
Noah Misch	39ac55918f	Reconcile nodes/*funcs.c with PostgreSQL 10 work. The _equalTableFunc() omission of coltypmods has semantic significance, but I did not track down resulting user-visible bugs, if any. The other changes are cosmetic only, affecting order. catversion bump due to readfuncs.c field order change.	2017-06-16 00:16:11 -07:00
Tom Lane	0436f6bde8	Disallow set-returning functions inside CASE or COALESCE. When we reimplemented SRFs in commit `69f4b9c85`, our initial choice was to allow the behavior to vary from historical practice in cases where a SRF call appeared within a conditional-execution construct (currently, only CASE or COALESCE). But that was controversial to begin with, and subsequent discussion has resulted in a consensus that it's better to throw an error instead of executing the query differently from before, so long as we can provide a reasonably clear error message and a way to rewrite the query. Hence, add a parser mechanism to allow detection of such cases during parse analysis. The mechanism just requires storing, in the ParseState, a pointer to the set-returning FuncExpr or OpExpr most recently emitted by parse analysis. Then the parsing functions for CASE and COALESCE can detect the presence of a SRF in their arguments by noting whether this pointer changes while analyzing their arguments. Furthermore, if it does, it provides a suitable error cursor location for the complaint. (This means that if there's more than one SRF in the arguments, the error will point at the last one to be analyzed not the first. While connoisseurs of parsing behavior might find that odd, it's unlikely the average user would ever notice.) While at it, we can also provide more specific error messages than before about some pre-existing restrictions, such as no-SRFs-within-aggregates. Also, reject at parse time cases where a NULLIF or IS DISTINCT FROM construct would need to return a set. We've never supported that, but the restriction is depended on in more subtle ways now, so it seems wise to detect it at the start. Also, provide some documentation about how to rewrite a SRF-within-CASE query using a custom wrapper SRF. It turns out that the information_schema.user_mapping_options view contained an instance of exactly the behavior we're now forbidding; but rewriting it makes it more clear and safer too. initdb forced because of user_mapping_options change. Patch by me, with error message suggestions from Alvaro Herrera and Andres Freund, pursuant to a complaint from Regina Obe. Discussion: https://postgr.es/m/000001d2d5de$d8d66170$8a832450$@pcorp.us	2017-06-13 23:46:39 -04:00
Peter Eisentraut	ec7129b781	Fix collprovider of predefined collations An earlier version of the patch had collprovider as an integer and thus set these to 0, but the correct setting is now null.	2017-06-13 08:55:09 -04:00
Andrew Dunstan	f7e6853e1a	Mark to_tsvector(regconfig,json[b]) functions immutable This make them consistent with the text function and means they can be used in functional indexes. Catalog version bumped. Per gripe from Josh Berkus.	2017-06-08 15:47:10 -04:00
Peter Eisentraut	644ea35fc1	Fix updating of pg_subscription_rel from workers A logical replication worker should not insert new rows into pg_subscription_rel, only update existing rows, so that there are no races if a concurrent refresh removes rows. Adjust the API to be able to choose that behavior. Author: Masahiko Sawada <sawada.mshk@gmail.com> Reported-by: tushar <tushar.ahuja@enterprisedb.com>	2017-06-07 13:49:14 -04:00
Tom Lane	80f583ffe9	Fix omission of locations in outfuncs/readfuncs partitioning node support. We could have limped along without this for v10, which was my intention when I annotated the bug in commit `76a3df6e5`. But consensus is that it's better to fix it now and take the cost of a post-beta1 initdb (which is needed because these node types are stored in pg_class.relpartbound). Since we're forcing initdb anyway, take the opportunity to make the node type identification strings match the node struct names, instead of being randomly different from them. Discussion: https://postgr.es/m/E1dFBEX-0004wt-8t@gemulon.postgresql.org	2017-05-30 11:32:41 -04:00
Tom Lane	76a3df6e5e	Code review focused on new node types added by partitioning support. Fix failure to check that we got a plain Const from const-simplification of a coercion request. This is the cause of bug #14666 from Tian Bing: there is an int4 to money cast, but it's only stable not immutable (because of dependence on lc_monetary), resulting in a FuncExpr that the code was miserably unequipped to deal with, or indeed even to notice that it was failing to deal with. Add test cases around this coercion behavior. In view of the above, sprinkle the code liberally with castNode() macros, in hope of catching the next such bug a bit sooner. Also, change some functions that were randomly declared to take Node* to take more specific pointer types. And change some struct fields that were declared Node* but could be given more specific types, allowing removal of assorted explicit casts. Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting. Likewise check only-one-key-for-list-partitioning restriction in a less random place. Avoid not-per-project-style usages like !strcmp(...). Fix assorted failures to avoid scribbling on the input of parse transformation. I'm not sure how necessary this is, but it's entirely silly for these functions to be expending cycles to avoid that and not getting it right. Add guards against partitioning on system columns. Put backend/nodes/ support code into an order that matches handling of these node types elsewhere. Annotate the fact that somebody added location fields to PartitionBoundSpec and PartitionRangeDatum but forgot to handle them in outfuncs.c/readfuncs.c. This is fairly harmless for production purposes (since readfuncs.c would just substitute -1 anyway) but it's still bogus. It's not worth forcing a post-beta1 initdb just to fix this, but if we have another reason to force initdb before 10.0, we should go back and clean this up. Contrariwise, somebody added location fields to PartitionElem and PartitionSpec but forgot to teach exprLocation() about them. Consolidate duplicative code in transformPartitionBound(). Improve a couple of error messages. Improve assorted commentary. Re-pgindent the files touched by this patch; this affects a few comment blocks that must have been added quite recently. Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org	2017-05-28 23:20:28 -04:00
Bruce Momjian	a6fd7b7a5f	Post-PG 10 beta1 pgindent run perltidy run not included.	2017-05-17 16:31:56 -04:00
Tom Lane	c079673dcb	Preventive maintenance in advance of pgindent run. Reformat various places in which pgindent will make a mess, and fix a few small violations of coding style that I happened to notice while perusing the diffs from a pgindent dry run. There is one actual bug fix here: the need-to-enlarge-the-buffer code path in icu_convert_case was obviously broken. Perhaps it's unreachable in our usage? Or maybe this is just sadly undertested.	2017-05-16 20:36:35 -04:00
Tom Lane	f04c9a6146	Standardize terminology for pg_statistic_ext entries. Consistently refer to such an entry as a "statistics object", not just "statistics" or "extended statistics". Previously we had a mismash of terms, accompanied by utter confusion as to whether the term was singular or plural. That's not only grating (at least to the ear of a native English speaker) but could be outright misleading, eg in error messages that seemed to be referring to multiple objects where only one could be meant. This commit fixes the code and a lot of comments (though I may have missed a few). I also renamed two new SQL functions, pg_get_statisticsextdef -> pg_get_statisticsobjdef pg_statistic_ext_is_visible -> pg_statistics_obj_is_visible to conform better with this terminology. I have not touched the SGML docs other than fixing those function names; the docs certainly need work but it seems like a separable task. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-14 10:55:01 -04:00
Robert Haas	1848b73d45	Teach \d+ to show partitioning constraints. The fact that we didn't have this in the first place is likely why the problem fixed by `f8bffe9e6d` escaped detection. Patch by Amit Langote, reviewed and slightly adjusted by me. Discussion: http://postgr.es/m/CA+TgmoYWnV2GMnYLG-Czsix-E1WGAbo4D+0tx7t9NdfYBDMFsA@mail.gmail.com	2017-05-13 12:04:53 -04:00
Alvaro Herrera	d99d58cdc8	Complete tab completion for DROP STATISTICS Tab-completing DROP STATISTICS would only work if you started writing the schema name containing the statistics object, because the visibility clause was missing. To add it, we need to add SQL-callable support for testing visibility of a statistics object, like all other object types already have. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-13 01:05:48 -03:00
Tom Lane	928c4de309	Fix dependencies for extended statistics objects. A stats object ought to have a dependency on each individual column it reads, not the entire table. Doing this honestly lets us get rid of the hard-wired logic in RemoveStatisticsExt, which seems to have been misguidedly modeled on RemoveStatistics; and it will be far easier to extend to multiple tables later. Also, add overlooked dependency on owner, and make the dependency on schema be NORMAL like every other such dependency. There remains some unfinished work here, which is to allow statistics objects to be extension members. That takes more effort than just adding the dependency call, though, so I left it out for now. initdb forced because this changes the set of pg_depend records that should exist for a statistics object. Discussion: https://postgr.es/m/22676.1494557205@sss.pgh.pa.us	2017-05-12 16:26:31 -04:00
Tom Lane	d10c626de4	Rename WAL-related functions and views to use "lsn" not "location". Per discussion, "location" is a rather vague term that could refer to multiple concepts. "LSN" is an unambiguous term for WAL locations and should be preferred. Some function names, view column names, and function output argument names used "lsn" already, but others used "location", as well as yet other terms such as "wal_position". Since we've already renamed a lot of things in this area from "xlog" to "wal" for v10, we may as well incur a bit more compatibility pain and make these names all consistent. David Rowley, minor additional docs hacking by me Discussion: https://postgr.es/m/CAKJS1f8O0njDKe8ePFQ-LK5-EjwThsDws6ohJ-+c6nWK+oUxtg@mail.gmail.com	2017-05-11 11:49:59 -04:00
Peter Eisentraut	013c1178fd	Remove the NODROP SLOT option from DROP SUBSCRIPTION It turned out this approach had problems, because a DROP command should not have any options other than CASCADE and RESTRICT. Instead, always attempt to drop the slot if there is one configured, but also add an ALTER SUBSCRIPTION action to set the slot to NONE. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us	2017-05-09 10:20:42 -04:00
Tom Lane	da07596006	Further patch rangetypes_selfuncs.c's statistics slot management. Values in a STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM slot are float8, not of the type of the column the statistics are for. This bug is at least partly the fault of sloppy specification comments for get_attstatsslot()/free_attstatsslot(): the type OID they want is that of the stavalues entries, not of the underlying column. (I double-checked other callers and they seem to get this right.) Adjust the comments to be more correct. Per buildfarm. Security: CVE-2017-7484	2017-05-08 15:03:14 -04:00
Heikki Linnakangas	68e61ee72e	Change the on-disk format of SCRAM verifiers to conform to RFC 5803. It doesn't make any immediate difference to PostgreSQL, but might as well follow the standard, since one exists. (I looked at RFC 5803 earlier, but didn't fully understand it back then.) The new format uses Base64 instead of hex to encode StoredKey and ServerKey, which makes the verifiers slightly smaller. Using the same encoding for the salt and the keys also means that you only need one encoder/decoder instead of two. Although we have code in the backend to do both, we are talking about teaching libpq how to create SCRAM verifiers for PQencodePassword(), and libpq doesn't currently have any code for hex encoding. Bump catversion, because this renders any existing SCRAM verifiers in pg_authid invalid. Discussion: https://www.postgresql.org/message-id/351ba574-85ea-d9b8-9689-8c928dd0955d@iki.fi	2017-04-21 22:51:57 +03:00
Fujii Masao	88b0a31926	Mark some columns in pg_subscription as NOT NULL. In pg_subscription, subconninfo, subslotname, subsynccommit and subpublications are expected not to be NULL. Therefore this patch adds BKI_FORCE_NOT_NULL markings to them. This patch is basically unnecessary unless the code has a bug which wrongly sets either of those columns to NULL. But it's good to have this as a safeguard. Author: Masahiko Sawada Reviewed-by: Kyotaro Horiguchi Reported-by: Fujii Masao Discussion: http://postgr.es/m/CAHGQGwFDWh_Qr-q_GEMpD+qH=vYPMdVqw=ZOSY3kX_Pna9R9SA@mail.gmail.com	2017-04-20 23:35:30 +09:00
Alvaro Herrera	ee6922112e	Rename columns in new pg_statistic_ext catalog The new catalog reused a column prefix "sta" from pg_statistic, but this is undesirable, so change the catalog to use prefix "stx" instead. Also, rename the column that lists enabled statistic kinds as "stxkind" rather than "enabled". Discussion: https://postgr.es/m/CAKJS1f_2t5jhSN7huYRFH3w3rrHfG2QU7hiUHsu-Vdjd1rYT3w@mail.gmail.com	2017-04-17 18:34:29 -03:00
Peter Eisentraut	67c2def11d	Catversion bump for commit `887227a1cc`	2017-04-14 14:24:01 -04:00
Peter Eisentraut	887227a1cc	Add option to modify sync commit per subscription This also changes default behaviour of subscription workers to synchronous_commit = off. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2017-04-14 13:58:46 -04:00
Alvaro Herrera	27bcc372b1	Catversion bump forgotten in previous commit	2017-04-13 11:54:28 -03:00
Tom Lane	511540dadf	Move isolationtester's is-blocked query into C code for speed. Commit `4deb41381` modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de	2017-04-10 10:26:54 -04:00
Peter Eisentraut	5f21f5292c	Mark immutable functions in information schema as parallel safe Also add opr_sanity check that all preloaded immutable functions are parallel safe. (Per discussion, this does not necessarily have to be true for all possible such functions, but deviations would be unlikely enough that maintaining such a test is reasonable.) Reported-by: David Rowley <david.rowley@2ndquadrant.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2017-04-06 14:30:13 -04:00
Peter Eisentraut	3217327053	Identity columns This is the SQL standard-conforming variant of PostgreSQL's serial columns. It fixes a few usability issues that serial columns have: - CREATE TABLE / LIKE copies default but refers to same sequence - cannot add/drop serialness with ALTER TABLE - dropping default does not drop sequence - need to grant separate privileges to sequence - other slight weirdnesses because serial is some kind of special macro Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>	2017-04-06 08:41:37 -04:00
Simon Riggs	2686ee1b7c	Collect and use multi-column dependency stats Follow on patch in the multi-variate statistics patch series. CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t; ANALYZE; will collect dependency stats on (a, b) and then use the measured dependency in subsequent query planning. Commit `7b504eb282` added CREATE STATISTICS with n-distinct coefficients. These are now specified using the mutually exclusive option WITH (ndistinct). Author: Tomas Vondra, David Rowley Reviewed-by: Kyotaro HORIGUCHI, Álvaro Herrera, Dean Rasheed, Robert Haas and many other comments and contributions Discussion: https://postgr.es/m/56f40b20-c464-fad2-ff39-06b668fac47c@2ndquadrant.com	2017-04-05 18:00:42 -04:00
Alvaro Herrera	c655899ba9	BRIN de-summarization When the BRIN summary tuple for a page range becomes too "wide" for the values actually stored in the table (because the tuples that were present originally are no longer present due to updates or deletes), it can be useful to remove the outdated summary tuple, so that a future summarization can install a tighter summary. This commit introduces a SQL-callable interface to do so. Author: Álvaro Herrera Reviewed-by: Eiji Seki Discussion: https://postgr.es/m/20170228045643.n2ri74ara4fhhfxf@alvherre.pgsql	2017-04-01 16:10:04 -03:00
Alvaro Herrera	7526e10224	BRIN auto-summarization Previously, only VACUUM would cause a page range to get initially summarized by BRIN indexes, which for some use cases takes too much time since the inserts occur. To avoid the delay, have brininsert request a summarization run for the previous range as soon as the first tuple is inserted into the first page of the next range. Autovacuum is in charge of processing these requests, after doing all the regular vacuuming/ analyzing work on tables. This doesn't impose any new tasks on autovacuum, because autovacuum was already in charge of doing summarizations. The only actual effect is to change the timing, i.e. that it occurs earlier. For this reason, we don't go any great lengths to record these requests very robustly; if they are lost because of a server crash or restart, they will happen at a later time anyway. Most of the new code here is in autovacuum, which can now be told about "work items" to process. This can be used for other things such as GIN pending list cleaning, perhaps visibility map bit setting, both of which are currently invoked during vacuum, but do not really depend on vacuum taking place. The requests are at the page range level, a granularity for which we did not have SQL-level access; we only had index-level summarization requests via brin_summarize_new_values(). It seems reasonable to add SQL-level access to range-level summarization too, so add a function brin_summarize_range() to do that. Authors: Álvaro Herrera, based on sketch from Simon Riggs. Reviewed-by: Thomas Munro. Discussion: https://postgr.es/m/20170301045823.vneqdqkmsd4as4ds@alvherre.pgsql	2017-04-01 14:00:53 -03:00
Kevin Grittner	18ce3a4ab2	Add infrastructure to support EphemeralNamedRelation references. A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others	2017-03-31 23:17:18 -05:00
Robert Haas	c94e6942ce	Don't allocate storage for partitioned tables. Also, don't allow setting reloptions on them, since that would have no effect given the lack of storage. The patch does this by introducing a new reloption kind for which there are currently no reloptions -- we might have some in the future -- so it adjusts parseRelOptions to handle that case correctly. Bumped catversion. System catalogs that contained reloptions for partitioned tables are no longer valid; plus, there are now fewer physical files on disk, which is not technically a catalog change but still a good reason to re-initdb. Amit Langote, reviewed by Maksim Milyutin and Kyotaro Horiguchi and revised a bit by me. Discussion: http://postgr.es/m/20170331.173326.212311140.horiguchi.kyotaro@lab.ntt.co.jp	2017-03-31 16:28:51 -04:00
Andrew Dunstan	e306df7f9c	Full Text Search support for json and jsonb The new functions are ts_headline() and to_tsvector. Dmitry Dolgov, edited and documented by me.	2017-03-31 14:26:03 -04:00
Simon Riggs	25fff40798	Default monitoring roles Three nologin roles with non-overlapping privs are created by default * pg_read_all_settings - read all GUCs. * pg_read_all_stats - pg_stat_, pg_database_size(), pg_tablespace_size() pg_stat_scan_tables - may lock/scan tables Top level role - pg_monitor includes all of the above by default, plus others Author: Dave Page Reviewed-by: Stephen Frost, Robert Haas, Peter Eisentraut, Simon Riggs	2017-03-30 14:18:53 -04:00
Teodor Sigaev	f90d23d0c5	Implement SortSupport for macaddr data type Introduces a scheme to produce abbreviated keys for the macaddr type. Bump catalog version. Author: Brandur Leach Reviewed-by: Julien Rouhaud, Peter Geoghegan https://commitfest.postgresql.org/13/743/	2017-03-29 23:28:56 +03:00
Peter Eisentraut	4fdb8a82e3	Update copyright year in recently added files Author: Masahiko Sawada <sawada.mshk@gmail.com>	2017-03-29 14:54:10 -04:00
Robert Haas	9a09527164	Mark more functions parallel-restricted. Commit `61c2e1a95f` allowed parallel query to be used in more places, revealing via buildfarm member mandrill that several functions intended to be called from triggers were incorrectly marked parallel-safe rather than parallel-restricted. Report by Tom Lane. Patch by Rafia Sabih. Reviewed by me. Discussion: http://postgr.es/m/16061.1490479253@sss.pgh.pa.us	2017-03-29 10:59:27 -04:00
Teodor Sigaev	ab89e465cb	Altering default privileges on schemas Extend ALTER DEFAULT PRIVILEGES command to schemas. Author: Matheus Oliveira Reviewed-by: Petr Jelínek, Ashutosh Sharma https://commitfest.postgresql.org/13/887/	2017-03-28 18:58:55 +03:00
Robert Haas	fc70a4b0df	Show more processes in pg_stat_activity. Previously, auxiliary processes and background workers not connected to a database (such as the logical replication launcher) weren't shown. Include them, so that we can see the associated wait state information. Add a new column to identify the processes type, so that people can filter them out easily using SQL if they wish. Before this patch was written, there was discussion about whether we should expose this information in a separate view, so as to avoid contaminating pg_stat_activity with things people might not want to see. But putting everything in pg_stat_activity was a more popular choice, so that's what the patch does. Kuntal Ghosh, reviewed by Amit Langote and Michael Paquier. Some revisions and bug fixes by me. Discussion: http://postgr.es/m/CA+TgmoYES5nhkEGw9nZXU8_FhA8XEm8NTm3-SO+3ML1B81Hkww@mail.gmail.com	2017-03-26 22:02:22 -04:00
Peter Eisentraut	7678fe1c6d	Make header self-contained Add necessary include files for things used in the header.	2017-03-24 23:36:10 -04:00
Alvaro Herrera	7b504eb282	Implement multivariate n-distinct coefficients Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql	2017-03-24 14:06:10 -03:00
Robert Haas	857ee8e391	Add a txid_status function. If your connection to the database server is lost while a COMMIT is in progress, it may be difficult to figure out whether the COMMIT was successful or not. This function will tell you, provided that you don't wait too long to ask. It may be useful in other situations, too. Craig Ringer, reviewed by Simon Riggs and by me Discussion: http://postgr.es/m/CAMsr+YHQiWNEi0daCTboS40T+V5s_+dst3PYv_8v2wNVH+Xx4g@mail.gmail.com	2017-03-24 12:00:53 -04:00
Peter Eisentraut	eccfef81e1	ICU support Add a column collprovider to pg_collation that determines which library provides the collation data. The existing choices are default and libc, and this adds an icu choice, which uses the ICU4C library. The pg_locale_t type is changed to a union that contains the provider-specific locale handles. Users of locale information are changed to look into that struct for the appropriate handle to use. Also add a collversion column that records the version of the collation when it is created, and check at run time whether it is still the same. This detects potentially incompatible library upgrades that can corrupt indexes and other structures. This is currently only supported by ICU-provided collations. initdb initializes the default collation set as before from the `locale -a` output but also adds all available ICU locales with a "-x-icu" appended. Currently, ICU-provided collations can only be explicitly named collations. The global database locales are still always libc-provided. ICU support is enabled by configure --with-icu. Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2017-03-23 15:28:48 -04:00
Simon Riggs	6912acc04f	Replication lag tracking for walsenders Adds write_lag, flush_lag and replay_lag cols to pg_stat_replication. Implements a lag tracker module that reports the lag times based upon measurements of the time taken for recent WAL to be written, flushed and replayed and for the sender to hear about it. These times represent the commit lag that was (or would have been) introduced by each synchronous commit level, if the remote server was configured as a synchronous standby. For an asynchronous standby, the replay_lag column approximates the delay before recent transactions became visible to queries. If the standby server has entirely caught up with the sending server and there is no more WAL activity, the most recently measured lag times will continue to be displayed for a short time and then show NULL. Physical replication lag tracking is automatic. Logical replication tracking is possible but is the responsibility of the logical decoding plugin. Tracking is a private module operating within each walsender individually, with values reported to shared memory. Module not used outside of walsender. Design and code is good enough now to commit - kudos to the author. In many ways a difficult topic, with important and subtle behaviour so this shoudl be expected to generate discussion and multiple open items: Test now! Author: Thomas Munro, following designs by Fujii Masao and Simon Riggs Review: Simon Riggs, Ian Barwick and Craig Ringer	2017-03-23 14:05:28 +00:00
Peter Eisentraut	7c4f52409a	Logical replication support for initial data copy Add functionality for a new subscription to copy the initial data in the tables and then sync with the ongoing apply process. For the copying, add a new internal COPY option to have the COPY source data provided by a callback function. The initial data copy works on the subscriber by receiving COPY data from the publisher and then providing it locally into a COPY that writes to the destination table. A WAL receiver can now execute full SQL commands. This is used here to obtain information about tables and publications. Several new options were added to CREATE and ALTER SUBSCRIPTION to control whether and when initial table syncing happens. Change pg_dump option --no-create-subscription-slots to --no-subscription-connect and use the new CREATE SUBSCRIPTION ... NOCONNECT option for that. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Tested-by: Erik Rijkers <er@xs4all.nl>	2017-03-23 08:55:37 -04:00
Stephen Frost	017e4f2588	Expose waitforarchive option through pg_stop_backup() Internally, we have supported the option to either wait for all of the WAL associated with a backup to be archived, or to return immediately. This option is useful to users of pg_stop_backup() as well, when they are reading the stop backup record position and checking that the WAL they need has been archived independently. This patch adds an additional, optional, argument to pg_stop_backup() which allows the user to indicate if they wish to wait for the WAL to be archived or not. The default matches current behavior, which is to wait. Author: David Steele, with some minor changes, doc updates by me. Reviewed by: Takayuki Tsunakawa, Fujii Masao Discussion: https://postgr.es/m/758e3fd1-45b4-5e28-75cd-e9e7f93a4c02@pgmasters.net	2017-03-22 23:44:58 -04:00
Robert Haas	befd73c50f	Add pg_ls_logdir() and pg_ls_waldir() functions. These functions are intended to be used by monitoring tools, and, unlike pg_ls_dir(), access to them can be granted to non-superusers, so that those monitoring tools can observe the principle of least privilege. Dave Page, revised by me, and also reviewed a bit by Thomas Munro. Discussion: http://postgr.es/m/CA+OCxow-X=D2fWdKy+HP+vQ1LtrgbsYQ=CshzZBqyFT5jOYrFw@mail.gmail.com	2017-03-16 15:05:02 -04:00
Stephen Frost	c583234662	Bump catversion for MACADDR8 Pointed out by Robert.	2017-03-15 11:19:39 -04:00
Stephen Frost	c7a9fa399d	Add support for EUI-64 MAC addresses as macaddr8 This adds in support for EUI-64 MAC addresses by adding a new data type called 'macaddr8' (using our usual convention of indicating the number of bytes stored). This was largely a copy-and-paste from the macaddr data type, with appropriate adjustments for having 8 bytes instead of 6 and adding support for converting a provided EUI-48 (6 byte format) to the EUI-64 format. Conversion from EUI-48 to EUI-64 inserts FFFE as the 4th and 5th bytes but does not perform the IPv6 modified EUI-64 action of flipping the 7th bit, but we add a function to perform that specific action for the user as it may be commonly done by users who wish to calculate their IPv6 address based on their network prefix and 48-bit MAC address. Author: Haribabu Kommi, with a good bit of rework of macaddr8_in by me. Reviewed by: Vitaly Burovoy, Kuntal Ghosh Discussion: https://postgr.es/m/CAJrrPGcUi8ZH+KkK+=TctNQ+EfkeCEHtMU_yo1mvX8hsk_ghNQ@mail.gmail.com	2017-03-15 11:16:25 -04:00
Tom Lane	8b358b42f8	Change the relkind for partitioned tables from 'P' to 'p'. Seven of the eight other relkind codes are lower-case, so it wasn't consistent for this one to be upper-case. Fix it while we still can. Historical notes: the reason for the lone exception, i.e. sequences being 'S', is that 's' was once used for "special" relations. Also, at one time the partitioned-tables patch used both 'P' and 'p', but that got changed, leaving only a surprising choice behind. This also fixes a couple little bits of technical debt, such as type_sanity.sql not knowing that 'm' is a legal value for relkind. Discussion: https://postgr.es/m/27899.1488909319@sss.pgh.pa.us	2017-03-10 13:15:47 -05:00
Alvaro Herrera	fcec6caafa	Support XMLTABLE query expression XMLTABLE is defined by the SQL/XML standard as a feature that allows turning XML-formatted data into relational form, so that it can be used as a <table primary> in the FROM clause of a query. This new construct provides significant simplicity and performance benefit for XML data processing; what in a client-side custom implementation was reported to take 20 minutes can be executed in 400ms using XMLTABLE. (The same functionality was said to take 10 seconds using nested PostgreSQL XPath function calls, and 5 seconds using XMLReader under PL/Python). The implemented syntax deviates slightly from what the standard requires. First, the standard indicates that the PASSING clause is optional and that multiple XML input documents may be given to it; we make it mandatory and accept a single document only. Second, we don't currently support a default namespace to be specified. This implementation relies on a new executor node based on a hardcoded method table. (Because the grammar is fixed, there is no extensibility in the current approach; further constructs can be implemented on top of this such as JSON_TABLE, but they require changes to core code.) Author: Pavel Stehule, Álvaro Herrera Extensively reviewed by: Craig Ringer Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com	2017-03-08 12:40:26 -03:00
Heikki Linnakangas	818fd4a67d	Support SCRAM-SHA-256 authentication (RFC 5802 and 7677). This introduces a new generic SASL authentication method, similar to the GSS and SSPI methods. The server first tells the client which SASL authentication mechanism to use, and then the mechanism-specific SASL messages are exchanged in AuthenticationSASLcontinue and PasswordMessage messages. Only SCRAM-SHA-256 is supported at the moment, but this allows adding more SASL mechanisms in the future, without changing the overall protocol. Support for channel binding, aka SCRAM-SHA-256-PLUS is left for later. The SASLPrep algorithm, for pre-processing the password, is not yet implemented. That could cause trouble, if you use a password with non-ASCII characters, and a client library that does implement SASLprep. That will hopefully be added later. Authorization identities, as specified in the SCRAM-SHA-256 specification, are ignored. SET SESSION AUTHORIZATION provides more or less the same functionality, anyway. If a user doesn't exist, perform a "mock" authentication, by constructing an authentic-looking challenge on the fly. The challenge is derived from a new system-wide random value, "mock authentication nonce", which is created at initdb, and stored in the control file. We go through these motions, in order to not give away the information on whether the user exists, to unauthenticated users. Bumps PG_CONTROL_VERSION, because of the new field in control file. Patch by Michael Paquier and Heikki Linnakangas, reviewed at different stages by Robert Haas, Stephen Frost, David Steele, Aleksander Alekseev, and many others. Discussion: https://www.postgresql.org/message-id/CAB7nPqRbR3GmFYdedCAhzukfKrgBLTLtMvENOmPrVWREsZkF8g%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAB7nPqSMXU35g%3DW9X74HVeQp0uvgJxvYOuA4A-A3M%2B0wfEBv-w%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/55192AFE.6080106@iki.fi	2017-03-07 14:25:40 +02:00
Peter Eisentraut	8b6d6cf853	Remove objname/objargs split for referring to objects In simpler times, it might have worked to refer to all kinds of objects by a list of name components and an optional argument list. But this doesn't work for all objects, which has resulted in a collection of hacks to place various other nodes types into these fields, which have to be unpacked at the other end. This makes it also weird to represent lists of such things in the grammar, because they would have to be lists of singleton lists, to make the unpacking work consistently. The other problem is that keeping separate name and args fields makes it awkward to deal with lists of functions. Change that by dropping the objargs field and have objname, renamed to object, be a generic Node, which can then be flexibly assigned and managed using the normal Node mechanisms. In many cases it will still be a List of names, in some cases it will be a string Value, for types it will be the existing Typename, for functions it will now use the existing ObjectWithArgs node type. Some of the more obscure object types still use somewhat arbitrary nested lists. Reviewed-by: Jim Nasby <Jim.Nasby@BlueTreble.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-03-06 13:31:47 -05:00
Robert Haas	9fe3c644a7	Mark pg_start_backup and pg_stop_backup as parallel-restricted. They depend on backend-private state that will not be synchronized by the parallel machinery, so they should not be marked parallel-safe. This issue also exists in 9.6, but we obviously can't do anything about 9.6 clusters that already exist. Possibly this could be back-patched so that future 9.6 clusters would come out OK, or possibly we should back-patch some other fix, but that would need more discussion. David Steele, reviewed by Michael Paquier Discussion: http://postgr.es/m/CA+TgmoYCWfO2UM-t=HUMFJyxJywLDiLL0nAJpx88LKtvBvNECw@mail.gmail.com	2017-03-06 12:41:55 -05:00
Peter Eisentraut	6f236e1eb8	psql: Add tab completion for logical replication Add tab completion for publications and subscriptions. Also, to be able to get a list of subscriptions, make pg_subscription world-readable but revoke access to subconninfo using column privileges. From: Michael Paquier <michael.paquier@gmail.com>	2017-03-03 14:13:48 -05:00
Robert Haas	19dc233c32	Add pg_current_logfile() function. The syslogger will write out the current stderr and csvlog names, if it's running and there are any, to a new file in the data directory called "current_logfiles". We take care to remove this file when it might no longer be valid (but not at shutdown). The function pg_current_logfile() can be used to read the entries in the file. Gilles Darold, reviewed and modified by Karl O. Pinc, Michael Paquier, and me. Further review by Álvaro Herrera and Christoph Berg.	2017-03-03 11:43:11 +05:30
Robert Haas	5a73e17317	Improve error reporting for tuple-routing failures. Currently, the whole row is shown without column names. Instead, adopt a style similar to _bt_check_unique() in ExecFindPartition() and show the failing key: (key1, ...) = (val1, ...). Amit Langote, per a complaint from Simon Riggs. Reviewed by me; I also adjusted the grammar in one of the comments. Discussion: http://postgr.es/m/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp	2017-03-03 09:09:52 +05:30
Peter Eisentraut	005638e988	Fix naming inconsistency subobjid -> objsubid From: Jim Nasby <Jim.Nasby@BlueTreble.com>	2017-03-01 12:22:33 -05:00
Tom Lane	d28aafb6dd	Remove pg_control's enableIntTimes field. We don't need it any more. pg_controldata continues to report that date/time type storage is "64-bit integers", but that's now a hard-wired behavior not something it sees in the data. This avoids breaking pg_upgrade, and perhaps other utilities that inspect pg_control this way. Ditto for pg_resetwal. I chose to remove the "bigint_timestamps" output column of pg_control_init(), though, as that function hasn't been around long and probably doesn't have ossified users. Discussion: https://postgr.es/m/26788.1487455319@sss.pgh.pa.us	2017-02-23 12:23:12 -05:00
Robert Haas	5d40286985	Fix wrong articles in pg_proc descriptions. This technically should involve a catversion bump, but that seems pedantic, so I skipped it. Report and patch by David Christensen.	2017-02-15 12:13:38 -05:00
Peter Eisentraut	2ea5b06c7a	Add CREATE SEQUENCE AS <data type> clause This stores a data type, required to be an integer type, with the sequence. The sequences min and max values default to the range supported by the type, and they cannot be set to values exceeding that range. The internal implementation of the sequence is not affected. Change the serial types to create sequences of the appropriate type. This makes sure that the min and max values of the sequence for a serial column match the range of values supported by the table column. So the sequence can no longer overflow the table column. This also makes monitoring for sequence exhaustion/wraparound easier, which currently requires various contortions to cross-reference the sequences with the table columns they are used with. This commit also effectively reverts the pg_sequence column reordering in `f3b421da5f`, because the new seqtypid column allows us to fill the hole in the struct and create a more natural overall column ordering. Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-02-10 15:34:35 -05:00
Robert Haas	806091c96f	Remove all references to "xlog" from SQL-callable functions in pg_proc. Commit `f82ec32ac3` renamed the pg_xlog directory to pg_wal. To make things consistent, and because "xlog" is terrible terminology for either "transaction log" or "write-ahead log" rename all SQL-callable functions that contain "xlog" in the name to instead contain "wal". (Note that this may pose an upgrade hazard for some users.) Similarly, rename the xlog_position argument of the functions that create slots to be called wal_position. Discussion: https://www.postgresql.org/message-id/CA+Tgmob=YmA=H3DbW1YuOXnFVgBheRmyDkWcD9M8f=5bGWYEoQ@mail.gmail.com	2017-02-09 15:10:09 -05:00
Heikki Linnakangas	181bdb90ba	Fix typos in comments. Backpatch to all supported versions, where applicable, to make backpatching of future fixes go more smoothly. Josh Soref Discussion: https://www.postgresql.org/message-id/CACZqfqCf+5qRztLPgmmosr-B0Ye4srWzzw_mo4c_8_B_mtjmJQ@mail.gmail.com	2017-02-06 11:33:58 +02:00
Tom Lane	aedd554f84	Fix CatalogTupleInsert/Update abstraction for case of shared indstate. Add CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo to let callers use the CatalogTupleXXX abstraction layer even in cases where we want to share the results of CatalogOpenIndexes across multiple inserts/updates for efficiency. This finishes the job begun in commit `2f5c9d9c9`, by allowing some remaining simple_heap_insert/update calls to be replaced. The abstraction layer is now complete enough that we don't have to export CatalogIndexInsert at all anymore. Also, this fixes several places in which `2f5c9d9c9` introduced performance regressions by using retail CatalogTupleInsert or CatalogTupleUpdate even though the previous coding had been able to amortize CatalogOpenIndexes work across multiple tuples. A possible future improvement is to arrange for the indexing.c functions to cache the CatalogIndexState somewhere, maybe in the relcache, in which case we could get rid of CatalogTupleInsertWithInfo and CatalogTupleUpdateWithInfo again. But that's a task for another day. Discussion: https://postgr.es/m/27502.1485981379@sss.pgh.pa.us	2017-02-01 17:18:36 -05:00
Tom Lane	ab02896510	Provide CatalogTupleDelete() as a wrapper around simple_heap_delete(). This extends the work done in commit `2f5c9d9c9` to provide a more nearly complete abstraction layer hiding the details of index updating for catalog changes. That commit only invented abstractions for catalog inserts and updates, leaving nearby code for catalog deletes still calling the heap-level routines directly. That seems rather ugly from here, and it does little to help if we ever want to shift to a storage system in which indexing work is needed at delete time. Hence, create a wrapper function CatalogTupleDelete(), and replace calls of simple_heap_delete() on catalog tuples with it. There are now very few direct calls of [simple_]heap_delete remaining in the tree. Discussion: https://postgr.es/m/462.1485902736@sss.pgh.pa.us	2017-02-01 16:13:30 -05:00
Alvaro Herrera	2f5c9d9c9c	Tweak catalog indexing abstraction for upcoming WARM Split the existing CatalogUpdateIndexes into two different routines, CatalogTupleInsert and CatalogTupleUpdate, which do both the heap insert/update plus the index update. This removes over 300 lines of boilerplate code all over src/backend/catalog/ and src/backend/commands. The resulting code is much more pleasing to the eye. Also, by encapsulating what happens in detail during an UPDATE, this facilitates the upcoming WARM patch, which is going to add a few more lines to the update case making the boilerplate even more boring. The original CatalogUpdateIndexes is removed; there was only one use left, and since it's just three lines, we can as well expand it in place there. We could keep it, but WARM is going to break all the UPDATE out-of-core callsites anyway, so there seems to be no benefit in doing so. Author: Pavan Deolasee Discussion: https://www.postgr.es/m/CABOikdOcFYSZ4vA2gYfs=M2cdXzXX4qGHeEiW3fu9PCfkHLa2A@mail.gmail.com	2017-01-31 18:42:24 -03:00
Tom Lane	de16ab7238	Invent pg_hba_file_rules view to show the content of pg_hba.conf. This view is designed along the same lines as pg_file_settings, to wit it shows what is currently in the file, not what the postmaster has loaded as the active settings. That allows it to be used to pre-vet edits before issuing SIGHUP. As with the earlier view, go out of our way to allow errors in the file to be reflected in the view, to assist that use-case. (We might at some point invent a view to show the current active settings, but this is not that patch; and it's not trivial to do.) Haribabu Kommi, reviewed by Ashutosh Bapat, Michael Paquier, Simon Riggs, and myself Discussion: https://postgr.es/m/CAJrrPGerH4jiwpcXT1-46QXUDmNp2QDrG9+-Tek_xC8APHShYw@mail.gmail.com	2017-01-30 18:00:26 -05:00
Fujii Masao	bdadf36eb4	Fix typo in description for pg_replication_origin_advance function.	2017-01-27 00:42:33 +09:00
Peter Eisentraut	3d9e73ea5f	Update copyright years in some recently added files	2017-01-25 12:32:05 -05:00
Tom Lane	d8d32d9a56	Make UNKNOWN into an actual pseudo-type. Previously, type "unknown" was labeled as a base type in pg_type, which perhaps had some sense to it because you were allowed to create tables with unknown-type columns. But now that we don't allow that, it makes more sense to label it a pseudo-type. This has the additional effects of forbidding use of "unknown" as a domain base type, cast source or target type, PL function argument or result type, or plpgsql local variable type; all of which seem like good holes to plug. Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com	2017-01-25 09:27:09 -05:00
Robert Haas	27cdb3414b	Reindent table partitioning code. We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.	2017-01-24 10:20:02 -05:00
Peter Eisentraut	b480086760	Add more includes so header files are self-contained	2017-01-21 15:49:53 -05:00
Peter Eisentraut	e4c27f5bef	Bump catversion	2017-01-20 09:07:13 -05:00
Peter Eisentraut	665d1fad99	Logical replication - Add PUBLICATION catalogs and DDL - Add SUBSCRIPTION catalog and DDL - Define logical replication protocol and output plugin - Add logical replication workers From: Petr Jelinek <petr@2ndquadrant.com> Reviewed-by: Steve Singer <steve@ssinger.info> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>	2017-01-20 09:04:49 -05:00
Robert Haas	05bd889904	Fix RETURNING to work correctly with partition tuple routing. In ExecInsert(), do not switch back to the root partitioned table ResultRelInfo until after we finish ExecProcessReturning(), so that RETURNING projection is done using the partition's descriptor. For the projection to work correctly, we must initialize the same for each leaf partition during ModifyTableState initialization. Amit Langote	2017-01-19 13:20:11 -05:00
Magnus Hagander	d00ca333c3	Implement array version of jsonb_delete and operator This makes it possible to delete multiple keys from a jsonb value by passing in an array of text values, which makes the operaiton much faster than individually deleting the keys (which would require copying the jsonb structure over and over again. Reviewed by Dmitry Dolgov and Michael Paquier	2017-01-18 21:37:59 +01:00
Peter Eisentraut	aa17c06fb5	Add function to import operating system collations Move this logic out of initdb into a user-callable function. This simplifies the code and makes it possible to update the standard collations later on if additional operating system collations appear. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Euler Taveira <euler@timbira.com.br>	2017-01-18 09:35:56 -05:00
Peter Eisentraut	352a24a1f9	Generate fmgr prototypes automatically Gen_fmgrtab.pl creates a new file fmgrprotos.h, which contains prototypes for all functions registered in pg_proc.h. This avoids having to manually maintain these prototypes across a random variety of header files. It also automatically enforces a correct function signature, and since there are warnings about missing prototypes, it will detect functions that are defined but not registered in pg_proc.h (or otherwise used). Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 14:06:07 -05:00
Peter Eisentraut	323b96aa34	Register missing money operators in system catalogs The operators moneyint8, int8money, and money/int8 were implemented in code but not registered in pg_operator or pg_proc. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 12:36:02 -05:00
Peter Eisentraut	6fc547960d	Rename C symbols for backend lo_ functions Rename the C symbols for lo_* to be_lo_*, so they don't conflict with libpq prototypes. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>	2017-01-17 12:35:30 -05:00
Tom Lane	ab1f0c8225	Change representation of statement lists, and add statement location info. This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre	2017-01-14 16:02:35 -05:00
Robert Haas	0563a3a8b5	Fix a bug in how we generate partition constraints. Move the code for doing parent attnos to child attnos mapping for Vars in partition constraint expressions to a separate function map_partition_varattnos() and call it from the appropriate places. Doing it in get_qual_from_partbound(), as is now, would produce wrong result in certain multi-level partitioning cases, because it only considers the current pair of parent-child relations. In certain multi-level partitioning cases, attnums for the same key attribute(s) might differ between various levels causing the same attribute to be numbered differently in different instances of the Var corresponding to a given attribute. With this commit, in generate_partition_qual(), we first generate the the whole partition constraint (considering all levels of partitioning) and then do the mapping, so that Vars in the final expression are numbered according the leaf relation (to which it is supposed to apply). Amit Langote, reviewed by me.	2017-01-13 14:04:35 -05:00
Robert Haas	18fc5192a6	Remove unnecessary arguments from partitioning functions. RelationGetPartitionQual() and generate_partition_qual() are always called with recurse = true, so we don't need an argument for that. Extracted by me from a larger patch by Amit Langote.	2017-01-04 14:56:37 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Peter Eisentraut	27866bd1e8	Expand ad-hoc unit abbreviations in function descriptions There is no need to use abbreviations here, so just write it out for consistency.	2016-12-29 11:15:01 -05:00
Tom Lane	fe591f8bf6	Replace enum InhOption with simple boolean. Now that it has only INH_NO and INH_YES values, it's just weird that it's not a plain bool, so make it that way. Also rename RangeVar.inhOpt to "inh", to be like RangeTblEntry.inh. My recollection is that we gave it a different name specifically because it had a different representation than the derived bool value, but it no longer does. And this is a good forcing function to be sure we catch any places that are affected by the change. Bump catversion because of possible effect on stored RangeVar nodes. I'm not exactly convinced that we ever store RangeVar on disk, but we have a readfuncs function for it, so be cautious. (If we do do so, then commit `e13486eba` was in error not to bump catversion.) Follow-on to commit `e13486eba`. Discussion: http://postgr.es/m/CA+TgmoYe+EG7LdYX6pkcNxr4ygkP4+A=jm9o-CPXyOvRiCNwaQ@mail.gmail.com	2016-12-23 13:35:18 -05:00
Robert Haas	2ac3ef7a01	Fix tuple routing in cases where tuple descriptors don't match. The previous coding failed to work correctly when we have a multi-level partitioned hierarchy where tables at successive levels have different attribute numbers for the partition key attributes. To fix, have each PartitionDispatch object store a standalone TupleTableSlot initialized with the TupleDesc of the corresponding partitioned table, along with a TupleConversionMap to map tuples from the its parent's rowtype to own rowtype. After tuple routing chooses a leaf partition, we must use the leaf partition's tuple descriptor, not the root table's. To that end, a dedicated TupleTableSlot for tuple routing is now allocated in EState. Amit Langote	2016-12-22 17:36:37 -05:00
Peter Eisentraut	f3b421da5f	Reorder pg_sequence columns to avoid alignment issue On AIX, doubles are aligned at 4 bytes, but int64 is aligned at 8 bytes. Our code assumes that doubles have alignment that can also be applied to int64, but that fails in this case. One effect is that heap_form_tuple() writes tuples in a different layout than Form_pg_sequence expects. Rather than rewrite the whole alignment code, work around the issue by reordering the columns in pg_sequence so that the first int64 column naturally comes out at an 8-byte boundary.	2016-12-21 09:06:49 -05:00
Peter Eisentraut	1753b1b027	Add pg_sequence system catalog Move sequence metadata (start, increment, etc.) into a proper system catalog instead of storing it in the sequence heap object. This separates the metadata from the sequence data. Sequence metadata is now operated on transactionally by DDL commands, whereas previously rollbacks of sequence-related DDL commands would be ignored. Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-12-20 08:28:18 -05:00
Robert Haas	7cd0fd655d	Invalid parent's relcache after CREATE TABLE .. PARTITION OF. Otherwise, subsequent commands in the same transaction see the wrong partition descriptor. Amit Langote. Reported by Tomas Vondra and David Fetter. Reviewed by me. Discussion: http://postgr.es/m/22dd313b-d7fd-22b5-0787-654845c8f849%402ndquadrant.com Discussion: http://postgr.es/m/20161215090916.GB20659%40fetter.org	2016-12-19 22:53:30 -05:00
Robert Haas	4b9a98e154	Clean up code, comments, and formatting for table partitioning. Amit Langote, plus pgindent-ing by me. Inspired in part by review comments from Tomas Vondra.	2016-12-13 10:59:14 -05:00
Tom Lane	9b3d02c2a9	Catversion bump for temporary replication slots. Missed in commit `a924c327e2`. Per Fujii Masao.	2016-12-12 14:41:49 -05:00
Peter Eisentraut	a924c327e2	Add support for temporary replication slots This allows creating temporary replication slots that are removed automatically at the end of the session or on error. From: Petr Jelinek <petr.jelinek@2ndquadrant.com>	2016-12-12 08:38:17 -05:00
Tom Lane	0b78106cd4	Fix reporting of column typmods for multi-row VALUES constructs. expandRTE() and get_rte_attribute_type() reported the exprType() and exprTypmod() values of the expressions in the first row of the VALUES as being the column type/typmod returned by the VALUES RTE. That's fine for the data type, since we coerce all expressions in a column to have the same common type. But we don't coerce them to have a common typmod, so it was possible for rows after the first one to return values that violate the claimed column typmod. This leads to the incorrect result seen in bug #14448 from Hassan Mahmood, as well as some other corner-case misbehaviors. The desired behavior is the same as we use in other type-unification cases: report the common typmod if there is one, but otherwise return -1 indicating no particular constraint. It's cheap for transformValuesClause to determine the common typmod while transforming a multi-row VALUES, but it'd be less cheap for expandRTE() and get_rte_attribute_type() to re-determine that info every time they're asked --- possibly a lot less cheap, if the VALUES has many rows. Therefore, the best fix is to record the common typmods explicitly in a list in the VALUES RTE, as we were already doing for column collations. This looks quite a bit like what we're doing for CTE RTEs, so we can save a little bit of space and code by unifying the representation for those two RTE types. They both now share coltypes/coltypmods/colcollations fields. (At some point it might seem desirable to populate those fields for all RTE types; but right now it looks like constructing them for other RTE types would add more code and cycles than it would save.) The RTE change requires a catversion bump, so this fix is only usable in HEAD. If we fix this at all in the back branches, the patch will need to look quite different. Report: https://postgr.es/m/20161205143037.4377.60754@wrigleys.postgresql.org Discussion: https://postgr.es/m/27429.1480968538@sss.pgh.pa.us	2016-12-08 11:40:02 -05:00
Robert Haas	f0e44751d7	Implement table partitioning. Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.	2016-12-07 13:17:55 -05:00
Stephen Frost	cb9dcbc1ee	Bump catversion for restrictive RLS changes Mea culpa. Pointed out by Andres.	2016-12-06 10:12:31 -05:00
Stephen Frost	093129c9d9	Add support for restrictive RLS policies We have had support for restrictive RLS policies since 9.5, but they were only available through extensions which use the appropriate hooks. This adds support into the grammer, catalog, psql and pg_dump for restrictive RLS policies, thus reducing the cases where an extension is necessary. In passing, also move away from using "AND"d and "OR"d in comments. As pointed out by Alvaro, it's not really appropriate to attempt to make verbs out of "AND" and "OR", so reword those comments which attempted to. Reviewed By: Jeevan Chalke, Dean Rasheed Discussion: https://postgr.es/m/20160901063404.GY4028@tamriel.snowman.net	2016-12-05 15:50:55 -05:00
Tom Lane	b3427dade1	Delete deleteWhatDependsOn() in favor of more performDeletion() flag bits. deleteWhatDependsOn() had grown an uncomfortably large number of assumptions about what it's used for. There are actually only two minor differences between what it does and what a regular performDeletion() call can do, so let's invent additional bits in performDeletion's existing flags argument that specify those behaviors, and get rid of deleteWhatDependsOn() as such. (We'd probably have done it this way from the start, except that performDeletion didn't originally have a flags argument, IIRC.) Also, add a SKIP_EXTENSIONS flag bit that prevents ever recursing to an extension, and use that when dropping temporary objects at session end. This provides a more general solution to the problem addressed in a hacky way in commit 08dd23cec: if an extension script creates temp objects and forgets to remove them again, the whole extension went away when its contained temp objects were deleted. The previous solution only covered temp relations, but this solves it for all object types. These changes require minor additions in dependency.c to pass the flags to subroutines that previously didn't get them, but it's still a net savings of code, and it seems cleaner than before. Having done this, revert the special-case code added in `08dd23cec` that prevented addition of pg_depend records for temp table extension membership, because that caused its own oddities: dropping an extension that had created such a table didn't automatically remove the table, leading to a failure if the table had another dependency on the extension (such as use of an extension data type), or to a duplicate-name failure if you then tried to recreate the extension. But we keep the part that prevents the pg_temp_nnn schema from becoming an extension member; we never want that to happen. Add a regression test case covering these behaviors. Although this fixes some arguable bugs, we've heard few field complaints, and any such problems are easily worked around by explicitly dropping temp objects at the end of extension scripts (which seems like good practice anyway). So I won't risk a back-patch. Discussion: https://postgr.es/m/e51f4311-f483-4dd0-1ccc-abec3c405110@BlueTreble.com	2016-12-02 14:57:55 -05:00
Peter Eisentraut	67dc4ccbb2	Add pg_sequences view Like pg_tables, pg_views, and others, this view contains information about sequences in a way that is independent of the system catalog layout but more comprehensive than the information schema. To help implement the view, add a new internal function pg_sequence_last_value() to return the last value of a sequence. This is kept separate from pg_sequence_parameters() to separate querying run-time state from catalog-like information. Reviewed-by: Andreas Karlsson <andreas@proxel.se>	2016-11-18 14:59:03 -05:00
Kevin Grittner	8c48375e5f	Implement syntax for transition tables in AFTER triggers. This is infrastructure for the complete SQL standard feature. No support is included at this point for execution nodes or PLs. The intent is to add that soon. As this patch leaves things, standard syntax can create tuplestores to contain old and/or new versions of rows affected by a statement. References to these tuplestores are in the TriggerData structure. C triggers can access the tuplestores directly, so they are usable, but they cannot yet be referenced within a SQL statement.	2016-11-04 10:49:50 -05:00
Robert Haas	f82ec32ac3	Rename "pg_xlog" directory to "pg_wal". "xlog" is not a particularly clear abbreviation for "write-ahead log", and it sometimes confuses users into believe that the contents of the "pg_xlog" directory are not critical data, leading to unpleasant consequences. So, rename the directory to "pg_wal". This patch modifies pg_upgrade and pg_basebackup to understand both the old and new directory layouts; the former is necessary given the purpose of the tool, while the latter merely avoids an unnecessary backward-compatibility break. We may wish to consider renaming other programs, switches, and functions which still use the old "xlog" naming to also refer to "wal". However, that's still under discussion, so let's do just this much for now. Discussion: CAB7nPqTeC-8+zux8_-4ZD46V7YPwooeFxgndfsq5Rg8ibLVm1A@mail.gmail.com Michael Paquier	2016-10-20 11:32:18 -04:00
Tom Lane	5c80642aa8	Remove unnecessary int2vector-specific hash function and equality operator. These functions were originally added in commit `d8cedf67a` to support use of int2vector columns as catcache lookup keys. However, there are no catcaches that use such columns. (Indeed I now think it must always have been dead code: a catcache with such a key column would need an underlying unique index on the column, but we've never had an int2vector btree opclass.) Getting rid of the int2vector-specific operator and function does not lose any functionality, because operations on int2vectors will now fall back to the generic anyarray support. This avoids a wart that a btree index on an int2vector column (made using anyarray_ops) would fail to match equality searches, because int2vectoreq wasn't a member of the opclass. We don't really care much about that, since int2vector is not meant as a type for users to use, but it's silly to have extra code and less functionality. If we ever do want a catcache to be indexed by an int2vector column, we'd need to put back full btree and hash opclasses for int2vector, comparable to the support for oidvector. (The anyarray code can't be used at such a low level, because it needs to do catcache lookups.) But we'll deal with that if/when the need arises. Also worth noting is that removal of the hash int2vector_ops opclass will break any user-created hash indexes on int2vector columns. While hash anyarray_ops would serve the same purpose, it would probably not compute the same hash values and thus wouldn't be on-disk-compatible. Given that int2vector isn't a user-facing type and we're planning other incompatible changes in hash indexes for v10 anyway, this doesn't seem like something to worry about, but it's probably worth mentioning here. Amit Langote Discussion: <d9bb74f8-b194-7307-9ebd-90645d377e45@lab.ntt.co.jp>	2016-10-12 14:54:08 -04:00
Heikki Linnakangas	bb55dd6059	Fix copy-pasto in comment. Amit Langote	2016-10-12 12:07:54 +03:00
Tom Lane	fdc9186f7e	Replace the built-in GIN array opclasses with a single polymorphic opclass. We had thirty different GIN array opclasses sharing the same operators and support functions. That still didn't cover all the built-in types, nor did it cover arrays of extension-added types. What we want is a single polymorphic opclass for "anyarray". There were two missing features needed to make this possible: 1. We have to be able to declare the index storage type as ANYELEMENT when the opclass is declared to index ANYARRAY. This just takes a few more lines in index_create(). Although this currently seems of use only for GIN, there's no reason to make index_create() restrict it to that. 2. We have to be able to identify the proper GIN compare function for the index storage type. This patch proceeds by making the compare function optional in GIN opclass definitions, and specifying that the default btree comparison function for the index storage type will be looked up when the opclass omits it. Again, that seems pretty generically useful. Since the comparison function lookup is done in initGinState(), making use of the second feature adds an additional cache lookup to GIN index access setup. It seems unlikely that that would be very noticeable given the other costs involved, but maybe at some point we should consider making GinState data persist longer than it now does --- we could keep it in the index relcache entry, perhaps. Rather fortuitously, we don't seem to need to do anything to get this change to play nice with dump/reload or pg_upgrade scenarios: the new opclass definition is automatically selected to replace existing index definitions, and the on-disk data remains compatible. Also, if a user has created a custom opclass definition for a non-builtin type, this doesn't break that, since CREATE INDEX will prefer an exact match to opcintype over a match to ANYARRAY. However, if there's anyone out there with handwritten DDL that explicitly specifies _bool_ops or one of the other replaced opclass names, they'll need to adjust that. Tom Lane, reviewed by Enrique Meneses Discussion: <14436.1470940379@sss.pgh.pa.us>	2016-09-26 14:52:44 -04:00
Tom Lane	a4c35ea1c2	Improve parser's and planner's handling of set-returning functions. Teach the parser to reject misplaced set-returning functions during parse analysis using p_expr_kind, in much the same way as we do for aggregates and window functions (cf commit `eaccfded9`). While this isn't complete (it misses nesting-based restrictions), it's much better than the previous error reporting for such cases, and it allows elimination of assorted ad-hoc expression_returns_set() error checks. We could add nesting checks later if it seems important to catch all cases at parse time. There is one case the parser will now throw error for although previous versions allowed it, which is SRFs in the tlist of an UPDATE. That never behaved sensibly (since it's ill-defined which generated row should be used to perform the update) and it's hard to see why it should not be treated as an error. It's a release-note-worthy change though. Also, add a new Query field hasTargetSRFs reporting whether there are any SRFs in the targetlist (including GROUP BY/ORDER BY expressions). The parser can now set that basically for free during parse analysis, and we can use it in a number of places to avoid expression_returns_set searches. (There will be more such checks soon.) In some places, this allows decontorting the logic since it's no longer expensive to check for SRFs in the tlist --- so I made the checks parallel to the handling of hasAggs/hasWindowFuncs wherever it seemed appropriate. catversion bump because adding a Query field changes stored rules. Andres Freund and Tom Lane Discussion: <24639.1473782855@sss.pgh.pa.us>	2016-09-13 13:54:24 -04:00
Tom Lane	0ab9c56d0f	Support renaming an existing value of an enum type. Not much to be said about this patch: it does what it says on the tin. In passing, rename AlterEnumStmt.skipIfExists to skipIfNewValExists to clarify what it actually does. In the discussion of this patch we considered supporting other similar options, such as IF EXISTS on the type as a whole or IF NOT EXISTS on the target name. This patch doesn't actually add any such feature, but it might happen later. Dagfinn Ilmari Mannsåker, reviewed by Emre Hasegeli Discussion: <CAO=2mx6uvgPaPDf-rHqG8=1MZnGyVDMQeh8zS4euRyyg4D35OQ@mail.gmail.com>	2016-09-07 16:11:56 -04:00
Tom Lane	b899ccbb49	Fix stray reference to the old genbki.sh script. Per Tomas Vondra.	2016-08-28 17:44:29 -04:00
Tom Lane	77e2906821	Create an SP-GiST opclass for inet/cidr. This seems to offer significantly better search performance than the existing GiST opclass for inet/cidr, at least on data with a wide mix of network mask lengths. (That may suggest that the data splitting heuristics in the GiST opclass could be improved.) Emre Hasegeli, with mostly-cosmetic adjustments by me Discussion: <CAE2gYzxtth9qatW_OAqdOjykS0bxq7AYHLuyAQLPgT7H9ZU0Cw@mail.gmail.com>	2016-08-23 15:16:30 -04:00
Robert Haas	86f31695f3	Add txid_current_ifassigned(). Add a variant of txid_current() that returns NULL if no transaction ID is assigned. This version can be used even on a standby server, although it will always return NULL since no transaction IDs can be assigned during recovery. Craig Ringer, per suggestion from Jim Nasby. Reviewed by Petr Jelinek and by me.	2016-08-23 10:30:52 -04:00
Tom Lane	8299471c37	Use LEFT JOINs in some system views in case referenced row doesn't exist. In particular, left join to pg_authid so that rows in pg_stat_activity don't disappear if the session's owning user has been dropped. Also convert a few joins to pg_database to left joins, in the same spirit, though that case might be harder to hit. We were doing this in other views already, so it was a bit inconsistent that these views didn't. Oskari Saarenmaa, with some further tweaking by me Discussion: <56E87CD8.60007@ohmu.fi>	2016-08-19 17:13:47 -04:00
Tom Lane	cf9b0fea5f	Implement regexp_match(), a simplified alternative to regexp_matches(). regexp_match() is like regexp_matches(), but it disallows the 'g' flag and in consequence does not need to return a set. Instead, it returns a simple text array value, or NULL if there's no match. Previously people usually got that behavior with a sub-select, but this way is considerably more efficient. Documentation adjusted so that regexp_match() is presented first and then regexp_matches() is introduced as a more complicated version. This is a bit historically revisionist but seems pedagogically better. Still TODO: extend contrib/citext to support this function. Emre Hasegeli, reviewed by David Johnston Discussion: <CAE2gYzy42sna2ME_e3y1KLQ-4UBrB-eVF0SWn8QG39sQSeVhEw@mail.gmail.com>	2016-08-17 18:33:01 -04:00
Tom Lane	0bb51aa967	Improve parsetree representation of special functions such as CURRENT_DATE. We implement a dozen or so parameterless functions that the SQL standard defines special syntax for. Up to now, that was done by converting them into more or less ad-hoc constructs such as "'now'::text::date". That's messy for multiple reasons: it exposes what should be implementation details to users, and performance is worse than it needs to be in several cases. To improve matters, invent a new expression node type SQLValueFunction that can represent any of these parameterless functions. Bump catversion because this changes stored parsetrees for rules. Discussion: <30058.1463091294@sss.pgh.pa.us>	2016-08-16 20:33:01 -04:00
Tom Lane	ed0097e4f9	Add SQL-accessible functions for inspecting index AM properties. Per discussion, we should provide such functions to replace the lost ability to discover AM properties by inspecting pg_am (cf commit `65c5fcd35`). The added functionality is also meant to displace any code that was looking directly at pg_index.indoption, since we'd rather not believe that the bit meanings in that field are part of any client API contract. As future-proofing, define the SQL API to not assume that properties that are currently AM-wide or index-wide will remain so unless they logically must be; instead, expose them only when inquiring about a specific index or even specific index column. Also provide the ability for an index AM to override the behavior. In passing, document pg_am.amtype, overlooked in commit `473b93287`. Andrew Gierth, with kibitzing by me and others Discussion: <87mvl5on7n.fsf@news-spur.riddles.org.uk>	2016-08-13 18:31:14 -04:00
Tom Lane	95bee941be	Fix misestimation of n_distinct for a nearly-unique column with many nulls. If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <16143.1470350371@sss.pgh.pa.us>	2016-08-07 18:52:02 -04:00
Tom Lane	a3c7a993d5	Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. Previously, if an INSERT with multiple rows of VALUES had indirection (array subscripting or field selection) in its target-columns list, the parser handled that by applying transformAssignedExpr() to each element of each VALUES row independently. This led to having ArrayRef assignment nodes or FieldStore nodes in each row of the VALUES RTE. That works for simple cases, but in bug #14265 Nuri Boardman points out that it fails if there are multiple assignments to elements/fields of the same target column. For such cases to work, rewriteTargetListIU() has to nest the ArrayRefs or FieldStores together to produce a single expression to be assigned to the column. But it failed to find them in the top-level targetlist and issued an error about "multiple assignments to same column". We could possibly fix this by teaching the rewriter to apply rewriteTargetListIU to each VALUES row separately, but that would be messy (it would change the output rowtype of the VALUES RTE, for example) and inefficient. Instead, let's fix the parser so that the VALUES RTE outputs are just the user-specified values, cast to the right type if necessary, and then the ArrayRefs or FieldStores are applied in the top-level targetlist to Vars representing the RTE's outputs. This is the same parsetree representation already used for similar cases with INSERT/SELECT syntax, so it allows simplifications in ruleutils.c, which no longer needs to treat INSERT-from-multiple-VALUES as its own special case. This implementation works by applying transformAssignedExpr to the VALUES entries as before, and then stripping off any ArrayRefs or FieldStores it adds. With lots of VALUES rows it would be noticeably more efficient to not add those nodes in the first place. But that's just an optimization not a bug fix, and there doesn't seem to be any good way to do it without significant refactoring. (A non-invasive answer would be to apply transformAssignedExpr + stripping to just the first VALUES row, and then just forcibly cast remaining rows to the same data types exposed in the first row. But this way would lead to different, not-INSERT-specific errors being reported in casting failure cases, so it doesn't seem very nice.) So leave that for later; this patch at least isn't making the per-row parsing work worse, and it does make the finished parsetree smaller, saving rewriter and planner work. Catversion bump because stored rules containing such INSERTs would need to change. Because of that, no back-patch, even though this is a very long-standing bug. Report: <20160727005725.7438.26021@wrigleys.postgresql.org> Discussion: <9578.1469645245@sss.pgh.pa.us>	2016-08-03 16:37:03 -04:00
Fujii Masao	dd5eb805d5	Remove unused arguments from pg_replication_origin_xact_reset function. The document specifies that pg_replication_origin_xact_reset function doesn't have any argument variables. But previously it was actually defined so as to have two argument variables, though they were not used at all. That is, the pg_proc entry for that function was incorrect. This patch fixes the pg_proc entry and removes those two arguments from the function definition. No back-patch because this change needs a catalog version bump although the issue exists in 9.5 as well. Instead, a note about those unused argument variables will be added to 9.5 document later. Catalog version bumped due to the change of pg_proc.	2016-08-02 02:43:17 +09:00
Tom Lane	99dd8b05aa	Advance PG_CONTROL_VERSION. This should have been done in commit `73c986adde` which added several new fields to pg_control, and again in commit `5028f22f6e` which changed the CRC algorithm, but it wasn't. It's far too late to fix it in the 9.5 branch, but let's do so in 9.6, so that if a 9.6 postmaster is started against a 9.4-era pg_control it will complain about a versioning problem rather than a CRC failure. We already forced initdb/pg_upgrade for beta3, so there's no downside to doing this now. Discussion: <7615.1468598094@sss.pgh.pa.us>	2016-07-16 12:49:14 -04:00
Fujii Masao	60d50769b7	Rename pg_stat_wal_receiver.conn_info to conninfo. Per discussion on pgsql-hackers, conninfo is better as the column name because it's more commonly used in PostgreSQL. Catalog version bumped due to the change of pg_proc. Author: Michael Paquier	2016-07-07 12:59:39 +09:00
Alvaro Herrera	9ed551e0a4	Add conninfo to pg_stat_wal_receiver Commit `b1a9bad9e7` introduced a stats view to provide insight into the running WAL receiver, but neglected to include the connection string in it, as reported by Michaël Paquier. This commit fixes that omission. (Any security-sensitive information is not disclosed). While at it, close the mild security hole that we were exposing the password in the connection string in shared memory. This isn't user-accessible, but it still looks like a good idea to avoid having the cleartext password in memory. Author: Michaël Paquier, Álvaro Herrera Review by: Vik Fearing Discussion: https://www.postgresql.org/message-id/CAB7nPqStg4M561obo7ryZ5G+fUydG4v1Ajs1xZT1ujtu+woRag@mail.gmail.com	2016-06-29 16:57:17 -04:00
Tom Lane	19e972d558	Rethink node-level representation of partial-aggregation modes. The original coding had three separate booleans representing partial aggregation behavior, which was confusing, unreadable, and error-prone, not least because the booleans weren't always listed in the same order. It was also inadequate for the allegedly-desirable future extension to support intermediate partial aggregation, because we'd need separate markers for serialization and deserialization in such a case. Merge these bools into an enum "AggSplit" to provide symbolic names for the supported operating modes (and document what those are). By assigning the values of the enum constants carefully, we can treat AggSplit values as options bitmasks so that tests of what to do aren't noticeably more expensive than before. While at it, get rid of Aggref.aggoutputtype. That's not needed since commit `59a3795c2` got rid of setrefs.c's special-purpose Aggref comparison code, and it likewise seemed more confusing than helpful. Assorted comment cleanup as well (there's still more that I want to do in that line). catversion bump for change in Aggref node contents. Should be the last one for partial-aggregation changes. Discussion: <29309.1466699160@sss.pgh.pa.us>	2016-06-26 14:33:38 -04:00
Tom Lane	f8ace5477e	Fix type-safety problem with parallel aggregate serial/deserialization. The original specification for this called for the deserialization function to have signature "deserialize(serialtype) returns transtype", which is a security violation if transtype is INTERNAL (which it always would be in practice) and serialtype is not (which ditto). The patch blithely overrode the opr_sanity check for that, which was sloppy-enough work in itself, but the indisputable reason this cannot be allowed to stand is that CREATE FUNCTION will reject such a signature and thus it'd be impossible for extensions to create parallelizable aggregates. The minimum fix to make the signature type-safe is to add a second, dummy argument of type INTERNAL. But to lock it down a bit more and make misuse of INTERNAL-accepting functions less likely, let's get rid of the ability to specify a "serialtype" for an aggregate and just say that the only useful serialtype is BYTEA --- which, in practice, is the only interesting value anyway, due to the usefulness of the send/recv infrastructure for this purpose. That means we only have to allow "serialize(internal) returns bytea" and "deserialize(bytea, internal) returns internal" as the signatures for these support functions. In passing fix bogus signature of int4_avg_combine, which I found thanks to adding an opr_sanity check on combinefunc signatures. catversion bump due to removing pg_aggregate.aggserialtype and adjusting signatures of assorted built-in functions. David Rowley and Tom Lane Discussion: <27247.1466185504@sss.pgh.pa.us>	2016-06-22 16:52:41 -04:00
Tom Lane	915b703e16	Fix handling of argument and result datatypes for partial aggregation. When doing partial aggregation, the args list of the upper (combining) Aggref node is replaced by a Var representing the output of the partial aggregation steps, which has either the aggregate's transition data type or a serialized representation of that. However, nodeAgg.c blindly continued to use the args list as an indication of the user-level argument types. This broke resolution of polymorphic transition datatypes at executor startup (though it accidentally failed to fail for the ANYARRAY case, which is likely the only one anyone had tested). Moreover, the constructed FuncExpr passed to the finalfunc contained completely wrong information, which would have led to bogus answers or crashes for any case where the finalfunc examined that information (which is only likely to be with polymorphic aggregates using a non-polymorphic transition type). As an independent bug, apply_partialaggref_adjustment neglected to resolve a polymorphic transition datatype before assigning it as the output type of the lower-level Aggref node. This again accidentally failed to fail for ANYARRAY but would be unlikely to work in other cases. To fix the first problem, record the user-level argument types in a separate OID-list field of Aggref, and look to that rather than the args list when asking what the argument types were. (It turns out to be convenient to include any "direct" arguments in this list too, although those are not currently subject to being overwritten.) Rather than adding yet another resolve_aggregate_transtype() call to fix the second problem, add an aggtranstype field to Aggref, and store the resolved transition type OID there when the planner first computes it. (By doing this in the planner and not the parser, we can allow the aggregate's transition type to change from time to time, although no DDL support yet exists for that.) This saves nothing of consequence for simple non-polymorphic aggregates, but for polymorphic transition types we save a catalog lookup during executor startup as well as several planner lookups that are new in 9.6 due to parallel query planning. In passing, fix an error that was introduced into count_agg_clauses_walker some time ago: it was applying exprTypmod() to something that wasn't an expression node at all, but a TargetEntry. exprTypmod silently returned -1 so that there was not an obvious failure, but this broke the intended sensitivity of aggregate space consumption estimates to the typmod of varchar and similar data types. This part needs to be back-patched. Catversion bump due to change of stored Aggref nodes. Discussion: <8229.1466109074@sss.pgh.pa.us>	2016-06-17 21:44:37 -04:00
Robert Haas	71d05a2c7b	pg_visibility: Add pg_truncate_visibility_map function. This requires some core changes as well so that we can properly WAL-log the truncation. Specifically, it changes the format of the XLOG_SMGR_TRUNCATE WAL record, so bump XLOG_PAGE_MAGIC. Patch by me, reviewed but not fully endorsed by Andres Freund.	2016-06-17 17:37:30 -04:00
Robert Haas	c7a25c242f	Mark some functions parallel-unsafe. currtid() and currtid2() call GetLatestSnapshot(), which fails in parallel mode. pg_export_snapshot() calls ExportSnapshot() which attempts to assign an XID for the current transaction if it does not already have one; that, too, will fail in parallel mode. Andreas Seltenreich	2016-06-15 11:40:07 -04:00
Tom Lane	cae1c788b9	Improve the situation for parallel query versus temp relations. Transmit the leader's temp-namespace state to workers. This is important because without it, the workers do not really have the same search path as the leader. For example, there is no good reason (and no extant code either) to prevent a worker from executing a temp function that the leader created previously; but as things stood it would fail to find the temp function, and then either fail or execute the wrong function entirely. We still prohibit a worker from creating a temp namespace on its own. In effect, a worker can only see the session's temp namespace if the leader had created it before starting the worker, which seems like the right semantics. Also, transmit the leader's BackendId to workers, and arrange for workers to use that when determining the physical file path of a temp relation belonging to their session. While the original intent was to prevent such accesses entirely, there were a number of holes in that, notably in places like dbsize.c which assume they can safely access temp rels of other sessions anyway. We might as well get this right, as a small down payment on someday allowing workers to access the leader's temp tables. (With this change, directly using "MyBackendId" as a relation or buffer backend ID is deprecated; you should use BackendIdForTempRelations() instead. I left a couple of such uses alone though, as they're not going to be reachable in parallel workers until we do something about localbuf.c.) Move the thou-shalt-not-access-thy-leader's-temp-tables prohibition down into localbuf.c, which is where it actually matters, instead of having it in relation_open(). This amounts to recognizing that access to temp tables' catalog entries is perfectly safe in a worker, it's only the data in local buffers that is problematic. Having done all that, we can get rid of the test in has_parallel_hazard() that says that use of a temp table's rowtype is unsafe in parallel workers. That test was unduly expensive, and if we really did need such a prohibition, that was not even close to being a bulletproof guard for it. (For example, any user-defined function executed in a parallel worker might have attempted such access.)	2016-06-09 20:16:11 -04:00
Robert Haas	4bc424b968	pgindent run for 9.6	2016-06-09 18:02:36 -04:00
Tom Lane	7feb60c1bb	Clarify documentation of ceil/ceiling/floor functions. Document these as "nearest integer >= argument" and "nearest integer <= argument", which will hopefully be less confusing than the old formulation. New wording is from Matlab via Dean Rasheed. I changed the pg_description entries as well as the SGML docs. In the back branches, this will only affect installations initdb'd in the future, but it should be harmless otherwise. Discussion: <CAEZATCW3yzJo-NMSiQs5jXNFbTsCEftZS-Og8=FvFdiU+kYuSA@mail.gmail.com>	2016-06-09 11:58:00 -04:00
Tom Lane	16ea51a263	Pin the built-in index access methods. This was overlooked in commit `473b93287`, which introduced DROP ACCESS METHOD. Although that command is restricted to superusers, we don't want even superusers dropping the built-in methods; "DROP ACCESS METHOD btree" in particular is unrecoverable from. Pin these objects in the same way that other initdb-created objects are pinned. I chose to bump catversion for this fix. That's not absolutely necessary perhaps, but it will ensure that no 9.6 production systems are missing the pin entries.	2016-05-19 14:40:02 -04:00
Tom Lane	1a2c17f8e2	Fix pg_upgrade to not fail when new-cluster TOAST rules differ from old. This patch essentially reverts commit `4c6780fd17`, in favor of a much simpler solution for the case where the new cluster would choose to create a TOAST table but the old cluster doesn't have one: just don't create a TOAST table. The existing code failed in at least two different ways if the situation arose: (1) ALTER TABLE RESET didn't grab an exclusive lock, so that the lock sanity check in create_toast_table failed; (2) pg_upgrade did not provide a pg_type OID for the new toast table, so that the crosscheck in TypeCreate failed. While both these problems were introduced by later patches, they show that the hack being used to cause TOAST table creation is overwhelmingly fragile (and untested). I also note that before the TypeCreate crosscheck was added, the code would have resulted in assigning an indeterminate pg_type OID to the toast table, possibly causing a later OID conflict in that catalog; so that it didn't really work even when committed. If we simply don't create a TOAST table, there will only be a problem if the code tries to store a tuple that's wider than a page, and field compression isn't sufficient to get it under a page. Given that the TOAST creation threshold is intended to be about a quarter of a page, it's very hard to believe that cross-version differences in the do-we-need-a-toast- table heuristic could result in an observable problem. So let's just follow the old version's conclusion about whether a TOAST table is needed. (If we ever do change needs_toast_table() so much that this conclusion doesn't apply, we can devise a solution at that time, and hopefully do it in a less klugy way than `4c6780fd17` did.) Back-patch to 9.3, like the previous patch. Discussion: <8110.1462291671@sss.pgh.pa.us>	2016-05-06 22:05:56 -04:00
Tom Lane	0b9a234432	Rename tsvector delete() to ts_delete(), and filter() to ts_filter(). The similarity of the original names to SQL keywords seems like a bad idea. Rename them before we're stuck with 'em forever. In passing, minor code and docs cleanup. Discussion: <4875.1462210058@sss.pgh.pa.us>	2016-05-05 19:43:32 -04:00
Robert Haas	9888b34fdb	Fix more things to be parallel-safe. Conversion functions were previously marked as parallel-unsafe, since that is the default, but in fact they are safe. Parallel-safe functions defined in pg_proc.h and redefined in system_views.sql were ending up as parallel-unsafe because the redeclarations were not marked PARALLEL SAFE. While editing system_views.sql, mark ts_debug() parallel safe also. Andreas Karlsson	2016-05-03 14:36:38 -04:00
Robert Haas	37d0c2cb1a	Fix parallel safety markings for pg_start_backup. Commit `7117685461` made pg_start_backup parallel-restricted rather than parallel-safe, because it now relies on backend-private state that won't be synchronized with the parallel worker. However, it didn't update pg_proc.h. Separately, Andreas Karlsson observed that system_views.sql neglected to reiterate the parallel-safety markings whe redefining various functions, including this one; so add a PARALLEL RESTRICTED declaration there to match the new value in pg_proc.h.	2016-05-02 10:42:34 -04:00
Stephen Frost	7a542700df	Create default roles This creates an initial set of default roles which administrators may use to grant access to, historically, superuser-only functions. Using these roles instead of granting superuser access reduces the number of superuser roles required for a system. Documention for each of the default roles has been added to user-manag.sgml. Bump catversion to 201604082, as we had a commit that bumped it to 201604081 and another that set it back to 201604071... Reviews by José Luis Tallón and Robert Haas	2016-04-08 16:56:27 -04:00
Teodor Sigaev	8b99edefca	Revert CREATE INDEX ... INCLUDING ... It's not ready yet, revert two commits `690c543550` - unstable test output `386e3d7609` - patch itself	2016-04-08 21:52:13 +03:00
Robert Haas	af025eed53	Add combine functions for various floating-point aggregates. This allows parallel aggregation to use them. It may seem surprising that we use float8_combine for both float4_accum and float8_accum transition functions, but that's because those functions differ only in the type of the non-transition-state argument. Haribabu Kommi, reviewed by David Rowley and Tomas Vondra	2016-04-08 13:47:06 -04:00
Teodor Sigaev	386e3d7609	CREATE INDEX ... INCLUDING (column[, ...]) Now indexes (but only B-tree for now) can contain "extra" column(s) which doesn't participate in index structure, they are just stored in leaf tuples. It allows to use index only scan by using single index instead of two or more indexes. Author: Anastasia Lubennikova with minor editorializing by me Reviewers: David Rowley, Peter Geoghegan, Jeff Janes	2016-04-08 19:45:59 +03:00
Teodor Sigaev	bb140506df	Phrase full text search. Patch introduces new text search operator (<-> or <DISTANCE>) into tsquery. On-disk and binary in/out format of tsquery are backward compatible. It has two side effect: - change order for tsquery, so, users, who has a btree index over tsquery, should reindex it - less number of parenthesis in tsquery output, and tsquery becomes more readable Authors: Teodor Sigaev, Oleg Bartunov, Dmitry Ivanov Reviewers: Alexander Korotkov, Artur Zakirov	2016-04-07 18:44:18 +03:00
Stephen Frost	29dd1504a1	Bump catversion for pg_dump dump catalog ACL patches Pointed out by Tom.	2016-04-06 23:04:48 -04:00
Stephen Frost	23f34fa4ba	In pg_dump, include pg_catalog and extension ACLs, if changed Now that all of the infrastructure exists, add in the ability to dump out the ACLs of the objects inside of pg_catalog or the ACLs for objects which are members of extensions, but only if they have been changed from their original values. The original values are tracked in pg_init_privs. When pg_dump'ing 9.6-and-above databases, we will dump out the ACLs for all objects in pg_catalog and the ACLs for all extension members, where the ACL has been changed from the original value which was set during either initdb or CREATE EXTENSION. This should not change dumps against pre-9.6 databases. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Stephen Frost	6c268df127	Add new catalog called pg_init_privs This new catalog holds the privileges which the system was initialized with at initdb time, along with any permissions set by extensions at CREATE EXTENSION time. This allows pg_dump (and any other similar use-cases) to detect when the privileges set on initdb-created or extension-created objects have been changed from what they were set to at initdb/extension-creation time and handle those changes appropriately. Reviews by Alexander Korotkov, Jose Luis Tallon	2016-04-06 21:45:32 -04:00
Teodor Sigaev	0b62fd036e	Add jsonb_insert It inserts a new value into an jsonb array at arbitrary position or a new key to jsonb object. Author: Dmitry Dolgov Reviewers: Petr Jelinek, Vitaly Burovoy, Andrew Dunstan	2016-04-06 19:25:00 +03:00
Simon Riggs	3fe3511d05	Generic Messages for Logical Decoding API and mechanism to allow generic messages to be inserted into WAL that are intended to be read by logical decoding plugins. This commit adds an optional new callback to the logical decoding API. Messages are either text or bytea. Messages can be transactional, or not, and are identified by a prefix to allow multiple concurrent decoding plugins. (Not to be confused with Generic WAL records, which are intended to allow crash recovery of extensible objects.) Author: Petr Jelinek and Andres Freund Reviewers: Artur Zakirov, Tomas Vondra, Simon Riggs Discussion: 5685F999.6010202@2ndquadrant.com	2016-04-06 10:05:41 +01:00
Alvaro Herrera	f2fcad27d5	Support ALTER THING .. DEPENDS ON EXTENSION This introduces a new dependency type which marks an object as depending on an extension, such that if the extension is dropped, the object automatically goes away; and also, if the database is dumped, the object is included in the dump output. Currently the grammar supports this for indexes, triggers, materialized views and functions only, although the utility code is generic so adding support for more object types is a matter of touching the parser rules only. Author: Abhijit Menon-Sen Reviewed-by: Alexander Korotkov, Álvaro Herrera Discussion: http://www.postgresql.org/message-id/20160115062649.GA5068@toroid.org	2016-04-05 18:38:54 -03:00
Robert Haas	41ea0c2376	Fix parallel-safety code for parallel aggregation. has_parallel_hazard() was ignoring the proparallel markings for aggregates, which is no good. Fix that. There was no way to mark an aggregate as actually being parallel-safe, either, so add a PARALLEL option to CREATE AGGREGATE. Patch by me, reviewed by David Rowley.	2016-04-05 16:06:15 -04:00
Robert Haas	11c8669c0c	Add parallel query support functions for assorted aggregates. This lets us use parallel aggregate for a variety of useful cases that didn't work before, like sum(int8), sum(numeric), several versions of avg(), and various other functions. Add some regression tests, as well, testing the general sanity of these and future catalog entries. David Rowley, reviewed by Tomas Vondra, with a few further changes by me.	2016-04-05 14:32:53 -04:00
Magnus Hagander	7117685461	Implement backup API functions for non-exclusive backups Previously non-exclusive backups had to be done using the replication protocol and pg_basebackup. With this commit it's now possible to make them using pg_start_backup/pg_stop_backup as well, as long as the backup program can maintain a persistent connection to the database. Doing this, backup_label and tablespace_map are returned as results from pg_stop_backup() instead of being written to the data directory. This makes the server safe from a crash during an ongoing backup, which can be a problem with exclusive backups. The old syntax of the functions remain and work exactly as before, but since the new syntax is safer this should eventually be deprecated and removed. Only reference documentation is included. The main section on backup still needs to be rewritten to cover this, but since that is already scheduled for a separate large rewrite, it's not included in this patch. Reviewed by David Steele and Amit Kapila	2016-04-05 20:03:49 +02:00
Teodor Sigaev	2d02a856e8	Bump catalog version, forget in `acdf2a8b37`	2016-03-30 18:56:21 +03:00
Teodor Sigaev	acdf2a8b37	Introduce SP-GiST operator class over box. Patch implements quad-tree over boxes, naive approach of 2D quad tree will not work for any non-point objects because splitting space on node is not efficient. The idea of pathc is treating 2D boxes as 4D points, so, object will not overlap (in 4D space). The performance tests reveal that this technique especially beneficial with too much overlapping objects, so called "spaghetti data". Author: Alexander Lebedev with editorization by Emre Hasegeli and me	2016-03-30 18:42:36 +03:00
Tom Lane	e511d878f3	Allow to_timestamp(float8) to convert float infinity to timestamp infinity. With the original SQL-function implementation, such cases failed because we don't support infinite intervals. Converting the function to C lets us bypass the interval representation, which should be a bit faster as well as more flexible. Vitaly Burovoy, reviewed by Anastasia Lubennikova	2016-03-29 17:09:29 -04:00
Robert Haas	5fe5a2cee9	Allow aggregate transition states to be serialized and deserialized. This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me.	2016-03-29 15:04:05 -04:00
Tom Lane	c94959d411	Fix DROP OPERATOR to reset oprcom/oprnegate links to the dropped operator. This avoids leaving dangling links in pg_operator; which while fairly harmless are also unsightly. While we're at it, simplify OperatorUpd, which went through heap_modify_tuple for no very good reason considering it had already made a tuple copy it could just scribble on. Roma Sokolov, reviewed by Tomas Vondra, additional hacking by Robert Haas and myself.	2016-03-25 12:33:16 -04:00
Alvaro Herrera	473b932870	Support CREATE ACCESS METHOD This enables external code to create access methods. This is useful so that extensions can add their own access methods which can be formally tracked for dependencies, so that DROP operates correctly. Also, having explicit support makes pg_dump work correctly. Currently only index AMs are supported, but we expect different types to be added in the future. Authors: Alexander Korotkov, Petr Jelínek Reviewed-By: Teodor Sigaev, Petr Jelínek, Jim Nasby Commitfest-URL: https://commitfest.postgresql.org/9/353/ Discussion: https://www.postgresql.org/message-id/CAPpHfdsXwZmojm6Dx+TJnpYk27kT4o7Ri6X_4OSWcByu1Rm+VA@mail.gmail.com	2016-03-23 23:01:35 -03:00
Robert Haas	e06a38965b	Support parallel aggregation. Parallel workers can now partially aggregate the data and pass the transition values back to the leader, which can combine the partial results to produce the final answer. David Rowley, based on earlier work by Haribabu Kommi. Reviewed by Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.	2016-03-21 09:30:18 -04:00
Peter Eisentraut	b555ed8102	Merge wal_level "archive" and "hot_standby" into new name "replica" The distinction between "archive" and "hot_standby" existed only because at the time "hot_standby" was added, there was some uncertainty about stability. This is now a long time ago. We would like to move forward with simplifying the replication configuration, but this distinction is in the way, because a primary server cannot tell (without asking a standby or predicting the future) which one of these would be the appropriate level. Pick a new name for the combined setting to make it clearer that it covers all (non-logical) backup and replication uses. The old values are still accepted but are converted internally. Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: David Steele <david@pgmasters.net>	2016-03-18 23:56:03 +01:00
Teodor Sigaev	3187d6de0e	Introduce parse_ident() SQL-layer function to split qualified identifier into array parts. Author: Pavel Stehule with minor editorization by me and Jim Nasby	2016-03-18 18:16:14 +03:00
Robert Haas	c16dc1aca5	Add simple VACUUM progress reporting. There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.	2016-03-15 13:32:56 -04:00
Tom Lane	2da7549987	pg_stat_get_progress_info() should be marked STRICT. I didn't bother with a catversion bump. Report and patch by Thomas Munro	2016-03-14 12:51:55 -04:00
Teodor Sigaev	a9eb6c83ef	Bump catalog version missed in `6943a946c7`	2016-03-11 19:31:04 +03:00
Teodor Sigaev	6943a946c7	Tsvector editing functions Adds several tsvector editting function: convert tsvector to/from text array, set weight for given lexemes, delete lexeme(s), unnest, filter lexemes with given weights Author: Stas Kelvich with some editorization by me Reviewers: Tomas Vondram, Teodor Sigaev	2016-03-11 19:22:36 +03:00
Robert Haas	53be0b1add	Provide much better wait information in pg_stat_activity. When a process is waiting for a heavyweight lock, we will now indicate the type of heavyweight lock for which it is waiting. Also, you can now see when a process is waiting for a lightweight lock - in which case we will indicate the individual lock name or the tranche, as appropriate - or for a buffer pin. Amit Kapila, Ildus Kurbangaliev, reviewed by me. Lots of helpful discussion and suggestions by many others, including Alexander Korotkov, Vladimir Borodin, and many others.	2016-03-10 12:44:09 -05:00
Robert Haas	b6fb6471f6	Add a generic command progress reporting facility. Using this facility, any utility command can report the target relation upon which it is operating, if there is one, and up to 10 64-bit counters; the intent of this is that users should be able to figure out what a utility command is doing without having to resort to ugly hacks like attaching strace to a backend. As a demonstration, this adds very crude reporting to lazy vacuum; we just report the target relation and nothing else. A forthcoming patch will make VACUUM report a bunch of additional data that will make this much more interesting. But this gets the basic framework in place. Vinayak Pokale, Rahila Syed, Amit Langote, Robert Haas, reviewed by Kyotaro Horiguchi, Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao, and Masanori Oyama.	2016-03-09 12:08:58 -05:00
Joe Conway	dc7d70ea05	Expose control file data via SQL accessible functions. Add four new SQL accessible functions: pg_control_system(), pg_control_checkpoint(), pg_control_recovery(), and pg_control_init() which expose a subset of the control file data. Along the way move the code to read and validate the control file to src/common, where it can be shared by the new backend functions and the original pg_controldata frontend program. Patch by me, significant input, testing, and review by Michael Paquier.	2016-03-05 11:10:19 -08:00
Tom Lane	eb43e851d6	Create stub functions to support pg_upgrade of old contrib/tsearch2. Commits `9ff60273e3` and `dbe2328959` adjusted the declarations of some core functions referenced by contrib/tsearch2's install script, forgetting that in a pg_upgrade situation, we'll be trying to restore operator class definitions that reference the old signatures. We've hit this problem before; solve it in the same way as before, namely by installing stub functions that have the expected signature and just invoke the correct function. Per report from Jeff Janes. (Someday we ought to stop supporting contrib/tsearch2, but I'm not sure today is that day.)	2016-03-02 17:37:54 -05:00
Robert Haas	a892234f83	Change the format of the VM fork to add a second bit per page. The new bit indicates whether every tuple on the page is already frozen. It is cleared only when the all-visible bit is cleared, and it can be set only when we vacuum a page and find that every tuple on that page is both visible to every transaction and in no need of any future vacuuming. A future commit will use this new bit to optimize away full-table scans that would otherwise be triggered by XID wraparound considerations. A page which is merely all-visible must still be scanned in that case, but a page which is all-frozen need not be. This commit does not attempt that optimization, although that optimization is the goal here. It seems better to get the basic infrastructure in place first. Per discussion, it's very desirable for pg_upgrade to automatically migrate existing VM forks from the old format to the new format. That, too, will be handled in a follow-on patch. Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit Kapila, Simon Riggs, Andres Freund, and others, and substantially revised by me.	2016-03-01 21:49:41 -05:00
Tom Lane	52f5d578d6	Create a function to reliably identify which sessions block which others. This patch introduces "pg_blocking_pids(int) returns int[]", which returns the PIDs of any sessions that are blocking the session with the given PID. Historically people have obtained such information using a self-join on the pg_locks view, but it's unreasonably tedious to do it that way with any modicum of correctness, and the addition of parallel queries has pretty much broken that approach altogether. (Given some more columns in the view than there are today, you could imagine handling parallel-query cases with a 4-way join; but ugh.) The new function has the following behaviors that are painful or impossible to get right via pg_locks: 1. Correctly understands which lock modes block which other ones. 2. In soft-block situations (two processes both waiting for conflicting lock modes), only the one that's in front in the wait queue is reported to block the other. 3. In parallel-query cases, reports all sessions blocking any member of the given PID's lock group, and reports a session by naming its leader process's PID, which will be the pg_backend_pid() value visible to clients. The motivation for doing this right now is mostly to fix the isolation tests. Commit `38f8bdcac4` lobotomized isolationtester's is-it-waiting query by removing its ability to recognize nonconflicting lock modes, as a crude workaround for the inability to handle soft-block situations properly. But even without the lock mode tests, the old query was excessively slow, particularly in CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new deadlock-hard test because the deadlock timeout elapses before they can probe the waiting status of all eight sessions. Replacing the pg_locks self-join with use of pg_blocking_pids() is not only much more correct, but a lot faster: I measure it at about 9X faster in a typical dev build with Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the test, without having to lengthen deadlock_timeout yet more and thus slow down the test for everyone else.	2016-02-22 14:31:43 -05:00
Dean Rasheed	53874c5228	Add pg_size_bytes() to parse human-readable size strings. This will parse strings in the format produced by pg_size_pretty() and return sizes in bytes. This allows queries to be written with clauses like "pg_total_relation_size(oid) > pg_size_bytes('10 GB')". Author: Pavel Stehule with various improvements by Vitaly Burovoy Discussion: http://www.postgresql.org/message-id/CAFj8pRD-tGoDKnxdYgECzA4On01_uRqPrwF-8LdkSE-6bDHp0w@mail.gmail.com Reviewed-by: Vitaly Burovoy, Oleksandr Shulgin, Kyotaro Horiguchi, Michael Paquier and Robert Haas	2016-02-20 09:57:27 +00:00
Joe Conway	a5c43b8869	Add new system view, pg_config Move and refactor the underlying code for the pg_config client application to src/common in support of sharing it with a new system information SRF called pg_config() which makes the same information available via SQL. Additionally wrap the SRF with a new system view, as called pg_config. Patch by me with extensive input and review by Michael Paquier and additional review by Alvaro Herrera.	2016-02-17 09:12:06 -08:00
Joe Conway	851636bfda	Move DATA entry to correct position In commit `7b4bfc87` the DATA and DESCR entries for the new row_security_active() function were inadvertantly put after the PROVOLATILE defines, rather than before as they should have been placed. Move them up where they belong. Backpatch to 9.5 where the new entries were introduced.	2016-02-15 16:38:47 -08:00
Tom Lane	d4c3a156cb	Remove GROUP BY columns that are functionally dependent on other columns. If a GROUP BY clause includes all columns of a non-deferred primary key, as well as other columns of the same relation, those other columns are redundant and can be dropped from the grouping; the pkey is enough to ensure that each row of the table corresponds to a separate group. Getting rid of the excess columns will reduce the cost of the sorting or hashing needed to implement GROUP BY, and can indeed remove the need for a sort step altogether. This seems worth testing for since many query authors are not aware of the GROUP-BY-primary-key exception to the rule about queries not being allowed to reference non-grouped-by columns in their targetlists or HAVING clauses. Thus, redundant GROUP BY items are not uncommon. Also, we can make the test pretty cheap in most queries where it won't help by not looking up a rel's primary key until we've found that at least two of its columns are in GROUP BY. David Rowley, reviewed by Julien Rouhaud	2016-02-11 17:34:59 -05:00
Tom Lane	72eee410d4	Move pg_constraint.h function declarations to new file pg_constraint_fn.h. A pending patch requires exporting a function returning Bitmapset from catalog/pg_constraint.c. As things stand, that would mean including nodes/bitmapset.h in pg_constraint.h, which might be hazardous for the client-side includability of that header. It's not entirely clear whether any client-side code needs to include pg_constraint.h, but it seems prudent to assume that there is some such code somewhere. Therefore, split off the function definitions into a new file pg_constraint_fn.h, similarly to what we've done for some other catalog header files.	2016-02-11 15:51:28 -05:00
Robert Haas	d89f06f048	Fix parallel-safety markings for pg_upgrade functions. These establish backend-local state which will not be copied to parallel workers, so they must be marked parallel-restricted, not parallel-safe.	2016-02-07 11:45:21 -05:00
Tom Lane	6819514fca	Add num_nulls() and num_nonnulls() to count NULL arguments. An example use-case is "CHECK(num_nonnulls(a,b,c) = 1)" to assert that exactly one of a,b,c isn't NULL. The functions are variadic, so they can also be pressed into service to count the number of null or nonnull elements in an array. Marko Tiikkaja, reviewed by Pavel Stehule	2016-02-04 23:03:37 -05:00
Robert Haas	b47b4dbf68	Extend sortsupport for text to more opclasses. Have varlena.c expose an interface that allows the char(n), bytea, and bpchar types to piggyback on a now-generalized SortSupport for text. This pushes a little more knowledge of the bpchar/char(n) type into varlena.c than might be preferred, but that seems like the approach that creates least friction. Also speed things up for index builds that use text_pattern_ops or varchar_pattern_ops. This patch does quite a bit of renaming, but it seems likely to be worth it, so as to avoid future confusion about the fact that this code is now more generally used than the old names might have suggested. Peter Geoghegan, reviewed by Álvaro Herrera and Andreas Karlsson, with small tweaks by me.	2016-02-03 14:29:53 -05:00
Tom Lane	2ad83fff22	Remove unnecessary "implementation of FOO operator" DESCR() entries. Apparently at least one committer hasn't gotten the word that these do not need to be maintained by hand, since initdb will create them automatically. Noted while fixing bug #13905. No catversion bump since the post-initdb state is exactly the same either way. I don't see a need for back-patch, either.	2016-02-02 11:52:27 -05:00
Tom Lane	a4627e8fd4	Fix pg_description entries for jsonb_to_record() and jsonb_to_recordset(). All the other jsonb function descriptions refer to the arguments as being "jsonb", but these two said "json". Make it consistent. Per bug #13905 from Petru Florin Mihancea. No catversion bump --- we can't force one in the back branches, and this isn't very critical anyway.	2016-02-02 11:39:50 -05:00
Fujii Masao	7f46eaf035	Add gin_clean_pending_list function to clean up GIN pending list This function cleans up the pending list of the GIN index by moving entries in it to the main GIN data structure in bulk. It returns the number of pages cleaned up from the pending list. This function is useful, for example, when the pending list needs to be cleaned up quickly to improve the performance of the search using GIN index. VACUUM can do the same thing, too, but it may take days to run on a large table. Jeff Janes, reviewed by Julien Rouhaud, Jaime Casanova, Alvaro Herrera and me. Discussion: CAMkU=1x8zFkpfnozXyt40zmR3Ub_kHu58LtRmwHUKRgQss7=iQ@mail.gmail.com	2016-01-28 12:57:52 +09:00
Fujii Masao	e09507a272	Fix volatility marking of pg_size_pretty function pg_size_pretty function should be marked immutable rather than volatile because it always returns the same result given the same argument. Pavel Stehule	2016-01-27 11:13:31 +09:00
Tom Lane	e1bd684a34	Add trigonometric functions that work in degrees. The implementations go to some lengths to deliver exact results for values where an exact result can be expected, such as sind(30) = 0.5 exactly. Dean Rasheed, reviewed by Michael Paquier	2016-01-22 15:46:22 -05:00
Robert Haas	a7de3dc5c3	Support multi-stage aggregation. Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me.	2016-01-20 13:46:50 -05:00
Tom Lane	dbe2328959	Fix assorted inconsistencies in GIN opclass support function declarations. GIN had some minor issues too, mostly using "internal" where something else would be more appropriate. I went with the same approach as in `9ff60273e3`, namely preferring the opclass' indexed datatype for arguments that receive an operator RHS value, even if that's not necessarily what they really are. Again, this is with an eye to having a uniform rule for ginvalidate() to check support function signatures.	2016-01-19 22:32:22 -05:00
Tom Lane	9ff60273e3	Fix assorted inconsistencies in GiST opclass support function declarations. The conventions specified by the GiST SGML documentation were widely ignored. For example, the strategy-number argument for "consistent" and "distance" functions is specified to be a smallint, but most of the built-in support functions declared it as an integer, and for that matter the core code passed it using Int32GetDatum not Int16GetDatum. None of that makes any real difference at runtime, but it's quite confusing for newcomers to the code, and it makes it very hard to write an amvalidate() function that checks support function signatures. So let's try to instill some consistency here. Another similar issue is that the "query" argument is not of a single well-defined type, but could have different types depending on the strategy (corresponding to search operators with different righthand-side argument types). Some of the functions threw up their hands and declared the query argument as being of "internal" type, which surely isn't right ("any" would have been more appropriate); but the majority position seemed to be to declare it as being of the indexed data type, corresponding to a search operator with both input types the same. So I've specified a convention that that's what to do always. Also, the result of the "union" support function actually must be of the index's storage type, but the documentation suggested declaring it to return "internal", and some of the functions followed that. Standardize on telling the truth, instead. Similarly, standardize on declaring the "same" function's inputs as being of the storage type, not "internal". Also, somebody had forgotten to add the "recheck" argument to both the documentation of the "distance" support function and all of their SQL declarations, even though the C code was happily using that argument. Clean that up too. Fix up some other omissions in the docs too, such as documenting that union's second input argument is vestigial. So far as the errors in core function declarations go, we can just fix pg_proc.h and bump catversion. Adjusting the erroneous declarations in contrib modules is more debatable: in principle any change in those scripts should involve an extension version bump, which is a pain. However, since these changes are purely cosmetic and make no functional difference, I think we can get away without doing that.	2016-01-19 12:04:36 -05:00
Tom Lane	65c5fcd353	Restructure index access method API to hide most of it at the C level. This patch reduces pg_am to just two columns, a name and a handler function. All the data formerly obtained from pg_am is now provided in a C struct returned by the handler function. This is similar to the designs we've adopted for FDWs and tablesample methods. There are multiple advantages. For one, the index AM's support functions are now simple C functions, making them faster to call and much less error-prone, since the C compiler can now check function signatures. For another, this will make it far more practical to define index access methods in installable extensions. A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possible from SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much more could be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanity used to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily editorialized on by me.	2016-01-17 19:36:59 -05:00
Simon Riggs	e63bb4549a	Add new user fn pg_current_xlog_flush_location() Tomas Vondra, reviewed by Michael Paquier and Amit Kapila Minor edits by me	2016-01-12 07:54:52 +00:00
Tom Lane	26d538dc93	Clean up some lack-of-STRICT issues in the core code, too. A scan for missed proisstrict markings in the core code turned up these functions: brin_summarize_new_values pg_stat_reset_single_table_counters pg_stat_reset_single_function_counters pg_create_logical_replication_slot pg_create_physical_replication_slot pg_drop_replication_slot The first three of these take OID, so a null argument will normally look like a zero to them, resulting in "ERROR: could not open relation with OID 0" for brin_summarize_new_values, and no action for the pg_stat_reset_XXX functions. The other three will dump core on a null argument, though this is mitigated by the fact that they won't do so until after checking that the caller is superuser or has rolreplication privilege. In addition, the pg_logical_slot_get/peek[_binary]_changes family was intentionally marked nonstrict, but failed to make nullness checks on all the arguments; so again a null-pointer-dereference crash is possible but only for superusers and rolreplication users. Add the missing ARGISNULL checks to the latter functions, and mark the former functions as strict in pg_proc. Make that change in the back branches too, even though we can't force initdb there, just so that installations initdb'd in future won't have the issue. Since none of these bugs rise to the level of security issues (and indeed the pg_stat_reset_XXX functions hardly misbehave at all), it seems sufficient to do this. In addition, fix some order-of-operations oddities in the slot_get_changes family, mostly cosmetic, but not the part that moves the function's last few operations into the PG_TRY block. As it stood, there was significant risk for an error to exit without clearing historical information from the system caches. The slot_get_changes bugs go back to 9.4 where that code was introduced. Back-patch appropriate subsets of the pg_proc changes into all active branches, as well.	2016-01-09 16:58:32 -05:00
Alvaro Herrera	b1a9bad9e7	pgstat: add WAL receiver status view & SRF This new view provides insight into the state of a running WAL receiver in a HOT standby node. The information returned includes the PID of the WAL receiver process, its status (stopped, starting, streaming, etc), start LSN and TLI, last received LSN and TLI, timestamp of last message send and receipt, latest end-of-WAL LSN and time, and the name of the slot (if any). Access to the detailed data is only granted to superusers; others only get the PID. Author: Michael Paquier Reviewer: Haribabu Kommi	2016-01-07 16:21:19 -03:00
Alvaro Herrera	abb1733922	Add scale(numeric) Author: Marko Tiikkaja	2016-01-05 19:02:13 -03:00
Tom Lane	ea0d494dae	Make the to_reg*() functions accept text not cstring. Using cstring as the input type was a poor decision, because that's not really a full-fledged type. In particular, it lacks implicit coercions from text or varchar, meaning that usages like to_regproc('foo'\|\|'bar') wouldn't work; basically the only case that did work without explicit casting was a simple literal constant argument. The lack of field complaints about this suggests that hardly anyone is using these functions, so hopefully fixing it won't cause much of a compatibility problem. They've only been there since 9.4, anyway. Petr Korobeinikov	2016-01-05 13:02:43 -05:00
Alvaro Herrera	efa318bcfa	Make pg_shseclabel available in early backend startup While the in-core authentication mechanism doesn't need to access pg_shseclabel at all, it's reasonable to think that an authentication hook will want to look at the label for the role logging in, or for rows in other catalogs used during the authentication phase of startup. Catalog version bumped, because this changes the "is nailed" status for pg_shseclabel. Author: Adam Brightwell	2016-01-05 14:50:53 -03:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Tom Lane	0dab5ef39b	Fix ALTER OPERATOR to update dependencies properly. Fix an oversight in commit 321eed5f0f7563a0: replacing an operator's selectivity functions needs to result in a corresponding update in pg_depend. We have a function that can handle that, but it was not called by AlterOperator(). To fix this without enlarging pg_operator.h's #include list beyond what clients can safely include, split off the function definitions into a new file pg_operator_fn.h, similarly to what we've done for some other catalog header files. It's not entirely clear whether any client-side code needs to include pg_operator.h, but it seems prudent to assume that there is some such code somewhere.	2015-12-31 17:37:31 -05:00
Joe Conway	241448b23a	Rename (new\|old)estCommitTs to (new\|old)estCommitTsXid The variables newestCommitTs and oldestCommitTs sound as if they are timestamps, but in fact they are the transaction Ids that correspond to the newest and oldest timestamps rather than the actual timestamps. Rename these variables to reflect that they are actually xids: to wit newestCommitTsXid and oldestCommitTsXid respectively. Also modify related code in a similar fashion, particularly the user facing output emitted by pg_controldata and pg_resetxlog. Complaint and patch by me, review by Tom Lane and Alvaro Herrera. Backpatch to 9.5 where these variables were first introduced.	2015-12-28 12:34:11 -08:00
Tom Lane	074c5cfbfb	Fix handling of inherited check constraints in ALTER COLUMN TYPE (again). The previous way of reconstructing check constraints was to do a separate "ALTER TABLE ONLY tab ADD CONSTRAINT" for each table in an inheritance hierarchy. However, that way has no hope of reconstructing the check constraints' own inheritance properties correctly, as pointed out in bug #13779 from Jan Dirk Zijlstra. What we should do instead is to do a regular "ALTER TABLE", allowing recursion, at the topmost table that has a particular constraint, and then suppress the work queue entries for inherited instances of the constraint. Annoyingly, we'd tried to fix this behavior before, in commit `5ed6546cf`, but we failed to notice that it wasn't reconstructing the pg_constraint field values correctly. As long as I'm touching pg_get_constraintdef_worker anyway, tweak it to always schema-qualify the target table name; this seems like useful backup to the protections installed by commit `5f173040`. In HEAD/9.5, get rid of get_constraint_relation_oids, which is now unused. (I could alternatively have modified it to also return conislocal, but that seemed like a pretty single-purpose API, so let's not pretend it has some other use.) It's unused in the back branches as well, but I left it in place just in case some third-party code has decided to use it. In HEAD/9.5, also rename pg_get_constraintdef_string to pg_get_constraintdef_command, as the previous name did nothing to explain what that entry point did differently from others (and its comment was equally useless). Again, that change doesn't seem like material for back-patching. I did a bit of re-pgindenting in tablecmds.c in HEAD/9.5, as well. Otherwise, back-patch to all supported branches.	2015-11-20 14:55:47 -05:00
Robert Haas	bc4996e61b	Make ALTER .. SET SCHEMA do nothing, instead of throwing an ERROR. This was already true for CREATE EXTENSION, but historically has not been true for other object types. Therefore, this is a backward incompatibility. Per discussion on pgsql-hackers, everyone seems to agree that the new behavior is better. Marti Raudsepp, reviewed by Haribabu Kommi and myself	2015-11-19 10:49:25 -05:00
Tom Lane	c5e86ea932	Add "xid <> xid" and "xid <> int4" operators. The corresponding "=" operators have been there a long time, and not having their negators is a bit of a nuisance. Michael Paquier	2015-11-07 16:40:15 -05:00
Robert Haas	a76ef15d9f	Add sort support routine for the UUID data type. This introduces a simple encoding scheme to produce abbreviated keys: pack as many bytes of each UUID as will fit into a Datum. On little-endian machines, a byteswap is also performed; the abbreviated comparator can therefore just consist of a simple 3-way unsigned integer comparison. The purpose of this change is to speed up sorting data on a column of type UUID. Peter Geoghegan	2015-11-06 12:14:35 -05:00
Robert Haas	816e336f12	Mark more functions parallel-restricted or parallel-unsafe. Commit `7aea8e4f2d` was overoptimistic about the degree of safety associated with running various functions in parallel mode. Functions that take a table name or OID as an argument are at least parallel-restricted, because the table might be temporary, and we currently don't allow parallel workers to touch temporary tables. Functions that take a query as an argument are outright unsafe, because the query could be anything, including a parallel-unsafe query. Also, the queue of pending notifications is backend-private, so adding to it from a worker doesn't behave correctly. We could fix this by transferring the worker's queue of pending notifications to the master during worker cleanup, but that seems like more trouble than it's worth for now. In addition to adjusting the pg_proc.h markings, also add an explicit check for this in async.c.	2015-10-16 11:49:31 -04:00
Bruce Momjian	b852dc4cbd	docs: clarify JSONB operator descriptions No catalog bump as the catalog changes are for SQL operator comments. Backpatch through 9.5	2015-10-07 09:06:49 -04:00
Stephen Frost	4158cc3793	Do not write out WCOs in Query The WithCheckOptions list in Query are only populated during rewrite and do not need to be written out or read in as part of a Query structure. Further, move WithCheckOptions to the bottom and add comments to clarify that it is only populated during rewrite. Back-patch to 9.5 with a catversion bump, as we are still in alpha.	2015-10-05 07:38:58 -04:00
Stephen Frost	088c83363a	ALTER TABLE .. FORCE ROW LEVEL SECURITY To allow users to force RLS to always be applied, even for table owners, add ALTER TABLE .. FORCE ROW LEVEL SECURITY. row_security=off overrides FORCE ROW LEVEL SECURITY, to ensure pg_dump output is complete (by default). Also add SECURITY_NOFORCE_RLS context to avoid data corruption when ALTER TABLE .. FORCE ROW SECURITY is being used. The SECURITY_NOFORCE_RLS security context is used only during referential integrity checks and is only considered in check_enable_rls() after we have already checked that the current user is the owner of the relation (which should always be the case during referential integrity checks). Back-patch to 9.5 where RLS was added.	2015-10-04 21:05:08 -04:00
Noah Misch	3cb0a7e75a	Make BYPASSRLS behave like superuser RLS bypass. Specifically, make its effect independent from the row_security GUC, and make it affect permission checks pertinent to views the BYPASSRLS role owns. The row_security GUC thereby ceases to change successful-query behavior; it can only make a query fail with an error. Back-patch to 9.5, where BYPASSRLS was introduced.	2015-10-03 20:19:57 -04:00
Robert Haas	7aea8e4f2d	Determine whether it's safe to attempt a parallel plan for a query. Commit `924bcf4f16` introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.	2015-09-16 15:38:47 -04:00
Andres Freund	6fcd88511f	Allow pg_create_physical_replication_slot() to reserve WAL. When creating a physical slot it's often useful to immediately reserve the current WAL position instead of only doing after the first feedback message arrives. That e.g. allows slots to guarantee that all the WAL for a base backup will be available afterwards. Logical slots already have to reserve WAL during creation, so generalize that logic into being usable for both physical and logical slots. Catversion bump because of the new parameter. Author: Gurjeet Singh Reviewed-By: Andres Freund Discussion: CABwTF4Wh_dBCzTU=49pFXR6coR4NW1ynb+vBqT+Po=7fuq5iCw@mail.gmail.com	2015-08-11 12:34:31 +02:00
Andres Freund	3f811c2d6f	Add confirmed_flush column to pg_replication_slots. There's no reason not to expose both restart_lsn and confirmed_flush since they have rather distinct meanings. The former is the oldest WAL still required and valid for both physical and logical slots, whereas the latter is the location up to which a logical slot's consumer has confirmed receiving data. Most of the time a slot will require older WAL (i.e. restart_lsn) than the confirmed position (i.e. confirmed_flush_lsn). Author: Marko Tiikkaja, editorialized by me Discussion: 559D110B.1020109@joh.to	2015-08-10 13:28:18 +02:00
Andres Freund	4eda0a6470	Don't include low level locking code from frontend code. Some frontend code like e.g. pg_xlogdump or pg_resetxlog, has to use backend headers. Unfortunately until now that code includes most of the locking code. It's generally not nice to expose such low level details, but `de6fd1c898` made that a hard problem. We fall back to defining 'inline' away if the compiler doesn't support it - that can cause linker errors like on buildfarm animal pademelon if a inline function references backend only code. To fix that problem separate definitions from lock.h that are required from frontend code into lockdefs.h and use it in the relevant places. I've only removed the minimal amount of necessary definitions for now - it might turn out that we want more for other reasons. To avoid such details being exposed again put some checks against being included from frontend code into atomics.h, lock.h, lwlock.h and s_lock.h. It's otherwise fairly easy to indirectly include these headers. Discussion: 20150806070902.GE12214@awork2.anarazel.de	2015-08-07 15:10:56 +02:00
Noah Misch	b8fe12a836	Reconcile nodes/*funcs.c with recent work. A few of the discrepancies had semantic significance, but I did not track down the resulting user-visible bugs, if any. Back-patch to 9.5, where all but one discrepancy appeared. The _equalCreateEventTrigStmt() situation dates to 9.3 but does not affect semantics. catversion bump due to readfuncs.c field order changes.	2015-08-05 20:44:27 -04:00
Alvaro Herrera	2834855cb9	Fix BRIN to use SnapshotAny during summarization For correctness of summarization results, it is critical that the snapshot used during the summarization scan is able to see all tuples that are live to all transactions -- including tuples inserted or deleted by in-progress transactions. Otherwise, it would be possible for a transaction to insert a tuple, then idle for a long time while a concurrent transaction executes summarization of the range: this would result in the inserted value not being considered in the summary. Previously we were trying to use a MVCC snapshot in conjunction with adding a "placeholder" tuple in the index: the snapshot would see all committed tuples, and the placeholder tuple would catch insertions by any new inserters. The hole is that prior insertions by transactions that are still in progress by the time the MVCC snapshot was taken were ignored. Kevin Grittner reported this as a bogus error message during vacuum with default transaction isolation mode set to repeatable read (because the error report mentioned a function name not being invoked during), but the problem is larger than that. To fix, tweak IndexBuildHeapRangeScan to have a new mode that behaves the way we need using SnapshotAny visibility rules. This change simplifies the BRIN code a bit, mainly by removing large comments that were mistaken. Instead, rely on the SnapshotAny semantics to provide what it needs. (The business about a placeholder tuple needs to remain: that covers the case that a transaction inserts a a tuple in a page that summarization already scanned.) Discussion: https://www.postgresql.org/message-id/20150731175700.GX2441@postgresql.org In passing, remove a couple of unused declarations from brin.h and reword a comment to be proper English. This part submitted by Kevin Grittner. Backpatch to 9.5, where BRIN was introduced.	2015-08-05 16:20:50 -03:00
Alvaro Herrera	e8e86fbc8b	Fix volatility marking of commit timestamp functions They are marked stable, but since they act on instantaneous state and it is possible to consult state of transactions as they commit, the results could change mid-query. They need to be marked volatile, and this commit does so. There would normally be a catversion bump here, but this is so much a niche feature and I don't believe there's real damage from the incorrect marking, that I refrained. Backpatch to 9.5, where commit timestamps where introduced. Per note from Fujii Masao.	2015-07-30 15:19:49 -03:00
Joe Conway	f781a0f1d8	Create a pg_shdepend entry for each role in TO clause of policies. CreatePolicy() and AlterPolicy() omit to create a pg_shdepend entry for each role in the TO clause. Fix this by creating a new shared dependency type called SHARED_DEPENDENCY_POLICY and assigning it to each role. Reported by Noah Misch. Patch by me, reviewed by Alvaro Herrera. Back-patch to 9.5 where RLS was introduced.	2015-07-28 16:01:53 -07:00
Joe Conway	1e2bd43b31	Bump catversion so that HEAD is beyond 9.5 As pointed out by Tom, since HEAD has progressed beyond 9.5 in terms of its catalog, we need to be sure catversion of HEAD is advanced beyond that of 9.5. Corrects my mistake in the pg_stats view commit `cfa928ff`.	2015-07-28 13:59:23 -07:00
Joe Conway	7b4bfc87d5	Plug RLS related information leak in pg_stats view. The pg_stats view is supposed to be restricted to only show rows about tables the user can read. However, it sometimes can leak information which could not otherwise be seen when row level security is enabled. Fix that by not showing pg_stats rows to users that would be subject to RLS on the table the row is related to. This is done by creating/using the newly introduced SQL visible function, row_security_active(). Along the way, clean up three call sites of check_enable_rls(). The second argument of that function should only be specified as other than InvalidOid when we are checking as a different user than the current one, as in when querying through a view. These sites were passing GetUserId() instead of InvalidOid, which can cause the function to return incorrect results if the current user has the BYPASSRLS privilege and row_security has been set to OFF. Additionally fix a bug causing RI Trigger error messages to unintentionally leak information when RLS is enabled, and other minor cleanup and improvements. Also add WITH (security_barrier) to the definition of pg_stats. Bumped CATVERSION due to new SQL functions and pg_stats view definition. Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav. Patch by Joe Conway and Dean Rasheed with review and input by Michael Paquier and Stephen Frost.	2015-07-28 13:21:22 -07:00
Tom Lane	dd7a8f66ed	Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.	2015-07-25 14:39:00 -04:00
Alvaro Herrera	149b1dd840	Fix omission of OCLASS_TRANSFORM in object_classes[] This was forgotten in `cac7658205` (and its fixup `ad89a5d115`). Since it seems way too easy to miss this, this commit also introduces a mechanism to enforce that the array is consistent with the enum. Problem reported independently by Robert Haas and Jaimin Pan. Patches proposed by Jaimin Pan, Jim Nasby, Michael Paquier and myself, though I didn't use any of these and instead went with a cleaner approach suggested by Tom Lane. Backpatch to 9.5. Discussion: https://www.postgresql.org/message-id/CA+Tgmoa6SgDaxW_n_7SEhwBAc=mniYga+obUj5fmw4rU9_mLvA@mail.gmail.com https://www.postgresql.org/message-id/29788.1437411581@sss.pgh.pa.us	2015-07-21 13:20:53 +02:00
Robert Haas	a04bb65f70	Add new function pg_notification_queue_usage. This tells you what fraction of NOTIFY's queue is currently filled. Brendan Jurd, reviewed by Merlin Moncure and Gurjeet Singh. A few further tweaks by me.	2015-07-17 09:12:03 -04:00
Tom Lane	10fb48d66d	Add an optional missing_ok argument to SQL function current_setting(). This allows convenient checking for existence of a GUC from SQL, which is particularly useful when dealing with custom variables. David Christensen, reviewed by Jeevan Chalke	2015-07-02 16:41:07 -04:00
Heikki Linnakangas	7931622d1d	Fix name of argument to pg_stat_file. It's called "missing_ok" in the docs and in the C code. I refrained from doing a catversion bump for this, because the name of an input argument is just documentation, it has no effect on any callers. Michael Paquier	2015-07-02 12:15:13 +03:00
Tom Lane	62d16c7fc5	Improve design and implementation of pg_file_settings view. As first committed, this view reported on the file contents as they were at the last SIGHUP event. That's not as useful as reporting on the current contents, and what's more, it didn't work right on Windows unless the current session had serviced at least one SIGHUP. Therefore, arrange to re-read the files when pg_show_all_settings() is called. This requires only minor refactoring so that we can pass changeVal = false to set_config_option() so that it won't actually apply any changes locally. In addition, add error reporting so that errors that would prevent the configuration files from being loaded, or would prevent individual settings from being applied, are visible directly in the view. This makes the view usable for pre-testing whether edits made in the config files will have the desired effect, before one actually issues a SIGHUP. I also added an "applied" column so that it's easy to identify entries that are superseded by later entries; this was the main use-case for the original design, but it seemed unnecessarily hard to use for that. Also fix a 9.4.1 regression that allowed multiple entries for a PGC_POSTMASTER variable to cause bogus complaints in the postmaster log. (The issue here was that commit `bf007a27ac` unintentionally reverted `3e3f65973a`, which suppressed any duplicate entries within ParseConfigFp. However, since the original coding of the pg_file_settings view depended on such suppression not happening, we couldn't have fixed this issue now without first doing something with pg_file_settings. Now we suppress duplicates by marking them "ignored" within ProcessConfigFileInternal, which doesn't hide them in the view.) Lesser changes include: Drive the view directly off the ConfigVariable list, instead of making a basically-equivalent second copy of the data. There's no longer any need to hang onto the data permanently, anyway. Convert show_all_file_settings() to do its work in one call and return a tuplestore; this avoids risks associated with assuming that the GUC state will hold still over the course of query execution. (I think there were probably latent bugs here, though you might need something like a cursor on the view to expose them.) Arrange to run SIGHUP processing in a short-lived memory context, to forestall process-lifespan memory leaks. (There is one known leak in this code, in ProcessConfigDirectory; it seems minor enough to not be worth back-patching a specific fix for.) Remove mistaken assignment to ConfigFileLineno that caused line counting after an include_dir directive to be completely wrong. Add missed failure check in AlterSystemSetConfigFile(). We don't really expect ParseConfigFp() to fail, but that's not an excuse for not checking.	2015-06-28 18:06:14 -04:00
Heikki Linnakangas	cb2acb1081	Add missing_ok option to the SQL functions for reading files. This makes it possible to use the functions without getting errors, if there is a chance that the file might be removed or renamed concurrently. pg_rewind needs to do just that, although this could be useful for other purposes too. (The changes to pg_rewind to use these functions will come in a separate commit.) The read_binary_file() function isn't very well-suited for extensions.c's purposes anymore, if it ever was. So bite the bullet and make a copy of it in extension.c, tailored for that use case. This seems better than the accidental code reuse, even if it's a some more lines of code. Michael Paquier, with plenty of kibitzing by me.	2015-06-28 21:35:46 +03:00
Andrew Dunstan	908e234733	Rename jsonb - text[] operator to #- to avoid ambiguity. Following recent discussion on -hackers. The underlying function is also renamed to jsonb_delete_path. The regression tests now don't need ugly type casts to avoid the ambiguity, so they are also removed. Catalog version bumped.	2015-06-11 10:06:58 -04:00
Fujii Masao	ea9c4c1e4a	Fix typo in comment. David Rowley	2015-06-10 15:26:02 +09:00
Andrew Dunstan	37def42245	Rename jsonb_replace to jsonb_set and allow it to add new values The function is given a fourth parameter, which defaults to true. When this parameter is true, if the last element of the path is missing in the original json, jsonb_set creates it in the result and assigns it the new value. If it is false then the function does nothing unless all elements of the path are present, including the last. Based on some original code from Dmitry Dolgov, heavily modified by me. Catalog version bumped.	2015-05-31 20:34:10 -04:00
Tom Lane	1c8c656b3c	Check that all aliases of a built-in function have same leakproof property. opr_sanity.sql has a test checking that relevant properties of built-in functions match when the same C function is referenced by multiple pg_proc entries. The test neglected to check proleakproof, though, and when I added that condition it exposed that xideqint4 hadn't been updated to match xideq. So fix that as well, and in consequence bump catversion. This isn't very critical, so no need to worry about fixing back branches.	2015-05-29 13:26:21 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Andres Freund	631d749007	Remove the new UPSERT command tag and use INSERT instead. Previously, INSERT with ON CONFLICT DO UPDATE specified used a new command tag -- UPSERT. It was introduced out of concern that INSERT as a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as some affected rows may actually have been updated. Alvaro Herrera noticed that the implementation of that new command tag was incomplete; in subsequent discussion we concluded that having it doesn't provide benefits that are in line with the compatibility breaks it requires. Catversion bump due to the removal of PlannedStmt->isUpsert. Author: Peter Geoghegan Discussion: 20150520215816.GI5885@postgresql.org	2015-05-23 00:58:45 +02:00
Heikki Linnakangas	4fc72cc7bb	Collection of typo fixes. Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.	2015-05-20 16:56:22 +03:00
Tom Lane	0b28ea79c0	Avoid collation dependence in indexes of system catalogs. No index in template0 should have collation-dependent ordering, especially not indexes on shared catalogs. For most textual columns we avoid this issue by using type "name" (which sorts per strcmp()). However there are a few indexed columns that we'd prefer to use "text" for, and for that, the default opclass text_ops is unsafe. Fortunately, text_pattern_ops is safe (it sorts per memcmp()), and it has no real functional disadvantage for our purposes. So change the indexes on pg_seclabel.provider and pg_shseclabel.provider to use text_pattern_ops. In passing, also mark pg_replication_origin.roname as using text_pattern_ops --- for some reason it was labeled varchar_pattern_ops which is just wrong, even though it accidentally worked. Add regression test queries to catch future errors of these kinds. We still can't do anything about the misdeclared pg_seclabel and pg_shseclabel indexes in back branches :-(	2015-05-19 11:47:42 -04:00
Tom Lane	afee04352b	Revert "Change pg_seclabel.provider and pg_shseclabel.provider to type "name"." This reverts commit `b82a7be603`. There is a better (less invasive) way to fix it, which I will commit next.	2015-05-19 10:40:04 -04:00
Tom Lane	b82a7be603	Change pg_seclabel.provider and pg_shseclabel.provider to type "name". These were "text", but that's a bad idea because it has collation-dependent ordering. No index in template0 should have collation-dependent ordering, especially not indexes on shared catalogs. There was general agreement that provider names don't need to be longer than other identifiers, so we can fix this at a small waste of table space by changing from text to name. There's no way to fix the problem in the back branches, but we can hope that security labels don't yet have widespread-enough usage to make it urgent to fix. There needs to be a regression sanity test to prevent us from making this same mistake again; but before putting that in, we'll need to get rid of similar brain fade in the recently-added pg_replication_origin catalog. Note: for lack of a suitable testing environment, I've not really exercised this change. I trust the buildfarm will show up any mistakes.	2015-05-18 20:07:53 -04:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Alvaro Herrera	b0b7be6133	Add BRIN infrastructure for "inclusion" opclasses This lets BRIN be used with R-Tree-like indexing strategies. Also provided are operator classes for range types, box and inet/cidr. The infrastructure provided here should be sufficient to create operator classes for similar datatypes; for instance, opclasses for PostGIS geometries should be doable, though we didn't try to implement one. (A box/point opclass was also submitted, but we ripped it out before commit because the handling of floating point comparisons in existing code is inconsistent and would generate corrupt indexes.) Author: Emre Hasegeli. Cosmetic changes by me Review: Andreas Karlsson	2015-05-15 18:05:22 -03:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00
Heikki Linnakangas	35fcb1b3d0	Allow GiST distance function to return merely a lower-bound. The distance function can now set *recheck = false, like index quals. The executor will then re-check the ORDER BY expressions, and use a queue to reorder the results on the fly. This makes it possible to do kNN-searches on polygons and circles, which don't store the exact value in the index, but just a bounding box. Alexander Korotkov and me	2015-05-15 14:26:51 +03:00
Fujii Masao	ecd222e770	Support VERBOSE option in REINDEX command. When this option is specified, a progress report is printed as each index is reindexed. Per discussion, we agreed on the following syntax for the extensibility of the options. REINDEX (flexible options) { INDEX \| ... } name Sawada Masahiko. Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi, Jim Nasby and me. Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com	2015-05-15 20:09:57 +09:00
Peter Eisentraut	a486e35706	Add pg_settings.pending_restart column with input from David G. Johnston, Robert Haas, Michael Paquier	2015-05-14 20:08:51 -04:00
Andrew Dunstan	5c7df74204	Fix some errors from jsonb functions patch. The catalog version should have been bumped, and the alternative regression result file was not up to date with the name of jsonb_pretty.	2015-05-12 16:54:38 -04:00
Andrew Dunstan	c6947010ce	Additional functions and operators for jsonb jsonb_pretty(jsonb) produces nicely indented json output. jsonb \|\| jsonb concatenates two jsonb values. jsonb - text removes a key and its associated value from the json jsonb - int removes the designated array element jsonb - text[] removes a key and associated value or array element at the designated path jsonb_replace(jsonb,text[],jsonb) replaces the array element designated by the path or the value associated with the key designated by the path with the given value. Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.	2015-05-12 15:52:45 -04:00
Alvaro Herrera	b488c580ae	Allow on-the-fly capture of DDL event details This feature lets user code inspect and take action on DDL events. Whenever a ddl_command_end event trigger is installed, DDL actions executed are saved to a list which can be inspected during execution of a function attached to ddl_command_end. The set-returning function pg_event_trigger_ddl_commands can be used to list actions so captured; it returns data about the type of command executed, as well as the affected object. This is sufficient for many uses of this feature. For the cases where it is not, we also provide a "command" column of a new pseudo-type pg_ddl_command, which is a pointer to a C structure that can be accessed by C code. The struct contains all the info necessary to completely inspect and even reconstruct the executed command. There is no actual deparse code here; that's expected to come later. What we have is enough infrastructure that the deparsing can be done in an external extension. The intention is that we will add some deparsing code in a later release, as an in-core extension. A new test module is included. It's probably insufficient as is, but it should be sufficient as a starting point for a more complete and future-proof approach. Authors: Álvaro Herrera, with some help from Andres Freund, Ian Barwick, Abhijit Menon-Sen. Reviews by Andres Freund, Robert Haas, Amit Kapila, Michael Paquier, Craig Ringer, David Steele. Additional input from Chris Browne, Dimitri Fontaine, Stephen Frost, Petr Jelínek, Tom Lane, Jim Nasby, Steven Singer, Pavel Stěhule. Based on original work by Dimitri Fontaine, though I didn't use his code. Discussion: https://www.postgresql.org/message-id/m2txrsdzxa.fsf@2ndQuadrant.fr https://www.postgresql.org/message-id/20131108153322.GU5809@eldon.alvh.no-ip.org https://www.postgresql.org/message-id/20150215044814.GL3391@alvh.no-ip.org	2015-05-11 19:14:31 -03:00
Andrew Dunstan	cb9fa802b3	Add new OID alias type regnamespace Catalog version bumped Kyotaro HORIGUCHI	2015-05-09 13:36:52 -04:00
Andrew Dunstan	0c90f6769d	Add new OID alias type regrole The new type has the scope of whole the database cluster so it doesn't behave the same as the existing OID alias types which have database scope, concerning object dependency. To avoid confusion constants of the new type are prohibited from appearing where dependencies are made involving it. Also, add a note to the docs about possible MVCC violation and optimization issues, which are general over the all reg* types. Kyotaro Horiguchi	2015-05-09 13:06:49 -04:00
Stephen Frost	4b342fb591	Bump catversion for pg_file_settings Pointed out by Andres (thanks!) Apologies for not including it in the initial patch.	2015-05-08 19:14:32 -04:00
Stephen Frost	a97e0c3354	Add pg_file_settings view and function The function and view added here provide a way to look at all settings in postgresql.conf, any #include'd files, and postgresql.auto.conf (which is what backs the ALTER SYSTEM command). The information returned includes the configuration file name, line number in that file, sequence number indicating when the parameter is loaded (useful to see if it is later masked by another definition of the same parameter), parameter name, and what it is set to at that point. This information is updated on reload of the server. This is unfiltered, privileged, information and therefore access is restricted to superusers through the GRANT system. Author: Sawada Masahiko, various improvements by me. Reviewers: David Steele	2015-05-08 19:09:26 -04:00
Andres Freund	168d5805e4	Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE. The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.	2015-05-08 05:43:10 +02:00
Andres Freund	2c8f4836db	Represent columns requiring insert and update privileges indentently. Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund	2015-05-08 00:20:46 +02:00
Alvaro Herrera	db5f98ab4f	Improve BRIN infra, minmax opclass and regression test The minmax opclass was using the wrong support functions when cross-datatypes queries were run. Instead of trying to fix the pg_amproc definitions (which apparently is not possible), use the already correct pg_amop entries instead. This requires jumping through more hoops (read: extra syscache lookups) to obtain the underlying functions to execute, but it is necessary for correctness. Author: Emre Hasegeli, tweaked by Álvaro Review: Andreas Karlsson Also change BrinOpcInfo to record each stored type's typecache entry instead of just the OID. Turns out that the full type cache is necessary in brin_deform_tuple: the original code used the indexed type's byval and typlen properties to extract the stored tuple, which is correct in Minmax; but in other implementations that want to store something different, that's wrong. The realization that this is a bug comes from Emre also, but I did not use his patch. I also adopted Emre's regression test code (with smallish changes), which is more complete.	2015-05-07 13:02:22 -03:00
Alvaro Herrera	3b6db1f445	Add geometry/range functions to support BRIN inclusion This commit adds the following functions: box(point) -> box bound_box(box, box) -> box inet_same_family(inet, inet) -> bool inet_merge(inet, inet) -> cidr range_merge(anyrange, anyrange) -> anyrange The first of these is also used to implement a new assignment cast from point to box. These functions are the first part of a base to implement an "inclusion" operator class for BRIN, for multidimensional data types. Author: Emre Hasegeli Reviewed by: Andreas Karlsson	2015-05-05 15:22:24 -03:00
Tom Lane	2503982be4	Improve procost estimates for some text search functions. The text search functions that involve parsing raw text into lexemes are remarkably CPU-intensive, so estimating them at the same cost as most other built-in functions seems like a mistake; moreover, doing so turns out to discourage the optimizer from using functional indexes on these functions. After some debate, we've agreed to raise procost from 1 to 100 for to_tsvector(), plainto_tsvector(), to_tsquery(), ts_headline(), ts_match_tt(), and ts_match_tq(), which are all the text search functions that parse raw text. Also increase procost for the 2-argument form of ts_rewrite() (tsquery_rewrite_query); while this function doesn't do text parsing, it does execute a user-supplied SQL query, so its previous procost of 1 is clearly a drastic underestimate. It seems reasonable to assign it the same cost we assign to PL functions by default, so 100 is the number here too. I did not bother bumping catversion for this change, since it does not break catalog compatibility with the server executable nor result in any regression test changes. Per complaint from Andrew Gierth and subsequent discussion.	2015-05-04 15:38:57 -04:00
Andres Freund	2b22795b32	Copy editing of the replication origins patch. Michael Paquier and myself.	2015-05-01 12:22:13 +02:00
Robert Haas	924bcf4f16	Create an infrastructure for parallel computation in PostgreSQL. This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby.	2015-04-30 15:02:14 -04:00
Andres Freund	5aa2350426	Introduce replication progress tracking infrastructure. When implementing a replication solution ontop of logical decoding, two related problems exist: * How to safely keep track of replication progress * How to change replication behavior, based on the origin of a row; e.g. to avoid loops in bi-directional replication setups The solution to these problems, as implemented here, consist out of three parts: 1) 'replication origins', which identify nodes in a replication setup. 2) 'replication progress tracking', which remembers, for each replication origin, how far replay has progressed in a efficient and crash safe manner. 3) The ability to filter out changes performed on the behest of a replication origin during logical decoding; this allows complex replication topologies. E.g. by filtering all replayed changes out. Most of this could also be implemented in "userspace", e.g. by inserting additional rows contain origin information, but that ends up being much less efficient and more complicated. We don't want to require various replication solutions to reimplement logic for this independently. The infrastructure is intended to be generic enough to be reusable. This infrastructure also replaces the 'nodeid' infrastructure of commit timestamps. It is intended to provide all the former capabilities, except that there's only 2^16 different origins; but now they integrate with logical decoding. Additionally more functionality is accessible via SQL. Since the commit timestamp infrastructure has also been introduced in 9.5 (commit `73c986add`) changing the API is not a problem. For now the number of origins for which the replication progress can be tracked simultaneously is determined by the max_replication_slots GUC. That GUC is not a perfect match to configure this, but there doesn't seem to be sufficient reason to introduce a separate new one. Bumps both catversion and wal page magic. Author: Andres Freund, with contributions from Petr Jelinek and Craig Ringer Reviewed-By: Heikki Linnakangas, Petr Jelinek, Robert Haas, Steve Singer Discussion: 20150216002155.GI15326@awork2.anarazel.de, 20140923182422.GA15776@alap3.anarazel.de, 20131114172632.GE7522@alap2.anarazel.de	2015-04-29 19:30:53 +02:00
Peter Eisentraut	cac7658205	Add transforms feature This provides a mechanism for specifying conversions between SQL data types and procedural languages. As examples, there are transforms for hstore and ltree for PL/Perl and PL/Python. reviews by Pavel Stěhule and Andres Freund	2015-04-26 10:33:14 -04:00
Andres Freund	cef939c347	Rename pg_replication_slot's new active_in to active_pid. In `d811c037ce` active_in was added but discussion since showed that active_pid is preferred as a name. Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-22 09:43:40 +02:00
Andres Freund	d811c037ce	Add 'active_in' column to pg_replication_slots. Right now it is visible whether a replication slot is active in any session, but not in which. Adding the active_in column, containing the pid of the backend having acquired the slot, makes it much easier to associate pg_replication_slots entries with the corresponding pg_stat_replication/pg_stat_activity row. This should have been done from the start, but I (Andres) dropped the ball there somehow. Author: Craig Ringer, revised by me Discussion: CAMsr+YFKgZca5_7_ouaMWxA5PneJC9LNViPzpDHusaPhU9pA7g@mail.gmail.com	2015-04-21 11:51:06 +02:00
Bruce Momjian	f92fc4c95d	pg_upgrade: binary_upgrade_create_empty_extension() is strict Was broken by commit `30982be4e5`. Patch by Jeff Janes	2015-04-17 20:08:42 -04:00
Peter Eisentraut	30982be4e5	Integrate pg_upgrade_support module into backend Previously, these functions were created in a schema "binary_upgrade", which was deleted after pg_upgrade was finished. Because we don't want to keep that schema around permanently, move them to pg_catalog but rename them with a binary_upgrade_... prefix. The provided functions are only small wrappers around global variables that were added specifically for pg_upgrade use, so keeping the module separate does not create any modularity. The functions still check that they are only called in binary upgrade mode, so it is not possible to call these during normal operation. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2015-04-14 19:26:37 -04:00
Heikki Linnakangas	4f700bcd20	Reorganize our CRC source files again. Now that we use CRC-32C in WAL and the control file, the "traditional" and "legacy" CRC-32 variants are not used in any frontend programs anymore. Move the code for those back from src/common to src/backend/utils/hash. Also move the slicing-by-8 implementation (back) to src/port. This is in preparation for next patch that will add another implementation that uses Intel SSE 4.2 instructions to calculate CRC-32C, where available.	2015-04-14 17:03:42 +03:00
Magnus Hagander	9029f4b374	Add system view pg_stat_ssl This view shows information about all connections, such as if the connection is using SSL, which cipher is used, and which client certificate (if any) is used. Reviews by Alex Shulgin, Heikki Linnakangas, Andres Freund & Michael Paquier	2015-04-12 19:07:46 +02:00
Alvaro Herrera	e9a077cad3	pg_event_trigger_dropped_objects: add is_temp column It now also reports temporary objects dropped that are local to the backend. Previously we weren't reporting any temp objects because it was deemed unnecessary; but as it turns out, it is necessary if we want to keep close track of DDL command execution inside one session. Temp objects are reported as living in schema pg_temp, which works because such a schema-qualification always refers to the temp objects of the current session.	2015-04-06 11:40:55 -03:00
Robert Haas	abd94bcac4	Use abbreviated keys for faster sorting of numeric datums. Andrew Gierth, reviewed by Peter Geoghegan, with further tweaks by me.	2015-04-02 14:04:26 -04:00
Heikki Linnakangas	f770870d9e	Move inet/cidr GiST opclass functions to correct place in header file. They were accidentally placed under the GIN heading. Andreas Karlsson	2015-04-01 19:20:45 +03:00
Alvaro Herrera	97690ea6e8	Change array_offset to return subscripts, not offsets ... and rename it and its sibling array_offsets to array_position and array_positions, to account for the changed behavior. Having the functions return subscripts better matches existing practice, and is better suited to using the result value as a subscript into the array directly. For one-based arrays, the new definition is identical to what was originally committed. (We use the term "subscript" in the documentation, which is what we use whenever we talk about arrays; but the functions themselves are named using the word "position" to match the standard-defined POSITION() functions.) Author: Pavel Stěhule Behavioral problem noted by Dean Rasheed.	2015-03-30 16:13:21 -03:00
Heikki Linnakangas	0633a60f4d	Add index-only scan support to range type GiST opclass. Andreas Karlsson	2015-03-30 13:22:38 +03:00
Heikki Linnakangas	3a20b0e7b6	Add index-only scan support to inet GiST opclass. Andreas Karlsson	2015-03-28 15:11:53 +02:00
Heikki Linnakangas	d04c8ed904	Add support for index-only scans in GiST. This adds a new GiST opclass method, 'fetch', which is used to reconstruct the original Datum from the value stored in the index. Also, the 'canreturn' index AM interface function gains a new 'attno' argument. That makes it possible to use index-only scans on a multi-column index where some of the opclasses support index-only scans but some do not. This patch adds support in the box and point opclasses. Other opclasses can added later as follow-on patches (btree_gist would be particularly interesting). Anastasia Lubennikova, with additional fixes and modifications by me.	2015-03-26 19:12:00 +02:00
Alvaro Herrera	bdc3d7fa23	Return ObjectAddress in many ALTER TABLE sub-routines Since commit `a2e35b53c3`, most CREATE and ALTER commands return the ObjectAddress of the affected object. This is useful for event triggers to try to figure out exactly what happened. This patch extends this idea a bit further to cover ALTER TABLE as well: an auxiliary ObjectAddress is returned for each of several subcommands of ALTER TABLE. This makes it possible to decode with precision what happened during execution of any ALTER TABLE command; for instance, which constraint was added by ALTER TABLE ADD CONSTRAINT, or which parent got dropped from the parents list by ALTER TABLE NO INHERIT. As with the previous patch, there is no immediate user-visible change here. This is all really just continuing what `c504513f83` started. Reviewed by Stephen Frost.	2015-03-25 17:17:56 -03:00
Bruce Momjian	1c7087af42	Add TOAST table to pg_shseclabel for long label use Report by Andres Freund	2015-03-21 22:14:49 -04:00
Andres Freund	959277a4f5	Use 128-bit math to accelerate some aggregation functions. On platforms where we support 128bit integers, use them to implement faster transition functions for sum(int8), avg(int8), var_(int2/int4),stdev_(int2/int4). Where not supported continue to use numeric as a transition type. In some synthetic benchmarks this has been shown to provide significant speedups. Bumps catversion. Discussion: 544BB5F1.50709@proxel.se Author: Andreas Karlsson Reviewed-By: Peter Geoghegan, Petr Jelinek, Andres Freund, Oskari Saarenmaa, David Rowley	2015-03-20 10:29:32 +01:00
Alvaro Herrera	13dbc7a824	array_offset() and array_offsets() These functions return the offset position or positions of a value in an array. Author: Pavel Stěhule Reviewed by: Jim Nasby	2015-03-18 16:01:34 -03:00
Tom Lane	7b8b8a4331	Improve representation of PlanRowMark. This patch fixes two inadequacies of the PlanRowMark representation. First, that the original LockingClauseStrength isn't stored (and cannot be inferred for foreign tables, which always get ROW_MARK_COPY). Since some PlanRowMarks are created out of whole cloth and don't actually have an ancestral RowMarkClause, this requires adding a dummy LCS_NONE value to enum LockingClauseStrength, which is fairly annoying but the alternatives seem worse. This fix allows getting rid of the use of get_parse_rowmark() in FDWs (as per the discussion around commits `462bd95705` and `8ec8760fc8`), and it simplifies some things elsewhere. Second, that the representation assumed that all child tables in an inheritance hierarchy would use the same RowMarkType. That's true today but will soon not be true. We add an "allMarkTypes" field that identifies the union of mark types used in all a parent table's children, and use that where appropriate (currently, only in preprocess_targetlist()). In passing fix a couple of minor infelicities left over from the SKIP LOCKED patch, notably that _outPlanRowMark still thought waitPolicy is a bool. Catversion bump is required because the numeric values of enum LockingClauseStrength can appear in on-disk rules. Extracted from a much larger patch to support foreign table inheritance; it seemed worth breaking this out, since it's a separable concern. Shigeru Hanada and Etsuro Fujita, somewhat modified by me	2015-03-15 18:41:47 -04:00
Peter Eisentraut	bb8582abf3	Remove rolcatupdate This role attribute is an ancient PostgreSQL feature, but could only be set by directly updating the system catalogs, and it doesn't have any clearly defined use. Author: Adam Brightwell <adam.brightwell@crunchydatasolutions.com>	2015-03-06 23:42:38 -05:00
Alvaro Herrera	a2e35b53c3	Change many routines to return ObjectAddress rather than OID The changed routines are mostly those that can be directly called by ProcessUtilitySlow; the intention is to make the affected object information more precise, in support for future event trigger changes. Originally it was envisioned that the OID of the affected object would be enough, and in most cases that is correct, but upon actually implementing the event trigger changes it turned out that ObjectAddress is more widely useful. Additionally, some command execution routines grew an output argument that's an object address which provides further info about the executed command. To wit: * for ALTER DOMAIN / ADD CONSTRAINT, it corresponds to the address of the new constraint * for ALTER OBJECT / SET SCHEMA, it corresponds to the address of the schema that originally contained the object. * for ALTER EXTENSION {ADD, DROP} OBJECT, it corresponds to the address of the object added to or dropped from the extension. There's no user-visible change in this commit, and no functional change either. Discussion: 20150218213255.GC6717@tamriel.snowman.net Reviewed-By: Stephen Frost, Andres Freund	2015-03-03 14:10:50 -03:00
Tom Lane	b67f1ce181	Reduce json <=> jsonb casts from explicit-only to assignment level. There's no reason to make users write an explicit cast to store a json value in a jsonb column or vice versa. We could probably even make these implicit, but that might open us up to problems with ambiguous function calls, so for now just do this.	2015-03-03 11:26:04 -05:00
Noah Misch	b8a18ad485	Add transform functions for AT TIME ZONE. This makes "ALTER TABLE tabname ALTER tscol TYPE ... USING tscol AT TIME ZONE 'UTC'" skip rewriting the table when altering from "timestamp" to "timestamptz" or vice versa. While it would be nicer still to optimize this in the absence of the USING clause given timezone==UTC, transform functions must consult IMMUTABLE facts only.	2015-03-01 13:22:34 -05:00
Tom Lane	c063da1769	Add parse location fields to NullTest and BooleanTest structs. We did not need a location tag on NullTest or BooleanTest before, because no error messages referred directly to their locations. That's planned to change though, so add these fields in a separate housekeeping commit. Catversion bump because stored rules may change.	2015-02-22 14:40:27 -05:00
Andres Freund	82a532b34d	Force some system catalog table columns to be marked NOT NULL. In a manual pass over the catalog declaration I found a number of columns which the boostrap automatism didn't mark NOT NULL even though they actually were. Add BKI_FORCE_NOT_NULL markings to them. It's usually not critical if a system table column is falsely determined to be nullable as the code should always catch relevant cases. But it's good to have a extra layer in place. Discussion: 20150215170014.GE15326@awork2.anarazel.de	2015-02-21 22:37:05 +01:00
Andres Freund	eb68379c38	Allow forcing nullness of columns during bootstrap. Bootstrap determines whether a column is null based on simple builtin rules. Those work surprisingly well, but nonetheless a few existing columns aren't set correctly. Additionally there is at least one patch sent to hackers where forcing the nullness of a column would be helpful. The boostrap format has gained FORCE [NOT] NULL for this, which will be emitted by genbki.pl when BKI_FORCE_(NOT_)?NULL is specified for a column in a catalog header. This patch doesn't change the marking of any existing columns. Discussion: 20150215170014.GE15326@awork2.anarazel.de	2015-02-21 22:31:54 +01:00
Tom Lane	692bd09ad1	Use "#ifdef CATALOG_VARLEN" to protect nullable fields of pg_authid. This gives a stronger guarantee than a mere comment against accessing these fields as simple struct members. Since rolpassword is in fact varlena, it's not clear why these didn't get marked from the beginning, but let's do it now. Michael Paquier	2015-02-20 00:23:48 -05:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Tom Lane	2fb7a75f37	Add pg_stat_get_snapshot_timestamp() to show statistics snapshot timestamp. Per discussion, this could be useful for purposes such as programmatically detecting a nonresponding stats collector. We already have the timestamp anyway, it's just a matter of providing a SQL-accessible function to fetch it. Matt Kelly, reviewed by Jim Nasby	2015-02-19 21:36:50 -05:00
Tom Lane	56a79a869b	Split array_push into separate array_append and array_prepend functions. There wasn't any good reason for a single C function to implement both these SQL functions: it saved very little code overall, and it required significant pushups to re-determine at runtime which case applied. Redoing it as two functions ends up with just slightly more lines of code, but it's simpler to understand, and faster too because we need not repeat syscache lookups on every call. An important side benefit is that this eliminates the only case in which different aliases of the same C function had both anyarray and anyelement arguments at the same position, which would almost always be a mistake. The opr_sanity regression test will now notice such mistakes since there's no longer a valid case where it happens.	2015-02-18 20:53:33 -05:00
Heikki Linnakangas	c619c2351f	Move pg_crc.c to src/common, and remove pg_crc_tables.h To get CRC functionality in a client program, you now need to link with libpgcommon instead of libpgport. The CRC code has nothing to do with portability, so libpgcommon is a better home. (libpgcommon didn't exist when pg_crc.c was originally moved to src/port.) Remove the possibility to get CRC functionality by just #including pg_crc_tables.h. I'm not aware of any extensions that actually did that and couldn't simply link with libpgcommon. This also moves the pg_crc.h header file from src/include/utils to src/include/common, which will require changes to any external programs that currently does #include "utils/pg_crc.h". That seems acceptable, as include/common is clearly the right home for it now, and the change needed to any such programs is trivial.	2015-02-09 11:17:56 +02:00
Tom Lane	3d660d33aa	Fix assorted oversights in range selectivity estimation. calc_rangesel() failed outright when comparing range variables to empty constant ranges with < or >=, as a result of missing cases in a switch. It also produced a bogus estimate for > comparison to an empty range. On top of that, the >= and > cases were mislabeled throughout. For nonempty constant ranges, they managed to produce the right answers anyway as a result of counterbalancing typos. Also, default_range_selectivity() omitted cases for elem <@ range, range &< range, and range &> range, so that rather dubious defaults were applied for these operators. In passing, rearrange the code in rangesel() so that the elem <@ range case is handled in a less opaque fashion. Report and patch by Emre Hasegeli, some additional work by me	2015-01-30 12:30:59 -05:00
Stephen Frost	c7cf9a2433	Add usebypassrls to pg_user and pg_shadow The row level security patches didn't add the 'usebypassrls' columns to the pg_user and pg_shadow views on the belief that they were deprecated, but we havn't actually said they are and therefore we should include it. This patch corrects that, adds missing documentation for rolbypassrls into the system catalog page for pg_authid, along with the entries for pg_user and pg_shadow, and cleans up a few other uses of 'row-level' cases to be 'row level' in the docs. Pointed out by Amit Kapila. Catalog version bump due to system view changes.	2015-01-28 21:47:15 -05:00
Tom Lane	fd496129d1	Clean up some mess in row-security patches. Fix unsafe coding around PG_TRY in RelationBuildRowSecurity: can't change a variable inside PG_TRY and then use it in PG_CATCH without marking it "volatile". In this case though it seems saner to avoid that by doing a single assignment before entering the TRY block. I started out just intending to fix that, but the more I looked at the row-security code the more distressed I got. This patch also fixes incorrect construction of the RowSecurityPolicy cache entries (there was not sufficient care taken to copy pass-by-ref data into the cache memory context) and a whole bunch of sloppiness around the definition and use of pg_policy.polcmd. You can't use nulls in that column because initdb will mark it NOT NULL --- and I see no particular reason why a null entry would be a good idea anyway, so changing initdb's behavior is not the right answer. The internal value of '\0' wouldn't be suitable in a "char" column either, so after a bit of thought I settled on using '*' to represent ALL. Chasing those changes down also revealed that somebody wasn't paying attention to what the underlying values of ACL_UPDATE_CHR etc really were, and there was a great deal of lackadaiscalness in the catalogs.sgml documentation for pg_policy and pg_policies too. This doesn't pretend to be a complete code review for the row-security stuff, it just fixes the things that were in my face while dealing with the bugs in RelationBuildRowSecurity.	2015-01-24 16:16:22 -05:00
Alvaro Herrera	972bf7d6f1	Tweak BRIN minmax operator class In the union support proc, we were not checking the hasnulls flag of value A early enough, so it could be skipped if the "allnulls" flag in value B is set. Also, a check on the allnulls flag of value "B" was redundant, so remove it. Also change inet_minmax_ops to not be the default opclass for type inet, as a future inclusion operator class would be more useful and it's pretty difficult to change default opclass for a datatype later on. (There is no catversion bump for this catalog change; this shouldn't be a problem.) Extracted from a larger patch to add an "inclusion" operator class. Author: Emre Hasegeli	2015-01-22 17:01:09 -03:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Alvaro Herrera	72dd233d3e	pg_event_trigger_dropped_objects: Add name/args output columns These columns can be passed to pg_get_object_address() and used to reconstruct the dropped objects identities in a remote server containing similar objects, so that the drop can be replicated. Reviewed by Stephen Frost, Heikki Linnakangas, Abhijit Menon-Sen, Andres Freund.	2014-12-30 17:41:46 -03:00
Alvaro Herrera	a676201490	Add pg_identify_object_as_address This function returns object type and objname/objargs arrays, which can be passed to pg_get_object_address. This is especially useful because the textual representation can be copied to a remote server in order to obtain the corresponding OID-based address. In essence, this function is the inverse of recently added pg_get_object_address(). Catalog version bumped due to the addition of the new function. Also add docs to pg_get_object_address.	2014-12-30 15:41:50 -03:00
Alvaro Herrera	a609d96778	Revert "Use a bitmask to represent role attributes" This reverts commit `1826987a46`. The overall design was deemed unacceptable, in discussion following the previous commit message; we might find some parts of it still salvageable, but I don't want to be on the hook for fixing it, so let's wait until we have a new patch.	2014-12-23 15:35:49 -03:00
Alvaro Herrera	d7ee82e50f	Add SQL-callable pg_get_object_address This allows access to get_object_address from SQL, which is useful to obtain OID addressing information from data equivalent to that emitted by the parser. This is necessary infrastructure of a project to let replication systems propagate object dropping events to remote servers, where the schema might be different than the server originating the DROP. This patch also adds support for OBJECT_DEFAULT to get_object_address; that is, it is now possible to refer to a column's default value. Catalog version bumped due to the new function. Reviewed by Stephen Frost, Heikki Linnakangas, Robert Haas, Andres Freund, Abhijit Menon-Sen, Adam Brightwell.	2014-12-23 15:31:29 -03:00
Alvaro Herrera	1826987a46	Use a bitmask to represent role attributes The previous representation using a boolean column for each attribute would not scale as well as we want to add further attributes. Extra auxilliary functions are added to go along with this change, to make up for the lost convenience of access of the old representation. Catalog version bumped due to change in catalogs and the new functions. Author: Adam Brightwell, minor tweaks by Álvaro Reviewed by: Stephen Frost, Andres Freund, Álvaro Herrera	2014-12-23 10:22:09 -03:00
Alvaro Herrera	0ee98d1cbf	pg_event_trigger_dropped_objects: add behavior flags Add "normal" and "original" flags as output columns to the pg_event_trigger_dropped_objects() function. With this it's possible to distinguish which objects, among those listed, need to be explicitely referenced when trying to replicate a deletion. This is necessary so that the list of objects can be pruned to the minimum necessary to replicate the DROP command in a remote server that might have slightly different schema (for instance, TOAST tables and constraints with different names and such.) Catalog version bumped due to change of function definition. Reviewed by: Abhijit Menon-Sen, Stephen Frost, Heikki Linnakangas, Robert Haas.	2014-12-19 15:00:45 -03:00
Heikki Linnakangas	4520ba6769	Add point <-> polygon distance operator. Alexander Korotkov, reviewed by Emre Hasegeli.	2014-12-15 17:06:21 +02:00
Andrew Dunstan	7e354ab9fe	Add several generator functions for jsonb that exist for json. The functions are: to_jsonb() jsonb_object() jsonb_build_object() jsonb_build_array() jsonb_agg() jsonb_object_agg() Also along the way some better logic is implemented in json_categorize_type() to match that in the newly implemented jsonb_categorize_type(). Andrew Dunstan, reviewed by Pavel Stehule and Alvaro Herrera.	2014-12-12 15:31:14 -05:00
Andrew Dunstan	237a882443	Add json_strip_nulls and jsonb_strip_nulls functions. The functions remove object fields, including in nested objects, that have null as a value. In certain cases this can lead to considerably smaller datums, with no loss of semantic information. Andrew Dunstan, reviewed by Pavel Stehule.	2014-12-12 09:00:43 -05:00
Simon Riggs	618c9430a8	Event Trigger for table_rewrite Generate a table_rewrite event when ALTER TABLE attempts to rewrite a table. Provide helper functions to identify table and reason. Intended use case is to help assess or to react to schema changes that might hold exclusive locks for long periods. Dimitri Fontaine, triggering an edit by Simon Riggs Reviewed in detail by Michael Paquier	2014-12-08 00:55:28 +09:00
Alvaro Herrera	73c986adde	Keep track of transaction commit timestamps Transactions can now set their commit timestamp directly as they commit, or an external transaction commit timestamp can be fed from an outside system using the new function TransactionTreeSetCommitTsData(). This data is crash-safe, and truncated at Xid freeze point, same as pg_clog. This module is disabled by default because it causes a performance hit, but can be enabled in postgresql.conf requiring only a server restart. A new test in src/test/modules is included. Catalog version bumped due to the new subdirectory within PGDATA and a couple of new SQL functions. Authors: Álvaro Herrera and Petr Jelínek Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven Singer, Peter Eisentraut	2014-12-03 11:53:02 -03:00
Tom Lane	1511521a36	Minor cleanup of function declarations for BRIN. Get rid of PG_FUNCTION_INFO_V1() macros, which are quite inappropriate for built-in functions (possibly leftovers from testing as a loadable module?). Also, fix gratuitous inconsistency between SQL-level and C-level names of the minmax support functions.	2014-12-02 14:07:54 -05:00
Tom Lane	866737c923	Add a #define for the inet overlaps operator. Extracted from pending inet selectivity patch. The rest of it isn't quite ready to commit, but we might as well push this part so the patch doesn't have to track the moving target of pg_operator.h.	2014-11-30 19:43:43 -05:00
Alvaro Herrera	816e10d800	Fix BRIN operator family definitions The original definitions were leaving no room for cross-type operators, so queries that compared a column of one type against something of a different type were not taking advantage of the index. Fix by making the opfamilies more like the ones for Btree, and include a few cross-type operator classes. Catalog version bumped. Per complaints from Hubert Lubaczewski, Mark Wong, Heikki Linnakangas.	2014-11-28 18:09:19 -03:00
Stephen Frost	143b39c185	Rename pg_rowsecurity -> pg_policy and other fixes As pointed out by Robert, we should really have named pg_rowsecurity pg_policy, as the objects stored in that catalog are policies. This patch fixes that and updates the column names to start with 'pol' to match the new catalog name. The security consideration for COPY with row level security, also pointed out by Robert, has also been addressed by remembering and re-checking the OID of the relation initially referenced during COPY processing, to make sure it hasn't changed under us by the time we finish planning out the query which has been built. Robert and Alvaro also commented on missing OCLASS and OBJECT entries for POLICY (formerly ROWSECURITY or POLICY, depending) in various places. This patch fixes that too, which also happens to add the ability to COMMENT on policies. In passing, attempt to improve the consistency of messages, comments, and documentation as well. This removes various incarnations of 'row-security', 'row-level security', 'Row-security', etc, in favor of 'policy', 'row level security' or 'row_security' as appropriate. Happy Thanksgiving!	2014-11-27 01:15:57 -05:00
Tom Lane	bac27394a1	Support arrays as input to array_agg() and ARRAY(SELECT ...). These cases formerly failed with errors about "could not find array type for data type". Now they yield arrays of the same element type and one higher dimension. The implementation involves creating functions with API similar to the existing accumArrayResult() family. I (tgl) also extended the base family by adding an initArrayResult() function, which allows callers to avoid special-casing the zero-inputs case if they just want an empty array as result. (Not all do, so the previous calling convention remains valid.) This allowed simplifying some existing code in xml.c and plperl.c. Ali Akbar, reviewed by Pavel Stehule, significantly modified by me	2014-11-25 12:21:28 -05:00
Heikki Linnakangas	0bd624d63b	Distinguish XLOG_FPI records generated for hint-bit updates. Add a new XLOG_FPI_FOR_HINT record type, and use that for full-page images generated for hint bit updates, when checksums are enabled. The new record type is replayed exactly the same as XLOG_FPI, but allows them to be tallied separately e.g. in pg_xlogdump.	2014-11-24 11:09:08 +02:00
Tom Lane	c5111ea9ca	Remove no-longer-needed phony typedefs in genbki.h. Now that we have a policy of hiding varlena catalog fields behind "#ifdef CATALOG_VARLEN", there is no need for their type names to be acceptable to the C compiler. And experimentation shows that it does not matter to pgindent either. (If it did, we'd have problems anyway, since these typedefs are unreferenced so far as the C compiler is concerned, and find_typedef fails to identify such typedefs.) Hence, remove the phony typedefs that genbki.h provided to make some varlena field definitions compilable. In passing, rearrange #define's into what seemed a more logical order.	2014-11-20 13:16:14 -05:00
Heikki Linnakangas	2c03216d83	Revamp the WAL record format. Each WAL record now carries information about the modified relation and block(s) in a standardized format. That makes it easier to write tools that need that information, like pg_rewind, prefetching the blocks to speed up recovery, etc. There's a whole new API for building WAL records, replacing the XLogRecData chains used previously. The new API consists of XLogRegister* functions, which are called for each buffer and chunk of data that is added to the record. The new API also gives more control over when a full-page image is written, by passing flags to the XLogRegisterBuffer function. This also simplifies the XLogReadBufferForRedo() calls. The function can dig the relation and block number from the WAL record, so they no longer need to be passed as arguments. For the convenience of redo routines, XLogReader now disects each WAL record after reading it, copying the main data part and the per-block data into MAXALIGNed buffers. The data chunks are not aligned within the WAL record, but the redo routines can assume that the pointers returned by XLogRecGet* functions are. Redo routines are now passed the XLogReaderState, which contains the record in the already-disected format, instead of the plain XLogRecord. The new record format also makes the fixed size XLogRecord header smaller, by removing the xl_len field. The length of the "main data" portion is now stored at the end of the WAL record, and there's a separate header after XLogRecord for it. The alignment padding at the end of XLogRecord is also removed. This compansates for the fact that the new format would otherwise be more bulky than the old format. Reviewed by Andres Freund, Amit Kapila, Michael Paquier, Alvaro Herrera, Fujii Masao.	2014-11-20 18:46:41 +02:00
Alvaro Herrera	85b506bbfc	Get rid of SET LOGGED indexes persistence kludge This removes ATChangeIndexesPersistence() introduced by `f41872d0c1` which was too ugly to live for long. Instead, the correct persistence marking is passed all the way down to reindex_index, so that the transient relation built to contain the index relfilenode can get marked correctly right from the start. Author: Fabrízio de Royes Mello Review and editorialization by Michael Paquier and Álvaro Herrera	2014-11-15 01:19:49 -03:00
Fujii Masao	1871c89202	Add generate_series(numeric, numeric). Платон Малюгин Reviewed by Michael Paquier, Ali Akbar and Marti Raudsepp	2014-11-11 21:44:46 +09:00
Alvaro Herrera	7516f52594	BRIN: Block Range Indexes BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.	2014-11-07 16:38:14 -03:00

... 7 8 9 10 11 ...

2515 Commits